Transportation and Traffic Theory

TRANSPORTATION
AND
TRAFFIC THEORY
Related Pergamon books
LESORT
Transportation and Traffic Theory: Proceedings of the 13th ISTT
BELL
Transportation Networks: Recent Methodological Advances
DAGANZO
Fundamentals of Transportation and Traffic Operations
ETTEMA & TIMMERMANS

Activity Based Approaches to Travel Analysis
GARLING, LAITILA & WESTIN

Theoretical Foundations of Travel Choice Modeling
GRIFFITHS
Mathematics in Transport Planning and Control
STOPHER & LEE-GOSSELIN

Understanding Travel Behaviour in an Era of Change
Related Pergamon journals

Transportation Research Part A: Policy and Practice Editor: Frank A. Haight
Transportation Research Part B: Methodological Editor: Frank A. Haight
Free specimen copies of journals available on request

TRANSPORTATION
AND
TRAFFIC THEORY
Proceedings of the 14th International Symposium
on Transportation and Traffic Theory
Jerusalem, Israel, 20-23 July, 1999
edited by
AVISHAI CEDER
Transportation Research Institute
Faculty of Civil Engineering
Technion - Israel Institute of Technology
Haifa, Israel
PERGAMON
An Imprint of Elsevier Science
Amsterdam - Lausanne - New York - Oxford - Shannon - Singapore - Tokyo
ELSEVIER SCIENCE Ltd
The Boulevard, Langford Lane
Kidlington, Oxford OX5 1GB, UK
© 1999 Elsevier Science Ltd. All rights reserved.
This work and the individual contributions contained in it are protected under copyright by Elsevier Science, and the following terms and
conditions apply to its use:
Photocopying
Single photocopies of single chapters may be made for personal use as allowed by national copyright laws. Permission of the publisher and
payment of a fee is required for all other photocopying, including multiple or systematic copying, copying for advertising or promotional pur-
poses, resale, and all forms of document delivery. Special rates are available for educational institutions that wish to make photocopies for
non-profit educational classroom use.
Permissions may be sought directly from Elsevier Science Rights & Permissions Department, PO Box 800, Oxford OX5 1DX, UK; phone:
(+44) 1865 843830, fax: (+44) 1865 853333, e-mail: permissions@elsevier.co.uk. You may also contact Rights & Permissions directly
through Elsevier's home page (http://www.elsevier.nl), selecting first 'Customer Support', then 'General Information', then 'Permissions
Query Form'.
In the USA, users may clear permissions and make payments through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers,
MA 01923, USA; phone: (978) 7508400, fax: (978) 7504744, and in the UK through the Copyright Licensing Agency Rapid Clearance
Service (CLARCS), 90 Tottenham Court Road, London W1POLP, UK; phone: (+44) 171 631 5555; fax: (+44) 171 631 5500. Other countries
may have a local reprographic rights agency for payments.
Derivative Works
Tables of contents may be reproduced for internal circulation, but permission of Elsevier Science is required for external resale or distribution
of such material.
Permission of the publisher is required for all other derivative works, including compilations and translations.
Electronic Storage or Usage

Permission of the publisher is required to store or use electronically any material contained in this work, including any chapter or part of a
chapter. Contact the publisher at the address indicated.
Except as outlined above, no part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means,
electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the publisher.
Address permissions requests to: Elsevier Science Rights & Permissions Department, at the mail, fax and e-mail addresses noted above.
Notice
No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence
or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid
advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made.
First edition 1999
Library of Congress Cataloging in Publication Data

A catalog record from the Library of Congress has been applied for.
British Library Cataloguing in Publication Data

A catalogue record from the British Library has been applied for.
ISBN: 0 08 043448 7
©The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of
Paper).
Printed in The Netherlands.
CONTENTS
Preface and Overview ix
International Advisory Committee xiii
In Memoriam xv
Contributors xix
Chapter 1 - Traffic Flow Models 1 -104

MACROSCOPIC TRAFFIC FLOW MODELS:
A QUESTION OF ORDER
J.P Lebacque, J.B. Lesort 3
MACROSCOPIC MULTIPLE USER-CLASS TRAFFIC FLOW
MODELLING: A MULTILANE GENERALISATION
USING GAS-KINETIC THEORY
S.P. Hoogendoorn, P.H.L. Bovy 27
THE CHAPMAN-ENSKOG EXPANSION: A NOVEL APPROACH
TO HIERARCHICAL EXTENSION OF LIGHTHILL-WHITHAM MODELS
P. Nelson, A. Sopasakis 51
THE LAGGED CELL-TRANSMISSION MODEL
C.F. Daganzo 81
Chapter 2 - Traffic Flow Behaviour 105-188

OBSERVATIONS AT A FREEWAY BOTTLENECK
M.J. Cassidy, R.L. Bertini 107
. TLOWS UPSTREAM OF A HIGHWAY BOTTLENECK
G.F. Newell 725
THEORY OF CONGESTED TRAFFIC FLOW: SELF-ORGANIZATION
WITHOUT BOTTLENECKS
B.S.Kerner 147
A MERGING-GIVEWAY BEHAVIOR MODEL CONSIDERING
INTERACTIONS AT EXPRESSWAY ON-RAMPS
H. Kita,K. Fukuyama 173
Transportation and Traffic Theory
Chapter 3 - Road Safety and Pedestrians 189 - 254

COMPARISON OF RESULTS OF METHODS OF THE IDENTIFICATION
OF HIGH RISK ROAD SECTIONS
M. Tracz, M. Nowakowska 191
BEHAVIOURAL ADAPTATION AND SEAT-BELT USE: A HYPOTHESIS
INVOKING LOOMING AS A NEGATIVE REINFORCER
A.M. Reinhardt-Rutland 213
BI-DIRECTIONAL EMERGENT FUNDAMENTAL PEDESTRIAN
FLOWS FROM CELLULAR AUTOMATA MICROSIMULATION
V. J. Blue, J.L. Adler 235
Chapter 4 - Flow Evaluation on Road Networks 255 - 324

FLOW MODEL AND PERFORMABILITY OF A ROAD NETWORK
UNDER DEGRADED CONDITIONS
Y. Asakura, M. Kashiwadani, E. Hato 257
A SENSITIVITY BASED APPROACH TO NETWORK RELIABILITY ASSESSMENT
M.G.H. Bell, C. Cassir, Y. lida, W.H.K. Lam 283
A CAPACITY INCREASING PARADOX FOR A DYNAMIC TRAFFIC
ASSIGNMENT WITH DEPARTURE TIME CHOICE
T. Akamatsu, M. Kuwahara 301
Chapter 5 - Traffic Assignment 325- 416

A DYNAMIC TRAFFIC ASSIGNMENT FORMULATION THAT
ENCAPSULATES THE CELL-TRANSMISSION MODEL
H.K.Lo 327
FORMULATIONS OF EXTENDED LOGIT STOCHASTIC USER EQUILIBRIUM '-x
ASSIGNMENTS
S. Bekhor, J.N. Prashker 351
A DOUBLY DYNAMIC TRAFFIC ASSIGNMENT MODEL FOR PLANNING
APPLICATIONS
V. Astarita, V. Adamo, G.E. Cantarella, E. Cascetta 373
ROUTE FLOW ENTROPY MAXIMIZATION IN ORIGIN-BASED
TRAFFIC ASSIGNMENT
H. Bar-Gera, D. Boyce 397
Contents
Chapter 6 - Traffic Demand, Forecasting and Decision Tools 417 - 514

THE USE OF NEURAL NETWORKS FOR SHORT-TERM PREDICTION
OF TRAFFIC DEMAND
J. Barcelo,J. Casas 419
ALGORITHMS FOR THE SOLUTION OF THE CONGESTED TRIP MATRIX
ESTIMATION PROBLEM
M. Maker, X. Zhang 445
COMBINING PREDICTIVE SCHEMES IN SHORT-TERM TRAFFIC FORECASTING
N.-E. ElFaouzi 471
A THEORETICAL BASIS FOR IMPLEMENTATION OF A QUANTITATIVE
DECISION SUPPORT SYSTEM - USING BILEVEL OPTIMISATION
A. CluneM. Smith, Y. Xiang 489
Chapter 7 - Traffic Simulation 515-574

MACROSCOPIC MODELLING OF TRAFFIC FLOW BY AN APPROACH
OF MOVING SEGMENTS
M. Cremer, D. Staecker, P. Unbehaun 517
MICROSCOPIC ONLINE SIMULATIONS OF URBAN TRAFFIC
J. Esser, L. Neubert, J. Wahle, M. Schreckenberg 535
MODELLING THE SPILL-BACK OF CONGESTION IN LINK BASED DYNAMIC
NETWORK LOADING MODELS: A SIMULATION MODEL WITH APPLICATION
V. Adamo, V. Astanta,M. Florian, M. Mahut, J.H. Wu 555
Chapter 8 - Traffic Information and Control 575-662

INVESTIGATION OF ROUTE GUIDANCE GENERATION ISSUES BY
SIMULATION WITH DynaMIT
J. Bottom, M. Ben-Akiva, M. Bierlaire, I. Chabini, H. Koutsopoulos, Q. Yang 577
A NEW FEED-BACK PROCESS BY MEANS OF DYNAMIC REFERENCE
VALUES IN REROUTING CONTROL
A. Poschinger, M. Cremer, H. Keller 601
OPTIMAL CO-ORDINATED AND INTEGRATED MOTORWAY NETWORK
TRAFFIC CONTROL
A. Kotsialos, M. Papageorgiou, A. Messmer 627
PROGRESSION OPTIMIZATION IN LARGE SCALE URBAN NETWORKS.
A HEURISTIC DECOMPOSITION APPROACH
C. Stamatiadis, N.H. Gartner 645
Chapter 9 - Road Tolling and Parking Balance 663 - 732

TOLLING AT A FRONTIER: A GAME THEORETIC ANALYSIS
D.M. Levinson 665
CARPOOLING AND PRICING IN A MULTILANE HIGHWAY WITH
HIGH-OCCUPANCY-VEHICLE LANES AND BOTTLENECK CONGESTION
H.-J. Huang, H. Yang 685
BALANCE OF DEMAND AND SUPPLY OF PARKING SPACES
W.H.K. Lam, M.L. Tarn, H. Yang, S.C. Wong 707
Chapter 10 - Traveller Survey and Transit Planning 733 - 796

THE ROLE OF LIFESTYLE AND ATTITUDINAL CHARACTERISTICS
IN RESIDENTIAL NEIGHBORHOOD CHOICE
M.N. Bagley, P.L. Mokhtarian 735
PLANNING OF SUBWAY TRANSIT SYSTEMS
S.C. Wirasinghe, U. Vandebona 759
SCHEDULING RAIL TRACK MAINTENANCE TO MINIMISE OVERALL DELAYS
A. Higgins, L. Ferreira, M. Lake 779
Index 797
PREFACE AND OVERVIEW
These proceedings represent a further step forward in the understanding and solution of
transportation and traffic problems. This is the 14th publication in the series of Symposia on
Transportation and Traffic Theory. We are endeavouring to continue in the tradition of
Professor Robert Herman and a group of scientists who gathered at the General Motors
Research Laboratory in Michigan and initiated this undertaking in 1959. We have all been
inspired by their leadership.
It is with great sadness that we learned that Professor Robert Herman (Honourary Member) and
Professor Michael Cremer (International Advisory Committee) have passed away since our last
symposium. We have included a tribute to them in this book.
This publication goes a long way towards providing innovative, advanced knowledge
regarding traffic and transpiration problems and the analytical tools required to achieve their
solutions. Transportation issues about: Safety, Mobility, Efficiency, Productivity, Planning
and Environmental elements are of direct interest to a growing number of professionals in the
fields of communication, data processing, electronics, environmental quality, policy makers
and others. Every advanced transportation or traffic system, existing or under development,
absorbs its "know how" from models, methods and analyses represented by papers like the
ones in this publication. To name but a few: ATMS (Advanced Traffic Management System),
ATIS (Advanced Traveller-Information System), APTS (Advanced Public Transportation
System), ACVO (Advanced Commercial Vehicle Operations) and ETC (Electronic Toll
Collection), all of which are known as ITS (Intelligent Transportation Systems). ITS can be
seen as a link between technology and the driver-vehicle-road system and models.
There is a saying: "A man's real worth is determined by what he does, when he has nothing to
do." The contributors to this publication are among a group of scientists who even when they
have nothing to do, think how to resolve and deal with transportation and traffic problems.
They are following George Bernard Shaw's remark that people look at existing things that do
not work (on transportation and traffic problems) and ask: why? But dream about things that do
not exist, but work and ask: why not?
This book has been compiled to follow the same session order as the Symposium and each of
the ten chapters has been prefaced by three sayings.
An overview of these chapters has been created in the following table. The thirty-five papers
in this publication represent contributions from sixteen different countries.
Input Chapter and Authors Advance in Knowledge

Relationships between 1. TRAFFIC FLOW MODELS New formula that allow more
traffic flow, density, [Lebacque, Lesort] accurate and better
speed, time, spacing and [Hoogendoorn, Bovy] understanding of the various
headway, including [Nelson, Sopasakis] traffic flow scenarios.
analogy to fluid dynamics [Daganzo]
and gas-kinetic.
Reproducible pattern of 2. TRAFFIC FLOW Better understanding of
traffic congestion around BEHAVIOUR traffic evolution around
bottlenecks and on-ramps [Cassidy, Bertini] bottlenecks, on-ramps and
including the basic model [Newell] traffic congestion using
of fluid dynamics. [Kerner] vehicle count, occupancy,
[Kita, Fukuyama] speed and by analytical flow
patterns.
Methods used for 3. ROAD SAFETY AND Improved methods for
identifying high risk road PEDESTRIANS identifying high risk road
sections, general belief in [Tracz, Nowakowska] sections, explaining changes
seat belts and pedestrian [Reinhardt-Rutland] in driving behaviour while
flow models. [Blue, Adler] using seat belts and improved
pedestrian dynamics for uni-
and bi-directional cases.
Travel behaviour, 4. FLOW EVALUATION ON Performance and reliability
origin-destination and ROAD NETWORKS measures, and
capacity models on road [Asakura, Kashiwadani, origin-destination patterns for
networks. Hato] short- and long-term
[Bell, Cassir, lida, Lam] deteriorated road networks.
[Aramatsu, Kuwahara]
Analytical and simulated 5. TRAFFIC ASSIGNMENT New formulations for
behavioural rules for [Lo] optimal, dynamic and user
dynamic traffic [Bekhor, Prashker] equilibrium traffic
assignment. [Astarita, Adamo, assignments with route
Cantarella, Cascetta] choice, origin and stochastic
[Bar-Gera, Boyce] considerations.
Forecasting procedures 6. TRAFFIC DEMAND. New and more accurate
and algorithms for FORECASTING AND methods and algorithms for
short-term DECISION TOOLS short-term origin-destination
origin-destination and trip [Barcelo, Casas] and trip predictions using
predictions and basic [Maher, Zhang] real-time traffic flows and a
decision strategies about [El Faouzi] decision support model for
traffic demand and [Clune, Smith, Xiang] transportation strategies.
choices.
Preface and Overview
Input Chapter and Authors Advance in Knowledge

Traffic as fluid dynamics, 7. TRAFFIC SIMULATION Simulation of traffic flow,
fuzzy logic for dynamic [Cremer, Stacker, density and travel time with
route guidance and Unbehaun] examination of dynamic
reproduced route flows on [Esker, Neubert, Wahle, traffic management and new
road networks. Schreckenberg] modelling of congestion
[Adamo, Astarita, Florian, spill-back.
Mahut, Wu]
Analysis framework for 8. TRAFFIC INFORMATION New algorithms and models
route guidance, control AND CONTROL for route guidance, optimal
theory with optimal signal [Bottom, Ben-Akiva, traffic signal setting on
setting and advanced Bierlaire, Chabini, networks, integration of
software for optimal Koutsopoulos, Yang] traffic control strategies and
control on roads with ramp [Poschinger, Cremer, traffic control with a
metering. Keller] feedback component.
[Kotsialos, Papageorgiou,
Messmer]
[Stamatiadis, Gartner]
Means for road financing, 9. ROAD TOLLING AND Better understanding the
toll collection ideas, road PARKING BALANCE welfare implications of road
tolling along with [Levinson] tolling, optimal strategies for
high-occupancy-vehicle [Huang, Yang] congestion pricing with
lanes and attributes of [Lam, Tarn, Yang, Wong] high-occupancy-vehicle
public car parks. lanes, and optimization model
for balancing demand and
supply of parking spaces.
Connection between 10. TRAVELLER SURVEY AND Definition and understanding
residential location and TRANSIT PLANNING of the variables to influence
density and urban travel [Bagley, Mokhtarian] travel patterns, optimal
patterns, methods for [Wirasinghe, Vandebona] subway plan with minimum
subway network plan and [Higgins, Ferreira, Lake] system cost and optimal
procedures for rail track model for rail track
maintenance. maintenance crew and
projects.
The review process that allowed the selection of .these 35 papers was particularly difficult,
following a two-stage international review process. It is for this reason that, for the first time,
22 additional papers will also be published, in a separate bound volume which will be available
at the Symposium. Warmest thanks are due to all the reviewers who completed this arduous
task within a very limited time framework.
Thanks are also due to Mr. J.-B. Lesort, who organised the last Symposium, to the
International Advisory Committee (listed separately) and to the Local Programme Committee
(Y. Berechman, Y. Gur, Y. Israeli, T. Lotan, D. Mahalel, A. Mandelbaum, M. Pollatschek,
Y. Prashker, D. Shefer, Y. Shiftan, I. Salomon).
We are indebted to the following main sponsors: the Transportation Research Institute of the
Technion-Israel Institute of Technology, the Israel Ministry of Transport, the General Motors
Foundation, and the European Commission (DG Transport).
Avishai Ceder
April 1999
INTERNATIONAL ADVISORY COMMITTEE
Prof. E. Hauer University of Toronto, Canada (Convenor)
Prof. R.E. Allsop University College London, UK
Prof. M.G.H. Bell University of Newcastle upon Tyne, UK
Prof. P.H.L. Bovy Delft University of Technology, The Netherlands
Prof. W. Brilon Ruhr University, Bochum, Germany
Prof. A. Ceder Technion - Israel Institute of Technology, Israel
Prof. C.F. Daganzo University of California, Berkeley, USA
Prof. N. Gartner University of Massachusetts at Lowell, USA
Prof. H. Keller Technical University of Munich, Germany
Prof. M. Kuwahara University of Tokyo, Japan
Mr. J.B. Lesort INRETS/ENTPE, Lyon, France
Prof. H. Mahmassani University of Texas at Austin, USA
Prof. Y. Makigami Ritsumeikan University, Japan
Prof. V.V. Silyanov Moscow Automobile and Road Construction Institute, Russia
Prof. M.A.P. Taylor University of South Australia, Australia
Prof. M. Tracz Cracow Technical University, Poland
Prof. S.C. Wirasinghe University of Calgary, Canada
HONOURARY MEMBERS
Prof. R. Hamerslag Delft University of Technology, The Netherlands

Prof. M. Koshi Tokyo University, Japan
Prof. W. Leutzbach University of Karlsruhe, Germany
Prof. G.F. Newell University of California at Berkeley, USA
Prof. H.G. Retzko Technical University Darmstadt, Germany
Dr. D.I. Robertson University of Nottingham, UK
Prof. T. Sasaki Kyoto University, Japan
Prof. S. Yagar University of Waterloo, Canada
This page intentionally left blank
IN MEMORIAM ROBERT HERMAN
"If you live long enough then even some nice things will happen..." was Robert Herman's
characteristically modest response to well-wishers congratulating him upon receiving any of
the many prestigious awards that he has received over the years in recognition of his
tremendous contribution across a spectrum of fields.
Born August 29, 1914, in New York City, he was graduated cum laude and with special
honors in physics from the City College of New York in 1935, and was awarded the M.A. and
Ph.D. degrees in physics in 1940 from Princeton University. After a distinguished career at
Johns Hopkins' Applied Physics Laboratory, he joined the General Motors Research
Laboratories in 1956, where he was appointed Head of the Theoretical Physics Department in
1959, and later Head of the Traffic Science Department, a position he held until 1979 when he
became General Motors Research Fellow. In September 1979, he joined the faculty of The
University of Texas at Austin as Professor of Physics, in the Center for Studies in Statistical
Mechanics, and L.P. Gilvin Professor in Civil Engineering. He continued as the L.P. Gilvin
Centennial Professor Emeritus in Civil Engineering and sometime Professor of Physics until
his death, at home in Austin, on February 13, 1997, after a battle with lung cancer.
Dr. Herman is widely acknowledged as a pioneer in the rapid development of the field of
vehicular traffic science. He has made significant contributions to the areas of single lane and
multiple lane traffic flow. With Ilya Prigogine, he has developed a Boltzmann-like kinetic
theory of multi-lane traffic flow which provides what is perhaps the best description up to this
time of this complex traffic situation. He has developed a two-fluid model of town traffic
which coupled with observation provides a description of vehicular traffic in an overall
macroscopic sense. In 1959, Dr. Herman organized a General Motors Research Laboratories
symposium on the theory of traffic flow, the first international gathering of its kind on this
subject, and remained actively involved in the organization of the subsequent symposia held
under various auspices around the world. The Tenth International Symposium on
Transportation and Traffic Theory was held in his honor at The Massachusetts Institute of
Technology in July 1987.
His interest in developing a scientific foundation for the study of traffic phenomena came
after a distinguished career as a physicist and cosmologist with far-reaching contributions. In
collaboration with Ralph A. Alpher and George Gamow, he initiated a theory of the origin and
relative abundance of the chemical elements in a relativistic "Big Bang" expanding universe.
In subsequent work, he and Alpher examined the properties of the expanding radiation-matter
universe according to general relativity theory, and in 1948 made the prediction that the
temperature of the residual black-body radiation, a vestige of the initial "explosion" of the "Big
Bang" universe, should be about 5°K, thus anticipating later work on the primordial cosmic
fireball radiation at 2.8°K which pervades the universe homogeneously and isotropically. The
theory was confirmed in the 1960s by scientists at Bell Laboratories trying to solve a problem
of microwave noise.
In recognition of this work, Herman and Alpher were awarded the Henry Draper Medal
from the National Academy of Sciences in 1993. They were recognized "for their insight and
skill in developing a physical model of the evolution of the universe and in predicting the
existence of a microwave background radiation years before this radiation was serendipitously
discovered". Herman also received the Magellanic Premium of the American Philosophical
Society, the oldest scientific award in the United States (1975), and the eighth quadrennial
George Vanderlinden Prix of the Belgian Royal Academy (1975), for the radiation prediction.
In 1980, he and Alpher were awarded the John Price Wetherill Gold Medal of The Franklin
xvi Transportation and Traffic Theory
Institute, and in 1981, they received The New York Academy of Sciences Award in Physical
and Mathematical Sciences.
Dr. Herman was a member of Phi Beta Kappa, Sigma Xi, the Washington Philosophical
Society, and the Royal Institution of Great Britain, and a Fellow of the American Physical
Society, the Washington Academy of Sciences, and The Franklin Institute. He was elected to
the National Academy of Engineering in 1978 for his contributions to the science of vehicular
traffic. In 1979, he was elected a fellow in the mathematical and physical sciences of the
American Academy of Arts and Sciences. He has been an associate editor of the Reviews of
Modern Physics; was one of the founders of the Transportation Science Section of the
Operations Research Society of America (ORS A, now INFORMS), and also became its first
chairman as well as the founding editor of its journal, Transportation Science. He has served
as the President of ORS A (1980-1981).
In 1959, Dr. Herman was co-recipient of the Lanchester Prize in Operations Research for
pioneering research on the stability and flow of single-lane traffic. In 1963, he was awarded an
honorary medal by the Universite Libre de Bruxelles, and during that same year received the
Townsend Harris Medal from the Alumni Association of the City College of New York as a
distinguished alumnus and for his scientific contributions. Other prestigious INFORMS
awards include the George E. Kimball Medal (1976) and the John Von Neumann Theory Prize
(1993). Also in 1993, he received the Roy W. Crum Distinguished Service Award of the
Transportation Research Board for "his pioneering contributions to the field of traffic science".
In 1984, he was awarded an Honorary Doctorate in Engineering by the University of Karlsruhe
in recognition of his outstanding research in the mathematical foundations and development of
the theory of traffic flow. He received the first Lifetime Achievement Award of the Operation
Research Society's (now INFORMS') Transportation Science Section for his body of work on
vehicular traffic science in 1990. The award was subsequently renamed the Robert Herman
Lifetime Achievement Award.
In October and November 1990, The National Academy of Engineering presented the first
public showing of some of Dr. Herman's small abstract sculptures in exotic wood, which he
had produced over a period of about 30 years.
Notwithstanding all these accomplishments, two activities occupied a very special place in
Bob's heart: the ISTTT series and Transportation Science (the journal). In addition to
symbolizing the coming of age of traffic science as a legitimate field of human scientific
inquiry, and providing it with its scientific pillars, they also defined Bob's special family and
community. With us, he shared the love for the work, the ideas, the excitement and the people
engaged in pushing the intellectual frontiers of this field, through theoretical development and
painstaking measurement. The collective effects of ISTTT gatherings always ensured a far
greater impact than the sum of its individual parts. Bob's unbounded human contributions as a
colleague, friend, mentor and educator to virtually every generation that has entered the
Transportation Science domain will endure. As it showcases recent developments in traffic
science and transportation analysis methods, the 14th ISTTT stands as a testimonial to the
house that Robert Herman built, and a celebration of the life and intellectual energy that he so
freely shared with all of us.
Hani S. Mahmassani
Austin, Texas, February 1999
In Memoriam xvii
IN MEMORIAM MICHAEL CREMER

Dr. Michael Cremer, university professor at the Technical University Hamburg-Harburg,
died on September 3, 1998, after a long illness which he bore patiently and with hope to
recover.
Prof. Cremer studied Electrical Engineering and Control Engineering at the Berlin
University of Technology, got his doctorate from the Ruhr University Bochum, and was
already involved in both Control and Traffic Engineering during his post doctoral work at the
Munich University of Technology.
He accepted the position of a university professor at the Hamburg University in 1979, and at
the Hamburg-Harburg University of Technology in 1992, where he taught in the departments
Industrial Engineering and Control Engineering. It was there where he built up the Automa-
tion Engineering Unit.
Apart from his research activities in the area of control engineering and its applications to
the distribution of pollutants in waterways, his major research interests were control and traffic
engineering. There is a series of original contributions of Michael Cremer in this area, such as
on the dynamic fundamental diagram, on the application of system dynamics to the estimation
of the matrix of traffic flows from traffic counts, or the development of control rules for the
optimisation of the traffic flows in motorway networks.
Cremer produced a large number of simulation tools which have been used and developed
further by his staff and his colleagues. The macroscopic and microscopic traffic flow models
which are offsprings of Cremer's institution have been named SIMONE, MAKSIMOS, and
MIKROSIM. The deployment of the Kalman filter with its wide variety of applications and
expansions to allow an analysis of traffic flow or incident detection is another example of his
innovative developments in traffic engineering. The consideration of floating car generated
data in macroscopic traffic flow models was the subject of one of his first and of his last
scientific achievements.
Michael Cremer played a significant role in the European research and development
programmes, from the design of system architectures within PROMETHEUS and DRIVE to
optimum traffic control strategies for congestion reduction in over-saturated networks within
the projects HERMES, COSMOS and OFFENSIVE, to name a few.
Michael Cremer was an active representative and monitor of science in his capacity as a
member of German and international professional and research associations. The most note-
worthy of these being the committee on "Traffic Flow Theory" of the German Forschungs-
gesellschaft fur StraBen- und Verkehrswesen and the Advisory Committee to the "International
Symposion on Traffic and Transportation Theory".
Professor Cremer earned significant national and international reputation with his
innovative contributions in the field of traffic control and engineering. The research
community in general and his colleagues in particular will have to miss his originality and
profound competence in research, his witty and sophisticated way and his positive view of life.
Hartmut Keller
1998
CONTRIBUTORS
Adler, J.L. Rensselaer Polytechnic Institute, Troy, NY, USA

Adamo, V. Universita della Calabria, Italy
Akamatsu, T. Dept. of Knowledge-based Information Engineering, Toyohashi
University of Technology, Toyohashi, Japan
Asakura, Y. Dept. of Civil & Environmental Engineering, Ehime University,
Matsuyama, Japan
Astarita, V. Universita della Calabria, Italy
Bagley, M.N. University of California, Davis, USA
Barcelo, J. TSS-Transport Simulation Systems, Barcelona, Spain
Bar-Gera, H. Univ. of Illinois at Chicago, USA
Bekhor, S. Dept. of Civil Engineering, Technion - Israel Institute of
Technology, Haifa, Israel
Bell, M.G.H. Transport Operations Research Group, University of Newcastle,
UK
Ben-Akiva, M. Massachusetts Institute of Technology, USA
Bertini, R.L. Dept. of Civil & Environmental Engineering & Institute of
Transportation Studies, University of California at Berkeley,
USA
Bierlaire, M. Swiss Federal Institute of Technology, Switzerland
Blue, V.J. New York State Dept. of Transportation, Poughkeepsie, NY,
USA
Bottom, J. Massachusetts Institute of Technology, USA
Bovy, P.H.L. Delft University of Technology, Traffic Engineering Section,
The Netherlands
Boyce, D. University of Illinois at Chicago, USA
Cantarella, G.E. Universita di Napoli, Naples, Italy
Casas, J. Universitat de Vic, Spain
Cascetta, E. Universita di Napoli, Italy
Cassidy, M.J. Dept. of Civil & Environmental Engineering & Institute of
USA
Cassir, C. Transport Operations Research Group, University of Newcastle,
UK
Chabini, I. Massachusetts Institute of Technology, USA
Clune, A. York Network Control Group, Dept. of Mathematics, University
of York, UK
Cremer, M. Technical University of Hamburg-Harburg, Germany
Daganzo, C.F. Inst. of Transportation Studies, University of California,
Berkeley, USA
El Faouzi, N.E. Laboratoire d'Ingenierie Circulation - Transport, Unite Mixte de
Recherche INRETS - ENTPE, Bron, France
Esser, J. Los Alamos National Laboratory, NM, USA
Ferreira, L. School of Civil Engineering, Queensland University of
Technology, Australia
Florian, M. Centre for Research on Transportation, University of Montreal,
Quebec, Canada
XX Transportation and Traffic Theory
Fukuyama, K. Dept. of Social Systems Engineering, Tottori University, Japan

Gartner, N.H. University of Massachusetts at Lowell, MA, USA
Hato, E. Dept. of Civil & Environmental Engineering, Ehime University,
Matsuyama, Japan
Higgins, A. CSIRO, Brisbane, Australia
Hoogendoorn, Delft University of Technology, Traffic Engineering Section,
S.P. The Netherlands
Huang, H.J. Beijing University of Aeronautics & Astronautics, P.R. China
lida, Y. Transport Operations Research Group, University of Newcastle,
UK
Kashiwadani, M. Dept. of Civil & Environmental Engineering, Ehime University,
Matsuyama, Japan
Keller, H. Fachgebiet Verkehrstechnik und Verkehrsplanung, Technische
Universitat Munchen, Germany
Kerner, B.S. Daimler Chrysler AG, Stuttgart, Germany
Kita, H. Dept. of Social Systems Engineering, Tottori University, Japan
Kotsialos, A. Dynamic Systems & Simulation Laboratory, Technical
University of Crete, Chania, Greece
Koutsopoulos, H. Volpe Transportation Systems Center
Kuwahara, M. Institute of Industrial Science, University of Tokyo, Japan
Lake, M. School of Civil Engineering, Queensland University of
Technology, Australia
Lam, W.H.K. Dept. of Civil & Structural Engineering, Hong Kong University
of Science & Technology, P.R. China
Lebacque, J.P. ENPC-CERMICS, Marnes La Vallee, France
Lesort, J.B. INRETS/ENTPE, Lyon, France
Levinson, D.M. Institute of Transportation Studies, University of California at
Berkeley, USA
Lo, H.K. Dept. of Civil Engineering, Hong Kong University of Science &
Technology, Clear Water Bay, .R. China
Maher, M. School of Built Environment, Napier University, UK
Mahut, M. Centre for Research on Transportation, University of Montreal,
Quebec, Canada
Messmer, A. Ingenieurburo A. Messmer, Munich, Germany
Mokhtarian, P.L. University of California, Davis, USA
Nelson, P. Dept. of Mathematics, Texas A&M University, College Station,
USA
Neubert, L. Physik von Transport und Verkehr,
Gerhard-Mercator-Universitat, Duisburg, Germany
Newell, G.F. Dept. of Civil & Environmental Engineering & Institute of
USA
Nowakowska, M. Laboratory of Computer Science, Kielce University of
Technology, Kielce, Poland
Papageorgiou, M. Dynamic Systems & Simulation Laboratory, Technical
University of Crete, Chania, Greece
Poschinger, A. Fachgebiet Verkehrstechnik und Verkehrsplanung, Technische
Universitat Munchen, Germany
Prashker, N.J. Dept. of Civil Engineering, Technion - Israel Institute of
Technology, Haifa, Israel
Contributors XXI
Reinhardt Psychology Dept, University of Ulster at Jordanstown, UK

-Rutland, A.H.
Schreckenberg, Physik von Transport und Verkehr,
M. Gerhard-Mercator-Universitat, Germany
Smith, M. York Network Control Group, Dept. of Mathematics, University
of York, Heslington, UK
Sopasakis, A. Dept. of Mathematics, Texas A&M University, College Station,
USA
Staecker, D. Technical University of Hamburg-Harburg, Germany
Stamatiadis, C. University of Massachusetts, USA
Tarn, M.L. Dept. of Civil & Structural Engineering, The Hong Kong
University of Science & Technology, P.R. China
Tracz. M. Cracow University of Technology, Poland
Unbehaun, P. Technical University of Hamburg-Harburg, Germany
Vandebona, U. University of New South Wales, Australia
Wahle, J. Physik von Transport und Verkehr,
Gerhard-Mercator-Universitat, Germany
Wirasinghe, S.C. Dept. of Civil Engineering, The University of Calgary, Canada
Wong, S.C. Dept. of Civil Engineering, The University of Hong Kong, P.R.
China
Wu, J.H. Centre for Research on Transportation, University of Montreal,
Quebec, Canada
Xiang, Y. York Network Control Group, Dept. of Mathematics, University
of York, Heslington, UK
Yang, H. Dept. of Civil & Structural Engineering, The Hong Kong
University of Science & Technology, P.R. China
Yang, Q. Caliper Corporation, Boston, USA
Zhang, X. School of Built Environment, Napier University, UK
CHAPTER 1
TRAFFIC FLOW MODELS
Imagination is more important than knowledge. (Albert Einstein)
We think in generalities, we live in details. (Alfred North Whitehead)
Science proceeds more by what it has learned to ignore than what

it takes into account. (Galileo)
MACROSCOPIC TRAFFIC FLOW MODELS:
A QUESTION OF ORDER
J.PLebacque, ENPC-CERMICS, Marnes La Vallee, France

J.BLesort, LICIT INRETS/ENTPE, Lyon, France
Abstract The aim of this paper is to propose a methodology for the comparison of first and
second order macroscopic traffic flow models. First, we identify a certain number of difficulties
and investigate how various macroscopic models cope with them. Second, we propose a set of
problems or situations which could be used as a test workbench for the comparison of models.
i INTRODUCTION
Since the first papers by Lighthill and Whitham [1955] and Richards [1956] and the introduction
of higher order models by Payne [1971], a great number of macroscopic traffic flow models have
been developed. They are all based on a few similar variables and assumptions the state variables
are the flow q(x, t), the density k(x, t) and the mean flow speed u(x, t), defined as:
(1) « = kf
The variables q, k and u are considered as piece-wise differentiable functions of space x and time
t, still allowing the existence of some singularities (discontinuities in space-time corresponding
to shock waves, discontinuities in time corresponding to incidents and discontinuities in space
corresponding to variations in the road layout). The conservation equation constitutes the second
basic equation. It can be written as:
+ 0
(2) lr
ox § ot =
To realise a complete model, a third equation is necessary. It can be derived in different ways.
The first one, which results in the basic Lighthill-Whitham-Richards (LWR) model, is to consider
an empirical relationship between speed and concentration:
(3) u(x,t) — ue(k(x,t)}

4 Transportation and Traffic Theory
These are first order models. In various extensions, the equilibrium relationship may depend on
space (variations in the road layout) and time (occurrence of incidents):
(4) u(x,
The important point is that this dependence is exogeneous to the model and not a part of it ue is
a function of the three variables k, x, t. A second way to obtain a third equation is to consider
relation (3) only as an equilibrium relationship, and to describe explicitly transitory states, using
for instance a relaxation process expressing the tendency of traffic to tend to an equilibrium.
Models of that kind may be derived from microscopic considerations, as made by Payne [1971],
resulting in a speed equation:
du du 1/ v dk
with v an anticipation coefficient and r a reaction time. These are higher order models. More
recently, many higher order models have been developed from the kinetic models generalizing
the model proposed by Prigogine and Hermann [1971] . Examples are [Phillips, 1979], [Helbing,
1997], [Kerner et al, 1996]. To develop such models, some simplifying assumptions have to be
made, such as considering a gaussian repartition of individual speeds [Helbing, 1997]. The
resulting equation are somewhat similar to Payne's.
For a long time, numerous experiments have been conducted to compare various classes of mod-
els. (e.g [Michalopoulos et al., 1992] and [Papageorgiou et al, 1983, 1989]). Most of these
experiments have concluded to the superiority of higher order models. However it must be no-
ticed that most of these experiments had as their first aim to validate a higher order model. For
these experiments, the LWR model discretization and implementation had not been optimized.
On the other hand, the values obtained in the calibration process for parameters as the reaction
time are often far from physical, as pointed out by Del Castillo et al. [1993]. More recently,
comparisons and analysis have been conducted on a more theoretical basis Schochet [1988] has
shown that Payne's model converges towards the simple LWR model when the reaction time
tends to zero. Daganzo [1995b] claimed that higher order models included fundamental flaws,
such as negative speeds, which made them unsuitable for practical use. In reaction, models have
been proposed [Liu et al, 1998], [Zhang, 1998] trying to avoid such flaws. In-depth experimental
analysis have been conducted [Kerner and Rehborn, 1997] to understand how theoretical models
are able to explain observations. In turn, these experimental results are contested [Daganzo et al.,
1997], various explanations being given of the same traffic phenomenons. To date, no definite
conclusion is possible on the respective advantages and drawbacks of each kind of model.
The purpose of this paper is to propose some clarifications to this debate. In order to achieve
this aim, we first identify some specific difficulties, and assess how the various macroscopic
models deal with them, thus achieving some measure of comparison. Further, we propose a set
of archetypical problems of modelling situations, for which we give some elements of compar-
ison between models, and on which various models could be tested for performance or simply
modelling capability. Such a comparison procedure is of course quite theoretical, but the re-
sults expected from the models ultimately rely on experience. Actually, some of the proposed
archetypes, such as traffic dispersion or congestion formation, are inspired directly by experimen-
tal data, whereas others, such as intersection or lane drop modelling, result more from theoretical
or logical considerations.
2 THE MAIN MODELING ISSUES

In the literature, a number of theoretical difficulties have been identified for each type of model.
They may be grouped into a few classes, which we shall review now.
Macroscopic Traffic Flow Models
2.1 Basic variables
It has been noticed for long that most models have difficulties in keeping some of the basic
variables describing the traffic flow dynamics within physically sound limits. This is particularly
true for speed and acceleration.
2.1.1 Flow speed

The nature of the flow speed varies depending on the class of the model. For 1st order models,
the flow speed is only a derived variable, and not an intrinsic one (the model is based only on
flow and concentration). It is one of the reasons why, as noticed by [Buisson et al, 1996], the
calculation of speed in discretized versions is not straightforward.
In higher order models, the flow speed is one of the basic variables, as one of the equations of the
model, say (5), is a speed equation. On the other hand, it has been noticed by [Daganzo, 1995b]
and others that this speed equation (5) may lead, under some initial conditions, to unphysical
values of speed (negative speeds). In reaction, some authors have recently presented models
designed to avoid these negative speeds. However, it seems that these models often present other
inconsistencies. For instance, the speed equation of the model proposed by [Liu et al, 1998] is
given by:
(6)
with T(k] a reaction time depending on k. The other equations of the models are the classical
equations (1) and (2). It is then clear that:
• If an initial condition is u = 0 (locally), then || is positive, which avoids negative speeds,
• On the other hand, consider the following initial conditions, which are those of a stopped
queue, with k = kjam, u = ue(kjam) = 0 for x < 0 and k = 0, u = Vf = ue(0) for x > 0,
with Vf the desired speed (i.e. q = 0 uniformly). Then ^ = 0 (because ue(k) = u for
all x), H = 0 (because initially q = 0 uniformly) and the queue will never start again the
solution is stationary.
It is to date not obvious whether higher order models presenting no risk of negative speed can be
developed.
In first order models, speeds are always physical for they are bounded by 0 and a maximum
equilibrium free speed.
2.1.2 Acceleration
Acceleration is not an intrinsic variable of any macroscopic model. However, it is interesting to
examine whether the acceleration of a vehicle moving with the flow is kept within reasonable
bounds. Indeed, models aiming for instance at the description of sound or pollutant emission by
traffic flow require reasonable acceleration estimates. If w(t) is the speed of such a vehicle, the
acceleration can be written as:
dw _ du du
dt dt dx
In the case of first order models, the possibility of the occurrence of unbounded acceleration
values will be illustrated by two examples. First, let us recall that, if the equilibrium speed-
density relationship is dependent on the position x, it follows from the conservation equation and
the speed-density relationship that:
dw , (duf\ dk ( , due\ dup
dt \ ok I dx \ * dk iI ox
*
If we consider a homogeneous roadway, the equilibrium speed is only a function of density

u(x, t) = ue(k(x, t)). Under these conditions the acceleration may be written:
dw (due\2 dk
=
(9
) ~JT
at ~k HjT
\dk J ^~
ox
Since the value of || cannot be bounded (through a Shockwave, or when a queue discharges), it
is clear that the acceleration may take any value.
A second instance which may be mentionned is the effect of a variation in the road layout, i.e
ue is a function of x u(x, t) = ue(k(x, t ) , x ) . It is then possible to examine the acceleration of a
vehicle under steady-state flow conditions with ff = 0 and |^ = 0, since the analytical solution
is straightforward. Under these conditions the acceleration of a vehicle can be written:
dw u2 due
This formula results from (8) by noting that the steady state conditions || = 0 = f^ imply
dqe dk dqe
dk dx dx
It can be seen that:
• For a given road lay-out profile ( given ^jf), the acceleration sign depends on traffic con-
ditions (fluid or congested)
• since the derivative ^ is bounded by the free speed, if ^ is very high, there is no physical
bound to the acceleration. What this presumably means in reality is that the actual use of
the road by traffic (actual capacity)is adapted to the acceleration/deceleration capabilities
of vehicles.
2.2 Supply representation
The representation of various road layout configurations is one of the problems to be dealt with
by traffic flow models. This problem will be investigated more into details in the next section, but
two questions are related to some theoretical issues: the notion of capacity, and the occurrence
of discontinuities in the road layout.
2.2.1 Capacities
Even if the capacity has a straightforward definition (maximum possible flow), its representation
is not that obvious. The main point of controversy is the concept of dynamic capacity for in-
stance, does the maximum flow depend on the presence of a congestion [Papageorgiou, 1998] or
not.
In first order models, the solutions used for the model are entropy solutions which always max-
imise the flow. The definition of capacity is thus obvious, since capacity is one of the parameters
of the equilibrium relationship.
On the other hand, if non-entropy solutions are used, as proposed for instance in [Lebacque,
1997] for bounded acceleration models, the notion of capacity corresponds to that of maximum
flow only under steady state traffic conditions. (Actually the flow can never exceed the capacity
value, but it is lower under transitory conditions).
In higher order models, there is a notion of maximum flow in steady state conditions, when the
flow speed is equal to the equilibrium one. On the other hand, outside of these conditions, the
Macroscopic Traffic Flow Models 1
flow may be lower but also higher than this value. Indeed, the speed u and the density k being
independent variables, there is no possibility to control the value of their product q = ku. This is
particularly apparent in Ross's model [Ross, 1988], ^ = £ (vj - u) (with v/ the desired speed),
in which the capacity constraint ku < qmax must be added to the model in order to recapture
such basic features as congestion propagation. For Payne's or similar models, Schochet's results
[Schochet, 1988] imply that such excessive values occur only in limited ranges of space and
time. This last result is actually only valid if the viscosity z/ and relaxation time T parameters
take physical values, i.e. are very small, as proposed in [Del Castillo et a/., 1993]. Otherwise,
convergence towards traffic states satisfying to capacity constraints is slow, or takes place over a
large space scale only.
2.2.2 Discontinuities
By discontinuities we mean mainly either space or time discontinuities of the equilibrium rela-
tionships ue and qe. Spacewise discontinuities would be related to abrupt increases or decreases
in capacity. Timewise discontinuities represent accidents or incidents, which imply a more or
less extended capacity restriction for some finite duration. Finally, moving capacity restrictions
might also be considered, such as buses (in urban areas) [Lebacque, Lesort, Giorgi 1998] or slow
moving truck convoys on highways [Newell, 1993 and 1997].
Discontinuities constitute a normal feature of macroscopic models, a necessary consequence of
the continuum hypothesis and of the corresponding approximation. Actually, the size of any
feature smaller than say 50 to 100 meters should be neglected, a remark that applies also to the
unbounded acceleration estimate problem mentionned in subsection 2.1.2.
We shall discuss spacewise discontinuities more in detail in subsection 3.1. A few remarks
are of order here. First order models accommodate such discontinuities well, since boundary
conditions are well defined for these models, see subsection 3.7 the flow at such a discontinuity
is continuous (spacewise) and equal to the minimum between upstream demand and downstream
supply (entropy solution). Since speed is a function of density, a discontinuity of the speed is
generated, implying infinite acceleration and an unrealistic solution, which is flow maximizing,
a point often criticized in the literature [Papageorgiou, 1998]. It might be argued that speed
discontinuities constitute a normal feature of macroscopic models too, i.e. that the velocity
changes abruptly over a range of the same order as the vehicle spacing. Such a velocity gradient is
nevertheless not compatible with the finite acceleration of traffic (and the acceleration capabilities
of vehicles).
Second order models are much better behaved in theory, owing to the damping effect of the
diffusion term (term in |£ in the speed equation (5)) only the acceleration would be discontin-
uous. The speed should gradually relax towards the equilibrium speed. Nevertheless, it may
well be that this favorable situation actually results from the fact that variations of the equilib-
rium relationships have not been considered in the derivation of higher order models. Indeed,
if such variations are considered, as in subsection 3.1.2, supplementary terms in ^ should be
introduced into the speed equation, implying a speed-discontinuity at the discontinuity of ue.
Timewise discontinuities will not be discussed in detail here, we refer the reader to [Buisson et
al., 1996], [Mongeot, 1997], [Heydecker, 1994]. The difficulties involved are the same, with a
difference which is that the density, and not the flow, is,conserved at the discontinuity. Moving
discontinuities can be treated by considering the moving frame associated to the discontinuity,
a programme which has been carried out in the case of first order models (see the references
cited above), but not, to the authors knowledge, in the case of second order models. It is actually
questionnable whether moving capacity restrictions of small size, such as buses or convoys, can
be modelled at all with second order models, considering that the model equations may not be
able to enforce a small sized capacity restriction through the mechanism of relaxation towards
the equilibrium state.
2.3 Models solutions
There are two possible ways of computing solutions to traffic models either to calculate analytical
solutions, which is not always possible, or to derive a discretized model from the continuous one
and to compute simulation solutions.
2.3.1 Analytical solutions

One of the main advantages of first order models is probably that they make it possible to com-
pute analytical solutions for a wide variety of simple but nontrivial cases. This is due to the sim-
plicity of the model and the existence of characteristic lines (straight in the homogeneous case)
carrying constant flows and densities. It is thus possible to calculate the analytical solutions of
the Riemann problem for all possible initial and boundary conditions. The necessary initial and
boundary conditions are also clearly defined: The knowledge of initial densities k(x, t 0 ) is a suf-
ficient initial condition, and the knowledge of the traffic demands at the entrances of the network
and traffic supplies at the exits are sufficient boundary conditions. Let us recall that following
[Lebacque, 1996b], in the LWR model, the traffic demand at any point x is the greatest outflow at
that point, and the traffic supply is the greatest inflow at that point. These quantities result from
the local left repectively right hand side density values at x through the equilibrium demand and
supply functions:
A e (K,z) = qe(K,x-) if K < kcrit(x-}
= qmax(x-) if K > kcrit(x-}
(11)
S e («,z) = qmax(x+) if « < kcrit(x+)
= qe(K,,x+) if K > kcrit(x+)
Concerning higher order models, the possibilities to compute analytical solutions are much more
restricted. For some specific models such as the model proposed by Ross [Ross, 1988], analyt-
ical solutions can be derived in a variety of cases, as shown by [Lebacque, 1995]. This is why
Ross's model, although it has been much criticized (see [Newell, 1989] and Ross's response,
[Ross, 1989]), constitutes a simplified archetype of second order models (much as Greenshields
model or the simplified piecewise linear equilibrium flow-density relationship model in [Da-
ganzo, 1994] do for first order models).
An interesting example of analytical calculations for second order models was given by Kuhne
in his model [Kuhne, 1984]:
2
du c dk
- =1-MAO
,
-u)-j^ 0
This model results from Phillips' model, in which the anticipation term of Payne's model is
replaced by a pressure term
_\&P_
k dx
(as resulting from a kinetical model) by assuming that the traffic pressure V is a linear function
of the density. This model, with addition of a viscosity term ^f^f , and a special choice of the
viscosity coefficient v = ^ and of the equilibrium relationship
ue(k] = u° [(!/ (1 + exp((k/kmax - 0.25)/0.06))) - 3.72,1(T6]
became the study object of Kerner and Konhaiiser [Kerner et a/., 1996].
Macroscopic Traffic Flow Models
2.3.2 Simulation
Discretizations have been proposed for most models presented in the literature. Concerning first
order models, various schemes have been proposed, including [Lebacque 1984] (in which the
Godunov flux was introduced in a heuristic way], [Michalopoulos et al., 1984a] (based on a
shock-fitting Lax-Wendroff scheme), [Michalopoulos et al., 1984b], [Bui et al., 1992] (using
Osher's formula for the Godunov flux function), [Leo and Pretty, 1992] (applying Roe's approx-
imate formula for the Godunov flux function to several models including the LWR model). More
recently, dicretizations based on the Godunov scheme and presenting a better consistency with
the continuous model have been proposed by [Lebacque, 1996b] and [Daganzo, 1994]. However,
few systematical analyses of the discretization question have been made. An older example is
provided by [Leo and Pretty, 1992] and a more recent example is [Zhang and Wu, 1997].
On the other hand, some models exist only under a discretized form. It is the case of the model
proposed by [Hilliges, 1995], or of some models introducing bounded acceleration (the phe-
nomenological model in [Lebacque, 1997]). The main drawback of this kind of model is that
the behaviour of the model is dependent on the discretization parameters (time 8t and space 5x
discretization steps), and that the convergence of the model is not guaranteed when 8x and 6t
tends towards zero. Considering the equivalent equation is usually not helpfull. Furthermore, the
convergences towards zero of 8x and 8t should not be considered independently, as the following
cursory analysis of numerical viscosity effects (and also Schochet's convergence results) show.
When the discretization scheme is consistent with the continuous model, the only effect of the
discretization is to introduce a numerical viscosity into the model behaviour. The effect of this
viscosity is visible on a simple example. Let us consider a simple first order model, under fluid
traffic conditions. Let
• ql(t) and q°(i) be the flows entering and leaving a discretization segment during time-step
[M + A],
• k(t) be the mean density at time t.
Under fluid conditions, and at first order approximation (for example a perturbation of a station-
ary state) qr(t) can be considered as given, and q°(t + 6t) written as:
= qe(k(f}} + -T-~ (qx(t) - q°(t)) by limited development
This is actually a smoothing formula (a fact not incompatible with the existence of shock-waves)
which bears some resemblance with the experimental TRANSYT smoothing formula [Robert-
son, 1969].
Introducing the maximum speed umax, the viscosity factor J|^f can be considered as the product
of two terms:
• A physical and intrinsic term ^— ^ which expresses that the viscosity is a function of
density
• A purely numerical term umaxj^ (The Courant-Friedrichs-Lewy number)
A symmetric formula relating qj(t + 6t) to ? 7 (t) and q°(t) applies in the congested case.
The Courant-Friedrichs-Lewy (CFL) condition 6x > umax5t imposes to this numerical term to
be lesser than 1 in order to guarantee the stability of the model. On the other hand, the lesser
this term, the higher the numerical viscosity (therefore it should be as close to one as feasable in
order to optimize the discretization).
An extreme example is given by some flow models used for dynamic assignment, which can
be considered as first order models continuous in time, with an imposed space discretization
(the link). The basic example is the model proposed by [Merchant and Nemhauser, 1978], from
which a time-continuous version has been given by [Friesz et al, 1989]. Indeed, a one segment
Godunov discretization of a link yields (assumingthat the downstream supply is sufficient):
x(t + St) = x(t) + 6t [q!(t) -
with q1 the given inflow, x(t) the total number of vehicles at time t, A e the demand function and
x ( t ) / l the mean density, I the length of the link. Taking 6t —>• 0, the following model results:
d
ft(t}=qI(t}-g(x(t}}
with
As explained by [Astarita, 1996] these models present a high degree of numerical viscosity,
resulting in unphysical behaviour (null or infinite travel-times ...).
Numerical viscosity can be considered as a positive effect, as it looks like a platoon dispersion
effect (this will be developed further on). However, it must be kept in mind that part of this is a
purely numerical effect, with little possibility to fit it to physical observations.
Second order models in simulation have been analyzed carefully, notably by [Papageorgiou et al,
1989], [Michalopoulos et al., 1992] for instance. For lack of sufficient supportive mathematical
results (the theory is not as well developed as that of conservation equations, few analytical solu-
tions or convergence results are known), analyses concentrate on stability, parameter calibration
and ability to reproduce satisfactorily observations. The idea of these simulations is to use cells,
cell-averages, and finite difference schemes, an approach similar to the one used with first order
models. The cell dimension is liable to be much greater than in discretized first order models.
Some allowance must be made for special features adding a constant K to the density in the l/k
terms for instance, in order to avoid division-by-zero problems. The equation q = ku must be
discretized with care too; [Papageorgiou et al., 1989] proposed a linear interpolation formula for
the cell outflow of cell (i) during time-step k, qf, relating it to the mean flows of cells (i) and
(i + 1):
(13) g* = afc*ut* + (1 - a)fc?+1u?+1
Flux vector splitting upwind schemes were proposed too [Michalopoulos et al, 1992], and may
yield more rigorous discretizations. At this point, parameter calibration has still an enormous
importance for second order models, which means that discretized second models should be
considered as phenomenological in that sense, and as having an existence in their own right,
irrespective of the continuous model from which they are derived. This is possibly a good thing
corrective features can be introduced into discretized second order models, correcting some flaws
of the corresponding continuous models, such as pointed out in [Daganzo, 1995c] (negative
speeds for instance), possibly introducing capacity bounds too.
2.4 Models vs. reality
The physical soundness of a model can only be checked by comparison to real world observa-
tions. This means that
• The model must be able to explain observed physical phenomenons,
• It can be identified using real data,

• The values of the model variables and parameters must be kept within physically sound
bounds.
2.4.1 Interpretation of measurements

The basic variables of macroscopic models, flow, speed and density, are readily accessible ex-
perimentally. It is thus interesting to analyse whether various kinds of models may explain the
values of this parameters as observed on actual traffic. This has been made through many in-
vestigations, starting with (Greenshields, 1935], but surprisingly few definite conclusions can be
drawn from this literature.
For steady-state conditions, an important analysis has been made by [Cassidy, 1998], who, using
an original filtering technique, has shown the validity of the speed/flow equilibrium relationships.
This is an important result, but it does not concede any kind of superiority to any model, for all
models behave quite similarly under steady-state conditions.
Concerning dynamic conditions, many phenomenons can occur, resulting in global perturbations
which are not easy to explain:
• The demand/supply mechanism may result, at a measurement point, and for a given value
of flow, in various density values: oscillation between both fluid and congested states,
mixed into a single observation period, can yield different values of the density, determin-
istic scatter and hysteresis, even if the flow is constant;
• The composition of traffic, particularly with respect to the various destinations, which is
never measured, may result in the occurrence of unexplained perturbations, particularly at
diverge points. This has been observed for instance by [Daganzo et al., 1998].
It is only recently that systematic analysis of observations in relation to traffic models have been
conducted. Classical experiments have mainly focused on the calibration/global validation of the
model on a statistical basis. One of the first attempts at an in-depth analysis has been made by
[Kerner and Rebhorn, 1997]. Considering their paper and the explanations given of the same and
similar data by [Daganzo et al., 1997], it is interesting to notice how multiple explanations can
be given of the same physical phenomenons, each following its own modelling rationale.
Other phenomenons are much more difficult to explain at, such as the constant size perturbations
observed by Kerner and Rebhorn, which cannot be explained through classical models. It can
be noticed that it would be possible to derive some non-entropy solutions of the LWR model
explaining these perturbations. However, these solutions would lead to instability problems.
2.4.2 Parameters values

This topic will be considered again in the sequel. At this point, it can be noted that first order
models require few parameters, which all have a physical meaning and are easily identified by
network operators (maximum density, speed, flow, critical density and flow). On the other hand,
second order models require the same above parameters, which are those of the equilibrium
relationships, plus many others whose physical meaning is not always obvious, with resulting
values contradicting both theory and common sense (reaction time T of the order of 30 sec. for
instance) [Cremer and Papageorgiou 1981], [Papageorgiou et al., 1989], but also [Michalopoulos
et al, 1992] parameter to. These parameters require extensive identification. A good example is
given by [Papageorgiou et al. 1989], in which the speed equation is given by:
du du 1 / v dk
-^7 + C -5- = - ue(k) - u- —— —
at ox r \ k + KOX
and the discretized version of the corresponding model requires the parameter values of the
equilibrium relationship ue, of the physical parameters T, 5, v, and of the nonphysical parameters
K, C and a (this last one, already mentionned, is specific to the discretization equation (13)). Let
us recall that in the above, g would be the ramp-inflow for instance. The issue of the reaction time
T is also illuminating, since the estimated values vary from 1 to 50 seconds or more depending
on the authors and very different functional forms (for r as a function of density k) have been
proposed (see [Del Castillo et ai, 1993] and [Michalopoulos et al, 1992]).
3 ANALYSIS OF SOME TYPICAL CASES

This section is devoted to the definition of problems on which to test macroscopic models. Since
traffic flow is essentially nonlinear, there exists no set of problems allowing an exhaustive com-
parison. Therefore, the problems described hereafter represent simply a set of basic problems,
aimed at the representation of the difficulties described in the previous section and some other
difficulties related to measurements.
3.1 Variation of the road layout
This point has already been mentionned in the paragraph dealing with the basic variables of the
model. The basic problem is to represent a discontinuity in the road layout, i.e a sudden increase
or decrease in capacity (addition or suppression of one lane for instance).
The study of two different situations is of interest:
• Steady-state traffic conditions, either fluid or congested,
• A variation of traffic demand upstream (or supply downstream).
3.1.1 First order models

It has been explained in [Lebacque, 1996b] how the supply/demand expression provide complete
boundary conditions for first order models. This makes possible a very simple resolution of the
Riemann problem when there is a discontinuity relative to variable x in the equilibrium function
qe(k, x), for the supply at the discontinuity point is given considering the function qe(k, x+) cor-
responding to downstream conditions, and the demand is computed using the function qe(k, x~)
corresponding to upstream conditions. The boundary condition is given by the continuity of
traffic flow through the discontinuity, which can also be expressed as a stationary Shockwave
denoting x+ and x~ points immediately downstream and upstream of the discontinuity x, the
speed of the shock wave is expressed as:
77 ff\ = 9(x+,t) -q(x~,t)

s()
k(x+,t)-k(x~,t)
The null speed of the Shockwave implies q(x+, t) = q(x~, t) Vt. The following examples show
the solutions in different cases. Solutions for capacity increase and decrease are completely
symmetric, depending on the fluid/congested conditions.
Ex 1. Capacity restriction, demand below capacity
Ex 2. Capacity restriction, increase of the demand from below to above capacity
Ex 3. Capacity expansion, increase of downstream supply from below to above upstream capac-
ity (This case is symetric of the previous one).
The lines in these examples are either characteristics (straight lines) or the obvious shock-waves.
What is important to notice is that:
• Accelerations are not bounded and are infinite in some cases
• The traffic flow through the discontinuity is always maximized by the entropy solutions
computed by the supply/demand process. Under stationary conditions anyway, the accel-
erations would be infinite whatever the solution used since the flow is uniform and the
concentration is not.
3.1.2 Higher order models

The situation is somewhat less clear for second order models. In principle, the terms || in the
speed equation should smoothen the discontinuities. Let us consider for instance Payne's model:
du 1
(14)
k dx J
The stationary solution of this model satisfies:
q = Q independent of x and t
k — k(x) function of £ only
u = u(x) function of a; only
Q = ku implying ±f| = -±jg
It follows from (14) that:

du I Q vdu\
(15)
u udx J
Let us consider Greenshield's relationship:
(and v = Vf/(2kj)), with parameters Vf and kj having different values on the left- and right-
hand-side of the origin. The generic solution of (15) (i.e. excluding the particular solution
u= (jf + 1vQ\ /Vf) is given by:
du (u2 - v/T] dx
(16)
+ 1vQ T
with boundary conditions given at infinity by the equilibrium conditions:
_ vf
u is continuous at the origin, but ~ is not, yielding for instance in the diverge case a solution of
the following type:
(the exact functional form depending on the exact value of Q). It should be emphasized here
that nothing prevents Q from being equal to the maximum throughflow of the system (i.e. the
smaller of the maximum equilibrium flow values on the left- and right-hand-side of the origin).
This observation suggests that in general, outside transitory (unstationary) situations, second and
first order models should yield the same outflows of say capacity restrictions].
This last remark is consistent both with Schochet's [1988] convergence result and with Del
Castillo et a/.'s [1993] analysis, which showed that for theoretically consistent values of the
parameters v and T of Payne's model:
T(k) = -1
!/(*) = -(1/2) (t-)
the LWR and Payne models yield very close results.

There is nevertheless a fundamental difficulty which must be mentionned here the derivation of
second order models usually does not take into account the effect of the possible variability of
the equilibrium speed-density relationship.
Let us consider for example the derivation of Payne's model from a car-following model:
fir
(17)
with xn the position of the n-th vehicle and A(l/fc) == u e ( k ) . Now, if ue depends on the position
x, we should rewrite (17) as:
dx
(18) -^(t + T) =A(z n _ 1 (*)-z f l (t) J z n _ 1 (*))
or:
There is not really any argument enabling us to prefer one of the formulas (18), (19) over the
other. Both formulas express the fact that the driver anticipates the variation of the physical
layout of the track, they differ by the range of this perception of the driver. Let us consider (19),
which results in the simplest calculations and is consistent with Payne's original approximation.
With the usual approximations:
xn ~ x
xn K, u(x,t)
xn-i-xn w l/k(x + l / ( 2 k ( x , t ) ) , t )
(xn.l-xn)/2 w x + l/(2k(x,t))
it follows:
..... du 1 I ., . 1 due/, .dk\ I due
<20)
introducing a supplementary term

1 due
accounting for a specific acceleration due to the variation of the physical environment.
The same remark applies to the more recent model of Zhang [1998]. Indeed, in keeping with
the derivation of the model, it would be necessary to start from the following macro-micro traffic
model:
(21) dx^ +TÛg (k(Xn + A, *),*„ + A)
at
(replacing equation (16) of the above-mentionned paper). In the above equation (21), we ex-
pressed the awareness of the driver of the physical layout ahead of him (at a distance A, i.e. the
anticipatory nature of the driver's behavior, in keeping with the ideas of the paper [Zhang 1998].
Now, after setting x = xn+i, xn+i = u, and developing at first order in T and A, it follows:
.„. du 1 . ., . . A 9 u e / , . dk
(22) = (u (k x} u] H -(k x)
dt ± J. OK ox
Since the characteristics speed being still
Zhang's argument still applies to yield:

A due dqe
1
aA; \
~\7 J / OT
a/c
Finally, Zhang's model becomes:
du 1 , ,, , , (du,.,, C\ dk , dup
(23) ^ =I
It is worth mentionning that this last model (23) does not admit the LWR model as its limit-
ing case when ue(k, x) = u since the acceleration in this last model is given by (8) when the
equilibrium speed-density depends on the position.
As a conclusion, some further research should go into adapting second order models to the pos-
sible variability of physical characteristics of the links. Some ideas in that direction have been
developed in a purely numerical setting, for instance in [Papageorgiou et al. 1989] (the a coeffi-
cients mentionned previously (13)).
3.2 Multiclass-multibehavior/multilane traffic modelling.
Both problems, which are related, have been treated in the literature. The necessity of these
problems (and the related intersection modelling problem) stems from the fact that traffic models
should actually describe flow on networks, and not simply on the real line.
We should distinguish between multibehavior flow, in which the the equilibrium relationships
applying to different classes of users are different (for instance multilane flow), and multiclass
flow, in which users may belong to different classes, as for instance in the dynamic assignment
context, but whose equilibrium relationships are the same. Multiclass models are an inescapable
ingredient of any traffic flow model aiming at representing dynamic assignment situations.
Considering the case of first order models, multiclass flow description relies on the partial flows-
densities concept. For each user class d, partial flow Qd, density Kd and speed Vd are introduced;
and the partial flow and density are related by the usual conservation equation and speed defini-
tion equation Qd = KdVd. An equation for Vd yields a complete traffic flow model for class
d. The simplest possible model is the FIFO model Vd = V, Vd (all users have the same speed,
irrespective of their class). Characteristic of the FIFO partial flow model is the fundamental fact
that the composition of density and flow are the same. Such a FIFO model was introduced in a
discretized version in [Daganzo, 1994]; a non-FIFO extension was proposed (still in a discretized
setting) in the STRADA model [Buisson et a/., 1996]. A continuous version of this model is de-
scribed in [Lebacque and Khoshyaran, 1998] the partial speed Vd results from the partial flow
Qd, estimated directly at each point as the minimum of the downstream partial supply and up-
stream partial demand. The behavioral part of the model is contained in the derivation of partial
supplies and demands from their global counterparts.
Second order models have a much older history of multiclass models. We might cite for in-
stance the Payne-derived METANET [Messner and Papageorgiou, 1990] in which the partial
flow model is approximately FIFO the discretized partial flows are proportional to the compo-
sition of the discretized density. The METANET and SSMT derived METACOR [Elloumi et
al., 1994] follows similar principles in its motorway part. Hilliges' model [Hilliges, 1995] is
approximately FIFO too (same speed irrespective of class d).
Basically, models differ on the issue of FIFO behavior of partial flows. FIFO behavior constrains
the model drastically, notably in respect to intersection modelling. Indeed, let [a, b] be a link,
T(t) the travel time of the link for vehicles entering it at time t, and xd(xi t) me composition
of the traffic at location x and time t, then x d ( G > i) — Xd(b, t + T ( t } ) . This last identity shows
that in FIFO models, the inflow composition of intersections may be completely determined
by past upstream conditions. Thus, the modelling of intersections in which this is not the case
(intersections with preselection lanes, left-turning stoarage capacity etc...) requires non FIFO
links. This is the purpose of multi-lane modelling. First order models do not accomodate these
models easily, because of the equilibrium speed-density relationship. Two examples may be cited
here. The model of [Daganzo et al. 1997], treats the special case of two lane types and two user
classes, with the effective flow resulting from a user optimum. [Daganzo et al. 1997] derived
analytical solutions for the corresponding Riemann problem. Lebacque and Khoshyaran [1998]
present a general lane assignment model, based on the analogy with classical static assignment
problems. In this last model, the users d, of density Kd, have access to the set of lanes i 6 Id,
and the density Kd is split between lanes i as Kd = £ kf. With kd = 0 if i £ Id, and with Ki
the density in lane i (given by Ki = £) kf) being less than the maximum density Kjti of lane i,
d
the following constraints apply to the unknowns kf:
Kd = £ kf \/d
kf = 0 if i g J d , Vi,
kf > 0 Vi, d
The unknowns kf are determined by maximizing the total flow
de f
with 7j = Kjj/Kj the relative width of lane i (system optimal, entropy maximizing-like model),
or the following criterion
<-Kihi
ue(s) ds
(user-optimal, similar to the Beckmann transform), yielding [Daganzo 1997] as a particular case.
An older model can be recalled here the model of [Michalopoulos et a/., 1984a), in which the
interaction between lanes is assumed to follow a simple linear relaxation behavior, of the form:
9kh
for instance in the case of two lanes.

Second order models accomodate also multilane models (see again the paper by Michalopoulos
et al. [Michalopoulos et al., 1984a], and Hoogendoorn [1996] (combining multiclass and multi-
lane). More recent developments favor the derivation from microscopic lane interaction models,
aggregated through the agency of kinetical models, as in Helbing [1997].
3.3 Platoons and their dispersion.
Platoon dispersion is an experimental fact, incorporated as a fundamental feature in the TRAN-

SYT model [Robertson, 1969]. The platoon dispersion problem can be expressed as the solution
of the problem presenting the following initial conditions, with kQ an undercritical concentration
value:
to which we should add eventually an initial condition for speed.


The standard solution of the first order model corresponds to a shock-wave occurring at the end
of the initial platoon, and a fan-like characteristic scheme at the front as the following example
shows, with Greenshields relationship (the straight lines represent as usual the characteristics and
the curves the Shockwaves):
This means that the evolution of the platoon is as follows:

kA
The dispersion as predicted by first order models occurs always at the head of the platoon and
concerns the faster vehicles (this is the consequence of the entropy condition). The situation is
slightly improved by considering discretized first order models, but as we have seen previously,
numerical viscosity is not really a reliable modelling tool. It can be noticed that, when computing
the solution numerically, the platoon aspect is more realistic, because of the numerical viscosity
introduced by discretization.
An illustration of the emulation of platoon dispersion by numerical viscosity effects can be given
by a derivation similar to the one presented above. Let the road be divided into cells of uniform
size 6x. If a concentration k0 is observed in cell i = 0 at time t = 0, with a null concentration
everywhere else, with the simplification that ^ is constant between 0 and ko, the concentration
at time-step p in the cell i is
fc i (p) = Cj^(l-^- i fc 0
with Q = H^f. /? is necessarily less than 1 due to the Courant-Friedrich-Lewy (CFL) condition,
and the distribution of concentration inside the platoon is thus binomial, which is quite similar
to the platoon dispersion model proposed by (Robertson, 1969) for the TRANSYT programm.
Again, this is a purely numerical effect depending on the discretization parameters. If || = ^,
which minimizes the numerical viscosity, then /? = 1 and there is no dispersion at all.
3.3.2 Higher order models.

Higher order models on the other hand, thanks to their diffusion term, (or the || term in the speed
equation) yield a much better emulation of platoon diffusion. For instance, the platoon diffusion
model based on a normal speed distribution proposed by Pacey [1956] has been shown [Grace
and Potts, 1964] to correspond to a diffusion equation, a result hardly surprising in view of the
relation between stochastic diffusion processes and parabolic partial differential equations.
3.4 traffic signals - interrupted flows
The presence of a traffic signal on the road can be considered in a simplified way as a boundary
condition q(xo, t) = 0 when the signal is red. The problem is to compute traffic flow evolution
when the signal turns red and when it turns green again. We can use as an archetypical initial
conditions that the signal has been green for an infinite time, and that a uniform undercritical
flow q0 is present. Boundary conditions are a constant demand q0 upstream and an infinite supply
downstream.

The solution of the traffic signal problem is well known and presented in many books and pa-
pers, for one or several intersections (see for instance [Michalopoulos et al., 1980]). For an
uncongested signal it has the following form:
What is interesting to notice again is that the accelerations of vehicles are not bounded, neither
when negative (vehicles stop instantaneously when they join the queue, nor when positive The
first vehicle starting when the signal turns green reaches instantaneously its free speed, and the
next ones may present unphysical acceleration values too. For instance, using a Greenshield equi-
librium relationship ue = umax (l - jr—\ it is easy to calculate the acceleration of a vehicle
moving with the flow at position x and time t in the fan area
du
1-
dt 4t
The maximum value of this acceleration is obtained for x — —umaxt which corresponds to the
place where vehicles start. It is equal to:
dt 2t
It is clear that, whatever the value of a physical maximum acceleration, it is possible to find a
time t when this value is exceeded.
However, bounded acceleration solutions can be constructed, using for the solution of the above
example the trajectory of the first vehicle of the platoon as a boundary condition. These solutions
must nevertheless be defined in a proper functional setting [Lebacque, 1997] and still need to be
investigated in more detail.
3.4.2 Higher order models

It is clear that higher order models present a better representation of the accelerations downstream
the stopline when the signal turns green. On the other hand, as pointed out by Daganzo [1995],
the presence of a stopped queue at a red traffic signal may result, if the upstream demand is
unsufficient, in negative speeds. As explained in the first part of this paper, some models designed
to avoid negative speeds may present other flaws, as the one proposed by [Liu et al., 1998].
3.5 Intersections.
Modelling intersections is a very difficult task in any macroscopic model, considering that the
interactions inside an intersection are very complex, generally microscopic in nature, and below
the scale-level of the validity of the continuum hypothesis. From a geometrical point of view,
two possibilities coexist describe the intersection as a point, or as an extended object.
The latter approach has been used mainly in the discretized context and first order models, with
SSMT [Lebacque 1984], in which each movement inside the intersection is discretized as a sin-
gle cell, and the intersection cells overlap, with the model proposed by [Michalopoulos, 1988],
also based on overlapping cells, but each movement being discretized into several cells (the dis-
cretization was a three cell discretization), or with METACOR [Elloumi et a/., 1994], whose
urban part is partially SSMT-derived. A different concept, the concept of non-overlapping ex-
change zones that accomodate several movements the interaction of which is summarized in the
zone dynamics, has been introduced with STRADA [Buisson et al, 1996].
Modelling interactions as points requires to combine quantities describing the traffic state up-
stream and downstream of the intersection. Particularly illuminating in this respect, notably as
far as difficulties occuring in higher order models go, is Hilliges' model [1995], which exists
only in semi-discretized form:
dNj __
(24)
with /j the length of cell i. It is obvious that the necessary downstream information for any
upstream cell u is an equivalent speed of downstream cells d, whereas a downstream cell d
requires an equivalent speed of upstream cells: u
(formulas (6.12) to (6.18) of [Hilliges, 1995]. Furthermore, the FIFO or non-FIFO character of
the flow on the upstream links has an impact on the intersection models.
A few solutions have been proposed in the first order case. Daganzo [1994] considered merges
and diverges, and proposed a priority rule for merges, and a maximum flow rule for diverges
(with a FIFO upstream flow). Maximum entropy-like models, constrained by upstream demand
and downstream supply, were proposed in [Lebacque and Khoshyaran, 1998]. The impact of the
nature of the upstream flow is particular clear in the case of a diverge if the flow is non FIFO, the
partial demands are proportional to the traffic composition coefficients upstream of the diverge,
but if the flow is FIFO, it is the actual partial flows (throughflows through the intersection) that
are proportional to the traffic composition coefficients.
To illustrate this point, let us consider for instance a pointwise diverge say a, with upstream link
u, downstream links i, upstream demand 5u(a, t), downstream suplies <7j(a, t) resulting from the
up- and downstream densities and equilibrium functions (11) by:
crj(a, t) = £ e (fcj(a+, £), a; i)

Su(a,t) = A e (fc u (a-,t),a;u) .
Let qi(a, t),q(a, t) be the partial and total flow through the intersection, and 8i(a, t) and 5(a, t) the
partial and total demands upstream. Finally, let Xi(a~i t) be the composition of traffic entering
the intersection. Then, if the model is non-FIFO, the partial demands are proportional to the
compositions, yielding:
q(a,t) = Eft (a,*) •

z
On the other hand, if the upstream flow is FIFO, the partial flows are proportional to the compo-
sitions, yielding:
q(a,t) = Mini[ai(a,t)/Xi(a-,t),6(a,t)]
qi(a,t) = Xi(o-,*)g(M) -
In the case of second order models, friction coefficients in the speed equation have ben used to
account for the effect of off- and on-ramps (i.e. diverges and merges). The basic idea is to assume
that the say outflow (on the off -ramp) has a given speed related to that of traffic upstream of the
intersection (generally smaller), as in [Papageorgiou et al. 1989]. Similar models with friction
terms were proposed in [Michalopoulos et al., 1992], generalizing Papageorgiou's model:
du $ dk
- =-(Uf(x)
with G = nkfg and g the the inflow (on-ramp).
3.6 Highway traffic jams.
[Kerner, 1997], exhibited a beautifull experimental example of two traffic jams of constant size,
persisting on a german highway for a very long period (of the order of an hour), surrounded by
fluid traffic. Such a structure cannot be explained by first oder models, except by recoursing to
non entropic solutions as illustrated by the following characteristics chart:
But such a solution is highly unstable in the sense that the solution is not unique, and that in-
finitely small divergences from the right initial conditions may produce an (unique) solution
totally different from the above solution, but very close to the entropy solution with its accelera-
tion fan. Let us recall the entropy solution satisfies to continuity relatively to initial conditions.
Some second order models of the Kerner-Konhaiiser-Helbing type might be able to explain the
persistence of such structures.
The above example definitely constitutes a challenge for future traffic flow models.
3.7 Boundary conditions.
Boundary conditions are an important feature they determine both intersection models and net-
work entry and exit point models. No traffic equation should be writen only on an infinite track!
In the case of first order models, the supply and demand concepts yield the boundary conditions
for any link. For instance the link inflow and outflow Q(a, t) and Q(b, t) of the following link:
are given by:

Q(a,t) = Min
Q(M) = Min
with the boundary conditions A u (t) and S<j(t) the demand upstream of the link entry point a and
the supply downstream of the exit point b. A (6, t) and S(a, t) are as usual the demand and supply
of the link at points b and a as resulting from the equilibrium supply and demand functions (11)
by the formulas:
A(M) =
In the case of second order models, to the authors knowledge, no simple boundary conditions
exist. Let us consider again Hilliges' model (eq. 24) to show the difficulties involved. For a
network entry point, corresponding to an entry cell say i, the inflow ki-\Ui must be given, i.e.
the upstream density fcj_i must be given.
entry point
The same applies to the speed equation, which requires an upstream speed. Basically, upstream
boundary conditions will comprise both an upstream density and an upstream speed. For exit
points, only the downstream speed is required (the analogy with Papageorgiou's exit ramp model
is obvious). There is no simple way to accommodate supply and demand concepts (for example
a constraint on the demand at an exit point would imply a constraint on the downstream speed
Ui+i, dependent on the exit link density fcj).
4 CONCLUSION
The analysis presented above has obvious limits, since it is clear that the only real validation
of a traffic model is its confrontation to real data. However, it is also true that real data always
present a high proportion of noise that may mask some inconsistencies of a tested model, and
that several and possibly conflicting interpretations may often be given of the same observed
situations. The validation of a model against real data is thus not necessarily a guarantee that
model will behave properly in all situations. This is the reason why many papers presenting
the validation of some traffic models are of limited interest. It thus seems interesting to have a
basic framework to analyse the behaviour of a model on a set of well-defined cases, to point out
eventual inconsistencies or side effects and to compare a particular model (or class of models)
to the existing ones. From that point of view, the various cases presented in this paper may
contribute to constitute such a framework.
REFERENCES
Astarita,V. (1996). A continuous time link model for dynamic network loading based on travel
time function, in: Transportation and Traffic Theory, proceeding of the 13th ISTTT (J.B. Lesort
ed.). 79-102, Pergamon, Oxford.
Bui D.D., P. Nelson, S.L. Narasimhan, (1992). Computational realizations of the entropy condi-
tion in modelling congested traffic flow. Report 1232-7. Texas Transportation Institute, USA.
Buisson C. , J.P. Lebacque, J.B. Lesort, H. Mongeot, (1996). The STRADA model for dynamic
assignment. Proc. of the 1996 ITS Conference. Orlando, USA.
Cassidy M. J.,(1998). Bivariate relation in nearly stationary highway traffic, Trsp. Res. 32B, pp
49-60.
Cremer M., M. Papageorgiou, (1981). Parameter identification for a traffic flow model, Automat-
ica 17 pp 837-843
Daganzo C.F., (1994). The cell transmission model 1: a dynamic representation of highway
traffic consistent with the hydrodynamic theory.
Daganzo C.F., (1995a). The cell transmission model 2: network traffic simulation.Trsp. Res.
28B. 4 269-287. Transportation Research 29B. 2 79-93.1995.
Daganzo C.F, (1995b). A finite difference approximation of the kinematic wave model. Trsp.
Res. 29Bpp 261-276.
Daganzo C.F., (1995c). Requiem for second-order fluid approximation of traffic flow.Trsp. Res.,
29B pp 277-286.
Daganzo C.F., Cassidy M. J. and Bertini R. L., (1997).Causes and effects of phase transitions in
highway traffic, report UCB-ITS-RR-97-8.
Daganzo C.F., (1997) A continuum theory of traffic dynamics for freeways with special lanes.
Transportation Research 31 B. p 83-102.
Del Castillo J.M., Pintado P. and Benitez F. G., (1993). A formulation for the reaction time
of traffic flow models. Twelfth International Symposium on Transportation and Traffic Theory,
Berkeley, California.
Elloumi, E.,H.Hadj Salem, M. Papageorgiou, (1994). METACOR, a macroscopic modelling tool
for urban corridors. TRISTAN II Int. Conf., Capri.
Friesz T.L., J. Luque, R.L. Tobin, B.W. Wie, (1989). Dynamic network traffic assignment con-
sidered as a continuous time optimal control problem. Op. Res. 37, 6 893-901.
Grace M.J., R.B. Potts, (1964). A theory of the diffusion of traffic platoons, Op. Res. 12-2 pp
529-533.
Greenshields B.D. (1935). A study of traffic capacity, Proceedings of the Highway Research
Board, vol 14, pp 448-477
Helbing D., (1997). Verkehrsdynamik, Springer, .
Heydecker B.G, (1994). Incidents and interventions on freeways. PATH Research Report UCB-
ITS-PRR 94-5.
Hilliges M., (1995). Bin phanomenologisches Modell des dynamischen Verkehrsflusses in
schnellstralknnetzen. PHD dissertation. Institut fiir theoretische Physik der Universitdt
Stuttgart. Shaker Verlag.
Kerner B. S., (1997). Experimental characteristics of traffic flow for evaluation of traffic mod-
elling. IFAC/IFIP/IFORS symposium, Chania, Greece.
Kerner B. S. and Rehborn H., (1997). Experimental properties of phase transitions in traffic flow.
Phys. Rev. Let.19, pp 4030-4033.
Kerner B. S., Konhaiiser P. and Shike M., (1996). A new approach to problems of traffic flow
theory.in: Transportation and Traffic Theory, proceeding of the 13th ISTTT (J.B. Lesort ed.).
79-102, Pergamon, Oxford.
Kiihne R. D., (1984). Macroscopic freeway model for dense traffic. Stop-start waves and incident
detection. Ninth Int. Symposium on transportation and traffic theory. VNU Science Press pp 21-
42.
Lebacque J.P., J.B. Lesort, F. Giorgi, (1998). Introducing buses into first order macroscopic
traffic flow models. Trsp. Res. Rec 1664, pp 70-79.
Lebacque J.P., (1984). Semimacroscopic simulation of urban traffic. Int. 84 Minneapolis Sum-
mer Conference. AMSE.
Lebacque J.P., (1995). Le modele de trafic de P. Ross Solutions analytiques, Actes INRETS 45,
Arcueil,France
Lebacque J.P. , (1996a). Instantaneous travel times for macroscopic traffic flow models. CER-
MICS Report 59-96.
Lebacque J.P., (1996b). The Godunov scheme and what it means for first order traffic flow mod-
els. in:Transportation and Traffic Theory, proceeding of the 13th 75/77 (J.B. Lesort ed.),pp 647-
677, Pergamon, Oxford.
Lebacque J.-P, (1997). A finite acceleration scheme for first order macroscopic traffic flow
models. The 8th IFAC symposium on transportation systems, Chania, Greece.
Lebacque J.P. , M.M. Khoshyaran, (1998). First order macroscopic traffic flow models for net-
works in the context of dynamic assignment. CERMICS Report. To be Published. Presented at
the 6th Meeting of the EURO Working Group On Transportation. Goteborg.
Leo C.J. , R.L. Pretty, (1992). Numerical simulation of macroscopic traffic models. Transporta-
tion Research 26B. 3 207-220,
Lighthill M.H., G.B. Whitham, (1955). On kinematic waves II A theory of traffic flow on long
crowded roads. Proc. Royal Soc. (Lond.) A 229 317-345.
Liu G., Lyrint/is A. and Michalopoulos P., (1998). Improved high Order model for freeway
traffic flow. TRB 77th meeting, Washington, DC.
Merchant D.K., G.L. Nemhauser, (1978). A model and an algorithm for the dynamic traffic
assignment problem.Transportation Science 12 183-(199 & 200-207.
Messner A., M. Papageorgiou, (1990). METANET, a macroscopic modelling simulation for
motorway networks. Technische Universitat Miinchen.
Michalopoulos P.G, G. Stephanopoulos, V.B. Pisharody (1980). Modelling of traffic flow at
signalized links, Trsp. Sc. Vol 14-1, pp 9-41
Michalopoulos P.G., D.E. Beskos, Y. Yamauchi, (1984a). Multilane traffic flow dynamics some
macroscopic considerations.
Michalopoulos P.G. , D.E. Beskos, J.K. Lin, (1984b). Analysis of interrupted flow by finite
difference methods. Trsp. Res. B 18B pp 409-421.
Michalopoulos P.G., (1988). Analysis of traffic flow at complex congested arterials. Trsp. Res.
Rec 1194 pp 77-86. Trsp. res. 18B, 377-395.
Michalopoulos P. G., Yi P. and Lyrintzis A. S., (1992). Developement of an improved high-order

continuum traffic flow model.Trap. Res. Rec., Vol. 1365, pp 125-132.
Mongeot H., (1997). Traffic incident modelling in mixed urban network, Traff. Eng. + Ctrl, vol
38-1 l.pp 584-592
Newell G. R, (1989). Comments on traffic dynamics. Trsp. Res. 23B, 386-389.
Newell G. R, (1993). A moving bottleneck, UCB ITS report UCB-ITS-RR-93-3.
Newell G. R, (1998). A moving bottleneck. Trsp. Res., 32B, pp 531-538.
Pacey G.M., (1956). The progress of a bunch of vehicles released from a traffic signal, Research
note RN/2665/GMP, RRL, London
Papageorgiou M., Posch B. and Schmidt G., (1983). Comparison of macroscopic models for
control of freeway traffic.Trsp. Res., 17B, pp 107-116.
Papageorgiou M., Blosseville J.-M. and Hadj-Salem H., (1989). Macroscopic modelling of traffic
flow on the boulevard peripherique in Paris. Trsp. Res., Vol. 23B, pp 29-47.
Papageorgiou M., (1998). Some remarks on macroscopic traffic flow modelling.7>.sp. Res. 32A
5 p 323-330.
Payne H. J., (1971). Models of freeway traffic and control. Simulation Council proceedings, 1,
ch6.
Phillips W. R, (1979). A kinetic model for traffic flow with continuum implications. Transporta-
tion Planning and Technology, Vol. 5-3, pp 131-138.
Prigogine I. and Herman R., (1971). Kinetic theory of vehicular traffic, American Elsevier, New
York.
Richards P.I. ,(1956). Shock-waves on the highway. Op. Res. 4 42-51.
Robertson, D.I, (1969) TRANS YT, a traffic network study tool, RRL Report 153, Crowthorne
Ross, P., (1988). Traffic dynamics, Trsp. Res. 22B pp 421-434.
Ross, P., (1989). Response to Newell. Trsp. Res. 23B, pp 390-391.
Schochet S., (1988). The instant response limit in Witham's non linear traffic model uniform
well-posedness and global existence. Asymptotic Analysis 1, pp 263-282.
Zhang H. M. and Wu T., (1997). Numerical simulation and analysis of trafic flow. TRB 76th
meeting, Washington, DC.
Zhang H. M., (1998). Theoretical inquiry into transient speed-concentration relationship in traffic
flow. TRB 77th meeting, Washington, DC.
27
MULTICLASS MACROSCOPIC TRAFFIC FLOW

MODELLING: A MULTILANE GENERALISATION
USING GAS-KINETIC THEORY
Serge P. Hoogendoorn and Piet H.L. Bovy, Delft University of Technology, Faculty of Civil
Engineering and Geosciences, Transportation and Traffic Engineering Section, Delft, The
Netherlands
ABSTRACT
In contrast to microscopic traffic flow models, macroscopic models describe traffic in terms
of aggregate variables such as traffic density, flow-rate, and velocity. The implied mean traf-
fic behavior depends on the traffic conditions in the direct environment of the vehicles in the
traffic stream. Using the analogy between the vehicular flow and flow in fluids encouraged
deriving these models (e.g. Lighthill and Whitham (1955), Payne (1979)). The advantages of
macroscopic models are among others the insight gained into traffic flow operations (e.g.
shock-wave analysis), the applicability in model based control, the relatively small number of
parameters simplifying model calibration, and the applicability to large traffic networks.
Generally, macroscopic models consider the behavior of the aggregate traffic flow. That is,
neither a distinction of user-classes, such as traveler types (commuters, freight, recreational,
etc.), vehicles types (person-cars, trucks, busses, vans), paying and non-paying traffic, and
various types of guided vehicles, nor a distinction of roadway lanes is made. However, we
envisage that a generalization of macroscopic traffic flow models to both user-classes and
lanes is advantageous. On the one hand, this generalization increases the applicability of mac-
roscopic models to the synthesis and analysis of multilane multiclass (MLMC) traffic flow.
As a result, more insight is gained into the response-behavior of the heterogeneous multilane
flow, such as effective capacity, velocity distribution, and the distribution of vehicles over the
roadway lanes.
On the other hand, from the traffic control perspective, contemporary policies pursue a more
efficient use of the available infrastructure (e.g dynamic allocation of roadway lanes to
classes, class-selective ramp-metering). The heterogeneous multilane network-wide traffic

control problem is characterized by multiple objectives (efficiency, safety, etc.), multiple tar-
get groups (the user-classes), and a high complexity. The latter is caused by the interaction
between the user-classes, the interplay between the available control instruments, and the in-
teraction between the different parts of the network. This complexity requires a model-based
approach, demanding the availability of operational models providing deterministic condi-
tional predictions of the multilane heterogeneous traffic flow, given some specific control
configuration.
Only very recently, attempts to generalize the classical macroscopic models emerged. Hoo-
gendoorn (1997), and Hoogendoorn and Bovy (1998a) present a multiclass generalization of
the model of Helbing (1996) based on gas-kinetic principles. Research on the multilane gen-
eralization of macroscopic flow models is reported by Daganzo (1997), Helbing (1997), and
Klar et al. (1998). Helbing (1997) briefly discusses the multiclass generalization of the gas-
kinetic multilane equations.
In this paper, we present a macroscopic model describing the dynamics of heterogeneous

multilane traffic flow, based on gas-kinetic multiclass multilane traffic dynamics. In contrast
to the aforementioned models, the MLMC model describes the traffic flow by considering the
conservative variables density, momentum, and energy, rather than the primitive variables
density, velocity, and velocity variance. Using these so-called conservatives simplifies the
derivation approach and enables improved mathematical and numerical analysis (cf. Hoogen-
doorn and Bovy (1998d,1999)). Since the acceleration and lane-changing behavior differs
significantly between free-flowing and constrained drivers, the macroscopic flow model con-
siders both driver's states.
Other novelties are the derived expressions of the MLMC equilibrium momentum and energy,
quantifying the asymmetric user-class and lane interaction. From these expressions, the
MLMC equilibrium velocity and velocity variance can be determined. The equilibrium rela-
tions result from competitive acceleration and deceleration processes: on the one hand, vehi-
cles accelerate towards their desired velocity, while on the other hand, vehicles that interact
with slower vehicles from different user-classes - without being able to immediately overtake
to an adjacent lane - decelerate. Also, the equilibrium lane-distribution of the classes as a re-
sult of overtaking can be determined. On the input-side, the model allows the specification of
the class specific desired velocity, acceleration time, and overtaking probabilities.
The paper is organized as follows. First we present the MLMC generalization of the gas-
kinetic flow equations of Paveri-Fontana (1975) for constrained and free-flowing vehicles.
Secondly, we present the macroscopic flow model using conservative variables, while subse-
quently discussing the equilibrium speed-density relations and the density lane distribution.
After discussing the numerical solution approach, we present results from application of the
macroscopic model to two test cases. Finally, in the closing section we summarize our re-
search findings.
Macroscopic Traffic Flow Modelling 29
DERIVATION OF THE MLMC GAS-KINETIC EQUATIONS
Gas-kinetic models describe traffic using the reduced phase-space density (PSD) p(*,v,/),
where p(jc,v,/)dxdv equals the expected vehicle number in [jt,jc+djt) driving with velocity
[v,v+dv) at instant t. This concept is borrowed from statistical physics and can be considered
as a mesoscopic generalization of the traffic density r(x,t).
Equations describing dynamic changes in the implied velocity distributions are based on the
work of Prigogine and Herman (1971), who assumed that changes in the reduced PSD are
caused by acceleration, deceleration, and convection. The latter simply describes changes due
to the movement of the traffic. Their deliberations yielded the following equation:
dp dp [dpi [dpi
— + v— = — + — CD
dt dx
The acceleration term [dp/dt]^cc describes relaxation of drivers' speed towards a traffic-
condition dependent velocity. Prigogine and Herman (1971) proposed:
[dp/dt]ACC =(Q°(jt,v,f)-p(JC,v,f))/T (2)
where T denotes the acceleration time and Q°(x,v,0 reflects the distribution of desired veloci-
ties, that is, of the expected desired velocity of vehicles driving with velocity v. Paveri-
Fontana (1975) improved the relaxation process by considering the Phase-Space-Density
(PSD), which can be considered as a generalization of the reduced PSD extended with an in-
dependent variable describing the desired velocity v°, that is p(jc,v,v°,0-
The interaction term [3p/3?]iNT reflects fast vehicles catching up with slower vehicles, consid-
ering the immediate overtaking probability n. The interaction term is composed of contribu-
tions of active and passive interactions. An active interaction occurs when a vehicle driving
with velocity v interacts with a slower vehicle driving with velocity w<v. A passive interac-
tion is defined from the viewpoint of the impeding vehicle: it occurs when a vehicle driving
with velocity v impedes a faster vehicle (w>v). The assumption of vehicular chaos (cf.
Prigogine and Herman (1971)) yields the following expressions for the contributions of active
and passive interactions on the dynamics to the reduced PSD:
[dp I &CTIVE = (l - n)p(x, v, 0 J ( w - v)pU, w, f)dw (3)
and:
[dp 1 3CTSSIVE = (1 - n)f>(x, v, r) J ( w - v)p(jc, w, r)dw (4)
Hoogendoorn and Bovy (1998a,b) propose a multiclass generalization of the gas-kinetic

equations of Paveri-Fontana (1975) by considering the Multiclass Phase-Space Density
(MUC-PSD). Their model is characterized by asymmetric interactions, caused by the distinc-
tion of slow and fast classes. Helbing (1998) proposes similar models for the multilane case.
In this section we will establish the first step in the derivation approach which is the determi-
nation of gas-kinetic equations for multiclass multilane traffic flow operations. These equa-
tions describe the dynamics of the multilane multiclass Phase-Space Density (abbreviated as
MLMC-PSD), which is dis-aggregated into contributions of platooning and free-flowing ve-
hicles respectively due to differences in driving-characteristics. Similar to other gas-kinetic
models, several class-specific and lane-specific processes govern the dynamics of the
MLMC-PSD (acceleration towards the desired velocity, deceleration caused by vehicle inter-
actions, immediate, postponed and spontaneous lane changing, and state-transitions). Typical
class-specific parameters are the desired velocity, the acceleration time, the reaction time, the
vehicle length, and the within-user-class velocity variance (cf. Hoogendoorn and Bovy
(1998a,b)). The traffic conditions on the roadway lanes differ, due to among others the class-
dependent overtaking behavior and lane preferences.
The phase-space density for constrained and free-flowing vehicles
In order to accommodate the multilane description of heterogeneous traffic, we dis-aggregate

the PSD p(jt,v,vV) (cf. Paveri-Fontana (1975)) by distinguishing classes and lanes. The
MLMC-PSD f>uj(x,v,v°,t) denotes the expected number of vehicles per unit road-length of
user-class u at x on lane j at instant t which are currently driving at a velocity equal to v while
aiming to traverse along the road at a desired velocity v°, where j=l,...,M and wsU. By defi-
nition, y'=l denotes the rightmost lane while j=M denotes the leftmost lane. By notational con-
vention, dropping the respective index from the notation indicates lane- and/or class-
aggregation (e.g. p«=2/pujF). The reduced MLMC-PSD and the MLMC traffic density are re-
spectively defined by:
pj(;t,v,0 = Jp,!U,v,v 0 ,Odv 0 (5)
and:
def . - .
rJ(x,t) = pJu(x,v,v0,t)dv°dv = pJu(x,v,t)dv (6)
Constrained and free-flowing vehicles
In the sequel we will show that the state of a driver - that is, whether he is constrained or
freely flowing - to a large extent determines both the lane changing behavior and the accel-
eration behavior. A constrained or platooning driver refers to any driver who is impeded by a
slower vehicle in front without being able to immediately change to an adjacent lane. Con-
versely, we will refer to any driver not impeded by any slow vehicle or who is able to imme-
diately change lanes as a free-flowing driver. The MLMC-PSD is then dis-aggregated as fol-
lows:
,v,v,r) (7)
where auj(x,v,v°,t) and £,u\x,v,v°,t) respectively denote the contribution of the constrained and
the free-flowing MLMC-PSD. Correspondingly to the mixed-state reduced MLMC-PSD, we
define the separate reduced MLMC-PSD 's, for constrained and free-flowing vehicles.
The gas-kinetic equations for free-flowing and constrained vehicles
The gas-kinetic equations that describe the dynamics of £,,/ are governed by both continuum
and non-continuum processes. The continuum processes reflect the smooth changes in the
free-flowing MLMC-PSD due to balancing inflow and outflow in the phase-space (x,v,v ,t).
We can show that the gas-kinetic equations describing the dynamics of the free-flowing
MLMC-PSD equal (cf. Hoogendoorn and Bovy (1998c)):
—— + V——H — = 1—— I fS"i

(8)
dt dx dt dv I a* I
where [d£j/dt]uc reflects dynamic changes caused by non-continuum processes.

The derivative dv/dt in (8) reflects the acceleration of vehicles. We assume that the free-
flowing drivers of class w accelerate to their desired velocity v° in an exponential fashion, i.e.:
dv
where TM° denotes the class-specific relaxation constant, reflecting the acceleration capabili-
ties.
The continuum terms describing the changes in aj can be derived similarly. However, in op-
position to free-flowing drivers, constrained drivers are by definition unable to accelerate to-
wards their (own) desired velocity. Assuming a,/(v0-v)/Tu°=0, the gas-kinetic equations for the
constrained MLMC-PSD become:
(10)
dt dx dt
where [dGuj/dt]wc reflects dynamic changes caused by non-continuum processes. For both
free-flowing and constrained traffic, the non-continuum processes reflect the influences of
deceleration and lane changing. Let us now discuss these processes in some detail.
Braking and immediate lane-changing. Given the assumption of vehicular chaos, the expected
number of actively interacting free-flowing vehicles of user-class u at (x,t) on laney driving
with a velocity v while having a desired velocity v° equals:
Uo 0
where xF;(v)=2uxFH;(v)>0 equals the expected number of active interactions per unit time of a
vehicle driving with velocity v with slower vehicles of any user-class on laney.
When a free-flowing vehicle interacts, it either immediately changes lanes, or it decelerates to

the (exact) velocity of the preceding vehicle, while becoming constrained (cf. Hoogendoorn
and Bovy (1998c)). Consequently, the contribution of active interactions to the dynamics of
£j equals:
L a^/ar]™ = -(i-^ (12)

where pj denotes the immediate lane-changing probability on lane j, When the interacting
free-flowing vehicles are able to change lanes, the £,J±l on the target lane j±l increases,
yielding:
where p,/^1 denotes the immediate overtaking probability to the left-lane 0+1) an
d right-lane
(/-I) respectively. Similarly, actively interacting free-flowing vehicles on lane y'±l able to
immediately change lanes to lane j cause an increase in ^,/. In Europe, overtaking regulations
during non-congested traffic conditions follow the 'drive on the right - overtake on the leff
principle. That is,pJ^~l=Q, when traffic is not congested.
Similarly to the immediate overtaking processes of free-flowing vehicles, a constrained vehi-

cle may be able to change to either of the adjacent lanes when it has actively interacted with a
slower vehicle. An active interaction of a constrained vehicle is defined by the corresponding
active interaction of the platoon-leading vehicle. Since at this stage, we assume that vehicular
particles have no physical length, following (constrained) vehicles interact with a slower ve-
hicle at the same instant and location as the leading vehicle does. As a consequence, we pro-
pose expressions similar to (12) and (13) for constrained traffic.
Postponed lane changing. Constrained vehicles previously not able to change lanes may
change lanes if the opportunity arises, after which the driver is able to accelerate towards his
desired velocity until it is again impeded. We will model the postponed lane changes using
the postponed lane-change rates X,rj±\ with Xul"°=XuM"M+l=0, and Xj= X^1 +X,Tj+l . Post-
poned lane changing causes a migration of constrained vehicles a,/ to free-flowing vehicles
and
t^f / WPLC~' = ^r 7± V,(v, v°) (14)
Spontaneous lane changing. Free-flowing drivers may choose to change to either of the adja-
cent lanes depending on the driver's preferences. This yields a flow from the current lane to
an adjacent lane in correspondence to the class-specific driver's preference. Since this process
is comparable to the postponed lane-changing process it yields comparable expressions. For
European legislation , this spontaneous lane changing process to a large extent results from
vehicles which have overtaken slower vehicles using the left lane returning to their origin
lane. In this case, spontaneous lane changes to the left lane are rare. Thus, the spontaneous
lane changing intensities satisfy yJ~>i'+l=Q. If we consider American legislation, the spontane-
ous lane changes result from drivers having a distinct preference for a specific lane. That is, a
driver changes to the left lane if he prefers to be on any of lanes left of his current lane.
Assuming that free-flowing drivers only change lanes if they remain free-flowing, then spon-
taneous lane changing yields the following contribution to the dynamics of ^,/ (cf. Hoogen-
doorn and Bovy (1998c)):
[d^/dt]£J±l=-yCJ±^Ju(v,vG) and [a^/a;]^=Yf^fV,v 0 ) (15)
Passive interactions. If any vehicle of class u on lane j driving at a velocity w interacts with a
vehicle driving at a velocity v<w, without having the opportunity to immediately change
lanes, it will assume the velocity v of the impeding vehicle while becoming constrained. In
this latter case, the number of constrained vehicles increases with rate:
P ; (v)|(l-pj)(w-v)^(w,v°)dw and p'(v)J(l-g,f)(>-v)c j u (w,v°)dw (16)
due to passive interactions with respectively free-flowing and constrained vehicles respec-
tively (cf. Hoogendoorn and Bovy (1998c)), where pj=pj(x,v,i) and quj=quj(x,v,t) denote the
immediate lane-changing probabilities of interacting free-flowing and constrained vehicles.
Let us define the passive interaction rates by:
def ™ def ~
^'(v,v°) = j(w-v)^(w,v°)dw and ^'(v,v°) = j(w-v)a^(w,v°)dw
Thus, the total contribution of passive interactions equals:

[dai/dt]ÎVE=pJ(vm-pJuW(v,v°) + (l-qi)®i(v,v0)) (18)
Relaxation due to vanishing impeding vehicles. When the constrained vehicles on lane j are
able to changes lanes, vehicles that were previously constrained by the lane-changing vehicles
are 'free-flowing'. This event can in itself yield a relaxation of other vehicles. Since this proc-
ess is very complex, we will assume that the process can be modeled by:
[a^'/a/] REL =d( v F ; (v))a^(v,v°) and [3^/3r] REL =-d(^(v)X(v,v°) (19)
where $ is a monotonic increasing function of the mean number of active interactions ^(v) of
vehicles driving at a velocity v.
The resulting gas-kinetic equations
Combining the gas-kinetic equation for free-flowing traffic (8) with the specifications for the
non-continuum processes derived in the previous paragraphs, yields:
^ + v ^ - + ^-(^(v 0 -v)/<) = -(l- / 7,;)T^(v)^+ 1 3(T ; (v))a^

at dx ov (20)
/XI/; i f
-Z/=,±1(^ M^' -p^ v M&-Zf_j±lw;"'tt -yr^' -^rx)
for all «eU andy'=l,...,M. Similarly, the gas-kinetic dynamics of o,/become:
i u / —u \ / ur \ / \\ r u / i ( \ i
a?
-.
a*
-\ V / \* " H ^ u V ' ^/
(21)
r jWx
(v))-V ,
^ /' /
(g
J ;'— :4-i ^ l U
]
DERIVATION OF THE MACROSCOPIC EQUATIONS
This section presents the macroscopic equations describing the dynamics of the MLMC mac-
roscopic traffic equations. To this end, aggregation operators are applied to the gas-dynamic
equations (20) and (21). These operators aggregate the contributions of free-flowing and con-
strained drivers to the respective conservative variables density, momentum, and energy.
Hoogendoorn and Bovy (1998c) show that using these conservative variables yields a simpli-
fied derivation approach and improved numerical analysis of the resulting macroscopic mod-
els. Moreover, the model can be easily recast in its primitive form (density, expected velocity,
velocity variance).
Aggregation operators. Let us define the operator H on any function a(v,v°) by:
2^[a(v,v°)] = J|a(w,H' 0 )^(jc,w,w 0 ,Odw 0 dw (22)
For instance, if we consider the momentum v of a free-flowing vehicle of class u on lane j

driving at velocity v, EuJ[v] determines the total free-flowing traffic momentum of class u on
lane/ Equivalently, we can define the operator Z by:
Z}u [a(v, v°)] = J J a(w, w° X (x, w, w°, t)dw°dw (23)
To derive the macroscopic flow equations, the gas-kinetic equations (20) and (21) are multi-
plied by V A , with £=0,1,2. Subsequently, the resulting equations are integrated with respect to
the velocity v and the desired velocity v°. Since in effect these aggregation operators total the
contributions of the vehicles driving at various velocities v to the respective conservative
variable. For instance, v^(v) reflects the contribution to the traffic momentum EuJ[v] of free-
flowing vehicles driving at a velocity v. In multiplying the reduced Paveri-Fontana equations
vk for £=0,1, and 2, the dynamics of Ej[vk] and £,/[v*] are established. Hoogendoorn and Bovy
(1998c) show that these respectively equal to traffic density (£=0), the traffic momentum or
flow (&=1), and two times the traffic energy (k=2) of class u on lane j of free-flowing and con-
strained vehicles respectively.
State-specific macroscopic traffic flow equations
State-specific MLMC conservation-of-vehicles. We establish the conservation of free-flowing

vehicles equation by assessing the aforementioned equations for £=0. Hoogendoorn and Bovy
(1998c) show that this yields:
deceleration after active interaction immediate lane-changing
a* .
"(v))]-
' * spontaneous / postponed lane-changing
Compared to the regular conservation-of-vehicle equation, the generalized conservative-of-

free-flowing vehicles of class u on lane j changes both due to lane changing and transitions
between constrained to unconstrained driving. Equivalently, we can determine the conserva-
tion of constrained vehicles equations. Moreover, we can show that the effect of active and
passive interactions of constrained vehicles without immediate overtaking cancel each other
out. That is, the constrained traffic density is not affected by interactions of constrained vehi-
cles that cannot immediately changes to another lane. We find:
dt dx (25)
state-changing postponed lane-changing
where the operator P,/ is defined by:
P,/[«(v,v°)] = JJa(w,w 0 )p^(x,w,w°,Odw ( ) dw-E:^[a(v,v ( ) )] + Z^[a(v,v 0 )] (26)
and thus:
def
0
)] = P>(v,v 0 )] (27)
State-specific MLMC momentum dynamics. Similar to the derivation of the vehicle conser-
vation equations, we can establish the free-flowing momentum dynamics equation by assess-
ing the aforementioned equations for k-\ (cf. Hoogendoorn and Bovy (1998c)).
state-changing
dt
where vu denotes the mean desired velocity of class u. Equation (28) shows that among oth-
ers the mean momentum of free-flowing traffic of user-class u on lane j changes due to the
inflow and outflow of momentum, reflected by the spatial derivative of the traffic energy.
Alternatively stated, the arithmetic mean velocity vj-mj/rj changes due to the balance of in-
flow and outflow of vehicles with different velocities.
We can also derive the momentum dynamics equation for the constrained vehicles:
dt dx
state-transitions immediate overtaking postp.lane changing
Let us remark the subtle difference between the reduction in the traffic momentum due to ac-
tive interactions for free-flowing and constrained vehicles. On the one hand, when a free-
flowing driver interacts, he either changes lanes or joins the platoon. As a result, a density
flux from the free-flowing spatial density to the constrained traffic density causes the mo-
mentum to decrease. However, this does not necessarily imply that the mean velocity of free-
flowing vehicles decreases. On the other hand, when a constrained vehicle is impeded, it will
reduce its velocity. This causes the mean velocity of the constrained vehicles to decrease. As
a consequence, the traffic momentum of the constrained vehicles is decreased. Note that this
does not necessarily imply a reduction in the spatial density of constrained vehicles.
State-specific MLMC energy dynamics. Finally, the multilane free-flowing energy dynamics
can be determined for k-2. These equations are similar to the dynamic equations for the un-
constrained and constrained momentum presented in the previous section, and will not be ex-
plicitly presented in this paper. Their formulation can be found in Hoogendoorn and Bovy
(1998c).
Mixed-state macroscopic traffic flow equations
To study the differences between aggregate-lane conservative models (cf. Hoogendoorn and
Bovy (1998d)), and the presented MLMC model, it is worthwhile to determine the mixed-
state traffic dynamics. To this end, we define the fraction of constrained vehicles:
a]u (x, t) = ruj (x,t)/ ru} (x, t) (30)
the mixed- state immediate overtaking probability from lane j to laney'il:
and the mixed-state overtaking rates from laney to lane y'+l:

i±l
A;-;±i tf (i - « + <*•->j±l (32)
MLMC conservation-of-vehicle equations. By adding the equations (24) and (25), the multi-
class multilane conservation-of-vehicles equation results:
^ + ^ = -£/=;±1(^/pu'^^ (33)
By state-aggregation, both the state-changing term and the within-lane influence of vehicles
interacting vanish, due to the fact that neither of these processes cause changes in the number
of vehicles of class u on laney. In the sequel of this section, we will see that these processes
do change the momentum and energy. Compared to the aggregate-lane multiclass model of
Hoogendoorn and Bovy (1998d), additional density fluxes between the motorway lanes are
present due to the different types of lane-changing.
MLMC momentum dynamics. By defining the equilibrium momentum:
Mju t' rujv° - T " ( 1 ~^" ) ^ s (P,/ [vV/ (v)] + P/ [vO;! (v)]) (34)
\ ~ u '
we can establish the mixed-state momentum dynamics by adding equation (28) and (29):
Ml -ml
• + 2-
dt dx ^/(1-aO (35)
/p j
- ^.._.+l (rcr ,/ [v¥ (v)] - n;f ^'P/ [v*F > (v)]) - £ _ +i (A^'mj - A{-*X )
Compared to the aggregate-lane multiclass model of Hoogendoorn and Bovy (1998d), the
distinction of lanes introduces momentum flows between the motorway lanes caused by the
different types of lane-changing. The expression for the equilibrium momentum M,/ is equal
to the expression derived by Hoogendoorn and Bovy (1998d) for aggregate-lane multiclass
traffic flow. The equilibrium momentum (34) describes changes in the traffic momentum of
class u on lane j due to vehicles interacting with other vehicles, without having the ability to
immediately overtake to either adjacent lane. Hoogendoorn and Bovy (1998d) show that this
interaction is asymmetric. That is, the presence of relatively slow vehicles, such as trucks,
have a more profound impact on the momentum of faster vehicles than vice versa. Bliemer
(1998) has used this quantification of the asymmetric interaction to determine multiclass
travel time functions for multiclass dynamic traffic assignment.
Alternatively, we can determine the equilibrium velocity V,/. This velocity equals the equilib-
rium momentum MJ divided by the density rj, i.e.:
V> = v° - T " (1 ~ 7l " ) V (P,/ [v*F/ (v)] + P/ [v®Ju (v)]) (36)

-
MLMC energy dynamics. Using a similar approach, we can determine the mixed-state energy
dynamics:
dej r) g Fj - cj
™>L
dt +°-(mJHJ+jJ,2) = 2 0 " ".
fa T°/(l-a^) (37)
- y - (ni~"'Vs[± v 2x F y (v)] -TC/'^P/R-v 2x F r (v)]) - T ,

A—0 =j±l M M / M H Z ^^ j =j ±
(A^V - A^V")
\ M M
J
where then equilibrium energy EU of class u on lane 7' is defined by:
s P/ [v2T/ (v)] + P/ [v 2 (frf (v)] (38)

^ M '
where Hl/=3euj/rl/-(muj/rl/)2 and 7^ respectively depict the traffic enthalpy (convective energy
flux) and the flux of velocity variance (non-convective energy flux). By noticing that the traf-
fic energy, the traffic velocity and the variance relate as follows:
eJu=jruJ((vJJ2+QJu) (39)
where 9M; denotes the velocity variance, we can establish (Hoogendoorn and Bovy (1998a)):
El = \ rj (vJuVuj + QJu ) (40)
where Qj denotes the equilibrium velocity variance:
u s (41)
In the sequel, we specify relations for the equilibrium velocity and velocity variance. Using
Muj=rujVuj and (40), we can determine respectively the equilibrium momentum and energy.
MLMC-MODEL FORMULATIONS
We can summarize the model equations by defining the vector \vj=(rj,mj,ej). Hoogendoorn
and Bovy (1998c) show that the model equations (33), (35) and (37) can be recast as follows:
dt dx dt dx
where A.J is the conservative flux-Jacobian. It describes how small spatial variations in the
conservative variables influence the other conservative variables over time. The vector x«;
summarizes the right-hand sides of the equations (33), (35) and (37).
Using this formulation, the model can be recast into among others its primitive form, and its
characteristic or Riemann form. The former describes the dynamics of the MLMC density,
velocity and velocity variance and is consequently well suited for comparing the MLMC
model equations with other macroscopic flow models. The characteristic form describes the
dynamics of the characteristic variables. Although these variables lack intuitive appeal, they
are of dominant importance when mathematically analyzing the properties of the flow equa-
tions. For instance, they reveal the way in which small perturbations are transported in the
flow along the so-called characteristic curves. It can be shown (cf. Hoogendoorn (1998c)) that
when the traffic conditions are free-flow, disturbances are transported downstream. In oppo-
sition, when traffic conditions are congested, perturbations are transported in both upstream
and downstream directions.
Figure 1 shows the relations between the various formulations and their respective uses. For a
detailed account on the different model formulations, we refer to Hoogendoorn (1998c,1999).
CONSERVATIVE PRIMITIVE RIEMANN
density r path-line variable z
momentum m mach-line var. z
kinetic energy e mach-line var. z

\v=(r,m,e) z=(z,z,z)
conservation of characteristic
vehicles equations
momentum decoupled system

dynamics describing dyn.
Riemann variables
energy dynamics
Upwind schemes: Godunov-type /

< 00 Riemann solvers:
Ug -CIR
- Van Leer - Godunov-scheme
- Steger Warming - Roe's approx.
Riemann solver
Figure 1: Different forms of traffic flow models, the relevant variables, and the applicable
numerical solution methods.
THE MLMC-EQUILIBRIUM CONDITIONS
In this section we will consider the MLMC equilibrium conditions for mixed-state traffic. To
this end, we propose a simple procedure to determine these equilibrium conditions. The dis-
cussion focuses on both the distribution of density on the roadway lanes, and the equilibrium
velocity. We will consider two user-classes, namely trucks and person-cars.
Specification of model relations
Before presenting the approach to determine the equilibrium traffic conditions for the
MLMC-model, the acceleration time iu, the desired velocities vu°, the fraction of constrained
vehicles O.J, the immediate lane-changing probabilities nuj, and the lane changing rates A«; are
specified. In the scope of this preliminary study, we have neglected the role of the flux of ve-
locity variance, i.e. jJ=Q.
Desired velocities. For the desired velocities of person-cars and trucks, the following values
have been respectively chosen:
v^rson.car =32m/s and vt°ruck = 24m / s (2)
These values agree with average values observed on two-lane motorways in the Netherlands.
Note that in the Netherlands, the distinct speed limits on motorways for person-cars and
trucks are 32m/s and 22m/s respectively.
The acceleration times. The acceleration time reflects the average acceleration capabilities of
vehicles of a specific user-class. Since person-cars generally have better acceleration capa-
bilities than trucks, we assume Tperson-car<ttruck- More specifically, we have chosen:
Vson-car = *S ^ T
truck = ^ (3)
Immediate overtaking probabilities. To specify the immediate overtaking probabilities, we

consider the distribution of gaps on the destination lane. Let G7 be a random variate, describ-
ing the gap on lane j. We define the gap by the distance between the rear bumper of the lead-
ing vehicles and the front bumper of the following vehicle. The mean gap E(GO can be ex-
pressed in terms of the lane density r* and the mean vehicle length ll on the lane:
E(Gi) = \lr1 -LJ (4)
where:
Z/=£v(r/Lv)/r' (5)
We have assumed that the available gaps can be modeled by a log-normal distribution.
Let sj denote the space needed by a vehicle of class u driving on lane j on either of the desti-
nation lanes j±\. The space needed is expressed as a function of the mean vehicle length Lu,
the average velocity vj and the reaction time Tj. For a vehicle of class u driving with veloc-
ity v we assume:
^(v) = 2.5(L u +7» (6)
Let us assume that the velocities of user-class u on lane j are Gatm/an-distributed random
variates with mean vuj and variance Quj. The random variate SUJ describing the distribution of
space needed by vehicles of class u on lane j is also Gaussian with mean 2.5(LM+7Yvu) and
Macroscopic Traffic Flaw Modelling 41
standard deviation (2.5Tltr)2Quj. The probability that a vehicle of class u on lane j can change
to either of the adjacent lanes equals:
X^1 = x^>±! ( V j, QJu, r ;±1 ) = Pr(S/ < GJ±1) (7)
We assume that the immediate lane-changing probabilities for constrained vehicles are negli-
gible, i.e. qj=0. That is, the probability that an overtaking opportunity occurs precisely when
a constrained vehicle actively interacts equals zero. Thus nj equals:
±l
n^J = Tt^1 ( V ;, 9;, a', r&) = (i - a;jp^'xf"'11 (8)
11
where (V^ models both the preference for either of the adjacent lane, and whether the driver
may use that lane for overtaking. For instance, for non-congested traffic operations in Europe
we have (3,T;+1=1 and P/Ô.
Spontaneous and postponed lane changing rates. The spontaneous lane changing rates are also
specified by considering the gap distribution on the destination lane. Considering European
traffic regulations, traffic must use the rightmost lane if possible. Thus, we assume:
Yr ; ' + '=0 and Yf^' = Xf"'"' / T ° (9)
where l/Tu° is the free-flow spontaneous lane changing rate. Tu° can be considered to be the
mean time needed for an overtaking maneuver.
Also postponed lane changing is a function of the available gap distribution on the destination
lanes. We propose:
K^j±l =$Ju-*j±1ti~*j±l/w? (10)
where Wu° denotes the mean time waiting behind the leading vehicle given free-flow condi-
tions.
Velocity variance and fraction of constrained vehicles. We assume that both the velocity vari-
ance and the constrained vehicle fraction can be adequately expressed as functions of the
mean number of vehicles per unit unoccupied lane-space ^ r =l/E(G / ) (see Figure 2).
EQUILIBRIUM VELOCITY VARIANCE CONSTRAINED VEHICLE FRACTION
0
0.00 0.02 0.04 0.06 0.08 0.00 0.02 0.04 0.06 0.08
effective density [veil / (in lane)] effective density [veh I (in lane)}
Figure 2: Equilibrium velocity variance and constrained vehicle fraction as functions of the
mean number of vehicles per unit unoccupied space on a lane.
Determination of the equilibrium conditions
We have assumed that the velocities are Gaussian distributed random variates, and can thus
be specified by the mean velocity vj and the velocity variance Qj. The assumption of Gaus-
sian distributed velocities, and the specifications of the constrained vehicle fraction, and the
immediate overtaking probability enables determining the equilibrium velocity (36) given rj,
vj, and QUJ, i.e.:
Vuj=Vus(r,\,Q) (11)
where r, v, and 0 are vectors of respectively the densities, velocities, and velocity variance of
each class on each lane.
We define equilibrium traffic conditions by:

1. Equilibrium of inflow and outflow for each of the lanes. That is, the number of vehicles
leaving lane j equals the number of vehicle arriving at lane j due to lane changing.
2. The velocities and velocity variances equal the equilibrium velocities and variances.
Considering the MLMC conservation of vehicle equation (33), condition 1 yields:

7i/P y 'pF ; (v)]-tiR' = V ,
U It U U ^^ j =j±\
(Ti^P/L^v)]-A{^'/?j')
" U
(12)
^~ '
where RJ denotes the equilibrium density distribution. Condition 2 yields:

y/=\//(R,V,0) (13)
RIGHT-LANE EQUILIBRIUM VELOCITIES LEFT-LANE EQUILIBRIUM VELOCITIES
0.00 0.02 0.04 0.06 0.08 0.02 0.04 0.06

effective density [veh I (m lane)] effective density [veh I (in lane)]
Figure 3: The equilibrium velocity on the right-lane and the left-lane of a two-lane motor-
way, for constant truck densities, and different person-car densities.
0.00 0.02 0.04 0.06 0.08

effective density [veil I (m lane)}
Figure 4: The equilibrium fraction of person-cars and trucks using the left lane for fixed truck
density values and increasing person-car densities.
Given the aggregate-lane densities ru for the respective user-classes, we can iteratively deter-
mine the equilibrium lane distribution, velocity, and velocity variance satisfying (12) and
(13). In illustration, Figure 3 and 4 show the equilibrium velocities for fixed truck densities
(^truck=0,2 , and 10 truckslkmllane).
NUMERICAL SOLUTION APPROACHES
In the past, several numerical approximation schemes have been proposed to determine solu-
tions to a variety of macroscopic models (see Lyritnzis et al. (1994), Lebaque (1996), and
Hoogendoorn and Bovy (1998a)). Because of the increased complexity of the developed
higher-order traffic flow model there is a need for more efficient numerical approaches to ap-
proximate solutions.
As with other traffic flow models, the numerical treatment of our multilane multiclass flow
model is quite cumbersome. Hoogendoorn (1999) describes a new approach to solve higher-
order multilane flow models, based on the flow equations cast in conservative variables.
The resulting scheme is an adaptation of the Van Leer Flux-Vector splitting scheme (cf. Van
Leer (1982)). It considers the direction in which the perturbations are transported in the
MLMC traffic flow, while conserving the density, momentum, and energy in the distin-
guished roadway segments that follow if only convective processes are considered. The
scheme adapts to the prevailing traffic conditions (free-flow/congested). Moreover, the effects
of the non-continuum processes are quantified in a multistep approach, in order to prevent ve-
hicles laterally flowing out of an empty lane. A fourth-order Runge-Kutta approach was cho-
sen for the temporal discretization. For details, we refer to Hoogendoorn (1999).
APPLICATION OF THE MLMC-MODEL
In this section we discuss results of macroscopic simulation of two test-case examples, based
on the specifications proposed in the preceding section. These are: mixing traffic classes, and
a lane drop.
Mixing of classes
In the first example, we consider a two-lane ringroad of 20km length. At time t=0, three ho-
mogeneous regions are present. The first region Xo°=[2km,6km) consists of person-cars only.
The second region X0l=[6km,8km) consists of both person-cars and trucks, while the third
Xo2=[Skm,l2km) consists of trucks only (Figure 5). We assume that initially, traffic is in equi-
librium. That is, the initial velocities, velocity variances, and the distribution of vehicles on
the lanes are determined by applying the approach described in the previous section (Figure
6).
When we consider the dynamics of the vehicles in the head of the region Xo2, trucks flow into
the empty downstream roadway section at a velocity nearly equal to the desired velocity. For
instance, considering instant t=4min, the fastest trucks are located at approximately x=iS.5km.
These vehicles have traveled at a velocity of approximately 95km/hr. By studying the char-
acteristics of the flow model, we can show that these vehicles have traveled along the so-
called Mach-line of the flow, defined by djc/d/=(v1/+(36,/)1/2) (cf. Hoogendoorn and Bovy
(1998d)).
trucks
x=8km
person-cars
x=2km
x=0km
Figure 5: Schematics of two-lane ringroad, and initial distribution of person-cars and trucks.
Compared to the trucks-only region XQ where trucks use both lanes of the two-lane roadway
comparably, in the mixed region XQI the lane-use of trucks is more confined to the right-lane.
Fast person-cars flowing from this region into the truck-only region do not affect the veloci-
ties of the trucks (e.g. t=4min, x=[\lkm,\lkm)). However, the truck-density lane-distribution
is affected, in that trucks only use the right roadway lane. In the upstream region, where per-
son-car densities are higher, the velocities of person-cars are reduced to such an extent, that
trucks actively interact with slow person-cars, and consequently need to reduce their velocity.
A jam forms between the person-car only region XQ° and the mixed region XQ{ . This area ap-
pears due to fast person-cars from the upstream region XQ° flowing into the low-velocity
mixed region XQI. These fast person-cars interact with trucks and slow person-cars, and con-
sequently need to slow down. This results in both a decreased velocity and an increased den-
sity. The increase in the density results in even more interactions with slow vehicles. This
avalanche-like process causes the formation of congestion in the transitional area between the
two regions. This phenomenon is comparable to the formation of localized structures leading
to the development of phantom-jams (cf. Kerner et al. (1996)).
Let us finally consider the tails of the regions XQ° and XQI. Clearly, none of the vehicles flow
back into the lower-density regions upstream of these tails. We may therefore conclude that
the model satisfies the so-called anisotropy condition (cf. Daganzo (1995)). That is, vehicles
mainly react to stimuli in front.
timestamp t = 0 min timestamp t = 1 min

140
densities
velocities
person-cars right-lane
person-cars left-lane
trucks right-lane
trucks left-lane
5 10 15 10 15
road position [km] road position [km]
Figure 6: Traffic conditions on a two-lane ringroad for 'mixing of traffic' example.
The lane drop
The second test case describes the traffic conditions on the 20km two-lane ringroad at a lane
drop. The lane drop is located at X=[lkm,\3km). We will consider a left-lane drop (Figure 6).
Upstream of the lane drop, both person-cars and trucks are present. We assume that at ?=0,
traffic conditions are in equilibrium (see Figure 8). The lane-drop is modeled by placing mo-
tionless virtual vehicles on the left lane. The presence of these virtual vehicles causes an in-
creased number of active interactions, yielding on the one hand an increase in the number of
lane-changes to the left lane, and on the other hand, an increase in the density on the left lane.
x=\Qkm
x=0km
Figure 7: Schematics of left-lane drop on a unidirectional two-lane ringroad.
From the simulation results we see that initially, person-cars and trucks flow into the left lane.
However, they change to the left lane quickly, resulting in an empty lane. Some of the person-
cars change to the left roadway lane when arriving at the end of the lane drop at x-\?>km.
Figure 8 shows that congestion occurs on the right roadway lane. That is, although vehicles
are able to flow into the right roadway lane without causing congestion upstream of the lane
drop, fast vehicles flowing into the right lane cause interacting with slow vehicle downstream
on the right lane need to reduce their velocity. This reduction in more profound, since no
overtaking possibilities occur. Since each interaction leads to velocity decrease, the right lane
is more susceptible to the seemingly spontaneous formation of jams due to vehicle interaction.
CONCLUSIONS AND OUTLOOK
This paper presents a macroscopic multilane multiclass traffic flow model that is founded on
gas-kinetic principles. The model dynamics are governed by processes of a convective nature,
acceleration towards the desired velocity, deceleration due to interaction, and various types of
lane changing. The model distinguishes constrained and free-flowing vehicles. It is cast using
the class-specific and lane-specific conservative variables density, momentum, and energy,
enabling a simplified model derivation and improved numerical analysis.
The model allows the specification of several parameters that define the characteristics of the
classes and the roadway lanes. Among these parameters are the vehicle length, the desired
velocity, the acceleration time, the reaction time, the constrained vehicle fraction, and the
lane-changing probability functions.
The gas-kinetic approach yields equilibrium expressions for the equilibrium momentum, en-
ergy, and the distribution of densities over the roadway lanes. From these relations, expres-
sions describing the equilibrium velocity, and velocity variance have been determined. The
equilibrium relations are functions of the user-class specific immediate overtaking probabili-
ties, acceleration times, desired velocities, covariances between the velocity and the desired
velocity, and the asymmetric interaction between fast and slow vehicles of the same and dif-
ferent user-classes.
densities
velocities
=5
2 so-
lane drop
person-cars right-lane
person-cars left-lane
trucks right-lane
trucks left-lane
I 80-
5 10 15 5 10 15
road position [km] road position [km]
Figure 8: Traffic conditions for the 'left lane drop' example.
From the test-case example of a two-lane ringroad we conclude that the preliminary results of
macroscopic simulation are plausible. Trucks increasingly use the right lane when the person-
car density increases, leaving the left lane for the faster person-cars. The slower trucks are
virtually unaffected by faster person-cars. However, when the velocity of person-cars de-
creases, trucks are affected by these slower person-cars and consequently need to reduce their
velocity as well.
MLMC traffic operations near a lane drop were simulated. Again, the macroscopic MLMC
model was able simulate the traffic operations at the lane drop realistically. Moreover, also
incidents and on-ramps can be described.
Hoogendoorn (1999) identifies and remedies some of the model's current shortcomings. For
instance, he argues that the vehicular chaos assumption only holds for dilute traffic, due to the
increased correlation between vehicles for increasing traffic densities. This can be remedied
by considering the correlation of the velocities of unconstrained vehicles and platooning vehi-
cles. Moreover, Hoogendoorn (1999) shows that platooning vehicles are able to accelerate
(to the desired velocity of the unconstrained platoon leader). Finally, the author incorporates
the vehicular space requirements in the MLMC dynamic equations.
REFERENCES
Bliemer, M. (1998). Multiclass travel time functions. Proceedings of the 6th meeting of the
EURO Working Group of Transportation.
Daganzo, C.F. (1995). Requiem for second-order fluid approximations of traffic flow. Trans-
portation Research B, 29, 277-286.
Daganzo, C.F. (1997). A Continuum Theory of Traffic Dynamics for Freeways with Special
Lanes. Transportation Research B, 31, vol. 2, 83-102.
Helbing, D. (1996). Gas-kinetic derivation of Navier-Stokes-like traffic equations, Physical
Review E, 53, vol. 3, 2266-2381.
Helbing, D. (1997). Modelling multilane traffic flow with queuing effects. Physica A, 242,
175-194.
Hoogendoorn, S.P. (1997). A Macroscopic Model for Multiple User-Class Traffic Flow. Pro-
ceedings of the 3rd TRAIL PhD. Congress, vol. I.
Hoogendoorn, S.P. (1999). Multiclass Continuum Modelling of Multilane Traffic Flow. Dis-
sertation Thesis, Delft University Press.
Hoogendoorn, S.P. and Bovy, P.H.L. (1998a). Multiple User-Class Traffic Flow Modelling -
Derivation, Analysis and Numerical Results. Research Report VK2205.328, Delft Univer-
sity of Technology.
Hoogendoorn, S.P. and Bovy, P.H.L. (1998c). Continuum Modelling of Multilane Heteroge-
neous Traffic Flow Operations. Research Report VK 2205.330, Delft University of Tech-
nology.
Hoogendoorn, S.P. and Bovy, P.H.L. (1998d). Macroscopic Modelling of Multiple User-Class
Traffic Flow using Conservative Variables. Proceedings of the 6th meeting of the EURO
Working Group of Transportation.
Hoogendoorn, S.P., and Bovy, P.H.L. (1998b). Modelling Multiple User-Class Traffic. Pre-
print 980692 of the 1998 TRB annual meeting.
Kerner, B.S., Konhauser, P., and Schilke, M. (1996). A new approach to problems of traffic
flow theory, Proceedings of the 13th International Symposium of Transportation and
Traffic Theory, INRETS, Lyon, 119-145.
Klar, A., R.D. Kiihne, and R. Wegener (1998). A hierarchy of models for multilane vehicular
traffic I: Modeling. To appear in SIAM J. Appl. Math.
Lebaque, J.P. (1996). The Godunov scheme and what it means for first order traffic flow
models. Proceedings of the 13th International Symposium of Transportation and Traffic
Theory, 647-677.
Lyrintzis, A.D., Liu, G., Michalopoulos, P.G. (1994). Development and comparative evalua-
tion of high-order traffic flow models, Transportation Research Record 1547, 174-183.
Paveri-Fontana, S.L. (1975). On Boltzmann-Like treatments for traffic flow: a critical review
of the basic model and an alternative proposal for dilute traffic analysis. Transportation
Research 5,9,225-235.
Payne, HJ. (1979). FREFLO: A Macroscopic Simulation Model For Freeway Traffic. Trans-
portation Research Record 772, 68-75.
Prigogine, I., and Herman, R. (1971). Kinetic theory of vehicular traffic. American Elsevier
Publishing Co., New York.
Van Leer (1982). Flux vector splitting for the Euler equations. Proceedings of the 8th Interna-
tional Conference in Numerical Methods in Fluid Dynamics, Berlin, Springer-Verlag.
51
THE CHAPMAN-ENSKOG EXPANSION:

A NOVEL APPROACH TO
HIERARCHICAL EXTENSION
OF LIGHTHILL-WHITHAM MODELS
Paul Nelson1
Alexandras Sopasakis
Department of Mathematics, Texas A&M University, College Station, Texas 77843-3368, U.S.A.
ABSTRACT
Attempts to improve on the basic continuum (hydrodynamic, macroscopic) model of traffic flow,
as developed in the seminal 1955 paper of Lighthill and Whitham, have largely followed the
1971 work of Payne in retaining the continuity equation, but replacing the classical traffic stream
model by a "dynamic traffic stream model" (or "momentum equation"). In the present work it
is suggested that, whatever may be the advantages and disadvantages of the Payne models, they
should not properly be regarded as the traffic flow analog of the Navier-Stokes equations of fluid
dynamics. Further, the Chapman-Enskog asymptotic expansion in a small parameter is shown
to lead to an alternate class of models that seem to have a more legitimate claim to that distinc-
tion. Details of this expansion, about the stable-flow equilibria of the Prigogine-Herman kinetic
equation and in the case that the passing probability and relaxation time are constant, are pre-
sented to orders zero and one. The zero-order and first-order expansions correspond respectively
'Also affiliated with the Departments of Computer Science and of Nuclear Engineering at Texas A&M Univer-
sity.
to the Lighthill-Whitham (LWR) model, and to the Lighthill-Whitham model with a diffusive
correction. These are suggested to be the correct traffic-flow analogs of respectively the Euler
and Navier-Stokes equations of fluid dynamics. Results of a numerical simulation for a simple
traffic-flow problem suggest that the diffusive term represents a correction to the LWR model that
captures, to some extent, effects stemming from the fact that vehicles actually travel at various
speeds. (By contrast, Lighthill-Whitham models proceed as if all vehicles travel at the average
speed corresponding to the density of vehicles in their immediate vicinity.) Some further related
work is suggested.
INTRODUCTION
Traffic stream models (fundamental diagrams, flow/density relations) date back to (at least) the
work of Greenshields (1934), and are often considered as the foundation of capacity analy-
sis (Transportation Research Board, 1985). Lighthill and Whitham (1955), and independently
Richards (1956), observed that if a traffic stream model is supplemented by the equation of conti-
nuity (conservation of vehicles), then the resulting partial differential equation presumably could
be solved, subject to suitable initial and boundary conditions, for the concentration (and hence
the mean speed and flow) as a function of location and time. This Lighthill-Whitham-Richards
model (LWR model) is widely considered the most fundamental continuum (macroscopic, hy-
drodynamic) model of traffic flow.
Unfortunately, observational data (e.g., Drake, Schofer and May, 1965) are at best ambiguous
as regards existence of traffic stream models.2 This has led some workers (e.g., Ceder (1976),
Hall (1987), Disbro and Frame (1989)) to suggest alternative macroscopic models that differ
from LWR models in a revolutionary manner. However, the usual approach to alternative con-
tinuum models is more evolutionary, in following the work of Payne (1971) by supplementing
the continuity equation with a "dynamic traffic stream model" (or "momentum equation") that is
itself a differential equation, as opposed to the classical "static" traffic stream models that do not
contain derivatives. This gives rise to a system of two first-order partial differential equations in
two unknowns, typically concentration and mean speed. Such systems are customarily known as
higher-order models.
Higher-order continuum theories of traffic flow frequently (e.g., Helbing, 1996b; Kerner and
Konhauser, 1993) are considered as analogs of the Navier-Stokes equations of fluid flow, and
correspondingly the LWR theory is viewed as analogous to the Euler equations of fluid flow. The
present work is intended as somewhat of a counterpoint to the first of these views. Specifically,
the primary objectives of this paper are as follows:
2
This difficulty was already noted by Lighthill and Whitham (1955, p. 344).
The Chapman-Enskog Expansion 53
1. It is suggested, in some detail, that there is a viewpoint from which the often repeated sup-
posed analogy between current higher-order models of traffic flow and the Navier-Stokes
equations of fluid dynamics is questionable.
2. It is noted that a systematic application of the Chapman-Enskog asymptotic expansion to

a kinetic model of vehicular traffic, following the lines of the use of this technique for the
Boltzmann kinetic equation and its solutions in the flow of rarefied gases (e.g, Chapman
and Cowling, 1952; Cercignani, 1988; Liboff, 1990), has the potential to provide not only
continuum models of traffic flow that are true analogs of the Navier-Stokes equations, but
even a systematic hierarchy of continuum models for traffic flow.
3. Some initial developments of this approach via the Chapman-Enskog asymptotic expan-
sion are presented. These developments are based on the classical Prigogine-Herman
(1971) equation of the kinetic theory of vehicular traffic,
The first two of these objectives are met in the following section, in the context of a general crit-
ical review of Continuum Models of Vehicular Traffic. The remainder of this work is devoted
to the third objective. This is initiated with a section in which the salient features of Kinetic
Equations for Vehicular Traffic are collected. The focus in this section is particularly upon the
equilibrium solutions of the Prigogine-Herman kinetic equation, and the necessity to limit further
considerations to the regime of stable flow, because of the form of these equilibrium solutions.
The next section is devoted to the The Chapman-Enskog Expansion, and contains the central
new results presented in this work. First the Chapman-Enskog expansion is described generally,
in the context of the Prigogine-Herman kinetic equation. Then the zero-order instance of this
expansion is shown to give rise to an LWR model, with traffic stream model completely defined
in terms of the knowns of the underlying Prigogine-Herman equation. Finally, the first-order
Chapman-Enskog expansion is shown to give rise to a continuum approximation that consists
of an LWR model with an additional "diffusive" term. The section on Computational Results
provides a comparison of numerical results, and a discussion of the interpretation of these differ-
ences, for this first-order diffusive approximation and the LWR model that is the corresponding
zero-order approximation, in a simple instance of traffic flow. The paper closes with a section of
Conclusions, which primarily consists of suggestions for further related work.
CONTINUUM MODELS OF VEHICULAR TRAFFIC

In the opening paragraphs of this section the elements of LWR and higher-order models are
collected, and three key issues (validity, mathematical solvability and computational solution)
affecting each of these types of models are identified. Then follows three subsections in which
the current state of affairs of both LWR and higher-order models vis-d.-vis these three issues is
briefly reviewed. The concluding subsection is devoted to a discussion of possible continuum
models that potentially improve upon LWR models, but in a fundamentally different manner
from that underlying current higher-order models.
Traffic stream models3 will be written as
9 = Q(c), (1)
where q is flow, c is density, and Q is a known function. The equation of continuity (conservation
of vehicles) will be expressed as
^ ++ î-n
dt dx
The two can be combined to give a determined system consisting of a single (nonlinear) partial
differential equation in one unknown (the density),
dc .. . dc
ii+'(c)^ =0'
(q1 = dq/dc) that presumably can be solved, subject to suitable initial and boundary conditions,
for the density (and hence the mean speed and flow) as a function of location and time.
The latter equation is the most concise mathematical form of a LWR model. These are widely
used, at least conceptually. The possibility of further improved continuum descriptions of traffic
flow already was considered by Lighthill and Whitham (1955, p. 344), who suggested adding
"diffusion" (representing adjustments by drivers to the concentration slightly ahead) and "in-
ertia" (representing the nonzero time required for accelerations or decelerations) effects to the
continuity equation. However, subsequent development of presumably better models, beginning
with the work of Payne (1971), has instead been directed toward so-called higher-order models.
In these, the "static" traffic stream model (1) is replaced by a "dynamic" counterpart that consti-
tutes a second differential equation (in addition to the continuity equation). A typical instance
of such a "dynamic traffic stream model" (also known as "momentum equation") is that due to
Kiihne and Beckschulte (1993),
dv dv 1 ,Tr, . , oldc d2v
a+'S-^M-O-^-s-i-si. (2)
Here v — q/c is the mean speed, V(c] = Q(c)/c is the mean speed at concentration c according
to some associated static traffic stream model, and r, CQ, v are coefficients that presumably are
to be provided from observations. Such higher-order models have been rather widely studied in
recent years, especially in the physics literature. An exhaustive list of references, and an excellent
systematic overview of this subject (including references to workers who have used somewhat
different forms of a dynamic traffic stream model), appear in recent works by Helbing (1995a,
1995b, 1996a).
The precise relationship between LWR and higher-order models has been the subject of consid-
erable discussion in the literature (e.g., Kiihne and Beckschulte, 1993; del Castillo, Pintado and
3
The terminology used here is that of Gerlough and Huber (1974), and of May (1990).
Benitez, 1993; Daganzo, 1995b; Helbing, 1996a). The view of the present work is that, in regard
to either of these types (i.e., LWR or higher-order) of continuum models of traffic flow, there are
three fundamental issues that arise:
Validity: What is the basis, either empirical or theoretical, for the associated (static or dynamic)
traffic stream model?
Mathematical solvability: What additional conditions (boundary, initial, diverge) are necessary
to provide a solution of the traffic flow model that represents, to some reasonable degree
of approximation, what one actually observes in a real traffic network?
Computational solution: How does one effectively obtain these "real" solutions on a modem
high-speed digital computer?
A brief overview of the authors' view of the current and historical understanding of each of these
issues will be presented in, respectively, the following three subsections.
VALIDITY
In the traffic-flow literature it usually is indicated that static traffic stream models should be de-
termined from observations. However, as already mentioned in the Introduction, it is well-known
that observations tend to show considerable scatter in observed flows (or speeds), especially in
the region of unstable flow. This phenomenon has led some workers (e.g., Ross (1988), p. 422;
Kiihne and Beckschulte (1993), p. 367) to doubt existence of traffic stream models, in the sense
of a relationship that expresses mean speed (or flow) as a single-valued function of concentration.
Others, similarly motivated, have suggested a variety of alternatives to traditional continuum the-
ories (Ceder (1976), Hall (1987), Disbro and Frame (1989)), although none of these as yet seems
to have found significant application to the quantitative modeling of traffic flow.
Higher-order models seem primarily to be validated on the basis of adjusting the variety of pa-
rameters that they contain (e.g., r, CQ, v) to obtain a solution that provides a reasonable fit to
observations. The authors are unaware of any efforts to validate dynamic traffic stream mod-
els themselves by direct comparison with observations. Any such program would appear to be
rather difficult to effect operationally, because (e.g.) each of the four derivatives appearing in Eq.
(2) (especially the second derivative) would be rather sensitive to uncertainty in measurements,
and then these errors could be subject to even greater magnification from combining them alge-
braically as in Eq. (2). Additionally, most dynamic traffic stream models invoke an underlying
static traffic stream model, and it is unclear how such a dynamic model can be valid without
some degree of validity attaching to the associated static model. (The semiviscous and viscous
models of Lyrintzis, Liu and Michalopoulos (1994) are exceptions. These dynamic traffic stream
models were developed on the basis of consideration of a number of car-following models, and
seem to give rather good results.)
There seems to have been relatively little work taking the form of a comparison of both LWR
and higher-order models against some independent standard (e.g., field data). It appears likely
that any early efforts (e.g., prior to the 1971 paper of Payne) to compare LWR models against
field data were significantly hampered by the primitive state, at that time, of the development of
numerical methods known to approximate the entropy solution that is thought (Ansorge, 1990)
to be relevant to traffic flow (see the following subsection on computation for more details). It
is a hypothesis that is consistent with the authors' reading of the literature that such efforts were
not undertaken in the U.S. soon after 1971, because of an "official" commitment to higher-order
models, notably as implemented in the well-known FREFLO code (Payne, 1979). The work
of Michalopoulos and co-workers is an exception to the apparent general lack of an effort to
provide an objective comparison of LWR and higher-order models. This work is particularly
noteworthy, because it seems to represent one of the first attempts to use, for LWR models,
modern numerical methods that are known to approximate the entropy solution. Michalopoulos,
Beskos and Lin (1984, p. 418) concluded, in a comparison against field data of LWR ("simple
continuum") models as solved by three distinct numerical methods, that "despite the absence of a
momentum equation, the close agreement of the numerical results with field data . . . is notable."
Michalopoulos and Beskos (1984) carried out a comparison of LWR and higher-order models, as
solved using a variety of numerical methods, against detailed microscopic simulations, and con-
cluded that "existing high order models are only slightly better than the simple continuum model
(i.e., LWR) at uncongested flows; however at congested flow conditions the simple continuum
model performs better ..." (p. 103). (More recent results by Lyrintzis, Liu and Michalopoulos
(1994) seem more favorable to high-order models.) Daganzo (1995b, p. 284) has noted, in re-
gard to high-order models, that "although reasonable fits to specific data-sets can be achieved
with adjusted models that include many parameters ... these limited successes do not constitute
a validation," and that "given enough degrees of freedom ... a good fit should be expected."
MATHEMATICAL SOLVABILITY
Mathematical solvability is best known (and understood) in the context of LWR models. The lat-
ter are a class of nonlinear conservation laws that can have multiple solutions (even infinitely
many) that satisfy given initial conditions (cf. Ansorge, 1990; LeVeque, 1990). This non-
uniqueness phenomenon requires imposition of some additional condition to select the solution
that represents actual traffic flow. This is typically done by selecting the solution that satisfies the
so-called "entropy condition." But in fact this condition originates from application of the second
law of thermodynamics in fluid flow (Ansorge, 1990), and it is unclear why one should expect
such a condition to hold sway in traffic theory (the valiant effort of Ansorge (1990) toward an
interpretation notwithstanding). Lebacque (1997) has recently noted that there are conditions un-
der which the entropy solutions imply a clearly nonphysical infinite acceleration, and suggested
that there are circumstances (e.g., entrance ramps) in which this is perhaps so badly wrong that
some other criterion should be used. Newell (1993) suggested the geometrically motivated cri-
terion of selecting as the solution of interest the lower envelope of all solutions, when the LWR
model is reformulated in terms of A(x, t), the cumulative flow past x at time t, as the dependent
variable. A version of this criterion that is suitable for digital computation does not seem to have
been developed, nor does its precise relationship to the entropy condition seem to have been clar-
ified. Boundary and intersection conditions for LWR models have been discussed by Lebacque
(1996).
Uniqueness issues do not seem to have been systematically studied in conjunction with higher-
order models. Lebacque (1995) notes that Schochet (1988) has shown:
i) Existence and uniqueness of solutions to Payne's (1971) higher-order model, which corre-
sponds to (2) with v = 0.
ii) The solutions of Payne's model converge, as T —> 0, to the solution of the continuity equation
with an added diffusion-like term,
iii) The solution of this "continuity plus diffusion" equation converges, as i>0 —> 0, to the entropy
solution of the LWR model.
Thus this "continuity plus diffusion" model plays the same role in justifying the entropy con-
dition for the LWR model as the Navier-Stokes equations play for the Euler equations in fluid
dynamics. This is rather strongly suggestive that the "correct" higher-order generalization of
the LWR model is this "continuity plus diffusion" model, or something very similar. The first-
order form of the continuum approximation corresponding to the Chapman-Enskog asymptotic
expansion has exactly that form, as described in the following section on The Chapman-Enskog
Expansion.
COMPUTATIONAL SOLUTION
Numerous discrete approximations, dating back to the 1960's, have been developed (see LeVeque
(1990); Ansorge (1990); Leo and Pretty (1992); and extensive references cited in these works)
that converge, in the fine-mesh limit, to the entropy solution of the Euler equations of fluid
dynamics. Although the Euler equations are the fluid-flow counterpart of the LWR equation of
traffic flow, the first applications of such entropy methods to the LWR equations of fluid flow
seem to have come somewhat later (Lebacque, 1984; Michalopoulos, Beskos and Lin, 1984).
Other workers (Bui, Nelson and Narasimhan, 1992; Leo and Pretty, 1992; Daganzo, 1995a) have
subsequently developed and applied such methods within the theory of traffic flow.
There is some evidence that this apparent early lack of awareness of the necessity to use, for
LWR models, discrete approximations that converge to the entropy solution (and of the exis-
tence of such approximations) provided some of the disbelief in LWR models that led to the
original motivation for the development of higher-order models. For example, as recently as ten
years past (Ross, 1988, p. 422) the LWR model was criticized in terms that make it clear the
author was thinking of a discrete approximation that did not predict maximal flow between an
upstream region at jam density and a downstream region devoid of traffic, and therefore could
not correctly capture the "acceleration wave" that is the entropy solution of this problem (and the
traffic flow equivalent of a rarefaction wave in fluid flow). (See also Newell, 1989; Ross, 1989;
Nelson, 1995b.) Similarly, the seminal work on higher-order models (Payne, 1971, Eq. (7.10))
approximates the flow between two cells as the average of the two densities times the average
of the two velocities. This discrete approximation shares with the upstream method the property
that it incorrectly predicts zero flow between an upstream region at jam density and an a down-
stream region devoid of traffic, and therefore it cannot converge to the desired entropy solution
of an LWR model. The discrete approximation apparently employed in the original version of
FREFLO (Payne, 1979) is the upwind method, which has this same defect. Because of this, one
would expect FREFLO to exhibit some characteristics of poor performance (e.g., slow conver-
gence with mesh refinement) under conditions such that this model approximates the LWR model
(see the above discussion of the work of Schochet, 1988). It is a reasonable suspicion that this
contributes to the deficiencies of FREFLO and related codes that have been noted by numerous
workers (Rathi, Lieberman and Yedlin, 1987; Ross, 1988; Leo and Pretty, 1992; Michalopoulos,
Yi and Lyrintzis, 1993; Lyrintzis, Liu and Michalopoulos, 1994; and other works cited in these).
ALTERNATIVE CONTINUUM FORMULATIONS
On the one hand it appears almost certain that there exist traffic phenomena (e.g., the "sponta-
neous traffic jams" of Kerner, Konhauser and Schilke, 1995) that are unlikely to be describable
by LWR models. Therefore, one suspects it should be possible to develop macroscopic (hydrody-
namic) models of traffic flow that improve upon the LWR model. On the other hand, higher-order
models of the type that have been introduced to date provide, at best, marginal improvements on
LWR models. (Even the explanations given by Kerner, Konhauser and Schilke (1995) for sponta-
neous traffic jams seem more based on ad hoc physical models of traffic flow than on the specific
mathematical higher-order model that is nominally introduced to provide an explanatory frame-
work.) This raises the question of what alternative type of continuum model might be developed
in an effort to improve on LWR models. The overall objective of the present work is to describe
initial results for an alternative approach that seems to lead to models more nearly consistent
with the improvements suggested in the seminal work of Lighthill and Whitham (1955, pp. 344)
than are the higher-order models that employ a dynamic traffic stream model.
The motivation for the idea that there should be some alternative treatment arises from the obser-
vation that the often repeated analogy between current higher-order models of traffic flow and the
Navier-Stokes equations of fluid dynamics is questionable, as follows. The compressible Euler
equations of fluid dynamics are a system of five equations in five unknowns, with each equation
representing conservation of some quantity (particle number, three components of momentum,
and energy) that is conserved in molecular interactions, and each of the five unknowns represent-
ing a macroscopic analog (density, fluid velocity, temperature) of one of these microscopically
conserved quantities. The LWR model of traffic flow likewise is a single equation, in a single un-
known, that represents conservation of the sole quantity (number of vehicles) that is conserved
in vehicular interactions, and the single unknown (concentration, or density) is a macroscopic
analog of that microscopically conserved quantity.
Thus, it seems reasonable to regard LWR models as the traffic flow analog of the compressible
Euler equations of fluid dynamics. But the Navier-Stokes equations of fluid dynamics likewise
consist of five conservation laws in these same five unknowns as the compressible Euler equa-
tions. These are merely of a different form from the Euler equations, in that viscosity and dif-
fusion are now represented, but by terms expressed in the same five dependent variables (and
their spatial or temporal derivatives). On the other hand, as described in the introduction, current
higher-order models of traffic flow introduce a "dynamic traffic stream model" that has the form
of an additional conservation law, and an additional independent variable (e.g., mean speed)
that is not a macroscopic analog of some quantity that is microscopically conserved in vehicular
interactions. On this basis alone it seems not quite appropriate to regard current higher-order
models as traffic-flow analogs of the Navier-Stokes equations.4 Rather they seem to stem from
following the/orm of the development of the Navier-Stokes equations of fluid dynamics, as op-
posed to following the spirit of such development, with due regard for the differences between
fluids and traffic streams.5
If one accepts this contention that current higher-order macroscopic models of traffic flow are
not the proper analogs of the Navier-Stokes equations of fluid flow, then it follows that the true
analog remains to be discovered. How should it be found? The answer proposed here is a de-
velopment following the lines of the Chapman-Enskog asymptotic expansion of the Boltzmann
equation and its solutions that has been shown (e.g., Chapman and Cowling, 1952; Cercignani,
4
In itself this says nothing about the validity or invalidity of such higher order models. In fact it seems quite
reasonable to consider them as analogs of the Thirteen Moment method of Grad (1949), which is a quite respectable,
although seldom used, description of fluid dynamics.
5
Kerner and Konhauser (1993) even simply directly import the Navier-Stokes equations from fluid flow into
traffic flow. Daganzo (1995b) has emphasized the differences between traffic and ordinary fluids, and warned of the
dangers of pushing too far the interesting, and sometimes useful, analogy between the two.
1988; Liboff, 1990) to lead to a hierarchy of macroscopic equations for rarefied gases. The
(compressible) Euler equations are the lowest-order (zero-order) member of this hierarchy, the
Navier-Stokes equations are the first-order approximation, and the seldom-used Burnett equa-
tions are the second-order approximation. The proposed approach thus follows the spirit of the
standard theoretical development of macroscopic fluid dynamic equations, as opposed to simply
adopting the form of the fluid-dynamic result of that development. In particular, the specific
characteristics of vehicular flow, as opposed to fluid flow, presumably will be represented in such
an approach, to the extent that these characteristics are incorporated in the underlying kinetic
equation that forms the starting point for the development.
The foundation of any such development is a suitable kinetic equation for vehicular traffic. Such
kinetic equations are the subject of the next section.
KINETIC EQUATIONS FOR VEHICULAR TRAFFIC

A kinetic equation is formed by setting the rate of change of the vehicular distribution in ve-
locity and position space equal to the rate of change caused by the changes of vehicle speeds
due to vehicular interactions (i.e., slowing-down, speeding up, and passing) according to some
(microscopic) mechanical model of driver responses to various situations. An equilibrium so-
lution of such a kinetic equation is a distribution function such that the latter rate of change is
identically zero. The equilibrium solutions of a kinetic equation are crucial elements of the con-
nection of that kinetic equation to a continuum model of traffic flow. Indeed, the equilibrium
solutions themselves normally lead directly to a traffic stream model. Further, the Chapman-
Enskog asymptotic expansion is a formal expansion of the solution of the kinetic equation about
an arbitrary equilibrium solution. For these reasons, in order to develop the Chapman-Enskog
continuum approximations corresponding to a particular kinetic equation it is essential to have a
good mathematical characterization of the equilibrium solutions of that kinetic equation.
In this work the venerable kinetic equation of Prigogine and Herman (1971) will be employed.
One certainly can raise legitimate questions about the degree of validity of some aspects of the
Prigogine-Herman kinetic equation, particularly the somewhat phenomenological "relaxation
term." As a result, other kinetic equations for vehicular traffic have been suggested; these are
summarized briefly in the concluding subsection of the present section, especially as regards the
state of knowledge of their equilibrium solutions and their prior use in developing continuum
models. In the first subsection the equilibrium solutions of the Prigogine-Herman kinetic equa-
tion are described. This description will be primarily for the modified Prigogine-Herman kinetic
model recently considered by Nelson and Sopasakis (1998). Their equilibrium solution contains,
as a special case, that originally given by Prigogine, Herman and Anderson (1962).
EQUILIBRIUM SOLUTIONS OF THE PRIGOGINE-HERMAN KINETIC MODEL
The Prigogine-Herman (1971) kinetic equation of vehicular traffic is
Here / is the density function for the distribution of vehicles in phase space, so that f ( x , v, t) dx dv
is the expected number of vehicles at time t that have position between x and x + dx and speed
between v and v + dv, c — c(x, t) is the spatial density of vehicles (vehicles per unit length), /0
is the corresponding distribution function for the desired speed of vehicles, v is mean speed, and
P, T are respectively the relaxation time and the passing probability.
The equilibrium solutions of the P-H model are given by setting the right-hand side of (3) equal
to zero, and solving for / = feq. Suppose all desired speeds lie between some positive lower
limit w_ and some upper limit w+. Then there is some critical density c^t,6 defined as the root
of
1
such that the equilibrium solutions of the Prigogine-Herman model are as follows:
Stable flow: For c < c^n,
with C = C(c) defined implicitly by JJ_+ îM dv = Tc2(l- P}.
Unstable flow: For c > c^n,
/„ = /„(«;<:, a) := L_-A_ +ao«(« - C). (5)
Here 8 is the delta function of Dirac, and a is not determined as a function of c, but rather
merely required only to satisfy a min (c) = max{0, 1 - re2( ^_ p) /J_+ £^ dv} < a <
1
~ rc3(i-p) fw- ^jr^-dv = amax(c). Further, £ = £(c, a) is defined implicitly, as a
function of both a and c, byTc2(}_P) /J_+ ^^ dv = 1 - a.
This means that, in the regime of unstable flow, the mean speed (or flow) is a function of two
independent parameters. In the above description these parameters were taken as density and the
fraction (a) of the traffic in the collective ("highly platooned") mode. It is equally possible to
take them as the density and the speed (Q of the collective flow. Notice that cû is infinite, and
therefore the collective flow does not exist, unless /0 is such that / *°^ dv is finite.
6
The critical density is to be interpreted as the smallest density at which equilibrium flow involves some vehicles
travelling slower than any driver wishes. This differs from the definition of the 1985 Highway Capacity Manual
(Transportation Research Board, 1985, pp. 1-6) as the density at which maximum (i.e., capacity) flow is reached.
However, the values of the two are quite close, under reasonable assumptions on /Q.
In any case a (classical) traffic stream model is obtained in the stable regime, as
q(c) = I vfeq(v;c)dv.
Prigogine, Herman and Anderson (1962) (see also Chap. 4 of Prigogine and Herman, 1971) as-
sumed that w_ = 0. In that case it turns out that a mai (c) = a m in(c) = a(c), so that again a (clas-
sical) traffic stream model is obtained, as q(c) — f v f e q ( v ; c, a(c)) dv. In this case £(c, a) = 0,
so that the critical density is the smallest density at which some traffic is stopped, at equilibrium
flow, and the component (or mode) a6(v) represents a "collective flow" in which some fraction
(a) of the vehicles are stopped.7 Prigogine and Herman (1971, Chap. 4) provide graphical pre-
sentations of a number of such classical traffic stream models, both with and without collective
flows.
But, the fact that equilibrium solutions lead to a traffic stream model is closely associated with
the fact that the equilibrium solutions are a one-parameter family, and that it is possible to take
the density as that parameter. This is the normal expectation. That expectation has its roots
in the Boltzmann equation of the kinetic theory of gases, where it is typically found that the
parameters required to characterize an equilibrium solution are macroscopic counterparts of the
microscopic invariants of molecular interactions, so that the number of parameters is the same as
the number of invariants.8 As there is only one invariant in vehicular interactions, the number of
vehicles, one therefore expects the equilibrium solutions of a kinetic equation for vehicular traffic
to comprise a one-parameter family. That expectation is realized for stable flow, and even for
unstable flow, provided one makes the somewhat unrealistic assumption that there exist drivers
desiring arbitrarily small speeds. It is therefore perhaps surprising9 to find that this expectation
is not met, in the general case of unstable flow.
To emphasize the radical distinction between the stable and unstable flow regimes, note that in
the unstable flow regime the classical static traffic stream model generally becomes a surface in
three-dimensional c/a/v (or c/C/v) space, rather than a two-dimensional curve. (See Nelson
and Sopasakis (1998) for figures illustrating this phenomenon.) The projection of this surface on
two-dimensional c/v space is, of course, a region rather than a curve. This is qualitatively similar
to what is observed in attempted measurements of traffic stream models.
7
Thus, in the unstable flow regime, a is very similar to the "percent time delay" that is used as a primary level
of service indicator for two-lane highways in the 1985 Highway Capacity Manual (Transportation Research Board,
1985). This quantity is defined as "the average percent of time that all vehicles are delayed while traveling in
platoons due to the inability to pass."
8
The equilibrium solutions of the Boltzmann equation depend upon five parameters, one for each of the five
invariants of molecular interactions. Specifically, these invariants are mass, three components of momentum, and
energy. The corresponding parameters can be taken as density, three components of the mean speed, and tempera-
ture. These are the parameters that normally are taken as characterizing the well-known Maxwellian distributions,
which are the equilibrium solutions of the Boltzmann equations, for some intermolecular force laws.
9
And emphasizes the danger of pushing fluid flow analogies too far in traffic flow.
The continuum models to be developed in the following section will be based upon the equi-
librium solutions described above. However, in order to obtain a LWR model as the zero-order
approximation (i.e., to obtain a static traffic stream model from the equilibrium solution) it ap-
pears to be necessary to restrict these developments to the stable-flow regime (i.e., densities
below critical). That restriction will be followed in the present work. The development of suit-
able continuum models for the unstable-flow regime is presently an open problem. Note that the
nature of the equilibrium solution in the region of unstable flow appears to cast some doubt on
the validity of LWR models per se, in this region.
OTHER APPROACHES TO DERIVING HYDRODYNAMIC MODELS FROM MICROSCOPIC MOD-

ELS
Approaches to obtaining LWR models from the Prigogine-Herman kinetic theory of vehicular
travel have been discussed above, along with a theoretical basis for the limit of their range of
validity. The objective of this subsection is to briefly summarize work on the development of
continuum models (LWR or higher-order) from other microscopic models. The central focus
will be on recent work based on kinetic models other than that of Prigogine and Herman (1971).
Somewhat different approaches to this subject seem to have been initiated independently by
Nelson (1995a) and by Helbing (1995a). Nelson (1995a) developed a kinetic equation in which
the phenomenological relaxation term in the Prigogine-Herman equation is replaced by a more
fundamental representation of speeding-up (and passing is neglected). He and co-workers (Bui,
Nelson and Sopasakis, 1996; Nelson, Bui and Sopasakis, 1997) subsequently showed that the
equilibrium solutions of this kinetic equation lead to a traffic stream model that provides a fit to
data that is as close as that of any of several classical such models, but with use of only parameters
taken directly from a microscopic model of driver behavior (i.e., no "free" parameters available
to provide the best fit to observations). Note that the equilibrium solutions found from this
kinetic equation do not display collective flow at nonzero speeds, and thus the newer results of
Nelson and Sopasakis as described above, along with the empirical scatter of flow under unstable
conditions, bring into doubt the applicability in the unstable regime of LWR models based upon
this traffic stream model.
On the other hand, Helbing (1995a, 1995b, 1996a, 1996b) has based his approach upon the
Paveri-Fontana (1975) extension of the Prigogine-Herman (1971) kinetic equation. This exten-
sion includes the desired speed as an additional independent variable, on the same footing with
the desired speed, as opposed to being a parameter as in the original Prigogine-Herman model.
(Daganzo (1995b) has emphasized the necessity of this, in order that the desired speed be a prop-
erty of vehicles and drivers, not of roads.) Helbing (1996a) has developed a number of hydrody-
namic models, by proceeding in the manner that frequently is used in fluid flow; i.e., by taking
low-order polynomial moments of the underlying (Paveri-Fontana) kinetic equation, then using
ad hoc techniques to close these (i.e., reduce them to determined systems). Note also that Hel-
bing (1995a, pp. 383-384) was unable to obtain the equilibrium solutions of the Paveri-Fontana
kinetic equation, but rather assumed them to be Gaussian (Maxwellian) distributions, based upon
empirical data that was specifically indicated as being for stable traffic. This certainly brings into
question the validity for unstable flow of the continuum equations thus obtained.
Wegener and Klar (1996) built upon the approach of Nelson (1995 a) to obtain a more realistic
kinetic model that uses first principles to incorporate speeding-up. They studied numerically the
equilibrium solutions, and associated traffic stream models, of their class of kinetic equations,
but no evidence of a collective flow seems to have been uncovered. These workers (Klar and
Wegener, 1997) also developed an "Enskog-like" kinetic model that differs from all previous
kinetic models of vehicular traffic in that the effect of nonzero vehicle length is taken into ac-
count in the slowing-down and speeding up of vehicles. They derived from this approach a set of
macroscopic equations formally similar to the higher-order model of Payne (1979), but with co-
efficients computable from microscopic considerations, and that clearly showed the importance
of including nonzero vehicle lengths. They also obtained computational comparisons between
the solution of the kinetic (Enskog) equation and their hydrodynamic model, for a rather real-
istic lane-drop problem. It is not yet clear how suitable their particular kinetic equation might
be for unstable flow (e.g., whether it exhibits collective flow), but it seems likely that ultimately
the effect of nonzero vehicle length must be incorporated into any realistic treatment of traffic
flow. Finally, Klar and Wagner (to appear-a, to appear-b) are currently pursuing very interesting
studies of multilane kinetic models.
For completeness, two further lines of development should be mentioned. First, Phillips (1977,
1979, 1981) also has developed continuum models on the basis of an underlying kinetic equa-
tion. However, the underlying equilibrium solution does not contain a collective-flow mode, so
that it appears unlikely this continuum model will be suitable for modelling unstable flow. Sec-
ond, there is an extensive older literature on obtaining traffic stream models from car-following
models. Most of this is based on the assumption of steady state conditions (constant headway;
see Nelson, 1995b), and thus seems unlikely to be helpful in providing insight into the collective
dynamics underlying unstable traffic flow. However, a paper of Newell (1962) that is perhaps
less well-known than deserved is specifically directed toward adapting car-following models to
unstable flow.
Before turning to the development of continuum models via asymptotic expansion of the Prigogine-
Herman kinetic equation, it is appropriate to note a significant difference between this approach
and that in the kinetically based developments of continuum descriptions of traffic flow in some
of the references cited above. In these previous works the approach is to take the first two poly-
nomial moments of the kinetic equation, and then introduce ad hoc approximations as necessary
to close the system (i.e., reduce to two the number of unknowns = dependent variables). By
contrast, in the approach via asymptotic expansions the number and nature of the moments of
the kinetic equation that are introduced at any order are determined automatically by the require-
ment that the next higher-order approximation exist. Further, the number and nature of these
moments usually are very closely related to the number and nature of the quantities conserved
in the interactions described by the particular kinetic equation (invariants). For vehicular kinetic
equations one expects only one quantity (the number of vehicles) to be conserved, and therefore
the associated continuum models to involve only one moment (as opposed to the two moments of
current higher-order models) in any corresponding continuum equation. In this respect a kinetic
model of vehicular traffic is perhaps closer to the Lorentz model of the Boltzmann Equation, for
which number of particles is also the sole conserved quantity, than to the Bojtzmann equation of
the kinetic theory of gases. For this model, it has been shown analytically (Hauge, 1969) that the
nth-order Chapman-Enskog approximation has the form of a single partial differential equation
of order 2n. One can perhaps reasonably anticipate somewhat similar results for Chapman-
Enskog approximations in the kinetic theory of vehicular traffic.
THE CHAPMAN-ENSKOG EXPANSION

In this section the Chapman-Enskog expansions, of orders zero and one, will be developed for
the Prigogine-Herman kinetic equation, with the probability of passing and the relaxation time
taken as constant. The treatment given here is adapted from that of Cercignani (1988, Section
V.3).
To begin, it is convenient to slightly rewrite the Prigogine-Herman equation (3) as
Here the arguments x, v and t have been omitted, in order to keep the notation as concise as
possible. Additionally, w+ is the maximum speed desired by any driver, as in the preceding
section, and <p0 is the reduced desired speed distribution, as defined by (pQ = /0/c, and otherwise
the notation is as explained above in conjunction with Eq. (3). For present purposes <ô is assumed
to depend only on v (in particular, to be independent of c), and further the relaxation time (T)
and passing probability (P) are assumed to be constants, independent of concentration (c).
The beginning point for the Chapman-Enskog approximation is to further rewrite the Prigogine-
Herman kinetic equation (6) in the form
ef = K f . (7)
Here / and Kf are respectively the left-hand and right-hand sides of (3), and e is a parameter that
is introduced as a formal device to separate (3) into (an infinite number of) more easily solved
problems. A formal series solution, in the form
is then sought. However, the density (the sole "hydrodynamic" variable, in the case of traffic
flow) is specifically not expanded in a power series in e. That means it is required that the
densities corresponding to the higher-order corrections to the distribution function are required
to be zero, w+
f(n\x,v,t)dv = Q,n = l,2,.... (9)
/
Thus density is carried entirely by the zero-order approximation to the distribution function,
f
f0(0)
(x,v,t)dv. (10)
Jo
In the Chapman-Enskog expansion the (partial) time derivative in the Prigogine-Herman equation
(7) is further expanded as
fc=0
f\(k\
(In the last equality the expansion (8) has been used, with the ^- treated as linear operators.)
Here the -—^— are to be determined so as to ensure existence of the terms of the formal solu-
tion (8). That is, the -—^— are to be regarded as unknowns, to be determined by substituting
the expansions (8) and (11) into the Prigogine-Herman (7), in exactly the same way as the various
terms in the expansion (8) (i.e., the /^) are determined.
The result of carrying out these substitutions, and equating coefficients of like powers of e is
K/ (0) = 0, (12)
at order n = 0. Otherwise it is
^ d (fc) f^-*-1) df(n-V

fc=0
Here L(^) is, for any function g, the linear operator defined by
1
w+
/
and
S
Jo
where, as usual, the dependence on x and t is understood, but is not explicit in the notation. (The
assumption that P and T are constants, independent of c, is used in obtaining these equations.)
If / (0) , /(1), / (n ~ 1) exist, then (13) has a solution for /^-^ if, and only if, the right-hand side is
an element of the range of L(/^), or equivalently it is orthogonal to the null space of the adjoint
of L(/<°>), notationally L(/<°>)*. But L(/<°>)* is given by
and it is readily shown from this representation that the null space of L(/(°))* is exactly the one-
dimensional space of constants. Thus a necessary and sufficient condition for the equations (13)
to have a solution for the f^ is the compatibility conditions that the integral over 0 < v < w+
of their respective right-hand sides to be zero. But
where Eqs. (9) and (10) have been employed. Similarly, (9) implies that f™+ Sndv = 0, for
n > 1. Therefore, the compatibility conditions for Eqs. (13) are the (LWR-like) equations
^("-l)/> /^(n-l)
= 1,2,..., (15)
ot ox
where
.- f +
(16)
Jo
If (15) is multiplied by e""1, and the results summed over all n = 1 , 2 . . . , then the result can be
written as
n=0
Here the definition of the expansion (11) has been employed. This equation, which arises as
a consequence of the basic assumptions of the asymptotic expansion rather than from ad hoc
assumptions, is the basis for the continuum approximations arising from the Chapman-Enskog
approximation, as follows. If the q^ are computed by solving Eqs. (13), then it turns out that
q^ depends on x and t only through c(x, t} and its spatial derivatives up to order n. If the sum
in (17) is truncated at n = N, and e set to unity, then the resulting partial differential equation of
order ./V is the continuum form of the ./Vth-order Chapman-Enskog approximation. The details
of these continuum approximations will now be developed, for ./V = 0, 1, in the following two
respective subsections.
THE ZERO-ORDER CHAPMAN-ENSKOG APPROXIMATION
From (4) and (8) the zero-order Chapman-Enskog solution of the Prigogine-Herman kinetic
equation, for the stable flow regime (c < cît), is given by
f (v.c] _
_ (o) _- ^ * 'AM
/f _ /f M-T(1_p)u_c(c).
Here £(c) is the root of

F0(C) = T(l - P)c,
+
where Fn(C) := J0™ , Jîn+i dv. From (17), the corresponding continuum approximation is
along with the classical traffic stream model given by
(19)
Note that the stable flow regime is defined quantitatively by c < c^n := rf°i°L > so that neces-
sarily C(c) is negative in this regime.
Eqs. (18) and (19) comprise a LWR model. As a partial differential equation for the zero-order
approximation to the concentration, this LWR model reads as
dQ0 _
(20)
~°'
Subject to suitable auxiliary (initial and boundary) conditions, this LWR model presumably de-
termines c(°) as a function of position and time. Note that the formal asymptotic procedure gives
rise to both the conservation law and the traffic stream model, and that the latter is expressed in
terms of the given data in the underlying (Prigogine-Herman) kinetic model.
In the following subsection the corresponding problem of determining /^ is considered. This

section will be concluded by briefly discussing how the problem of determining /^ is modified
in the case of unstable flow (c^ > 0^4). In the unstable case, all of the preceding formal
development goes through verbatim, except that QQ no longer depends upon only c^°\ but rather
upon both c^0) and a. Therefore, in this case the partial differential equation (20) is a single
equation, which cannot be expected to determine the two unknowns, c^ and a. The central
problem of extending the Chapman-Enskog expansion to encompass unstable flow is to find a
suitable second equation in c^ and a that will, along with (20) and suitable auxiliary conditions,
serve to determine c^ and a. This problem is outside the scope of the present work.
THE FIRST-ORDER CHAPMAN-ENSKOG APPROXIMATION
Thefirst-orderChapman-Enskog solution of the Prigogine-Herman kinetic equation is / =

/(1), where /(°) is the solution of (12) and /(1) is the solution of the instance n = 1 of (13). The
resulting zero-order approximation to the vehicular distribution is /(0) = feq(v; c), just as before.
However, there is a subtle but important difference, as regards the density parameter appearing
in this equilibrium solution, between this equilibrium solution and that arising in the zero-order
approximation of the previous subsection. In the equilibrium solutions arising as the zero-order
approximation for the Chapman-Enskog expansion truncated at zero order, the appropriate value
of the density parameter is determined by the LWR model of the preceding subsection. But in the
first-order Chapman-Enskog approximation the appropriate value of the density is the solution
of the continuum model corresponding to (17), with the infinite series truncated at N = 1. It will
be seen below that this is not a LWR model.
In order to determine this first-order Chapman-Enskog continuum model it is necessary to calcu-

late q^\ as defined by the instance n = 1 of (16). This calculation will now be outlined.
As already mentioned, the starting point for the computation of /(*), and hence q^l\ is the in-
stance n — 1 of (13). The compatibility condition (15) for this instance is
0(o)c ^(0)
-—- + -^— = 0, n = 1 , 2 , . . . .
dt dx
The right-hand side of the relevant instance of (13) is then
0fO) dc d(0) f(0) f
+ v-^— = —^—-7— + v-4r-^c- = —^— < —^— +v— > =
dt dt dc dt dc dx dc { dx dx
Here £' is the derivative of C, and the explicit expression (19) for gO) has been employed.
The instance n = 1 of (13) can then be written as
(l-P)ct,-C(c)'
This is a (degenerate) integral equation of the first kind for /(1). It can be solved explicitly for
/M, but it is somewhat easier to solve for q^l\ which is the quantity of actual interest. If this
integral equation is integrated over 0 < v < w+J then the result can be solved explicitly for q^
as
But it is relatively easy to show that f™+ g-\.(v) dv = 0 and

^(C(C))^
Jo dx
Therefore,
where the "diffusion coefficient" is
The first-order continuum approximation associated with the Chapman-Enskog expansion then
is defined by the continuity equation (18), but now the flow is given by
Thus this first-order continuum approximation is the advection-diffusion equation
It is an easy application of the Cauchy-Schwartz inequality, along with the equality F 0 (C(c)) =
cT(l - P], which defines C(c), to show that Fi(C(c))2 < cT(l - P)F2(C(c)), so that D(c) > 0.
Therefore, the initial-value problem for this partial differential equation is well-posed compu-
tationally. Some initial computational results for this equation are presented in the following
section.
The diffusive term in (21) can be interpreted as a correction to the zero-order LWR model, which
can be written as
l+flwol-0-
It is interesting to note that the possibility of further improved continuum descriptions of traf-
fic flow already was considered by Lighthill and Whitham (1955, p. 344), who suggested adding
"diffusion" (representing adjustments by drivers to the concentration slightly ahead) and "inertia"
(representing the nonzero time required for accelerations or decelerations) effects to the continu-
ity equation. (The present approach thus confirms the diffusion term, but not the inertia term.)
By the time of his book on nonlinear waves, Whitham (1974) mentioned both this possibility and
an approach similar to current dynamic stream models (as an effort to take account of the "time
lag in the response of the driver"), but there was no associated citation of the work of Payne
(1971). Payne (1971) and Schochet (1988) have observed that the Payne model, as its relaxation
time (which does not appear simply related to that in the Prigogine-Herman model) approaches
zero, reduces to a form similar to (21). On the basis that the first-order Chapman-Enskog approx-
imation to the Boltzmann equation is the Navier-Stokes equations of fluid dynamics, it appears
that the advection-diffusion equation (21) is better considered as the true analog for traffic flow
of the Navier-Stokes equations than are current higher-order models. This supposition seems to
be rather strongly supported by the result of Schochet (1988) described earlier, which shows that
this "continuity plus diffusion" model plays the same role in justifying the entropy condition for
the LWR model as the Navier-Stokes equations play in justifying it for the Euler equations of
fluid dynamics.
COMPUTATIONAL RESULTS
For the example of this section the values P = .67 and T = .0015 hours were used. These cor-
respond to 77: c/Cjam = .3333 and T = .003 hours « 11 seconds, in the notation of Chapter 4 of
Prigogine and Herman (1971). The reduced desired speed distribution used is that corresponding
to a uniform distribution of desired speeds from 40 miles per hour to 80 miles per hour. The
corresponding equilibrium solution, in the stable regime, is
f -^T^> f°r 40 mph < v < 80 mph

/e,(v;c) = < v C(c) .
I 0, otherwise.
Here C is given explicitly by £(c) = 40(e 02c — 2)/(e 02c — 1) mph, where c is density in vehicles
per mile per lane (vpmpl). The critical density is the root of £(c) = 0, which is C^H = 50 In 2 =
34.67 vpmpl. The corresponding traffic stream model, from the equilibrium solution, is q^ (c) =
2000 4- c£(c) vehicles per hour per lane (vphpl). A plot of this traffic stream model, for the stable
regime, is shown in Figure 1. The critical flow (i.e., the flow at critical density) is qâ =
1/T(1-P)= 2000 vphpl.
The problem considered here is defined by the parameters of the preceding paragraph, and the
initial conditions
, x , x, f Ccrit = 34.67 vpmpl, x < 0,

c(z,0) = c(z,t)| t = 0 = {
I 3 vpmpl, x > 0.
This corresponds to release, into a relatively vacant region, at t = 0 of a semi-infinite "platoon"

of vehicles extending indefinitely to the left from x = 0, and initially at the critical concentration.
The corresponding exact solution to the LWR model (18) consists of a shock, with density Q =
Ccrit = 34.67 vpmpl on the left and density cv = 3 vpmpl on the right. The shock thus propagates
to therightat a speed of (g(0)(cr) - 9(0)(c*)) / (cr - Q) = 57.3 mph.
This problem was solved computationally, for both the zero-order LWR approximation and
the first-order "advection-diffusion" approximation (21), by means of the generalized Godunov
method (Morton, 1996, Section 7.3) This method was selected specifically because it is a shock-
capturing method, when applied to pure hyperbolic,advection problems such as the LWR model,
but it also is applicable to advection-diffusion equations, such as the first-order Chapman-Enskog
approximation (21) obtained above. Thus this method permits comparison of the zero-order and
first-order approximations, without extraneous effects stemming from the use of fundamentally
different computational methods. Some sense of the capabilities and limits of this method can
be obtained from Fig. 2. In this figure, the densities for both the exact solution of the LWR
model, as described above, and for its computational approximation by the generalized Godunov
method, are displayed, for t = 1.03 minutes. Notice that the computational method captures
very well the location of the shock. The shock in the computational result displays a nonzero
thickness, that is just discernible on the scale of the motion of the front (it is three to five compu-
tational spatial cells thick). This "numerical dispersion" phenomenon is practically unavoidable
for computational approximations to nonlinear conservation laws (such as LWR models). The
LWR computational results also display relatively small (about 10% of the critical density) non-
physical oscillations near either edge of the computed shock. Many (but not all) computational
methods for nonlinear conservation laws display such high-frequency oscillatory errors. All in
all, these results provide some evidence of credibility for the computational methodology.
The computational result for the advection-diffusion model (21) at the same time also are dis-
played in Fig. 2. The most significant differences from the LWR results are that the shock profile
is significantly wider and smoother. (The shock is approximately .3 miles wide for the diffusive
approximation, versus .1 miles for the computed LWR results). To understand the basis for this
difference, notice that the LWR model effectively approximates the vehicular flow as if all ve-
hicles travelled at the average speed. (The average speed is 57.7 miles per hour unstream of the
shock, and 59.7 mph downstream.) However, according to the underlying kinetic model there is
in fact a distribution of vehicular speeds extending from v = 40 mph to v = 80 mph. Therefore,
on either side of the shock a significant fraction of the vehicles are in fact moving faster than the
average speed. Those vehicles in fact penetrate into the low-density downstream region further
in a given time than would be predicted on the basis of the assumption that all vehicles move at
the average speed corresponding to the local density. It is this physical dispersive effect, arising
from the underlying statistical distribution of vehicular speeds, that seems to be represented by
the additional flow (i.e., q^) giving rise to the diffusive term in the first-order approximation
(21).
Figure 3 is a plot of the densities, as computed from the first-order diffusive model, from t = 0
tot = 1.03 minutes, and over a significant spatial region surrounding the initial location of the
leading edge of the platoon. The region to the left of the slowly widening shock is the relatively
crowded region initially to the left of x — 0, and that to the right is the freeflow region initially to
the right of x = 0. From the perspective of Fig. 3 the widening of the shock as time increases is
quite apparent. This is due to dispersion of vehicles at the leading edge of the platoon, because of
their different desired speed. By contrast, the shock in the comparable LWR results (not shown)
maintain a constant and narrower width.
CONCLUSIONS
It has been shown that the Chapman-Enskog expansion applied to the Prigogine-Herman ki-
netic equation gives rise, in the regime of stable flow, to a LWR model at order zero, and to an
Figure 1: Traffic Stream Model
5 10 15 20 25 30
Concentration (vehicles per mile)
Rgure 2: The platoon release problem at t = 1.03 minutes

w
/\
35
+
+ \
+ t
30
1 + i
+ \
E
|25 + i
i +
t20 4
I
.2 + Diffusive H
2
- - Cmptd. LWR V1
a
O
— Exact LWR
1
+
Dettat = .97sec.
10 . +
Deltax = 88ft. I +
1 +
T=.0015hrs +
5 ++
P = .666 > +
w
n , i
0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
Space (miles)
Figure 3: The Diffusive Approximation to the Platoon Release Problem
, 1
Space (miles)
advection-diffusion equation, consisting of an LWR model with an additional diffusive term, at

order one. It is suggested that the diffusive term represents the tendency of vehicles initially to-
gether to disperse, because of the different desired speeds of the drivers. (By contrast, the LWR
model proceeds as if all vehicles were traveling at the average speed corresponding to the density
of vehicles in their imediate vicinity.)
Helbing (1996, p. 2379) has earlier discussed this dispersion phenomenon, by way of answering
the objection of Daganzo (1996b) to the fact that higher-order models predict vehicular move-
ment faster than the (average) vehicular speed. Dispersion of vehicles due to differing speeds
clearly is a real phenomenon that is not represented in LWR models. But this phenomenon is
captured, to some degree, by either existing higher-order models or by the first-order Chapman-
Enskog approximation presented here. The choice between these two alternatives, at least in
terms of capturing the dispersion effect, may ultimately be a matter of taste. The authors prefer
the advection-diffusion model developed here, on the basis of the following two points:
A corrective term in the conservation equation, with no additional unknowns, seems sim-
pler than an additional equation, with an additional unknown. (This point invokes the
principle of Occam's razor.)
• The first-order model obtained here arises from the kinetic theory via standard asymptotic
expansions, rather than by ad hoc techniques.
In any event, the results presented above clearly provide an alternative to current higher-order
models, for those seeking continuum models of traffic flow that are not subject to some of the
limitations of LWR methods.
The approach presented here also has the advantage, relative to current higher-order models, that
it can be systematically extended to give a hierarchy of arbitrarily high-order extensions of the
zero-order LWR model. For example, in the theory of rarefied gases the second-order Chapman-
Enskog expansion gives rise to the Burnett equations of fluid flow (Cercignani, 1988). These
equations are not used in fluid flow as often as their first-order Navier-Stokes counterpart, but
they have found application. It remains to be seen whether the traffic-flow counterpart of the
Burnett equations has any significant application. Work related to this issue will be presented
elsewhere.
It would be of some interest to extend the Chapman-Enskog expansions presented here to the
case that P and T vary with density (c) in the manner assumed in Chap. 4 of Prigogine and
Herman (1971). However, this interest may be mostly academic in nature. This is because the
effect of this dependence upon c is most pronounced for larger values of c, which are outside
the stable-flow regime for which the present theory applies. Probably a more important issue,
as regards the potential application to congested flow, is to determine a counterpart continuum
theory that is applicable to unstable flow. The results of this paper, combined with those of
Nelson and Sopasakis (1998), suggest that the applicability of LWR models may be essentially
limited to stable flow.
The asymptotic expansion of Hilbert (e.g., Cercignani, 1988) has been used in the kinetic the-
ory of gases, as yet another approach to obtaining a hierarchy of continuum approximations.
When applied to the Prigogine-Herman kinetic model (in the stable flow regime) it gives rise to
a hierarchy of continuum models in which the model of order n arises by adding a first-order
partial differential equation of LWR type to the previous n first-order partial differential equa-
tions. However, the unknowns in these equation are successive corrections to the density, and the
equation added to comprise the model of order n contains not only the nth order correction, say
c^n\ but also all lower-order contributions c^n~l\ c ^ n ~ 2 ) , . . . , c^. Thus there is a one-directional
(triangular) coupling between the equations marking the successive orders. More details of this
approach will be presented elsewhere.
The first-order diffusive Chapman-Enskog approximation is a differential equation of order two.

Thus one would expect to need two boundary conditions. And at each successive further order in
the Chapman-Enskog hierarchy one would need one further boundary condition. Thus difficulties
with boundary and interface conditions, which already are significant even for the zero-order
LWR models, are only compounded for succesive entries in the Chapman-Enskog hierarchy.
This is to be expected, as boundary conditions for the counterpart expansion in the kinetic theory
of gases also are a difficult issue. This issue also arises for current higher-order models, and does
not seem to have been addressed in that context.
Finally, it would be of great interest to extend the types of results presented here to other, and
more realistic, kinetic models of vehicular traffic. Perhaps the most immediate such extension
should be to a Prigogine-Herman equation modified so as to incorporate the spatial extent of
vehicles in the interaction term.
REFERENCES
Ansorge R. (1990). What does the entropy condition mean in traffic flow theory? Transportation
Research B 24B, 133-144.
Bui, D. D., Nelson, P. and Narasimhan, S. L. (1992). Computational realizations of the entropy
condition in modelling congested traffic flow. Texas Transportation Institute report no.
1232-7.
Bui, D. D., Nelson, P. and Sopasakis, A. (1996). The generalized bimodal traffic stream model
and two-regime theory. Transportation and Traffic Theory, (J.-B. Lesort, Ed.), Elsevier,
Oxford, 679-696.
del Castillo J. M., Pintado, P. and Benitez, F. G. (1993). A formulation for the reaction time
of traffic flow models. Transportation and Traffic Theory, (C. F. Daganzo, Ed.), Elsevier,
Amsterdam, 387-406.
Ceder A. (1976). A deterministic flow model for the two-regime approach. Transp. Res. Rec.
567, 16-30.
Cercignani C. (1988). The Boltzmann equation and its applications. Springer-Verlag, New York.
Chapman S. and Cowling, T. G. (1952). The mathematical theory of nonuniform gases. Cam-
bridge University Press, Cambridge.
Daganzo C. F. (1995a). A finite difference approximation of the kinematic wave model of traffic
flow. Transportation Research B 29B, 261-276.
Daganzo C. F. (1995b). Requiem for second-order fluid approximations of traffic flow. Trans-
portation Research B 29B, 277-286.
Disbro J. E. and Frame M. (1989). Traffic flow theory and chaotic behavior. Transp. Res. Rec.
1225,109-115.
Drake J. S., Schofer, J. L. and May, A. D. (1967). A statistical analysis of speed density hypoth-
esis. Highway Research Record 154,53-87.
Gerlough D.L. and Huber M. J. (1975). Traffic Flow Theory. Transportation Research Board
Special Report 165, Washington, D.C.
Grad, H. (1949). On the kinetic theory of rarefied gases. Communications on Pure and Applied
Mathematics 2, 331-407.
Greenshields B. D. (1934). A study of traffic capacity. Procs. Highw. Res. Board 14, 448-477.
Hauge E. H. (1969). Exact and Chapman-Enskog solutions of the Boltzmann equation for the
Lrentz model. Arkivfor Det Fysiske Seminar i Trondheim, No. 5.
Hall F. L. (1987). An interpretation of speed-flow concentration relationships using catastrophe
theory. Transp. Res. A 21 A, 191-201.
Helbing, D. (1995a). Theoretical foundation of macroscopic traffic modejs. Physica A 219,
375-390.
Helbing, D. (1995b). High-fidelity macroscopic traffic equations. Physica A 219, 391-407.
Helbing D. (1996a). Gas-kinetic derivation of Navier-Stokes-like traffic equations. Physical
Review E 53, 2366-2381.
Helbing, D. (1996b). Derivation and empirical validation of a refined traffic flow model. Physica
A 233, 253-282.
Kerner B. S., Konhauser, P. and Schilke, M. (1995). Deterministic spontaneous appearance of
traffic jams in slightly inhomogeneous traffic flow. Physical Review E 51, 6243-6246.
Klar A. and Wegener, R. (1997). Enskog-like kinetic models for vehicular traffic. J. Statistical
Physics 87, 91-114.
Klar A. and Wegener, R. (to appear-a). A hierarchy of models for multilane vehicular traffic I:
Modeling. SIAM J. Applied Math.
Klar A. and Wegener, R. (to appear-b). A hierarchy of models for multilane vehicular traffic I:
Numerical and stochastic investigations. SIAM J. Applied Math.
Kiihne R. D. and Beckschulte R. (1993). Non-linearity stochastics of unstable traffic flow. Trans-
portation and Traffic Theory, (C. F. Daganzo, Ed.), Elsevier, Amsterdam.
Lebacque, J.-P. (1984). Semimacroscopic simulation of urban traffic. Procs. Int. AMSE Conf
"Modelling & Simulation," 4, 273-292.
Lebacque, J.-P. (1995). The Godunov scheme and what it means for first order traffic flow models.
CERMICS Report No. 95-48, November.
Lebacque J.-P. (1996). The Godunov scheme and what it means for first order traffic flow models.
Transportation and Traffic Theory, (J.-B. Lesort, Ed.), Elsevier, Oxford, 647-678.
Lebacque, J.-P. (1997). A finite acceleration scheme for first order macroscopic traffic flow mod-
els. Transportation Systems, Procs. 8th IFAC/IMP/IFORS Symposium, (M. Papageorgiou
and A. Pouliezos, Eds.), Tech. Univ. Crete, Chania, 2, 815-820.
Leo, C. J. and Pretty, R. L. (1992). Numerical simulation of macroscopic continuum traffic

models. Transportation Research B, 26B, 207-220.
LeVeque, R. L. (1990). Numerical methods for conservation laws. Birkhauser-Verlag, Basel.
Liboff R. L. (1990). Kinetic theory: classical, quantum and relativistic descriptions. Prentice-
Hall, Englewood Cliffs, NJ.
Lighthill M. J. and Whitham G. B. (1955). On kinematic waves n - a theory of traffic flow on
long crowded roads. Proc. Royal Society, London A229, 317-345.
Lyrintzis A. S., Liu, G. and Michalopoulos, P. G. (1994). Development and comparative evalua-
tion of high-order traffic flow models. Transportation Research Record, 1457, 174-183.
May, A. D. (1990). Traffic Flow Fundamentals. Prentice-Hall, Englewood Cliffs, NJ.
Michalopoulos, P. G., Beskos, D. E. and Lin, J-K (1984). Analysis of interrupted traffic flow by
finite difference methods. Transp. Res. B 18(B)(4/5), 409-421.
Michalopoulos, P. G. and Beskos, D. E. (1984). Improved continuum models of freeway flow.
Ninth International Symposium on Transportation and Traffic Theory, (J. Volmuller and R.
Hamerslag, Eds.,) VNU Science Press, Utrecht, 89-111.
Michalopoulos, P. G., Yi, P. and Lyrintzis, A. S. (1993). Continuum modelling of traffic dynamics
for congested freeways. Transportation Research B, 27B, 315-332.
Morton K. W. (1996). Numerical solution of convection-diffusion problems, Chapman & Hall,
London.
Nelson, P. (1995a). A kinetic model of vehicular traffic and its associated bimodal equilibrium
solutions. Transport Theory and Statistical Physics 24(1-3), 383-409.
Nelson P. (1995b). On deterministic developments of traffic stream models. Transp. Res. B 29B
No 4, 297-302.
Nelson, P., Bui, D. D. and Sopasakis, A. (1997). A novel traffic stream model deriving from
a bimodal kinetic equilibrium. Transportation Systems, Procs. 8th IFAC/IFIP/IFORS
Symposium, (M. Papageorgiou and A. Pouliezos, Eds.), Tech. Univ. Crete, Chania, 2,799-
804.
Nelson P. and Sopasakis, A. (1998). The Prigogine-Herman kinetic model predicts widely scat-
tered traffic flow data at high concentrations. Transportation Research B 32,8, 589-604.
Newell G. F. (1962). Nonlinear effects in theory of car following. Operations Res. 9, 209-229.
Newell, G. F. (1989). Comments on traffic dynamics. Transportation Research B 23B, 386-389.
Newell, G. F. (1993). A simplified theory of kinematic waves in highway traffic, Part I: General
theory. Transportation Research B 27B No 4,281-287.
Paveri-Fontana S. L. (1975). On Boltzmann-like treatments for traffic flow: A critical review
of the basic model and an alternative proposal for dilute traffic analysis. Transp. Res. 9,
225-235.
Payne H. J. (1971). Models of freeway traffic and control. Simulation Council Proc., 1, 51-61.
La Jolla, CA; Simulations Council, Inc.
Payne, H. J. (1979). FREFLO: A macroscopic simulation model of freeway traffic. Transporta-
tion Research Record 722, 68-77.
Phillips W. F. (1977). Kinetic model for traffic flow. Report No. DOT/RSPD/DPB/50-77/17,
Utah State University for the U.S. Department of Transportation.
Phillips W. F. (1979). A kinetic model for traffic flow with continuum implications. Transporta-
tion Planning and Technology 5, 131-138.
Phillips, W. F. (1981). A new continuum model for traffic flow. Report No. DOT/RSPA/DPA-
50/81/5, U. S. Department of Transportation, Washington D. C. (Prepared by Utah State
University, Logan, Utah).
Prigogine I. and Herman R. (1971). Kinetic Theory of Vehicular Traffic. American Elsevier Pub.
Co., New York.
Prigogine, I. Herman, R. and Anderson, R. L. (1962). On individual and collective flow. Bulletin
Acad. Royale de Belgique 48, 792-804.
Rathi, A. K., Lieberman, E. B. and Yedlin, M. (1987). Enhanced FREFLO Program: Simulation
of congested environments. Transportation Research Record 1112, 61-71.
Richards, P. I. (1956). Shockwaves on the highway. Operations Research, 4,42-51.
Ross P. (1988). Traffic dynamics. Transp. Res. 228, 421-435.
Ross, P. (1989). Response to Newell. Transportation Research B 23B, 390-391.
Schochet S. (1988). The instant response limit in Whitham's nonlinear traffic-flow model: Uni-
form well-posedness and global existence. Asymptotic Analysis, 1, 263-282.
Transportation Research Board (1985), Highway Capacity Manual. Special Report 209, National
Academy of Sciences, Washington, DC.
Wegener, R. and Klar, A. (1996). A kinetic model for vehicular traffic derived from a stochastic
microscopic model. Transport Theory and Statistical Physics 25, 785-798.
Whitham G. B. (1974). Linear and Nonlinear Waves. Wiley, New York.
The Lagged Cell-Transmission Model 81
THE LAGGED CELL-TRANSMISSION MODEL
Carlos F. Daganzo, Inst. ofTransp. Studies, Univ of California, Berkeley, California, USA
ABSTRACT
Cell-transmission models of highway traffic are discrete versions of the simple continuum
(kinematic wave) model of traffic flow that are convenient for computer implementation. They are
in the Godunov family of finite difference approximation methods for partial differential equations.
In a cell-transmission scheme one partitions a highway into small sections (cells) and keeps track
of the cell contents (number of vehicles) as time passes. The record is updated at closely spaced
instants (clock ticks) by calculating the number of vehicles that cross the boundary separating each
pair of adjoining cells during the corresponding clock interval. This average flow is the result of
a comparison between the maximum number of vehicles that can be "sent" by the cell directly
upstream of the boundary and those that can be "received" by the downstream cell.
The sending (receiving) flow is a simple function of the current traffic density in the
upstream (downstream) cell. The particular form of the sending and receiving functions depends
on the shape of the highway's flow-density relation, the proximity of junctions and on whether the
highway has special (e.g., turning) lanes for certain (e.g., exiting) vehicles. Although the discrete
and continuum models are equivalent in the limit of vanishingly small cells and clock ticks, the
need for practically sized cells and clock intervals generates numerical errors in actual applications.
This paper shows that the accuracy of the cell-transmission approach is enhanced if the
downstream density that is used to calculate the receiving flow(s) is read { clock intervals earlier
than the current time, where 0 is a non-negative integer that should be chosen by means of a simple
formula. The rationale for the introduction of this lag is explained in the paper. The lagged cell-
transmission model is related (but not equivalent) to both Godunov's first order method for general
flow-density relations and Newell's exact method for concave flow density relations. It is easier
to apply and more general than the latter, and more accurate than the former. In fact, if the flow-
density relation is triangular and the lag is chosen optimally, then the lagged cell-tansmission model
is a conservative, second order, finite difference scheme. As a result, very accurate results can be
obtained with relatively large cells. Accuracy formulae and sample illustrations are presented for
both the triangular and the general case.
1. INTRODUCTION
This paper describes a new finite difference approximation for the kinematic wave model
of traffic flow formulated by Lighthill and Whitham (1955) and Richards (1956), called here the
LWR model, and for the generalized continuum model that applies to freeways with special lanes
(Daganzo, 1995). The proposed scheme is conservative, in that vehicles are not created or lost
during the simulation except at the highway's entrances and exits, like the procedures described
in Bui et al (1992), Lebacque (1993) and Daganzo (1993 and 1993a). Unlike its predecessors,
however, the proposed scheme is not in Godunov's family of finite difference approximations (see
Godunov, 1961, or LeVeque, 1992), and is more accurate than they are.
In the new scheme, flows across the boundary between two cells are calculated with a
sending/receiving metaphor similar to that introduced in Lebacque (1993) and Daganzo (1993a),
using only information from the two neighboring cells.1 Such a metaphor is called here the cell-
transmission (CT) model. The CT recipe has been modified to model junctions (Daganzo, 1994)
and highway links with special lanes, e.g., for turning or high occupancy vehicles (Daganzo, Lin
and DelCastillo, 1995). These modifications make it possible to model complex networks. The
similarity between the new and the old CT recipes ensures that these modifications can be extended
trivially to the new method. Thus, the improvement in accuracy obtained with the new method
should apply to network models.
The main difference between the new approach and the original CT recipe is that the density
of the downstream cell, used to calculate the "receiving" flow, is now taken from an earlier time,
with a lag of C simulation clock intervals. Newell's exact solution method for concave flow-
density relations (Newell, 1993) is also based on the introduction of lags. Lags are useful for
highway traffic modeling because traffic information travels several times more slowly in the
Lebacque (1993) uses the terms "local demand" and "local supply" to express the metaphor.
upstream than in the downstream direction. In our case, we shall see that if the two wave speeds
(forward and backward) are independent of density, then the lagged cell-transmission (LCT) model
turns out to be second order accurate.
Because this paper is closely related to Daganzo (1993a), the reader is referred to that
reference for more extensive introductory remarks. The remainder of this paper is organized as
follows. Section 2 presents experimental evidence pertaining to wave speeds and illustrates by
means of an example the minor difficulties one encounters with the LWR theory when the wave
speed does not decline with density monotonically, i.e., when the flow-density relation is non-
concave; in particular, it is shown that the "stable" LWR solution can have waves emanating from
a shock. The new finite difference approximation is then presented in Section 3, together with an
analysis of its accuracy and stability. The paper concludes with some examples (Sec. 4) that
illustrate the results of Sec. 3.
2. NON-DECLINING WAVE SPEEDS
The LWR model is intended to describe traffic on large scales of observation, where it
makes sense to define a density function k(t, x) and a flow function q(t, x) in time-space (t, x).
It is based on the assumption that q and k are locally related by a flow-density relation q = T(k,
t, x). When the highway is homogeneous and its features do not depend on time (e.g. no incidents
or moving bottlenecks) then the relation only includes k as an argument:
q = T(k). (1)
This is the case that will receive attention for the most part of this paper.
Flow and density are also related by the conservation equation:
k, + qx = 0, (2)
where subscripts have been used to denote partial derivatives.2 Thus, it is possible to eliminate
q from (1) and (2). The result is the following simple first order quasi-linear partial differential
equation in k:
kt + T k k x = 0. (3)
2
Subscripts t, x and k will be used in this paper to denote partial derivatives of the subscripted
variable with respect to time, distance and density. All other subscripts, e.g., i, j, and {, will be
indices for the subscripted variables.
In this equation kt and kx are functions of t and x, and Tk is a known function of k. This is the
conventional way of expressing the LWR model for numerical approximations.
If Tk is independent of k then the solution to (3) has the form: k(t, x) = g(x-Tkt) ; i.e., it
is a translationally symmetric function (wave) with wave velocity T k . The particular form of g
depends on the boundary conditions. For example, for the initial value problem where k(0, x) is
given, g(x) = k(0, x) and
k(t, x) = k(0, x-Tkt). (4)
The wave velocity Tk also plays an important role in the quasi-linear case. In this case too,
the density is constant along wave-lines (characteristics) issued from the boundary, but the wave-
lines can now focus and cross. The solution can be extended into these regions of the (t, x) plane
by introducing curves, called shocks in the LWR theory, where k(t, x) is discontinuous.
Conservation of vehicles across such discontinuities results in the following equation for the shock
velocity, u :
u = Aq/Ak (5)
where Aq and Ak represent the changes in flow and density across the shock.
Equations (3) and (5), however, are not always enough to specify a solution. It turns out
that shocks can often be introduced in more than one way to form a mathematically correct solution
of a properly formulated problem; i.e., a solution that is consistent with (3) and (5) and with the
initial data. If the problem has been properly formulated, however, i.e., if it makes physical sense,
then there should be one and only one solution that makes physical sense. This solution can be
identified with a standard stability argument; i.e., by making sure that the solution does not come
undone if a small perturbation is introduced in it.3
If q = T(k) is represented by a curve as in the top part of Fig. 1, then the wave velocity is
the slope of the curve. Note that if the curve is concave, then the wave velocity declines with
density, and also with the traffic speed. That is, waves focus (into a shock) when traffic decelerates
and fan out (as an expanding wave) when it accelerates. The best evidence available and common
sense indicates that the maximum wave speed in the forward direction, obtained for low densities,
is comparable with the free-flow traffic speed, e.g., on the order of 60 mi/hr (27 m/s), and that the
backward wave speeds are several times slower. This is illustrated on the bottom part of Fig. 1,
3
The physically relevant solution must be one where the shocks are "stable"; i.e., where if (at any
given time, t 0 ) one were to replace a shock by a quick but gradual transition in density between the
values prevailing on both sides of the shock, then the shock would reform itself and the solution
would quickly approach the original (for t > t0).
0.5
0.4
Jl 0.3
- o, X
0.1
0
0.02 0.04 0.06 0.08 0.1
Density, k (veh/m)
onnn
O.1 RHO
/
c?
Omnn -
LJ_
,9
0 . I I I I I I I I I I I I I I I I I I I
C 5 10 15 20 25 30 35 40 45 5(
Occupancy (%)
o 10-min Criterion a Relaxed Criterion
Figure 1. Stationary relations between traffic variables. Top: hypothetical non-concave

relation between flow and density. Bottom: actual form of a steady-state relationship
between flow and occupancy (source: Cassidy, 1998).
which has been taken from Cassidy (1998).

Additional experiments correlating data from several detectors point to the existence of
expansive deceleration waves and focused acceleration waves in congested traffic (Windover and
Cassidy, 1997). Related anecdotal evidence has also been recorded by this author on Highway US-
50 West of Placerville, California. This is a heavily traveled two-lane highway with very few
intersections, which experienced a sustained capacity-reducing incident on the particular date. This
created queues of many miles on both sides of the road. On crossing the incident location and
traveling past the queue in the opposite direction, this author noted a period of a minute or two
(spanning between 1 and 2 highway miles) where rather dense traffic appeared to be coasting
toward the end of the queue(!). This was not a stop-and-go wave, for those were observed within
the queue and they had a much shorter period (from the stand point of the moving observer).
This coasting effect cannot be explained by the LWR theory with a concave T(k), since in
that theory the end of a queue should involve a transition with just a few vehicles. It cannot be
explained either by linear car-following models (e.g., as in Herman et al, 1959), but it can be
explained by the LWR theory with a non-concave T(k) and a "tail" to the right (such as that in the
top part of Fig. 1) and/or by the corresponding non-linear car-following model. We do not wish
to speculate further about this phenomenon in this paper, since the goal of this comment is only to
establish the desirability of having numerical methods that can treat non-concave T(k) relations.
Such methods can be used for prediction if such tail exists, or they may help disprove its existence
if it does not.
Improved numerical methods are desirable because the exact procedures in Newell (1993)
cannot be used with non-concave relations, and because the cell-transmission approaches introduce
some error into the calculation. Before the new method is presented, a brief example is introduced
which illustrates the stability condition for non-concave T(k).
2.1 An example
Let us examine how a line of cars comes to a halt according to the LWR theory, when the
flow-density relation, T(k), is as in Fig. 2a. We assume that the corners of our curve have been
smoothed very slightly, not enough to be seen in the figure, so that the wave velocity is defined for
all densities.4
It is assumed that the leading cars in the line are in state A' (q = 0.6 veh/sec, k = 12
veh/Km) and that heavier traffic (state A with q = 1 veh/sec and k = 20 veh/Km) is found 6 Km
upstream of the leading car; see Fig. 2b. The origin of the (t, x) coordinates has been chosen so that
4
Smoothed piece-wise linear q-k curves such as ours, are good for illustration purposes because
they lead to solutions that are simple and easy to interpret.
-200 -100 o 100 200 300 400 l(sec)
Figure 2. Solutions of a lead-vehicle problem with kinematic wave theory: (a) T(k) relation; (b)
a mathematically valid but physically unacceptable solution; and (c) the stable solution.
the first car stops at x = 0 (Km) when t = 0 (sees).

Part b of the figure satisfies (3) and (5) but is unstable. To see this, note that if the shock
at time t0 is replaced by a smooth transition in density (e.g., at t0 = 400s) then a fan of waves would
emanate from it and the solution would not evolve into the future (t > t0) as originally assumed
because the fan would introduce a state "E" into the solution.
Part c is the stable solution to this problem. It includes a fan of waves (carrying state "E")
that are issued tangentially from the shock at the point where it bends. That this solution is stable
is seen by noting that if the shock is replaced by a smooth but rapid transition in density anywhere,
even at the point where the shock bends, then the wave pattern so generated will match the one in
the solution.5 This means that no new states can be introduced into the solution of Fig. 2c by an
infinitesimal perturbation and that the solution is the one that would arise physically.
That Fig. 2.c is the relevant solution can also be verified from microscopic (car-following)
considerations, although this is more tedious. This statement is based on the principle that any
asymptotically stable car-following model, in the sense of Herman et al (1959), must be consistent
with the corresponding (stable) LWR result; e.g. with the macroscopic result of Fig. 2c in our case,
provided of course that the equilibrium speed-spacing relation is consistent with the T(k) of Fig.
2a and that the initial conditions are also consistent with those of Fig. 2. The development of a
coasting state, such as state "E" of Fig. 2c, is demonstrated with car-following theory in a longer
version of this paper (Daganzo, 1997). Figure 10 of that reference shows that the car-following
vehicle trajectories indeed develop a coasting state when a platoon is brought to a halt.
3. THE LAGGED CELL-TRANSMISSION RULE
3.1 Background
The cell-transmission model can be applied to unimodal T(k) curves with maximum flow
qmax. First one defines two monotonic curves that take values in the interval [0, qmax] , as shown
in Fig. 3: a non-increasing receiving curve R(k) and a non-decreasing sending curve S(k). The
symbol k° is used to denote one of the densities where the maximum is achieved.
A rectangular lattice with spacings e and d is then overlaid on the (t, x)-plane, as shown
on Fig. 4. It is understood that traffic flows in the direction of increasing x. The x-coordinates of
the lattice points, denoted {Xj} , represent the center of the "cells" into which the highway has been
discretized, and the t-coordinates {tj, the times at which the cell densities are evaluated. The
numbering scheme is such that xj+1 > ^ and ti+1 > tj so that traffic advances in the direction of
increasing j .
5
To check stability at the point where the shock bends, one should treat the transition from A' to
A as being rapid but smooth, remembering that the diagram of Fig. 2c is on a large scale. On a
resolution scale where the transition from A' to A can be discerned the stable shock would have
to bend gradually. Furthermore, waves would peel off from it as it curves. This detailed geometry
is consistent with the macroscopic picture painted in Fig. 2c.
If we now let K(tj, Xj) denote the average density estimated for cell j at time t ; , and we
write Q(tj + e/2, Xj + d/2) for the average flow that would advance from cell j to cell j+1 (i.e.,
crossing location Xj + d/2 ) in the time interval [t ; , tj+1] , then we require:
K(t+e, x) = K(t, x) - (e/d)[ Q(t+e/2, x+d/2) - Q(t+e/2, x-d/2)] (6)
by virtue of conservation. The subscripts i and j have been omitted in (6) for simplicity of notation.
This will be done from now on when reference to particular cells and/or a time slices is not
necessary. In these cases it should be understood that (t, x) is a point on the lattice.
The cell transmission model is completed by a formula that gives Q in terms of the sending
and receiving functions evaluated at the upstream and downstream cells,
Q(t+e/2, x+d/2) = min{ S(K(t, x)) , R(K(t, x+d)) }, (7)
and by specifying that
d/|T k | max , (8)
where Tk max is the maximum of the absolute wave speed for the given T(k) relation. For
maximum accuracy, one should choose
e=d/|Tk| (9)
Figure 3. Sending and receiving functions of the cell-transmission model.

Figure 4. Lattice and stencil for the lagged cell-transmission model. Dots
are lattice points; crosses are points where the average flows are evaluated.
Equation (8) ensures that data from outside the two neighboring cells cannot influence the
calculated flow, which is a requirement for convergence.
Recall now that in Godunov's approach, Eq.(7) would be the flow at the discontinuity in
the stable solution of a Riemann problem6 for which the upstream density, ku , is that of the
upstream cell, ku = K(t, x ) , and the downstream density, k d , is that of the downstream cell, kd =
K(t, x+d); see LeVeque, 1992. The reader can verify, e.g., using the ideas in Sec. 2, that Eq. (7)
indeed yields the stable flow at the location of the discontinuity for a Riemann problem with
densities K(t, x) and K(t, x+d) — no matter how these two values are chosen -- if the T(k) relation
is unimodal. This establishes that the CT model (6-7) is in Godunov's family of finite difference
approximations for unimodal T(k)'s.
3.2 The new rule.
Let Sk max and | Rk max denote the maximum (absolute) wave speeds in the forward and
reverse directions, and T k | max the maximum in any direction. We show below that whenever
6
A Riemann problem is an initial value problem where the initial density is a step function with
one step; i.e., k(0, x) = kd if x > x°, and k(0, x) = ku if x < x°.
ax,
as one would expect for most traffic streams, it is advantageous to read
the sending and receiving flows from different time slices. That is, one can define a lag, Q. = 0, 1,
2,..., and use:
Q(t+e/2, x+d/2) = min{ S(K(t, x)), R(K(t-fie, x+d)) }, (10)
instead of (7). This corresponds to the stencil depicted in Fig. 4, which predicts the flow at point
"P" when 0 = 2. The special case with { = 0 reduces to the conventional cell-transmission rule.
It will be shown in Sec. 3.3 that rule (10) is most accurate when the velocity of the wave
reaching "P" happens to match the slope of one of the arrows in the figure; i.e., if the prediction
at "P" is evaluated as close as possible to the source of its wave.7 This will be the basis for our
choice for {. It also turns out, for stability reasons discussed in Sees. 3.3 and 3.5, that the backward
slope of the arrow in our diagram should not be less than the maximum backward wave speed,
R k | m a x . This means that C must satisfy:
e > d/[|Rk|max(2C+l)]. (11)
A choice of f where (11) is as close as possible to a pure equality is recommended; i.e., where:
0 = V2[d(e\Rk\maJl - 1]. (12)
The accuracy of (8), (10) and (12) is evaluated below. The steps parallel those in Sec. 4 of
Daganzo(1993a).
3.3 Error estimation.
Consider a region of the time-space plane where the waves move back (congested traffic).
Then, (6) and (10) may be rewritten as:
K(t+e, x) = K(t, x) - (e/d)[ T(K(t-«e, x+d) - T(K(t-Ce, x) ] (13)
since in this region T(k) = R(k).
7
Insofar as cause and effect relations propagate as waves in the LWR model, it should not be
surprising to see that accuracy is greatest when we take our data from the lattice point closest to
the wave.
In the exact theory, the solution at time t+e is related to the solution at time t by:
k(t+e, x) = k(t, x-Tke)
where Tk is evaluated for the density prevailing at (t, x-Tke). Thus, in a region where K > k°, it is
convenient to rewrite (13) in the following manner:
K(t+e, x) = K(t, x-Tke) + [K(t, x) - K(t, x-Tke)] - (e/d)[T(K(t-0e, x+d)) - T(K(We, x))]. (14)
This is useful because a second order power series expansion of the second and third terms in this
expression about point (t, x-Tke) yields an estimate of the error in (13) in terms of known
quantities.
The expansion of the second term is:
[K(t, x) - K(t, x-Tke)] - (Tke)Kx + l/2 (Tke)2Kxx ,
where a double subscript denotes a second derivative. Likewise, the expansion of the third term
can be reduced to:
-(e/d)[T(K(t-0e, x+d)) - T(K(t-£e, x))] =

-(e/d)[ (dTkKx) + V2(d2 + 2dTke)(TkkKx2 + T k K xx ) - (d0e)(TkkKxKt + TkKxt) ],
and the sum of these two expressions can be further simplified. The combined result is:
- '/2d2Kxx[(Tke/d)2 + (Tke/d)] - >/2(TkkKx2)(de)[l + 2(Tke/d)] - (Ce2)(TkkKxKt + T k K x t ),
which allows us to rewrite (14) as follows:
K(t+e, x) = K(t, x-Tke) - Vzd'KJCIê/d)2 + (Tke/d)]

- i/2(TkkKx2)(de)[l + 2(Tke/d)] - (fe^T^R, + TkKxt). (15)
The first three terms of this expression coincide with (9) in Daganzo (1993a). The last term is the
contribution to the error caused by the lag. This term becomes more meaningful if the time-
derivatives Kj and Kxt are eliminated from the solution. This can be done if one notes from (3) that
kt = - T k k x , and that the x-derivative of this expression (keeping time constant) is: ktx = - Tkkkx2 -
Tkkxx . If we use these relationships in the last term of (15) and use p as an abbreviation for
| Tke/d , then we obtain the following expression for the error committed in time e (when the
system is congested):
0 = d 2 K xx p 2 [V4(l/p-l) - fi] + T kk K x 2 de[ -Vi + p(l+2«)]. (16)
This is the generalization of (10) in Daganzo (1993a).

When the system is uncongested, with T(k) = S(k), the LCT recipe does not use a lag.
Therefore, the original derivation applies. The result is again (16), but with 0 = 0.
Note from (15) that the last three terms of that expression, i.e., what we have called 0 ,
represent the change in K(t, x) along the characteristic in time e. Therefore, the ratio 0/e is the
directional time-derivative of K(t, x) when e - 0 . Thus, in the case of a constant wave speed (i.e.,
linear T(k)) where the second term of (16) disappears because Tkk = 0, the value of K as seen by
an observer moving with the wave satisfies the diffusion equation if e is small. The solution of the
diffusion equation is stable only if the diffusion coefficient is non-negative ; i.e., if the bracketed
quantity in the first term (16) is non-negative. Thus, one should choose an H that satisfies this
condition for the largest possible p. This is the rationale for condition (11). Section 3.5, below,
looks at the stability issue in a different way.
We have just seen that the second term of (16) vanishes if T kk = 0 . Note as well that if
(12) is used to choose C, then the first term will also vanish whenever the prevailing (backward)
wave speed is at its maximum value. Since (16) also holds with C = 0 when the highway is
uncongested, the first term also vanishes where the wave velocity is at its maximum value in the
forward direction (i.e., when p = 1). Since the term also vanishes if p = 0, we see that the LCT
model must be second order accurate when the T(k) relation is trapezoidal or triangular, provided
that the lag is chosen with (12).
We also see from (16) that the (first order) errors arising from the first term of (16) should
be proportional to the difference between the lag one would like to use for the given wave speed,
which is Vi(l/p - 1) , and the actual lag. This difference is minimized with rule (12), and this is
particularly important when T(k) is piecewise linear.
3.4 Variable meshes
The LCT method can be applied to highways and networks that have been discretized with
cells of different lengths. In this more general case the LCT lag should be cell-dependent so as to
ensure that the density is always read from the earliest possible time without violating the stability
condition. That is, the lag for cell j , ^ , should satisfy as tightly as possibly the inequality:
e>dj/[|Rk|max(2ej+l)], (17)
where dj is the cell length, instead of inequality (11).

To preserve the good properties of the LCT method, one should also introduce a forward
lag, fj , which should satisfy:
e>dJ/[Sk,max(2fJ+l)], (18)
instead of (8). The forward lag can forestall the deterioration of accuracy where large cells are
being used; especially if fj is chosen so that (18) is as close to an equality as possible. The logic
for this choice is the same as that in Sec. 3.3, and the outcome is also similar; e.g., in that the
resulting method is still second order accurate in the case of a triangular T(k) relation.
The LCT rule with a variable discretization is then:
K(t+e, Xj) = K(t, Xj) - (e/dj)[ Q(t+e/2, Xj+d/2) - Q(t+e/2, Xj-d/2) ] (19)
with
Q(t+e/2, Xj+dj/2) = min{ S(K(t-fj£, Xj)) , R(K(t-«j+1e, xj+1)} . (20)
3.5. Stability and relationship to Godunov's method
Godunov's procedure identifies the stable solution because, as we have seen, its flows are
always derived from the stable solution of a Riemann problem. Likewise, it will be shown here
that the flows of the LCT method (20) are based on the stable solution of a modified Riemann
problem, and therefore that the LCT solution should too approximate the stable solution.
Let us consider a modified Riemann problem (MRP) in which constant upstream and
downstream densities have been defined on a V-shaped boundary, such as the dark line in Fig. 5.
For an MRP problem to be well-posed, waves from both legs of the "V" should point into the
solution space. This will happen if the (t, x)-slopes, su and sd , of these legs satisfy su > Sk max and
sd < -|R k max ; i.e., if the V-shaped boundary is inside the shaded wedges shown in the figure.
Consideration reveals that the stable solution of the MRP for t > 0 is independent of the
slopes of the "V" and that, as a result, said solution is also the stable solution of the conventional
Riemann problem with the same initial densities. In other words, the stable flows arising at the
discontinuity are the same for the conventional and modified Riemann problems.
Since (20) has the same form as (7), i.e., it is the minimum of a sending and a receiving
value, it expresses the stable flow of a Riemann problem with ku = K(t-fje, Xj) and kd =
K(t-f j+1 e, xj+1) . As we have just shown, this is also the solution of a well-posed MRP with the
same initial densities. In particular, it is the stable flow for the MRP in which the "V" passes
through the lattice points at which ku and kd have been evaluated (points "A" and "B" in Fig. 5).
This MRP will be well-posed if the "V" passing through the lattice points is inside the shaded
= 4
A"
«* max ^"^-^~^
s ' f
S*, max a, II
\"B*
Figure 5. The modified Riemann problem at the core of the LCT.
wedges; i.e., if (17) and (18) are satisfied.

Thus, when (17) and (18) are satisfied the LCT method is nothing but a recursive solution
of well-posed MRP's, just like the Godunov/CT method involves the recursive solution of ordinary
Riemann problems. Insofar as the stable solution is always used in both procedures, we can
conclude that the LCT method should share the good stability properties of the conventional CT
method. The advantage of the former is that it is based on "older" data, which should be less
corrupted by numerical errors.8
The following section presents some examples that illustrate both the accuracy and stability
properties of the LCT method.
4. EXAMPLES
Let us start by examining the accuracy of the LCT model with both smooth and
discontinuous initial conditions.
8
The accuracy formulae of Sec. 3.3 confirm that the LCT method is most accurate when f, and
<> j+ i are chosen to be as large as possible, i.e., when the "V" forms as acute an angle as possible.
Tables 1 and 2 present 17 iterations of the conventional and lagged cell transmission models
for a triangular flow-density relation of the form:
q = min{k, (180-k)/5), (21)
where k is in veh/mile and q in veh/min. It is assumed that the initial density profile is quadratic
and in the congested range k e [30, 180] ; more specifically, that k(0,x) = 50 + Vix2 and x ~ 10
± 5. Because the (congested) wave velocity is constant (i.e., Rk = -0.2) the exact solution of the
problem in this range of x for small t is: k(t, x) = 50 4-VXx-f t/5)2. That is, in the exact solution the
density at mile x-1 must equal the density that prevailed at mile x five minutes earlier.
Table 1 presents the results of the ordinary CT model when one uses e = 1 and d = 1. In
the exact solution the numbers in boldface would be equal, so that the observed discrepancy is the
CT error. The discrepancies observed in the table are consistent with what would be expected from
(16) with { = 0, and from the more detailed analysis in Daganzo (1993a).
If one uses the LCT model with e = 1, d = 1 and f = 2 , i.e., the value recommended by (12),
then Table 2 is obtained. In this case the LCT model reproduces the exact results, as one would
expect from (16) and the analysis leading to it.9
Of course, the performance of the LCT procedure deteriorates in less favorable cases, e.g.,
with non-quadratic density profiles, non-linear T(k) functions, and non-ideal lags.
This is illustrated by Fig. 6, which contains the initial and final density profiles (at t = 25
min) for a Riemann initial value problem where (21) holds and where the initial density changes
suddenly from k = 50 to k = 100 at x = 20 mile. The thin lines are the CT and LCT results
obtained with e = 1 min and d = 1 mile. Note the lesser spread of the LCT result, and that the LCT
model does not transition monotonically from one density value to the other.10 The relative
accuracy of the two models can be evaluated better with the cumulative curves of vehicle count;
see Fig. 7. Whereas in the CT model the maximum error in vehicle position is on the order of 2/3
miles (which is not surprising since d = 1 mile), with the LCT the largest error is less than 1/5 mile.
In order to illustrate the stability of the method for non-concave T(k) relations, Figs. 8 and
9 depict the LCT solution of the lead vehicle problem in Fig. 2c using cells with d = 100 m and
time steps e = 2 sees. This is efficient because with this arrangement (8) is satisfied as an equality
— note from Fig. 2a that in our case |Tk max = vf =50 m/s. Since |Rk max = 350/13 m/s , the
9
Note that, for this to be the case, 0+1 time slices of internally consistent initial data had to be
specified.
10
This undesirable feature of the LCT model is typical of second order approximations (see, e.g.,
LeVeque, 1992). As a result, the LCT model may produce densities slightly greater than the
theoretical maximum from time to time. Therefore, the receiving function R(k) should be defined
for 0 < k < °° , in an implementation.
Table 1. Estimated densities at different positions with the cell-transmission model. Row 3 is the
position (x) in Km. Rows 5 and 6 are the maximum flow and the jam density at the given position.
Rows 9 to 11 are the initial data, k = 50 + V^x +1/5)2, where t = 0 for row 9, t = 1 for row 10 and
t = 2 for row 11.
3 6 7 8 9 10 11
4
5 30 30 30 30 30 30
6 180 180 180 180 180 180
7
8
9 68 74.5 82 90.5 100 110.5
10 69.22 75.92 83.62 92.32 102.02 112.72
11 70.48 77.38 85.28 94.18 104.08 114.98
12 71.86 78.96 87.06 96.16 106.26 117.36
13 73.28 80.58 88.88 98.18 108.48 119.78
14 74.74 82.24 90.74 100.24 110.74 122.24
15 76.24 83.94 92.64 102.34 113.04 124.74
16 77.78 85.68 94.58 104.48 115.38 127.279
17 79.36 87.46 96.56 106.66 117.76 129.852
18 80.98 89.28 98.58 108.88 120.178 132.453
19 82.64 91.14 100.64 111.14 122.633 135.071
20 84.34 93.04 102.74 113.438 125.121 137.694
21 86.08 94.98 104.88 115.775 127.635 140.308
22 87.86 96.9599 107.059 118.147 130.17 142.897
23 89.68 98.9796 109.276 120.551 132.715 145.445
24 91.5399 101.039 111.531 122.984 135.261 147.936
25 93.4397 103.137 113.822 125.44 137.796 150.355
26 95.3793 105.274 116.145 127.911 140.308 152.691
27 97.3583 107.449 118.499 130.39 142.785 154.931
28 99.3763 109.659 120.877 132.869 145.214 157.066
29 101.433 111.902 123.275 135.338 147.584 159.09
Table 2. Estimated densities at different positions with the lagged cell-transmission model. Row
3 is the position (x) in Km. Rows 5 and 6 are the maximum flow and the jam density at the given
position. Rows 9 to 11 are the initial data, k = 50 + 1A(\ + t/5)2, where t = 0 for row 9, t = 1 for
row 10 and t = 2 for row 11.
3 6 7 8 9 10 11
4
5 30 30 30 30 30 30
6 180 180 180 180 180 180
7
8
9 68 74.5 82 90.5 100 110.5
10 69.22 75.92 83.62 92.32 102.02 112.72
11 70.48 77.38 85.28 94.18 104.08 114.98
12 71.78 78.88 86.98 96.08 106.18 117.28
13 73.12 80.42 88.72 98.02 108.32 119.62
14 74.5 82 90.5 100 110.5 122
15 75.92 83.62 92.32 102.02 112.72 124.42
16 77.38 85.28 94.18 1 104.08 114.98 126.88
17 78.88 86.98 96.08 106.18 117.28 129.38
18 80.42 88.72 98.02 108.32 119.62 131.92
19 82 90.5 100 110.5 122 134.5
20 83.62 92.32 102.02 112.72 124.42 137.12
21 85.28 94.18 104.08 114.98 126.88 139.78
22 86.98 96.08 106.18 117.28 129.38 142.48
23 88.72 98.02 108.32 119.62 131.92 145.22
24 90.5 100 110.5 122 134.5 148
25 92.32 102.02 112.72 124.42 137.12 150.82
26 94.18 104.08 114.98 126.88 139.78 153.677
27 96.08 106.18 117.28 129.38 142.48 156.565
28 98.02 108.32 119.62 131.92 145.22 159.47
29 100 110.5 122 134.5 147.999 162.369
12 16 20
Distance, x
Figure 6. Evolution of a discontinuous traffic disturbance as predicted by the CT and LCT

models for the case of a triangular T(k) curve with p = 0.2: density profile comparison..
optimum lag according to (12) should be f = 3/7 . Because this is not an integer, R(k) was
evaluated for a k that was an interpolation of the densities obtained with { = 0 and C = 1. The
numerical results should then still be second order accurate for densities less than or equal to that
of state "E" in Fig. 2a, and less accurate for more congested states.
The density profiles of Fig. 8 confirm this. Note how the curves have sharper steps below
the line K = 0.5 than near the top. (In the exact solution these curves would be perfect step
functions.) Note as well the good agreement between this figure and 2c; in particular, the
development of intermediate state "E", with K = 0.4, after t = 120 sees.
Figure 9 displays the N-curves (of vehicle number) at five locations in ¥2 Km increments
upstream of the stoppage. They also agree qualitatively with the exact solution of the problem,
900
750
o
700
650
Figure 7. Evolution of a discontinuous traffic disturbance as predicted by the CT and LCT

models for the case of a triangular T(k) curve with p = 0.2: cumulative vehicle count
comparison.
which in this case is piecewise linear. Clearly, the curved arcs in the figure, which correspond to
episodes of very congested traffic, have some numerical error. However, the straight portions of
the curves match the exact solution precisely, and therefore it is easy to ascertain from the figure
the magnitude of the numerical errors in the curved sections. This error does not exceed 5 vehicles
in any of the curves.
5. CONCLUSION
The LCT procedure is also well suited for modeling intersections and inhomogeneous
highways, since the only change needed in the existing procedures is reading the traffic density for
the downstream conditions with a time lag. This modification is so minor that it can also be
applied to highways with special lanes and their junctions; e.g. by modifying the procedure
described in Daganzo et al. (1995).
It should also be said that lags impose additional memory storage requirements on an LCT
simulation, since cell densities must be stored for the past 1+fj time slices. For multi-destination
networks, however, most of the storage is consumed by the cell content proportions (by destination
and entry time), which are only used as arguments as part of the "sending" functions. Thus, in an
LCT implementation the bulk of the information would only have to be kept for 1+fj time slices.
0.1
•t = 0s
-1 = 40 s
t = 80 s
0.075
-t=120s
-t=160s
'w 0.05
c
<u
Q
0.025
0 10 20 30 40 50 60 70
Distance, x(100 m)
Figure 8. Density profiles for the lead vehicle problem of Fig. 2.

N-plot for the lead vehicle problem
..-
,•*
,'"
/
J
Vehicle number
/
i3
,--—
///
^
B
'{/
/
,.
'^ ,---—
/
V//
'/
0 40 a3 120 160 200 240 280 320 360
x=OKm
Time (sees) x = -.5Km
x=-1 Km
x = -1.5Km
- • • x = -2.0Km
Figure 9. N-curves for the lead vehicle problem of Fig. 2.
REFERENCES
Bui, D.D., P. Nelson and S. L. Narasimhan (1992) Computational realizations of the entropy
condition in modeling congested traffic flow. FHWA Report TX-92/1232-7.
Cassidy, M. (1998), Bivariate relations in highway traffic. Trans. Res. 32B(1), 49-59.
The Lagged Cell- Transmission Model 103
Daganzo, C.F. (1993) The cell-transmission model: A dynamic representation of highway traffic
consistent with the hydrodynamic theory. Institute of Transportation Studies, Research
Report, UCB-ITS-RR-93-7, Univ. of California, Berkeley, CA. Short version in Trans.
Res., 28B(4), 269-287.
Daganzo, C.F. (1993a) A finite difference approximation for the kinematic wave model. Institute
of Transportation Studies, Research Report, UCB-ITS-RR-93-11, Univ. of California,
Berkeley, CA. Short version in Trans. Res. 29B(4) 261-276.
Daganzo, C.F. (1994) The cell-transmission model. Part II: Network traffic. PATH working paper
UCB-ITS-PWP-94-12, Univ. of California, Berkeley, CA. Short version in Trans. Res.,
29B(2), 79-94.
Daganzo, C.F. (1995) A continuum theory of traffic dynamics for freeways with special lanes.
PATH Technical Note UCB-ITS-PTN-95-08, Univ. of California, Berkeley, CA. Short
version in Trans Res. 31B (2), 83-102.
Daganzo, C.F. (1997) An enhanced cell-transmission rule for traffic simulation. Institute of
Transportation Studies Research UCB-ITS-RR-97-06, Univ. of California, Berkeley, CA.
Daganzo, C.F., Lin, W.H. and del Castillo, J.M. (1995) A simple physical principle for the
simulation of freeways with special lanes and priority vehicles. PATH Technical Note
UCB-ITS-PTN-95-09, Univ. of California, Berkeley, CA. Short version in Trans. Res.
31B (2), 105-125.
Godunov, S.K. (1961) Bounds on the discrepancy of approximate solutions constructed for the
equations of gas dynamics. J. Com. Math, and Math. Phys. 1, 623-637.
Herman, R., E.W. Montroll, R.B. Potts, and R.W. Rothery (1959), "Traffic dynamics: Analysis of
stability in car following", Opns. Res. 1, 86-106.
Lax, P.D. (1973) Hyperbolic systems of conservation laws and the mathematical theory of
Shockwaves. Regional Conf. Series in Applied Mathematics. SLAM, Philadelphia, PA.
Lebacque, J.P. (1993) Les modeles macroscopiques de trafic. Annales de Fonts, (3rd trim) 67, 28-
45.
LeVeque, R.J. (1992) Numerical methods for conservation laws, (2nd edition), Birkhauser-Verlag,
Boston, MA.
Lighthill, M.J. and G.B. Whitham (1955) On kinematic waves. I flow movement in long rives.
n A theory of traffic flow on long crowded roads. Proc. Roy. Soc. A, 229, 281-345.
Luke, J.C. (1972) Mathematical models for landform evolution. J.Geophys. Res., 77, 2460-2464.
Newell, G.F. (1961) Non-linear effects in the dynamics of car-following. Opns. Res. 9(2), pp.
209-229.
Newell, G.F. (1993) A simplified theory of kinematic waves in highway traffic, I general theory,
n queuing at freeway bottlenecks, HI multi-destination flows. Trans. Res., 27B, 281-313.
Richards P.I. (1956) Shockwaves on the highway. Opns. Res., 4, 42-51.
Windover, J. and Cassidy, M.J. (1997) Private communication.

CHAPTER 2
TRAFFIC FLOW BEHAVIOUR
• Common sense is the least common of all senses.

• Doubt is not a very pleasant status but certainty it is a ridiculous one.
(Voltaire)
• Every extension of knowledge arises from making the conscious the
unconscious. (Friedrich Nietzsche)
107
OBSERVATIONS AT A FREEWAY BOTTLENECK
Michael J. Cassidy and Robert L. Bertini

Department of Civil and Environmental Engineering and Institute of Transportation Studies,
University of California at Berkeley
ABSTRACT
Traffic was studied upstream and downstream of a bottleneck on a freeway in Toronto, Canada
using transformed curves of cumulative vehicle count and cumulative occupancy. The
bottleneck was located more than a kilometer downstream of a busy on-ramp. After
diagnosing its location and the times that it remained active each day, the study focused on the
traffic patterns that arose in each travel lane. It was observed that prior to the bottleneck's
activation, the vehicles' lane-changing trends created extraordinarily high flows in the median
(i.e., left-most) lane and that these high flows were sustained for extended durations. When a
queue eventually formed at the bottleneck, its discharge rates were considerably lower than
those flows measured prior to queueing. Within each lane, the queue discharge rates remained
nearly constant over the rush and the average rates varied only slightly across days. Finally,
vehicles arriving to the bottleneck from the nearby upstream on-ramp entered the freeway at
high rates, even after the bottleneck's queue propagated beyond this ramp.
1. INTRODUCTION
In an earlier study, the authors examined traffic at two bottlenecks on freeways in and near
Toronto, Canada (Cassidy and Bertini, 1999). From this study, certain reproducible patterns
were observed. For example, each bottleneck formed more than a kilometer downstream of an
on-ramp and their formations occurred at these same locations each day. While the bottlenecks
were active,1 the vehicles discharged through them at nearly constant rates, although some time
dependencies were observed for short periods following the onset of queueing. Furthermore, a
bottleneck's average queue discharge rate did not vary much from one day to the next and
these average rates were typically 8 to 10 percent lower than the flows measured prior to
queueing upstream.
These earlier findings came by visually comparing sets of transformed cumulative curves.
Each curve was constructed from either the counts or the occupancies collected at one of
several neighboring loop detector stations. Of note, these curves described measurements that
were taken across all travel lanes, meaning that the above findings came by grouping together
the traffic streams in multiple lanes and studying them in the aggregate.
In this paper, we add to the previous findings on bottleneck flow by reporting on some
observations taken from individual lanes. At each of three neighboring detector stations
located upstream and downstream of a bottleneck, curves of cumulative vehicle count and
cumulative occupancy were separately constructed for each travel lane. Visual comparisons of
these curves revealed certain details of traffic evolution, some of which were unexpected.
It was observed, for example, that large numbers of vehicles gradually moved into the median
lane as they approached and passed through the bottleneck. This lane-changing pattern even
continued at locations more than 2 kilometers downstream of the neighboring on-ramp. As a
consequence, flows measured in the median lane were remarkably high; e.g., at a location well
downstream of the on-ramp, the median lane flows sometimes exceeded 2,600 vehicles per
hour. These high rates persisted each day for durations of up to 40 minutes before queues
formed upstream and lower discharge rates ensued.
The bottleneck's queue formed at nearly, but not exactly, the same times in each lane; i.e., this
formation occurred in the shoulder lane several minutes after it had occurred in the adjacent
lanes. Following the queue formations, the flow reductions observed downstream were most
pronounced in the median and center lanes and less so in the shoulder lane. The discharge
rates remained nearly constant so long as the bottleneck was active and free of any incidents
nearby. Although these average rates varied across lanes, each lane's average was reproduced
from day to day.
It was further observed that vehicles entered the freeway from the upstream on-ramp at very
high rates, even after the bottleneck's queue propagated beyond the ramp and obstructed this
flow. Thus, the on-ramp vehicles did not share the available capacity with vehicles in the
1
The term "active bottleneck" denotes that the discharge rates measured downstream of a queue were not
affected by traffic conditions from further downstream (Daganzo, 1997).
Observations at a Freeway Bottleneck 109
adjacent freeway lane in a strictly alternating or "one-to-one" basis. Rather, motorists from the
on-ramp forced themselves into the queue in such a way that the gaps between the freeway
vehicles were filled by multiple on-ramp vehicles.
The following section provides descriptions of the bottleneck and of the loop detector data used
in this study. Section 3 describes the use of cumulative curves to identify the bottleneck's
location and the times that it remained active; uncovering these details was a requisite step for
studying the evolution of the bottleneck flows. Section 4 presents the bottleneck's traffic
patterns that were observed in each lane during a single rush. Some findings from repeating
these analyses with data from other days, and comments regarding future research directions,
are provided in the fifth and final section.
2. THE DATA
The segment of the Gardiner Expressway shown in Figure 1 was the site used in our study. As
an aside, this site has been used in several earlier studies of freeway capacity (Persaud, 1986;
Persaud and Hurdle, 1991; Persaud, et al., 1998), including the previous one by the authors
(Cassidy and Bertini, 1999). It is located in Toronto, Canada and meters are not deployed on
its on-ramps (although the Jameson Avenue on-ramp is closed during a portion of each
afternoon rush).
Loop Detector
Detector Station
Figure 1: Gardiner Expressway, Toronto, Canada
The loop detector stations for measuring traffic data are labeled in Figure 1 as per the
numbering strategy adopted by the City of Toronto. These detectors record counts,
occupancies and (time) mean speeds in each lane over 20-second intervals. In total,
measurements were made during three weekday afternoons when the local weather bureau
reported clear skies and no measurable precipitation.
In the next section, data from one of the observation days are used to demonstrate that a
bottleneck was activated between detector stations 60 and 70. This bottleneck remained active
for more than two hours before a queue spilled over from further downstream and restricted its
discharge. We will also show that, some minutes prior to its deactivation, the bottleneck's flow
was impeded by an incident that occurred near detector 50.
3. THE ACTIVE BOTTLENECK

Figure 2 presents transformed curves of cumulative count, N, versus time, t, for detectors 40
through 80. These were constructed using the vehicle counts taken in all lanes during a 30-
minute period that spanned the onset of queueing.2 Untransformed JV-curves give the
cumulative number of vehicles to have passed (detector) location x by t. By constructing the
curves as linear interpolations through the cumulative counts measured every 20 seconds, each
curve's slopes would be the flows past x during the 20-second measurement intervals.
Moreover, since the counts for each curve in Figure 2 started (N = 0) with the passage of a
reference vehicle, the horizontal and vertical separations between the curves would have been
the trip times and the vehicle accumulations between detectors, respectively (Newell, 1982;
Newell, 1993).
In Figure 2, however, each curve, along with its corresponding time axis, was shifted to the
right by the average free-flow trip time between the respective detector and downstream
detector 80. Consequently, the vertical separations in the curves are the excess vehicle
accumulations between detectors due to vehicular delays. These shifts are advantageous
because two superimposed curves indicate that traffic in the intervening segment was freely-
flowing; every feature of an upstream W-curve is passed to its downstream neighbor a free-flow
trip time later. In addition, Figure 2 shows only the differences between each curve of
cumulative count and the line N-qo-1', where go is the rate used to re-scale the curves and t' is
the elapsed time from the start of each curve. This is important because reducing the curves'
cumulative count by a background flow qo magnifies details, such as the time dependencies in
the flows, without changing the excess accumulations (Cassidy and Windover, 1995).
The on-ramp counts at Spadina Avenue were also used to construct the curve for detector 40 so that all of the
curves in Figure 2 describe the same collection of vehicles. Conversely, a curve for detector 30 was not
included in Figure 2, since the vehicles measured at this location were not identical to those measured at the
detector stations further downstream.
The (nearly) superimposed curve portions in Figure 2 indicate that traffic was initially in free
flow and remained in free flow between detectors 70 and 80. The reader may use a
straightedge to confirm that curves 40, 50 and 60 exhibit increased slopes sometime shortly
after 15:40, as delimited by the large arrow in Figure 2, and this caused these three curves to
en en en
CD -» eo en
01 en en en
oo co co co on en on en
c n o r i c n c n c n c n c n c n c n c n
c o c o c o c o c o c o c o c o
en on en en
e o e o e o e o e o e o e o e o
e n c n e n e n e n e n e n e n
en en en en
co co eo eo eo eo eo eo eo eo
Figure 2: Transformed A^-curves, Detectors 40 through 80.

diverge from their (two) downstream counterparts. These curve features indicate that a
bottleneck was activated between detectors 60 and 70 when increased flows arrived from
upstream. The subsequent separation between curves 60 and 50 (at about 15:51:23) reveal
when the queue arrived at detector 60. Likewise, the queue's arrival at detector 50 is made
evident by the divergence in curves 50 and 40 (at about 15:53:03).
In short, the transformed TV-curves in Figure 2 conclusively diagnose the bottleneck's location
by showing that excess vehicle accumulations occurred upstream of detector 70 while free-
flow conditions prevailed immediately downstream. Especially notable are the pronounced
flow reductions (i.e., the reduced slopes of the TV-curves) that followed the queue's formation;
further details on this are provided in the next section.
Ol Ol Ol O1 Ol Ol Ol Ol Ol Ol Ol Ol Ol Ol Ol & 0) 0) 0) 05 o>
-t* o o o
Ol
o o o
-N A A 4x Ol Ol Ol Ol Ol Ol Ol Ol Ol 0
Ol 0) N 05 co o CO w *. 01 01 ^1 00 CD M w Ol
^
co w w w
Figure 3: Re-scaled 7V- and T-curves, Detector 40
Figure 3 reveals the approximate time that the backward-moving queue arrived at the (freeway)
detectors at station 40. Presented in this figure is a re-scaled TV-curve for the freeway lanes at
this detector station along with a re-scaled curve of cumulative occupancy versus time, or a T-
curve, where cumulative occupancy is the total vehicular trip time over the detectors by time t
(Lin and Daganzo, 1997). Again for the purpose of magnifying details, the T-curve shown here
is the difference between the cumulative occupancy actually measured in the three freeway
lanes at detector 40 and the line T=b0- t', where b0 is the background occupancy rate used to
re-scale the curve and /' is the elapsed time from the curve's start. The two curves show that a
sharp reduction in flow was followed closely by an increase in the occupancy rate at about
15:54:23. These features reveal the arrival of the queue at station 40. That this arrival
occurred some time around 15:54 will be an important part of later discussion regarding the
observed time dependencies in the on-ramp flows from Spadina Avenue.
Finally, Figures 4a and 4b reveal the period that the bottleneck remained active. Figure 4a
shows transformed TV-curves for detectors 60 and 80; these were constructed in the manner
previously described, but for an extended period of over 4 hours. The slope of curve 80 drops
noticeably some time around 18:00 and a similar reduction is displayed by curve 60 soon
thereafter. Notably, the queue between detectors 60 and 80 persisted, even after these flow
reductions; this is evident from the continued displacement between the two curves. These
curve features indicate that a queue from further downstream arrived at these detector stations
and thereby deactivated our bottleneck. In Figure 4b, the divergence in the re-scaled curves of
TV and rreveal that the queue from downstream arrived at station 80 at about 18:12:03.
Before concluding this section, it is worth re-emphasizing that the curves in Figures 2 and 4a
were instrumental in identifying the bottleneck's location and the period that it remained
active. These curves derived their value, in part, by displaying the excess vehicle
accumulations that arose between detectors and this required that the curves be constructed
from the counts taken over all travel lanes.3 Having now identified these bottleneck details (i.e.,
its location and the time it was active), traffic patterns could be studied in the individual lanes
at locations upstream and downstream of the bottleneck. Some notable findings from this
study are presented next.
4. SOME OBSERVATIONS IN INDIVIDUAL LANES

Figure 5 presents re-scaled TV-curves in the shoulder lane for detectors 60, 70 and 80. Figures 6
and 7 present the re-scaled curves for these same detector stations in the center and median
lanes, respectively. These three detector stations (i.e., 60, 70 and 80) were selected because
they were situated immediately upstream and downstream of the bottleneck. In each figure, the
curves span a period of more than 3 hours, which includes the time that the bottleneck was
active.
Before presenting some of the findings obtained by jointly examining Figures 5, 6 and 7, a
brief explanation of their annotations is warranted. First, piece-wise linear approximations
If a set of TV-curves do not describe node conservation, their vertical displacements would not be the excess
accumulations (Newell, 1982; Newell, 1993; Cassidy and Windover, 1995).
Sta. 60
^ Sta. 80
10
ii
ioo 4 /
0 [/ Time, f @ 60
. 1 i : i i i i
O l U l C O O J O l C D N v l X N C O 00 CO CO to to
N '^. <jl -J. NJ
fflSSsufflS-NuS (Jl -» -*J 03 CO
Time, f @ 80
Figure 4a: Transformed N-curves, Detectors 60 and 80.
O
-c
T(80,t)-b0t' O
-c
o3
Q.
I
I<D
o co
o CM
00 CO
10
T~
n
c?
A N(80,t) - qf II
-c?
CT §51
PO Ow o
00
Time,
0 0 0 0
u u u u
Figure 4b: Re-scaled JV- and T-curves, Detector 80.
were superimposed on the N to highlight periods of nearly constant flow; the N usually
deviated from its corresponding linear approximation by no more than about 10 vehicles. The
start and end times for each period of near-constant flow were selected "by eye" and re-scaled
7-curves (not shown here) were also used to aid in these delimitations, since (sizable) changes
in flow are accompanied by changes in occupancy. The times marking these flow changes are
labeled on each curve.4 Further, the times marking the onset and the termination of queue
discharge flows are noted in boldface type for the curves at downstream stations 70 and 80.
Also shown are the rates corresponding to each period of near-constant flow; the numbers
shown without parentheses are in units of vehicles per hour (vph) and those in parentheses are
the corresponding average counts per minute. Finally, dotted lines are used to highlight the
average queue discharge rates measured over the rush at downstream detectors 70 and 80.
Labels specifying these average discharge flows are shown on the figures in boldface type.
We now turn our attention to the traffic patterns revealed by Figures 5 through 7. From even
cursory examination of these figures, one observes a trend in vehicle lane changing; namely,
that large numbers of vehicles changed lanes to the left while they traveled between detectors
60 and 80. The curves in Figure 5, for example, were each started with the same value of N,
but curve 60, N(60, t), lies well above curve 70 for all time t. In similar fashion, curve 70 rises
above curve 80, indicating that vehicles continued to exit the shoulder lane at locations well
downstream of the Spadina Avenue on-ramp.
Figure 6, on the other hand, shows no such obvious trend. Rather, the (net) flows in the center
lane remained nearly unchanged as traffic moved through the bottleneck, indicating that,
between detectors, the number that moved into the center lane nearly equaled the number that
moved out. In fact, the Figure 6 curves have been vertically displaced (by arbitrary distances)
because not separating these curves would have made it difficult to view their details.
Figure 7 shows that large numbers entered the median lane between detectors 60 and 80 and
that this trend gave rise to some extraordinarily high flows. For example, a flow in excess of
2,600 vph was measured in the median lane at detector 80 prior to the bottleneck's activation.
Remarkably, this very high rate was observed for over 40 minutes before the queue formed
upstream and a lower discharge rate ensued. (The queue's formation is signaled by the onset of
its discharge flow at detector 70).
The onset of queueing occurred at nearly, but not precisely, the same times in each lane; i.e.,
the queue appears to have formed in the shoulder lane several minutes after it formed in the
adjacent lanes. The curves at detector 60 (in all three lanes) exhibit sustained surges some
4
A few of these times are not labeled in Figure 5 so that the figure would not become cluttered.
O
CO
ZZ-OS-Sl L£*OS:91 PSAUJV ananQ
= °b '.fb-()'x)N
100 MEDIAN LANE
NOTE:
Times shown in bold mark the onsets and terminations of queue discharge.
Rates shown in bold are average queue discharge flows observed during the rush
14:55:03 15:07:03 15:19:03 15:31:03 15:43:03 15:55:03 16:07:03 16:19:03 16:31:03 16:43:03 16:55:03 17:07:03 17:19:03 17:31:03 17:43:03 17:55:03 18:07:03 18:19:03
Time, t
Figure 7: Re-scaled TV-curves, Median Lane.
minutes prior to the queue's formation. Conversely, the onset of this queue was accompanied
by rather dramatic flow reductions. Figures 5 through 7 show that the periods immediately
following the bottleneck's activation were marked by some of the lowest discharge rates
observed during the rush, but that these relatively low flows were short-lived relative to the
rush. In the center and median lanes (Figures 6 and 7), these so-called flow collapses (Cassidy
and Windover, 1995) prevailed for just under 20 minutes before being replaced by higher
discharge flows. In the shoulder lane (Figure 5), the collapse persisted for less than 10
minutes.
Also of note, Figures 5 through 7 show that sizable reductions in the discharge flows were
measured in all lanes some minutes before the bottleneck was deactivated by the arrival of the
queue from downstream (recall that this queue arrived at detector 80 at about 18:12). Visual
inspection of N- and T-curves measured in individual lanes revealed that these flow drops were
caused by an incident (perhaps a vehicle stall or a small collision) that occurred in the shoulder
lane near detector station 50. Notably, detector 50 measured near-zero counts and occupancies
in the shoulder lane from 18:02:23 to 18:08:23. Figure 8a shows that during this period, the
shoulder lane traffic at upstream station 40 exhibited a sharp reduction in the flow coupled with
a rise in the occupancy, features which mark the passage of a queue. Conversely, Figure 8b
shows that, within this period, shoulder lane traffic at downstream station 60 exhibited sudden
reductions in both the flow and the occupancy. These features would be expected to occur
downstream of a sudden restriction (i.e., as traffic passed the obstruction and moved into the
shoulder lane). A reduction in the discharge flow also occurred in the shoulder lane at about
16:59 (see Figure 5), but this reduction was short-lived and of no real consequence.
Despite the flow variations that occurred during the bottleneck's active period, Figures 5, 6 and
7 reveal that the queue discharge rates never deviated much from the linear trends shown with
the dotted lines (the reduced discharge flows that accompanied the incident near detector 50
were excluded from consideration here). Thus, for each lane, the discharge rates can be
described as being nearly constant over the rush. By comparing the average rates that
correspond to each dotted line with the flows that prevailed prior to the bottleneck's activation,
it is clear that queueing was accompanied by long-run flow reductions, especially in the
median and center lanes. Also by examining these three figures collectively, it is evident that
the average queue discharge rates varied across lanes.
Finally, Figure 9 shows a re-scaled TV-curve constructed solely from the counts on the Spadina
Avenue on-ramp (at detector station 40). The sustained surge in the ramp flow evident at
15:42:43 corresponds to the measured increase in upstream flow previously revealed in Figure
3
o
-c ---" O
^
CD " "" v^,

CL
V)
Dr
0 ^
00
3>
.y . '" c:
o
CD 0
CD
O
^
f
_,^f^=~^~^ —\" "- - --^^l «o
O} O
00 >- ^~^-^^ 00
Zo ^^^^ CO
II 0
I—
II
c? O
o x. -C?
fc ., Qo \ 1 ^ ^
•*-* LU '• ^—i
cr K
Z ^\
10~ "^
o« \x A/(40,f) - q0t'
it) —i ^
o"
^ LJ_
K X^^ 1 ^^ o"
2f -o "^\ x
fr
Time, f -.^
c3 0 0 0 0 0 0 C O C 0 0 3 Co o o o o o o o o o o o o c o o o o o o o o o o o o o o o CO CD 00 CO
iD O O O O O O C3 0 0 0 0 0 0 O O O O O O O C O O O O
c. •î -g oo oo
Figure 8a: Re-scaled N- and T-curves, Detector 40, Shoulder Lane.
,
o -- /^^x/^\ T(60,t) -b0t' ^
"'/ ss^
o
•^
~~^-^ \^^^ ^•
CD >_ ^^^ •—-—- ^
Q.
CL
CO
o N(60,t)-q0t
CD "D
O ^
\— ID
oo c
~" " ** \
-5
CD Q
UJ
0
o 8
CD
CO
0
§cv -\5
1
S Q.
o
2 o:
v 10-15
^ "\\ •^
"o 1
II
Cr •°_
c£- 0 -Q°
'\V 0
!
^ ''-. \ °"
co_
s &
;
Time, t
! ' i ' 1 I ! -i
' '
cO C O O O C D O O Q 3 G O Q O O O O O O O O O Q O C X ) Q 0 0 0 0 0 0 3 0 0 C O C O Q 0 0 0 0 3 C O C
o o o o o o o o o o o o o o o o o o o o o o o o o o
u c o c o u u w c o u
Figure 8b: Re-scaled TV- and T-curves, Detector 60, Shoulder Lane.
lioo ./!*
CD
a JMf/''^ ^f
Vj
CO ,/V/<?" V.TN cjp
CD
O COi
-c CN;
§ co'i
o
N(Spadina,t) - q0t'
Time, f
Figure 9: Re-scaled vV-curve, Spadina Avenue on-ramp.
2.5 Likewise, the sharp reduction in this ramp flow at 15:50:23 corresponds closely to the
arrival of the queue from our active bottleneck; recall that this queue was shown (in Figure 3)
to have arrived to the freeway detectors at station 40 some time shortly after 15:50. Although
this queue apparently suppressed the on-ramp flow, vehicles continued to enter the freeway via
this ramp at a high rate; i.e., an average ramp flow of 1,650 vph persisted for nearly an hour.
This flow dropped at about 16:46:43, perhaps due to a reduction in the on-ramp demand
(although additional ramp detectors do not exist to confirm this). In any event, these high ramp
flows mean that, just downstream of the merge, more than half of the vehicles traveling in the
shoulder lane originated from the on-ramp. Thus, the merging process did not exhibit the so-
called "zipper effect" (Newman, 1986) whereby freeway and ramp vehicles share the shoulder
lane in a strictly alternating fashion.
Study of the N-curves constructed from freeway counts at detector stations 30 and 40 (not shown here)
revealed these increased flows observed in Figure 2 were part of sustained surges in the freeway flows as well
as in the flow from the Spadina Avenue on-ramp.
5. FINDINGS FROM REPEATED EXPERIMENTS AND FUTURE RESEARCH

DIRECTIONS
Data from detectors 40 through 80 were extracted during two other weekday afternoons and
were examined in the manner previously described. On each of these two additional days, the
observed traffic patterns were similar to those presented above. As a means of exemplifying
some of these day to day similarities, Table 1 presents certain observations taken each day at
detector 80. Row 1 of this table shows that, in the median lane, very high flows were observed
each day prior to queueing. The table's second row reveals that these high rates were always
sustained for periods of at least 5 minutes, and in two instances, for much longer. Row 3 of
Table 1 shows that the flow collapse at the onset of queueing was a reproducible feature in the
median lane (note that the rates shown in this row are lower than their corresponding average
discharge rates measured over the rush and presented in row 5). However, these collapses
persisted for durations that varied across days, as shown in row 4 of the table. The flow
collapse was likewise reproduced each day in the center lane, although this information is
excluded from the table.
Important features also not shown in Table 1 are that the bottleneck always formed at the same
location (i.e., between detectors 60 and 70) and that it was always activated by a sustained
surge in the flow from upstream. Thus, traffic transitioned from free flow to queued conditions
in a predictable way; the queues formed at an inhomogeneity, the bottleneck, due to
reproducible, exogenous reasons, i.e., the increased flows. In this instance, one might presume
that the freeway's horizontal curve is the inhomogeneity creating the bottleneck (see Figure 1).
While this may indeed be the case, it is worth noting that the same analysis methods applied to
data from another freeway location found that a bottleneck consistently formed more than a
kilometer downstream of an on-ramp, even though there was no obvious inhomogeneity at this
location (Cassidy and Bertini, 1999). In any event, our studies to date have revealed no
evidence suggesting that traffic can break down and form queues in a spontaneous manner.
Also of note, rows 5, 6 and 7 of Table 1 indicate that, while the bottleneck was active, the
average discharge flow (in a given lane) exhibited only small variation across days. On each
day, these rates can be described as "near-constant" since the cumulative counts never deviated
much from a linear trend. Given these predictable features of its discharge rates, it seems
reasonable to postulate about how queues might evolve upstream of this bottleneck; e.g., by
using a continuum model of highway traffic (Lighthill and Whitham, 1955; Richards, 1956;
Newell, 1993).
Table 1
Some Observations Taken From Station 80
D a y l Day2 Day3
ROW
3/5/972/20/97 7/21/97
~1 Maximum Flow (vph) - Median Lane 2630 2630 2400
2 Measured Duration of Maximum Flow (min: sec)-Median Lane 43:40 5:40 19:40
3 Flow Collapse (vph) - Median Lane 2280 2110 2300
4 Measured Duration of Flow Collapse (min: sec)-Median Lane 19:20 7:00 5:20
5 Average Queue Discharge Rate (vph) - Median Lane 2340 2290 2330
6 Average Queue Discharge Rate (vph) - Center Lane 1920 1910 1950
7 Average Queue Discharge Rate (vph) - Shoulder Lane 1720 1690 1690
Note: Data from Day 1 were used in Figures 2 through 9.
In closing, we note that the observations reported above bring to light a number of unanswered
questions. For example, the reason(s) why the observed lane-changing trends persisted well
downstream of the on-ramp, and the extent to which the high flows observed in the median
lane might be reproduced at other bottlenecks, are unknown. Also unknown are the causes of
the flow reductions that accompanied queueing, especially the relatively large reductions at the
onset of queueing. Finally, the potential for using control measures, such as ramp metering, to
extend the periods marked by high flows (observed prior to queueing, for example) is
uncertain.
Answers to the above will only come through additional empirical study. Since (freeway)
bottlenecks come in many forms, including merges, diverges, weaves and lane reductions, and
since the traffic patterns on each type of bottleneck may exhibit their own peculiarities, the
study of bottlenecks in each of their forms seems warranted. Cumulative curves like those
described here might be used to conduct these studies since they provide a robust way of
diagnosing the details of bottleneck traffic.
ACKNOWLEDGEMENTS
The authors are indebted to Mr. David Nesbitt, City of Toronto, for providing the data used in
this study, and to G.F. Newell for his helpful comments.
REFERENCES
Cassidy, M.J. and R.L. Bertini (1999). Some traffic features at freeway bottlenecks. Transpn.
Res., 33B, 25-42.
Cassidy, M.J. and J.R. Windover (1995). Methodology for assessing dynamics of freeway
traffic flow. Transpn Res. Rec., 1484, 73-79.
Daganzo, C.F. (1997). Fundamentals of transportation and traffic operations. Elsevier, New
York, p. 133.
Lighthill, M.J. and G.B. Whitham (1955). On kinematic waves. 7: Flood movement in long
rivers. //: A theory of traffic flow on long crowded roads. Proc. Royal Soc., A229, 281-
345.
Lin, W.H. and C.F. Daganzo (1997). A simple detection scheme for delay-inducing freeway
incidents. Transpn Res., 31A, 141-155.
Newell, G.F. (1982). Applications ofqueueing theory. Chapman Hall, London.
Newell, G.F. (1993). A simplified theory of kinematic waves in highway traffic 7: General
theory. 77: Queuing at freeway bottlenecks. 777: Multi-destination flows. Transpn Res.,
278,281-313.
Newman, L. (1986). Freeway operations analysis course notes. Institute of Transportation
Studies, University Extension, Univ. of California, Berkeley, U.S.A.
Persaud, B.N. (1986). Study of a freeway bottleneck to explore some unresolved traffic flow
issues. PhD thesis, Univ. of Toronto, Toronto, Canada.
Persaud, B.N. and V.F. Hurdle (1991). Freeway capacity: definition and measurement issues.
Proc., International Symposium of Highway Capacity, A.A. Balkema press, Germany,
289-307.
Persaud, B., S. Yagar and R. Brownlee (1998). Exploration of the breakdown phenomenon in
freeway traffic. Transpn Res. Rec., 1634, 64-69.
Richards, P.I. (1956). Shock waves on the highway. Opns. Res., 4, 42-51.
125
FLOWS UPSTREAM OF A HIGHWAY BOTTLENECK
Gordon F. Newell
Department of Civil and Environmental Engineering and Institute of Transportation Studies
University of California at Berkeley
ABSTRACT
Suppose that the local capacity of a highway is a smooth function of location, approximated by a
parabolic function with a minimum value at some location (the bottleneck). The flow approaching
the bottleneck increases approximately linearly with time as it exceeds the capacity of the
bottleneck. We present here an analytic solution for the resulting flow pattern upstream of the
bottleneck as predicted by the theory of Lighthill and Whitham (1955) for two different types of
analytic forms for the relation between flow and density.
Although, in each of the two cases, the formulation of the problem contains seven parameters, it is
shown that, by appropriate linear transformation of variables, the flow pattern can be described in
terms of a single dimensionless pattern. In each case, a shock first forms at some point upstream of
the bottleneck with an amplitude which increases proportional to the square root of the time from
its beginning.
1. INTRODUCTION
Suppose that the capacity at each location along a section of highway is a smoothly varying
function of location, as might be the case if the highway curves or changes grade, and that the
capacity has a minimum value at some location which we arbitrarily identify as x = 0. Over some
distance in the vicinity of the bottleneck, x = 0, we will assume that the capacity Qm(x) can be
approximated by a quadratic function of the form
Qm(x) = q0 + Ax2 (1.1)
with q0 the capacity of the bottleneck and A some positive constant describing how rapidly the
capacity changes with the distance from the bottleneck.
Suppose also that at some location x = - L , well upstream of the bottleneck, there is a flow q(-L, t)
approaching the bottleneck. This flow is increasing with time and at some time t = t0 becomes
equal to the capacity, q0. Over some interval of time near to we will assume that the flow is
increasing (nearly) linearly with time so we can approximate
q(-L,t) = q0 + B(t-to) (1.2)
for some positive constant B describing the rate of increase of the flow.
A flow larger than q0 certainly cannot pass the bottleneck nor can it pass any point upstream oi
the bottleneck where that flow exceeds the local capacity. As this flow approaches the bottleneck
vehicles accumulate behind the bottleneck. The accumulation will eventually create a shock which
propagates upstream.
Lighthill and Whitham, L-W, (1955, p. 333) gave a qualitative description of how the shock forms
and propagates upstream based upon their theory of kinematic waves. In this theory it is i
postulated that there is a specified functional relation between the density, k(x, t), and the flo\\
q(x, t),
q(x,t) = Q(k(x,t),x). (1.3)
For any fixed x, Q(k, x), is a concave function of k having a maximum with respect to k, th<
capacity Qm(x) at that location. L-W did not specify any specific form for Q(k, x) but it wa;
assumed to be a smooth function of k with a locally parabolic maximum with respect to k, and tha
Qm(x) had a minimum at the bottleneck.
Flows Upstream of a Highway Bottleneck 127
Our objective here is to give an analytic solution for q(x, t) based on the L-W theory for two
hypothetical forms for the function Q(k, x). In the first case it is assumed that Q(k, x) for fixed x
has a triangular shape, as illustrated in figure 1 for several values of x. The slope Q(k, x)/k on the
left hand side of the triangle is the "free speed" v0 assumed to be independent of x. The slope on
the right hand side of the triangle, the (negative) wave velocity, is also assumed to be independent
of x. The maximum height of the triangle is Qm(x) as in (1. 1). Thus the family of Q(k, x) curves for
various x differ only by a scaling factor Q(x) for both q and k.
Q(x)
CT
Q(x)-
ko
Density, k
Density, k
Fig. 1. Triangular q-k relations
Fig. 2. Parabolic q-k relations
In the second case, it is assumed that the function Q(k, x) has a local parabolic maximum with
respect to k at some density ko and some specified curvature. The ko and curvature are both
independent of x. Thus for k in some vicinity of k, we can, approximate Q(k, x) by
Q(k, x) = q0 + Ax2 - C(k - ko )2 (1.4)
for some positive constant C. This form is illustrated in figure 2.
We will show for each of the two forms for Q(k, x) that, by appropriate choice of units and
coordinates, q(x, t) can be described in terms of a single dimensionless function q*(x*, t*) of a
dimensionless length x* and time t*, independent of the parameters q0, A, B, C, L etc. The
function q*(x*, t*) can also be evaluated analytically. Similar methods could be applied also to
other types of q - k relations.
2. TRIANGULAR Q - K RELATION
For the triangular q - k relation illustrated in figure 1, it is advantageous first to introduce a moving
time origin traveling at velocity v0. If we measure time at each location x from the time that some
reference vehicle traveling at velocity v0 would pass (Newell 1993II)
t' = t-x/v0 (2.1)

and define
q'(x, t') = q(x, t' + x/vo) (2.2)
k'(x, t) = k(x, t' + x/vo) - q (x, t1 + x/v0)/v0 (2.3)
then the relation between q' and k' would have the form illustrated in figure 3 in which the "free
speed" is, in effect, infinite in the new coordinates. The negative "wave pace" (- w0), the reciprocal
of the wave speed, is (- l/v0) plus the corresponding wave pace of figure 1, with w0 independent of
This eliminates the parameter v0, in the sense that q'(x, t') is a solution of the L-W equations for
the q' - k' relation of figure 3, which is independent of v0.
According to the L-W theory, the flow q(x, t') in the absence of shocks must be a constant q' along
"characteristic curves" which travel at the wave velocity associated with the flow q'. For the q' - k'
relation of figure 3 there are only two possible values for the wave pace 0 or -w,. Some
characteristic curves are illustrated in figure 4. (The numerical values of the flow, distance, and time
in figure 4 are the dimensionless values q*, k*, and t* to be specified shortly.) The characteristic
curves for flow less than q0 (not shown in figure 4) are vertical lines that extend through the
bottleneck. The characteristic curve for the flow q0, labeled as 0 in figure 4, travels vertically until
it reaches the bottleneck at x = 0 then travels backward from that point at wave pace -w0. A
characteristic curve for some flow larger than q0 travels vertically until it reaches the location where
the local capacity is equal to the flow, and then turns back from that point at pace -w0.
The "boundary conditions" specify q(-L, t) or equivalently

q'(-L, t') = q(-L, t'-L/v0) = q0 + B(t' - L/v0 - t0).
If q(-L, t) is linearly increasing in t, q'(-L, t') is linearly increasing in f. Since the characteristic
curves (before they turn back) are vertical, a linearly increasing flow at one value of L implies a
linearly increasing flow at other values of L. Thus the value of L is irrelevant, and we can
arbitrarily choose the time origin (i.e. the t0) or the L so that the characteristic for flow qo reaches
the bottleneck at t' = 0.
Q(x)-
Density, k'
Fig. 3. A q'-k' relation in moving coordinates
The locus of points where the characteristic curves turn back are points where
q'(x, t') = q0 + Bt' = q0 + Ax2
i.e.
Bt' = Ax2 (2.4)
shown in figure 4 by a dotted curve. Obviously figure 4 depends on the values of q'(x, t') - q0, but
the value of q0 itself is irrelevant. If we increase the flows by some fixed amount and increase the
capacities by the same amount the figure would not change.
A shock will form at the earliest time when two characteristic curves of different flows intersect.
This is obviously at such time t[ , location X], and flow qi , when the slope of the curve (2.4) is
equal to the wave velocity -l/w 0 , i.e.
dt' _ 2A
— ^1 — Wn .
dx B ' °
x, = - w 0 B / 2 A , t [ = (2.5)
0.5 1.0 2.0 3.0

Flow, q*
Fig. 4. Curves of constant flow
We still have the option of choosing units for x, t! and q' - q0. We could, for example, choose a
"dimensionless" distance x*
x* = -x2A/Bw0 (2.6)
measured upstream of the bottleneck so that x,* = + 1, and a dimensionless time t*
t* = t'2/Bw<> (2.7)
so that the dimensionless wave velocity or pace is +1 (in the direction of increasing x*) t,* = 1/2 .
We can also choose the units of flow so that
q*(x*, t*) = [q'(x, t') - q0] 2A/B2w02. (2.8)
The approaching flow is then q*(x*, t*) = t* and the q,* at t* = 1/2 is 1/2.
Thus, starting from a formulation which contained potentially seven parameters, q0, A, L, B, t0, v0,
and WG , the problem has been reduced to a dimensionless form with no parameters. The q*(x*, t*)
is equivalent to the q(-x, t) for q0 = 0, A = 1/2, L arbitrary, B = 1, t0 = 0, v0 = °° and w0 -1 (or +
1)-
Finally to evaluate the q*(x*, t*) we note that, at any point (x*, t*), it can have only one of three
possible values:
q*(x*,t*) = t* (2.9)
if it is determined by the approaching flow,
q*(x*,t*) = 0, (2.10)
if it is determined by the flow which can pass the bottleneck at x = 0, or by a value determined
from the characteristic curve passing through (x*, t*) coming from the curve (2.4), t* = (l/2)x*2
where the flow is q* = t*. This last value gives
q*(x*, t*) = 1 - (x* -t*) - [1 - 2(x* - t*)]l/2. (2.11)
This last expression is a function only of x* -1* since the flow is constant along the characteristic
curves x* - t* = constant, but it applies only for 0 < x* - t* < 1/2. One can verify that (2.11) is
correct by checking that along the curve (2.4), t* = (l/2)x*2, (2.11) gives the approach flow q*(x*,
(l/2)x*2) = (l/2)x* 2 = t*.
For small values of (x* -1*) an expansion of (2.11) in powers of x* -1* gives
q*(x* , t*) = (x* -1*)2/2 + (x* -1*)312 +... (2.11 a)
thus q*(x*, t*) vanishes quadratically in (x* -1*) as x* -1* —> 0. For x* -1* close to 1/2, however,
q*(x*, t*) varies rapidly with a 1/2 power singularly. The function (2.11) is illustrated by the
curve of figure 5.
For 0 < x* < 1, it is obvious that (2.9) applies for t* < (1/2) x*2, (2.11) for (l/2)x*2 < t* < x*, and
(2.10) for x* < t*. For 1 < x*, however, the boundary separating these solutions is a shock path.
L-W determine the path of the shock by integrating the equations for the velocity of the shock. A
simpler and more general method (Newell 19931) is to evaluate the cumulative flow and require
that the cumulative flow be continuous across the shock.
c>
u_
0.5
-r •
Time , tj *-x *
Fig. 5. Flow vs. time at various locations
We can define a dimensionless cumulative flow A*(x*, t*) such that

8A*(x*, t*)/3t* = q*(x*, t*), 3A*(x*, t*)/9x* = k*(x*, t*) (2.12
with k*(x*, t*) = 0 if the waves at (x*, t*) are going forward and
k*(x*, t*) = (l/2)x *2 -q*(x*, t*) (2.13
if the waves at (x*, t*) are going backward. Since q*(x*, t*) = 0 along the characteristic curve t* = 0
for all x* > 0 and also at the bottleneck x* = 0 for all t* > 0, we can define A*(x*, t*) = 0 along
these lines. The A*(x*, t*) can be evaluated by integrating (2.12) along any path from a point
where A* = 0 to (x*, t*).
On the upstream side of the shock (2.9) applies. Integration of this with respect to t* for fixed x*
gives
A*(x*, t*) = (l/2)t* 2 . (2.14)
On the downstream side of the shock, for t* > x*, (2.10) applies and k*(x*, t*) = (l/2)x*2.
Integration of this with respect to x* for fixed t* gives
A*(x*, t*) = (l/6)x 3 , t*> x* . (2.15)
Equating (2.14) and (2.15) we conclude that the path of the shock is
(l/2)t*2 = (l/6)x *3 ; x* = 3(t*/3)2/3 (2.16)
if t* > x*, or, equivalently, for t* > 3, x* > 3.
The formula for the shock path for 1 < x* < 3 is much more complicated because the flow on the
downstream side satisfies (2.11). In the region where (2.11) applies, we can evaluate A*(x*, t*) by
integrating the flow (2.11) along a line of constant x* from the point (x*, x*) where A*(x*, x*) =
(l/6)x*3.
A * (x * , t*) = (l/6)x *3 - Jq * (x*,T)dT
x*3 1 3Cx*-t*1 2 7
= — + -{l-3(x*-t*) + — —-[l-2(x*-t*) }. (2.17)
6 3 2
If one integrates the expansion (2. 11 a), one can verify directly that
x *3 cx*_t*^3 ]
A*(x*,t*) = - —— - -(x*-t*) 4 +.... (2.17a)
6 6 8
Over most of the region where (2.17) applies, the second term of (2.17) is very small. Even at the
point where the shock first forms at x* = 1, t* = 1/2 where x* - t* is largest, the second term of
(2.17) is only (-1/4) times the first term and the relative size of this term decreases very rapidly as
(x* -1*) decreases.
The path of the shock for 1 < x* < 3 is obtained by equating (2.17) and (2.14). It is quite easy to
evaluate numerically the small deviation of the shock path from the curve defined by (2.16).
Figure 4 shows the path of the shock and the curves of constant flow q*. Figure 5 shows the flow
q*(x*, t*) as a function oft* - x* for various values of x*.
To give a more complete description of the solution one might wish also to draw some vehicle
trajectories. For any dimensionless solution q*(x*, t*), however, there is a whole family of
possible (dimensionless) trajectories depending on the capacity q0 of the bottleneck or its
dimensionless form
q0* = 2A qo/B2 w02.
To see this one need only observe that the "dimensionless cumulative flow" A*(x*, t*) defined by
(2.12) is the dimensionless cumulative flow for a hypothetical capacity qo - 0 or, equivalently, it is
the actual dimensionless cumulative flow less a "background" cumulative flow of q0*t*. Thus, for
any q0* > 0, the dimensionless cumulative flow is actually
q0*t* + A*(x*, t*)
and the trajectories are the curves for which this is a constant. In particular, the trajectory which
passes x* = 0 at some time t0* is the curve
A*(x*, t*) = q0*(to* -1*). (2.18)
There is just one dimensionless function A*(x*, t*) as described above; the t0* labels the
trajectory, but there is still the parameter q0*. There is a different set of trajectories for each qo*.
In the region where the characteristics of figure 4 are vertical, A*(x*, t*) is given by (2.14) and the
trajectories are defined by the equation
t*2/2 = q0*(t0* -1*). (2.18a)
As expected, the trajectories are also vertical, i.e. t* is independent of x* (the vehicle speed and the
wave speed are the same), but the labels of the trajectories (vehicle number) depend on q0*.
In the region t* > x*, the trajectories are also simple. From (2.15)
x*3/6 = q0*(to* -1*) (2.18b)
which means that the time displacement of a trajectory from a vertical trajectory is proportional
to x*3 (for any qo*). The velocity in the x*, t* coordinate system is proportional to x*"2.
In the region where (2.17) applies, the trajectories are more complicated but, for any q0*, thi
trajectories are continuous across the shock path.
3. PARABOLIC Q - K RELATION
For the parabolic q - k relation described by ( 1 .4) the formulas are somewhat more complicated
because the characteristic curves are not straight lines. In the absence of shocks the formal solution
of the L-W equations is that q(x, t) is constant along characteristic curves in the x - t plane, x(t; q),
having a slope, for fixed q,
dt ok
Along the characteristic curve for flow q, the density as given by (1.4) is
k - k 0 = ± [ q 0 - q + Ax2]1/2C-1/2.
Thus the equation for the characteristic curve of flow q is
dx(t; q)/dt = ± 2(CA)I/2 [(q0 - q)/A + x2]1/2. (3.2)
in which either the + or - sign might apply depending on whether the velocity of the wave is
positive or negative.
A "formal" solution of (3.2) is

r
±(q 0 -q) 1 / 2 A-' / 2 sinh(2(AC)
x(t;q)= ,
±(q - q 0 )" 2 A'" 2 cosh(2(AC) l/2 t + D(q)) if q > q 0 (3-3)
in which D(q) is some unspecified "integration constant" for each value of q. The D(q) is to be
determined from appropriate boundary or initial conditions. In particular, we assume here that
"sufficiently far" upstream at x = -L, the flow satisfies (1.2).
Sufficiently far upstream is now interpreted to mean at a value of L such that the hyperbolic
functions in (3.3) can be approximated by positive exponentials
sinh x| ~ cosh x ~ (1/2) exp (|x|).
Thus in (3.3), the boundary condition requires that
-L = ± | q0 - q(-L, T) | 1/2(l/2)A-'/2 exp(|2(AC)1/2 T + D(q(-L, T)|)
for all T . Equivalently if we represent I as a function of q rather than q as a function of T
through (1.4), this gives
-L = ± | q0 - q|l/2(l/2)A-|/2 exp(| 2(AC)1/2 (t0 + B'1 (q0 - q)) + D(q) |).
We would naturally specify that, for some sufficiently large L, the flow would reach q0 at some
"large" negative t0, so the argument of exp(| • |) above is negative and the ± sign must be -. Thus
exp(D(q)) = | q0 - q | 1/2(2L)-'A-1/2 exp(-2(AC)1/210 - 2(AC)I/2 (q0 -
This relation depends on t0 and L only through a single factor L"1 exp(-2(AC)1/2 t0) on the righl
hand side. Since a change in L is equivalent to a suitable change in t0, this means that if the flow is
linearly increasing with t as in (1.2) for some sufficiently large L, it is also linearly increasing with t
at other large values of L but with a displaced value of t0. Any change in this factor is equivalent to
a translation of the time coordinate and, by suitable choice of the t0, we can assign this factor any
value we wish. In particular, we will choose the time origin so that
exp(D(q)) = (AC)1/4 [2(q0 - q)/B]1/2 exp(-2(AC)1/2 (q0 - q)/B).
The equations for the characteristics (3.3) can now be written in the simple dimensionless
form
x*(t*; q*) = + e-'*+q* +q*e'*'q* (3.4)
with
x*(t*; q*) = -23/2A 3/4C1/4B-|/2 x(t; q)
(3.5)
I/2 1/2
t* - 2(AC) t, q* = 2(AC) (q0 - q)/B.
Equation (3.4) is valid either for q < q0 or q > q0 since the q* changes sign when q passes q0 .
The x*, t*, and q* are simply rescaled dimensionless versions of the x, t, and q - q0, respectively
(but not necessarily related to the corresponding symbols in section 2). Again, as in section 2, the
formulation of this problem started with seven parameters q0, A, L, B, t0, ko, and C, the first five
of which are the same as before but the parameters ko and C associated with the shape of the
parabolic q - k curve replace the parameters v0 and w0 associated with the triangular q - k curve
Again the final dimensionless form of the characteristic curves contain none of these parameters.
To complete the solution of the dimensionless flow pattern we note that (3.4) can also be writter
in the form
f-21 q* l l / 2 sinh(t * -q * +l/2^n I q* I) for q* < 0
x*(t*;q) = < (3.6)
[+2q *' / 2 cosh(t * -q * +1 / 2£n q*) for q* > 0 .
Thus the characteristic curves for q* < 0 and those for q* > 0 all have the same shape except for i
scaling factor (q*)I/2 and a translation in time by q* - (l/2)£n | q* |.
Figure 6 shows a family of characteristic curves evaluated from (3.4) or (3.6). The curves for q*< (
pass through the bottleneck at x* = 0. Note that these are drawn for a geometric sequence of q"
values -1, -1/2, -1/4, etc. These characteristic curves will cover the whole space downstream of thf
Flaws Upstream of a Highway Bottleneck 137
bottleneck, but clearly the flow at any location downstream rapidly approaches the flow q*= 0 (q
= q0) as t* increases.
Time , t '
Fig. 6. Curves of constant flow
The characteristic curve for q* = 0 is simply an exponential e"1* which is asymptotically horizontal
for t*—>°°. This characteristic cannot pass the bottleneck. The characteristic curves for q* > 0
move forward in time until the flow q* reaches the location where the local capacity of the
highway is q*. At this point the slope of the characteristic curve becomes zero. Unlike the
characteristic for q* = 0, however, the slope of the characteristic curve for q* > 0 becomes zero at
a finite time after which the characteristic proceeds to move back upstream. Note that this is
similar to the pattern of figure 4 except that in figure 4 the characteristic curves change direction
abruptly.
The points at which the characteristic curves become horizontal are identified in figure 6 by the
dotted curve. They occur where the argument of the cosh (•) in (3.6) vanishes
t* = q* - 1/2 In q* , x* = 2q*/2
or
t* = (x*/2)2 - In (x*/2).
As a function of q* or x*, this point starts at t* = oo for q* = 0. It decreases with q* until it
reaches a minimum at q* = 1/2 and then increases again. This is the analogue of the dotted line
curve of figure 4 except that figure 4 is drawn with a moving time origin.
After a characteristic curve turns upstream it will, at some time, intersect a characteristic curve of
higher flow moving downstream so as to create a shock. In figure 6 it appears as if all the
characteristics for 1 < q* < 2 intersect at (nearly) the same point. The shock actually starts at the
first time that 9x*(t*; q*)/3q* vanishes. From (3.4) we see that
3x * (t*;q*)/3q * = e"'*+q* + e'*~ q * - q * e'*~q*

= -e'* +q *[(q*-l)e~ 2q *-e~ 21 *].
The earliest that this can vanish is at that value of q* for which (q* - 1) exp(-2q*) has a maximum,
namely at
qi* = 3/2 , t,* = (3/2) + (1/2) ^n2 = 1.846, x,* = 23/2 = 2.83. (3.7)
To analyse the behavior of the characteristics in the vicinity of the point xi*, t]*, it is
advantageous to make a power series expansion of (3.4) in powers of x' = x* - X]*, t1 = t* - t!*, and
q' = q*-q,* .
, „„ ,
(l + — - — + -) + ^4^(1 - — +
2 6 3V2 4
The first term on the right hand side of (3.8) is an expansion in powers of t' of the characteristic
curve for q' = 0. The second term is a power series expansion in q' for t' = 0. Successive terms give
power series expansions in q' of the coefficients of powers oft', t'2, etc (for powers of q' of at least
1).
Since the q]*, t)*, X|* were chosen so that 3x'/3q' = 0 at t' = 0, q' = 0, we knew that the second term
of (3.8) would not contain a term proportional to q'. We would have expected this term to start
with a (positive) term proportional to q'2 , but, by some coincidence, this term is also missing. The
second term of (3.8) starts with a term proportional to q'3. This is the reason why it appears in
figure 6 as if a wide range of characteristic curves are crossing at (nearly) the same point xi*, t].
Also in (3.8) the fourth term is proportional to q' 3 .
Equations (3.4), (3.6) or (3.8) define x* or x' as a single-valued function of q* and t*, but, to
determine the path of the shock, one must determine the flow (and corresponding densities) on
either side of the shock, i.e. q* as a function of x* and t*. From figure 6 it would appear that, in the
region where the characteristic curves intersect, there are actually three values of q* at each point
x*, t*.
In (3.8) it is not a -priori obvious how many terms one needs to retain in a first approximation
because one does not know the relevant relative magnitudes of t' vs q'. If one were to keep only
terms linear or quadratic in t' and q', one might approximate (3.8) by
x' = A/2 t'(l +1') - A/2 t'q' + ...
This could then be solved for q' as a function of x' and t'
q' = x'/2 1 / 2 t' + l + t' + ..
This describes what appears to occur in figure 6. Over some range of (small) q', the characteristic
curves (nearly) pass through x' = 0, t1 = 0. The value of q' at any point (x', t') depends mostly on
the slope x'/t' of the line from x - 0, t' = 0. But this also shows that the relevant magnitudes of t'
and q' are not comparable. One is interested in values of q' large compared with t' and should retain
at least the term 2l/2q'3/3 in (3.8).
Along the characteristic curve

x' = A/2 t'(l + t' + t ' 2 / 6 + ...)
corresponding to q' = 0, (3.8) gives
0 = ±A/2 q'3 73 +A/2 t'q' + -
which, as an equation for q', is satisfied not only for q1 - 0 but also for
q' = ±(3t') 1/2 . (3.9)
Indeed these represent (to the lowest approximation) the other two values of q' along the
characteristic curve for q' = 0.
The shock first forms at t' = 0 as an "infinitesimal" shock between the values q' = ±(3t)1/2 and a:
such, it travels at essentially the wave velocity for q' = 0, dx*/dt* = (2) l/2 . But even as the jump ii
flow increases with t', the shock velocity between the flows q* = q,* + (3t) l/2 and q!* - (3t)1/2 wil
stay very close to the wave velocity at the average of these two flows, namely at the wav<
velocity for q1 = 0. Thus the shock path will, in turn, stay very close to the characteristic curve fo
q' = 0 even though the amplitude of the shock is increasing rapidly with time.
Starting with (3.9) as a first approximation one can iteratively obtain higher order approximation
by including other terms from (3.8). Figure 7 shows a magnitude view (by a factor of 10) of thi
characteristics in the vicinity of X] , t ) . The cross indicates the place where the shock begins. Not
that there is a rather large curvature of the shock path near t,*; the shock velocity increases fron
(2) l/2 = 1.41 at t,* = 1.85 to about 2 fort,* = 2.0.
To determine the path of the shock over long distances it is advantageous to introduce
dimensionless cumulative flow as in (2.12), but now the dimensionless density is given by
k*2(x*, t*) = (l/4)x *2 - q*(x*, t*) (3.10

rather than (2.13).
Along a characteristic curve of flow q*

dA*(x*, t*)/dt = k* dx*/dt* + q*
with
dx*/dt* = -3q*/ak* = 2k*
so
dA*(x*, t*)/dt* = 2k*2 + q* = (1/2) x*2(t*; q*) -q*.
Substitution of (3.4) for x* gives
dA*(x*, t*)/dt* = (l/2X 2t% + 2q* + (l/2)q*2 e2t* ' 2q *
and integration of this with respect to t along the curve of constant q* gives
A*(x*, t*) - (l/4)K 2t * + 2q* + q*2 e 2t *' 2q * + D*(q*)] (3.11)
for some integration constant D*(q*) for each q*.
We can arbitrarily specify that for q*= 0 and t*—» + °° ,
A*(x*, t*) -» 0 so that D*(0) = 0 and
A*(x*, t*) = -(l/4X 2t *, for q* = 0, (3.12)
or from (3.4)
A*(x*, t*) = -(l/4)x* 2 , for q* = 0. (3.13)
.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3
Fig. 7. Curves of constant flow, a magnified view
The boundary conditions specify that, for any large fixed x*, q*(x*, t*) is linearly increasing in t*
on the upstream side of the shock, so
A*(x*, t*) = -(l/4)x *2 + (l/2)(t* + In x*)2 for large x* (3.14)
From this one can show that
D*(q*) = -2q* + 2q*2
so that (3.11) determines A*(x*, t*) parametrically as a function of q* along the characteristic
curve (3.4) describing x*(t*; q*) also as a parametric function of q*. Where characteristic curves
intersect each other giving multiple values of q* for the same x*, t*, there will also be multiple
values of A*(x*, t*). The "correct" A*(x*,t*) is the smallest of its multiple values and the shock
path is where the A*(x*, t*) becomes multiple-valued (Newell 19931).
To determine the path of the shock for large t*, it suffices to note that, for large positive t*, q* is
nearly zero for all x* on the downstream of the shock and A*(0, t*) is also nearly zero for large t*.
If, as in section 2, we integrate (2.12) along a line of constant t*, but now with the density given
by (3.10) for q*(x*, t*) = 0, we have
A * (x*,t*) = jk * (z) dz = (1/2) Jz dz = (l/4)x* 2 . (3.15)

0 0
The path of the shock is now obtained by equating (3.14) and (3.15)
or
t = x*-.ftix*. (3.16)
Actually the formula (3.16) is fairly accurate soon after the shock forms. One can show that to 2
second approximation
t = x* - &ix* - x*e -x*
The additional term is negligible for x* > 4, only about one distance unit after the shock forms
Figure 6 shows the shock path.
As with the triangular q - k relation, one might also like to describe the vehicle trajectories. For an>
dimensionless q*(x*, t*) as illustrated in figures 6 and 7, however, there is a two-parameter family
of possible trajectories. For the triangular q - k relation there was only a one-parameter family ir
the transformed coordinate system but in the original coordinate system the trajectories would alsc
depend on the free-flow speed v0.
The dimensionless cumulative flow defined by (3.11) is a cumulative flow less a dimensionles:
background flow of q0* and density ko* so the actual dimensionless cumulative flow is
A*(x*, t*) + q0* t* + ko* x* (3.17)
with
q0* - 2(AC)1/2 q0/B, k0* = 21/2 A1/4 C3/4B'1/2 k0.
Although there is only one dimensionless A*(x*, t*), there are two arbitrary parameters qo* and
ko* in (3.17) (regardless of how these may be related to the q0 and ko).
The exact parametric representation of A*(x*, t*) along characteristic curves is not very
convenient for the evaluation of trajectories but, for most trajectories which pass through the
shock, (3.15) is valid downstream of the shock. The trajectory which passes x* = 0 at time t0*
satisfies the equation
qo*(to* -1*) - V x* = x*2/4. (3.18)
Thus the deviation of a trajectory from a straight line is proportional to x*2 (instead of x*3 as in
(2.18b)). Upstream of the shock, according to (3.14), this trajectory would satisfy
qo*(to* -1*) - V x* = -x*2/4 +(t* + ^nx *)2 12. (3.18a)
4. SOME ESTIMATES
There is no guarantee that traffic behaves in the manner described here, but, for the theory to be
plausible, the various scaling parameter should have values in a reasonable range.
In a typical rush hour we might expect the flow per lane to increase at a rate of about 1000 cars/hr.
in a period of about 1 hour. Thus a reasonable value of B would be about 103/hour2. For the
triangular q - k relation a plausible value of w0 is about 1/10 hr/km (a wave velocity of lOkm/hr). A
"typical" value of A in (1.1) is not as well defined but it might be helpful to relate the A to a
distance L* required for the capacity (1.1) to change by 100 vehicles/hr. from its value at x = 0, i.e.
let
100/hr = AL* 2 , A = 100/L*2hr.
With these values of A, B, and w0, (2.5) gives
_ L* 2 , _ L *2 hr _ _ 20L* 2
X
' ~ 2 k m ' ' ~ 40 (km) ' ' q° ~ km 2 hr
t 2 q
Thus, if L* is measured in km, x\ = L*2/2.
If the capacity of the road section did vary as in (1.1), a value of L* = 1km would certainly
described a very slowly changing capacity for typical roads. The prediction here is that, for L* = 1
km, a queue would first form about 1/2 km upstream of the bottleneck (a rather substantial
distance) at a time t] 1 of about 1 minute and qi - q0 = 25 veh./hr.
The value of \\, being proportional to L*2 , decreases rapidly with decreasing L*. If L* is reduced
to 1/2 km, X] becomes only 1/8 km.
For the parabolic q - k relation a typical value of C in (1.4) would be such that the flow would
drop by about 2000 veh./hr from capacity at a density displacement of about 100 veh./km. Thus,
2000/hr = C(100)2/km2, C ~ (1/5) km2/hr.
For the same values of B and A as above, (3.5) and (3.7) predict that the shock forms at
x, ~ B(1/2) C-(1/4) A-(3/4) ~ (1.5)L*(3/2) (km) - (1/2 >.
This value of x, will typically be considerably larger than for the triangular q - k relation because
the parabolic q - k relation provides less room to store excess vehicles near the bottleneck (but
more space far from the bottleneck). For L* = 1 km, the value of x, is about 1.5 km. The X]
decreases with decreasing L*, proportional to L*(3/2) , but not as rapidly as for the triangular q - k
relation. For L* = (1/2) kin, x, is still about (1/2) km.
5. CONCLUSIONS
We have described here an analytic solution of the problem posed in the introduction. No claim is
made that the theory described here is consistent in detail with how traffic actually behaves bul
any attempt to verify the L-W theory should recognize that a shock wave does not necessarily
start at the bottleneck itself. It may begin at some point upstream of the bottleneck at a poinl
depending on how the capacity varies with location (the A of eq. (1.1)), the rate at which the flow
increases (the B of eq. (1.2)), and the shape of the q - k curve near its maximum. Despite all the
parameters in the formulation of the problem, the final evaluation of the flow pattern is expressed
in terms of a single dimensionless form. All the parameters are absorbed in coordinate
transformations.
Installations of vehicle detectors at multiple locations along freeway sections at various places
throughout the world are presently providing very detailed information about traffic flows or
freeways. Data from these system will soon provide a means of testing the validity or deficiencies
of various models of traffic flow, including the L-W theory, but to deal with masses of data ai
different locations will required special techniques of analysis.
Many parameters are needed to characterize any particular set of observations and to compare one
set of observations with another. Even if the L-W theory should fail to describe correctly some
features of the flow pattern, maybe some of the dimensional arguments will still be valid. If one
makes observations at one location having a certain set of parameters, such as the A, B, q0, ko, etc.
here, and then observations at other locations with different values of these parameters, but one
presents the data in terms of the rescaled coordinates, the A*(x*, t*) for example, then maybe one
will obtain nearly the same A*(x*, t*) for all locations. This, in itself, would mean that observation
at one location can be used to predict behavior at another location having different parameter
values.
The trick of comparing observations at two or more location by measuring time relative to a
coordinate system moving with the free speed has been exploited before. If there are no "delays",
an A'(x, t') derived from (2.2) would be independent of x. The trick of rescaling time and/or space
variables involves nothing more than relabeling the coordinate scales of x and/or t on a graph or
drawing graphs for two different locations on different physical coordinate scales.
The trick of subtracting some "background flow" from the actual flow or, equivalently, drawing
graphs of
A(x, t) - q0 t+ ko x (5.1)
for some possibly arbitrary values of q0 and ko is less "well-known". This transformation does not
change the wave pattern or the characteristic curves and was exploited in section 3 to eliminate two
parameters from the dimensionless wave solution of section 3. This transformation, however,
does not eliminate these parameters from the trajectory pattern.
Cassidy and Windover (1995) have subtracted a background flow as in (5.1) from experimental
data, not to compare observations at different locations, but simply as a tool to "magnify"
observed flow variations in space and time. If, by appropriate choice of q0 and ko, one can greatly
reduce the magnitude of (5.1), then one can draw the graph on a magnified scale. The effect is quite
dramatic.
REFERENCES
Cassidy, M. J. and J. R. Windover (1995). A methodology, for assessing the dynamics of freeway
traffic flow. Transportation Research Record, (Washington, D.C.) (in press).
Lighthill, M. J. and G. B. Whitham (1955). Kinematic waves II a theory of traffic flow on long
crowded roads. Proc. Royal Soc. (London) A 229, 317-345.
Newell, G.F. (1993) A simplified theory of kinematic waves in highway traffic, I general theory, II
queueing at freeway bottlenecks, III multi-destination flows. Transpn. Res. 27B, 281-287,
289-303,305-313.
Theory of Congested Traffic Flow 147
THEORY OF CONGESTED TRAFFIC FLOW: SELF-

ORGANIZATION WITHOUT BOTTLENECKS
Boris S. Kerner, DaimlerChrysler AG, FT1/V, HPC: E224, 70546 Stuttgart, Germany
ABSTRACT
Results of experimental observations of phenomena of self-organization in traffic flow on
German highways are presented. The observations allow to suggest that there are at least two
phenomena of 'self-organization without bottlenecks' in real traffic flow: (i) The spontaneous
formation of a local region of synchronized traffic flow in an initially free traffic flow and (ii)
The spontaneous formation of a traffic jam in synchronized traffic flow. A theory of
congested traffic flow which may qualitatively explain these and other diverse effects of self-
organization in real traffic flow is discussed.
1. INTRODUCTION
1.1. Self-Organization
Self-organization, which is usually viewed as a spontaneous formation and evolution of

different patterns in a non-linear system, is the usual phenomenon in many physical systems,
biological systems and chemical reactions (e.g., Nicolis and Prigogine, 1977; Kerner and
Osipov, 1994). The ideas of self-organization have also been applied in different theories to
explain phenomena observed in traffic flow. Indeed, already in 1958 Chandler, Herman and
Montroll (1958) and Komentani and Sasaki (1958), and later other authors (e.g., Prigogine
and Herman, 1971; Kiihne, 1991; Schreckenberg et al., 1995, Helbing, 1997) proposed that
there is a density range where homogeneous states of traffic flow due either to an instability or
to some other kind of phase transition cannot exist. Therefore a sequence of traffic jams has to
occur spontaneously - the so-called 'stop and go' phenomenon, which is often observed in
real traffic flow (e.g., Treiterer, 1975; Koshi et al., 1983).
A different scenario of self-organization for the jam's formation has been proposed in 1994 by
Kerner and Konhauser (1994), and later in other papers (Bando et al, 1995; KrauB et al.,
1997, Barlovic et al., 1998): Before the density range mentioned above is reached, there
should be a broad range of lower densities where homogeneous states of free traffic flow are
metastable states (see an explanation of 'metastable states' below in Sect. 1.4) and 'the local
cluster effect' leading to jam formation can occur (a comparison of the local cluster effect in
different traffic models can be found in (Herrmann and Kemer, 1998; Kerner, 1998b).
On the contrary, there are alternative theories which claim that there are no self-organization
processes in traffic flow and that all phenomena of the formation of spatial-temporal patterns
in traffic flow are without exception determined by an influence of on- and off-ramps or other
freeway bottlenecks. As the consequence of these theories it is also often claimed that spatial-
temporal behavior of traffic flow can be described satisfactorily by the well-known classical
Lighthill-Whitham-theory of traffic flow (see Whitham, 1974), which is based on the
Lighthill-Whitham-model or by further developments of such 'first order' traffic flow models
where no self-organization processes are possible (e.g., Daganzo, 1997; Daganzo et al., 1998).
1.2. Phenomena of Pattern Formation in Traffic Flow
Only results of experimental observations of real traffic flow are able to answer the question
which traffic flow theories may actually explain related phenomena in real traffic flow. One
may mention two well-known phenomena experimentally observed in real traffic flow:
A. The occurrence of spatial-temporal patterns consisting of different traffic jams - the 'stop
and go' phenomenon (e.g., Treiterer, 1975; Koshi et al., 1983) and
B. The breakdown phenomenon in the vicinity of a freeway bottleneck which is often
accompanied by the related 'capacity drop' (Fig. l(a)) (e.g., Agyemang-Duah and Hall,
1991; Cassidy and Bertini, 1998; Persaud et al., 1998).
Are these phenomena the result of some self-organization processes or are they only the result
of initial permanent non-homogeneity caused by freeway bottlenecks !?
Recently the nature of the breakdown phenomenon in the vicinity of a freeway bottleneck has
been disclosed (Kerner and Rehbom, 1997): This breakdown phenomenon is linked to an
occurrence of a first order local phase transition 'Free Flow => Synchronized Flow' (the
definition of synchronized traffic flow will be given in Sect. 1.6). Such local phase transitions
are one phenomenon of self-organization. Kerner (1998c) discovered the nature of the 'stop-
and-go' phenomenon: It turned out that this phenomenon is linked to the double (cascade)
phase transitions 'Free Flow => Synchronized Flow => Traffic Jams and the 'pinch effect' in
synchronized flow which are both very complex phenomena of self-organization.
1.3. About the Role of Non-Homogeneity in Self-Organization Processes
A freeway bottleneck usually causes a permanent non-homogeneity in traffic flow (e.g., Koshi
et al, 1983; May, 1990; Daganzo, 1997; Cassidy and Bertini, 1998). On the one hand, it is
well-known from investigations of physical, biological and chemical systems (Kerner and
Osipov, 1994) that a permanent non-homogeneity acts as 'a permanent nucleus' for phase
transitions: Spontaneous formation of a spatial-temporal pattern occurs considerably more
frequently in the vicinity of the permanent non-homogeneity. Such phase transitions can occur
in a deterministic way, i.e., even if fluctuations are negligible. On the other hand, this
deterministic effect is also one phenomenon of self-organization (Kerner and Osipov, 1994).
Indeed, the reason of the spontaneous occurrence and the main non-linear properties of
spatial-temporal patterns are nevertheless determined by intrinsic non-linear properties of a
system. In other words, phenomena of self-organization can occur in such systems (although
usually considerably more seldom) even when no initial permanent non-homogeneity exists.
The aim of this paper is to show that self-organization in experimentally observed traffic flow
occurs outside any bottlenecks. To show such 'self-organization without bottlenecks' in real
traffic flow, one needs to perform experimental observation of the behavior of traffic flow on
a long enough highway section where no on- and off-ramps and other bottlenecks exist (Sect.
2). A qualitative theory of congested traffic flow (Kerner, 1998 a, b, 1999) which may explain
phenomena of self-organization will be considered in Sect. 3. However, before the new results
are considered (Sect. 2), some recent experimental observations which are an 'indirect' proof
of'self-organization without bottlenecks' (Sect. 1.4), some additional definitions (Sect. 1.5),
and features of synchronized flow which have been found out (Sect. 1.6) should be briefly
reviewed.
1.4. Metastable States of Free Traffic Flow and Characteristic Parameters of Wide Jams
as 'Indirect' Proof of 'Self-Organization Without Bottlenecks'
It should be noted that an 'indirect' proof of the existence of 'self-organization without

bottlenecks' in real traffic flow has already been done by Kerner and Rehborn (1996a, 1998a)
from their experimental investigation of the propagation of 'wide' jams on highways. A wide
traffic jam is a jam whose width, i.e., the longitudinal distance between jam fronts, is
considerably higher than the widths of the jam's fronts. A jam front is a region where the flow
rate, the density and the average vehicle speed change sharply spatially. Note that the term
'jam front' is used instead of 'shock wave' because the term shock wave in traffic flow
theories is usually associated with the classical Lighthill-Whitham theory of shock waves (see
Whitham, 1974). However, as it has recently been shown in (Kemer, et al, 1997), the
Lighthill-Whitham theory and other 'first order models' cannot explain the existence of
characteristic parameters of wide jams observed in experiments (Kerner and Rehborn, 1996a).
Indeed, when wide jams propagate outside bottlenecks and free flow is formed in the outflow
of the jam, it has been found out (Kerner and Rehborn, 1996a) that there are the characteristic
(unique, or coherent) parameters of traffic flow: (i) the velocity v g of the downstream front of
a jam, (ii) the jam density p m a x , (iii) and also the flow rate q o u t , the vehicle density p m i n , the
average vehicle speed in free flow which is formed by a wide jam downstream. The
characteristic parameters do not depend on initial conditions. In particular, it has been found
out (Kerner and Rehborn, 1996b) that independently on initial conditions the characteristic
parameters were spontaneously self-formed during a development of any jam whose width
monotonically increased in time. This process of self-organization cannot be explained by any

'first order' models (e.g., Daganzo, 1997; Daganzo, et al, 1998); On the contrary it can be
explained by 'second order' models (Kerner and Konhauser, 1994; Kemer et al., 1997).
Such a stationary propagation of the downstream front of a wide jam can be represented by a
line in the flow-density plane. This characteristic line for the downstream front of the jam,
which will be called 'the line J', has the coordinates (p mjn , q out ) and (pmax, 0) in the flow-
density plane; the slope of 'the line J' is equal to the mean value of the velocity of the
downstream front v g (Fig. l(b), line J). It should be noted that 'the line J' is not a part of the
fundamental diagram: It represents the characteristics of the downstream front of a wide jam.
The other result of experimental observations of wide jams found out in (Kemer and Rehborn,
1996a) is that the maximal possible flow rate in free flow qmraexe) can be considerably higher
than the flow rate out of a wide jam q out (Fig. l(b)):
q (free)
In
4 max /lout (1)
Therefore, in the range of the flow rate (and in the corresponding range of the vehicle density)
(Fig. l(b))
(2)
w
(free) A fl° rate flow rate
Qmax
(b)
D (free) density Pmax

Kmax
Fig. 1. A possible shape of the fundamental diagram of traffic flow (a) (e.g., Brannolte, 1991;
Ceder, 1976; Hall, 1987; Hall, et al, 1986; Agyemang-Duah and Hall, 1991; Koshi, et al,
1983) and (b) - the concatenation of states of free flow with the characteristic line for the
downstream front of a wide traffic jam ('the line J") (Kemer and Rehborn, 1996a; Kemer and
Konhauser, 1994).
at any chosen average vehicle density there are at least two different states of traffic: (i) free
traffic flow and (ii) wide jams. It must be noted that the existence of this range of the density
(2) is not linked to a freeway bottleneck. Therefore, a local perturbation of an initially in
average homogeneous free traffic flow can force an appearance of a jam, if the amplitude of
this local perturbation is high enough. Because there is always a finite probability of a
spontaneous occurrence of such a fluctuation in traffic flow outside any bottleneck, a traffic
jam can spontaneously occur in traffic flow without any influence from a freeway bottleneck,
i.e. 'self-organization without bottlenecks' in traffic flow can really occur.
As well as in physical, biological and chemical systems, states of free traffic flow in the range
(2) may be called metastable states. Recall that a metastable state of a spatial system is stable
with respect to any infinitesimal perturbations. However, if the amplitude of a local
perturbation exceeds some critical amplitude, this local critical perturbation begins to grow.
The same definitions may be applied to traffic flow where local perturbations (fluctuations) of
traffic variables (vehicle speed, density and flow rate) usually occur. As it follows from (2),
the flow rate and the density q b = q o u t , pb = p min are the boundary (threshold) values
which separate stable states of free flow and metastable states of free flow with respect to the
jam formation (Kerner and Konhauser, 1994). This means that at q < q b (p < p b ) no jams can
exist for a long time or be exited in free flow (Fig. 2).
critical amplitude probability of
1
of local perturbation f first order local
phase transition /
!\ (a) 1 (b) /
^v_ ! _J
P b °r P s density p b or p density
(flow rate) (flow rate)
Fig. 2. Explanation of local first order phase transitions: (a) A qualitative shape of the
dependence of the critical amplitude of a local perturbation (see Fig. 6(a) in Kerner and
Konhauser, 1994); (b) the dependence of the probability of first order phase transitions which
is related to Fig. (a) (see Figs. 3 and 4 in Persaud et at., 1998). Threshold vehicle densities in
free flow are designated as p b for the phase transition 'Free Flow => Jam' and as ps for the
phase transition 'Free Flow => Synchronized Flow'.
1.5. Nucleation Effect and Local First Order Phase Transitions
The effect of the growth of a local perturbation in a metastable state of a system whose
amplitude exceeds the critical value, i.e., the growth of a critical local perturbation, is called
the nucleation effect. The critical local perturbation plays the role of a 'nucleation center' for
local phase transitions in an initial state of the system (e.g., Kerner and Osipov, 1994). A first
order local phase transition is a phase transition which occurs in a metastable state of a
distributed system and is caused by the nucleation effect. First order local phase transitions are
accompanied by a jumping (i.e., breakdown) behavior of variables of a system and hysteresis
effects. Some other general properties of first order local phase transitions are shown in Fig. 2:
The critical amplitude of the critical local perturbation is maximal at the threshold which
separates stable and metastable states of a system. This critical amplitude decreases as the
density (or the flow rate) deviates from the threshold value inside the metastable range.
Apparently, for traffic flow these properties have first been discovered by Kemer and
Konhauser (1994) from their theoretical investigation of the phase transition 'Free Flow =>
Jam' (Fig. 2(a)). Note that from statistical physics it is known that the probability of the
spontaneous occurrence of a fluctuation in a distributed system decreases with the increase in
the amplitude of this perturbation. Therefore, from Fig. 2(a) follows the other general property
of first order local phase transitions: The lower the amplitude of the critical perturbation is,
the higher is the probability of the spontaneous occurrence of a first order local phase
transition (Fig. 2(b)). Note that the nature of the well-known breakdown phenomenon in the
freeway bottleneck, as it has recently been discovered in (Kerner and Rehborn, 1997), is
linked to another first order local phase transition 'Free Flow => Synchronized Flow'.
Therefore, the latter phase transition should show qualitatively the same behavior as it is
shown in Fig. 2 (with a different threshold density p s ). The behavior of the probability of the
breakdown phenomenon in a freeway bottleneck shown in Fig. 2(b) has been discovered by
Persaud et al. (1998) that confirms the mentioned conclusion by Kerner and Rehborn (1997).
Observations (Kemer and Rehbom, 1996b) show that there are three qualitatively different
phases of traffic flow: (i) free flow, (ii) synchronized flow and (iii) wide jams. As a result,
there may be three qualitatively different types of phase transitions: 1) 'Free flow <=> Jam', 2)
'Free flow <=> Synchronized flow', and 3) 'Synchronized flow <=> Jam(s)'. A spontaneous
occurrence of these phase transitions determines the complexity of traffic flow observed in
experiments. It is also linked to the result (Kerner and Rehborn, 1997; Kerner, 1998c)) that all
these transitions are related to the same class of first order phase transitions, i.e., they are
accompanied by similar looking breakdown and hysteresis effects. Besides, for each of these
phase transitions the properties shown in Fig. 2 are valid. However, they must be
distinguished one from another because the non-linear features of these phase transitions are
qualitatively different. This differentiation determines one of the main difficulties in traffic
flow theory.
1.6. Properties of Synchronized Traffic Flow
In free traffic flow, due to the relatively low densities of vehicles, drivers on a multi-lane road
are able to change a lane and to pass. On the contrary, due to the higher density in 'congested'
flow, vehicles are almost not able to pass. As a result, a bunching of drivers both on each
individual lane and between different lanes of a highway can occur. Therefore, when all lanes
of a highway correspond to the same route without on- and off-ramps or other bottlenecks,
drivers move with nearly synchronized average speed on the different lanes of the highway. A
possibility of such 'synchronized' ('collective') flow which is related to a part of the
fundamental diagram (e.g., Fig. l(a)) at higher density has been theoretically predicted by
Prigogine and Herman (1971). Synchronized flow has been observed by Koshi et al. (1983).
In congested flow a broad and complex spreading of measurement points which cover a two-
dimensional region on the flow-density plane is observed (e.g., Koshi et al, 1983). This
spreading is interpreted either as fluctuations, or as an instability, or else as a jam formation
(e.g., Hall, 1987; Helbing, 1997; Koshi et al, 1983; Hall etal., 1986).
It is well-known that in traffic flow with high density traffic jams can appear (e.g., Treiterer,
1975). Inside a jam both the vehicle speed and the flow rate are very low or even zero. Recall
that in synchronized traffic flow, on the contrary, the vehicle speed is relatively low (but a
finite value) and the flow rate can be nearly as high as the flow rate in free flow. This
'quantitative' difference as it follows from results of experimental observations leads to
qualitatively different non-linear properties of traffic jams and of synchronized flow (Kerner
and Rehborn, 1996a, 1996b). Traffic flow with high density, where both traffic jams and
synchronized flow may occur, is called congested traffic flow.
Kerner and Rehborn (1996b) have found out that synchronized flow has totally different
dynamical properties in contrast to free flow. In particular, a multitude of states of free flow
may really be described by a curve on the flow-density plane, i.e., by the fundamental diagram
(Fig. l(a), curve 'free'). On the contrary, even a multitude of homogeneous states of
synchronized flow (i.e., states which are homogeneous spatially and stationary in time; often
such states are called 'steady speed' states) cover a two-dimensional region in the flow-
density plane: A given vehicle speed (a steady speed) in a homogeneous state of synchronized
flow may be related to an infinity multitude of vehicle densities, and a given density may be
related to an infinity multitude of different speeds. In other words, the real dynamics of
synchronized flow cannot be described in the frame of the hypothesis about the fundamental
diagram: There is no fundamental diagram which is able to describe the properties even of the
multitude of homogeneous states of synchronized traffic flow (Fig. 3).
Q (free)
4 max
D (free) density density

pmax
Fig. 3. Homogeneous states of traffic flow (Kerner and Rehborn, 1996b; Kerner, 1998a): (a)
Multitudes of homogeneous states of free (curve F) and of synchronized flow (hatched region)
on a multi-lane road, (b) A multitude of homogeneous states of flow on a one-lane road.
2. SELF-ORGANIZATION WITHOUT BOTTLENECKS

Between 1995 and 1998 all mentioned above types of phase transitions in traffic flow on the
German highways A5, Al, A3 and A44 have been investigated on different days. Since it has
been found out that the features of these phenomena are similar in all cases, some general
results may be illustrated by a representative data set measured on Monday, March 17, 1997 on
a section of the highway A5 (Fig. 4) (Figs. 5 - 9).
The section of the highway has three intersections with other highways (II, "Friedberg", 12,
"Bad Homburger Kreuz" and 13, "Nordwestkreuz Frankfurt") and is equipped with 24 sets of
induction loop detectors (D1,...,D24) (Fig. 4). Each of the sets D4-D6, D12-D15, and D23, D24
consist of four detectors for a left (passing), a middle and a right lane, plus one for the lane
related to on-ramps or to off-ramps. The other sets of detectors are situated on the three-lane
road without on- and off-ramps, where each of them consist of three detectors only. Each
traffic flow
1
o o o
—
a ro
—•
D ro
a oa o
ro to 10
. u> o en o o —' ro cx> b
] p ] 1 L JP \ 1 1 1 1 ] i ih i i i i \ i ii
] 1 hff ] ] ] ] 1 1 i 112 i i i i ] j ii
] ] ] 1 1 IQ ] 1 b i ]u ] ] ] ] i]
Fig. 4. Schematic configuration of the section of the highway A5-South in Germany.
induction loop detector records the crossing of a vehicle and measures its crossing speed. A
local road computer calculates the flow rate and the average vehicle speed in one minute
intervals. The accuracy of the detectors according to official regulations is: better than 10% for
the flow rate higher than 600 veh/h, better than 20% for the flow rate lower than 600 veh/km;
better than 3% for the vehicle speed higher than 100 km/h; to within 3 km/h for the vehicle
speed lower than 100 km/h. If each vehicle during the interval of the averaging (one minute)
has the vehicle speed lower than 20 km/h, then the average speed is set to 10 km/h. Only if no
vehicle crosses a detector during the interval of the averaging, the average speed is set to zero.
2.1. Experimental Properties of Phase Transitions 'Free Flow => Synchronized Flow'
outside Freeway Bottlenecks
Let us first consider the dependence of the average vehicle speed (v) and flow rate (q) within
the time interval 06:20 - 06:40 (Fig. 5(a)) in the vicinity of the freeway bottleneck which exists
on the section of the highway due to the off-ramp inside the intersection 13 (D23, D22, Fig. 4)
and in the vicinity of the freeway bottleneck which exists due to the on-ramp inside the
intersection 12 (D15, D16, Fig. 4). It can be seen from the dependence of the average vehicle
speed (Fig. 5(a), left) and flow rate (Fig. 5(a), right) that in the whole time interval 06:20 -
06:40 in the vicinity of both freeway bottlenecks free flow is realized. On the contrary, outside
from both bottlenecks some transitions between free and synchronized flow occur (D18, at
t « 06:27, t « 06:32 and t « 06:36). As it has already been mentioned in Sect. 1.6, in
synchronized flow the flow rate may be of the same order of magnitude as in free flow (Fig.
5(a), D18, right), but the vehicle speed is noticeably lower than in free flow and it is
approximately the same on different lanes of the highway (Fig. 5(a), D18, right, t>06:36).
To show that the transition from free flow to synchronized flow outside freeway bottlenecks,
which occurs at t=06:36 in the vicinity of the detectors D18 (Fig. 5(a), D18, left, up arrow), is a
local phase transition 'Free Flow => Synchronized Flow' outside freeway bottlenecks, spatial-
temporal distributions of vehicle speed and the flow rate both upstream and downstream from
the detectors D18 at later times should be studied. The results of such an investigation in the
time interval 06:35 - 06:50 are shown in Fig. 5(b). It can be seen from Fig. 5(b) that transitions
from free to synchronized flow both upstream (D17, D16, left in Fig. 5(b), up arrows) and
downstream (D19, D20, left in Fig. 5(b), up arrows) occur later than at the detectors D18 (up
arrow at t=06:36, left in Fig. 5(b)). Besides, the greater the distance from the detectors D18, the
later the transition from free to synchronized flow occurs. This conclusion is true both upstream
(D17, D16, left in Fig. 5(b), up arrows) and downstream (D19, D20, left in Fig. 5(b), up
arrows) of the detectors D18. This confirms that the transition from free flow to synchronized
Row first occurs only in the vicinity of the detectors D18, i.e., it is really a local phase transition
'Free Flow => Synchronized Flow' outside freeway bottlenecks.
(a) A5-South, 17.03.1997, — l e f t lane ---middle lane right lane

v [km/h] 3000 T^ [veh/hl
120 --
D23
1500 -- ,--
D23 D23-off
06:20 06:25 06:30 06:35 06:40 06:20 06:25 06:30 06:35 06:40
v [km/h] 3000
120 --
1500
06:20 06:25 06:30 06:35 06:40 06:20 06:25 06:30 06:35 06:40
v [km/h] 3000 7 1 [veh/h]
120 ,
1500 + :
0
06:20 06:25 06:30 06:35 06:40 06:20 06:25 06:30 06:35 06:40
v [km/h]
120 -^\
80 -- 1500--
40 -- D16
D16
0 o
06:20 06:25 06:30 06:35 06:40 06:20 06:25 06:30 06:35 06:40
v[km/h] 300
120 -F :^x>^X^/ ° T q [vehih] D15
80 --
1500
40 --i
D15
0 0
06:20 06:25 06:30 06:35 06:40 06:20 06:25 06:30 06:35 06:40
Fig. 5(a). See caption to Fig. 5 (a, b) below.
Note that the transitions from free flow to synchronized flow, which occur upstream (D17,
D16, left in Fig. 5(b), up arrows) of the detectors D18 are linked to the appearance of a wave of
induced transitions from free flow to synchronized flow. Respectively, the transitions from free
to synchronized flow downstream of the detectors D18 (D19, D20, left in Fig. 5(b), up arrows)
are linked to a wave of the propagating synchronized flow. The propagation of synchronized
flow supplants free flow downstream. Indeed, as it has already been mentioned, the transitions
from free flow to synchronized flow, which occur upstream (D17, D16, left in Fig. 5(b), up
arrows) and downstream (D19, D20, left in Fig. 5(b), up arrows) of the detectors D18, occur
later than at the detectors D18. Besides, the greater the distance from the detectors D18 is, the
later they occur. Therefore, the transitions from free flow to synchronized flow which occur
upstream (D17, D16, left in Fig. 5(b), up arrows) and downstream (D19, D20, left in Fig. 5(b),
up arrows) are not local phase transitions, but induced transitions. The induced transitions
upstream and the propagating synchronized flow downstream cause a widening of synchronized
flow, which has first spontaneously occurred in the vicinity of the detectors D18: The region of
localization of synchronized flow is widening both upstream and downstream over time. It
must be noted that the detectors D20-D17 are situated outside bottlenecks. Therefore, the
spontaneous occurrence of the local phase transition 'Free Flow => Synchronized Flow' is
really a process of 'self-organization without bottlenecks' in traffic flow.
3000 T
1500
D20
0
06:50 06:35 06:40 06:45 06:50
3000 jq[veh/h]
1500 --
D19
0
06:45 06:50 06:35 06:40 06:45 06:50
3000 T Q[veh/h]
(b) 1500 --
D18
0
06:45 06:50 06:35 06:40 06:45 06:50
3000
1500 --
D17
0
06:50 06:35 06:40 06:45 06:50
3000 TV
1500 "
D16
0
06:35 06:40 06:45 06:50 06:35 06:40 06:45 06:50
Fig. 5(a, b). Results of experimental observations of the phase transition 'Free Flow =>
Synchronized Flow' outside highway bottlenecks: The dependence of the vehicle speed (left)
and the flow rate (right) at the different detectors both downstream and upstream from the
location of the phase transition (D18) within time interval 06:20 - 06:40 (a) and 06:35 -
06:50 (b) for three lanes of the highway (for the detectors D23 the vehicle speed on the off-
ramp is additionally shown, D23-off). Up arrows symbolically show the transitions 'Free
Flow => Synchronized Flow' at the related detectors.
Theory of Congested Traffic Flow 15 7
2.2. Experimental Properties of Phase Transitions 'Synchronized Flow => Jam' outside
Freeway Bottlenecks
A different process of 'self-organization without bottlenecks' - the spontaneous occurrence of

a traffic jam inside synchronized flow outside bottlenecks, i.e., the phase transition
'Synchronized Flow => Jam' outside freeway bottlenecks - occurs in the example under
consideration a few minutes later (Fig. 6 (a), down arrows). It must be noted that the
synchronized flow where the jam spontaneously emerges (D20) is located downstream from
the region where the synchronized flow has earlier spontaneously occurred outside
bottlenecks in the initially free flow (D16, up arrow, Fig. 5(b)).
The jam occurs due to the sequence of the double phase transitions 'Free Flow =>
Synchronized Flow => Jam' (Kerner, 1998c): (i) First the phase transition
'Free => Synchronized flow' occurs (D18, Figs. 5(b) and 7(a)). (ii) Then the 'pinch effect' in
synchronized flow, i.e., a self-compression of synchronized flow is realized, where states of
synchronized flow are lying noticeably above the line J in the flow-density plane (D20, Fig.
7(b)). (iii) In the pinch region outside bottlenecks a growing local perturbation appears (down
arrows at t = 06:50 in Fig. 6(a), D20).
In contrast to the local phase transition 'Free flow => Synchronized flow' (Fig. 5), both the
vehicle speed and the flow rate have decreased simultaneously and noticeably (down arrows
left and right at t = 06:50 in Fig. 6(a), D20). The further behavior of this local perturbation
shows the process of spontaneous self-formation of the traffic jam (down arrows in Fig. 6(a),
D19-D15): The local perturbation gradually increases in the amplitude during its propagation
upstream (from the detectors D20 to D15, Fig. 6(a)), i.e., both the vehicle speed and the flow
rate in the jam decrease over time until they reach the values nearly zero (D16, Fig. 6(a)). It
should be noted that the detectors D20-D17, where the processes of self-emergence and self-
formation of the jam are realized, are located outside bottlenecks. Therefore, the processes of
self-organization: (i) the spontaneous occurrence of a local perturbation (D20, t= 06:50, Fig.
6(a)), (ii) the self-maintaining and self-growth of this perturbation; (iii) the appearance of the
traffic jam due to the further self-growth of the local perturbation really occur outside
bottlenecks.
For an additional verification of the fact that the self-organization of the jam formation has
occurred outside bottlenecks, the vehicle speed and the flow rate at the bottleneck in the same
time interval as in Fig. 6(a) (from 06:45 to 07:10) have been studied (Fig. 6(b)). It can been
seen from Fig. 6(b) that no growing local perturbations have occurred downstream from the
detectors D20. Besides, both earlier and during the spontaneous occurrence of the local
perturbation (at t= 06:50, D20, Fig. 6(a)) and even later, when the process of the self-
formation of the jam continues (D19 - D17, Fig. 6(a)), only free flow exists in the vicinity of
the freeway bottleneck downstream (D23 and D22, Fig. 6(b)).
A5-South, 17. 03.1997, — l e f t lane -- middle lane - . - r i g h t lane

v[km/h]
3000 Tq[veh/h]
36:50
1500 -'-
D20
0
06:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
120 4-v[km/h] 3000 T-q[veh/h]
1500
D19
0
06:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
120 |v[km/h] 3000 -r q[veh/h] ,
80 -«
1500 - -
40 i
v D18
1 i 1—
/ \ 0^
(a.) 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
3000 T'
1500 - A
D17
0.
07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
3000 T q[veh/h]
V>
1500 "•••
06:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
Fig. 6(a). See caption to Fig. 6(a, b) below.
After the jam has been self-formed, it propagates upstream with in average the same velocity
of its downstream front independently of the complexity of states of traffic flow and
independently of whether there is a freeway bottleneck or not (Fig. 8, down arrows).
q[veh/h]
v [km/h] 3000
120 -L D23
1500 ",".
40 -- D23-off
D23
0
06:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
v [km/h] q[veh/h]
3000 T
120 --\
80 • ~ 1500
(b) 40 "
D22
0
05:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
v[km/h] q[veh/h]
3000 T
120 T
2000
40" D21 1000 "

D21
0 0
06:45 06:50 06:55 07:00 07:05 07:10 06:45 06:50 06:55 07:00 07:05 07:10
Fig. 6(a, b). Results of experimental observations of the phase transition 'Synchronized Flow
=> Jam' outside highway bottlenecks: The dependence of the vehicle speed (left) and the flow
rate (right) at the different detectors both upstream (a) and downstream (b) from the location
of the phase transition (D20). Arrows 'down' symbolically show the location of the jam at the
different detectors. Arrows 'up' symbolically show the transitions 'Free Flow =>
Synchronized Flow' at the detectors D16 and D15 in the vicinity of the freeway bottleneck in
the intersection 12 (Fig. 4).
q[veh/h] q[veh/h]
3000 -i 3000 n
1500 1500 -
D18
0
100 P[veh/km] 100
Fig. 7. Pinch effect in synchronized flow: The concatenation of states of free flow (black
quadrates), synchronized flow (circles) and the line J at D18 (a) and D20 (b). Synchronized
flow at D20 is related to higher densities than at D18.
A5-South, 17.03.1997, — left lane -- middle lane right lane

v [km/h]
120--
06:55 07:10 07:25 07:40 06:55 07:10 07:25 07:40
D15
06:55 07:10 07:25 07:40 06:55 07:10 07:25 07:40

v [km/h] q[veh/h]
120 4 3000 ¥
D15-on
D15-on
H h
06:55 07:10 07:25 07:40 06:55 07:10 07:25 07:40
v [km/h]
120-•
i
80 •-
40 -
06:55 07:10 07:25 07:40 06:55 07:10 07:25 07:40

V [km/h]
120
~-rv'-\ .
D13
06:55 07:10 07:25 07:40 06:55 07:10 07:25 07:40
Fig. 8. Propagation of the jam (Fig. 6) through the highway bottleneck and upstream of the
bottleneck: The dependence of the vehicle speed (left) and the flow rate (right) at the different
detectors. Down arrows symbolically show the location of the jam at the different detectors.
Up arrow at D16 symbolically show the phase transition 'Free Flow => Synchronized Flow' at
in the vicinity of the freeway bottleneck in the intersection 12 (Fig. 4).
2.3. Comparison of Phase Transitions 'Free Flow Synchronized Flow' outside and in
the vicinity of Freeway Bottlenecks
It is interesting to make a comparison of the phase transition 'Free Flow => Synchronized
Flow' which occurs outside freeway bottlenecks (Fig. 5) with the related effects where
freeway bottlenecks may play an important role.
First note that the wave of induced transitions 'Free Flow •=> Synchronized Flow' upstream of
the location of the initial local phase transition (t = 06:36, D18, Fig. 5, left) at time t ~ 06:43
reaches the detectors D17 (up arrow, Fig. 5(b), left) and at time t « 06:44 reaches the detectors
D16 (up arrows in Figs. 5(b), left). The latter detector is already located in the vicinity of the
bottleneck (on-ramp, D15-on). It could be expected that if the wave of induced transitions
'Free Flow => Synchronized Flow' would propagate further upstream, then synchronized flow
would be caught in the vicinity of this freeway bottleneck, i.e., it would be self-maintained in
the vicinity of the freeway bottleneck for a long time. Such the 'catch-effect', as it follows
from experimental observations which have been made during other days, can be really
observed on a highway. However, in the case under consideration it does not occur. On the
contrary, the reverse phase transition 'Synchronized Flow => Free Flow' is realized at the
detectors D16 (up dotted arrow in Fig. 6(a), left). Therefore, the synchronized flow which has
spontaneously occurred outside bottlenecks (t = 06:36, D18, Fig. 5, left) has later been also
localized outside bottlenecks during the whole time of its existence.
The free flow which has occurred due to the mentioned reverse phase transition
'Synchronized Flow => Free Flow' has existed at the detectors D16 only during about 5 min.
After this time, the 'usual' phase transition 'Free Flow => Synchronized Flow' in the vicinity
of a freeway bottleneck in the intersection 12 at t « 06:57 occurs (up solid arrow in Fig. 6(a),
left). It should be noted that at a distance of about 12,6 km from the intersection 12 at time t «
06:37 another local phase transition 'Free Flow => Synchronized Flow' in a freeway
bottleneck which is situated in the vicinity of the latter intersection has occurred (up arrow at
the detectors D6 in Fig. 9, left). Both phase transitions 'Free Flow => Synchronized Flow' (up
solid arrow in Fig. 6(a), left and up arrow at the detectors D6 in Fig. 9, left) show qualitatively
the same peculiarities of first order phase transitions which have been considered in (Kerner
and Reborn, 1997).
A5-South, 17.03.1997, — left lane --• middle lane
'[km/h]
06:35 06:40 06:45 06:50 06:55 07:00 06:35 06:40 06:45 06:50 06:55 07:00
v [km/hi qfveh/h]
120 T 3000 T;
D6 D6
06:35 06:40 06:45 06:50 06:55 07:00 06:35 06:40 06:45 06:50 06:55 07:00
' [km/h] q[veh/h]

3000 T
06:35 06:40 06:45 06:50 06:55 07:00 5:35 06:40 06:45 06:50 06:55 07:00
Fig. 9. The phase transition 'Free Flow => Synchronized Flow' in the bottleneck in the
intersection II: The dependence of the vehicle speed (left) and the flow rate (right) at the
different detectors both downstream and upstream from the location of the phase transition
(D6). Up arrows show the transitions to synchronized flow at the related detectors.
Thus, there are at least two types of transitions 'Free Flow => Synchronized Flow' in
experimentally observed traffic flow:
(i) The local phase transition outside bottlenecks which is an example of 'self-
organization without bottlenecks' in traffic flow (up arrow, D18, Fig. 5, left).
(ii) The local phase transition in the vicinity of a freeway bottleneck (the bottleneck
located in the intersection 12, up solid arrow at D16, Fig. 6, left and the bottleneck
located in the intersection II, up arrow at D6, Fig. 9, left).
The first difference between these phase transitions is the different reasons of their
occurrence: The phase transition (i) occurs without obvious reason due to the growth of a
local perturbation outside bottlenecks and the transition (ii) occurs due to the growth of a local
perturbation in the vicinity of a bottleneck. The second difference is that in the cases of the
phase transition (ii) synchronized flow can be self-maintained in the vicinity of the bottleneck
for several hours after the transition has occurred. On the contrary, in the case (i)
synchronized flow usually exists for a relatively short time interval and it may propagate both
upstream and downstream from the location where the phase transition has initially occurred.
;h/h] D18-leftlane
feh/h]J ^,, , . , q[veh/h]
D16-leftlane 3^ •left lane
iVljirfJW
-V^ o -I -i 1
0 20 P [veh/km] Q 20 P [ven/km] 0
Fig. 10. The transitions 'Free Flow => Synchronized Flow' in the flow-density plane: The
local first order phase transition outside bottlenecks (a), the local first order phase transitions
in the freeway bottlenecks in the intersection 12 (b) and in the intersection II (c), respectively.
Free flow is shown by black quadrates and synchronized flow by circles. Arrows symbolically
show the transitions at the related detectors (Figs. 5 - 8).
However, these two types of transitions 'Free Flow => Synchronized Flow' have important
common properties: In the flow-density plane all of them show qualitatively the same
breakdown effect in free flow (Fig. 10) which is well-known from numerous observations of
freeway bottlenecks (e.g., Agyemang-Duah and Hall, 1991; Brannolte, 1991; Cassidy and
Bertini, 1998; Persaud, et al, 1998). However, one could see from the consideration made
above that the breakdown effect in free traffic is not exclusively a property of freeway
bottlenecks: It can spontaneously occur outside any bottlenecks in free flow (up arrow at D18,
Fig. 5). The latter is linked to the fact that these breakdown phenomena have the same nature
- they are the local first order phase transitions 'Free Flow => Synchronized Flow' (Kerner
andRehborn, 1997).
Thus, 'self-organization without bottlenecks' is really observed in traffic flow. Therefore,

phase transitions in traffic flow may not be explained by theories (Daganzo, 1997; Daganzo,
et al, 1998) where congestion in traffic flow can only occur due to some capacity restriction in
freeway bottlenecks rather than due to spontaneous effects of self-organization. On the
contrary, experimental observations presented in the paper show that congestion can indeed
spontaneously occur in traffic flow outside freeway bottlenecks. In other words, phase
transitions in traffic flow may really be explained in the frame of the self-organization
phenomena, in particular as considered in (Kerner and Rehborn, 1997; Kerner, 1998c).
3. THEORY OF CONGESTED TRAFFIC FLOW

The results of experimental observations presented in Sect. 2 show that 'self-organization
without bottlenecks' can really occur and play an important role in real traffic flow. 'Self-
organization without bottlenecks' can be responsible for all three types of phase transitions in
traffic flow: 'Free Flow => Synchronized Flow', 'Free Flow => Jam', and 'Synchronized
Flow => Jam(s)'. Naturally, the local phase transition 'Free Flow => Synchronized Flow'
occurs considerably more frequently in the vicinity of a freeway bottleneck. It is linked to the
fact that traffic flow is usually permanently strongly non-homogeneous in the vicinity of
freeway bottlenecks. This permanent non-homogeneity can act as 'a permanent nucleus' for
the phase transition, whereas outside bottlenecks there is usually no such non-homogeneity
which may 'support' the spontaneous emergence of synchronized traffic flow.
On the one hand, the observations confirm that 'self-organization without bottlenecks' can
really occur. On the other hand, from these observations it follows that some features of this
'self-organization without bottlenecks' and of self-organization phenomena which occur in
the vicinity of a freeway bottleneck are qualitatively the same. Therefore, some intrinsic non-
linear properties of ' homogeneous traffic flow, which do not depend on whether bottlenecks
exist on a highway or not, are essentially responsible for these effects of self-organization. A
theory of these intrinsic non-linear properties of homogeneous traffic flow has recently been
developed in (Kerner, 1998a, 1998c, 1999). Some of the hypotheses of this theory which may
explain the observed phenomena of self-organizations presented in this paper and in (Kemer
and Rehborn, 1996a, 1996b, 1997; Kerner, 1998c) will be explained below shortly.
3.1. Homogeneous States of Traffic Flow
Experimental observations show that real traffic flow on a highway is usually non-
homogeneous. However, to understand features of real traffic flow, the properties of
homogeneous states (steady states) of traffic flow should first be understood. The hypotheses
presented below are devoted to these steady states of flow.
(i) A hypothesis about the multitude of homogeneous states (steady states)

of traffic flow (Kerner and Rehborn, 1996b; Kerner, 1998a): In the flow-
density plane homogeneous states (steady states) of flow on a multi-lane-
road are related to a curve (curve F) for free flow and to a two-dimensional
region (hatched region in Fig. 3 (a)) for synchronized flow. The multitudes
of states of free flow on a multi-lane-road overlap homogeneous states
(steady states) of synchronized flow in the density. They are separated by a
gap in the flow rate at a given density.
(ii) A hypothesis about the behavior of infinitesimal perturbations in

initially homogeneous states of traffic flow (Kerner, 199 8 a):
Independently of the vehicle density in an initial state of flow infinitesimal
perturbations of traffic flow variables (the vehicle speed and/or the density)
do not grow in any homogeneous states of either free or synchronized flow
or else of homogeneous-in-speed states of synchronized flow: In the whole
possible density range homogeneous states of flow can exist. In other
words, in the whole possible density range (Fig. 3) there are no unstable
homogeneous states of traffic flow with respect to infinitesimal
perturbations of any traffic flow variables.
(iii) A hypothesis about continuous spatial-temporal transitions between

different states of synchronized flow (Kerner, 1998a): Local perturbations
in synchronized flow can cause continuous spatial-temporal transitions
between different states of both homogeneous and homogeneous-in-speed
states of synchronized flow (hatched region in Fig. 3(a)).
To explain the hypotheses (i)-(iii), note that in synchronized flow spacing between vehicles is
relatively low (i.e., the density is relatively high) in comparison with free flow at the same
flow rate. At low spacing a driver is able to recognize a change in the spacing to the vehicle in
front of him, even if the difference in speed is negligible. In other words, the driver is able to
maintain a time-independent spacing (without taking fluctuations into account) to the vehicle
in front of him in an initially homogeneous state of synchronized flow. The ability of drivers
to maintain a time-independent spacing should be valid for a finite range of spacing.
Therefore, a given vehicle speed may be related to an infinite multitude of homogeneous
states with different densities in a limited range (p^^ < p < p^"' in Fig. 11 (a)). For this
reason, the multitude of homogeneous states of synchronized flow covers a two-dimensional
region in the flow-density plane (hatched region in Figs. 3(a), 11 (a)). States of free flow
(curve F in Fig. 11 (a)) and states of synchronized flow (hatched region) overlap in densities.
However, in free flow on a multi-lane road due to the possibility to pass the average vehicle
speed can be higher than the maximum speed in synchronized flow at the same density.
Therefore, there is a gap in the flow rate between states of free flow and synchronized flow at
a given density.
Small enough perturbations in spacing which are linked to an original fluctuation in the
braking of a vehicle do not grow. Indeed, small enough changes in spacing are allowed,
therefore drivers should not immediately react on it. For this reason even after a time delay,
which is due to a finite reaction time of drivers T reac , the drivers upstream should not brake
stronger than drivers in front of them to avoid an accident. As a result, a local perturbation of
traffic variables (flow rate, density, or vehicle speed) of small enough amplitude does not
grow. An occurrence of this perturbation may cause a spatial-temporal transition to another
state of synchronized flow. In other words, local perturbations may cause continuous spatial-
temporal transitions between different states of synchronized flow.
Note that on a one-lane road, independently of the vehicle density, vehicles cannot pass.
Therefore, homogeneous states of flow on the one-lane road at higher density are identical to
homogeneous states of synchronized flow on a multi-lane road (hatched region in Fig. 3(b)).
3.2. Nucleation Effects and Phase Transitions in Traffic Flow
(iv) A hypothesis about two different kinds of nucleation effects in traffic

flow (Kerner, 1999): There are two qualitatively different kinds of first
order local phase transitions and, respectively, two qualitatively different
kinds of 'nucleation effects' in traffic flow: (1) The 'nucleation effect'
which is responsible for the jam's formation (Fig. ll(a,b)) and (2) The
'nucleation effect' which is responsible for the phase transition 'Free flow
=> Synchronized flow' (Fig. 1 l(c, d)).
The nucleation effect which is responsible for the phase transition 'Free
flow => Synchronized flow' is linked to a self-decrease in the probability of
passing in traffic flow (P), i.e., in the probability that a driver is able to pass.
The self-decrease in the probability of passing occurs if, owing to some
local perturbation of traffic variables (the vehicle speed or/and the density),
the probability of passing in the related local region of traffic flow has
decreased below some critical value of the probability of passing Pcr (dotted
curve Pcr in Fig. ll(d)). In contrast, the 'nucleation effect' which is
responsible for the jam's formation is linked to the growth of a local critical
perturbation of the traffic variables (the vehicle speed or/and the density).
The local critical perturbation occurs if the amplitude of a local perturbation
exceeds some critical amplitude (curves F(pert) and S(pert) in Fig. 1 l(b)).
(v) A hypothesis about the probability of phase transitions in free flow

(Kerner, 1999): The limit density ps (Fig. ll(d)) for the phase transition
'Free flow => Synchronized flow' can differ from the threshold of the phase
transition 'Free flow => Jam' pb (Fig. 1 l(a, b)).
Because the critical value of the probability of passing is an increasing

function on the density (Fig. 1 l(d), dotted curve Pcr), the higher the density
is the lower the amplitude of the local perturbation of the traffic variables
which can cause the critical value of the probability of passing.
Within the density range of free flow where both phase transitions can
occur, the probability of an occurrence of the phase transition 'Free flow =>
Synchronized flow' is considerably higher than of the phase transition 'Free
flow => Jam'. It is linked to the assumption that the amplitude of a local
perturbation of traffic variables (the vehicle speed or/and the density), which
is needed for the occurrence of the critical value of the probability of
passing Pcr (Fig. ll(d), curve Pcr), is considerably lower than the critical
amplitude of a local perturbation (Fig. ll(b), curve F*611*), which is needed
for the jam's formation. The latter assumption is related to the result of
observations (Kerner, 1998c) where it has been found out that the 'double'
phase transitions 'Free flow => Synchronized flow => Jam' occurs
considerably more frequently than the phase transition 'Free flow => Jam'.
Flow rate, q
(c)
D(free)
Pmax nb(syn) max
P Pma*
Density, p Density, p
Fig. 11. Hypotheses about different types of phase transitions in traffic flow: (a, b) -
Explanation of the phase transitions 'Free flow => Jam' and 'Synchronized flow => Jam(s)',
and (c, d) - Explanation of the phase transition 'Free flow => Synchronized flow'. In Fig. a the
concatenation of the line J with homogeneous states of free (curve F) and synchronized flow
(hatched region) is shown. In Fig. b qualitative dependence of the critical amplitude of the
density local perturbation on the density in metastable homogeneous states of free flow (curve
F (pert) ) and of synchronized (curve S^pert\ which is related to a constant vehicle speed) flow
are shown. In Fig. d a qualitative shape of the dependence of the probability of passing P on
the density is shown. Homogeneous states of free (curve F) and synchronized flow (hatched
region) in Figs, a, c are the same as in Fig. 3(a).
(vi) A hypothesis about the nucleation effect which is responsible for the
jam's formation (Kerner, 1998a, c): The line J (line J in Fig. ll(a))
determines the threshold of the jam's existence and excitation. In other
Theory of Congested Traffic Flow 1 67
words, all (an infinite number !) homogeneous states of traffic flow which
are related to the line J in the flow-density plane are threshold states with
respect to the jam's formation. The line J separates all homogeneous states
of both free and synchronized flow into two qualitatively different classes:
(1) In states which are related to points in the flow-density plane lying
below (see axes in Fig. 1 l(a)) the line J no jams either can continue to exist
or can be excited, and (2) States which are related to points in the flow-
density plane lying on and above the line J are 'metastable' states with
respect to the jam's formation where the related nucleation effect can be
realized. The local perturbations of traffic variables whose amplitude
exceeds some critical amplitude grow and can lead to the jam's formation
(up arrows in Fig. 1 l(b)), otherwise jams do not occur (down arrows in Fig.
ll(b)). These critical local perturbations act as 'nucleation centers' (nuclei)
for the jam's formation in traffic flow.
The critical amplitude of the local perturbations is maximal at the line J and
depends both on the density and on the flow rate above the line J. The
critical amplitude of the local perturbations is considerably lower in
synchronized flow than in free flow (Fig. ll(b)). The lower the vehicle
speed in synchronized flow is the lower the critical amplitude of
perturbations for states of synchronized flow being at the same distance
above the line J in the flow-density plane.
To explain the hypotheses (iv), (v), note that in free flow of low enough densities a driver
is not hindered to pass, i.e., the related probability of passing is P=l (Fig. ll(d)). The
higher the density in free flow is the lower the probability of passing; however in free flow
up to the limit point p = p^f' tne probability of passing does not decrease drastically
(solid curve PF, Fig. ll(d)). On the contrast to that in synchronized flow of high enough
density drivers are not able to pass at all, i.e., P=0. The lower the density of synchronized
flow is the higher the probability of passing; however in synchronized flow up to the limit
point p = ps the probability of passing cannot increase drastically (solid curve Ps, Fig.
Synchronized and free flows overlap in the density, therefore pj^ > ps . As a result, the
dependence P(p) is Z-shaped, i.e., it has a hysteresis loop, and therefore it consists of three
branches: (i) The branch PF for free flow, (ii) the branch Ps for synchronized flow, and (iii)
the branch Pcr which is related to the critical value of the probability of passing (dotted
curve Pcr, Fig. 1 l(d)). The latter means that if in a local region of free flow due to a local
perturbation of traffic variables (the speed or/and the density) the probability of passing is
decreased below the critical value Pcr, then an avalanche self-decrease in the probability of
passing occurs leading to a self-formation of synchronized flow where P=PS (down arrow
in Fig. ll(d)). If on the contrary in a local region of synchronized flow due to a local
perturbation of traffic variables the probability of passing is increased above the critical
value Pcr, then an avalanche self-increase in the probability of passing occurs leading to a
self- formation of free flow where P=PF (up arrow in Fig. 1 l(d)).
It has been mentioned in the Introduction that above a threshold of any first order phase
transition, the higher the distance from the threshold is the higher the related probability of
the occurrence of the phase transition. In other words, the probability of the phase
transition 'Free flow => Synchronized flow' should be an increasing function of the density
in free flow, and it should be one at the limit point p = p|âexe). Such a behavior of the
probability of the breakdown phenomenon in a freeway bottleneck has recently been
discovered by Persaud, et al (1998). This experimental fact confirms the conclusion made
by Kerner and Rehborn in (1997) that the latter phenomenon is caused by the first order
local phase transition 'Free flow => Synchronized flow'. An explanation of the hypothesis
(vi) has been made in (Kerner, 1998c).
(vii) A hypothesis about the resulting states of traffic where the nucleation
effect responsible for the jam's formation occurs. The resulting state of
traffic flow, where the nucleation effect responsible for the jam's formation
occurs, essentially depends even on small peculiarities of the initial state of
flow. In particular, if the initial state is homogeneous, then jams appear
caused by the nucleation effect. If, on the contrary, spacing between
vehicles in an initial state of flow is essentially non-homogeneous, then the
hypothesis (viii) which will be considered below may be valid.
The hypotheses (iv)-(vii) may explain the results of experimental observations where two
qualitatively different types of first order phase transitions in traffic flow have been
distinguished (Kerner and Rehbom, 1997). Besides, they may explain the following results
which have been discovered in (Kerner, 1998c): (1) 'Double' phase transitions 'Free flow
=> Synchronized flow => Jam(s)' occur considerably more frequently than the phase
transition 'Free flow => Jam'; (2) While 'stop-and-go' traffic patterns occur in
synchronized flow, only single jams can appear in free flow.
3.3. Nucleation-Interruption Processes in Traffic Flow
(viii) A hypothesis about 'nucleation-interruption' processes in traffic flow

(Kemer, 1998a): In traffic flow, where spacing between vehicles are very
different one from another, the nucleation effect which is responsible for the
jam's formation can be interrupted. A related 'nucleation-interruption'
process may cause a sequence, i.e., a cascade, of qualitatively the same
nucleation-interruption processes upstream from the initial perturbation, if
the dispersion between spacing in an initial flow is large enough. The
nucleation-interruption processes may be responsible for phase transitions
from free to synchronized flow and for an occurrence of spatial-temporal
chaos in traffic flow.
An explanation of this hypothesis has been made in (Kemer, 1998a).

3.4. Propagation of Jams and 'Three Kinds' of Highway Capacity
(ix) A hypothesis about the process of jam propagation (Kerner, 1998c): The
velocity of the downstream front of a wide jam v g does not depend on
whether free flow or synchronized flow is formed in the outflow of the jam.
Note that the existence of three qualitatively different phases of traffic (Kerner and Rehborn,
1996b): (i) free flow, (ii) synchronized flow and (iii) jams indicates that highway capacity
depends on the phase (i, ii, or iii) on which traffic actually is in. The related 'maximal
capacity' for free flow is q™raexe), for synchronized flow it is qj^, and downstream of a wide
jam it is q out (Fig. 11 (a)).
4. CONCLUSIONS
1. Results of experimental observations allow to conclude the following:
• There are at least two phenomena of 'self-organization without bottlenecks' in real traffic
flow: (i) the local first order phase transition 'Free Flow => Synchronized Flow' and (ii)
the local first order phase transition 'Synchronized Flow => Jam'.
• The local phase transition 'Free Flow => Synchronized Flow', which occurs outside
freeway bottlenecks, can cause two waves: (i) a wave of induced transitions 'Free Flow
=> Synchronized Flow' upstream of the initial location of this phase transition and (ii) a
wave of the propagating synchronized flow downstream.
• These waves, in turn, can cause a widening of the region of synchronized flow (i.e., a
widening of congestion) both upstream and downstream.
• The well-known breakdown phenomenon in a freeway bottleneck (e.g., Agyemang-Duah
and Hall, 1991; Brannolte, 1991; Cassidy and Bertini, 1998; Persaud, et al, 1998), whose
nature is linked to the local first order phase transition 'Free Flow => Synchronized Flow'
in the vicinity of a freeway bottleneck (Kerner and Rehborn, 1997), has some differences
and some common features with the local phase transition 'Free Flow => Synchronized
Flow', which occurs outside freeway bottlenecks. A difference is that synchronized flow
can be self-maintained for several hours in the case of the phase transition in a freeway
bottleneck. On the contrary, in case of the local phase transition 'Free Flow =>
Synchronized Flow', which occurs outside freeway bottlenecks, synchronized flow
usually exist for a relatively short time interval and the region of the location of the
synchronized flow may propagate both upstream and downstream from the location
where the phase transition has initially occurred.
2. A theory of congested traffic flow (Kerner, 1998a, c, 1999) may qualitatively explain the
different local first order phase transitions in traffic flow.
ACKNOWLEDGMENTS
I would like to thank H. Rehborn, S. Valkenberg and M. Aleksic' for their help, the
Autobahnamt Frankfurt for the support in the preparation of the experimental data and the
German Ministry of Education and Research for the financial support within the BMBF
project ,,SANDY".
REFERENCES
Agyemang-Duah, K., and F. Hall (1991) Some issues regarding the numerical value of
capacity. In: Proceedings of International Symposium of Highway Capacity (U.
Brannolte, ed.), p.1-15. A.A. Balkema, Rotterdam.
Bando, M., K. Hasebe, K. Nakanishy, A. Nakayama, A. Shibata, and Y. Sugiyama (1995).
Phenomenological study of dynamical model of traffic flow. Phys. I (France) 6,
1389-1399.
Barlovic, R., L. Santen, A. Schadschneider, and M. Schreckenberg (1998). Metastable states
in cellular automata for traffic flow. Eur. Phys. J. B, 5, 793-800.
Cassidy, M.J., and R.L. Bertini. (1998). Some Traffic Features at Freeway Bottlenecks. Trans.
Res. B (in press).
Ceder, A. (1976). A deterministic flow model for two-regime approach. Trans. Res. Rec. 567,
16-30.
Chandler, R. E., R. Herman, and E. W. Montroll (1958). Traffic dynamics: Studies in car
following. Oper. Res. 6, 165-184.
Daganzo, C. F. (1997). Fundamentals of Transportation and Traffic Operations. Elsevier
Science Inc., New York.
Daganzo, C. F., M.J. Cassidy and R.L. Bertini (1998). Causes of Phase Transitions in
Highway Traffic. Trans. Res. B (in press).
Hall, F. L. (1987). An interpretation of speed-flow concentration relationships using
catastrophe theory. Transp. Res. A, 21, 191-201.
Hall, F.L., B.L. Allen and M.A. Gunter. (1986). Empirical analysis of freeway flow-density
relationships. Transp. Res. A, 20, 197-210.
Helbing, D. (1997). Verkehrsdynamik. Springer, Berlin.
Herrmann, M., and B.S. Kerner (1998). Local cluster effect in different traffic flow models.
Physica A, 255, \63-\SS.
Kerner, B.S. (1998a) A Theory of Congested Traffic Flow. In: Proceedings of 3rd
International Symposium on Highway Capacity (R. Rysgaard, ed.), Vol. 2, pp. 621-
642. Road Directorate, Ministry of Transport - Denmark.
Kerner, B. S. (1998b). Traffic flow: Experiment and Theory. In: Traffic and Granular Flow
97 (M. Schreckenberg and D. E. Wolf, eds.), pp. 239-268. Springer, Singapore.
Kerner, B. S. (1998c). Experimental features of self-organization in traffic flow. Phys. Rev.
Letters, 81, 3797-3800.
Kemer, B.S. (1999). Congested traffic flow: Observations and theory. Preprint No. 990106,
TRB, 78th Annual Meeting, January 10-14, Washington D.C.
Kerner, B. S., S. L. Klenov, and P. Konhauser. (1997). Asymptotic theory of traffic jams.
Phys. Rev. E, 56, 4200-4216.
Kerner, B. S. and P. Konhauser. (1994). Structure and parameters of clusters in traffic flow.
Phys. Rev. E, 50, 54-83.
Kerner, B. S. and V.V. Osipov. (1994). Autosolitons: A New Approach to Problems of Self-
Organization and Turbulence. Kluwer Academic Publishers, Dordrecht, Boston,
London.
Kerner, B. S. and H. Rehborn. (1996a). Experimental features and characteristics of traffic
jams. Phys. Rev. E, 53, R4275-R4278.
Kerner, B. S. and H. Rehborn. (1996b). Experimental properties of complexity in traffic flow.
Phys. Rev. E, 53, R1297-R1300.
Kerner, B. S. and H. Rehborn. (1997). Experimental properties of phase transitions in traffic
flow. Phys. Rev. Letters, 79, 4030-4033.
Kerner, B. S. and H. Rehborn. (1998). Messungen des Verkehrsflusses: Charakteristische
Eigenschaften von Staus auf Autobahnen. Internationales Verkehrswesen 50, 5/98,
196-203.
Komentani, E., and T. Sasaki (1958). On stability of traffic flow. J. Oper. Res. (Japan) 2, 11-
26.
Koshi, M., M. Iwasaki and I. Ohkura. (1983). Overview on vehicular flow characteristics. In:
Transportation and Traffic Theory (V. F. Hurdle, E. Hauer and G. N. Stewart, eds.),
pp. 403-426. Proceedings of 8th International Symposium on Transportation and
Traffic Theory, University of Toronto Press, Toronto..
KrauB, S., P. Wagner, and C. Gawron, (1997). Metastable states in a microscopic model of
traffic flow. Phys. Rev. E, 55, 5597.
Ku'hne, R. (1991). Traffic patterns in unstable traffic flow on freeway. In: Highway Capacity
and Level of Services (U. Brannolte, ed.), pp. 211-223. A. A. Balkema, Rotterdam.
May, A.D. (1990). Traffic Flow Fundamentals. Prentice-Hall, Englewood Cliffs, New York.
Nicolis, G., and I. Prigogine. (1977). Self-Organization in Non-equilibrium Systems. Wiley,
New York.
Persaud, B., S. Yagar and R. Brownlee. (1998). Exploration of the Breakdown Phenomenon
in Freeway Traffic, Tranportation Research Board, Preprints of the 77th Annual
Meeting, Washington, D.C.
Prigogine, I. and R. Herman. (1971). Kinetic Theory of Vehicular Traffic. American Elsevier,
New York.
Schreckenberg, M., A. Schadschneider, K. Nagel, and N. Ito. (1995). Discrete stochastic
models for traffic flow. Phys. Rev. E, 51, 2939-2949.
Treiterer, J. (1975). Investigation of traffic dynamics by aerial photogrammetry techniques.
Ohio State University, Report No. PB 246 094, Columbus, Ohio.
Whitham, G. B. (1974). Linear and Nonlinear Waves. Wiley, New York.
173
A MERGING-GlVEWAY BEHAVIOR MODEL

CONSIDERING INTERACTIONS AT EXPRESSWAY
ON-RAMPS
Hideyuki Kita and Kei Fukuyama, Department of Social Systems Engineering.

Tottori University, Tottori, Japan
INTRODUCTION
As seen in the car following model, traditional traffic flow theory assumes that the influence
of a car to the peripheral cars is one-directional. More or less, driving action depends on
the actions of surrounding cars with each other. This traditional approach has given us
useful information to understand the traffic phenomena. This approach is, however, not
necessarily sufficient in describing the traffic behavior when the bi-directional influence
plays a dominant role in driving actions (e.g. Troutbeck, 1995).
Driving behavior between merging and through cars on an on-ramp merging section of an
expressway is a case of this sort. While merging cars usually avoid unsafe situations by
controlling the timing of their merge, the through car sometimes makes a "giveway" motion
and keeps safer passing by changing its lane to the passing lane next to the merging lane.
The behavior of the merging car under the influence of the through traffic also influences
the through traffic. That is, both merging and through cars affect each other. Since their
influences are not independent from one another, it should be jointly treated in the analysis
as an "interaction". Ignoring this interaction may cause an inaccurate description of the
traffic phenomena on a merging section.
While many studies point out the need to analyze the giveway behavior, few studies have
explicitly dealt with such behavior. Troutbeck (1995) analyzes the phenomena on a round-
about where the cars with the right of way running on the circle lane sometimes give their
way to the merging cars at the entrances. Nielsen and Rysgaard (1995) reported merging
and giveway behavior at motorway on-ramps in four EU countries and compared them with
several indices. These studies are useful data sources for understanding giveway behavior.
There exist many studies of lane changing behavior for traffic capacity analysis (e.g., TRB
(1985), Cassidy et al: (1990), Vermijs (1991)), they merely try to clarify the relationship
between the macroscopic characteristics such as traffic distribution ratio over lanes or lane
changing ratio and road & traffic characteristics. Chang and Cao (1991) model the fre-
quency of lane changing from the viewpoint of microscopic behavior analysis. Their study,
however, does not explicitly handle the fundamental mechanism that the traffic behavior
as a whole is formed as the result of interactions of driving decisions among cars.
Under the recognition mentioned above, we developed a simple model for describing the
traffic behavior of a couple of merging and through cars while taking into consideration
the interaction between them explicitly (Kita, 1999). The study views the situation, in
which each of the drivers chooses their best action by considering his/her forecast of the
other driver's action, as a "game" and then, clarifies the mechanism under which the
traffic phenomena such as the merging and/or giveway ratio over lanes are determined
by the driving environments consisting of road and traffic characteristics and the driving
decisions of other surrounding cars.
However, the study does not necessarily investigate the existence of multiple equilibria and
the transition of equilibria, so that the correspondence between the driving conditions as
input and the chosen driving actions as output is not unique. This means that it is difficult
to examine the replication capability of the model by using observed data.
In this study, we show that a unique correspondence can be found between a certain driv-
ing environment and a pair of the resultant driving actions of the drivers as an equilibrium
solution of the game, and the possibility to examine the replication ability of the model
based on observation data. For this purpose, we refine the earlier version model exten-
sively in order to specify the timing and location of their driving actions corresponding to
the initial conditions, by analyzing the equilibria of the game especially their transition
along time. According to these results, a test of the model using observation data will be
implemented in a practice case study.
A Merging-Giveway Behavior Model 175
A GAME THEORETIC INTERPRETATION OF MERGING-GIVEWAY

INTERACTION
Direction of Progress
[Wl
] 'fRil'
74
j -Yz -Ys ~Y4 _y
Figure 1: Merging section and the influencing variables
Here, we shall analyze a give way motion often seen in the downflow section from the
merging gore by the through car encountering a merging car and avoiding facing conflict.
The merging section and the cars concerned in this study are depicted in Fig. 1. The
speed of the merging cars is assumed to be slower than that of the through cars. Under
this situation, if the merging car (Car [1] in Fig. 1) decides to merge behind the through
car (Car [2] in Fig. 1), the through car may not give its way. If the through car wishes to
give its way to the merging car, the merging car may merge in front of the through car.
In this way, both the merging car and the through car attempt to take the best action for
themselves by forecasting the other's action, respectively. To describe this situation as a
game, we specify the structure of the game as follows.
1. A merging car (Car [1] in Fig. 1, also named 'Merging Car' in this study) and
the closest through car approaching from the rear side in the adjacent lane to the
acceleration lane (Car [2] and named 'Through Car') are the only players of the game.
The other cars depicted in Fig. 1 such as Car [3] (which is also named 'Passing Car')
and Car [4] (named 'Following Car') constitute the driving environment of Car [1]
and [2].
2. The number of plays between two players over a merging section is one.
3. No communication and therefore no coalition can exist between the players.
4. The players have the complete information.

THE MODEL AND THE EQUILIBRIUM ANALYSIS
The Model Through Car
Go with Giveway Go without Giveway
Merge Fll, Gn Fwt Gw
Merging
Car
Pass F01, Goi F
°°' G°°
Figure 2: Merge-Giveway Game in Normal Form

The behavior of the Merging Car and Through Car can be modeled as a two-person non-
zero-sum non-cooperative game. The strategy of the Merging Car consists of "Merge
(merging in front of the through car)" and "Pass (passing the through car)". The strategy
of Through Car consists of "Go with Giveway" and "Go without Giveway" . This game
is given in Fig. 2 as the normal form game. In Fig. 2, F^ is the payoff of the Merging
Car where i = I means "Merge" and i = 0 means "Pass" , while dj is the payoff of the
Through Car where j ' = I means "Go with Giveway" and j = 0 means "Go without
Giveway", respectively. We denote the probabilities of merging and giveway choices by
x and y, respectively. Then, the expected payoff of the Merging Car, U, and that of the
Through Car, V, are given as follow.
U = (1)
V = (2)
Since driver behavior can be well described by the model provided in "Time to Collision"
(Kita, 1993), assume that each payoff is determined solely by the (TTC) function fa,
"Time to Collision (TTC)", t 1? £ 2 , *3, and t4. These are:
FQQ =
Gn =
Goo = 0 (3)
where t\, t 2 , is, and £ 4 , represent TTCs between Car [1] and the end of the merging lane,
Car [1] and Car [2], Car [2] and Car [3], and Car [1] and Car [4], respectively, and are denned
asii = X/vi, t2 = y2/(v$-vi), t3 = (y3 -yz)/(v$ - v$), and U = 2/4/^2- y i)> where u1} uf,
1)3, and u|(= ^2)1 are the velocities of the car [1], [2], [3], and [4], respectively. Functions
fij and §ij determine the payoffs of the Merging Car and Through Car, respectively, when
the Merging Car and Through Car choose the strategies i(= 0 or 1) and j(= 0 or 1).
Equilibria
The game has the following eight possible Nash equilibria, (z*, y*). (For details about how
to obtain Nash equilibrium, refer, for example, to Rasmusen (1994)).
I. (z*, y*) = (1,1) when Fn - F01 > 0, Fw - F00 > 0, Gn - GIO > 0
II. (z*,y*) = (1,0) when Fn - F01 > 0, Fw - F00 > 0, Gu - Gw < 0
III. (z*, y*) = (1, 0), (0,0), or (F, G) when Fn - Fol > 0, F10 - F00 < 0, Gn - G10 > 0
IV. (z*, y*) - (0, 0) when Fn - F01 > 0, F10 - F00 < 0, Gn - G10 < 0
V. (z*, y*) = (G, F) when Fn - F01 < 0, F10 - F00 > 0, G u - G10 > 0
VI. (z*, y*} = (1,0) when Fu - F01 < 0, F10 - F00 > 0, Gn - G10 < 0
VII. (z*, y*) = (0,0) when Fu - F0i < 0, F10 - F00 < 0, Gn - G10 > 0
VIII. (z*, y*) = (0,0) when Fu - F01 < 0, F10 - F00 < 0, G u - G10 < 0
where F = (F00 - F10)/{(F00 - F10) + (Fn + F01)} and G = -Gol/(Gn - G10 - G 01 ).
All the cases above have unique equilibrium except for Case III which has multiple (three
possible) equilibria of (1,0), (0,0), and the mixed strategy equilibrium.
Case I and its equilibrium show that when the merging car has higher risk at the end of the
merging lane and the through car recognizes the low risk on the passing lane, then (Merge,
Go with Giveway) occurs. Here, (-, •) indicates ('behavior of merging car (Merging Car)',
'behavior of the through car (Through Car)'). Case II and its equilibrium indicate that
when the situation of the merging car is same as in Case I and the through car recognizes the
higher risk on the passing lane, then (Merge, Go without Giveway) may be realized. Case
III and its equilibria mean that when the two cars are very close in TTC, either (Merge,
Go with Giveway), (Pass, Go without Giveway), or their mixed-strategy equilibrium are
(vn) (0,0)
1
2 " (0,0)
rr \ .. -,
(0,0) n\H , ' (In ) n.n
o t4 (G,F)
3-
3
OQ
cs 4. i
O
"•h
(E) (I)
change of \.^ or U
Figure 3: Transition of Equilibria
realized. Case IV and its equilibrium reveal that when the situation of the through car is
same as in Case II and the merging car recognizes higher risk of merging, then (Pass, Go
without Giveway) is realized. Case V and its equilibrium show that when the merging car
still has a longer merging lane remaining and feels more risk from the following car than
the through car, and the through car recognizes a low risk on the passing lane, then the
mixed strategy equilibrium (in which the merging car sometimes chooses 'Merge' and the
through car sometimes chooses 'Go with Giveway') occurs. Case VI and its equilibrium
indicate that when the situation of the merging car is same as in V while the through car
recognizes a higher risk on the passing lane, then (Merge, Go without Giveway) occurs.
Case VII and its equilibrium show that when the merging car is close to the following
car and still has a longer merging lane remaining and the through car recognizes a low
risk on the passing lane, then (Pass, Go without Giveway) is realized. Finally, Case VIII
and its equilibrium mean the situation of the merging car is the same as in Case VII, and
the through car recognizes a higher risk on the passing lane, then (Merge, Go without
Giveway) may be realized.
Analyses of Equilibrium
The transition among the equilibria according to the change of the environmental parame-
ters, TTCs, are analyzed in this section. Fig. 3 shows the possible transitions of equilibria
due to the change of (decrease in) TTCs.
At the equilibria I and II the merging car chooses Merge and therefore the merging-giveway
game ends. At IV, because the merging car does not choose Merge, the game does not
end. When the initial environmental situation is at IV, the game ends when the situation
moves to either II or I via III. The transition of the equilibrium from IV to II means the
change of inequality of FIQ — F0o < 0 into > 0. This change is brought about either by the
shortening of the merging lane length remaining for the merging car or brought about by
decreasing the TTC to that of the following car [3] for the merging car [1]. This transition
of the equilibrium shows that due to the increase in both the risk of staying in the merging
lane and the risk of merging to the next gap, the merge to the first gap by the merging car
becomes the outcome.
Next consider the case where the equilibrium moves from IV to I via III. The equilibrium
transition from IV to III occurs when the inequality GH — G\Q < 0 is reversed. The
merging car and through car become closer (t^ becomes smaller) and the through car,
recognizing the higher risk against the merging car, gives its way to it, resulting in III. In
III there exist three equilibria of (0.0), (1,1), and (G,F). We assume that the realizing
equilibrium is (0, 0) in III by assuming path dependent decisions by the players; before
the transition from IV to III the merging car and through car has chosen 'Pass' and 'Go
without Giveway', respectively, at the previous equilibrium of IV. The transition from III
to I occurs when the inequality F10 — FQQ < 0 is reversed by the change of the environment,
similar to the transition from IV to II. This change of environment is brought about either
by the shortening of the merging lane remaining for the merging car [1] or by the decrease
in the TTC to the following car (£3) and therefore it means that the merging car chooses
Merge due to the increase in risks of merging lane and also due to the next gap to merge.
Next, consider when the game starts with III where there exists three equilibria. Without
any additional information we cannot designate one equilibrium to realize out of three.
By considering the 'pre-game' situation, however, the equilibrium selection can be done.
The game has been set to begin when the merging car appearing at the merging section.
This means that at 'pre-game situation' the initial situation before the strategy choices
by both drivers are 'Pass' and 'Go without Giveway' and therefore (0,0). Employing the
path dependence assumption again, we can designate (0, 0) among the three equilibria as
the equilibrium to realize outcome III.
GAME EQUILIBRIA AND TRAFFIC PHENOMENA

Specification of Payoff Functions
In the previous section, the equilibria are obtained by modeling the merging-giveway be-
haviors by focusing on the decision making process of the drivers at a certain environment
'at a moment'. Additionally by considering the transitions of conditions under which the
equilibria hold, the changes of the merging and giveway decisions with time are explained.
In this section, by specifying the payoff functions the traffic phenomena realized at the
merging section are explained and the merging and giveway time and location are clarified.
Here the relationships between the payoffs and the TTCs given in (3) are specified so that
the payoffs are given by the corresponding TTCs themselves. The payoffs are specified by
TTCs as follows.
„, + - i^
VtL -
X -vit
_
-TOO —
- 2/2) - (vst - 3/3)

t>3 - vl
- 3/2) - (v3t - 3/3)
-V20
Gin —
Goo = 0 (4)
where t is the length of time measured from when the merging car first appears at the
merging lane, 7/2 > 2/3, and 2/4 are the distances of the car [2], [3], and [4], respectively, from
the merging nose when t = 0, and x is the length of the merging lane.
By substituting the specification of the payoff functions in (4) into the three inequality
conditions, Fn - Fol > (<)0, FIO - FQO > (<)0, and Gn - G10 > (<)0, they can be
rewritten as follows.
< V2 - Vi < Vi
Fw-
Gn — GIO . 0
< V 2 — V i > V3 — V i
Notice that in these three conditions that determine the equilibrium realized they do not
include the time variable t and therefore they are not affected by the progress of time. In
other words, when assuming payoffs consisting of TTC themselves the equilibrium transi-
tions never appear. The merging in addition to the giveway times and locations realized are
therefore determined right away by the location of all cars when the merging car appears
at the merging nose.
These three conditions can be interpreted as follows.
Inequality Fn — F0i > (<)0: The sign of this inequality is determined by the TTC of
the merging car [1] and the following car [4], and the TTC of the merging car [1] and
the end of the merging lane. This corresponds to the comparison of the sizes of t\
and £3, indicating whether the merging car [1] will be passed by the following car [4]
before reaching the end of the merging lane.
Inequality Fw — F00 > (<)0: The sign of this inequality is determined by the TTC of
the merging car [1] and the end of the merging lane, and the TTC of the merging
car [1] and the through car [2]. This corresponds to the comparison of the sizes of t\
and t2, indicating whether the merging car [1] will be passed by the through car [2]
before reaching the end of the merging lane.
Inequality G\\ — G\Q > (<)0: The sign of this inequality is determined by the TTC of
the merging car [1] and the through car [2], and the TTC of the merging car [1] and
the passing car [3]. This corresponds to the comparison of the sizes of t-2 and £3 + ^4,
indicating whether the through car [2] will be passed by the passing car [3] before
catching the merging car [1].
Traffic Phenomena and the Game Equilibrium
The game is constructed to capture the merging-giveway situation 'at a moment' of the
merging section. Accordingly, the equilibria include the ones in which the merging car
decides not to merge to the central lane (or chooses 'Pass'; y* = 0). We are interested in
analyzing not the drivers' behavior of the moment but the traffic phenomena that occur
at the merging section. The merging car should merge to the central lane at some point
in time; otherwise it stacks at the end of the merging lane. Using the game analyses given
above, we should examine when and where the merging car merges to the central lane and
also when and where the through car chooses to giveway.
Table 1: Merging and Giveway Behaviors and Equilibrium Conditions
Equilibrium (x*,y*) FH — FQI FIO — FQQ G\l — GW (z,y) \J"mi t j j

I (i,i) + + - (i.i) (0,0)
II (1,0) + + + (1,0) (o,-)
III (1,1)(0,0)(G,F) + - - (i,i)* *> Vy—Vl ' V%— Vl'
( yz N
IV (0,0) + - + (1,0)* \v%—vi ' '
V (G,F) - + - —
VI (1,0) - + + —
VII (0,0) - - - (i,i)* (-M Jte_)
^v* — vi ' v^—vi '
VIII (0,0) - - + (1,0)* (-M
v^f=^7' >)
* Indicates the results different from the game equilibrium
According to the interpretation of the three equilibrium conditions of inequality given in the
previous section, the game equilibria can be interpreted as the traffic phenomena (merge
and giveway behaviors and their timings) under the payoff function specification by the
TTCs themselves given above. They are summarized in the sixth and seventh columns in
Table 1. In Table 1, (x,y) means whether the merging car [1] ever merges or not (x = I
when merging) and whether the through car [2] changes the lane and chooses giveway
behavior or not (y = 1 when giveway), respectively. Notice that "—" in Table 1 means the
corresponding behavior cannot occur because of the inconsistent conditions exist. tm and
tg indicate the time that the merging car [1] merges to the central lane and the time that
the through car [2] changes to the far right lane (chooses 'Go with Giveway'), respectively.
The traffic phenomenon to realize in the cases that correspond to equilibria I and II are
exactly same as the equilibrium strategy ((x*,y*) = (x, y)). In equilibrium I, the merging
car merges to the central lane immediately and the through car goes with giveway im-
mediately (tm = tg = 0). In equilibrium II, the merging car merges to the central lane
immediately, and the passing car [3] stays at the central lane.
Other cases, Case III to VIII, have a different traffic phenomena from the game equilibria
(compare the entries in the second and sixth columns in table 1). The reason and the
interpretation of these cases are as follows.
In equilibrium III, The merging car [1], which will be passed by the through car [2] before
the end of the merging lane but not by the following car [4], merges to the central lane
right after being passed by the through car (into the gap between the through car [1] and
the following car [4]). Also, the through car [2], faced with risk of catching the merging
car [1], chooses to give its way to the merging car and changes the lane to the passing lane
because it is never caught by the passing car [3]. Consequently, the traffic phenomenon in
this case will be (Merge, Go with Giveway), and the times of merging and giveway that
occur simultaneously are given by tm=tg = yij(v\ — t'i).
In equilibrium IV, the situation and the behavior of the merging car [1] are same as in the
equilibrium III above. The through car [2], which is under risk of catching the merging car
[1] before the end of the merging lane, cannot change to the passing lane because it will
be caught by the passing car [3]. Consequently, the corresponding traffic phenomenon is
(Merge, Go without Giveway). The merging time is given by tm = yil{v\ — v\] because
the merging car merges to the central lane just after being passed by the through car, and
the giveway time does not exist because no giveway behavior exists.
Under the specification of the payoff in this study, the two conditions, Fu — F0i < 0 and
FW — -Foo > 0, constituting the equilibrium V and VI, imply that the following car [4] is
running ahead of the through car [2]. Consequently, these conditions cannot hold and the
traffic phenomenon corresponding to these two equilibria do not exist.
In equilibrium VII, the merging car [1] is passed by the through car [2] and also by the
following car [4] before reaching the end of the merging lane. It merges to the central lane
just after being passed by the following car [4]. The through car [2], which is assumed to
always run at the front of the following car [4], already passed the merging car when it
merges to that lane. Also, the through car [2] which is never being caught by the passing
car [3], changes the lane to the passing one. Consequently, the traffic phenomenon to
realize is given by (Merge, Go with Giveway). The corresponding merging and giveway
times, that occur simultaneously, are given by tm = tg = y$/(v\ — v\).
Finally, In equilibrium VIII, The situation and the behavior of the merging car [1] are
exactly same as the ones given in the equilibrium VII above, and it merges to the central
lane right after being passed by the following car [4]. The through car [2], which will
be caught by the passing car [3] before the merging car reaches the end of the merging
lane, does not change the lane to the passing one. Consequently, the realizing traffic
phenomenon is (Merge, Go without Giveway). The corresponding giveway time is given
as tm = ?/4/(f| — fi) while the giveway time does not exist.
Consequently, the merging car merges to the central lane immediately when equilibrium
is I or II, merges after being passed by the through car [2] when III and IV, and merges
erge i m m e d i a t e l y
erge behind Car [2]
erge behind Car [4]
0 i 2
TTC of Car [1] and Car [4] TTC of Car [1] and Car [2]
a) TTCs of cars and the merging behavior b)TTCs of cars and the giveway behavior
Figure 4: Occurrence of the merging and giveway behaviors
after being passed by the following car [4] when VII or VIII. On the other hand, the
through car changes to the far right lane when the equilibrium is I, III, or VII, and does
not change the lane for II, IV, or VIII. These relationships between the equilibria and the
resulting traffic behaviors are depicted in Fig. 4 on the TTC planes of (£2,^4) and (£2^3)5
by using the three inequalities governing the realizing equilibrium. Fig. 4 a) depicts the
merging behavior, and the horizontal and vertical lines that distinguish the equilibrium
areas indicate £2 = X/v\ and U = X/vi, respectively. Fig. 4 b) shows the equilibrium
area for the giveway one, and the line that distinguish the two equilibrium area is given
by £3 = (t>2 —
A CASE STUDY
Data and Their Handling
To examine the proposed model, a case study is carried out by using a set of video-recorded
data observed at an on-ramp merging section of Nagoya I.C., Tohmei Expressway, in
Nagoya (data source is Research Group on Design of Intersections (1987)). The number
of observation samples of the merging behavior was 74. Among these, observations that
have complete sets of explanatory variables were used: there are 10 such complete sets of
data in the observation data set. The merging lane has 200m in length, and the average
velocities of the vehicles were assumed to be constant and estimated as v\ =82.8km/h for
the merging car, v\ = v\ =85.8km/h for the through and following cars, and v3 =95.7km/h
for the passing car.
Results
The data is plotted on Fig. 5 a) and Fig. 5 b). The solid dots and circles indicate data
that "Car [1] merges behind Car [2]" and "Car [1] merges immediately", respectively, in
Fig. 5 a), and "Car [3] goes with giveway" and "Car [2] goes without giveway" in Fig. 5
b), respectively.
Due to the velocities estimated, the lines, t^ — x/v\ and £4 = x/Vi, that distinguish the
equilibrium areas in Fig. 4 a) are both 8.696 sec. The plots of the observed data ('Merge
immediately' and 'Merge behind Car [2]') are consistent with the designated areas of the
game equilibrium, while no data that a merging car 'merges to the central lane after being
passed by the following car' or 'merge behind Car [4]' is available.
Next consider Fig. 5 b). According to the velocities of the car set, the slope that distin-
guishes the two areas is (v\ — v\)/(v^ — v\] = 0.36. The nine data plots for the cars that did
not do giveway are consistent with the area designated by the game equilibrium analyses,
and that is also true for the one observation plot in which the car went without giveway.
Though the sample size is not large enough to fully support the validity of the model,
through this data references to the proposed model can be recognized, for instance it gives
a good description of the traffic behavior of giveway and merging in on-ramp merging
sections. This result verifies the ability of this new approach to understand the traffic
phenomena with interactions.
CONCLUSION
In this study, the traffic behaviors such as merging and giveway are expressed as game
strategy, and the relationships between the traffic phenomena to realize and the driving
environment surrounding the drivers are clarified. By understanding the change of driving
behaviors (and therefore the traffic phenomena) as the transition from one game equi-
librium to another, the mechanism under which the drivers' behaviors and the resulting
phenomenon occur and change according to time progress and the change of driving en-
vironment. Furthermore, by specifying the payoff functions in the merging-giveway game
White: Merge behind Car [2] (sec)

White: Go with Giveway
Black: Merge immediately
Black: Go without Giveway
40
20
t3=0.36t2
0 8.696 20 60 20 40 t2 (sec)
t4 (sec)
a) TTCs of cars and merging behavior b) TTCs of cars and the giveway behavior
Figure 5: Observed merging and giveway behaviors
model, the conditions under which the merging and giveway behaviors emerge are explicitly
induced and the merging and the giveway timing and locations are specified. With this
model development a unique correspondence is found between a certain driving environ-
ment and a pair of the resultant driving actions of the drivers as an equilibrium solution
of the game, and presents the possibility to examine the replication ability of the model
based on observation data. Finally, the validity and applicability of the proposed model
are checked by applying the observation data.
By employing the approach and models employed in this study, the occurrence and change
(location and timing) of driver's behaviors such as giveway and merging motions, and
the traffic phenomena resulting from the combination of such behaviors can be specified.
Accordingly, application of this study contributes to the construction of effective traffic
control systems and the design of safer and more comfortable intersections of expressways,
explicitly considering the car motions.
The game theoretic approach to the traffic behavior modeling has very little research
accumulation and several important aspect remain uninvestigated. Among others, most
importantly, the development of a new payoff estimation technique is necessary. With this,
other important progresses will be also possible: non-linear payoff function applications
which enables us to explain the merging behavior at the middle of the gap, and game
modeling with incomplete information. Formulation to translate this microscopic model
into macroscopic results should be also developed.
A Merging-Giveway Behavior Model 18 7
These various extensions of this study that have not been considered in this study are all
important; the existence of many possible future directions of this study indicates that the
model and approach can be easily extended to become more robust.
ACKNOWLEDGEMENT
The authors would like to express their appreciation to the anonymous referees for their
valuable comments. This study was supported financially in part by the Grant-in-Aid for
Scientific Research, Ministry of Education, and Sumitomo Marine Welfare Foundation.
REFERENCES
Cassidy, M. J. et al. (1990). A proposed analytical technique for the design and analysis
of major freeway weaving sections, Inst. of Transp. Studies, Univ. of California at
Berkeley, UCB-ITS-RR-90-16.
Chang G.-L., and Y.-M. Cao (1991). An empirical investigation of macroscopic lane chang-
ing characteristics on uncongested multilane freeways, Trans. Res., 25A, 6, 375-389.
Kita, H. (1993). Effects of merging lane length on the merging behavior at expressway
on-ramps. In: Transportation and Traffic Theory (C. Daganzo ed.), Elsevier, 37-51.
Kita, H. (1999). A merging-giveway interaction model of cars in a merging section: a game
theoretic analysis, Transportation Research, Vol.33A, No.3/4, 305-312.
Nielsen, M. A., and Rysgaard, R. (1995). Merging Contra Give Way When Entering A
Motorway. Road Directorate Report, 27, Danish Road Directorate, Copenhagen.
Rasmusen, E. (1994). Games and Information. 2nd edition, Blackwell Publisher, Mas-
sachusetts, 67-91.
Japan Society of Traffic Engineers (1987). Report on the Design of the Intersections,
Research Group on Design of Intersections, 2, Japan Society of Traffic Engineers (in
Japanese).
Transportation Research Board (1985). Highway Capacity Manual, Special Report, 209.
Troutbeck, R. J. (1995). The capacity of a limited priority merge. Physical Infrastructure
Centre Research Report, 95-8, Queensland University of Technology, Australia.
Vermijs, R. G. M. M. (1991). The use of micro simulation for the design of weaving
sections, In: Highway Capacity and Level of Service (U. Brannolte. ed.), 419-227,
A. A. Balkema.
CHAPTER 3
ROAD SAFETY AND PEDESTRIANS
• Nothing in life is to be feared. It is only to be understood. (Marie Curie)

• There is more to life than increasing its speed. (Mahatma Ghandi)
• A problem adequately stated is a problem well on its way to being solved.

191
Comparison of Results of Methods of the

Identification of High Risk Road Sections
Marian Tracz, Chair of Highway & Traffic Engineering, Cracow University of Technology, ul.
Warszawska 24, 31-155 Cracow, POLAND, e-mail: mt@rodes.wil.pk.edu.pl
Marzena Nowakowska, Laboratory of Computer Science, Kielce University of Technology,

Al. 1000-leciaP.P. 3, 25-314 Kielce, POLAND, e-mail: spimn@eden.tu.kielce.pl
ABSTRACT
In this paper some measures for effectiveness evaluation of the methods of the identification of
high risk sections on rural roads have been proposed. Two categories of measures have been
taken into account. First concerns the level of accidents' concentration along hazardous road
sections as well as the level of concentration of accidents severity. The second category
concerns the problem of repeatability of the identification procedures during two consecutive
time periods in relation both to the location of dangerous sections along a road and to some
features of road accidents. The measures have been applied to three Polish methods and the real
data have been used to conduct a comparison. Some lacks which were found in the
identification methods in this study have been completed. The results varied for different
aspects and accident data from different roads.
INTRODUCTION
Accident risk at dangerous road sites can be an important factor while identifying and
evaluating them. In practice these dangerous sites can be found in parts of a road network
where accident intensity is comparatively large. So, the studies are usually focused on urban
roads and on the sites that are prone to accidents of various types because of their geometrical
characteristics - as urban junctions.
In the identification and evaluation of dangerous road sections, the majority of highway
authorities in Poland use methods that are based on rather simple criteria such as numbers of
accidents or average density of accidents (ace./km/year) that happened on a road section during
a certain period of time (usually 3 years). In research it is commonly assumed that the number
of accidents at a site over a fixed time period is well modelled by a Poisson distribution. Traffic
volume is not taken into account due to the lack of current traffic data.
When investigating the safety of rural roads a problem arises how to identify dangerous
sections as they can have various lengths and they are not evenly spaced along a road. Polish
methods use different algorithms of dividing a road into sections for further investigations. In
addition, the time periods for which high risk road sites are identified also differ in various
methods. In consequence, the results of accident analysis are frequently not comparable.
In this study some measures have been proposed for evaluation of the effectiveness of the
methods that are oriented towards rural roads. For those who deal with the problem, these
measures should be helpful in comparing results. The paper does not only present a pure
methodological study of the analysed methods but also includes some practical aspects.
MEASURES FOR COMPARING RESULTS OF THE IDENTIFICATION OF HIGH

RISK ROAD SECTIONS
There are several methodologies that are used in analysis of traffic accidents and in the
identification of high risk sections on a road (Hauer, 1995). When using different methods,
different distributions and numbers of dangerous sections along a road can be a consequence of
changing a selection criterion. Then this can affect the effectiveness of road safety measures.
All these issues depend to a great extent on the reliability and validity of accident data (Hakkert
and Hauer). Consequently, the results of the identification of dangerous road sections are
influenced by accident analysis methodology, data collection and data handling. Therefore, an
important question arises: which identification procedure should be used to get the best results?
A variety of techniques have been applied to answer the question (Maher and Mountain, 1988;
Mountain at all, 1992a, 1992b). To choose a proper method reaserchers use statistical tools
and indices derived on the basis of probability theory. In the analyses they use real or artificial
(generated by a computer) accident data. In this paper some measures for evaluating and
comparing the performance of various identification methods have been introduced. The
process of comparison was carried out on the basis of real accident data.
Road Safety Evaluation Methods 193
Method Inaccuracy Coefficient
Identification procedures applied to rural roads are usually accomplished on the basis of the
data that come from time periods of different lengths. Such data contain information of various
levels of reference. So the comparion of results of the identification of dangerous road sections
obtained from these data should be processed on the basis of indices that can be easily
examined. For road sections a method can be considered as a more accurate if there are:
• the high proportion of the total number of accidents on dangerous sections in relation to the
total number of accidents on the whole analysed road,
• the low proportion of the total lengths of these sections in relation to the length of the road.
An average accident density (ace. /km/year) is the ratio used very commonly. The greater its
value calculated for hazardous road sections the better the results of an identification
procedure. In extreme situations, the accident density can reach infinity for zero length
sections. There are, however, some disadvantages of using this measure. Comparing the
accuracy of results this measure works quite well for one road but not for several roads.
Average accident density is an absolute measure and can be as well considered as a
suplemantary measure.
In order to consider more intuitive measure, a certain relative ratio has been put forward in this
work. It has been called the Method Inaccuracy Coefficient (MIC) and is defined as follows:
MIC = [ I l./rl]/[ I nacc./rnacc] 0)
where: n - number of high risk sections on a road,

/, - length of the z-th high risk road section,
rl - length of the considered road,
nacCj - total number of accidents on the j-th high risk road section,
rnacc - total number of accidents on the road.
The MIC coefficient expresses the proportion of the fraction of the total length of dangerous
sections in the road length to the fraction of the total number of accidents occurred on these
sections in the total number of accidents on the road. The value of the coefficient can exceed
unity. However, if the value really concerns high risk road sections, it is included in the interval
<0,1>. The closer to zero the value of the coefficient is, the more accurate or precise the
method is and reciprocally.
Coefficients describing accidents severity are defined in the same manner and are used here as
supporting measures. They are dentoed as MIC(I), MIC(F) and MIC(V) and calculated
according to the formula (1) in which the number of accidents is respectively replaced by the
number of injuries (I), fatalities (F) and vehicles (V). The interpretation of the values of these
coefficients is similar to the MIC interpretation.
Space Compatibility Index
When considering the process of the identification of hazardous rural road sections, additional
potential impact should be taken into account. This is commonly known as "accident
migration" or "migration of accident risk" and referred to a phenomenon arising after applying
some remedial treatments on a road section or because of an increased percent of time delay
(travelling in platoon). These are, for example, accidents typical for dangerous overtaking
occurring sometimes a few kilometers from a place where the overtaking demand has appeared.
However, during this study it has been noticed that the ends of a high risk road section can
change during two consecutive time periods of the identification process, even if no treatment
had been applied in relation to the "earlier period" section. In this paper, such period-after-
period changing of locations of the hazardous sections along a road can be expressed by a
certain index. This index has been build using "positive" and "negative" differences between
the mutual position of the ends of consecutive (earlier and later) overlapping dangerous road
sections:
(2)
dr = sgn(r, (1) - r, (2)) • \ r, (1) - r (2) \ (3)
where: !,(!) - kilometreage of the left end of the z'-th overlapping section obtained from an
identification procedure for the earlier time period; lt(2) is defined likewise for
the later time period,
r,(l) - kilometreage of the right end of the /-th overlapping section obtained from an
identification procedure for the earlier time period; rt(2) is defined likewise for
the later time period,
dlj - difference between left ends of the two r'-th overlapping road sections; the sign
of the difference informs whether the kilometreage of the end of the later time
period section is lower (,,plus" sign) or higher (,,minus" sign) than the
kilometreage of the respective end of the earlier time period section,
drt - defined similarity to the dlt difference with respect to the right ends.
The earlier time period, for which identification procedures have been processed, will also be
called the first time period in this paper, and the later time period will be called the second time
period. The scheme of overlapping of two identified dangerous road sections is presented in
Figure 1. In this scheme, the lower section s(l) represents a road section from the first time
period whereas the upper section s(2) represents a road section from the second time period.
The notations s(l) and s(2) also represent the respective sections lengths. The remaining
notations follow the explanation given for the equations (2) and (3). Subscripts were omitted
for simplification.
kilometreage
Figure. 1. Illustration of a mutual position of overlapping dangerous road sections identified

for two consecutive time periods.
All possible alternatives of mutual position of the two overlapping dangerous road sections s(l)
and s(2), obtained for the first and the second time periods in accident analysis, are presented in
Figure 2. For each alternative a short description is given, that explains the meaning of signs of
the positive and negative differences (formulae (2) and (3)). As can be seen, there is one perfect
overlapping (alternative IX) among the nine presented cases. There are also a few other
alternatives that can be considered as representing satisfactory space compatibility of
overlapping sections - alternatives III- VIII. Only in two cases I and II the space compatibility
seems doubtful. Uncertainty is especially justified when the length of the common part of the
two sections is shorter than 50% of the length of the total section <l(2), r(l)> in the case I or of
the total section <l(l), r(2)> in the case II.
In order to evaluate the degree of repeatability of an identification method - regarding the space
distribution of the identified dangerous sections, the Space Compatibility Index (SCI) was
introduced as follows:
(4)
where: n/ - number of dangerous sections on a road identified in the first time period,
ri2 - number of dangerous sections on the same road identified in the second time
period,
n - number of dangerous sections on the road identified in the first time period that
have common parts with dangerous sections on the same road identified in the
second time period; i.e. the number of pairs of dangerous sections on the road
identified in two consecutive time periods. The mutual locations of sections
treated as a pair follow one of the schemes peresented in Figure 2. If all
dangerous sections from both time periods follow the schemes then ni=ri2=n,
x, - coefficient that specifies the way of overlapping of the pair of dangerous road
sections. The coefficient has been defined only for those pairs of sections that
have common parts (i=l,...,n) and it distinguishes satisfactory space
compatibility (alternatives III-IX in Figure 2) and not satisfactory space
compatibility (alternatives I-II):
{1.0 dl -dr. <=0

x . = \ , ' ' i = l...n (5)
' (ssljsjl) dl,-dr>0
In the formula (5) ssli is the total length of the sum of the two overlapping road
sections determined by the very ends of these sections:
sslt = maxfafl), rf(2)} - min{lt(l), lt(2)}
Sj(l) is the length of the /-th overlapping road section identified in the first time
period.
The idea of the SCI index described by (4) has been taken from the concept of standard
logarithm information function (Nicholson, 1995). The SCI index is a complex measure that
takes into account the number and the mutual positions of dangerous road sections identified
in two consecutive time periods. The minimal value of SCI is equal to 0.3. This value describes
the most satisfactory space compatibility, where «/=«2 = « and xt =1.0 for each i=l,...,n. This
means that all dangerous road sections identified in the first time period were identified as
dangerous in the second time period with the satisfactory period-after-period shifts along a road
(schemes III-IX), and any additional sections were not identified in the second time period.
s(2)
I) dl>0, dr>0; the s(2) section is shifted to the left
in relation to the s(l) section
II) dl<0, dr<0; the s(2) section is shifted to

the right in relation to the s(l)
section
s(2)
III) dl>0, dr<0; the s(2) section covers the s(l)
section
IV) dl<0, dr>0; the s(2) section is covered by the

s(l) section
V) dl=0, dr>0; the s(2) section is covered by the

s(l) section, left ends are equal
s(2)
VI) dl=0, dr<0; the s(2) section covers the s(l)
section, left ends are equal
s(2)
VII) dl>0, dr=0; the s(2) section covers the s(l)
section, right ends are equal
VIII) dl<0, dr=0; the s(2) section is covered by

the s( 1) section, right ends are
equal
IX) dl=0, dr=0; the perfect overlapping - s(2) and

s(l) are the same sections
Figure. 2. Schemes of mutual positions of overlapping dangerous road sections obtained from
an identification procedure for two consecutive time periods.
The value of the SCI higher than 0.3 means that:

• ni = ri2 but there are ,,unsatisfactory shifts" in mutual positions of overlapping sections
such as presented by the schemes I or II (Figure 2),
• HI * ri2 but there are satisfactory shifts" in mutual positions of overlapping sections such
as presented by the schemes III-IX,
• nj * ri2 and there are ,,unsatisfactory shifts" in mutual positions of overlapping sections.
The last case is usually the most frequent one - the greater the value of SCI is, the worse the
method of high risk road sections identification is with regard to space compatibility.
Accident Patterns Repeatability Measures
The SCI index evaluates the space repeatability of danagerous road sections in two consecutive
time periods. In order to support this index, some additional measures are suggested in this
chapter to check the repeatability of accident patterns in relation to overlapping road sections.
In this work, the accident patterns are expressed by qualitative features of accidents recorded on
dangerous road sections such as driver's behaviour and accident type. The classification
categories representing these features have to be independent of each other within each of the
features (Tracz and Nowakowska, 1998).
The measures accessing repeatability of accident patterns are build using commonly known
standard error ideas derived from the estimation theory. For each overlapping dangerous
section /, the occurence of classification categories of a chosen accident feature was checked
for two consecutive time periods t=l,2. Then, if a given accident category was recorded for at
least one time period, the percentages of this category occurence for the respective time periods
were calculated according to the formula (6):
FCPij(t)= "'J t = l,2 i = l...n j = L..k, (6)
where: FCPyft) - percentage of the occurence of an accident feature category, recorded on

the z'-th overlapping dangerous road section in the time period t,
nij(t) - frequency of the occurence of an accident feature category on the z'-th
overlapping dangerous road section recorded in the time period /,
kt - number of different categories of the accident feature recorded on the z'-th
overlapping pair of road sections.
The introduction of the percentages FCPy allowed to define three complementary measures of
accident pattern repeatability:
Road Safety Evaluation Methods 1 99
- minimum absolute difference MIAD(FCP) in accident feature pattern:
MIAD(FCP) = min {— £ FCP:] (\)- FCPl} (2} (7)

/=1
-" k, 7=1
- maximum absolute difference MAAD(FCP) in accident feature pattern:
MAAD(FCP) = max{— FCP:J (1) - FCP^ (2} (8)

/=1
-" *,• 7=1
- mean absolute difference MAD in accident feature pattern:
(9)
All notations on the right-hand sides of the definitions (7) - (9) are the same as in the measures
described earlier (formulae (1), (4), (6)). The range of the measures of accident pattern
repeatability is the interval <0,1>. The value 0 of the measures MAAD and MAD confirms
perfect repeatability in accident patterns. The closer to unity the values of the MIAD and the
MAD are, the lower the degree of repeatability.
METHODS OF THE IDENTIFIACTION OF HIGH RISK ROAD SECTIONS USED

IN POLAND
A few methods for the evaluation of traffic safety levels on rural roads have been implemented
in Poland. Some of them are attractive for traffic engineers because of their simplicity. These
are commonly used methods known as the Warsaw method and the Gdansk method (Datka at
all., 1989, 1997). Both are based on the knowledge of accident numbers and average accident
densities. Other methods are less popular due to their requirements regarding accident and
traffic data or due to probability aspects regarding stochastic character of road accidents. One
of such methods has been worked up recently (Tracz and Nowakowska, 1996). Is is based on
cluster analysis and Bayesian theory and,takes into account random character of accidents in
time and space. The three methods were taken as the subject of a survey in this work. For
simplicity they have been named here the WM method, the GM method and the BM method.
The Warsaw Method
A unit road section is used to define the method. It is the one-kilometer-long section
determined by a road kilometreage. In the WM method the unit section on which not less than
four road accidents have been recorded during a year is classified as a pecularily dangerous
unit road section whereas the unit section with the number of road accidents equal to two or
three during a year is classified as a dangerous unit road section. In order to characterise the
safety of any road section, the numbers of both types of unit sections on this section are
considered.
However, there is a disadvantage in the definition given above. It has not been expressed how
many pecularily dangerous and dangerous unit road sections should be contained in the
considered road section in order to identify this section as a high risk one. To make the
definition more precise, the performace of the WM method applied to real accident data has
been studied. As a result, the following completion has been made in this work:
At least three unit road sections with the number of accidents not less than 2 and with the
distance between these sections not less than 2 km identify the location of a high risk section on
a road.
The Gdansk Method
The method classifies a road section on the basis of its accident density D (acc./km/year).
A classification criterion uses the accident density of the road Drd that contains the considered
section. The Drd value depends on the administrative division of the Polish roads as it has to
be calculated for the part of the road placed within the administrative border of a province.
According to the GM method a road section can be classified into one of the three categories:
• relatively safe, if Drd <D<=2- Drd,
• being threatened, if 2 • Drd < D <= 3- Drd,
• dangerous, if D > 3- Drd .
The disadvantage of the GM method lies in lack of the information how the ends of the section
should be determined. After the analyses of some possibilities for solving this problem, the
following completion has been made in this work:
The sections where accidents cluster are determined initially using the single linkage method
in the way described in the BM method. Then each section is checked in order to decide to
which category it belongs according to the GM method classification.
The Bayesian Method
In order to perform the road safety analysis, road sections where accidents cluster are
determined first. This is done by applying the single linkage method - one of the methods of
cluster analysis. To identify hazardous sections among the accident cluster sections, an
accident-proneness model has been developed (Tracz and Nowakowska, 1996). The model
describes the distributions of the following accident variables: number of injuries, number of
fatalities and number of vehicles involved in accidents. As the model is derived from the
Bayesian approach (Benjamin and Cornell, 1977) the identification method has been called a
Bayesian one.
The model is based on two sources of information. The first one h(x) is a prior distribution of
the accident variable X over the accident cluster sections:
for x =0
(10)
for x >1
The second source h(x\z,i) is a posterior distribution of the variable Shaving reference to
those accident cluster sections, where the value of z of the accident variable over the i time
units have been recorded:
for x =0
h(x\z,i) =
h(x-l\z,i)- for x> 1
x • (i + 1 + s)
In the above formulas s>0 is a scale parameter and k>0 is a shape parameter of the distribution.
It is obvious that the level of safety varies between different roads and different road sections
along a given road. Therefore, to define the change in the value of accident variable X for each
road section, the value of the cumulative probability of the h(x z,i) function, calculated for the
median m of the h(x) distribution, has been proposed. Thus, a single section can be
characterised according to the cumulative posterior probability h (x z, i) of the X value above
the median, i.e. by the weight W(X) expressed as follows:
(12)
where m is determined by the equation: v h(x)dx « Y h(x)dx « 0.5 •

0 m
In order to include the influence of a road section length / in the identification process, a weight
function for this length has been defined by the following form:
u=
LW(l) max
for
In the BM method the accident cluster sections are subsequently used to find the prior
distributions for injuries, fatalities and total vehicles involved in accidents. Then posterior
distributions are detremined to calculate accident severity weights from the formula (12). These
weights and the road section length weight (13) are the elements of a set S, that characterises
the safety of a section:
S = { W(X,), W(X2), W(X3), LW(l) } (14)
The variable X} identifies the number ofy-th accident variable as follows: X\ is the number of
injuries, X^ is the number fatalities, Xs is the number of vehicles involved in accidents on a
section.
The identification is processed on the basis of a classification value calculated from the
elements of the set S. This value is called a Safety Weight SW and is defined in the following
form:
SW = /Z W(X t ) 2 + LW(l)2 (15)

V j
A criterion for high risk road section is determined by a critical set S* obtained from the set S of
the elements equal to 0.5, 0.4, 0.6 and 0.75. These numbers reflect the rank of respective
elements of the set - the lower value, the higher rank. An accident cluster section is indicated as
a high risk road section, if its classification value is greater than the value SW for the critical set
S*.
ANALYSIS OF EMPIRICAL DATA
In order to present practical application of the described measures and to compare the methods
of the identification of high risk road sections, real accident data recorded on four selected rural
roads in the period 1991-1996, from two neighbouring provinces of south-central Poland were
investigated. These roads represent different categories and are characterised by various
lengths, traffic volumes and accident densities. High risk road sections were identified for
different time periods according to the identification methods. The studied road sections were
not treated during the two considered periods. The comparison of results was conducted using
calculus evaluation. This enabled to present advantages and disadvantages of the methods.
Comparison of methods in relation to the degree of accident clustering
The main differences in the concentration of accidents and in the concentration of accident
consequences (injuries, fatalities and vehicles involved in accidents) obtained when using the
three methods (WM, GM, BM) are presented in Table 1. The WM method involved one-year
time period data to process the identification procedure. The results given in Table 1 for this
method were calculated for the year 1991. The other methods (GB and BM) were used for
a three-year time period. In this case, the results are given for the period 1991-1993.
Table 1. Comparison of the values of the Method Inaccuracy Coefficient and additional
measures for high risk road sections obtained for different indentification methods.
Accident Method Inaccuracy Coefficient
Time Method density Total length MIC MIC(I) MIC(F) MIC(V)
period [ace. /km/year] [km]
-1- -2- -3- -4- -5- . . -6- -7- -8-
Road No 7
1991 WM 2.0 83.0 (43%) 0,60(71%) 0.57 (76%) 0.59 (73%) 0.57 (76%)
91-93 GM 5.4 17.8(9%) 0,24(38%) 0.24 (39%) 0.23(41%) 0.24 (38%)
91-93 BM 3.9 40.0(21%) 0.33 (62%) 0.33 (65%) 0.30 (70%) 0.33 (63%)
Road No 44
1991 WM 2.7 6.0 (7%) 0.38(19%) 0.31(23%) 0.57 (13%) 0.39(18%)
91-93 GM 3.8 9.2(11%) /0.23(48%) 0.21 (53%) 0.23 (48%) 0.23 (47%)
91-93 BM 3.0 9.9 (12%) 6,30(39%) 0.27 (44%) 0.34 (35%) 0.30 (39%)
Road No 74
1991 WM 2.1 20.0 (24%) 0.51 (48%) 0.53 (46%) 0.45 (54%) 0.59 (42%)
91-93 GM 4.5 3.6(4%) 0.20 (23%) 0.20 (23%) 0.20 (22%) 0.21(21%)
91-93 BM 2.6 11.2(14%) 0.34 (40%) 0.33 (42%) 0.33(41%) 0.34 (40%)
Road No 728
1991 WM 1.7 6.0 (4%) 0,21(20%) 0.20 (20%) 0.15(27%) 0.22(18%)
91-93 GM 2.0 11.6(8%) 0-17 (45%) 0.17(47%) 0.16(49%) 0.19(43%)
91-93 BM 2.0 4.2 (3%) 0.17(17%) 0.14 (20%) 0.14(21%) 0.15(19%)
Accident denisties for dangerous sections are presented in the column 3 of Table 1. Some
additional supporting measures have been included in brackets:
- the percentage in a row of the column 4 represents the ratio of the total length of dangerous
road sections to the length of a road,
- the percentage in a row of the column 5 represents the ratio of the total number of accidents
on dangerous road sections to the total number of accidents on a road,
- the percentages in the columns 6, 7 and 8 are ratios such as in the column 5 but calculated
for injuries, fatalities and number of vehicles involved in accidents.
For a road with a high accident density (road No 7) the WM method determines dangerous
sections on which the ratios of the numbers of accident features exceed 70%. However, the
total length of the sections can be fairly significant - even more than 40% of the road length
(see the percentage in the column 5 for the road No 7). The lower the category of a road, the
shorter the total length of dangerous sections but also the lower the percentage values of
accident features (roads: 44, 74, 728). For all four roads, accident densities on dangerous road
sections identified by the WM method are lower than identified by the GM and BM methods.
Values of the Method Inaccuracy Coefficient are the highest also for all roads. The situation
was similar when using this coefficient for subsequent years (1992, 1993). So, from the point
of view of the concentration of accident features the WM is not advisable.
The accuracy of the GM method proves to be very good for all roads. Accident densities on
high risk road sections obtained from this method have the highest values. The supporting
ratios (percentages in the columns 5-8) vary from road to road: they are the lowest for the roads
No 7 and No 74, whereas for the roads No 44 and No 728 they are the highest. The MIC
measures are very low: they range from 0.17 to 0.24. Taking into account the Method
Inaccuracy Coefficient, the performance of this method the best. Nevertheless, it should be
pointed out that the results of the GM method depend on the way in which road sections are
determined to be the subject of the further identification procedure. If the sections are selected
using the cluster analysis the selection criterion can be very weak and only a few accident
cluster sections are abandoned. So a sieve role of the GM method (Hauer and Persaud, 1984) is
almost none.
The values of comparative measures calculated for the BM method place the method, on
average, between the WM and the GM methods on a ranking list of the method performace
with regard to the degree of accident clustering. In the case of the roads No 7, No 44 and No 74
the MIC values are relatively low (0.20-0.34), but a little higher than for the GM method. In the
case of the road No 728 the BM gives the lowest (i.e. the best) values of the MIC coefficient.
Considering the values of the measures of the degree of accident clustering it can be said that
the best average results were obtained for the GM method, the BM method has occurred quite
acceptable and the WM method can be classified as the worst.
Comparison of methods in relation to the repeatability of locations of high risk road

sections
The time period for which high risk road sections are identified differs in the considered
methods. Consequently, two consecutive time periods taken for calculation of the Space
Compatibility Index differ in their lengths. In order to investigate the space repeatability of the
WM method the results of identification for two years (1991 and 1992) were taken into
account. For the two other methods (GM and BM) trade-off analyses were carried out on the
basis of the results from two three-year time intervals 1991-1993 and 1994-1996. The results of
analyses are presented in Table 2. Notations in columns 3-7 were taken form the definition of
the SCI measure.
The different numbers of high risk road sections identified by any of the three methods for two
consecutive time intervals can confirm the phenomenon of accidents migration and
consequently the migration of accident risk, even if a remedial treatment was not applied.
For the three roads (No 7, No 44, No 74) the ratios of the number of overlapping dangerous
road sections to the total number of such sections obtained in two consecutive time periods, are
for the WM method equal to, or very close to, the same ratios calculated for the BM method.
These ratios are better than ratios for the GM method - see column 6. The values of the SCI
index are very diversified and rank the WM method on a perfect repeatability list on the first
place (road No 74), but also on the second (road No 7) or on the last place (road No 44). The
extremally high value of the SCI measure for the WM method for the road No 44 (1.12) results
mainly from only one overlapping of high risk road sections, whereas there were together five
high risk sections identified by this method in two considered time periods. On the road No 728
the WM method identified one dangerous section in the first time period and also one such
section in the second period. However, they do not overlap and there is over fifteen kilometers
distance between them.
For all considered roads values of the n/(nl+n2) ratio for the GM method are lower than
respective ratios for two other methods. The SCI measures never place this method on the first
place on the perfect repeatability list. The worst results were obtained for the road No 74. There
were seven dangerous road sections identified in the first time period and five sections
identified in the second time period using the GM method. From these sections only one pair of
road sections overlaps - so the values of both measures of space repeatability (columns 6 and 7)
calculated for this method are most unsatisfactory. On average, the GM method gives the worst
results.
Table 2. Comparison of the values of the Space Compatibility Index and

suplementary measures obtained for different methods of the identification of
high risk road sections.
Road Method Number of dangerous road sections n/(nl+n2) SCI
number nl n2 n ratio
-1- -2- -3- -4- -5- -6- .7.
WM 7 10 5 0.29 0,65
7 GM 30 43 11 0.15 0,86
BM 36 34 20 0.29 0.60
WM 1 4 1 0.20 1.12
44 GM 15 12 5 0.19 0.78
BM 9 10 4 0.21 0.67
WM 3 2 2 0.40 0,54
74 GM 7 5 1 0.08 1,16
BM 7 10 6 0.35 0.64
WM 1 1 0 - .
728 GM 10 8 3 0.17 0.90
BM 3 4 2 0.33 0.62
For the roads No 7, No 44 and No 728 the BM method gives the lowest values of the SCI
measure and for the road No 74 only marginally larger than the SCI calculated for the WM
method. Consequently, in qualitative terms, the BM method can be considered as working
better than the two other methods. This can be confirmed by the n/(nl+n2) ratios, which are
comparatively best of all.
Evaluating all values of the space compatibility measures, it can be said that the best results
were obtained for the BM method. It means that the precision in the location of dangerous road
sections determined by this method was generaly greater than in the case of the two other
methods. The WM and GM methods perform more or less on the same level with regard to
space compatibility.
Comparison of methods in relation to the repeatability of accident patterns on

overlapping high risk road sections
In Poland the description of accident details is included in a road police report known as Road
Accident Card. This accident information contains nine qualitative features describing accident
circumstances (Tracz and Nowakowska, 1997). Two of them have been chosen as subjects of
the analysis of the accident pattern repeatability on overlapping high risk road sections. These
are: accident type and driver's behaviour. Some categories of these features have been
combined in order to obtain more coherent classification (Tracz and Nowakowska, 1998). In
such way accident type was classified in eight values and driver's behaviour was classified in
eleven values. To mark the chosen feature in the measures MIAD, MAAD and MAD, the
notation FCP is replaced by ATP for accident type pattern and by DBF for driver's behaviour
pattern. The values of the pattern repeatability measures for the analysed methods and for the
four considered roads are presented in Table 3.
The WM method gives the best results for the road No 7 for both qualitative accident features.
The average differences in accident type pattern MAD(ATP) and in driver's behaviour pattern
MAD(DBP) do not exceed 10%. The method performs better than the GM method but worse
than the BM method for the other considered roads.
The results of accident pattern repeatability on overlapping dangerous road sections are the
worst for the GM method both in the case of accident type (columns 3-5) and in the case of
driver's behaviour (columns 6-8) for all roads.
Table 3. Comparison of the measures of repeatability of accident patterns on

overlapping high risk road sections obtained for different identification methods.
Road
Method MIAD(ATP) MAAD(ATP) MA0(ATP) MIAD(DBP) MAAD(DBP) !'lfeCD(DBP)
number
-1- -2- -3- -4- -5- -6- -7- -8>
WM 0.045 0.099 0.076 0.050 0.129 0.086
7 GM 0.052 0.300 0.163 0.071 0.280 0.1^1
BM 0.052 0.172 0.106 0.037 0.206 0*111
WM 0.120 0.120 0.120 0.133 0.133 • oaft
44 GM 0.071 0.213 0.124 0.065 0.244 &147
BM 0.071 0.114 0.093 0.065 0.244 OM30
WM 0.112 0.148 0.130 0.099 0.129 O.H4
74 GM 0.240 0.240 0.240 0.253 0.253 0.253
BM 0.044 0.201 0.122 0.084 0.133 0,102
WM - - . ,-. - - -
728 GM 0.079 0.157 0.126 0.054 0.250 0.141
BM 0.079 0.141 0,110 0.109 0.119 Oill4
The BM method performs best of all for the roads No 44, No 74 and No 728 despite the fact
that MIAD and MAAD values are not the smallest for some cases. The average differences in
pattern repeatability (the MAD measure) are smaller for the BM method than for the other two
methods. They range from 9.3% to 13.0%. This range is determined by the results obtained for
the road No 44. The MAD values for the roads No 74 and No 728 are included in this interval.
The MAD(DBP) value for the WM method is only 0.3% greater than such value for BM
method in the case of the road No 44.
Considering all values of the measures of accident pattern repeatability it can be said that the
best average results were obtained for the BM method. This confirms the general advantage of
this method over the WM and BM methods with regard to accident pattern repeatability in
relation to overlapping dangerous road sections.
Summing up evaluation of the measures used for comparison
Accident data such as numbers of: accidents, fatalities, injuries and involved vehicles that were
recorded on dangerous road sections as well as total length of these sections determine the
degree of accident clustering. The ratios of these numbers on dangerous sections to their totals
for a considered road can indicate the degree of clustering; the closer to unity such ratio is, the
better the results of identification. However, a comparatively large value of the described
percentage can be accompanied by a comparatively large value of the total length of dangerous
sections in relation to the length of a road - see the results in Table 1. The Method Inaccuracy
Coefficient strikes a balance of these two characteristics, thus becoming a measure which, with
regard to accident clustering, plays a fundamental role in the comparison of methods of high
risk sections identification. The lower the values of the coefficients MIC, the better the
performace of the method. The mentioned earlier ratios can help in the interpretation of MIC
values.
The repeatability measures can confirm the accuracy of a method with respect to the location of
sites with high accident risk. It is obvious that accidents can, but not always, occur in the same
sites every year but if the location of sites where these occurences oscillate around two
consecutive time intervals is indicated, the method can be considered as proper. Therefore two
aspects are important:
• the number of dangerous sections that overlap,
• the way of overlapping (i.e. the range of overlapping).
The more important is the ratio of overlapping dangerous sections to the total number of
dangerous sections from two consecutive time intervals. If this ratio is the same for two
methods, the range of overlapping indicates a preferable method - see the results for the road
No 7 in the columns 6 and 7 in Table 2. Nothwithstanding all these aspects, the SCI values are
quite satisfactory to evaluate the repeatability of methods - compare, for example, the results in
columns 6 and 7 for all roads in Table 2.
Accident patterns described by accident type and driver's behaviour can help in making a
remedial treatment decision to be taken in order to improve road safety. So, in relation to
overlapping dangerous sections, the perfect repeatability of accident pattern is strongly
recommended. As measures of repeatability of accident pattern are related to overlapping
dangerous sections they should be treated as supplementary measures to the SCI indexes. The
interpretation of these measures is very easy and intuitive - the lower values the better.
The considerations presented in this paper have shown that the range of the values of MIC,
SCI, MAD(ATP) and MAD(DBP) measures depends on a method of high risk road sections
identification and also depends on a studied road. Following a suggestion of a referee, the
authors have considered the accuracy of these measures (Hauer, 1997). However, in the
presented study the sample size is too small (only four roads) to derive general conlusions
concerning this issue. Therefore, to look into some results, only the preliminary calculations
dealing with variability of the measures were done. The variance of the measures in relation to
the three methods is presented by coefficients of variations in the Table 4.
It can be seen from this table that coefficients of variance are small for three measures (SCI,
MAD(ATP), MAD(DBP)) for the BM method. This method can be expected to have the same
efficiency for other roads. For the WM and GM methods the variance is comparatively large.
Considering the repeatability of these methods, their efficiency for other roads can be either
satisfactory or not. For the MIC measure all coefficients of variance are rather large - so the
efficiensy of all considered methods, with respect to this measure, can be difficult to predict.
Table 4. Percent coefficients of variation of accuracy

measures of high risk road sections identification methods.
All roads MIC SCI MAD(ATP) MAD(DBP)
WM 22.3% 40.0% 26.4% 21.3%
GM 15.1% 17.8% 33.2% 29.0%
BM 27.6% 4.7% 11.1% 10.2%
These results seem to be interesting but it should be pointed out that more accident data are
required in order to confirm conclusions concerning the accuracy of the measures. This is the
aim of the authors' future research in this area.
CONCLUDING REMARKS
The purpose of this study was to develope a methodology for comparison of the efficiency of
different methods of the identification of high risk road sections.
The analysis carried out in this work enabled to detect a few deficiences in the dangerous
section identification methods (WM and GM). Thus, some suggestions of their improvement
were formulated. The suggestions concern:
• the way of defining road sections for further high risk sections classification, and
• making some definitions of classification criteria more precise.
In order to generalize the results and to qualify the methods, some main and supplementary
measures independent of a methodology have been defined. Two aspects were taken into
account. One concerns the level of concentration of accidents along hazardous road sections
and the level of accidents severity concentration. The other aspect concerns the problem of
repeatability of the identification procedures during two consecutive time periods in relation to
the location of dangerous sections along a road and in relation to some descriptive features of
road accidents.
It has turned out that conclusions from comparison are not explicit. Even simplified methods
can give satisfactory identification results provided that their deficiences are made up for. A
very simple method such as the GM method performs well in terms of accident feature
concentration and badly in terms of repeatability of the method. Comparison measures for the
WM method (also a simple one) are the worst for the feature concentration aspect and the most
diversified for the other aspects. Only in the case of the more sophisticated BM method the
values of all indices are the least controversial.
All measures considered in this work are based on the results of the process of the identifcation
of high risk road sections. The methods used in such processes varies in different countries
despite the common roots in several cases. However, having the results of identification, one
can use the proposed measures, which are transferable to other countries, in comparing
different methods or in estimating the efficiency of a method.
ACKNOWLEDGEMENTS
The authors wishes to acknowledge helpful comments and suggestions of anonymous referees.
REFERENCES
Datka S., W. Suchorzewski and Tracz M. (1989, 1997). Traffic engineering. WKiL Press,
Warszawa (in Polish).
Hakkert A. S. and E. Hauer. Extent and some implications of incomplete accident reporting. In:
Methods for evaluating highway improvements. Transporation Research Record
1185, TRB, Washington DC.
Hauer E., and B. N. Persaud (1984). Problem of Identifying Hazardous Locations Using
Accident Data. Transp. Res. Rec., 975, Transportation Research Board, 131-140.
Hauer E. (1995). Identification of'Sites With Promise'. International Conference in Prague on
Strategic Highway Research Program and Traffic Safety on Two Continents. The
Czech Republic, 20-22 September.
Hauer E. (1997). Observational before-after studies in road safety. Estimating the effect of
highway and traffic engineering measures on road safety. Pergamon (Elsevier),
Oxford, New York, Tokyo.
Maher M. J. and L. J. Mountain (1988). The identification of accident blackspots: a comparison
of current methods. Ace. Anal, and Prev., 20, 143-151.
Mountain L., Fawaz B., Sineng L. (1992a). The assessment of changes in accident frequencies
at treated intersections: a comparison of four methods. TE&C, 2, 85-87.
Mountain L., Fawaz B., Sineng L. (1992b). The assessment of changes in accident frequencies
on link segments: a comparison of four methods. TE&Cl, 7-8, 429-431.
Nicholson A. (1995). Indices of accident clustering: a re-evaluation. TE&C, 5, 291-294.
Tracz M. and M. Nowakowska (1996). Bayesian Theory and Cluster Analysis in the
Identification of Road Accident Blackspots. 13-th International Symposium on
Transporatation and Traffic Theory, (Editor J-B Lesort), Pergamon (Elsevier),
Oxford, New York, Tokyo, 261-276.
Tracz M. and M. Nowakowska (1997). Characteristics of some accident circumstances on road
blackspot sections. 20-th International Conference on Theories and Concepts in
Traffic Safety, Lund, Sweden, 5-7 November.
Tracz M. and Nowakowska M. (1998). Using qualitative analysis in road safety research. Third
IMA International Conference on Mathematics in Transportat Planning and
Control, Cardiff, Great Britain, 1-3 April.
Behavioural Adaptation and Seat-Belt Use 213
BEHAVIORAL ADAPTATION AND SEAT-BELT USE:

A HYPOTHESIS INVOKING LOOMING AS A
NEGATIVE REINFORCER
Anthony H. Reinhardt-Rutland, Psychology Department,

University of Ulster at Jordanstown, UK
ABSTRACT
The technical performance of seat-belts is not in doubt, but their continuing value is diminished
if driving deteriorates following the switch to seat-belt use. Janssen (1994) demonstrated such
behavioral adaptation over one year after the switch, but the effects were not sufficient to
nullify seat-belt use. Longer experiments are probably impractical because of near-universal
seat-belt legislation. However, UK statistics suggest that driving speeds have increased over the
years since legislation to the extent of more than nullifying the effectiveness of seat-belts. In the
present paper, Fuller's learning model of road behavior is developed in conjunction with the
perceptual phenomenon of looming as an alternative to risk formulations of behavioral
adaptation. Looming acts as negative reinforcement for unbelted drivers, but not for belted
drivers. Because it represents threat to life, negative reinforcement persists in its effectiveness:
it will take some years for the loss of looming to affect fully the new seat-belt user's behavior.
In addition, behavioral adaptation in motorists inevitably militates against the encouragement of
environmentally-sustainable - but vulnerable - modes of travel, such as walking and cycling. If
safety-related interventions are to be properly assessed, there must be adequate empirical data -
obtained over a time-span which permits a full assessment of behavioral adaptation - in
conjunction with a plausible theoretical framework in which the data can be interpreted.
INTRODUCTION
For many years, driving had been regarded simply as a perceptual-motor skill subject to
straightforward cause-effect relationships between the driver's actions and the road
environment. In this context, any technical intervention to reduce the consequences of failure in
the performance of any aspect of driving skill would be deemed beneficial; the intervention may
be to the road environment - for example, modification of the road-layout at an "accident
black-spot" - or to the vehicle - for example, equipping the vehicle with seat-belts, ABS
braking and air-bags. However, in the last thirty years or so, it has become clear that such
thinking is superficial (Summala, 1996). For example, casualties may simply "migrate" from the
former accident black-spot to other parts of the road-system (Davis, 1993). One crucial
difficulty is that the technical intervention can be followed over time by changes in driving
behavior, a phenomenon labelled behavioral adaptation. Occasionally, behavioral adaptation is
in the direction of more cautious driving. More commonly, however, behavioral adaptation
entails less cautious driving characteristics. It seems that behavioral adaptation is an
automatized process occurring with little, if any, conscious effort on the part of the driver.
One well-documented illustration of undesirable behavioral adaptation concerns antilock

braking (ABS), which renders a braked vehicle more controllable in slippery conditions. In a
study of two matched groups of drivers operating either ABS-fitted or conventionally-braked
taxis, collisions over three years were in fact not statistically different for the two groups; the
ABS group evinced changes in behavior, particularly increase in abrupt braking, which nullified
the expected reduction of casualties (Aschenbrenner and Biehl, 1994).
In a contrasting example, the dangers inherent in switching from left- to right-hand driving in
Sweden and Iceland were accepted as an unfortunate by-product of legislation required to
achieve conformity with neighbouring jurisdictions. In fact, the switch initially elicited reduced
casualty rates, presumably because drivers were sensitive to the particular problems and made a
conscious effort to avoid them. Unfortunately, casualty rates then increased steadily over the
subsequent years to the levels applying before the switch (Naatanen and Summala, 1976;
Wilde, 1994).
An inference that might be drawn from the above two cases of behavioral adaptation is that all
safety-pertinent interventions or situations become void over time, which has profound
implications for road-safety policy: there seems little point in devising safety interventions if
inevitably there will merely be a return to previous casualty rates. It might be argued that little
is lost if the group to whom the safety-pertinent intervention or situation is targeted fails to
demonstrate better casualty rates, but this is simplistic. The targeted group does not exist in
isolation: it is not helpful for vulnerable road-users - pedestrians, cyclists and, arguably, motor-
cyclists - if the targeted group consists of motorists who now drive more dangerously and
generate a more hostile environment for vulnerable road-users.
CURRENT THEORIES: CONCEPTS OF RISK, HABITUAL BEHAVIOR AND

BEHAVIORAL ADAPTATION
Theories to explain behavioral adaptation generally invoke risk as a mediating variable: it is

assumed that subjective risk is adequately manifested in observable behavior. Much has been
written about risk theories, so only a brief review will be attempted here.
In the best-known example of risk theory, Wilde (1982, 1994) proposes that an individual's
performance is determined by a target level of risk, dependent mainly on psychological factors,
but including economic and cultural factors. Risk is linked with the individual's "need for
stimulation", so the frequently-cited link between high casualty rates and young male drivers
can be attributed to the high target risk-levels among this group (Heino et al, 1996). In brief,
the target risk-level for the driver is reflected in behavioral factors such as habitual speed,
following distance and frequency of overtaking. Following a safety-pertinent intervention or
situation, risk homeostasis returns risk to the target level, as reflected in adjustments to habitual
speed, for example.
Summala and Naatanen (1988) conceptualise risk differently. Their zero-risk theory argues that
drivers internalise safety margins; these lead to habitual patterns of behavior which the driver
does not consider to be risky. Only when the driver determines that safety margins have
become inappropriate - for example, unnecessarily eliciting slow speed - does perceived risk
"switch in" to affect behavior.
Taking the case of ABS brakes, the differences between Wilde's theory and Summala and
Naatanen's theory can be illustrated. For Wilde, the driver's experience after adopting ABS
brakes may suggest that the cautious behavior applying in previously hazardous conditions is
now unduly low in risk: increase of speed in such conditions will raise risk to tolerable levels. In
contrast, for Summala and Naatanen the driver's experience after adopting ABS brakes -
perhaps obtained by chance - may suggest that cautious behavior in previously hazardous
conditions is no longer necessary: there will be no risk if the driver now increases speed in such
conditions.
Criticisms of risk
Risk theories have intuitive appeal but several issues prevent full acceptance. One issue
concerns definition. In most research, risk is necessarily defined in terms of objective casualty
statistics. However, risk has a major subjective component which completely eludes objective
statistics (Adams, 1988; Rumar, 1988). For example, flying for the population at large is
subjectively riskier than driving, but in terms of casualties per unit distance traveled the reverse
is emphatically the case (UK Government Statistical Service, 1997). Given such ambiguities, it
is unsurprising that Wilde's theory and the competing theory of Summala and Naatenen can
conceptualise risk in such contrasting ways.
The proposed mechanisms to explain behavioral adaptation are dubious in their relationship to
other evidence. For example, typical cases of homeostasis concerning energy, water and
temperature regulation are mediated via specific brain areas, particularly the hypothalamus
(Carlson, 1993; Kolb and Wishaw, 1985). In such cases, the time-constant for homeostasis is of
the order of hours and a plausible physiological model can be postulated. In contrast, the time-
constant for risk homeostasis is of the order of years, suggesting an entirely different and
unfamiliar mechanism.
Finally, evidence of complete nullification of a safety-pertinent intervention or situation is not

always straightforward. For example, ABS braking is technically less effective than had been
expected, since certain collisions are in fact exacerbated by ABS (Farmer et a/, 1997; Kahane,
1994): Aschenbrenner and Biehl's (1994) casualty rates - referred to above - cannot exclude
this as a possible contributory factor. Another well-publicised intervention - the airbag - is
affected likewise, with reports of injuries to short drivers caused by the inopportune operation
of airbags at low speeds (Duma et al, 1996; Mahmud and Alrabady, 1995). The implication is
that the determination of a target level of risk can rarely be precise, so it follows that we cannot
know for sure that behavioral adaptation elicits changes that fulfil the requirements of a
homeostatic conceptualization.
Another case suggests little behavioral adaptation: British casualties in poor visibility such as
fog and night are invariably higher than in good visibility (Parker and Cross, 1981; White and
JefFery, 1980; Smeed, 1977). Since poor visibility is familiar in Britain, risk theories should
apply: drivers should evince slower speeds and longer following distances to compensate for
the increased incidence of casualties in poor visibility. Although some reduction of speed and
increase of following distances may occur, the changes in behavior are far from adequate.
Furthermore, the dangers of poor visibility might elicit caution amongst new drivers, given the
evidence from switching direction of travel in Sweden and Iceland. However, there is little
evidence that these inferences apply. The increased casualties in poor visibility are better
understood in terms of visual perception: there are several phenomena - reduced vection,
anomalous myopia, motion contrast, aerial perspective and so on - which entail serious
misperception of speed and distance in conditions of restricted visibility (Cavallo et al, 1997, in
press; Leibowitz et al, 1982; Reinhardt-Rutland, 1986, 1992a,b,c).
The above reasons suggest that the invoking of risk as a global factor in road-behavior requires
caution. As a psychological concept risk is plausible and perhaps even necessary, if only as a
short-hand descriptor in regard to behaviour that seems unnecessarily dangerous. However,
current risk formulations seem unable to link risk to any established body of psychological
knowledge. For example, one might expect to link risk to theories and research concerning
motivation and personality (Maslow, 1987) and issues of social interaction (Hogg and
Vaughan, 1998; Moghaddam, 1998), but there seems little evidence that this has been or can be
realistically attempted.
As a starting point in developing theories of road-behavior, it may be better to consider (a) each
safety-pertinent intervention or situation in detail on an individual basis, while (b) seeking to
relate the intervention or situation closely to empirical research and theory in pertinent areas of
psychology.
SEAT-BELT USE AND LEGISLATION
Of all safety-pertinent interventions, seat-belts have been regarded as particularly efficacious

(e.g., Evans (1991), Wyatt and Richardson (1994) and many governmental agencies). Certainly
the potential of seat-belts in reducing casualty rates is difficult to fault. For a given severity of
collision, seat-belt use limits motion of the self within the vehicle and ejection from the vehicle.
Furthermore - and in contrast with ABS and airbags - the cases in which seat-belt use is
disadvantageous are agreed generally to be minor (Rutherford et al, 1985; Salam and
Frauenhoffer, 1996). Particularly compelling evidence arises from those cases in which a
vehicle involved in a collision contains both belted and unbelted occupants: the belted
occupants have a fatality rate which is 40% less than for the unbelted occupants (Evans and
Frick 1989; Evans 1990).
Seat-belt effectiveness modified by driver behavior
The limitation of the above evidence is that it can tell us nothing about whether the driver's
behavior has altered as a result of wearing a seat-belt. This is an issue that is more difficult to
establish. Evans (1991) provides a useful overview of the types of non-intrusive data-collection
procedures that may be employed for drawing conclusions about the actual effectiveness of
seat-belts among vehicle-occupants. One procedure is to compare casualty rates in jurisdictions
matched as for as possible except in regard to seat-belt legislation. Seemingly, this is
unsuccessful because the necessary degree of matching is difficult to attain: other legislation
and differing patterns of road-use are likely to confound the effects of seat-belt use.
Other procedures relate to the time around the introduction of legislation for compulsory seat-
belt use in a given jurisdiction. This can be a simple before-and-after count of casualties or a
more sophisticated time-series analysis: a discontinuity is expected in an otherwise continuous
function for casualty-rates against time. The ideal source of data for such a study is a
jurisdiction in which seat-belt use switches from zero before legislation is introduced to 100%
after legislation is introduced. Unfortunately, in many jurisdictions seat-belt use has tended to
change only gradually over the years, whether legislation is introduced or not: any changes that
may result are thus difficult to attribute to seat-belt use.
However, the case of the United Kingdom may provide sufficient of a contrast in seat-belt use
during pre- and post-legislation times: legislation in 1983 brought a change in seat-belt usage
from about 40% to over 90%. Scott and Willis (1985) reported about a 20% reduction in
fatalities, which - given the 50% change in seat-belt usage - is in line with Evans and Prick's
results from belted and unbelted occupants of the same vehicle. Evans and Prick's results were
also broadly confirmed in a more sophisticated time-series analysis (Harvey and Durbin, 1986).
Nonetheless, caution is required in the interpretation: drink-driving legislation was also
introduced simultaneously in 1983 (Broughton and Stark, 1986) and selective recruitment may
apply when a safety-pertinent intervention is not compulsory: seat-belt users prior to legislation
were probably individuals who put a premium on safety (Evans, 1985). The 20% reduction in
fatalities therefore overestimates the effect of seat-belt use. This indicates that the effectiveness
of seat-belts suggested by Evans and Prick diminished over the duration of Scott and Willis'
study. Since the study extended one year either side of the introduction of legislation,
behavioral adaptation was possible.
An experimental study
Janssen (1994) confirms this view. This important study was unique in the degree of control
exercised over nuisance variables. Three groups were investigated - new seat-belt users who up
to the beginning of the study had been habitual non-users, habitual seat-belt users and habitual
seat-belt non-users - in various driving behaviors on Dutch freeways over one year. Janssen had
intended to include a fourth group - new seat-belt non-users who had been habitual users - but
could not obtain willing participants. Compared with the other groups, new seat-belt users
increased their speeds by 1.6 km/hr - average speeds were 110 km/hr - and reduced their
following distances. These changes do not negate the benefit for seat-belt users. Empirical data
suggest that the ratio of road fatality rates at two speeds is determined by the fourth power of
the ratio of the speeds (Evans, 1991). Hence, the above increase in speed implies a 6% increase
in fatalities. The reduced following distances would add to fatalities, although quantification in
this case is not easy. Janssen found that behavioral changes had not stabilised: given time, they
could have seriously compromised the benefit for seat-belt users.
Trends in British road-traffic
This assertion is consistent with trends in British road-traffic speeds since the early 80s. Until
very recently officially-available data on speeds and speed trends has been sparse. However,
Thomson et al (1985) as part of a study concerning road-use in relation to school-age children
reported that on urban and suburban roads for which a 30 mph (48 km/h) limit applies, speeds
in 1981 averaged 28 mph (45 km/h), with 38% of vehicles exceeding the limit. Corresponding
1996 figures are 33 mph (53 km/h) and 72% (UK Government Statistical Service, 1997).
Following Evans' relationship, the 8 km/h increase in mean speed implies a 92% increase in
fatalities - much more than enough to nullify the benefit of seat-belt for their users, given Evans
and Prick's 40% reduction in fatalities due to belt-use from their data for belted and unbelted
occupants of the same automobiles. Changes other than seat-belt legislation have occurred over
the eighties and nineties - for example, in regard to the comfort and ease-of-use of automobiles
- which might also have contributed to the elevation of speed.
It is unfortunate that much less is known about other trends in driving behavior, such as close-
following and violation of overtaking rules.
With regard to long-term empirical trends in fatalities for British automobile occupants, data
over a 20 year period - much longer than was available for the studies of Scott and Willis
(1985) and Harvey and Durbin (1986) - show a dip around 1983 and 1984 which was not
maintained over subsequent years (UK Government Statistical Service, 1991). Adams (1994)
reports at length on this and similar long-term trends from other jurisdictions.
The possibility of further empirical studies
The problem for empirical research is that controlled seat-belt studies with the time-span of,
say, Achenbrenner and Biehl's (1994) ABS study are probably unrealistic, since most
jurisdictions now opt for compulsory seat-belt use. Indeed, this applies in the Netherlands, the
location of Janssen's study. In the absence of a long-term experimental study, one might appeal
to theoretical issues. However, the currently-prevailing risk formulations that are postulated for
behavioral adaptation are problematic. Hence, the extent of behavioral adaptation for seat-belt
use remains something of an open question. Proponents of seat-belt use can argue that other
factors in the ever-changing roadway environment account for the disappointing trends. Of
course, proponents need to identify these alternative factors - which they have yet to do in any
systematic way.
However, the theoretical underpinnings for behavioral adaptation need not be couched in terms
of the slippery concepts of risk and homeostasis. Earlier, I argued that each safety-pertinent
intervention or situation should be considered individually and in the context of established
bodies of psychological research. In fact, there is an alternative model of driver behavior
offered by Fuller (1984, 1988, 1992) which can be developed to explain why behavioral
adaptation eventually becomes substantial in the case of seat-belt use.
THE BASIS OF SIMPLE LEARNING: CLASSICAL CONDITIONING AND

OPERANT CONDITIONING
Fuller's model refers to simple learning in humans and non-humans, particularly classical
conditioning - associated with Ivan Pavlov - and operant conditioning - associated with B. F.
Skinner (Mazur, 1994; Leslie, 1996). The foundation for such learning resides in associations
with desired outcomes - rewarding stimuli or events - and associations with undesired
outcomes - aversive stimuli or events.
In classical conditioning, low-level or autonomic physiological responses become associated

with otherwise neutral stimuli. In a typical Pavlovian study with dogs, the normally neutral
stimulus of a tone instigates salivation, after the tone and food have been presented
contemporaneously a number of times. In a contrasting example, electric shock elicits

physiological responses associated with fear, undesired outcomes preparing the individual to
cope with threat to life; classical conditioning is illustrated if the site at which the electric shock
was given evokes fear responses in its own right.
Operant conditioning concerns behavior that is under voluntary control at least in the early
stages in the acquisition of a given behavior. In a typical Skinnerian study, a food-deprived
pigeon learns to peck a key if the pecking is followed by the presentation of a pellet of food. In
a contrasting example, an electrician always turns off the mains supply before examination of an
appliance, since this has prevented electric shock in the past. As learning becomes established,
the learnt behavior is likely to become increasingly automatized, with the learner having little
insight or understanding of the learnt behavior; the learning can then take on some of the
characteristics of classical conditioning. Indeed, the different types of conditioning can
simultaneously affect many cases of learning.
Conditioning depends on the reinforcement contingencies with which it is associated: a positive

reinforcer elicits learning by its presence - as in the two food-related examples above - while a
negative reinforcer elicits learning by its removal - as in the two electricity-related examples
above. Learning can be lost if a reinforcer no longer operates, a phenomenon known as
extinction. Positive reinforcement for the driver includes early arrival at destination: behaviors
associated with fast driving become learnt. Collision is likely to be the main source of negative
reinforcement, since it entails tangible damage and injury: slow driving behaviors should be
learnt. Learning due to negative reinforcement is often less easily extinguished than learning
due to positive reinforcement, since many negative reinforcers are linked with threat to life. On
the face of it, this should apply strongly to the roadway. However, despite very high casualty
rates in relation to other forms of transport, the roadway is a "forgiving" environment and
collisions are rare in relation to distance driven, the average vehicle has only about a 20%
probability of being involved in a collision every 15,000 km or so (Fuller, 1992). The unhappy
conclusion from Fuller's theory is that positive reinforcement has an overriding effect on driver
behavior, because of the rarity of negative reinforcement.
Another feature of the rarity of negative reinforcement on the road is that learning may not
generalise beyond the specific circumstances of the negative reinforcement. Thus, surviving
participants in fatal collisions drive cautiously ever afterwards when faced with the specific
circumstances of the collision, particularly the location of the collision - but their driving may
be little altered away from these specific circumstances (Foeckler et a/, 1978; Rajalin and
Summala, 1997).
Fuller's Model and Seat-Belt Use
A major interest of Fuller concerns the difficulty in getting drivers to anticipate danger. For
example, discriminative stimuli signal potential reinforcement. A road-sign or markings before
a bend on a narrow winding road is a potential discriminative stimulus leading to slowing, since
such signs indicate hidden hazards. However, such road-signs and markings are often
ineffective (Shinar et al, 1980): collisions are rare, so the driver fails to slow.
Fuller has less to say about behavioral adaptation as such. Indeed one area in which his model
and risk theories seem to conflict concerns seat-belts: Fuller (1984, p. 1139) believes that seat-
belts remain beneficial for their users. This implies that the effectiveness of a negative reinforcer
is independent of its consequences: the reduced injuries for a seat-belt user do not alter the
effectiveness of a given collision as a negative reinforcer. Although this seems a dubious
proposition, the rarity of collisions argues that such issues would in fact have little effect.
Furthermore, the specificity of the learning among fatal-collision survivors was noted above
(Foeckler etal, 1978; Rajalin and Summala, 1997).
Near-Misses as Sources of Negative Reinforcement
However, the scope of negative reinforcement can be extended in another way. For every
collision that a driver may experience, the "forgiving" nature of the roadway suggests that there
are likely to be many near-misses. Although, not entailing tangible damage and injury, near-
misses are in fact likely to be negatively reinforcing. It might be retorted that this is simply a
matter of classically-conditioned or secondary reinforcement, by which an effect - say,
vestibular reaction to sudden deceleration - becomes negatively reinforcing because of its
association with the primary reinforcer of collision. Indeed, were this the case little might be
added by considering near-misses, given the rarity of collisions and the limited chances of
associating deceleration and collision.
However, near-misses provide primary negative reinforcement in their own right; the features
of near-misses do not need to occur with collisions in order to become effective.
VISUAL EXPANSION
Potential collision with an object - for example, a pedestrian, another vehicle or a static item of
road "furniture" - is accompanied by visual expansion at the driver's eye elicited by the object.
Features of this expansion, such as angular velocity and acceleration, in conjunction with
knowledge of the object's physical size and level of driving experience, provide information
from which the driver can determine time-to-collision and take controlled avoiding action
(Cavallo & Laurent, 1988; Lee, 1976; Stewart etal, 1993).
Near-misses often entail sharp braking, with concomitant visual, vestibular and kinaesthetic
changes: the unbelted, unrestrained motorist will be thrown towards interior fittings - the
steering-wheel, dashboard, windscreen, windscreen pillars and so on. The visual changes
associated with collision between the motorist and the internal fittings are similar to those for
collision between the automobile and an object external to the automobile, with the crucial
exception that simple mathematical considerations show that visual expansion will be much
greater in the former case. This is because collision between motorist and internal fittings
concerns much closer distances and directly affects the self: while the motorist is separated
from collision with external objects by the automobile's superstructure, the motorist's motion
towards the internal fittings is not. Objects close to oneself subtend bigger visual angles than
distant objects. When object and self move closer together - as with a motorist being thrown
towards the vehicle's internal fittings - the rate of change of the subtended visual angles is of a
different order of magnitude, in comparison with the motorist and external objects.
The Looming Phenomenon
The unbelted motorist thrown towards the vehicles's interior fittings provides conditions that
evoke the looming phenomenon, defensive and fear responses elicited by the threat of
immediate collision conveyed visually. For a wide range of non-human species, nothing more
elaborate than a rapidly expanding shadow back-projected onto a screen is sufficient to elicit
looming responses; neonates are affected similarly to adults - neonates having not encountered
collision cannot have learned to link visual expansion with collision (Schiff et al, 1962; Schiff,
1965). Human infants are affected similarly to other species (Ball and Tronic, 1971), while
human adults evince looming responses to moving objects that stop short of hitting them (King
et al, 1992). Such evidence argues that looming responses are "hardwired": they are the subject
of automatic and inborn neural processes.
Further support for this assertion comes from human psychophysical evidence that centrifugal
visual motion - corresponding to visual expansion - elicits special responses in a number of
phenomena: motion aftereffects, reaction-times and threshold responses (Ball and Sekuler,
1980; Georgeson and Harris, 1978; Reinhardt-Rutland, 1994). Motion aftereffects in particular
provide evidence of low-level neural processing (Reinhardt-Rutland, 1987, 1988). Looming
itself can therefore act as a primary negative reinforcer, without any requirement that it be
preceded by experience of a collision.
Looming and Seat-Belt Use
The negative reinforcement contingencies associated with sharp braking differ according to
whether or not a seat-belt is worn. The seat-belt user suffers much less motion within the
interior of the automobile than does the seat-belt non-user (Evans, 1991): the former will be
protected from looming responses. The greater vestibular and kinaesthetic responses for seat-
belt non-users will no doubt also provide other primary negative reinforcers for this group,
although there seems to be less in the way of research to which this can be related, probably
because such responses are relatively inconvenient and difficult to control in the laboratory.
Research in aviation psychology may provide some clues (Hawkins, 1987).
Given that near-misses are more frequent than collisions on the road - and therefore likely to be
experienced in a variety of situations and locations - greater generalisation of any learning
instigated by near-misses is likely. In contrast with the case of fatal-collision survivors noted
above (Foeckler et al, 1978; Rajalin and Summala, 1997), slower driving among seat-belt non-
users is not likely to be confined to specific circumstances of near-misses, but is likely to apply
to all circumstances.
The Time Course of Learning and Extinction Regarding Negative Reinforcement
As noted earlier, behaviors learnt during negative reinforcement are not easily lost, because of
the threat to life that negative reinforcement represents; hence, the long-term fear of the
collision circumstances shown by fatal-collision participants (Foeckler et al, 1978, Rajalin and
Summala, 1997). Often the individual can avoid the negative reinforcement altogether: a
consequence is that the individual is then unlikely to find out if reinforcement contingencies
remain constant. Away from the road, a good example concerns food aversions. An individual
who consumes an unfamiliar food that is followed by sickness is likely to avoid that food
thereafter: the sickness is a negative reinforcer for avoidance of the food. The individual will
not find out if the case of sickness was a one-off occurrence, or was truly a consequence of that
food (Carlson 1993).
A difference between seat-belt non-use and food aversion is that the reinforcement
contingencies for the driver are less predictable, since other road-users are ever-present: the
driver is likely to experience near-misses, no matter how slowly he/she drives. When near-
misses are found to be no longer associated with looming, extinction of slow driving becomes
possible, while positive reinforcement such as early arrival at destination can become more
salient. This analysis supports Janssen's assertion that behavioral adaptation requires much time
to be complete.
New Licence Holders
One factor not easily included in a controlled study of seat-belt use is the increasing proportion
of the driver population - new licence holders - who will always have worn seat-belts and will
never experience looming: a negative reinforcer leading to safer driving behaviors is
permanently unavailable. The performance of these drivers early in their driving careers will
therefore be more dangerous than the performance of their predecessors who initially lacked
seat-belts. Given that new licence holders include young males - notorious for dangerous
driving (Evans, 1991) - this factor will over time come to affect reported statistics substantially.
DISCUSSION
In this paper, it is argued that risk models have serious shortcomings in explaining behavioral
adaptation: definition of risk, substantiation of risk homeostasis and the fact that behavioral
adaptation does not always relate in obvious ways to risk however defined are problems.
Instead, given our rudimentary knowledge of psychological factors - particularly as they act
over long periods of time - behavioral adaptation should currently be considered on a case-by-
case basis.
Regarding the case of seat-belts, Fuller's discussion of negative reinforcement on the road can
be developed to provide an explanation of behavioral adaptation among seat-belt users. While
collisions entail tangible damage and injury - so have obvious reinforcing properties - primary
negative reinforcement need not be so restricted. In near-misses, not using a seat-belt leads to
vehicle occupants being thrown towards the internal fittings of the automobile and the looming
phenomenon - fear and avoidance responses conveyed by visual information for imminent
collision - can be a primary negative reinforcer in the driver learning slow, careful behaviors. In
contrast, the near-misses encountered by seat-belt users are not accompanied by substantial
movements within the automobile: looming is eliminated as a negative reinforcer, to the
detriment of driving behavior.
After switching from not using seat-belts to using seat-belts, the driver may avoid the fast
driving behaviors that lead to near-misses, analogous to the avoidance applying in food
aversions. However, given that the driver must interact with other road-users, he/she will
necessarily encounter near-misses which with belt use no longer engender looming. Only then
can the behaviors learned under the influence of looming begin to be extinguished. Nonetheless,
the persistence of learning under the influence of primary negative reinforcers - with their link
with threat to life - suggests that learning faster driving behaviors will take much time. Another
factor as time proceeds after seat-belt legislation, is that there will be increasing numbers of
new licence-holders - particularly young males - who have always used seat-belts, for whom
looming will never have been a negative reinforcer.
The eventually substantial effects envisaged in the present proposal are consistent with other
evidence. Janssen's (1994) study demonstrated small behavioral changes, which nonetheless
would have been greater had the study been longer, an assertion supported by long-term
increases in UK traffic speeds and in casualty rates. In fact, such data suggest that behavioral
adaptation may have reversed the benefit for seat-belt users in Great Britain.
The urgent need for an adequate mix of empirical and theoretical knowledge
The case of seat-belts is an especially instructive one for our appreciation of interventions
intended to reduce casualties. Of all interventions, seat-belts have been regarded as particularly
efficacious by many governmental or quasi-governmental authorities. Their belief may seem to
be supported by certain short-term empirical investigations. However, it is much more difficult
to amass evidence to support such an intervention in the long term, particularly when, as in the
case of seat-belts, legislation has made usage mandatory. Furthermore, the effects of other
interventions and developments can never be discounted: in Britain, the introduction of drink-
driving legislation simultaneously with seat-belt legislation is a clear case in point.
It is not unusual to hear the off-the-cuff remark among the road-safety community that present
driving behavior would be careful and sedate if the clock was turned back and the driver was
unbelted and surrounded by sharp rigid interior-fittings which would impact with him/her in the
event of sudden braking. Yet, it seems highly unlikely that such an assertion would ever be put
to the test. Once a safety intervention attains compulsory status, it is thereafter highly
improbable that that intervention will be the subject of rescinding legislation leading to its
abandonment. Furthermore, such an intervention would probably not have societal support:
recall Janssen's (1994) inability in recruiting habitual seat-belt users who were prepared to give
up seat-belt use for the year's duration of this study.
Non-drivers
The main concern of this paper has been with the consequences of a particular safety
intervention as it affects drivers. In that regard, it might be argued that it is unfortunate that a
safety intervention becomes ineffective for the group to whom it is targeted, but nothing is lost
by this: the targeted group at least may not be worse off than before the intervention.
However, this is to imply a proposition that the road-system entails more-or-less separate and
non-interacting groups of road-users. Unless groups are rigorously segregated, this is simply
untrue: how one group performs on the road inevitably affects other groups. As an increasing
number of authors have noted, some groups are going to be disadvantaged by any level of
behavioral adaptation among drivers resulting from interventions directed at driver safety: it is
of absolutely no benefit to vulnerable groups such as pedestrians and cyclists that drivers travel
faster than they did twenty-five years ago. As Davis (1993) eloquently shows, the persistent
disregard of the needs of vulnerable groups of road-users amounts to an issue of civil rights; it
is one that governments - particularly in the UK - have consistently ignored under the pressure
of rich and vocal interests representing various strands of "the motor industry".
Now this lack of official concern with these civil rights is reaping problems that affect all road-
users, including motorists themselves. Increasingly, congestion is affecting urban and suburban
roads. In governmental circles, the answer no longer lies in new road build. Beyond concerns
about quality of life, pollution and the like - even if the civil rights of vulnerable road-users are
discounted - there is a realization that automobiles are extremely inefficient users of road-space
(Highway Code, 1993; Marchetti, 1983): each new stretch of road only accommodates a
minimal increase in motoring. Such issues compel governments to campaign for greater
dependence on walking and cycling, whatever might be the dangers (Adams, 1988, 1994;
Davis, 1993;Hillman<?fa/, 1991).
One of the more bizarre developments in road-use campaigning in Northern Ireland - and
perhaps elsewhere - is an emerging view that children - that most vulnerable group of road-
users - should be taking a lead and travelling to school on foot. The "school-run" is seen as the
most expendable form of car-dependency - but little or no parallel effort is directed to

diminishing motoring for other purposes.
Comments on possible developments
Such observations all point to the need for coherent combinations of empirical and theoretical
understanding regarding the psychological factors at play in safety-related interventions. It may
be that this understanding always has to be atomistic and specific to individual safety
interventions. On the other hand, it may be that more holistic model-building can be
undertaken; Fuller's learning model might well form the basis for such an undertaking.
Compared with risk formulations, the empirical and theoretical underpinnings of Fuller's
learning model are considerably more secure and there is no need to posit intervening variables
whose definition and investigation are at the least problematic.
Nonetheless, such holistic model-building has limitations. In the particular circumstances of

poor visibility, the increased casualties are - as noted in the introduction - best understood in
terms of visual perception (Cavallo et al, 1997, in press; Leibowitz et a/, 1982; Reinhardt-
Rutland, 1992a,b). Note that the issue of poor visibility represents another area in which
greater attention to adequate empirical and theoretical understanding is needed. The problems
of poor visibility have never been adequately addressed in safety-related interventions. For the
most part, these interventions have been restricted to issues of basic conspicuity: for example,
such interventions have entailed the introduction of high-intensity lamps and retroreflective
material, neither of which has been particularly successful. Unfortunately, interventions
affecting conspicuity do not address the underlying problems revealed by a deeper analysis
based on the psychology of perception: the illusory effects of motion and space - alluded to
earlier - cannot be eliminated by reference to conspicuity alone. The consequence is, for
example, the continuing occurrence of "motorway madness" cases each winter when fog
affects fast through-routes (Reinhardt-Rutland, 1992a).
Another and recently-recognized illustration of the need for adequate empirical and theoretical
understanding regarding psychological factors concerns the ever-expanding propensity of
information technology to automate, direct navigation or otherwise affect the task of driving.
As Harris and Smith (1997) and Young and Stanton (1997) show, it is necessary that such
developments can be understood and evaluated within a solid framework of psychological
knowledge. Such technology will be of limited value for the motorist if it merely displaces
congestion from one location to another. More crucially, if the cognitive demands (or lack of
them) of the technology affect driver-performance in relation to other road-users, then its value
really does become contentious - as already remarked, it is at last becoming unacceptable that
technical gains for one group of road-users entail losses for vulnerable groups to whom they
are not applicable. Judging from the past, if the use of information technology on the road is
allowed to develop without restraint, it will become difficult to reverse the process if and when
detrimental effects that should have been foreseen become serious.
REFERENCES
Adams, J. G. U. (1988). Risk homeostasis and the purpose of safety regulation. Ergonomics,
31, 407-428.
Adams, J. G. U. (1994). Seat belt legislation: the evidence revisited. Safety Science, 18, 135-
152.
Aschenbrenner, M. and B. Biehl (1994). Improved safety through improved technical
measures? Empirical studies regarding risk compensation processes in relation to anti-
lock brake systems. In: Changes in Accident Prevention: The Issue of Risk
Compensation (R. M. Trimpop and G. J. S. Wilde, eds.),. Styx, Groningen
Netherlands.
Ball, K. and R. Sekuler (1980). Human vision favors centrifugal motion. Perception, 9, 317-
325.
Ball, W. and E. Tronik (1971). Infant responses to impending collision: optical and real.
Science, 111, 818-820.
Broughton, J. and D. C. Stark (1986). The effect of the 1983 changes to the law relating to
drink/driving. Research Report 89. Transport and Road Research Laboratory,
Crowthorne UK.
Carlson, N. R. (1993). Psychology, the Science of Behavior. Allyn and Bacon, Boston MA.
Cavallo, V. and M. Laurent (1988). Visual information and skill level in time-to-collision
estimation. Perception, 17, 623-632.
Cavallo, V., J. Dore, M. Colomb, and G. Legoueix (1997). Distance perception of vehicles
rear in fog. In: Human Factors in Road Traffic II (P. Albuquerque, J. Santos, A.
Pires da Costa and C. Rodrigues, eds.). Braga, University of Minho, Braga, Portugal.
Cavallo, V., J. Dore, M. Colomb, and G. Legoueix (in press). Distance over-estimation of
vehicle rear lights in fog. In: Vision in Vehicles 7 (A. G. Gale, ed.), North-Holland,
Amsterdam.
Davis, R. (1993). Death on the Streets: Cars and the Mythology of Road Safety. Leading
Edge, Hawes UK.
Duma, S. M., T. A. Kress, D. J. Porta, C. D. Woods, J. N. Snider, P. M. Fuller, and R. J.
Simmons, R. J. (1996) Airbag-induced eye injuries - a report of 25 cases. Journal of

Trauma-Injury Infection and Critical Care, 41, 114-119.
Evans, L. (1985). Human behavior, feedback and traffic safety. Human Factors, 27, 555-576.
Evans, L. (1990). Restraint effectiveness, occupant ejection from cars and fatality reductions.
Accident Analysis and Prevention, 22, 167-175.
Evans, L. (1991). Traffic Safety and the Driver. Van Nostrand Reinhold, New York.
Evans, L. and M. C. Frick (1988). Safety belt effectiveness in preventing driver fatalities versus
a number of vehicular, accident, roadway and environmental factors. Journal of
Safety Research, 17, 143-154.
Farmer, C. M., A. K. Lund, R. E. Trempel, and E. R. Braver (1997). Fatal erases of passenger
vehicles before and after adding antilock braking systems. Accident Analysis and
Prevention, 29, 745-757.
Foeckler, M., F. Hutchenson, C. Williams, A. Thomas, and T. Jones (1978). Vehicle drivers
and fatal accidents. Suicide and Life Threatening Behavior, 8, 174-182.
Fuller, R. (1984). A conceptualization of driving behavior as threat avoidance. Ergonomics, 27,
1139-1155.
Fuller, R. (1988). On learning to make risky decisions. Ergonomics, 31, 519-526.
Fuller, R. (1992). Learned riskiness. Irish Journal of Psychology, 13, 250-257.
Georgeson, M. A. and M. G. Harris (1978). Apparent foveofugal drift of counterphase
gratings. Perception, 1. 527-536.
Harris, D. and F. J. Smith (1997). What can be done versus what should be done: A critical
evaluation of the transfer of human engineering solutions between application
domains. In: Engineering Psychology and Cognitive Ergonomics. Vol. 1:
Transportation Systems (D. Harris, ed.). Ashgate, Aldershot UK.
Harvey, A. C. and J. Durbin (1986). The effects of seat belt legislation on British road
casualties: A case stuy in structural time series modelling. Journal of the Royal
Statistical Society, A149, 187-227.
Hawkins, F. H. (1987). Human Factors in Flight. Gower, Aldershot UK.
Heino, A., H. H. van der Molen, and G. J. S. Wilde (1996). Risk perception, risk taking,
accident involvement and the need for stimulation. Safety Science, 22, 35-48.
Highway Code (1993). Belfast UK: HMSO.
Hillman, M., J. Adams, and J. Whitelegg (1991). One False Move: a Study of Children's
Independent Mobility. Policy Studies Unit, London.
Hogg, M. A., and G. M. Vaughan (1998). Social Psychology: an Introduction. Prentice-Hall,
New York.
Janssen, W. (1994). Seat-belt wearing and driving behavior: an instrumented-vehicle study.
Kahane, C. J. (1994). Preliminary Evaluation of the Effectiveness of Antilock Brake Systems
for Passenger Cars. Report DOT-HS-808-206. National Highway Traffic Safety

Administration, Washington DC.
King, S. M., D. Dykeman, P. Redgrave, and P. Dean (1992). Use of a distracting task to
obtain defensive head movements to looming stimuli by human adults in a
laboratory setting. Perception, 21, 245-259.
Kolb, B., and I. Q. Whishaw (1985). Fundamentals of Human Neuropsychology. Freeman,
New York.
Lee, D. N. (1976). A theory of visual control of braking based on information about time-to-
collision. Perception, 5, 437-459.
Leibowitz H. W., R. B. Post, T. Brandt, and J. Dichgans (1982). Implications of recent
developments in dynamic spatial orientation and visual resolution for vehicle
guidance. In: Tutorials in Motion Perception (A.H. Wertheim, W. A. Wagenaar
and H. W. Leibowitz, eds.). Plenum, New York.
Leslie, J. C. (1996). Principles of Behavioral Analysis. Harwood, Amsterdam.
Mahmud, S. M. and A. I. Alrabady (1995). A new decision-making algorithm for airbag
control. IEEE Transactions on Vehicular Technology, 44, 690-697.
Marchetti, C. (1983). The automobile in a system context: The past eighty years and the next
twenty years. Technological Forecasting and Social Change, 23, 3-23.
Maslow, A. H. (1987). Motivation and Personality. Harper and Row, New York.
Mazur, J. E. (1994). Learning and Behavior. Prentice-Hall, Englewood Cliffs, NJ.
Moghaddam, F. M. (1998). Social Psychology: Exploring Universals across Cultures.
Freeman, New York.
Naatanen, R. and Summala, H. (1976). Road-User Behavior and Traffic Accidents. North
Holland, Amsterdam.
Parker, D. B. and I. G. Cross (1981). The effectiveness of motorway matrix signalling - a
police view. Police Journal, 54, 266-276.
Rajalin, S. and H. Summala (1997). What surviving drivers learn from a fatal road accident.
Reinhardt-Rutland, A. H. (1986). Misleading perception and vehicle guidance under poor
conditions of visibility. In Vision in Vehicles (A. G. Gale, M. H. Freeman, C. M.
Haslegrave, P. Smith and S. P. Taylor, eds.), pp. 413-416. North Holland.
Amsterdam.
Reinhardt-Rutland, A. H. (1987). Aftereffect of visual movement - the role of relative
movement: A review. Current Psychological Research and Reviews, 6, 275-288.
Reinhardt-Rutland, A. H. (1988). Induced movement in the visual modality: an overview.
Psychological Bulletin, 103, 57-72.
Reinhardt-Rutland, A. H. (1992a). On learning, distance overstimation and mist-related
motorway accidents. Perceptual and Motor Skills, 126, 130.
Reinhardt-Rutland, A. H. (1992b). Poor-visibility road accidents: theories invoking "target"

risk-level and relative visual motion. Journal of Psychology, 126, 63-71.
Reinhardt-Rutland, A. H. (1992c). Some implications of motion perception theory for road
accidents. Journal of the International of Traffic and Safety Sciences (IATSS
Research), 16, 9-14.
Reinhardt-Rutland, A. H. (1994). Perception of motion-in-depth from luminous rotating
spirals: Direction asymmetries during and after rotation. Perception, 23, 763-769.
Rumar, K. (1988). Collective risk but individual safety. Ergonomics, 31, 507-518.
Rutherford, W. H., T. Greenfield, H. R. M. Hayes, and J. K. Nelson (1985). The medical
effects of seat belt legislation in the United Kingdom. Research Report 13.
Department of Health and Social Security, Office of the Chief Scientist. HMSO,
London.
Salam, M. M. and E. E. Frauenhoffer (1996). Left arterial appendage rupture caused by a seat-
belt - a case-report and review of the literature. Journal of Trauma-Injury Infection
and Critical Care, 40, 642-643.
Scruff, W. (1965). Perception of impending collision: a study of visually directed avoidant
behavior. Psychological Monographs, 79:604, 1-26.
Schiff, W., J. A. Caviness, and J. J. Gibson (1962). Persistent fear responses in rhesus to the
optical stimulus of "looming". Science, 136, 982-983.
Scott, P. P. and P. A. Willis (1985). Road casualties in Great Britain in the first year with seat
belt legislation. Report 9. Transport and Road Research Laboratory, Crowthorne
UK.
Shinar, D., T. H. Rockwell, and J. A. Malecki (1980). The effects of changes in driver
perception on rural curve negotiation. Ergonomics, 23, 263-275.
Smeed, R. J. (1977). Pedestrian accidents. In Proceedings of the International Conference on
Pedestrian Safety II (A. S. Hakkert, ed.), pp. 7-21. Michlol: Haifa.
Stewart, D., C. J. Cudworth, and J. R. Lishman (1993). Misperception of time-to-collision by
drivers in pedestrian accidents. Perception, 22, 1227-1244.
Summala, H. (1996). Accident risk and driver behavior. Safety Science, 22, 103-117.
Summala, H. and R. Naatanen (1988). Risk control is not risk adjustment: The zero-risk theory
of driver behavior and its implications. Ergonomics, 31, 491-506.
Thomson, S. J., E. J. Fraser, and C. I. Howarth (1985). Driver behavior in the presence of child
and adult pedestrians. Ergonomics, 28, 1469-1474.
UK Government Statistical Service (1991). Transport Statistics Great Britain. HMSO,
London.
UK Government Statistical Service (1997). Transport Statistics Great Britain. HMSO,
London.
White, M. E. and D. J. Jeffery (1980). Some aspects of motorway traffic in fog. Report LR958.
Behavioural Adaptation and Seat-Belt Use 23 3
Transport and Road Research Laboratory, Crowthorne UK.

Wilde, G. J. S. (1982). The theory of risk-homeostasis: implications for safety and health. Risk
Analysis, 2, 209-255.
Wilde, G. J. S. (1994). Target Risk. PDE Publications, Toronto.
Wyatt, J. P. and J. M. Richardson (1994). The use of seat belts on British motorways. Journal
of the Royal Society of Medicine, 87, 206-207.
Young, M. S. and N. A. Stanton (1997). Automotive automation: effects, problems and
implications for driver mental workload. In Engineering Psychology and Cognitive
Ergonomics. Vol. 1: Transportation Systems (D. Harris, ed.). Ashgate, Aldershot
UK.
235
Bi-Directional Emergent Fundamental Pedestrian Flows

From Cellular Automata Microsimulation
Victor J. Blue, New York State Department of Transportation, Poughkeepsie, NY, USA
Jeffrey L. Adler, Rensselaer Polytechnic Institute, Troy, NY, USA
INTRODUCTION
Pedestrian flow is an important component in the design and analysis of transportation
facilities and in urban transportation planning. The need to assess the level of service of
pedestrian walkways motivates much of the work on modeling pedestrian flows. Statistically
derived flow models, expressing the fundamental relationships between density, flow, and
speed, are used to generate level of service criteria. The methodology of assessing level of
service for pedestrian walkways is similar to that for vehicle roadways.
Microscopic modeling of vehicle flows is well studied and most, if not all, of the resulting
models have been shown to generate fundamental relationships. In contrast, few attempts have
been made to develop microscopic models of pedestrian flows that capture the complex
behavioral movements and decisions. Historically, researchers have found the modeling task
to be daunting.
In recent years, Cellular Automata (CA) has emerged as a technique for modeling complex
behavior using a set of simple rules. Cellular Automata (CA), when emulating the complex
behaviors of living systems, is characterized as an artificial life approach to simulation
modeling (Levy 1997). CA is named after the principle of automata (entities) occupying cells
according to localized neighborhood rules of occupancy. The CA local rules prescribe the
behavior of automata creating an approximation of actual behavior. Emergent group behavior
is an outgrowth of the interaction of the microsimulation rule set. Unlike traditional simulation
models that apply equations and not behavioral rules; CA behavior-based cellular changes of
state determine the emergent results. Furthermore, CA models function as discrete
idealizations of the partial differential equations that describe fluid flows and allow simulation
of flows and interactions that are otherwise intractable (Wolfram, 1994).
The purpose of this paper is to present CA microsimulation models of uni- and bi-directional
pedestrian walkways to demonstrate the ability of capturing fundamental properties of
pedestrian movements using a fairly simple rule-based approach. The paper begins by
characterizing the pedestrian flow problem. A brief literature review of previous modeling
efforts is provided. The formulations for the uni- and bi-directional walkway cases are
described. This is followed by a discussion of results gathered from numerous simulation
experiments. Included is a comparison of emergent flow patterns to the fundamental
parameters described in the Highway Capacity Manual (1994).
BACKGROUND
Microscopic models of car-following behavior and continuum models of traffic streams are
mathematical relationships to describe the complex, and often chaotic, movement of vehicles
and the high degree of interaction between vehicles. It is well known that vehicle flow exhibits
non-linear speed transitions and self-organized criticality. Lieberman and Rathi (1997) suggest
that, while the local behavior of individual vehicles may be reliably represented logically and
mathematically with acceptable confidence, the complex simultaneous interactions of the
system are difficult to describe and in mathematical forms. In response to the need to study
traffic networks, researchers have turned to using simulation.
Microscopic simulation attempts to model the behavior and movement of individual vehicles.
Simulations are constructed from a set of models that represent a variety of behaviors from
changing vehicle speed to lane switching and wayfinding decisions. Traditionally, microscopic
simulation of vehicle flow is constructed around car-following models. However, microscopic
simulations are difficult to develop and calibrate due to the complex behaviors and large
number of parameters that are needed. In addition, chaotic vehicular traffic phenomena that
occur at higher densities are difficult to capture with equation-based models and therefore are
also difficult to simulate with sufficient realism.
Cellular automata microsimulation has been proven to provide a good approximation of

complex flow patterns over a range of densities. (Nagel and Rasmussen, 1994; Paczuski and
Nagel, 1995; Nagel 1996). Over the past several years researchers have demonstrated the
applicability of CA to car-following and vehicular flows including traffic within a single-lane
(Nagel and Schreckenberg, 1992), two-lane flow with passing (Rickert, Nagel, Schreckenberg
Pedestrian Flows From Cellular A utomata 237
and Latour, 1995), and network-level vehicle flows in the TRANSIMS model (Nagel, Barrett
and Rickert, 1996). While pedestrian flows are much more complex and chaotic than vehicular
flows, the success with which researchers were able to use CA to model traffic flow provides
the impetus to explore the use of CA for pedestrian flows.
Although fundamental properties for pedestrian flows and level of service methods for
assessing capacity are well established, it has been difficult to develop microscopic models of
pedestrian flows that generate these fundamental properties. Pedestrian flow is highly chaotic
and individualized. Pedestrian corridors may have several openings and support movement in
several directions. Pedestrian walkways are not regulated as are roadways. For the most part
pedestrian flows are not channeled; pedestrians are free to vary speed and allowed to occupy
any part of a walkway. Unlike roadways where vehicle flow is separated by direction, bi-
directional walkways are the norm rather than the exception. However, since safety and crash
avoidance are less of a concern to pedestrians, slight bumping and nudging is often a part of
walking through crowded corridors. Pedestrians are capable of changing speed more quickly
when gaps arise and can accelerate to full speed from a standstill. In addition, it is not
uncommon for pairs or groups of pedestrians to walk side-by-side or in clusters.
Gipps (1986) came closest to developing a realistic microscopic model of pedestrian flow with
a cell-based approach using a hexagonal lattice. Individual pedestrians occupy hexagonal cells
and over discrete time intervals are moved in relation to one another. Movement is dictated
according to reverse gravity-based rules. Pedestrians are repelled from each other as they seek
their destinations in buildings. This work was limited in that the formulation is encumbered by
sequential floating-point calculations based on the number and proximity of pedestrians who
aim to avoid each other. In addition, the approach was tested only on one-directional
uncongested flows with inconclusive, though apparently reasonable results for the uncongested
speed-volume relationship. Gipps also failed to yield a tenable method for handling bi-
directional movements.
Fruin discusses several efforts from the 1960's to develop models of pedestrian flow within
different scenarios. There are few notable recent efforts by researchers to model pedestrians
using more traditional techniques. Lovas (1994) examined pedestrian traffic in building
evacuation with a discrete event queuing network. AlGadhi and Mahmassani (1991) simulated
crowd behavior using a set of conservation of mass flow simultaneous partial differential
equations solved for several classifications of pedestrians over discrete time steps.
Helbing and Molnar (1995) developed a model in which a pedestrian behaves as if acted upon
by external attractive and repulsive forces, termed social fields. Their model displayed the
formation of lanes by opposing direction flows and oscillatory changes in walking direction at
a doorway. The paper demonstrated a single instance of these abilities and did not show results
over a range of densities or calibrate them against established fundamental flows. The CA
approach explored in this paper is distinctly different from the social fields model,
computationally much simpler and based on maximizing forward progress.
The attractiveness of using C A is that the models are based on behavioral rules, rather than
performance functions. This paper will demonstrate that emergent group behavior comes from
the dynamics of the pedestrians in motion across the defined walkway. The CA behavior-based
cellular changes of state determine the emergent results, giving rise to very lifelike phenomena,
and to new possibilities of modeling based on behavioral rules. The relevant behaviors become
the focus of the modeling and that leads to an understanding of the factors that contribute to the
dynamics.
MODEL FORMULATION
The objective is to develop an intuitively and empirically appealing CA microsimulation for
modeling bi-directional flows on a pedestrian walkway. The model aims to employ the
essential minimal rule set needed and avoids the pitfall of trying to capture all possible
behaviors of pedestrians which are numerous to identify and problematic to verify. Many
pedestrian behaviors are unnecessary distractions from the essential parameters. It has been
observed in CA simulations that very simple models are capable of capturing essential system
features of extraordinary complexity (Bak 1996). Aiming at the essential and minimal rule set
by eliminating anything but critical behavioral factors permits a clear understanding of the
underlying fundamental dynamics.
There are two fundamental elements of pedestrian movement to capture: forward movement
and resolving conflicts. Forward movement refers to the velocity and acceleration of each
pedestrian. Desired walking speeds vary over the pedestrian population. Under less crowded
conditions, individual pedestrians will strive to accelerate toward their desired maximum
walking speed. In congested walkways, pedestrians will adapt their speed to the prevailing
flow rates in the immediate walkway neighborhood.
To account for differences in desired maximum walking speeds (termed vjmax), a pedestrian
population consisting of three walker classes is adopted:
Table 1. Maximum Walking Speeds
Pedestrian Class \ Walking Speed (m/s)

Fast | 1.80
Standard j 1.30
Slow I 0.85
Conflict resolution refers to movements that are undertaken to avoid head-on collisions or to
enable passing or bypassing. In a bi-directional walkway, pedestrians from opposing
directions can vie for the same location. The proposed model will introduce the concept of
"place exchange", whereby the positions of opposing pedestrians are swapped to emulate the
collision avoidance behavior. Passing movements occur when two pedestrians are moving one
in front of the other in the same direction and the trailing pedestrian wishes to accelerate past
the lead pedestrian. Bypassing occurs when two pedestrians approach one another from
opposite directions and circumvent one another. Termed "lane-changing" or "side-stepping"
behavior, the passing pedestrian shifts position to the right or left in order to enable a higher
walking speed. Though pedestrians do not follow lanes, lane changing is a helpful term to use
when referring to a CA grid of cells where de facto lanes exist.
Rule Set
The CA microsimulation is based on six rules applied across four parallel updating stages. In
the first parallel update stage, a set of lane changing rules is applied to each pedestrian on the
lattice to determine the next lane of each pedestrian. All pedestrians are assigned to the new
cells during the second stage. In the third stage, a set of forward movement rules is applied to
each pedestrian. The allowable speed of each pedestrian is based on the available gap ahead
and the pedestrian's desired speed with all the entities in their current positions. Finally, the
pedestrians hop forward to new cells in the fourth update.
In the first parallel update stage Rules 1-5 determine lane switching. Pedestrians can change
lanes only when an adjacent cell is available. Rule 1 determines if the adjacent cells to the
immediate right and left are available. Rule 2 resolves possible conflicts if an adjacent cell is
available but the cell two lanes over is occupied. A random number is drawn to designate the
lane as free to this pedestrian or to the pedestrian two cells away. Rule 3 eliminates a lane
change if both adjacent cells are unavailable. If an adjacent lane is free, then lane change is
determined by the maximum gap in Rule 4. If the maximum gap is common to two or more
lanes, Rule 5 breaks ties for making lane assignments. Rule 5a, an 80/10/10 split for all three
lanes, assumes pedestrians generally stay in the current lane and drift out of it occasionally.
Rule 5b, a 50/50 split between the adjacent lanes, is the most reasonable assumption. Rule 5c,
a 50/50 split between the current lane and an adjacent lane, assumes pedestrians wish to have a
cell-width of separation from a person in the adjacent lane as much as they want to stay with
the current course. These probabilities worked well for the simulation. The second parallel
update stage moves all the pedestrians into their new lanes.
The third parallel update stage determines the forward movement of the pedestrians. The gap
ahead is determined first. If they are opposing pedestrians, Rule 6 guards against deadlocks by
emulating what people actually do. Under constrained conditions opposing pedestrians may
exchange places. In actuality temporary standoffs may occur when people guess which way to
step past one another. Thus, the simulation contains a probability of a temporary standoff
between closely opposing walkers. With probability p_exchg closely opposing pedestrians
exchange places in the time step. The opposing entities each move the same number of cells,
which is 0, 1, or 2 cells. In the fourth parallel update, the pedestrians are moved forward based
on the gaps determined in the third parallel update stage.
Parallel Update #1: Lane Assignments

(Rule 1): Check adjacent cells
IF the cell immediately to the left (right) is unavailable
THEN assign the cell to be occupied and GOTO Rule 3
ELSE GOTO Rule 2
(Rule 2): Determine if adjacent lanes are free

IF the cell two lanes over to the left (right) is occupied by a pedestrian
THEN with probability r assign the left (right) lane to be occupied
GOTO Rule 3
(Rule 3): Determine if pedestrian must remain in current lane

IF the lane immediately to the right is occupied
AND the cell immediately to the left is occupied
THEN assign pedestrian pn to his current lane
ELSE GOTO Rule 4
(Rule 4): Assign to uniquely maximal gaps

Compute the available gaps for the current lane and for unoccupied adjacent lanes
IF a gap is uniquely maximal

THEN assign pedestrian pn to the lane having maximum gap
ELSE GOTO Rule 5
(Rule 5): Tie-breaking of equal maximum gaps

Use the appropriate tie-breaking rule:
(a - 3-way tie): Randomly apply 80/10/10 split for current lane and two adjacent
lanes.
(b - 2-way tie between the adjacent lanes): Randomly apply 50/50 split.
(c - 2-way tie between current lane and single adjacent lane): Randomly apply
50/50 split.
Parallel Update #2: Lane Movement
Move each pedestrian pn, to the lane assigned in the lattice.
Parallel Update #3: Assigning Travel Speeds
(Rule 6): Update velocity
Let v(pn) = available gap

IF gap = 0 or 1 and gap = gap2 (cell occupied by an opposing pedestrian)
THEN with probability/? exchg v(pn) = gap + 1
ELSE v(pn) = 0
Parallel Update #4: Forward Movement
Advance each pedestrian pn, v(pn) cells forward in the lattice.
The available gap ahead depends on the direction of flow of the next vehicle downstream.
From the gap calculation, if both pedestrians are going in the same direction, the new velocity
of the follower is the minimum of the desired velocity of the pedestrian (v_max) and the
available gap ahead. If the gap between opposing pedestrians is less than the total distance that
both pedestrians could move at maximum speed (i.e., 8 cells is the maximum - 4 in each
direction), then the updated velocity is the minimum of v_max and moving halfway forward.
Moving halfway forward guards against collisions and hopping over other entities. Taking the
minimum gap of the same and opposite directions of movement is shown as a combined task at
the end of the gap calculation.
Computing the Available Gap
Look ahead a max of 8 cells (since 2 * largest v_max = 8)

IF occupied cell found with same direction
THEN set gapl to number of cells between entities
ELSE gap 1 = 8
IF occupied cell found with opposite direction
THEN set gap2 to INT (0.5 * number of cells between entities)
ELSE gap2 = 4
Assign gap = NUN (gapl, gap2, v_max)
SIMULATION EXPERIMENTS
A series of simulation experiments were conducted to evaluate the rule set over different uni-
directional and bi-directions scenarios. The pedestrian walkway is modeled as a circular lattice
of width W and length G. Each cell in the lattice is denoted L(i, j) where 1 < i <, W and 1 ^ j <
G. Pedestrian densities are predetermined at the start of the simulation and remain constant
throughout each run. At the start of each simulation, for a density d, the proportion of
occupied cells, where 0.05 < d < 1.0, N = INT (d*W*G) pedestrians are created and assigned
randomly to the lattice. The circular lattice enables the set of pedestrians to interact at constant
density while maintaining strict conservation of flow. Cells in the lattice are considered square
at 0.457 m per side. This cell size is scaled according to minimal requirements for personal
space as described in the Highway Capacity Manual. The scale is also used to generate the
speed-flow-density relationships that emerge. Density as the proportion of occupied cells is
considered more helpful for discussion of results than units of pedestrians/m2, which certainly
can be converted from d/(0.457m)2.
To capture variations in walking speeds across the population a distribution of 5 percent 'fast'
walkers; 90 percent 'standard' walkers, and 5 percent "slow" walkers (termed 5:90:5
distribution) was used. This distribution had the best realization of the fundamental diagram
compared with other distributions in the single-direction case (Blue and Adler 1998). For the
circular lattice, the walking speeds are 4, 3, and 2 cells per time step respectively and these
values fall within ranges of speed and standard deviations used by others researchers (Lovas
1994).
Pedestrian Flows From Cellular Automata 243
Fundamental parameters of pedestrian flows, space, flaw rate, and, average walking speed
were generated from the experiments. For the circular lattice, space is the reciprocal of density
and is constant for each run. Flow rate is computed as the total number of revolutions made by
each pedestrian over the duration of the simulation. Speed is computed as number of total
steps divided by the time duration.
Each simulation was run for 11,000 time steps of one second each. The first 1000 time steps
were used to initiate the simulation and the latter 10,000 were used to generate performance
statistics. Each set of experiments included runs at 19 densities ranging from 0.05 to 0.95
percent occupancy in increments of 0.05. For statistical accuracy, twenty repetitions at each
density level are run and the fundamental parameters are computed as the average over these
replications. The resulting emergent fundamental profile is a map of the relationships between
speed and flow over the range of densities.
Uni-Directional Flow
The first set of experiments was conducted on a uni-directional pedestrian walkway. This
eliminates the need to resolve head-on conflicts as all pedestrians are moving in the same
direction. Several model forms were fitted against the resulting fundamental data. Figure 1
depicts three volume vs. density curves: (1) the results of the microsimulation experiment, (2)
a fitted May's bell-shaped curve, and (3) a composite two-regime model curve-fitted to the
data.
May's flow equation was estimated from the simulated data. It was found to have the
following speed-density relationship bell curve
S = 83.16e-4 11D2 (i)
The best fitting model is a two-regime speed-density relationship. Under low densities, the
flows appear to follow May's bell-shaped relationship; a linear relationship between flow and
density is suggested at higher densities. The best estimated model for the data had the
following speed-density relationship:
84.4e~4 75°2 D < 0.45

1
[(26.59/D)-26.39 D>0.45
As seen on the chart, the maximum flow of this two regime model is approximately 80
ped/min/meter of width. The HCM suggests a capacity around 25 Ped/min/ft-of-width,
equivalent to 82 Ped/min/meter-of-width.
The results compare favorably with work reported by Virkler and Elayadath (1994) who
examined uni-directional flows and fit various curves to field data. Their study found that two
functional forms fit the data best: (1) single regime model represented by May's bell-shaped
function and (2) a two-regime model with separate linear regimes. A closer look at this study
reveals that the estimation of flow equations was based on a data set where density reached a
maximum value of 3.12 Ped/m2, corresponding to a density of 0.65 on the circular lattice.
Therefore, Virkler and Elayadath's estimation is limited to low and medium densities. The
results of the uni-directional CA model for low densities is consistent with these previous
findings, combining the best features of May's bell-shaped curved and the two regime linear
model. The CA data set continues to high density and provides improvements to the
fundamental model. The CA model is a useful tool to extending examination and knowledge
of pedestrian behavior and aggregate movement.
Figure 1. Volume-Density Curves for Uni-Directional Flow
•CA Simulation
•May's Bell Curve
2-regime
0.2 0.4 0.6 0.8

Density
(Cells Occupied / Total Cells)
Bi-Directional Flow
In a bi-directional walkway, pedestrian movements are limited to two opposing directions.

According to the HCM (1994), flows along a bi-directional walkway will tend to segregate
over time and occupy proportional shares of the walkway width. For example, if sixty percent
of a pedestrian population is walking northbound, it can be expected that over time, that sixty
percent of the walkway capacity will become dedicated to the northbound pedestrians. For a
walkway lattice of ten cells wide, it would be expected that the northbound pedestrians would
occupy the six rightmost lanes and the southbound pedestrians would occupy the remaining
four lanes. Under this condition, the flow is directionally separated and crossover movements
to lanes supporting flow in the opposing direction only occur briefly to accommodate passing.
Since cases of directionally separated flow are nearly identical to having two adjoining uni-
directional walkways, the emerging flow-density profile should be similar to the uni-
directional case. The HCM suggests that depending on the directional split, the flow profile
will retain the same shape except for some reduction in capacity. If the directional flows are
roughly equal then little reduction in capacity is expected. However when there is a significant
imbalance, say, a 90-10 split, capacity reductions of about 15 percent have been observed
(HCM 1994).
Figure 2 depicts results of microsimulation experiments conducted over a range of directional

splits from 100-0 to 50-50. As expected, it was found that the emergent flow-density curves
were virtually identical to the results of the unidirectional case. In addition, very little variation
in the resulting behaviors between splits was found.
There are cases in which flows along bi-directional walkways are not segregated; opposing
traffic intermingles and directional lanes do not readily evolve. This usually rather short-lived
case of randomly interspersed pedestrians moving in opposing directions may occur at busy
crosswalks, in crowded subway stations, and in emergency situations, among others.
Historically interspersed bi-directional flow has not been well examined, documented, or
modeled. As a result this case is not thoroughly understood.
The C A microsimulation was used to investigate the case of bi-directional interspersed flow.
When opposing pedestrians converge on the same cell but cannot negotiate sidestepping, the
conflict is resolved by exchanging the places of the opposing pedestrians. This is realized
through Rule 6 with the use of a variable, p_exchg. To the authors' knowledge, the
characteristics of place exchange have not been examined in any field study of pedestrian
activity, though it is well known that pedestrians will bypass, bump, slip past, and otherwise
exchange places with one another when necessary.
Figure 2. Volume-Density Curves for Bi-directional Walkway and Varied Splits
— 100/0
— 90/10
— 80/20
— 70/30
— 60/40
— 50/50
0.2 0.4 0.6 0.8

Density
Given that the microsimulation model correctly approximates speed-volume-density

relationships for the single and lane-based bi-directional cases, a set of runs were conducted to
examine the effect on handling conflicts using the place exchange logic. In the absence of
empirical data, these findings are considered preliminary, though they are quite instructive
regarding the capabilities of the model and, to a lesser extent, actual crowd behavior in the
random bi-directional setting.
Exchange rates of 100, 75, 50, 25, and 0 percent were applied across the various directional
splits. 0 percent is the base case where pedestrians cannot move if an opposing pedestrian
creates a direct conflict; 100 percent indicates that all conflicts are resolved.
Figure 3 illustrates the resulting speed-density curves for several directional splits based on
exchange probabilities of 100 percent success. The speed-density plots of each directional split
show a pattern emerging that is orderly and understandable. At a density below 0.45, speeds
are reasonably reduced according to the directional split. Above a density of 0.45, the high
degree of exchange rate allows higher speeds, as splits become more even. This is because
pedestrians are not as often held back by opposing persons ahead and because more balanced
directional flow provides more opportunities for face-to face conflicts and smooth exchanges
of position.
However, perfect exchanges of opposing pedestrians allows flows to do what appears to be

unreasonably well, allowing evenly split flows to increase at very high density above what
single direction flow would allow at its maximum. Thirty or more exchanges per pedestrian
per minute occur, a number that appears unlikely in practice. Primarily because of this result,
the perfect exchange concept appears more useful as a benchmark of maximum flow. Actual
levels of exchange are likely to be hampered by occasional wrong-steps.
Figure 3. Speed vs. Density Curves for Bi-directional Walkway,

Interspersed Flow, and varied splits for 1.0 exchange probability
100/0
90/10
80/20
70/30
60/40
0.2 0.4 0.6 0.8

Density
Figure 4 illustrates flow-density curves for 90-10 splits over the range of exchange
probabilities. As the rate goes down, significant attenuation of flow is observed. In addition,
even for the 100 percent exchange case there is a significant (approximately 15 percent)
reduction in capacity as the HCM describes for the case where lane formation does not occur.
The 75 percent exchange case has a similar peak volume, appropriately diminished from the
uni-directional and lane-separated bi-directional case. Lower exchange rates appear to have
peak volumes out of range of what the HCM describes.
Figure 4. Flow vs. Density Curves for Bi-directional Walkway,

Interspersed Flow, and 90:10 volume split
p_exchg = 0
p_exchg = 0.25
p_exchg = 0.50
p_exchg = 0.75
p_exchg = 1.0
0.2 0.4 0.6 0.8

Density
Lane Changing Behavior
Understandably, most of the historical effort that has gone into capturing pedestrian flow
characteristics has been directed toward their forward movement. However, an important
factor in understanding pedestrian flow is lane changing. Lane changing has been modeled in
vehicular traffic (see Rickert, Nagel, Schreckenberg & Latour 1995; Ben-Akiva, Koutsopolous,
& Yang 1995 for example). Since extensive data gathering and modeling of pedestrian lane
changes has not been previously done, the effort here has been to perceive pedestrian lane
changing from experience and empirically capture the forward movement with reasonable
rules.
Attention to modeling of lane changing is also needed due to mode locking of pedestrians, or a
marching effect, which may occur in some instances. Mode locking, or synchronization of
motions, occurs in many physical processes (Schroeder 1991). In this case pedestrian self-
organization effects the ability to synchronize movement. Mode locking is almost certain to
occur at about 0.25 density if all the entities are of the same walker class, having the same
maximum speed. If rigid mode locking occurs, slowdowns and small jams do not occur and
average speed will be unreasonably high. Even with substantial opportunity to pass, the
pedestrians achieve flexible self-organization. Self-organization arises naturally in the model,
but the lane changing rules should avoid rigid locking into step, or the forward flow
characteristics will not emerge correctly. While self-organization is of interest, and the model
does display it, it is not the primary focus of attention at this stage of research investigation.
Figure 5 illustrates the relationship between lattice density and rate of side stepping. The uni-
directional lane change rate (100/0 split case in Figure 5) is effected by density in an
unexpected way with a local minimum at 0.25 and a local maximum at 0.35 density. The
phenomenon appears to be an outgrowth of self-organization at various stages of interaction
among the pedestrians. There are, thus, some spatial configurations of entities that promote
lane changing and others that impede it. In fact, spatial use and the uni-directional lane change
curve can be used to explain the shape of the S-curve in the speed-density relationship. As the
density for the 100/0 split increases above 0.25, the spatial efficiency for forward movement
decreases and lane changing increases as entities tend to self-organize into the available space,
breaking up any even spacing between pedestrians and making forward movement more
difficult.
Figure 5 also illustrates the relationship between sidesteps and density at the 1.0 exchange
probability for interspersed flow over a range of directional splits. As can be seen from the
figure, the spatial advantages of mode locking are entirely lost and the opposite effect even
occurs. In fact, the spatial advantage of mode locking at 0.25 density becomes a disadvantage
with opposing flows and increases as flows become more evenly divided. Avoiding entities
coming from the opposite direction is most effective for reasons similar to why mode locking
is so effective in uni-directional flow.
Figure 5. Sidesteps vs. Density Curves for Bi-directional Walkway,

Interspersed Flow, and varied splits for 1.0 exchange probability
100/0
90/10
80/20
70/30
60/40
50/50
0.2 0.4 0.6 0.8

Density
(Cells Occupied/Total Cells)
When exchanges occur freely, the lane changing is not as impeded. As the exchange
probability goes down, the lane changing increases (not shown). One finding from this analysis
is that lane changing is sometimes helpful and other times not very helpful in promoting flows.
A lot depends on the spatial use and density. Especially as density increases, when a
pedestrian changes lane to pass, another pedestrian can change lanes and block the first
pedestrian's movement. Pedestrians make their decisions independently and are myopic to the
movements others may take. In actual pedestrian traffic pedestrians can "read" the body
movements of others to a certain degree, but this can also be deceptive and oncoming
pedestrians may still wind up in a face off. In road traffic lane changing, a phenomenon,
referred by some as "snaking," can be observed (Resnick 1994). In congested traffic if an
adjacent lane moves faster, drivers switch lanes and congest the formerly moving lane. Then,
the formerly stopped lane begins to move. The traffic moves like a snake, alternating the lane
that is in freer flow. The fact that lane changing can be counterproductive or non-productive
for forward flow does not stop lane changing from happening. Pedestrians can be as myopic as
drivers, as can be observed, when at multiple queues, queue hopping occurs to the seeming
eternal frustration of the person who always seems to switch to the wrong queue.
It is noteworthy that the CA model has a significant capability in capturing lane changing to
the degree needed for apparently any application. At the very least, lane change rules in CA
models must prevent excessive mode locking from occurring while permitting reasonable
amounts of self-organization of pedestrians.
CONCLUSIONS
This paper demonstrated the use of CA microscopic simulation for modeling pedestrian
behavior. The basic pedestrian behaviors have been approximated with a minimum rule set
that effectively captures the essential pedestrian dynamics for the single- and bi-directional
case. The rule set and scaling can be adjusted to situations and conditions that correspond, in
some measure, to actual behaviors. The correspondence of the rules to actual behaviors need
not be exact. The essential rules are simple enough to program and modify without imposing
unnecessary detail, and yet capture complex phenomena.
The CA microsimulation exhibits emergent fundamental flows that correspond to published

field data and accepted norms. The resultant single-directional flow-density case corresponds
to a two-regime curve consisting of (a) May's Bell curve at low-to-mid density and (b) a linear
curve at mid-to-high high density. The bi-directional case where lanes are dedicated by
directional split appears sufficiently validated in that those flows are not significantly different
from single-direction flows. The 90-10 interspersed bi-directional case corresponds well to the
15 percent reduction in capacity noted in the HCM.
Over all the simulation tests, the region of density 0.2-0.4, where the maximum flows occur,
has the largest differences in speed and volume between tests. These are presumed to be the
speed-flow-density combinations that have the most volatile dynamics. This inference agrees
with Paczuski and Nagel (1995) that reveals complex dynamics at work, especially in the
maximum flow range of the Nagel-Schreckenberg automobile traffic model. This is evidently
due in part to spatial sensitivities of lane changing that affects forward movement in this
region. Since nonlinear effects occur around maximum flow, the region of maximum flow
would then be the area where the greatest concentration of effort in fine-tuning the model
would benefit. With further work it would be possible to reveal more about and to better
understand the dynamics in the maximum flow range.
Lane changing effects yield some interesting insights into the speed-density relationship. At
lower densities mode locking may occur that is efficient for forward movement, but requires
and permits few lane changes. At higher volumes, lane changing may help as well as hinder
overall flow and evidently has relatively small impact upon the emergent group flow behavior.
However, though not well studied or understood by field research to date, lane change behavior
is an important feature of pedestrian movement. Its inclusion in the model is essential and adds
realism.
Among those issues that would benefit from further examination, lane changing and place
exchange, especially, emerge as factors that are critical to the functioning of the model. Field
studies should verify the hypothetical lane change and position exchange phenomena.
Previously published data were used to model the distribution of walker speeds. However, the
use of a CA model as a design tool would require careful study of local pedestrian populations
with respect to characteristics of lane changing, position exchange, and the speed distribution
of walkers in free flow.
The more complex and hypothetical case of interspersed bi-directional flows clearly illustrates
the modeling power of the CA method. Its capabilities have been merely touched upon in this
investigation. The CA approach yields a viable tool for pedestrian modeling that has daunted
the efforts of researchers for years. It captures micro-level pedestrian dynamics and offers an
experimental platform for better grasping the important parameters of pedestrian flows. As a
new method, its possibilities have barely begun to be theoretically explored and considerable
opportunities exist for innovative applications. Among the many possibilities, the authors are
investigating a 4-directional pedestrian model and complex walking environments, such as
shopping malls, street intersections, and bus/rail stations. Multiple-mode models that combine
pedestrians, autos, trucks, buses, bicycles, and so forth (e.g., auto-rickshaws and scooters in
Asian traffic) are plausible extensions of the CA pedestrian and auto work done to date.
A version of the simulation can be seen online at http://www.ulster.net/~vjblue.
REFERENCES
AlGadhi, S.A.H. and H. Mahmassani (1991). Simulation of Crowd Behavior And Movement:
Fundamental Relations And Application. Transportation Research Record, 1320, 260-268.
Bak, P. (1996). How Nature Works: The Science of Self-Organized Criticality, Springer-
Verlag New York, Inc.
Ben-Akiva, M, H. N. Koutsopoulos, and Q. Yang, Q. (1995). A Simulation Laboratory for

Testing Traffic Management Systems. 7 World Conference on Transport Research
(WCTR). Syndey, Australia.
Blue, VJ. and J. L. Adler (1998). Emergent Fundamental Pedestrian Flows from Cellular
Automata Microsimulation, Transportation Research Record, 1644, 29-36.
Fruin, J. J. (1971). Pedestrian Planning and Design. Metropolitan Association of Urban

Designers and Environmental Planners, New York, N.Y..
Gipps, P. G. (1986). Simulation of Pedestrian Traffic in Buildings. Schriftenreihe des Instituts
fuer Verkehrswesen, 35, Institut fiter Verkehrswesen, Universitaet Karlsruhe, Germany.
Helbing D. and P. Molnar (1995). Social Force Model for Pedestrian Dynamics. Physical
Review E, 51 (5) 4282-4286.
Lieberman E. and A. Rathi (1997). Traffic Simulation, In: Revised Monograph on Traffic
Flow Theory, Federal Highway Administration, (N. Gartner, M. Messer, and A. Rathi,
eds.) WWW publication http://www.tfhrc.gov/its/tft/tft.htm.
Levy, S. (1992). Artificial Life. Vintage Books, New York.
Lovas, G. G. (1994). Modeling and Simulation of Pedestrian Traffic Flow. Transportation

Research, 28B (6) pp. 429-443
Nagel, K. and M. Schreckenberg (1992). A Cellular Automaton Model for Freeway Traffic. J.
Physique I, 2.
Nagel, K. and S. Rasmussen. (1994). Traffic at the Edge Of Chaos. Artificial Life IV:
Proceedings of the 4th International Workshop on the Synthesis and Simulation of Living
Systems, pp. 222-225.
Nagel, K. (1996). Particle Hopping Models and Traffic Flow Theory. Physical Review E, 53
(5)4655-4661
Nagel, K., C. Barrett, and M. Rickert (1996). Parallel Traffic Micro-simulation by Cellular
Automata and Application for Large Scale Transportation Modeling. Los Alamos
Unclassified Report 96:0050, Los Alamos National Laboratory, Los Alamos New Mexico.
Paczuski, M. and K. Nagel (1995). Self-Organized Criticality and 1/f Noise in Traffic, Los
Alamos Unclassified Report 95:4108 (published in Traffic and Granular Flow, eds. D.E.
Wolf, M. Schreckenberg, and A. Bachem, Singapore: World Scientific, 1996, p 41) Los
Alamos National Laboratory, Los Alamos New Mexico.
Resnick, M. (1994). Turtles, Termites, and Traffic Jams: Explorations in Massively Parallel
Microworlds. MIT Press, Cambridge, Mass.
Rickert, M., Nagel, K., Schreckenberg, M. and Latour, A (1995). Two-Lane Traffic
Simulations Using Cellular Automata. Los Alamos Unclassified Report 95:4367 (published
in Physica A, Vol. 231, pp. 534, 1996), Los Alamos National Laboratory, Los Alamos New
Mexico.
Schroeder, M. (1991). Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H.
Freeman and Company, New York.
Special Report 209: Highway Capacity Manual (1994). Transportation Research Board,
National Research Council, Washington, D.C.,
Virkler, M. R. and S. Elayadath (1994). Pedestrian Speed-Flow-Density Relationships.
Transportation Research Record, 1438, 51-58.
Wolfram, S. (1994). Cellular Automata and Complexity. Addison-Wesley Publishing

Company
CHAPTER 4
FLOW EVALUATION IN ROAD NETWORKS
• Time is nature's way of keeping everything from happening at once.

• You cannot depend on your eyes when your imagination is out of focus.
(Mark Twain)
• Discovery consists of seeing what everybody has seen and thinking what
nobody has thought.
Road Network Under Degraded Conditions 257
FLOW MODEL AND PERFORMABILITY OF A

ROAD NETWORK UNDER DEGRADED
CONDITIONS
Yasuo Asakura, Masuo Kashiwadani and Eiji Hato
Department of Civil and Environmental Engineering, Ehime University

Matsuyama, 790-77, JAPAN, FAX. +81-89-927-9843, E-mail, asakura@enl.ehime-u.ac.jp
INTRODUCTION
Transport network flows are the results of the interaction between travel demand and supply
conditions. Flows in a network are not stable due to the fluctuation of travel demand or the
occurrence of failure in supply conditions. Even if links and nodes in a network are not
physically damaged and the supply condition of the network is normal, flows in the network
may not always be stable. Travel demand is variable from time-to-time or day-by-day, and
the resulting network flow will be fluctuating. We sometimes experience unusual network
flow conditions due to seasonal fluctuations of travel demand
On the other hand, traffic accidents, road-works or natural disasters occasionally causes
damage on the supply conditions of a network. Some links in the network may be closed to
traffic, and flows in the network will become unstable. Almost all of the links are damaged in
an extraordinary natural disaster such as the great Hanshin Earthquake. We may suffer from
extremely heavy traffic congestion caused by the interaction between fluctuating travel
demand and degraded network capacity.
Figure 1 shows a conceptual relation of physical conditions and traffic flows in a transport
network. As mentioned above, the degree of unusuality of traffic flows in a network is
observed even in a physically sound network. The instability of network flows will be
magnified as the degree of the physical damage of the supply conditions becomes worse. In
the countries often suffering from natural disasters like Japan, it is necessary to describe
network flows and estimate network performance in deteriorated conditions for the strategic
transport network planning under uncertainty. As well as those countries, the modern
society requires more reliable transport systems. A reliable network means a network which
can guarantee an acceptable level of service for traffic even if some links of the network are
physically damaged or large amount of travel demand is occasionally generated.
Degree of Unusuality of Traffic
Degraded Network
Physically Normal Network
Degree of Physical
Damage of Network
Figure 1 Level of Network Degradation and Corresponding Flow Conditions
Network reliability models have been studied for evaluating transport networks in both usual
and unusual conditions. Turnquist and Bowman(1980) presented a set of simulation
experiences to study the effects of the structure of the urban network on service reliability,
lida and Wakabayashi(1989) developed an approximation method of calculating the
connectivity between a node pair in a network. These studies are concerned with the
reliability analysis of a. pure network. Flows in the network were not explicitly considered in
those studies.
Asakura and Kashiwadani(1991) proposed a time reliability model of a network considering

day-to-day fluctuation of traffic flow. Although they used a traffic assignment model and
considered flows in the network, the reliability model was calculated independently from
network supply conditions. Du and Nicholson(1993) showed a general framework of the
analysis and design of Degradable Transportation Systems. The User Equilibrium assignment
was involved in the reliability analysis. By extending the algorithm shown by Du and
Nicholson, Asakura(1996) presented an approximation algorithm of the distribution function

of a performance measure in a deteriorated network. Asakura and Kashiwadani(1997)
compared some different reliability models of an origin and destination (OD) pair in a road
network. Travel time, travel demand and consumer's surplus between an OD pair were
respectively used as performance measures in those reliability models. These studies focus on
a flow network, in which the interaction between travel demand and network condition is
described.
Network flows in degraded conditions might be different from the ones in ordinary
conditions. As Khattak and Palma (1997) mentioned, however, few studies have been made
for describing traveller response in normal and unexpected situations. In particular, network
flow models have not been studied for unusual conditions. The uncertainty and inconvenience
of travel may bring the reduction of travel demand and it will result in network flow patterns.
Deterministic flow models have been used in the previous network reliability analysis.
However, this may cause inconsistency between flow description and reliability evaluation
process since flow is deterministic while network itself is not. Thus, it is necessary to employ
stochastic models for describing flows in a network. The objective of this paper is first to
formulate a network flow model for describing flows in degraded conditions. The second
objective is to propose some performance measures of a network when some links are closed
to traffic. The network flow model is involved in the evaluation process of the performance
measures.
In the following Chapter 2, the Stochastic User Equilibrium model with variable demand is
applied to describe flows in a network with some disconnected links. Chapter 3 shows some
performance measures considering uncertain factors of the network flow. Approximation
algorithms for estimating the expected value and the distribution functions of performance
measures are also described in this chapter. Numerical examples are calculated in Chapter 4 to
verify the convergence of the algorithm. A brief conclusion and implication of the models are
presented in Chapter 5.
FLOWS IN A NETWORK WITH SOME DISCONNECTED LINKS

Network State Vector and State Probability
A transport network is represented by a directed graph which consists of a set of numbered

nodes and a set of numbered links. We assume that the function of links may be deteriorated
by some reasons, while nodes will not be. In order to describe flows in a deteriorated network
as simply as possible, we assume that links with failure are completely closed to traffic and
such a condition continues for a long time. This assumption is introduced so that the static
User Equilibrium conditions could be applied for describing network flows.
A degraded road network is identified by the state vector x = {x:,..., xa,... ,x J-. The element
of the vector xa denotes whether link a is degraded or not, namely x a is equal to 0 if the link
is closed to traffic, or x a is equal to 1 if the link maintains its functions in the ordinary
condition. If all links are connected, the state vector is written as xg = {2, ,1} . This
state is referred as the ordinary or normal state in the following part of this paper. The worst
state vector is written as x w = {0,.... ,0},ir\ which all links are disconnected. X = {x} denotes
the set of the possible state vectors.
Since the occurrence of a failure in a link is uncertain, the state of a network is not
deterministic. We introduce a link connectivity, which is the probability of whether the link
is connected and not closed to traffic. The connectivity for link a is denoted by
pa (a = l,... ,L). We assume the value of link connectivity is exogenously estimated and fixed.
When the link connectivity pa is independent each other, the probability of the occurrence of
a state x in a network is calculated as;
P(x)=TlPaX° (I-PJ1^* (^
a £A
We call p(x) as the state probability.
Formulation Stochastic User Equilibrium Model with Variable Demand
According to the survey by authors, 15% of drivers cancelled their trips in the deteriorated
network, and 75% of diverted drivers were obliged to choose the routes with higher risks. We
formulate a network flow model in a deteriorated network considering those travel choice
behaviour in the network. Travelling in a deteriorated network is more uncertain than in the
ordinary network. A stochastic travel choice model is appropriate to describe a driver's travel
behaviour. Since some of the drivers may cancel their trips, the travel demand is assumed
elastic for network situations.
In the previous studies (Asakura(1996), we have applied the Deterministic User Equilibrium
with Variable Demand. Although the elasticity of demand was considered in the model, the
route choice behaviour was assumed deterministic. In order to represent uncertain route choice
behaviour in a deteriorated network, the stochastic route choice modelling is desirable. This
will also resolve the inconsistency between route choice behaviour and trip making behaviour.
We will apply a Nested Logit based Stochastic User Equilibrium model with Variable Demand
(NLSUE-VD) for describing flows in the transport network. Congestion effect is involved in
the route travel cost which is the function of the number of travellers. The trip making and
route choice behaviour is written by the Nested Logit model. Those choice probability
functions between an OD pair are given as follows,
exp(-6,c[s)
p[k\rs] = — VkEKrs,rstQ (2)
I exp(-6jc")
k€K
p [ r s ] = 2-p[rs] VrsEQ (4)
where p [k\rs] denotes the choice probability of the route k between origin r and destination s,
p [rs] presents the choice probability of generating a trip between OD pair rs and p [ rs } is
the choice probability of cancelling a trip between the OD pair, c" denotes the cost of
travelling the route k between OD pair rs, and S rs represents the expected minimum cost
between the OD pair, which is written as Eq.(5).
(5)
These route cost and the expected minimum cost are the function of flows in a network. Crs
and Rrs are the fixed costs of travelling OD pair and of cancelling the travel, respectively.
6}, 02 are the parameters of route choice and travel choice. Q,Krs are the set of OD pairs
and the set of routes between OD pair rs.
When the number of potential trips of an OD pair is given as 7rs, the number of generated
trips qrs and the number of cancelled trips q rs of the OD pair are written as;
exp{-82(Srs
\/rsE Q (6)
exp{ - exp(-e2RJ
(7)
The flow of the route k of OD pair is also written as;
rs, rse Q (8)

exp(-d1c[s)
An optimization model equivalent to the above conditions is formulated as the following Eqs.
(9) to (13).
[NLSUE-VD]
J k
min.Z(f,q, q ) =
(9)
subject to
(10)
VI ^.7-5 y , , jy- —^ / 1 1 \
fks>0 \/kEKrs,rsEQ (12)
qrs>0, ~qrs>0 \/rsEQ (13)
where ta(w) denotes the link cost function, and xa is the flow of a link. We assume the link
cost function is separable and monotonically increasing for its link flow.
Let us prove that the first order necessity conditions of the above non-linear optimization
model [NLSUE-VD] are equivalent to the Nested Logit model. Introducing the Lagrange
multipliers {A ri } and {/j.rs}, the Lagrangianof the [NLSUE-VD] can be written as;
L\f,q, q,*,l*] = Z } ( x ( f ) ) + Z 2 ( q , q ) + Z3(q,f)+Z4(q, q )

(14)
+ I *-r,(Tri-(qr, + ~qJ) + I Hrs(qrs- I ft)

rsea rs€fi k€K
where Z j ( x ( f ) } , Z2(q, q ), Z ^ ( q , f ) and Z4(q, q ) correspond to the 1 st to the 4th terms

of the original objective function, respectively. The Kuhn-Tucker conditions are;
and >0 VkEKrs,rseQ (15)

d
fk
BL 8L
qrs =0 and > 0 V rs E Q (16)
q rs—=— = 0 and —-=—>Q V rsEQ (17)

d q rs d q rs
The constraints of the original problem Eqs.(lO) to ( 1 3 ) are also hold. Since the route flow
and the OD flow are positive, the above conditions can be rewritten as;
dL rs 1
In (18)
dL 1 1
In (19)
In +1 (20)
Then, the route flow is calculated using Eq.(18) as,
fks = qrsexp(- 6jCrks (21)
When this equation is put into the route flow conservation constraint Eq.(l 1), we obtain the
Logit model for route choice. Similarly, the OD flow is calculated using Eqs.(19), (20) and
(10). Then, we obtain the Binary Logit equation for trip making. These are the Nested Logit
equations which we have assumed for the traveller's choice behaviour in a network. Thus, the
optimal conditions of the [NLSUE-VD] is proved equivalent to the stochastic route and trip
making choice models. The objective function of [NLSUE-VD] is strictly convex for path flow
and OD flow, and feasible region is convex as well. I f w e eliminate cyclic paths, the feasible
region of path flow is bounded. Thus, the solution of [NLSUE-VD] is unique for path flow
and OD flow.
The conventional SUE models assumed probability distribution of perceived travel times of
drivers in a deterministic network. This includes both the logit based SUE model by Fisk
(1980) and the probit based SUE model by Sheffi and Powell(1981). Although the probit
based SUE model overcomes the deficiency of the logit based model, it still remains
computational inefficiency for solving probit based SUE model. Mirchandani and Soroush
(1987) proposed a general SUE model in a stochastic network. Some limited cases are
approximately solved and consistent extension to variable demand seems difficult. The
formulated [NLSUE-VD] is categorized into the logit based SUE models, and it is not
sufficient to capture the similarity between different routes "Independent from irrelevant
alternatives" property of the logit model may cause overloading to the overlapping paths in a
network. Nevertheless, the [NLSUE-VD] seems useful since the route choice and the trip
generation behaviour in a network are consistently formulated and the model is

computationally efficient to solve.
Solution Algorithm for NLSUE-VD
Applying the Simplicial Decomposition method, the formulated NLSUE-VD can be solved.
Two phases are iterated; the phase of solving the Restricted Master problem for given path
set and the Column Generation phase of extending the path set. The iterative procedure for
the Restricted Master problem and the Column Generation problem can be summarized as
follows. The route flow vector and the OD flow vector are referred as /"={/"} and
q = (qrs, q rs j, respectively.
[Phase 0] Initialization. Set initial path set JA"M. Calculate initial feasible network flows
{ f ° , q ° } . Set iteration counter m=0.
[Phase 1] Restricted Master problem. Solve the Restricted Master problem for the given path
set {.K™ ] and obtain network flows \f,q } .
[Phase 2] Column Generation problem. Examine the possibility of path set extension. If no
additional route is found, close the iteration. Otherwise, up-date the path set (K™*1 } and set
iteration counter m=m+l, and go back to [Phase 1]
For finding the additional route in the [Phase 2] , the shortest path is calculated for the link
cost with loading route flow vector f . If the shortest path is new and not included in the
current path set |/C™ ], add the path to the path set.
The Restricted Master problem in the [Phase 1] can be calculated using partial linearization
method. The algorithm for the Restricted Master problem is as follows,
<Step.O> Initialization. Initial feasible network flows \f°, ^°}were calculated for given path
set { K r s } . Set iteration counter n=0.
<Step.l> Travel cost up-date. Up-date link flow xa= £ L o"kf^ and link travel cost
rs-:Slk€K
ta=ta(xa) for all links in a network. Then, calculate route cost c™ = £ S"k ta and the
a €A
expected minimum costS^.
<Step.2> Direction search. Calculate auxiliary OD flows v= (vrs, v \ and route flows
g = |g™ j using following equations.
exp{-62(S „ + €„)}
v,, = T.
exp{ - 82(Srs + CJ} + exp(-62R J
exp(-6Icrks)
8k " vrs
£ exp(-8jc")
k€K
<Step.3> Move-size determination. Find optimal value of a by one dimensional search

method.
min. Z (a) =min. Z (fn + a(g -f"), q" + a(v-q") )
subject to 0 < a <7
<Step.4> Flow up-date. Up-date the route flow vector and the OD flow vector using the
optimal value aopt.
<Step.5> Convergence check. If the difference of the flow vectors ||/'!+J -/"|| +
is sufficiently small, then the flow vectors are converged and close the Restricted Master
problem. Otherwise, set n=n+l and return to <Step.l>.
PERFORMABILITY MEASURES AND APPROXIMATION

ALGORITHMS
Performance Measures
Solving [NLSUE-VD] for the network at state x, we obtain the flow variables as well as travel
cost in the network. The equilibrium OD flow {qrs(x)} and the equilibrium route flow
\f"(x) } are the representative flow variables, and can be used for evaluating the performance
of a deteriorated network. In addition to those flow variables, the equilibrium route travel
cost {c"(x) j and the expected minimum travel cost between OD pair {Srs(x)} are also
available for evaluation.
The OD travel demand may decrease when some links in a network become deteriorated.
Thus, the simplest performance measure is the degree of reduction of OD travel demand from
the normal network state, that is, the ratio of the OD travel demand of a state x to that of the
normal state x 0. This is denoted by qrs(x) lqrs(x0). This measure is convenient to compare
the performance between different OD pairs. When it is necessary to evaluate the reduction
rate for an entire network, the reduction rate of total travel demand £ qrs(x)l £ qrs(xo)can
rs€n rsttt
be used as a performance measure.
The other performance measure is the travel cost of an OD pair. Using route travel cost
c"(x) and the route choice probability p [ k \ r s ] , the averaged travel cost of an OD pair is
represented as,
exp(-6,c"(x))
c[s(x)p(k\rS) = £ c[s(x) (22)
exp(-
The expected minimum travel cost between an OD pair Srs(x) will be the alternative
composite measure using route travel cost. As well as the performance measure using flow
variables, it is possible to define the ratio of the cost based performance measures, for
example, the increase ratio of the expected minimum cost such as S r s ( x ) / S r s ( x 0 ) .
We will define the reliability of a flow network using a performance measure. A reliable
transport network generally means the network in which one can travel from his/her origin to
the destination without much uncertainty. The state of a network is probabilistic and the
performance measures are also random variables. Therefore, we define the reliability as the
probability of whether a performance measure is sustained within an acceptable level. The
probability is written as:
f Prob.[PM(x) < c] , when PM(x) is increasing.

\Prob [PM(X) > c] , when PM(x) is decreasing.
PM(x) denotes the value of a performance measure at a network state x . The flow based
performance measures are usually decreasing since the ratio will become smaller for worse
network state. The cost based performance measure is increasing to the contrary. Parameter c
denotes an acceptable level of the performance measure. The value of c is exogenously
determined considering the level of service which should be maintained even in deteriorated
situations.
When the ratio of OD flow, qrs(x) I qrs(x 0 ) , is used as a performance measure, Rrs(c)
becomes the OD flow reliability. This means the probability of whether the travel between
the OD pair is possible within an acceptable reduction rate c (0 < c < 1 ). In other words, the
OD pair is regarded connected in condition that (1-c) of the travellers cancel their trips. When
the value of c is set equal to 1, any reduction of OD flow is not permitted and the highest
level of service is required.
The reliability is numerically evaluated using the expected value of the operated/failure
function, which determines whether the performance measure is within the given level. Taking
a decreasing performance measure like OD flow rate as for example, the operated/failure
function is written as,
(1 if PM(x)>c
Z(c,x)=
v
' ' L\0 if., PM(x)
p,,, , <c (24)
^ '
Note that the subscripts rs are dropped in order to avoid complexity. The probability R(c) is
the mathematical expectation of Z(c,x) weighted by the state probability p(x), which is
written as,
R(c) = EfZ(c,x)J = £ P(x) Z(c,x) (25)

JC€X
When we evaluate the reliability measure for different criteria c's using the approach above, it
is necessary to calculate the operated/failure function for each value of c. Even if we use an
approximation algorithm, this is time consuming work. Here, we will show an alternative
approach for evaluating reliability. The occurrence of a state is stochastic and performance
measures are randomly distributed. If we could estimate the cumulative distribution function
of a performance measure Frs [PM ], the probability is easily calculated for any values of c,
such as,
Rrs(c)=Frs[PM>c] (26)
The cumulative distribution function is also approximated using similar methods for
approximating the expectation of the operated/failure function. This is explained in the next
section as well.
Approximation Algorithm
For a network with L links, the number of possible state amounts to 2L. If [NLSUE-VD] is
calculated for each network state, the direct calculation of the expected value of reliability
using Eq.(25) requires huge computation cost. This is also true for estimating cumulative
distribution function. In this section, we show two algorithms for approximating those
equations. The original idea was presented by Li and Silvester (1984). The algorithm defines
the lower and upper bounds using the J most probable state vectors.
Sorting state vectors in the order of the state probability as Eq.(27).
p(x ;)>... > p ( x j ) > p ( x j ^ ! ) > . . . >p(xN) (27)
where xj denotes the j-th most probable state vector, p(x.) represents the state probability
for the state x f and N is the number of all state vectors. Using the state vectors by the J-th
most probable state vector |jc;, ,jc y ) and corresponding the values of the operated/failure
function{Z rs (c,jc y/ );j = 1,... , J } , the upper and lower bounds of the expected value can be
defined as follows. The upper bound is obtained through the optimistic expectation of the
operated/failure function. If we assume that the states jc . for j=J+l to N are equivalent to the
normal state xa, the performance measure for those states are also equivalent to that of the
normal state, namely PMrs(Xj)=PMrs(x 0) for j=J+l to N. This means the values of the
operated/failure function Zrs(c,Xj) are equal to 1 for j=J+l to N. Thus, the upper bound of
R^s(c) is obtained as the expected value of Zrs(c,x) of these conditions. That is,
I p(Xj)Zn(c,x0)
On the other hand, the lower bound is obtained through the pessimistic expectation of the
operated/failure function. Assuming that the states X j for j=J+l to N are equivalent to the
worst state x w, the performance measure for those states are also equivalent to that of the
worst state; namely PMrs(Xj)=PMrs(x w) for j=J+l to N. For example, the OD flow of the
worst state qrs(x w) will probably be zero or extremely smaller than that of the normal state
qrs(x o ) . Since the ratio of the two qrs(x w)/q rs(x a) is also very small, the value of the
operated/failure function for the worst state Zrs(c,xw) is equal to 0 for any criterion c,
Accordingly, the values of the operated/failure function Zrs(c, Xj) are equal to 0 for j=J+l to
N. Thus, the lower bound of RLrs(c) is obtained as the expected value of Zrs(c,x) of these
conditions. That is,
(29)
The expected value of Rrs(c) stays between Rvrs(c) and RLrs(c). Comparing the values of the
upper and the lower bounds of the J-th iteration with those of the (J+l)-th iteration, we
obtain the folio wings:
RLrs(c,Xj)<RLrs(c,XjJ
This means the upper bound and the lower bound converge to the exact expected value of
Rrs(c) from the upper side and the lower side, respectively. Figure 2 shows the convergence
image of the upper and the lower bounds. We take the next most probable state vectors one
after another and update the approximated expected value of the reliability measure Rrs(c)
until the difference between the upper and lower bounds becomes small enough.
R(c)
Upper Bound R (c)
R*(c)
Lower Bound
Approx. Value
N
1 2 Number of iteration J
Figure 2 An Image of Convergence of Approximation Algorithm
This procedure can be represented as the following algorithm.
<Algorithm>
Step.O Calculate the flow and cost variables by solving the [NLSUE-VD] for the normal
state xo .Set iteration counter J=l.
Step. 1 Take the J-th most probable state vector x y , where the probability p(xj) is the J-th
largest. An efficient algorithm proposed by Lam and Li (1986).
Step.2 Calculate the performance measure PMrs(Xj) by solving [NLSUE-VD] for state Xj.
Step.3 Eevaluatethe operated/failure function.
1 if PMrs(Xj)>c
(31)
0 if PMrs(Xj)<c
Step. 4 Calculate the upper and lower bounds using Eq.(28) and Eq.(29), respectively.
Step. 5 Check the convergence. If the difference of the upper and the lower bounds is small
enough, go to Step. 6. Otherwise, set J=J+1 and return to Step. 2.
Step. 6 Approximate the expected value as,
c)l2 (32)
This algorithm seems efficient for evaluating the reliability between OD pairs rs (rs e £2) for
given criteria c. However, one must calculate the number of iterations again if it is necessary to
obtain the expected value o f R n ( c ' ) for different criteria c'.
Then, we propose another algorithm to approximate the cumulative distribution function

Frs(t) of a decreasing performance measure. The same as the approximation of the expected
value of Rrs(c), we can approximate the function Frs(t). Using the J most probable state
vectors {x ,,..., jr y ] and corresponding values of the occurrence probability
\p(x l),...,p( Xj) j, the upper and the lower bounds of the cumulative distribution function are
represented as,
where H(t) denotes the subset of [1,...,J], for which performance measure PMrs(Xj) is greater
than or equal to t. The difference between the upper and the lower bounds is:
j
F"(t) -!*,(*) = l-Zp(xj) (34)
j-i
which is kept constant for any range oft. This is convenient for examining the convergence of
the iteration.
When we take the next most probable state vector xj+1, the upper bound is lowered and the
lower bound is raised, respectively. That is,
F",.j(t)*F?,,J+1(t)
FrsJ(t) and FrsJ(t) denote the upper and the lower bounds for the J-th iteration.
FrSiJ+](t) and FrSiJ+1(t) represent the upper and the lower bounds for the J-th iteration,
respectively. This means that the difference of the upper and the lower bounds becomes
small enough when a sufficient number of the probable state vectors are included. The
cumulative distribution function of performance measure Frs(t) is then approximated as:
F
rS (*) = {F",(t) + FLrs (t)} 12 for all t> 0 (36)
Once the distribution function is estimated, we can evaluate the value of Rrs(c) using Eq. (26)
for any criteria of c. This is quite useful for analyzing network reliability measures since it is
not necessary to consume additional computational cost for different criteria. The algorithm is
similar to the one for approximating the expected value of Rrs(c). That is summarized as
follows:
<Algorithm>
Step.O Calculate [NLSUE-VD] for the normal network state. Set iteration counter J=l.
Step. 1 Take the J-th most probable state vector.
Step. 2 Calculate the performance measure by solving the NLSUE problem for the J-th most
probable state vector.
Step. 3 Evaluate the upper and the lower bounds using Eq.(33) for all ranges of the
perfomance measure.
Step. 4 Go to Step. 5 if the difference of the upper and lower bounds is small enough.
Othewise, set J=J+1 and return to Step. 2.
Step. 5 Approximate the distribution function using Eq.(36).

NUMERICAL EXAMPLES
Inût Conditions
Figure 3 presents a small scale network with 4 nodes 5 directed links. Linear link cost
functions are also shown in the same figure. A pair of origin and destination flow is assumed
from node 1 to node 2. In the following part of the numerical examples, the subscript of
origin and destination pair is omitted since only one OD pair is concerned. The maximum
number of paths is 3 in the full network., and those paths are identified as shown in Figure 4.
(tj-10+Xj
Figure 3 Network Configuration and Link Cost Function
path-1
path-3
Figure 4 Paths in the Full Network
The upper limit of OD flow T equals to 40. The fixed travel cost C and the cost R for
cancelling the travel of the OD pair are assumed 0 and 50, respectively. The parameter values
0j, 02 are 0.1 and 0.2, respectively. Using these conditions for OD demand and route choice,
the OD flow q and the path flow/^ (k = l,2 and 3) are written as follows;
exp(-02(C exp(-0.2S)
-40
exp(- - 62R) exp(-0.2S) + exp(-0.2x 50)
exp(-B1ct) exp( -O.lck)

for k = 1,2,3,
exp(- e,ct) YJexp(-0.1ck)
where S denotes the expected cost between the OD pair and written as;
exp(-61ck) = -10xln lJexp(-0.1ck)
The values of link connectivity pa (for a= 1,2,3,4 and5) are assumed 0.9, 0.8, 0.7, 0.9 and
0.8. When the occurrance of link closure is independent each other, the probability for
X 1 -X
network state is given by p(x) = Y[pa °(l-pa) '• F°r example, the state probability for
x = (1,1,1,1,0) is calculated as p(x) = 0.9 x 0.8 x 0. 7 x 0.9 x (1 - 0.8) = 0.09072.
SUE Flow for Different Network States
Before evaluating network performance measures under degraded network conditions, we will
show the results of the network flow analysis. The number of links is 5, and the number of
the states is 25=32. These states are aggregated into 8 network patterns with the common set
of paths. Table 1 presents the 8 patterns and corresponding state probabilities. When the OD
pair 1-2 is not connected, the state is categorized into "disconnected". [NLSUE-VD] is
calculated for each network state. The results are summarized in the Table 1.
It is possible to load 36.67 of OD flow onto the full network (PI) without any disconnected
links. This amount of flow is equal to 92% of the given upper limit of the potential OD
demand. For this normal state of the network, the largest proportion of the OD flow uses the
path-3, and the second largest is the path-1. The path-2 will be used very few, since the cost
of the path-2 is the highest among three possible paths.
Comparing the network patterns with some disconnected links, it is found that the
performance of the network becomes worse when link 5 is not available. If only link 5 is
removed (P2) from the full network, the amount of the OD flow loaded onto the network is
20.67. This corresponds to 56% of the OD flow for the full network. On the other hand, the
influence of removing link 4 is not so large. Just 1% of OD demand is reduced if link 4 is
disconnected (P3). Similarly, the closure of link 1(P5) results in the 10% reduction of OD
flow of the full network.
Table 1 Network State Probability and SUE Flows for Network Patterns
Network Network State OD Flow Path Flow Path Cost
Pattern Probability (q) &«) (Ci,C2,C3)
PI 0.3629 36.67 14.90 50.90

O
\T/C-^-'^0
1.12 68.31
^-^rx
20.65 42.18
P2 0.0907 20.67 11.94 52.64

8.73 72.41
^-^\
0
</ ^
-
P3 0.0403 36.31 15.01 50.02

0 -
^-^°\
21.30 44.26
<^^°
P4 0.2261 15.90 15.90 51.8
0
a^^^^^ô 0
o
P5 0.0403 33.35 0 -
CL 6.51 60.49
cx /ô 26.84 43.30
P6 0.0101 8.33 0
8.33 59.99
0
"N/" 0
-
P7 0.1155 29.52 0
O 0 -
"N^^0 29.52 45.90
Disconnected 0.1141 0 0 -
When only one path is available (P4,P6 and P7), the performance of the network becomes
decreased. In particular, either path-1 or path-2 can carry 20-40% of OD demand of the full
network. However, path-3 is still capable to maintain 80% of the OD flows (P7). Although
the SUE calculation is executed just for a small scale toy network, it would be useful to find
the paths with higher performance under degraded conditions.
Convergence of the Approximation Algorithm
Figure 5 shows the convergence of the upper and the lower bounds of the expected value of
the OD flow. From the previous Table 1, the exact value of the expected value of the OD flow
is calculated as;
Y J p ( X ) q ( X ) = 25.078,
where q(x .) denotes the OD flow from node 1 to 2 for the state vector jc y . Using the J-th
most probable state vector, the upper and lower bounds of the expected value can be
approximated. It is found that the difference of the upper and the bounds becomes smaller as
the number of the probable state vectors increases. This figure proves the convergence of
approximation algorithm for the upper and the lower bounds, Eb [q] and EL [q], given by the
following equations.
J) q(Xj) j)) q(x ,
OD Flow
40-
\ Lower Bound
Upper Bound
V Expected
30"
-0-0
"tic-iAr,
20"
10
11 16 21 26 31
N. of iteration
Figure 5 Convergence of the Upper & Lower Bounds of OD Flow and the Expected Value
The values of the upper and the lower bounds approximated using up to the 6-th most
probable state are 31.20 and 23.14, respectively. The averaged value of the upper and the
lower bounds is 27.17, and the approximation error is 8.3%. When the approximation goes to
the 12-th most probable state vector, the upper and the lower bounds become 26.92 and
24.19, respectively. The averaged value of these two bounds is 25.56, and only 1.9% of the
approximation error remains at the 12-th iteration.
The upper and the lower bounds of the cumulative distribution function of the OD flow,
Fu(q), and FL(q), are approximated as follows;
j
FU(q)= I pfrJ + V
jtH(q)
L
F (q)= £ p(xj)
H(q) where denotes the subset of [1,...,J], for which OD flow is less than or equal to q.
1.0
0.8
Lower Bound
Upper Bound
O.6-
0.4-
n—B
,-S
J,
0.2
0.0°
0 10 20 30 40
Figure 6 Upper and Lower Bounds of OD Flow Distribution (up-to 6th probable states)
i.o
Lower Bound
Upp er B ound
0.2:
0.0
Figure 7 Upper and Lower Bounds of OD Flow Distribution (up-to 12th probable states)
Figure 6 shows the upper and the lower bounds of the cumulative distribution function of the
OD flow at the 6-th iteration. The difference between the upper and the lower bounds is
nearly equal to 0.220 and may not be converged enough. Figure 7 shows the approximated
function for the 12-th iteration, in which the difference of the upper and lower bounds reaches
0.074. The difference may not be satisfactory small. However, the distribution function can
be well approximated since it is given by the average of the upper and the lower bounds.
Once the distribution function is estimated, it is possible to calculate other performance

measures as well as the expected value of the OD flow. The shape of the function will give
some information for discussing the reliability of the network. We have just shown the
distribution function of the flow for an OD pair. However, the similar discussion is possible
for any other performance measures such as the expected travel cost. It is also possible to
compare the distribution of the link flows for different links in the network. The results will
be used to find the links with larger fluctuations of flows. Those links should be carefully
operated since the influence of the closure of the other links might be magnified.
CONCLUSION
Flows in a transport network are not stable even if all links are in service. When several links
are deteriorated by some reasons, the fluctuation of network flows will be magnified. The
uncertainty and inconvenience of travel may bring the reduction of travel demand. It is
necessary to consider those aspects of network flows when we evaluate the performance of
the transport network. This paper aims to show a network flow model in degraded
conditions, and then to propose performance measures incorporating flows in the network
with some links being closed to traffic.
Travelling in a deteriorated network is more uncertain than in the ordinary network. A

stochastic travel choice model is appropriate to describe a driver's travel behaviour. Some of
the drivers may cancel their trips, and the travel demand is assumed elastic for network
situations. We have applied Stochastic User Equilibrium model with variable demand to
describe flows in a network. Assuming that the trip generation and route choice behaviour are
written by the Nested Logit model, we have formulated the Nested Logit based Stochastic
User Equilibrium model with Variable Demand (NLSUE-VD). The optimal conditions of the
[NLSUE-VD] was proved equivalent to the stochastic route and trip making choice models.
Formulated [NLSUE-VD] can be solved using the Simplicial Decomposition method.
Solving [NLSUE-VD] for a state of the network, we obtain the flow variables as well as travel
cost in the network. Those flow and cost variables can be used for evaluating the performance
of a deteriorated network. The simplest performance measure is the degree of reduction of OD
travel demand from the normal network state, that is, the ratio of the OD travel demand of a
state to that of the normal state. When it is necessary to evaluate the reduction rate for an
entire network, the reduction rate of total travel demand can be used as a performance
measure.
Then, the reliability of a flow network was shown using a performance measure. It is defined
as the probability of whether a performance measure is sustained within an acceptable level.
For a large scale network, the number of possible state is enormous. The direct calculation of
the expected value of the reliability requires huge computation cost. We have shown the
algorithm for approximating the upper and the lower bounds of the expected value. The
algorithm is applied to approximate the probaility distribution function of a performance
measure. Through numerical examples in Chapter 4, the convergence of the algorithm was
verified.
As well as being useful for estimating the expected value, the distribution functions will be
widely available to discuss the characteristics of the performance measures. When those
performance measures are compared between different OD pairs, it is possible to find less
reliable OD pairs. The links reducing network performability will be found and the results will
be useful to improve the physical durability of those links. The formulation of the network
design problem from the view of reliability is possible. Although solving the reliability
network design model will be more difficult than usual bilevel network design problems, this
will be an attractive field of extension.
Of course, we have not yet applied the proposed flow model and the performance measures
to actual transport networks. In parallel with this study, the behavioural survey of car drivers
is now on going for studying their travel choice behabiour when some links are closed to
traffic due to heavy rains. The external validity of the methodology will be examined using
those data.
REFERENCES
Asakura, Y. and M. Kashiwadani (1991). Road Network Reliability Caused by Daily

Fluctuation of Traffic Flow. Proc. of the 19th PTRC Summer Annual Meeting m
Brighton, Seminar G, pp.73-84.
Asakura, Y. and M. Kashiwadani (1995). Traffic Assignment in a Road Network with
Degraded Links by Natural Disasters. Journal of the Eastern Asia Society for Transport
Studies, Vol.1, No.3, pp.1135-1152.
Asakura, Y. (1996) Reliability Measures of an Origin and Destination Pair in a Deteriorated
Road Network with Variable Flows. Paper presented at the 4th Meeting of the EURO
Working Group on Transportation in New Castle upon Tyne. 14 Pages.
Asakura, Y.(1997). Comparison of Some Reliability Models in a Deteriorated Road Network.
Journal of the Eastern Asia Society for Transport Studies, Vol.2, No.3, pp.705-720.
Du, Z. P. and A.J.Nicholson (1993). Degradable Transportation Systems Performance,
Sensitivity and Reliability Analysis. Research Report, No.93-8, Department of Civil
Engineering, University of Canterbury, Christchurch, New Zealand.
Fisk, C. (1980). Some Developments in Equilibrium Traffic Assigment Methodology.
Transpn. Res.-B, Vol.l4-B, pp.243-255.
lida, Y. and H. Wakabayashi (1989). An Approximation Method of Terminal Reliability of
Road Network Using Partial Minimal Path and Cut Set. Proceedings of the 5th World
Conference on Transport Research, VolJV, Yokohama, Japan, pp.367-380.
Khattak, A.J. and A.D. Palma(1997). The Impact of Adverse Weather Conditions on the
Propensity to Change Travel Decisions: A Survey of Brussels Commuters. Transpn.
/tej.-yi., Vol.31, No.3,pp.l81-203.
Li, V.O.K. and J.A. Silvester (1984). Performance Analysis of Networks with Unreliable
Components. IEEE Trans, on Communications, Vol.COM-32, No. 10, pp.1105-1110.
Mirchandani, P. and H. Soroush (1987). Generalized Traffic Equilibrium with Probabilistic
Travel Times and Perceptions. Transportation Science, Vol.21, No.3, pp.133-152.
Patnksson, M. (1994) The Traffic Assignment Problem; Models and Methods. VSP Utrecht,
The Netherlands.
Sheffi, Y. and W. Powell (1981). A Comparison of Stochastic and Deterministic Traffic
Assignment Over Congested Networks. Transpn. Res.-B, Vol.l5-B, pp.53-64.
Turnquist, M.A. and L.A. Bowman (1980). The Effect of Network Structure on Reliability of
Transit Service. Transpn. Res.-B, Vol.l4-B, pp.79-86.
283
A SENSITIVITY BASED APPROACH TO NETWORK

RELIABILITY ASSESSMENT
Michael G H Bell', Chris Cassir1, Yasunori lida2 and William H K Lam3
Abstract
This paper presents a methodology for evaluating the reliability of transportation networks.
While tools already exist to determine the expected benefits of travel demand management or
new infrastructure, tools have yet to be developed which take into account disbenefits arising
from randomly occurring disturbances. The paper focuses on the sensitivity of both path travel
times and expected minimum origin-destination travel times to normal within-day demand and
supply variation, where demand variation takes the form of perturbations to origin-destination
flows and supply variation takes the form of perturbations to link saturation flows. Two extreme
cases are distinguished; one where route choices fully respond to the perturbations,
corresponding to the more major, longer-term incident and the other where route choice does
not respond, corresponding to the more minor, shorter-term incident. A logit assignment model,
referred to as the Path Flow Estimator (PFE), is linearised with respect to the parameters
affected by within-day variation, using sensitivity expressions. As analytically derived
sensitivities are sometimes difficult to calculate for large networks, their approximation by finite
differencing is considered. Results obtained for a large network (5000 links, 8000 OD pairs) in
York are discussed, as well as results obtained for a much smaller network (100 links, 60 OD
pairs) in Leicester.
i. INTRODUCTION
1.1 Background
The potential sources of disruption to transportation networks are numerous, ranging at one
extreme from natural or man-made disasters, which tend to occur rather infrequently, to at the
other extreme events, which occur on a daily basis. The scale, impact, frequency and
predictability of such events will of course vary enormously. While little can be done about
their scale, frequency or predictability, particularly where natural disasters are concerned, it
Transport Operations Research Group, University of Newcastle, Newcastle upon Tyne, NE1 7RU
!
Department of Transportation Engineering, Kyoto University
1
Department of Civil and Structural Engineering, Hong Kong Polytechnic University
should be possible to design transportation networks so as to minimise the disruption such

events can cause.
This paper describes the theory behind a software tool being developed for the analysis of
network reliability. While many tools exist for studying the impact of new transport
infrastructure or travel demand management measures on traffic flows (for example, the widely-
used CONTRAM and SATURN programs), there are no tools for assessing the impact of such
measures on the reliability of transportation networks.
Network reliability has two dimensions. The first relates to the connectivity reliability of a
network. When links fail in unfavourable configurations it may no longer be possible to reach a
given destination from a given origin, in which case the network becomes disconnected.
However, even a connected network may fail to provide an adequate level of service. For
example, random events may cause unacceptable variation in origin-to-destination travel times,
making it difficult for travellers to arrive at their destinations on schedule. The second
dimension of reliability is therefore the performance reliability of a network.
Previous work on network reliability (Du and Nicholson, 1993; lida and Wakayabashi, 1989)
has focused principally on connectivity in degradable transportation networks. Although
Asakura and Kashiwadani (1991, 1995) have looked at the problem of travel time reliability, the
field of performance reliability is distinctly under-researched.
This paper focuses on the performance reliability of transportation networks in the face of
normal within-day variation of origin-destination flows and link properties. It makes use of the
Path Flow Estimator (PFE), a Stochastic User Equilibrium (SUE) assignment model, in order to
estimate the distributions of the variables of interest (flows, travel times). Reliability with
respect to given performance criteria can then be computed on the basis of those distributions.
The eventual objective is to develop a software tool that facilitates the design of robust
transportation networks.
1.2 Path Flow Estimator
The Path Flow Estimator (PFE) is a flexible traffic assignment tool. It was originally developed
in the DRIVE 2 project MARGOT to estimate path travel times for route guidance systems
given vehicle detector data, traffic signal times, and possibly also travel time data from probe
vehicles. The PFE is based on the notion of a Stochastic User Equilibrium (SUE) between
demand and supply. The demand for the paths is determined by a logit path choice model and
the link costs, which characterise the supply side, are determined primarily by flow-dependent
delay formulae. The delay function currently used is the Kimber and Hollis (1979), time
dependent function which allows for over-saturated conditions . The PFE utilises the fact that
under a SUE, in contrast to a deterministic user equilibrium, path flows are uniquely defined at
the equilibrium.
The equilibrium path flows are found by solving an equivalent optimisation problem iteratively.
An outer loop generates paths and an inner loop assigns flows to paths according to the logit
path choice model. The inputs take the form of traffic counts and/or trip tables and the outputs
take the form of estimated path flows and travel times, as well as data which may be derived
Network Reliability Assessment 285
from these, like junction turning proportions, link flows where unmeasured and link delays.
Further information about the PFE can be found on http://www.ncl.ac.uk/~nsg5/PFE/html.
The structure of the algorithm is illustrated in Fig. 1. Interesting features of the algorithm are;
the inclusion in path generation of the "pressure" exerted by active link constraints, and the
allowance for measurement errors through user-defined confidence intervals.
Input: Static network data and dynamic flow and signal control data
Outer loop
Generate paths of "least resistance" taking delay and the pressure
exerted by active link constraints into account
Inner loop
Assign measured flows to paths according to the logit path
choice model taking flow-dependent delays into account
Output: Path flows and travel times, turning movements, unmeasured link flows
Fig. 1 : The Path Flow Estimator
The approach to network reliability analysis adopted in this study is based on the sensitivity of
the PFE outputs, in particular path travel times and expected minimum origin-destination travel
times, to perturbations of the inputs, in particular the origin-destination demands and link
saturation flows. Tobin and Friesz (1988) obtained expressions for the sensitivity of
deterministic user equilibrium link flows to perturbations in origin-destination demands and link
travel times. Bell and lida (1997) extended the method to obtain expressions for the sensitivity
of minimum origin-to-destination travel times to perturbations in origin-destination flows and
link travel times. Using the same approach, Bell and lida (1997) have obtained comparable
sensitivity expressions for the PFE.
2. NETWORK PERFORMANCE RELIABILITY
2.1 Network performance reliability measures
This paper focuses on normal within-day variations of traffic conditions, like the morning peak-
hour for instance. The measures of network performance reliability considered are based on
travel times, since broadly speaking the main function of a transportation network is to carry
travellers to their destinations within acceptable times.
Two measures of potential interest to both network users and planners are:
• Reliability of path travel times

• Reliability of the expected minimum travel time between the origin and destination
The first measure is defined as the probability that the travel time on a given path is less than an
acceptable threshold. This is of direct interest to the users, since path travel time reliability is a
factor likely to influence their route choice. The second measure is taken at a more aggregate
level, namely the level of the Origin-Destination (OD) pair. The expected minimum OD travel
time is the expected lowest perceived travel time amongst all the relevant paths, allowing for
random variations across users in the perception of travel time. Thus changes of route
undertaken by users to avoid perceived increases in link travel times are taken into account. The
second measure may be more useful for planners than for the network users themselves.
Consider the logit path choice model with perceived path travel time
Ck=g?—j
where g™ is the actual travel time on a path k between an OD pair rs, £ is a random, Gumbel
distributed, perception error term, and 9 is a scaling factor, then the expected perceived
minimum travel time is given by
E[min kers '

kers
For both types of reliability measure, an estimation of the distributions of actual (not perceived)
path travel times in the face of normal within-day variations of exogenous parameters in the
demand or the supply side of the transportation system is required.
2.2 Sources of unreliability
There are basically two types of variation which affect travel times in a transportation network
on a daily basis. One relates to the demand, characterised by average OD flows during the
period of interest, and the other relates to the supply, for example the saturation flows of the
links in the network. Both types of variation can have a range of causes and impacts in terms of
their scale and duration. Examples of sources of demand- and supply-side variation are
presented in Table 1.
Impact Demand Supply

Minor/Short-term Addition or cancellation of Reduction of capacity due to
trips; change of destination; minor accidents; signal timing
change of departure time; etc. changes (where not demand
related); etc.
Maj or/Long-term Exceptional events (football Major accidents; road
matches, fairs, etc.); holiday closures; etc.
periods; etc.
Table 1: Sources of demand- and supply-side variation
For both the demand- and the supply-side sources of variation, the distinction between
minor/short-term and major/long-term impacts serves to indicate whether or not there is likely
to be an induced re-routing. For more minor, short-term incidents, there may be insufficient
time for information to disseminate to network users, and even where information has
disseminated, it may not be worthwhile for users to respond by changing route.
These sources of fluctuation can be modelled by the use of appropriate discrete or continuous
probability distributions for the OD flows and link saturation flows. To model the impacts the
fluctuations will have on the travel times, and hence estimate path travel time distributions,
linear relationships between the PFE estimates of path travel times and the sources of random
variation (the OD flows and the link saturation flows) are established. Two approaches are
adopted, depending on the severity and time-scale of the perturbations, reflecting whether or not
induced re-routing is expected.
3. PFE SENSITIVITY
3.1 Linearising the PFE
Linearising the PFE allows approximate distributions for model outputs to be obtained from
distributions of model inputs thanks to the conservation of form for certain commonly used
types of distribution. For instance, the summation of independent normal variates has a normal
distribution whose defining moments are straightforward to compute. The linearisation is valid
only for small variations about some point, in this case taken to be the SUE- 'solution obtained
for average input values. /
Where the fluctuations may induce re-routing, one extreme case arises where full adjustment to
the fluctuations has occurred. SUE sensitivity analysis may be used to estimate the distributions
of equilibrium path travel times. The other extreme case arises where no re-routing at all occurs.
The fluctuations effect only the supply-side, since the route-choice proportions do not respond.
3.2 Notation
Let:
h = Vector of path flows
v = Vector of link flows
c = Vector of link travel times
g = Vector of path travel times
t = Vector of Origin-Destination (OD) flows
s = Vector of link saturation flows
A = Link-path incidence matrix
B = OD-path incidence matrix
P = Matrix of path choice proportions
a = Dispersion parameter of the logit model.
The PFE gives an estimation of average flows and travel times in a network for some period of
the day. The solution yields flows and travel times consistent with each other by virtue of the
equilibrium principle, and the assignment of flows to paths is governed by the logit model as
follows
with
/7 = exp(-a.gy(h,s))/
for any path flow hj between an OD pair W .

The relation between path costs and path flows is through the fact that path travel times are the
summation of link travel times, and that link travel times depend on link flows in relation to
their saturation flows through monotonically increasing functions
where / e j indicates that the summation is over all links on path j , and the relation between
link flows and path flows is
where j e / indicates that the summation is over all paths using link i.
This SUE solution can be interpreted as the mean value of a network equilibrium where users
minimise their perceived travel times. Due to random errors in travel time perception, the model
is stochastic and therefore the estimated flows and travel times are also random variables.
However the mean values provided by the SUE are a reasonable estimator of average network
traffic conditions on a particular period of the day, like the morning peak hour. We shall
hereafter refer to the 'base solution' as being the PFE stochastic equilibrium obtained for a
given trip matrix t 0 , considered to be the average trip table, and the vector s0 of average link
saturation flows. From this base we can then look at the effects of random variations in both
demand and saturation flows on path travel times and expected minimum OD travel times
necessary for calculating reliability measures.
3.3 PFE sensitivity analysis with full route response
Assuming OD flows are normally distributed with known parameters, Asakura and
Kashiwadani (1991) ran a static User Equilibrium (UE) assignment several times, with a
demand sampled from these normal distributions, in order to estimate OD travel time
distributions. A similar approach would be possible for the PFE. However, for large networks
with many OD pairs, this method is computationally prohibitive and also simulation adds an
element of non-reproducibility.
Another approach would be to estimate variances of travel times from the variances of OD trips
by making linear approximations of the relationship between OD flows and the path travel times
obtained from the equilibrium assignment model. This can be done by applying sensitivity
analysis to the PFE model.
Fisk (1980) proposed the following objective function
f(h) = S j h J ( l n h J - l ) + aS i I 0vi c i (x)dx
When f(h) is minimised subject to t = Bh and h > 0, the Kuhn- Karush -Tucker optimality
conditions are
Vf(h) + BTu* > 0, h* > 0 and (Vf(h) + BTu*)T h* = 0

where Vf(h) = In h + ag. As h* > 0 (since all paths are used)
In h* = -ag* - BTu*
It can be shown that this implies the logit path choice model
hj* = tw exp(-agJ*)/EjeP(w) exp(-agj*)
Hence at the optimum
Vf(h*) + BT u* = 0
where Vf(h*) = In h* + ag*, so at equilibrium
V 2 f(h*) Ah + BT Au = -aAg
where Ag is the exogenous perturbation of path cost. This leads to
r«Agl ~H B T TAhl
[At J ~ B 0 J_AuJ
where H = V2f(h*). Following Bell and lida (1997)
TAhlJj,, J12TaAgl
Au J
L J L 2i J22iAt J
where
J n = H'1 (I - BT (B H-1 BT) ' B H'1)

J12 = H ' BT (B H-1 B T )'
J21 = (B H'1 B T )' B H '
— /i> u-1 uTVl
72 — ~\ •*-*• Jt> I
Let z = Expected minimum OD costs
At optimality In h* + ag* + BTu* = 0 implying hj = exp(- agj -1^). From this logit model
exp(-uw) = tw / Sj 6 P(w) exp(-agj)

so
-uw = In tw - In Sj e P(w) exp(-agj)
Hence -u = In t + az, so
dz/dt = a(B H-1 B7)'1
Putting this together

[dz/ds|s = s,] = -(l/ct)(B H-1 B) !B H-' AT[dc/ds|s = s*]
[dz/dt|t=t*] = -(l/cc)[du/dt|t = t,] = (l/a)(B H > B)-1
[dh/ds s= s*] = H-' (I - BT (B H ' B1)'1 B H'1) AT[dc/ds|s = s»]
[dh/dt|t = t,] = H-1 BT (B H-1 BT)-'
This leads to the following approximate variance and covariance expressions
Var(z) = [dz/ds|s = s,] Var(s) [dz/ds|s = s,]T + [dz/dt|t=t»] Var(t) [dz/dt|t = t*]T
Var(h) = [dh/ds|s = s,] Var(s) [dh/ds|s = S*]T + [dh/dt|t = t,] Var(t) [dh/dt|t = t*]T
assuming that s and t are vectors of independent random variables.
Gradients and variances for path travel times rather than path flows may be obtained from the
gradients of path travel times to path flows, via the link travel time Jacobian.
Calculating those expressions requires extensive matrix operations (like inversion), which for
large networks are thought to be unmanageable. Therefore it may be preferable to approximate
these gradients by finite differencing
|t = t,]*[Az/At|t = t*]
|s = s,]«[Ah/As|s =s,]
To obtain finite differences one needs first to run the PFE program with the trip table t 0 to
obtain a base solution in all variables of interest . Then the model has to be run sequentially
after applying a small change A/y (say 1 vehicle/hr) to each OD pair j , one at a time. To reduce
the number of iterations within the PFE at each run, the initial solution is taken to be the base
solution, which is assumed to be not so distant from the perturbed solutions. Thus after each run
j , the differences with respect to the base solution in terms of path travel times can be
calculated. Those differences divided by the htj provide the finite differences approximations
to the partial derivatives needed.
3.3 PFE sensitivity with no route response
If we assume that there is no re-routing induced by the fluctuations of demand and saturation
flow, then it is no longer appropriate to consider expected minimum OD travel time. Instead we
can look at distributions of path travel times around the equilibrium caused by the demand- and
supply-side perturbations. The following gradients reflect the corresponding sensitivities
[dg/dt|t = t*] = AT [dc/dv|v = v.] A P
4. EXAMPLES
4.1 Example 1: network in York (2323 links)
4.1.1 Network Data.
Network data were provided by York City Council (YCC) in SATURN (a simulation and
assignment model) format, as part of a European project AIUTO, which looked at how to model
the effects of various travel demand management measures. The SATURN model was based
upon one constructed by YCC in 1992 and extensively updated by them in 1996.
The node-based data from SATURN had to be translated into a link-based format suitable for
the PFE. Also the morning peak hour trip matrix, obtained from a survey with real number
entries, was transformed into an integer trip matrix by a rounding operation that sought to
maintain approximately the same column and row totals as in the original real valued matrix.
The resulting PFE network thus consists of about 2000 links, 8635 Origin-Destination (OD)
pairs, and includes such necessary information as link free-flow average speed, link saturation
flows, as well as the signal times. A representation of the modelled network is shown in Fig. 2.
Fig. 2: The York network
4.1.2 Base SUE solution.
The base SUE solution was obtained by running the PFE on the York network with the average
morning peak hour trip matrix. Fig. 3 shows the network with the congested links highlighted
(links with the ratio of flow over saturation flow greater than 0.9 were defined as congested).
Fig. 3: York network with congested links highlighted
4.1.3 Finite differences and variance computation.
It was proposed in this study to obtain morning peak hour variances of both equilibrium and
non-equilibrium path travel times. The source of variation was taken to be the trip table (OD
flows). For equilibrium path travel times we used sensitivity of the PFE solution about the base
solution calculated previously. Since applying analytical sensitivity expressions was not
practically possible for such a large network, it was decided to approximate the derivatives of
equilibrium path travel times with respect to OD flow variation by finite differences. This was
carried out by simply changing each OD flow tt , one at a time, by one unit of flow, and then
calculating a new PFE solution trip matrix thus perturbed. The difference in path travel times
with respect to the base solution give the approximate derivatives
where g* is the SUE travel time of path j.
This meant running the PFE 8635 times (the number of OD pairs) with a differently perturbed
OD matrix each time. Eventually the outcome would yield an estimation of the equilibrium
path travel time variances, through the linearisation of the relationship between equilibrium path
travel times and OD flows discussed in the previous section
i<=OD pair
What is needed as input are the variances of each OD flow. Due to lack of empirical statistical
data, some assumptions about the distributions of OD flows were required. For simplicity, we
used a normal distribution for each OD flow centred around tf , the base solution. This implied
a normal distribution as output for the path travel times.
To have a range of OD flows with negligible negative values, and given that the majority of all
OD pairs had flows of the order of 1 veh/hr, it seemed quite reasonable to take a standard
deviation a at a third of the mean value // . So for each OD pair i , the variance of flow was
thus assumed to be
4.1.4 Analysis of results.
The finite differences method was applied to obtain variances of SUE path travel times. Four
days of computational time were necessary in order to compute the required variances, despite
the fact that each of the 8635 PFE runs were started from the base solution, expected to be not
too distant from the perturbed solutions. We also obtained the variances for non-equilibrium
path travel times using finite differences for proper comparison, even though the variances
could have been computed analytically as shown in the previous section.
With the variances thus calculated, reliability measures could be computed for each path. The
performance criteria, or the travel time threshold set for reliability, was arbitrarily taken at 1 1 0%
of the mean path travel time.
Thus for each path j the travel time reliability Rj , defined as the probability that the travel time
will be less than 1 10% of the mean travel time g° , was calculated as follows:
/Var[g,]'
where O( ) = Fx(x) = —f= |exp[-—y 2 ]dy is the cumulative distribution of a normally

0" v27r J 2
-00
distributed variable x with mean // and variance a1. The values of O are obtained by
tabulation.
Given the huge number of paths in this network (around 23000), a thorough analysis of all the
results was not really presentable here. For illustrative purposes, we chose a particular OD pair
with two clear cut alternatives to analyse the results and illustrate the differences between the
equilibrium and non-equilibrium methodology for path travel time reliability. Fig. 4 shows the
two paths (with the arrows) on a zoomed part of the network (in the centre of town), along with
the congested links (highlighted). The table below indicates the average path travel times with
the values of the standard deviations and reliability for both equilibrium (full route adjustment)
and non-equilibrium (no route adjustment) distributions.
Fig. 4: Two alternative paths
path Average time Std deviations (sec) Reliability

(ran) (Equilibrium *) (Equilibrium *)
1 7.06 49* 66 0.81* 0.76

2 4.63 1* 8 1.00* 0.99
We can see that for both paths the standard deviation increase quite significantly when
readjustment to equilibrium is not considered in the calculation. This suggests that the effects of
drivers re-routing is to lessen to impacts of variations.
We can also notice that path 2 experiences hardly any significant variations (whether in the
equilibrium case or not) compared to path 1. By looking at the map, the explanation is clear;
path 2 does not include any congested links whereas path has two of them. So fluctuations of
demand on those congested links is bound to produce greater changes in travel times than on the
links that are less congested in the base solution. This is a feature that actually appears for all
the most unreliable paths in the network; they all contain one or several congested links.
While this appears sensible, it still prompts a question about the usefulness of conducting such a
time-consuming calculation; the most unreliable paths could indeed have been identified before
the calculations of the variances just by finding out the paths containing congested links in the
base solution.
This limitation comes from the linear approximation used in both methods, which is only valid
in the vicinity of the base solution. Therefore it would be useful to account for larger variations,
which tend to happen in reality anyway. However to model the impacts of these larger variations
the non-linear effects need to be included.
4.2 Example 2 : network in Leicester (103 links)
To illustrate the points made in the previous example further, the methodology was applied to a
much smaller network , thereby allowing a more complete overview of the results to be
presented in the scope of this paper.
4.2.1 The network.
The network considered here is a small part of an urban network in Leicester , England. It
consists of 103 links (including micro-links at junctions), 9 origins and 9 destinations . A
representation of the network is shown in Fig. 5. The topological data (including signal
timings), along with a trip table for a peak period were made available for a Phd project on OD
estimation based on SCOOT traffic counts. We didn't use the link detector data here.
O signalised
junction
f • —•^
.
unsignalised
junction
m
*.._. 1 1 d«J d«_.
212i 222. 2*1* 22« 21Jb

. -"- - +1. _ *-
»-
iin
' !,•-<: NK.222
^222. C
N00224 )22^ (* NM2« ) 22.,
Fig 5: Leicester network
4.2.2 Computation of path time variances and reliability measures with respect to OD flow
fluctuations.
We used the PFE model on the Leicester network with the trip table available, and applied the
sensitivity expressions defined in chapter 3, with and without route response. It was here
possible to compute the sensitivity expressions in the full (equilibrium) route response case,
directly without using finite differences, given the small size of the network (typically the
number of paths generated was of the order of 100 veh/hr).
As in the York case we assumed normal distributions of OD flows centred around the trip table
values t°, and again chose a standard deviation of (t° 13), to be consistent with the previous
example. However since the order of magnitude of most OD flow was here much larger than in
the York case , we put a limit of 30 veh/hr on the standard deviation, which corresponds to the
largest deviation in the York case. This was to avoid very large deviations, which given the first
order approximation used in the method, would probably lead to unrealistically large variances
in the equilibrium travel times. Note that a deviation of 30 veh/hr is not strictly speaking small
enough to warrant a valid use of the first order approximation for the relation between the
equilibrium solution and the OD flows. However it is conjectured that for moderately congested
network, the second-order derivatives of the equilibrium flows will be sufficiently small so as to
make the second order terms negligible.
The following table shows the time deviations and reliability (defined as in the York example)
results obtained for the 20 most 'unreliable' paths, in the equilibrium and non-equilibrium
variance evaluation case.
Orig. Dest. Time Deviat. Reliab. Route

OR_F DES_B 6.00m 1.28m 0.68 212S-241C-224I-226S-215K-DES_B
(non-equilibrium) 2.09m 0.61
OR_F DES_B 5.78m 1.27m 0.68 2 1 2S-24 1 C-242C-243C-2 1 3C-226X-21 5K-DES_B-
2.06m 0.61
OR_F DES_B 5.93m 1.27m 0.68 212S-222A-224L-226S-215K-DES_B-
1.90m 0.62
OR_F DES_A 4.54m 1.26m 0.64 2 1 2S-24 1 C-242C-243C-2 1 3C-244L-21 5L-DES_A-
1.69m 0.61
OR_F DES_B 4.54m 1.26m 0.64 2 1 2S-24 1 C-242C-243C-2 1 3C-244L-21 5L-DES_B-
1.69m 0.61
OR_F DES_B 5.68m 1.25m 0.67 2 12S-222A-224L-24 1 K-242C-243C-2 1 3C-244L-215L-DES_B-
1.35m 0.66
OR_D DES_B 5.05m 1.09m 0.68 232K-222D-224L-226S-215K-DES_B-
1.86m 0.61
OR_D DES_B 4.98m 1.09m 0.68 232K-245L-224K-226S-215K-DES_B-
1.85m 0.61
OR_E DES_B 4.63m 1.09m 0.66 222F-224L-226S-215K-DES_B-
1.85m 0.60
OR_G DES_B 4.79m 1.08m 0.67 OR_G-24 1 Q-224I-226S-2 1 5K-DES_B-
1.85m 0.60
OR_C DES_B 4.88m 1.07m 0.68 OR_C-245J-224K-241K-242C-243C-213C-244L-215L-DES_B-
1.26m 0.65
OR_D DES_B 4.73m 1.07m 0.67 232K-245L-224K-241K-242C-243C-213C-244L-215L-DES_B-
1.27m 0.65
OR_E DES_B 4.38m 1.07m 0.66 222F-224L-24 1 K-242C-243C-2 1 3C-244L-215L-DES_B-
1.27m 0.63
OR_G DES_B 4.53m 1.07m 0.66 OR_G-243C-213C-226X-215K-DES_B-
1.80m 0.60
ORJ DES_B 4.33m 1.07m 0.66 OR_J-213A-226X-215K-DES_B-
1.80m 0.59
OR_H DES_B 4.37m 1.07m 0.66 OR_H-2 1 3C-226X-2 1 SK-DES_B-
1.80m 0.60
OR_C DES_B 4.02m 1.06m 0.65 OR_C-226R-215K-DES_B-
1.82m 0.59
OR_G DES_A 3.29m 1.06m 0.62 OR_G-243C-2 1 3C-244L-2 1 5L-DES_A-
1.26m 0.60
OR_G DES_B 3.29m 1.06m 0.62 OR_G-243C-213C-244L-215L-DES_B-
1.26m 0.60
As for the two paths in the York example, we can see here that, for all the paths, the standard
deviation is greater, and thus the reliability lower, when equilibrium route response is not taken
into account. We also highlighted two links , 215L and 215K, which are included in all the most
unreliable paths, and those two links happen to be the only over-saturated links in the whole
network This confirms the relation between congestion in the average equilibrium situation and
high variations in travel times and again points to the limitation of the first-order approximation
in gaining more useful insights.
The next table shows the standard deviation of the expected minimum OD travel time, for the
20 most 'unreliable OD pairs:
Orig Dest Deviation (mn) Deviation (mn)

(equilibrium)
OR F DES_B 1.2621 1.634
OR D DES B 1.0804 1.4536
OR E DES B 1.0792 1.3646
OR H DES B 1.0612 1.2594
OR G DES B 1.0611 1.2632
OR C DES B 1.0594 1.6743
OR J DES B 1.0579 1.2547
OR F DES A 0.5138 0.6661
OR F DES C 0.4795 0.584
OR F DES J 0.4651 0.7232
OR F DES G 0.4645 0.7223
OR F DES H 0.4645 0.7223
OR F DES D 0.4494 0.405
OR A DES E 0.1582 0.1644
OR J DES A 0.1535 0.1607
OR B DES_E 0.1476 0.154
OR A DES D 0.1242 0.1242
OR E DES A 0.1216 0.1525
OR D DES A 0.1216 0.1526
OR H DES A 0.1173 0.1152
We can see that all OD pairs connecting DES_B have a substantially higher variability in travel
time compared to all other pairs. A quick look at the network shows why: To get to DES_B, it is
impossible to avoid either link 215L or link 215K. The same could be said for DES_A on the
evidence of Fig 5; however in reality link 215K is divided into 2 sub-links, and it is only the
right-turn sub-link leading to DES_B that is actually over-saturated.
s. FUTURE WORK
Including non-linear effects is a problem in the sense that the form of the input distributions
(demand, capacity) is no longer conserved in the output (travel time) distributions. We present
here briefly the problem of attempting to derive path travel times distributions knowing the
distributions of OD flows , and keeping the path-choice proportions fixed (thus restricting the
non-linearity to the supply side only).
We shall concentrate on the link travel times c,(v,(h),^ ; ). Suppose, without loss of generality,
the demand t to be normally distributed around t°, the base trip table.
Then assuming the path choice proportion matrix fixed at the base solution, P0 , the link flow
v, (h) also becomes a normally distributed random variable, centred around v,° , since there is a
linear relationship:
v
76i W 7a
and
Var[v,]=
provided the fw are independent, (a condition that is not necessarily satisfied, but which we will
assume holds here as we did in the previous calculations).
Thus the travel time ct (v,, st) is also a random variable, being a function of a random variable,
though it is not normal, since the link cost function is non-linear. However the expected value
and the density function can be calculated, using the density function of v^, fv
and the cumulative probability distribution of Cj is given by
where the inverse function c~'(x) exists, since the link cost function is monotonically increasing
as a function of link flow v only.
The density function fc (x) can then be obtained by taking the derivative of F ( x ) .
For reliability studies, we are mainly interested in path travel times, which being summations of
random link travel times, are also random variables whose cumulative probability function is
just what is needed to obtain the required measures of reliability . The probability of a trip cost
on a path j being close enough to the 'normal situation' (base solution) value serves as a good
performance measure and will be given by Fg (gj + s). The problem is thus how to compute
V
Suppose , for the sake of clarity, that a path X is composed of link 1 and 2.
Then we have:
as the summation of two random variables.
The cumulative distribution of gx is then
z+y<x
If Cj and c, were independent variables, Fgx (jc) would then be quite easy compute, since in
that case:
The fc (x) , density functions for the link costs, had been calculated previously. So Fg (x) may
be calculated relatively easily ( at least in this case of two variates ).
= z+y<x
IJ
However, in the case of fluctuating demand, c, and c, will not in general be independent (they
will be affected by some common flows). Thus the joint density function fc c (z,y) will not be
easy to compute, especially for paths containing many correlated links. Some approximation
methods might be needed. Furthermore, even if we supposed that the travel times on all the
links were independent then we will be faced for large realistic networks with the nearly
impossible task of multiple integrals. Thus it appears that further research in heuristics to a)
include the correlation between the numerous random variates involved in the determination of
path travel times and b) to integrate the density functions of those variates is necessary and
justified.
6. CONCLUSIONS
We have presented a methodology for obtaining some performance reliability measures for a
transportation network. The work is essentially centred around estimating typical distributions
of path travel times resulting from random fluctuations of exogenous factors. Having identified
those factors, a method based on sensitivity (linearisation) of a SUE model, with two different
options depending on whether re-routing is considered or not. We showed some results for a
large realistic network and also a smaller one to illustrate the differences between the two
options and noticed that paths subject to bigger variations in time are those containing links that
are congested in the average situation. We concluded by indicating that more useful results
might be obtained if non-linear effects were taken into account so as to allow a wider range of
fluctuations to be considered.
REFERENCES
Asakura, Y and Kashiwadani, M(1991). Road Network Reliability Caused by Daily Fluctuation
of Traffic Flow. Proceedings of the J91 PTRC Summer Annual Meeting in Brighton, Seminar
G, 73-84.
Asakura, Y and Kashiwadani, M (1992). Road Network Reliability Measures Based on Statistic
Estimation of Day-to-Day Fluctuation of Link Traffic. Proceedings of the 6th World Conference
on Transport Research, Lyon, France, June.
Asakura, Y and Kashiwadani, M (1995). Traffic Assignment in a Road Network with Degraded
Links by Natural Disasters. Journal of the Eastern Asia for Transport Studies, Vol.1, No.3,
1135-1152.
Asakura, Y (1996). Reliability Measures of an Origin and Destination pair in a Deteriorated

Proceedings of the 4th Meeting of the EURO Working
Road Network with Variable Flow. Proceea
Group, Newcastle-upon-Tyne,UK,September.
Bell, M.G.H. , Lam, W and lida, Y (1996), A Time-Dependent Multiclass Path Flow Estimator.
Proceedings of the 13th International Symposium on Transportation and Traffic Theory, Lyon,
France, July.
Bell, M.G.H. and lida, Y (1997), Transportation Network Analysis, Wiley, England.
Du, Z.P. and Nicholson, A.J. (1993). Degradable Transportation Systems Performance,
Sensitivity and Reliability Analysis. Research Report, No. 93-8, Department of Civil
Engineering, University of Canterbury, Christchurch, New Zealand.
Du, Z.P. and Nicholson, A.J. (1997). Degradable Transportation Systems: Sensitivity and
Reliability Analysis. Transportation Research B, 31, No 3, 225-237.
Fisk, C (1980). Some developments in equilibrium traffic assignment. Transportation

Research, 14B, 243-255.
lida, Y and Wakayabashi, H (1989). An Approximation Method of Terminal Reliability of Road

Path and Cut Set. Proceedings of the 5th World Conference,
Network Using Partial Minimal Pat
Vol.IV, Yokohama, Japan. 367-380.
Kimber, R.M. and Hollis, E.M. (1979). Traffic queues and delays at road junctions.
TRRL Laboratory Report 909, Transport and Road Research laboratory, Crowthorne, England.
Tobin, S.J. and Friesz T.L. (1988). Sensitivity Analysis for equilibrium network flow.
Transportation Science, 22, 242-250.
A Paradox For A Dynamic Equilibrium Assignment 301
A CAPACITY INCREASING PARADOX

FOR A DYNAMIC TRAFFIC ASSIGNMENT
WITH DEPARTURE TIME CHOICE
Takashi Akamatsu
Department of Knowledge-based Information Engineering

Toyohashi University of Technology
Toyohashi, Aichi 441-8580, Japan
Masao Kuwahara
Institute of Industrial Science

University of Tokyo
Minato-ku, Tokyo 106-8558, Japan
ABSTRACT
This paper demonstrates that the capacity increasing paradox in a transportation networks
as in Braess(1968) does also occur under non-stationary settings, in particular, under
dynamic traffic assignment with endogenous time-varying origin-destination (OD)
demands. Through the analyses, the analytical formulae for the solutions of the dynamic
equilibrium assignment are explicitly derived for two kind of networks: the networks
with a one-to-many OD pattern and the reversed networks with a many-to-one OD pattern;
the formulae clarify the significant difference in the properties of the two dynamic flow
patterns. This also leads us to the findings that one of the crucial conditions that
determine whether the paradox occurs or not is the OD pattern of the underlying
networks.
1. INTRODUCTION
Local improvements in a transportation network do not necessarily lead to the improvement

of the global performance of the network. This fact has been well recognized as "Braess's
paradox"(Braess (1968)) or "Smith's paradox"(Smith (1978)). The paradoxes stimulated many

researchers in the field, and a considerable number of studies have been made on the relevant
topics such as the network design problem or the sensitivity analysis of the equilibrium traffic
assignment. Almost all the studies are, however, based on the framework of static (equilibrium)
traffic assignment; only a few attempts have so far been made to study non-stationary (dynamic)
traffic flow patterns with queues. Since the properties of the dynamic flow with queues are
significantly different from those of the static flow without queues, many basic problems on the
paradox under non-stationary settings are yet to be investigated.
The purpose of this paper is first to demonstrate that the capacity increasing paradox does
also occur under non-stationary settings, in particular, under dynamic traffic assignment with
endogenous, time-varying origin-destination (OD) demands. The paper also aims to capture the
conditions that determine whether the paradox is likely to occur or not; we disclose that the OD
pattern of the underlying networks is one of the crucial conditions.
In order to achieve the purpose, we first disclose that the analytical solution of the
dynamic user equilibrium (we call this DUE) traffic assignment with elastic OD demands (i.e.
the assignment considering users' departure-time choice behavior) can be obtained explicitly in
a particular type of network satisfying some conditions. The solutions are derived for two
kinds of network: (i) networks with single origin and multiple destinations (regarded as an
"Evening rush hour" on a network of a city with a single CBD; we refer to this "E-net" hereafter);
and (ii) networks with single destination and multiple origins (obtained by reversing the direction
of all links and origin/destinations of the E-net, we may regard it as a "Morning rush hour" on the
same network above; we refer to this "M-net"). Through the analyses of the two cases, we see
the significant difference in the properties of the two dynamic flow patterns for not only the case
where time-varying OD demands are given but also for the case of elastic OD demand due to
user's departure time choice. These basic results for the DUE assignment then enables us to
demonstrate the dynamic version of the capacity increasing paradox and to discuss the
significant effect of OD pattern on the occurrence of the paradox.
The organization of this paper is as follows. In the second chapter, we briefly explain the
basic properties of dynamic user equilibrium assignment, restricting ourselves to the minimum
knowledge required for considering our problem. The third chapter explores the structure of the
dynamic equilibrium assignment with exogenous OD demands for E-net and M-net. The
analytical solution formulae of the equilibrium flow patterns for E-net and M-net are derived. The
fourth chapter extends the analyses to the model with endogenous OD demand; not only the route
choice but also the departure time choice are simultaneously considered in the model. For an
appropriate set of boundary conditions, the explicit equilibrium flow patterns are derived for E-net
and M-net. By using the results obtained in Chapters 3 and 4, we demonstrate a dynamic version
of Braess's paradox in the fifth chapter. We first discuss the paradox for the model with
exogenous OD demand; the analysis on a simple network exhibits that the paradox arises only on a
particular condition for the network with a one-to-many OD pattern, while the corresponding
paradox always arises for the reversed network with a many-to-one OD pattern. We then show that
the same results also hold for the model with endogenous OD demand. Finally, the last chapter
summarizes the results and remarks on some further research topics.
2. DECOMPOSITION OF DYNAMIC EQUILIBRIUM ASSIGNMENT
2.1 Networks
Our model is defined on a transportation network G[N, L, W\ consisting of the set L of

directed links with L elements, the set N of nodes with N elements, and the set W of origin-
destination (OD) nodes pairs. The origins and the destinations are the subset of N, and we denote
them by R and S, respectively. In this paper, we deal with only networks with a one-to-many OD
(i.e. the element of R is unique) or those with a many-to-one OD (i.e. the element of S is unique).
Sequential integer numbers from 1 to N are allocated to N nodes. A link from node / toy is denoted
as link (ij). We also use the notation to indicate a link by the sequential numbers from 1 to L
allocated to all the links in the set L.
The structure of a network is represented by a node-link incidence matrix A*, which is an N
X L matrix whose (n, a) element is 1 if node n is an upstream-node of link a, -1 if node n is a
downstream-node of link a, zero otherwise. The rank of this matrix is N-l since the sum of rows
in each column is always zero. Hence, it is convenient in representing our model to use the
reduced incidence matrix A (instead of A*), which is an (N-l) XL matrix eliminating an arbitrary
row of A*. We call the node corresponding to the elimination "reference node". It is also
convenient to "split" the matrix A into a pair of matrices, A _ and A + , defined as follows: A_ is a
matrix that can be obtained by letting all the +1 elements of A be zero (i.e. the (n, d) element is -1 if
link a arrives at node n, zero otherwise); A + is a matrix that can be obtained by letting all the -1
elements of A be zero (i.e. the (n, a) element is +1 if link a leaves node n, zero otherwise); it is
needless to say that A = A_ + A + holds.
2.2. Link Model and Dynamic Equilibrium Assignment
For a link model in our dynamic assignment, we employ a First-In-First-Out (FIFO)

principle and the point queue concept in which a vehicle has no physical length: it is assumed that
the arrival flow at link (ij) leaves the link after the free flow travel time m:j if there exists no queue
on the link, otherwise it leaves the link by the maximum departure rate (capacity) ^.
Concerning the assignment principle, we assume the dynamic user equilibrium (DUE)
assignment, which is a natural extension of the static user equilibrium assignment; the DUE is
defined as the state where no user can reduce his/her travel time by changing his/her route
unilaterally for an arbitrary time period.
2.3. Decomposition Property of Dynamic Equilibrium Assignment
Under the DUE state, the users who depart their origin at the same time, regardless of their
routes, have the same arrival time at any node that is commonly passed through on the way to their
destination. Furthermore, under the DUE state, the order of departure from the origin must be
kept at any node through destinations. From these property, we can define the unique equilibrium
arrival time at each node for each departure time from the origin.
As defined in the previous section, link travel time ctj(f) depends only on the vehicles which
arrived at the link before time /. Therefore, together with the above discussion on the order of
arrivals at a node, it is concluded that the travel time experienced by the vehicle that departs from
an origin at time s is independent of the flows of the vehicles that depart from the origin after time s.
Consequently, we can consider the assignment sequentially in the order of departure from the
single origin. That is, the assignment can be decomposed with respect to the departure time from
the single origin provided that the OD pattern is one-to-many. Similarly, for a many-to-one OD
pattern, we can easily conclude that the assignment can be decomposed with respect to the arrival
time at the single destination. For the detailed discussions on this property, see Kuwahara and
Akamatsu (1993) and Akamatsu and Kuwahara (1994).
3. EQUILIBRIUM FLOW PATTERNS ON SATURATED NETWORKS

- FIXED DEMAND CASE
In general, the DUE assignment is formulated as a non-linear complementarity problem

(NCP) or a variational inequality problem (VIP), which implies that it is difficult to obtain the
analytical properties of the assignment. Hence, instead of exploring the properties of the DUE
assignment under general settings, we confine our analysis to "saturated networks" where we can
obtain the analytical solution. The "saturated networks" are the networks satisfying the following
two conditions: a) there exist inflows on all links over the network, b) there exist queues on all
links over the network. The first condition a) is not very restrictive, since we can constitute the
networks satisfying this condition after knowing the set of links with positive flows. Although the
second condition b) may not be satisfied in many cases, we nevertheless employ this assumption
because this assumption, as shown below, gives us the explicit formula for the solution of the DUE
assignment, which enables us to understand the qualitative properties of interest.
We will first show the formulation for E-net and derive the solution in 3.1; and then the
formulation and the solution for M-net will be examined in 3.2.
3.1. Equilibrium on Saturated Networks with a One-to-Many Pattern
(1) Formulation
The DUE assignment on a network with a one-to-many OD pattern can be decomposed with
respect to the origin departure-time as mentioned in chapter 2. Hence, once we know the method of
solving the equilibrium pattern for one particular departure-time, we can obtain the equilibrium
pattern for whole time periods by successively applying the same procedure at the order of the
departure-time. In the following, we consider the problem of obtaining the equilibrium pattern for
vehicles departing from origin o at time s, assuming that the solutions for vehicles departing before
time s are already given.
In the decomposed formulation with origin departure time s, two kinds of variables, ( y]} ,T' ),
play a central roll: r,s is the earliest arrival time at node i for a vehicle departing from origin o at
time s; y", is the link flow rate with respect to s, that is, y*l = dFtJ (T- ) / ds , where FtJ(f) denote the
cumulative number of vehicles entered into link ij at time t. In addition, we denote the number of
vehicles with destination d departing from origin o until time s (cumulative OD demand by
departure-time) by Qoli (s) .
In the DUE state, each user choose his/her route whose travel time is (ex post) minimum over
the network. In other words, the links with positive inflows should be on the minimum path tree.
In our saturated networks, all the links have positive inflows, and therefore the minimum path
condition for users with origin departure-time s is written as c(s) + A 7 T = 0 , where c(s) is an L
dimensional column vector with elements c* =c r:/ (r; v ), r(s) is an (N-l) dimensional column
vector with elements r,v . Since the equation above should hold for any s, taking the derivative
with respect to 5, we have
f£) + A ' * £ » = 0 V,, (3.1)

as as
where dc(s)/ds is an L dimensional column vector with elements dcs(l I ds , and ar(s)/ ds is an
N-l dimensional column vector with elements dr* I ds .
In our link model, the point queue and the FIFO principle are assumed, and therefore, the rate
of change in link travel time is given by
dc (t) \(^,j (0 / /^ ) ~ 1 if there is a queue
dt 0 otherwise
where /i y (/) is the standard link flow rate defined as dFtJ(t)ldt. Hence, in our saturated
networks where all links have queues, the rate of change in the time needed to traverse link if for
users with origin departure time s, dc^ I ds , can be represented as:
_
ds dr* ds I Jly j ds
Noticing here the definitional relationship y*'. = A, (r/ ) • dr* I ds we see that the dc^lds
reduces to a function of y*. and r/:
(3.2a)
ds fitj ds
or equivalently
= M~'y(s)-A+ \fs. (3.2b)
ds ds
where M is a diagonal matrix whose oth diagonal element represents the maximum capacity of link
a, y (s) is an L dimensional column vector with elements yy.
Substituting (3.2) into (3.1), we obtain
A-( ~\
= 0, Vs (3.3)
and rearranging this yields
Vs. (3.4)
ds
On the other hand, in the decomposed DUE formulation, the flow constraints that consist of
the FIFO condition for each link and the flow conservation at each node over a network reduce to
the following equations (for the detail, see Kuwahara and Akamatsu (1993), Akamatsu and
Kuwahara(1994)):
-Ay(5)-^ = 0 V,. (3.5)

ds
where dQ(s)/ds is defined as an (N-l) dimensional vector with elements dQod(s)lds (given).
Combining (3.5) with (3.4),
7
(AMA
V )^L^1 V,. (3.6)
-' ds ds
Thus, we see that the DUE assignment has a unique solution (dt(s)/ds) if the rank of the matrix
AMA r is N-l.
(2) Solution
The rank of the matrix AMA^ generally depends on the choice of a reference node. For a
network with a one-to-many OD, the rank of AM A [can be less than N-l when we choose an
arbitrary node that is not an origin as the reference node. The rank, however, is always N-l when
an origin is employed as the reference node. Furthermore, since the value of drt (s) I ds for an
origin node is always 1 from the definition of rt ( s ) , it is natural to choose an origin as the
reference node. Thus, by setting an origin as the reference node, we obtain the equilibrium
solution, dr(s)/ ds, by the following formula:
v (3-7)
ds ~' ds
In addition, we can obtain the equilibrium link flow pattern, y(s), by substituting (3.7) into (3.4).
3.2. Equilibrium on Saturated Networks with a Many-to-One Pattern
(1) Formulation
The DUE assignment on a network with a many-to-one OD pattern can be decomposed with
respect to the destination arrival-time as shown in chapter 2. In the following, we consider the
problem of obtaining the equilibrium pattern for vehicles arriving at a destination at time u,
assuming that the solutions for vehicles arriving before time u are already given.
For the networks with a many-to-one OD pattern, by decomposing with respect to the arrival
time at a single destination, the discussions almost parallels to those in the previous section, hi the
decomposed formulation with destination arrival time u, two kinds of variables, (>"//» r , ), play a
central roll: T" is the latest arrival time at node / for a vehicle reaching destination d at time u;
yl is the link flow rate with respect to u, that is, y>l = dFl} (T" }l du. hi addition, we denote the
number of vehicles with origin o arriving at destination d until time u (cumulative OD demand by
arrival-time} by Q^ (u).
The formulation almost parallels the discussions in 3.1. First, the minimum path
conditions for saturated networks reduces to the following conditions:
Vw. (3.8)
du du
Then the link travel time with a point queue for saturated networks also should satisfy
(3.9)
du du
Substituting (3 .9) into (3.8), we obtain
^M v«. (3.10)
On the other hands , the link flow y should satisfy the flow constraints:
Ay(w) =0 Vw, (3.11)

du
Combining (3.10) with (3.11), we reach
.
- du du
Thus, we see that the DUE assignment has a unique solution (dt(u)l du and y(u)) if the rank of
AMA r isN-1.
(2) Solution
An arbitrary network with a many-to-one OD pattern can be obtained by reversing the
direction of all links and origin/destinations of a network with a one-to-many OD pattern.
Therefore, it is natural to expect that, "reversing" the result in 3.1, the rank of AMA^ become
N-l when a destination is chosen as the reference node. However, it is not the case for this
problem; the rank become less than N-l even if-we set the destination as the reference node;
furthermore, we can prove that the rank is less than N-l for any choice of the reference node.
The reason why the rank of the matrix AMA[ becomes less than N-l is that there exist
particular origins (we call this "pure origins") that are not traversal nodes (i.e. the origin which has
no links arriving at the origin). Letting B, be the (/'/) element of A*MAl , we easily see that
'f
(3.13)
Hence, the column vectors of AMA[ corresponding to the pure origin are always zero, and the
rank of AMAT_ necessarily decreases by the number of pure origins.
To see this fact more precisely, we divide the node set N into two sub-sets: the set of pure
origins, N,, and the set of the other nodes, N2. Then, we divide A*, A*_, d r (u)ldu and dQ(u)/du
into the two blocks corresponding to N, and N 2 , respectively:
"<&,(")" ~dQ}(u)~
dt(u] du dQ(u) du
A = A =
A, du dr2(u) du dQ2(u)
du du
where i th element of dQ2(u) /du is defined as - ^ {dQad (u) I du} = -^ jukd if / is an orign,
dQij(u)/du if / is a destination, zero otherwise. Note that A, _, which is the first block of A_
corresponding to N,, is always 0 according to the definition of the pure origins. Rewriting (3.12)
with these partitioned variables, we have
du .r dr(u) du
= A'MA (3.14)
dQ2(u) du dr2(u)
du du
That is,
(3.15a)
du du
T
*?2(") = -A,MA'_ dr2(u)
(3.15b)
du du
This means that no condition which determines the dr} / du for the pure origins is included in the
equilibrium condition (3.12), while the dr2 / dufor the traversal nodes can be obtained by
(3.16)
du du
Thus we see that the solution of the DUE assignment with a many-to-one OD pattern can not be
unique and that for the problem to have a unique solution we should add appropriate conditions to
resolve the indeterminacy of the dt\ I du .
4. EQUILIBRIUM FLOW PATTERNS ON SATURATED NETWORKS

- ELASTIC DEMAND CASE
The previous chapter analyzed the solution of the DUE assignment where only user's route choice
is endogenously described given time-varying OD demands. This chapter extends the analyses to the
case where the time-dependent OD demands are endogenously determined (we call the model "DUE
assignment with Elastic demand") by incorporating the user's departure time choice. The model
employed here is the simplest one that consistently unifies the two kind of dynamic equilibrium models:
the dynamic equilibrium assignment presented in the previous chapter and the dynamic equilibrium
model of departure time choice as is well known since Vickrey (1969) or Hendrikson and Kocur (1980).
For expositional brevity, the following assumptions are made in this paper:
1) The users with the same OD pair are homogeneous, that is, their utility functions are all the same and
their desired arrival time is unique;
2) The users who arrive later than the desired arrival time do not exist [This is not a restrictive
assumption but one just to make the exposition as simple as possible; it is easy to extend to the case
where late arrival is permitted.].
3a) For the problems with one-to-many OD pattern (i.e. when we consider the problem on the basis of
the origin departure-time), the disutility function for the users with destination d leaving origin at time s,
VJs), is given as the linear combination of their travel time from the origin to destination d and their
"schedule delay":
Vod (s) = a} {rd (5) - 5} + a2 {td - Td (5)}, (4.1)
where a,, a2 are positive parameters that satisfy al > a2, Td(s) is the destination arrival-time for the
users who start from origin at time s, and td is the users' desired arrival time.
3b) For the problems with many-to-one OD pattern (i.e. when we consider the problem on the basis of
the destination arrival-time), the disutility function for the users with origin o arriving at the destination
at time u, Vj(u), is given as the linear combination of their travel time from origin o to the destination
and their "schedule delay":
Vod («) = a} {u - TO (u)} + a2 {td - u], (4.2)
where TO(U) is the origin departure-time for the users who arrive at destination at time u.
4) The networks can be regarded as "saturated networks" that is defined in the previous chapter.
4.1. Equilibrium on Saturated Networks with a One-to-Many Pattern
(1) Formulation
In this section we consider the networks with a one-to-many OD pattern where all nodes except the
origin are destination, i.e., there are no nodes that are neither origin nor destination. [This is simply for the
convenience of expositional brevity. The appropriate division of the node set easily extends our analyses
to the general case where there are some nodes that are neither origin nor destination. See Appendix.]
The elastic demand DUE employed in this chapter is defined as the state where no one can improve
his/her utility by changing either his/her route or their departure-time unilaterally. To formulate this,
consider users who choose time s as departure time. Since the users choose their optimal route
(conditional on the optimal departure time) in the DUE state, the equilibrium conditions for the route
choice should be represented by the following differential equations as shown in Chapter 3:
(4.3)
ds ds
where the origin node is selected as a reference node as discussed in 3.1 . Then, the condition that no user
can improve his utility by changing his/her departure-time in the DUE state can be represented by
\ f s , Md. (4.4)
ds
Substituting the definition of disutility function (4.1) into this, we obtain the equilibrium rate of change in
the destination arrival-time as follows:
= _ Vj> Md (4.5)
ds a, -flu
[We are assuming that networks can be regarded as "saturated networks" and all OD pairs have positive
OD flows during the period of time. In general we should consider the analysis period to include the time
where some OD pairs have no generation of OD flows. By introducing appropriate classification,
however, the general case can be reduced to the combination of our basic case (the case where all OD
pairs have positive OD flows during the period for our analysis) and the case presented in Appendix.].
Thus, the elastic DUE conditions are represented as the following system of differential equations:
a
=E \ (4.6a)
ds a, -a
dQ(s)_
(4.6b)
ds " -' ds
where E is an (N-l) dimensional column vector whose elements are all equal to 1. It is worthwhile to
compare the equilibrium conditions with those for the fixed demand case. In the fixed demand DUE
model, eq.(4.3) with a given constant vector dQ(syds determines dr(s)/ds . On the contrast, in the
elastic demand DUE, dr(s}/ds is first determined from the departure-time equilibrium condition, and
then eq.(4.3) with fixed dr(s)lds determines dQ(s)/ds.
(2) Solution
By setting appropriate boundary conditions, we can obtain the solution (t(s),QCs)) for the
differential equation (4.6). For the boundary conditions, we first set the initial time Ss of the time period
(measured with respect to the origin departure-time) during which eq.(4.6) holds (i.e. the networks can be
regarded as "saturated networks" and all OD pairs have positive OD flows). Then we give the value of
cumulative OD flows for the time ss and for the final time of the period:
Vd (4.7a)
Vd (4.7b)
where s(td) is an origin departure-time of the final users who arrive at destination d at time td (note that
we do not have to give the value of s(td} explicitly).
Integrating the second equation of (4.6) from time ss to s with the initial condition (4.7a), we have
*,), (4-8)
- a, - a2
where Q is an (N-l) dimensional vector with elements g .
We then solve (4.6) with respect tot. Integrating the first equation of (4.6) from time Ss to time
s(td } reduces to
t-t(lI) = -^— M/J-EfJ W. (4.9)
where t, t(.sv ), and s^ are (N-l) dimensional vectors with elements td, rd (ss), and s(t^, respectively.
The length of the time period that appears in the right hand side of (4.9), s(td )-ss , can be obtained by
substituting (4.7b) into (4.8):
-'(Q-Q)
~ . (4.10)
Hence, from (4. 1 0) and (4.9), we can determine the initial equilibrium arrival time corresponding to ss :
T(s 5 ) = t-(AMA[)-'(Q-Q) (4.11)
Thus, the equilibrium pattern (-t(s), Q(s)) with the boundary condition (4.7) is given by
Vs (4.12)
a
Q(.?)=Q+AMA[E ' (s-ss)
a, -a.
and the corresponding equilibrium disutility Td (s) is calculated by
p = (t - Es, ) a, + ( AMAT_ )"' (JQ - Q -a2) Vs . (4.13)
4.2. Equilibrium on Saturated Networks with a Many-to-One Pattern
(1) Formulation
In the following we consider the networks with a many-to-one OD pattern where all nodes except
the destination are origins, i.e., there is no node that is neither origin nor destination. For the general case
where there are some nodes that are neither origin nor destination, see Appendix.
We divide the node set N into two sub sets: the set of origins N,, and the set of the single
destination, N2. Then, we divide A*, A*_, dr(u)/du and dQ(u)/du into the two blocks
corresponding to N, and N2 , respectively:
*.(«) dQ(u) _ '*?.(")'

du <& (4.14)
du 1 du
where A, is an (N—1)XL matrix, A 2 is an L dimensional column vector, dQ,(u)/du is an N-l

dimensional column vector with elements dQtxl(u)ldu, and ^d = ^^tj.
ijeLj
The elastic demand DUE employed here is defined as the state where no one can improve his/her
utility by changing either his/her route or their departure/arrival-time unilaterally. Since the users choose
their optimal route (conditional on having chosen his/her optimal departure/arrival-time) in the DUE state,
the equilibrium conditions for the route choice should be represented by the following differential
equations as shown in Chapter 3:
v
du ' du
Rewriting this with the variables introduced in (4.14), we have
v
_ (AIMA[_) (4.16)
du ' "' du
The condition that all the users can not improve their utility by changing his/her arrival-time (or departure-
time) in the DUE state can be represented as
dV
°d\u> = Q VM, Vo. (4.17)
Substituting the definition of disutility function (4.2) into this, we obtain the equilibrium rate of change in
the destination arrival-time as follows: -
*!>) = £LI£L VM, Vo (4.18)

du a}
du a,
(4.19)
du
It is worthwhile to compare the equilibrium conditions with those for the fixed demand case. In
the fixed demand DUE model, we tried to determine dr(u) I du from the eq.(4.15) with a given constant
vector dQ(u)/du. Then we encountered the indeterminacy of dr(u) I du due to the decrease in the rank
of matrix A*MA*-. On the contrast, in the elastic demand DUE, the indeterminacy problem is resolved
since dt(u)l du is first determined from the departure-time equilibrium condition, and then eq.(4.16)
with fixed dt(u)l du determines dQ(u)/du.
(2) Solution
As in the case of one-to-many OD pattern, we can obtain the solution (i(s), Q(s)) for the differential
equation (4.19) by giving appropriate boundary conditions. For the boundary conditions, we first set the
initial time us of the time period (measured with respect to the destination arrival-time) during which
eq.(4.19) holds (i.e. the networks can be regarded as "saturated networks" and all OD pairs have positive
OD flows). Then, on a parallel with the discussion in 4.1, it is natural to give the value of cumulative OD
flows from us and for the final time td :
Qod («,) = Qod = S'ven Vo

- (4.20a)
iven Vo
Qod (*u } = Qod = g » (4.20b)
The conditions (4.20) in conjunction with (4.19) can be solved with respect to Q(w). However, these
conditions are not enough to determine the value of T . Hence, instead of (4.20a), we give the time
needed to travel from origin o to the destination at the initial time us as a new boundary condition:
us - TO (us ) = rod = given Vo . (4.20c)
Integrating the second equation of (4. 19) from time u to td with the initial condition (4.20c), we have
-«) VM (4.21)
We next solve (4.19) with respect tot . Integrating the first equation of (4.19) from time us to time u
with the initial condition (4.20c), we obtain
~ - ^ ) VM. (4.22)
a,
and the corresponding equilibrium disutility rd (s) is calculated by
p = fl2-(/rf-MJ)E + fl1r (4.23)
5. PARADOXES
Having derived the formulae for the solution of the dynamic traffic equilibrium assignment
so far, now we can discuss the capacity increasing paradox. The paradox presented here is a
situation such that improving the capacity of a certain link on a network worsen the total travel cost
over the network; this is a dynamic version of Braess's paradox which is well known in the static
assignment. Using the results obtained in Chapters 3 and 4, we derive the necessary conditions
for the occurrence of the paradox for E-net and M-net, which are shown to be significantly
different.
5.1. A Paradox for a Network with a One-to-Many OD Pattern
We consider the paradox for the network shown in Fig. 5.1, where node 1 is a unique origin;
nodes 2 and 3 are destinations; the maximum departure rate of link a (a = 1,2,3) is given by //„.
Fig.5.1. Example Network with Single Origin and Two Destinations

For the brevity of notation, we employ the superscript" • " as the derivative operation with respect
to origin departure-time s in this section, (e.g. t,(s) = dr^lds, Q^s) = dQod(s)lds}.
(1) Fixed Demand Case

For the network in Fig. 5.1, the origin (i.e. node 1) should be the reference node; the
incidence matrix A*, the reduced incidence matrix A, and the corresponding A_ are given as
follows:
1 1 0
-1 0 1 - 1 0 0
A = -1 0 1 A= A = (5-D
0 -1 -1 0 -1 -1
0 -1 -1
Hence,
*> 0
(5.2)
The equilibrium pattern for the vehicles with the departure time 5 from a single origin can be
calculated using the results of Chapter 3. From (3.6), we first obtain the rate of change in
equilibrium arrival time:
fiuW. T3(S) = —^— 013 (j) (5.3)

/" 2 +y" 3
Substituting these into (3.3), we have the following equilibrium link flow pattern:
(5.4)
To discuss the "capacity increasing paradox", we employ the total travel time for the users
departing from an origin from time 0 to T as an indicator for measuring the efficiency of the
network flow pattern:
s
X f y* =Z f Q (5.5)
We then refer to the situation "paradox" if increasing the capacity of a certain link, //0, causes the
increase of TC (i.e. dTC/d/na > 0 implies "paradox").
Let us examine whether the paradox arises or not for the network in Fig. 4.1. Substituting
(5.3) into (5.5), we obtain TC:
(5.6)
From (5.6), we easily see that the increase of //, or ju2 always decreases TC (note that both //,
and ju2 appear in only the denominator of TC), that is, the paradox does not arise for links 1 and 2.
Increasing ju3, however, causes the paradox. The reason is that since
if the condition:
S
'° ' 2 ^ " > -*0 '3 ^ " * (5.8)
is always positive , this means the occurrence of the paradox.

The (5.8) is the condition that the paradox occurs for a certain time period 0 ^ T. From
this, we can also derive the condition under which the paradox occurs for an arbitrary time period:
QM' *>&&)! to- (5-9)

The meaning of this inequality is simple. Since the increase of /J3 always results in the increase of
y3 (see (5.4)), suppose 1 unit of increase in flow on link 3 (=^3). This means that the number of
users with destination 3 who pass through link 1 increases by 1 unit. The increase in flow on link 1
then causes Qn (s)l //, of increases in total travel time for the users with destination 2 ("User-2").
On the other hand, total travel time for the users with destination 3 ("User-3") decreases by
Q}3 (s)/ ju2 , since the flow on link 2 decreases 1 unit. Therefore, the 1 unit of increase in flow on
link 3 causes the increase of total travel time by Qn (5V P\ ~ Q\i (s}/ -"2 • Thus, we see that (5.9)
means the condition that the "net benefit" for User-2 and User-3 (User-3's benefit minus User-2's
loss) due to the increase of ju3 becomes positive.
(2) Elastic Demand Case

The equilibrium pattern for the network in Fig. 5.1 can be calculated from the results of
Chapter 4. From (4. 1 2), we first obtain the equilibrium arrival times and OD flows:
(5.10)
at-a2 a,-0 2
where Qad = Qod - Q . Then (4. 1 3) gives the equilibrium disutility for each origin:
1 1 "
p2 =a,(t2 -5,)+(a, -a2) - 012, p, =a}(t3 -ss) + (al -a2) - 013 - (5.12)
/^|-y"3 /" 2 +/" 3
We define the sum of disutility experienced by all users over a network, TC, as an indicator for measuring
the efficiency of the network usage:
. (5-13)
d
The TC for the network in Fig.5. 1 is given by
TC = a,lt2 -SS]QU + fe -SM+h -a^-^-+ -^-j (5.14)
To check the occurrence of the paradox, we calculate dTC/dju^:

-v 2
*rc 0,2 0,3 (5.15)
Note that the capacity of link 1 should be greater than that of link 3 (i.e. //, > /J3) in order for (5.11) to
satisfy the (physically evident) condition Q}2 (s) - Qt 2 (s v ) > 0. Hence dTC/dju3 > 0 holds only if
Qn /(//, -// 3 ) > jQ13 /(// 2 +^3) • (5.16a)

We see from (5.16a) that the paradox arise (with the capacity increase of link 3) independent of the value
of/^3 if the following condition hold:
It is noteworthy that the condition (5.1 6b) is identical in form to the condition for the fixed demand case.
5.2. A Paradox for a Network with a Many-to-One OD Pattern
We consider the paradox for the network in Fig. 5. 2, where node 1 is a unique destination;
nodes 2 and 3 are origins; the maximum departure rate of link a (a = 1 ,2,3) is given by jua.
For the brevity of notation, we employ the superscript " • " as the derivative operation with respect
to destination arrival time u in this section, (e.g. T^^drû}/ du ,Qod(u) = dQad(u)/du)
*•—'
Fig. 5.2. Example Network with Two Origins and Single Destination
(1) Fixed Demand Case

For the network in Fig. 5.2, node 3 is the pure origin; we divide the incidence matrix A*, the
corresponding A*_ and the OD flow vector as follows:
A,=[0 1 l ] (node3) A,_=[0 0 0

-1 -1 0~$node$, -1 -1 0 (5.17)
A, =
1 0 -\\(node2) 0 0 - 1 e2,(«)
Hence,
0 ~
A MA - = 0 (5.18)
//3
The equilibrium pattern for the vehicles with the arrival time u at a single destination can be
calculated from the results in Chapter 3. From (3.16), we first obtain the rate of change in
equilibrium arrival time for nodes 1 and 2:
T2(u) = - (5.19)
Substituting these into (3.10) yields the link flow rates (with respect to u):
(5.20)
Note that this flow pattern is significantly different from that for the reversed network (see (5.4)).
hi order to determine the rate of change in equilibrium arrival time for node 3 (= the pure
origin), adding an appropriate condition is required. Here we assume for node 3 that the OD flow
rate measured at the origin, <?31 = dQ3l (u) I dr3 (u) = Q3} (u) I f3 (u), is given. On the other hand,
the OD flow rate measured at the destination, q3] = Q3l (u) , is determined from (3.15a):
031 (") = ^ 1 + ^ 2 - 021 («)

(5.21)
Substituting this into the definitional relationship between and qod :
qod(u) du dT0(u) du ^
we obtain the rate of change in equilibrium arrival time at node 3:
~ f \ (5.22)
q3](u)
Defining the total travel time for the users arriving at an destination from time 0 to T as an
indicator for measuring the efficiency of the network flow pattern:
TC
* (5.23)
let us examine whether the paradox arises or not in the network in Fig. 5.2. Substituting (5.19),
(5.21) and (5.22) into (5.23), we obtain the TC for this network:
(5.24)
du
where Q^(u}= \ qod(u)du. We see from this equation that the increase in /^, or ju2 will
always decrease TO, the paradox does not arise for links 1 and 2. However, the increase in the
capacity of link 3 always results in the occurrence of the paradox. This fact can be easily
examined as follows. Calculating the derivative of TC with respect to ju3, we have
Note that f2 (u) should be positive in the DUE state. The reason is that if f 2 («) is not positive
the users with the destination arrival time if> u must depart from their origin before the users with
arrival time u, and this contradict the assumption that the state is in the DUE. Therefore, from the
(5.25) and the fact that f 2 («)>0 for any u, the inequality dTC I dju3 > 0 always holds; we see
that the paradox for link 3 takes place without any additional conditions.
(2) Elastic Demand Case

The equilibrium pattern for the network in Fig. 5.2 can be calculated from the results of
Chapter 4. For the network in Fig.5 .2, the matrices A , M A \_ and A2 MA 2_ defined in 42 are
oi T (5.26)
Hence, from (4.21) and (4.22), we obtain the equilibrium arrival times and OD flows:
/ -, a^-a2 a2 L a, /, J , \ a,-a 2 a2 L a, /„L J

^(«) = -a-L" a+ — \usa -- Lr2l(u,)\, T3(u) =a ^-2a-u + -±\u
a
s -- r 31 (wJ (5.27)
\ \ ( 2 I \ \ [ 2 }
, ,
We also get the equilibrium disutility from (4.23):
P2=a2(t-us)+a}r2}(us), p3 = a2(t -us)+a^(us). (5.29)
Let us define the sum of disutility experienced by all users over a network, TC, as an indicator for
measuring the efficiency of the network usage:
TC^p0Qod (530)
Substituting (5.29) into the definition (5.30), we get the TC for the network in Fig.5.2:
(531)
To check the occurrence of the paradox, we calculate dTC/d/u^:
^ = a}(a,-a2](t-us}{r3,(us)-r2l(us)}. (5.32)
a//3
Note that the relationship

r*(u,)>r2l(«,) (5-33)
or equivalently,
7 2 («,)>^(«J (5-34)
should holds as long as the network in Fig.5.2 is a saturated network. The reason can be proved
by contradiction: consider two users with origin 2 and 3, denoted as U2 and U3, who arrive at
the destination at the same time us; suppose that the (5.34) does not hold, then it implies that
U2 should leave his origin earlier than U3 does; this clearly contradict the assumption of the
saturated network. Thus, from (5.32) and (5.33), we see that dTC/dju3 > 0 always holds; in
other words, the occurrence of the paradox is inevitable when the capacity of link 3 is
increased. It is worth noting that we eventually obtained the same result as in the fixed
demand case.
6. Concluding Remarks
This paper discussed a capacity increasing paradox under a dynamic equilibrium assignment
with elastic OD demands: the paradox is a situation such that improving the capacity of a certain
link on a network worsen the total travel cost over the network. Our analysis in a simple network
disclosed that the paradox arises only on a particular condition for a network with a one-to-many
OD pattern, while the corresponding paradox always arises for the reversed network with a many-
to-one OD pattern. This is the asymmetrical result that can not be seen in the classical static
assignment framework; it is particular to the dynamic assignment with queue. Furthermore, we
show that this property holds not only for the assignment with fixed OD demands but also for the
assignment with elastic OD demands.
A Paradox For A Dynamic Equilibrium Assignment 3 21
In this paper, particular simple networks were employed to demonstrate the paradox. Note,
however, that the examples presented here are not the exceptional ones that can hardly be observed
in practical situations but the ones that can be seen universally if we regard the example networks
as a macroscopic representation of real road networks. Therefore, we think that the examples,
despite their simplicity, describe one of the essential points that should be considered in deciding
practical traffic management operations such as ramp metering or addition of lanes in freeways.
We recognize that there are still several relevant topics to be studied. First, we should extend
our analysis to the paradox in a more complex network by exploiting the analytical formula of the
DUE solution derived in this paper; it may be possible to obtain systematic methods for general
networks that detect (without computing the equilibrium patterns) the links where the paradox
takes place; the exploration of this possibility would be an interesting future topic. Secondly, we
should analyze more realistic case where the assumption of "saturated networks" are relaxed; the
exploration would be achieved by employing not only the analytical approach just as shown in this
paper but also the numerical approach based on the recent convergent algorithms for the DUE
assignment (see Akamatsu (1998)). Finally, we should explore the case where physical queues
are explicitly incorporated into the analysis. Though the incorporation of physical queues may
cause very complex phenomena as shown in Daganzo(1998), comprehensive studies on this topic
would be indispensable for a clear understanding of the properties of dynamic network flows.
Acknowledgements
The authors gratefully acknowledge stimulating discussions with Nozomu Takamatsu on the topic
of this paper. Thanks are also due to Benjamin Heydecker and three anonymous referees for their
helpful comments and suggestions.
REFERENCES
Akamatsu T. (1996). The Theory of Dynamic Traffic Network Flows, Infrastructure Planning
Review 13, 23-48.
Akamatsu T. (1998). An Efficient Algorithm for Dynamic User Equilibrium Assignment for a One-
to-Many OD Pattern, submitted to Transportation Science.
Akamatsu T. and M. Kuwahara (1994). Dynamic User Equilibrium Assignment on Over-saturated
Road Networks for a One-to-Many / Many-to One OD Pattern, JSCE Journal of
Infrastructure Planning and Management IV-23, 21-30.
Arnott R., De Palma A., and R. Lindsey (1993). Properties of Dynamic Traffic Equilibrium
Involving Bottlenecks, Including a Paradox and Metering, Transportation Science 27,
148-160.
Bernstein D., T.L. Friesz, R.L. Tobin, and B.W. Wie (1993). A Variational Control Formulation
of the Simultaneous Route and Departure Choice Equilibrium Assignment, Proc. of

the 12th International Symposium on Transportation and Traffic flow Theory, 107-
126.
Braess D. (1968). U her ein Paradox in der Verkehsplanung, Unternehmensforshung 12,258-268.
Daganzo C.F. (1997). Fundamentals of Transportation and Traffic Operations, Elsevier Science,
Oxford.
Daganzo C.F. (1998). Queue Spillovers in Transportation Networks with a Route Choice,
Transportation Science 32,1-11.
Hendrikson C. and G. Kocur (1981). Schedule Delay and Departure Time Decisions in a
Deterministic Model, Transportation Science 15, 62-11.
Heydecker B. G and J.D. Addison (1996). An Exact Expression of Dynamic Traffic Equilibrium,
In J.-B. Lesort (Ed.) Proc. of the 13th International Symposium on Transportation and
Traffic Theory, 359-383.
Kuwahara M. (1990). Equilibrium Queueing Patterns at a Two-Tandem Bottleneck during the
Morning Peak, Transportation Science 24,217-229.
Kuwahara M. and T. Akamatsu (1993). Dynamic Equilibrium Assignment with Queues for a
One-to-Many OD Pattern. In C. Daganzo (Ed.) Proc. of the 12th International
Symposium on Transportation and Traffic Theory, 185-204.
Kuwahara M., and GF. Newell (1987). Queue Evolution on Freeways Leading to a Single Core
City during the Morning Peak. Proc.of the 10th International Symposium on
Transportation and Traffic Theory, 21-40.
Murchland J.D. (1970). Braess's Paradox of Traffic Flow. Transportation Research 4, pp.391-394.
Smith M.J. (1978). In a Road Network, Increasing Delay Locally Can Reduce Delay Globally.
Transportation Research 12B, 419-422.
Smith M.J. (1993). A New Dynamic Traffic Model and the Existence and Calculation of Dynamic
User Equilibria on Congested Capacity-constrained Road Networks. Transportation
Research 27B, 49-63.
Vickrey, W.S. (1969). Congestion Theory and Transportation Investment. American Economic Review 59.
Yang H. and M.GH. Bell (1998). A Capacity Paradox in Network Design and how to Avoid it.
Transportation Research 32A, 539-545.
Appendix
In Chapter 4 it is assumed that all the OD pairs have always positive flows during the
period of analysis. In this appendix, we briefly demonstrate how the formulation can be
extended to the case where some OD pairs have no OD flows. The formulations for the one-to-
many OD problem and the many-to-one are presented in turn.
(1) One-to-Many OD pattern

We first divide the node set N (where the origin is excluded as a reference node) into two sub
sets: the set of destinations with positive OD flows, N,, and the set of the other destinations, N2.
Then, we divide A, A_, dn(s)lds and dQ(s)/ds into the two blocks corresponding to N, and N 2 ,
respectively:
~dTi(sY ~dQi(s)~ r / xn
ctr(s dO(s] î V /
A= A =
A,. v /} as/ as (A-l)
dQ2(s) ds
ds dT2(S) ' ds o
ds ds
For the destinations with positive OD flows, the arrival times are governed by the departure-
time equilibrium condition (4.5):
=E (A-2)
ds
For the other destination nodes, the arrival times should be determined from the route choice
equilibrium condition (3.6). Rewriting the condition (3.6) with the variables defined above,
ds (A-3)
dr2(s)
0
ds
or equivalently,
(A-4a)
ds ds ds
0 = (A-4b)
ds ds
= dt7(s) ft , . . r V i / ' . , , A r V , #1
E- = -(A 2 MA 2 _) (A 2 MA,_jE —,
«57
a, — a2 asJ \ ^ ^ ~ / \
a, -a-,
' i l
(A-5)
r r r
= fA,MA, _)- (A,MA2 _ \A2MA2_)"' (A2MA, _
ds
(2) Many-to-One OD pattern

We first divide the node set N into two sub sets: the set of origins with positive OD flows, N15
and the set of the other nodes (including the destination), N2. Then, we divide A, A_, dr(u)/du
and dQ(u)/du into the two blocks corresponding to N] and N 2 , respectively:
~drt(u}~ fdQiMl
A dr(u) du dQ(u) du
A-| '.A = (A-6)
A, " A , du dr2(u) ' du dQ2(u)
du du
where dQi(u)/du is a column vector with element dQ0j(u)/du, dQ2/du is a column vector
whose element is nd = ^ jukd if it corresponding to the destination, otherwise zero.
k
For the origins with positive OD flows, the departure times are governed by the
departure-time equilibrium condition (4.18):
:-!—2-
s-hf If/ I /-f sy
(A-?)
du a}
For the other destination nodes, the arrival/departure times should be determined from the
route choice equilibrium condition (3.12). Rewriting the condition (3.12) with the variables
defined above,
fdQiMl "rfE,(«)~
du
dQ2(u)
du
[ A, 1 r T r '
A;JM[A[. AL.
du
dr2(u)
du
(A-8)
or equivalently,
(A-9a)
du du
v
(A-9b)
du
du a, du a, du
(A-10)
du
du
CHAPTER 5
TRAFFIC ASSIGNMENT
Making the simple complicated is commonplace; making the complicated

simple, awesomely simple, that's creativity.
Everything should be made as simple as possible, but not simpler.

(Albert Einstein)
To get nowhere, follow the crowd.

A Cell Based Dynamic Traffic Assignment Formulation 327
A Dynamic Traffic Assignment Formulation that

Encapsulates the Cell-Transmission Model
Hong K. Lo
Department of Civil Engineering, Hong Kong University of Science and Technology
Clear Water Bay, Hong Kong. E-mail: cehklo@ust.hk
ABSTRACT
This study develops an analytical dynamic traffic assignment (DTA) formulation based on a
dynamic extension of Wardrop's Principle, referred to as dynamic user optimal (DUO) (Ran
and Boyce, 1996). We develop a gap function for the corresponding nonlinear complementarity
prolem (NCP) and prove that minimizing the gap function produces a solution that fulfills the
ideal DUO conditions. Existing analytical DTA formulations mostly use macroscopic link
travel time functions to model traffic. In general it is difficult for such functions to capture
traffic interactions across multiple links such as queue spill-back and dynamic traffic
phenomena such as shock-wave. Instead, traffic in this formulation is modeled after the Cell-
Transmission Model (CTM) (Daganzo, 1994, 1995a). CTM provides a convergent
approximation to the Lighthill and Whitham (1955) and Richards (1956) (LWR) model and
covers the full range of the fundamental diagram. This study transforms CTM in its entirely to a
set of mixed-integer constraints. The significance of this is that it opens up CTM to a wide
range of dynamic traffic optimization problems, such as the DUO formulation developed
herein, dynamic signal control, and possibly other applications.
1. INTRODUCTION
Existing Dynamic Traffic Assignment (DTA) models basically follow two approaches:
simulation and mathematical formulation. The first approach emphasizes microscopic traffic
flow characteristics. Strict adherence to traffic assignment principles, such as Wardrop's, is
secondary. Earlier generations of this approach used intersection-turning ratios to split traffic
without route specification. Recent models specify route choices based on the k-shortest paths
criteria, which was then extended to the concept of "bounded rationality" for dynamic route
switching (Chang and Mahmassani, 1988). This approach shares the following properties.
Firstly, they are essentially descriptive, not prescriptive tools. They simulate the probable
results of a certain traffic management strategy but do not prescribe what the strategy ought to
be. Secondly, they lack well-defined solution properties. One cannot prove whether the solution
has achieved the required optimality. In each computer simulation, the model produces a
realization out of a large space of probable realizations. Therefore, one must be careful in
generalizing or transferring results.
DTA models can also be developed through an analytical approach (examples: Ran and Boyce,
1996; Ran et al. 1996; Lo, et al., 1996a; Jayakrishnan et al., 1995; Janson, 1991; Friesz et al.,
1993; Smith, 1993; Wie et al., 1990). This approach often has well-defined properties, in terms
of optimality conditions and adherence to a dynamic version of Wardrop's principle (1952).
Depending on how the objective functions are defined, these models may be used for
prescriptive or descriptive purposes. The main difficulty with the analytical approach is adding
realistic traffic dynamics to already complicated formulations. For this reason, most DTA
formulations use macroscopic link travel time functions (e.g., variations of the BPR function)
to approximate traffic dynamics. This lack of accurate traffic dynamics is a shortcoming.
Daganzo (1995b) pointed to the potential problem of link travel time functions under dynamic
loads. Our studies (Lo et al., 1996b, 1996c) confirmed that traffic dynamics is too important to
be replaced or simplified by a macroscopic link travel time function.
This study develops a DTA formulation to overcome this shortcoming. The formulation has
well-defined solution properties and follows a dynamic extension of Wardrop's Principle,
referred to as dynamic user optimal (DUO) (Ran and Boyce, 1996). It models traffic dynamics
by encapsulating the network version of the Cell-Transmission Model (CTM) (Daganzo, 1994,
1995a). CTM provides a convergent numerical approximation to the Lighthill and Whitham
(1955) and Richards (1956) (LWR) model and covers the full range of the fundamental
diagram. This property makes it a suitable platform for modeling dynamic traffic. However,
incorporating CTM as part of the constraints of a mathematical formulation is not
straightforward, as we will see in Section 3.
Generally, traffic assignment formulations use four approaches: (i) mathematical programming
(MP) (example, Sheffi, 1985); (ii) variational inequality (VI) (examples, Dafermos, 1980;
Nagurney, 1993; extensions to dynamic traffic: Ran and Boyce, 1996; Friesz et al. 1993); (iii)
nonlinear complementarity problem (NCP) (Aashtiani, 1979); and (iv) fixed-point problem
(FPP) (Asmuth, 1978). Lin and Lo (1998) discussed a potential problem of extending the MP
approach for dynamic traffic assignment. Two summaries of these approaches are annotated by
Patriksson (1994) and Florian and Hearn (1995). They showed the linkages and equivalence
conditions among these approaches. One common requirement for all formulations is the
demand constraint (i.e., the sum of path flows per origin-destination (OD) pair equals the given
OD demand or follows the demand function). For this reason, the path flow variables are
included in most formulations as part of the constraint set1. In problems where the path costs
are additive of the link costs, one can use the link-path incidence matrix to express the path cost
1
One can avoid the use of the path flow variables totally by defining the link flow variables by origin or by
destination and maintaining multiple copies of the node and link flow conservation constraints to form a "link-
based" multi-commodity flow problem. See LeBlanc et al. (1975) for example.
as the sum of link costs on the path. Hence, one can avoid the path flow variables in the
objective function. Subsequently, by using a column generation method in the solution, as in
most of the widely used algorithms, one can solve the problem without the need to store the
path flows (see Patriksson, 1994, and Florian and Hearn, 1995, for a summary of these
algorithms). This is a marked advantage, especially for large networks in which the number of
paths is much more than the number of links. The outputs of these algorithms are mostly
expressed in link flows, although path flows would also be available if the paths generated in
each column generation procedure are stored.
Recently, there is a resurgence in the interest of reformulating or solving the traffic assignment
directly with path flows (examples: Jayakrishnan et al., 1994, Cascetta et al., 1997; Bell et al.,
1997; Gabriel and Bernstein, 1997; Lo and Chen, 1998). This redirection of effort can be
summarized by four motivations:
(i) for some types of problems, obtaining the solution in path flows (but without exhaustive
path enumeration) is faster (Chen and Jayakrishnan, 1998);
(ii) where route preference2 is an important attribute or where turning restrictions are common
hence restricting the route choices substantially (e.g, downtown), formulating and solving
the problem with path flows is more convenient;
(iii)where the path costs are nonadditive (i.e., the path cost is not the direct sum of link costs), it
is difficult, if not impossible, to formulate and solve the problem with link flows alone
(Gabriel and Bernstein, 1997; Lo and Chen, 1998).
(iv)in assignment models that explicitly consider queuing, as in this study, one must track the
paths of the spill-back queues, which may extend over multiple links. Moreover, path flow
provides important information to model traffic at diverges and merges.
For these reasons, we develop this DTA formulation based on the path-flow variables. The
possible paths between each OD pair are either predefined or generated through a column
generation procedure in the solution algorithm. In practice, the predefined paths could be based
on travelers' preferences or interview results. The formulation will then equilibrate traffic flows
according to the DUO principle. That is, all the used paths between the same OD pair will have
equal travel time; while the unused ones will have equal or higher travel times. We will
delineate this formulation and discuss its optimality conditions in Section 2.
2. DYNAMIC TRAFFIC ASSIGNMENT FORMULATION
We consider a general transportation network with multiple origin-destination (OD) flows. The
traffic network is represented by a set of cells and directed links. Road segments are
represented by a series of cells that have physical length, while links are there merely to
delineate connectivity between cells. As such, links have no physical length and cannot hold
traffic. Traffic begins at an origin cell (denoted as r) and terminates at a destination cell
2
Such as route guidance in which users have specific route preferences , such as choosing a safe, scenic, or
highway-based route whose distance or travel time may not be necessarily the shortest
(denoted as s). We consider the fixed time period [0, T] that is long enough to allow all
vehicles to complete their trips.
Generally, two dynamic user optimal (DUO) conditions have been defined in the literature
(Ran and Boyce, 1996): Ideal and Instantaneous DUO. This paper focuses on the conditions of
Ideal DUO, restated as: for each origin-destination pair at each instant of time, the actual travel
times experienced by travelers departing at the same time are equal and minimal. Possible
scenarios where this could happen include (i) in commuting traffic where the patterns were
reproduced every day and hence commuters could modify their route choices until they could
improve no further after many days of experiences; (ii) in the future when the techniques of
traffic detection and prediction will become accurate.
The Ideal DUO condition implies that a path p between OD pair r-s will not be used at time t if
its actual travel time is longer than the shortest travel time between r-s. Conversely, any used
path p at time t must have its travel time equal to the shortest travel time between r-s.
Mathematically, they are:
VrstRS (2)
" (3)
f > 0, u > 0 (4)
The notations are the following:
RS set of OD pairs for the whole network

rs an OD pair, rs e RS
Prs set of paths connecting OD pair rs
p a path between an OD pair, p e P™
fp(t] path flow on p between OD pair rs departing at t
f set of {/p"(f)j with dimension n 1 = ^ P r e x n 3 where «3 is the discretized time
rs
dimension
Tl"(t) path cost (or disutility) on path p between OD pair rs for flows departing at t
n set of {77™ (f )} with dimension n, = ]T P"| x n3
rs
nrs(i) shortest travel cost (or disutility) between OD pair rs departing at t

u set of {^"(f)} with dimension n2 = RS xr^
qrs(t] demand between OD pair rs . It is a constant for the fixed demand case and could be a
function of u and t for the elastic demand case, qrs:R? —» Rl+
q the set of {<?ra(f)} with dimension n2 = \RS\ xn3
Conditions (1) and (3) are the complementary slackness conditions for ideal DUO. Aashtiani
(1979) observed that the user optimal conditions for static traffic assignment are equivalent to a
nonlinear complementarity problem (NCP). Here we generalize this approach to the case of
Ideal DUO, stated as:
x-F(x) = 0
F(x)>0 (5)
x>0
by setting x = (f ,u)e R" where n = n, +n 2 and letting
F(x) = efl". (6)
Actually, (5) and (6) are just rewrites of (l)-(4) in NCP format. Recent advancements in the
analysis of NCP and mathematical programs (MP) offer new insights into reconsidering this
formulation. Furthermore, recasting this NCP formulation into a MP would make available a
large number of solution techniques already developed for MP. This reformulation from NCP
to MP primarily draws upon the use of a gap or merit function.
Definition: Let Q be the set of solutions to the NCP formulation (5)-(6). A function
G: R" —> R[ is a gap function for the NCP formulation if
ii. G > 0
In essence, the gap function provides a measure of convergence of the NCP formulation at any
point x . By minimizing G over x , a point in Q is obtained. That is:
minG(x) s.t.xeV where ¥ = (x>0,F(x) >0}. (7)
Facchinei and Scares (1995) suggested three desirable properties to have when transforming a
NCP formulation with a gap function, including:
1. G is smooth (or differentiable);

2. every stationary point of (7) is a global solution of the NCP formulation;
3. the level sets of L(a) = |xe /?":xe ^,F{\)< aj are bounded, where F(x) is an element
of F(x) and a a finite constant
These properties are important from a computational point of view. If they are satisfied, the MP
can be solved efficiently by applying a number of optimization algorithms to (7). Recently, a
new gap function has been proposed to solve NCP. This hinges on the key observation that this
simple, two-variable, convex function (Fischer, 1992):
(I)(a,b} = ja2+b2 -(a + b), (8)

has this property:
= 0. (9)
From ( 8), a gap function is proposed based on squaring 0(a,&) (Facchinei and Soares, 1995):
(p(a,b) = <t>2(a,b). The gap function <p(a,b) satisfies the following properties (Kanzow and
Fukushima, 1996):
a. (p is continuously differentiable on R2 , in particular V(j£>(0,0) = (0,0)
b. cp(a,b) > 0 , for all (a,b) e R2
c. (p(a,b) = Qa a>0,b>0,ab = 0
This approach can be applied to reformulate the DTA problem in the form of NCP.
Specifically, we define the gap function for the traffic equilibrium problem as:
(10)
where xi,Fi are the corresponding elements in (5) and (6), and <p(-) is as defined above.
Solving the NCP formulation is equivalent to finding the unconstrained global solutions of the
problem jminG(x)}.
Proposition 1: Let Q be the set of solutions to the NCP formulation (5)-(6). Function
G(x):R" —> R1 is a gap function for these conditions.
Proof:
(i) This gap function satisfies the condition: G(x) = 0<=>xeQ, where x = (f ,u)e R" , f
denotes the set of {/p"(0l >and u denotes the set of {#"(*)} .
Necessity:
Given the Ideal DUO conditions: f"(t\il"(t)-JCa(t)] = 0, frps(t) > Qfrj^(t)-Jf"(t) > 0 , let
a = f"(t),b = ri"(t)-n;"(t). The "only-if condition of property c. implies that
<
p(/ P "(0' 7p(0~ ™(0) 0 f° ^ fs&RS and p&Prs. Given the demand conditions:
r ;r = r
= 0, 7f"(f)>0, 2lifpS(t)-qrs(t)>0, let a = n"(t) and

p
b = ^fpS(t)-qrs(t). The "only-if condition of property c. implies that
p
<p\nrs(t),^f"(t)-q"(t)\ = Q for all rseRS. Adding these two conditions for all
V P J
^^ / X ^ ( ^ }
instances yield: G(\) = Y T.<p(f™(t),rf s p (0 - ^™(0) + 2-r H ^"W'X/P" W ~ <?" (0 = ° >
« /> « I P )
since each term is zero. Therefore, given the Ideal DUO conditions, the gap function is
zero.
Sufficiency:
Here we prove that a zero gap function implies the Ideal DUO condition.
According to properties b., (p(fpS(t),rir*(t)-n"(t)) > 0 and J n"(t),^f?(t)-q"(t) > 0.

v p )
Given that G(x) = 0, each term <p[/p"(f),77j(f)-;rra(?)] and
9\ nrs(t\^,fp(t}~^rs(t] must be zero. Using the "if condition of property c., one obtain
the Ideal DUO and demand conditions.
(ii) The gap function satisfies this second condition: G(x)>0. From property b., both
(p(frs(t),r]'"<(t)-nrs(t)\and ( rs(t\2.f
(p\n ^ rs(t)-qrs(t}\ } are nonnegative. Therefore, the
I P )
sum of all terms, G(x), is nonnegative.
This completes the proof. Thus, the Ideal DUO condition can be achieved by minimizing
the gap function (10).
3. TRAFFIC FLOW MODEL
The above section describes an Ideal DUO formulation based on path flow and actual path
travel time. Many dynamic link travel time functions could be used for this purpose. To capture
detailed traffic dynamics, this formulation encapsulates the Cell-Transmission Model (CTM)
(Daganzo, 1994, 1995a) as the underlying traffic flow model. Developed as a simulation tool,
CTM uses nonlinear mathematical operations and logic statements. While their inclusion as
part of a simulation code does not pose any problem, casting them as constraints in a
mathematical program (MP) is problematic. In this paper, we develop a transformation to
convert CTM to a set of mixed-integer constraints. The significance of this is that the
transformed CTM (and hence, the underlying LWR model) can be used in a general
optimization context, such as dynamic signal control, dynamic traffic assignment, or other
optimization problems that include a traffic model. By encapsulating CTM in this ideal DUO
formulation, we improve the accuracy of traffic dynamics modeling significantly. This
conversion, however, increases the complexity of the formulation, as discussed in the
following.
Cell Transmission Model (CTM): Basic Principles
The Lighthill and Whitham (1955) and Richards(1956) (LWR) model can be stated by the
following two conditions:
^ + — = 0 and = Y(k,x,t) (11)

ox at
where yis the traffic flow; k is the density; jc and t, respectively, are the space and time
variables, and Y is a function relating y and k. The first partial differential equation states the
traffic flow conservation. The relation Y defines the Fundamental Diagram. Given a set of
well-posed initial conditions, one can determine y and k at any (x,t) by solving (11). This
model is sometimes referred to as the hydrodynamic or kinematic wave model of traffic flow.
Lighthill and Whitham (1955) and Newell (1991) developed two different solution approaches
to this model. Daganzo (1994, 1995a) simplified the solution scheme by adopting the following
relationship between traffic flow, y, and density, k:
y = min{vk,Q,w(kjam-k)} (12)
where kjam,Q,V,Wdenote, respectively, jam density, inflow capacity (or maximum allowable
inflow), free-flow speed, and the speed of the backward shock wave (or the backward
propagation speed of disturbances in congested traffic), then the LWR equations for a single
highway link are approximated by a set of difference equations. Essentially, (12) approximates
the fundamental diagram by a piece-wise linear model as shown in Figure 1.
Density
Figure 1 The fundamental diagram used in CTM
By discretizing the road into homogenous sections (or cells) and time into intervals such that
the cell length is equal to the distance traveled by free-flowing traffic in one time interval, then
the LWR results are approximated by this set of recursive equations (Daganzo, 1994, 1995a):
(13)
AT'-" (14)
where the subscript y refers to cell j, andj+1 (j-1) represents the cell downstream (upstream) of
j. The variables « y (f),V;(f),A^(f) denote the number of vehicles, the actual inflow, and the
maximum number of vehicles (or holding capacity) that can be held in cell j at time /,
respectively. The variables Qj(t),V,W follow the earlier definitions. It is important to
differentiate between G/(0 an^ ^y(0 • ^ former is the inflow capacity while the latter is the
actual inflow. Because (13)-(14) provide a numerical approximation to the LWR equations, all
the traffic phenomena demonstrated in the LWR model are replicated in CTM.
The key is to determine y}(t] from the minimization (14). Once this is accomplished, n}(t)
can be determined recursively from the linear equation (13). However, including (14) as a
constraint would make the resultant MP difficult to solve. To overcome this problem, we
transform the minimization in (14) to a set of mixed-integer constraints, as discussed in the
next section.
Equations (13)-(14) provide the basic principle of modeling traffic flow on a series of straight
cells. To apply this principle to a general network with multiple OD pairs, three extensions are
required: (a) modeling merge and diverge junctions; (b) differentiating the OD specific traffic;
(c) maintaining the first-in-first-out (FIFO) property. In the next section, we discuss how CTM
is extended for network traffic and the corresponding transformations.
Cell Transmission Model: Network Traffic
The extension of CTM for network traffic was addressed in Daganzo (1995a). To facilitate
cross-reference, this paper adopts similar notations as in there. Moreover, to make the
description succinct, this paper only covers the basic concepts needed for exposition. One may
refer to that paper for detailed discussions. The focus here is to cast CTM as a constraint set in
this ideal DUO formulation.
As mentioned in Section 2, road segments are represented by a series of cells that have physical
length, while links, without physical length, are there merely to delineate connectivity between
cells. In general, five types of connectivity are needed to represent a network, including (Figure
2):
1. Ordinary connection: where two cells are joined by a straight link

2. Origin connection: where exogenous traffic enters the cell
3. Destination connection: where traffic terminates at the cell
4. Merge connection: where two cells merge into one
5. Diverge connection: where one cell diverges into two
In Figure 2, Bk and Ek refer to the beginning and ending cell of each cell-group, while Ck in
the merge and diverge connections refers to the complementary (third) cell. For the origin and
destination connections, Bk and Ek are replaced by r and s, respectively. Links are denoted by k
or ck dependent on whether the link has one end at a complementary cell Ck. More
complicated network connections can be simplified through a combination of one or more of
these five connections.
In a general network with many OD pairs, the general principle of (13)-(14) still applies to the
aggregate traffic flows between cells. However, to maintain the intended paths of traffic and the
first-in-first-out (FIFO) property, traffic in each cell is disaggregated by path (p) and by arrival
time at the cell ( T , modeled as a discrete time index). The path variable is used to direct traffic
at merges and diverges. By tracking T , the FIFO property is maintained by ensuring that earlier
arrivals (with a larger T) will leave sooner.
K
Ordinary Bk Ek Merge
Origin r Ek
Diverge
Destination Bk s
Figure 2 Five Types of Network Connections Captured in CTM
Ordinary Connection
For Bk and Ek on path p, denote nBk pr(t], K
"at p r(0' respectively, as the traffic in cells Ordinary Bk Ek
Bk and Ek at time t on path p and entered the cells in the time interval immediately after the
clock tick (t - T). Essentially, T represents the waiting time of that packet of traffic in the cell.
Define yk<p<r(t) as the traffic (on path p and that has been waiting forr time intervals) leaving
Bk through link k at time t. Considering the conservation of traffic in cell Bk and Ek, we have:
Equation (15) states that the new traffic enters Ek at T = l. Equation (16) states that the
remaining traffic in Bk at time t+1 (determined by the difference between the traffic at t minus
those leaving) increases their waiting time from T to T +1. If the minimum wait aBk (t] in cell
Bk is known, then the flow on link k is given by3:
3
Since all the variables are time-dependent, to simplify notations, we drop "(t) " from all the variables hereafter
except in instances of ambiguity.
(17)
if T<
where a flt + denotes the smallest integer equal to or greater than aBk. Equation (17) states that
if the waiting time T of a certain traffic packet is longer than the larger integer part of the
minimum wait, the whole packet advances; none of the packet leaves Bk if its waiting time is
less than aBk; and lastly, for the packet with waiting time T = \aBk + , the fraction of (T - aBk)
leaves. Note that the exit flow yk<ptl(t] is a decreasing function of aBk\ i.e., v ti/ , jT (f)increases
with decreasing aBk.
Mathematically, (17) is equivalent to:
(18)
If T>aBk+ or, equivalently, T — aBk>l, the middle term is nBkpr. If T = or,

equivalently, 0 < T — a B A < l , the middle term is (^ — aBk)nBkpr. Lastly, if T < or,
equivalently, T-aBk < 0 , the middle term is zero. This completes the verification.
To incorporate (17) as part of the constraint set, the mid-value function (18) is converted to a
set of mixed-integer constraints. This is achieved by the technique discussed in Appendix C.
The resultant constraints are:
L • v, + e < (T - aBk }nBkpr - nBk^ < U • (l - v,) (19)

L-v2 + £<-(T-aBk)nSk^<U-(l-v2) (20)
L.v3+e<nBi!pT<f/.(l-v3) (21)
L-(v, + v3) < ykipiT -nBkpr < U -(v, + v 3 ) (22)
L.(2-v 1 -v 3 )<v, i p , I -n B ^ T <£/.(2-v 1 -v 3 ) (23)
L-(v1+v2)<y,>p>r-(T-aBt)nB,ipT<C/-(v1+v2) (24)
L-(2-v 1 -v 2 )<v,, p , T -(T-a B ,)n B ^ T <f/-(2-v 1 -v 2 ) (25)
(26)
(27)
where L,U are very large negative and positive constants, e is a very small positive constant,
v,, v 2 , v3 are three binary variables. Refer to Appendix C for details.
Without loss of generality, the disaggregate variables can be expressed in aggregate forms by
summing through the path and arrival time indices, as the following:
r p r p
Daganzo (1995a) showed that the aggregate flow on link k, yk, observes:
yk=^Ly
r k,P,r (29)
P
yk=min{SBk,REk}, (30)
where
SBk=min{QBk,nBk},and (31)
REk = min{<2£* ,SEk[NEk -nEk]} . (32)
The aggregate flow is the minimum between the effective capacity of the "sending" cell Bk,
SBk, and that of the "receiving" cell Ek, REk. These two capacities are subject to the conditions
(31) and (32), where QBk,QEk are the absolute flow capacities of the sending and receiving
cells; NEk,nEkthe holding capacity and vehicle occupancy of cell Ek, respectively; therefore
NEk~nEk the available space in the receiving cell, and SEk=W/V. In essence, these
conditions follow from the basic principle of (14).
The minimization equations (30)-(32) can each be converted to a set of linear constraints
through the transformation in Appendix A. For brevity, we only show the equivalent
constraints for (30):
0<REk-yk<U-(l-z) (33)
0<SBk-yk<U-z (34)
L-(\-z)<SBk-REk<U-z (35)
where z is a binary variable and like before, L,U are, respectively, very large negative and
positive constants. Their equivalence can be verified by putting the only two possible values of
z into (33)-(35) or referring to Appendix A for details. With this technique, (31)-(32) can each
be transformed in a similar way.
The minimum wait aBk, outflow yk pr, occupancies nBk pTand nEk pr and their corresponding
aggregate variables are fixed simultaneously by satisfying both the disaggregate constraints
(15)-(16), (18), and the aggregate ones (28)-(32). Of course, both (18) and (30)-(32) are
represented by their equivalent mixed-integer constraints.
Origin Connection
The origin connection is almost identical to the
ordinary connection. The only exception is that Origin
exogenous dynamic traffic demand enters the
network through cell r according to the path flow/^) 4 . Other than this change, the entire set
of constraints and the associated equivalent mixed-integer constraints for the ordinary
connection applies here. Vehicle occupancy in cell r is described by:
=/,"W (36)
Destination Connection
The destination connection is also similar
Destination Bk s
to the ordinary connection. The only
difference is that a destination cell is modeled with an infinite storage capacity to act as a big
parking lot. Traffic arrives at a destination cell is considered to have left the network. The only
change to the set of constraints for the ordinary connection is that the holding capacity Ns is
taken as infinity, where the subscript s refers to the destination cell. So instead of setting
Rs =min{Qs,8s[Ns — «J}, we useRs = Qs as A^—»°°. Again all the conditions can be
transformed to mixed-integer constraints through the techniques in Appendices A and B.
Merge Connection
Much of the principle of developing the merge
connection follows from the ordinary
connection. We have one set of conditions for
Merge
the disaggregate traffic representation and one
set for the aggregate case. Vehicle occupancies
6k
in these three cells are represented by:
(37)
(38)
= «c*.,.t M - y«*,P,T (0 (39)
The disaggregate exit flows from cells Bk and Ck, respectively, are:
>Wr=mid{« a j p T , (r-aCk)nCkpT, OJ (41)

Note that each of the sending cells has its own minimum wait, aBk,aa . The aggregate flows
are:
4
The notation of the path flow f" follows from its earlier definition. The use of three indices (r, s, p) is
redundant because path p alone automatically defines the starting and ending cells (r and s) as well as
all the cells along p. This is only for presentation clarity and can be simplified in coding.
*,p,r (42)
r p r p
The effective capacities of the sending cells Bk and Ck and that of the receiving cell Ek are
defined, respectively, by:
SBk=min{QBk,nBk}, (44)
SCk=min{QCk,nCk}, (45)
R =m i n G , $ t f -« •
Up to this point, all the conditions follow from the ordinary connection. By the techniques in
Appendices A and C, each of these operations can be converted to a set of mixed-integer
constraints.
In a merge connection, in addition, one must ensure that traffic exit from the two sending cells
can be accommodated in the receiving cell. The basic idea is that if the receiving cell has
sufficient capacity, all the traffic from Bk and Ck enters, hi the case of insufficient Ek capacity,
traffic from Bk and Ck is apportioned according to the amount of traffic from Bk and Ck and
the two priority parameters Pc and Pck associated with the exit links. Daganzo (1995a)
provided a detailed treatment of this aspect and is not repeated here. The mathematical
conditions are written as:
.=S
if REk>SBk+SCk, tfaen
[yck - ^ck
{ y = mid{5Bk
fljt , REk
Ek - SCkck , P
k k • Ek
REk } (47)
if Rn<SRk+Srk, then \ k } \
U* = mid{5ct , REk - SBk , Pck • REk }
Pk + Pck=l (48)
The parameters Pc and Pck (which sum to one) provide flexibility in modeling prioritized
merge junctions or the action of a traffic signal.
Letting:
Mk = mid{5fl, , REk - Sck , Pk • REk } (49)
Mck = mid{Sct , Ra - SBk , Pck -REk}, (50)
condition (47) can be rearranged as:
SBk ifREk>SBk+SCk
Mk XREk<SBk+SCk
if REk - S
Bk + SC
(52)
ifR<S+S
Conditions (49) and (50) can each be transformed to a set of mixed-integer constraints by the
transformation in Appendix C. And (51) and (52) can be transformed by the approach in
Appendix B. In summary, all the conditions in the merge connection can be transformed to
mixed-integer constraints.
Diverge Connection
Similar to the other connections, two sets of
constraints are required to fully define the
Diverge
diverge connection: one for disaggregate
traffic and one for aggregate traffic
representations. The flow conservation constraints at the disaggregate level is similar to the
ordinary connection, stated as:
(53)
(54)
(55)
The disaggregate exit flow from Bk is:
(56)
(57)
To maintain FIFO, a single minimum wait aBkis defined for Bk regardless of which
downstream cell the traffic is heading. (Although the right-hand-sides of (56) and (57) look
identical, numerically they are different because cells Ek and Ck lie on different paths.) For
exposition purposes, we also define the sum of outflows from Bk:
yBk,p,r = yk,p,r + yck,P,r (58)
The aggregate traffic in each cell and link flows are then defined in the following by summing
the disaggregate variables through the path and arrival time indices:
(59)
(60)
The aggregate link flows must observe the capacity limits. That is,
(61)
Note that the flows yk, yck, and the sum yk + yck are a function of the minimum wait aBk. A
lower aBk permits higher flows yk, yck, and yk +yck. In order not to violate any of the three
constraints in (61), one needs to find the value of aBk such that it begins to make the
constraints binding. This can be achieved by these constraints:
ykpr=mid[nBkpT, (T-a'Bk)nBkpr, 0} (62)

y' = midln , (T — a" }n , OJ (63)
(65)
(66)
(67)
The value of a'Bk that produces the limiting case of y^. = REk is determined by (62), the first
equation in (65), and the first equation in (66). Similarly, a'Bk, a'Bk, corresponding to the
limiting cases of y'ck = Rck and y'Bk = SBk, are determined from (62)-(66). Finally aBk takes the
maximum among a'Bk, a'Bk, a'B'k in order not to violate any of the constraints in (61).
In the above six equations, (62)-(64) can be transformed to mixed-integer constraints through
the technique in Appendix C. The maximization (67) can be transformed to mixed-integer
constraints by the technique in Appendix A by re-stating it as:
aBk = — min(-a^, —a'Bk, -a^")- Therefore, all the conditions and operations related to the
diverge connection can be cast into mixed-integer constraints.
Summarizing, the network CTM involves three types of operations that are difficult to be
incorporated directly as constraints, namely, "if-then" conditions, the mid-value function, and
minimization. We showed in this section that each of these three operations can be transformed
to mixed-integer constraints, which can then be part of the constraint set of a MP.
Determination of Path Travel Times
to
This ideal DUO formulation uses the actual path travel time J?p(0 equilibrate traffic. The
r
network CTM tracks the path index in each cell. Let (Q p(t)be the cumulative vehicle count of
traffic on path p in the origin cell r and cosp(t) the corresponding cumulative count in the
destination cell s. According to Figure 3, one can write:
Graphically, the travel time on path p is the horizontal distance between the two cumulative
curves. If time is discretized and hence traffic arrives in packets, generally the cumulative
counts in cells r and s will not be equal at the discretized time ticks. We estimate the path travel
time in this way. If the cumulative count in r at time t is bounded by the cumulative counts in s
between t' and f'+l, then the path travel time is set to be: t' - t. The maximum error in this
estimation is one time interval. Mathematically, it can be stated as:
if (Osp(t'}<0)rp(t)<o)sp(t'+l} then rfsp(t) = t'-t (69)
This condition is equivalent to:
L-p, <a>rp(t)-a>sp(t'+l)<U-[l-pt] + e, fort'>t (70)

L-p2+e<cosp(t'}-(orp(t)<U-[l-p2], fort'>t (71)
L-(2-p,-p2)<r1r;(t)-(f-t)<U-(2-pl-p2} (72)
where L,U are very large negative and positive constants; e is a small positive constant;
Pj,p 2 are binary variables indicating the states of satisfying the constraints G)rp(t]< cosp(t'+i)
and cosp(t') < corp(t), respectively. The binary variables satisfy these if-and-only-if conditions:
p, = 1 if and only if Q)rp(t} < cosp(t'+i) and p2 = 1 if and only if (Osp(t'} < corp(t). If both p t ,p 2
equal 1, then the condition a>sp(t') < f)rp(t) < a>sp(t'+l] holds. By substituting p, =p 2 = l into
(72), one can verify that 0<r]"(?)-(f'-?)<0 or rjrp(t) = (t'-t). If one of p,,p 2 is not one (i.e.,
the condition casp(t') < corp(t) < cosp(t'+i) is not satisfied), (72) becomes
L < 7 7 ™ ( t ) - ( t ' - t ) < U w h i c h is true for any values of rfrp(t) and hence non-binding. This
completes the verification.
Cumulative
Vehicle
Count of vehicles
Using path p
t t' t'+l
Figure 3 The cumulative vehicle counts in origin cell r and destination cell s on path p
4. REMARKS
We formulated the ideal DUO problem with two distinctive features. Firstly, it is based on
minimizing the gap function of the corresponding NCP. We proved that the solution to the
minimization corresponds to the ideal DUO conditions. Secondly, we showed that the network
CTM can be transformed in its entirely to a set of mixed-integer constraints, ready to be
incorporated in dynamic traffic optimization problems, such as this ideal DUO formulation or
dynamic traffic signal control problems (Lo, 1998, 1999). The advantage of this transformation
is that it improves the accuracy of traffic dynamics modeling significantly, as compared with
other analytical DTA formulations.
However, we note that this improvement comes at a cost of increasing the computation effort
significantly. Developing efficient solution algorithms is an important topic, which is outside
the scope of this paper and is an area of our future research.
ACKNOWLEDGEMENTS
This research is sponsored by the research grant (RGC-DAG96/97.EG32) from the Hong Kong
Research Grant Council. I am also grateful for the helpful comments of the two anonymous
referees.
REFERENCES
Aashtiani, H. (1979). The Multi-Modal Traffic Assignment Problem. PhD Thesis, Operations
Research Center, MIT, Cambridge, MA.
Asmuth, R. (1978). Traffic Network Equilibria. PhD Thesis, Department of Operations
Research, Stanford University, Stanford, CA.
Bell, M.G., C, Cassir, S. Grosso, and S. Clement. (1997). Path Flow Estimation in Traffic
System Management. Proceedings of International Federation of Automatic Control:
Transportation Systems, Chania, Greece. 1316-1321.
Cascetta, E., F. Russo, A. Vitetta. (1997). Stochastic User Equilibrium Assignment with
Explicit Path Enumeration: Comparison of Models and Algorithms. Proceedings of
International Federation of Automatic Control: Transportation Systems, Chania,
Greece. 1078-1084.
Chang, G. L. and H. Mahmassani. (1988). Travel time prediction and departure time
adjustment behavior dynamics in a congested traffic system. Transportation research.
Vol. 22B, no. 3.,217-32.
Chen, A. and R. Jayakrishnan. (1998). A Path-Based Gradient Projection Algorithm: Effects of
Equilibration with a Restricted Path Set Under Two Flow Update Policies. Paper
presented at 77th Annual Meeting of Transportation Research Board, 1443,
Washington D.C.
Dafermos, S. (1980). Traffic equilibrium and variational inequalities. Transportation Science,

14, 42-54.
Daganzo, C. F. (1994). The cell-transmission model: a simple dynamic representation of
highway traffic. Transportation Research. 28B(4), 269-287.
Daganzo, C. F. (1995a). The cell-transmission model, Part II: Network Traffic. Transportation
Research. 29B(2), 79-93.
Daganzo, C. F. (1995b). Properties of link travel time functions under dynamic loads.
Transportation Research. 29B(2), 93-98.
Facchinei, F. and J. Soares. (1995). Testing a New Class of Algorithms for Nonlinear
Complementarity Problems. In "Variational Inequalities and Network Equilibrium
Problems" edited by Giannessi and Maugeri. Plenum Press, New York.
Fischer, A. 1992. A Special Newton-type Optimization Method. Optimization, 24, pp. 269-284.
Florian, M. and D. Hearn. (1995). Network Equilibrium Models and Algorithms. Handbooks in
Operations Research and Management Science, Volume 8, Network Routing (M.O.
Ball et al. Editors). Elsevier Science. Netherlands.
Friesz, T., D. Bernstein, T. Smith, R. Tobin, B. Wie. (1993). A Variational Inequality
Formulation of the Dynamic Network User Equilibrium Problem. Operations
Research, 41, 179-191.
Gabriel, S. and D. Berstein. (1997). The Traffic Equilibrium Problem with Nonadditive Path
Costs. Transportation Science. 31,4, 337-348.
Janson, B. (1991). Dynamic Traffic Assignment With Arrival Time Costs. Transportation
Research Vol. 25B, 143-161.
Jayakrishnan, R., W.K. Tsai, J. N. Prashker, and S. Rajadhyaksha. (1994). Faster Path-Based
Algorithm for Traffic Assignment. Transportation Research Record, 1443, 75-83.
Jayakrishnan, R., W.K. Tsai, J. and A. Chen. (1995). A Dynamic Traffic Assignment Model
with Traffic Flow Relationship. Transportation Research 3C, 51-82.
Kanzow, C. and M. Fukushima. (1996). Equivalence of the Generalized Complementarity
Problem to Differentiable Unconstrained Minimization. Journal of Optimization
Theory and Applications. 90, 2, 581-603.
LeBlanc, L., E. Morlok, and W. Pierskalla. (1975). An Efficient Approach to Solving the Road
Network Equilibrium Traffic Assignment Problem. Transportation Research. 9, 430-
442.
Lighthill, M.J. and J.B. Whitham. (1955). On Kinematic Waves. I. Flow Movement In Long
Rivers, n. A Theory Of Traffic Flow On Long Crowded Road. Proceedings Of Royal
Society, A229, 281-345.
Lo, H., B. Ran, and B. Hongola. (1996a). A Multi-Class Dynamic Traffic Assignment Model:
Formulation And Computational Experiences. Transportation Research Record 1537,
pp. 74-82.
Lo, H., W. Lin, L. Liao, E. Chang, and J. Tsao. (1996b). A Comparison of Dynamic Traffic
Models: Part I Framework. PATH Published Research Report UCB-ITS-PRR-96-22.
University of California, Berkeley, Institute of Transportation Studies.
Lo, H., W. Lin, L. Liao, E. Chang, and J. Tsao. (1996c). A Comparison of Dynamic Traffic
Models: Part H Results. PATH Published Working Paper. UCB-ITS-PWP-97-15.
University of California, Berkeley, Institute of Transportation Studies.
Lo, H. (1998). A Novel Traffic Control Formulation. Transportation Research. In press.
Lo, H. (1999). A Cell-based Traffic Control Formulation: Strategies and Benefits of Dynamic
Plans. Transportation Science. Submitted.
Lo, H. and A. Chen 1998. Traffic Equilibrium Problem with Path-specific Costs: Formulation
and Algorithms. Transportation Research. Submitted.
Lin, W. and H. Lo. (1998). Are the Objective and Solutions of the Dynamic User-Equilibrium
Models Always Consistent? Transportation Research, hi press.
Nagurney, A. (1993). Network Economics: A Variational Inequality Approach. Kluwer
Academic Publishers. Norwell, Massachusetts, USA.
Newell, G. F. (1991). A Simplified Theory of Kinematic Waves. Research Report., University
of California, Berkeley. UCB-ITS-RR-91-12.
Patriksson, M. (1994). The Traffic Assignment Problem: Models and Methods. VSP. The
Netherlands.
Ran, B, H. Lo and D. Boyce. (1996). A Formulation and Solution Algorithm for A Multi-class
Dynamic Traffic Assignment Problem. Transportation and Traffic Theory:
Proceedings of the Thirteenth International Symposium on Transportation and Traffic
Theory (ISTTT). Editor Lesort. Pergamon, UK., 195-216.
Ran, B. and D. Boyce. (1996). Modeling Dynamic Transportation Networks. An Intelligent
Transportation System Oriented Approach. Second Revised Edition. Springer-Verlag,
Heidelberg.
Richards, P.I. (1956). Shockwaves on the highway. Operations Research, 4, 42-51.
Sheffi, Y. (1985). Urban Transportation Networks: Equilibrium Analysis with Mathematical
Programming Methods. Prentice-Hall Inc. Englewood Cliffs, New Jersey.
Smith, M. J. (1993). A New Dynamic Traffic and the Existence and Calculation of Dynamic
User Equilibria on Congestion Capacity-Constrained Road Networks. Transportation
Research. 27B, 49-63.
Wardrop, J. (1952). Some Theoretical Aspects Of Road Traffic Research. Proceedings Of The
Institute Of Civil Engineers, Part n, 325-378.
Wie, B., T. Friesz, R. Tobin. (1990). Dynamic User Optimal Traffic Assignment on Congestion
Multidestination Networks. Transportation Research, 24B, 431-442.
APPENDIX A: TRANSFORMING 2- AND 3-TERM MINIMIZATIONS TO MIXED-

INTEGER CONSTRAINTS
The minimization of G = min(A, B, C) can be divided into two steps:

0 = min{A,fi} (A-l)
and G = rnin{0,C} (A-2)
(A-l) sets 0 to be the minimum between A and B. (A-2) then defines G to be the minimum
between 0 and C. (A-l) can be represented equivalently by this set of linear constraints:
0<fl-0<t/(l-z,) (A-3)
0<A-0<[/Z! (A-4)
L(l-Zl)<A-B<UZl (A-5)
where L,U are, respectively, very large negative and positive constants and z, a binary
decision variable which must be either zero or one. If z, = 1 , (A-3)-(A-5) become:
0= fi (A-6)
0 < A - 0 < C 7 =» A > 0 (A-7)
Q<A-B<U=*A>B (A-8)
Or, equivalently, 0 = 5 when 5 is the minimum. On the other hand, if z, =0, (A-3)-(A-5)
become:
0 < S - 0 < £ 7 =»5>0 (A-9)

0<A-0<0=>0= A (A-10)
L<A-5<0=>fl>A (A-ll)
Or, equivalently, 0 = A when A is the minimum. Therefore, the minimization (A-l) can be
replaced by (A-3)-(A-5) with one binary decision variable. Similarly, (A-2) is equivalent to:
0<C-G<f/(l-z2) (A- 12)

0<0-G<C/z2 (A- 13)
L(l-z 2 )<0-C<£/z 2 (A- 14)
where z2 is a binary decision variable. Likewise, one can verify that G = C<=>z 2 = l <
On the other hand, G = 0 (or the minimum of A,B) < = > z 2 = 0 < = > C > 0 . Therefore, (A-2) is
equivalent to the linear constraints (A-12)-(A-14). In summary, each instance of this 3-term
minimization can be replaced by these two set of linear constraints (A-3)-(A-5) and(A-12)-
(A-14).
APPENDIX B: TRANSFORMING AN "IF-THEN" CONDITION TO MIXED-

INTEGER CONSTRAINTS
"If-then" condition of the following form:
'A ifD>E
(B ifD<E
can be transformed to the following set of mixed-integer constraints:
L-o<D-E<U(\-a} + £ (B-l)
(B-2)
cr) (B-3)
where L,U are, respectively, very large negative and positive constants, e is a very small
positive constant and <7 is a binary variable indicating the state between D and E: o - 0 for
the case D > E ; (7 = 1 otherwise. This can be verified by substituting the two values of G into
(B-l):
The value of <7 then determines that of C according to (B-2)-(B-3). Say, (7 = 0, according to
C= A
L<C-B<U <=> nonrestrictive on C '
One can verify that <7 = 1 corresponds to the case of C = B . This completes the verification.
APPENDIX C: TRANFORMING THE MID-VALUE FUNCTION TO MIXED-

INTEGER CONSTRAINTS
Let G be defined as the mid-value term among A, B, and C, stated as G = mid(A,fi,C) . In the
following, we show how Function G is converted to a set of mixed-integer constraints.
Consider these three linear constraints:
-w1) (C-l)
L-w2+£<C-B<U-(l-w2) (C-2)
l-w 3 ) (C-3)
where e is a very small positive constant; L and U are very large negative and positive
constants, respectively; and wl,w2,w3 are binary variables indicating the state of the
inequalities shown in Table C-l. One can verify the equivalence between (C-l)-(C-3) and
Table C-l.
Table C-l The relationship between the values of w,, vv2, w3 and the inequalities
Condition Value of wl , w2 , W3
A>B w,= 1
A<B w,=0
B>C w2 = 1
B<C W2 = 0
C>A W3=l
C<A W3 = 0
To find G, we establish the following six exclusive and exhaustive cases by ranking A, B, and
C, as shown in Table C-2. The corresponding values of w,, w 2 , w3 follow from Table C-l.
Table C-2 The relationship between the six cases and w,, vv 2 , vv3
Case Result Wi W2 W3
(a) B > A > C G=A 0 Doesn't5 0

Matter
(b)C>A>5 G=A 1 Doesn't 1
matter
(c) C > B > A G=B 0 0 doesn't
matter
(d) A > B > C G=B 1 1 doesn't
matter
(e) A > C> B G=C doesn't 0 0
matter
(f)B>C>A G=C doesn't 1 1
matter
The results shown in Table C-2 are equivalent to the following six linear constraints:
L • (w, + w3) < G - A < U -(w, + w3) (C-4)

L-(2-w1-w3)<G-A<t/-(2-w1-w3) (C-5)
L^w,+w2)<G-B<f/-(w,+w2) (C-6)
L-(2-w1-w2)<G-5<t/-(2-w)-M;2) (C-7)
L-(w 2 + w 3 ) < G - C < [ 7 -(vv 2 + vv3) (C-8)
L-(2-w2-w3)<G-C<[/-(2-w2-w3) (C-9)
where L, t/, w,, w 2 , vv3 follow their earlier definitions.
5
"doesn't matter" here means the result is conclusive even without specifying its value.
We use one example to illustrate how the constraints (C-4)-(C-9) capture the results of Table
C-2. Say, the values of A, B, and C are as represented in case (a) with G equals A. By
substituting the values of w,, w2, w3 of case (a), the constraints (C-4)-(C-9) become:
0<G-A<0 (C-10)
2-L<G-A<2-U (C-ll)
L<G-B<U (C-12)
L<G-B<U (C-13)
L<G-C<U (C-14)
The resultant constraint (C-10) requires that G = A, while the rest of the constraints are
nonrestrictive. Say, (C-ll) requires that the difference G — A lie between two times a very
large negative constant and two times a very large positive constant. As L and U are defined
to be very large negative and positive constants, (C-ll) is always satisfied regardless of the
values of G and A. The same argument applies to constraints (C-12)-(C-14). So all the
constraints are satisfied with one ensuring the required result. By repeating the same
verification approach to all the other five cases, one can prove that indeed the constraints (C-l)-
(C-9) precisely replicate the function G = mid( A, B, C].
Extended Logit Assignment Formulations 351
FORMULATIONS OF EXTENDED LOGIT

STOCHASTIC USER EQUILIBRIUM
ASSIGNMENTS
Shlomo Bekhor and Joseph N. Prashker
Department of Civil Engineering, Technion - Israel Institute of Technology, Haifa,

32000, Israel.
ABSTRACT
In the transportation literature, the "logit assignment" stands for a stochastic user
equilibrium model in which the multinomial logit is the route choice model. Efficient
algorithms using this mathematical formulation were proposed to solve the logit
assignment. However, the use of the logit function for route choice has some
theoretical drawbacks. Recently, extended logit-base models were proposed to
overcome the overlapping problem, while keeping the analytical tractability of the logit
function. The purpose of this paper is to present new stochastic user equilibrium
formulations. The paper shows how two extended logit models - such as the
Cross-Nested Logit and the Paired Combinatorial Logit, can be derived from more
general entropy-type formulations, and thus allowing the use of existing (and yet under
development) algorithmic solutions for the more general logit-family stochastic
assignment model. The paper also shows how the general stochastic user equilibrium
formulation can be adapted for the new route choice models.
INTRODUCTION
In the transportation literature, the "logit assignment" stands for a stochastic user
equilibrium model in which the multinomial logit is the route choice model. Fisk (1980)
developed an equivalent mathematical formulation for the stochastic equilibrium, in
which the solution obtained is the logit function.
The use of the logit function for route choice has some theoretical drawbacks. The
most discussed one is related to the independence of irrelevant alternatives (IIA)
property of the logit function. In typical transportation networks, many routes have
common links, and the structure of the simple multinomial logit model is not able to
account for these common links. The probability of choosing a route depends solely
on the total cost of each route in the choice set. Despite this theoretical drawback, the
analytical simplicity of the logit model has motivated many authors to use the logit
model for route choice formulation, arguing that the congestion effect 'alleviates' in
some way the overlapping problem.
Recently, Cascetta et al. (1996) proposed a modified logit model, named C-Logit,
which takes into account the overlapping sections of the routes, while keeping the
analytical tractability of the logit-family models. Cascetta et al. (1998) further
developed their model and showed that the C-Logit can be seen as an Implicit
Availability Perception random utility choice model. This model can be used to
generate routes for a path-based stochastic assignment model.
Prashker and Bekhor (1998) presented two other general discrete choice models of the
logit family that can also be adapted for route choice situation. The models are the
Cross-Nested Logit model of Vovsha (1997) and the Paired Combinatorial Logit
model of Chu (1989) and further developed by Koppelman and Wen (1997). The
performance of the extended logit models was tested for simple networks in Prashker
and Bekhor (1998). The adaptation of the Cross-Nested Logit for stochastic
equilibrium was suggested by Vovsha and Bekhor (1998), and is further developed
here.
The purpose of this paper is to show how the extended logit models - such as the
Cross-Nested Logit and the Paired Combinatorial Logit, can be derived from more
general entropy-type formulations, and thus allowing the use of existing (and yet under
development) algorithmic solutions for the more general logit-family stochastic
assignment model. By presenting the more general formulations and correspondent
solutions, this paper introduces a possible wider class of stochastic user equilibrium
formulations for generalized logit models.
This paper is organized as follows: first, Fisk's (1980) mathematical formulation and
the solution are presented for completeness. Next, two extended logit models are
considered in this paper: the Cross-Nested logit model and the Paired Combinatorial
logit model. The paper presents modified entropy-type mathematical formulations and
shows that the solutions obtained from these mathematical programs are extended logit
functions. The derivation from Sheffi and Powell (1982) mathematical formulation is
also presented. The last part of the paper discusses how the generalized formulations
can be implemented in existing algorithms that solve the logit assignment problem.
THE MULTINOMIAL LOGIT MODEL
The solution of Fisk's equivalent stochastic user equilibrium (SUE) minimization

program is the logit route choice model. The mathematical formulation of Fisk's SUE
model is stated as follows:
Min Z = Z, + Z2
*•'• Z /*"=?". v ^
k
f" > 0 , V k,r,s
where f" is the flow on route k between origin r and destination s;

ca is the cost on link a;
xa is the flow on link a;
<7™ is the demand between r and s;
0 is a dispersion parameter.
To show that this formulation produces the SUE solution, we proceed to develop the
first-order conditions, by forming the Lagrangean function as follows:
(2)
We omit the indexes r and s (origin and destination) for simplicity of notation. The
partial derivative of L with respect to a path flow/is obtained as follows:
The solution is obtained by equating the first derivatives to zero:

fk = exp(fll + 1) * exp(-0ct ) (6)
summing the above expression for all routes k results in the following expression:
2 A = exp(0A + 1) * X exp(-0c, ) = q (7)
t t
combining the two expressions above leads to the probability of choosing a route:
which is the simple multinomial logit function.
In the above formulation all the routes connecting an origin-destination pair should be
considered. Akamatsu (1997) showed that a link-based assignment can be derived if
the Markov property holds for the path set. He also showed that Dial's (1971) STOCK
network loading procedure produces a path set which is consistent with the Markov
property.
THE CROSS-NESTED LOGIT MODEL
Model description
The Cross-Nested logit model, presented by Vovsha (1997), was applied for a
mode-choice situation. The model was defined as a particular case of McFadden's
(14) generalized extreme value (GEV) function. The probability function can be
obtained when a generator function G(yl,y2,--,yn) satisfies conditions for serving as
a basis of the distribution of random utilities as follows:
1. G(...) is non-negative.
2. G(...) is homogeneous of degree ju.
3. lim G(...) = oo, for each k.
4. The /-th partial derivative of G(...) with respect to any combination of/ distinct^'5
is non-negative if / is odd and non-positive if / is even.
When these conditions are satisfied, the probability function for choosing an alternative
is obtained as follows:
l>, , V 7 , . . . , V )
(9)
Where ^ v = and G =
^ .
The generator function for the Cross-Nested Logit model is obtained as follows:
(10)
Where a mk is the inclusion coefficient of alternative k in nest m.

fj. is the nesting coefficient.
The probability of choosing an alternative (route) k is then obtained as follows:
f k f***A mk I ^^ ml r\ / / I
(11)
where the utility Vk is assumed to be a linear combination of the path cost ck.
It is possible to rewrite the expression for the probability of choosing a route as

follows:
(12)
Where the conditional probability of a route k being chosen in link (nest) m is:
^7 (13)
and the marginal probability of a nest m being chosen is:
The probability of choosing route k depends on two factors: the generalized cost of
the route c t , and the inclusion coefficient a mk associated with links m that form route
k. The coefficient ju indicates the degree of nesting, as in the Nested Logit model.
When n is equal to one, the model is equal to the Multinomial Logit.
Adaptation for Route Choice Situation
The adaptation of the Cross-Nested Logit model for route choice situation was
proposed by Prashker and Bekhor (1998). It is possible to define a functional
relationship for the inclusion coefficient with respect to the links in a route. In the spirit
of Cascetta et al. (1996) work, this coefficient can be specified as follows:
a-^T5-*- (15)
Where Lm is the link length;

Lk is the path length;
8 mk is equal to 1 if link m is included in path k and zero otherwise.
In this case, the inclusion coefficient is dependent only on network topology. If we

assume that that the inclusion coefficient is proportional to the link costs (instead of
link lengths), then a mk is also dependent on congestion. However, this assumption
demands a more elaborated formulation of the GEN generator function, particularly
the condition related to the partial derivatives. Therefore, in this paper, we assume that
the inclusion coefficient is not dependent on congestion.
The formulation of the Cross-Nested model presented above permits an alternative (in
our case, a route) to belong to more than one nest (in our case, a link). The crossing
effect is represented by the inclusion coefficient a mk, 0 < amk < 1. The Nested Logit
model is a special case of the Cross-Nested Logit model, in which the coefficients a mk
are either zero or one. When a mk is zero, the alternative is the only one in the nest,
and when a mk is one, the alternative belongs to a specified nest.
Figure 1 below illustrates how the Nested Logit (NL) and Cross-Nested Logit (CNL)
models takes into account the overlapping part of a simple network for comparison.
The figure also includes a simple Multinomial Logit (MNL) model, which is not
capable to take into account the effect of common links between different routes.
Figure 1. Overlapping effect in MNL, NL and CNL models.
Network
x y ROUTE 1: LINK A
ROUTE 2: LINKS B-C
ROUTE 3: LINKS B-D
MULTINOMIAL LOGIT NESTED LOGIT
CROSS-NESTED LOGIT
Figure 1 above shows a simple network with four links. There are three routes between
x and y. Routes 2 and 3 have a common link B. In the MNL model, the probability of
choosing a route is dependent only on the total route cost, so it is not possible to
"isolate" link B in the tree structure. In the NL model, each route belongs to only one
nest. In this simple example, the tree structure is also simple. However, for bigger
networks, with many links shared by many routes, the tree structure becomes
extremely complicated. Since each route is restricted to only one nest, the tree
representation must "duplicate" the common links in order to form different routes.
The tree representation of CNL model is different from the NL is two points. First, all
links are grouped in the upper level of the tree, indicating that each link may belong to
different routes. In this way, the tree structure is kept simple, with only two levels. The
second point is related to the inclusion coefficient. Since a route may belong to more
than one nest (e.g., route 2 belongs to nests B and C), the inclusion coefficient
represents the proportion of "splitting" the route into the nests. To keep consistency,
the inclusion coefficients of each route must sum up to one (e.g., aB2 + aC2 =1).
It should be noted that in principle, for each origin-destination pair, each link in the
network is potentially a nest. The implementation of such a model for real-size
networks is of course not possible. A practical solution for this problem may be
obtained by reducing the choice set (the available routes) for each o-d pair, as with
other explicit route enumeration methods. In this way, the nests for each o-d pair are
restricted to the links that form the routes in the choice set.
In the following section, the stochastic user equilibrium formulation for which the
solution obtained is the Cross-Nested Logit model is developed.
Cross Nested Logit Equivalent Mathematical Formulation
The next step is the development of a mathematical formulation, similar to Fisk's, for
which the solution obtained is the Cross-Nested Logit model. The CNL model is a
hierarchical model that can be decomposed into marginal and conditional probabilities.
Similarly, the objective function has to be decomposed into two entropy terms, instead
of only one in the multinomial logit model. In this way, it is possible to obtain the
ExtendedLogit Assignment Formulations 359
conditional and marginal probabilities as solutions of the equivalent formulation.

Towards this goal, we formulate the following mathematical program:
Min Z = Z + Z + Z
7
Z --.
- ^ ,,
(16)
m t
=?'1. Vr s
'
/^>0, Vm,k,r,s
where ^ is the dispersion coefficient and ju is the nesting coefficient.
There are two main differences between the mathematical program formulated above
and Fisk's: first, the inclusion of another entropy term (Z^), corresponding to the
higher choice level, and the modification of the entropy term (£2) to include the
inclusion coefficient. The summation of the path flows is decomposed by the m links
(nests).
(17)
The partial derivatives are obtained as follows:
(20)
Equating the partial derivatives to zero and multiplying by 0 we obtain the following:
^c t+;U ln-^ r + (l-^)ln2;/Blt + 1 - 0 A = 0 (21)

(«-*)
Rearranging terms:
(/J"IZ/.
In = 0l-\-0ct (22)
amk '
Taking the exponent:

!
Elevating both sides to l/ju\
(24)
Summing the above expression by route A: provides the following expression:
(25)
Elevating both sides to ,u:
z/-
* =«p(^
Summing the above expression by link (nest) /w:
zz/- =*=(
and finally dividing equation (26) by equation (27) leads to:
2-i Jmk
(28)
Z Zk
which corresponds to the marginal probability of nest m being chosen. To obtain the
conditional probability, we divide equation (24) by (25):
which corresponds to the conditional probability of route k being chosen in nest m.

f ,, (« J1//J exp(-
V m-*J PV 6>ct*/ w)
^ (30)
Uniqueness Conditions
The uniqueness of the solution relies on the proof that the objective function and the
feasible region are convex. The term Z\ (the Beckmann-type term) is the same as in
Fisk's formulation, as with the feasible region, and therefore are convex. It is left to
show that the components 2.^ and Z3 are convex. Differentiating 7.^ and Z3 for a path
flow variable gives the following:
(32)
Differentiating both expressions for another path flow variable gives the following:
1 - If
o /• (33)
* /m/ 0 otherwise
(34)
The Hessian of Z2 is positive definite, and the Hessian of Z3 is positive semi-definite

(the determinant is equal to zero). This assures convexity of the whole objective
function, since the sum of convex functions is also convex. Hence, the solution is
unique in terms of the path-flow variable f^. It means that in principle, all flows in all
nests should be included in the path-set. This represents an additional difficulty with
respect to the simple logit model. This problem is further discussed on the section
about the algorithmic implementation.
THE PAIRED COMBINATORIAL LOGIT MODEL
Model description
Another GEV-type model, proposed by Chu (1989) and later developed by Koppelman
and Wen (1997), was also adapted to model route choice in Prashker and Bekhor
(1998). The generator function G(...) in this case is as follows:
+ > ,-•>«)•-* (35)

*=1 >=* + !
The probability of choosing an alternative (route) k is given as follows:

F
v* J*_ *
V V
- (36)
> »
Where cr^ is an index of similarity between alternatives k and/
The double summation includes n(n-l)/2 elements, which is the number of different
pairs of alternatives in the choice set of n alternatives. If CT^ is equal to zero for all k, j
pairs, the PCL collapses to the MNL model. The PCL model allows a differential
correlation between pairs of alternatives, as can be seen as follows. Let:
(37)
**>
Where P(k\kj) is the conditional probability of choosing alternative k, given that the
binary pair (k, j) was chosen as follows:
exp| ^ _ ^
(38)
and P(A/) is the marginal probability for the binary pair (k, j) as follows:
O ~7 f
OY
VA + exp
P(kj) = (39)
V n cr ~r CXp
2, U ^) GXp
/=! m=/+l
In the Nested Logit model, all pairs of alternatives in a common grouping are required
to be similar. In the PCL model, each pair of alternatives can have a similarity
relationship that is completely independent of the similarity relationship of other pairs
of alternatives. This feature is highly desirable for route choice models, since each pair
of routes may have different similarities.
Adaptation for Route Choice Situation
Like the Cross-Nested Logit model, which was adapted for route choice by defining
the inclusion coefficient, it is possible to relate the similarity index to the network
topology. The functional form is similar to the C-Logit model as follows:
(40)
where L^ is the length of the common part of routes k and/
The above equation confines the similarity-index boundaries between zero and one.
These conditions have to hold for the PCL model to be consistent with random utility
maximization. If cr^ approaches one, this indicates that all the links of a path are
completely equal to the links of the other path (maximum overlap). On the other hand,
if the similarity index is zero, this means that the paths have no link in common
(disjointed paths).
Figure 2 below illustrates the tree representation of the PCL model for the same simple
network example as Figure 1.
Figure 2. Overlapping effect in PCL model.
Network:
x ROUTE 1: LINK A
ROUTE 2: LINKS B-C
ROUTE 3: LINKS B-D
PAIRED COMBINATORIAL LOGIT MODEL
Figure 2 above shows the tree representation for the PCL model. In the upper level of
the tree representation, the similarity index between the different routes can be
calculated by equation (40). Since there is no link in common between routes 1 and 2
and between routes 1 and 3, the similarity index in both cases is zero. Since the model
is based on pair comparisons, each route in the lower level is reached by two points
from the upper level.
The number of nests in the PCL model increases rapidly with network size, since
theoretically the upper level includes all possible route pairs. However, the tree
structure does not change with network size, as in the CNL model. To implement the
PCL model for real-size networks, the number of routes between each o-d pair should
be kept small. This is critic in the PCL model, since it requires a double summation for
each pair of routes in an o-d pair (see equations 36 or 39) to compute the probability
of choosing a path. As with the CNL model, an explicit route enumeration method may
be used to generate the available routes for each o-d pair.
The following section shows the development of a stochastic user equilibrium

formulation, for which the solution is the Paired Combinatorial Logit model.
ExtendedLogit Assignment Formulations 365
Paired Combinatorial Logit Equivalent Mathematical Formulation
The mathematical formulation proposed follows the idea presented for the cross nested
model. Since the PCL model is also a hierarchical model, the objective function should
be composed of two entropy terms: one reflecting the higher level (marginal
probability of choosing a pair of routes) and the other reflecting the lower level
(conditional probability of choosing a route given a chosen pair of routes). We form
the following mathematical formulation as follows:
Min Z - Z, + Z, + Z,
t ;«*
/4,>0, VkJ,r,s
where f"^) is the flow on route k (of the route pair kj) between r and s.
f)^ is a measure of the dissimilarity index, defined as/7 = 1 - cr .
Similar to the Cross-Nested model, there are two main differences between the
mathematical program formulated above and Fisk's: first, the modification of the
entropy term (Z2) to include the similarity index in the final expression. Second, the
inclusion of another entropy term (Z3), corresponding to the higher choice level. The
summation of the path flows is decomposed by the combinatorial number of pairs.
The solution is obtained in a similar way as the Cross-Nested logit model. To obtain
the first-order conditions, we form the Lagrangean function, and omit the indexes r and
s (origin and destination) for simplicity of notation. After forming the L function and
equating the partial derivatives to zero and multiplying by 0//3we obtain the following:
P.J..*..)
V6 A, />,'
* .^ (42)
For other route y different from k the following equation holds:
0-/V
+ fj} ** =exp(9)i 1 Oc (43)
H
V J F~F~7^
Pk, Pig Pig
Dividing equation (42) by (43) gives:
exp
A
exp
Which gives the binary logit model as the conditional probability of choosing route k
from route pair kj.
exp
. Pin J
(45)
where /3 = l-cr (the similarity index).
To obtain the marginal probability, we sum equations (42) and (43) and elevate to the
power /?:
(46)
Summing the above expression for all possible pairs m,l gives:
(47)
Dividing (46) by (47) gives the marginal probability of choosing pair kj amongst all
possible pairs m,l:
exp|--J.exPl--
P(kj} = (48)
SZI/.+/,) +etp
exp
&j r
Uniqueness Conditions
As with the former cases, the feasible region and Z j are convex. The first derivatives of
Z2 and Z3 with respect to a path flow variable are equal to the following:
P72
t/ Zrf i B
r'la f
J Ir(h'\ B
r^lri
-Mn-liil + 1^- (49)
-—^-ln-^L—jm- + ——- (50)
The second derivatives of Z2 and Z3 with respect to another path flow variables are
equal to the following:
0 otherwise
(52)
Similar to the Cross Nested formulation, Z2 is strictly convex (positive diagonal

matrix) and Z3 is a positive semi-definite matrix. Therefore, the objective function is
convex, assuring uniqueness of the solution in terms of the path-flow variable fjc(kj)-
This means that in principle, all flows in all pairs should be included in the path-set. As
with the Cross-Nested formulation, this represents an additional difficulty with respect
to the simple logit model. This problem is further discussed on the section about the
algorithmic implementation.
DERIVATION FROM SHEFFTS FORMULATION
The new formulations presented in this paper are extensions of Fisk's equivalent
formulation, which is specific for the logit model. It is also possible to derive the
solution from the general mathematical formulation for the SUE problem, developed
by Sheffi and Powell (1982) as follows:
Mm 7.~2_l\acCi-2_iqsS"-2^\c(\v^ (53)
a rs i 0
Where S is the satisfaction function, defined as:
S"=E\min{G?}\c(*)\ (54)
The solution of this mathematical program is obtained at the link level as follows:
A"** (55)
The importance of the above formulation lies on the algorithmic implementation.
Equation (55) above provides a descent direction. The step size computation is more
complicated, since it involves the evaluation of the satisfaction function. For
probit-based models, this function can only be approximated, but for logit-based
models, the satisfaction function can be evaluated in a closed form. For the case of the
multinomial logit model, the satisfaction function is evaluated as follows:
S" = £(mingn = In p(^) (56)

v k J
The above expression was obtained based on Williams (1977) theorem, which states
that the expected perceived utility of an alternative is equal to the composite cost of
the alternative. The composite cost, in turn, is equal to McFadden's (1981) GEV
generator function.
Since both the Cross-Nested and Paired Combinatorial Logit are derived from the
GEV theory, the satisfaction functions can be obtained in a straightforward manner.
For the Cross-Nested Logit model, the satisfaction function is given by:
(57)
For the Paired Combinatorial Logit model, the satisfaction function is given by:
•^ XT' ft \\ / n\l-<Tg / ra\l-«7^ /^Q\

I A^ £*4 ^ ty^l ^ ~o* _/ [ |
I *=ij=*+i
The formulations presented above in a closed form allows for an analytical evaluation
of the whole objective function. A possible algorithmic implementation using these
formulations is discussed further. However, the solution of Sheffi's mathematical
formulation can be proved only at the link level. The extended logit formulations, as
with Fisk formulation, can provide the solution at the route level.
This concludes the presentation of two stochastic user equilibrium formulations for
extended logit models. The final part of the paper discusses algorithmic
implementations which make use of the above formulations.
ALGORITHMIC CONSIDERATIONS
The Method of Successive Averages (MSA) algorithm may be used to solve the logit
assignment algorithm, as well as for other route choice models, such as the probit
model. The method is not efficient in the sense that predetermined step lengths are
used, instead of optimizing the step length in each iteration. Since the equivalent
mathematical function for the logit model is a convex programming model, better
algorithms can be proposed.
Another important issue in traffic assignment (deterministic and stochastic), is the

link-flow versus the path flow algorithms. Generally, path-flow algorithms converge
faster than link-flow algorithms, at the expense of more computer storage space. The
MSA algorithm is basically a link-based algorithm, although it can also be implemented
as a path-flow algorithm.
Chen and Alfa (1991) proposed two link-based algorithms to solve the logit
assignment problem with Fisk's formulation. The difference between these algorithms
and the MSA is the step size computation. In the first algorithm, the step size
computation is performed only for the Beckmann-type Z j term. The second algorithm
performs a line search on a restricted form of the entire objective function. To evaluate
the line search in a link-based fashion, the computation of the inverse of a link-path
incidence matrix is required.
Bell et al. (1993) comment that the method proposed by Chen and Alfa (1991) may
result in inconsistent flows. They proposed a path-based algorithm in which the step
size computation is accomplished by iterative balancing, in a similar way of
entropy-maximizing trip distribution models. In this way, the link and route flows are
kept consistent. A column generation method is used to store the paths, by adding the
current shortest path in each iteration to the path-set.
Regarding path-based algorithms, there are two basic approaches with respect to the
path-set generation. The first approach is to generate paths progressively according to
the number of iterations. Generally, the shortest path between each origin-destination
pair is added to the path-set. This approach is known as column generation algorithms.
Another approach is to externally define a set of alternative paths based on some
criteria. For example, Ben-Akiva et al. (1984) proposed a "labeling" technique in order
to generate paths. This technique was also applied in the work of Cascetta et al.
(1997). The recent work of Cascetta et al. (1998) can be used to generate a consistent
path-set based on random utility choice models.
Damberg et al. (1996) show that it is possible to implement efficient path-flow

algorithms to solve the assignment problem for large networks. The routes are
generated prior to the assignment, according to some heuristic method. One method
proposed is to store routes based on variations of the shortest path between each
origin-destination pair. Another method proposed is to perform some iterations of a
deterministic assignment and then store the paths generated at the first iterations. Once
the routes are generated and stored, they can be used to compute the probability of
using each route, using the logit model, and thus producing a descent direction. A line
search is then made in the direction obtained.
Damberg et al. (1996) described a solution for the overlapping problem between
different routes. They proposed different measures of overlap in the route generation
phase. However, this can be applied only at the route selection process; the logit model
used in the assignment is not capable to capture the similarity between the routes. The
C-Logit and the models presented in this paper can also be used with Damberg's
algorithm, with the advantage that the route choice model can capture the similarity
between the routes. Finally, the formulations presented in this paper can be used to
perform a line search, either by minimizing the whole objective function, or using the
Armijo rule as in Damberg's algorithm.
The implementation of extended logit models for stochastic assignment can be

proposed in a similar way to the path-based assignment algorithms mentioned above,
with an additionally computational effort due to the extended entropy terms. In this
way, the probability of choosing a route in the stochastic network loading phase of the
algorithm will be computed in accordance with the extended logit models. Numerical
experiments and performance results are currently being investigated, and will be
reported in a future paper.
REFERENCES
Akamatsu, T. (1997). Decomposition of path choice entropy in general transport

networks. Transportation Science, 31B, 349-362.
Bell, M.G.H., W.H.K. Lam, G. Ploss and D. Inaudi (1993). Stochastic user equilibrium
assignment and iterative balancing. In: Proceedings of the 12th International
Symposium on Transportation and Traffic Theory (C.F. Daganzo ed.), pp.
427-439, Elsevier, New York.
Ben-Akiva, M, M.J.Bergman, AJ.Daly and R.Ramaswamy (1984). Modelling inter
urban route choice behaviour. In: Proceedings of the 9th International
Symposium on Transportation and Traffic Theory (J. Volmuller and R.
Hamerslag eds.), pp. 299-330, VNU Press, Utrecht.
Cascetta, E., A. Nuzzolo, F. Russo and A.Vitetta (1996). A modified logit route
choice model overcoming path overlapping problems: specification and some
calibration results for interurban networks. In: Proceedings of the 13th
International Symposium on Transportation and Traffic Theory (J.B. Lesort
ed.), pp. 697-711, Pergamon Press, London.
Cascetta, E., F. Russo and A.Vitetta (1997). Stochastic user equilibrium assignment
with explicit path enumeration: comparison of models and algorithms. In:
Proceedings of the 8th IF AC Symposium on Transportation Systems, (M.
Papageorgiou and A.Pouliezos eds.), pp. 1078-1084, Chania.
Cascetta E., A. Papola, F. Russo and A.Vitetta (1998). Implicit availability/perception
logit models for route choice in transportation networks. Preprints of the 8th
World Conference on Transport Research, Antwerp.
Chen M., and A.S. Alfa. (1991). Algorithms for solving Fisk's stochastic traffic
assignment model. Transportation Research, 25B, 405-412.
Chu, C. (1989). A paired combinatorial logit model for travel demand analysis. In:
Proceedings of the Fifth World Conference on Transportation Research, Vol. 4,
pp. 295-309, Ventura, CA.
Damberg O., J.T. Lundgren and M. Patriksson (1996). An algorithm for the stochastic
user equilibrium problem. Transportation Research, 30B, 115-131.
Dial, R.B. (1971). A probabilistic multipath traffic assignment algorithm which
obviates path enumeration. Transportation Research, 5, 83-111.
Fisk, C. (1980). Some developments in equilibrium traffic assignment. Transportation

Research, 14B, 243-255.
Koppelman, F. and C. Wen (1997). The paired combinatorial logit model: properties,
estimation and application. Preprints of the 76th TRB Meeting, Washington D.C.
McFadden D. (1981). Econometric models of probabilistic choice. In: Structural

Analysis of Discrete Data., (D. McFadden and C. Manski eds.), pp. 198-272.
Prashker, J.N. and S. Bekhor (1998). Investigation of stochastic network loading
procedures. Preprints of the 77th TRB Meeting, Washington D.C.
Sheffi Y. and W.B.Powell (1982). An algorithm for the equilibrium assignment
problem with random link times. Networks, 12, 191-207.
Vovsha P. (1997). The cross-nested logit model: application to mode choice in the
Tel-Aviv metropolitan area. Preprints of the 76th TRB Meeting, Washington
D.C.
Vovsha, P. and S. Bekhor (1998). The link-nested logit model of route choice:
overcoming the route overlapping problem. Preprints of the 78th TRB Meeting,
Washington D.C.
Williams, H.C.W.L. (1977). On the formation of travel demand models and economic
evaluation measure of user benefit. Environment and Planning A, 9, 285-344.
373
A DOUBLY DYNAMIC TRAFFIC ASSIGNMENT

MODEL FOR PLANNING APPLICATIONS
Giulio Erberto Cantarella1

Ennio Cascetta2
Vincenzo Adamo3
Vittorio Astarita3
3
Universita di Reggio Calabria, Italy.
2
Department of Transportation Engineering - Universita di Napoli, Italy.
3
Universita della Calabria, Italy.
1. INTRODUCTION
Situations with high levels of congestion are increasingly frequent in urban and metropolitan
areas. The effects of different control strategies, demand management and infrastructural
schemes can be studied with accuracy only with within-day dynamic modeling of traffic
assignment explicitly taking into account peaks and oversaturated conditions.
Traditionally the interactions between demand and supply have been studied with an
equilibrium approach that has produced many consolidated mathematical methodologies
(Sheffi, 1985; Cascetta, 1998; Patriksson, 1994). In such formulations it is not necessary to
simulate explicitly the learning and adjustment behavior of the users, and the evolution over
time, because only the final, equilibrium, state of the system is considered (Cascetta and
Cantarella, 1993).
The presented model simulates the process of users' choice in a day-to-day dynamic framework
and also the traffic dynamics "within" each single simulated day. Therefore it can be considered
a doubly dynamic model being day-to-day dynamic to represent user path choice and within-
day dynamic to represent user movement on the network. The model has fixed departure times:
o-d demand flows are assumed known at each interval within the simulated period. Path choice
behaviour is modeled considering explicitly learning or information updating processes for
Partially supported by the National Research Council of Italy.

experienced path costs. The users' choice adjustment process takes into account the inertia to
day-to-day path changes. Finally, path choices are simulated by applying a C-Logit random
utility model, that allows the effects of links shared between similar paths to be simulated,
keeping an analytical expression of choice probability (Cascetta et al, 1996). The difficulties
that arise in the solution of the Network Flow Propagation problem (NFP, that is the
reproduction of within-day variable link performances and flows given a corresponding O/D
demand and users' choice model) have caused the development of different approaches with
different levels of aggregation: from analytical models (macro-simulation: Daganzo, 1995b;
Lebacque, 1996) to meso-simulation or micro-simulation (Ben Akiva M. et al. ,1991; Cascetta
and Cantarella, 1991; DynaMIT, 1996; Fernandez and De Cea , 1994; Jayakrishnan et al.,
1994; Rilett et al, 1995; Smith and Wisten, 1993; Janson, 1995).
The Network Flow Propagation model adopted in the DTA proposed hi this paper, is based on
a mesoscopic simulation model and explicit path enumeration. The model is based on
analytical results (Astarita, 1996) of dynamic flow propagation on links, but for the solution the
explicit movements of packets on the network are followed. The proposed model, as all
mesoscopic models, does not need to establish explicit constraints for the conservation of
vehicles (as in some fully analytical approaches) and guarantees the respect of the FIFO rule.
The model is also able to consider the propagation of queues on links upstream of an over-
saturated link (spill-back of queues).
The implementation of the complete procedure allows the complete behavior of the system to
be reproduced, giving estimates of flows, queues, travel times, speeds, densities etc. for each
link of the network in all the intervals of the simulated days. The model is intended mainly for
planning applications; computation times, at the moment, are not compatible with real time
application or ITS (Intelligent Transportation Systems), but are still compatible with a detailed
offline PC evaluation of planning options.
The model was developed to be compatible with the input data set of a static simulation
package hi order to facilitate application to existing network data sets. An application is
reported on some small test networks and on the road network of Palermo (800,000
inhabitants), hi section 2 the main features of within-day Dynamic Traffic Assignment and day-
to-day dynamic processes are recalled, hi section 3 all the modeling assumptions of the adopted
model are presented, explicitly explaining the structure of the demand, supply and demand-
supply interaction models, hi section 4 and 5 the proposed algorithm and some applications
are presented. The simulations show that the model can be applied to large networks and that
the day-to-day process usually converges to a fixed-point attractor, or "equilibrium" state,
where the congestion spreads to the point that users no longer change their paths.
2. DAY-TO-DAY AND WITHIN-DAY DYNAMICS IN TRAFFIC ASSIGNMENT

Traffic assignment models are used to simulate traffic flows on the links of a network, and the
values of key variables such as travel times, congestion, pollutant emissions, etc.
In transportation system planning applications such information is used to evaluate project
alternatives. The general structure of dynamic assignment models is the same as that of static
A Doubly Dynamic Traffic Assignment Model 375
models. They are composed of three sub-models (Cascetta and Cantarella 1993, Cascetta
1998):
• Demand model: simulates the effects of the transportation system state on user behavior.
• Supply model: simulates the effects of user behavior on the transportation system.
• Demand-supply interaction model that simulates the interaction between the two preceding
models.
In modeling the transportation system, two different evolutions over time were considered: the
fluctuations of the system variables within each single day (within-day dynamics) and between
subsequent "days" or more generally observation periods of similar characteristics (day-to-day
dynamics).
2.1 Day-to-day dynamics.
Day-to-day dynamics mainly involve the specification of the demand-supply interaction model
and some aspects of the demand model like the simulation of users' learning and the reaction to
the evolution of the transportation system. Within-day-dynamics involve the supply model with
the reproduction of within-day variable link performances given a time-dependent path flow
vector (The demand model can also be affected if the departure time choice is explicitly
simulated). In day-to-day static assignment, equilibrium models are used to obtain consistency
between traffic flows and path costs. In day-to-day dynamic assignment, where the evolution of
the system is studied, convergence towards an equilibrium point is not certain and, even when
there is convergence of the system towards a final state, it can be useful to study the evolution
of flows and path costs from day-to-day. The dynamic process approach to assignment has
been recently introduced (Cantarella and Cascetta ,1995). This approach is a generalization of
the equilibrium approach, since it also allows the simulation of convergence towards
equilibrium states and the transients due to modifications of supply and/or demand. Moreover,
equilibrium stability can be analysed and a full statistical description can be obtained.
Specification of a dynamic process model requires the explicit modeling of:

• users' learning and adjustment processes: that is how experience and information about
costs influence choices. Taking into account such processes allows simulation of
phenomena like users' memory and information diffusion strategies (cost updating model);
• users' choice updating behavior, that is how choices in a given day are influenced by
previous day choices, which allows the simulation of phenomena like habit (choice updating
model).
The state of the system, during its evolution over time, is defined by variables describing the
results of the above two types of behavior. This definition varies according to the adopted
modeling approach. It should be noted that anticipated path costs affecting users' choice
behavior generally differ from path costs actually experienced on the network, even if the
former clearly depend on the latter.
Moreover, a dynamic process model can be applied at an aggregate level, considering classes
of homogeneous users, thus generalizing multi-user equilibrium assignment, or at a
disaggregate level, where choice and cost updating is referred to each single individual. The
latter approach, requiring more computing resources, allows more realism in simulating users'
behavior and network performance in the presence of ATIS (Advanced Traveler Information
Systems).
Let
i denote a single user or a class of homogeneous users,
C[ be the actual path cost vector at day t for user (class) i,
Yjl be the anticipated or expected path cost vector at day t for user (class) i,
Pjl be the path choice conditional probability matrix at day t for user (class) i,
Fjl be the path flow vector at day t for user (class) i, (with entries equal to 0 or 1 for a
disaggregate approach where index i refers to a single user).
Anticipated path costs at day t generally depend on actual and anticipated costs at day t-1, and
are obtained through the cost updating model:
Y^FiCQ'-'.Yi'-1). (1)
Multiple information sources can be modeled by combining different updating models

(Emmerink, 1996). Then, the users' choice behaviour depends on the anticipated path costs and
is modelled through the choice updating model:
Pi'îCYi'). (2)
Where the element Pkj ,l is the probability of choosing path k at day t for user (class) i that
choose path j at day t-1.
Finally, the path flow vector should be modeled as a random vector with the expected value
given by the following relation, which generalizes the equilibrium approach:
^[Fi'^Pj'Fj'-1. (3)
Under the last assumption a stochastic process assignment model is specified (with a
probabilistic path choice model). The above framework could be easily extended to consider
actual path costs as random variables (Cascetta, 1998).
Assuming the path flow vector equal to its expected value, that is modelling it as a
deterministic vector, the related mean or deterministic process model (with a probabilistic path
choice model) is obtained:
Fj^Pj'Fi'-1. (4)
The different meanings of the terms stochastic and deterministic should be stressed when
referred to the type of dynamic process assignment model rather than to the type of equilibrium
assignment model. In the latter case they refer to the path choice model, while the underlying
assumption about system evolution remains deterministic.
2.2 Within-day dynamics.
In within-day dynamic assignment the supply model (which is limited to a matrix product in
the static case), expressing time-varying network flows and costs resulting from movements of
users on the network, is obtained with a much more complex sub-model: the Network Flow
Propagation (NFP) model. The extension of within-day static models to take into account
within-day dynamics is by no means straightforward, since within-day dynamic supply
modelling requires completely new definitions and formulation of the problem (Cascetta and
Cantarella, 1991). Some of the existing NFP methods do not rule out overtaking between users
or the spill-back of queues. The respect of the FIFO rule is necessary in an assignment model
not only to ensure an internal consistency of the supply model, but also because, as a
consequence, it may happen that a user decides to leave later to arrive earlier. A large number
of papers have been presented on the first issue, but the second problem (the spill-back of
queues) which has almost the same importance has been less considered. Daganzo (1995) has
correctly pointed out that in many assignment models based on link travel time functions the
spill-back of queues from congested links to the upstream links is not modeled at all. The aim
of dynamic network assignment models is to capture the time varying evolution of flows and
this cannot be accomplished correctly without representing the propagation of queues that
originate from congested links which is a commonly occurring phenomenon, at least in urban
networks.
The NFP model is the evaluation of densities and queues from path flows, using link
performance cost functions. It can be summarized as follows:
t
f= 9(F' ) , (5)
where:
f* is the (however defined) link flow vector for each interval of day t,
Fl is the path flow vector for each interval of day t.
The network performance model, that is the evaluation of path costs for each interval (averaged
between the costs of the users departing at the beginning and at the end of the interval) as a
function of link costs, i. e. densities and queues, can be summarized by:
C - r(Fl) , (6)
where:
Cl is the path costs vector for each interval of day t.
3. GENERAL DESCRIPTION OF THE MODEL
The proposed model is a day-to-day dynamic process reproducing a deterministic process of

averages. Only pre-trip path choice behavior and no en-route diversions are considered in a
within-day dynamic flow propagation framework. The model structure is presented in figure 1.
The actual path costs of day t-1 are combined with the previous anticipated path costs in the
cost updating model which gives the new anticipated path costs. Those costs are used in the
demand model that assigns the a priori known O/D flows (time-dependent) to the different
considered paths according to the choice updating model with a stochastic path choice C-Logit
model. The Network Flow Propagation model determines the time dependent link flows and
costs and the actual path costs that are used for updating the anticipated path costs of the
following day. In the following three sub-sections the demand, supply and demand-supply
interaction models are described in detail.
Yesterday ».
Choice Updating 1
and
Path Choice 1
1—••
( Path
Flows
;
j Tomorrow
i
1 Anticipated \
Cost Updating 1
Yesterday •\ Path Costs } Tomorrow
Model p^
V 1
V
Network Flow 1
( etwor \
^ Model } Propagation 1
Model 1
V
„
/ Link \ Actual
Flaws and —*• Path
I Costs ^ Costs J
Figure 1: General architecture of the model
3.1 The demand model
Path choice behavior is simulated assuming that each day only a fraction of users ae]0,l]
considers the opportunity of changing paths (but does not necessarily do so):
t
od,U = a dod Vod, i, k (7)
where:
is the flow on path k between pair od and departing in interval i of day t,

is the probability of reconsidering the previous day path choice,
is the demand flow for pair od, in interval i,
is the probability of choosing path k between pair od, departing in interval
i of day t, for users who reconsider the previous day's path choice.
A Doubly Dynamic Traffic Assignment Model 3 79
Choice probabilities for users that reconsider the previous day's choice are obtained only for a
set of paths explicitly enumerated (the k shortest paths obtained from a previous static
assignment procedure, alternatively free flow speeds could be used). The probabilities p'0d,k? f°r
each departure interval are evaluated with a C-Logit random utility path choice model (Cascetta
et al, 1996), that allows an analytical expression to be obtained as in the usual Logit model,
but introduces a correction factor for the links that are shared between different paths.
p'od.i.k = exp(9 vÔ / Sj e/Cod exp(9 vVij) Vke ATod Vod, • (8)
where:
0 = 7i/(aV6) = 1 .283/CT is the parameter of choice model, proportional to the standard

deviation a of random residuals,
v'od ! k is the systematic utility of path k between pair od, for users departing
in interval i of day t, which includes a "commonality factor" for
overlapping paths,
ATod is the set of explicitly enumerated paths between pair od (The first k
minimum paths evaluated with static stochastic user equilibrium
assignment are used in the following).
The systematic utility of a path depends, not only on the correction factor, but also on the
anticipated path costs of the users (average value for the departure interval considered). Such
costs are obtained with a filter that reproduces the users' learning and adjustment mechanisms,
combining real costs (travel time) with the preceding day's anticipated costs. A parameter (3 is
introduced:
v
od,i,k = ~y od,i,k
(10)
(11)
where:
ylod,i,k is the anticipated cost (average) of path k between pair od departing in interval i of day
t,
zod k is the factor for paths overlapping of path k of pair od,
lk is the length of path k,
ljk is the length of shared links for paths k and j,
(3 is the weight given to the anticipated cost of day t-1 in the evaluation of the
anticipated cost of day t,
g'odj.k is the effective (average) cost of path k between pair od for users departing in interval
i of day t.
The application of this demand model, that simulates the path choice behaviour once the
demand flows are known, allows to evaluate the path flows for each interval in the simulated
period from anticipated and effective costs of preceding day.
3.2 The supply model
Many simulation approaches to the problem of dynamic Network Flow Propagation have been
based on packet methods, where users are grouped together to form a packet that can be moved
along the network so as to realize a discretization of the demand of each O/D pair. It is possible
to distinguish between a point packet approach (Yagar, 1975, Leonard et al, 1978, Cascetta and
Cantarella, 1991, Adamo et al., 1996) in which a group of users is concentrated into a single
point and a continuous packet approach (de Romph et al, 1992, Smith and Wisten, 1993, Di
Gangi, 1992, Di Gangi and Astarita, 1994) where the users are supposed uniformly distributed
in time or space between packet edges.
In previous point packet approaches it was not straightforward to impose capacity on the link
outflows so as to reproduce network bottlenecks. Such bottlenecks were reproduced through
the use of additional constraints.
The continuous packet approach is also, from a computational point of view, more complicated
than the point packet because for each packet the movement of more than a point has to be
followed, leading to complicated evaluations of link flows and network characteristics.
Moreover, some theoretical inconsistencies in the continuous packet approach have been
indicated in dell'Orco (1997).
The approach used here is a point packet NFP model called MICE which is able to represent
flow propagation without FIFO rule violations or other inconsistencies even in a
multidestination framework (Adamo and Astarita, 1996). This approach is able easily and
adequately to reproduce network bottlenecks because it is based on an analytical formulation
that gives automatic respect of capacities. This new analytical approach overcomes flow
propagation inconsistencies that are characteristic of exit link function formulations.
The MICE model can be seen as a heuristic algorithm to solve the analytical time continuous
dynamic network loading model presented in Adamo et al. (1999) or as an independent time
continuous model with discretized demand. Links in the model are composed of two segments,
a running and a queuing segment. Each segment is ruled by the analytical model based on the
following system of differential equations for each link:
Travel time function = «(/)-w(0

dt
Models «(/) (12)
General Formulation
1+
dt
where the link characteristics:
u(t) = the inflow ,

w(t) = the outflow,
x(t) = the number of users on the link ,
rft) = the travel time of the user who arrives at time t at the beginning of the link,
are governed by a conservation equation, a travel time function and an outflow function that is
consistent with the respect of the FIFO rule.
In figure 2 a simulation is shown for a single link, giving similar results for the packet and for
the analytical model. Each link may have a capacity constraint at its downstream end, imposed
implicitly by the travel time function as in the analytical model presented in Astarita (1996).
The difference between running and queuing segments is in the functional form used to obtain
i(t). The running segment has for performance function a Greenshield-like relationship
between the speed and the density of the segment:
•c(t)=L/sft) (13)
where:
s(t}=s(k(t)} (14)
o<k<kcrit
s(k} =
s = cap/k
s(t)=L/T(t) (15)
k(t)=x(t)/L (16)
s(t) is the speed of the link at time t,

k(t) is the density of the link at time t,
L is the length of the link,
cap is the link capacity (obtained from the travel time function in the static
model),
is the free flow speed (obtained from free travel time in the static model),
is the jam density,
is the critical density.
The queuing segment has for performance travel time function a deterministic queue model for
the exit delay of a link. The delay r is (in principle) a function of the queue length and the
outflow:
r(t) = qft)/cap (17)

where :
cap is the link downstream capacity of the considered link,
q(t) is the queue of the considered link.
This is true only when the exiting flow is equal to the downstream capacity, i.e. there are no
spill-backs from downstream links; otherwise, r(t) is determined by the rate of queue
dissipation and can be analitically evaluated (once w(t) is known) solving for rft) in:
t+r(t)
\w(t)dt = q(t} (18)
t
Storage capacity is also explicitly considered on each link in the network, in order to deal with
spill-back congestion phenomena. This feature is obtained here with heuristic modeling that
reproduces the rules, presented in Adamo et al, (1998), that govern flows at an intersection
node. The problem is to distribute the limited resource of flow supply on downstream links
between users in a proper way. This is one of the several possible approaches to the problem
(see also Daganzo, 1995b; DynaMIT, 1996 and Lebacque, 1996 for other approaches).
240 640 840 1040 1240 1440 1640

sec.
Figure 2: A simulation on a single link with MICE, meso-simulation packet-based model,

derived from an analytical formulation. (Adamo and Astarita, 1996)
3.3 Demand-supply interaction model
The obtained dynamic process model is a deterministic process of averages. The state of the
system at day t is defined by the vectors of anticipated path costs and path flow vector for day /.
The convergence of the system depends on the values of the parameters used in the filters that
simulate the learning and adjustment behaviors of the users:
Uu = a dod>i p'od.i.kCy'od.i) + ( * -a) F1"'0<i,.,k Vod, i, k (19)

t
= cp(Ft) (20)
(21)
yt+W = P g'od,,k + (1-P) y Vod, i, k (22)
where:
y'od i is the vector of anticipated costs (average) for the users of the enumerated paths
between pair od departing hi interval i of day t,
p'od.i.kCy'od,;) is the path choice probability model for users who reconsiders the previous day's
path choice.
Initial estimates of the variables:
and F°od,i,k (23)

are obtained with a static Stochastic User Equilibrium assignment.
Some general results, that have been presented in Cantarella and Cascetta (1995), indicate that
small values of parameters a and p cause the evolution of the system towards a fixed-point
attractor. This equilibrium point can be seen as an extension of the static equilibrium with a
within-day dynamic network loading model.
4. THE PROPOSED ALGORITHM
Demand between o-d pairs is considered fixed in the model, so the o-d flow matrices for each
interval of the simulated period are known. To allow the network to be loaded and unloaded,
some intervals are also simulated before and after the considered period of time, with a demand
equal to that of the first and last interval respectively.
The proposed algorithm was developed to be compatible with the input data set of a static
simulation package (MT.Model Inaudi et.al, 1996) hi order to facilitate application to existing
network data sets. Some input data are hi fact obtained from the static model: the network
graph and an o/d demand matrix. Other data need to be added in order to obtain a dynamic
simulation: the time evolution of graph characteristics and the time evolution of o/d demand
matrix within the considered time period.
Different modules perform the following tasks:

• enumeration of the feasible paths for each o-d pair: a k-shortest paths algorithm is used in
the presented implementation, but any other algorithm can be easily introduced;
• evaluation of initialization path costs for the day-to-day process: the initialization path costs
are obtained from a static equilibrium assignment;
• evaluation of commonality factors of the feasible paths for each o-d pair: these factors are
based on link lenghts and are used in the C-Logit (Cascetta et. al, 1996) path choice model;
Two operations are necessary before performing the dynamic assignment: conversion of the
static network into a dynamic network and generation of the set of feasible paths. The
generation of the set of feasible paths for each o-d pair is performed using the algorithm of Yen
for the k-shortest paths with an internal L-deque shortest path algorithm ( Aujia et al, 1993).
Users belonging to the same O/D pair are supposed divided into sub-sets by common departure
interval and followed path; these sub-sets are called packets, hi particular, in the proposed
model, a packet is assumed to be "physically" a point on the network. Let us order in the same
way all the O/D pairs and let Kr be the set of admissible paths for the O/D pair r and Djk be the
number of users leaving during interval] and following path k, with keK r . Thus a packet "P" is
identified by three indexes: "P(t,r,k)" where t is the starting time from the origin node. Link
travel times are strictly positive for each packet (even if the link is empty). Points on the links
move with a speed that depends on the conditions of the link itself at the moment the packet
reaches the link.
The movement of a packet to a new link is conditioned by its storage capacity. Packets are
forced to stay on the preceding link until the next link is discharged. Packets in the queue
therefore have to wait for some of the packets on the next link to exit. This allows the
propagation of the spill-back phenomenon from link to link. We can solve the problem in this
way by memorizing each packet position on the network. The procedure assigns the supply that
a saturated link (the flow that is still allowed to enter this saturated link equals the outflow) can
give to incoming links according to some pre-established weights. Those are obtained
proportionally to the number of lanes of each incoming link. Queued packets from different
lanes are selected proportionally to the weights. This procedure does not ensure an exact spill-
back propagation, being an heuristic approximation of the analytical model.
Figure3 : Paths potentially causing a deadlock grid for spill-back of queues.
MICE results show that in the worst case computing time is polynomial as:
O( max( nPo-d*no-d , nA*nP*log(nP))),
where nPo-d is the maximum number of paths per o-d, no-d the number of o-d pairs, nA the
number of arcs in the network and nP the generated number of packets for the simulation.
These results are valid except in the presence of a deadlock. The deadlock may occur when
packets of different paths get caught in a loop queue. If users that are not allowed to switch to
alternative paths, the network loading procedure may reach a deadlock situation. This happens
whenever a loop queue is formed as in figure 3, in which case the algorithm (as well as the
traffic) slows down considerably. A special procedure is run in the presence of a deadlock. The
procedure guarantees that the packets will move with a flow equal to that of the minimum
capacity between the queued links. In Figure 4 the flow diagram of all the methodology is
represented. The dotted line separates the tasks that are performed only once in the
initialization from those that are executed every iteration (day) of the simulation. Different
arrows are used to indicate the operations that are performed within a single day. All the
procedures have been implemented in C++.
Proposed procedure
*&
Historical
path flows
Updated path Past costs for

each Time Slice
O/D for T
Figure 4: Flow diagram of the assignment algorithm.

5. SOME NUMERICAL EXAMPLES
Some modeling examples are presented in this section: the first is a small regular network with
2C nodes and 31 links and the second is the Palermo network, used by the city administration
for planning purposes. The first network was implemented to investigate convergence towards
a fixed point or other attractors. Some simulation results are presented here. The network with
a grill pattern consists of 20 nodes and 31 links. O/D demand flows follow an initial increasing
then decreasing pattern to simulate a demand peak in a period of 60 minutes. The demand is
steady during each departing interval 10 minutes long. Three warm-up intervals are added at
the beginning of the simulated period to load the network; three intervals are also necessary
after the demand peak to allow users to exit the network. The total period simulated is 120
minutes long. The o-d pair considered, together with the path flows for each interval (veh/h)
and the used paths are presented below (the links in brackets are used to verify upstream
capacity values). The path choice behavior was simulated as described in section 3.1. The
variance parameter of the C-Logit model is 0.0026(sec"1) = 1.283/a that corresponds to a
standard deviationCT= 500 sec (with an average travel time of 2400 sec). The parameters for
the learning and adjustment filters are respectively: a = 0.2, (3 = 0.4.
! O-D \ Intervals 1 2 3 4 5 6 7 8 9 10 11 12
! 1-16 60 60 60 60 120 200 160 120 80 80 80 80
\ 2-17 60 60 60 60 120 200 160 120 80 80 80 80
3-19 15 15 15 15 30 50 40 30 20 20 20 20
3-20 15 15 15 15 30 50 40 30 20 20 20 20
\ 5-19 15 15 15 15 30 50 40 30 20 20 20 20
Table 1: Demand flows for each interval (veh/10 min.).
) G) (D
3 4 5
14
\. /^ L 17 ^K
I fcl f) I .J ir> )
I 11 15
1r
10 13 16 18
21 N, 27 ^ L 30 J L
•> I" J i /i l" 2s ^(
19
1 25
1 r 'î\
20 23 26 29 31
• i• i
Figure 5 : Test Network 1

Links 1,2,3,4,5,6,9, and 10 have a downstream capacity of 1800 veh/h and a maximum density
of 0.4 veh/m (two lanes). Links 23,28,29 and 31 have the upstream and downstream capacities
of 540 veh/h and a maximum density of 0.2 veh/m (one lane); the remaining links have a
downstream capacity of 900 veh/h and a maximum density of 0.2 veh/m All links are 500 m
long.
0-D 1-16 2-77 3-7P 3-20 5-7P

1 1 1 1 2 2 2 2 3 3 5
6 6 6 7 9 8 9 10 12 12 17 I
9 9 10 20 12 7 13 23 16 16 161
12 13 21 16 19 24 32 29 28 29
16 24 20 27 23 23 34 33 34
27 21 24 32 32 31
24 20 23
21 32
20
Table 2: Paths for each o-d pair.
The following RMSE indicator was used to show how the difference between the anticipated
and real costs evolves from day to day:
RMSE = (Sk(Ctod;1)k-ytod,,,k)2)^kCtod,1,k
where:
ylod,i,k is the anticipated cost (average) of path k between pair od departing in interval i of
dayt,
Cl0d,i,k is the real cost of path k between pair od departing in interval i of day t.
The results are presented in figure 6 where convergence is shown towards a fixed-point
attractor.
RMSE between anticipated and real path costs
day 1
Figure 6: Convergence towards a fixed-point attractor

The total travel time of users on the network decreases from the first (5,850,000 sec.) to the last
simulated day (5,800,000 sec.). This happens because congestion spreads on the network from
day to day reaching a final state of equilibrium, but while in this case there is little difference in
travel time because there are not many alternative paths (for some of the o-d pairs there is only
a single path), on the Palermo network with 5 paths per o-d pair the change in travel time is
considerable as shown in the following.
On the same network another test was performed to see if the results of the NFP model are
sensitive to very small changes in path flows. This is a common problem in some micro-
simulation models where "robustness" of results is still an unsolved problem (see Nagel et
a/,1998)
Q Q
Figure 7 Interval 4-10 exiting flows (veh/h).
Different simulations were carried out. The two issues investigated were:
-differences in link flows after a change in the size of packets without a change in path flows;
-differences in link flows after a very small change in path flows (a packet was removed from
the simulation).
The results showed a very small change in link flows assuring the robustness of the Network
Flow Propagation model.
Another simulation is presented here in detail (tables 3 and 4, figures 7 and 8) on the same
network to show the behavior of the NFP model MICE in spill-back simulation.
Q Q
Q Q Q Q Q Q
Figure 8 Interval 4-10 relative saturations of storage capacities.
The simulation is similar to the previous except in the following:

-link 29 (14-19) has a downstream capacity of 360 veh/h
Node 14 will be affected by the queue of link 29 (14-19). This queue will propagate mainly on
link 30 (15-14) that has a capacity of only 900 veh./h. (one lane) compared to the capacity of
link 16 (9-14) that has a capacity of 1800 veh./h. (two lanes): the weights of the two links are
proportional to the capacities.
In figures 7 and 8 outflows and relative saturations of storage capacities (densities divided by
the maximum possible link density) are presented.
On link 9-14 when congestion grows from node 14 the outflow is almost double that of link
15-14, in accordance with the relative weights of the two links. The queue propagates backs,
growing first on link 30 (15-14) and then, when the link is saturated, on link 18 (10-15).
Paths 1~U 1-17 i 2-1? 3-20 4-19 S-l$

1 1 2 3 4 5
6 6 7 8 9 10
11 11 12 13 14 15
16 12 17 14 19 14
17 15 19
20
Table 3: Paths used in the spill-back example
Intervals 1 2 3 4 5 6 7 8 9 10 11 12
Q-P
t-16 33 33 33 33 54 75 60 45 33 33 33 33
1-1? 33 33 33 33 54 75 60 45 33 33 33 33
2-17 55 55 55 55 90 125 100 75 55 55 55 55
3-20 44 44 44 44 72 100 80 60 44 44 44 44
4-l£ 44 44 44 44 72 100 80 60 44 44 44 44
M9 44 44 44 44 72 100 80 60 44 44 44 44
Table 4: Path flow table (veh./lO min.)
The second network (the city of Palermo) was implemented to investigate the capacity of the
model to deal with real size networks. Five paths were generated for each o-d pair, with the
specific conversion module (described above), from the results of MT Model static stochastic
assignment. The following are the statistics for the new network graph compared with the
graph used in MT Model:
MT.Model MICE
Network Network
Nodes 1337 3031
Links 2466 4160
O/D pairs 9279 7524
Assigned trips 72500 70600
The longest generated path consists of 248 links (125 on the original graph). The model used
for the generation of feasible paths is one of the possible models that can be used (see Cascetta
et al, 1996 for other methods).
The number of warm-up intervals added at the beginning of the simulated period to load the
network, and after the demand peak to allow users to exit the network was fixed in 6 intervals
of 10 minutes. 50 minutes was the longest path travel time among the used paths of the static
assignment. This guarantees that when the real simulation starts, the users that entered at the
beginning of the simulated period have already reached their destination, and that all the users
of the peak hour are out of the network when the simulation ends.
The demand profile used for 18 intervals of 10 minutes, was obtained from a data survey on a
sample of users interviewed by the city administration.
The fractions indicated below were used to obtain the demand flows, for each interval, from the
original o-d matrix
Interval 1 6 7 8 9 10 11 12 13 18
Fraction 0.05 0.05 0.09 0.13 0.16 0.13 0.1 0.09 0..06 0.06
Table 5: Fractions used to obtain the demand flows, for each interval, from the original
static o-d matrix.
Days 15 17 19 21
Figure 9 Percentuage differences between anticipated and experienced path costs for each
interval
Figure 10 Numerical example 2 Palermo network
Average travel times are presented in the following tables for the first and last simulated days.
Day 1 Intervals: 6 7 8 9 10 11
Average travel time: 16 19 22 24 24.4 24
(min.)
Maximum travel time: 97 130 153 144 133 125
(min.)
Day 23 Intervals: 6 7 8 9 10 11
Average travel time: 13 15 17.4 18.5 18.3 17.8
(min.)
Maximum travel time: 60 80 99 112 117 115
(min.)
Table 6: Palermo network: average travel times for the first and last simulated days.
This network was implemented mainly to investigate the capacity of the model to deal with real
size networks and to converge towards a fixed point. The probability of reconsidering the
previous day's path choice (a) is 20% and the parameter P used to weight the anticipated cost
of day t-1 in the evaluation of anticipated cost of day t is 0.3 (30%). The results show that the
total travel time of users on the network decreases from the first to the last simulated day in all
intervals. The cause of the reduction in travel times is the spread of congestion and
convergence towards an "equilibrium" point. The spread of congestion also causes a reduction
in the maximum travel time experienced in the worst path of each interval. The simulation on
the Palermo network took on average two hours of simulation time for each simulated day on a
A Doubly Dynamic Traffic Assignment Model 3 93
Pentium MMX200. The algorithm was implemented in C++. Some deadlocks were present,
caused also by the small (compared with the network size) sets of considered paths.
6. CONCLUSIONS
The proposed methodology is a Doubly Dynamic Traffic Assignment model that can be applied
using demand and supply data tipically available in a traditional static assignment procedure.
Day-to-day dynamics is simulated with a deterministic process. Within-day dynamics is
simulated with a meso-simulation model (MICE, packets based). The detailed discussion of
models for the generation of feasible path choice alternatives as well as path choice are beyond
the scope of the paper, (see Cascetta et al, 1996 for more details on the subject). Greater
accuracy in the results of the simulation would be clearly obtained with a larger set of feasible
paths, but the resulting computing times may be too large for practical applications.
At the moment the major limitations of the methodology are, in the authors' opinion,
computing times and the inability to simulate different users' classes. A multi-class version of
the model should be able to cope with different path choice behaviors, different information
access, and should be able to represent different classes of vehicles with different sizes moving
on the network at different speeds.
Research efforts will be and are at the moment addressed to develop the following subjects:
- A performance analysis of the Dynamic Network Model used and a sensitivity analysis to
see how the number of feasible paths, the travel time functions and other parameters affect the
results, and the occurence of deadlocks.
- The use of a stochastic process model for the day-to-day dynamics.
- The extension of the procedure to multi-class assignment.
- The extension of the procedure to simulate departure time choice.
7. REFERENCES
Adamo, V. and Astarita, V. (1996) - "Un nuovo modello di caricamento dinamico del traffico
con gestione dello spill-back e degli incidenti" - C.S.S.T. 1996.
Adamo, V., Astarita, V. and Di Gangi M. (1996) A dynamic network loading model for
simulations of queue effects and while-trip re-routing. 24th European Transport Forum
PTRC. 2-6 September 1996 Brunei University, Uxbridge.
Adamo V., Astarita V., Florian M., Mahut M. and Wu J.H. (1998a) Link based dynamic
network loading models with spill-back: intersection models. Publication CRT, 1998.
Adamo V., Astarita V., Florian M., Mahut M. and Wu J.H. (1998b) Link based dynamic
network loading models with spill-back: solution by simulation. Publication CRT, 1998.
Adamo V., Astarita V., Florian M., Mahut M. and Wu J.H. (1999) Modelling the Spillback of
Congestion in Link based Dynamic Network Loading Models: A simulation model with
application. 14ISTT International Symposium on Transportation and Traffic Theory.
Ahuja R.K., Magnanti T.L. and Orlin J.B. (1993) Network flows. Prentice-Hall.
Astarita, V. (1996) A continuous time link model for dynamic network loading based on travel
time function. 13th International Symposium on Theory of Traffic Flow, Lyon July 1996,
J.-B. Lesort ed. Pergamon Press pp. 79-102.
Ben Akiva M., de Palma A. and Kaysi I. (1991). Dynamic network models and driver
information systems. Transportation Research 25A (5), pp. 251-266.
Cantarella G.E. and Cascetta E. (1995). Dynamic Processes and Equilibrium in Transportation
Networks: Towards a Unifying Theory. Transportation Science, 29, pp. 305-329.
Cascetta E. (1998). Ingegneria del sistemi di trasporto. UTET. Italy.
Cascetta, E. and G.E. Cantarella (1991) A Day-to-day and Within-day Dynamic Stochastic
Assignment Model. Transportation Research 25a (5),pp. 277-291.
Cascetta, E. and G.E. Cantarella (1993). Modelling dynamics in transportation networks.
Journal of Simulation Pratice and Theory - Elsevier.
Cascetta, E. and G.E. Cantarella (1998). Stochastic assignment to transportation networks:
models and algorithms. Proceedings of Equilibrium and advanced transportation
modeling colloquium. Montreal, 10-11 October 1996
Cascetta E., Nuzzolo A., Russo F., and Vitetta A. (1996). A modified Logit route choice
model overcoming path overlapping problems. Specification and some calibration results
for interurban networks. In Proceedings of the 13th International Symposium on Theory
of Traffic Flow, Lyon July 1996, J.-B. Lesort ed. Pergamon Press, pp. 697-711 .
Daganzo C.F. (1995a) Properties of Link Travel Time Functions under Dynamics Loads.
Transportation Research 29B(2), pp. 95-98.
Daganzo C.F. (1995b) The cell Transmission model, Part I and Part n Transportation Research
29B No.2.
de Romph, E., van Grol,H.J.M. and Hamerslag, R. (1992) A Dynamic Traffic Assignment
model for Short-Term predictions. // International Seminar on Urban Traffic Networks,
Capri, July 1992.
Di Gangi M. (1992). Continuous flow approach in Dynamic Network Loading. II International
Seminar on Urban Traffic Networks - CAPRI July 1992.
Di Gangi M. and Astarita V. (1994). Structure of a Dynamic Network Loading Model for the
Evaluation of Control Strategies. - Capri June 1994- TRISTAN II TRIennal International
Symposium on Transportation ANalysis.
DynaMIT (1996). Development of a deployable real-time dynamic traffic assignment system.
MIT, Boston - Interim Report.
Emmerink R.H.M. (1996). Information and pricing in Road transport. Theoretical and applied
models. Thesis Publishers, Amsterdam.
Fernandez J.E. and de Cea J. (1994). Flow Propagation Description in Dynamic Network
Assignment Models - TRISTAN II TRlennial International Symposium on Transportation
ANalysis - Capri June 1994, pp. 517.532.
Inaudi D., Tartaro D., Toffolo S. and Velardi V. (1998). Modelli matematici per la redazione
del p.u.t. in Ipiani urbani di traffico. SIDT Franco Angeli Editore. Naples. Italy. 1996.
Janson, B.N. (1989) Dynamic traffic assignment for urban road networks. Transportation
Research, 25B (2/3) pp. 143-161.
Jayakrishnan R., Mahmassani H.S. and Hu T. (1994). An evaluation tool for advanced traffic
information and management systems in urban networks. Transportation Research B
1994.
Lebacque J.P. (1996) The Godunov scheme and what it means for first order traffic flow
models. 13th International Symposyum on Theory of Traffic Flow, Lyon July 1996.
Published by Elsevier, pp.647-677.
Leonard D.R., Tough J.B. and Baguley P.C. (1978). A traffic assignment model for predicting
flows and queues during peak periods. TRRL Report 841, 1978.
Nagel K., Rickert M. and Simon P.M. (1998). The dynamics of iterated transportation
simulations. TRISTANHI Puerto Rico June 1998.
Patriksson M. (1994). The traffic assignment problem. - VSP 1994.
Rilett L., Benedek C., Rakha H., and Van Aerde M. (1994) Evaluation of FVHS Options Using
CONTRAM and INTEGRATION. First World Congress on Applications Transport
Telematics & Intelligent Vehicle Highway Systems, Paris, France.
Sheffi Y. (1985). Urban transportation networks. Prentice-Hall.
Smith MJ. and Wisten M.B. (1996) A distributed algorithm for the dynamic traffic equilibrium
assignment problem. 13th International Symposium on Theory of Traffic Flow, Lyon July
1996. Published by Elsevier, pp 385-408.
Yagar S. (1975). CORQ - A model for predicting flows and queues in a road corridor.
Transportation Research Record 533, pp. 77-87, TRB, National Research Council,
Washington D.C.
397
ROUTE FLOW ENTROPY MAXIMIZATION IN

ORIGIN-BASED TRAFFIC ASSIGNMENT
Hillel Bar-Gera and David Boyce, Department of Civil and Materials Engineering, University of
Illinois at Chicago, Chicago, Illinois, USA
ABSTRACT
Most solution methods for the static user-equilibrium traffic assignment problem are either link-
based or route-based; recently a new origin-based method was proposed. In general, link-based
solutions are unique, but origin-based and route-based solutions are not. Several researchers
have suggested that the entropy maximizing route flow solution is the most likely one. In this
paper the implications of route flow entropy maximization are studied in the context of origin-
based solutions. An alternative intuitive assumption is proposed, and the equivalence between
the proposed assumption and the entropy maximization criterion is examined. As a result, a
natural, easily obtainable route flow interpretation for origin-based solutions is derived. Related
improvements to the origin-based solution method are discussed.
INTRODUCTION
Consider a segment of a main road route with an alternative bypass. Wardrop's user equilibrium
assumption implies that the proportion of users choosing the bypass is such that the cost of each
alternative is the same. Interpreting that proportion as the probability that a certain user chooses
the bypass, one may ask whether that probability depends on the trip origin or trip destination.
The basic user equilibrium traffic assignment model assumes that all users are identical, in the
sense that they all decide in the same way to minimize route cost, which is the same for all users
regardless of their origin and destination. More complex models suggest that route generalized
cost may vary by trip purpose, user group and other attributes, but typically not by origin and
destination. Hence, it seems reasonable to assume that the probability of choosing the bypass is
independent of the origin and the destination.
Figure 1: The bypass proportionality assumption
The same arguments suggest that the probability of choosing the bypass is also independent of
decisions made prior to the point of diversion, and after the merge point. The bypass propor-
tionality assumption is that the proportion of users choosing a bypass is the same for all origins,
destinations, initial routes (the route segment from the origin to the bypass diverge), and final
routes (the route segment from the bypass merge to the destination).
For example, consider the network of Figure 1. In this network the main route passes through
nodes 1, 2, 3,4, 5, and 6. Suppose that 800 vph (vehicles/hour) use the main route segment 3 —>
4 -> 5, while 200 vph divert to the bypass 3 —>• 8 -> 5. The bypass proportionality assumption
suggests that in this case every user remains on the main route with probability 0.8 and chooses
the bypass with probability 0.2. In particular if the demand from origin B to destination D is
200 vph, then 80% of those, i.e. 160 vph, choose the main route B - » 2 - » 3 - > - 4 - > - 5 - > 6 -*D,
while 20%, i.e. 40 vph, divert to the bypass and use the route B—)• 2—>• 3 —> 8 —>• 5 —>• 6 —»D.
Similarly, suppose that the demand from origin A to destination C is 350 vph, 150 vph of those start
their trip on the initial route A—^ 7 —>• 2 —> 3 from the origin A to the diverge node 3. Suppose that
out of these, 100 vph end their trip on the direct link from the merge node 5 to the destination C,
while the other 50 vph choose the final route 5 — ^ 6 —>C. The bypass proportionality assumption
is that the same proportions (80/20) apply to each of these groups; in particular 80% of the flow
in the last group, i.e. 40 vph follow the main route A—)- 7 — » 2 - » 3 - » 4 - » - 5 — > • 6 ->C, and the
remaining 10 vph choose the bypass and follow the route A—>• 7 —>• 2 —>• 3 —>-8—>• 5 —>• 6 —»C.
The main goal of this paper is to study the bypass proportionality assumption and to compare it
with the entropy maximizing criterion for the most likely route flow solution suggested by Rossi
et al. (1989). A formal definition of the bypass proportionality condition and related notation
are given in section 1. General background on the traffic assignment problem is given in section
Entropy Maximization in Origin-Based Assignment 399
2; this topic is thoroughly covered by Patriksson (1994). In section 3 we show that bypass
proportionality is a necessary but not sufficient condition for route flow entropy maximization
under any feasible constraint on total link flows. In particular, solutions to maximum entropy
user equilibrium (MEUE) and LOGIT assignments must satisfy this condition. Section 4 shows
that the bypass proportionality assumption provides a constructive route flow interpretation for
any feasible a-cyclic origin-based link flow array. This interpretation also maximizes route flow
entropy. Therefore, in the context of route flow interpretations for origin-based solutions the two
assumptions are equivalent. Section 5 extends the constructive solution of section 4 to the general
case when only total link flows are known. Section 6 discusses the implications of this paper to
origin-based assignment methods. In particular, the results of section 4 are proposed as tools for
the analysis of origin-based solutions, and improvements to the origin-based assignment method
are suggested based on the results of section 5.
1 DEFINITIONS
Let the directed graph G = {N, A} represent the transportation network, where N is the set of
nodes, and A the set of directed links. The set of origins N0 is a subset of N. For each origin
p G N0, Nd(p) C N is the set of destinations with positive demand. In order to formalize the
bypass proportionality condition we introduce the following notation for routes, route segments,
and route combinations.
A (simple) route segment is a sequence of (distinct) nodes [ u i , . . . , Vk] such that (vt, vi+\) e A
VI < / < k — 1. In particular, the route segment [i, j] is the link from node i to node j. For
generality we also allow the route segment [v] G RW-, which is the empty route segment at v,
i.e. the route segment that starts from v, ends at v, and does not contain any links. The set of
all simple route segments, that is route segments that do not contain cycles, from node i to node
j is denoted by Rij. If route segment r = [i = vi,... ,vn = j] € Rij is followed by route
segment s = [j = t / i , . . . , um = k] e Rjk then the combination of the two segments is denoted
by ( r + s) = [i = u l 5 . . .vn_i,vn = j = « i , u 2 . . . , um = k]. In general, a combination of simple
route segments may not be simple; if it is simple, then (r + s) € Rik- The statement s C r means
that route segment s is part of route segment r. In particular a C r means that link a is part of
route r, this relationship is also represented by the element of the route-link incidence matrix Sra,
which is equal to 1 if link a is part of route r and zero otherwise. The amount of traffic that flows
along route r e Rpq from the origin p to the destination q is denoted by hr<pq.
Using these definitions, bypass proportionality can be formally defined. The term bypass pro-
portionality stems intuitively from the situation presented in the introduction. In general it may
not be possible to distinguish between the "main route" and the "bypass", therefore we need to
consider any pair of alternative route segments s,s' € Rndnm that start at some diverge node
HA G N and end at some merge node nm G N . One way to formulate bypass proportionality
mathematically is to consider two groups of users: one group starts at origin pi, uses an initial
route segment r\ to the diverge node n d , continues through either alternative segment s or s'
to the merge node n&, and ends their trip through a final route segment r{ to their destination
qi. The other group starts at origin p2, uses another initial route segment r\ to the diverge node
rid, chooses between the same alternative segments s and s', and ends their trip through a final
route segment r2 to their destination q2. The bypass proportionality assumption suggests that the
proportion of users that choose each alternative segment is the same in both groups, hence the
flow ratios are equal, i.e.
r\+s+r(,piqi _ r'2+s+r2 ,p2q2
h r;+s'
i , i +, r;/ ,pigi h r^, +. s'+r^
i . f ,p2?2
Cross multiplying implies that
r\+s+r{,piqi r^+s'+r* ,p2q2 ~ r[+s'+r(,piqi r'2+s+r2,p2q2 ^ '
Condition (1) is probably more intuitive; however, it is only applicable if the denominators are
strictly positive, in which case (1) and (2) are equivalent. In the following definition, condition
(2) is used, as it can be applied to all possible combinations of zero flows as well.
Definition: A route flow vector h = (hrfq) &NO. £Ar / ).r£R satisfies the (strong) bypass pro-
portionality condition iff it satisfies condition (2) for every diverge node nj, G N, merge node
nm € N, pair of alternative route segments s,s' G Rndnm, pair of origins pi,p2 G N0, pair of
destinations qi G Nd(pi), q2 G AT d (p 2 ), initial routes r\ G RPlnd]r\ G Rp2nd, and final routes
r
l G t^nmqi'ir2 ^ Rnmq2-
Notice that this definition requires that (2) holds even if some of the route combinations are not
simple. For the sake of simplicity, in the following flows are explicitly restricted to simple routes,
i.e. to routes that do not contain cycles. In that context an alternative condition is considered,
referred to as the weak bypass proportionality condition which requires that (2) holds only if
all four route combinations are simple. That is (r\ + s + r{),(r\ + s' + r{) G RPiqi, and
The bypass proportionality assumption corresponds to a behavioral postulate that the actual flows
satisfy the bypass proportionality condition.
2 BACKGROUND
Given the demand for travel d = (dpq)p^N0;q£Nd(p} (vehicles/hour) between pairs of origins and
destinations, the Traffic Assignment Problem (TAP) is to assign those flows to specific routes of a
given network according to a given behavioral hypothesis. By the definition of the destination sets
dpq > 0 Vp e N0] Vq € Nd(p). A common hypothesis in transportation research is Wardrop's
user equilibrium principle that users minimize the cost of their chosen routes, where cost can be
interpreted as travel time, monetary cost, some combination of those, or any other measure of
disutility of using the specific route.
The most detailed description of the resulting assignment is given by the route flow vector
h = (hr,pq) where hripq represents the flow along route r from origin p to destination q. A route
flow vector is feasible if it is non-negative, and if it satisfies the demand, i.e. EreRp«, ^r,Pg = dpq
Vp € N0;Vq 6 Nd(p). A more concise description of the assignment is obtained by aggre-
gating the flows by origin, the result being the origin-based link flow array f = (/P,a) where
fp,a = E<jejv d ( P ) EreR hr!pq-Sra is the flow on link a originating at p. An origin-based link
flow array is feasible iff it is the aggregation of some feasible route flow vector. The vector of
total link flows v = (va)a^ is obtained by aggregating the flows further over all origins, that
is by letting va = va(h) = E P eJv 0 E<,eJv d ( P ) EreR p , hripq-6ra. Again, a link flow vector will be
considered feasible iff it is the aggregation of some feasible route flow vector. In this paper we
assume that link costs t = (t a ) a£j 4 are strictly positive, strictly increasing and separable functions
of total link flows, i.e. ta = ta(va). Route costs c = (cr) are assumed additive over the links, i.e.
Cr = EaCr *o-
In the deterministic model, costs are assumed to be known perfectly in advance. In that case a
route flow assignment satisfies Wardrop's user equilibrium conditions iff it is an optimal solution
for
««(h)
[TAP] min f c(h) • dh = £ f ta(x)dx
s.t. £ hr,pq = dpq Vp 6 N0- Vg e Nd(p)
hr,m > 0 Vp € N0- Vq € Nd(p); Vr € Rpq (3)
Under the above assumptions with respect to link costs, the objective function of TAP is a well-
defined convex function, and a solution always exists. In general, the route flow solution and
the origin-based solution for TAP are not unique; however, the strict monotonicity of link costs
ensures that total link flows are unique. To overcome that problem Rossi et al. (1989), Janson
(1993) and others proposed consideration of the entropy maximizing user equilibrium (MEUE)
solution as the most likely one. Rydergren recently presented an effective dual method to obtain
the entropy maximizing solution once equilibrium link flows are known (Larsson et al., 1998).
Stochastic models assume that perceived costs are random variables. Various stochastic models
have been proposed in the literature. In this paper we refer only to the LOGIT model, which
assumes perceived route costs to be Cr = cr + ier, where er are independent Gumbel variates.
The resulting assignment is the unique optimal solution of
[LOGIT] min /c(h) - dh + - • £ £ £ hr,pq • (ln(hr,pq/dpq) - 1)

^ peN0qeNd(p)r£Rpq
s.t. £ hr>pg = dpq Vp G N0- \/q G Nd(p)

r£Rpq
hr,Pq > 0 Vp G N0; Vq € Nd(p); Vr G Rpq (4)
The additional term in the objective function is the route flow entropy, divided by the cost
sensitivity coefficient //. It is known that at the limit when the variation approaches zero, i.e.
/j, —> oo, the LOGIT assignment approaches the deterministic user equilibrium solution.
3 MAXIMUM ENTROPY USER EQUILIBRIUM (MEUE)
As mentioned in the last section, the user equilibrium assignment is unique only in terms of total
link flows. A possible criterion for the most likely route flows is entropy maximization. In this
section route flow entropy maximization is used to obtain a route flow representation for a general
feasible vector of total link flows. When applied to the user equilibrium link flows, the MEUE
solution is obtained; however, the results are valid for any feasible link flows vector.
Suppose that v is a feasible vector of total link flows. The route flow entropy maximizing
representation of v is the optimal route flow solution for:
max
- £ Z! 52 hr,pq(\n(hr,pq/dpq)-l)
peNo qeNd(P) reRpg
s.t. E E Z! hrm • Sra = va Va E A
peN0 q€Nd(p) r£Rpq
E fcr,P, = dpq Vp G N0; Vq G Nd(p)

re Rpq
hr,Pq > 0 Vp 6 N0- V<? e Nd(p); Vr € Rpq (5)
Since v is feasible, by definition it has at least one route flow representation which is a feasible
solution for (5). Let Rpq denote the set of routes r G Rpq such that there is some feasible solution
for (5) with hrtpq > 0. Problem (5) is therefore equivalent to:
h
S.t. S S S r,P
peN0 q€Nd(p) r€Rpq
/*r,P9 > 0 Vp € ATO; V9 6 A^(p); Vr e 4, (6)
In the optimal solution every route in R = UPe;v0 \JqeNd(P) Rpq must nave positive flow, therefore
the optimal solution of (6) is an inner point, at which the objective function is differentiable. The
Lagrangian is:
L h
= ~ S S S w (^(hr,Pq/dpq) ~ 1)
-S/'.-k-
aeA \ p
S S E WM/
p€N qeN (p) H , 0 d r£ p
- E S 7P9 • K»8 - S fer'P9

PeNoqeNd(p) \ reRpq J
and the inner point optimality conditions are that for every origin p, destination q and route
r e ^Rp, o r
—- = - \n(hr,pq/dpq) + Y^ Pa • 8ra + %q = 0
- dpq • CXp I 7p? + E ^

oCr
Using this derivation we can verify that the optimal solution for (5) satisfies the weak bypass
proportionality condition. Suppose n^nm £ N; s,s' £ Rndnm', Pi,Pi € -ô! 9i ^ N<i(pi);
<?2 € Nd(pz)', r\ e Rpind;r'2 € /? p2 n d ; ^ G Rnmq^r{ € -Rn m92 - If all four route combinations
can have positiveflow,i.e. (r{+s + rf), (r^ + s' + rf) € ^P191 (r^ + s + r^^r^ + s' + r^) e .Rpjgj,
we can substitute (7) in (2), denote u;r = exp (ZâCr /?a) and get
[dpiqi • exp (7P1?1 ) • u> r j • «;, • u; r /] • [dp29a • exp (7P2?2) • w^ • ws, • wrf] =
[dfiqi • exp (7Pm) • wr{ • ws< • wrt] • [dp2?2 • exp (7P272) • w r j • ws • wr^] (8)
which is clearly true.
Suppose w.l.o.g. that(r{ +s + r{) ^ Rpiqi and hence hr,+s+rf i = 0. If the right hand side of
(2) is not zero, then there is e > 0 such that hr,+s,+rf > e > 0 and hri+s+rf > e > 0. If
all four route combinations are simple, we can shift e flow from (r\ + s' + r{) to (r\ + s + r{) and
from (r'2 + s + r{) to (r^ + 5' + r{) and get another feasible solution where hri +s+rf = e > 0,
hence (r\ + s + r(,p1qi) e -RP191, a contradiction.
We showed that the route flow entropy maximizing representation for any feasible total link
flows constraint satisfies the weak bypass proportionality condition. In particular, the solution
to MEUE which is defined by maximizing route flow entropy under the user equilibrium total
link flows, must satisfy the weak bypass proportionality condition. The solution to the LOGIT
assignment may also be described as maximizing entropy under some total link flows constraints;
hence it must also satisfy this condition.
Our conjecture is that if flows are not explicitly restricted to simple routes, the route flow entropy
maximizing representation will satisfy the strong bypass proportionality condition. In the case of
user equilibrium, the restriction to simple routes is done implicitly, i.e. routes that contain cycles
can not be cost minimizing since link costs are strictly positive. Therefore the MEUE solution
also satisfies the strong bypass proportionality condition.
The next question is whether satisfying bypass proportionality is sufficient for entropy maximiza-
tion. To answer this question consider the network in Figures 2a and 2b. In this network demand
flows along the diagonals, di^ = d3|1 = c?2,4 = ^4,2 =10 vph, and is zero for all other O-D pairs.
The total flow on each link in both figures is 10 vph; therefore, both are feasible solutions for the
same route flow representation problem. In Figure 2a all the flow from 1 to 3 and from 3 to 1 use
the counter-clockwise links, while the flow from 2 to 4 and from 4 to 2 use the clockwise links. In
Figure 2b the flows between each O-D pair are evenly distributed, half going clockwise, and half
going counter-clockwise. One can verify that the route flows representation in both figures satisfy
the bypass proportionality condition; however, route flow entropy is maximized only by the route
flow representation of Figure 2b. From this example we learn that bypass proportionality is only
a necessary, but not sufficient condition for entropy maximization.
4 ROUTE FLOW INTERPRETATIONS FOR ORIGIN-BASED SOUTIONS
In the previous section, entropy maximization and bypass proportionality were considered in
determining the most likely route flow representation for a given total link flow vector. In this
section similar criteria are considered when route flows are further restricted by a specific origin-
based link flow array, f = (fp,a)- One reason to consider this question is when origin-based
solution methods are used to find the entropy maximizing user equilibrium assignment; however,
such methods have not yet been proposed. Origin-based methods were proposed by Bruynooghe
et al. (1969) and by Bar-Gera and Boyce (1998) to solve the basic user equilibrium traffic
10vph •> 5 vph

Figure 2: a) Bypass proportionality holds b) Maximum route flow entropy
assignment problem. The latter produces feasible a-cyclic origin-based solutions that are not
necessarily in agreement with the entropy maximizing route flow representation for the same total
link flows. Nevertheless, route flow interpretations for these origin-based solutions are helpful
in understanding the origin-based solutions, and useful for evaluation purposes. Therefore, in
this section we use the two assumptions, entropy maximization and bypass proportionality, to
determine the most likely route flow interpretation for a general feasible a-cyclic origin-based
link flow array.
Given a feasible a-cyclic origin-based link flow array, f = (fp,a), a route flow interpretation is a
vector of route flows h that satisfies Y^qeNd(P) Zreftp, hr,pg • Sra = /P)0 for every origin p € N0
and every link a G A. Notice that since f is feasible, every route flow interpretation of it is a
feasible assignment.
Denote the used subnetwork for origin p by Ap = {a G A : /p,a > 0} C A. The set of route
segments from node i to node j that are included in the subnetwork Ap is denoted by Rij[Ap] =
{r € Rij : a C r =>• a G Ap}. By assuming that f is a-cyclic we mean that each of the used
subnetworks Ap is a-cyclic. Since used subnetworks are a-cyclic, every route segment included
in them is simple, and every combination of such route segments is also simple. Therefore in this
section, the earlier distinction between strong and weak bypass proportionality is irrelevant.
The entropy maximizing interpretation is the optimal solution for:
£ - 1)
s.t. ^ £ ^p, • 8ra = fp,a VP G No' Vfl G A
hr,pq > 0 Vp G N0; Vq G AT^p); Vr G Rpq (9)
In every interpretation of f, a route r can carry a positive flow only if it is included in the used
subnetwork, i.e. hr,pq > 0 =>• r € Rpq[Ap]. Therefore problem (9) can be rewritten as:
- 5] hrtpq(ln(hr<pq/dpq)-l)
53 53 ^r,OT ' bra = fp,a Vp £ No] Vd <E A

q£Nd(p) reRM[Ap]
hr,pq > 0 Vp G N0; Vq G Nd(p); Vr G Rpq[Ap] (10)
The Lagrangian is:
r
L = — 2_^
V* V^
/
V^
/ _,
i, n fi,
"r,pg (LH(h'r,Pq/dpq)
ij \ i\
~ I)
-E Eft.-(/P,a- E E hr,pqsr] (ID

and the inner point optimality conditions are that for every origin p, destination q and route
r G Rpq[Ap]
——
6h
r, q
= - \n(hr>pq/dpq) + 53 ppa • sra = o
P aeAn
hr,pq = dpq • exp ( 53 PPO. -Sra] = dpq • exp 53ft» I (12)

VoG^l / \«Cr /
Notice that (12) is very similar to (7), only that the Lagrange multipliers related to the links are
origin-dependent, and the O-D Lagrange multiplier is omitted. As in the previous section, this
derivation allows us to check the bypass proportionality condition for the optimal solution of (9). If
all four route combinations are included in the used subnetworks, i.e. (ri+s+rf), (r[ + s'+r{) €
Rpiqi[Apl\; (rJ, + s + ri), (rj + s' + rf2) e RP^[AP^\, we can substitute (12) in (2), use the
abbreviation wp>r = exp (£aCr /3p,a) and get
rf
pni ' ^ Pl ,r; ' wpl,.' • w piir / • ^P292 • wp2,r. • wp,jS • w^f (13)
which holds if
W W W
Pl,» ' P2,S' = P1,*' • WP2S (I 4 )
This equation is certainly true if pi = p?; however in general it may not hold.
It is also necessary to check the bypass proportionality condition if one of the four route combi-
nations is not included in the used subnetwork. Suppose w.l.o.g. that (r\ + s + r{) ^ RPiqi [API\
and therefore either r\ £ Rpind[Apl], or r{ <£ Rnmqi[Apl], or s ^ Rndnm[Apl}. In any case,
hr,+s+rf pi?i = 0. In the first two cases, (r\ + s' + r{) is also not in RPiqi[Apl], and condition
(2) holds. However in the last case, the fact that s £ Rndnm[APl] does not necessarily imply that
s ^ Rndnm[Ap2], unless pt andp2 are the same origin.
In conclusion, solution (12) satisfies condition (2) for groups of users that start from the same
origin. One should note that bypass proportions in a general feasible origin-based link flow array
may be different from one origin to the other, in which case there will not be any route flow
interpretation that satisfies the bypass proportionality condition, as defined in section 1. Therefore
a weaker condition should be considered, that equation (2) holds when the two groups of users
have the same origin, i.e. p\ = p 2 - This is referred to as the origin-based bypass proportionality
condition. As shown above, the entropy maximizing route flow interpretation (12) does satisfy
the origin-based bypass proportionality condition.
In section 3, when only total link flows were given, we found that there may be more than one
route flow representation that satisfy the bypass proportionality condition. In the following we
show that when the more detailed origin-based link flows are given, there is only one route flow
interpretation that satisfies the origin-based bypass proportionality condition. As will be shown,
this interpretation also satisfies the inner point optimality conditions of the entropy maximization
problem, thus demonstrating that the two assumptions are equivalent.
The following derivation uses additional definitions for aggregating flows in various ways. The
O-D segment flow fpqiS is the aggregation of all the flows from the origin p to the destination q
that share a specific route segment s. The route segment s does not have to start at the origin p
or end at the destination q. It can be part of a route, or possibly part of several different routes
from p to q. It is defined mathematically as the sum over all routes r 6 Rpq such that the route
segment s is part of the route r, that is:
h
/P?,* ~ XI r,pq (15)
r£flp,;sCr
The arriving flow from origin p to node j, afpj is the aggregation of all the flows that originate at
p and arrive at j, either on their way to another destination, or to stop at j, if it is the destination.
Formally: afpj = J2[i,j]zA fp,[i,j]- When the arriving flow is strictly positive, the proportion of
flow that arrives to j from a specific approach [i,j] € A is denoted by ap,[i,j] = fp,[i,j]/afpj-
For a given origin p and link a = [i,j] e Ap consider a route segment sj. + [i,j] for some
si G Rpi[Ap], an alternative route segment s2 € Rpj[AP], destinations <?, q' G Nd(p) and final
routes r 6 Rjq[Ap}; r' 6 Rjq>[Ap]. The origin-based bypass proportionality assumption (with
empty initial routes r,- = r\ = [p]) implies that:
hsi+[i,j]+r,pq • hS2+ri,pq> — hS2+r,pq • h3l+[ij]+ritpqi (16)

Notice that:
=
/ ., ^si+[«',j]+r,pqi Jpq,[i,j]+r (17)
E E E /».a+r',P,' = afpj (18)

s 2 6R pJ [Ap] g'£Nd(p) r'€Rjg,[Ap]
E /l S2 +r, P9 = /p,,r (19)
h
E V^
/ >
siGHpt[>lp] q'£.Nd(p)
V^
/ >
r'£Rjqi[Ap]
'*si+[i,.7]+r',pg' — Jp,[«J]
—f i"7m
V-^W
Sum (16) over all possible s1; s 2 , <?', r' and use (17), (18), (19), and (20) to get
fpq,[i,j]+r ' afpj =

fpq,r ' fp,[i,j] (21)
Since [i, j] G Ap, /p,[;,j] > 0, hence a/ p j > 0 and therefore we can rewrite (21) as
f f -I — a \- •] • f (22)
which may be interpreted as an approach proportionality condition. Since Ap is a-cyclic, if

r € Rpq[Ap] then there can not be a longer route r' e _Rpg[y4p] that contains r; therefore the route
flow and the O-D segment flow are equal, hr,pq = fpq,r- By a similar argument fpq,[q] = dpq, and
hence for any route r e Rpq [Ap]
i r /• I I i T r /'O'5^
<^r,pq — /p?,'" ~~ Jpg,[q] ' J..L ^P<a — PI ' J.J. ®P<a \^'~>)
aCr aCr
The route flow interpretation given by (23) clearly satisfies the inner point optimality conditions for
the entropy maximization problem with the Lagrange multipliers /?p>a = ln(a p , a ). In conclusion,
when a feasible a-cyclic origin-based link flow array is given, route flow entropy maximization,
the origin-based bypass proportionality assumption and the approach proportionality assumption
are equivalent.
Notice that (ap,a) are immediately available for any feasible origin-based link flow array (fp,a)-
The main effort in obtaining the route flow interpretation by (23) is to enumerate all used routes.
In some cases route enumeration may be avoided; for example, using (23) the computation of
route flow entropy can be simplified as follows:
E(t) = E(h(f)) = -E E E hr,Pq-\n(hr,pq/dpq)

P^No g£Nd(p) reRpq[Ap]
= -E E E hr,pq.iJuap,(
p£N0 qeNd(p) r£Rpq[Ap] \ a Cr
= -Z Z Z Z^,p?' ln ( a P,a)
peM; q£Nd(p) r£Rpq[Ap} aCr
= -Z Z Z Z^-l*Ka)-<^
= - Z Z ln
(aP^ ' Z Z ^.P9 • <
pe/Vo ae.4 L?eJvd(P) reRp,[Ap]
= - Z ZlnKa)'/P,a (24)
The computation of route flow entropy by the last expression is substantially faster than computing
entropy using route flows directly.
5 EXTENDED APPROACH PROPORTIONALITY
In section 4 the origin-based bypass proportionality assumption was used to derive the approach
proportionality condition (22), which provided a constructive solution for the entropy maximizing
route flow interpretation problem. In this section we examine the possibility to extend this result
for solving (5) when only total link flows are given, using the strong bypass proportionality
assumption. In particular we are looking for situations where the approach proportions are
expected to be equal across origins, that is aplia = a p2?a for some link a = [z, j] and some origins
Pi and pi. It is quite unlikely to expect the approach proportions to be equal for all origins;
however, if the routes from two origins arrive at the termination node j from the same direction,
such equality may hold. For example in Figure 3a the approach proportions of links [6,8] and
[7,8] are the same for both origins 1 and 2. (All routes carry the same flow.) On the other
hand in Figure 3b when additional routes are considered, these approach proportions differ by
origin, even though the bypass proportionality assumption still holds. (Again all routes carry
the same flow.) The general direction from the origins to the termination node is therefore not a
sufficient condition for equal approach proportions, and the specific structure of used routes must
be considered.
To analyze the difference between the two cases we use the following definitions. The origin-
based segment flow fVtS is the aggregation of O-D segment flows over all destinations, that
is:
/„,.= Z / M ,.= Z Z ^ (25)
<?ewd(p) ?eNd(p) refl P ,;sCr
This is consistent with the definition of the origin-based link flow /P)0, as every link is also a route
segment. A common node from origin p to node j is a node other than j that is common to all
>L± >^
Figure 3: a) Equal approach proportions b) Different approach proportions

and same last common node and different last common node
used route segments from p to j, that is
(26)
The definition of a common node is valid only if there is at least one used route segment from p
to j.
In Figure 3a, node 4 is a common node from origin 1 to node 8; the same node is also a common
node from origin 2 to node 8. In Figure 3b the only common node from origin 1 to node 8 is the
origin - node 1, and the only common node from origin 2 to node 8 is the origin - node 2, and
they are different. The following lemma suggests that this is the essential criterion for approach
proportions to be equal.
Lemma: if h is a feasible route flow vector that satisfies the bypass proportionality condition,
and if for some node j' G N and two origins pi,p 2 £ N0, there is a node n which is a common
node from pi to j and also from p2 to j, then the proportion of every approach to j is the same
for both origins. That is:
CO'MPIJ n COMP2J / 0 => apl<[ttj] = aP2,(itj] V[i, j] G A (27)
Proof: Let n G COMplj n COMP2J. Consider a specific approach ( i , j ) G A. For any route
segments s G Rni, and s' G Rnj, destinations <?i G Nd(pi), q_i € A^(p 2 ) initial routes rj G /?Pin;
r
2 ^ fip2n, and final routes rf G -Rj 9l ; r2 G Rjq2, the bypass proportionality assumption states
that:
Entropy Maximization in Origin-Based Assignment 41 1
Sum over all possible s, s', qi, g2, r\, r{, r 2 , r{ and note that:
2^ Z^ Z^ 2_^ ^r;+ S +[i,j]+rf,pigi ~ JPl,[i,J]

\€Rpin ^R^< r'efl^
Q
Z_y 2^ Zw 2-^ ^r2+S'+r{,p2q2 ~ fp^
£Nd(p) r 2 eR P2? , °'£Rn3 r^R3q2
= a
Z^ Z_, 2^ 2-^ ^rj+s'+rf.pigi /PU
efl P1 n s'efln, rfefl^
=
2-s Z-^ ^r+s+[i,j]+r,P2g2 /P2,['j]
to get
/PI,[»,J] ' a /p2J = a/pij ' /P2,[i,j] (29)
The definition of a common node requires that there will be at least one used route segment from
Pi to j, and therefore a/PU > 0; similarly a/ p2 j > 0, and therefore
Condition (27) is fairly general, but it may not be so easy to verify. When all used routes from each
origin are included in some a-cyclic subnetwork, an alternative condition can be derived, which
is easier to verify. For every such a-cyclic subnetwork Ap, a topological order can be defined,
i.e. a one to one function op : N —»• {1,2,3 . . . \N\} such that [i, j] G Ap =>• op(i) < op(j}. The
last common node, lcnpj from origin p to node j is defined as the common node / with highest
value of op(l). Clearly if the last common node to j is the same for two origins, condition (27)
is satisfied. The following lemma shows that it is sufficient to compare only the last common
nodes.
Lemma: If h is a feasible route flow vector that satisfies the bypass proportionality condition,
where the flows from each origin p are restricted to some a-cyclic subnetwork Ap, and if for some
node j; G N and two origins pi, p?. e N0, there is a node n which is a common node from pi to j
and also from p2 to j, then the last common node to j is the same for both origins. That is:
COMplj n COMP2J ^ 0 => lcnplj = lcnp2J (31)
Proof: Denote/i = /cn p u ;/ 2 = Icnp2j. Supposen e COMpljr\COMp2j. By definition there is a

destination <ji e Nd(pi) and route segments r\ e Rpin [AP1], si e Rnj [API], and r{ e /?j9l [AP1]
such that /z r , +s , r / > 0. Every used route segment from p2 to j is of the form (r 2 + ,s2), where
rl2 e /?p2n [AP2], and s 2 6 -Rnj [^p2]- Again by definition there is a destination q2 € Nd(p2) and
route segments r| e jRj,2 [AP2] such that /i r , +S2+r / > 0.
The bypass proportionality assumption states that
1 L L L /"20^
r\+si+r{,piqi r'2+s2+r^,p2qt ~~ i-;+s 2 +r 1 / ,pigi rj+si+r^,p 2 92 ^ '
which implies that hr,,s +r / > 0; hence (r^ + s 2 ) is a used route segment from pi to j, and
therefore l\ e (rj + s 2 ). Using the topological order o p l (/i) > °pi( n ); hence /! E s 2 , i.e. /i is
common to all used route segments from p2 to j, and if /i / n then in each of these segments /i
comes after n. Applying the same argument in the opposite direction where l\ replaces n and / 2
replaces l\ shows that /2 is common to all used route segments from pi to j, and if /x / /2 then in
each of these segments /2 comes after l\. But this is a contradiction to the choice of l\ as the last
common node from pi to j. Therefore l\ = / 2 . Q.E.D.
The conclusion from this section is that if flows from each origin are restricted to a-cyclic
subnetworks, then the bypass proportionality assumption implies that
/cn pu = lcnp2J =^> apl,[ij] = ap2,(i,j] Vp1;p2 € N0-\/(iJ) <E A (33)
which we refer to as the extended approach proportionality condition.
6 IMPLICATIONS FOR ORIGIN-BASED METHODS
Most solution methods for the traffic assignment problem are either link-based, i.e. only total
link flows are stored, or route-based, i.e. all used routes and the flow on each are stored. The
main advantages of route-based methods are: detailed solutions which are necessary for certain
evaluation procedures and useful for re-optimization, and computational efficiency. Their main
disadvantage is the large memory requirements.
Bar-Gera and Boyce (1998) presented a new origin-based method that is computationally efficient
in finding highly accurate solutions for large scale networks, as can be seen in Figure 4. The
memory requirements of this method are relatively reasonable. The upper bound on the memory
required is on the order of the number of origins times the number of links. The minimum
requirement, if no alternative routes are used, is one integer per origin per node. For example
in the case of the Chicago sketch network, which has 317 zones each representing an origin
and a destination, 76,267 O-D pairs with positive demand, 1,088 nodes, and 3,008 links, the
minimum memory requirement is about 1.4 MB, while the upper bound is about 26.0 MB. The
actual amount of memory used in the first iterations was about 4.9 MB, but as the algorithm
converged memory requirement decreased to about 1.9 MB at equilibrium. These features make
this origin-based method highly suitable for practical large-scale applications.
1E-L4
CPU time (rain)

Figure 4: Relative gap vs. CPU time (Chicago sketch network)
The origin-based solution provides sufficient detail for re-optimization. Using the approach
proportionality condition (22) and the related route flow interpretation (23), the detail provided
by an origin-based solution is actually equivalent to route-based solutions.
A key issue in this origin-based method is the restriction of solutions to a-cyclic subnetworks. In
addition to the advantages discussed by Bar-Gera and Boyce (1998), the a-cyclic subnetworks
allow the definition of last common nodes. In fact the current implementation finds the last
common node from each origin to each node (for other purposes). It is therefore possible to
embed the extended approach proportionality assumption (33) into the method, so that whenever
last common nodes are the same, approach proportions will be equal. As a result of this
improvement, flow shifts from one approach to the other are calculated only once, thus reducing
the computation time per iteration. In addition, the resulting search direction has a better
coordination across origins, in comparison with the previous method in which the same shift
may have been applied to several different origins simultaneously. Overall this improvement
reduces the number of decision variables and thus simplifies the optimization problem. Finally,
the resulting solution is closer to satisfying the bypass proportionality assumption and the route
flow entropy maximization criterion. The last advantage is especially apparent when compared
with route-based methods that tend to choose solutions that use small numbers of routes, and
hence have low entropy.
CONCLUSIONS
In this paper the behavioral assumption of bypass proportionality was introduced and studied. This
assumption can be considered in conjunction with Wardrop's user equilibrium in determining
whether a certain route flow pattern is reasonable or not. The unique solution to the route
flow entropy maximizing user equilibrium problem satisfies the bypass proportionality condition;
however, they are not equivalent, as there may be several route flow patterns that satisfy Wardrop's
user equilibrium and bypass proportionality.
Bypass proportionality implies an immediate route flow interpretation for any feasible a-cyclic
origin-based solution, which is also the route flow entropy maximizing interpretation. Using this
interpretation, origin-based solutions provide equivalent detail to route-based solutions.
Bypass proportionality also implies certain relationships between the origin-based link flows for
different origins. These relationships can be used to improve the origin-based assignment method
proposed by Bar-Gera and Boyce (1998). Some of these relationships were revealed in this paper,
but there is more to explore in that area. We believe that this direction of research may lead to an
origin-based method for finding the route flow entropy maximizing user equilibrium assignment.
The validity of the bypass proportionality assumption in other models, and especially in the
various stochastic models, is another interesting area for future research.
ACKNOWLEDGMENTS
The authors are grateful for the financial support of the National Science Foundation through the
National Institute of Statistical Sciences, Research Triangle Park, NC.
REFERENCES
Bar-Gera, H. and D. Boyce (1998). Origin-based network assignment. Presented at: The 6th
meeting of the EURO Working Group on Transportation. Goteborg, Sweden, September
9-11.
Bruynooghe, M., A. Gibert and M. Sakarovitch (1969). Une methode d'affectation du traffic.
In: Proceedings of the 4th International Symposium on the Theory of Road Traffic Flow,
Karlsruhe, 1968, (W. Leutzbach and P. Baron, eds), pp. 198-204. Beitrage zur Theorie des
Verkehrsflusses, Strassenbau und Strassenverkehrstechnik, Heft 86, Herausgegeben von
Bundesminister fur Verkehr, Abteilung Strassenbau, Bonn.
Larsson, T., J. Lundgren, M. Patriksson and C. Rydergren (1998). Most likely traffic equilibrium
route flows - analysis and computation. Presented at: The 6th meeting of the EURO
Working Group on Transportation. Goteborg, Sweden, September 9-11.
Janson, B. N. (1993). Most likely origin-destination link uses from equilibrium assignment.
Transportation Research, 27B, 333-350.
Patriksson, M. (1994). The Traffic Assignment Problem -Models and Methods. VSP, Utrecht,
Netherlands.
Rossi, T. R, S. McNeil and C. Hendrickson (1989). Entropy model for consistent impact fee
assessment. Journal of Urban Planning and Development/'ASCE, 115, 51-63.
CHAPTER 6
TRAFFIC DEMAND, FORECASTING AND DECISION TOOLS
• The true mystery of the world is the visible, not the invisible. (Oscar Wilde)
• The most beautiful thing we can experience is the mysterious. It is the
source of all true art and science. (Albert Einstein)
• We all live under the same sky but we don't all have the same horizon.
419
THE USE OF NEURAL NETWORKS FOR

SHORT-TERM PREDICTION OF TRAFFIC
DEMAND
Jaime Barceld, TSS-Transport Simulation Systems, Barcelona, Spain

Jordi Casas, Universitat de Vic, Spain
INTRODUCTION
All proposals for Advanced Traffic Management Systems based on Telematic Technologies
agree on the importance of a short-term forecast of the evolution of traffic flows or,
equivalently network state, for a proper foundation of the management decisions. Consequently
a lot of effort has been devoted to the research and development of proper forecasting
procedures. Perhaps the most relevant of these efforts is the EU project DYNA (Gunn,1994;
Ben-Akiva et al., 1994; Inaudi et al., 1994). Unfortunately the achievements of these projects
cannot be applied or extrapolated to complex urban structures. Other approaches more suited to
complex networks have been developed, as for instance the ones referenced in (Cascetta, 1993;
Barcelo, 1997). Unfortunately these models do not appear as very appropriate for full dynamic
applications and we had to look in another direction in our search for a suitable prediction
model. The promising features of neural networks as predictive tools, Baldi and Homik, (1995),
decided us to explore this approach.
DYNAMIC O/D ESTIMATES USING NEURAL NETWORKS
To address the problem we consider Origins and Destinations as being paired, / being the set of
all O/D pairs in the network, so if origin r and destination s are the z'-th O/D pair, g, denotes the
corresponding entry of the demand matrix G, representing the total number of trips between
origin r and destination s. OfD(r_S) = g / , i = (r,s) e 7, where / denotes the set of all O/D pairs in
the network. The total number of trips between an origin r and a destination s is not a fixed
value through time, but is a dynamic value (i.e., has a dependency with time. According to this
dynamic vision of the demand, we can consider each component of the O/D matrix as a time
series, therefore forecasting a O/D matrix consists of performing the forecast for each
component of this matrix, in other words, the simultaneous forecasting of many multivariate
time series. Solutions to this problem based on classical forecasting methods, as Box-Jenkins or
Kalman filtering has been proposed by several authors (Davis, 1993; Davis et al, 1994; Van
der Ziipp and Hamerslag, 1996), the proposed approaches provide quite good results for linear
infrastructures, as in the case of motorways, but in the case of more complex road networks it
remains unclear whether they could provide sound results, although in some of the most
promising cases, Davis (1994), the computational burden required makes doubtful their use for
real-time applications on networks of remarkable size, making advisable to look for other
methods.
Neural networks appear as a natural candidate for a forecasting model, particularly when the
fact of its easily parallelizable structure is taken into account, in case a computational speed up
is required to achieve the system objectives. Another reason to think of a neural network
approach are the results reported by Chakraborty (1992) on analyzing multivariate time series
using neural networks, or by Weigend (1992) in the evaluation of its predictive capabilities
compared to other classical models.
The research described in this paper should be considered as a preliminary exploration of the
performance of neural networks for the demand prediction problem, from the point of view of
the quality of the provided results, as well as from the computational requirements for real-time
applications.
A Neural Network, Hecht-Nielsen (1989) consists of a set of interconnected computational

units or neurons, each one performing a computational process on a weighted sum of inputs
according to a specific function, as shown in figure 1. In general a Neural Network model is
characterized by three elements:
1. The topology of the Neural Network
2. The Neuron characteristics
3. The rules of the training or learning process
The neural network topology determines the connection between the various computational
units. The neural network that we have used is a multi-layer perceptron, Hecht-Nielsen (1989).
This topology corresponds to a Feed-forward network in which the neurons (i.e., processing
units) are arranged in layers and every neuron on each layer is connected directly to all neurons
on the next layer, as shown in Figure 2. The input layer, denoted by layer 0, contains no real
neurons and its purpose is to spread the input to the neurons of the first hidden layer.
Neural Networks For Short-Term Prediction 421
Input
Output
Computational Unit
(neuron)
Figure 1. Neurone representation
The hidden layers are numbered from 1 to Z-l, and the output layer is L. In general the l-th
layer contains NI neurones, therefore the input layer has NO elements and the output layer has NL
neurones. A neurone n in the layer / is connected to all the neurones in layer 7-1 through several
connections (exactly NI-\), each one associated with a weight. We organise this weight in a
vector W(n)(/)> and this neurone has the corresponding bias or offset bn(l). These elements in a
multi-layer perceptron are static; they determine the topology of the neural network, and there
is a dynamic element that determines the state of each layer when input is propagated through
the neural network. If we apply input, represented by a vector /with exactly NO components, the
neural network propagates from input to output through every layer. Each l-th layer takes on a
precise state, represented by a vector S(l) with NI components that represents the output of each
///neurons on this layer.
Layer 0 Layer L
S(0)
Figure 2: A Multi-layer Neural Network
As it shown above, the layer 0 contains no real neurons, its function being only to spread the
input to layer 1, so 5(1) is exactly / and the output of the network will be S(L), which
corresponds to the vector O. The state of the «-th neuron of the /-th layer is computed with the
feed- forward rule:
Sn(l) =/(£(/- 1)T W^C/) + bn(l)) in vectorial form
or SH(l) =/( I(S,(/-1) W/n)(/) + bn(l))) in scalar form
where /is the activation function of the neuron. The procedure to determine the output of the
neural network for a given input is called the feed-forward procedure and is described by the
algorithm:
Feed-Forward Procedure
for each layer 1=1 ... L
for each neurone n - 1 ... NI
Sn(t) =f( S(/-1)T W^C/) + &„(/)) {state ofn-th neurone of the layer 1}
end for
end for
end procedure
Each neuron is characterized by a non-linear activation function. In our case the activation
function selected is the sigmoid function:
The sigmoid function, Hecht-Nielsen (1989), is a bounded differentiable real function that is
defined for all real input values, that rapidly approaches asymptotically fixed finite upper and
lower limits as its argument gets large or small respectively. This limited dynamical range
effectively implements noise suppression and cut-off as shows Masson (1990), this is a quite
important property featuring the sigmoid rule as one of the most used non-linear activation
rules, and it has been the main reason for us to select it, given the nature of our problem with
continuous inputs and outputs.
THE TRAINING PROCESS
The training algorithm used is an ad hoc version of the Back Propagation algorithm as
described by Hecht-Nielsen (1989). This corresponds to a supervised learning process given
that the weights of the different neuron connections are iteratively changed with reference to a
set of predefined patterns specified as a set of input-output pairs. At each step the
computational error is estimated as:
p=\ n=I
where t(p) is the /?-th desired output, tfp)(L) is the /?-th output produced by the neural network.
Back propagation tries to minimize the total squared error E using the following gradient
algorithm:
Gradient is computed as:
P
f>F
—^- = _ Y &,<"> (L) . Si™ (L - 1)
Weights in the output layer are modified according to a delta-rule
MViM(L) = r]^8n(p\L) • Siw(L - 1)

P=\
where 77 is a parameter
For other layers / = 1 .. L-\, the rule is defined as:
/ s i
8n (P> (/ - 1) = (l - [Sn^ (I - I)]2 )• £ [§-
>=1
and weights are modified by:
P=\
where r\ is a parameter
The Back Propagation algorithm is then:
Back-Propasation Procedure
{ Feed-Forward Phase}
for each layer / = 1 ... L
for each neuron n = 1 ... A^
for each input/? = 1 ... P
s(np)(i) =f(s(p)(i-r)T
end for
end for
end for
{Error Computation Phase}
for each neuron n = 1 ... NI
for each input p = 1 ... P
end for
end for
{ Error back-propagation Phase}
for each layer / = L-l ... 1
for each neuron n = 1 ... NI
for each input/? = 1
( }
' (/ + 1)]
end for
end for
end for
{ Step Phase}
for each layer 1=1 ...L
for each neuron « = 1 ...
for each weight i = 1 ...
end for
end for
end for
{ Weight updating Phase}
for each layer 1=1 ...L
for each neuron n = \ ... /V/
for each weight i = 1 ... Af/./
end for
end for
end for
end procedure
PREDICTION
The prediction process forecasts the O/D matrix in the next interval, from the detector
measurements collected and the historical O/D matrix using a multi-layer perceptron neural
network. The predictor process receives as input the detector measurements F,, / being the set
of arcs, where detectors are located, together with the demand O/Dj, which represents the
demand between the j-th origin/destination pair. For optimally running the neural network, this
input must be normalized. This normalization can be done with these rules: 1) the detector
measurements can be divided by the maximum capacity of each arc, i.e., Vt I Vmax and 2) the
different demands of the historical O/D matrix can be divided by the maximum demand, i.e.,
O/A / O/Dmax. Consequently the neural network input will be defined in the interval [0...1]. The
output will not give the demand values of each O/D matrix component directly, but rather the
percentage of variation of each component with respect to the historical O/D matrix, i.e., aj
being the result of the z'-th forecasted O/D matrix component, then each component will be:
ted = o/Dfistorical
where a, e [-1— l]
The expect output of the prediction process for each O/D pair is shown in figure 5 where the
expected prediction for the i-th O/D pair at times tj and tj is depicted.
predicted
interval interval
Figure 3
The training process does the gauging of the neural network, i.e., determines the different
weights of the link connections, and this depends on a set of desired input and output pairs.
This training or learning process is performed with a back-propagation procedure using
simulated input data. The simulator used to generate these input data has been AIMSUN2
described in Barcelo and Ferrer (1997).
The experimentation has been conducted with the microscopic simulator, which provides as
output the detector measurements that correspond to the simulation of traffic flows obtained
from an O/D matrix. Then, from the historical O/D matrix, and small perturbations of this
historical O/D matrix, expressed as percentage variations, and the detector measurements
generated by simulation, the necessary inputs for the training module can be simulated, as
displayed in Figure 4.
Weights
Detector
measurements
att+4
Figure 4: Scheme of an experiment
Data used to apply the training methodology
The computational experiments have been conducted with data from the Madrid site for the
CAPITALS EU Project, reported in CAPITALS (1998). The O/D matrix for the site area has
been extracted from the most recently updated O/D matrix for Madrid after the "Madrid
Mobility Study" done in 1997 by the "Consorcio de Transportes de Madrid". The reference
sub-matrix has been obtained as a traversal matrix of the global matrix using the traversal
matrix procedures in the EMME/2 transportation planning software INRO (1996).
This traversal matrix has been adjusted for the time horizon from 7:00 am until 14:00 pm
using the traffic counts from the data collection done. The adjustment procedure employed
has been the Spiess (1990) heuristics for a bilevel matrix adjustment model, implemented as
a macro of the EMME/2 package.
Training methodology: generation of the Neural Network Patterns
The short term prediction process requires the input of a historical time sliced O/D matrix, as
well as the patterns to train the Neural Network which has to produce the forecasts. Time
sliced O/D matrices are not currently available and its direct production is not easy and very
costly, although some telematic applications have considered the possibility of generating
such information in real time (See for instance the Report on Floating Car Data of the
SOCRATES V1007 project of DRIVE I). Our proposal for CAPITALS consists on
generating an initial estimate using the available information. This initial estimate could be
later on being improved and refined with the experience gained during the testing and
evaluation of the system. The generation of the neural network patterns for the training
process, consisted on the following steps, whose logic is displayed in the diagram of figure 5.
1. The traversal matrix for the site and the link flow measurements provided by the data
collection were the input to the heuristic matrix adjustment whose output was the
Adjusted O/D Matrix. This process has proven in practice its capability to adjust trip
matrices to the flow variations in the time horizon considered, reflecting in that way the
time variability of the traffic demand.
We have adopted the following bilevel formulation of the matrix adjustment problem as
non-linear optimization problem:
v(g) = arg min £ sa (x)dx

aeA
s.t. Ih k =g,, V i e I
keKj
hk>0, V k e K j , Vie I
khk, VaeA
where va(g) is the flow on link a estimated by the lower level traffic assignment problem with
the adjusted trip matrix g, and va is the measured flow on link a. The algorithm used to
solve the problem, based on a proposal by Heinz (1990) is heuristic in nature, of steepest
descent type, and does not guarantee that a global optimum to the formulated problem will be
found. The iterative heuristic works as follows:
At iteration k:
k v
Given a solution §i an equilibrium assignment is solved giving link flows va, and
/ k\
proportions \Pia/ satisfying the relationship
Note: the target matrix is used in the first iteration (i.e. g? = g, , Vi e I )

• The gradient of the objective function F(v(g)) is computed. For a more realistic
approach the gradient is based on the relative change in the demand, written as:
fg, fork = 0
(Then a change in the demand is proportional to the demand in the initial matrix and
zeroes will be preserved in the process).
The gradient is approximated by
d F,
Cv.-v.), Vie I
V gi keK, aeA
(where A c: A is the subset of links with flow counts).
• The step length is approximated as:
aeA
A* =
IX'
aeA
where
v>-Il,l
iel VkeK, aeA
There have been two main reasons for selecting this heuristic: the quality of the results that it
provides, and the easy implementation using the macro language of the EMME/2
transportation-planning package. However, it should be noticed that this is not the only way
of solving the problem and there are other alternative heuristics, as the one proposed by
Florian and Chen (1993), of by Chen (1994), which can also be implemented in EMME/2
with the available utilities.
2. The Adjusted O/D Matrix has been combined with the information collected about the
time distribution of the total number of trips on the network, to generate an Adjusted
Time Sliced O/D Matrix consistent with time variation of the link flows in the time
horizon, and the time distribution of the total number of trips.
3. This Adjusted Time Sliced O/D Matrix is the input to a Route Based variant of
AIMSUN2 microscopic simulation model, described in Barcelo et al. (1995), in which
vehicles follow time dependent routes from origins to destinations, performing in that
way a heuristic dynamic assignment.
4. The simulation model emulates the detector measurements generating in that way a set of
link flow measurements similar to those produced by the real detection system.
5. The Adjusted Time Sliced O/D Matrix, the simulated link flow measurements, and the
Adjusted O/D Matrix define the Neural Network Patterns used in the training process
TIME
SLICED O/D
MATRIX
1r
AIMSUN2/RB
SIMULATION
MODEL
1'
SIMULATED
LINK FLOW
MEASURMTS.
1r ir
I NEURAL | T'
I MnT\j/r>Df ^~
I PATTERNS
Figure 5: Logic diagram of the production of the neural network patterns
The figure 6 illustrates the graphic output of the heuristic matrix adjustment process for the i-
th entry of the O/D matrix corresponding to the (5/36) origin-destination pair. The values in
the x-axis of figures 8 and 9 have been divided in 84 intervals of 5 minutes each assuming
that the simulated data collection process aggregates the detector measurements every 5
minutes. The y-axis represents the number of trips.
Viatges histories entre el parell 5/36
HViatges histories entre el

parell 5/36
Figure 6
And figure 7 shows the same entry for the Adjusted Time Sliced 0/D Matrix
Figure 7
Data preprocessing
Patterns are defined by:
Input
a) Adjusted O/D Matrix for the k-th time interval. The number of entries
depends on the relationships that will be defined in Section 5. An initial
analysis has been conducted for each O/D pair to determine the more
Neural Networks For Short-Term Prediction 43 1
suitable topology and validate the methodology. For the shake of

completeness we will present the results for O/D pair 5/36. Therefore in
this exercise there will be only 1 input value.
b) Simulated detection for the k-th time interval. The number of entries
depends on the number of links with detection on the corresponding sub-
network. 23 values for the links with detection in the best routes
connecting the selected O/D pair.
• Output
a) Adjusted Time Sliced O/D Matrix for the (k+l)-th time interval. The same
number of values as for the input, 1 value in our case for O/D pair 5/36.
In our example the available values lay in the interval from 7:00 am until 14:00 p.m.,
divided in time intervals of 5 minutes, that means a total of 84 data sets. As we want to
make the prediction for the next time interval then we have available 83 different patterns
for training the neural network. To avoid the influence of the different scale measures of
the various types of input data a normalization of inputs values in the interval [-1...1] has
been done for all entries.
The proposed normalisation is done by the following transformation:

• Input corresponding to the adjusted O/D matrix:
2(7
- Number of values > 1 = 0

- Number of values< - 1 = 0
Input corresponding to measurement of a detection station:
- Number of values > 1 = 3 (corresponding to detector 11)

- Number of values < -1 = 0
Output corresponding to the Adjusted time sliced O/D Matrix
- Number of values > 1 = 0

- Number of values < -1 = 0
Where x is the variable to transform, and y is the variable resulting from the
transformation, and jj. and a are respectively the mean and standard deviation of the
observed input values. Once the pattern data have been suitably transformed to range in
the interval [-!,+!], the next step consists of partitioning the 83 patterns in two sets. The
first is used to train the neural network and the second to validate the training. In our
example we have grouped 73 and 10 data sets in each class respectively. The 10 data sets
for validation have been chosen randomly (In this case the validation patters are the 6, 12,
14, 19, 23, 25, 37, 53, 67 and 75). The size of the sets has been set up empirically after
various trials, the computational results show in this case that the selected values are
adequate for validate the training.
Network topology
As described in section 2 the topology used for the neural network has been a "feed-
forward" with the following composition for the (5/36) O/D pair:
Input layer
The input layer, composed of 24 neurones has the function of performing
the transfer of the input value, therefore the functions for each neurone are:
• Activation Function: Identity Function

y =x
• Output Function: Identity Function
y =x
Hidden Layer
• Activation Function:
y = tanh(x)
y =x
We have tested neural networks with 1 and 2 hidden layers, with a variable number of
neurones in each layer.
Output Layer
Only one neurone for the case of one O/D pair, defined by:
• Activation Function:
y = tanh(*)
y =x
Training
The neural network has been modeled using the simulator SNNS (Stuttgart, 95).
And the training has been conducted with a Back Propagation algorithm with the
following parameters whose values have been determined empirically:
* r| (learning parameter) : Determines the step length along the gradient descent
direction. The value used in our experiments has been r|=0.2.
• dmax : This parameter determines the tolerance between the output and the
input values of the neural network. The value used in our experiments has
been 0.01
The number of iterations depends on the behavior of the SSE curve (Squared Sum of Errors). On
one hand we have to reduce the SSE resulting from the training patterns, and on the other hand
we have also to decrease the SSE resulting from testing the validation patterns.
In the typical training process of a neural network there is a first initialization phase that assigns
random weights to the connections. In our case we have initialized the weights randomly in the
interval [-2 , 2].
Results and Prediction
Two groups of neural network topologies have been computationally experimented to determine
the most suitable for the O/D prediction problem. The first group was composed of neural
networks with only one hidden layer and the second group was of networks with two hidden
layers.
Neural networks with one hidden layer

Topology description: n-m-l (n neurons in the input layer, m neurons in the hidden layer
and p neurons in the output layer).
Topology 24-4-1 Topology 24-3-1 Topology 24-2-1

SSE 26.71 48.27 50.74
R2 0.9983 0.9963 0.9978
SSEvai 101.80 80.14 38.46
The figures 8 and 9 display the results obtained for topology 24-4-1.
3 5 7 9 11 13 15 17 19
Figure 8: Comparison between the forecasted and the desired output for the training
pattern 24-4-1
Figure 9: Comparison between forecasted and desired output for validation pattern
24-4-1
Neural Networks with two hidden layers
Topology 24-3-2-1 Topology 24-2-4-1
SSE 6.12 12.35
R2 0.9996 0.9991
SSEvai 122.07 70.94

The figures 10 and 11 display in this case the results obtained for topology 24-3-2-1.
3 5 7 9 11 13 15 17 19
Figure 10: Comparison between the predicted and the desired output for training pattern
24-3-2-1
Figure 11: Comparison between the predicted and the desired output for validation pattern
24-3-2-1
ADDRESSING THE PROBLEM OF THE NEURAL NETWORK SIZE

The dynamic prediction of an O/D matrix by means of Neural Networks has a main
drawback: the amount of data required for a proper training of the Neural Network. If N
is the total number of centroids (origins and/or destinations) in the road network
representation, the maximum number of entries in the matrix is N*(N-1), taking into
account that there are no trips from a centroid to itself. Therefore, if the total number of
links in the road network with detectors is M, according to the selected topology
displayed in figure 3 the total number of entries into the Neural Network will be N*(N-
1)+M and N*(N-1) the number of exits. According to this topology the total number of
parameters to estimate during the neural Network training is:
Let lj be the number of links between layers y andy-7

//=(N*N-1)+M)*N*(N-1)
/r=(N*(N-l))*(N*(N-l))
Let / be the total number of links /=//+/^
Let bj the number of bias in layer y
Let b the total number of bias b=b]+b2

Then the total number of parameters to estimate isp=l+b
That means, for example that:

For N=25 then the total number of parameters to estimate is p=72 1,200
For N=50,p=l 2,009,900
ForN=100,p=196,039,800
With a so large number of parameters to be estimated, and taking into account that the
number of patterns for a proper training must be larger than the number of parameters it
would be impossible the training process. The solution proposed consists of a reduction in
the size of the Neural Network while not losing capacity of representing the road network.
We start at each time period by a pre-processing of the road network in which we analyse
the connectivity of the network identifying the k current most likely used paths between
each O/D pair, as suggested by Jayakrishnan et al, (1994), that can be computed using the
algorithms proposed by Epstein (1994). The number k has been fixed empirically (3 or 4
in most cases), and the current travel times estimated by simulation has been the cost
criterion. Then taking into account that volume va on link a is given by:
fl if arc a belongs to path k

h.5., where 5... = «
0 otherwise
Where 7 is the set of all O/D pairs and KI is the set of all paths connecting the i-th O/D
pair, and h/< is the flow on the k-th path. Then, defining:
Ia = (Set of O/D pairs using link a in a k shortest path}

where a e A = (Subset of links with detectors}
And defining the auxiliary graph G=(N,E), whose set of nodes N and set of links E is
N = {la : \/a e A]
Neural Networks For Short-Term Prediction 43 7
given by:
n = card{la u 7,,, Va, b : (a, b) e E]
Then the number of different Neural Networks will be the number of non connected
components of G, and consequently the number n of O/D pairs to be considered in each
neural network (determining therefore the number of neurones on the input and output
layers) is given by:
The partitioning condition may result very strict in most cases, then it would be desirable
to admit a certain degree of overlapping when no significant errors are induced. In this
case the proposed methodology can be replaced by a clustering analysis, where the degree
of overlapping can be controlled as a function of the similarity level between the clusters.
The traversal matrix for CAPITALS site had 98 centroids and thus, in theory, 9506 O/D
pairs, and 377 detectors. This implies a theoretical total of 184,330,846 parameters to
estimate. After calibrating the AIMSUN2 microscopic simulation model of the site the
analysis reveals that only 1117 O/D pairs have a significant number of trips (at least a 5%
of the highest entry). Accepting at most a 10% of overlapping among clusters, these O/D
pairs have been clusters using the Ward's method as implemented in the statistical
package MINITAB (1998), that considers the distance between two clusters as the sum of
the squared deviations from points to cluster centroids, and minimizes the within cluster
sum of squares and the distances used have been the Pearson distances. This method has
been selected by its property of tending to produce clusters of similar number of
observations, although it is sensitive to outliers. The final partition lead to the following 9
clusters:
Number of Within cluster Average distance Maximum distance
observations sum of squares from centroid from centroid
Cluster 1 106 128913.117 33.053 64.682
Cluster2 372 16698.529 5.964 19.210
Clusters 315 964.750 1.467 5.925
Cluster4 169 38485.558 13.812 40.860
Clusters 93 47907.066 21.019 49.551
Cluster6 20 82398.261 62.351 95.052
Cluster7 20 4636.483 14.560 23.397
Clusters 5 5864.195 33.311 47.550
Cluster9 12 25708.829 45.043 68.794
The route based ASIMSUN2 simulation model enables the analysis of the used paths
between origins and destinations and the identification of the detectors located on the
links composing these paths. A neural network can be associated with each cluster. The
number of parameters of that neural network is determined by the number of O/D pairs
and the number of detectors on the links of the paths connecting these O/D pairs. In this
case the number of parameters to be estimated for the largest neural network, the one
associated with cluster number 2 is:
N*(N-1) = Number of O/D pairs =372

M = Number of detectors = 237
Total Number of Parameter to Estimater=365,676
For the neural network associated with cluster number 6, N*(N-1)=20, M=51, and the
number of parameters 1860, that are more reasonable numbers
CONCLUSIONS AND FURTHER RESEARCH
The analysis of the results obtained show that the best forecasting is the one performed by
the topology 24-2-1. This topology gives a SSE slightly higher than the others for the
training patterns, but it also offers a much better SSE for the validation patterns. This can
be interpreted as follows: the more neurons in the hidden layer, or more hidden layers, the
better learning process of the neural network for the training patterns, however this
improvement in the training is not transferred to the prediction that could even become
worse. Another aspect to remark is that the R2 parameter does not significantly differ for
the other topologies and consequently does not help to discriminate.
An interesting aspect that confirms the hypothesis in section 4 that led us to propose the
neural network as forecasting mechanism is the quality of the predictions obtained. This
validates the methodology for predicting O/D matrices.
However, the evaluation of the system done as part of the CAPITALS project,
CAPITALS Final Report (1998) reveals an important drawback of this procedure. Traffic
data from the 377 detectors on the CAPITALS site were collected every 5 minutes from
8:00 a.m. until 14:00 p.m. for Tuesdays, Wednesdays and Thursdays for 6 weeks (weeks
20 to 25 of 1998). The neural network reproduces the input historical O/D matrices with a
high degree of accuracy, but when the predicted O/D values are used to predict traffic
flows on the network by means of a heuristic dynamic network loading based on
simulation, the comparison between the measured and estimated flows reveals significant
deviations in a 40% of the cases. A deeper analysis of these cases has been done. Let us
illustrate this analysis for one case: Analysis of the difference between the observed and
predicted flow values for the time period 11:00-11:15 for Tuesday 21.07.98:
Descriptive Statistics
20.815
119.620
14309.0
0.933078
6.30806
346
-524.694
IstQuartile -32.757
Median 7.500
3rd Quartile 67.208
Maximum 666.667
95% Confidenc Interval for Mu
8.166 33.464
95% Confidence I iterval for Sigma
111.322 129.265
95% Confidence I iterral for Mediar
-0.687 17.805
After removing the outliers, the test of the paired difference gives:
Test of mu = 0.00 vs mu not = 0.00
Variable N Mean StDev SE Mean T P

C27 334 13.18 87.80 4.80 2.74 0.0064
This value is even worse than the initial one, and repeating the descriptive statistical
analysis we get:
Descriptive Statistics
Variable: C27
i-Dariing NormalityTe:
13.1834
87.8044
7709.60
334
-275.222
IstQuartile -32.757
Median 6.083
3rd Quartile 61.750
334.667
95% Confidence rvalforMu
3.733 22.634
95% Confidence Ir al for Sigma
81.612 95.021
95% Confidence In il for Mediar
95% Confidence Interval for Median
-1.667 16.589
The persistence of the variability indicates that is generated by other causes. A clearer
picture is provided by the analysis of the regression plot between the observed detector
values and the predicted values as the following figure illustrates
Regression Plot
Y = -38.3718»0.953590X
Regress*
95% Cl
DMQ1
If we observe the dispersion of the points beyond the 0.95 interval it becomes more
evident that deeper reasons should be found.
The main data used by the prediction process are the detector measurements and the
Historical Origin-Destination matrix. Therefore, if the simulation model is able of
reproducing acceptably the "average working day" for the considered period, as the
calibration procedure verifies, and the training of the Neural Network is of enough
quality, the reasons for the inability to explain the flow variability in some cases could be
found in the inaccuracy of the Historical Origin-Destination matrix.
The traffic data gathered during the data collection process have been used for
determining the 24 average 15 minutes periods covering the 6 hours of the studied time
horizon. Average flows of each quarter period have been used for adjusting the O/D
matrix for that period and for calibrating the simulation model for the same period.
Investigating the reasons for the deviations we have found that meaningful differences
could appear between the average values for the same quarter in the same weekdays.
These variations affect largely the accuracy of the adjusted O/D matrices used as input.
A way of trying to overcome this type of drawback would be to adjust a Historical O/D
matrix for each interval for each day of the week to capture this unexplained variability.
Recent results of other projects also using O/D approaches point in this direction, see
Mauro (1998) or Inaudi and Morello (1998).
On the other hand, since the quality of the O/D adjustment used to produce the Historical
O/D matrices depends on the number links with detectors, and the detector layout, that is
which are the links with detectors, it is essential to identify how many detectors and their
best layout to achieve the most reliable estimate. In the case of Madrid site the only
available detectors where those installed for the traffic control system. Obviously the
layout decision was taken with the objective of optimising the control strategies and
ignoring completely any other use. The best decision for control is not necessarily a good
Neural Networks For Short- Term Prediction 441
one for the adjustment of O/D matrices. A careful review of the detector layout reveals
that:
1. A high percentage of detectors are redundant for the O/D estimation purposes
2. Some key links for the O/D adjustment do not have detectors
A way of putting remedy to these drawbacks would be the following
1. Determine upper and lower bounds to the optimal number of detectors

depending on the size and topology of the road network, as in Bianco et al.,
(1998).
2. Identify the best layout for these detectors.
In the case of Madrid site, for instance only 147 out of the 346 detectors are non-
redundant, and therefore valid for the O/D adjustment With respect to the detection
layout, a layout maximising the reliability of the O/D estimate can be obtained using the
procedure proposed by Yang et al. (1991). The improvement of the historical O/D
matrices appears to be the key component to ensure reliable predictions based on the
Neural network approach. We do not consider that these results are conclusive. As we
pointed out in the introduction our objective was to explore the feasibility of the proposed
approach. We believe that these results point towards the confirmation that it could be a
valid approach but, as one of the referees pointed out, only one experience is not enough
to draw definite conclusions, more experiments are required, namely to overcome the
identified drawbacks.
REFERENCES
Baldi, P.P. and Hornik, K. (1995). Learning in Neural Networks: A Survey, IEEE
Transactions on Neural Networks, 6, 837-858.
Barcelo, J., J.L.Ferrer, R. Grau, M. Florian and E. Le Saux (1995). A Route Based
Variant of the AIMSLIN2 Microsimulation Model, Proceedings of the 2nd.
World Congress on Intelligent Transport Systems, Yokohama.
Barcelo, J. and J.L.Ferrer (1997). AIMSUN2: Advanced Interactive Microscopic
Simulator for Urban Networks, User's Manual, Departament d'Estadistica i
Investigacio Operativa, Universitat Politecnica de Catalunya.
Barcelo, J. (1997/ A Survey of Some Mathematical Programming Models in
Transportation,
TOP (Journal of the Spanish Operations Research Society), 5, 1-40.
Ben-Akiva, M., E. Cascetta, H. Gunn, S. Smulders, and J. Whittaker (1994). DYNA: A

Real-Time Monitoring and Prediction System for Inter-Urban Motorways,
Proceedings of the First World Conference on Intelligent Transport Systems.
Bianco L., G. Confessore and P. Reverveberi (1998). A Network Based Model for Traffic
Sensor Location with Implications on O/D Estimates, Pre-prints of the TRISTAN
III Meeting.
CAPITALS, (1998). EU Telematics Applications Programme, DGXIII Project TR1007,
Deliverable DOS.2, Final Report on Advanced Traffic Control Strategies.
Cascetta, E. (1993). Dynamic Estimators of Origin-Destination Matrices Using Traffic
Counts, Transportation Science, 27, 363-373.
Chakraborty, K., K. Mehrotra, C.K. Mohan and S. Ranka (1992). Forecasting the
behaviour of multivariate time-series using neural networks. Neural Networks, 5,
961-970.
Chen, Y. (1994). Bilevel Programming Problems: Analysis, Algorithms and
Applications, Centre de Recherche sur les Transports, Universite de Montreal,
Publication #984.
Davis, G.A. (1993). A Statistical Theory for Estimation of Origin-Destination Parameters
from Time-Series of Traffic Counts. In: Transportation and Traffic Theory (C.F.
Daganzo, ed.). Elsevier Science Publishers.
Davis, G. A. and J.G. Kang (1994). Estimating Destination-Specific Transit Densities on
Urban Freeways for Advanced Traffic Management, Transportation Research
Records.
Epstein, D. (1994). Finding the k-Shortest Paths, Dept. of Information and Computer
Science, University of California Irvine, Tech. Report 94-26.
Florian, M. and Y. Chen (1993). A coordinate descent method for the bilevel O-D matrix
adjustment problem. Centre de Recherche sur les Transports, Universite de
Montreal.
Gunn, H. (1994). DYNA-DRTVE II Project V2036 Annual Project Review Report - Part
A Section 2, EC R&D Program Telematic System in the Area of Transport.
Hecht-Nielsen R. (1989). Neurocomputing, Addison-Wesley.
Inaudi, D., S. Manfredi, and S. Toffolo (1994). The DYNA on-line matrix estimation and
prediction model, Proceedings of the First World Conference on Intelligent
Transport Systems.
Inaudi,D. and E. Morello (1998). On Line Traffic Models for On Line Traffic
Management. Pre-prints of the TRISTAN III Meeting, June
INRO Consultants Inc., (1996). EMME/2 User's Manual.
Jayakrisham, R, H.S. Mahmassani, and Ta-Yin Hu (1994). An Evaluation Tool for
Advanced Traffic Information and Management Systems in Urban Networks.
Transportation Research C, 2, 129-147.
Masson, E. and Y. Wang (1990). Introduction to Computation and learning in artificial
neural networks. European Journal of Operational Research, 47, 1-28.
Mauro, V. (1998). Advanced Traffic management and Guidance: Experimental Results

from the Torino 5T Scheme, Pre-prints of the TRISTAN III Meeting.
SOCRATES V1007 (1991). Project of DRIVE I, European Commission, Report on
Floating Car Data.
Spiess, H. (1990). A Gradient Approach for the O-D Matrix Adjustment Problem,
Publication #693, Centre de Recherche sur les Transports, Universite de Montreal.
University of Stuttgart, SNNS User's Manual, Version 4.1, (1995). Report #6/95.
Van der Zijpp, N.J. and R. Hamerslag (1996). Improved Kalman Filtering Approach for
Estimating Origin-Destination Matrices for Freeway Corridors. TRB, 1443.
Weigend, A.S, B.A. Huberman and D.E. Rumelhart (1992). Predicting sunspots and
exchange rates with connectionist networks. In: Nonlinear modeling and
forecasting (M. Casdagli and S. Eubank, eds.). Addison-Wesley.
Yang H., Y. lida and T. Sasaki (1991). An Analysis of the Reliability of an Origin-
Destination Trip matrix Estimated from Traffic Counts. Transpn. Res. 25B, 5,
351-363,
Algorithms For Congested Trip Matrix Estimation 445
ALGORITHMS FOR THE SOLUTION OF THE

CONGESTED TRIP MATRIX ESTIMATION
PROBLEM
Mike Maker and Xiaoyan Zhang

School of Built Environment, Napier University, UK
ABSTRACT
This paper is concerned with the problem of estimating trip matrices from traffic counts on a
set of road links on congested networks. The purpose is to develop efficient algorithms for the
solution of the combined trip matrix estimation and User Equilibrium (UE) assignment
problem. Two types of solutions will be considered: one is the mutually consistent solution at
which the two sub-problems are solved simultaneously, and the other the solution to the bi-
level programming problem in which matrix estimation is the upper-level problem and UE
assignment the lower-level problem. The algorithms are tested on two artificial networks and
the Sioux Falls network. The algorithms developed are also applied to the combined signal
optimisation and UE assignment problem.
1. INTRODUCTION
Trip matrix estimation is the problem of determining the number of trips between each Origin-
Destination (O-D) pair from observations of traffic flows on a set of road links in a network.
The estimation methods developed to date may be divided into two categories: non-
assignment-based methods and assignment-based methods, according to whether a trip
assignment model is involved. In the first category of methods, it is assumed that the route
choice proportions between each O-D pair are constants and are determined separately from the
matrix estimation process. Several estimation procedures of this category have been developed
over the years, including entropy maximisation (Van Zuylen and Willumsen, 1980), Bayesian
inference (Maher, 1983), generalised least squares (Cascetta, 1984), and maximum likelihood
(Spiess, 1987). See Cascetta and Nguyen (1988) for a review of these methods. However,
assuming constant route choice proportions has an inherit shortcoming. Normally, route choice
proportions are obtained by assigning a trip matrix to the network by a trip assignment model.
Thus, there is in general an inconsistency between the results of matrix estimation and trip
assignment if the two problems are solved separately. The inconsistency tends to be more
serious when the network is congested because route choice will depend on the trip matrix in
congested networks.
In the assignment-based methods, matrix estimation and assignment are linked together to
overcome the problem of inconsistency. Normally, a User Equilibrium (UE) assignment is
used. Both matrix estimation and UE assignment have been formulated, separately, as
mathematical programming problems; the combined problem therefore involves two linked
programming problems. The problem may be formulated either as a two-objective
programming problem or as a bi-level programming problem. In the two-objective
programming problem, each sub-problem has its own set of decision variables (matrix
estimation having the matrix and UE assignment having link flows or, equivalently, route
choice proportions) and the main interest here is to obtain a mutually consistent solution at
which the two sub-problems are solved simultaneously. We will call this type of problem the
mutually consistent problem. In the bi-level programming problem, matrix estimation is the
upper-level problem and UE assignment the lower-level problem. This is essentially a single-
objective programming problem with link flows constrained by UE conditions.
Although the mutually consistent problem can be seen as a multiple-objective programming

problem, it is rather special in that in each of the sub-problems only one set of variables may be
varied for optimisation. Fisk (1984) discussed a range of combined problems in the framework
of game theory, and the discussion was illustrated by the combined signal optimisation and UE
assignment problem. In game theory, the mutually consistent problem is called a Nash non-
cooperative game and the bi-level problem a Stackelberg game or leader-follower game. In
Nash non-cooperative games, each of the two players tries to minimise his own objective
function only without considering the reaction of the other player. In a leader-follower game,
on the other hand, the leader chooses his variables so as to optimise his objective function,
taking into account the response of the follower who tries to optimise his objective function
according to the leader's decisions. In other words, although the leader cannot intervene in the
follower's decision, he can consider the follower's reaction in his own decision making. This is
particularly important in the bi-level signal optimisation problem, in which signal optimisation
is the leader and UE assignment the follower. The combined signal optimisation and UE
assignment problem will also be considered in this paper.
There has been an emphasis on the bi-level programming formulation for the two combined
problems in recent years (Fisk, 1988; Yang et al., 1992; Yang, 1995; and Yang and Yagar,
1995). However, it is of theoretical importance to identify both mutually consistent solutions
and the solutions to the bi-level problems so that comparison between them may be made.
Therefore, we will consider both types of solution in this paper. The purpose of this paper is to
develop efficient algorithms for the solution of the combined matrix estimation and UE
assignment problem. Two algorithms will be developed, respectively, for the two types of
solution. In the rest of this introduction section, we describe the formulation and solution of the
two types of problem. We also discuss some of the existing algorithms for solving the
combined problem. The two proposed algorithms are presented in section 2. In section 3, the
algorithms are tested on two artificial networks and the Sioux Falls network. The algorithms
developed are also applicable to the combined signal optimisation and UE assignment problem.
This application is described in section 4, where detailed description of the problem can be
found. The paper is summarised in the last section.
1.1 The Problem Formulations and Solutions
The problem of trip matrix estimation has been considered by many researchers. Most of the
approaches developed so far have the general form of an optimisation problem (e.g., Yang et
al., 1992):
ME: MinZ M E (t,v) = F t (t,r) + F v (v,v-) (1)

subject to v=A(t), t>0
Here, "ME" stands for matrix estimation; t and t" are, respectively, vectors of estimated and
target trip matrices; v and v~ are, respectively, vectors of estimated and observed link flows; Ft
is the function of "distance" between t and t~; Fv is the function of "distance" between v and v~;
all vectors in this paper are column vectors. Note that the second term of the objective function
is defined only for those links with traffic counts.
In the matrix estimation problem, t is the set of decision variables, v is a function oft, and A(t)
is called a trip assignment map which predicts link flows for a given matrix. The simplest
assignment map is proportional assignment: v=Pt, where P is the matrix containing proportions
of each O-D flow using each link, or link choice proportions. Therefore, given a set of link
choice proportions and assuming v=Pt, problem (1) can be solved to get an optimal trip matrix.
For example, one of the most widely used formulations for matrix estimation is the
minimisation of the weighted sum of squared distances between the observed and estimated
traffic flows (Cascetta, 1984):
Min Z ME (t, v) = (f - t)lT' (r -1) + (V - v)W1 (v- - v) (2a)

subject to v=Pt, t>0
where U and W are weighting matrices, or the variance-covariance matrices of target matrix
and the observed link flows. The solution to this problem, t*, is the Generalised Least Squares
(GLS) estimator and is given by (Cascetta, 1984)
t* =(IT' + P r W- 1 P)- 1 (lT 1 t~+P r W~V) (2b)
The assumption of a proportional assignment map may be appropriate only in uncongested

networks. When the network is congested, however, route choice will vary with O-D flows and
a UE assignment is normally included in the matrix estimation. A UE assignment has also been
formulated as an optimisation problem
UE: Min ZUE (t, v) = £ ]ca (x)dx (3)

atA o
subj ect to v0 = ^] /r8ar , fr > 0, ^ fr = t(
where ca is the cost on link a , f r is the flow on route r, 5nr=l if link a is in route r, and 5ar=0
otherwise, R/ is the set of routes for O-D pair i, R is the set of all routes in the network, and A is
the set of all links in the network. Although we have included the trip matrix t in the objective
function, t is fixed in the UE assignment problem. It is possible to find a UE solution of link
choice proportions together with link flows, although the solution of link choice proportions is
not unique. Unlike the proportional assignment model, UE assignment does not in general have
an explicit functional form of A(t). We will use V(t) and P(t) to denote UE solutions of link
flows and link choice proportions.
Thus, in congested trip matrix estimation, there are two optimisation problems with the input of
one problem being the output of the other:
where t*, v* and P* are optimal solutions of ME and UE, respectively. The two sub-problems
may be coupled in two ways and two types of solutions may be defined. Consider firstly a
situation where the two sub-problems are solved simultaneously:
Mm Z M E (t,Pt) (4a)
Min Z U E (t,v) (4b)
where Dt and Dv are the feasible regions for t and v respectively. Then we get a mutually
consistent solution, (tMC, pMCtMC), satisfying
ZME(tMC, PMCtMC)<ZME(t, PMCt),

ZUE(tMC, vMC)<ZUE(tMC, v), VveD v
In other words, tMC optimises ZME for given PMC and, in the meantime, PMC optimises Z UE for
given tMC. On the other hand, the two sub-problems may also be coupled in a hierarchical way.
This results in a bi-level programming problem, in which ME is the upper-level problem and
UE assignment the lower-level problem:
Mm Z M E (t,V(t)) (5)
where V(t) is the lower-level UE assignment problem as defined by (3). An optimal solution,
(tBL,V(tBL)), to the bi-level problem must be such that
ZME(tBL,V(tBL))<ZME(t,V(t)), VteA
We will refer to this type of solution as a bi-level solution. Note that the mutually consistent
solution has been defined in terms of P rather than v. This is because in the ME problem, v
varies with t. For v fixed, the second term of the objective function becomes constant and the
problem is equivalent to that of minimising the first term. For the bi-level solution, on the other
hand, UE conditions must always be satisfied and the solution is in the form of (t,V(t)). Both
mutually consistent solutions and bi-level solutions satisfy UE conditions. In addition, a
mutually consistent solution is optimal to the ME problem while a bi-level solution is not
necessarily optimal for ME — we can normally solve the ME problem for given PBL=P(tBL). In
spite of this, however, the bi-level solution may still have a smaller objective function value for
the ME problem than the mutually consistent solution. In fact, among all the solutions that
satisfy UE conditions, the bi-level solution has the minimum ME objective function value.
1.2 A Two-link Network Example
Here we demonstrate the two types of solutions of combined ME and UE by an example of a

simple network with one O-D pair connected by two links. The cost function on the two links
are
Cl = 5 + v,/1000
c2 = 6.25 +v 2 /l 000
The target matrix is f =2000 and the link count is v2"=620, made on link 2. The UE assignment
solution for given matrix, t, is
= t/2 + 625 (6a)

V2(f) = t/2 - 625 (6b)
where t >1250. The corresponding UE link choice proportions are
P,(0 = 1/2 + 625/f (7a)

P2(t)= 1/2 - 625/f (7b)
Using least squares matrix estimation
ZME=U(t-n2+W(v2-v2-)2
where U and W are weights of trip matrix and link flows respectively, the optimal solution of
ME problem is given by
dZME/dt = 2U(t- r) + 2 W(v2 - v2) (dv2/d/) = 0 (8)
Replacing v2 by P2t in (8), the mutually consistent solution is the root of simultaneous
equations (8) and (7b):
2U(t - r) + 2W(P2 1 - v~)P2 = 0

P2 = 1/2 - 625/t
The bi-level solution, on the other hand, is given by substituting V2(f) for v2 in (8):
A Igorithms For Congested Trip Matrix Estimation 451
2U(t - O + 2W(V2(f) - v2-)F2'(/) = 0
where V2(f) is given by (6b) and V2'(t)=l/2. The two types of solutions with [7=1.0 and PF=1.0
are shown in Table 1. Note that the objective function value of matrix estimation is smaller at
the bi-level solution than at the mutually consistent solution. This two-link example will be
used later to test the convergence of different algorithms.
Table 1. An example of two types of solutions on a two-link network

t V| V2 PI P2 ZME ZUE
Mutually consistent solution 2043.3538 1646.6769 396.6769 0.8059 0.1941 51752.7613 12147.0637
Bi-level solution 2098.0000 1674.0000 424.0000 0.7979 0.2021 48020.0000 12511.0260
1.3 Existing Algorithms for Solving the Combined Problems
Hall et at. (1980) considered a special case of problem (1), in which the observed link flows are
assumed to be at UE and are error-free (so that the second term of the objective function
vanishes), that is
MinZ M E (t) = F,(t,r)

subject to P(t)t=v", t>0
An iterative algorithm for solving the problem is proposed, in which the two sub-problems are
solved alternatively until convergence is achieved:
where n is the number of iteration. This alternate algorithm has been widely used. However, it
has been demonstrated (Fisk, 1988) that this algorithm may or may not converge, depending on
whether the coupling between the two sub-problems is weak or not. In addition, when it does
converge, it will converge to the mutually consistent solution. Carvalho (1996) investigated a
number of modifications to the alternate algorithm; in these the change from the current
solution of matrix t(M) to the new one t("+1) is moderated by using a pre-determined sequence of
step size of l/n. This algorithm is similar to that of the Method of Successive Average (MSA)
frequently used in Stochastic User Equilibrium (SUE) assignment (Sheffi, 1985). Although the
algorithm removes the erratic nature of the convergence, the convergence is generally slow
because the step sizes become smaller and smaller as the iterations proceed. The slow
convergence of MSA has also been observed by Maher and Hughes (1998) in their study of
SUE assignment with elastic demand.
Two heuristic algorithms for solving the bi-level problem (5) have been proposed by Yang et
al. (1992) and Yang (1995). The two algorithms also involve alternate optimisation of the
upper- and lower-level problems. For each solution of the upper-level problem, the UE
assignment map is approximated by a linear map. In the first algorithm, V(t) is approximated
by P(n)t, and in the second algorithm by V(t("))+VtV(t("))(t-t(")). The partial derivatives of UE
link flows with respect to O-D flows in the gradient V,V(t(")) are obtained from the sensitivity
analysis methods for non-linear programming problems or variational inequalities (Tobin and
Friesz, 1988). The first algorithm is essentially the same as that by Hall et at. (1980) mentioned
above, though the algorithm is developed for solving the more general problem (1). Therefore,
it should identify the mutually consistent solution of (4). Intuitively, the second algorithm may
converge to the bi-level solution because the approximation of V(t) is equivalent to taking the
first-order terms of the Taylor expansion of V(t). Numerical tests in Yang (1995) have shown
that the two algorithms do converge to different solutions and that, in most cases tested, the
second algorithm converges to a solution with a smaller ZME. Here, the two algorithms are
tested on the two-link network mentioned in section 1.2. The convergence of the algorithms is
shown in Table 2. By comparing Table 2 with Table 1, it can be seen that the first algorithm
converges to the mutually consistent solution while the second algorithm to the bi-level
solution. In this particular example, the linear assumption of the UE map in the second
algorithm is an exact one so the algorithm converges in only one iteration. In general, however,
conditions for the convergence of both algorithms remain to be proved. In section 3, we will
show an example in which the alternate algorithm diverges.
Table 2. Convergence of the two algorithms by Yang et al. (1992) and Yang (1995) on the
two-link network
First algorithm converges to the mutually consistent solution
n t v, v2 P^ P2 ZME ZUE
0 2000.0000 1625.0000 375.0000 0.8125 0.1875 60025.0000 11859.3750
1 2044.3774 1647.1887 397.1887 0.8057 0.1943 51614.2346 12153.8673
2 2043.3282 1646.6641 396.6641 0.8059 0.1941 51756.2633 12146.8934
3 2043.3544 1646.6772 396.6772 0.8059 0.1941 51752.6737 12147.0680
4 2043.3538 1646.6769 396.6769 0.8059 0.1941 51752.7635 12147.0636
5 2043.3538 1646.6769 396.6769 0.8059 0.1941 51752.7612 12147.0637
Second algorithm converges to the bi-level solution
n t v. V
2 />, PI ^ME
7
•ÛE
0 2000.0000 1625.0000 375.0000 0.8125 0.1875 60025.0000 11859.3750

1 2098.0000 1674.0000 424.0000 0.7979 0.2021 48020.0000 12511.0260
2. THE PROPOSED SOLUTION ALGORITHMS
In this section we present the two proposed algorithms, each for one type of solution. The core
of most algorithms for solving mathematical programming problems is to calculate, at each
iteration, a new solution, say, x("+1) from the current solution, x(n), in the form of
where x* is an auxiliary solution, which provides a search direction (x*-x(n)), and a is the step
size which determines how far to move from the current solution. The alternate algorithm can
be seen as one with the step length being 1 at each iteration and the MSA as one with the step
length being \ln. In the two proposed algorithms, we will calculate an (approximate) optimal
step length at each iteration. The two algorithms are described in turn below.
2.1. The Algorithm for the Mutually Consistent Solution
Suppose we have a current solution, [t(n), ?'"¥"']. The ME problem is firstly solved to get an
auxiliary solution of the trip matrix, t*, assuming v=P(")t. Then, a UE assignment is performed
to find the auxiliary solution of UE link choice proportions, P*, for t*. This provides two
search directions, (t*-t(M)) and (P*-P(n)), for matrix and for link choice proportions,
respectively. We then search for a pair of optimal step lengths for the two sets of variables
respectively in the hyper-plane defined by the three points (t(n), P(M)), (t*, P(n)), and (t*, P*). Let
t(a)=t(")+a(t*-t(")) (9a)
P(P)=P(")+p(P*-P(")) (9b)
Denote the derivatives of the two objective functions along the two directions by g(a,(3) and
A(a,P), respectively. Then
da
= dZ " E( ^ aXv) = V v Z UE (t(a),v)(p- -

dp
where v=P((3)t(a). The values of g(a,P) and /z(a,p) can be readily calculated at the three points
mentioned above. Using these values and assuming that the two objective functions are
quadratic in a and P in the vicinity of the current solution so that the derivatives are linear, we
have
g(a,p) s g00 + a(g,0-g00) + P(gn-g10)

/z(a,p) = /z0
where the subscripts refer respectively to values of a and p. For example, g,0 is the value of g
when a=l and p=0. A pair of optimal step lengths which minimise simultaneously ZME(t(a),v)
and ZUE(t(a),v) can now be found by solving the set of two linear equations
goo + a(g]0-goo) + Pfen-gio) = 0
Once an optimal set of step lengths is found, a new set of solutions is then given by (9). The
algorithm may be outlined as follows.
Algorithm 1
Step 0: Initialise t(0), P(0), and v(0) - P(0)t(0); set «=0.
Step 1 : Determine t* by, for example, a GLS estimator, assuming v=P(")t.
Step 2: Find P(t*) for t* by UE assignment.
Step 3: Find optimal step lengths, a* and p*.
Step 4: Set t(n+1)=t(n)+a*(t*-t(")); p<"+1>=pco+p*(p*_p«)
Step 5: If the convergence criterion is met, stop; otherwise, set n:=n+l and go to step 1.
At step 0, the initial trip matrix can normally be set to be the target matrix. The initial UE link
flows and link choice proportions are obtained by assigning the target matrix to the network. At
step 1, the solution to the GLS estimation (2a) of the trip matrix is given by (2b). At step 2, the
UE link flows can be found by the well-known Frank- Wolfe algorithm. The stopping criterion
can be based on the maximum change in the elements of the estimated trip matrix at successive
iterations:
Max.
where £ is the error tolerance. In this algorithm, the two auxiliary solutions both point to
descent directions of the two sub-problems, respectively, and so the optimal step lengths are
positive though not limited to the range [0,1] in each iteration. As the iterative process
converges, the auxiliary solutions approach the current solution or the optimal solution.
Therefore, the convergence of the algorithm to the mutually consistent solution can be
observed by the fact that the auxiliary solution approaches the current solution.
2.2. The Algorithm for the Bi-level Solution
Suppose we have a current solution, [t("),v(")], where v(")=V(t(")). At each iteration, the upper-
level problem is firstly solved to get an auxiliary solution of the trip matrix, t*, assuming
v=P("'t. Then, a UE assignment is performed to find the UE link flows, v*, or V(t*) at t*. Thus,
we have two points satisfying UE conditions, t(n) and t*. We then search for one optimal step
length along (t*-t(n)) by a line search. A line search algorithm normally requires repeated
evaluation of the objective function. In the bi-level problem (5), however, the evaluation of the
upper-level objective function requires the solution of the lower-level UE assignment, whose
functional form is generally unknown. Therefore, a line search directly based on the objective
function (5) requires repeated UE assignment and is very inefficient. To overcome the
difficulty, we linearise the UE assignment map between the two points, (t<n),v(n)) and (t*,v*),
that is
V(t) = v(n}+Q(t-t(n))
where Q=[0a/] and Qar(v*-va(n))/(t^-t^\ Let
t(p)=t(")+p(t*-t(n)) (lOa)
We have
v(p)=v(")+p(v*-v(")) (10b)
Then an optimal step length P* can be found by minimising ZME(t(P),v(p)). This is a standard
one-dimensional search and can be solved by, for example, the Newton method. The function
ZME(t(p),v(P)) and its derivatives with respect to P can be evaluated for any value of p. The
first and the second derivatives of the ZME(t(P),v(P)) with respect to P are
dZ M E (t(P),v(p))_,
dp
dp :
The new solution of the trip matrix is then given by (lOa) with p*. However, v(p*) obtained by
(lOb) is only an approximation to V(t(|3*)). Therefore, another UE assignment is performed to
find the exact new link flows, V(t(P*)), for t(p*). The algorithm can be outlined as follows.
Algorithm 2
Step 0: Initialise t(0), v<0), and P(0); set «=0.
Step 1: Determine t* by, for example, a GLS estimator, assuming v=P(n) t.
Step 2: Find P(t*) for t* by UE assignment.
Step 3: Find p which minimises ZME(t(P),v(p)).
Step 4: Set t<"+1)=t(")+p(t*-t(")).
Step 5: Find v(n+1) = V(t(n+1)as well as p<n+"(t("+1) for t(n+1) by UE assignment.
Step 6: If the convergence criterion is met, stop; otherwise, set «:=«+! and go to step 1.
The calculations of all steps in this algorithm are the same as those in algorithm 1, except for
Step 3, which has been described above, and for Step 5, the extra UE assignment at the end of
each iteration. In contrast to algorithm 1, the auxiliary solution in algorithm 2 does not
necessarily point to a descent direction of the ME problem and the optimal step length is not
limited to be positive. In addition, the auxiliary solution is generally different from the current
solution even as the iteration converges. The optimal step length approaches zero as the
iterations converges. Numerical tests with different networks so far have shown that, in the first
few iterations, the auxiliary solution does provide a descent direction and negative step lengths
only occur in the vicinity of the optimal solution. In addition, the first few iterations are most
"cost effective"; the solution is close to the optimal one after only a few iterations. This is a
desirable feature for practical network estimations where high convergence precision may not
be justified due to errors in the target matrix and the observed link flows.
The direction finding method in the two proposed algorithms is similar to that in the alternate
algorithm. However, the two proposed algorithms are different in that the step lengths at each
iteration are optimised rather than fixed. This will overcome the erratic nature of the
convergence in the alternate algorithm as well as the slowness of the convergence in the MSA-
like algorithm. The calculation of the optimal step lengths is straightforward and does not
require considerable computational effort. Another feature in the proposed algorithms is that
algorithm 2 needs an extra UE assignment at the end of each iteration. This is necessitated by
the bi-level nature of the problem: the UE condition must be satisfied at every solution.
3. TEST OF THE ALGORITHMS

3.1. The Two-link Network Revisited
Here we test the convergence of algorithm 1 and the alternate algorithm on the two-link
network for different values of weights, U and W. It was found that as U decreases and W
increases, the convergence of the alternate algorithm becomes poor, and eventually it diverges.
Figure 1 shows a comparison of the two algorithms. The alternate procedure diverges in this
example and algorithm 1 converges to the mutually consistent solution, 2489.6066, found
analytically by the method mentioned in section 1.2. It worth mentioning that algorithm 2
converges in this case, too, to the bi-level solution, 2489.8041, found analytically.
Alternate algorithm
Algorithm 1
1500
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Iteration
Figure 1. Test on the two-link network, with £7=0.0001, ff=0.9999.
3.2. A Grid Network
A grid network shown in Figure 2 is used for testing the two algorithms. The network has 9
nodes and 24 links. There are 5 centroids (nodes 1, 2, 3, 4, and 5) and 12 O-D pairs. The true
trip matrix is supposed to be known and the assumed matrix is shown in Table 3. Assigning the
true matrix to the network by UE assignment gives the true link flows. The target matrix and
the observed link flows are generated by (Yang et al., 1992)
vfl-=va+(1.0-CvlkQ
where t? and va+ are the elements of the true matrix and link flows, 4, and <^n are randomly
generated jV(0,l) variables, and C^ and Cvlk are the coefficient of variations reflecting the
random variations of the target matrix and observation errors in link flows respectively. The
variance-covariance matrices, U and W, are assumed to be diagonal matrices with the variances
(Yange/fl/., 1992)
Var(vfl-)=(CvlkO2
The BPR (Bureau of Public Roads) link performance function
is used with cc=0.15 and y=4. The uncongested link costs [ca(0)] and link capacity [qa] are listed
in Table 4.
Results from the case with CmA = 0.2 and CV|k= 0.1 are shown in Figure 3, where it can be seen
that the two proposed algorithms converge quickly and that algorithm 1 converges slightly
faster than the alternate procedure, bearing in mind that the alternate algorithm does not always
converge. It can also be seen that the value of the objective function of the matrix estimation
problem is lower at the bi-level solution (algonthm 2) than that at the mutually consistent
solution (algorithm 1 and the alternate algorithm), as stated in section 1.1. Calculations with
different values of Cvod and Cvlk have been made and similar results found. The computational
efficiency of the algorithms will be discussed in the next sub-section.
Figure 2. The grid network. All links are two-directional.

Table 3. The true trip matrix for the grid network.

O-D pair No. Centroids t,+ O-D pair No. Centroids t,+
1 l->3 1500 7 4->2 1000
2 l->5 500 8 4->5 1000
3 2->4 1000 9 5->l 600
4 2->5 1000 10 5->2 900
5 3->l 1500 11 5->3 600
6 3->5 500 12 5->4 900
Table 4. Uncongested link travel costs and link capacities on the grid network
Link No. Nodes cfl(0) 9n Link No. Nodes cfl(0) <?„
1 l->6 15 1800 13 6->l 15 1800
2 l->9 10 1800 14 6->2 15 1800
3 2->6 15 1800 15* 6^5 10 1200
4 2->7 10 1800 16 7->2 10 1800
5 3->7 10 1800 17 7->3 10 1800
6 3->8 15 1800 18* 7-»5 15 1200
7 4-x8 15 1800 19 8->3 15 1800
8 4^-9 10 1800 20 8->4 15 1800
9* 5^-6 10 1200 21* 8->5 10 1200
10* 5-»7 15 1200 22 9-^1 10 1800
11* 5^>8 10 1200 23 9^.4 10 1800
12* 5-^9 15 1200 24* 9-»5 15 1200
Note: *links with observed flows
Figure 3. Matrix estimation on the grid network, with Cvod = 0.2 and Cvlk= 0.1. The
parallelograms denote the alternate algorithm; the circles algorithm 1; and the squares
algorithm 2.
3.3. The Sioux Falls Network
The Sioux Falls network has been widely used for testing (equilibrium) assignment models.
Information in the data set includes network characteristics (link-node topology and the
parameters in the link performance functions) and a demand trip matrix. The network has 24
nodes, 76 links, and 528 O-D pairs. The demand matrix is treated as the true matrix; the target
matrix and the observed link flows are generated by the same methods as those used for the
grid network in section 3.2. The performance of the two algorithms are shown in Figures 4(a)-
4(c), where the two algorithms are compared with the alternate algorithm for three
combinations of Cvod and Cvlk. It can be seen that all three algorithms converge in just a few
iterations. Shown in Table 4 are the number of iterations as well as the c.p.u. time needed for
the algorithms to converge at a given error tolerance. These are used to judge the efficiency of
the algorithms. It can be seen that the two proposed algorithms are slightly more efficient than
the alternate algorithm.
In general, the computation time of the algorithms also depends on the size of a network. For
the Sioux Falls network, each of the above calculations takes about 2-3 minutes to converge on
a 300MHz Pentium II machine with 64.0 Mb RAM. For the grid network, on the other hand, it
takes only 5-10 seconds for the iterations to converge at the same error tolerance on the same
machine. The main computational burden in the two proposed algorithms is the solution of the
ME problem and the UE assignment problem; the former involves a matrix inversion and the
latter is itself an iterative process. If there is a large number of O-D pairs, such as in the Sioux
Falls network, the solution of ME problem contributes more significantly to the c.p.u. time. On
the other hand, if there are a lot more links than O-D pairs, such as in the grid network, UE
assignment contributes more significantly to c.p.u. time. In both algorithm 1 and algorithm 2,
UE assignment can be made more efficient by starting with the latest link flows rather than the
free-flows. For example, in Step 2 in both algorithms, the initial link flows for UE assignment
can be set to be v=P("'t*.
Another factor that may affect the speed of convergence of the proposed algorithms is the
congestion level on the network. The number of iterations for the Frank-Wolfe algorithm to
converge increases with the congestion level (see, for example, Sheffi, 1985), although the
congestion level may have little effect on matrix estimation. Here, different levels of
congestion are tested on the grid network which has more links than the number of O-D pairs
and so the UE assignment forms the major part of the c.p.u. time at each iteration. It was found
that the number of iterations for the two proposed algorithms to converge is virtually the same
for different levels of congestion (measured by the average of the ratio of assigned link flows
over link capacity), ranging from 0.5 to 1.16, although the c.p.u. time increases slightly as the
level of congestion increases.
(a)
(b)
(c)
Figure 4. Test on the Sioux Falls network for three combinations of Cvod and Cvlk, with error
tolerance s=0.001. The parallelograms denote the alternate algorithm; the circles algorithm 1\
and the squares algorithm 2. (a) Cvod=0.05 and Cvlk=0.1; (b) Cvod=0.1 and Cvlk=0.2; (a) Cvod=0.15
and C,1V=0.3.
Table 5. Efficiency of the algorithms on the Sioux Falls network with an error tolerance of
s=0.001.
Alternate algorithm Algorithm 1 Algorithm 2
cvlk Cvod N c.p.u. (sec.) N c.p.u. (sec.) N c.p.u. (sec.)
0.05 0.1 5 154.29 4 130.60 3 120.93
0.1 0.2 6 180.11 5 160.16 3 127.69
0.15 0.3 6 207.03 5 181.87 4 195.11
4. APPLICATION OF THE Two ALGORITHMS To THE SIGNAL

OPTIMISATION PROBLEM
4.1. The Problem Formulation and the Previous Algorithms
Conventional methods for traffic signal optimisation assume fixed traffic flows; whereas the
trip matrix is assigned to the network assuming fixed signal settings. This separation of traffic
control from assignment may lead to inconsistency between traffic flows and signal settings
because they are in general inter-dependent. The inter-dependence tends to be more serious in
congested networks. The inconsistency may be eliminated by combining signal optimisation
with an equilibrium assignment. Here, we shall use UE assignment. The combined signal
optimisation and UE assignment problem is one in which a traffic engineer tries to optimise the
performance of signals while road users choose their routes in a UE manner. This combined
problem is mathematically similar to that of the combined matrix estimation and UE
assignment (Note that the trip matrix is assumed to be fixed in the signal optimisation
problem).
The most commonly used policy for signal optimisation is to minimise the total journey costs
in the network:
SO: Min Zso (s, v) = ^T vn ca (v fl , sa)

a&A
mm
subject to s™*>sa>sa , a&A; ^sa-l, ApA
where "SO" stands for signal optimisation; va is the flow on link a; ca is the cost on link a; sa is
the green split for link a, s=(..., sa, ...); samm and samm are maximum and minimum allowable
green split for link a, samm>0, .samax<l; Af is the set of links heading for they'th signal controlled
intersection. If link a is not controlled by a signal, then samax, sa, and samm will all be equal to 1.
The decision variables in this problem are the signal settings s while the set of link flows v is
the output from a UE assignment problem. Given a signal setting, s, UE assignment problem
may be written as
UE:
V
SUbj eCt tO « = f$ °r ' fr - °' fr = ti
reK reft,
In this section, we will use V(s) to denote UE link flows for given s. As in the combined matrix
estimation and UE assignment problem, here we may also consider two types of problem and
solution. The Nash non-cooperative game can be written as
MinZ s o (s,v) (11 a)
MinZ U E (s,v) (lib)

ve£>v
where Ds and Dv are the feasible regions for s and v respectively. And the mutually consistent
solution (SMC,VMC) must be such that SMC solves (lla) and VMC solves (lib). The bi-level
programming formulation of the problem is
Min Zso(s,V(s))
where V(s) is the lower-level UE assignment problem. An optimal solution, (sBL,V(sBL)), to the
bi-level problem is characterised by
Zso(sBL,V(sBL))<Zso(s,V(s)), Vsefl,
The comparison or the contrast of the two types of solutions here is the same as that in the
combined matrix estimation and UE assignment problem. However, it is important to note that
in the Nash non-cooperative game, "non-cooperation" implies that the traffic engineer does not
know or consider the route choice behaviour of drivers. By definition, among all the solutions
that satisfy UE conditions, the bi-level solution has the minimum SO objective function value.
In the SO problem, the objective function is normally some performance measure of the system
such as total cost in the network. Therefore, the total cost in the bi-level solution is smaller than
or equal to that of the mutually consistent solution. In this sense, the bi-level solution is more
advantageous than the mutually consistent solution.
An iterative algorithm in which the SO and UE problems are solved alternately has been used
for the solution of the combined SO and UE problem (Van Vuren and Van Vliet, 1992; Smith
and Van Vuren, 1993). As in the matrix estimation problem, this procedure may converge to
the mutually consistent solution but convergence is not guaranteed (Fisk, 1984, 1988). The
combined signal optimisation problem has been formulated as a bi-level optimisation problem
in the eighties, but an efficient solution algorithm has been long awaited. We shall mention
here only three algorithms which are different from the common alternate procedure. Sheffi
and Powell (1983) proposed a feasible descent direction method. The direction is found by the
gradient projection method, where the gradient is calculated by numerical differentiation. The
procedure thus requires as many UE assignments as the number of signal control parameters in
the network. In addition, the line search along the descent direction also requires repeated UE
assignment. Or, a fixed, ever reducing step length may be used, as in the MSA algorithm, but
the convergence may be very slow, as has been mentioned. Sheffi and Powell also proposed a
heuristic algorithm. It is assumed that a small variation in a signal setting parameter on a link
only causes the UE flow on that link to vary, and not UE flows on any other link. In this way,
only one UE assignment is needed for determining an approximate feasible descent direction.
However, the problem of repeated UE assignment in the line search remains. In Heydecker and
Khoo (1990), on the other hand, no descent directions are determined explicitly. The searches
at each iteration are made along a set of pre-determined directions which span the feasible
region. For each direction, the UE assignment map V(s) is approximated by a linear
relationship fitted to a number of UE flow patterns, such as 5 points, for given signal settings.
Yang and Yagar (1995) developed another feasible descent direction method. At each iteration,
both the upper-level objective function and its constraints are linearised at the current solution,
based on the partial derivatives of UE link flows with respect to signal control parameters
obtained by the sensitivity analysis method (Tobin and Friesz, 1988). This results in a linear
programming problem, the solution of which gives a direction that provides the largest
reduction of the objective function. This direction-finding algorithm requires only one UE
assignment at each iteration. The step lengths may either be optimised or predetermined, which
poses the same problem as the two algorithms by Sheffi and Powell (1983).
4.2. The Proposed Algorithms
The algorithms proposed here are very much the same as that for trip matrix estimation, with
the trip matrix estimation being replaced by signal optimisation with fixed link flows. The two
algorithms for the two types of solutions are as follows.
The mutually consistent solution: Algorithm 3

Step 0: Initialise s(0>, v<0), where v(0) = V(s(0)); set «=0.
Step 1: Determine s* by solving the SO problem, with v=v(n).

Step 2: Find V(s*) for s* by UE assignment.
Step 3: Find optimal step lengths, a* and (3*.
Step 4: Set s("+l W)+a*(s*-s<'1>); v0*1 W)+p*(v*-v(II)).
Step 5: If the convergence criterion is met, stop; otherwise, set «:=«+! and go to step 1.
The bi-level solution: Algorithm 4

Step 0: Initialise s<0), v(0), where v(0) = V(s(0)); set n=0.
Step 1: Determine s* by solving the SO problem, with v=v(n).
Step 2: Find V(s*) for s* by UE assignment.
Step 3: Find P* which minimises Zso(s(P),v(P)).
Step 4: Set s("+1)= s(n)+p( s*- s(">).
Step 5: Find V(s("+1)) by UE assignment.
Step 6: If the convergence criterion is met, stop; otherwise, set n~n+\ and go to step 1.
In these algorithms, the UE link flows for given signal settings can be found by the Frank-
Wolfe algorithm. The SO problem is reduced to several sub-problems of determining the
optimal green split for each signal controlled intersection. Each of them may be solved by any
standard one-dimensional optimisation algorithm, such as the Newton method. The optimal
step length(s) may be found in the same way as in the matrix estimation problem.
4.3. A Test Example
We shall use a simple three-link network shown in Figure 5 to demonstrate the solutions and
the application of the two algorithms. The network has two O-D pairs, with demand t} = t-, =
100. O-D pair 1 is connected by link 1 and link 2. O-D pair 2 is connected by Iink3. There is a
signal at the intersection of links 1 and 3. The BPR (Bureau of Public Roads) link performance
function is used with a=y=l. The uncongested link costs [cn(0)] and link capacities [qa] are
[cfl(0)] = [l 2 1]; fa,] = [100 80 100]
The two types of solution are summarised in Table 6, where the mutually consistent solution
was found by the alternate algorithm, which converges in this case, and the bi-level solution
found by exhaustive trial of all possible solutions of signal settings, with increment size of
0.002. However, it should be pointed out here that neither of the methods is applicable to
general networks for obvious reasons. The methods are used here only to confirm that the
proposed algorithms converge to the right solutions. Figure 6 shows the convergence of the two
proposed algorithms applied to this example. It can be seen that, starting from different initial
conditions, the algorithms converge quickly to the same optimal solutions as those shown in
Table 6. In this example, link 2 is twice as long as link 1, although its capacity is comparable to
that of link 1 (considering signal control). More drivers would naturally use link 1 at low
demand. However, if the signal optimiser knows drivers' route choice behaviour, as in the bi-
level problem, he can reduce the green split on link 1 and thus divert more traffic to link 2. The
total cost, Zso, in the bi-level solution is therefore lower than that in the mutually consistent
solution.
5. SUMMARY AND CONCLUSIONS
The combined trip matrix estimation and UE assignment problem has been considered. Two
types of solutions have been discussed: one is the mutually consistent solution at which the two
sub-problems are solved simultaneously, and the other the solution to the bi-level programming
problem in which matrix estimation is the upper-level problem and UE assignment the lower-
level problem. Two new algorithms have been described, one for each type of solution. The
two algorithms were tested in a simple two-link network, a 3x3 grid network with 24 links and
12 O-D pairs, and the Sioux Falls network with 76 links and 528 O-D pairs. In each case, the
algorithms were efficient and convergent. The algorithms developed have also been applied to
the combined signal optimisation and UE assignment problem and some preliminary test
results are encouraging.
For the bi-level matrix estimation problem, the upper-level objective function is the weighted
sum of the distance between the estimated and the target matrix and the distance between the
estimated and the observed link flows. Therefore, the bi-level solution is better than the
mutually consistent solution only if the target matrix and the observed link flows are close to
the true ones. For the combined signal optimisation and UE assignment problem, the bi-level
solution is clearly better than the mutually consistent solution because the total cost (or some
other measure such as queues and number of stops) in the former is no larger than that in the
latter. For both matrix estimation and signal optimisation problems, the advantage of the
mutually consistent solution is that it is easier to identify. For example, in both the alternate
algorithm and the algorithm suggested here, one can tell if the iteration is converging to the
optimal solution by observing if the auxiliary solution is approaching the current solution.
Currently work is being carried out to apply the two algorithms to the signal optimisation
problem in more realistic networks, using more suitable cost functions than the BPR function
so as to consider explicitly the delay caused by a traffic signal. The algorithms are also being
extended by replacing the UE assignment model with a Stochastic User Equilibrium (SUE)
assignment model in both the matrix estimation problem and the signal optimisation problem.
These works will be reported in the near future.
ACKNOWLEDGEMENT
The research is supported by a research grant from the UK Engineering and Physical Science
Research Council. The authors wish to thank Dirk Van Vliet of the Institute for Transport
Studies, Leeds University, for helpful discussions and comments on an earlier draft of the
paper, and an anonymous referee for constructive comments on the paper.
Figure 5. The three-link network.
Table 6. Solutions of the signal optimisation problem on the three-link network

«1 *3 Vl V2 ZSQ AJE
Mutually consistent solution 0.357 0.643 55.556 44.444 511.111 377.778
Bi-level solution 0.240 0.760 41.507 58.493 504.654 381.574
1
I
0 8 >
06 \
" 0.4,
0 2 < ^
o
(3 1 2 3 4 5 6 7 8 91 0
Iteration
(a)
1n - i
08)
\ 1
06
" 04d t\\
S<V [
y^^B' H H 1| H B B B g
0
(3 1 2 3 4 5 6 7 8 91 0
Iteration
(b)
Figure 6. Signal optimisation on the three-link network from different initial conditions, (a)
mutually consistent solution, (b) bi-level solution.
REFERENCES
Carvalho, M.D.S. (1996). Algorithms for improving the convergence of trip matrix estimation
and assignment models, PhD Thesis, Institute for Transport Studies, University of
Leeds.
Cascetta, E. (1984). Estimation of trip matrices from traffic counts and survey data: a
generalised least squares estimator. Transportation Research, 18B (4/5), 289-299.
Cascetta E. and S. Nguyen (1988). A unified framework for estimating or updating
origin/destination matrices from traffic counts. Transportation Research, 22B (6), 437-
455.
Fisk, C. S. (1984). Game theory and transportation systems modelling. Transportation
Research, 18B (4/5), 301-313.
Fisk, C. S. (1988). On combining maximum entropy trip matrix estimation with user optimal
assignment. Transportation Research, 22B (1), 69-79.
Hall, M. D., D. Van Vliet, and L. G. Willumsen (1980). SATURN: A simulation assignment
model for the evaluation of traffic management schemes. Traffic Engineering and
Control,21 (4), 168-176.
Heydecker, B. G. and T. K. Khoo (1990). The equilibrium network design problem. In:
Proceedings of AIRO '90 Conference on Models and Methods for Decision Support,
Sorrento, pp 587-602.
Maher, M. J. (1983). Inferences on trip matrices from observations on link volumes, A
Bayesian statistical approach. Transportation Research, 17B, 435-447.
Maher, M. J. and P. C. Hughes (1998). New algorithms for the stochastic user equilibrium
assignment problem with elastic demand. Sixth Meeting of EURO Working Group on
Transportation, Gothenburg, 9-11th September, 1998.
Sheffi, Y. (1985). Urban Transportation networks: Equilibrium Analysis with Mathematical
Programming Methods. Prentice-Hall, Englewood Cliffs, New Jersey.
Sheffi, Y. and W. B. Powell (1983). Optimal signal setting over transportation networks.
Transportation Engineering, 109 (6), 824-839.
Spiess, H. (1987). A maximum likelihood model for estimating origin-destination matrices.
Transportation Research, 24B, 395-412.
Smith M. J. and T. Van Vuren (1993). Traffic equilibrium with responsive traffic control.
Transportation Science, 27 (2), 118-132.
Tobin R. L. and T. L. Friesz (1988). Sensitivity analysis for equilibrium network flows.
Transportation Science, 22 (4), 242-250.
Van Vuren T. and D. Van Vliet (1992). Route Choice and Signal Control. Athenaeum Press
Ltd., Newcastle upon Tyne.
Van Zuylen J. H. and L. G. Willumsen (1980). The most likely trip matrix estimated from
traffic counts. Transportation Research, 14B (3), 281-293.
Yang, H. (1995). Heuristic algorithms for the bi-level origin-destination matrix estimation
problem. Transportation Research, 29B (3), 1-12.
Yang, H., T. Sasaki, Y. lida, and Y. Asakura (1992). Estimation of origin-destination matrices
from link traffic counts on congested networks. Transportation Research, 26B (6), 417-
434.
Yang H. and S. Yagar (1995). Traffic assignment and signal control in saturated road networks.
Transportation Research, 29A (2), 125-139.
Combining Traffic Forecasts 471
COMBINING PREDICTIVE SCHEMES IN

SHORT-TERM TRAFFIC FORECASTING
Nour-Eddin El Faouzi
Laboratoire d'Ingenierie Circulation - Transport
Unite Mixte de Recherche INRETS - ENTPE
25, Avenue F. Mitterrand, F-69675 Bron Cedex
E-mail : nour-eddin.elfaouziQinrets.fr
Abstract - The principal motivation for combining forecasts which can either be a class
label (classification) or numerical (regression) has been to avoid the a priori choice of
which forecasting method to use by attempting to aggregate all the information which
each forecasting model embodies. In selecting the 'best' model, the forecaster is often
discarding useful independent evidence in those models which are rejected. Hence the
methodology of combining forecasts is founded upon the axiom of maximal information
usage.
Short-term traffic prediction is an area where the combining of two or more predictions
is a promising technique which would directly improve the forecast accuracy. This ap-
proach may eventually help in specifying underlying processes more appropriately and
thus build better individual models.
This article deals with combining forecast methods potentially suitable for short-term
prediction with their performance comparisons. The emphasis lies on the application
to the short-term traffic flow prediction. Since the combination of predictors has, for
the most part, implicitly assumed a stationary underlying process, attention has been
focused on taking into account the effect of nonstationarity of the traffic flow process.
Keywords: Traffic forecasting, Combining forecasts, Weighted averaging of forecasts, Fore-

casts evaluation, Cross-validation, Multicollinearity.
1. INTRODUCTION
Traffic flow prediction has received increasing attention in the past years and different tech-
niques have been developed mainly for traffic surveillance and control (e.g. Lesort, 1987;
Moorthy and Ratcliffe, 1988). Many prediction schemes of traffic flow were obtained by
means of classic autoregressive models, especially time series techniques (e.g. Stephanedes
et a/., 1981; Gafarian et a/., 1977; Ahmed and Cook, 1979).
Some authors have tackled this problem in the context of Bayesian framework (e.g. Harrison
and Stevens, 1971). Some others used Kalman filtering technique (e.g. Okutani and
Stephanedes, 1984), or neural networks and system identification (e.g. Vythoulkas, 1993)
and more recently a nonparametric paradigm was adopted via kernel predicting technique
(see El Faouzi,1996).
None of these proposals allow one to achieve highly accurate predictions except in some
special situations (for some network configuration and/or with a high detector coverage).
This is induced to some extent by traffic dynamic which cannot be formalized by a single
procedure. In such a situation, the basic problem is traditionally viewed as a problem of
identifying and subsequently choosing the method that produces the best forecasts (some
predicting techniques are more sensitive to this identification problem such as the Box and
Jenkins methodology).
Therefore, in the context of traffic operations where highly accurate forecasts are
needed, one can obtain different forecasts of the same quantity (the underlined assumption
here is that different predictors are measures of the same quantity and/or various aspects
of the same thing) by two or more different methods. The set of available methods may
consist of alternative models, different forecasters, or a mixture of models and forecasters.
Often, the approach used is to find the single 'best' predictor in some sense (most accu-
rate values, most appropriate models of the underlying process, most cost-effective,etc... )
among the available forecasting methods. Another approach consists of combining these
individual forecasts.
The idea of combining estimators instead of selecting the single 'best' model is well-
known in statistics and has generated intensive theoretical works. The earliest statistical
study of combining multiple estimates appears to have been that of Edgerton and Kolbe
(1936) and since the seminal article of Bates and Granger (1969) which showed that the lin-
ear combination of several predictors from a single data set can outperform the individual
predictors, methodological and practical issues related to combining forecasts produced
by different methods has been investigated extensively in various contexts with notable
successes. For example, in weather forecasting situations (Fraedrich and Leslie, 1988;
Fraedrich and Smith, 1989), in macroeconomic problems (Clemen and Guerard, 1989), in
electrical demand prediction (Bunn, 1987; Smith , 1989) to name but a few.
Recent work has shown that in the presence of several predictors, it is almost always
better to combine information from them, rather than, say, choosing one, as has been
standard practice with cross-validation (e.g. Wolpert, 1992; Breiman, 1996 for single data
set configuration and El Faouzi and Lesort, 1995; El Faouzi, 1997 for heterogeneous data
sources one). Accuracy is increased if the competing methods are unstable, i.e. if small
perturbations in their training sets or in the parameters involved in their construction may
result in large changes in the resulting predictors. One of the more obvious and feasible
ideas for combining predictors is simply to average them, and some studies report good re-
sults from this (e.g. Makridakis and Winkler, 1983). Others used weighted averages where
the weights are constrained to sum to one (e.g. Newbold and Granger, 1974; Nelson 1972).
However, one can do slightly better than that, by choosing optimal weights that take
into account the relative precision of the forecasts.
Since identifying the 'best' of competing models, a priori, is rarely feasible since the
selected model is exceptionally (if ever) optimal (Clemen and Winkler, 1986; Makridakis
and Winkler, 1983) considerable research attention has recently been focused upon the
utility of combining the model forecasts (for an extensive review we refer the reader to
Mahmoud, 1984).
This study provides a methodological framework to combine various forecasts of the

same quantity. Following a review of combining predictive schemes potentially suitable for
short-term traffic forecasting. The structure of this article is as follows: a framework of
the combining predictors techniques is described. The development of forecast combina-
tion is formulated in terms of a statistical approach. Methods for combining forecasts are
discussed then empirical analysis involving short-term traffic flow prediction is presented
with performance comparisons.
2. THEORY OF COMBINING
The basic problem assumes that at a given time t one is facing I predicting models
VitiVx, • • • , V?« f°r the same uncertain variable yt+h, h > I a time horizon. Assume that
each predictor <fkt(-) is achieved on the basis of a training sample available at time period
t and gives a forecast fkt = ¥>fct( x «) °f Ut+h. defining the I x 1 vector F«. x« may present
past data of yt+h and/or some explanatory variables.
The issue here is how to combine these different assessments into a single forecast so that the
reliability of the combined forecast increases. The most widely used combination paradigm
is of linear type. In that case, the combined model is obtained via linear weighting vector
wt so as :
t
nt = ^WktVkt- (1)
k=\
and the combined forecast is given by :
7T t ( Xt ) = WX (2)
Several estimation techniques have been used in the choice of the weights w< in equation
(1). The majority of these are based on heuristics, the practical interest of which is limited
by the absence of theoretical justification making possible the evaluation of their degree of
legitimacy.
The most simple approach is an arithmetic averaging where WM — l/l which does not
require information about the precision of the predictors or the correlation between their
However, this method, in general is not appealing as it treats the forecasts as though
they are exchangeable, i.e. indistinguishable from one another. This may be a reasonable
assumption when the models have similar error variances and known a priori. If one
has some reasons to believe that the models are not exchangeable or that the models are
known to have different accuracies, then this information should be incorporated into the
combined forecast. In these situations combining methods based on relative precision of
the forecast components are desirable.
2.1 Weights Schemes

A desirable statistical property of the combined forecast is that its reliability increases. In
other words we aim to have combined forecast prediction error as low as possible. Note
that the expression (2) leads to a linear biased combined forecast unless:
(a) each element of Ft is assumed to be unbiased: E{//tt) = yt+h for k = 1,... , £

i
(b) and combining weights sum to unity: ^T^Wkt = 1.
fc=i
However, in some cases, it is possible to achieve a biased combined forecast that has a
smaller error measure than the unbiased one. Indeed, the biased combined forecast may
be more desirable if its variance is significantly less than that of the unbiased one and if
the bias is not too high. In this case, a more meaningful set of weights may result and
then a more stable combined forecast will follow.
A common error measure is the mean squared errors (MSB). If yt is some underlying
value that we want to predict using data at hand and yt some forecast of yt, this measure
of the assessment is defined as MsE=IE(yt — yt)2- A familiar bias-variance decomposition
states1 that :
MSE = Variance -f (Bias)2
It follows that this measure of the assessment penalizes an underestimation or overes-

timation on the one hand (bias term) and a large variability on the other hand (variance
term). Thus MSE express also a trade-off between a reduction in the variance and increase
in bias and vice versa.
2.2 Constrained Combining Regression

When each element of F< is assumed to be unbiased, to guarantee that the combined forecast
stays within the class of linear unbiased predictors, the weights must be constrained to sum
1
Indeed, we can write :
MSE =
to unity. In this situation, the underlined model is a constrained regression problem with
no intercept :
yt — wtFt + e( with I^W; = 1
where 1^ denote an ^—dimensional vector of ones.
So, the optimal weights w( are the minimizer of :
\ subject to I^wt = 1
Let wf = (wit,W2t, . . . , iw£_i > 4 ), the constraint I^wt = 1 may be expressed as :
Wft = (1 -I^W")
and the constrained problem given by (3) can be written down as an unconstrained homo-
geneous regression problem :
e« = XiW~ + e( (4)
where eft = (yt - f l t ) and X t = (Xlt, X^, . . . , ^-i,t) with Xjt = (f]t - fit)-
Then the Ordinary Least Squares (OLS) estimation of wj~ is given by :
w<- = (x;X,)~1X;eft (5)
if fi t = X'(X( and St = Xje# if follows that
Note that D( and St represent the variance-covariance matrix of X( and the vector of
covariances of e# with Xt respectively.
The well known theory of optimal estimators (see Rao, 1973) states that the solution
expressed in (6) is equivalent to the following one :
w; = (i^r'i/Ti^r1 (7)
where St is a positive definite covariance matrix of forecast errors e jt = (yt—fjt) j = 1, • • •
in time period t — 1.
Assuming that the vector of forecast errors et — (etj) is normally distributed with zero
mean and a covariance matrix £*, it follows from sampling properties ol linear estimators
that the combined forecast defined by (1) and (7) is optimum in the sense of having a
minimum forecast error variance. This is known as the minimum variance linear unbiased
estimator (MVLUE). The combined error variance will be at least as small as the best
individual predictor in F t , providing St has been known a priori.
From a Bayesian point of view, this is equivalent of saying that the posterior distribution
of the forecast quantity yt is normal with mean and variance given by :
and
var{y < |F < ,S < }= (ijS;"!,)'1 (9)
Note that the first equation provides the forecast of yt and the second one measures
the accuracy of the combined forecast.
A special case occurs when we have independent forecasts, i.e. where the covariance
matrix St = diag{<Ji(, cr|t, . . . , cr^} with ajt = var(ej t ) and e^ = (yt — fjt} yields a weights
of the form :
It follows from (10) that if the forecasting error variances are known, then the optimal
weights are proportional to the information summarized by the set of predictors. Note that
in this case, weights are always nonnegative in comparison to MVLUE estimates.
One obvious generalization of minimum variance weights estimation consists of a re-

laxation of the constraint that weights sum to unity with no intercept, that have been
imposed so far. The purpose of the next section is to develop further these issues. A gen-
eral linear model is used and optimal weights are expressed as a solution of unconstrained
OLS problem.
2.3 Unconstrained Combining Regression

When unbiasedness is not imposed a priori, we consider the general linear model defined
by
yt = wot + w'tFt + tt (11)
an< w
where wt = (WH, Wat, • • • , wit)' i ot is an intercept for a given time period t. If wot is
unconstrained, then regardless of whether or not the I forecasts are unbiased, their weights
need not to sum up to one in order to assure unbiasedness of the combined forecast. The
inclusion of this intercept in the model may eliminate any bias that may be present in the
component forecasts. It is convenient to consider wot as a bias correction.
Once again from the well known results of standard regression problem, the optimal
choice (in MSB criterion sense) of weights is achieved by the familiar expression :
w? = (F+'F+J-'F+'y, (12)
where F^ = (1|F') and w* = (w*ot,w*t,...,wjt).
2.4 Practical Considerations

In real-world applications, the mean squared error (MSE) defined by TE(yt — 7r t (xj)) 2 is
replaced by its empirical counterpart :
(13)
where TT T (X;) = w^,FT or 7r T (x,-) = w r F+ in the case of biasedness and CT = {(y;,X;),z =

1, . . . , T — 1} denote a reference or training sample used to obtain the set of competing
forecasts.
Firstly note that if the same reference sample is used in equations (10) and (11) to
achieve the optimal weights, the solution to the problem outlined in (2) gives rise to
complications in estimating weights and in determining their statistical properties. Mainly
such an approach presents two major drawbacks :
(a) overestimation problem

(b) and multicollinearity
2.4.1 Overestimation
Given that the individual forecasts (fkT)k=\,...,t are established on the basis of the same
training sample £T obtained overfit the data. Consequently, this may lead to a considerable
amount of over-optimism with respect to the predictive ability of the combined forecast.
To overcome this limitation one can use a cross-validation version of the MSE criterion
(cf. El Faouzi, 1997). The idea of cross-validation (CV), which is probably the most
popular automatic approach, uses the data set at hand to verify how well a particular
choice of weights performs in terms of residuals. More precisely, the leave-one-out cross-
validation uses a criterion defined as :
where fk = y>k (xj) is the so-called leave-one-out version of f ^ T . That is, fk is

constructed with all data points except the data point (y;,X;).
l
Note that the vector (y,- — ^ WkTfkT' ) can be viewed as a prediction error for the new
k=l
observation x,- and fk ' is independent of y,-.
As an alternative to leave-one-out cross validation a v— fold cross validation is often

advocated. The motivation of this cross validation version is that the standard leave-one-
out technique is time consuming when the set of available data is large. So, a much cheaper
version can be used : instead of leaving one data out at each step, one can leave out, say
v— th of the data, with v is an integer such that v > 1.
2.4.2 Multicollinearity
This drawback is subjacent to the problem of combining more than one predictor using
OLS estimation procedure. It is known that this later technique works best when there is
no correlation between pairs of explanatory variables. Unfortunately, in combining setting
competing predictors all attempt to attain the same quantity, which inevitably brings with
it high correlation between forecasts. This is referred to as multicollinearity. The resulting
weights (w t ) from the minimization of the least square criterion are highly sensitive to
slight data modifications which results in a weakening in the synthetic capacities of the
combined predictor nt.
In case of multicollinearity, several techniques can be used and most of these techniques
work by imposing certain constraints or prior information on the coefficients. If the un-
derlying process is stationary (see § 2.5), one solution to the problem of multicollinear
explanatory variables is simply to acquire more data to improve the efficiency of the esti-
mation and thereby improving prediction performance. However in real-world situations
this is often not possible and one can consider biased estimation if this approach might
lead a substantial improvement in term of estimation efficiency.
The biased approach involves minimizing the least square errors under a given set
of constraints on the regression coefficients. Thus the (W T ) solutions will be those that
minimize the generic constrained problem :
v
(15)
.=i fc=i '
subject to : l-L(witi WM, • • • , wtt) > VT
VT denote a real parameter and %(W() a given functional which can be viewed as a reg-
ularization term. For example, when "H(wj) denote the Euclidean norm of w< (i.e. ||wt| 2 )
with VT = 1, we obtain the Principal Component Regression (PCR) (e.g. Massy, 1965).
Another technique dealing with multicollinearity in regression is the method of ridge regres-
sion corresponding to "H(w<) = ||w<|| 2 and VT is a parameter to be estimated (e.g. Hoerl
and Kennard, 1970).
2.5 Negativity problem

From the previous approaches it can be shown that when a set of competing forecasts are
significantly positively correlated this may lead to negative weights which are undesirable.
To see this, consider the case of two forecasts. According to the constrained combining
regression solution, the optimal combined forecast is given :
where w\t = (1 — p(0 ( )/(l + $ — 1pt4>t), <$>t = (&ul&it) and pt is the correlation of the
forecast errors.
If </>t > 1 and (f)^1 < pt < 1, then w*t is negative and (1 — w*t) is greater than one. To
circumvent such problems a version of combining regression is used here which consists of
using non-negative constraints on the solution. Thus, following this strategy, the solution
(W T ) is that which minimizes
subject to
As this minimization problem falls into the generic constrained problem designed for
redundant information in a regression analysis (see equation (15)), it may be expected
that the derived optimal weights will not only overcome negative weights problem but
also multicollinearity. That is why in the application part, only the nonnegative weights
approach is chosen to deal with the two mentioned defects.
2.6 Stationary vs. non-stationary underlying process

Previous research on the combination of forecasts has, for the most part, implicitly as-
sumed a stationary underlying process, i.e. the performance of the individual forecasts
can be assumed to be consistent over time. This implies that the variance-covariance of
prediction errors E t could be assumed constant for all values of time. However in a more
general framework, error variance-covariance structure is not stationary so, a fixed choice
of £< may lead to severely suboptimal combined forecast with instability of the weights.
In real forecasting settings, properties of the forecasts are not known. Rather, a typical
strategy involves estimating them from past forecasts. For instance the error covariance
matrix £< is generally unknown, appropriate estimates St are often substituted and the
applicability of regression-based combination lay in how well E< will estimate Sj. This
estimation reliability is constrained mainly by the amount of data available to estimate E t .
Under stationarity of the underlying process assumption, the elements of St are viewed
as fixed, quantities to be estimated from the V sample observations. A common choice in
this situation is a sample covariance matrix :
However this is not a reasonable assumption of a traffic process. The true but unknown
matrix S<, and hence the vector of weights may not be fixed over time. To deal with
this problem some authors suggest to base the combining regression on the most recent ,
say m observations. We choose here to use the entire sample, while still weighting recent
observations more heavily. This goal can be met by using the weighted least squares
(WLS) technique instead of ordinary least squares (OLS). That is, instead of choosing wt
to minimize the sum of squares as in (11) or its cross-validation version in (12) we choose
it to minimize the weighted sum of squares :
T-l
Y^ fl.f „. _ \",,,, f:~" \ (ig)
where (#i,#2 7 • • • ,$T-I) is a vector of real values such that :
This later constraint is to ensure that the influence of past observations declines with
their distance from the present. For example choice of $; could be linearly or geometrically
decreasing elements.
The linear specification is given by :
and a geometrically specification of smoothing parameter A is :

\T-l~s for 0< A < 1
A< for A > 1
For a given choice2 of 0;, the optimal weights are given by :
w; = (iîi;r1i^-1
where
<s,)0. = x;10.e,-.ej./x;10.
s=l s=l
and in the case of biasedness, we have :
F+'Oy, (22)
where 9 = diag{#i,# 2 , . . . , #T-I}-
Note that the OLS approach emerges as a special case of WLS with geometrical weights
for A = 1. The WLS approach should produce a non-noisy sequence of combining weights.
3. EMPIRICAL RESULTS
3.1 Data
In this section the forecast combination methods described in the previous sections are
tested using operational data provided by the Rhones-Alpes toll highways company AREA,
collected during the month of November 1996. The data used consisted of two locations
(inductive loop sites) vehicle counts recorded at 6 minutes intervals, on 1 to 22 Novem-
ber 1996. The two vehicle counts sites are 16 kms away from each other.
3.2 Analysis
To check the consistency of the investigated combining schemes, this section deals precisely
with prediction of traffic flow levels problem in which two forecasting methods were used
to generate forecasts. For evaluation purpose, we consider one-step-ahead prediction of the
downstream traffic flow levels (Q* ) at the next slice, i.e. 6 minutes ahead of the last day
(i.e. 22 November 1996) using either past data of that downstream location or lagged
upstream traffic flows recorded at the upstream one.
The forecasting methods used to generate competing forecasts of traffic flow are derived
from an autoregressive procedure and traffic propagation-based models. Namely, the first
forecast (pred.l) is achieved using a nonparametric traffic flow via kernel estimator (we
refer the reader to El Faouzi, 1996) which uses only current and past values of traffic flow.
The second predicting scheme (pred.2) is based on a propagation of a lagged upstream
traffic flow.
The motivations behind such a choice is that endogenous prediction models based on
current and past values (e.g. time series approach, kernel predictor,...) provide procedures
2
In the case of the WLS with geometrical weights of parameter A, one can use an approach similar
in spirit to cross-validation to choose the optimal value of the smoothing parameter A instead of picking
arbitrarily one.
Combining Traffic Forecasts 48 1
for detecting and adjusting to change in the forecasts series, but cannot predict these
changes. So, we advocate that the addition of information of potential relevance to current
and past values of the variable under study such as explanatory variables in regression
framework, phenomenological related behaviors, etc. ..., will aid to prediction of major
changes of the forecasted variable.
Exogenous prediction (such as Regression analysis) on the other hand, frequently ignore
information contained in the historical movement pattern of the forecast variable whereas
the use of such information is the essence of endogenous prediction. So combining these
class of models may consider forecasts from different types of sources, each contain some
independent amount of information to be modeled.
For any given time interval r > 1, let (Qf, • • • , Qd] denote a set of observations of Q*
and XJ*' the past data vector of length s :
The kernel predictor of Q* ] can be written down in the following way
r-1
where a r j = a r j/ ^c*rj and the vector (otr,j) is defined as :
<*rj = ™P(- 2*p ) J = *,-..,r-l. (25)
|| .||2 denote the Euclidean norm.
The bandwidth parameter j3r, which determines how large a neighborhood of the target
point is used to calculate the average, is as fixed according to the following expression :
f (26)
where a is the empirical standard deviation of r observations.
If [.] denote the integer part operator, the length of the past data vector s is chosen as
the minimizer of the MSB criterion :
E fe>i-^>)2 (27)
j=[2.r/3]
For more details concerning this method and its performances, see El Faouzi, (1996).
The second model is based on a propagation of upstream traffic flow. More precisely, it
assumes that the value of traffic flow at time slice t at a downstream location Q^ is given
by a dynamic linear function of upstream traffic flow <5"_j at time t — 1.
The reason why the upstream traffic flow is used as a potential explanatory variable is
that traffic propagates down the highway and after some delay, the upstream level arrives
at downstream location with potential attenuation from on-ramps and off-ramps opera-
tions. The delay is identified as the maximizer of the cross-correlation function 3 between
the two series (Q™; Q1}). Exploratory analyses reveal that the optimal delay is equal to 1,
(i.e. 6 min). So, the complete dynamic system can
at
(S) at =
Pt = Pt-i
where et and £t — (£i*,£2<) are error random variables normally distributed with zero
mean and a covariance matrix T t and Ef respectively. The parameter at can be viewed as a
dynamic level at time t and (3t is a dynamic scale factor for upstream traffic flow at time t — l.
The EM algorithm is used in conjunction with conventional Kalman smoothed estima-

tors to derive forecasts from this dynamic linear system. For more details on the estimation
aspects involved in this model, see Shumway and Stoffer, 1982.
The weights are derived using 3 different approaches. The first one is the min-variance
approach (Mvc) based on direct calculation (see. equation 3) and as a solution of a con-
strained regression (see. equation 5). The special case of independent predictors is also
considered here (see equation 10). The second approach is nonnegative weights combine
tion presented in section 2.5 (see. expression 16) and finally the unconstrained regression
approach was also used as a combining scheme (see. equation 12). In this application,
we first fit a combination model using cross-validation criterion and then calculate error
measure in an independent sample, variable weights strategy were used here. For instance,
for each forecast period, data points were simply added without abandoning earlier data
points. The length of the series incremented every time. Table 1. represents the results
for the optimal weights.
Additional to the Root mean squared prediction Error (RME) measure of each combin-
ing method, standard deviation of prediction errors (STDE) and correlation (R) between
each predictor and target were also computed.
Combination Methods
Minimum Variance Constrained Regression Unconstrained
General Under wfc>0 Ek w/t = l Regression
Model Independence A: = 1,2 and w0 = 0 Model
(CM1) (CM2) (CM3) (CM4) (CM5)
Interecept 0.000 0.000 0.000 0.000 -4.936
Wi 0.419 0.444 0.424 0.422 0.427
W 2 L_0.581 0.556 0.603 0.578 0.612
Table 1. Optimal Weights for Investigated Combination Methods.

3
This approach is similar to the one used in Dailey and al. (1991) for travel time estimation.
Note that the two methods (CM1) and (CM2) are equivalent from theorical point of
view. However, we observe here slight differences in resulted weights. This fact is induced
by differences in numerical methods involved in the weights' calculation.
Another interesting fact is although the weights are not constrained to sum up to one
in nonnegative weights approach, it was found that the nonnegative optimal weights have
always sum not far from 1 (we have here 1.027). It means that, as noticed by Breiman
(1992), in the regression with no intercept this constraint is largely unnecessary.
To emphasize this, with the same notations, let Yl wfc< = It] fct — w t Ft and fc< = w t F<,
with wt = Wf/7*. Then we have fct = -ytfct and (y( - f ct ) = (1 - 7*)* + -yt(yt -fct)-
If ||.||2 denote the Euclidean norm on L , observe that :
> li -7«l 11*11, -M II* -**!!,

> 11*11, -M 11*11, + ll*-f*||,
leading to
And symmetrically we have a second inequality
\\yt\\2 + lly<~ict|| 2 ^ nt\ [\\yt\\2 ~ lly<~*ct|| 2 J

It follows readily from these two inequalities that :
l|y«ll a - llyt- ff c*H 2 < I I < llyJ 2 + I|y ( - f fctll2

llytll, + Ilyi- cill2
II M . It 1* II —
l|y*l| 2 - Ily<- rfli2
I ' ' — II II II */• II
So, if there are some good predictors among F f , then both ||y( — f ct || 2 and \\yt — fct\
will be small compared to ||yt||2 and the optimal wt will have a sum not far from one.
Finally, as the predicting models used here are constructed from two complementary
data sets, it can be expected that the prediction error of the two individual forecasts are
statistically independent. For instance, the correlation between the two prediction errors is
0.307. Hence the covariance term in £( are significantly small so that results from method
(CM1) and (CM2) are quite similar.
Individual Combined Predictors

Predictors Derived from the Investigated Methods
Pred.l Pred.2 (CM1) (CM2) (CM3)^ (CM4) (CMS)
RMSE 108.731 97.548 82.943 82.985 82.362 82.982 82.439
STDE 108.668 97.243 82.721 82.876 82.325 82.720 82.410
R 0.867 0.889 0.921 0.921 0.921 0.921 0.921
Table 2. Performance measures (RMSE, STDE and R) for the combined forecasts.
In each case, all two individual forecasts are combined using the indicated technique
and measures of assessment are calculated for the evaluation period only. The Table 2.
shows performance measures when the investigated combination technique are applied to
the two forecasts of traffic flow. The results reported in here show that among the two
forecasts, Pred.2 had the smallest RMSE and STDE. What is important to our purpose is
that regardless of the approach used, combining all the forecasts consistently outperform
the individual forecasts and provides substantial improvement over the 'best' method un-
der RMS. and STDE criteria. Results are quite homogeneous with a marked difference for
the (CMS) technique. This method seems to be a strong performer when compared with
others. In this case, the magnitude of RMSE is reduced to about 15.56% relative to the
'best' predictor.
As mentioned earlier, the performance of the min-variance technique are not altered
by independence assumption and the results of the two methods are very similar. When
we compare the min-variance approach with the nonnegative one, we obtain a lower vari-
ance for the later one. This is some what surprising because theory of the min-variance
approach states that the combined predictor has a lowest forecast error variance within a
class of linear unbiased predictors. This result can be explained by either results of the
min-variance combining technique are altered mainly by a lack of sufficient long time-series
data for estimating £< efficiently or that individual forecasts we are considering are biased.
The analysis of correlation between forecasted traffic flow and actual values of traffic
flow are positive and statistically significant. Performances of combining methods accord-
ing to this measure of assessment are equal.
It should be remarked that, one similar approach was used in the single data set config-
uration. Namely, Breiman (1966) presented an approach in which a sequence of data sets
is generated from the same training data set by some resampling techniques (bootstrap
for instance), then a set of estimates are constructed such that each one is based on the
same model and on one and only one data set. He showed that by combining the derived
estimates, one can improve considerably accurateness of predictions. However, when pro-
cessing in such a manner, one implicitly assumes that the single data set has some desirable
properties : extended spatial and temporel coverage, with no imbiguity, high confidence,
etc... Unfortunately, in real world applications, it is even not possible to derive the infor-
mation needed for a particular task from one single data (measured by a single detector
for exemple) and it is reasonnable to make use of the favourable properties of all data at
hand, with possible overlaping information, and coming from detectors of differents types.
This latter direction of research is the objective we tried to attain in this work.
4. CONCLUSIONS
This study was involved in the forecast combinations. It has been noted that all themeth-
ods outlined in this paper and applied improve the quality of the resulting forecast by
reducing its predictive error. At worst, the quality (in the error reduction sense) of the
combined forecast is comparable to that obtained if the 'best' model is chosen (in the mean
square sense).
Therefore, unless strong evidence indicating that a particular forecasting method based
on a given data set is better than other methods for a given situations, it might be prefer-
able to consider several methods. An advantage of combining forecasts is that when several
methods are used the results do not seen to be highly sensitive to the specific choice of
methods (robust performances). As noted in Makridakis and Winkler (1983) "using an
average of forecasts is undoubtedly better than using a wrong model or a simple poor
forecasting method".
It is important to note that the combining forecast methodology is not confined to

combinations utilizing model-based forecasts, The combining schemes can be generalized
to combine model-based forecasts (objective forecasts) with some subjective forecasts from
experts, prior knowledge, etc... Such extensions are future research efforts for this work.
REFERENCES
Ahmed M. S. and A. R. Cook (1979). Analysis of Freeway Traffic Time Series Data by
Using Box-Jenkins techniques. Transportation Research Board, 722, pp. 1-9.
Bates M. and W. J. Granger (1969). The combination of Forecasts. Operational Research

Quarterly, 20, pp. 451-468.
Bunn W. (1987). Expert Use of Forecasts: Bootstrapping and Linear Models. In: Judg-
mental Forecasting (G. Wright and P. Ayton Ed.), pp. 229-241, Wiley, New York.
Breiman L. (1992). Stacked Regression. Department of Statistics, University of California,

Berkeley.
Breiman L. (1996). Bagging Predictors. Machine Learning, 26, pp. 123-140.
Clemen T. and L. Winkler (1986). Combining Economic Forecasts. /. Bus. Econ Statist,
4, pp. 39 - 46.
Clemen T. and J. Guerard (1989). Econometric GNP Forecasts: Incremental Information

Relative to Naive Extrapolation. International Journal of Forecasting, 5, pp. 417 -
426.
Dailey D., M. Haselkorn and N. Nihan (1991). Travel Time Estimation Using Cross-
Correlation Techniques. TransNow, Final Report TNW91-02, October 1992.
Fraedrich K. and M. Leslie (1988). Real-time Short-term Forecasting of Precipitation at

an Australian Tropical Station. Weather and Forecasting, 3, pp. 104-114.
Fraedrich K. and R. Smith (1989). Combining Predictive Schemes in Long-range Forecast-

ing. Journal of Climate, 2, pp. 291-294.
Edgerton A. and E. Kolbe (1936). The Method of Minimum Variation for the Combination
of Criteria. Psychometrika, 1, pp. 183-188.
EL Faouzi N.-E. and J.-B. Lesort (1995). Travel Time Estimation on Urban Networks from
Traffic Data and On-board Trip Characteristics. 2nd WbrJd Congress on Intelligent
Transport Systems, (M. Koshi Ed.), November 9-11, 1995 Yokohama (Japan).
EL Faouzi N.-E. (1996). Nonparametric Traffic Flow Prediction Using Kernel Estimator.
Proceeding's of the 13tlj ISTTT, (J.-B. Lesort Ed.), pp. 41-54, Lyon, France 1996.
EL Faouzi N.-E. (1997). Heterogeneous Data Sources Fusion for Impedance Indicators.
Proceedings of the 8th IFAC-IFIP-IFORS, (M. Papageorgiou and A. Pouliezos Eds.),
volume 3, pp. 1375-1380, chania, Greece 1997.
Gafarian V., J. Paul and L. Ward (1977). Discrete Time Series Models of Freeway Density
Process. Proceedings of the 7th ISTTT, (T. Sasaki and T. Yamaoka Eds.), pp.
387-411, Kyoto, Japan 1977.
Harrison J. and F. Stevens (1971). A Bayesian Approach to Short-term Forecasting. Op-

erational Research Quarterly, Vol. 22(4), pp. 341-362.
Hoerl A. E. and R. W. Kennard (1970). Ridge Regression. Technometrics, 12, pp. 55-67
and 69-82.
Kalman E. (1960). A New Approach to Linear Filtering and Prediction Problems. J. Basic
Engineering, 82, pp. 35-45.
Lesort J.-B. and J.-P. Lebacque (1983). Prevision a Court Terme du Trafic en Ville. (in
French), Rapport IRT, Arcueil, Mai 1983.
Lesort J.-B. (1987). Prediction of Traffic Flow. In Concise Encyclopedia of Traffic and
Transportation Systems, (M. Papageorgiou Ed.), pp. 329-331, Pergamon Press.
Mahmoud E. (1984). Accuracy in Forecasting: A survey. Journal of Forecasting, 3,

pp. 139 - 159.
Makridakis S. and L. Winkler (1983). Averages of Forecasts: Some Empirical Results.

Management Science, 29, pp. 987-996.
Moorthy C. K. and B. G. Ratcliffe (1988). Short Term Traffic Forecasting Using Time
Series Methods. Transportation Planning and Technology, 12, pp. 45-56.
Nelson R. (1972). The Prediction Performance of the FRB-MIT-PENN Model for the U.S.
Economy. American Economic Review, 62, pp. 902-917.
Newbold P. and W. J. Granger (1974). Experience with Forecasting Univariate Time Series
and the Combination of Forecasts. Journal of the Royal Statistical Society, Series A,
137, pp. 131-149.
Okutani I. and Y. J. Stephanedes (1984). Dynamic Prediction of Traffic Volume through

Kalman Filtering Theory. Transportation Research B., 18(1), pp. 1-11.
Rao C. R. (1973). Linear Statistical Inference and Its Applications., 2nd Edition, New
York: John Wiley, 1973.
Smith D. (1989). Combination of Forecasts in Electricity Demand Prediction. Journal of

Forecasting, 8, pp. 349-356.
Shumway R. H. and D. S. Stoffer (1982). An approach to time series smoothing and

forecasting using the EM algorithm. J. Time Series Anal., 4(3), pp. 253-264
Stephanedes Y. J., P. G. Michalopoulos and R. A. Plum (1981). Improved Estimation of

Traffic Flow for Real-Time Control., Transportation Research Board, 795, pp. 28-39.
Vythoulkas P. C. (1993). Alternative Approaches to Short Term Traffic Forecasting for

Use in Driver Information Systems., Proceedings of the 12 ISTTT, (C. F. Daganzo
Ed.), pp. 485-506, Berkeley 1993.
Wolpert D. (1992). Stacked Generalization. Neural Networks, 5, pp. 241-259.

Decision Support System - Using Bilevel Optimisation 489
A THEORETICAL BASIS FOR IMPLEMENTATION

OF A QUANTITATIVE DECISION SUPPORT
SYSTEM — USING BILEVEL OPTIMISATION
Arthur Clune, Mike Smith and Yanling Xiang,

York Network Control Group
Department of Mathematics,
University of York,
Heslington,
York, YO105DD.
e-mail mjs7@york.ac.uk
ABSTRACT
There are very many assessment tools which answer "what i f . . . ?" questions, but there are no
tools which take the initiative and suggest options for the planner to consider. This paper
suggests that there is an urgent need for a decision support tool or system; implemented in
software; which does make suggestions to the transportation planner. The paper then outlines a
theoretical model which seeks to move non-optimal steady state (traffic, price, green-time)
distributions steadily toward locally optimal values taking account of users' responses. The
model is a multi-modal equilibrium transportation model with elastic demands. Road prices,
prices charged to traverse a route (as with public transport fares) and signal green-times are
explicitly included as control variables. The model should permit values of these control
variables which meet specified transportation targets to be estimated.
The paper expresses the central problem as a bilevel optimisation problem and suggests four
different approaches to solving the problem. Results of using a penalty method are given; but
the greatest justification is provided for approaches which involve solving a sequence of
quadratic or linear subproblems. These methods are perhaps most likely to prove to be
efficient.
Key words: Decision support system, bilevel optimisation, transportation networks.

THE NEED FOR A DECISION SUPPORT SYSTEM WHICH MAKES

SUGGESTIONS
Urban Transportation is at a cross-roads; with changing targets ahead, and an expanding
plethora of increasingly sophisticated controls to assist in the task of meeting them, the
transport planner faces a daunting task.
Currently, computer models of transportation only assess strategies given to them. So the
planner is left with complete responsibility for devising strategies, including pricing levels,
likely to be successful when tested (in computers and in reality) against new and changing
targets. While vast computational resources are employed for the assessment of options for
controlling town traffic, the design of these options is left to the planner.
The design of optimal or near optimal strategies, including the prices to be charged and signal
timings to be implemented, for controlling urban travel is far, far more difficult than assessing
any given option. Thus computational assistance is only available for the easier task; by far the
harder task is left entirely to the Transport Planner, Local Authority, Central Government or
the EU. The single most important tool currently entirely absent from the transportation
planner's tool-box is an effective and proven mathematical optimisation methodology;
implemented within helpful and easy-to-use software.
The tool is now becoming a practical necessity for two reasons. Firstly; for transport planners
themselves to devise prices and systems for controlling urban travel which are anywhere near
the best would take (i) great effort, (ii) great insight, (iii) a long period of time and (iv) luck;
the problem simply has too many ramifications and feedback loops for a really good solution to
arise, reasonably quickly, without the assistance of the very best tools. Secondly; the
mathematical tools are now (and only just now) becoming available to allow this development
to actually take place. Thus now, for the first time, it is perhaps possible to implement a really
helpful decision support system like that described above.
The design tool advocated above might be thought of as a "supervisor" by those interested in
traffic control systems and as a means of calculating "second-best" prices by those interested in
pricing structures.
Prices
Prices are important controls of economic and transportation systems. The decision support
tool based on mathematical optimisation envisaged above would have immediate application to
the estimation of approximately optimal prices for transportation taking account of real-life
constraints and also taking into account other controls such as traffic signal control settings and
bus lane provision.
Realistic approximately optimal prices are likely to be very different to the unrealistic marginal
cost prices which have so far received most attention, both within EU-funded work and
elsewhere. Marginal cost prices as usually interpreted pay little regard to practical constraints.
In order to calculate prices which are confined by realistic constraints it is necessary to solve
the problem known as the "second best" problem; that is:
what should the prices be set at when some prices cannot be marginal cost prices?
For example, suppose some prices are readily changed by an authority (say car park prices) but
others (say bus fares or road prices) are not readily changed by the same authority. Given the
limitations (not being able to change bus fares or road prices from their current values, say)
what should the car parking charge be?
If the Authority could change at will all prices then under natural conditions an optimal set of
prices might be determined using marginal cost pricing. But with limited jurisdiction the only
available optima will almost certainly not arise from the usual marginal cost price, calculated
in the usual way. Or again, the usual network theoretic "System-Optimal" flow pattern only
gives the right prices if all links are charged; as soon as some links are not charged then we
have a "second-best" problem and we need a more general optimisation procedure to calculate
optimal controls (prices) and flows, subject to constraints reflecting the fact that certain links
must not be subject to charge. Finally, real life is essentially dynamic and so marginal cost
prices must in fact vary with time and it is quite unrealistic at this moment to contemplate
either a charging mechanism or a user-friendly form for such charges if they are to be faithfully
implemented.
As soon as we have a proper mathematical optimisation model we will be able to calculate

optimal road-use charges and other prices even if there are severe constraints on the charges
and prices which may be implemented. Such constraints would naturally ensure that the
charges should only vary in a very simple and understandable manner; and that the charges
were within the limits of political acceptability.
Signal Timings
There is a vast literature concerning signal timings. The following address the control problem
subject to equilibrium choices by travellers: Allsop (1974), Chiou (1997), Clegg and Smith
(1998), Fisk (1984), Smith (1979b), Smith et al (1997), Smith et al (1998a, b), Tobin and Friesz
(1988), Yang et al (1994a, b, 1995), Yang (1996a, b, c),
Context
The basic structure of the equilibrium model here (with any controls fixed) is as in Smith
(1979a); to find an x in 9^" satisfying
-f(x) is normal at x to 9?". (1)
But here the function -f takes the form of an excess demand function with more aspects than
usual so as to embrace capacity and demand constraints as well as equilibrium constraints. This
structure is that of a complementarity problem.
The solution methods proposed here follow up ideas introduced in Smith (1979a, b); which
were partially inspired by cone-fields introduced in Smale (1976). Smale (1976) introduced
dynamical systems whose solution trajectories were not uniquely defined; they merely move in
"roughly" the right direction, rather than exactly the right direction; and argued that such
dynamical systems may be more appropriate for the study of the evolution of economic
systems. The "solution trajectories" in these systems have their direction of motion at each
point confined to be within a cone, instead of being confined to be in a precise direction.
This paper relates the cone projection method introduced in Battye et al (1998), Smith et al
(1998a, b), and Clegg and Smith (1998) to Smale's cone fields; and also extends proofs of
convergence given in earlier papers.
The cone projections discussed here are slightly similar to those discussed in Dupuis and
Nagurney (1993), Zhang and Nagurney (1995), Nagurney and Zhang (1996, 1998). However
there appear to be some very substantial differences. The main difference is that throughout we
are concerned with finding an appropriate direction to move controls.
One important element of the equilibrium modelling which arises is that the "equilibrium
objective function" introduced by Beckmann et al (1956) is changed to one which allows
asymmetries. The dynamics in this paper exploit "Lyapunov" methods. See Lyapunov (1907).
Achieving the Complementary Formulation
The notation adopted is shown below; there is a base network and a multi-copy (Charnes and
Cooper (1961)) version of this. Within each copy the travellers or vehicles all have a single
destination node. This network structure is very similar to that in Smith (1998); here we have
chosen to give a route-formulation in which each copy has links which comprise routes in the
base network. In this paper then a link will always be an element of the base network; links in
the multi-copy network will here be called routes since they are routes constructed from base
network links. Route costs on the copies may have components which are sums of costs on
base network links (link delays will add in this way along routes) but may also have elements
which are non-additive with respect to base network links; so that bus fares and so on may be
represented.
The structure also represents multi-mode networks; flow on the routes in a single copy may
represent travellers using a single mode; or vehicles of a certain type. Within the framework
given here multi-modal effects are most easily represented by thinking of all flows on the
multi-copy network in congestion-causing units and defining the demand function (W(.)
below) appropriately. A typical link in the multi-copy part of the network will have suffix r (as
here this will be a route) and a typical base network link will have suffix i.
There is also a copy containing sets of base network links which will be called stages. Stage k
(say) will be a subset of base network links which may be shown green at the same time. All
base network links which are not controlled will comprise stage 1 which will be compatible
with all other stages and which will be regarded as being "given green" all the time. Such links
will be able to always operate at full capacity.
Notation
Variables which are to be found in the equilibrium problem

Xr = flow along route r in the multicopy network;
Cn = least cost of reaching the destination from node n in the multi-copy network,
where Cn = 0 if n is a destination; and
bi = bottleneck delay at the exit of link i in the base network.
Control variables
K\ = price to be paid to traverse link i;
Pr = price to be paid for traversing route r; and
Yk = proportion of time stage k is green.
Fixed given variable

Si = saturation flow at exit of base link i (may be infinite).
Multi-copy network structure

Bnr = 1 if node n is the entrance node of route r in the multi-copy network (n
Before r) and 0 otherwise;
Nik = Sj if link i is in stage k and 0 otherwise (k = 1, 2, 3 , . . . ) ; and
Mir = 1 if route r in the multi-copy network contains link i and 0 otherwise.
Nominal link capacity

y; = Zk NjkYk, the nominal capacity of link i.
Demand and supply functions

Hr(X) = cost of traversing route r if the vector of route-flows is X;
Wn(C) = demand arising at node n (heading for the unique destination in the
copy containing node n) if the cost to destination vector is C; and
gi(t>i, yO = maximum possible average flow when the nominal link capacity is y,
and the bottleneck delay on link i is bj.
Following Payne and Thompson (1975), the bottleneck delays b; and the node costs Cn are
regarded as independent variables. Also if n is a destination node then Cn = 0 and Wn(C) = 0 for
all C. (The demand for travel and the cost of travel from a node to itself are both zero.)
All the links at non-signalised junctions will comprise just one "stage" (the first in the list of
stages) shown green for all time and so YI = 1. Also NH is (as in the general case above) to be Sj
for all i such that link i is not signal-controlled and then the nominal link capacity will still be y, =
ZkNjkYk (= NjiYi = Sj) as within the notation list above. The formula gi(bj, yO above will then
determine for each such link the maximum possible flow consistent y; with this and a bottleneck
delay bj.
Equilibrium, demand and capacity constraints

We use Wardrop's (1952) condition: more costly routes carry no flow. But we choose to write this
in the following form: for each route r the (least) cost to the destination from the node B(r)
upstream of route r (Before r) is no more than the least cost to the destination via route r, and if it
is less then no flow will enter route r, or
InBTraCn - Hr(X) - Pr - XiMTri(bi + TtO < 0, and

InBTraCn - Hr(X) - Pr - IiMTri(bi + TCI) < 0 implies Xr = 0
Here LnBTmCn is just a way of writing CB^; the cost at the node, B(r), upstream of route r. This
sum comprises just the single cost at that node at the entrance of route r.
The (elastic or inelastic) demand constraints may be written: the total route-flow Zr BnrXr out of
node n equals the demand Wn(C). Rewriting this in a slightly weaker and more artificial way we
obtain:
W n (C)-I r B n r X r <0,and
Wn(C) - Xr BnrXr < 0 impllCS Cn = 0.
This is in fact, under natural conditions (which include the Wardrop condition above), equivalent
to the italicised condition.
For the capacity constraint condition we suppose here that for any average bottleneck delay bj and
nominal link capacity y\ there is a maximum possible flow g;(bj, yO consistent with the delay bj.
Then the capacity constraint may be written:
Is MisXs - gi(bi, Ik NjkYk) < 0 and

Is MisXs - gi(bi, Ik NikYk) < 0 implies bj = 0.
As specified here this condition will ensure that congestion costs normally represented by a cost-
flow function will in fact occur as "bottleneck" delays bj which arise from equilibrating via the
functions gj. The g; may be thought of as the inverse of given cost-flow functions. Given a
nominal link capacity y,; gj(bj, yO delivers a (largest) flow qj compatible with a given cost or delay
bj, instead of delivering a cost b; for each flow qj This formulation allows for explicit capacity
constraints: for example we may set gj(bj, yO = yj (independent of bj) in which case we obtain the
simplest strictly capacitated model. This may be thought of as having a "vertical" cost-flow
"curve" but the inverse cost-flow functions gj(-, y,) are then flat and may be expected to have
numerical advantages.
The inverse delay-flow or cost-flow curves (these are flow-delay or flow-cost) may be expected to
have numerical advantages even if the delay-flow or cost-flow function is just steep, rather than
vertical; because the g, are then shallow. It is convenient and natural to think of bj as the delay at
the link exit, and as being a "steep" function of flow; and Hr as the cost of traversing route r when
there may be congestion along the links of route r but no bottleneck delays at the exits of the links;
and as being a fairly flat function of flow.
The main thing which drives the forthcoming results is monotonicity of f below.
The complementarity formulation

First we define network response functions fir, f2n, fa
-flr(X, C, b, 71, Y, P) = InBTmCn - Hr(X) - Pr - IjMTri(bj + TCj)
-f2n(X, C, b, TC, Y, P) = Wn(C) - Ir B^,
-f3i(X, C, b, TC, Y, P) = Is MisXs - gj(bj, Ik NjkYk).
Then we rewrite the equilibrium, demand and capacity constraints above as follows:
-fi r (X, C, b, TC, Y, P) < 0, and -f!r(X, C, b, TC, Y, P) < 0 implies Xr = 0,
-f2n (X, C, b, TC, Y, P) < 0, and -f2n(X, C, b, TC, Y, P) < 0 implies Cn = 0,
-f3i(X, C, b, n, Y, P) < 0, and -f3i(X, C, b, TC, Y, P) < 0 implies bj = 0.
Finally we put x = (X, C, b), p = (TC, Y, P) and f(x, p) = (f^x, p), f2(x, p), f3(x, p)) to obtain:
x belongs to 9?" and -f(x, p) is normal, at x, to SR" ; or (2)
x belongs to SR"; -f(x, p) < 0; and -fi(x, p) < 0 implies Xj = 0. (3)

(The i and the n here are "new" suffices and are unrelated to the previous i and n.)
These conditions are similar to a Tobin economic model; are of the form (1), and suggest
"following" a cone field like the "half-space field" in Smith (1979b).
Equilibrium as the solution set of 2n + 1 inequalities

Keeping p be fixed (in 5? m ) for the moment. The feasibility and equilibrium conditions (2) or
(3) are plainly equivalent to:
\i > 0 for i = 1, 2, 3,. . . , n; -f;(x, p) < 0 for i = 1, 2,. . . , n; and E 2 = Zxjf^x, p) < 0. (4)
FOUR SOLUTION METHODS

In each of these four methods we suppose given an objective function Z = Z(x, p) and seek to
minimise Z subject to equilibrium. We describe the third method in greatest detail (since this is
the case we have considered the most) although the first and fourth methods may be most
effective.
A generalisation to allow several different objectives

All four methods may extended slightly so that they apply to the minimisation of
Z(x p) = max (Zi(x, p), Z2(x, p),. . . , ZK(x, p))
(say), instead of a single Z, by thinking of Z now as a free variable and minimising the
independent variable Z subject to all the equilibrium constraints in (4), constraints on the
control vector p and also:
Zk(x, p) - Z < 0 for k = 1, 2, 3, . . . , K.
Then to take account of the constraints in (4) we may replace them by

-x, - M < 0 for i = 1, 2, 3,. . . , n; -fj(x, p) - M < 0 for i = 1, 2, . . . , n; and E2-M < 0;
together with M < 0. The bilevel problem now becomes:
Minimise Z subject to
Zk(x, p) -Z < 0 for k = 1, 2, 3,. . . , K;
-Xj - M < 0 for i = 1, 2, 3,. . . , n; -fj(x, p) - M < 0 for i = 1, 2,. . . , n; and E2-M < 0;
and also: M < 0.
In this way certain of the methods described in this paper may be naturally modified to assist in
meeting different objectives.
Solution method 1: Using a sequence of linear programs

Now suppose p is subject to linear constraints. The most straightforward method of minimising
Z subject to these p-constraints and the constraints in (4) is as follows. At a typical point (XQ,
po) linearise fj, £2 and Z around (XQ, po); let E2~((xo, po), (x, p)) be the value of the linearised £2,
(linearised at (XQ, po)), evaluated at (x, p); and similarly for fj~ and Z~. Now minimise Z~((XQ,
po), (x, p)) subject to the p-constraints and also the linearised constraints:
x, > 0, - ff((xo, po), (x, p)) < 0 for i = 1, 2,. . ., n; and E2~((x0, p0), (x, p)) < 0.
It follows from the analysis in method 3 below that this linear program will be feasible for
certain natural functions f; and Z. Let an optimal solution of this LP be (XQ, po) + S- Then 8 is to
be our search direction and a point in this direction closer (than (XQ, po)) to satisfying every
constraint (4) and still satisfying the linear control constraints, is chosen. The process is
repeated at this new point. And so on.
The feasibility of this natural method emerges clearly from our analysis of method 3 below
since that analysis provides conditions under which the approximating linear program here is
feasible. Omitting some or many constraint functions from the LP will accelerate the
procedure.
Solution method 2. A penalty method

We have applied a penalty method to seek road prices on the separate links of a network model
of Edinburgh which minimise Z (total journey time) subject to equilibrium. We put E = EI as a
measure of the degree to which equilibrium is not satisfied; EI is an objective function we have
often used before and is specified in Smith et al (1998a, b) for example. Then we minimised
KE + Z for larger and larger K using the conjugate gradient method. We will be seeking to
impose realistic constraints in future work and so we have done this rather than using marginal
cost pricing. Results of preliminary tests are shown in Figures 1 and 2. Figure 1 shows that for
K = 10 or 100 E does appear to approach 0 quite closely; the initial value of E is about 106 in
each case. For these initial tests the ratio of total congestion benefit to total out-of-pocket cost
was found to be about 1/10.
Solution method 3. A Cone Projection Method

Suppose that for each fixed p the function f(-, p) is monotone and (for simplicity and
temporarily) linear in x. Then
f(x, p) = Apx + ap
Edinburgh network with flow-delay curve y+yb/30

3500
3000
2500
2000
1500
1000 E+Z
10E+Z - -
500 100E+Z
0
1000 2000 3000 4000 5000 6000
Steps
Figure 1: Graphs showing that E becomes very small for K=10 and K=100.
Edinburgh network with flow-delay curve y+yb/30

18000
I 15000
E+Z
12000 10E+Z
100E+Z
no prices
6000
3000
o
H
1000 2000 3000 4000 5000 6000

Steps
Figure 2: Graphs showing the value of Z as the minimisation of KE+Z proceeds

and also an estimate of Z for zero prices.
for some square positive semi-definite matrix Ap and some n-vector ap which both depend on p.
Since Ap is positive semidefinite (that is xTApx > 0 for each n-vector x), x-Apx is a convex
function of x (the positive semi-definite Vi(Ap + APT) is the Hessian of this function); and so
x-f(x, p) is a convex function of x for each fixed p. By linearity each -fj is convex too. Thus if f
is monotone and linear in x for fixed p each inequality in (4) specifies a convex set and the set
of equilibria is the intersection of the (2n + 1) convex sets in (4).
It thus appears that monotonicity combined with linearity has the excellent consequence of
making the sets in (4) convex. We will here just impose the convexity of all the sets in (4);
instead of assuming that f is linear.
Since we wish to vary p in SRm we need constraints on the set of possible p values. Let these
constraints be ej(x, p) < 0. Capacity constraints deriving from engineering considerations are
already included in the conditions -ft < 0 in (4); these new e-constraints may represent either
engineering constraints on the vector of controls p or new constraints, perhaps environmental
constraints, on x and p together. Now we let Cj(x, p) < 0 for j = 1 , 2, 3, . . , J stand for x > 0 and
the "old" ej(x, p) < 0; and h k < 0 stand for -f,(x, p) < 0 and £2= £xjfj(x, p) < 0. Then the control-
augmented set of constraints is:
ej(x, p) < 0 for j = 1, 2,3, . . , J and h k (x, p) < 0 for k = 1, 2, 3, . . . , K. (5)
Smoothness and convexity assumptions required

1. The original BJ and fi are differentiable and their (x, p)-gradients are continuous and non-
zero.
2. The original Cj, the -fj and £2 are convex in x for each fixed p.
(The new Cj and the hk inherit all these properties.)
Feasibility
Here the e-constraints will be regarded as feasibility constraints. So "(x, p) is feasible" means
thate/x, p ) < 0 f o r allj.
Delay and cost formulae and other conditions ensuring that the convexity constraints 2 hold
Suppose that a twice differentiable single link cost-flow function 0. Q is given so that
b{ = 0, (q, ) (where qi is a single link flow), and that the inverse of this function is g,. It is easy to
check that bigi(bi) is convex if and only if
for all relevant qi. (This includes both capacitated links where delay tends to infinity as flow
approaches capacity and also uncapacitated links.)
Using this criterion it follows that certain standard cost flow functions have inverses g, such
that bigi(bj is convex. For example consider the second term alone of Webster's two-term delay
formula. This is proportional to (dropping suffices) (j)(q} = q/(a-q)(a = capacity; q = flow);
so that (f)'(q}=l/(a-q)+q/(a-q)2, (f)"(q) = 2/(a-q)2 + 2<?/(a-g) 3 and
2[f (q)]2 = 2l(a - q}2 + 2q21 (a - q)' + 4q(a - q}3 > 2q l(a - q}3 + 2q*(a- q)4 = 0fe>'fe)
This means that the second part of Webster's delay formula has an inverse g such that bg(b) is
convex.
Further, it is clear that adding two g's with the desired convexity property yields another
function with the desired convexity property. Of course linear functions satisfy the convexity
property and so Kimber and Hollis (1979) delay formulae satisfy the convexity property as
they are generated in precisely this manner; by adding the inverse of a function of the form
q/(a-q) and a linear function (two functions each having the required convexity property) and
then inverting the sum.
Thus if the delay formulae </>, (qt., yt)are all of Kimber and Hollis type the bigt(bi, yt) will all be
convex for fixed y/. In this case b-g(b, y) is a convex function of b for fixed y.
So far here we have only looked at part of E2(x, p). Now

E2 = x-f(x, p)
(X, C, b>(- BTC + H(X) + P + MT(b + it), - W(C) + BX, - MX + g(b, NY))
- X BTC + X-MTb + C BX - b-MX + X-[H(X) + P + MTTC] - C-W(C) + b-g(b, y)
0 + X-[H(X) + P + MTTI] - C-W(C) + b-g(b, y).
Thus x-f(x, p) is convex in x for fixed p if X-H(X) is convex in X, - C-W(C) is convex in C and
b-g(b, y) is convex in b for fixed y.
Also
-flr(X, C, b, 71, Y, P) = InBTmCn - Hr(X) - Pr - IiMTri(bi + TCj),
-f2n(X, C, b, rc, Y, P) = Wn(C) - Ir BnrXr and
-f3i(X, C, b, 71, Y, P) = Is MisXs - gi(bi, Ik NfcYk)
are all convex (for fixed 7t, Y, P) if - Hr(X), Wn(C), - gj(bj, Ik NjkYk) are convex (for fixed Y).
Convexity condition
A portmanteau convexity condition which ensures that the convexity constraint 2 holds is thus as
follows. For fixed y or Y and for each r, n and i
-Hr(X), Wn(C), -gi(bj, yO, are convex in X, C and b respectively; and
X-H(X), -C-W(C), b-g(b, y) are also convex in X, C and b respectively.
This convexity condition is bound to hold if H, -W, and g are all linear and f is monotone.
MIN-MAX condition for equilibrium

Given a feasible (x, p) put
M(x, p) = max{0, hk(x, p); k = 1, 2, 3,. . . , K} = max{hk(x, p); k = 1, 2, 3, . . . , K}
and for each fixed p let MMP = inf{M(x, p); (x, p) is feasible}. Then a feasible (x, p) satisfies
all the constraints in (5) if and only if
M(x, p) = 0 or: x minimises M(., p) and M(x, p) = 0. (6)
So to find an equilibrium we need only: (i) find a feasible (x, p): which satisfies 6j(x, p) < 0 for
j = 1, 2,3, - • , J; (ii) maintain this feasibility throughout; (iii) reduce M(x, p) to a minimum; and
(iv) check that this M(x, p) = 0.
Theorem 1. Let the smoothness and convexity assumptions 1 and 2 hold, let (x, p) be feasible
and let M(x, p) > MMP. Then there are M-descent feasible directions at (x, p).
Proof. Since all the constraints in (5) are convex M(x, p) is convex in x for each fixed p. Now
suppose that M(x, p) > MMP. Then there is a feasible (y, p) such that M(y, p) < M(x, p). Since
M(., p) is convex and the feasibility constraints are convex the line joining (x, p) and (y, p) is
an M-reducing feasible direction. Thus there are M-descent directions at (x, p) which preserve
feasibility. The proof is complete.
Corollary. Let p be fixed and the smoothness and convexity assumptions 1 and 2 hold.
Suppose that there is an equilibrium solution. Then at each feasible (x, p) there are directions
which reduce M(x, p) unless this is already zero.
Proof: The existence of a solution means that there is an XQ such that M(XQ, p) = 0; so MMP = 0
and theorem 1 ensures that there are feasible directions at (x, p) which reduce M(x, p) if it is
positive.
A cone-field method of calculating equilibria

In the light of theorem 1 (and theorem 2 below) it becomes natural to define the cone field F to
be the function which assigns to any feasible (x, p) the cone of feasible directions in which
M(x, p) does not increase at (x, p).
Suppose (x(t), p(t)) is defined for all t > 0 and has a right derivative (x*(t), p*(t)) at all t.
Suppose further that:
M((x(t), p(t)) is strictly decreasing for all t > 0 except at points t where (x*(t), p*(t)) = 0. (7)
In this case we shall call the trajectory (x(.), p(-)) an assignment process following Smith
(1979a) who followed Smale (1976). Thus an assignment process is a path along which M is
strictly decreasing. This notion of an assignment process is a generalisation of the solution of a
differential equation. It is very useful here.
Suppose that D(x, p) is specified in F(x, p), for all (x, p) such that M(x, p) > 0; satisfying:
M((x, p) + tD(x, p)) is strictly decreasing for 0 < t < 1 if int F(x, p) is nonempty;
and D(x, p) = 0 if int F(x, p) is empty. (8)
Here int F(x, p) is the relative interior of F(x, p).
Given a starting point (XQ, po) we may then define a polygonal assignment process which
begins at (x0, p0) and then goes to (xi, pO, (x2, p2), .. . where (XL pO = (x0, po) + D(x0, po), (x2,
p2) = (xi, pO + D(XI, pO, (x3, p3) = (x2, p2) + D(x2, p2) .. . .By design:
M(x0, po) > M(XI, pO > M(x2, p2) > M(x3, p3) > .. . .
unless the process stops. Usually we obtain an infinite sequence {(xn, pn)}.
The gain w(x, p) at (x, p) is defined by w(x, p) = M(x, p) - M((x, p) + D(x, p)).
Smale defined a complete process as one such that if (x, p) was approached by it; then there
would automatically be no non-trivial process starting from (x, p). With our polygonal
definition an assignment process will be complete if it is such that any point (x, p) approached
by it is automatically an equilibrium.
Theorem 2. (A class of complete assignment processes)

Let D(.,.) satisfy the above condition (8) and also be such that the gain w(x, p) is a continuous
function of (x, p). Then any bounded assignment process {(xn, pn)} generated by D(., .) will be
complete.
(These conditions may be made weaker. In particular it is sufficient to suppose that w is

bounded below by a continuous function which is positive at non-equilibria.)
Proof Let D be such that w is continuous. It is automatically true that w will be positive at
non-equilibria. Let an assignment process arising from D be bounded. Then if it has only
finitely many positive steps it must terminate at (xm, pm) (say) where D(xm, pm) = 0 and so
w(xm, pm) = 0 and we are at an equilibrium.
On the other hand let the bounded polygonal path contain an infinite number of corners {(xn,
pn)}, where n = 0, 1, 2, 3, 4, . . . . These corners are all distinct as M decreases. Hence the set
{(xn, Pn)} is infinite, and being bounded has a limit point (x1, p'). We shall show that such a
limit point is an equilibrium.
Suppose that (x', p') is a limit point and that this (x', p') is not an equilibrium. Then int F(x', p')
is non-empty, D is non-zero, w(x', p') > 0 and so M(x', p') - M((x', p') + D(x', p')) = w(x', p') >
0. Since (x1, p') is a limit point there is a subsequence of {(xn, pn)}converging to (x', p'). Call
this subsequence {(xn, pn)}. Now w(xn, pn) tends to w(x', p') = w' > 0 as n tends to infinity since
w is continuous. Hence there is an no such that w(xn, pn) > w'/2 > 0 for all n > no.
Thus M is reduced by at least w'/2 > 0 between (xn, pn) and (x n+ i, pn+0 for n = no, no + 1, no +
2, no + 3, no + 4 , . . . , no + k - 1 (for any choice of k). Over k steps M is thus reduced by at least
kw'/2 which may be made more than M(XO, po) (the initial value of M) by choosing k
sufficiently large. M decreases at each step and so if k is so chosen then
M(xn0+k, Pno+k) < M(xn0, pno) - kw72 < M(x0, po) - kw'/2 < 0.
However M is intrinsically non-negative and so we have a contradiction.
This contradiction arises from the assumption that {(xn, pn)} has a limit point (x1, p') which is a
feasible non-equilibrium. Thus all feasible limit points of the sequence {(xn, pn)} must be
equilibria and the assignment process defined by D is complete. QED.
Notes: (1) For practical implementation D(x, p) must be specified; (2) only continuity at the
limit point (x1, p') is required; (3) even then continuity is only required with reference to a
subsequence of points of the original {(xn, pn)}.
The cone projection method

Now we specify a D(x, p) which not only ensures convergence to an equilibrium (that is gives
rise to a complete assignment process) but also seeks to do the best for any given smooth
objective function.
Suppose given a smooth (continuously differentiable) objective function Z = Z(x, p); where x
is the vector of flows, delays and costs, and p is the vector of (signal green-time proportions
and prices including any feasible road prices). The general form of "the cone-projection
method" is to begin at any feasible (x, p) and continually follow a polygonal path which at each
step follows a direction D which reduces M(x, p), while "approximately doing the best for" the
given Z. As motivated here such a trajectory may (under natural conditions) be expected to
converge to equilibrium and a weak variety of local-optimality simultaneously.
Let (x, p) be feasible but not an equilibrium. If hk(x, p) > M(x, p)/2 > 0 we will say that
constraint k is very violated; otherwise it is not very violated.
Let CQ(X, p) be the cone of directions in (x, p)-space which do not cause any e-constraint (in
(5)) to become violated and let Ci(x, p) be the cone of those directions at (x, p) along which no
very violated h-constraint (in (5)) becomes more violated. Then Q(x, p) is the cone of
directions 8 (in (x, p)-space) such that 6-hk'(x, p) < 0 if hk(x, p) > M(x, p)/2. Directions in the
relative interior of Ci(x, p), intC](x, p), reduce the violation of all very violated constraints in
(5), at (x, p), simultaneously and so reduce M.
Let the vector d(x, p) be the "centre-line" of the cone CQ(X, p)nCi(x, p) (the zero vector if and
only if the cone is empty), descZ(x, p) = -Z'(x, p)/1 | Z'(x, p) | | and
descZ(x, p) | CQ(X, p)nCi(x, p) be descZ(x, p) projected onto the cone Q(x, p)nCi(x, p).
The cone projection method follows the assignment process generated by D where D has the
following form:
D(x, p) = Dap(x, p) = ad(x, p) + (3descZ(x, p) | C0(x, p)nCi(x, p). (9)
At each step a and (3 (both positive) are to be chosen so that (8) holds; and hence M decreases
at each step. Now D(x, p) is non-zero unless d(x, p) = 0 and hence M(x, p) = 0. Thus any zero
of D(x, p) is an equilibrium.
The solution method is thus in outline to follow a direction D(x, p) at each (x, p). This is
intended to be a refinement of the bi-level method proposed in Smith et. al. (1998a,b, 1997);
replacing half-spaces with cones to narrow the search region in an effort to reduce
numerical/computational problems.
So far we have not specified either the "centre-line" d(x, p) or descZ(x, p) | CQ(X, p)nCi(x, p).
The "weighted centre-line" of the cone CQ(X, p)nCi(x, p) = C(x, p) is obtained by solving the
problem P(x, p) shown below:
Problem P(x,p):
Minimise | | ZHi(-ei'(x, p)) + 2^k(-hk'(x, p)) | |
subject to ii; > 0 (all i); |U,i = 0 if e, < 0,
and Xk > 0 (all k); kk = 0 if hk < M/2 and SXkhk = 1.
This direction, if followed and if the hk were linear, would reduce all the currently very
violated constraint functions to zero in the shortest distance. Now at each feasible non-
equilibrium (x, p) let the centre-line be defined by:
d(x, p) = ZHi(-ei'(x, p)) + ZXk(-hk'(x, p)),
where X and \\. solve P(x, p). By virtue of theorem 1 above d(x, p) is non-zero at non-equilibria.
Consider also problem Q(x, p):

Minimise | descZ(x, p) + Z(j,i(-ej'(x, p)) + ZA,k(-hk'(x, p)) |
subject to: p,i > 0 (all i) and |ij = 0 if ei < 0; Xk > 0 (all k) and A* = 0 if hk < M/2.
Now descZ(x, p) | CQ(X, p)nCi(x, p) is defined by:
descZ(x, p) | C0(x, p)nC((x, p) = descZ(x, p) + Z(ii(-ei'(x, p)) + ZXk(-hk'(x, p)),
where K and |j, solve Q(x, p).
Direction D(x, p) may now be defined as a weighted sum of these two vectors. This
"generalises" the corresponding direction in Smith et al (1998a, b) which involved projections
onto half-spaces. Smith et al (1997, 1998b) give an initial result of applying the method, using
half-spaces, to calculate good signal timings.
An iterative process
Suppose a feasible starting point (XQ, po) is given, D(x, p) is given as (9) as prescribed above,
and w is continuous. Let (xi, pO = (XQ, po) + D(XO, po); (X2, pi) = (xi, pi) + D(XI, pi). And so on.
Thus we follow a standard "M-reducing" polygonal trajectory. We obtain a (usually infinite)
sequence {(xn, p n )}.
Proof of convergence of the iterative process to the set of equilibria

The proof relies on or is similar to Theorems 1 and 2. Let the polygonal path be bounded. Then
if it has only finitely many steps it must terminate at (xm, pm) (say) where D(xm, pm) is zero. At
such a point w = 0 and (xm, pm) is an equilibrium.
On the other hand let the bounded polygonal path have an infinite number of corners {(xn, p n )},
where n = 0, 1, 2, 3, 4, . . . . Since the path is bounded the set {(xn, pn)}, being infinite (no
duplicates as M declines), has a limit point (x', p'). We shall show that every such limit point is
an equilibrium.
Suppose that (x', p') is any such limit point and that this (x', p') is not an equilibrium. For each n
let A(n) be the set of those suffices i and k such hk and e, occur in Co(xn, pn) nCi(x n , pn).
(These suffices correspond to active constraints.) Then there is at most a finite number of
possible A(n) and some A(n) must be repeated infinitely often. Consider a subsequence of {(xn,
p n )} which converges to (x1, p') and for which A(n) (= A (say)) is always the same set of active
suffices. Call this subsequence {(xn, pn)}.
Then in this subsequence Q>(xn, pn)nCi(xn, pn) all involve the same constraints and the
constraint functions are continuous; and so {Co(xn, p n )nCi(x n , p n )} converges to Co'oCi' (say)
with again the same active constraints. Now, since the constraint functions are continuous,
Co(x', p')nintCi(x', p') cCo'nintCi'; so (x1, p') an equilibrium implies CQ(X', p')nintCi(x', p') is
non-empty which implies Co'nintCi' is non-empty.
Hence M must decline in direction DA(X', p') determined as D(x', p') is determined but using
only those e and h constraints with suffices in A. Let w be the greatest reduction in M possible
in direction DA(X', p') beginning at (x1, p'). Then w(x', p') > 0 and it follows that, since w(xn, pn)
tends to w(x, p) as n tends to infinity, there is an no such that w(x n , pn) > w(x', p')/2 for all n >
no.
Thus M is reduced by at least w'/2 between (xn, pn) and (x n+] , p n +i) for n = no, no + 1, n0 + 2, n0
+ 3, no + 4, . . . , no + k - 1 (for any choice of k). Over k steps M is thus reduced by at least
kw'/2 which may be made more than M(XO, po) (the initial value of M) by choosing k
sufficiently large. However M decreases at each step and so if k is so chosen then M(x n o+k,
pno+k) < M(xno, Pno) - kw'/2 < M(XQ, po) - kw'/2 < 0. But M is intrinsically positive and so we
have a contradiction. This contradiction arises from the assumption that {(xn, pn)} has a limit
point (x1, p') which is a feasible non-equilibrium. Thus all feasible limit points of the sequence
{(xn, pn)} must be equilibria.
Problem with the iterative process

The step lengths are chosen solely to reduce M to zero and so there is no reason to think that
the polygonal path will converge to a minimum of Z within the set of equilibria.
This problem may be resolved in several ways but here we propose to interrupt the previous M
reducing scheme periodically and to follow a modified direction with a constant small step
length aiming to reduce Z. This "interruption" may instead use a standard constrained
minimisation procedure: minimising Z subject to a relaxed equilibrium condition, allowing M
to increase somewhat as Z is minimised. Thus we come by the implementation proposed
below.
Implementation and outline justification

We have defined M(x, p) but now we put
N(x, p) = I I descZ(x, p) | C0(x, p)nC,(x, p) | |

= | | descZ(x, p) + IUi(-ei'(x, p)) + ZXk(-hk'(x, p)) | |
where \JL\, A.k solve problem Q(x, p). N(x, p) is a measure of the degree to which (x, p) departs
from satisfying a KKT condition. (We shall say that (x, p) satisfies a KKT condition iff N(x, p)
= 0.)
We follow the two-stage method in Smith et al (1998a, b) and Clegg and Smith et al (1998). In
each iteration there are two stages.
Beginning at (XQ, po), let MO = M(XO, po) > 0 and NO = 1 .
First iteration In the first stage we follow the polygonal path above each step of which begins
in direction D(x, p) given by (9); until M(x, p) < M0/4.
Then in the second stage we follow Dap given by (9), with a/P small, so that eventually
N(x, p) < No/2; and M(x, p) < Mo/2.
This will be a polygonal path and both a and P must be chosen so that both conditions hold at
the termination of stage 2. (Conjecture: such a choice is possible. We believe that this is fairly
readily provable by using arguments like those in Zhang and Nagurney (1995).) Repeating
these two stages yields a sequence {(xn, p n )}. Let (x*, p*) be any limit point of this sequence.
Then
(i) M(xn, pn) < M0/2n and so M(x*, p*) = 0 and (x*, p*) belongs to the equilibrium set; and
(ii) N(xn, pn) < No/2" and (x*, p*) is an "asymptotic" type of KKT point.
(It may be best to run the second stage until N satisfies a much tighter constraint - even until N
= 0.)
Alternative second stage

An alternative second stage here would be to use a constrained minimisation algorithm instead
of the direction D«p above. Adopting the latter alternative provides a complete proof of
convergence to a locally optimal point at the termination of each stage 2.
4 A sharper cone projection method

Here we decompose E2(x, p) < 0 into inequalities Xjfj (x, p) < 0 (i = 1, 2, 3, . . . , n) and treat all
these separately. Thus we let
M(x, p) = max {- fj(x, p); i = 1, 2, 3,. . . , n} + max {Xjfj(x, p); i = 1, 2, 3, . . . , n}
and seek a direction which reduces this new M. This is now far more "discriminating"; we
propose to cause all positive members of both the sets on the right hand side to diminish
simultaneously.
An alternative way of writing this M is as follows:

M(x, p) = max {[- f;(x, p)]+ + [Xjfj(x, p)]+; i = 1, 2, 3,. . . , n}
To ensure that we may always find a direction which reduces this new M (if it is positive) we
need to assume:
A. that f is monotone; and
B. that there is a direction which simultaneously reduces all positive -f; and ej.
The condition A has been introduced already. Condition B here will typically hold for soluble
problems. Indeed condition B here may be thought of as a rather stronger version of the
condition: "there is a solution of the linearised equilibrium problem for this p". In technical
terms it has the form of a coercivity condition.
It is, at first glance at least, surprising that the existence of a direction which reduces all the
positive -fj and Cj simultaneously ensures that there is a direction which reduces all positive -fj
all positive BJ and all positive Xjfj(x, p) simultaneously. The reason of course is the
monotonicity assumption. Here we outline a proof of this initially surprising fact.
In this last section we shall suppose that the 6j constraints act only on x or p separately and say
that a direction 8 in x-space is feasible at x > 0 if and only if x + 8 > 0. Here gradients in x-
space will be written: </>'.
It is easy to check that having a single feasible descent direction for all positive 0, (say) is
equivalent to:
~^l7i(t>i' is not normal at x to 9t" for each y> 0 with at least one co-ordinate which is
positive when </>, > 0 and also satisfies: 0,{jc, p) < 0 implies y\ = 0,
Now we use this result, combined with an extension of Theorem 3 in the appendix which
depends on assumptions concerning the 6j, in both directions. First suppose that assumption B
holds. Then by the above result, if y> 0 has at least co-ordinate which is positive when -/) or ej
> 0 and also has zero co-ordinates corresponding to any -/) < 0 or any a <0; then
- T yie't - y r. {[- f: (x, 0)1 } is not normal at x to 91".

Monotonicity of/, assumption A, now implies (using an extension of Theorem 3) that, if y > 0
has at least one co-ordinate which is positive when ej > 0 or [- fi(x, p)]+ + [x/j(x, p)]+ > 0 and
also has zero co-ordinates corresponding to any -/ < 0 or any et <0; then
-I^/-l7,{[-/,(^p)L+k/,(^p)U'
is not normal at x to9t". The above result may now be used again to deduce that there is a
feasible direction which reduces all ej(x, p) > 0 and all ([- ft(x, p)]+ + [Xjfi(x, p)]+) > 0
simultaneously.
CONCLUSION
The paper has (i) outlined the need for a decision support system which makes positive
suggestions; (ii) provided a theoretical model and four solution methods which may be
appropriate as such a support system and (iii) showed three of the methods converge to signal
timings, prices and flows which satisfy a weak asymptotic KKK condition.
All four methods apply to certain monotone multi-modal deterministic elastic and inelastic
problems; and may be extended to include stochastic elements.
Further work is needed (i) to implement the methods within software; (ii) to assess the practical
efficiency of the two cone projection methods, comparing with other possible bilevel design
methods, (iii) to convert the "outline (a, (5) justification" given here to a rigorous proof, and
(iv) to test the decision support tool on a spectrum of real-life problems.
ACKNOWLEDGEMENTS
The equilibrium framework was greatly developed during a LINK project undertaken with
MVA Limited and the Centre for Transport Studies at University College, London. The basic
complementarity equilibrium condition utilised here was extensively refined during that LINK
project and, in particular, the section concerning the Kimber and Hollis formulae was
hammered out within this collaborative project.
We are grateful for the funding for the LINK project provided by DETR, ESRC and EPSRC.
We are also grateful for DGVII (E2), which supported the MUSIC (Management of traffic
USIng Control) project over three years to February 1999; and to EPSRC for consistent
funding over many years.
APPENDIX
Definition: Let ae 5R". Then a vector VE 9t" is said to be normal at a to 9?" if and only
i f v - U - a ) < 0 for all X<=<Rn+.
Notation: The set of all vectors which are normal at a to 9T will be denoted by N = N(a).
Theorem: Let /:9TH>9T be monotone and differentiable, let ae9t" and let
/ = {/: a,f, (a) > 0} and let J = {i: f . (a) < 0}. Suppose that
[ /?, > 0 for all i e J and £/J./,.'(a)e W ] => 0,. = 0 for all i e 7 . Then
a, > 0 for all ie / , /3, > 0 for /e 7 } and v = ^ftifi'(a)-âlxifi'(d)€ N implies that
y /
a, = 0 for all i e I and 0, = 0 for all / e J .

Proof: We show that, if the hypothesis of the theorem holds, and if ai are such that a, > 0 for
all / e / (and also not all zero), then v = v(a,/3) is not normal at a to 5?". We do this by
finding ;te 9?" such that v(x-a)>0; in which case v ("based" at a) "looks toward" x.
(When we know that [ v e N => a, = OVi ] the theorem ensures that all /J, with / e J are zero.)
So suppose that I is non-empty and the a, are not all zero. Put
x =a +k - a ia,e, + 0,
L / ^
where k is a (small) positive real number.
We now need to check that, for an appropriately small k, jce SR" and v-(x-a) > 0 . For the
proof that jte 9t", first note that 7 is non-empty and that not all the «, are zero. Now for
iel, a i f i ( a ) > 0 and so a, > 0 (otherwise a;/; would be zero). Hence for / e /
jc, = a, - fca,a, = (1 - jfca, )a, > 0 if 1 - Jka, > 0 for all iel .
If, for example, we let k = min{ 1 lal ; i e /} then 1 - fca, > 0 for all i e / . So we now choose k
to be this positive number, then xt > 0 for / e / . Also, for this choice of k and any / e 7 ,
jc, =a,+k/3i >at >0 (as a e 9 t " , a, >0 and also j3, >0).
Decision Support System - Using Bilevel Optimisation 51 1
Hence with our choice of k , for i e 7 , and also for i 6 J , x, > 0 . Therefore x has all co-
ordinates non-negative and jce 9t" .
Now for the proof that v • (x - a) > 0 . Note that
and
x - a= k -
Hence:
/ ./
Now the first term here is greater than 0 as all a,/, (a) > 0 and at least one a, >0. Also the
second term has the form (MS) • S = 8TMT 8 which must be greater than or equal to 0 as the
matrix M = f ' ( a ) is positive semi-definite (since f is monotone).
REFERENCES
Allsop R E (1974), Some possibilities for using traffic control to influence trip distribution and
route choice, Proceedings of the 7th International Symposium on Transportation and Traffic
Theory, 345 - 374.
Battye A, Smith M J and Xiang Y (1998), The cone projection method of designing controls for
transportation networks, Mathematics in Transport Planing and Control, Selected Proceedings of
the Third International Conference on Mathematics in Transport Planning and Control,

Pergamon, 197-206.
Beckman M, McGuire C B and Winsten C B (1956), Studies in the Economics of Transportation.
Yale University Press, New Haven, CT.
Chiou S-W (1997), Optimisation of area traffic control subject to user equilibrium traffic
assignment, Proceedings of the 25th European Transport Forum, Seminar F, Volume II, 53 - 64.
Charnes A and Cooper W W (1961), Multi-copy traffic network models. Proceedings of the
Symposium on the Theory of Traffic Flow, held at the General Motors Research Laboratories,
1958 (Editor: R Herman), Elsevier, Amsterdam.
Clegg J and Smith M J (1988), Bilevel optimisation of transportation networks,
Mathematics in Transport Planning and Control, the Proceedings of the Third
International IMA Conference on Mathematics in Transport Planning and Control,
Pergamon, 29 - 36.
Dupuis P and Nagurney A (1993), Dynamical systems and variational inequalities, Annals of
Operations Research 24, 9-42.
Fisk C S (1984), Optimal signal controls on congested networks. Proceedings of the Ninth
International Symposium on Transportation and Traffic Theory, Delft (Editors: J Volmuller
and R Hammerslag), VNU Science Press, Utrecht, 197-216.
Kimber R M and Hollis E M (1979), Traffic queues and delays at road junctions. Transport and
Road Research Laboratory Report 909.
Lyapunov A M (1907), Probleme general de la stabilite de mouvement, Ann. Fac. Sci.
Toulouse 9, 203 - 274. Reprinted in Ann Math. Stud. No 12, 1949.
Nagurney A and Zhang D (1996), Projected dynamical systems and variational inequalities
with applications, Kluwer Academic Publishers, Boston, Massachusetts.
Nagurney A and Zhang D (1998), Network equilibria and disequilibria, Equilibrium and
Advanced Transportation Modelling, (Editors: Patrice Marcotte and Sang Nguyen) Kluwer
Academic Publishers, Massachusssetts, 201 - 243.
Payne H J and W A Thompson (1975). Traffic assignment on transportation networks with
capacity constraints and queueing. Paper presented at the 47th National ORSA/TIMS North
American Meeting.
Smale S (1976), Exchange processes with price adjustment, Journal of Mathematical Economics,
3,211-226.
Smith M J (1979a), The marginal cost taxation of a transportation network, Transportation
Research 13B, 237 - 242.
Smith M J (1979b), The existence, uniqueness and stability of traffic equilibria, Transportation
Research 13B, 295-304.
Smith M J, Xiang Y, and Yarrow R (1997), Bilevel optimisation of signal timings and road prices
on urban road networks. Preprints of the IFAC/IFIP/IFORS Symposium, Crete, 628 - 633.
Smith M, Xiang Y and Yarrow R(1998a), Descent methods of calculating locally optimal signal
controls and prices in multi-modal and dynamic transportation networks, Transportation
Networks: Recent Methodological Advances, Selected Proceedings of the 4th EURO
Transportation Meeting, 9 - 34.
Smith M J, Xiang Y, Yarrow R and Ghali M O (1998b), Bilevel and other modelling
approaches to urban traffic management and control, Equilibrium and Advanced
Transportation Modelling, (Editors: Patrice Marcotte and Sang Nguyen) Kluwer Academic
Publishers, Massachussetts, 283 - 325.
Tobin R L and Friesz T L (1988), Sensitivity analysis for equilibrium network flow,
Transportation Science, 22, 242 - 250.
Wardrop J G (1952) Some theoretical aspects of road traffic research, Proceedings, Institution of
Civil Engineers II, 1, 235-278.
Yang H, Yagar, S, lida Y, and Asakura Y (1994a), An algorithm for the inflow control problem
on urban freeway networks with user-optimal flows. Transportation Research 28B, 123 - 139.
Yang H, and Yagar S (1994b), Traffic assignment and traffic control in general free way-arterial
corridor systems. Transportation Research 28B, 463 - 486.
Yang H and Yagar S (1995), Traffic assignment and signal control in saturated road networks.
Transportation Research 29A, 125 - 139.
Yang H (1996a), Sensitivity analysis for queueing equilibrium network flow and application to
traffic control. Mathematical and Computer Modelling, 22, 247 - 258.
Yang H (1996b), Sensitivity analysis for the elastic-demand network equilibrium problem with
applications, Transportation Research, 31B, 55 - 70.
Yang H (1996c), Equilibrium network traffic signal setting under conditions of queueing and
congestion. Applications of Advanced Technologies in transportation Engineering.
th
Proceedings of the 4 International Conference (editors: Y J Stephanedes and F Filippi),
American Society of Civil Engineers, 578 - 582.
Zhang D and Nagurney A (1995), On the stability of projected dynamical systems, Journal of
Optimisation Theory and Applications 85, 97 - 124.
CHAPTER 7
TRAFFIC SIMULATION
It requires a very unusual mind to undertake the analysis of the obvious.

(Alfred North Whitehead)
The only way to discover the limit of the possible is to look beyond it into
the impossible.
The computer is down. I hope it is something serious.
517
MACROSCOPIC MODELLING OF TRAFFIC

FLOW BY AN APPROACH OF MOVING
SEGMENTS
Michael Cremer, Daniela Stacker and Pascal Unbehaun,

Technical University of Hamburg-Harburg, Hamburg, Germany
ABSTRACT
In this paper we present a new modelling approach to the macroscopic simulation of traffic
flow. The basic elements of the model are cells containing a certain fixed number of vehicles.
A road section is represented by a chain of these cells which cover the section completely.
The approach is inspired by hydrodynamics in so far, that the cells move to represent
traffic flow whereas the number of vehicles per cell is kept constant. We give some results
of the simulation program and compare these to data of an actual test site.
1 INTRODUCTION
Dynamic simulation of traffic behaviour becomes more and more important due to traffic
capacity problems. Applications range from the design of new traffic control strategies
and the estimation of their effects on real-time incident detection and operation of traffic
control.
Due to the stochastical nature of traffic flow, such a simulation can never produce exact
results. However, experiences with similar model have shown that macroscopic simulations
can provide important and very accurate information about traffic behaviour. This model
focuses on freeway flows, although an application to urban traffic might be possible.
From hydrodynamics two schemes are known for modelling mass transportation: a for-
mulation of the physical effects in a fixed spatial coordinate system and a formulation in
moving (Lagrangian) coordinates. There are some advantages of the latter when compared
to the fixed coordinates approach. Among these are
• Some simplifications in the formulation of the non-linear model equations
• less effects of numerical dispersion when a time- and space-discrete implementation

is applied.
In our model we still use a fixed coordinate system, but we take advantage of the basic idea
of the Lagrangian method: the space discrete segments or "cells" do not stay at a given
place and in a given form, but they move with the particles they carry inside [3j.
By applying this scheme to traffic flow modelling further advantages are obtained like
• direct calculation of travel times and
• higher accuracy in determining the length of a congestion due to the variability of

the spatial discretization scheme.
On the other hand there are also some disadvantages connected with this approach which
possibly have inhibited a wider application of moving segments in traffic flow modelling.
Essentially these are
• higher effort for modelling local measurements in comparison to models using spa-
tially fixed cells
• additional problems when linking topological elements to form a network which is

also done easier with fixed segments.
In this paper we present the details of the modelling approach using moving segments. We
give an impression of the quality and of the potential of this modelling approach by the
comparison of some simulated scenarios with real traffic data.
2 MODEL DESCRIPTION
The basic element of this model is a spatial cell containing a certain fixed number of
N vehicles [4]. A distribution of many vehicles then can be represented by a chain of
cells which thoroughly cover the road section without a gap. The downstream boundary
Si(t) of the i-th cell moves downstream according to the current average speed vi+i(t) of
the vehicles in the next cell. In this way the cell lengths are contracting and expanding
Macroscopic Modelling With Moving Segments 519
according to a nonuniform speed distribution. Since the number N of vehicles in a cell

is constant, the density in a contracting cell is increasing while in an expanding cell it is
decreasing.
It should be pointed out here that we do not seek a numerical scheme for solving any
partial differential equation with infinitesimal or very small elements of discretization.
Rather our spatial discretization scheme considers cells of a considerable length between
1 km in low density traffic and about 100 m at full congestion.
In this way averaging and speaking of aggregation remain still meaningfull with respect to
real traffic as an aggregation of maneuvers of individual objects.
Of course a macroscopic model cannot deal explicitly with effects like overtaking and
lane changing. The parameters of the model have to be adjusted in a way so that these
actions are covered by the simulation. A microscopic model, where individual vehicles are
modelled, can be more accurate. But simulations with such a model on a computer are of
course much more time consuming and would not be suitable for real-time simulations,
especially not for large freeway networks. This is the main reason why we investigate a
model with segments that "contain" averaged variables of state.
2.1 The Equations of State

To be more specific, let us consider a road section that is subdivided into moving cells
according to Fig. (1):
Cell
q*(k)
C M (k+l) wx(k)
s ul (k+l) Xo
Figure 1: Road section
The dynamic state of our macroscopic model is defined by two state variables
Vi(k) mean speed of vehicles in cell i at time k • T

Si(k), Si_i(A;) boundaries of cell i at time k • T
Within this definition we have already adopted a time discrete formulation with a constant
time step T (preferably in the range of 5 to 10 seconds).
Now dynamic motion of a cell's position is simply described by
(1)
For the dynamic adaption of the mean speed Vi(k) within a cell we refer to the macroscopic
model of Payne [2] that was later on modified by Cremer [1], in which four terms affect
the cell's speed of the next timestep: the actual speed, adaption to the stationary speed
density relation V(c), a term of convection of speed profiles in downstream direction and a
weighted density gradient as foreseen by the driver in downstream direction. Since in our
model the cells carry their own speed the convection term becomes obsolete and we get
^, (2)
where
i \ ) — iv / î 1\ / \ )
is the length of the i-th cell and

Ci(k] = jj^} (4)
is the traffic density within cell i.
For the steady state speed-density relation we use [1]
> /\m
(5)
with:
Vf the maximum speed of undisturbed traffic flow
Cmax maximum density at full congestion
/, m dimensionless parameters to adjust the form of the relation.
This is our starting point. However, it will turn out that the last term in (2) is no longer
adequate if a scheme of moving cell boundaries is to be applied, as we shall show now.
Let us, as an example, examine equation (2) in a traffic situation where we have Vi < vi+i
and Q < Cj+i.
In this case speed as well as traffic density are increasing in downstream direction. Con-
sequently, the density gradient term in (2) slows down the i-th cell in the next timestep.
Since this cell is already slower than the adjacent downstream cell and falls behind this is
no desired effect.
We fix the problem by introducing a new term replacing the density gradient to obtain our
second equation of state:
-h(vi(k)).S*-(vi+l-Vi), (6)
T
where we set
for > vi+i
8* = (7)
for <V
The expression in brackets of the new term accelerates the z'-th cell if the (i + l)-th cell is
faster than itself or decelerates it in case the (i + l)-th cell is slower.
6* is a dimensionless factor depending on the traffic density. For example in case that
v
i < vi+i, acceleration caused by the new term is amplified if the vehicles foresee an area of
smaller density in downstream direction. Thus, it brings back the effect that was introduced
by the now omitted density gradient (2).
The effect of this term turned out to be correct in quality. However, at full congestion, it
was not strong enough and therefore the speed dependent weight h(vi(k)} was introduced.
Around a certain cutoff speed vcut this weight increases from a value of 1 to 2 as the speed
decreases:
n(Vi(K)) f,,.(M_,. A/,,<r i 7 ' *• \"J
Figure 2: Speed dependent weight with cutoff-speed vcut = 30™.
The parameter v" has the dimension of a speed and can be used to adjust the abruptness
of the transition to a higher weight.
These basic model equations have to be supplemented by the following mechanisms for
which algorithmic solutions have to be found.
Calculation of local measurements for volumes and average speeds at selected

positions for model calibration with real measurements and for application of traffic
responsive control
Generation of cells at the upstream end of a road section according to the given
arrival rate.
Specification of boundary conditions at the downstream end of a section to

model congestions which move into the section.
We now want to show how the traffic volume q and the local speed w at an arbitrary
position XQ on the road section can be calculated.
We have to look at two different cases:
• In the first case XQ is still inside of the same cell after one timestep
• In the second case the upstream boundary of the cell has moved past XQ so that the
adjacent upstream cell now contains x0 after one timestep.
Sj.2(k) s M (k) x0
v,(k)
k+1
x0 s,2(k+l) ^
a) b)
Figure 3: Cells passing a fixed position XQ
Not more than one cell boundary may pass XQ in one timestep. In Payne's model there is
a similar condition: the vehicles may not skip one complete cell. Our condition reads
(9)
To satisfy this condition one has to choose small enough timesteps.
We now compute the local speeds and volumes for two cases as mentioned above.
2.2 The Local Traffic Volumes

First case.
(Cf. Fig. 3 a.) We consider the vehicles to be homogenously distributed within a cell.
After kT seconds there are N™p(k] vehicles in the z-th cell before XQ (i.e. in upstream
direction).
N?p(k) = a ( k ) • (x0 - Si-^k)) (10)
(Note that the total number N of vehicles in a cell is always constant, except in the first
and the very last cell.)
One timestep later this amount decreases to N™p(k + 1) vehicles. The difference has passed
XQ.
(11)
(12)
By definition, the traffic volume is
Q*(k) = ^- (13)
From this we get an expression for the volume in our model:

, _ Q(fc) • (x0 - Sj-i(k)) - Cj(k + 1) • (x0 - Si _i(A: + 1))
(Jx(K) — rp • \LÎ
Second case
(Cf. Fig. 3 b.) Here we have to look at both cells:
Nxo(k) = N^p(k)-N^p(k + l) + N^l(k)-N^l(k + l) (15)

=0 =N
We have N^(k) = N since the (z — l)-th cell has not reached XQ yet. One timestep later,
the i-th cell has passed x0 completely, which yields N™p(k + 1) = 0.
The total number of vehicles that have passed x0 is
Nxo(k) = a(k) • (xQ - Si-!(k)) + N - ct_,(k + 1) • (ar0 - Sl_2(k + 1)) (16)
Inserting this into (13) we get

,M Cj(fc) • (X0 - Sj-i(fc)) + N - Cj^(k + 1) • (X0 - S^2(k + 1))
<Ix(k) = j, (I7)
for the local traffic volume.
2.3 The Local Speeds

First case.
Again, we start with the situation shown in Fig. 3 a).
The information we have to compute the local speed at XQ are the speeds of the cells at
the A;-th and (k + l)-th timesteps. We choose a linear expression v*(t) for the interpolation
of the z-th cells speed from Vi(k) to V{(k + I):
- (18)
To obtain the local speed wxo we take the average over one timestep:
(fc+l)T
wxo(k) = ? / v!(t)dt. (19)

fcT
Since v*(t) is a linear expression we can write this as an arithmetical mean
(20)
Second case.
Here the z-th cell covers x0 until XQ is reached by its upstream boundary Sj_i. The boundary
travels with the speed Vi(k) between kT and (k + 1)T. We can now calculate the time txo
where Sj_i reaches XQ.
x0 = Si-!(k) + (txo - kT) • vt(k) (21)

Xp - Sj-i(k) .
(22)
Vi(k)
In the next timestep, XQ will be inside of the adjacent upstream cell.

For the local speeds, we have to average from kt until txo for cell i, and from txo until
(A; + 1)T for cell (i - 1).
tx 0
wxi = ^ / <(*) dt (23)

kT
2Vl
~(vi(k) + v*(txo)) (24)

(25)
(k+l}T
1
(26)
1)T-«.0
(27)
We average over these two speeds, weighted with the number of vehicles in the correspond-
ing cell section and obtain an expression for the local speed:
Cj(k) • (X0 - Sj-

I-
3 GENERATION AND ANNIHILATION OF CELLS

Since in our model the cells do not stay at a fixed place we have to find a mechanism to
generate new cells at the beginning of the road section. To explain how this works consider
the following scenario (Fig. 4 a - e):
The first cell in the road section is fully occupied by N vehicles. In the next timestep,
qo(k)T vehicles enter the cell so that a new cell is generated. qo(k) denotes the measured
traffic volume at the beginning of the road section. T and N have to be set to appropriate
values so that not more than one full cell is generated per timestep. Therefore, the following
unequality has to be satisfied:
q0(k)-T < N, (29)
by analogy to equation (9).
a)
b) k+1
c) k+2
d) (q0(k) + q 0 (k+l))T N k+2
e) N N k+2
(q0(k) + q 0 (k+l))T-N
Figure 4: Generation and merging of new cells
We now presume this new cell not to be fully occupied. At (k + 2)T, qo(k + l)T additional
vehicles enter the section. Now there are two possibilities:
First, both cells together might not contain more than N vehicles, i.e.
l - T < N.
In this case the cells are merged to give a new incomplete cell (Fig. 4 d).
Second, we might have more than N vehicles in the cells:
(<7o(AO + 9o(A: + l ) ) - T > N.
The cells will then be rearranged so that one complete cell is generated. The remaining
vehicles give a second incomplete cell (Fig. 4 e).
We still have to find expressions for the variables of state for the new cell which
from now on we shall give the index 0.
The information we have are <?o and WQ. We think of the new cell to contain qo(k)T
vehicles that have a speed of wo(k) during the time interval [kT,(k + 1)T]. At kT the
cells downstream boundary is located at the beginning of the road section, at (k + 1)T
its upstream boundary is at that position. Since the speed is constant during the time
interval we find
go(k)-T
I
= «*>(*) (30)
Qo(k).T
o(k)T N
|
s0(k) = 0 s,(k)
T
q 0 (k)T N k+1
A
0 s 0 (k+l)
Figure 5: A new cell moving into the section
The new cell is referenced to by its boundaries s_i and s0 (cf Fig. 5). Using our first
equation of state (1) we get
= s-i(k)+T-v0(k). (31)
From Fig. 5 we see that

s_i(fc + l) = 0. (32)
From this we get
0 = s^(k) + T
s-i(k) = -T-vo(k) (33)
and, using (4), the density
N0(k)
s0(k) -s_i
Qo(k)
(34)
As we see, WQ and q0 contain enough information to compute the new cell that enters the
section at (k + 1)T already one timestep earlier.
In case that cell one, which is the adjacent in downstream direction to the new
cell, is not fully occupied, the vehicles have to be distributed differently in order to get a
complete cell with N vehicles.
First, we consider the case where the sum of the number of vehicles in both cells is less
than N:
+ l)) < N. (35)
Y
N0(k+l) N,(k+l)
v0(k+l) V )
S 0 (k+l)
N0(k+l) + N,
Figure 6: Merging of two incomplete cells at the beginning of the road section
We have to merge these two cells to a new one (cf Fig. 6). Therefore, we set the boundary
so(k + 1) to zero and add the number of vehicles. The speed is calculated by averaging the
speeds of the cells, weighted with the corresponding number of vehicles.
n , . #o(fc + l ) - M f c + l) + Wi(fc + i; (36)

N0(k
Now we look at a second case where
N0(k + l) + N,(k + l)) > N. (37)
In this case the vehicles are rearranged in such a way that cell 1 becomes complete.
Ninew(k + l) = N (38)
NQnew(k + l) = N0(k -N (39)
The boundary between the cells is moved to keep co(k + l ) constant (cf Fig. 7).
(40)
s0(k + 1) s0new(k
(41)
N0(k+l)
V0(k+l) K
S 0 (k+l)
N
v lnew (k+l)
Sonew (k+l)
V 0 (k+l)
Figure 7: Generation of a complete cell at the beginning of the road section
The speed of cell 0 remains unchanged, whereas the new speed of cell 1 is computed in
analogy to (36):
Ni(k + 1) • vi(k + 1) + (N - Ni(k + 1)) • v0(k + 1)
(42)
N
It is useful to update the cell indices (cf Fig. 8) so that the first cell in the section always
is referred to as No. 1.
old 0 1 2
Y
new • I 2
A
3
Figure 8: Renumbering of the cells
The Variables of State of the Last Downstream Cell
Some part of the last downstream cell is already outside of the road section, where
our variables of state are not defined. But we have the measured values of the local traffic
volume q*L and the local speed w*L, both taken at the end of the section. We will use these
as our boundary conditions.
In the last cell, we will assume the number of vehicles N to remain constant. In
addition to that, we set
cm = 4§
w* (k) L
(43)
v*L(k) = w*L(k) (44)
Thus, the upstream boundary of the last cell can be calculated according to our first
equation of state (1) for the next timestep, which means that the last cell boundary moves
with a speed of w*L(k}.
Should the last cell boundary cross the end of the road section during the simulationstep
from kT to (k + 1)T, the last but one cell becomes the last in the next timestep. The
above variables then refer to this cell at (k + l)T. The downstream boundary does not
have to be calculated as the information taken from the measured local values is sufficient
for modelling the last cell. In this way we are able to make use of the local volumes and
speeds at the end of the section, which enables us to model congestions. If, for example,
the end of the section constitutes a bottleneck, a congestion will build up and move in
upstream direction as it grows in size.
For handling this annihilation process of cells a more complicated scheme was considered.
Since the accuracy of the simulation results was not improved, this approach has been
dropped in order to save computing time. The boundary conditions at the beginning of the
road section are more important, since the vehicles travel from there into the simulated
area. However, in order to simulate a more complex topology, a different approach should
be chosen.
4 RESULTS OF THE SIMULATION PROGRAM

We compare our simulation results with real traffic data from a 6 km section of the
motorway A9 between Nuremburg and Munich, Germany. The data have been provided
by the SIEMENS AG. We look at the local traffic volumes and speeds at a fixed position
near the end of the test site.
Since our model is still in a very early stage of developement, it has not been calibrated
yet, and all parameters have been adjusted using our experiences from other models.
First, we take a look at a free flowing traffic situation (Fig. 9 and 10) where the
vehicles travel with a speed of approximately 130 km/h. The model gives good results for
this scenario of medium overall traffic load for both the local speed and volume.
140
120
100
80
60
40 Simulation —
Measurement —
20 Minutes
0
30 60 90 120
Figure 9: Local speed of undisturbed traffic
Figure 10: Local volume of undisturbed traffic

Our second example (Fig. 11 and 12) features a congestion building up and dissolving.
Whereas the speed is simulated accurately, there are intervals where the simulated traffic
volume deviates from the measured values. The reason for this might be found in statistical
fluctuations of the real traffic data due to the fact that these were taken in one minute
intervals, whereas the timesteps used for our simulation program are 3 to 6 seconds long.
Due to restrictions of the actual implementation of the model, we cannot deal with longer
timesteps yet, since it would have to generate more than one new cell per step. In a future
release, there should be a better match between traffic data and simulation intervals to
investigate whether the deviations result from the short simulation timesteps.
140
120
100
80
60
40 Simulation
Measurement
20 Minutes
0
30 60 90 120 150
Figure 11: Local speed of a congestion
1.6
1.4
1.2
1
0.8
0.6
0.4
Simulation
Measurement —
0.2 Minutes
0
30 60 90 120 150
Figure 12: Traffic volume of a congestion
The third scenario(Fig. 13 and 14) shows a rapid drop of the mean speed, which can be seen
very well in the results of the simulation. However, during this time the traffic volume is
not simulated correctly, but shows a strong bias towards a higher volume. This seems to be
different from our second example where the deviations did not show a constant tendency.
Rather what we have here is a traffic density which is constantly too strong . The answer
to this riddle is also the reason why the mean speed decreased at all: The traffic data states
that heavy rain set in at the test site on which the drivers reacted by slowing down but at
the same time roughly keeping the distance between the vehicles, thus the traffic density
remains about the same.
Since our model draws information of this kind from the steady state speed-density relation
which is fixed, it cannot react to such an incident.
140
120
100
80
60
40 Simulation
Measurement —
20
0
30 60 90 120
Figure 13: A sudden decrease of mean speed due to rain
1.6
1.4
1.2
1
0.8
0.6
0.4 Simulation
Measurement -
0.2
0
30 60 90 120
Figure 14: Biased traffic volume caused by inadequate speed-density relation

5 CONCLUSION
We have presented a new model for macroscopic traffic simulation using moving cell bound-
aries.
We indicated how potential problems of the approach, especially concerning the boundary
conditions can be overcome. This should enable us to include local on- and off-ramp flows in
a next step of developement which would imply the possibility of forming a more complex
topology like a network. Although the model has not yet been thoroughly calibrated, the
simulation results are very good in most cases, ranging from low to heavy traffic load. For
future research, emphasis should be laid on making the simulation program more versatile.
In addition to the aforementioned topology, a variable steady-state speed-density relation
might be included in order to make the simulation react to changes of the state of the
road as caused, for example, by a change of weather conditions. This should give more
accurate results for our second simulation scenario, where the sudden rain was not taken
into account. To reduce influences of statistical fluctuations, the real traffic data used as
input for the simulation should be smoothed. Especially since our macroscopic model is
still in a very early phase of its developement we consider the simulation results to be very
encouraging.
References
[1] Michael Cremer, Der VekehrsSuss auf Schnellstrassen Springer-Verlag 1979
[2] H.J. Payne, Models of Freeway Traffic Control Math. Models of Public Systems.
In Simulation Council proceedings, 1971
[3] M. Baum, Zustandsschatzer zur Schadstoffiiberwachung in Tidettiissen bei Mod-

ellierung im mitbewegten Koordinatensystem, Schriftenreihe der Arbeitsgruppe
Automatisierungstechnik, Technische Universitat Hamburg-Harburg, 1990.
[4] Daniela Stacker, Modellierung des VerkehrsHusses auf Autobahnen mit be-
wegten Zellen, Diploma Thesis, Arbeitsgruppe Automatisierungstechnik, Tech-
nische Universitat Hamburg-Harburg, 1998.
535
MICROSCOPIC ONLINE SIMULATIONS

OF URBAN TRAFFIC
Jorg Esser, Los Alamos National Laboratory, MS M997, Los Alamos, NM 87545, US,
esser@santafe. edu
Lutz Neubert, Physik von Transport und Verkehr, Gerhard-Mercator-Universitdt, Lotharstr. 1,
47048 Duisburg, Germany, neubert@traffic.uni-duisburg.de
Joachim Wahle, Physik von Transport und Verkehr, Gerhard-Mercator-Universitdt, Lotharstr.
1, 47048 Duisburg, Germany, wahle@traffic.uni-duisburg.de
Michael Schreckenberg, Physik von Transport und Verkehr, Gerhard-Mercator-Universitdt,
Lotharstr. 1, 47048 Duisburg, Germany, schreck@traffic.uni-duisburg.de
Time-dependent information about traffic states in road networks is an important condition for
efficient dynamic traffic assignment. In this contribution a concept for online simulations of
urban road networks is developed: Real-time traffic data stemming from induction loops serve
as input for microscopic simulations. By running the simulation traffic conditions are estimated
for regions which are not adequately covered by measurements. The quality of the reproduced
traffic states with regard to vehicular densities and link travel times are investigated. The
dynamic data are processed by a route guidance system based on a fuzzy logic approach. As an
example for dynamic traffic management different strategies for individual en-route guidance
systems and their efficiencies are studied. For all investigations the real road network of
Duisburg served as study area.
1 INTRODUCTION
Daily traffic jams reflect the fact that the road networks are not able to cope with the demand
of mobility which will still increase in the near future. Especially, in densely populated regions
the freeway network cannot be expanded to relax the situation and existing infrastructures have
to be used more efficiently (with "freeway" we mean the German "Autobahn"). Therefore, a lot
of work has been done to develop dynamic route guidance systems which offer possible travel
routes (for an overview see Ran and Boyce, 1996). To make such proposals these systems need
valuation criteria and a detailed knowledge base of the present traffic state.
Typically, traffic data are collected by locally fixed detectors like induction loops or cameras.
In most cases they are installed in the neighbourhood of intersections in order to control and
optimise the traffic signals. But in many cities the road network is not adequately equipped
with detection devices to gather information about the present traffic state in the whole
network.
A possible way to derive information for those regions which are not covered by measurements
is to combine local traffic counts with the network structure (i.e. type of roads, priority
regulations at the nodes, etc.) under consideration of realistic traffic flow dynamics. This is the
basic idea of online simulations: Local traffic counts serve as input for traffic flow simulations
to provide network-wide information. An advantage of this approach is the fact that all static
entities of the network like its structure or the traffic light management are incorporated
directly in the simulation dynamics.
The outline of this paper is as follows: Firstly, the cellular automaton approach for the
simulation is introduced. The underlying road network and the used database are described in
the third section. Within this section some technical remarks about the construction of the road
network are given. Some measurements in the network and the reproduction of traffic states are
discussed in the sections four and five. A route guidance system is introduced in the sixth
section, and in the final section it is shown how the network simulation can be extended to an
freeway network.
Online Simulations Of Urban Traffic 537
THE CELLULAR AUTOMATON MODEL
In general, traffic flow models should describe relevant aspects of the flow dynamics as simply
as possible (for an overview see Schreckenberg and Wolf, 1997). In this spirit very recently
cellular automaton models (CA) were introduced which give a microscopic description of the
vehicular motion using a set of update rules. Among them the original Nagel-Schreckenberg
cellular automaton model (Nagel and Schreckenberg, 1992) is perhaps the simplest.
Nevertheless, it is capable to reproduce important properties of real traffic flow, like the
density-flow relation and the spatio-temporal evolution of jams (Treiterer et al., 1965) (Fig. 1).
Furthermore, cellular automata are by design ideal for large-scale computer simulations and
can therefore be used in complex practical applications very efficiently.
distance *• distance *•
Figure 1: Time vs. distance plot of density waves. Each trajectory represents a vehicle. The
left picture was generated by using video sequences taken from an American highway
(Treiterer et al., 1965) and shows the spontaneous emerging of a jam. Similar structures can be
obtained by simulations with the Nagel-Schreckenberg model (right, v,^ = 5, p = 0.5).
For completeness, we recall the definition of the Nagel-Schreckenberg model for single-lane
traffic. In the model the street is subdivided in cells with a length of Ax = pjjm = 7.5 m, with
tne
Pjam density of jammed cars (Fig. 2). Each cell i is either empty or occupied by only one
vehicle with a (discrete) velocity v, e {0; v^}, with Vma* the maximum velocity (velocity is
measured in cells per time step). The motion of the vehicles is determined through the
following rules (parallel update):
• Collision-free acceleration: v, <— minfv, + 1, v,^, gap),
Randomisation: with a certain probability p do v(. <— maxfv,. - 1 , 0)
• Movement: ;c, 4— jc, + v,(2).
The variable gap is the number of empty cells in front of the vehicle at cell i. A time step
corresponds to At ~ 1 sec, the typical time for a driver to react. With a maximum velocity
Vmax = 5 cells per time step the cars can speed up to 135 km/h. Note that Vmax does not bound
the speed due to technical reasons but it is a speed which is desired by the majority of drivers if
they are not hindered by other vehicles. In reality, this "desired" speed can be spread widely. It
should also be remarked that a single vehicle might exhibit unrealistic behaviour from a
microscopic point of view, e.g. it is possible to slow down from maximum velocity to zero
within one time step.
g»P|f3
cell length 7.5m velocity _2

/
/• •» +
^2 -1
^2 ^3
gap=2
Figure 2: Outlook of a road in the Nagel-Schreckenberg model. The road is subdivided in

cells, which are 7.5 m long. Each car has a discrete speed up to v^, which is restricted by the
headway gap to the car ahead. If a driver wants to change the lane he has to take into
consideration two more gaps, gaps and gapp, on the alternative lane in order to prevent crashes.
How can the update rules be interpreted? The first rule describes an optimal driving strategy,
the driver accelerates if the vehicle has not reached the maximum velocity v^^ and brakes to
avoid accidents, which are explicitly forbidden. So far, the model is completely deterministic,
i.e. the stationary state only depends on the initial conditions. Therefore, the introduction of the
noise p is essential for a realistic description of traffic flow. It mimics the complex interactions
between the vehicles and is also responsible for spontaneous formation of jams. The parameter
p includes over-reactions like heavy braking or delayed accelerations. A comparison with
empirical results (Treiterer et al., 1965) shows that the macroscopic features of traffic flow are
reproduced quite well (Fig. 1). An analytical description of the model is very difficult, it can
only be solved in certain limiting cases, e.g. v^^ = 1 or p —> 0 (Schadschneider and
Schreckenberg, 1998 and references therein).
The fundamental (hydrodynamic) relation reflects the dependencies between the density p, the
velocity v and the flow 0, which (from a global point of view) are given by
P=Y. « = ^f v " 0=p<v> (l)

for a system with N vehicles and L cells.
In fact, more detailed measurements of freeway traffic (Kerner and Rehborn, 1996; Helbing,
1996; Kerner and Rehborn, 1997) yield that flow is not a single-valued function of density. In
some density regimes two branches in the fundamental diagram coexist. The upper branch of
higher flow can be characterised by negligible interactions between vehicles, jams do not
emerge. A system in the lower branch shows homogeneous flow as well as large jams with
phase separation. The high-flow states are called metastable and have not been observed in
simulations with the original Nagel-Schreckenberg model. But with a velocity-dependent
randomisation (VDR) Barlovic et al. (1998) could find metastable states in their simulations. In
a VDR model the delay probabilities are velocity dependent, i.e. p = p(v). A continuous
extension of the cellular automaton which also shows metastable states was earlier proposed by
KrauBetal. (1996, 1997).
Another phenomenon observed in real traffic at intermediate densities is the synchronised flow
(Kerner and Rehborn, 1997). If a transition from free flow to congested flow takes place it
sometimes yields a sudden drop in velocity but the flow remains nearly constant. The new
phase in the jammed region is called synchronised flow. But its nature as well as its reasons are
still under discussion.
Nevertheless, it has been shown that the Nagel-Schreckenberg model is sufficient to model
traffic flow in urban networks. In order to describe multilane traffic the set of fundamental
rules has to be expanded. This extension has to be carried out with regard to safety aspects and
legal constrains, which vary according to the considered country. A schematic lane change is
shown in Fig. 2. Firstly, the vehicle on cell / checks if it is hindered by the predecessor on its
own lane. This is fulfilled for gapi < v(, e.g. Then it has to take into account the gap to the
successor gaps and to the predecessor gapp on the alternative lane. If the gaps allow a safe
change the vehicle moves to the other lane. A systematic approach for two-lane rules can be
found in Nagel et al. (1997).
ROAD NETWORK AND TRAFFIC DATA
An urban road network is very complex, but Esser and Schreckenberg (1997) showed that
arbitrary kinds of roads and intersections an be constructed with only a few basic elements.
With these elements it is possible to construct the complete mean road network of a city like
Duisburg, an area of about 30 km2 (Fig. 3). In the following an edge corresponds to a driving
direction on a road, i.e. each road usually consists of two edges. For each road the number of
lanes, the turning pockets, the traffic signal control and the detailed priority rules are included
into the simulation. The network consists of 107 nodes (61 signalised, 22 non-signalised and
24 boundary nodes), 280 edges and 22,059 cells corresponding to about 165 km. The boundary
nodes are the sources and sinks of the network.
For an online simulation the model has to be supplemented by traffic data gathered from all
over the city. Therefore, every minute the measurements at over 500 induction loops are sent
from the traffic computer of the municipality of Duisburg to the online simulation computer
(Fig. 4).
The data collection is used in several ways. On the one hand, at the so-called check points the
traffic data derived from the simulation and from the measurements are compared. With
respect to the differences the number of passing vehicles is changed by creating or annihilating
vehicles. They serve as internal sources and sinks, depending on the calculated differences. It is
Figure 3: Sketch of the simulated road network with check points (filled circles) and sources
and sinks (letters). At the check points the data of all lanes of the road are accessible. Here the
local flow can be tuned with respect to the empirical data collection. The digits denote the
mean adjustments to rates of passing vehicles (cars per minute, averaged over 24 hours).
Traffic data collection

(flux, density, type of vehicles, ...)
Online connection
with data interface
(CA microsimulation)
Figure 4: Flowchart of the Online Simulation (OLSIM). The traffic data are sent via an
permanent connection to the Controller. The Controller handles static information like the
network structure and performs CA microsimulations. The results can be visualised and
processed in further application and numerical experiments.
convenient to make these adjustments at the check points, because at such a point (in total there
are 51) all lanes of the road are equipped with detection units and cross-section information is
accessible (Fig. 3). There are two feasible ways for tuning, either by adapting densities or
flows. The same flow, for instance, is connected with two different densities, as it can be seen
in a fundamental diagram, and therefore it may cause problems. But the flow can be easily
expressed by the number of cars Np passing a check point during a simulation period AT: :
<2)
»•=£•
One finds two ways to estimate the local density: (a) by summing up the durations
t°ccu vehicles are screening the area of an induction loop, or (b) according to (1) supposed the
velocities v, are accessible and with respect to the number Ns of time steps during which
vehicles are stopping directly upon an induction loop (only available in simulations):
1 3; N2 N
The most convenient method, which is applied here, is the calculation of the density of a
complete edge or at least of a section: If n vehicles are occupying a link of length ze, one
defines
(c} P< — • (4)

Ze
But the simulations showed that both methods, namely tuning the flow or the density, result in
comparable quantities taken into account. Special check points are the boundary nodes — here
the empirical traffic data directly generate the rates of vehicles entering and leaving the
network.
In Fig. 3 the rates of vehicles per time interval (typically 1 minute) that have to be put into the
network are shown. Obviously, at most check points vehicles have to be added into the
network meaning that there are additional sources in the network like smaller side streets or
parking lots, which are not recorded by the detection units. This is the main difference between
urban and freeway traffic: On freeways sources and sinks are well defined by on- and off-
ramps, so it is easier to handle them and to collect data. Thus in the urban network of Duisburg,
reliable results of the online simulation can only be obtained in the inner city, where the
density of check points is sufficiently high.
On the other hand, the collected data are necessary to compute the turning probabilities at the
crossings, if possible (currently, this can be done for 56 driving directions). In addition, the
necessary but missing turning probabilities were counted manually in order to get at least the
average number of turning vehicles at intersections which are not covered by measurements.
We decided to use this algorithm, since origin-destination information with a sufficient
resolution in time and space was not available. Therefore, all vehicles are guided randomly
through the network.
RESULTS OF THE NETWORK SIMULATION
The online simulation enables to interpolate the traffic state between check points (which are
typically nearby intersections) and to extrapolate into areas which are hardly or not equipped
with detection units. This can be visualised (OLSIM, 1999) and also serve as a support for
planning a trip (in this case for Duisburg, Fig. 5).
stmbol. liSHJÎ " Updat'

Dace: 29.4.1998 16:20:00 Overvleul Can In the Hot: I 2054
Figure 5: Screen shot from the interactive map (OLSIM, 1999). The roads are coloured
according to the traffic loads computed during the previous minute.
The empirical and numerical results allow more detailed examinations of network traffic (Fig.
6). The higher number of vehicles found in the simulation can be traced back to the enlarged
length of simulated roads in comparison to the length of roads which are covered by detection
units — a result of the extrapolation by the simulation. On a typical Wednesday (left column in
Fig. 6) the commuters cause a pronounced peak in the morning rush-hour. Due to shopping
traffic, varying working times and people working overtime the afternoon-peak is broader.
Smaller peaks in the early morning or in the late evening are caused by shift-workers. These
peaks can also be detected on Sunday (right column in Fig. 6). Additionally, the background
traffic is also higher at Saturday night ("Saturday Night Fever").
Wednesday, Empirical data Sunday, Empirical data

1200 _ 50 1200
900
600
$$&
300
6:00 12:00 18:00 24:00 0:00 6:00 12:00 18:00

Time Time
Wednesday, Simulation Sunday, Simulation
12:00 18:00
Time
Figure 6: Comparison between the empirical (upper row) and simulation results (lower row) as
well as between a typical Wednesday (left column) and a typical Sunday (right column). By
night the mean speed is higher than by day, but its variance increases due to the more
significant contribution of both cars waiting at intersections and cars driving at high speed on
the links.
In general, a smaller number of vehicles driving by night leads to an increased variance of the
measured and the simulated speed. Stopping cars at intersections affect the statistics strongly,
but it is also highly possible to find drivers who are taking the risk of driving very fast, since
the roads are quite empty. By day all these fluctuations are smoothed out. The network is
dominated by the intersections and priority rules or their traffic lights, on the roads between
intersections the behaviour of drivers is constrained — nearly everyone acts in a similar way
irrespective of preferences or engine performance.
The effect of smoothing out can be observed not only between day and night, but also between
empirical and simulated results. For the empirical data the drivers' behaviour at the
intersections mainly contributes to the statistics, whereas in the simulation the cars travelling
between the intersections are of greater relevance. This fact is also responsible for the slightly
increased mean velocity.
An important condition for online simulations or, in the next step, generating a traffic state
forecast, is to simulate large networks faster than real time. Having in mind the simple set of
update rules, cellular automata models are by design suitable to meet this requirement. As
depicted in Fig. 7, the simulation of a network of the order of magnitude of Duisburg can be
easily performed on a common personal computer (Pentium 133 MHz}. The simulation
includes the data transmission between the diverse modules of the application as well as
handling the data of the traffic lights. Within the most interesting density interval (10 ... 40% or
= 2,000 ... 8,000 vehicles in the network) it only takes less than one second to perform an
update step of one minute in real time. The remaining time can be used to calculate trip
alternatives (see section 6), to simulate several scenarios under varying conditions or to
estimate the traffic state that is probably to be expected within the next few minutes.
Vehicles in the network

4000 8000 12000 16000
400
300
0% 20% 40% 60% 80%

Density
Figure 7: As expected, the simulation speed of a microscopic model like the cellular
automaton traffic flow model is inversely proportional to the number of simulated vehicles. For
the most frequently occurring densities the road network of Duisburg can be simulated around
100 times faster than real time.
REPRODUCING TRAFFIC STATES
A sensitive test for the quality of the online simulation is the ability to reproduce given traffic
states. Because network-wide information cannot be obtained from the measurements, we have
to compare the results of the simulations with artificial states (reference states') generated by an
independent simulation run. In other words, we perform two simulation runs with two
independent sets of random numbers, but the same set of simulation parameters (e.g. source
rates). After reaching the stationary state, we estimated the reproduction rate of the local
density via the following quantity:
Pe ' P.|) (5)
The edges are weighted by ze, the number of cells on the edge (hence Z denotes the total
number of cells ze of all Ne edges). The local density of an edge in the second run pe is
compared with the local density of the same edge pe drawn from the reference state. Each
measure point serves as possible check point in the reference run.
Scenario 1 Scenario 2
0.96 '. ©T'c

* i 4 * * ^ - , § 6 S> © § ' x
§ - _ §
0.94 g * 6 ~ ®
? l
6 j * ****~ '*
f z * * * » * * <
0.97 4 0.92
f ** - 0* * ~ * *
I 0.90 °*
0.96 * Flow Measurement
» Flow Measurement
0.88 1 o Density Measurement
<, o Density Measurement
0.95 i
0.86
;
1
0.2 0.4 0.6 0.8 1.0 0.4 0.6
Pmc Pmp
Figure 8: Similarity Rp(pmp) resulting from reproducing traffic states of two scenarios.
Simulations in scenario 1 (2) are performed with an input rate rs = 0.1 (0.5) at the boundary
nodes. Additionally, the special reproduction rates R™ (dotted line) and Rhp (dashed line) are
shown.
In Fig. 8 the results of the reproduction rate dependent on the probability p^ that a measure
point is actually used as check point are depicted. The results are shown for two different
values of the input rates at the boundary nodes (scenarios 1 and 2). For obvious reasons the
similarity of the states increases for higher p^. But for small values of pmp the reproduction
rate does not increase monotonically. This is due to the fact that the reproduction rate strongly
depends on the position of the check points in the network. In order to show this effect we used
a completely new check point configuration for each value of p^. It should be mentioned that
even for the empty network (without any check points) high reproduction rates can be obtained
if the input rates of the simulation and the reference state agree. Therefore it should be possible
to extrapolate future states of the network with a reasonable accuracy using online simulations.
Comparing both scenarios we can see that the reproduction rate is higher for smaller input
rates. This means that it is more difficult to reproduce high density states. Finally, we consider
two special cases which are depicted in Fig. 8 as dashed and dotted lines. /?"" denotes the
reproduction rate that one obtains if the simulation and the reproduction only differ by the set
of the used random numbers, i.e. the measurements are starting with two identical system
copies. For large value of p^ a higher reproduction rate than /?"" can be achieved, because
the check points also store the fluctuations in the reference run leading to an extremely high
value of Rp. The reproduction rate Rhp denotes the results obtained from an initial
homogeneous state in the reference run.
Another criterion for the valuation of the quality of the reproduced traffic state is the time
needed for travelling between two intersections. Therefore, we define a parameter D, that
corresponds to the difference of travel times measured in two different simulation runs
r = £*~ (6)
k
as shown in Fig. 9 for two different configurations. The measurements include data collection
with floating cars (FC) which lead to a more sophisticated approach to the reference state.
Scenario 1 Scenario 2
U. \£. u.o
i
1i
0.4 * o D,(tkvp, tk'ot)
0.09 * OD.C-O
§ • D,(t7, t,101) 0 • D,(tkmp, tk101)
-, 0-3 .
S~~ , ir
-f 0.06 ?
~* 4> *
T • 0.2
Q" tf
< J > » a * i T _ i . _
i ; * - * - • • • c• T #
0.03 <> - - •J, i i • i

_--.!•••£--§•-£-••" t , « i • - ? • ,
0.1
~ <j> - ~
S 5
° °°o0
0.00 ^^ A rt
VAV ~ ~~ *=
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1
Pvp and p^ P vp andp mp
Figure 9: Travel time differences according to (6). The similarity of the compared states
increases with the density of floating cars (FC) and measurement points (MP). The dotted line
represents a reference run which only differs in the set of used random numbers, data below of
this line characterise a usable reproduction rate.
6 DYNAMIC ROUTE GUIDANCE
In the previous sections we have seen that the online simulation estimates the actual traffic
densities and the average speeds on the links. This information can be processed in a dynamic
route guidance system. In contrast to static route guidance systems incorporating only static
network data the actual traffic state plays an important role for the decision-making process in
dynamic systems. As a first application we investigated different routing strategies with the
road network of Duisburg as a test case. Note that certain share of vehicles are therefore
equipped with appropriate systems, and well-defined origins and destinations are assigned
explicitly to selected vehicles. This means that they do not turn randomly at intersections like
non-equipped vehicles.
Instead of calculating the routes for every demand vehicle individually, the optimum routes
were stored globally, so that the same route recommendations are given to all vehicles with
same origin and destination in every time step. After a predefined update interval the optimum
routes are recalculated. For the edge costs we used both, travel times (Fig. 10) and online
vehicle densities. While travel times based routing yield shorter overall travel times,
considering vehicle densities in the calculations allows aspects like driving comfort to be
incorporated. For this reason, both criteria have to be combined to provide route guidance
systems, which go far beyond the calculation of shortest paths with several weights. A very
important parameter for commercial applications is the market penetration of route guidance
systems (Esser and Schreckenberg 1998).
0=2.5; p=15;pr=0.2 SL=100; Update Period=100
1600
j| 1200 7 T ;»K ti I 4 ! *1 ? '

0) T T t ; : . ^ : -1 ; " - ;
K Ut»**:
800
0 200 400 600 800 1000 0 4 8 12 16 20

Update Period
Figure 10: It is important to tune thoroughly the update period (left) and the refresh period
(right) in order to get a real benefit for the drivers relying on a dynamic route guidance system.
The update period is the duration between two updates of the proposed route. For urban areas
the travel time is minimised for update period of = 10 minutes. The refresh period is more
artificial. It refers to the duration for which a stored travel time is taken into account for link
travel time estimations. After the refresh period it will be replaced by a mean value. This
procedure has been chosen in order to cut off the temporal correlations of different traffic
states.
In the simple approach given above only one valuation criterion was chosen and optimised. To
combine different criteria like route length, travel time and travel comfort, van Laak and
Torner (1998) implemented a symmetric decision model, a technique frequently used in fuzzy
set theory. The link travel times calculated by the online simulation are used as dynamic
weights.
A route guidance tool based on this approach will be presented in the Internet (OLSEVI, 1999).
The road user can choose a starting and destination point and a special trip is calculated. This
can be optimised with regard to different criteria selected by the user. In addition to the
dynamic routes further information like road-works or timetables of other transportation modes
can be accessed from an underlying GIS database. The effects of such a route guidance system
on the network performance will be analysed in a future investigation.
FREEWAY TRAFFIC
An important share of German traffic capacities is provided by freeways. For most big cities it
is necessary to incorporate them, since they are either an explicit part of the urban road
network or at least their entrances and exits represent high-throughput sources and sinks.
Duisburg, for example, is connected with three of the most frequently used freeways in
Germany, enclosing several intersections and a number of junctions.
In contrast to the simulation of urban traffic, freeway traffic is dominated by travelling on links
between junctions and/or intersections. The typical length scale is of some tens of kilometres,
whereas in cities it is an order of magnitude smaller. Although Fig. 11 may suggest otherwise
the junctions and intersections are indeed of simpler architecture, vehicles leaving or entering
the freeway via on- and off-ramps can be treated like those which only want to change the lane.
Only a few legal constrains have to take into account, safety aspects are of greater importance.
Additionally, there are less turning decisions.
Figure 11: A freeway intersection from the CA-point of view. Leaving or entering the freeway
is similar to a lane change procedure which simplifies the handling in comparison to urban
traffic simulations.
It has already been shown that even very large freeway networks (some 10,000 kilometres with
some millions of cars) can be simulated on high-performance computers faster than real time
(Rickert and Wagner, 1996). So it is natural to carry out an online traffic simulation. Since the
roads are well and homogeneously equipped with detection units (~ 2,500 measurement points
in North-Rhine Westfalia covering complete cross-sections) the simulation can be
supplemented by a proper set of data. The detectors are able to determine the speed of passing
vehicles as well as they can distinguish between vehicle types (passenger cars, trucks or
trailers).
A further reason for performing freeway simulations is an extended and more useful
application of route guidance systems. In urban areas real alternative routes with a significant
benefit are hard to find. In freeway networks, especially in a quite dense one like the German
Autobahn network, several alternative routes can be found. This fact also causes the increased
commercial interest in collecting freeway traffic data and generating useful information for
potential costumers. Here, the simulation can serve as a tool for estimating travel times
necessary to offer the best route available. The same algorithms developed for urban traffic
simulation and prediction can now be applied. Finally, the link between long-distance traffic on
freeways and the local traffic in the cities can be achieved.
8 SUMMARY
We presented a simulation tool for urban traffic which can easily be extended to model traffic
flow on freeways. The microscopic dynamics based on the Nagel-Schreckenberg cellular
automaton permits the simulation of large networks in multiple real-time. The network model
includes complex intersections and their priority rules as well as the handling of parking
capacities and the circulations of public transports. We were able to show that the combination
of a high-speed road network simulation with real-time traffic counts serves as a basis for two
applications. On the one hand the simulations allow for extrapolating traffic states from a
spatial and also a temporal point of view, and on the other hand a useful laboratory
environment for designing and evaluating dynamic traffic management systems incorporating
different criteria can be implemented. Finally, practical applications like an interactive and
inter-modal support for route planning are conceivable.
ACKNOWLEDGEMENT
The authors would like to thank K. Froese for digitising the Duisburg network and J. Lange,
acting for the traffic control centre of the municipality of Duisburg, for providing essential road
network data and especially the online traffic data. We are also thankful to O. Annen, C.
Gawron, S. KrauB, K. Nagel, M. Rickert, O. van Laak and P. Wagner for fruitful discussions
and sharing insights. Parts of this work were developed within the framework of the North-
Rhine Westfalia Research Cooperative Traffic Simulation and Environmental Impact NRW-
FVU(¥VU, 1997).
REFERENCES
Ran, B. and D. Boyce (1996). Modeling Dynamic Transportation Networks: An Intelligent

Transportation System Oriented Approach. Springer, Berlin Heidelberg.
Schreckenberg, M. and Wolf, D.E. (eds.) (1998). Traffic and Granular Flow '97. Springer,
Singapore.
Nagel, K. and Schreckenberg, M. (1992). A cellular automaton model for freeway traffic. J.
Phys.I, 2,2221.
Treiterer, J. et al. (1965). Investigation and measurement of traffic dynamics. Appx. IX to final
Report EES 202-2, Ohio State Univ., Columbus.
Schadschneider, A. and Schreckenberg, M. (1998). Garden of Eden states in traffic models. J.

Phys. A, 31, L225.
Kerner, B.S. and Rehborn, H. (1996). Experimental features and characteristics of traffic jams.
Phys. Rev. E, 53,R1297.
Kerner, B.S. and Rehborn, H. (1997). Experimental properties of phase transitions in traffic
flow. Phys. Rev. Lett, 79, 4030.
Helbing, D. (1997). Empirical traffic data and their implications for traffic modelling. Phys.
Rev. E, 55, R25.
Barlovic, R., Santen, L., Schadschneider, A. and Schreckenberg, M. (1998). Metastable states
in cellular automata for traffic flow. Eur. Phys. J. B, 5, 793.
KrauB, S., Wagner, P. and Gawron, C. (1996). The continuous limit of the Nagel-
Schreckenberg-Model. Phys. Rev. E, 54, 3707.
KrauB, S., Wagner, P. and Gawron, C. (1997). Metastable states in a microscopic model of
traffic. Phys. Rev. E, 55, 5597.
Nagel, K., Wolf, D.E., Wagner, P. and Simon, P. (1998). Two-lane traffic rules for cellular
automata: A systematic, approach. Phys. Rev. E, 58, 1425.
Esser, J. and Schreckenberg, M. (1997). Microscopic simulation of urban traffic based on

cellular automata. Int. J. of Mod. Phys. C, 8, 1025.
OLSIM (1999). Online Simulation of downtown Duisburg, Physics of Transport and Traffic,
University of Duisburg (Germany), http: / /www. t r a f f i c . uni-duisburg. de.
Esser, J. and Schreckenberg, M. (1998). In: Traffic and Granular Flow '97 (Schreckenberg, M.
and Wolf, D.E., eds.), p. 181, Springer, Singapore.
van Laak, O. and Torner, G. (1998). In: Traffic and Granular Flow '97 (Schreckenberg, M. and
Wolf, D.E., eds.), p. 221, Springer, Singapore.
Rickert, M. and Wagner, P. (1996). Parallel real-time implementation of large-scale, route-

plan-driven traffic simulation. Int. J. of Mod. Phys. C, 7, 133.
FVU (1997). Home Page at the Centre for Parallel Computing ZPR Cologne (Germany).
http://www.zpr.uni-koeln.de/Forschungsverbund-Verkehr-NRW.
555
MODELLING THE SPILL-BACK OF CONGESTION

IN LINK BASED DYNAMIC NETWORK LOADING
MODELS: A SIMULATION MODEL WITH
APPLICATION
Vincenzo Adamo1
Vittorio Astarita1
Michael Florian2
Michael Mahut2
Jia Hao Wu2
1
Universita della Calabria, Italy.
2
Centrefor Research on Transportation, University of Montreal, Quebec, Canada.
1. INTRODUCTION
The spill-back of congestion on a network is the propagation of congestion backwards from a
link to its upstream links. It occurs whenever the downstream queue is so large that it impedes
the incoming speed and flow of vehicles. This very common situation in many urban networks
is usually caused by recurrent congestion on a day-to-day basis. This paper provides a way to
adjust some existing network models in order to take this complicated phenomenon into
account. There are a number of different approaches to dynamic network loading and
assignment that have been studied, such as: heuristic generalisation of within-day static
methods (Hammerslag, 1988; Janson, 1989), exit-function methods (Merchant and Nemhauser,
1978; Carey, 1986a,b; Friesz et al, 1989; Wie et al., 1990; Boyce et al, 1991), packet-
approach (Cascetta and Cantarella, 1991; Smith and Wisten, 1996) continuous time link
models (Astarita, 1996; Wu et al. ,1995; Xu et al, 1996; which identify conditions for the
satisfaction of FIFO). These models are all capable of performing Dynamic Network Loading
(DNL), i.e. the reproduction of within-day variable link (or local) performances given the path
flow rates, and have proved to be useful, both to evaluate traffic flows, and, to simulate the
effects of control strategies on users' behaviour. However, none of these models adequately
address the problem of the backward propagation of congestion, and some do not even rule out
overtaking between users (FIFO rule).
Traffic flow specialists have been aware of the necessity of modelling spill-back phenomena
since the post war decade (Lightill and Whitham, 1955). Most of the microscopic traffic
simulation packages (Barcelo, 1996; Barcelo et al., 1989; Rilett et al., 1994; Van Aerde et al.,
1988) are able to represent spill-back phenomena very well. Macroscopic traffic flows models
have also this ability, since they have a continuous representation of space and are able to
propagate waves both upstream and downstream. Recently, some network applications
(Daganzo, 1995b; Lebacque, 1996) have been proposed in order to perform DNL. Traffic
simulation models are more detailed in representing traffic flows than the DNL procedures
used in dynamic assignment models, but as a consequence of the greater detail of these
methods, more computing resources are needed. The DNL model presented in this paper is able
to deal with the spill-back of congestion with a limited amount of computing resources. This
model is based on the link model formulation presented in Friesz et al. (1993), Astarita
(1995,1996), Wu et al.(1995) and in Xu et al.(1996). The model presented in this paper is still
a link based model and, as pointed out by Daganzo (1995a), is thus not capable of reproducing
realistic shock waves. Hence this model is not an attempt to substitute the above mentioned
models, but an attempt to give a theoretical basis for the development of new link models and
an analytical justification of a model that has been already developed, and applied to a large
scale network. Due to its approximate nature, this model lends itself more to general planning
applications, than to micro applications, such as adaptive signal control.
This paper is structured as follows: in section 2 it is explained why there is a need of modelling
spill-back of congestion, hi section 3 some of the basic definitions and notation that are used in
the paper are introduced, hi section 4 the deterministic queuing model is extended in order to
take into account the spill-back of queues, hi section 5,6 and 7 the analytical model is
presented. Different solution approaches are considered and one capable of representing a
whole network has been used to produce a simple example of queuing spill-back.
2. THE IMPORTANCE OF MODELLING SPILL-BACK OF CONGESTION: SPILL-

BACK AS A CONSEQUENCE OF CAPACITY CONSTRAINTS
hi all link based approaches to DNL, a downstream capacity is given to flows exiting from each
link (such as proposed by Boyce et.al ,1991;de Romph et al., 1992; Janson, 1989; Vythoulkas,
1990), but there has never been an attempt to give a limit to the flows entering the links. The
resulting flow obtained with these models may exhibit links filled with enormous numbers of
vehicles, indicating that these models are essentially unqualified for practical application. The
representation of a network with only downstream capacities for the links is thus too
approximate and may not be representative of real situations.
Relying on common physical experience, the flow entering a link is not only limited by the
upstream capacity, but when the link is full (all the space is occupied by vehicles waiting to
exit), the entering flow should be smaller than or equal to the exiting flow. And this should be
reflected in a dynamic network assignment model.
Two simple concepts are taken into consideration by the model proposed here, that have not
previously been considered in link models:
• the spill-back of queues is caused by limited upstream capacity and/or link storage space;
• the surplus of flow that cannot be received by a saturated link is accumulated on the
preceding links.
Modelling The Spill-Back Of Congestion 557
3. BASIC DEFINITIONS AND NOTATION
Consider a transportation network Q = (N, A) composed of a set of nodes i, i e N and a set of

directed arcs (links) a, a e A . Traffic originates at nodes o, o e O, o c N and is destined to
nodes d , d e D , D c N . An origin destination (o-d) pair is designated r , r e R c O x D . A
set of paths k, k e Kr connects o-d pairs r, and Dr (t) is the time varying demand flow rate
that uses these paths. Traffic departs from origins in the interval [0,7"] and all traffic arrive at
destination whithin the interval[0,r'J (T'>T).
The time varying inflow rates, outflow rates and number of vehicles on link a for o-d pair r are
denoted ura(t), and wra(t) and, xra(t), respectively, at time t. The total inflow rates, outflow
rates and vehicles at time t on link a are:
"a(o=2X(o, wa(o = XX(o and *«(o = 2Xco,
respectively. The travel time on link a at time t is T a (0- It is also useful to identify the sets of
arcs A,+ andA~ which are respectively the forward and backward stars of node i. In order to
state the model a link is subdivided into a running segment and a queuing segment.
ura ( t ) , wra (t) and xra (t) are related to the running segment, while uqa(t),wqa(t) and
xqa(t)are related to the queuing segment. rra(f) and t (t) are the travel times for traffic
entering the running and queuing segments, respectively, at time t. Tra(t) is given by a running
travel time function Tra(xra(/)) which is monotone increasing. Upstream capacities, C™,
downstream capacities, C™', and link storage space, Cf, are associated with link a. The traffic
demand flow rate Sa(t) represents the traffic that is ready to exit link a to link a+, and oa, (t)
is the traffic supply flow rate of the following link. It is also useful to denote <5ab (t} as the
partial demand flow rate from link a to b and &ab(t) as the partial supply flow rate of link b
allocated to link a at time t. A path partial demand flow rate is also introduced as the partial
demand flow rate<5^(?) from a to the following link of path k we have so that:
4a(0= 5X«.
k:a,bsk
Further notation will be introduced when required in the following sections.
4. A SIMPLIFIED LINK DETERMINISTIC QUEUE MODEL WITH SPILL-BACK
This section is intended to introduce some simple and new concepts that may be useful for the
comprehension of the contents of next sections. A simple link queuing model can be stated to
represent traffic flow on a sequence of four links, where each link is represented as a
deterministic queuing model. The queues are vertical and the travel time is zero if there is no
queue. The link travel time for the user that arrives at time t can be represented (see Figure 1)
as a function of the number of vehicles on the link at time t:
Travel Time
x=Vehicles on the link
Figure 1 Queuing segment with linear travel time function.
Is this link model a good network traffic model? Certainly not, for many obvious reasons; but is
this a good link traffic queuing model (at least to represent deterministic delays)? The answer is
again no, because there is no propagation of the queue from link to link. Suppose that the
following network is simulated with this model:
© Figure 2 Queuing network.

©
Traffic flow is 1000 veh/h moving from left to right. The capacities are 2000 veh/h for links 1
and 2, 500 veh/h for link 3 and 100 veh/h for link 4. Queues will be present on links 3 and 4
and will go to infinity unless a link storage space is defined (see Figure 2). The way towards an
almost correct link model is to fix a link storage space value. Is it possible now to correct the
link queuing model maintaining the same travel time function (1) and considering spill-back
due to storage capacity?
Suppose that a link storage space value is fixed on link 4 at 400 veh. It takes Ih for the link to
fill up. At this time the queue has reached the end of link 3. Now the flow exiting link 3 is
going to be immediately 100 veh/h and it may seem a good idea to evaluate the new travel time
with the same travel time function (1) using this value for downstream capacity. This is correct
for the users that arrive later than the time at which the outflow on link 3 changes, but this
value of capacity should also be applied to all the vehicles who are already on link 3 at the time
the outflow rate changes. However, the travel time for these vehicles was established when
they entered, and thus was estimated using the higher outflow rate value. For this reason it is
not correct to represent the travel time in the queue with equation (1). Thus, in presence of
spill-back, flow exiting a queue can be dependent on possible bottlenecks located downstream.
So it is clear that equation (1) cannot be used for even the simple queuing model presented
above. A new model has to be formulated in which the travel time x a (0 it is not explicitly
established at time t.
This new link model can be implemented using an idea applied in a finite difference method for
solving the Lightill and Whitham model. Daganzo (1995b) and Lebacque (1996) have
presented a numerical method to solve the kinematic wave model of Lightill and Whitham
(1955), where local flow is obtained as the minimum of two quantities (see Lebacque (1996),
which can be considered as local traffic supply and demand. Using this new idea it is possible
to correct the previous formulation of this simple link queuing model in this way:
Queuing Segment < dt " "
*a (0 = uqa (0 if xqa (0 = oand â (0 < Q

J (/) = C°"' otherwise (4)
Demand and supply
cra (f) = wm( w<?a (0, C'a" ) // xga (0 > C'a (5)
T • i
Link storage andi non negativity
• • •
constraints« Wad) - Ca (6)
[x ?a (0,M9 fl (0,w 9fl (0>0 (7)
Equation (2) is the conservation equation, which applies to every link in the simulation model.
Equation (3) states that the link outflow rate is the minimum between the demand flow rate
and the supply flow rate of the following link. Equation (4) establishes a value for the demand
flow rate of the link: the demand flow rate has an upper bound value which is the downstream
capacity flow rate C°l", if there is a queue on the link the demand flow rate is always equal to
the downstream capacity, otherwise the demand flow rate is equal the inflow rate of the
queuing segment. Equation (5) gives the supply flow rate that is the maximum inflow rate that
a link can accept; the supply flow rate is a pre-established value, Ca" if link a has still enough
storage space; otherwise, when the link is full, it is equal to wqa(t) (the outflow rate of the
link). This rule guarantees that the link storage space will not be exceeded. Equation (6) is then
redundant because the number of users on link a are forced to be less than Csa by the other
equations.
Something is clearly missing. Where is the travel time of the link?

The travel time rqa(t} can be calculated assuming link FIFO by imposing:
Uqa(t} = Wqa(t + T a(0) (8)
where U and W are its known cumulative flows:
Uqa (t) = \uqa (t)dt, Wqa (t) = \wqa (i)dt (9)

o o
Is now this new queuing model able to represent all queue dynamics?
This model is still not able to represent all type of shock waves propagating backwards.
Suppose that in the preceding four link network we have an incident that reduces downstream
capacity of link 4 to 0 for some interval and suppose the queue will be as in the following
Figure3 :
Figure 3 Queue after an incident.
When the original capacity is restored the outflow rate, for each link, reaches instantaneously
the downstream capacity flow rate of link 4 because the supply flow rate is immediately
(according to equation 5) restored back to the value of the current downstream outflow. An
infinite speed wave moving backwards is the consequence of this modelling result. The queue
will dissipate maintaining the same front point and with a back point that moves forward as in
Figure 4 (if the arrival rate at the back of the queue is less than the new discharging flow rate
value):
©
Figure 4 Queue dissipation in proposed model.
This will be different from what is predicted by the Lightill and Whitham theory: i.e. the back
and the front of the queue are moving backwards (see Figure 5).
-®
Figure 5 Queue dissipation in Lightill and Whitham model.
This behaviour can be improved by introducing a delay when changing the supply flow rate of
the upstream links, but this is beyond the scope of this paper, and may be of questionable
benefit. For simplicity and numerical tractability it was decided to maintain a link based
structure, and not to attempt to model the upstream propagation of queue dissipation.
The queue model presented thus far is clearly not suitable for a real network; since the travel
times of the network paths are 0 if the flows are below capacities. A running segment is thus
added so to represent both congested and uncongested flow regimes.
Running segment Queuing segment
O ; O O
Link modelling
Figure 6 Proposed link model.
The idea of this paper is to use the supply and demand flow rates to establish flow rates at the
nodes and to use links that have a running segment and a queuing segment (that are modelled
as shown before) in order to represent the whole transportation network on a link basis (see
Figure 6), and to include the modelling of spill-backs.
5. THE NETWORK MODELLING CONCEPT
hi this section a complete network loading model is formulated, such that once the set of paths
Kr between each o-d pair r and the partial demand hfcft) to be allocated to each path are known:
hk(t), (10)
it is possible to reproduce the movements of the users on the network according to a set of
analytical rules. The idea of local flow as the minimum of local supply and demand is
generalised in this model. Links are composed of a running segment that has the same
analytical formulation as that presented in Friesz et al, (1993) in Astarita (1995,1996), Wu et
al, (1995), and of a queuing segment that is governed by equations similar to the equations
presented above, hi terms of input values and output values, the link model has the general
structure indicated below in Figure 7. It establishes partial supply and demand flow rates at
time /* as a function of the inflow patterns at time f- t' . Details are shown in section 7.
Output
Link model
t<t' *„»(')
Figure 7 General structure of proposed link model.
Flows at each node at time / respect the conservation condition, as in all network models,
except that flows at nodes are constrained by the interactions between demand of entering links
and supplies of exiting links. A link based DNL model that considers spill-back has to solve the
problem of calculating flows according to prevailing demands and supplies at each node of the
network, at each instant in time. This task may be reduced to solving the node model for n time
slices, assuming that supply and demand flow rates are constant during each time slice. More
details on the numerical solution approaches are presented in section 8. The node model which
is described in detail in section 6 has the structure showed below in Figure 8. It establishes the
flows at time t, once supply and demand flow rates at time t are known.
Input . Output
,.(,) ^(0-^(0
Figure 8 General structure of proposed node model.

The DNL model must deal with the possibility that congestion propagates back to one of the
origin centroids (Ran and Boyce, 1994). By giving to the first link of each path an infinite link
storage space, the model is able to deal with this situation. A connector link (from a centroid to
*' network) is used for this purpose. The ;, tributes of connector links are the same as that of
other links of the networks, except that a connector has a fixed running travel time (which may
also be 0) and an infinite storage space. In the next two sections the node and link model are
presented. The node model is also an issue for the network extension of Lightill and Whitham
model (see Lebacque, 1996; and Daganzo, 1995b).
6. THE NODE MODEL
In this section we consider the problem of defining the rules that govern flows at any
intersection node in the network. The problem is to distribute the limited resource of physical
space on downstream links between users in a proper way.
Due to the constraining capacities, in the DNL, rather than just only flow conservation rules,
we need to solve a more complex problem that can be expressed as an optimization problem,
where the objective function is to maximize the total flow crossing the node and the constraints
are the local supplies and demands at the node.
A node stream ab, is defined to be the set of all paths that contain link a which enters into a
node / followed by link b which exits from this node (see Figure 9).
paths 1 and 2
Stream I4 = paths 1 + 2
Set of paths
pathl: x-x-l-4-x-x-x-x-x
path2: x-x-x-l-4-x-x
pathS: x-l-5-x-x-x-x-x-x-x
path4: x-2-4-x-x-x-x-x
path5: x-x-x-x-x-x-x -2-4-x
path6: x-x-3-4-x-x-x-x-x
path?: x-x-x-x-3-4-x-x
pathS:x-x-3-5-x-x-x-x-x
Stream 34 = paths 6 + 7
Figure 9 Node stream.
The node problem consists of evaluating the flows for each node stream at each time t of the
simulation, given the values of partial supplies and partial demands of all the links included in
A* and A,". The outflows wqajj(i) of link a are equal to the partial inflow rates urab(i) of link b
so we have the following set of conservation equations for each pair of entering and exiting
links:
wqah(t) = urab(i) ; a e A:,6 e A, + . (11)
Stream supply (and demand) flow rates are the inputs to the intersection problem and respect
the following conditions, where Ska(t)\s the partial demand flow rate of path k destined to link
a, / is the node connecting link a and b and crab(t} is the partial supply flow rate of link a
given to link b:
AeA*
<r»(0 =£*-«•
aeA~
The computation of the node flow rates uses a model which is described below, it is but one of
several possible approaches to this problem (see Lebacque, 1996). This formulation has been
used by the authors in a complete analytical implementation of the model (for more details see
Adamo et al, (1998b)). It is in fact the simplest way of solving the node problem while
respecting the FIFO rule, as it has an analytical solution. It can be written as a problem of
maximizing the total flow Z through node i (the index t is omitted for simplicity of exposition):
subject to the following constraints:
• Flow rates cannot be negative:

wqab>0 aeA~,beAl, (16)
• Flow rates from link a to b must be less than or equal to the demand from a to b :
5ab > wqab aeA-, b e A; , (17)
• Total inflow rates of link b must be less than or equal to the total supply flow rate of link b :
• The partial flow rates from link a to link b (wqab) relative to the total flow exiting link a,
maintains the proportion of the partial demand flow rates from link a to link b ( 5 ab) to the
total demand flow rate of link a. This rule guarantees that FIFO is respected on link a, by
ensuring that, if a stream is delayed because of a lack of supply on its destination link, all other
streams exiting the same link are delayed as well. aab is defined as the proportion of partial
demand c>ab relative to the total demand S3 :
«ab = SJ £ £ab - SJ6, a e A', b e A,+ , (19)
&eA,*
and thus the constraint is:

b6 A
*" '
• In a similar way fiab is defined as the ratio of the partial supply aab to the total supply ab :
° <*=?<*• °b aeA~, bzA+ (21)
and so that the flow of stream ab must then be less than or equal to the given supply:
(22)
For each entering link a at node / the resulting partial outflow flow rate to link b which solves
the problem (16-22) is given by:
=wqa(t) • ^TV = mini 1 , min(aab(t)/Sab(t))\ • djt) a e A:, b e A; . (23)
The principal weakness of this formulation is that the values fiab are fixed. As a result, excess
supply is not re-assigned. If link b has a higher supply flow rate crab(t) than the demand Sab(t)
sothat m/«(cr a6 (0, <5aA(0)= ^(0 > the remaining supply flow rate <Jab(t} -5ab(t} cannot be (as
would be physically expected) used by the others links. Thus a link will always affect the
supply for other links in the same way, regardless of its flow. There are other node models that
can overcome this deficiency (see Daganzo, 1995b for two links merging). For sake of brevity
the reader is referred to Adamo et al (1998b). The purpose of this paper is to give a general
framework for link models, rather than to present all possible model formulations.
7. THE LINK MODEL
A FIFO continuous time link dynamic modelling approach has been presented in Friesz et al.
(1993) in Astarita (1995,1996), Wu et al. (1995), Ran et al. (1997) among others. The proposed
link model consists of the following system of differential equations that constitutes a
complete link model where t e [0,r'j :
-^-^ = M (?) - w (t) V

(24)
dt ° "
,(t) = Ta(xa(t)) (25)
(26)
1+ —
dt
This link model does not represent the backward propagation of queues due to the lack of
storage space. To overcome this limitation the link model is modified as described in section 4:
links are supposed to be composed of two segments: a running segment (following equations
37-44) and a queuing segment (following equations 45-55). The running segment at time t is
travelled at a speed which depends on the number of users on the running part, as in the DNL
models indicated above (equation 24-26), or alternatively, the travel time for the running
segment, depends on the number of users on both running and queuing part. The queuing
segment holds vehicles which are unable to exit the link because the next link of their path is
not saturated. A link may be unable to accept traffic because of insufficient storage space or
because the upstream capacity has too low a value.
A supply flow rate value is then introduced which corresponds to the upstream capacity of
exiting links. The supply is equal to a pre-established upstream capacity C™ , or alternatively
when the link is saturated, because the traffic on both running and queuing segments is equal to
the link storage space; hence the supply flow rate is equal to the present outflow rate \vq from
the queuing segment. For simplicity of the exposition only the equations for a single path (see
Figure 10) for a sequence of links, as done in section 4, are presented and discussed briefly:
-©-
Figure 10 Sequence of links .
(26)
dt
Running segment (27)
(Q
(28)
drm(t)
1+
dt
dxqa(t)
--uqa(t)-wqa(t) (29)
dt
Queuing Segment wqa(t) = min(Sa(t),cra+(t)) (30)
(31)
8a (t) = \vra (?) if xqa (t) = 0 and w ra (f) < C°"'

5a (t) = C™' otherwise (32)
Demand and supply
°a (0 = Ca tf Xr
a (0 + X1a (0 < Ca
cra (t) = min( wqa (t), C* ) // xra (t) + xqa (t) > Csa (33)
xra(t) + xqa(t)<Csa (34)
Link storage and non negativity constraints
xra (t), xqa (t), uqa (t), ura (t), \vqa (t\wra(t)>0 (35)
Equations (26) and (29) are the conservation equations. The running segment is represented
exactly as in the previous formulations. The outflow rate from the running segment has no
constraint and the flow is always allowed to exit the running segment. The queuing segment is
capable of holding the flow in case there is not enough space on the following link or if the
flow is larger than the downstream capacity. Equation (30) is the node model for this simple
single path problem; it establishes that the outflow of the queuing segment is the minimum
between its demand flow rate and the supply flow rate of the next link.
Equation (32) states that the demand flow rate when there is no queue is just the running
segment outflow rate limited by the downstream capacity C°"'. If a queue is present, the
demand flow rate is always equal to C°ut. Equation (33) gives the supply flow rate for a link:
when the link is full it is equal to the outflow rate of the link and the link is saturated when the
traffic on its running segment and queuing segment are equal to Cf .
Running and queuing segments have very different behaviour: the dynamic of the running
segment is based on the travel time function and the outflow is determined by imposing FIFO
as a function of inflow rate and travel time (equation (28) ). The dynamic of the queuing
segment is based on the allowed outflow, which may be determined by the traffic on
downstream links. The travel time in that case cannot be established at the time traffic arrives
at the beginning of the segment but depends on the inflow and outflow functions. In fact, in a
spill-back situation, when the flow on the queuing segment of a link is affected by the
following links of the path, it is not possible to establish explicitly the travel time. The motion
of traffic may be affected by changes in supply flow rate while it is still in the queue. As before
the travel time can be calculated by imposing FIFO as follows:
Uqa(t) = Wqa(t + T<ia(t)) (36)
In a multidestination network the problem is more complicated, as it is necessary to propagate

flows on different paths maintaining the right proportion with respect to the FIFO rule. The
running segment can be represented by:
dxrka(t)
- = ur*(t)-wr>(t) k:a<=k (37)
dt
(38)
k:ae.k (39)
dr,a(t)
1+
dt
Running Segment (40)
(41)
(42)
A a t = 0 if aik (43)
Ani = 1 if a e k (44)
This formulation guarantees that the proportions of flow on different paths are respected due to
equations (39). The equations for the queuing segment are as follows:
k:aek (45)
dt
) = wr*(t) V/7 k:aek (46)
(47)
Queuing Segment (48)
(49)
(50)
(51)
dt
Equations (45) are the continuity equations for each path in which both inflow and outflow are
given respectively from the running segment (equation (46)) and from the intersection model.
Equation (50) is the FIFO equation that establishes the travel time of the link and together with
equations (51) the right propagation of the composition of flows. Then the following equations
are for the total demand and supply flow rates. The partial demand are obtained at time t
according to the composition of flow given by equations (51)
8a (0 = wa(t) if xqa (t) = 0 and wra (t) < C™'

5a(t) = C°a'" otherwise (52)
Demand and supply
<, (0 = C'a" if xra (t) + xqa(t)< C*a
aa (0 = min( wqa(t),C'a" ) // xra (t) + xqa (t) > Csa (53)
ra(t) + xqtt(i)<C'tt (54)
Link storage and non negativity constraints {
xrk(t),xqka(t),uqka(t\urk(tlWqka(t\Wrk(t)>Q (55)
This formidable set of equations cannot be solved analytically except for some trivial cases. In
order to compute flows which respect the constraints some approximate methods are suggested
below. The FIFO condition is satisfied when the sufficient condition in Xu et al. (1996) are
verified, by the running segment of the link. The queue segment always satisfies the FIFO
condition. For sake of brevity no proof is given here that the link FIFO condition imply the path
FIFO for this model.
8. SOLUTION APPROACHES
Three different methods are introduced here in order to solve the proposed DNL model:
• A point packet approach is presented in Adamo et al. (1996). This methodology is able
to represent flow propagation without FIFO rule violations or other inconsistencies even in a
multidestination network. The approach is able to easily and adequately reproduce network
bottlenecks. Packets on the network are represented as points and the travel time of a link for
each packet is established at the moment the packet reaches the beginning of the link. Travel
time depends on the traffic on the link at that time plus the packet itself.
• The model can be solved by numerical methods that consider a discretized version of
the problem. This methodology cannot be easily applied to real network simulation: the
representation of a whole network with finite difference methods based on a time discretization
common to all links is too complicated, as it needs to retain the inflow patterns on each link.
Some examples are presented in Adamo et al, (1998c).
• An approximate methodology for solution, which is based on a different time

discretization for each link, thus generating an asynchronous representation of the network, has
been presented in detail in Adamo et al, (1998c). This methodology, which is an extension of
the one presented in Astarita (1996), is capable of dealing with spill-backs, is very convenient
from a computational point of view, has a numeric solution that is reasonably close to the right
solution for practical applications and guarantees the exact conservation of vehicles.
This last procedure, which is an event based simulation, is briefly introduced here. The general
flow chart of the DNL procedure is in Figure 11.
Initialize heap of events
Select first Event (E on link L)

and the instant related (tL)
Evaluate number of users

entered link L
Evaluate number of users

exited from link L
to other links
Compute next time related

to event on link L (tL)
Insert the new Event into the heap

and update heap of events
Figure 11 Algorithm of approximate DNL procedure.
The basic idea is simulating the network with constant values of flows between two
consecutive events. The flow rates change only at each event time, hence the complete dynamic
evolution of the network flows is always easy to follow. The traffic on each segment is easily
evaluated. This procedure for the DNL identifies the following events:
Running event: the time instant at which each running segment is discharged
Queuing event: the time instant at which each queuing segment is discharged.
Demand event: A change in demand flow rate of a link generates this event. The associated
node is checked by considering the set of demands and supplies of its entering and exiting
links in order to determine if there is a change in the flows of all streams of the node.
Supply event: A change in the supply flow rate of a link generates this event. The associated
node is checked by considering the set of entering links. For each of these links a demand
event is generated.
Empty queue event: This event occurs when a queue is totally discharged and has the effect of
changing the demand flow rate of the link.
Full link event: This event occurs when a link is full, in other words when the number of users
on both running and queuing segments are equal to the storage space of the link. It has the
main effect of changing the supply flow rate of the link.
Create queue event: This event occurs when the state of the link queuing segment is empty
and the demand of the running segment of the link cannot be absorbed by the following link(s).
Inflow event: This event occurs when one of the inflow rates is changed. It generates a new
full link event when necessary, and loads the entered vehicles on the link.
Outflow event: This event occurs when the outflow rates are changed. It generates a new full
link event or empty queue event, when necessary.
9. A SOLVED EXAMPLE ON A SMALL NETWORK
In this section a simple example is presented and solved by an implementation of the algorithm
described above. This example demonstrates how the storage capacity is respected in coherence
with the analytical formulation. A network with two paths, six links, two origins and only one
destination is considered (see Figure 12).
The first path is 1-2-5-6, and the second path is 3-4-5-6.

1 2 _ 5
Figure 12 Simulated network.
The total time of the simulation is 3800 sec. The path flows (veh./h.) are constant between 0
sec and 1800 sec and have the following values:
Path 1 flow: 0.14 veh./sec.

Path 2 flow: 0.14 veh./sec.
Between 1800 sec. and 3800 sec, the path flows are again 0.
The links of the network have the following attributes (the object is to obtain a simple case):
Link Length Parameter a of the Parameter b of the Upstream Downstream Storage Jam
(m.) linear cost function linear cost function capacity capacity space density
of the running of the running (veh./sec.) (veh./sec.) (veh.) (veh./km.)
segment, (sec.) segment, (sec./veh.)
1 500 60 0 1000 1000 1000 2000
2 500 60 0 1000 1000 10 60
3 500 60 0 1000 1000 1000 2000
4 500 60 0 1000 1000 1000 2000
5 500 60 0 1000 0.2 10 40
6 500 60 0 1000 1000 1000 2000
Table 1 Attributes of the network.
The travel time function used is (any desired function can be used):
where xra(t) is the number of users on the running segment at time t. With this travel time
function the speed is obtained as :
o v a / rr, , N. '
Tfl(xrfl)
where La is the length of the link.
And the travel time of the running segment is a function of xra(t) and xqa(t):
~n (t\ T
L
-~
,(0
The total number of vehicles accumulated on each link (both running and queuing segment) are
shown for links 1 to 5 in Figure 13:
veh> Link4
\r u- i
Vehicles on *u
the rlinks
i
70
60
50
40
30
20
Link 6
10
/ / /
Link3
0
0 500 1500 1000 2000 2500 3000
Time (sec.)
Figure 13 Total number of vehicles on the links (including both running and queuing
segments).
Storage capacity are respected on links 2 and 5 and the queues grow only on links 1 and 4 that
have enough storage space to contain the queue without causing it to be spread onto other links.
After 30 minutes the path demands are again 0 and the queues on links 1 and 4 start to
discharge (first on linkl then on link 4).
The outflow rate of each link are shown in Figure 14:
veh./sec.
0.25 Outflow
Link5
0.2
Link 6
Link5 Link 6
Link 3
0.15
Link4 \ Linkl Links 1-2-4 Link 2

0.1
Link 4
0.05 Linkl
0 500 1000 1500 2000 2500 3000

Time (sec.)
Figure 14 Outflow rates.
The outflow rate of link 5 is equal its downstream capacity. When the back of the queue affects
links 1, 2 and 4 their outflow is equal half the capacity of link 5, because /? 25 =/? 45 =0.5. The
supply flow rate of link 5 is equally shared between links 2 and 4. Link 3 is not affected by the
queue because the storage space on link 4 is enough to store the queue, Link 1 has a drop in the
outflow rate at the moment the queue reaches its downstream end (when link 2 is full as can be
seen in figure 3).
10. CONCLUSION
In this paper a proposal has been made to extend the link based DNL models to represents the
spill-back phenomenon. A complete set of modelling approaches for intersection representation
has been given in Adamo et al, (1998b). Future research efforts will be devoted to the
development of the approximate solution methodology (briefly introduced above) towards the
implementation and validation of the model on real data and networks.
A simulation implementation on a medium size city network (based on the point packet
approach) has been already submitted in Adamo et al, (1998a) showing that the analytical rules
here presented can be usefully implemented in planning applications.
11. REFERENCES
Adamo, V., Astarita, V. and Di Gangi M. (1996) A dynamic network loading model for
simulations of queue effects and while-trip re-routing. 24th European Transport Forum
PTRC. 2-6 September 1996 Brunei University, Uxbridge.
Adamo V., Astarita V., Cantarella G.E. and Cascetta E. (1998a) "A Doubly Dynamic Traffic
Assignment Model for Planning Application" submitted to 14th International
Symposyum on Theory of Traffic Flow
Adamo V., Astarita V., Florian M., Mahut M. and Wu J.H. (1998b) Link based dynamic
network loading models with spill-back: intersection models. Publication CRT, 1998.
Adamo V., Astarita V., Florian M., Mahut M. and Wu J.H. (1998c) Link based dynamic
network loading models with spill-back: solution by simulation. Publication CRT, 1998.
Astarita, V. (1995) Flow Propagation Description in Dynamic Network Loading Models.
Proceedings of IV International Conference on Application of Advanced Technologies in
Trasportation Engineering (AATT), 27-30 June 1995. Capri. Published by American
Society of Civil Engineers pp. 599-603.
Astarita, V. (1996) A continuous time link model for dynamic network loading based on travel
time function. 13th International Symposyum on Theory of Traffic Flow, Lyon July 1996.
Published by Elsevier pp. 79-102.
Barcelo, J. (1996) The parallelization of AFMSUN2 Microscopic Traffic Simulator for ITS
applications. Presented at the 3rd World Conference on Intelligent Transport Systems,
held in Orlando on October 14-18, 1996.
Barcelo, J., Ferrer J.L. and Montero L. (1989) AIMSUN: Advanced Interactive Microscopic
Simulator for Urban Networks. Vol I: System Description, and Vol II: Users Manual.
Departamento de Estadistica e Investigacion Operativa. Facultad de Informatica.
Universidad Politecnica de Cataluna.
Boyce, D.E., Ran, B. and LeBlanc, L.J. (1991) Dynamic User-Optimal traffic Assignment
model: A new Model and Solution Technique. First TRIennal Symposium on
Transportation ANalysis, Montreal Canada, June 6-11 1991.
Carey, M. (1986a) A constraint qualification for a dynamic traffic assignment model.
Operations Research, 35 No.5, pp. 55-58.
Carey, M. (1986b) Optimal time-varying flows on congested networks. Operations Research,
35No.5,pp. 58-69.
Cascetta, E. and G.E. Cantarella (1991) A Day-to-day and Within-day Dynamic Stochastic
Assignment Model. Transportation Research 25a (5),pp. 277-291.
Cantarella G. E. and Cascetta E. (1997) Un modello di assegnazione doppiamente dinamica del
traffico. 3° Convegno Nazionale Progetto Finalizzato Trasporti 2 Taormina, 1 0 - 1 2
November, 1997.
Daganzo C.F. (1995a) Properties of Link Travel Time Functions under Dynamics Loads.
Transportation Research 29B(2), pp. 95-98.
Daganzo C.F. (1995b) The cell Transmission model, Part I and Part n Transportation Research
29B No.2.
de Romph, E., van Grol,H.J.M. and Hamerslag, R. (1992) A Dynamic Traffic Assignment
model for Short-Term predictions. Seminar on Urban Traffic Networks, Capri, July 1992.
Friesz, T.L., Luque J., Tobin R.L. and Wie B.W. (1989) Dynamic network traffic assignment
considered as continuous time optimal control problem. Operations Research, 37, pp.
893-9Q1.
Friesz, T.L., Bernstein, D., Smith, I.E., Tobin, R.L. and Wie, B.W. (1993). A Variational
Inequality Formulation of the Dynamic Network User Equilibrium Problem. Operations
Research, 41, pp. 179-191.
Hammerslag, R. (1988) A three-dimensional assignment in the time-space. UTSG Annual
Conference, 1988, London.
Lebacque J.P. (1996) The Godunov scheme and what it means for first order traffic flow
models. 13th International Symposyum on Theory of Traffic Flow, Lyon July 1996.
Published by Elsevier, pp.647-677.
Lighthill M, Whitham G. (1955) On Kinematic waves I and n a theory of traffic flow on long
crowded roads. Proc. Royal Society,London, Series A, 229, pp.317-345.
Merchant, D.K. and G.L. Nemhauser (1978) A model and an algorithm for the dynamic traffic
assignment problems. Transportation Science, 12 (3), pp. 183-207.
Janson, B.N. (1989) Dynamic traffic assignment for urban road networks. Transportation
Research, 25B (2/3) pp. 143-161.
Ran B. and Boyce D.E. (1994) Dynamic Urban transportation Network Models. Lecture notes
in economics and mathematical systems, n.417. Springer-Verlag.
Ran B., Rouphail N.M., Tarko A. and Boyce D.E. (1997) Toward a class of link travel time
functions for dynamic assignment models on signalized networks. Transportation
Research, 3 IB, No 4, pp. 277-290.
Rilett L., Benedek C., Rakha H., and Van Aerde M. (1994) Evaluation of IVHS Options Using
CONTRAM and INTEGRATION. First World Congress on Applications Transport
Telematics & Intelligent Vehicle Highway Systems, Paris, France.
Smith M.J. and Wisten M.B. (1996) A distributed algorithm for the dynamic traffic equilibrium
assignment problem. 13th International Symposyum on Theory of Traffic Flow, Lyon
July 1996. Published by Elsevier, pp 385-408.
Van Aerde M.,Yagar S., Ugge A. and Case E.R. (1988) A Review of Candidate Freeway
Arterial Corridor Traffic Models Transportation Research Record 1132, TRB.
Vythoulkas, P.K. (1990) A Dynamic Stochastic Assignment Model for the Analysis of General
Networks. Transportation Research, 24B (6), pp. 453-469.
Xu Y.,Wu J.H., Florian M.,Marcotte P. and Zhu D.L. (1996) New advances in the continuous
dynamic network loading problem. Publication CRT-96-26. to be published on
Transportation Science.
Wie, B.W., Friesz T.L. and Tobin R.L. (1990) Dynamic user optimal traffic assignment on
congested multi destination networks. Transportation Research, 24B(6), pp. 431-442.
Wu J.H.,Chen Y. and Florian M. (1998) The continuous dynamic network loading problem: a
mathematical formulation and solution method. Transportation Research, 32B, pp. 173-
187. (1998)
CHAPTER 8
TRAFFIC INFORMATION AND CONTROL
Nine-tenths of wisdom is being wise in time. (Theodore Roosevelt)

Genius is one percent inspiration and ninety-nine percent perspiration.
(Thomas Alva Edison)
Never mistake motion for action (Ernest Hemingway).
Investigation of Route Guidance with DynaMIT 577
INVESTIGATION OF ROUTE GUIDANCE GENERATION ISSUES

BY SIMULATION WITH DYNAMIT
J. Bottom1, M. Ben-Akiva1, M. Bierlaire2,1. Chabini1, H. Koutsopoulos3 and Q. Yang4
1
Massachusetts Institute of Technology
2
Swiss Federal Institute of Technology
3
Volpe Transportation Systems Center
4
Caliper Corporation
Abstract
We examine some of the issues associated with the generation of consistent anticipatory route
guidance, focusing particularly on the identification of effective and efficient algorithms, and on
the exploration of tradeoffs between computation speed and guidance quality. First we provide a
formulation of the guidance generation problem in terms of fixed points and propose specific
algorithms for solving such problems; these include standard averaging methods, which are
known to be slow, as well as a variety of "accelerated" methods. We next describe a rolling
horizon approach for practical guidance generation in a traffic information center. We identify a
number of implementation parameters that affect tradeoffs between guidance quality and
computation speed. We then investigate these issues by simulation, using the DynaMIT software
system. Results of a number of simulation experiments are presented. The paper concludes by
pointing out areas where further research is needed.
1. INTRODUCTION
By providing tripmakers with information about travel options and so allowing them to make
better travel decisions, Advanced Traveler Information Systems (ATIS) promise to enhance the
utilization of existing network infrastructure and to help manage congestion. Of particular
interest here are route guidance systems, which inform road users about traffic network
conditions (link traversal times, queue lengths, etc.) and/or suggest a path to follow from their
current position to their ultimate destination.
We are specifically interested in anticipatory route guidance, in which real-time traffic

measurements are combined with other data to make short-term predictions of road network
traffic conditions, and these predictions are the basis of the guidance information that is provided
to drivers by some means of communication such as variable message signs, highway advisory
radio, infrared beacons or digital cellular radio.
Anticipatory route guidance is likely to be more effective than route guidance based on historical
or on current traffic conditions because it accounts for the probable evolution of traffic
conditions over time and throughout the network, and it bases the information provided to a
driver on the traffic situation that is forecast to prevail at network locations at the time the driver
will actually arrive there. Simulation experiment results and theoretical analyses of small-scale
networks have generally confirmed this expectation (Kaysi, 1992; Ben-Akiva et al, 1996; Hall,
1996;Engelson, 1997).
Any method for generating anticipatory route guidance must confront the following fundamental
issue: anticipatory guidance is derived from predictions of future conditions, but these conditions
will themselves be affected by drivers' reactions (whatever they may be) to the guidance that
they receive. It is therefore necessary to ensure that anticipatory guidance is consistent—in
other words, that the forecasts on which the guidance is based are the same, within the limits of
modeling accuracy, as the outcome that is predicted to result when drivers react to the guidance.
Clearly, adequate treatment of route guidance generation requires explicit incorporation of

information as a problem component: applicable models must take account of the way in which
information is disseminated by the communication system, and of the way in which this
information is reacted to by drivers. Furthermore, route guidance generation requires an
algorithmic approach capable of addressing the complexities which information introduces into
dynamic network modeling.
To be useful in an operational environment, a route guidance generation method must also

confront a number of practical issues. Perhaps the most challenging of these is the time
constraint: guidance generation may involve considerable amounts of computation which must
be completed quickly and accurately enough for the results to be timely and of use to drivers.
Prior work by our group has examined theoretical aspects of the route guidance generation
problem and has developed software tools for working with this problem. Specifically, in
Bottom et al. (1998) we developed an analysis framework for route guidance generation, posing
it as a class of fixed point problems likely to be solvable by standard methods. In Ben-Akiva et
al. (1997a), we described the features of DynaMIT, a software system we have developed for
real-time guidance generation in an operational traffic information center. Finally, in Yang and
Koutsopoulos (1996) we presented MITSIM, an advanced microscopic traffic simulator which
accounts for driver reactions to information.
In this paper we examine more thoroughly some of the issues associated with the generation of
consistent anticipatory route guidance, focusing particularly on the identification of effective and
efficient algorithms, and on the exploration of tradeoffs between computational speed and
guidance quality. We do this by building on and extending our previous work.
We first briefly recall the analysis framework which leads to the formulation of the guidance
generation problem in terms of fixed points, after which we propose specific algorithms for
solving such problems; these include standard (i.e., MSA-type) averaging methods, which are
known to be slow, as well as a variety of "accelerated" methods which promise to offer
significant computational speedups. We next describe how guidance generation would likely be
carried out in a traffic information center, focusing on a rolling horizon approach. From this
approach, we identify a number of implementation parameters that can be adjusted in order to
affect the tradeoff between guidance quality and computation speed; such tradeoffs are likely to
be important, given the substantial amount of computation entailed by the guidance generation
problem and the time-critical nature of its solution. We then investigate these issues by
simulation. We do this by interfacing the MITSIM and DynaMIT software systems in a way
which mimics how the latter would operate in a traffic information center. The combined system
allows us to investigate the performance of the route guidance generation system by comparing it
with the detailed simulation of reality. Results of a number of simulation experiments are
presented. The paper finishes by summarizing the principal conclusions of the work and
pointing out areas where further research is needed.
In this investigation, we will be concerned only with user-optimal guidance on a road network,
and will consider only within-day driver behavior. We consider a fixed OD matrix of driver trips
by departure time, and do not allow for possible mode or departure time changes in response to
guidance. These simplifications have been made in order to facilitate the presentation here of the
essential ideas, but could be relaxed without requiring significant changes to the approach.
2. ROUTE GUIDANCE GENERATION AS A FIXED POINT PROBLEM

We sketch out in this section an analysis framework which enables us to formulate consistent
route guidance generation as a mathematical problem, and to investigate appropriate solution
methods. First we describe the various components which make up the framework. We then
propose three different formulations of the notion of consistency in anticipatory route guidance,
by defining alternative compositions of the components; in each case, consistency corresponds to
a fixed point of the composite mapping.
2.1 Components of the analysis framework
We assume a standard representation of a dynamic network as a directed graph consisting of

nodes and links, which is used by a population of drivers characterized by their origin,
destination, departure time, access to guidance information, and other relevant behavioral
parameters. The analysis framework is defined by three time-dependent variables and three
mappings which connect them. The variables of interest are path flows, link conditions and
guidance messages, while the mappings are the demand (path split) model, the network loading
model and the guidance generation model. Time may be considered to be either continuous or
discrete.
By a path flow we mean the number of drivers choosing a particular path at a particular time for
their trip to their destination. We designate by P the set of feasible path flows in the network.
By link conditions we mean time-dependent link impedance variables. For definiteness we will
focus here on link traversal times, although more general definitions of travel impedance can be
accommodated without difficulty. Let C denote the set of time-dependent link conditions.
Given the set of path flows, the network loading model determines the time-dependent link
conditions which result as the flows move over the network. Dynamic network loading involves
the determination of mutually consistent time-dependent link flows and conditions (including in
particular link traversal times). Such models are still the subject of active research. For our
purposes, however, we can schematically represent network loading as a mapping S: P —> C.
Guidance is provided in the form of time-dependent messages, a unit of communication having a

specific content and format that can be accessed by some subset of drivers at particular network
locations and at particular times. One example of a message would be the predicted time-
dependent conditions on all links in the network, available to suitably equipped vehicles at any
location and at any time; we will refer to this as perfect guidance information. More realistically,
technological constraints may require guidance messages to convey information that is in some
way approximate, and is only available at certain locations or times. The coded wordings used
on variable message signs, broadcast link times that are truncated to reduce communications
bandwidth requirements, and next-turn instructions transmitted from a short-range infrared
beacon are all examples of this. Let M denote the set of time-dependent messages.
Given forecasts of link conditions, the guidance model outputs guidance messages. It represents
the response of a traffic information center to a particular condition forecast, taking account of
the available computation and communications technologies. (Note that this guidance is not
necessarily consistent since drivers' reactions to the guidance may invalidate the condition
forecasts.) The guidance model is thus a mapping G: C —> M from the set of time-dependent link
condition predictions C to the set of time-dependent guidance messages M.
Finally, given the information available to drivers (including guidance messages), the demand
model determines the path flows that result. These flows might be obtained directly from an
aggregate model or indirectly by aggregating the outputs of individual driver behavior models.
In either case, we treat the resulting aggregate path flows as deterministic quantities. We allow
pre-trip or en route decisions, and do not necessarily assume full compliance with prescriptive
guidance messages.1 The demand model can be represented as a mapping D: M -» P.
' The modeling of driver response to information is an important and active area of behavioral research.
Development of such models involves issues that are quite different from the central topics of this paper,
so will not be considered here. Note, moreover, that even if "perfect" models of driver response to
information were available, the question of determining what particular guidance information to
disseminate would still remain posed.
2.2 Fixed point formulations for guidance generation
The discussion in the preceding section identified three mappings useful in analyzing route
guidance. Each of these mappings relates one set of inputs to a different set of outputs, which in
turn constitute the inputs for another mapping.
Recall that we say that guidance is consistent when the assumptions used as the basis for
generating the guidance turn out to be realized, within the limits of modeling accuracy, after
drivers receive the guidance and react to it. It can be seen that any of three distinct sets of
assumptions—regarding time-dependent path flows, link conditions or guidance messages—
might be used as a basis for generating guidance. The notion of consistency can thus be
expressed in terms of the composite mapping from one of these sets of time-dependent variables
into itself. In fact there are three such mappings:
• a composite mapping of path flows into themselves, which starts with a set of time-dependent
path flows, forecasts the corresponding link conditions, computes a suitable set of guidance
messages, which are disseminated to drivers and cause them to react in some way, leading to
a new set of path flows. The composite path flow mapping is DoG°S: P —> P. In this case we
say that the guidance is consistent if the forecast path flows are the same as the flows that
were assumed at the start;
• a composite mapping of link conditions into themselves. This mapping begins with a set of
time-dependent link conditions, computes the corresponding guidance messages, which are
communicated to drivers and affect the path flows, thus resulting in a new set of conditions.
The composite condition mapping is SoDoG: C -> C. Here guidance is consistent if the
initial link conditions used for the guidance computation coincide with those which result
after the guidance is disseminated;
• a composite mapping of guidance messages into themselves which, beginning with a set of
time-dependent messages, determines the resulting path flows, forecasts the link conditions
which ensue from these, then generates a new set of guidance messages. The composite
message mapping is GoSoD: M -» M. The guidance is consistent if the computed guidance
messages reproduce the initially-assumed set of messages.
(Note that guidance messages are the direct output of the guidance mapping only; in the other
formulations, guidance is obtained as an intermediate result of the composite mapping.)
The coincidence of the assumed input value with the corresponding output value of a particular
composite mapping can be expressed by saying that the value is a fixed point of the mapping. (If
T: X -» X is a mapping, x* e X is a fixed point of T means that x* = T(x*).) Thus, consistency
in the context of anticipatory route guidance can be analyzed in terms of the fixed points of the
composite mappings.
The three composite mappings are equivalent with respect to the existence of a fixed point: if one
has a fixed point then they all do, and if one does not then none does. However, each is defined
on a different domain and has different mathematical properties. This characteristic gives us
flexibility to choose the most suitable formulation for a given purpose, for example, existence
proofs or computation.
Of course there is no guarantee that such fixed points exist. Standard theorems can be used to
assert the existence of a fixed point when the composite mappings satisfy specific properties
(e.g., when they are continuous mappings on convex compact sets). Stronger properties are
required to assert fixed point uniqueness. It is not difficult to think of realistic guidance system
designs which violate one or both of these sets of properties, however. We do not investigate
existence or uniqueness questions in this paper. Rather, we adopt a pragmatic viewpoint: if an
algorithm converges, then we know that we have found a fixed point; otherwise, we will be
content to find an approximate fixed point x" in the sense, for example, that ||x" - T(x*)|| < s for
some s > 0 and some norm || • ||. If we find multiple (exact or approximate) fixed points, we may
then apply criteria other than user-optimality to pick one of them for implementation.
Note that if guidance is perfect (it consists of complete and accurate information on current and
future link conditions, and is available anywhere and at any time on the network), then the
mapping G is the identity and the three formulations reduce to two: SoD: C—>C and DoS: P —» P.
The route guidance generation problem then becomes equivalent to a dynamic traffic assignment
problem. The guidance fixed point problem identified by Kaysi (1992) and the dynamic network
composite routing-assignment mapping considered by Kaufman et al. (1998) are of this form.
More realistically, however, if guidance is not perfect, then its specific properties (as determined
by the computation and communication technologies in place) will affect network flows and
conditions. Route guidance generation with imperfect information is thus seen to be a more
general problem than the dynamic traffic assignment problem.
3. FIXED POINT SOLUTION METHODS

Motivated by the formulation of the route guidance generation problem in terms of fixed points
of composite mappings, we consider in this section the general problem of finding a fixed point
x* of a continuous mapping T defined on a subset A" of finite-dimensional Euclidean space 9T:
find x* e A'such that T(x*) = x*, where T: Jf-> Xis continuous and .Ye 9T.
We also examine a number of published route guidance generation methods in terms of the fixed
point framework presented in section 2 and the solution methods discussed in this section.
We are particularly interested in solution methods which do not involve derivative information
since, in most realistic applications, the mapping T will be evaluated by probabilistic discrete-
time simulation and, consequently, numerically-calculated derivatives are likely to be noisy.
There are two main classes of fixed point solution methods which avoid derivative calculations:
simplicial triangulation and averaging.
Simplicial triangulation (Scarf, 1973; Todd, 1976) is a systematic method of subdividing a

compact domain into simplices, defined by their vertices; a number of schemes for carrying out
this subdivision have been developed. The function whose fixed point is sought is evaluated at
each vertex of the subdivision and the result of the evaluation is used to generate a label for the
vertex. If the vertex labels of any simplex in the subdivision satisfy a property known as
complete labeling, the algorithm stops, and that simplex is then known to be located in an e-
neighborhood of a fixed point. A very attractive feature of this approach is that the density of the
triangulation can be chosen to approximate a fixed point to any desired degree of accuracy s > 0.
On the other hand, despite the development of restart procedures that work with sequences of
increasingly dense triangulations around the fixed point, the method is still essentially based on
exhaustive exploration of the problem domain, and so is generally considered impractical for
high-dimensional problems such as route guidance generation.
Averaging refers to a class of iterative fixed point solution methods of the following form:
xk+1 = xk + ak (T(xk) - xk); x° e X (3.1)
where ak is drawn from some step size search set S c y\. These procedures are discussed in more
detail below.
3.1 Functional iteration
Functional iteration (or successive approximation) is a well-known algorithm for solving certain
fixed point problems. If T: X —> X is a mapping then, starting with some x° e X, functional
iteration generates the sequence x k+l = T(xk). Functional iteration can be seen to be a special case
of averaging in which ak = 1. If T is contractive on X (i.e., there exists a P e [0, 1) such that
||T(x)-T(y)|| < P||x - y for all x, y e X and some norm || • ||) and X is closed, then functional
iteration converges to the fixed point (which is necessarily unique) of T in X. Other sufficient
conditions for convergence have also been established (Ortega and Rheinboldt, 1970).
Kaufman et al. (1991) apply functional iteration as the basis of the SAVaNT (Simulation of
Anticipatory Network Vehicle Traffic) route guidance method. In SAVaNT, unguided vehicles
are assumed to follow minimum paths based on free-flow travel times, while guided vehicles are
assumed to comply fully with time-dependent minimum time path recommendations; a single
minimum path is computed for each node-destination pair and each departure time. SAVaNT
uses a mesoscopic traffic simulator to move vehicles along the assigned paths and to determine
the resulting traffic conditions. From a set of initial conditions, time-dependent minimum paths
are computed and guided vehicles are moved along them, resulting in a new set of conditions and
new minimum paths, which become in turn the starting point for the next iteration. Because of
the use of single-path guidance and the assumption of full driver compliance, the process evolves
deterministically and can assume only a finite number of states following initialization. Thus, it
necessarily either converges to a fixed point (the minimum paths are repeated in iterations k and
k+1) or cycles (minimum paths are repeated in iterations k and k+n, n>l). In fact Kaufman et al.
found that SAVaNT invariably cycled when using the time-dependent link traversal times from
the mesoscopic simulator as the basis for the minimum path calculation.
The SAVaNT method admits a straightforward interpretation in terms of the analysis framework
presented in section 2. The mesoscopic traffic simulator implements the network loading model.
"Guidance" consists of a minimum path recommendation based on free-flow link time for the
unguided vehicles (this could be considered background information or habitual behavior) and
based on time-dependent link times for the guided vehicles. Since each class complies fully with
the guidance, the demand (path split) model simply has all the OD flow of each class choose the
recommended path. Kaufman et al. work with the composite guidance message fixed point
formulation (although they test for convergence by comparing the time-dependent link traversal
times between iterations). Their use of the minimum paths output by one iteration as input for
the next amounts to functional iteration of the composite message mapping. The cycling
problem they encountered is evidence that the composite mapping is not contractive (in fact,
because of the discontinuity inherent in all-or-nothing loading, the composite mapping may not
have a fixed point at all).
Pursuing this work, Wunderlich (1994) developed a heuristic method called SAVaNT-CNV
which avoids cycling by biasing the time-dependent link times computed from the simulator
outputs: a simulated vehicle link traversal time corresponding to link entry in simulation time
period t, for example, is intentionally imputed to time period t+n, n*0. The time-dependent
minimum paths are computed from these biased times. Wunderlich varies the magnitude of the
bias so as to reduce inaccuracies while still ensuring convergence of the method. This can be
seen as a way of modifying the guidance mapping to make it contractive. Note that, in general,
the method does not compute consistent anticipatory guidance since the conditions it predicts,
and from which its guidance is generated, are likely to be systematically different from those
which would actually be experienced by drivers.
In view of the problems experienced in these applications of the functional iteration method, it
seems worthwhile to examine other forms of averaging which may converge under a broader
range of conditions.
3.2 Averaging with predetermined step sizes
We speak of averaging with predetermined step sizes when the ak in the averaging iteration (3.1)
do not depend on current or prior values of the iterate x. The method of successive averages
(MSA), in which ak = 1/k, is perhaps the prototypical example of such methods.
The MSA was originally applied to solve the equivalent minimization formulation of the static
network stochastic user equilibrium problem with separable link cost functions. Sheffi and
Powell (1982) proved the validity of this application by invoking a form of Blum's theorem
(Blum, 1954). They interpreted the averaging iteration (3.1) as a step in a descent method and
established the convergence of the iterative process to a minimizing value without requiring the
evaluation of the objective function and despite noisy (but unbiased) determinations of the
descent direction.
More recently, Cantarella (1997) has developed very general formulations of the static network
user equilibrium problem which allow for deterministic and probabilistic demand models, elastic
demand, mode choice, and hyperpath-based route choice, as well as link cost functions with
asymmetric Jacobians. The formulations explicitly express user equilibrium conditions in terms
of fixed points of composite network loading and demand mappings. Cantarella also invoked
Blum's theorem, in a form which allowed him to establish the convergence of MSA-type link
flow and link cost averaging algorithms for finding the fixed points of these formulations.
In the context of route guidance methods, the MSA is the basis of the dynamic flow prediction
algorithm in DYNASMART. DYNASMART (Mahmassani and Jayakrishnan, 1991;
Mahmassani and Peeta, 1993; Mahmassani et al., 1994) is a mesoscopic simulation-assignment
system which models a number of user classes including, among others, drivers who respond to
en route guidance recommendations (a minimum time path based on prevailing traffic
conditions) according to a boundedly rational compliance rule, drivers who follow dynamic user-
optimal routings, and drivers who follow approximate dynamic system-optimal routings. The
MSA application is part of an algorithm which averages the path flows of each user class in order
to obtain a flow pattern which simultaneously satisfies the path choice criteria of all user classes.
The DYNASMART approach can also be interpreted in terms of the analysis framework
presented in section 2. Again, the mesoscopic traffic simulator corresponds to the network
loading model. Guidance consists of minimum path recommendations based on current
(instantaneous) travel times, time-dependent travel times, and approximate time-dependent
marginal travel times, respectively, for each of the three classes. The demand model represents
full compliance by users of the two latter classes, and probabilistic compliance by users of the
first class. Although not expressly stated as such in the DYNASMART documentation, the
approach in effect attempts to compute a fixed point of the composite path flow guidance
mapping using the MSA.
The DynaMIT system (discussed in more detail below) is another dynamic traffic model system
which addresses the route guidance problem. Its default flow prediction algorithm is also
derived from the MSA. This default algorithm, however, is based on the composite link
condition mapping: it applies the MSA to find a fixed point of time-dependent link travel times.
Although computational experience in applying MSA-type methods to dynamic route guidance

problems is still somewhat limited, the MSA is known to exhibit slow convergence when applied
to the (simpler) static network user equilibrium problem. It is thus worthwhile to see if other,
more efficient, averaging methods are available to solve the fixed point problem.
3.3 Adaptive averaging
Adaptive averaging (Magnanti and Perakis, 1997a, 1997b) has recently emerged as a promising
approach for accelerating the convergence of averaging methods for certain classes of fixed point
problems. In adaptive averaging, the weights ak in (3.1) are not predetermined, but rather are
calculated in each iteration as a function of prior iterations' data. If the scheme for determining
the weights is effective, then the total computational effort to convergence (the average effort per
iteration to determine the weights times the required number of iterations) will be less than the
effort to convergence taken by a method with predetermined step sizes.
Magnanti and Perakis (1997b) suggest a scheme to determine the weights cck by minimizing a
potential function. If P(x) is such a function, they propose to compute a k+ , via the following one-
dimensional auxiliary minimization problem:
a k+1 = argminaeS P(xk(a)) (3.2)
where xk(a) = xk + a (T(xk) - xk) and S e 1R is some step size search set. The minimization may
be carried out using either exact or inexact (i.e., Armijo-type) line search methods.
Magnanti and Perakis identify and investigate a variety of potential functions for the fixed point
problem. They prove that the use of specific potential functions with particular classes of
mappings leads to a convergent algorithm for finding the fixed point of those mappings, and
show that for these mappings the theoretical rate of convergence can be faster than that of
averaging with predetermined step sizes.
Possible potential functions are often suggested by intuitive considerations. One obvious
approach, for example, is to define the potential in terms of the distance between an iterate and
its image under the map T:
P(xk(a)) = ||xk(a) - T(xk(a))||2 (PI)
Clearly, if x* is a fixed point of T then P(x*) = 0, and conversely. Magnanti and Perakis show
that use of this potential with contractive mappings leads to a convergent algorithm with a better
rate of convergence than functional iteration.
Another approach is based on the general idea that, to better explore the solution space X, iterate
x k+1 should be forced away from iterate xk unless the process is near a fixed point. This leads
naturally to the following potential function:
P(xk(a)) = ||xk(a) - T(xk(a))||2 - p ||xk(a) - xk(0)||2 (P2)
where P>0 is a "repelling" factor which tends to keep x k+l away from xk(0) = xk.
A natural extension of the preceding idea is to force iterate x k+l away from both xk and T(xk)
unless the process is near a fixed point. The resulting potential function is then
P(xk(a)) = ||xk(a) - T(xk(a))|f - (3 ||xk(a) - xk(0) ||xk(a) - x k (l)|| (P3)
Both P2 and P3 have attractive convergence properties when applied to nonexpansive maps,
(maps T: JT-»^such that ||T(x)-T(y)|| < P||x - y|| for p e [0,1], x, y e Xand some norm || • ||).
In general, adaptive averaging methods can apply to maps that satisfy even weaker properties
than nonexpansiveness. Magnanti and Perakis derive sufficient conditions, depending on both
the mapping and the weights, for the convergence of a general adaptive averaging process.
The weights used in (3.2) with all of the above potential functions can in principle be chosen
from the entire positive real line, i.e. S = SJ?; however, in practice the need to keep the iterates
within a convex set X will generally constrain the step size search set to the range [0, 1].
As mentioned above, the auxiliary minimization problem (3.2) can be solved using either exact
or inexact line search methods. The justification for use of inexact methods in this case is very
similar to the justification when such methods are used in optimization problems: since at each
iteration we are only dealing with a subproblem that is a local approximation to the actual
problem, there is little point in devoting significant computational effort to compute a precise
solution to the subproblem.
Magnanti and Perakis propose the following Armijo-type line search rule for solving the
auxiliary minimization problems: for positive constants D>0 and 0<b<l, find the smallest integer
power n (different for each iteration) such that the step length a = b" satisfies the condition
P(xk(a)) - P(xk(0)) < -D b" ||xk - T(xk)||2 (3.3)

They show that, for the combinations of problem and potential functions considered above,
application of this rule to the auxiliary minimization problem results in a convergent algorithm.
3.4 Conclusions
While the MSA and other averaging methods with predetermined step sizes are robust enough to
successfully solve a relatively large class of problems, their empirically-observed convergence
rates leave much to be desired. When applied to static network problems, the MSA has often
been observed to exhibit a pronounced tailing effect, in which convergence to equilibrium
proceeds very slowly. Although there is less experience in applying these methods to dynamic
problems, limited results indicate that the methods show similar behavior with these problems,
and this is not surprising because they take no account of problem structure or properties.
On the other hand, adaptive averaging methods based on potential minimization have so far only
been proven to converge for maps with somewhat restrictive properties such as
nonexpansiveness. For such maps the theoretical convergence rates of adaptive averaging
methods appear to be very attractive, although against this must be set the additional effort
needed to compute the step size. In fact computational experience with adaptive averaging
methods is still mostly lacking.
4. PRACTICAL ISSUES IN ROUTE GUIDANCE GENERATION

We suppose now that there is available a software system capable of applying methods such as
those discussed in section 3 to compute fixed points of the composite mappings identified in
section 2, and we consider how such a system might be used for practical guidance generation in
a traffic information center.
A reasonable approach would be to adopt a rolling horizon framework. According to this

framework, traffic condition and flow data are collected in real time from sensors distributed
over the network. In each roll stage (every 15 minutes, for example), the current network state is
estimated from the accumulated data. Using the estimated current state as a starting point,
network forecasts and consistent guidance are determined together over a guidance horizon (say
one hour into the future).
As discussed before, guidance consists of a set of time-dependent messages to be disseminated to

drivers via the available communications system. The time-discretization used for these
messages is called the guidance update interval (one minute, for example); at each guidance
update time, the corresponding set of messages in the block is disseminated. Guidance messages
can be considered to be fixed over the duration of an update interval; alternatively, with
appropriate technology (in-vehicle computers) it might be reasonable to interpolate messages
between the guidance update times.
After consistent guidance is computed for the entire guidance horizon, only the set of messages
for times up to the guidance transmission in the next roll stage need be disseminated. (In
general, however, the guidance computation needs to consider a horizon beyond the next stage
since traffic conditions predicted to occur at later times may affect the guidance provided to
currently-traveling vehicles whose trip durations are longer than one stage.) Then data collection
and guidance processing begin anew after moving forward in time by one roll stage.
Implementation of this rolling horizon framework requires decisions about the different
parameters identified above. By varying these parameters, it is possible to influence the
computation speed and the accuracy of the computed guidance solution. This is an important
general relationship that is central to the design of an operational system with real-time response
capability: by taking more computer time, a more consistent guidance solution can in principle
be computed; on the other hand, a long computation time reduces the timeliness and relevance of
the guidance once it is determined and disseminated.
Following are some specific rolling horizon implementation parameters and the speed/quality
tradeoffs which they affect:
• the length of the roll stage: a longer stage requires less frequent guidance recomputation and
allows more time for guidance generation and dissemination; on the other hand, this involves
forecasts over a longer time period, which are a less accurate basis for guidance;
• the length of the guidance update interval (or equivalently, the number of guidance changes
in a roll stage): a longer interval between guidance changes requires less computation but
makes for less precise guidance (because messages are less able to track predicted changes in
conditions) and may lead to situations in which significant numbers of drivers all react in
similar ways, shifting the location and possibly exacerbating the level of congestion;
• the tolerance used (or the number of iterations allowed) for the determination of consistent
guidance: a less stringent tolerance or fewer number of iterations speeds termination of the
algorithm but reduces the accuracy of the computed guidance.
It is clear that the development of efficient guidance generation algorithms is a critical step
towards the implementation of operational guidance systems. On the other hand, until traffic
centers routinely have such powerful computing capabilities that the effort required to generate
consistent guidance becomes negligible, careful investigation of the speed/accuracy tradeoffs
identified above will remain important. This is the task we turn to next.
5. EXPERIMENTAL SETUP
We have developed a software system called DynaMIT (Dynamic traffic assignment for the
Management of Information to Travelers) (Ben-Akiva et a/., 1997a) which is based on the
framework described in the preceding sections and is ultimately intended to generate guidance in
an operational traffic information center. DynaMIT features capabilities for network state
estimation and OD prediction from sensor data, for network condition predictions, and for
anticipatory guidance generation. To provide these capabilities, DynaMIT includes a
mesoscopic supply (traffic) simulator and a demand simulator incorporating a detailed driver
behavior model; these simulators are designed to operate significantly faster than real-time so
that they can be embedded in iterative algorithms.
In the work described here, we use DynaMIT to explore the speed/quality tradeoffs which were
discussed in section 4. We use a much more detailed microscopic traffic simulator called
MITSIM (Microscopic Traffic SIMulator) (Yang and Koutsopoulos, 1996; Yang, 1997; Ben-
Akiva et al., 1997b) to serve as the "ground truth" against which DynaMIT's outputs can be
compared. (Because of the amount of detail and computation involved in carrying out its
simulations, MITSIM itself is not suitable for use in real-time iterative algorithms.) MITSIM
carries out a very detailed probabilistic simulation of vehicle movements over the network; this
simulation accounts for a number of aspects of driver behavior, including response to guidance.
MITSIM also generates the sensor readings that would result from these movements; the
readings can be randomly perturbed, if desired, to simulate the effects of sensor inaccuracies.
Our experimental setup strictly controls the interconnections between DynaMIT and MITSIM: in
principle, the only data which flow from MITSIM to DynaMIT are the simulated sensor
readings, and the only data which flow from DynaMIT to MITSIM are the computed guidance
messages. However, MITSIM's detailed simulation outputs are stored and can be compared in a
post-evaluation with DynaMIT's estimates and forecasts.
Our interest here is in DynaMIT's guidance generation component only. Consequently, we have
simplified matters by providing DynaMIT, at the beginning of each roll stage, with full
information on MITSIM's current simulation state (location, status and destination of every
vehicle on the network) and on future OD flows. This perfect information substitutes for the
outputs of DynaMIT's state estimation and OD prediction operations, which normally would be
carried out by processing data from sensor readings together with historical data on traffic and
OD patterns to give more or less accurate results. Furthermore, we have set equal the driver
behavior model parameters in the two systems, whereas more generally the two could be
different to reflect behavioral modeling errors.
The version of DynaMIT used for these investigations accepts guidance information only in the
form of predicted time-dependent link traversal times. The current version of the system
provides enhanced capabilities in the guidance representation as well as in the demand and
supply simulators; however, details of this version will not be covered here. Indeed, our hope is
to be able to draw general conclusions about guidance generation that are relatively independent
of details of the software that is used for the computations.
In a typical computational experiment, we initialize both MITSIM and DynaMIT with input data
describing a given traffic situation (MITSIM's data requirements are much more extensive than
those of DynaMIT, but DynaMIT can extract most of the data it requires from MITSIM's input
files). At the beginning of a given roll stage, we transmit to DynaMIT an accurate "snapshot" of
MITSIM's current traffic pattern and the upcoming OD demands, as explained above, then start
MITSIM simulating detailed "ground truth" traffic conditions and start DynaMIT generating
guidance; the two are separate processes that can proceed in parallel.
Given the snapshot information, DynaMIT carries out a mesoscopic simulation of network
conditions over the entire guidance horizon. To do this, it must make an initial assumption about
the guidance (predicted time-dependent link traversal times) that will be provided during that
time. Based on this assumption, the simulation is run, and the resulting link traversal times are
obtained as an output. The input guidance traversal times are compared with the output traversal
times to ascertain the consistency of the guidance; this is done by computing the (quadratic)
norm of the difference between the two sets of times.
If the guidance turns out not to be consistent or is otherwise unsatisfactory, it is modified for use
in the next iteration. In the experiments presented here, the modification consisted of an MSA-
type averaging of the assumed guidance and the traversal times resulting from the simulation.
DynaMIT then repeats this loop of forecasting link conditions and modifying guidance until
either satisfactory consistency is obtained or the allotted number of iterations have been
accomplished. In the latter case, we retain the traversal times from the iteration with the least
inconsistent guidance. Guidance covering the roll stage is then communicated to MITSIM. To
account for delays due to guidance computation and communication, the reception of the
guidance information by MITSIM can be scheduled to take place a specified amount of
(simulated) time after the beginning of the roll stage; the MITSIM simulation will block itself, if
required, to ensure this synchronization.
MITSIM then begins the simulation of the next roll stage; its vehicles, based on their information
access characteristics, use the guidance information provided by DynaMIT to decide their routes.
This process is repeated until the simulation is complete. As MITSIM simulates traffic
conditions, it also collects information which allows a post-evaluation of the effectiveness of the
guidance and of the accuracy of DynaMIT's condition forecasts.
6. EXPERIMENTS AND RESULTS
6.1 Problem data description
The simulation experiments described here used problem data describing the Boston Central
Artery network on a typical weekday from 7:00 to 9:00 AM. There are five origins (designated
by the letters A through E) and two destinations, for a total of ten OD pairs. Drivers can use
either of two tunnels to traverse the network: the Third Harbor tunnel to the south and the
Sumner/Callahan tunnel to the north. Each tunnel has two lanes in each direction. Drivers who
depart from origin B are restricted by the topology of the network to use the Third Harbor tunnel.
Drivers who depart from origin E are similarly restricted to use the Sumner/Callahan tunnel.
Drivers from the other three origins—roughly 68% of the total OD flows—can choose between
the two tunnels for their trip.
We assume that drivers are aware of historical time-dependent travel times for all available paths.
Driver route choice is represented by a logit-form model which uses these historical times in the
absence of guidance information, but takes account of guidance path time predictions when these
are available. Based on the number of vehicles assumed to be equipped to receive guidance
information and the distribution of OD flows mentioned above, approximately one-third of the
total number of vehicles can potentially change routes in response to guidance information.
The problem data simulates an incident in the Third Harbor tunnel that completely blocks one of
the lanes and restricts movement in the other lane for fifteen minutes from 7:15 to 7:30 AM.
Given the network structure and demand levels, this could be characterized as a major incident.
Drivers who do not receive guidance information are assumed not to know that the incident has
occurred, and to follow their habitual routes based on historical travel times.
6.2 Experiments and results
We report here on a set of experiments that were run to investigate the effects of the guidance
generation parameters discussed in section 4. Each experiment reported here involved a
comparison of aggregate simulation results (average travel time and speed, and change in total
vehicle-hours) from the computed guidance, against benchmark results from a simulation in
which no guidance was provided. To establish the no-guidance results, three MITSIM runs were
carried out, and their aggregate results were averaged. None of the results differed by more than
4% from the computed mean, thus we subsequently only performed one run for each guidance
parameter set.
The parameters that were investigated include:

• the length of the guidance recomputation interval; baseline value: 5 minutes;
• the length of the rolling horizon; baseline value: 45 minutes;
• the number of guidance generation iterations carried out; baseline value: 3 iterations;
• the length of the guidance update interval; baseline value: 400 seconds.
(See section 4 for a discussion of these parameters.) Most runs varied the value of a single
parameter, while leaving the others set at baseline values; the last set of runs, however,
investigated the effect of a shorter guidance update interval together with variations in other
guidance parameters.
Table 1 below presents the aggregate results obtained from these experiments. The following
conventions and abbreviations are used in the table to describe each run:
• the baseline settings are indicated by an asterisk (*);
• the benchmark is the MITSIM run without guidance;
• GR indicates the guidance recomputation interval;
• iterations indicates the number of guidance algorithm iterations per recomputation;
• RL indicates the rolling horizon length, i.e., the length of the forecast period;
• GU indicates the length of the guidance update interval.
6.3 Comments on simulation results
Guidance recomputation interval

The guidance recomputation interval is the length of time between successive operations of
combined state estimation, prediction and guidance; in other words, it is the stage length of the
rolling horizon procedure. Of the parameters tested here, the recomputation interval seemed to
have the greatest influence on the aggregate results, and did so in a systematic fashion:
shortening the recomputation interval increased the benefits of the guidance, while lengthening
the interval reduced their magnitude.
Number of guidance iterations

The number of guidance iterations refers to the number of iterations for which the averaging
algorithm is allowed to run. This relates in a general way to the accuracy of the fixed point
computation. However, it is known from applications of the MSA to static models that accuracy
does not generally increase monotonically with the number of iterations, particularly when only a
few iterations are performed. This characteristic appears to hold as well for application of the
MSA to the dynamic model considered here.
Rolling horizon length

The rolling horizon length is the period of time over which predictions are made for the purpose
of guidance generation. The findings from changing the rolling horizon length were quite
interesting. Of the four scenarios that were investigated, the best results were obtained when
using a horizon of one-half hour. When this value was either reduced or increased, the overall
benefits of the guidance were smaller. It may be that the deteriorated accuracy of longer-term
forecasts adversely affects the quality even of short-term guidance; while short time horizons do
not sufficiently account for the effects of future conditions on near-term guidance.
Length of the guidance update interval

Using shorter update intervals did in fact increase the benefits of guidance by an appreciable
amount. This result was expected, because a short recomputation interval allows guidance
instructions to be updated rapidly for all drivers. Even if the guidance information is somewhat
more aggregate, the ability to update it rapidly is sufficient to compensate for its coarseness.
Table 1
Aggregate Effects of Variations in Guidance Parameters
Travel time Travel speed Benefit

(minutes) (mph) (veh-hrs)
Mean Std.dev. Mean Std.dev.
No guidance 12.7 6.0 23.3 10.3 —
2 min GR 9.9 5.1 29.1 8.9 363.3
3 min GR 10.4 4.9 27.6 9.0 301.7
4 min GR 11.3 6.3 26.8 9.9 182.0
5 min GR* 11.5 6.3 26.5 10.0 162.8
lOminGR 11.9 7.0 26.1 10.1 111.0
ISminGR 12.3 7.3 24.9 9.5 57.9
1 iteration 12.4 5.9 23.8 10.3 42.6
2 iterations 11.3 6.3 26.8 9.8 189.7
3 iterations* 11.5 6.3 26.5 10.0 162.8
6 iterations 11.2 5.9 27.2 10.2 201.6
5 minRL 12.2 5.6 24.0 10.0 72.5
15 minRL 11.6 5.9 25.1 9.4 143.8
30 min RL 10.9 5.7 27.1 9.6 236.5
45 min RL* 11.5 6.3 26.5 10.0 162.8
400 sec GU* 11.5 6.3 26.5 10.0 162.8
100 sec GU 10.8 6.1 27.8 9.7 250.8
100secGU-2minGR 10.0 4.6 28.0 8.5 349.1
1 00 sec GU - 6 iterations 10.3 5.7 28.7 9.8 307.3
1 00 sec G U - 60 minRL 10.3 5.0 27.9 9.0 310.3
7. CONCLUSIONS AND DIRECTIONS FOR FURTHER WORK

In this paper we have introduced the route guidance generation problem; proposed an analysis
framework for the problem, leading to three alternative fixed point formulations; identified a
broad class of algorithms for solving the fixed point problems; described a software simulation
environment for evaluating these algorithms and for investigating speed vs. accuracy tradeoffs
that are important in a real-time operational setting; and presented results from simulation
experiments involving these tradeoffs.
In general, the experiments demonstrate the validity and practicality of consistent anticipatory
guidance generation based on a fixed point approach, and show the DynaMIT/MITSIM software
system to be a powerful environment for the investigation of guidance generation issues.
Specific results from the guidance tradeoff analyses were broadly as expected with regard to the
guidance recomputation interval, the number of guidance iterations and the length of the
guidance update interval; experiments involving the rolling horizon length exhibited interesting
results that merit further investigation.
Considerable work remains to be done in developing and refining algorithms for computing
guidance fixed points and, as we have argued, averaging methods are likely to be a useful
approach for this purpose. A superior averaging method would combine the robustness,
simplicity and generality of the MSA with the favorable convergence rates of adaptive averaging.
We are currently exploring two possibilities in this sense: first, by identifying properties of the
route guidance composite mappings that satisfy the weakest requirements posed by Magnanti and
Perakis; and second, by deriving potential functions which generate step sizes automatically
verifying the summability requirements of the Blum theorem weights (i.e., Zak = oo, Eak2 < a>).
We are also currently conducting simulation tests in which, as a purely heuristic device,
potential-based adaptive averaging methods are applied to route guidance generation problems in
an attempt to accelerate their convergence.
These and similar efforts are expected to lead to a better understanding of the route guidance
generation problem, and to be of practical value in implementing and operating advanced traffic
information systems.
References
Ben-Akiva, M.E., A. de Palma and I. Kaysi (1996). The Impact of Predictive Information on
Guidance Efficiency: An Analytical Approach. In: Advanced Methods in Transportation
Analysis (L. Bianco and P. Toth, eds.), pp. 413-432, Springer-Verlag, Berlin.
Ben-Akiva, M.E., M. Bierlaire, J. Bottom, H.N. Koutsopoulos and R. Mishalani (1997a).
Development of a Route Guidance Generation System for Real-Time Application.
Presented at the IF AC Conference, Chania, Greece.
Ben-Akiva, M.E., H.N. Koutsopoulos, R. Mishalani and Q. Yang (1997b). Simulation
Laboratory for Evaluating Dynamic Traffic Management Systems. ASCE Journal of
Transportation Engineering 123(4), pp. 283-289.
Blum, J.R. (1954). Multidimensional Stochastic Approximation Methods. Annals of
Mathematical Statistics 25, pp. 737-744.
Bottom, J., M.E. Ben-Akiva, M. Bierlaire and I. Chabini (1998). Generation of Consistent
Anticipatory Route Guidance. Presented at the Tristan III Conference, San Juan, Puerto
Rico.
Cantarella, G.E. (1997). A General Fixed-Point Approach to Multimode Multi-User Equilibrium
Assignment with Elastic Demand. Transportation Science 31(2), pp. 107-128.
Engelson, L. (1997). Self-fulfilling and Recursive Forecasts—an Analytical Perspective for
Driver Information Systems. Presented at the 8fn IATBR Meeting, Austin, Texas.
Hall, R. W. (1996). Route Choice and Advanced Traveler Information Systems on a Capacitated
and Dynamic Network. Transportation Research C 4(5), pp. 289-306.
Kaufman, D.E., R.L. Smith and K.E. Wunderlich (1991). An Iterative Routing/Assignment
Method for Anticipatory Real-Time Route Guidance. Proceedings of the Conference on
Vehicle Navigation and Information Systems.
Kaufman, D.E., R.L. Smith and K.E. Wunderlich (1998). User-Equilibrium Properties of Fixed
Points in Dynamic Traffic Assignment. Transportation Research C 6(1), pp. 1-16.
Kaysi, LA. (1992). Framework and Models for the Provision of Real-Time Driver Information.
Ph.D. thesis, Massachusetts Institute of Technology.
Magnanti, T.L. and G. Perakis (1997a). Averaging Schemes for Variational Inequalities and
Systems of Equations. Mathematics of Operations Research 22(3), pp. 568-587.
Magnanti, T.L. and G. Perakis (1997b). Solving Variational Inequality and Fixed Point Problems
by Averaging and Optimizing Potentials. Operations Research Center, Massachusetts
Institute of Technology, Report OR 324-97.
Mahmassani, H.S. and R. Jayakrishnan (1991). System Performance and User Response under
Real-Time Information in a Congested Traffic Corridor. Transportation Research A
25(5).
Mahmassani, H.S. and S. Peeta (1993). Network Performance under System Optimal and User
Equilibrium Dynamic Assignments: Implications for Advanced Traveler Information
Systems. Transportation Research Record 1408.
Mahmassani, H.S., T.-Y. Hu, S. Peeta and A. Ziliaskopoulos (1994). Development and Testing of
Dynamic Traffic Assignment and Simulation Procedures for ATIS/ATMS Applications.
Center for Transportation Research, University of Texas at Austin, Technical Report
DTFH61-90-R-00074-FG, revised version.
Ortega, J.M. and W.C. Rheinboldt (1970). Iterative Solution of Nonlinear Equations in Several
Variables. Academic Press, New York.
Scarf, H. E. (1973). The Computation of Economic Equilibria. Yale University Press, New
Haven.
Sheffi, Y. and W.B. Powell (1982). An Algorithm for the Equilibrium Assignment Problem with
Random Link Times. Networks 12, pp. 191-207.
Todd, M. J. (1976). The Computation of Fixed Points and Applications. Springer-Verlag, Berlin.
Wunderlich, K.E. (1994). Link Time Prediction for Dynamic Route Guidance in Vehicular
Traffic Networks. Ph.D. Thesis, University of Michigan.
Yang, Q. and H.N. Koutsopoulos (1996). A Microscopic Traffic Simulator for Evaluation of
Dynamic Traffic Management Systems. Transportation Research C 4(3), pp. 113-129.
Yang, Q. (1997). A Simulation Laboratory for the Evaluation of Dynamic Traffic Management
Systems, Ph.D. Thesis, Massachusetts Institute of Technology.
601
A NEW FEED-BACK PROCESS BY

MEANS OF DYNAMIC REFERENCE VALUES
IN REROUTING CONTROL
Andreas Poschinger, Michael Cremer, Hartmut Keller

Fachgebiet Verkehrstechnik und Verkehrsplanung
Technische Universitat Munchen
Arcisstrqfie 21, 80333 Munchen, Germany
email: andi @fgv. vpl. bauwesen. tu-muenchen. de
ABSTRACT
This paper provides an approach to extend the classical two-component control scheme con-
sisting of a traffic model and an algorithm. Within this scheme a special feed-back controller
is added which uses dynamic reference values. Within the framework of a motorway control
system it is shown that this approach allows for modular and subsidiary system architecture.
In an example of rerouting control this extended scheme needs neither origin destination (OD)
estimations nor traffic forecasts with respect to incertainties of models and parameters in
particular the compliance rate.
Practical requirements for the algorithms are derived from a test site. The efficiency of
this approach is assessed by simulation results utilising a macroscopic traffic flow model and
OD estimation from a real motorway topology. Data from loop detectors of the test site are used
to bring the simulated scenario close to reality. It is shown that the proposed algorithms are
insensitive to estimation errors of the traffic models to be expected when applying the models to
reality.
602 Transportation and TraDc Theory
BACKGROUND
AND MOTIVATION
Initial investigations into motorway control were undertaken in the 70ies in Germany. Since
this time the aim of control has been to optimise objective functions like travel time computed,
by non-dynamic traffic models. The control systems consist of a traffic model and a control
model in the sense of a control algorithm which is considered as two-componentcontrol system.
Further investigationswere undertaken in the development of system technology to apply control
strategiesto real facilities and of new dynamic traffic models for control use (Cremer et al., 1980).
Cremer used these models to optimise the objective functions for a few data sets, but it
I cannot be ruled out that the solutions found are not the optimum for other data sets. This
problem was solved by using online optimisation based on optimum control theory (Cremer
et al., 1993; MeBmer, 1994). All of these optimisation approaches are directly based on
models which can be inaccurate due to an excessive level of abstraction or missing infor-
mation., Some approaches attempt to consider these problems by developing more accurate
traffic models (e.g. Kates (1994)); other approaches combine models of different levels (e.g.
a model for traffic flow with a model for OD estimation (Cremer and Keller, 1987; Sachse, 1995).
Online optimisation of the performance functions by means of a traffic model has the dis-
advantage that all states of the models have to be computed for the time horizon of the
optimisation; typical values for the time horizon are often chosen, during which the vehicles
influenced by control leave the considered motorway network, so the horizon can be computed
as the sum of the longest travel time in the network plus the time the optimised control cycle is
active. The inaccuracy of the models used are increased by long prognosis intervals. Further pa-
rameters like the compliancerate are needed which are hard to forecast (Tsavachidiset al., 1998).
Another approach which extends the traffic model and control algorithm by a closed loop
control is to avoid online optimisation of objective functions. Such investigations for rerouting
control can first be found in Papageorgiou (1990). In this work the control algorithm itself
consists of an feed-back controller with static reference value.
An important and new approach is to introduce dynamic reference values (Poschinger,

1996; Poschinger et al., 1997); with this extension it is possible to consider time varying
demands and traffic situations. Using the dynamic reference values in the control algorithm
only secondary effects like congestion lengths or travel times, but no primary effects, like the
measured demand on the main route were considered.
This paper uses the approach of dynamic reference values to introduce a new three-component
control scheme. Within this control scheme the primary measured effects of control strategies
can also be taken into consideration simply by the use of control errors.
Dynamic Reference Values in Rerouting Control 603
METHODOLOGY
Three-Component Model
Figure 1 shows the two-component approach using a traffic model and a control algorithm. The
names of variables are taken from control theory. Generally the plant is a traffic network. The
Control System |z(t)

Control
JyW Ob- xfl), Algo- lufl).
Plant
y(t)
server
rithm
Figure 1: Two-component approach
system equation of a non-linear plant is
^ = /(*(*),«(*),*(*)) (1)
x(t) system state

u(t) control variables
z(t) disturbances
The measurement equation of the system is
= c(x(t), «;(*)) (2)
y(t) measured output

w(t) measured noise
The control variables u(t) are directly calculated by the control algorithm. The control algorithm
primarily consists of an optimisation apprach; sometimes, especially in continuously operating
control systems, simple control laws which yield the control variable are also used. The system
states are estimated by the traffic model, herein named an observer; this leads to the estimated
system variables x(t). In general the time parameter also consists of times in the future; therefore
the observer consists either of an actual estimation of traffic situation or also of a prognosis.
Figure 2 shows the extended approach. The control algorithm yields one or more dy-
namic reference values s(t) for the controlled variables in the additional feed-back controller.
The controlled variables should be measurable or at least observeable in a good quality; the
estimation error should be below 5%. The control algorithm may consist of simple heuristics but
also of optimisation approaches. It is also possible to use classical feed-back control together
Control System
A
|z(t)
A
Control x(t)(
!y(t), y(t)
/JA FeeCl-
Ob- x(t)N ju(t)

Algo- riant
server s(t),
rithm Control
Figure 2: Three-component approach
with dynamic reference values, which is shown by the example of rerouting control. This leads
to a cascade structure of the control system. The feed-back controller may consist of classical
control theory, but also of more advanced approaches like neural networks.
Basic Control Scheme for Motorways
The control scheme shown in Fig. 3 represents an approach to motorway network control using
e.g. rerouting, speed control and ramp metering implemented by use of dynamic reference values
not only in the extended component model but also as part of the control algorithm. The main
aim of control is to provide the capacity needed for an incoming traffic flow varying over time
and space. Therefore the demand for a motorway network yields dynamic reference values.
Using cascade control structures an algorithm can calculate further dynamic reference values as
inputs for subalgorithms. Starting with collecting traffic data from local detectors, the first step is
Rerouting
q_boundary
>
q i iom
——>$
im rerouting
»
q_cap
Rerou- • 'LJ
ting Al- Controller
Speed Motor- q_mes
q_mes qjobs gorithm v_nom j| 1| v_vnrw
way
Ob- UJ v mes
v_ Ties
ser-
v_obs Flow - Controller Network
optimi-
Density
ver speod- density relation zation k_nom j a t_gr
>
k obs v_obs
Controller
1
I k_oba
Figure 3: Basic control strategy
to reconstruct the traffic state in form of complete traffic load profiles of the motorway network
considered (observer). With this information it is possible to assign traffic flows entering the
network to the possible routes by a rerouting algorithm, leading to reference values for the traffic
flows to be rerouted, which are controlled by a rerouting controller. This leads to a minimum of
required capacities on the links. The capacities lead to reference values for speed and density
(flow optimisation) which are controlled by variable speed limits and by ramp metering. The
observer could be subdivided to observe single routes. Please note that this approach is an hier-
archical architecture which can be implemented as subsidiary, parallel and locally distributed
control systems.
Observer
In this paper only the rerouting algorithm (for computing the dynamic reference value) and the
rerouting feed-back control (controlling the reference value) are discussed in detail. The observer
consists of a macroscopic traffic flow model with Kalman filter feed-back. Contrary to Cremer
and Schiitt (1990) the variable qj yielding incident detection is not estimated. Instead of this the
Kalman filter approach is extended to additionally estimate turning rates.
Rerouting Algorithm
Areas of Working Points. Fig. 4 shows a pair of two alternative routes, which can be regarded
as a highly abstract model for one direction of the test field Munich North. Under the assumption
Figure 4: Schematic diagram of considered network
of a capacity qcapi of route 1 and a capacity 9,^,2 of route 2 one can find different areas of
operating points (Fig. 5):
• No rerouting is required if the demand on both routes is less than their capacity.
• It is possible to reroute from route 1 to route 2, if there is a demand for route 1 of greater
than qcapi and a demand for route 2 of less than qcaP2-
• Vice versa, it is possible to reroute from route 2 to route 1.

flow route2
q_cap2
q_cap1 flow route 1
Figure 5: Areas of operating points
• A special algorithm is needed if there is congestion on one of both routes or if there is

overcapacity demand on both routes; in such cases a congestion balance algorithm is used
to produce reference values.
Control Objectives. The main objective is to ensure that the travel time on the alternative route
is shorter than the travel time on the main route in the case of a rerouting recommendation;
in other words, the desired optimum is the user optimum. Using the system optimum would
result in low compliance rates (most probably). Therefore the user optimum which results in
high compliance rates also yields a ,,better optimum" for the system than using ,,the system
optimum" itself. In other words ,,the system optimum" described in many publications e.g. as
function consisting of the summarized travel time of all vehicles is not the real system optimum
in routing control. The real system optimum is not known exactly but can be approximised by
the user optimum. A trivial example for this train of thought is an assumed compliance rate
cr = 0 in the case of ,,the system optimum".
In the case of a considerable length difference between main route and alternative route
only the area of operating points with no rerouting and the area of congestion balance algorithm
can be used; otherwise in the areas of no network overload the user optimum is not fulfilled.
Calculating Reference Values in Case of No Network Overload. For the first three cases the
following equations hold:
9drl + 9dr2 < 9crl + 9cr2 (3)
qdri : flow entering route 1

(fcri : capacity of route 1
In these cases the following reference values for the route input flows are chosen:
9dri for qdri < Qcri and qdrj < q^j

{ 9cr» for qdri > Qcri
QdH + Qdrj ~ Qa-j for qdri < qcri
and qdrj < q^
and qdrj > q^j
(4)
qnTi : reference value of route i for the rerouting controller
Calculating Reference Values in the Case of Network Overload. If there is a higher demand
than the maximum capacity, or if the alternative routes have different lengths and if, to achieve
an high compliance, a user optimum is desired, congestion will occur on at least one route. In
this case a ,,congestion balance algorithm" is used (Fig. 6). This algorithm fulfils the following
Length of congestion
on route2
LO Length of congestion
on route 1
Figure 6: Relation of congestion on main and alternative route
equation:
Icnl = (*r2 - frl) ' /<*C + l& (5)
Icni : reference value of congestion length of route 1

lri : length of route i in km
1C2 : observed congestion length on route 2
fac : factor considering the conversion of length difference and congestion length
which leads to a reference value for congestion" on route 1. The observed congestion length on
route i is calculated by:
k = r^-
K
£ (*?•'<*) (6)
max \fgegmenti\Vj<5Qkm/h
Vj : speed on segment j
kj : density on segment j
kmax '• maximum density
Isj : length of segment j
Weighting the segment lengths with the squared densities considers different kinds of congestions
or even zones in one congestion with different densities and speeds. It is assumed that higher
densities cause the costs to increase in an overproportional manner. Therefore this standardised
congestion length is seen to be approximately proportional to the costs of a driver using the route;
however, this is still not validated. The simplified equation
lcj= £ (^) (7)

allsegmentsj \ Vj <50km/h
without standardized congestion can be used for validation purposes as long as the traffic flow
model used does not distinguish between different kinds of congestion.
The difference between the reference value and the real value of congestion length is
taken as an input for a controller which yields a reference value for the traffic flow on route 1.
<?nrl = kp • (/cni - /cl) + <Zdout (8)
qnri : reference value for flow on route 1

kp : amplification factor of proportional (P-) part of controller
Qci0ut '• fl°w leaving congestion on route 1
This is the equation of a P Controller under consideration of disturbances. The idea is that
the traffic flow on the main route must be approximately the flow out of congestion plus the
difference between flows on off- and on-ramps to keep the length of congestion constant; the
controller itself adapts the congestion length by adding a further part to the reference value for
flow on route 1. This equation is used together with the multi-threshold control and feed-back
control described below. Equation
9nn = kp • (lcnl - lcl) (9)
is used together with the bang-bang control. In this case qnr\ is the reference value for traffic
flow to be rerouted.
Rerouting Controller
Control Actions. It is assumed that there are several control actions with different effects.
These control actions are carried out via destination plans". A destination plan consists of
a combination of destinations which are displayed on variable direction signs at the decision
points of a motorway interchange. Please note that three different destinations on one panel at
two decision points result in nine possible different destination plans. In the following sections
we assume to have AT different destination plans dn with the attribute
effect(di) > effect(dj) if i> j (10)
effect(dj) : effect of plan dt (traffic flow rerouted)
d : destination plan
i,j : indices
Bang-Bang Controller. For this additional and new component of the control structure three
approaches are shown. A simple approach which requires neither a traffic model nor additional
measurements is a bang-bang controller. The reference value for the vehicles to be rerouted is
the input for the rerouting control. The entire reroutable flow is rerouted, when there occurs
a reference value of more than 300 veh/h for the flow to be rerouted. This leeds to following
equation:
x
i for qnri < 300
D(n + l) = (ID
for qnri > 300
D : destination plan in use

n : discrete time step
di : destination plan with effect 0 vph rerouted
djv : destination plan with maximum effect
Please note that qnri in this case is the traffic flow to be rerouted. Figure 9 shows the simulated
control structure. This strategy causes control oscillations.
Rerouting Algorithm Rerouting Controller
Figure 7: Scheme of simulated bang-bang control
Model Based Multi-Threshold Controller. A further approach with a strong demand for a traffic
model is a multi-threshold controller. With the OD relations known, a lookup table can be built,
which yields the relation between the vehicles to be rerouted and the control state. The control
law follows the equation
D(n + 1) = di with i, |effect(d;) — qnri (12)
The optimisation is done by varying the variable i. The control structure simulated for results
is shown in Fig. 8. In the first part of the scheme the congestion lengths are calculated. In the
middle part the congestion balancing algorithm which consists of the congestion length control
error and a P controller is calculated. The third part is the rerouting control which uses a lookup
segien LJT copic

'
\
2_ Model
1'
Routes
Rerouting algorithm Rerouting Controller
Figure 8: Scheme of simulated multi-threshold control
table based on OD information. Anyway, for this approach one must know the acceptance rates
with which the drivers follow the rerouting suggestion. Empirical examinations have shown that
this is not reasonable in the test field of Munich. The compliance rates are varying between 5%
and 40% (Sachse, 1998).
Feed-Back Controller. A third approach which requires less model knowledge but uses mea-
surements and demonstrates the benefit of the three-component approach is based on automatic
control. If one knows the order of destination plans with respect to their effects some kind of
an Integral (I-) controller can be designed which is modeled in the following equations. Let us
asume that
D(n) = di ,Ki<N (13)
then
+1 for qnri-qdri>50
D(n + 1) = (14)
_i for gnri — qdri < —50
If there is a control difference of more than a small threshold of 50 veh/h then the next destination
plan is chosen with respect to the sign of the error. In the case
D(ri) = (15)
the following equation holds:
d2 for qnri - qdri > 50

D(n + l) = (16)
di for qnri - qdri < -50
Analogous for
D(n) = dN (17)
dN for qnrl - qdrl > 50
D(n + 1) = (18)
jv-i for qnri - qdri < -50
Ob- 4— L^ —I
server qou c1 m y Model
I
f
f * ^
RautoZ
r •
: • ,9|
I
Observer Rerouting Algorithm Rerouting Controller
Figure 9: Scheme of simulated feed-back control
Figure 9 shows the control structure simulated. The threshold avoids control oscillations in the
sense of pulse width modulation; however, the equations are calculated every minute only to
avoid that one driver sees more than one activated sign and to take into consideration the time
lag between the activation and the effects at the next measurement loops downstream. The idea
is to avoid a model with great uncertainties as long one can measure the effects of the control.
The approach can be seen also as some kind of trial and error.
RESULTS
Motorway Network
The motorway network in the North of Munich consists of two alternative route pairs between the
motorway interchanges Neufahrn and Mlinchen Nord. Fig. 10 shows the topology used for the
algorithms and simulations. The network has 27 on-ramps and off-ramps. The length of the main
route is about 9.5 km. The alternative route has a length of 18 km (direction Munich to Neufahrn)
or 19 km, respectively (direction Neufahrn to Munich). There are some destination plans which
can be be activated for both the North and South direction. The destination plans arise from the
combination of the variable direction signs showing different destinations on different locations.
Fig. 11 shows the relevant destination plans which are implemented in the existing control system
(Klofkorn et al., 1991) and can be used for field trials. One variable direction sign (1) is located
at the link entering the interchange Neufahrn from Nuremberg, another (2) is located at the link
coming from the airport and from Deggendorf, respectively. Both direction signs can either show
nothing (-), or direction Munich, or direction Munich and Salzburg. Their combination yields the
Nuremberg
II
A92 A92 A92 A92
Eching
Oberschleissheim Garching-Nord
Garching-Sued
Feldmoching
A99 A99
r
North
Kieferagarten"5'
Link with loop detector Freimann
i Frankfurter Ring
Munich
Figure 10: Motorway network in the North of Munich
destination plans Nj for the direction Neufahrn to Munich. Further, two variable direction signs
located at the interchange Munich North from direction Munich (3) and direction Salzburg (4)
with the possible destinations Nuremberg and airport yields the plans M4. The plans are sorted
with respect to their effects. The higher the index i the higher the effect with respect to rerouted
vehicles.
Plan decision point 1 decision point 2

NO - -
Nl - Munich
N2 - Munich + Salzburg
N3 Munich Munich
N4 Munich Munich + Salzburg
N5 Munich + Salzburg Munich + Salzburg
Plan decision point 3 decision point 4

MO - -
M2 - Airport
M5 Airport Airport
M6 Airport Airport + Nuremberg
M7 Airport + Nuremberg Airport + Nuremberg
Figure 11: Table of possible destination plans for the YDS from North and South
Design of Experiment
The closed loop control is simulated with the second order macroscopic model SiMONE (Cremer,
1979) and an OD estimation (Ploss, 1993). As input real data were taken from the morning rush
hours of the May 15th, 1995 from the network inflows. The compliance was assumed to be
constant and known during the simulation horizon. It was set to 40% (which is about the highest
value which can be achieved in reality and the worst case with respect to control stability). The
data are aggregated to 5 minutes for traffic flow models and to 20 minutes for OD estimation. The
results of OD estimation are used to assign the 5 minute traffic flow to the different destinations.
The simulation produces virtual measurement data which are used as input for the control system.
The algorithms themselves neglect a lot of traffic information and phenomena. Nevertheless a
lot of real traffic information and phenomena are considered by the models used for validating
the algorithms. The estimated traffic data are quite different from the simulated data, because at
the on- and offramps real data are used which differ from simulation. This is why the results can
be seen as a measure for the ability of using the algorithms on the real motorway network.
Scenario
Let us consider a situation where we have an incident on the main route due to merging effect
at the interchange at Munich North coming from North. The capacity is reduced to 1600 veh/(h
lane) in direction South. Without control the main route is fully congested after 20 minutes
(Fig. 12). High speed can be observed on the alternative route, because there is a free traffic flow.
Bang-Bang Controller
The traffic data are taken directly from the model; the results shown are only the effects of the
control algorithm and feed-back control component. The effects of the traffic model (observer)
are not considered. The congestion length ranges from 1.5 km to 2.4 km. The control oscilla-
tion can be explained with the delay (time lag) between the rerouting signal and the end of the
congestion. No congestion occurs on the alternative route. However, the user optimum is not
reached completely. The travel time on the main route is up to 4 minutes shorter than the travel
time on the alternative route (Fig. 12). This might be due to the bang-bang controller (too many
rerouted vehicles) and it could be solved by choosing other parameters in the congestion balance
algorithm.
Travel time with and without control
1800
main route with control
1600 alternative route with control
1400
main route without control

1200
alternative route without control
1000
800
600
400
200
0
0 10 15 20 25 30
time / [60sec]
Figure 12: Travel times with and without control
Multi-Threshold Control
In this case, too, the effects of the observer are neglected. Using the OD information based multi-
threshold rerouting controller a significant reduction in control oscillation can be observed. With
growing congestion length the destination plan is switched from NO to N4. The remaining differ-
ence of travel times is intended due to the user optimum (Fig. 13). Please note that these results
can be achieved only because the compliance rate was assumed to be known.
Travel time with and without control

1800
main route with control —
1600 alternative route with control •— "
1400 ^ '
S^ main route without control —
I 1200 x
x.x-'' alternative route without control —
1000
10>
800
^x*^
/"•"""'"
E
600 .,X-«===r-
400 •
200
0
20 40 60 80 100
time / [20 sec]
Figure 13: Travel times with and without control
Feed-Back Control
The results of feed-back control are calculated using the same merging scenario again, but data
from April 28th, 1997,7 am. In contrast to both other experiments the effects of the traffic model
are also considered. Furthermore another version of the traffic flow model is used which together
with data supplied and the control output explains the diffences in simulation. All results shown
are with operating control. Figure 14 shows the simulated speed on the main route. Figure
15 shows the estimated speed on the main route. The differences occur because of the use of
real data at on- and off-ramps, which are not compatible with the simulation with respect of the
control but also with respect of uncertainties of the traffic flow model. Although the observation
quality in the example shown is not quite satisfactory, the results in terms of travel time (Fig. 16)
are remarkable. This indicates that the approach will show good results also in reality. Figure 17
shows the simulated speed on the alternative route, Figure 18 shows the estimated speed. No
impact occurs although it was assumed that all rerouted vehicles are using the alternative route.
Empirical analysis of data has shown that a non negligible part of them are using the secondary
road network, which results in higher speeds than simulated on the alternative route.
Feed-back control
10
20 time [1 min]
space [segment]
Figure 14: Simulated speed on main route
Feed-back control
80
60
40
20 20 time[1 min]
space [segment]
Figure 15: Observed speed on main route
CONCLUSION
The implementation of the rerouting algorithm which utilises an additional feed-back controller
has shown quite good results although or even because in the new component measures data
rather than a traffic model was used. The new three-component approach presented here allows
the simplification of the traffic models used, their correction and even a kind of ,,adaptation
0.3
main route •
alternative route
0.25
0.2
0.15
0.1
0.05
10 20 30 40 50 60 70
time [1 min]
Figure 16: Travel times on main and alternative route
Feed-back control
100
speed [km/h]
150
100
50
0
80
60
0
10 40
20
20 time [1 min]
space [segment] 50 0
Figure 17: Simulated speed on alternative route
of reality" to the traffic models using the additional controller to overcome estimation errors
by changing the real effects and the destination plan respectively. Thus the tree-component
approach is a reference for a new class of control systems in traffic theory.
The control actions of the proposed examples are most comprehensible because of the
simple rules of the control algorithms. This can be used for debugging purposes but it can also
Feed-back control
speed [km/h]
space [segment] 50^0
Figure 18: Observed speed on alternative route
convince operators to accept the rerouting recommendations of the control algorithm.
In further developments of the example control strategies the congestion length controller
will be revised or replaced by a travel time estimation. Further investigation includes the
adaptation of the approach to more complex traffic networks and to replace the presented simple
examples by more sophisticated subalgorithms.
In the projects MANAH and MOBINET for the greater Munich area a field trial and the
development of a continously operating system are planned.
REFERENCES
Cremer, M. (1979). Der VerkehrsfluS auf Schnellstraflen, Springer-Verlag, Berlin, Heidelberg,

New York.
Cremer, M. and Keller, H. (1987). A new class of dynamic methods for the identification of
origin - destination flows. Transportation Research, 21B, 117-132.
Cremer, M., Meissner, F. and Schrieber, S. (1993). On predictive control schemes in dynamic
rerouting strategies, in: Transportation and Traffic Theory, 407—426.
Cremer, M., Papageorgiou, M. and Schmidt, G. (1980). Einsatz regelungstechnischer Mittel

zur Verbesserung des StraBenverkehrsablaufs auf SchnellstraBen. Forschung Strafienbau und
Strqfienverkehrstechnik, Heft 307.
Cremer, M. and Schiitt, H. (1990). A comprehensive concept for simultanious state observa-
tion, parameter estimation and incident detection, in: Transportation and Traffic Ttieory, ed.
M. Koshi, 95-112, Elsevier, New York, Amsterdam, London.
Dougherty, Cobett (1994). Short term inter-urban traffic forecasts using neural networks, in:
Second Drive-II Workshop on Short-Term Traffic Forecasting, 65-79.
Kates, R. (1994). The interplay of viscosity, instability, and nonlinearity in deterministic traffic
flow models: Numerical simulations and possible applications, Forschungsbericht, Astrophys-
ical Institute 11482 Potsdam, Germany, preprint.
Klofkorn, Ohio and Tilly (1991). Pflichtenheft Wechselwegweisung A9 / A92 / A99, Technischer
Bericht, SIEMENS AG.
MeBmer, A. (1994). Anwendung regelungstechnischer Verfahren zur dynamischen

Routenfuhrung in SchnellstraJ3ennetzen, Dissertation, Lehrstuhl fur Steuerungs- und
Regelungstechnik der Technischen Universitat Miinchen.
Papageorgiou, M. (1990). Dynamic modelling, assignment, and route guidance in traffic net-
works. Transportation Research, 24B(6), 471^495.
Ploss, G. (1993). Ein dynamisches Verfahren zur Schdtzung von Verkehrsbeziehungen aus Quer-
schnittszahlungen, Dissertation, Fachgebiet Verkehrstechnik und Verkehrsplanung, Technis-
che Universitat Miinchen.
Poschinger, A. (1996). Some aspects for the development of a new motorway control algorithm,
in: 4th Meeting of the EURO Working Group on Transportation, Newcastle upon Tyne, preprint
of extended abstracts.
Poschinger, A., Cremer, M. and Keller, H. (1997). A control scheme for variable dkection signs
using dynamic reference values, in: IEEE Conference on Intelligent Transportation Systems,
Boston, 121 (Digest), 0-7803-^269-0/977$ 10.
Sachse, T. (1995). Advanced regional traffic control systems, DRIVE II Project Deliverable 4011,
Fachgebiet Verkehrstechnik und Verkehrsplanung, TU Miinchen.
Sachse, T. (1998). Alternativroutensteuerung in Autobahnnetzen auf der Grundlage einer erweit-

erten Analyse des Verkehrsablaufs, Dissertation, Technische Universitat Miinchen, Fachgebiet
Verkehrstechnik und Verkehrsplanung.
Tsavachidis, M., McLean, T, Brader, C., Hangleiter, S., Damas, C, Maxwell, B. and Barber, P.
(1998). Urban integrated traffic control, TABASCO Deliverable number 8.3.
621
OPTIMAL COORDINATED AND INTEGRATED

MOTORWAY NETWORK TRAFFIC CONTROL
Apostolos Kotsialos, Dynamic Systems and Simulation Laboratory, Technical University of

Crete, Chania, Greece.
Markos Papageorgiou, Dynamic Systems and Simulation Laboratory, Technical University of

Crete, Chania, Greece.
Albert Messmer, Ingenieurburo A. Messmer, Munich, Germany.
INTRODUCTION
An increasingly important area in the field of traffic engineering is motorway traffic control. A
number of approaches have been developed in the past for the design of control strategies that
involve control measures such as route recommendation with the use of Variable Message
Signs (VMS) or equipped vehicles, ramp metering, motorway-to-motorway (mtm) control, etc.
These approaches include expert systems, fuzzy systems, neural networks, and classical feed-
back control. A further approach in this area is the use of optimal control methods and corre-
sponding numerical solution algorithms, in which case the designer's expertise is required in
the problem formulation, rather than in the problem solution phase. More precisely, the de-
signer must suitably translate the technical problem at hand in the format of an optimal control
problem. The optimal decisions resulting from the solution of the formulated optimal control
problem may in many cases surprise the designer and may even call for an a posteriori inter-
pretation, thus challenging his technical judgement and extending or correcting his presumed
expertise. In fact, solutions obtained via this approach are most effective because control deci-
sions are based on the minimisation of a physically meaningful control criterion and on explicit
consideration of all process non-linearities and control constraints. For a number of application
examples that employ optimal control in such a fashion, see Papageorgiou (1997).
In this paper the optimal control approach (discrete-time formulation) is applied to the design
of optimal coordinated and integrated motorway control strategies. Coordinated control strate-
gies consider multiple control measures of the same type, while integrated control strategies
consider different types of control measures. A feasible-direction algorithm is used for the nu-
merical solution of the formulated discrete-time optimal control problem (see Papageorgiou
and Marinaki, 1995). Previous applications of nonlinear constrained optimal control to coordi-
nated ramp metering (without route guidance) were reported by Blinkin (1976), Papageorgiou
(1983), Papageorgiou and Mayr (1982), Bhouri et al (1990), Bhouri (1991), Zhang et al.
(1996), while optimal control applications to the route guidance problem in motorway net-
works (without ramp metering) was suggested by Messmer and Papageorgiou (1995), Wie et
al. (1995), and Iftar (1995). Moreno-Banos et al. (1993) suggested an integrated control strat-
egy addressing both route guidance and ramp metering, based on a simplified traffic model.
Another approach to the same problem of integrated control is the application of a linear pro-
gramming method as described by Papageorgiou (1994) and Elloumi et al. (1996) where (in
addition to VMS, ramp metering, and motorway-to-motorway control) traffic signal control for
the urban parts of the network is also considered. An integrated heuristic feedback approach to
motorway traffic control was also presented recently by Ataslar and Iftar (1998). Finally, a
practically appealing, feedback approach when urban areas are included in the overall network
was presented by Diakaki et al. (1998, 1999), along with an application to a part of the Glas-
gow highway network. The approach presented in this paper incorporates for the first time a
realistic, nonlinear traffic flow model with various control measures, a powerful numerical so-
lution algorithm, and a generic software code (AMOC: Advanced Motorway Optimal Control)
while applying optimal control methods to motorway network traffic.
AMOC incorporates the macroscopic traffic flow model of METANET (Modele d' Ecoule-
ment du Traffic Autoroutier: Networks) (Messmer and Papageorgiou, 1990) and the nonlinear
optimisation algorithm required for the determination of the control variables' optimal trajecto-
ries over time suggested by Papageorgiou and Marinaki (1995). AMOC determines optimal
open-loop control trajectories for motorway networks of arbitrary topology and of arbitrary
traffic characteristics. The control measures considered within AMOC are route recommenda-
tion, ramp metering, and motorway-to-motorway control.
The rest of this paper is organised as follows. The second section describes the utilised macro-
scopic traffic flow model and the corresponding control problem. The third section presents the
discrete-time optimal control problem formulation and its numerical solution, while the fourth
contains some illustrative results obtained by use of AMOC for two hypothetical networks and
a number of different control scenarios. Finally, the fifth section summarises the conclusions
and future work.
THE MACROSCOPIC TRAFFIC FLOW MODEL

A validated second-order traffic flow model is used for the description of traffic flow on
motorway networks. A generic simulation tool, called METANET, was developed in the past
(see Messmer and Papageorgiou, 1990) on the basis of this model which provides the model-
ling part of the optimal control problem formulation.
Optimal Co-Ordinated Network Traffic Control 623
Model overview
The macroscopic model employed is suitable for free flow, critical, and congested traffic con-
ditions. It has two distinct modes of operation. When traffic assignment (routing) aspects of the
traffic process are not taken under consideration, it operates in the non-destination oriented
mode. When traffic assignment is an issue, it operates in the destination oriented mode. When
route guidance measures are not included, then the traffic model does not need to operate in the
destination oriented mode, although this is not imperative.
The network is represented as a directed graph whereby the links of the graph represent
motorway stretches. Each motorway stretch has uniform characteristics, i.e. no on-/off-ramps
and no major changes in geometry. The nodes of the graph are placed at locations where a
major change in road geometry occurs, as well as at junctions, on-ramps, and off-ramps.
The non-destination oriented model
In the METANET model, the time and space arguments are discretised. The discrete-time step
is denoted by T . A motorway link m is divided into Nm segments of equal length Lm. Each
segment i of link m at time instant t = k-T, k = 0,. . ., K, is characterised by the following macro-
scopic quantities: The traffic density Pm,i(k) (veh/lane-km) is the number of vehicles in segment
i of link m at time t = k-T divided by Lm and by the number of lanes hm\ the mean speed vm,i(k)
(km/h) is the mean speed of the vehicles included in segment i of link m at time k-T, and the
traffic volume or flow qm,i(k) (veh/h) is the number of vehicles leaving segment / of link m
during the time period [k-T, (&+l)-7], divided by T. The basic equations used for their calcula-
tion for each segment i of link m at each time step, are
,- (* + !)= Pm,i(k)+T^-~[qm,i-l(k)-qm,(k)]
(3)
m V, Pcr,n
rcr.m
(4.)
where Vfim and pcr,m denote the free speed and critical density per lane, respectively, of link m
while am is a further parameter of the fundamental diagram (eqn. (3)) of link m. Equation (1) is
the well-known conservation equation, eqn. (2) is the flow equation to be substituted in (1),
eqn. (4) is an empirical dynamic speed equation with (3) to be replaced therein (for more de-
tails see Papageorgiou et al, 1990). The mean speeds in the network's links are limited from
below by LW u/>OT, Pcr^n, am, T, v, and K are constant parameters which reflect particular char-
acteristics of a given traffic system and depend upon street geometry, vehicle characteristics,
drivers' behaviour, etc. For a real-life network these parameters are determined by a validation
procedure as the one described by Kotsialos et al. (1997), Annex B.
In order for the speed calculation to take into account the speed decrease caused by merging
phenomena, the term -(^r/Lm^m)[qfj(k)um,i(k)/(pm,i(k')+K)] is added to the right-hand side of
(4), where S is a further parameter, // is the merging link and m is the leaving link. In order for
the speed reduction due to weaving phenomena, resulting from lane drops in the mainstream,
to be considered, the term -(</>T/LmAm)[AA-pmjjm(k')/pcr,m\umiNm(K)2 is also added in (4), where
A /I is the number of lanes being dropped, and ^ is a further parameter. For more details on
these two additional terms, see Papageorgiou et al. (1990).
For origin links, i.e. links that receive traffic demand and subsequently forward it into the
motorway network, a simple queue model is used. Origin links are characterised by their ca-
pacity and their queue length. The outflow q0(k) of an origin link o is given by
«„(*)= min [d0(k)+Wo(k)/T,qmaXt0(k)} (5)
where d0(k) is the demand flow at time period k at origin o, w0(k) is the length (in vehicles) of a
possibly existing waiting queue at time k, and ^max.o is the maximum outflow at the specific
time instant. The later depends on the density of the primary downstream leaving link ju in the
following way
-pk else
where Q0 is the flow capacity of the origin link and p(k) is the portion of Q0 that is permitted to
enter link /A where
_ (7)
Pmax. Pcr,n
with /7max the maximum possible density in the network's links. The conservation equation for
an origin link yields
Wo(k + l)=Wo(k)+T-[d0(k)-q0(k)] (8)
When ramp metering control measures are applied at an origin link o, then eqn. (6) is replaced
by the following relationship
, , ) < p ( k ) ( 9 )
Q0- p(k] else
where r0(k) e [r^n, 1] is the metering rate for the origin link o at period k, i.e. a control vari-
able. If r0(k)=l, i.e. no ramp metering is applied, equation (9) reduces to (6). If r0(k)<\, ramp
metering becomes active.
For a number of reasons, among which is the modelling of motorway-to-motorway control

measures, a simple queue model may also be used for some internal network links, called
store-and-forward links. These links are characterised by their flow capacity, their queue
length, and their constant travel time. For the determination of the outflow and the queue
length of such links, equations similar to (5)-(8) hold. If motorway-to-motorway control is ap-
plied at a store-and-forward link, an equation similar to (9) holds.
Motorway bifurcations and junctions (including on-ramps and off-ramps) are represented by
nodes. Traffic enters a node n through a number of input links and is distributed to a number of
output links according to
Qn(k)=^q,,^(k) (10)
feln
9m.M=fln(k)-Qn(k) Vm e On (11)
where /„ is the set of links entering node n, On is the set of links leaving node n, Qn(k) is the
total traffic volume entering node n at period k, qm,o(k) is the traffic volume that leaves node n
via outlink m, and J3™ (k) is the portion of Qn that leaves the node through link m. /?™ (k) are
the turning rates of node n and are assumed known for the entire time horizon. Equations (10)
and (11) provide qm,o(k) which is needed in (1) for i = 1.
When a node n has more than one leaving links then the upstream influence of density has to
be taken into account in the last segment of the incoming link (see (4)). This is provided via
where pm<Nm+\ ^ tne virtual downstream density of the entering link m to be used in eqn. (4)
for / = Nm, and p^\(K) is the density of the first segment of leaving link p. This quadratic form
is used because of the fact that one congested leaving link may block the entering link even if
there is free flow in the other leaving link.
When a node n has more than one entering links, then the downstream influence of speed has
to be taken into account according to equation (4). The mean speed value is calculated from
where um,o is the virtual speed upstream of the leaving link m that is needed in (4) for i = 1.
The destination oriented model
When traffic assignment is taken under consideration, the notion of turning rates ft™ is gener-
alised to the notion of splitting rates. Assume that a destination j is reachable from node n via
more than one exiting links. Let Qnj(k) be the total traffic volume entering motorway node n at
period k, that is destined to j. Then the splitting rate J3™j (K) is the portion of the traffic volume
Qnj(k) which leaves motorway node n at period k through link m. We further define for seg-
ment i of link m the composition rate ym,ij(K) to be the portion of traffic volume qm,i(fy which is
destined to destination j. Based on these definitions the following equations hold for any net-
work node
Qnj (k) = Z q^ (k)- r^j (k) V(n, ;) (14)
ê/n
qm,0(k) = ^Qnj(k)-^j(k) VmeOn (15)
YmM(k)=P^(k)-Qnj(k)/qm,0(k) V m e O n VjeJm (16)

where Jm is the set of all destinations reachable via link m. Equations (14)- (16) provide qmfl(k}
and ym,oj(K) that are needed in (17) for i= 1.
Another required addition to the destination oriented model is the consideration of partial den-
sities per destination. The partial density Pm,ij{k) is the density of vehicles in segment i of Link
m at time k-T destined to destination j & Jm. Equation (1) then becomes
pm>,,. (* + !)= p^j (k)+ —1— • [rm,,_u (*)• qm>i_, (k)- rmjij (k)- qm>l (k)] (17)
L
m '^m
where ym,ij(k) = pm,ij(k)/pm,i(k). Similarly, for the queues of store-and-forward and origin Links,
the notion of partial queues is introduced. Partial queues at an origin link evolve according to
the relationship
where w0j(k) is the number of vehicles in the queue of origin link o with destination j at time k.
An analogous conservation equation applies to the partial queues of store-and-forward links.
For modelling various network topologies, AMOC follows METANET's nomenclature, i.e.
only three links may be attached to a network node, either leaving or entering. More complex
topologies are decomposed with the use of dummy links. This means that there are three possi-
ble configurations for a node: one in-link and one out-link (continuation of a stretch), two in-
links and one out-Link (merging of two motorways or on-ramp), or one in-link and two out-
links (bifurcation or off- ramp). The role of a VMS at a bifurcation node is to recommend to the
drivers, heading to a certain destination, the optimal route to that destination. This recommen-
dation affects the average drivers' behaviour, depending on their compliance to the VMS.
Since the routing message refers to particular destinations, the influence on the route choice is
projected to the corresponding splitting rates of the node. In presence of a VMS at bifurcation
node n for destination j, an indicating splitting fivMS,nj (m being the main out-link of node n
towards f), which is a control input, can be defined. If the sign guides drivers at node n via the
main route towards their destination j, fivMs.nj is equal to 1. If the sign indicates the alterna-
tive route, PvMs,n,j is zero. The relation between PvMS,n,j and the resulting real splitting rate
(3™j is modelled by the following equation
P:,j=(l-e)-/3lnj+WMS,nJ-e (19)
where e is the compliance rate to the route recommendation (0 < e < 1) and /?#,„,; is the por-
tion of vehicles that take the main route hi absence of any route recommendation. In this paper,
it is assumed for simplicity that e = 1, i.e. all drivers follow the VMS indications. Moreover
PvMs,n,j is considered as a continuous variable within the range [0, 1], that can be eventually
transformed to a binary sequence (see Messmer and Papageorgiou, 1995).
The state and control variables
Based on the previous subsections, it may be seen that, when the traffic model operates in the
non-destination oriented mode, then the state equations (22) are obtained by the substitution of
the static equations into the dynamic ones, i.e. by substituting (2), (10), (11) into (1); (3), (12),
(13) into (4); and (5), (6), (7), (9) into (8). Thus the state vector x consists of the densities /cy<
and mean speeds vmj of every segment i of every link m, the queues wsaf of every saf-link, and
the queues w0 of every origin link.
When the traffic model operates in the destination oriented mode, then the state equations (22)
are obtained by substituting (2), (14), (15), (16) into (17); (3), (12), (13) into (4); and (5), (6),
(7), (9) into (18). The state vector x consists of the partial densities pmjj of every segment i and
reachable destination j from link m, the mean speeds vm,i of every segment i of every link m,
the partial queues \vsafj of every saf-link, and the partial queues w0j of every origin link.
at
The control vector consists of the splitting rates f}™,j bifurcation nodes where VMS are in-
stalled (only in the destination oriented mode), and of the metering rates r of mtm and ramp
metering control measures (any mode), while the admissible control region for all control vari-
ables is included [0, 1].
The process disturbance vector d in the non-destination oriented mode consists of the demands
at the origin links and the turning rates at each bifurcation, while in the destination oriented
mode it consists of the demands and composition rates at the origin links, the splitting rates at
bifurcations which are not controlled by a VMS, and the drivers' compliance rates to VMS in-
dications. These disturbances are assumed to be known for the entire time horizon.
The cost criterion
The cost criterion in an optimal control problem may be arbitrary. In the particular case of the
motorway traffic process, possible cost criteria that may be used are: the Total Travel Time
(TTS), the Total Waiting Time (TWT) at origins and/or saf links' queues, the Total Distance
Travelled (TDT), or the Total Amount of Fuel Consumed. A cost criterion can also be pro-
duced from a combination of the above criteria. The chosen cost criterion has the following
form
'=
(20a)
with
W W
° ^ ^ « M » « ,om.
,v (20b)
0 (fc)-w 0imax else
where a/, aw are weighting factors and_/ is the index of the control variables. This criterion, ex-
cluding the weighted terms, corresponds to the total time spent by all vehicles in the network
and its origins during the considered time horizon. The term with weight a/ is included in the
cost criterion so that high-frequency oscillations of the control trajectories be suppressed. The
contribution of this term to the total cost criterion is very small compared with the contribution
of TTS in all cases examined. The last additional weighted term is a penalty term included in
the cost criterion in order to enable the control strategy to limit the queue lengths at the origins
if and to the level desired. The parameters w0,max are predetermined constants that express the
maximum desirable number of vehicles waiting at any time period in origin o's queue.
A noteworthy point related to the cost criterion (20), is that the minimisation of TTS results in
maximisation of the time- weighted total outflow from the network T^ (K - k)S (k), where
S(K) is the sum of all traffic volumes exiting the motorway network at period k. This is a fairly
general result, independent of the macroscopic motorway traffic model used (see Papageor-
giou, 1983). Note also that the cost criterion (20) has the form of the cost functional (21).
NUMERICAL SOLUTION ALGORITHM
A discrete-time optimal control problem
The integrated motorway control task is formulated as a dynamic optimal control problem with
constrained control variables which can be solved numerically over a given time horizon. The
general discrete-time formulation of the optimal control problem reads
Minimise
&=J (21)
t=o
subject to
x(k + l) = f[x(k),u(k],d(k)], x(o) = x0 (22)
where K is the considered time horizon, x e 91" is the state vector, and u e 9?m is the vector of
control variables. The cost functional (21) expresses the objectives of control in mathematical
terms, whereby q>, 9 are arbitrary, twice differentiate, nonlinear cost functions. The state
equation (22), with known initial state XQ, describes the dynamic process behaviour (nonlinear
model) while d is a vector comprising all disturbances acting on the process which are assumed
to be known over the time horizon. Equation (23) expresses the problem's control constraints.
Numerical solution
The numerical algorithm used for the solution of the discrete-time optimal control problem is
outlined in this subsection (for more details see Papageorgiou and Marinaki, 1995). For a given
admissible trajectory u(k), k = Q,...,K-l, the according state trajectory x(k) can be found by
solving (22) starting with the known initial state, and hence the cost criterion can be regarded
as depending on the control variables only, i.e. / = / (u ). The gradient of / with respect to u
on the equality constraints surface is given by
g(jt)= dJ/8u(k)= [ dt/du(k)J -X(fc + l ) (24)
where the co-state vector X e 9?" (not to be confused with number of lanes Am of link rri) satis-
fies
'= d<p/dx(k) + [dt/d\(k)J -X(jt + l), k = Q,...,K-l (25)
d&/dx(K). (26)
A projected gradient £(&) is defined to have its component £,(£) equal to g,(£) if none of the
corresponding bounds (23) is active, and £(K) = 0 else. The scalar product of two vector tra-
jectories t|(&), £ (k}, fc = 0,..., K-l, is defined as follows
(27)
k=0
Furthermore we define a saturation vector function sat(ti) with components
if »7, < 1iia* (28)

else.
The necessary conditions of optimality are given by (22), (23), (25), (26) and £•(£) = 0 V i, k. If
these equations are satisfied simultaneously by some trajectories x(fc), u(&), and X(fc), a station-
ary point of the optimal control problem has been found. A well-known solution algorithm can
be described as follows:
Step 1) Select an admissible initial control trajectory u(0)(A;), k = 0,...,£-1; set the iteration in-
dex t = 0.
Step 2)Using u v(i)' (k) solve (22) from known initial condition to obtain x(t\ ^ ' (k+Y); using
x^ (fc+1) and u ^ ) (k) solve (25) from terminal condition (26) to obtain A,^ (fc+1), k =
Step3)Usmg x^(frfl), u^(fc), and A.^(£+l), calculate the gradients g^(k) and ^ (k),
Step 4) Specify a search direction p (*' (k), k = 0,..., K-l (see below).
Step 5) Apply an one-dimensional search routine along the p ^' -direction to obtain a new, im-
proved admissible control trajectory u ^+1^ (k), i.e.
a O = arg min / |sat (u ^ + a • p '*'JJ
a
1
where a^ ' is the resulting optimal scalar step length, and set
Step 6) If, for a given scalar cr> 0, the condition (j(t+l)-jV> )IJ^> < <ris satisfied, stop; oth-
erwise set t := I + I and go back to step 2.
Several techniques for specifying a search direction p ^ in Step 4) of the algorithm can be ap-
plied including steepest descent, conjugate gradient, and variable metric. In the examples pre-
sented here, the Polak-Ribierre conjugate gradient search direction was used. The one-
dimensional search required in Step 5) of the algorithm makes use of cubic interpolation within
an appropriate frame until a suitable stop criterion is fulfilled (see Papageorgiou and Marinaki,
1995).
The algorithm's convergence to a global minimum is not guaranteed, but previous experiences
indicate that a physically satisfactory minimum is always achieved. In some cases, when the
algorithm was started with different initial control trajectories, convergence to different local
minima occurred. The corresponding differences in the cost criterion's values, however, were
insignificant.
COORDINATED AND INTEGRATED CONTROL STRATEGIES
Background information
Two hypothetical networks were considered, a plain stretch (Figure 1) with one on-ramp and
one off-ramp and a double-route network (Figure 6) with on-ramps and off-ramps attached at
its routes from the main origin to the main destination. In the first network, the only control
measure available is ramp metering. The problem at hand is the determination of an optimal
ramp metering trajectory over the entire time horizon (2.5 h) considered. In the second net-
work, the control measures available are one VMS located at the network's main bifurcation
and the ramp metering at the on-ramps of both routes. Thus the problem is that of the determi-
nation of optimal splitting rates and ramp metering control trajectories. The time horizon for
this example is 3 h. Because this network provides more space for different combinations of
control measures, the case of coordinated ramp metering (without the use of the VMS) is first
presented, followed by the case where no ramp metering is available, i.e. only route guidance is
active. Fully integrated control with and without a congested initial traffic state is studied next.
Finally, the case of integrated control with queue constraints is examined.
The network's characteristics remain the same for all cases. Table 1 displays the utilised model
parameter values that are common to both networks. The utilised values of iy>m , pcr,m, and am
result in a capacity of 2000 veh/h/lane for the motorway links. Moreover, the capacity of origin
links (on-ramps) was set to <2o=1500 veh/h/lane. The time step is 7=10 s, and it was assumed
that the drivers' compliance rates to the VMS signals is 100%. Finally a/= 1000 veh, and un-
less otherwise stated, aw = 0 was used.
Test network 1
Network description and demand scenario. The first network considered (Figure 1) consists of
two origins, three motorway links, and two destinations. It is a motorway stretch with an off-
ramp located 0.5 km before an on-ramp. Oi is the main origin and has three lanes with capacity
2000 veh/h each. The motorway link L! follows with three lanes, and it is 2 km long with the
off-ramp D2 attached at its end. The motorway link La follows with three lanes also and a
length of 0.5 km with a single-lane on-ramp Oa attached at its end while the last motorway
link LI (2 km long) has also three lanes. Each of the links' segments (except for link L3) is 1
km long. The free speed in the network is Uf<m = 102 km/h and am = 1.867. Trapezoidal de-
mands have been used for the two origins (Figure 2). It was assumed for the entire time hori-
zon that 5% of the demand originating at Oj has as its destination the off-ramp D2 and the re-
maining 95% is destined to Dj.
Results and discussion. In the no-control case, a congestion is built just downstream of the on-
ramp because the sum of the flow coming from the upstream link and of the demand from the
on-ramp exceed L^'s capacity despite the fact that some of the vehicles have left the network
from the off-ramp. This congestion propagates backwards and blocks D2's upstream segment.
The congested mainline reduces the flow of vehicles exiting the network at Dj, but also the
flow of vehicles leaving the network from D2 is reduced since the congestion propagates up-
stream the off-ramp. These are the reasons for activating ramp metering during the peak hour
(Figure 3) resulting in an increase of the queue of O2 (Figure 4) under AMOC application.
Under optimal control, the network's outflow obtains higher values in an early phase (Figure
5). The cost criterion of ITS is thus improved to become, from 897 veh-h in the no-control
case, equal to 834 veh-h in the control case, which corresponds to an improvement of 7%. It
should be noted that the TTS is calculated over the entire time horizon, not only during the pe-
riod of activation of ramp metering, which means that there is a systematic underestimation of
the improvement resulting from control application.
Table 1. Utilised parameter values for the traffic flow model.
Par. T K V Pmax § </> Pcr,m
Val. 18 40 60 180 0.0122 2.98 33.5
s veh/lane-km km2/h veh/lane-km veh/lane-km
0,
Figure 1. Test network 1.

6000- 1-21
5000- 1
in
.a
|4000- E 0.8
|
^3000- I °-6
m E
10.4
£2000- ra
QL
0.2-
1000 -
0 200 400 600 800 1000

200 400 600 800 1000
Time (seconds x 10) Time (seconds x 10)
Figure 2. Demand profiles for network 1. Figure 3. Optimal control trajectory, net-
work 1
350 -
300 -
^250-
.Si
I 20°"
^ 150 -
o>
o 100-
50-
0
200 400 600 800 1000 200 400 600 800 1000
Figure 4. Queue #2, network 1. Figure 5. Total network outflows, network 1.
Test network 2
Network description and demand scenario. Network 2 (Figure 6) firstly includes a single
motorway Link LO that is 2 km long and has four lanes. At the end of L0 there is a bifurcation
where two routes, the main and the alternative, lead to the main destination Dj. The length of
the main route is 6.5 km while the length of the alternative route is 8.5 km. The main route
consists of the links LI, Lt, and L3, each with 2 lanes and respective lengths of 3, 0.5, and 3
km. Along the main route, the off-ramp D2 and the single-lane on-ramp O2 exist. The alterna-
tive route consists of the links L^, LS, and L6, each with 2 lanes and respective lengths of 4, 0.5,
and 4 km. Along the alternative route, the off-ramp D3 and the single-lane on-ramp 63 exist.
Again the Links' segment lengths are 1 km except for the Links L-4 and LS which are one-
segment Links with length 0.5 km. The free speed in the network is vj<m =110 km/h and am =
1.636.
Trapezoidal demands have been used for the three origins of the network (Figure 7). When no
VMS control is considered, it is assumed that 60% of the drivers who have as their destination
DI, take the main route and the rest 40% take the alternative route. This assumption is some-
what unrealistic, but it is included here for illustrative purposes. It is also assumed that 4% of
the demand in Oi has D2 as its destination, another 4% has D3 as its destination, and the rest
92% is destined to DI. These values remain the same over the entire time horizon.
Figure 6. Network 2.
8000-
7000-
_6000-
| 5000-
^4000-
1 3000'
Q
2000-
1000-
0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 7. Demand profile for network 2. Figure 8. Optimal control trajectories, coor-
dinated ramp metering, network 2.
The no-control case. When no-control measures are applied to network 2 for the described de-
mand scenario, congestion appears. At the main route, just downstream of the origin O2, con-
gestion appears during the peak hour for the same reasons as in network 1 discussed earlier.
This congestion is quite severe and it propagates backwards all the way to the main origin Oi,
where a queue is formed. The blocking of LO results in smaller outflow towards the alternative
route and therefore to underutilisation of its capacity. For this reason, any control strategy
aiming at minimising the ITS should focus on reducing as much as possible or even eliminat-
ing this congestion.
For this example, three control scenarios have been tested. The first one concerns only coordi-
nated ramp metering at both on-ramps while routing is identical to the no-control case. The
second concerns VMS control only, without any ramp metering, and finally the third scenario
is concerned with integrated control, meaning that VMS control is acting cooperatively with
ramp metering control measures in order for the common criterion of TTS to be minimised.
Coordinated ramp metering. When only coordinated ramp metering is considered then, as can
be seen from Figure 8, the only on-ramp that is metered is O2 resulting in a long queue. Ramp
metering in the main route is required for the congestion's elimination. Because there is no
VMS control, 60% of drivers having DI as their destination, use the main route leading to
flows that exceed the link capacity. The control strategy, by optimising the inflow into the
motorway from O2, builds up a long queue (Figure 9) but allows this way higher flows at its
upstream and downstream links without the formation of a congestion, as can be seen from
Figures 10-13.
Because ramp metering in O2 ameliorates significantly traffic conditions in the main route,
there is no congestion to propagate backwards and block LO as in the no-control case. Never-
theless, no ramp metering is needed in the alternative route because the flow destined to DI and
the 40% of the flow destined to DI do not exceed the route's capacity, even after the on-ramp
inflow is added. Figure 14 depicts the flow of segment Le,i which is located downstream of 63.
Thus the control measures taken in the main route affect positively also the alternative route
From Figure 15 it can be seen that the network's total outflow is larger during the ramp meter-
ing activation compared with the no-control case during the same period, whereas when the
peak hour demand is accommodated, the outflow becomes smaller than in1 the no-control case.
The difference observed at the network's outflows between the no-control and the control cases
mirrors the large differences of TTS between them, namely 3258 veh-h and 2721 veh-h in the
no-control and control cases, respectively, i.e. ramp metering improves the TTS by 16%.
0 200 400 600 800 1000 1200 400 600 800 1000 1200
Figure 9. Queue #2, cordinated ramp me- Figure 10. Flow LSJ, coordinated ramp me-
tering, network 2. tering, network 2.
120 -
100 -
- 80 -
40-
20-
0 200 400 600 800 1000 1200 400 600 800 1000 1200
Figure 11. Speed LSJ, coordinated ramp me- Figure 12. Flow coordinated ramp me-
4500 -
4000 -
3500 -
,-, 80 -
..3000 -
52500-
E2000-
•1500-
1000-
500 -
0
400 600 800 1000 1200 200 400 600 800 1000 1200
Time (seconds x10) Time (seconds x 10)
Fi
Figure 13. Speed Lu, coordinated ramp me- Sê 14. Flow L6J, coordinated ramp me-
1-
0.9-
0.8-
«°- 7 -
•f 0.6-
i>0.5-
|0.4-
W
0.3-
0.2-
0.1 -
0
400 600 800 1000 1200
400 600 800 1000 1200
Time (seconds x 1 0) Time (seconds x 10)
Figure 15. Total network outflow, coordi- Figure 16. Optimal control trajectory, route
nated ramp metering, network 2. guidance, network 2.
Route guidance. In this control scenario it is assumed that there is no possibility of ramp me-
tering in either of the two on-ramps. The only control measure available is route guidance via a
VMS installed at the network's bifurcation. The VMS can influence the drivers' decisions con-
cerning what route to take towards destination DI. Obviously the routing choice of those driv-
ers that are not destined to Dj cannot be influenced by a VMS. Thus the VMS's task is to alter
the percentages of 60% and 40% of the drivers that use the main and the alternative route, re-
spectively, in the no-control case, in such a manner that the congestion formed in absence of
control be reduced or even eliminated. These percentages are the splitting rates at the bifurca-
tion node. The optimal splitting rates for the entire time horizon can be seen in Figure 16.
Despite the optimal splitting, avoiding the congestion is not possible due to excessive demand.
In fact the determined optimal control trajectory results in the elimination of the congestion in
the main route but creates a new one in the alternative (longer) route. In Figure 17 the speed at
L3ii is shown. It is clear that the congestion that existed in the no-control case is not present
anymore. This fact has a positive effect on the queue formed in the on-ramp, which is de-
creased, and also on the off-ramp flow which is increased.
The point where congestion begins in the alternative route, is just downstream of the on-ramp,
i.e. at segment L6,i. Figure 18 depicts the speed trajectories for this segment in the control and
no-control cases. This congestion propagates backwards but it ends at the first segment of link
L2, i.e. L2,i (Figure 19). This means that the link LO remains virtually uncongested and no
queue is formed in the main origin. This is quite an optimal behaviour because the bifurcation
remains uncongested and therefore the flow that is assigned by the VMS to the main route re-
mains unobstructed, contrary to the congestion formed in the no-control case. The reason for
the optimal control strategy to "choose" the alternative route as the location of the unavoidable
congestion, is that it is longer, hence more vehicles can be stored therein without blocking L0.
On the other hand, the results of this "good behaved", unavoidable congestion in the alternative
route are an increase of the formed queue in 63 and a decrease at the outflow from the off-ramp
D3. The overall network outflow though is increased (Figure 20). The early increase of the total
outflow is reflected in the improvement of the TTS which amounts in the no-control case to
3258 veh-h and reduces when route guidance is applied to 2316 veh-h. Thus by making optimal
route recommendations to the drivers, the TTS is improved by 29%.
Integrated control. In this control scenario it is assumed that both ramp metering and route guid-
ance control measures are possible. A control strategy that addresses simultaneously all possible
control measures must have them cooperating towards a common objective which in this case is
the minimisation of the TTS. The control trajectories determined by AMOC for this case (Figure
21) indicate that the control strategy divides almost equally the flow between the two routes to
DI and activates ramp metering at both on-ramps during the peak hour. This leads to the elimi-
nation of the congestion in both routes. As can be seen from Figure 22, there is no congestion
downstream of O2 where the flow is at capacity (Figure 23). The on-ramp's O2 queue is in-
creased because of the metering actions as can be seen in Figure 24. In the alternative route, al-
though higher traffic volumes use it, than in the no-control case, the congestion is avoided by
metering the on-ramp Oa during the peak hour. As can be seen from Figure 25, the speed down-
stream Os is decreased but larger flows occur (Figure 26). Ramp metering results in a larger
queue in Os (Figure 27).
The total network outflow (Figure 28) is at the capacity level for a prolonged time period which
demonstrates the efficiency of the integrated control in the network examined. The TTS in the
no-control case is, as was mentioned earlier, 3258 veh-h while with application of integrated
control it becomes 2264 veh-h which corresponds to an improvement of 30.5%.
120- 120-
100 - 100 -
g- 80 - -- 80-
* 60-
w 40-
I"
in 40-
20- 20-
0 0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 17. Speed LJJ, route guidance, net- Figure 18. Speed Lg,i, route guidance, net-
work 2. work 2.
120 -
100-
- 80
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 19. Speed Lî, route guidance, net- Figure 20. Total network outflow, route
work 2. guidance, network 2.
1.2-1 120-
100 -
.E°-8 ' g- 80-
E0.6- ^ 60-
1
w 40-
20-
0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 21. Optimal control trajectories, in- Figure 22. Speed LI:I, integrated control,
tegrated control, network 2. network 2.
5000- 300-
4500-
4000 - 250-
3500- 8200-
|3000 o
!c
-1-2500 - 2,150-
u
§2000 -
u.
1500- |ioo-
1000 -
50 -
500 -
0 0
200 400 600 800 1000 1200 200 400 600 800 1000 1200
Figure 23. Flow LSJ, integrated control, Figure 24. Queue Oi, integrated control,
network 2. network 2.
120- 5000-
4500 -
100- 4000 -
80- 3500 -
r
|3000-
d>
\ 60- £.2500-
>i. 12000-
> 40 -
1500-
20- 1000 -
500-
0 0
0 200 400 600 800 1000 1200 200 400 600 800 1000 1200
Figure 25. Speed Lgj, integrated control, Figure 26. Flow LS,I, integrated control,
network 2. network 2.
300-
250-
8200-
!c
1,150-
<D
a 100-
50-
200 400 600 800 1000 1200 200 400 600 800 1000 1200
Figure 27. Queue OB, integrated control, Figure 28. Total network outflow, integrated
network 2. control, network 2.
Integrated control with congested initial state. This scenario is identical with the previous one
with the only difference that in the present case there is a congestion in the initial state of the
network, which was not present in the previous scenario. The initial congestion is located in
the main route, and more specifically it begins from L3>2 where the density is 140 veh/lane-
km, goes back to Ljti, where the density is 100 veh/lane-km, and it stops at L4ii, where the
density is 60 veh/lane-km. The rest segments are in free flow conditions.
Figure 29 presents the control trajectories for this scenario. Compared with Figure 21, the
ramp metering trajectories are identical, and the splitting rate trajectories are also the same
except for a small period at the beginning of the time horizon. This occurs because, when
there is an initial congestion, the integrated control strategy diverts vehicles destined to DI to
the alternative route until the congestion is cleared in the main route. After congestion clear-
ance, the strategy begins to divide almost equally the volumes destined to DI and initiates
ramp metering at the peak hour, just like in the previous control scenario.
The TTS in the no-control case with initial congestion is 3722 veh-h and reduces when inte-
grated control is applied to 2383 veh-h, corresponding to an improvement of 36%.
Integrated control with maximum queue constraints. This scenario is similar to that of inte-
grated control without initial congestion but with the addition of maximum queue constraints
woz.max = Wos.max = 200 veh and aw = 0.01 h. The resulting optimal control trajectories may be
seen in Figure 30. The control strategy adheres to the maximum queue constraint while
minimising the TTS. The result of this trade-off may be seen in Figures 31 and 32. The char-
acteristic feature of both queues is that, during the peak hour, ramp metering keeps the
queues at the maximum desired level of 200 vehicles. The resulting traffic conditions remain
uncongested although they are closer to critical conditions than in the unconstrained case
(Figures 33, 34).
1.2-1
<n o.2 -
200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 29. Optimal control trajectories, inte- p.gure 3Q ^^ tmjectories> network 2>
grated control with initial, congestion, net- integmted contwl with queue constraints.
work 2.
300 -i 300 -
250- 250 -
3200- |f 200 -
,>150 t 150 -
3 100 H
o
50- 50 -
0
0 200 400 600 800 1000 1200
0 200 400 600 800 1000 1200
Figure 31. Queue 02, network 2, integrated Figure 32. Queue Os, network 2, integrated
control with queue constraints. control with queue constraints.
120 - 120 -
100 - 100-
80 - £. 80 -
J
60 - r eo H
40 40-
20-I 20 -
0 0
200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Figure 33. Speed L?,;, network 2, integrated Figure 34. Speed LGJ, network 2, integrated
control with queue constraints. control with queue constraints.
In the control case of this scenario, the TTS becomes 2273 veh-h which is larger than 2264
veh-h achieved in absence of queue constraints but still significantly smaller than the 3258
veh-h of the no-control case. Thus, when integrated control with queue constraints is applied, a
30.2% decrease in the TTS, versus 30.5% of the unconstrained case, is achieved.
A number of investigations was carried out using different values of the waiting factor aw. The
results of these investigations with respect to the TTS and the maximum occurring queue
lengths are summarised in Table 2. It can be seen that both maximum queue lengths attained at
the on-ramps for every non-zero value of the weighting factor aw are equal. The quadratic na-
ture of the penalty term included in the cost criterion (eqn. (20)) is responsible for this effect,
which is not present if no queue constraints are considered, i.e. if aw = 0.
Table 2. TTS and maximum occurring queue lengths for different values of a*
flH-(h) 1.0 l.OxlO'1 l.OxlO'2 l.OxlO'3 l.OxlO"4 l.OxlO"5 0.0
T^S (veh-h) 2273 2273 2273 2272 77-71 2265 2264
Max. queue 200 200 200 201 211 251 240
O2(veh)
Max. queue 200 200 200 200 211 254 281
O3 (veh)
The values of the maximum occurring queues are not equal for different ^values. Because the
maximum queue length requirement is considered in the formulated optimal control problem
by the inclusion of a penalty term in the cost criterion, the degree of violation of the queue
thresholds depends upon aw's value. The more aw decreases, the more the maximum queue
constraint is violated, and the TTS decreases, gradually approaching the integrated control case
without queue constraints. The fact that the TTS increases with aw reflects the fact that the
maximum queue constraints do not allow to completely eliminate the congestion in the net-
work, especially for the larger values of aw.
A final observation from Table 2 is that the TTS and the maximum occurring queues are not
very sensitive with regard to the changes in aw's values.
CONCLUSIONS AND FUTURE WORK
A generic approach and the corresponding software tool (AMOC) for optimal integrated
motorway traffic control was presented. A detailed discussion of relatively simple examples
and topologies aimed at demonstrating in detail the efficiency and the intelligent behaviour of
the optimal control approach for coordinated and integrated motorway traffic control. It is
worth noting that the results obtained via this approach for a particular network and demand
scenario must be carefully analysed to really identify and understand the deeper reasons why
they are optimal. In this sense, optimal control may be viewed as a genuine source of artificial
intelligence, because the resulting control trajectories must be studied by the human developers
to clarify their significance, and more than that, the approach may at many instances provide
quite surprising results, fairly different than those one would expect at first view. Clearly, this
intelligent control behaviour, whose complexity increases exponentially with increasing net-
work topology, is far beyond the real-time capacities of any human operator or expert system
approach.
The solutions determined by AMOC are of an open-loop nature but they can be used within a
rolling horizon framework. The rolling horizon concept provides the way of embedding an ini-
tially open-loop optimal control in a closed-loop structure utilising real-time measurements.

More precisely, the optimal control problem may be solved in real time periodically, whereby
disturbance predictions d(fc) and initial condition XQ are updated each time according to the cur-
rent measurements. The repetition period equals a certain control interval while the optimisa-
tion horizon reaches a longer period into the future in order to consider all future effects of cur-
rent control measures up to a significant extent. This procedure allows for the use of current
measurements in order to reduce the impact of modelling and prediction errors. In the exam-
ples discussed, the rolling horizon concept was not considered. It is part of future work that
will be conducted with real networks with a high number of control measures.
ACKNOWLEDGEMENTS
This work was partly funded by the European Commission in the framework of the project
DACCORD (TR1017), Telematics Application Programme. The content of this paper is under
the sole responsibility of the authors and in no way represents the views of the European
Commission.
REFERENCES
Ataslar, B. and A. Iftar (1998). A decentralised control approach for transportation networks.
In: Preprints of the 8th IFAC Symp. on Large Scale Systems, Vol. 2, pp. 348-353. Patras,
Greece.
Bhouri, N. (1991). Commande d'un Systeme de Traffic Autoroutier: Application au Boulevard
Peripherique de Paris. Ph.D. Dissertation, University de Paris-Sud, Centre d'Orsay,
France.
Bhouri, N., M. Papageorgiou and J.-M. Blosseville (1990). Optimal control of traffic flow on
periurban ringways with application to the Boulevard Peripherique in Paris. In: Preprints
of the 11th IFAC World Congress, Vol. 10, pp. 236-243. Tallinn, Estonia.
Blinkin, M. (1976). Problem of optimal control of traffic flow on highways. Automation and
Remote Control, 37, 662-667.
Diakaki,C., M. Papageorgiou and T. McLean (1998). Applying integrated corridor control in
Glasgow. In: Proc. ASCE 5th Inter. Conf. on Applications of Advanced Technologies in
Transportation Engineering, pp. 1-16. Newport Beach, CA, USA.
Diakaki, C, M. Papageorgiou and T. McLean (1999). Application and evaluation of the inte-
grated traffic-responsive urban corridor control strategy IN-TUC in Glasgow. In: Preprints
of the 78th Annual Meeting of Trans. Res. Board, Paper No 990310. Washington, D.C.,
USA.
Elloumi, N., H. Haj-Salem and M. Papageorgiou (1996). Integrated control of traffic corridors
- Application of an LP-methodology. In: 4th Meeting of the EURO Working Group on
Transportation Systems. Newcastle, UK.
Iftar, A. (1995). A decentralised routing controller for congested highways. In: Proc. 34th Conf.
on Decision and Control, pp. 4089-4094. N. Orleans, LA, USA.
Kotsialos, A, M. Papageorgiou, H. Haj-Salem, S. Manfredi, J. v. Schuppen, J. Taylor and M.
Westerman (1997). Coordinated Control Strategies. Deliverable D06.1 of the DACCORD
project (TR1017), European Commission, Brussels, Belgium.
Messmer, A. and M. Papageorgiou (1990). METANET: A macroscopic simulation program for
motorway networks. Traffic Engineering and Control, 31, 466-470; 32, 549.
Messmer, A. and M. Papageorgiou (1995). Route diversion control in motorway networks via
nonlinear optimisation. IEEE Trans, on Control Systems Technology, 3, 144-154.
Moreno-Banos, J.C., M. Papageorgiou and C. Schaffner (1993). Integrated optimal flow con-
trol in traffic networks. Eur. J. Op. Res., 71, 317-323.
Papageorgiou, M. (1983). Application of Automatic Control Concepts in Traffic Flow Model-
ling and Control, Springer Verlag, New York.
Papageorgiou, M. (1994). An integrated control approach for traffic corridors. Transp. Res. C,
3, 19-30.
Papageorgiou, M. (1997). Optimal control as a source of intelligent behaviour. In: Proc. of
IEEE Intern. Symp. on Intelligent Control, pp. 383-389. Istanbul, Turkey.
Papageorgiou, M. and R. Mayr (1982). Optimal decomposition methods applied to motorway
traffic control. Inter. J. Control, 35, 269-280.
Papageorgiou, M., J. Blossville and H. Hadj-Salem (1990). Modelling and real-time control of
traffic flow on the southern part of Boulevard Peripherique in Paris. Part I Modelling.
Transp. Res. A, 24, 345-359.
Papageorgiou, M. and M. Marinaki (1995). A Feasible Direction Algorithm for the Numerical
Solution of Optimal Control Problems. Internal Report 1995-4, Dynamic Systems and
Simulation Laboratory, Technical University of Crete, Chania, Greece.
Wie, B., R. Tobin, D. Bernstein and T. Friesz (1995). Comparison of system optimum and user
equilibrium dynamic traffic assignment with schedule delays. Transp. Res. C, 3, 389-411.
Zhang, H., S. Ritchie and W. Recker (1996). Some general results on the optimal ramp meter-
ing control problem. Transp. Res. C, 4, 51-69.
645
PROGRESSION OPTIMIZATION IN LARGE

SCALE URBAN TRAFFIC NETWORKS: A
HEURISTIC DECOMPOSITION APPROACH
Chronis Stamatiadis and Nathan H. Gartner
Dept. of Civil and Environmental Engineering

University of Massachusetts
Lowell MA 01854, U.S.A.
ABSTRACT
Progression methods are widely used for optimization of traffic signal system operation in
arterials and in grid networks. The methods provide robust solutions for traffic control as
well as a multitude of design alternatives that are not readily available in other models.
Solution procedures were developed in recent years using mixed-integer linear programming
methods. While these methods can produce optimal solutions, they are computationally
demanding and inefficient for traffic control applications. This paper describes a heuristic
decomposition procedure for the optimization of the variable bandwidth network progression
problem. The procedure does not merely exploit the mathematical formulation of the mixed-
integer linear program, but is primarily based on the traffic characteristics of the network.
The network is decomposed into priority sub-networks which facilitates the accelerated
determination of the optimal values for the integer variables. The heuristic improves
dramatically the efficiency of the computation, by at least a factor of 1/100. This enables to
handle larger-scale networks, similar to the ones found in many metropolitan areas. Overall,
more efficient computational procedures result in the ability to obtain improved solutions
and, ultimately, lead to improved performance of the traffic network.
INTRODUCTION
Coordination of traffic signals in urban street networks has long been recognized as a
requirement for the smooth and efficient flow of traffic. Arterial progression methods, which
use the width of green bands along the arterial as the objective of optimization, are often
favored by traffic engineers and are in wide use (Gordon, 1996, RiLSA, 1992). A basic
characteristic of progression methods is that they offer robust solutions for traffic control by
providing the opportunity for continuous movement to platoons of vehicles through
successive traffic lights on the main arterial streets of the network.
Numerous computer programs were developed for finding optimal signal timing plans based
on the maximum green bandwidth criterion. Two of the most versatile and advanced such
programs are PASSER-II and MAXBAND, which provide offset, split, cycle length and left
turn phase sequence optimization on single arterial streets. The PASSER-II program (Messer
et al., 1973) uses a search procedure to determine the combination of offsets that will result in
the widest equal bands in both directions of an artery. A rigorous mathematical formulation
of the optimization problem is given in MAXBAND (Little et al., 1981) through which the
maximization of a weighted combination of the bandwidths in the two directions of the
arterial is achieved. The MAXBAND formulation includes several decision variables that are
integers and, therefore, mixed-integer linear programming (MILP) is used for solving the
problem. The original MILP formulation dates back to the 3rd International Symposium on
The Theory of Traffic Flow which was held in New York in 1965 (Little, 1967). The single
arterial MAXBAND formulation was later extended to handle grid networks of arterial
streets, which considerably expanded the size of the integer variable set (Chang et al., 1988).
A basic limitation of the above models is that the progression schemes generated by these
models are of uniform width. Conventional bandwidth optimization models apportion the
total available bandwidth along each arterial in proportion to the average directional volumes.
The underlying assumption is that the platoon of vehicles traveling through the arterial is
approximately fixed in size, in which case the calculated signal timing plan will provide
optimal progression. However, traffic volumes may vary significantly along an arterial due
to turn-in and turn-out traffic at each intersection. Consequently, the size of the platoon of
vehicles traveling through a sequence of intersections is not, generally, constant and thus the
basic assumption of conventional progression models often does not hold. The effect of
using average through-moving volume for apportioning the total bandwidth is that the green
band may be either wasted at intersections with lower than average though-moving volume,
or be deficient at intersections with higher than average though-moving volume. This has
been a major drawback of progression methods and, for varying traffic volumes, optimum
results are frequently not achieved.
Several attempts have been made to overcome this deficiency and place the arterial
bandwidth optimization concept on more sound traffic based principles. Tsay and Lin (1988)
developed an "inverted funnel" progression model in which the band could only increase in
width along the arterial. This model does not adapt adequately the bandwidth to the
variations of flow and its applicability is too limited. Sripathi et al. (1995) developed a search
Progression Optimization In Urban Networks 647
procedure in which the total bandwidth available on each arterial link is apportioned in such a
way that a link utility function is maximized. This approach, while it does not guarantee an
optimum solution, provides significant improvements over uniform bandwidth models. A
more elaborate approach was developed by Gartner et al. (1990, 1991) in the MULTIBAND
model that is an extension of MAXBAND. By incorporating into the model a traffic-
dependent criterion, MULTIBAND calculates individual bandwidths for each directional link
of the arterial while still maintaining main street platoon progression. The individual bands
depend on the actual traffic volumes that each link carries and the resulting signal-timing plan
is tailored to the varying traffic flows along the arterial. The multi-band design has been
shown to provide significant benefits in terms of common performance measures such as
delays, number of stops and fuel consumption over conventional uniform bandwidth models.
The MULTI-BAND model is also formulated as a mixed-integer linear problem. A network
version of the variable bandwidth formulation was also introduced (Stamatiadis and Gartner,
1996). The multi-band formulation increases the size of the continuous variable set and of
the constraint set by about 20-60% compared with the uniform band formulation; however,
the size of the integer variables set remains the same.
A basic difficulty that is encountered in solving these models concerns the

multidimensionality of the discrete variables of the problems: the large number of discrete
variables, as well as their wide feasible ranges, make the solution of the program
computationally demanding. This paper presents a heuristic decomposition procedure for the
solution of the variable bandwidth network progression problem. The procedure does not
merely exploit the mathematical formulation of the mixed-integer linear program, but is
primarily based on the traffic characteristics of the network. Specifically, the network is
decomposed into priority sub-networks which facilitates the accelerated determination of the
optimal values for the integer variables. The heuristic improves dramatically the efficiency of
the solution and thus enables to handle large-scale networks, similar to the ones found in
many metropolitan areas. The heuristic solution is equal or nearly equal to the globally
optimal solution obtained by solving the original model. The speed and the quality of the
heuristic solution procedure may allow on-line implementation of the method. The paper
includes the application of the model on two large-scale networks from Memphis, Tennessee
and from Ann-Arbor, Michigan.
DESCRIPTION OF THE MODEL
This section presents the formulation of the multi-band optimization model. The model
consists of several blocks of constraints dealing with the individual arterials of the network as
well as a set of network constraints that ensures that continuous bands are being produced on
all the intersecting arterials. To explain the variable band formulation, the basic uniform
bandwidth optimization formulation is presented first.
The geometric relations for the uniform bandwidth model are shown in Figure 1 . Consider a
network with m arterials and each arterial having «7 signalized intersections. Let Sy denote the
/'* signal on the/A arterial of the network and Ltj denote the i'H link (between signals i and z'+l)
of the/* arterial, withy = 1 ,..., m and i = 1 ,..., n}. All time variables are defined in units of
the cycle time. The following variables are defined:
C= cycle time (sec);

b (b ) = outbound (inbound) bandwidth on arterial 7;
r
r (^ )= outbound (inbound) red time at 51..;
w
ii ( %• ) = interference variables, time from right (left) side of red at Sy to left
(right) side of outbound (inbound) green band;
ty (ty) = travel time on link i of arterial y in the outbound (inbound) direction;
v) <«) intemode offsets, time from the center of the outbound (inbound)
red at Sr to the center of the outbound (inbound) red at Skl;
A. = directional node phase shift, time from center of rfj to nearest center
of r..;
r. ( Ty ) = queue clearance time for advancement of outbound (inbound)
bandwidth at Sy to clear turning-in traffic before arrival of main-
street platoon;
v.. (Vy)= outbound (inbound) progression speed on link Li} (ft/sec).
Inbound
S
i+lJ
Figure 1: Time-space diagram for uniform bandwidth optimization.

In the case of uniform bands, the objective function has the following form:
Maximize ^ (bj + kj-bj) (1 )

7=1
where k; is the target ratio of inbound to outbound bandwidth for arterial j.
The directional interference constraints ensure that the progression bands use only the green
time and they do not cross through the red time. When the band has a fixed width
throughout the arterial there is only one such constraint needed for each signal 5zy and each
directional band:
wy+bj<l-ry (2.a)
^v+bj<\-rv (2.b)
The arterial loop constraints result from the fact that all signals must be synchronized, i.e.,
that they operate with a common cycle time. In Figure 1 it can be seen that for each link L^
the summation of the internode offsets and directional node phase shifts is an integer multiple
of the cycle time as follows:
where KJJ is an integer variable. The same principle of signal synchronization applies to
closed loops of the network consisting of more than 2 links, resulting in the network loop
constraints. For simplicity, we drop the arterial index in the notation of nodes and internode
offsets and we define the intranode offset co^ as the time from the center of the red at Sj for
traffic moving from Sj to S,-, to the center of red in the crossing direction at the same node for
traffic moving from Sy to S^ (Figure 2. a). The network loop constraints specify that the
summation of internode and intranode offsets around a loop of intersecting arterials must be
an integer multiple of the cycle time (Figure 2.b):
+
4,- + *V + 0;* + <°ju <t>u + °>Hi + ^ + &UJ = Vn (4)
where fj.n is the integer variable of the ntn network loop. The number of network loop
constraints and the choice of a fundamental set of loops are given by Gartner (1972).
The cycle time C (sec) and the link specific progression speeds v^ and v~ are treated as
decision variables as well. This formulation introduces considerable flexibility in the
calculation of the best progression scheme. Each of these variables must be constrained by
upper and lower bounds as follows:
Figure 2.a: Closed loop of four intersecting arterials
Signal z
C C C C
Signaly
C C
Signal k
C C
CO/
Northbound Red
Southbound Red Signal /
C C
Westbound Red
Eastbound Red
Signal /
Figure 2.b: Network loop constraint geometry

C,, C2 = lower and upper bounds on cycle length;

\ei"fi-)>(€ij,fij)= lower and upper bounds on outbound (inbound) speed v.. (iL)
(ft/sec);
(g.., /ziy), ( g y , hy)) = lower and upper bounds on change in outbound (inbound) speed v..
(v,) (ft/sec).
The corresponding constraints will be:
C<C<C (5)
(6)
and (7)
An additional important decision capability that can be added through the MILP formulation
of the problem is in identifying the optimal left-tum phase sequence with respect to the
through green at any signal Sr A left-tum green can be chosen to lead or lag the through
green, whichever results in the most total bandwidth. Figure 3 shows the four possible
patterns of left-turn phases, where ltj ( l t j ) is the outbound (inbound) green time for left-
turning traffic at Sfj. For the four phase patterns the intranode offsets can be expressed in
terms of/., and /.,. as follows:
(a) Outbound Left Leads, Inbound Left Lags

Inbound
l
Outbound
(b) Outbound Left Lags, Inbound Left Leads
(c) Outbound Left Leads, Inbound Left Leads

Inbound
Outbound
(d) Outbound Left Lags, Inbound Left Lags

Inbound
Outbound
Figure 3: The four different phase sequences.

Pattern 1: A,y = -(/., +/„) 12

Pattern 2: A&. =(/ & . +^.)/2
Pattern 1: A,y = -(/.,. -ly) 12
Pattern 1: A&. = (ly -lv)/2
The above equations can be combined into a single equation by introducing two sets of binary
decision variables 8- and 8'~ such that:
l).^-(2^'-l).)/2 (8)
The binary variables 8'~ and S~ are defined as follows:
Pattern 8- ty
1 0 1
2 1 0
3 0 0
4 1 1
The traffic engineer may specify that only some of these patterns are allowable, in which case
additional constraints are imposed on the combinations of allowable values of 8'y and S'(j.
In the MULTIBAND model the width of the directional bands may differ from link to link.
The bandwidth can be individually weighted with respect to its contribution to the overall
objective function. By using weights that are computed based on the directional traffic that
each link carries, a method is obtained that is sensitive to the varying traffic conditions along
each arterial of the network. The link specific bands generated by MULTIBAND are
symmetric about the centerline of the arterial progression band. The geometry of
MULTIBAND is shown in Figure 4. The bandwidths and the interference variables are
redefined as follows:
by (bij)= outbound (inbound) bandwidth of link i on arterial j; there is now

one specific band for each link Lij;
w
ii (%•) = th£ time fr°m right (left) side of red at Sij to the centerline of the
outbound (inbound) green band; the reference point at each signal
has been moved from the edges to the centerline of the band.
The objective function now has the form:
Maximize 2^ 2^ (ay' by +aij' by) (9)

y = l 1 =1
Outbound
Figure 4: Geometric relations for the variable bandwidth optimization model.
where ar and ij are the link specific weighting coefficients for the outbound and inbound
directions respectively. There is a multitude of options for choosing appropriate expressions
for the weighting coefficients in (9). The coefficients currently used are as follows:
a.,. = -H and a = M- (10)

s,
where
outbound (inbound) directional flow rate on link Ly, either the total
or the through volume can be used;
s saturation flow rate outbound (inbound) directional volume on link
r ( % )=
Vy, either the total flow rate or the through flow rate can be used;
p= an integer exponential power; the values 0,1,2 and 4 were used.
In the case of variable bandwidths the band must be constrained from both sides so that
neither edge of the band crosses through the red time. For each signal Sr and for each link
specific directional band there are two interference constraints, as follows:
w., + b:: 12 < 1 - r. and w... + h: 12 > 0 (ll.a)
Wy + b i j / 2 < l - r ~ y and wtj + ^ / 2 > 0 (ll.b)
The same relationship must be valid at both ends of the band, i.e., at signals 5,y and SMJ:
vv. + 1 J +V2<l-^. and wMJ+bij/2>Q (ll.c)
wMJ+bs/2£l-rv and wMJ+by/2>Q (ll.d)
The arterial loop constraints, and the network loop constraints are not affected by the variable
bandwidth extension of the procedure and remain as in expressions (3) and (4). The cycle
time and the progression speed, as well as, the left-turn phase pattern constraints remain
unchanged in MULTIBAND and are the same as in expressions (5) through (8).
It has been shown previously (Gartner et al, 1990, 1991, Stamatiadis and Gartner, 1996) that
the multi-band approach offers substantial flexibility for the design of progressions that result
in significant improvements in all performance measures. It produces optimal variable
progression schemes tailored to the traffic demand and capacity of each individual road
section along each arterial street, while simultaneously it optimizes progressions on the
crossing arterials as well. For network problems delays produced by the variable bandwidth
method are reduced by as much as 13.8% over the delays produced by the uniform bandwidth
approach (Stamatiadis and Gartner, 1996). Examples of uniform and variable bandwidths for
the 4th Street in the Ann Arbor network are shown in Figure 5. The shaded bands in the first
segment of 4th Street indicate the existence of asymmetric progression bands. The Ann Arbor
network is used later as an example for the application of the heuristic methodology and is
shown in Figure 6.
THE HEURISTIC DECOMPOSITION APPROACH
The MULTIBAND model uses mixed-integer linear programming for determining the
optimal solution. The principal difficulty in solving mixed-integer problems is the number of
integer variables and their range, i.e., the size of the integer feasible set. Virtually all MILP
codes use branch-and-bound strategies for calculating the optimal values. While there are
now available quite efficient codes for solving large-scale problems, it is incumbent on all
modelers, especially in traffic engineering applications, to devise the most efficient solution
procedure for a given problem. Clearly, more efficient computational procedures will result
in the ability to obtain improved solutions and, ultimately, will lead to improved performance
of the traffic network. Computational procedures are just as important as are more accurate
traffic models, especially in an era of increasing real time applications. In the case of
progression optimization in networks this would enable the following:
(a) to use more economical mathematical programming codes which are more affordable and
more easily accessible;
1,0
C35E
Main St. 4th St. 5th St.
•^T* =-fV>r*
^P- »> Huron St.
I I
1
1' 1
3^L? ^. Washington St.
\ '
TV* (
to) )»- Liberty St.
t
\ 1
?v* r
A
?i *\^ *• William St.
-*
1'
r ;
Figure 6: The downtown Ann Arbor, Michigan network.
(b) to optimize larger-scale networks than would otherwise be possible;
(c) to assure more reliable convergence to optimal solutions, i.e., the procedure would not
fail as often;
(d) to analyze a larger number of alternatives at a much reduced cost (one should note that
the progression schemes may involve a multitude of data sets, choice of coefficients and
parameter ranges); and
(e) to have the opportunity to use the codes in real time, e.g., in a 2nd generation type of
control scheme (Gartner et al, 1995).
Branch-and-bound is essentially a strategy of "divide and conquer". The idea is to partition

the feasible region into more manageable subdivisions and, eventually, to fathom the entire
tree of integer solutions. However, branch-and-bound is a strategy for general purpose
mixed-integer programs that does not exploit the special characteristics that a particular
problem may have. Solving the original multi-band problem by general purpose branch-and-
bound is still a rather formidable task. For this reason, heuristic methods which quickly lead
to a good, though not always optimal solution are often preferable. Typically, heuristics have
an intuitive justification motivated by an intimate familiarity with the particular problem
characteristics (Nemhauser and Wolsey, 1988).
It is virtually always advantageous to decompose mixed-integer programming problems into

smaller sub-problems, in order to reduce the number of integer variables that have to be
considered in each sub-problem, and to restrict as much as possible the allowable range of
each variable. Several authors have proposed solution approaches along these lines. Mireault
(1991) tried to solve more efficiently an early version of the mixed integer linear
programming progression model by carefully restricting the range of the integer variable set.
Chaudhary et al. (1991) devised two decomposition procedures for the network version of
MAXBAND. The procedures are based on subdividing the integer variable set into two or
three subsets consisting of (1) the arterial two-way loop variables KIJ, (2) the network loop
variables fjn, and (3) the phase sequencing variables 5'- and 8\- . The procedures are based
on calculating one of the sets while relaxing the integrality requirements on the other two.
The results are then used to fix the values of the first set and to calculate, in turn, integer
values for the other two sets. Six alternative feasible integer solutions are kept from which
the best final solution is chosen. Pillai et al. (1998) proposed two greedy heuristics which,
similar to Chaudhary, involve partial relaxation of some integer variables coupled with a
depth-first search of the branch-and-bound tree. Both of the above approaches are based on a
rather arbitrary decomposition of the integer variable set and are not motivated by any traffic-
related considerations.
The new heuristic that is described in this paper does not merely exploit the mathematical
characteristics of the mixed-integer problem, but is primarily based on the traffic
characteristics of the network. Apriority sub-network consisting of an arterial tree is selected
from the original traffic network based on its geometry and on the volume of traffic that each
link is carrying. The priority sub-network is optimized first and the results are then used for
the solution of the entire network. Alternative sub-networks can be selected in a heuristic
procedure if further improvements are desired. Most importantly, the new decomposition
process does not require relaxation of any integer variables unlike previous approaches. It is
believed that this feature, in addition to the fact that the priority sub-network contains the
bulk of the traffic volumes in the network, enables the achievement of faster and better
solutions. As a matter of fact, the results indicate that an optimal solution is obtained in the
majority of cases on the first try. The decomposition procedure is described below.
Let PI be a progression optimization problem obtained by considering only a sub-network NI

of the original network N, and P2 be another progression optimization problem obtained by
setting a selected subset of the arterial loop and the network loop integer variables in the
network problem to a specific set of values. Then the heuristic optimization method is as
follows:
* Step 1: Identify a new priority sub-network Nj c N
* Step 2: Optimize PI for NI, and save the resulting values of the arterial loop and network
loop integer variables Ky*, jun* for all the links and network loops of Nj.
* Step 3: Optimize P2 by setting the integer variables calculated in step 2 (Ky
= Ky*, Vn =Vn* • Ky, j"n e PI).
* Step 4: Calculate the value of the objective function. If it is better than the previous
solution, save it.
4 Step 5: Stop if all priority sub-networks have been considered; otherwise go back to step
1.
A priority sub-network is chosen so that it contains only a "tree" of arterials, eliminating any
network loops. The arterials contained in the "tree" should include the principal arterials of
the network and can be chosen based on the following criteria:
1. Choose the principal arterial of the network to be the trunk of the tree and include only
crossing arterials in the sub-network;
2. The tree consists of the maximum number of arterial two-way links without forming any
network loops.
Examples of different priority tree sub-networks are shown in Figure 7. The solution of both
PI and P2 can be obtained very quickly due to the reduced number of integer variables.
Table 1 shows the number of integer variables in the original problem and in the two sub-
problems PI and P2 of the heuristic approach for an m x n closed grid network (m x n
intersections and m+ n arterials).
The performance of the heuristic solution was tested on two networks: the first in downtown
Ann-Arbor, Michigan ( 3 x 5 grid) and the second in downtown Memphis, Tennessee ( 4 x 4
grid). The grids are not complete, as there are not signals at each of the nodes. The results
are shown in Table 2. All the runs were executed on a 200MHz Pentium computer. The
heuristic approach was applied to both the uniform and the variable bandwidth models. For
the uniform bandwidth case the heuristic calculated an optimal solution - there are multiple
optimal solutions - for both sample networks. In the case of the multi-band model, the
heuristic located the optimal solution only for one of the test networks. However, the
execution times were significantly reduced by as much as 1:263 compared to the original
times.
Table 1: Size of MILP problem for an m x n closed grid network.
Variable MAXBAND MULTIBAND Heuristic-Stepl Heuristic-Step2

b,b 2(m+l) 2(m(n-l)+n(m-l)) 2(mn-l) 2(2mn-m-n)
z 1 1 1 1
w,w 2(2mn-m-n) 2(2mn-m-n) 2(mn-l) 2(2mn-m-n)
K (2mn-m-n) (2mn-m-n) mn-1 mn-m-n+1
p (m-l)(n-l) (m-l)(n-l) 0 (m-l)(n-l)
Total Integers 3mn-2m-2n+l 3mn-2m-2n+l mn-1 2(mn-m-n)
Example (no. of integers)
4x6 network 53 53 23 30
3x7 network 44 44 20 24
Table 2: Objective Function Values and Execution Times of the
Network/ Size Original MILP Problem Heuristic Decomposition

Model Art./Nodes Obj. Value Exec. Time Obj. Value Exec. Time
Memphis, TN 8/17
MAXBAND 3.4682 6,735sec 3.4682(100%) 50sec (1/135)
MULTIBAND 7.9381 15,012sec 7.9381 (100%) 57sec (1/263)
Ann Arbor, ML 8/14
MAXBAND 2.9381 4,235sec 2.9381 (100%) 44sec(l/96)
MULTIBAND 4.8930 9,995sec 3.7680(77%) Slsec (1/196)
CONCLUSIONS
This paper describes a heuristic decomposition procedure for the solution of the mixed
integer linear programming formulation of the multi-band network progression problem. In
contrast to previous approaches for solving this problem, the present heuristic is based on the
traffic characteristics of the network and it is shown that it can yield dramatic reductions in
computational effort without or with little compromise in the quality of the results.
Reductions range from 1:100 to 1:300 compared with the original formulation. By achieving
these results one could obtain more easily optimal solutions for large-scale networks, analyze
a larger number of alternatives, as well as consider to implement this strategy in an on-line
system. Overall, more efficient computational procedures result in the ability to obtain
improved solutions and, ultimately, lead to improved performance of the traffic network.
They are just as important as are more accurate modeling techniques.
<D Isolated Intersection

(no coordination)
Figure 7: Example of priority sub-network selection
REFERENCES
Chang E.,C-P, Cohen, S.L, Liu, C, Chaudary, N.A., Messer, C. (1988). MAXBAND-86: A
'program for optimizing left-turn phase sequence in multiarterial closed networks.
Transportation Research Record 1181, 61-67.
Chaudhary, N.A., Pinnoi, A., Messer, C., (1991). Proposed enhancements to MAXBAND-86
program. Transportation Research Record 1324, 98-104.
Gartner, N.H., (1972). Constraining relations among offsets in synchronized signal networks.
Transportation Science 6, 88-93.
Gartner, N.H., Assmann, S.F., Lasaga, F., Hou, D.L. (1990). MULTIBAND - A variable-
bandwidth arterial progression scheme. Transportation Research Record 1287,
212-222.
Gartner, N.H., Assmann, S.F., Lasaga, F., Hou, D.L., (1991). A multi-band approach to
arterial traffic signal optimization. Transportation Research Vol. 25B, No 1, 55-
74.
Gartner, N.H., Stamatiadis, C., Tarnoff, PJ. (1995). Development of advanced traffic signal
control strategies for IVHS: A multi-level design. Transportation Research Record
1494, 98-105.
Gordon, R.L., et al, (1996). Traffic Control Systems Handbook. Report No FHWA-SA-950-
032, Federal Highway Administration, U.S. Department of Transportation.
Little, J.D.C. (1967). A mixed-integer linear program for synchronizing traffic signals for
maximal bandwidth. Vehicular Traffic Science (L.C. Edie et al, editors), Elsevier,
New York.
Little, J.D.C., Kelson, M.D., Gartner, N.H. (1981). MAXBAND: A program for setting
signals on arteries and triangular networks. Transportation Research Record 795,
40-46.
Messer, C.J., Whitson, R.H., Dudek, C.L., Romano, E.J. (1973). A variable -sequence multi
phase progression optimization program. Highway Research Record 445, 24-33.
Mireault, P., (1991). A branch-and-bound algorithm for the traffic signal synchronization
problem with variable speed. TRSTAN 1, Montreal.
Nemhauser, G.L. and Wosley, L.A., (1988). Integer and Combinatorial Optimization. Wiley-
Interscience, New York.
Pillai, R.S., Rathi, A.K., Cohen, S., (1994). A restricted branch-and-bound approach for
setting the left turn phase sequence in signalized networks. Transportation
Research Vol. 32B, No 5.
RiLSA (1992). Richtlinien fur Lichsignalanlagen. Forschungsgesellschaft fur Strassen und
Verkehrswesen. FGSV Verlag, Koln (Germany).
Sripathi, H.K., Gartner, N.H., Stamatiadis C. (1995). Uniform and variable bandwidth arterial
progression schemes. Transportation Research Record 1494, 135-145.
Stamatiadis, C., Gartner, N.H., (1996). MULTIBAND-96: A program for variable bandwidth
progression optimization of multiarterial traffic networks. Transportation Research
Record 1554, 9-17.
Tsay, H.S., and Lin, L.T. (1988). New algorithm for solving the maximum progression
bandwidth. Transportation Research Record 1194,. 15-30.
CHAPTER
CHAPTER 99
ROAD
ROAD TOLLING
TOLLING AND
ANDPARKING
PARKING BALANCE
BALANCE
Great
Great moments
moments inin science:
science: Einstein
Einstein discovers
discovers that
that time
time is
is actually
actually money,
money,
(from
(from cartoon
cartoon caption)
caption)
Logic
Logic is
is aa systematic
systematic method
method of
of coming
coming to
to the
the wrong
wrong conclusion
conclusion with
with
confidence.
confidence.
More
More important
important than
than the
the quest
quest for
for certainty
certainty is
is the
the quest
quest for
for clarity.
clarity.
(Francois Gautier)
(Francois Gautier)
665
TOLLING AT A FRONTIER: A GAME

THEORETIC ANALYSIS
David Levinson, Institute of Transportation Studies Rm. 109 McLaughlin Hall University of
California at Berkeley dmlevins@uclink4.berkeley.edu
ABSTRACT
Frontiers provide an opportunity for one jurisdiction to remedy inequities (and even exploit
them) in highway finance by employing toll-booths, and thereby ensure the highest possible
share of revenue from non-residents. If one jurisdiction sets policy in a vacuum, it is clearly
advantageous to impose as high a toll on non-residents as can be supported. However, the
neighboring jurisdiction can set policy in response. This establishes the potential for a classical
prisoner's dilemma consideration: in this case to tax (cooperate) or to toll (defect). Even if
both jurisdictions would together raise as much revenue from taxes as from tolls (and perhaps
more since taxes may have lower collection costs), the equilibrium solution in game theory,
under a one-shot game, is for both parties to toll. However in the case of a repeated game,
cooperation (taxes and possibly revenue sharing) which has lower collection costs is stable.
INTRODUCTION
Tolls are viewed by transport economists as a more efficient means for financing highways and
allocating scarce road space than general taxes in many cases (Bernstein and Muller 1983; de
Palma and Lindsey 1998; Downs 1994; Dupuit 1849; Gittings 1987; Keeler and Small 1977;
Mohring 1970; Poole 1994; Roth 1996; Small 1983; Small, Winston, and Evans 1989; TRB
1994; Verhoef, Nijkamp, and Rietveld 1996; Vickery 1963, 1969; Viton 1981, 1995). During
some periods in the history of road financing, tolls have been widely used, including in the
United States during the period from the late 1700s through the mid 1800s (Klein 1990), and
again from 1940 - 1956 (Gomez-Ibanez and Meyer 1993). However most roads are now
financed with gas taxes or from general revenue. If tolls ever again become a widely used
revenue source, it won't happen overnight, they will be staged into wide acceptance. Some
locations will be more politically acceptable for new toll collections than others. In particular,
jurisdiction boundaries or frontiers, where at least half the crossing vehicles are driven by non-
residents, would seem to be among the most politically palatable. However a frontier, by
definition, involves more than one jurisdiction, and the policies of neighbors affect each other.'
This paper considers the welfare implications of tolling at a frontier under alternative
behavioral assumptions: different objectives (welfare maximizing, profit maximizing, cost
recovery), willingness to cooperate on setting tolls, and over different time frames (one-time
interactions and repeated interactions). By understanding how tolls, welfare, and profits vary
under different behavioral assumptions, we can better understand the motivations of
jurisdictions and under which behaviors tolls will be most likely.
There are two problems that are considered in this paper, referred to as strategic and tactical
decisions respectively. First is the strategic decision: will a jurisdiction tax or toll? Second is
the tactical decision: if it tolls, what toll will it set? The decision to toll and the rate of toll set
by one jurisdiction affects the welfare of the residents of another jurisdiction, leading to
interactions and possible gains to both jurisdictions by cooperating. Game Theory, developed
by Von Neumann and Morgenstern (1944), presents an analytic approach to explain the
choices of multiple actors in conflict with each other with scope for cooperation, where the
payoffs are interdependent (Axelrod 1984, Hargreaves-Heap and Varoufakis 1995, Osborne
and Rubinstein 1994, Rapoport 1970, Taylor 1987).
The focus of this paper is on the revenue policies and rates of toll which emerge at jurisdiction
boundaries under alternative behaviors in the absence of congestion. The model is developed in
the next section. Alternative objectives, one aspect of behavioral variation are then considered.
Two different toll-setting methods, cooperative and non-cooperative are investigated, and
comprise the second main behavioral variation. After presenting empirical values for the
model coefficients, an algebraic solution to the model under the different behaviors is
computed. Then sensitivity tests are conducted and the model applied in the context of a one-
shot game. The application is extended into the realm of repeated games, where many
outcomes are possible. The paper closes with some concluding remarks.
MODEL
We assume an infinitely long two way road covered by two jurisdictions, one ranging from the
point -oo to a boundary point b (jurisdiction Ji), the other covering the area from point b to +00
(jurisdiction Jj). Both jurisdictions may establish toll-booths at the boundary. Tolls can be
collected in either one or both directions, which will affect welfare by a fixed amount
associated establishing toll-booths and a variable cost per collection. For convenience we
assume tolls in both directions if tolls are collected. There are no internal toll-booths. This is
illustrated in Figure 1.
Figure 1: Infinitely Long Road Covered Completely by Two Jurisdictions
J
J, b .
Tolling At A Frontier 667
This network structure implies four classes of trips, trips staying within Ji (T,,), trips from Ji to
Jj (T,j), trips from Jj to Ji (Tji) and trips staying within Jj (Tjj). We are only concerned with trips
crossing the boundary. By assuming symmetry, the equations for Ty and TJ; trips are identical.
Our model assumes that flow ( fb ) across point b on a road is described by a negative
exponential model, where demand depends on the toll charged by both jurisdictions can be
described by the function below.
fb=coea(r<+^ (1)
where: fb = flow past point b

CO, a = model parameters
ri, rj = the toll charged by jurisdiction J i , Jj
Because the jurisdictions are infinite in size, we are not interested in total welfare, rather only
in welfare crossing the boundary point b. The consumer's surplus of local boundary crossing
trips (Ujj) is measured as the difference between what each consumer would pay and what they
do pay. We can solve for consumers' surplus by integrating the demand function over the
range of tolls from what they do pay (n + rj) to infinity. This is given by equation (2)."
Two components comprise cost: network use cost (CNIJ) and toll collection cost (Cvij)- External
costs are excluded because jurisdictions don't generally include them in their decision making.
Implicit in this model is that jurisdictions have the obligation of maintaining a level of service
with a specific travel speed. Thus "congestion effects" are ascribed to infrastructure costs
which are proportional to traffic flow. To simplify the analysis we assume no (dis)economies
of scale and we assume smoothly and continuously increasing infrastructure costs. We assume
zero fixed costs associated with operating the network or collecting tolls or taxes. Equation 3
shows the network use cost (CNIJ), which equals the flow multiplied by the average trip length
of the portion of the trip in jurisdiction I (1A|/), multiplied by a cost per unit distance (<))).
r Nij--
L
Equation 4 provides the cost of toll collection per traveler (Cvij) as the flow multiplied by the
collection cost per crossing (6).
CVli = ecoe^r'+r^ (4)
Equation 5 shows the revenue from toll collection (Rjj) as the rate of toll for jurisdiction I
multiplied by flow.
_
(5)
OBJECTIVES
Which objective jurisdictions employ will shape the resulting tolls and welfare, and thus
perhaps the decision to employ tolls. When it is assumed that jurisdictions have the objective of
local welfare maximization, welfare is defined narrowly as the sum of profit (loss) from
administering the road and consumers' surplus for its residents, as shown in equation (6).
Max WL=Vi^2*Ri-2*CNl-2*CVi (6)
The profit maximization objective excludes all consumers' surplus as given in equation (7).
This represents conditions when the toll-booth is privately controlled, for instance to compare
the consequences of unfettered private control with the public control of the network. To the
extent that the welfare losses associated with private control are not excessive, it may be a
reasonable organizational form for jurisdictions to consider.
Max n = 2*Rfj-2*CNij-2*CVij (7)
We can analyze the objective of local welfare maximization with a cost-recovery constraint.
This objective requires that tolls be high enough to recover the costs imposed by those crossing
the toll-booth but that toll revenue cannot be raised in excess of costs.
Max WLCR=Uy
(8)
si. 0 = 2*Rij-2*CNiJ-2*CVij
Finally, we might for comparison purposes identify what would happen if both jurisdictions (Jj
and Jj) were under single control. If that government imposes tolls, it will only require a single
toll-booth, so collection costs will remain the same as a single jurisdiction. On the other hand,
it will consider consumer's surplus of all frontier crossing trips and the network costs they
impose on both jurisdictions roads.
Max Wc=2*Uij + 4*Rii-4*CNU-2*CVij (9)
TOLL-SETTING
The discussion to date still leaves some latitude in how to solve the tactical problem of toll-
setting. The issue, in solving for the toll of jurisdiction I (n) , is what toll (rj) does jurisdiction
I assume that jurisdiction J uses when it is known what policy they choose. Two approaches
can be considered: non-cooperative and cooperative equilibria.
First, if we assume no collusion (implicit or otherwise), we attain a non-cooperative Nash

equilibrium for toll-setting. This means that Jurisdiction I can do no better by changing its toll
given what Jurisdiction J does, while Jurisdiction J can also do no better. This does not
necessarily result in the best satisfaction of the objective function, but is sustainable. This is
solved keeping the two toll variables: TI and r j , separate and not necessarily equal.
It may be possible to attain higher overall welfare (profit) than non-cooperative approach.
However it will be to the advantage of any jurisdiction to cheat (i.e. raise tolls) if the other
jurisdiction doesn't cheat or retaliate but retains the cooperative tolls resulting from this
solution. The cooperative solution is sustainable as an equilibrium in indefinitely repeated
games."1 Simply, the issue again is how does Jurisdiction I treat rj. To attain this cooperative
solution, each jurisdiction includes both its own and the other jurisdiction's tolls as variables in
its objective satisfaction calculations. (Under non-cooperative equilibrium, the other
jurisdiction's toll could be treated as a constant). The overall payoff maximizing result can be
achieved by setting rj = n in the equations, and solving for the equilibrium toll ( r*=rj=n).
Economic theory argues that, when jurisdictions are welfare maximizing, cooperation should
result in the rate of toll equal to the marginal cost of travel for those paying the toll, that is the
network cost which is the average trip length of the portion of the trip in Jurisdiction I (1A|/)
multiplied by a cost per unit distance (())) plus the cost of toll collection (0). In fact, this is the
case as will be seen in the next section.1V In the absence of fixed costs, and where average costs
equal marginal costs, this implies cost recovery is satisfied.
EMPIRICAL VALUES
The model does not have much real-world meaning without understanding typical values for
the model coefficients. Table 1 gives some values developed from earlier research by the
author (Levinson 1998). The first two variables, a and co describe demand. The variable a is
set to -1, this value makes consistent what is known about the user costs of highway travel
developed from a full cost study (Levinson and Gillen 1998) and a gravity model's decay
function (Levinson and Kumar 1995). This variable must be less than zero to ensure that
demand falls when prices rise. The second demand variable 0) describes the number of trips
when the total monetary price r\ + r} = 0. Clearly this is a scalar and does not affect tolls or the
ultimate decision to tax or toll in this analysis. To keep this analysis consistent with other
research by the author, it is set at 2338, which is a value derived from a more complex version
of the model (considering multiple jurisdictions). The variable network cost is the cost that a
jurisdiction faces for every vehicle kilometer traveled. The value of (|)=0.018 was estimated by
Levinson and Gillen (1998) from a database of state highway expenditures and vehicle travel.
The variable collection cost (0) was estimated from toll collection costs on California bridges
(Levinson 1998). Average trip length (1A|/) within the jurisdiction was calculated from the
multiple jurisdiction model, which required a factor for which trips were sensitive to distance
traveled, (vj/=$0.15/km), developed in Levinson and Gillen (1998).
Table 1: Empirical Values of Model Coefficients
Variable Description Value

Alpha (a) coefficient relating demand to price -1
Omega (co) demand multiplier (trips at price =0) 2338
phi «f>) variable network cost ($/vkt) 0.018
average trip length in jurisdiction 6.67
(km)
Theta (0) variable collection cost ($/vehicle) 0.08
SOLUTIONS
Table 2 shows algebraic solutions for each scenario (combining objective and toll-setting
methodology) assuming that jurisdictions do employ tolls. These results were simplified by
assuming the demand coefficient oc=-l. The final column shows the mathematical result
assuming the empirical values described in the next section.
Table 1: Tolls by Scenario

Scenario (Objective: Maximize; Toll-Setting) _ Solution_ Result
Local Welfare, Non-Cooperative Toll-Setting (WN) _ 2 ^ + y » + 26y/ $0.70
Local Welfare, Cooperative Toll-Setting (WC) _ $ + &¥ $0.20

^ ¥
Local Profit, Non-Cooperative Toll-Setting (]JN) n
_ 0 + V + &¥ $ 1 .20
~ ¥
Local Profit, Cooperative Toll-Setting (OQ _ 2 0 + y / + 26y/ $0.70
Local Welfare, Cost Recovery Toll-Setting (CR) _ 0 + 6¥ $0.20

"' ¥
Global Welfare Maximizing (WG) _ 2^+6y/ $0.32*
note: solution obtained by setting a = -1, result obtained with empirical values described in Table 2.
* indicates tolls in case of Global Welfare Maximization, which should be halved to compare with other
scenarios.
The first thing to note is that the tolls resulting from the non-cooperative welfare maximizing
scenario (rj WN ) are the same as cooperative profit maximizing tolls (r^0 ). As mentioned in the
previous section, we find that welfare maximizing cooperative tolls (riWC) do equal the
marginal costs of travel across the frontier. Also, because we have no fixed costs here, the
tolls and welfare from the cost recovery objective is the same as welfare maximizing with
cooperative toll-setting. The global welfare maximizing objective also has tolls equal to
marginal costs, just that with fewer toll-booths, marginal costs are lower.
We realize some other interesting relationships in the analysis, independent of the empirical
values of the model coefficients:
1. Profit maximizing cooperative tolls (ri^c ) are always $0.50 higher than welfare
maximizing cooperative tolls (r[WC ).
2. Profit maximizing non-cooperative tolls (r^ ) are always $0.50 higher than welfare
maximizing non-cooperative tolls (nWN ).
3. Welfare maximizing non-cooperative tolls (nWN ) are always $0.50 higher than cooperative
tolls (r,wc ),
4. Profit maximizing non-cooperative tolls (r™ ) are always $0.50 higher than cooperative
tolls (nnc ).
5. Therefore, profit maximizing non-cooperative tolls (r^N ) are always $1.00 higher than
welfare maximizing cooperative tolls (riWC ).
These relationships are summarized in Equation (10);
r™C + $ 1.00 = r/WA/ +$0.50 = r,nc + $0.50 = r™ (10)

In contrast with the usual application of cooperative equilibria for analyzing industrial
organization of competitive markets, the best repeated game (cooperative) equilibrium toll is
lower than the Nash equilibrium (non-cooperative) toll. Furthermore, the lower toll results in
higher welfare and profit. The main reason for this is that we are dealing with complementary
rather than substitute goods in our revenue mechanism game. Thus, cooperation to lower tolls
allows higher welfare in an application similar to serial monopolists raising profits by
cooperating to lower tolls (Chamberlin 1933). A second reason is that the objective function
includes not just profit, but also consumers' surplus.
SENSITIVITY TESTS
Figures 2 through 5 show sensitivity of the model as we vary key parameters around their
assumed variable (shown in Table 2). Table 3 gives us the elasticity (the percentage change of
the variable of interest: tolls, profits, and welfare for each percentage change in the input
variable for each scenario. Figures 2, 3, and 4 illustrate how tolls rise linearly as unit costs (0,
0) and trip lengths (1A}/) rise, keeping all other variables at the values shown in Table 2. Figure
5 shows how welfare and profits change as tolls (ri=rj) vary, again assuming all other variables
are at the values shown in Table 2. Welfare is maximized when tolls are $0.20, profits when
tolls are $0.70. Clearly when collection and network unit costs rise, welfare and profits
decline. Trip lengths are somewhat more complicated, as they rise, tolls rise but so does
welfare and profit until trip lengths exceed 66 km.
Table 2 : Elasticity of Tolls, Profits, and Welfare as Inputs Vary

Toll: W-NONC Toll: n-Nonc Toll: W-Coop Payoff
Trip Length 0.171 0.100 0.600 1.765
Network Costs 0.171 0.100 0.600 -0.215
Collection 0.114 0.067 0.400 -0.159
Costs
note: the elasticity of pa\offs to changes in trip length, net\vork costs, and collection costs is the same for both
welfare and profit, and cooperative and non-cooperative equilibria
NON-COOPERATIVE GAME THEORY
Non-cooperative game theory is employed to analyze the strategic interactions between two
jurisdictions under various conditions and objectives. Two decisions are considered: first, the
strategic choice of revenue mechanism (tax or toll); and second, the tactical selection of the rate
of tax or toll given the strategic choices by jurisdiction Jo and the other jurisdictions (the
environment).
The application of game theory requires acceptance of certain assumptions about the behavior
of actors (in this case jurisdictions) and their level of knowledge. First, it is assumed that
actors are instrumentally rational, that is they express preferences (which are ordered
consistently and obey the property of transitivity) and act to best satisfy those preferences.
Second, it is assumed that there is common knowledge of rationality (CKR), which means that
each actor knows that each other actor is instrumentally rational, and that each actor knows that
each actor knows, and so on. Third, it is assumed that there is a consistent alignment of beliefs
(CAB), such that that each actor, given the same information and circumstances, will make the
same decision - no actor should be surprised by what another actor does. Last, it is assumed all
players know the rules of the game, including all possible actions and the payoffs of each for
every player. These four assumptions are used in our analysis of a highly stylized game
between two jurisdictions who have clear objectives.
The payoff to each jurisdiction depends on the policy (tax or toll), objective (welfare or profit),
and the toll-setting equilibrium (cooperative or non-cooperative) taken by both itself and the
other jurisdiction. The source of interaction between jurisdictions derives from residents of one
jurisdiction traveling on the roads of the other. Thus the revenue and the pricing policy of one
jurisdiction alters the demand for the roads of both jurisdictions. The payoffs to jurisdictions
are shown in Tables 4 and 5, representing Welfare and Profit respectively.
Table 3: Payoffs for Welfare Maximizing Jurisdictions

Jj H-Non-Coop. W-Non-Coop. W-Coop. = Tax
J, c
= n- °°p- Cost Recovery
]~[-Non-Coop. [636, 636] [1049, 699] [1822, 577] [2226, 535]
W-Non-Coop. = [699, 1049] [1153, 1153]* [1901, 951] [2322, 883]
0- Coop.
W-Coop. = [577, 1822] [951, 1901] [1567, 1567] [1914, 1455]
Cost Recovery
Tax [535, 2226] [883, 2322] [1455, 1914] [1777. 1777]
note: [payoff to J!, payoff to Jj]; *: Indicates Nash Equilibrium in One-Shot Game; Italics : Indicates Higher
Welfare Scenario Pair; Underline Italics : Indicates Highest Welfare Scenario Pair with Toll Policy, Stable under
repeated game equilibrium; Double-Underline Italics : Indicates Highest Welfare Scenario Pair
Examining Table 4, we can find the Nash equilibrium solution to the one-shot game, that is the
solution where Ji cannot improve its position given what Jj is doing, and vice versa, for welfare
maximizing jurisdictions. The tolls from the non-cooperative local welfare maximizing
scenario produce the Nash Equilibrium. For all Jj polices, Ji maximizes welfare by choosing
this policy, similarly for Jj. However, a number of scenario pairs, denoted in italics have
higher overall welfare, both jurisdictions together would be better off if somehow they could
choose any of those pairs. Assuming toll policies, welfare would be maximized by each
jurisdiction choosing the lower tolls of cooperative toll-setting, while overall, a [tax, tax]
scenario pair (with no tolls) has the highest overall welfare.
Table 4 : Payoffs for Profit Maximizing Jurisdictions
Jj H-Non-Coop. W-Non-Coop. W-Coop.= Tax

J. = h-Coop.
Cost pecovery
H-Non-Coop. [424, 424]* [699, 350] [1246, 0] [1521. -1691

W-Non-Coop. = [350, 699] 1577, 5771 [951, 0] [1161, -279]
11- Coop.
W-Coop. = Cost [0, 1246] [0, 951] [0,0] [0, -459]
Recovery
Tax f -169. 15211 [-279, 1161] [-459, 0] [-561, -561]
note: *: Indicates Nash Equilibrium in One-Shot Game; Italics : Indicates Higher Profit Scenario Pair; Underline
Italics : Indicates Highest Stable (non-cooperative repeated game) Profit Scenario Pair;_Double- Underline Italics :
Indicates Highest Profit Scenario Pair
Similarly, examining Table 5, where both jurisdictions are profit maximizing, we find that the
Nash equilibrium is to employ the tolls assuming profit-maximizing non-cooperative toll-
setting. Again, a number of scenario pairs have higher overall payoffs.
Table 5: Payoff Accruing to Jurisdictions: Ji Welfare Maximizing, Jj Profit Maximizing
Jj [T-Non-Coop. W-Non-Coop. W-Coop. = Tax

J. = 0- Coop. Cost Recovery
H-Non-Coop. [636, 424] [1049, 350] [1822, 0] [2226. -1691
W-Non-Coop. = [699, 699]* [1153, 577] [1901, 0] [2322, -279]
11- Coop.
W-Coop. = [577, 1246] [951, 951] [1567, 0] [1914, -459]
Cost Recovery
Tax [535.15211 [883, 1161] [1455, 0] [1777, -561]
note: *: Indicates Nash Equilibrium in One-Shot Game; Italics : Indicates Higher Payoff Scenario Pair; Double-
Underline Italics : Indicates Highest Payoff Scenario Pair
Combining the matrices from Table 4 and Table 5, shown in Table 6, we consider the
payoffs where one jurisdiction is welfare maximizing Ji and the other Jj is profit maximizing.
In this case, for a one-shot non-cooperative equilibrium game, Ji chooses the welfare
maximizing non-cooperative tolls while Jj chooses the profit maximizing non-cooperative tolls.
Most of the other scenario pairs produce higher total payoffs, indicating gains from cooperation
or a repeated game.
INFINITELY REPEATED GAME
Tables 4, 5, and 6 represent a number of payoffs, but at their heart lie a complex prisoner's
dilemma, with multiple cooperative and non-cooperative strategies. The tables show that the
Nash equilibrium solution does not have the highest overall payoff. In a repeated game, the
payoff maximizing solution may also be an equilibrium when some mechanism to enforce
cooperation is in place. Cooperation has two advantages. First cooperation protects local
citizens from the negative effects of other jurisdiction's pricing policies. Second, cooperation
eliminates the finance externality that reduces demand for local roads from non-local residents
and then hurts profits. Other mixed policies (alternating [Tax, Toll] and [Toll, Tax] for
instance) may also achieve higher results, especially since they reduce collection costs and the
negative effects of a serial monopoly relative to a single monopoly (Chamberlin 1933).
Enforcement mechanisms include the ability to "punish" and "reward" neighbors in a repeated
game, a government in the case of many players (jurisdictions), or a negotiated treaty, contract,
or compact.
This dissonance between individual and collective payoffs in a one-time game may disappear in
a repeated game. While both the one-shot and the finitely repeated prisoner's dilemma give
unique solutions, the indefinitely repeated prisoner's dilemma does not ensure a unique
solution. The "Folk Theorem" demonstrates that in infinitely and indefinitely repeated games,
any of the potential payoff pairs in repeated games can be obtained as a Nash equilibrium with
a suitable choice of strategies by the players. There are always multiple equilibria in an
indefinitely repeated game, though some strategies have higher collective payoffs than others.
Given various discount rates, different solutions will result in the highest repeated game payoff.
The question is how cooperation between jurisdictions can be achieved. A mechanism that can
result in strategic cooperation without actual negotiation is the enforcement available in
repeated games. In an indefinitely repeated game, one jurisdiction's behavior can be
disciplined by another. Cheating on an agreement (for instance tolling when taxing was agreed
to) by jurisdiction Jj in one round (year) can be punished in the next period by jurisdiction Jj,
which would also toll, thereby hurting the payoff to jurisdiction Ji. This section applies the
mathematics underlying repeated games, and computes the necessary discount factors for
cooperation to be stable between rational jurisdictions.
To begin we will examine the conventional two strategy one-shot game. Consider the
representation in Table 7 (after Taylor 1987) of the payoffs for two strategies of the two player
prisoner's dilemma game, where the traditional prisoner's dilemma cooperate strategy is
associated with tax and the defect strategy with non-cooperative toll-setting. (A similar
construction could be made between either of these two policies and a cooperative toll-setting
policy). As noted above non-cooperative toll-setting is a Nash equilibrium in this one-shot
game. The letters w, x, y, and z are used to denote the payoffs in this section as shown in the
table.
Table 6: Welfare of Boundary Crossing Trips on Infinite Road Covered by Two Welfare-
Maximizing Jurisdictions
JG\JI_
___ _ Tax
_______ Non-Cooperative
________ Tolls
Non-Cooperative Tolls [y, z] = [2322, 883] [w, w] = [1153, 1153]

where: y > x > w > z, numeric values indicate payoff from model
Payoffs from repeated games (or a supergame) can be thought of as the summation of a series
of payoffs from one-shot games, discounted so that the present period's game is more valuable
than the next and so on. If we define a discount factor for jurisdiction i, a;, (and a discount rate:
1- aj), then we can compute the supergame payoff (X) from a strategy which results in the
payoff x on every turn as X = x(a> + a;2 + a;3 + ...), or X = \(a\ I (1- a; )), and similarly for any
other payoffs (w, y, z). It should be noted that 1 > aj > 0, and other values are invalid
(suggesting either future payoffs are more valuable than the present if aj > 1, or that future
payoffs are negative in value if 0 > aj). It should also be noted that the discount factor can vary
for different jurisdictions.
Strategies in a sequence of games can be formulated which result in stable equilibria for each
player and higher payoffs. We will consider four supergame strategies: tax on every round
(X°°), toll on every round (T°°), conditionally tax with initial trust (B), and conditionally tax with
initial distrust (B'). The first conditional strategy (B), (also called tit-for-tat ) begins by
cooperating (imposing a tax) on turn 1, and then on all subsequent turns does what the other
player did in the previous turn. A variation on this strategy (B') is also tit-for-tat, but begins by
defecting (imposing a toll) on turn 1, and then doing what the other player did.
We can conclude that in the repeated game, the strategy pair of both jurisdictions choosing to
toll on every round, independent of what the other players are doing, [t00, T1"0], is an equilibrium.
Neither player can improve their position if the other plays t00. However, this is not necessarily
the best equilibrium. The strategy of taxing every round, again independent of what the other
players are doing (x°°), is never an equilibrium. If your opponent is playing x°°, there is always
a gain possible from any other strategy. The conditional supergame strategies, where the policy
employed by one jurisdiction depends on what other jurisdictions did on a previous turn, are
more complicated.
We can reformulate the game in terms of supergame strategies, shown in Table 8. The three
supergame strategies which are sometimes equilibria (B, B', T00) can be played by jurisdiction Ji
and Jj. The cells in the table show which conditions (of Table 9) hold for the supergame
strategy to be a repeated game equilibria. It can be shown (Taylor 1987) that the results shown
in the first column of Table 9 hold when the conditions in the second column bear out.
Table 7: Conditions for Supergame Strategies to be Equilibria

_____ _ _ . __
B (1)&(2) (3) & rev. (2) Never equilibrium

for J0, Ji for JQ, Ji
[l>a,>0.60] [0.60 > a, > 0.23]
B' (3) & rev. (2) (4) & rev. (3) (4) & rev (3)
for J0, Ji for J0, Ji forJj
[0.60 > a, > 0.23] [0.30 > a, > 0] [0.30>aj >0]
T" Never equilibrium (4) & rev (3) Always equilibrium
forJi
[0.30 > a, >0]
Note: rev. denotes reversing the > in the equation (i.e. making it <). Conditions are defined in Table 6.8
I] indicates results of conditions for game
Table 8: Conditions for Supergame Strategies, and Results from Equations Above
Result Condition Value of RHS
(1) B is superior to lx if «,- >_ZHL
_ 0-46
y— w
x
(2) B is superior to B ' i f a, _ J~
> 0.60
x-z
( 3 ) B ' i s superior to T1* if > w-z 0.23
(4) Mutual B' is stable if the reverse of condition (3) ^ w -z

d: :^
0.30
holds and x-z
The final column of Table 9 gives the value associated with the right hand side of the condition
in the table. Applying those conditions to the strategy pairs of Table 8 we get the solution to
the repeated game equilibria, shown by the range of discount factors shown in brackets in that
table. We assume that if there are multiple equilibria in the game, that jurisdictions will choose
the one which results in the highest welfare to them so long as it results in the highest welfare
to other players. Just as in the one-shot game, if there is one stable equilibrium which does
provide the highest welfare to all players, it can be anticipated to be chosen. We see several
policy pairs are valid (repeated game equilibria). Significantly for discount factors in the
range: 1 > a\ > 0.60 (or discount rates between 0% and 40%, where typical governement
interest rates are well under 10% in the United States in the 1990s), mutual cooperation [B, B]
is a stable equilibrium, and since it has the highest payoff, we can assume that it would be the
selected equilibrium.
This alternating policy pair [B', B] or [B, B'] emerges as stable for the range of discount
factors: 0.60 > aj > 0.23 (or discount rates between 40% and 77%). Implicitly this assumes that
toll-booths can be constructed and removed at no loss, or at least result in no charge during the
off-turn, though the extent to which this is true is empirical. A similar policy is for one
jurisdiction to always play cooperate and the other defect, so long as revenues are shared
equally between them. Whether this can actually be enforced depends on the institutional
arrangements between the jurisdictions. However, if we assume that these jurisdictions can
cooperate at that level, it is unclear why they would select the alternating policy pair unless it
had a higher payoff.
A range of discount factors (0.30 > ai > 0) (discount rates between 70% and 100%) allows the
policy pair of [B', B'] to be stable, which in practice is the equivalent of mutual defection
[l00,!35]. Similarly [T°°, B'] and [B' ,T°°] are stable when one or the other jurisdiction has such a
low discount factor (0.30 > a\ > 0). These policies are also the equivalent of mutual defection
This exercise can be undertaken for other profit and welfare maximizing policy couplets. The
key point to take away is that cooperative equilibria are stable for a wide variety of realistic
interest rates for indefinitely and infinitely repeated games.
SUMMARY AND CONCLUSIONS

This paper examined the question of what happens when jurisdictions have the opportunity
establish toll-booths at the frontier separating them. Clearly, tolls are more likely at frontiers
than at internal locations if only because a greater percentage of the toll falls on non-residents.
Nevertheless, for larger jurisdictions, frontier toll-booths still raise nearly half their revenue
from residents.
If welfare-maximizing jurisdictions behave non-cooperatively, they are likely to toll, however

if they can arrange to cooperate, they will employ lower tolls or agree not to toll. Cooperation
is easier the fewer jurisdictions involved. A border between two large jurisdictions essentially
involves traffic from only those two jurisdictions. However, that same border along small
jurisdictions will serve traffic from many different jurisdictions.
If all jurisdictions hope to maximize profit, they will toll, even if they do cooperate. However
if they cooperate (by means such as forming a single toll or road authority), they will charge
lower tolls and even eliminate one toll-booth between them (so that they share revenue while
lowering operating costs). Profit maximization is more likely under private sector management
than public sector. So if tolling is a desired policy outcome, privatization will be more likely to
achieve it than public control.
There are several ways the analysis could be extended. First is the inclusion of congestion
costs. Congestion pricing is often cited as the main benefit from road pricing, but its benefits
cannot be understood with the model in the absence of delay due to excess demand. Second,
this paper has assumed that travelers are identical except in their reservation price. Congestion
pricing is most meaningful when demand is heterogeneous, that is, different travelers have
different values of time and differ in their disutility from congestion. Third, all fixed costs
were neglected. This simplifies the analysis, particularly under cost recovery behavior, but is
not necessarily a realistic approach.
REFERENCES
Axelrod, R. (1984). The Evolution of Cooperation. Basic Books, New York.

Bernstein, D. and J. Muller (1993). Understanding the competing short-run objectives of peak
period road pricing. Transportation Research Record 1395 pp 122-128
Chamberlin, E. (1933). The Theory of Monopolistic Competition: A Re-orientation of the
Theory of Value. Harvard University Press, Cambridge MA.
de Palma, A. and R. Lindsey (1998) Private Toll Roads: A Dynamic Equilibrium Analysis
presented at Western Regional Science Association Meeting Feb. 1998, Monterey CA.
Downs, A. (1994). Stuck In Traffic. Brookings Institute, Washington DC.
Dupuit, J. (1849). "On Tolls and Transport Charges" reprinted in International Economic
Papers 1962117-31
Gittings, G. (1987). Some Financial, Economic, and Social Policy Issues Associated with Toll
Finance. Transportation Research Record 1102 20-30
Gomez-Ibanez, J. and J. Meyer (1993). Going Private: The International Experience with
transport Privatization. The Brookings Institute, Washington DC.
Hargreaves-Heap, S. and Y. Varoufakis (1995). Game Theory: A Critical Introduction.
Routledge. New York.
Keeler, T. and K. Small (1977). Optimal Peak-load Pricing, Investment, and Service Levels on
Urban Expressways. Journal of Political Economics 85:1 1-25
Klein, D. (1990). The Voluntary Provision of Public Goods? The Turnpike Companies of Early
America. Economic Inquiry. March 1990.
Levinson, D. (1998). On Whom the Toll Falls: A Model of Network Finance University of
California at Berkeley Dissertation.
Levinson, D. and D. Gillen (1998). The Full Cost of Intercity Highway Transportation.
Transportation Research D 3:4 207-223,
Levinson, D. and A. Kumar (1994). Multimodal Trip Distribution: Structure and Application,
Transportation Research Record 1466 124-131
Mohring, H. (1970). The Peakload Problem with Increasing Returns and Pricing Constraints.
American Economic Review. 60: 693-705
Newell, G. (1980). Traffic Flow on Transportation Networks. MIT Press, Cambridge, MA.
Osborne, M. and A. Rubinstein (1994). A Course in Game Theory. MIT Press, Cambridge,
MA.
Poole, R. and Y. Sugomoto. (1994). Congestion Relief Toll Tunnels, Transportation Quarterly
48:2 115-134.
Rapoport, A. (1970). N-Person Game Theory: Concepts and Applications. The University of
Michigan Press, Ann Arbor, MI.
Roth, G. (1996) Roads in a Market Economy. Avebury Technical Press, Aldershot UK.
Small, K., Winston, C. and Evans, C. (1989). Road Work. Brookings Institution. Washington,
DC.
Small, K. (1983). The Incidence of Congestion Tolls on Urban Highways. Journal of Urban
Economics. 13:190-111.
Taylor, M. (1987). The Possibility of Cooperation. Cambridge University Press, Cambridge
UK.
Transportation Research Board. (1994) Curbing Gridlock: Peak Period Fees to Relieve Traffic
Congestion: Special Report 242. Washington, DC.
Verhoef, E., P. Nijkamp and P. Rietveld (1996). Second Best Congestion Pricing: The Case of
an Untolled Alternative. Journal of Urban Economics 40 279-302.
Vickery, W. (1963). Pricing in Urban and Suburban Transport. American Economic Review. 53
452-465.
Vickery, William. (1969). Congestion Theory and Transport Investment. American Economic
Review 59 251 -60
Viton, Philip. (1981). Optimal Tolls on the Bay Bridge. Journal of Transport Economics and
Policy. 15 185-204.
Viton, Philip. (1990). Private Roads. Journal of Urban Economics. 37 260-289.
Von Neumann, J. and O. Morgenstern. (1944). Theory of Games and Economic Behavior.
Princetion University Press, Princeton, NJ.
Figure 2: Tolls as Network Cost Changes by Scenario
Tolls vs. Average Collection Cost per Crossing
1 2 3
Theta: Average Cost to Jurisdiction in (S/Crossing)
Figure 3: Tolls as Collection Cost Changes by Scenario
Tolls vs. Average Cost per VKT
0.2 0.4 0.6
Phi: Average Cost to Jurisdiction per vkt
"W-NONC -NONC W-COOP

Figure 4: Tolls as Average Trip Length Changes by Scenario
Tolls vs. Average Trip Length
20 40 60
Average Trip Length (1/Psi) in Jurisdiction (km)
• W-COOP
Figure 5: Welfare and Profit as Tolls Change by Scenario
Welfare and Profits vs. Tolls
Welfare Maximizing Solutions

(Cooperative, NonCooperative)
Profit Maximizing Solutions

(Cooperative, NonCooperative
Tolls ($)
Welfare • Profits
' To quantify the importance of frontiers, of 133 major countries existing prior to the fall of the Soviet Union, there
were 500 international boundaries between them, with each boundary containing multiple crossings (source:
author's calculations). This does not include sub-national frontiers (state, provincial, county, or city boundaries,
for instance).
" By symmetry, the consumers' surplus in each direction is identical, and by symmetric trip tables, half the flow in
each direction is made by residents, therefore we only need to compute the total consumers' surplus in one
direction rather than half in both directions.
'" The Nash equilibrium conditions state that when all jurisdictions are identical, each jurisdiction will try to
achieve the highest welfare for themselves, recognizing that other jurisdictions will do the same. However in an
indefinitely repeated prisoner's dilemma game, strategies which enforce cooperation by punishing "defection" can
be employed to maximize overall welfare.
In an infinitely repeated games context, this is the best result that jurisdictions can attain over the long
term, and though other solutions are also equilibria, no other solution improves on this one overall (though a single
jurisdiction raising tolls - violating the equal tolls provision, may have a higher individual welfare or profit).
685
CARPOOLING AND PRICING IN A MULTILANE

HIGHWAY WITH HIGH-OCCUPANCY-VEHICLE
LANES AND BOTTLENECK CONGESTION
Hai-Jun Huang
School of Management,
Beijing University of Aeronautics and Astronautics, Beijing 100083, P.R. China
Hai Yang
Department of Civil & Structural Engineering,
The Hong Kong University of Science and Technology, Kowloon, Hong Kong
ABSTRACT
This paper examines departure time choices of commuters with early or late arrival penalty
and the evolution of equilibrium queue in a multilane highway with high-occupancy-vehicle
(HOV) lanes and a bottleneck, using deterministic equilibrium mode choice models with
elastic demand. Optimal uniform and time-varying tolls are sought to achieve optimal mode
split and elimination of queue behind the bottleneck. Road tolling schemes in either absence or
presence of HOV lanes and 'equal toll' constraint are derived for either first-best or second-
best social optimum. The resultant changes in departure time, person-delay, total social cost
and total benefit are compared by numerical examples. Our results show that, in the absence of
HOV lanes, an anonymous toll for all vehicles (independent of the number of occupants of the
vehicles) should be charged for obtaining a first-best social optimum. In the presence of
carpooling lanes, however, the pricing for a first-best social optimum requires to differentiate
the toll across segregated lanes. If the 'equal tolls' constraint is imposed, a weighted average
of the marginal external congestion costs of the carpooling and non-carpooling commuters
should be set for a second-best social optimum. Time-varying tolling scheme is also
determined to eliminate queue behind the bottleneck, and the resulting equilibrium modal split
is ascertained. Our observations have strong practical implications for combined application of
HOV lanes and congestion pricing.
l. INTRODUCTION
Carpooling is usually cheaper than either using a car alone or using mass transit because of the
splitting of expenses between two or more riders and no walk to or wait for scheduled public
transportation. The comfort level of carpooling is basically the same as that of the private
vehicle, but the need to own a special car for regular travel to work, with all the appending
costs, is greatly reduced. From the viewpoint of government authorities, carpooling can result
in considerable reductions in traffic congestion, air pollution, noise levels and parking space by
decreasing the number of vehicles. Therefore, carpooling becomes a well developed mode of
transportation in many countries. In terms of the 1975-1976 national surveys made by U.S.
Bureau of the Census in 41 urban areas, 20 to 30% of America workers who commute in a
vehicle were carpoolers (BOC, 1979). In the United Kingdom, 12% of total trip miles are
travelled by passengers in non-household cars, while only 10% travelled on local stage buses
(Bonsall, 1981).
In the 1990s, however, the level of using carpooling as a travel mode declines; for example in
America, the share of carpooling to work was only 13.4% nationwide in 1990 (FHWA, 1993).
One of reasons is that in some ways carpooling is inferior to both driving alone and public
transit riding (for instance, carpooling requires an increase in travel time due to the need to
pick up and deliver carpool members; carpoolers should suffer from the loss of independence
and privacy; and for many people the anonymity of transit riding may be a more comfortable
social climate than carpooling). Another important reason is lack of organized and efficient
programs to encourage people carpooling (Collura, 1994; Giuliano et al, 1990). Ferguson
(1990) and Willson and Shoup (1990) found that the employer-based programs (e.g.,
employers reduce or remove parking subsides) were efficient approaches to promote
carpooling behaviour.
Considering the second reason above, techniques for giving priority to high-occupancy
vehicles are growing in importance. These techniques include providing carpools (and
vanpools and buses) with reserved street and expressway lanes. High-occupancy-vehicle
(HOV) lanes now become fashionable worldwide as a cost-efficient method of addressing
growing traffic congestion problems and have been introduced in many cities and areas such as
Beijing, Hong Kong and California. Once a HOV lane is established, carpools (and vanpools
and buses) will thereby be provided with both travel time savings and more reliable travel
times. These two benefits serve as incentives for commuters to choose a higher occupancy
mode. The total vehicle demand for scarce road capacity is thereby reduced and the person-
movement efficiency of the roadway is thus increased. In some areas, additional incentives
such as reduced toll charge for carpools have been used to further encourage individuals to
use high-occupancy mode. Road pricing can encourage carpool commuting mode because
riders of a carpool can share their toll charge and then cut down their individual cost.
Therefore, those polices of HOV lanes and toll differentiation would provide a cost-effective
way to reduce traffic congestion (O'Sullivan, 1993; Turnbull et al., 1991).
It is thus of strong interest to investigate the benefits of these individual or combined

procedures for alleviating traffic congestion. In fact, some attempts have been made to
evaluate the priority scheme, and specially determine whether the benefits from priority
scheme are likely to be substantial, and how they compare to those from marginal-cost pricing.
Morhring (1979) developed a class of highly simplified models to obtain rough quantification
of the benefits of reserved bus lanes and marginal cost pricing on ordinary streets with
exclusive bus lanes. Small (1977, 1983) addressed the same problem by using a disaggregate
logit model of individual choices among alternative modes for work trips and a deterministic
queuing model behind a single-point bottleneck. Through extensive numerical simulations they
found that the priority measure for buses is effective in motivating people to shift to public
transit mode and then reducing urban highway congestion, but additional benefits are not
HOVLanes And Carpooling and Pricing 687
gained obviously from such policies if auto tolls are charged. Dahlgren (1998) examined
whether constructing a HOV lane will necessarily reduce person-delay more than construction
of a general-purpose lane. The logit model of individual choices developed by her takes into
account the dynamic nature of travel time differential between HOV lanes and general lane, as
well as the initial delays in all lanes. With the applications of the model in typical situations,
she found adding a HOV lane can provide substantial reductions in person-delay if the initial
delays are long, otherwise adding a general lane is more effective. However, all her analysis
and development are conducted without introduction of congestion pricing on the highway
with HOV lanes.
While the above three studies provide some interesting insights into road priority service
schemes, there exist limitations in the following aspects. (1) They assume that the
transportation demand is given and fixed; (2) the penalty for commuter's time-early or time-
late in arriving at workplace is not considered; (3) the shifts in departure time are left out.
Constructing a HOV lane or a general-purpose lane, or converting an existing lane to a HOV
lane, will reduce the averaged person-delay by increasing traffic capacity directly or indirectly,
which will in turn induce new trips. Each new trip represents a benefit to the trip maker.
Hence, it may be inappropriate to use a model with an assumed fixed travel demand in
evaluating the overall benefit gained from HOV lane policy. Meanwhile, as one of the efficient
manners to manage traffic flow during peak period, the effects of the priority scheme on
travelers' choice of departure and arrival times should not be omitted. In addition, it is
meaningful to investigate how to determine time-varying tolls to remove queue behind the
bottleneck and how to differentiate the tolls according to vehicle occupancy to achieve an
optimal modal split. In these respects, Huang (1998) investigated, in a highway with a
bottleneck, the joking behavior of work commuters between carpooling and driving alone
modes under several tolling policies, but HOV lanes and demand elasticity were not
considered. Recently, Yang and Huang (1999) dealt with the elastic demand case in a multiple
highway with or without HOV lanes, but they did not consider bottleneck congestion and
schedule delay in a dynamic manner.
This paper is built upon Yang and Huang (1999) and Huang (1998). Using deterministic
equilibrium mode choice models with elastic demand, we determine departure time choices of
commuters with early or late arrival penalty and the evolution of equilibrium queue in a
multilane highway with HOV lanes and a bottleneck. In particular, we investigate the optimal
uniform and time-varying tolls to achieve optimal mode split and elimination of queue behind
the bottleneck. Road tolling schemes in either absence or presence of HOV lanes are derived
for either first-best or second-best social optimum, and the resultant changes in departure time,
person-delay, total social cost and total benefit are compared by numerical examples. In
Section 2, all assumptions made in the paper are given and the main results on the classical
bottleneck problem are reviewed. Section 3 considers the case without HOV lanes. After
discussing the benchmark non-toll equilibrium, an anonymous toll for all vehicles (independent
of their number of occupants) for obtaining a first-best social optimum is derived. Section 4
contributes to the case with HOV lanes. A dynamic or time-varying tolling scheme which can
be used to eliminate any queue behind the bottleneck, is also studied in these two cases,
respectively. Section 5 provides numerical demonstrations to show the effects generated by
HOV lanes, pricing schemes, and their combinations. Section 6 concludes the paper.
2. BASIC ASSUMPTIONS AND BOTTLENECK MODELS

Suppose there exists a multilane highway connecting a residential area and workplace, and a
certain number of identical individuals must commute to work on the highway everyday.
Without loss of generality, we suppose there are two types of modes for travel: commuters
can either drive a vehicle alone to work (solo driver) or share a car with another one to form a
two-person carpool. For simplicity, here we consider two-person carpool only, it is
straightforward to incorporate carpools with three or more occupants. An economic analyses
of the size of the carpool can be found in Lee (1984). Let N\ and N2 denote the numbers of
non-carpooling and carpooling commuters, respectively, and N\ + N2 = N in which TV is the
total number of commuters. Clearly, the number of vehicles running on the highway is N\ +
Let / be the number of traffic lanes on the highway, and l\ and h be, respectively, normal lanes
for non-carpooling vehicles and carpool lanes for carpooling vehicles only (hereinafter,
carpool lanes are used instead of HOV lanes since high-occupancy mode consists of carpools
only in our study). Naturally, l\ + 12 = /. We assume that there is a bottleneck at the entering
point of the highway and as a result each traffic lane has a deterministic capacity of s vehicles
per unit time. Travel is uncongested except at the bottleneck, with a constant time ta from
getting through the bottleneck to arriving at workplace for all types of lanes.
Now we formulate the private travel cost functions for the two types of commuters. Take
solo-driving mode as an example of deriving the detailed formulae. If the departure rate of
non-carpooling vehicles from residential area (i.e., the arrival rate at the bottleneck) exceeds
the total capacity of lis lanes, a queue develops and then actual traffic flow equals the
capacity. Thus, the capacity constraint is a flow constraint, while the queue discipline is first-
come, first-served (FIFO). Let u\(t) be the departure rate at time t, the vehicle queue length,
q\(f), evolves according to dq\(f)/dt = iii(i) - l\s for q\(f) > 0. The travel time from residential
area to workplace is ta + q\(f)ll\s, where qi(t)/l\s is the time spent waiting in the queue. A non-
carpooling commuter's total travel cost depends on his travel time and schedule delay (time-
early or time-late in arriving at workplace) if tolling is not introduced. To simplify, we assume
a linear travel cost function
Ci(0 = <x[/a + qi(f)lhs\ + P[f - (t + ta + qi(t)lks)} for fe [tlq, tlo],
= a[ta + qi(t)/Iis] + y[(/ + ta + qiWhs) - t*] for te[tio, tltj<], (1)
where a = the unit cost of travel time, P (y) = the unit cost of schedule delay time-early (time-
late), /* = the official work starting time, t\q = the departure time at which the queue begins,
t\q> = the departure time at which the queue ends, and t\0 = the departure time at which a
commuter arrives at workplace on time, i.e., t\0 + ta + q\(t\<^ll\s = t*. In accordance with
empirical results by Small (1982), we assume that y > a > p.
In choosing when to leave home, individuals face a trade-off between travel time and schedule
delay. At equilibrium, any non-carpooling commuter is unable to find a departure time which
reduces her travel cost, taking all other commuters' departure times as fixed. In other words,
dC\(f)/dt = 0 for all departure times actually used. Hence, we have
dqi(t)/dt = P/jj/(a - P) for f e [hq, tlo],
= -y/i5/(a + y) for f e [t\0, tlq>],
= a/i5/(a - p) for fe[/i,, tlo],
= a/i5/(a + y) for t^\t\0, t\q-].
HOV Lanes And Carpooling and Pricing 689
Integrating d#i(0/d/ yields

?i(/)= (/ - /i,)P/is/(a - 3) for /e fr?, f lo ],
= -(/ - tio)jlis/(a + y) + (/i 0 - /i,)P/is/(a - p) for fe [fi0, V]- (2)
The piece-wise linear function q\(f) is nonnegative within the interval of time [til}, t\q<], and
zero otherwise. The first and last non-carpooling commuters (i.e., these who depart at t\q and
/i e ') face no queue and have the same private travel cost. The exiting rate from bottleneck
during the rush hour [t\q, t\q>] is l\s. Combining these facts and the definition of t\0, we can
determine the following three unknowns
tig = t* ~ 8M/P/1* - ta, (3.1)
(3.2)
(3.3)
where 6 - Py/(P + y). Therefore, the generalized private travel cost perceived by each non-
carpooling commuter, without tolling, is
Ci = bNi/liS + ata. (4)
Other results are:
The total cost of the lis system = NiCi = 8(M)2//i* + Motffl
The marginal social cost = d(NlCi)/8Ni = 25M//1* + af fl
The marginal externally = marginal social cost - private travel cost = dN\/l\s
The maximum queue length in vehicles — q\* = q\(tio) = 5N\/a.
It is also easy to show that in the total cost, half of 5(Ni)2/l\s is contributed by total waiting
time cost and another half by total schedule delay cost. The bottleneck model highlights the
costs of traveling at inconvenient times, so that the a-related terms for waiting times are
presented by 8-related components in private travel cost, marginal social cost and marginal
externality.
On the carpool lanes for carpooling vehicles, note that the number of vehicles is 0.5N2. Using
the same method introduced above, we obtain
ta, (5.1)
ta, (5.2)
ho = t*- 0.5dN2/a.l2s - ta, (5.3)
and the generalized private travel cost perceived by each carpooling commuter
C2 = 0.55N2/l2s + ata + ^ (6)
where A is a constant that lumps the additional cost (positive of negative) for individual
occupants of carpools compared to solo-drivers. Note that A may be seen to consist of the
extra cost for collection and distribution of riders, any other undesirable features such as
scheduling inconvenience, and the out-of-pocket savings resulting from equal sharing of
mileage, gasoline cost and parking charges and son on. Finally, the maximum queue length
that is measured with number of vehicles and occurs at time t2o is q2* = q2(t2o) = O.SSA^/a.
In this paper, we employ a single demand function to endogenize the total number of
commuters N. Let B(N) denote the marginal benefit or demand function of commuting trips,
which is assumed to decline with commuting demand, dB(N)/dN < 0. Thus, we have actually
made an assumption of both overall or external demand elasticity and internal demand
elasticity. The external demand elasticity reflects the extent to which auto commuters
efficiently leave the highway system altogether due to pricing, and the internal demand
elasticity reflects the extent to which commuters divide themselves efficiently between
carpooling and non-carpooling modes for travel. The problem of interest in this paper is to
worry about two partially conflicting goals: achieving the right total number of road users, and
allocating these users appropriately between the two travel modes and in different departure
times.
3. CARPOOLING AND PRICING IN THE ABSENCE OF

HOV LANES
In this section, we investigate three kinds of equilibria under non-tolling, uniform tolling and
time-varying tolling policies, respectively. These three equilibria are obtained without
dedicated lanes for carpools (i.e., l\ = I and 72 = 0). So, the flows of carpooling and non-
carpooling vehicles are not separated by lanes. If no external interference exists, all vehicles of
two modes leave home during the same time interval [tq, tg~] and are well-distributed on the
highway since their vehicular queue length curves have identical slopes.
3.1 No Toll
If no road toll collected, the trade-off equilibrium between travel time and schedule delay for
each mode leads to
tq = t*- 8(M+0. 57V2)/p/s -ta, (7.1)
V = t* + 5(M+0. 5N2)/yls - ta, (7.2)
/„ = /*- 5(M+0.57V2)/a/s - /„, (7.3)
Ci = 8(M+0. SN2)lls + ata, (8)
C2 = 6(M+0. 5N2yis + ata + A, (9)
and the maximum queue length
The equilibrium between the travel costs of two modes is characterized by

l= NandN2 = 0, i f A > 0
Nl =OandA^ 2 = N, i f A < 0 (10)
I and N2 take any positive value s.t. A^ + AT2 = A7" and Q = C2 = B(N), if A = 0
where the total demand is determined by equalizing travel cost and marginal trip benefit, i.e., it
is determined by
6AV/S + oc/fl = B(N) with Ni=N if A > 0; by
6(0. 5N2yis + ata + A = B(N) with N2 = N if A < 0;
while not determined uniquely when A = 0.
Evidently, the equilibrium solution associated with A = 0 might not closely coincide with
empirical observations. The real situation should be that commuters will generally prefer solo-
driving mode unless some influent incentives are provided for carpooling (Teal, 1987). So the
real situation is approximately characterized by the solution N\ = N and N2 = 0 when no
powerful factors motivate carpooling. Note that the mode share at equilibrium state as given
by (10) is irrelevant to the first two terms of both travel cost functions (i.e., C\ and C2), and an
explicit modal split cannot be obtained when A = 0. The reason is that the riders of carpools
can divide the out-of-pocket costs (toll and fuel cost), but they cannot do the same with the
travel time and schedule delay. Therefore, very little can be expected for traffic congestion
reduction or public benefit increase through self-forming of carpools without any external
intervention. This observation suggests that certain public policies should be made to promote
HOV Lanes And Carpooling and Pricing 69 1
carpooling, including discriminatory road regulations such as introduction of special lanes and
reduced toll for carpools to be discussed herein.
3.2 Uniform Tolling
Now we consider the impacts of road-use pricing on total benefits and mode choice. It is well
known that pricing will encourage carpool commuting mode because riders of a carpool can
share their toll charge and therefore lower their individual cost. We are now seeking a uniform
or anonymous toll which can maximize the total benefit of the system. The total benefit is
given by the area under the demand function curve minus the total social cost of the system.
The total social cost equals aggregate schedule delay costs plus aggregate queuing and moving
costs. Note that the uniform toll does not change the distribution of commuters' departure
time choices and then the travel costs caused by schedule delay, queuing and moving losses.
This says, each solo-driving and carpooling commuters will bear the travel costs given by (8)
and (9), respectively, besides the tolls that will be determined below. Thus, we have
max TB = \ B(x)Ax - Q N^ - C2N2
= J B(x)dx - (Nl + N2 )[5(Ar, + 0.5 N2 ) / Is + ata ] - N2 A

with NI + N2 = N and M > 0, N2 > 0. It is easy to show that the problem (11) is strictly
concave and then a unique solution exists. The first-order optimality conditions include
(12.1)
(12.2)
Equation (12) represents an equilibrium leading to a social optimum for which a tolling
scheme must be implemented. Each rider of a carpool should pay an amount of oW/2/s, which
is the externality generated by an additional carpooling commuter joining the highway system,
so a two-person vehicle should be charged with a toll amounting to dN/ls. This is exactly the
same toll that should be paid by a solo-driving commuter as shown in the second term of
(12.1). Hence, we conclude that in the absence of carpool lanes in a multilane highway, a
uniform or anonymous toll for all vehicles should be charged for achieving a social optimum.
Now we further investigate the properties of the equilibrium solutions. Subtracting (12.1)
from (12.2), we obtain A = 5N/2ls. So, the optimal uniform toll for all vehicles is T = 2 A. This
means that at the social optimum the marginal reduction of the system's total social cost when
forming an additional carpool equals the double extra carpooling cost, i.e., SN/ls = 2A.
Suppose 5(0) > ata which guarantees the overall demand N> 0. Using (12) with the definition
on equilibrium we can show: N\=Q leads to A < &N/2ls and A^ = 0 to A > 8A72/5. However, A
= £>N/2ls is not the sufficient condition for having an interior point solution (i.e., both M and
NI are positive). Rearranging (12. 1) and (12.2) with N\+N2 = N, we have
N} = 2ls[B(N) - ata - 3A] / 5 > 0 and N2 = 2U[-B(N) + a.ta + 4A] / 5 >0
where N = 2A/5/5. Hence
must hold for obtaining an interior point solution. The correct solution route is
(1) Compute N = 2A&/8.
(2) Calculate the positive solution JVj* and N2 as just given and N* = N if
3A < [B(N) -a.ta]< 4A holds.
(3) Otherwise, solve the equation C\ + 8N/ls = 2SM/s + ata = B(N) to get N* , then
N* = N* and N2 = 0 if [B(N) - ata ] < 3A , or solve the equation C2 + 5A72& =
8M/s + o/fl + A = 5(7V) to get TV* , then N* = 0 and N2 = TV* if
(4) Finally, we use equation (7) to compute the time unknowns as well as the maximum
queue length with the total demand and modal share just obtained.
3.3 Time- Varying Tolling
To the bottleneck problem with identical vehicles, it has been known that there exists a time-
varying tolling scheme which eliminates any queue on road (except the case in which the
bottleneck capacity is queue length-dependent, see Huang and Yang (1996) and Yang and
Huang (1997)). In our problem, there are two types of vehicles, the non-carpooling vehicles
with parameters (a,P,y) and the carpooling vehicles with parameters (2a,2p,2y). According to
the results by Arnott et al. (1992, 1994) we know that the queue can be eliminated by a time-
varying tolling scheme under which, however, the vehicles would be rationally allocated to
departure time slots so as to minimize their travel costs. Because the carpooling commuters
take the advantage of sharing the toll, they would choose the middle of the rush hour, i.e.,
they are willing to pay a higher vehicular toll but lower shared individual toll. The main results
when a time-varying tolling is introduced are summarized below.
The timing of rush hour, tq and ?,-, are the same as given in (7.1) and (7.2), certainly with the
specific total demand and modal share determined in this subsection. The queue that appears in
non-tolling and uniform tolling does not exist now and is completely replaced by a fine toll
(vehicular toll) which is a piecewise linear curve with four slopes, P, 23, 2y and y. Carpooling
vehicles make their commuting at the middle of the rush hour, [^2,^21], and solo-driving
vehicles depart from home during [tq,ti2] and [t2i,tq'], where
tu = t*-0.55N2/$ls-ta, (13.1)
t2i=t* + 0.55N2/yls - ta. (13.2)
At equilibrium state, the generalized travel costs of each non-carpooling and carpooling
commuters, including the toll components, are
Ci = 5(Ni+0.5N2yis + ata, (14)
C2 = 5(0. 5M+0. 5N2)/ls + ata + A, (1 5)
respectively. Note the difference between (9) and (15). Eqn (15) is obtained by simply making
a summation to the arriving early cost and the toll paid by a carpooling commuter at time t\2 as
well as (afa + A), i.e., [t* - (tn + k)]P + 0.5P(712 - tq) + (o.ta + A). The total social cost of the
highway system, no queuing costs now, is
TSC = \'U (t* -t- 1. )|5/sdf + f ~'" (t* -t- ta)2$lsdt + f ^ (t + t - 1* )2y/sdf
J J J
' 'U ' ~ta
(t + ta- t*)ylsdt +(Nl + N2}ata + N2& (16)
fjAT / 2ls + 5(0.5 AT2 ) 2 / Is + (#! + N2 )ata + N2 A

Assuming that we aim at maximizing the total benefit of the system. We solve the following
optimization problem:
max TB = ^ B(x)dx-[5N}N/2ls + 5(0.5N2f lls + (N^+N2)a/a + N2A] (17)
with NI+ N2 = N and N\ >0,N2> 0. Suppose the maximum occurs at an interior point of the
feasible region, then the following first-order optimality conditions are obtained:
6(M+0.5N 2 )/ls + ata = d = B(N), (18.1)
5(0. SNi+Q.SN2yis + OLta + ^ = C2 = B(N), (18.2)
and Ni+N2 = N. Subtracting (18.1) from (18.2) yields M = 2A/5/5. If N2 > 0, then (18.2)
holds which means N is the solution of equation (18.2), i.e., N= [B(N) - a.ta - A](2/5/5). And,
N>Ni which leads to B(N)- ata > 2A. Therefore, an interior point solution exists if the N-
solution of (18.2) is larger than 2A/5/5, otherwise N2= 0 and N\ = N which is the solution of
(18.1). The case with corner solution M= 0 and N2 = N is unlikely to occur when A>0.
Comparing N\ = 2A/S/6 generated by time-varying toll to N = 2A/S/5 by uniform toll, we find

that the total demand implemented under time-varying tolling scheme becomes larger (>2A/s/5
since N > M), the number of vehicles increases (>2A/s/5), and the interval of rush hour is
wider. These observations are made only when interior point solutions exist for two tolling
regimes.
The fine dynamic toll with respect to vehicles is
f2Al forrer^y-fj
(19)
2fl(JV)-[2y(f + / a - O + 2af f l +2A] fort e[f -/ a ,/ 2 1 ]
forte[t2l,tg,]
and 0 otherwise, where the time unknowns are determined by (7) and (13). Substituting the
values of these time unknown into (19), we have T(^) = T(^-) = 0, T(?12) = T(Î) = 5N\/ls, and
the highest point i(/*-/a) = dN/ls. So, the fine toll is a piecewise linear curve without jump.
The revenue generated from (19) is the area under this curve times the capacity Is.
4. CARPOOLING AND PRICING IN THE PRESENCE OF

HOV LANES
In the presence of carpool lanes in a multilane highway, commuters will make mode choice
decision by trading the extra cost A for reduced travel time and schedule delay, or trading the
extra cost A for both reduced travel time and schedule delay and equal sharing of toll charge if
a tolling scheme is implemented. Now we first investigate the benchmark equilibrium when
special lanes are provided for carpools but no road toll is collected.
4.1 No Toll
Note that (4) and (6) give the individual travel costs of carpooling and non-carpooling
commuters when no road toll is collected, respectively. We write the condition for achieving
an equilibrium between these two costs as follows
Ci = &Ni/lis + ata = B(N), (20.1)

C2 = 0.55N 2 /I 2 s + ata + A = B(N). (20.2)
If a solution with NI > 0 and N2 > Q exists, the above two equalities must hold, which
generates 8[(M//tf) - (0.5AV/2s)] = A. Then, with Ni + N2 = N, we have M = [/i/2/(A +
2/2)][(A///2) + (2As/6)] and N is the solution of equation (20.1) with M derived just. We denote
this Absolution as N and conclude that a solution with NI > 0 and N2> 0 exists if and only if
B(N)-a.ta>A. And, AÔ and JVi=# which is the solution of (20.1) if B(N)-ata<&. It is easy
to show that the case with corner solution M=0 and N2=Nis unlikely to occur when A>0.
Comparing the above results with the discussions in Section 3.1, we find that in the presence
of HOV lanes the positive N2 may exist although A > 0, hence the introduction of HOV lanes
really promote carpooling even if by self-forming. Next, we further study the effects caused by
several road tolling schemes in the presence of HOV lanes.
4.2 Differential Tolling
Assume that a static (time-unvarying) toll regime is introduced on multilane highway and this
regime may differentiate the toll across segregated lanes. Since the toll is not time-varying, the
formulae for computing rush hours and private travel costs (including schedule delay and
queuing costs only) in Section 2, i.e., (3)-(6), still hold in current situation. Therefore, we can
directly consider the following social optimum problem
max TB = f 5(x)dx - C,N, - C2N2
J
° (21)
=J Jx - tfj(dNl / /!•$ + ata) - N2 (0.55N I I2i
withNi+N 2 = NandNi>0,N2>0. The first-order optimality conditions include
(22.1)
(22.2)
where C\ and C2 are given by (4) and (6), respectively. The interpretation of (22.1) and (22.2)
is that the marginal benefit value of an extra commuter ought to equal his (her) marginal social
costs (i.e., the left hand sides of the equations). The externalties in (22.1) and (22.2), i.e.,
6AV/is and 6N2/2l2s, include the schedule delay costs of an extra commuter added to either
end of the rush hour, plus the increase in aggregate queuing costs.
From (22.2),we can see that a rider of a carpool should pay a toll equal to 6N2/2l2s, and
therefore the optimal toll charge for a carpooling vehicle amounts to 5N2/l2s. On the other
hand, from (22.1) the optimal toll for a non-carpooling vehicle is 8Ni/l\s. Since in general
6Ay/2s * dNi/lis, consequently, a differential tolling scheme must be introduced for achieving
a social optimum in the presence of carpooling lanes on the highway. Note that here the social
optimum requires only toll differentiation across segregated lanes rather than across the
vehicles according to their number of riders. The actual magnitude of toll differentiation will
depend on the allocation of highway lanes as shown in (22). This theoretical observation
strongly supports the current practice of differential charging scheme. Namely, a reduced toll
is charged for carpools to encourage commuters to use high-occupancy vehicle mode; this
scheme is equivalent to set more HOV lanes and then create a lower combined marginal
queuing and schedule delay cost, 5N2/l2s.
HOVLanes And Carpooling and Pricing 695
Now we check the corner solution. If N\ = 0 and A^ = N, then we have (0 + ctffl) + 0 > B(N)
and (0.55A7/2J + a.ta + A) +5A72/2s = B(N), i.e., 5N/l2s + A < 0. Clearly, this case is unlikely to
occur in normal situation where the extra cost for carpooling A> 0. If, on the other hand, M =
W and N2 = 0, we have A - 6AVAs - SNi/lis > 0 in which the second term is the additional
travel cost due to queuing and schedule delay on normal lanes and the third term represents
the toll charged for solo drivers. Therefore this case is very likely to occur if additional
carpooling cost A is high and/or there is no marginal-cost pricing. Of course carpool lanes is
not needed in this case.
If Ni > 0 and N2 > 0, the two equalities (22.1) and (22.2) must hold, which generates
6[(2AV/is) - (AV/2S)] = A and then M = [/1/2/(/1 + 2/2)][(A7/2) + (As/5)] and N is the solution
of equation (22.1) with NI derived just. Denote this Absolution as N. We conclude that a
solution with M > 0 and N2 > 0 exists if and only if B( N) - 04 > A. Otherwise, N2= 0 and M
= A^ which is the solution of equation 25A7/iS + a.ta = B(N).
Comparing the two equations of solving the total demand under no-toll and differential toll
regimes, we find that the total demand implemented by differential toll is less than that in the
no-toll situation if the interior point solutions exist in both situations, but the proportion of
carpooling commuters is higher. Finally, the time unknowns, t\q, 710, t\q>, t2q, t2o, and t2q> are
determined by (3) and (5) with the M-value and AVvalue obtained above.
4.3 Common Uniform Tolling
We have shown in Section 4.2 that the first-best pricing scheme does call for toll
differentiation between carpooling and non-carpooling vehicles for optimal regulatory use of
the multilane highway. If for whatever reason it is impossible to implement the first-best tolling
scheme, then the subsequent question of interest is how to find a common uniform toll to offer
the second best solution for optimizing the traffic flows under road use discrimination.
Assuming that we aim at maximizing the total benefit as given by (21) by setting an optimal
uniform toll, T, for all vehicles. The equilibrium conditions can be written as:
SNl/lls + vita+i = B(N), (23.1)
r
0.55A 2 / I2s + ata + A + T / 2 = B(N) . (23 .2)
The following Lagrangian can be formed:
ta)-N2 . fl
X2 [B(N)-0.55N2 /I2s-ata - A - T / 2 ]
where ~k\ and X2 are the multipliers associated with (23.1) and (23.2), respectively. It can be
easily shown that the first-order optimality conditions of (24) generate:
T = wl (5NY //!*) + w 2 (0.58W2 /I2s) - (25)
where
S/l2s-B'(N) _ 5 //!$ + £'(#)
an
^2 =2 ' ' (
^
In (25), the terms §N\ll\s and 0.55A^//2^ are the marginal external costs of a non-carpooling
commuter and a carpooling commuter, respectively. Consequently, the optimal common
uniform toll turns out to be a weighted sum of the marginal external costs associated with the
two classes of commuters. The actual magnitude of this uniform toll will be influenced by all
parameters used in the problem, since they affect the total demand and modal share.
If overall demand is perfectly inelastic, B'(N) = oc, then we obtain w\ = 2.0 and w2 = -2.0, and
thus (25) becomes
T = 2[(dNl / /,s) - (0.58N2 / I2s)]. (27)
In this case, as there is no effect of the pricing on overall demand, but solely on mode split, the
problem is reduced to one of the optimal assignment of commuters to the two modes. The
resultant toll should be set at the double difference between the marginal external costs of non-
carpooling and carpooling commuters (not vehicles) in the second-best optimum.
We also note that when demand is completely inelastic, the common uniform toll given in (27)
can still achieve a first-best social optimum as do the first-best differential tolls. This can be
verified by directly minimizing total social cost subject to the total fixed demand. The resulting
optimality or equilibrium conditions prove to be identical, irrespective of uniform or
differential tolls. However, as long as demand is not completely inelastic and toll remains non-
differentiated, it will be generally impossible to realize the first-best situation where both
overall demand effect and mode split are optimized. Finally, substituting the toll expression
(25) into (23.1) and (23.2), respectively, the equilibrium solution ( N * , N*2, N*) with N* + N2
= N can be obtained for given marginal benefit functions. This says, N is the solution of
equation
N(bc -ad) + a.ta(b + c-a-d) + A(b-a) = B(N) (b + c-a-d)
where
a = 8 (1 + wO/As , 6 = 0.5Sw2/l2s ,
c = 0.5SMV/1S, d= 0.58(1 + 0.5w2)//as.
4.4 Time- Varying Tolling
Assume that there are two time-varying tolling devices which are introduced on general lanes
and carpooling lanes, respectively. The queue on two types of lanes is eliminated and replaced
by the time- varying toll as pointed out in Section 3.3. We first compute the total social cost of
the system:
TSC = - ' - ta )(3/l5d/ + ? (t + ta- 1* )y/15d/ + - '- t
(t + ta - t')2yl2sdt +(N, + N2 )ata + N2A (28)
where all time unknowns are given by (3) and (5).
Let the time-varying tolling scheme maximize the total benefit, i.e.,
max TB = 5(jc)dx -dNlNl / 2/j* - 8(0.5 N2 ) 2 / I2s - (Nl + N2 )<xf a - N2 A (29)
with N\+ TV2 = N and N\ > 0, TV2 > 0, Suppose the maximum occurs at an interior point of the
feasible region, then the following first-order optimality conditions are obtained:
/fl = Ci = B(N), (30.1)
(30.2)
and M + TV2 = N. Subtracting (30.1) from (30.2) yields M = [2/i/2/(/i + 2/2)][(TV/2/2) + (As/6)]
and TV is the solution of equation (30.1) with N\ derived just. Denote this TV-solution as N. We
conclude that a solution with N\ > 0 and TV2 > 0 exists if and only if B( N) - a.ta > A.
Otherwise, TV2 = 0 and M = TV which is the solution of equation 8N/liS + a.ta = B(N). In fact, the
equation (30) is quite the same with (20). So the time-varying toll regime in the presence of
HOV lanes generates the same total demand and the same modal share as that by no-toll
regime in the presence of carpooling lanes.
The time-varying tolls with respect to vehicles, on general lanes and carpooling lanes,
respectively, are
\B(N)-W(t* - f - O + o/J for f <=[/„/ -ta]
TlU
* f o r t e[t'-t.,^]
2A] fort <=[t2q,t*-ta]
2&] f o r t e[t*-ta,t2q,]
and 0 otherwise. These two curves reach the highest at time t - ta and decline to zero at t\q,
tif and tiq, hq-, respectively. For each type of traffic lanes system, the toll is exactly equal to
what queuing time costs would be with no toll, so the toll completely replaces the queue (now
zero) as the mechanism which rations commuters' departing time (from home).
Note that in the presence of HOV lanes two classes of commuters travel on different types of
traffic lanes and are charged by different tolling devices under the pricing policy studied above.
This is similar with the differential tolling policy across segregated lanes proposed in Section
4.2, but herein what we study is dynamic rather than static.
5. NUMERICAL EXAMPLES
In this section, we use numerical examples to validate the theoretical observations already
made in this paper and get some insights into the characteristics of the combined operations of
carpooling and congestion pricing. The basic parameters are: (y, a, P) = (4.2, 1.6, 0.8)
($/min), 7 = 4 lanes, s = 5 (veh/min), ta = 55 minutes, A = 10 ($/carpooling commuter). The
total number of lanes is assumed to be 4 and four cases for 0, 1, 2 and 3 reserved carpool lanes
are considered, respectively.
5.1 The Fixed Demand Case
Here in the first instance we assume that the overall demand TV is given, but varied
parametically rather than determined by a demand function. This will facilitate our discussion
of the impact of degree of traffic congestion on carpooling behaviour. As the overall demand
is given a priori, we thus focus on the total social cost for performance comparison.
Figure 1 shows the number of commuters joining carpool versus the total number of
commuters under various combined policies of pricing and HOV lanes. In the case without
carpool lane and without toll (no-intervention case), the number of carpoolers is constantly
zero. This is characterized by the solution (10) as A > 0. However, as optimal bottleneck
congestion pricing is introduced, number of carpooling commuters increases sharply with total
demand (or the level of congestion) because commuters can save their commuting costs by
sharing toll charges especially at higher level of congestion or toll charge. The time-varying,
tolling cannot induce carpoolers as more as the uniform tolling does because its charge level is
lower (see, e.g., Figure 5) so that the cost saving through sharing toll is not obvious. In the
presence of carpool lane, toll charge will induce more commuters to join carpool, and the
more the carpool lanes the more the carpooling commuters. The joint implementation of
carpool lane and pricing (except time-varying pricing) encourages more commuters to switch
to carpooling mode in comparison with carpool lane only, but does not so necessarily in
comparison with congestion pricing only. Note that, as shown before, time-varying tolling
causes the same modal split with that by non-tolling in the presence of HOV lanes, and with
fixed overall demand the second-best uniform toll leads to the first-best social optimum.
Figure 2 shows the relative change in total cost versus total number of commuters under
various schemes. The relative change is defined as the ratio of the total cost change with
respect to 'do-nothing' case (without carpool lane, no toll charge). As demand is within a
certain range (100 < N < 600), introducing carpool lane (either with or without congestion
pricing, except time-varying tolling) actually increase total cost (positive change ratio), but the
uniform tolling with zero HOV lane has no impact. For time-varying tolling, the more the
carpool lanes the larger the relative change ratio. As demand grows beyond a certain
threshold, all schemes lead to reduction (negative change ratio) in total cost. These reductions
are achieved by encouraging participation of commuters in carpooling and thus reducing
vehicular flow. It is noteworthy that, if excluding time-varying tolling the first-best uniform
pricing with zero HOV lane always results in most significant cost reduction. These
observations demonstrate that if commuters are charged appropriately in a first-best fashion,
introduction of carpool lane is not necessarily beneficial, and even disastrous (when demand is
lower). But this should be mentioned carefully because introducing carpool lane alone can still
achieve significant reduction in total cost at higher levels of commuter demand. Among all
schemes, the time-varying tolling with or without carpool lanes yields the most significant
influence in reducing total cost.
5.2 The Elastic Demand Case
In the case of elastic overall demand, the schemes studied in this paper (with or without
carpool lane, pricing or unpricing) will present different performances each other in total
demand, modal split, toll charge and total benefit. The slope of the marginal benefit (or inverse
demand) curve is an important factor that brings about these impacts. To gain some insights
into this aspect, we adopt the following affine inverse demand function: B(N) = -K(N - A^x)
where AW = 1000 and K is the slope. One advantage of using this affine function is that B '(N)
= £(const), which makes easier investigation on the impact of the slope of demand function.
Note that a larger value of K implies that the demand is less sensitive to the marginal trip
benefit and thus the final realized demand will be higher. In fact, Figure 3 verifies such a
conclusion. In addition, Table 1 lists the total demands in numeral form, which is useful to
recognize the differences carefully. Time-varying tolling alone results in the most total
demand, then the 3 carpool lanes scheme with time-varying tolling or non-tolling. In the
presence of carpool lanes, the more the carpool lanes the more the final total demand. In the
case with identical number of carpool lanes, the non-tolling and time-varying tolling implement
the most total demand, then the uniform tolling (second-best) and then the differential tolling
(first-best).
Table 1: Total demand implemented under various schemes

K 0 HOV lane 1 HOV lane 2 HOV lanes 3 HOV lanes
No toll Uniform Time- No toll Uniform Differ No toll Uniform Differ No toll Uniform Differ
toll varying toll -ential toll -ential toll -ential
tolltolltolltoll
0.1 89.82 71.77 89.82 82.87 63.29 61.36 71.77 51.19 49.65 51.19 32.54 33.85
0.2 479.45 419.16 479.45 476.02 425.60 434.48 473.62 430.28 439.54 471.85 433.84 441.76
0.3 635.49 595.24 637.63 636.32 587.97 604.16 636.89 595.51 611.47 637.31 601.15 614.77
0.4 719.56 696.49 724.57 721.51 678.77 696.46 722.85 686.45 703.31 723.83 692.13 706.52
0.5 772.11 753.37 777.86 774.37 736.78 754.20 775.91 744.00 760.24 777.02 749.31 763.18
0.7 834.24 820.61 839.84 836.45 806.62 822.25 837.95 812.75 826.91 839.03 817.21 829.31
1.0 882.35 872.68 887.10 884.23 861.68 874.59 885.50 866.51 877.95 886.41 870.02 879.80
2.0 940.20 935.29 943.08 941.35 929.03 936.84 942.11 931.79 938.52 942.66 933.79 939.54
Note: The time-varying tolling policy generates the same results with that by non-tolling policy in the presence
of HOV lanes.
Figure 4 shows the proportions that carpooling commuters occupy in the total demand. The
carpooling behavior under alternative schemes, except the common uniform tolling (second-
best), is basically the same with that shown in Figure 1. This is because the larger the value of
the slope K the higher the total demand. In the presence of HOV lanes, the common uniform
tolling results in the largest value of proportion, then the differential tolling and then the time-
varying tolling.
Figure 5 depicts the average toll charges versus the slope K. The average toll is defined as the
total revenue (that generated from tolls) divided by the total number of vehicles (not
commuters). The slope K affects the average toll charges in two ways: first the amount of
charge under these schemes except common uniform tolling (with HOV lanes) increases with
the value of K, secondly the discrepancy of toll charges among different pricing schemes (not
different HOV lane schemes) becomes more remarkable as K increases. The average toll
charges are not sensitive too much to the number of carpool lanes.
Figure 6 displays the relative change in total social benefit versus slope K under various
schemes. The relative change of total social benefit is defined as the ratio of the overall welfare
gain in comparison with the 'do-nothing' case. The following observations can be made.
Within a certain range of slope K (> 0.3), all schemes lead to relatively higher social welfare
improvement. As demand approaches complete inelasticity (AT-ôo), the impacts of all schemes
on social welfare diminish and eventually disappear. Whereas as demand approaches perfect
elasticity (AT—»0), introducing carpool lanes (either with or without pricing, except time-
varying pricing) will reduce social welfare. The social welfare gains under time-varying tolling
depend on the number of carpool lanes, i.e., the more the HOV lanes the less the social
welfare gains. It is noteworthy that the first-best pricing alone and time-varying tolling alone
always lead to social welfare improvement.
Finally, we give the detailed numerical results when K = 0.3, see Table 2. The data shown in
this table coincide with the above analyses which are made through figures. Besides this, the
next interest for us is the intervals of rush hour and the maximum queuing lengths. All non-
tolling schemes generate nearly the same overall demand, but ones with more carpool lanes
have shorter rush hour intervals and lower maximum queuing lengths. In the presence of HOV
lanes, the differential tolling scheme (first-best) with 2 HOV lanes, in comparison with 1 HOV
lane, results in narrower rush hour and lower queuing length on general lanes, and narrower
rush hour and higher queuing length on carpool lanes. The same phenomena can be observed
when checking the common uniform tolling schemes (second-best). When comparing the
uniform tolling alone with these schemes containing HOV lanes (except time-varying tolling),
we find that q[ + q*2 > q*, but max {q*,q^} <q*. Here note that the maximum queuing
lengths, ql and q*2, may occur at different times.
Table 2: Numerical results when K = 0.3

Without carpool lane With 1 carpool lane With 2 carpool lanes
No toll Uniform Time No toll Different Uniform Time No toll Different Uniform Time
toll -varying -ial toll toll -varying -ial toll toll -varying
(a) toll (b) (c) toll (d) (e) toll
N 635.49 595.24 637.63 636.32 587.97 604.16 636.32 636.89 595.51 611.47 636.89
M 635.49 204.08 595.24 471.08 397.42 393.12 471.08 311.50 248.11 242.05 311.50
N2 0.00 391.16 42.39 165.24 190.54 211.04 165.24 325.39 347.41 369.42 325.39
tq 18.31 28.21 19.11
tq' 50.08 48.20 49.93
t\q 18.62 22.74 22.99 18.62 18.83 24.16 24.67 18.83
t\q' 50.02 49.24 49.19 50.02 49.98 48.97 48.87 49.98
hq 31.12 28.99 27.27 31.12 31.33 30.41 29.48 31.33
hq' 47.64 48.05 48.38 47.64 47.60 47.78 47.96 47.60
q 266.91 167.86 0.00
q* 197.85 166.92 165.11 0.00 130.83 104.21 101.66 0.00
/j- 34.70 40.01 44.32 0.00 68.33 72.96 77.58 0.00

TSC 69492.6 64285.7 62926.4 69425.1 63162.6 65193.0 63536.8 69378.3 64071.2 66026.1 64339.2
TB 60577.4 61139.5 67376.4 60735.3 61372.0 61303.6 66623.6 60844.3 61387.5 61330.6 65883.4
Total 0.0 7993.2 6391.4 0.0 9515.8 6552.2 5888.3 0.0 8191.9 5246.3 5039.1
Revenue
Notes: (a) Toll = 20.00 ($/veh); (b) Toll for non-carpooling vehicles = 17.80 ($/veh), toll for carpooling
vehicles = 25.61 ($/veh); (c) Toll = 13.14 ($/veh); (d) Toll for non-carpooling vehicles = 16.67 ($/veh);
toll for carpooling vehicles =23.35 ($/veh); (e) Toll = 12.29 ($/veh).
In summary, 1) the determination of a most efficient scheme depends on the elasticity of

demand or the level of traffic congestion (refer to the fixed demand case), the number of
carpool lanes and the evaluation criterion (number of carpoolers, total cost, total benefit,
average toll, queuing length and so on). 2) Time-varying tolling schemes with or without HOV
lanes can greatly reduce total cost and improve total benefit, but are not satisfactory in
promoting carpooling. 2) Reserving one or more traffic lanes for carpooling vehicles can
attract more commuters for carpool, but has not much effect in reducing total cost and
improving social welfare. 3) There exists a problem to optimize the number of carpool lanes or
the capacity reserved for the flow of carpooling vehicles; otherwise, the provision of HOV
lanes might be disastrous. Such a problem takes a bi-level structure in which the upper level
maximizes the total benefit of the system by choosing number of carpool lanes and the lower
level represents an equilibrium modal split.
By the way, we mention that there are many more possibilities than just the ones reported
above. Besides the slope of the demand function, the length of highway, the lane capacity, the
extra carpooling cost, the value of time and the schedule delay penalty are certainly important
factors that affect the aforementioned results and observations. If we change these parameters,
we may obtain different results outputs. Future work would be toward the investigation of the
sensibility of the models, presented in this paper, to the values of their parameters.
6. CONCLUSIONS
In this article we studied carpooling and congestion pricing in the presence or absence of
carpool lanes in a multilane highway with bottleneck congestion using deterministic
equilibrium models. This is certainly not an issue of mere academic interest but one that is
being encountered nowadays in some large urban areas such as Beijing and California. Our
findings therefore may be highly relevant for the design and operations of combined
congestion pricing and HOV lanes on a highway. However, it should be pointed out that the
models presented in this paper are very simple from the view point of practice. They are based
on a number of simplification assumptions. Their value therefore lies in the insights they offer.
And, the methodology used in this paper can be extended to deal with more complex
situations for obtaining more illuminating results.
We investigated theoretically how different charging policies should be performed under

alternative road regulations. In the absence of HOV lanes, an anonymous toll for all vehicles
should be charged for obtaining a first-best social optimum. In the presence of carpooling
lanes, however, the pricing for a first-best social optimum requires to differentiate the toll
across segregated lanes; if the 'equal tolls' constraint is imposed, a weighted average
anonymous toll should be set and leads to a second-best social optimum. We also studied the
equilibrium modal split under time-varying tolling scheme which can be used to eliminate any
queue behind the bottleneck.
Acknowledgements—This research was partly financed by the Natural Science Foundation

Committee of China through the project 79770006 and the Hong Kong Research Grants
Council through a RGC-CERG Grant (HKUST719/96E).
REFERENCES
Arnott, R. de Palma, A. and Lindsey, R. (1992) Route choice with heterogeneous drivers and
group-specific congestion costs. Regional Science & Urban Economics, 22,71-102.
Arnott, R. de Palma, A. and Lindsey, R. (1994) The welfare effects of congestion tolls with
heterogeneous commuters. Journal Transport Economics & Policy, 28,139-161.
Bonsall, P. (1981) Car sharing in the United Kingdom. Journal of Transport Economics and
Policy, 15, 35-44.
Braid, R.M. (1996) Peak-load pricing of a transportation route with an unpriced substitute.
Journal of Urban Economics, 40, 179-197.
Bureau of the Census. (1979) The journey to work in the United States: 1975, USA DoT.
Collura, J. (1994) Evaluating ridesharing programs: Massachusetts experience. ASCE Journal
of Urban Planning and Development, 120, 28-47.
Dahlgren, J. (1998) High occupancy vehicle lanes: Not always more effective than general
purpose lanes. Transportation Research-A, 32, 99-114.
Ferguson, E. (1990) The influence of employer ridesharing programs on employee mode
choice. Transportation, 17, 179-207.
FHWA (1993) Journey-to-work trends in the United States and its major metropokitan areas,
1960-1990. Research Report for the Federal Highway Administration, USA DoT.
Giuliano, G., Levine, D.W. and Teal, R.F. (1990) Impact of high occupancy vehicle lanes on
carpooling behaviour. Transportation, 17, 159-177.
Huang, H.J. and Yang, H. (1996) Optimal time-dependent and location-dependent road
pricing on a congested network with parallel routes and elastic demand. In: Proceedings of
the 13th International Symposium on Transportation and Traffic Theory (Edited by Lesort
J.-B.), pp 479-500, Elsevier Science Ltd, Pergamon.
Huang, H.J. (1998) Carpooling and tolling in a highway with a bottleneck. In: Proceedings of
the 3rd International Conference on Management (electronic media), China Higher
Education Press Beijing and Springer-Verlag Berlin Heidelberg.
Lee, L.W. (1984) The economics of carpools. Economic Inquiry, 22, 128-135.
Mohring, H. (1979) The benefits of reserved bus lanes, mass transit subsidies, and marginal
cost pricing in alleviating traffic congestion. In: Current Issues in Urban Economics
(edited by P. Mieszkowski and M. Straszheim), The Johns Hopkins University Press,
Maryland, pp. 165-195.
O'Sullivan, A. (1993) Urban Economics (second edition), IRWIN, Boston, MA.
Small, K.A. (1977) Priority lanes on urban radial freeways: An economic-simulation model.
Transportation Research Record, 637, 8-13.
Small, K.A. (1982) The scheduling of consumer activities: Work trips. American Economic
Review, 72, 467-479.
Small, K.A. (1983) Bus priority and congestion pricing on urban expressways. In: Research in
Transportation Economics l(edited by T.E. Keeler), JAI Press, pp.27-74.
Teal, R.F. (1987) Carpooling: Who, how and why. Transportation Research,2lA,203-2\4.
Turnbull, K., Stokes, R.W. and Henk, R.H. (1991) Current practices in evaluating freeway
HOV lanes facilities. Transportation Research Record, 1299,63-73.
Verhoef, E., Nijkamp, P. and Rietveld, P. (1995) Second-best regulation of road transport
externalities. Journal of Transport Economics and Policy, 29, 147-167.
Verhoef, E., Nijkamp, P. and Rietveld, P. (1996) Second-best congestion pricing: The case of
an untolled alternative. Journal of Urban Economics. 40, 279-302.
Willson, R.W. and Shoup, D.C. (1990) Parking subsidies and travel choices: Assessing the
evidence. Transportation, 17, 141-157.
Yang, H. and Huang, H.J. (1997) Analysis of the time-varying pricing of a bottleneck with
elastic demand using optimal control theory. Transportation Research-B, 31, 425-440.
Yang, H. and Huang, H.J. (1999) Carpooling and congestion pricing in a multilane highway
with high-occupancy-vehicle lanes. Transportation Research-A, 33, 139-155.
- 0 HOV lane, no toll

- 0 HOV lane, uniform toll
- 0 HOV lane, dynamic toll
-1 HOV lane, no toll
-1 HOV lane, differential toll
-1 HOV lane, uniform toll
-1 HOV lane, dynamic toll
-2 HOV lanes, no toll
- 2 HOV lanes, differential toll
- 2 HOV lanes, uniform toll
2 HOV lanes, dynamic toll
3 HOV lanes, no toll
3 HOV lanes, differential toll
3 HOV lanes, uniform toll
Total number of commuters
Figure 1: Impact of bottleneck pricing on carpooling under fixed total demand
0,1 T
- 0 HOV lane, uniform toll

0,05 J 0 HOV lane, dynamic toll
-1 HOV lane, no toll
1 HOV lane, differential toll
Total number of commute'rs"* 1 HOV lane, uniform toll
-1 HOV lane, dynamic toll
1000 - 2 HOV lanes, no toll
- 2 HOV lanes, uniform toll
-0,05 ] -2 HOV lanes, dynamic toll
-0,15 J
Figure 2: Relative change in total cost under fixed total demand (versus 'do-nothing' case)
OHOVIane, no toll
0 HOVIane, uniform toll
0 HOV lane, dynamic toll
1 HOV lane, no toll
1 HOV lane, uniform toll
2 HOV lines, differential toll
3 HOV lines, differential toll
0,4 0,5 0,7

Value of K
Figure 3: Total demand implemented under pricing or unpricing schemes
0 HOV lane, no toll

1 HOV lane, no toll
1 HOVIane, differential toll
1 HOVIane, dynamic toll
-2 HOV lanes, no toll
-2 HOV lanes, uniform toll
--x- - 2 HOV lanes, dynamic toll
X 3 HOV lanes, no toll
- -x- 3 HOV lanes, differential toll
o 3 HOV lanes, uniform toll
X 3 HOV lanes, dynamic toll
0,4 0,5 0,7

Value of K
Figure 4: Proportion of carpoolers in total demand under pricing and unpricing schemes
35-1
30-
2 HOV lanes,, dynamic toll
0,1 0,2 0,3 0,4 0,5 0,7 1

Value of K
Figure 5: Average toll ($/veh) under elastic demand

0,4
—a— 0 HOV lane, uniform toll

£ 0,2 \
0) —A— 0 HOV lane, dynamic toll
0) —x— 1 HOV lane, no toll
+^=^$:=:S5=^======.=_
~x— 1 HOV lane, differential toll
•n 0 —o— 1 HOV lane, uniform toll
* 0,3 0,4 0,5 0,7 1 2 —i— 1 HOV lane, dynamic toll
Value of K --&--2HOVIanes, no toll
c -0,2-1 --*-- 2 HOV lanes, differential toll
o> - -o- - 2 HOV lanes, uniform toll
0)
c —i— 2 HOV lanes, dynamic toll
TO
O x 3 HOV lanes, no toll
--*--- 3 HOV lanes, differential toll
o 3 HOV lanes, uniform toll
H— 3 HOV lanes, dynamic toll
-0,8-1
Figure 6: Relative change in total social benefit under elastic demand (vs. 'do-nothing' case)
Balance of Demand And Supply Of Parking Spaces 707
BALANCE OF DEMAND AND SUPPLY OF

PARKING SPACES
William H.K. Lam, M.L Tarn

Department of Civil & Structural Engineering, The Hong Kong Polytechnic University
Hung Horn, Kowloon, Hong Kong, P.R. China
Hai Yang
Department of Civil Engineering, The Hong Kong University of Science & Technology
Clear Water Bay, Kowloon, Hong Kong, P.R. China
and
S.C. Wong
Department of Civil Engineering, The University of Hong Kong
Pokfulam Road, Hong Kong, P.R. China
ABSTRACT
It is always a controversial issue on whether the demand and supply of parking spaces should
be balanced. This paper investigates the effects of balancing the parking demand and supply.
A bilevel programming model is proposed to determine the minimum supply of parking space
required taking into account the elasticity effects of road congestion and parking delays on
parking demands. The lower-level problem is an equilibrium trip distribution/assignment
problem with variable parking delay, while the upper-level problem is to minimise the total
number of parking spaces in traffic zones by considering motorists' route and destination
choice behaviour and satisfying the attraction-end parking demand constraints. A sensitivity
analysis based heuristic algorithm is developed to solve the proposed bilevel problem and is
illustrated with a numerical example.
INTRODUCTION
Parking is a common problem for most motorists in urban areas. Due to the inherent
uncertainty associated with many of the attributes of public car parks (Saloman, 1986; Polak
and Axhausen, 1990), a high proportion of motorists travelling within central city areas must
search for a car park. A shortage of parking space increases the searching time for an available
parking space and hence induces traffic congestion and environmental pollution problems. It
also causes illegal roadside parking and traffic accidents. As a result, there may be a reduction
of road space. On the other hand, excess parking spaces in some areas may represent a waste of
resources and also induce traffic on the roads. Such parking problems are caused by inaccurate
prediction of car ownership/usage and parking supply. Parking controls are seen importantly to
regulate and restrain car use, and develop parking facilities according to the estimated level of
car ownership and parking demand (Coombe et al, 1997).
Car parking is an issue of significance both at the local and at the strategic level of planning.
Parking policy and supply play a major role in the management of transportation systems in
urban areas. Although the policies that govern the provision and operation of parking facilities
are recognised to have an important bearing on the operation of urban transport systems,
decisions have often not been properly integrated with other elements of transport system
analysis.
A comprehensive review of parking models (Young et al., 1991) identified a number of

modelling approaches that have been used to understand and replicate parking choice
behaviour. These choice models are used to investigate the demand for parking within a given
supply situation. The models mainly concentrate on the choice of parking location (Ergun,
1971; Hunt, 1988) and the impact of parking on mode choice (Feeney, 1989). The effects of
parking cost and access time and a number of socio-economic variables on parking location
behaviour are studied in their models. Thompson and Richardson (1998) developed a parking
search model to better understand the parking choice behaviour.
The second approach is to formulate the parking problem as an allocation problem, in which a
fixed number of arrivals is allocated onto the parking lots on the basis of a measure of the
relative attractiveness of each element of the parking lots. Optimisation models are developed
to determine the optimal location of parking spaces in a way that minimises the total walking
distance for all parkers (Oppenlander and Dawson, 1988). The constraint models adopt the
principle that parkers will look for a satisfactory parking space rather than an optimal one
(Young, 1982). Gravity models for parking allocation (Bullen, 1982) provide estimates of the
interchange of trips between particular origins and destinations. Their behavioural basis offers
a number of advantages over the optimisation and constraint models, particularly for those trips
where the parking location decision would affect the choice of destination. However, the
vehicle trip production and attraction (i.e. parking demand) may not be fixed if the effects of
road congestion and accessibility on trip production and attraction are taken into account.
Nour Eldin et al. (1981) developed a model different to the traditional assignment model by
considering interaction between parking supply and vehicular traffic assigned to urban streets.
The links included all parking facilities in the road network and a capacity correction factor was
incorporated to take into account the illegal parking. Gur and Beimborn (1984) developed an
equilibrium assignment model for analysis of parking process in dense urban areas. The model
could provide estimates of parking impedance for each destination zone in the study area and
the level of use of each parking location in the area. It was nevertheless assumed that the
parking demand is constant and fixed in their models.
Bifulco (1993) studied a model, which consists of a supply model, a demand model and a
supply/demand model. On the supply side, network-based and proper functions are introduced
to simulate the attributes (e.g. parking access time, searching time) related to parking choices.
The demand side of the model consists of a stochastic choice model in which a steady-state
parking demand by time period is allocated onto the network and parking spaces. The
connection between two successive time periods is mainly the parking occupancies carried over
to the next period.
The traffic assignment/allocation models proposed by Nour Eldin model (1981) and Gur-
Beimborn model (1984) both adopt the traditional deterministic user equilibrium (UE)
conditions; while Bifilco's model (1993) uses stochastic user equilibrium (SUE) assignment.
However, they all assumed a fixed origin-destination (O-D) demand. In general, the target O-D
matrix should be varied and dependent on the future development of each urban area and the
traffic conditions of the study network. To overcome this shortcoming, a combined trip
distribution/assignment (CDA) model (Evans, 1976; Lam and Huang, 1992) can be adopted to
incorporate both destination and route choices of motorists.
Lam et al. (1998) evaluated the parking demand in Hong Kong using a stated preference survey
and examined the influence of parking space availability on mode choice. A parking demand
model has also been developed to forecast the future demand for parking facilities in different
districts of Hong Kong. It was found that there was a shortfall in parking facilities provision,
which are compatible with the results obtained by Lam and Tarn (1997). Lam and Tarn (1997)
pointed out that an under-estimation of parking demand would be obtained if the standard
transport model was used to predict car ownership in Hong Kong. In order to assess maximum
car ownership under network constraints, a bilevel programming model was developed by Tarn
and Lam (1999). The growth potential of car ownership can be determined by the bilevel
programming model under the constraints of road capacities and parking supply.
It is found that the previous work mainly concentrates on parking behaviour and deals with the
allocation of parking demand for a fixed parking supply. In strategic transport planning, the
number of parking spaces should be supplied in response to elastic parking demand. A balance
of demand and supply of parking spaces seems to be important to motorists. The balance of
demand and supply of parking spaces can reduce the searching time for an available parking
space, and hence reduce the total network travel time. Although increasing the number of
parking spaces can reduce parking search-time, it would induce traffic and decrease the
utilisation rate of a parking space. Thus, a balance of demand and supply of parking spaces
would lead to a reduction in total network travel time and full utilisation of parking space.
This paper makes three extensions of the existing parking models. Firstly, it is the first bilevel
programming model that balances the parking demand and supply coherently. Secondly, the
CDA model is adopted to incorporate the destination and route choices of motorists
simultaneously. Thus, the O-D travel demand is not fixed. Finally, the vehicle trip production
and attraction ends are elastic to traffic congestion and availability of parking spaces.
This paper is organised as follows. Sections 2 and 3 present a description of the assumptions
and notation adopted in this paper, respectively. In Section 4, a bilevel programming model is
presented, which aims to balance the number of parking spaces demanded and supplied in the
study network. In this model, the number of car trips attracted to a destination (i.e. the parking
demand) would be affected by the delay of searching a parking space at the destination.
Section 5 proposes a heuristic sensitivity analysis based algorithm for solving the model. A
numerical example is presented to illustrate the application of the proposed model and the
solution algorithm in Section 6. Three scenarios with balanced and unbalanced parking
demand/supply are investigated using the numerical example.
ASSUMPTIONS
Generally, the type of parking space can be classified as private or public. Home-end parking
demand is assumed to be catered by the private parking spaces that are known and fixed.
Otherwise, the excess home-end parking demand is presumed as the pre-occupancy of the
public parking spaces. On the other hand, the public users cannot use the private car parks. In
this paper, attention is given to the attraction-end parking. Thus, the public parking space is the
key decision variable in the proposed model. In the following sections, parking spaces
(demand and supply) are referred to the public parking spaces only. For simplicity, a single
user category is adopted in this paper. In addition, further assumptions are also used to
facilitate the presentation of the essential ideas without loss of accuracy or leading to an
erroneous conclusion.
As the proposed model is aimed to be used for strategic planning of parking supply, the
following assumptions are made throughout the study:
(a) Travel times on road links are continuous, strictly increasing functions of link flows. The
link travel time functions are assumed to be differentiate and separable. These
assumptions will ensure the uniqueness of the solution to the network equilibrium problem
if it exists (Sheffi, 1985). Capacity constraint effect is not considered in this paper but the
overflow delay is incorporated into the link travel time function (Bell and lida, 1997).
(b) Drivers have sufficient and perfect network information to make routing decisions in a
user equilibrium manner (Sheffi, 1985).
(c) The study period is assumed to be one-hour (unit time) period, such as the morning peak
hour period. It is known that the morning peak hour is usually the most critical period in a
typical weekday and all the car trips are home-based work trips to their work places. It is
also assumed that no round trips occur during the one-hour study period.
(d) The network is assumed to be fixed with constant road capacity in terms of passenger car
units per hour (pcu/hr).
(e) The population, number of job places and cars in each traffic zone is given and fixed.
(f) As the study period is a one-hour time-slice of a weekday, parking space is allowed to be
occupied at the beginning of the study period. It should be noted that the proposed model
could be extended to time-dependent dimension. This is because the model allows for the
parking spaces to be occupied at the beginning of the one-hour study period and to be
carried over to the next hour period (Bifulco, 1993).
(g) As the proposed model is for planning purposes, each car must occupy one parking space
at the destination during the study period, i.e. no illegal parking is allowed.
(h) When a car trip is attracted to a traffic zone, it represents a car entering the traffic zone and
looking for a parking space. Thus, trip attraction is equivalent to the parking demand.
(i) The zonal trip production by car is assumed to be a function of the number of people living
in a zone, the number of cars owned by the residents in the zone, and an accessibility
measure for producing car trips. The accessibility measure reflects the degree of ease or
difficulty in making trips from each production zone (Leake and Huzayyin, 1980; Bruton,
1985).
(j) The zonal trip attraction by car is assumed to be a function of the number of job places in a
zone, the number of parking spaces available in that zone and an accessibility measure for
attracting car trips (i.e. ease or difficulty of making trips to each attraction zone).
(k) The accessibility measure for trip production is affected by the number of car trips
attracted and the generalised travel time from an origin zone to a destination zone (Leake
and Huzayyin, 1980; Ortuzar and Willumsen, 1994), while the accessibility measure for
trip attraction is influenced by the number of car trips produced and the generalised travel
time by O-D pair.
NOTATION
Sets
A: the set of links in the network
W: the set of O-D pairs
R: the set of paths in the network
Rw: the set of paths between O-D pair \v e W
I: the set of origin zones (or production zones)
J'. the set of destination zones (or attraction zones)
Vectors
v: a vector of all link flows
T: a vector of O-D demand by car (lower-level decision variables)
h: a vector of the number of parking spaces supplied (upper-level decision variables)
g: a vector of O-D travel times (journey times)
d: a vector of parking delays
Variables
va: flow (pcu/hr) on link a e A
fr: flow (pcu/hr) on path r e R
Cafvct): travel time (hrs) on link a e A
&ar • entry is 1 if r e R uses link a, and 0 otherwise
tjf. travel demand between O-D pair (i,f)
O{. trip production by car at origin zone / (pcu/hr)
Df. trip attraction by car at destination zone7 (pcu/hr)
Dj: balanced trip attraction by car at destination zone j (pcu/hr)
z,: accessibility measures for producing car trips in origin zone i that measures the
expected maximum utility of travel on the road network as perceived from origin i
z'j\ accessibility measures for attracting car trips in destination zone j that measures the
expected maximum utility of travel on the road network as perceived to destination/
hf. the number of parking spaces supplied in destination zone/
gy. O-D travel time (journey time) (hrs)
d -(D ) : parking delay at destination zone/
Constants
Sa: capacity (pcu/hr) of link a s A
cl: free-flow travel time (hrs) of link a e A
qt: population in zone i
ef. employment in zone/
uf. the number of cars owned by the residents in zone i
h°cc: pre-occupied parking spaces in zone/
/?™ x : upper limit of supply of parking space in zone/
do/. free-flow parking access time (hrs) in zone/
Ff. parking charge in zone/ (expressed in terms of equivalent time (hrs))
a: a dispersion parameter for gravity-type trip distribution model
fio'. a pre-determined parameter that measures the additional number of trips that would be
generated from a given origin / if the zonal accessibility increased by unity
/?/: a pre-determined parameter that measures the additional number of trips that would be
generated from a given origin i if the zonal population increased by unity
/%: a pre-determined parameter that measures the additional number of trips that would be
generated from a given origin i if the zonal car ownership increased by unity
Yo'- a pre-determined parameter that measures the additional number of trips that would be
attracted to a given destination/ if the zonal accessibility increased by unity
yr. a pre-determined parameter that measures the additional number of trips that would be
attracted to a given destination/ if the zonal employment increased by unity
Y2\ a pre-determined parameter that measures the additional number of trips that would be
attracted to a given destination / if the number of zonal available parking spaces
increased by unity
9: a pre-determined parameter that reflects the sensitivity of the utility of travel between
any given O-D pair w due to changes in the network's performance
NETWORK EQUILIBRIUM PROBLEM WITH ELASTIC TRIP PRODUCTION

AND ATTRACTION ENDS
The problem of minimising the total supply of parking spaces for satisfying the elastic parking
demand, can be formulated as the following bilevel programming problem:
(Upper-level) Minim ise ^hj (1 )

jeJ
subject to
(2)
h™<hj<h™, j e J (3)
where the equilibrium O-D demand fy(h), / e I, j e J is obtained by solving the following
network equilibrium trip distribution/assignment problem:
(Lower-level) Minimise ^ [° ca (x)dx + — ^ ^ ttj (In ttj - 1) + ^ fr '" dj (y)dy (4)

a
a i j j
subject to
eI,jeJ (5)
,iel (6)
JeJ (7)
ar,aÂ (8)
reR
fr>Q,reR (9)
tij>Q,ieIJeJ (10)
The trip production and attraction in equations (6) and (7) are defined as below:
0, = /?0 z, +/?,<?, + / ? 2 W , , z e / (11)
(ro ^ +r,« y +r 2 (A; -Û /^ 02)

Balance of Demand And Supply Of Parking Spaces 71 5
and the accessibility variables are defined as follows (Safwat and Magnanti, 1988):
z, = max{0, ln D7 exp(-#(gy + </,.))} (13)

jeJ
_
ie/
. > (14)
The BPR (Bureau of Public Roads) link travel time function is used.
(15)
It is noted in equations (11) and (12) that the sensitivity of accessibility measures (Ortuzar and
Willumsen, 1994) and the effect of social factors on trip production and attraction are
incorporated in the proposed model. The merit of elastic trip production and attraction ends is
the capability of reflecting the responses of motorists to traffic congestion and availability of
parking facilities. These include changing the time of day at which a journey is made and
switching to an alternative mode or not making the journey.
As more car trips attracted to a destination would increase the parking demand at that
destination, the time required to search for a parking space would become longer. Hence, the
parking delay is not fixed, but follows a function, dj - d} (Z)y ) , of the number of car trips
attracted to destinationy. The parking delay function is strictly increasing with trip attraction.
The generalised cost for parking delay can be considered as the free-flow parking access time,
doj, plus the searching time for an available parking space plus the parking fee. The equivalent
time of the parking fee, F/, can be calculated using a pre-determined value of time. The free-
flow access time is the minimum time taken to reach the parking facilities in a given zone
under free-flow conditions. A complicated searching time function for an available parking
space was adopted by Bifulco (1993). In this paper, a simplified searching time function (16) is
adopted. In order to calibrate the parameter of equation (16), the searching time was calculated
by Bifulco' s search-time cost function with various parameter values. The value of the
parameters in equation (16) were then obtained by regression analysis based on the calculated
data.
searching time - w -J -—i , / e J (16)

&
'
Hence, the parking delay function (in hrs) becomes:

</,(£>) = 4>y + 0.3 1( ' ?"+FJt jeJ (17)

j j
The lower-level problem (4)-(10) is a standard combined trip distribution/assignment (CD A)

problem that can be solved by a convex-combination method for given h and trip production
and attraction ends. The traffic flow va obtained in the lower-level problem represents the
equilibrium flow on link a e A when the number of parking spaces in the zone j is hj. The
function of the balanced trip attractions ( Dj ) is used to adjust the total trip attractions to be
equal to the summation of trip productions. By solving the CDA problem, the equilibrium link
flows, path flows, O-D travel patterns, journey times and parking delays will be obtained. The
resultant journey time gtj and parking delay dj are used to update the accessibility measures
using equations (13) and (14). These results will then be fed into the upper-level problem (1)-
(3) to determine the minimum total number of parking spaces required subject to the parking
demand constraints.
A new set of parking spaces hj by traffic zone j will be obtained by solving the upper-level
problem. This set of values will then be applied to the equation (12) and the parking delay
function (17) for solving the lower-level problem again. This process will be repeated until a
desirable convergence is achieved (see the proposed algorithm below). Figure 1 shows the
flow chart of the mechanism of the proposed model.
The attraction trips (representing the number of cars entering a destination zone) should be less
than or equal to the number of parking spaces so that the attraction-end parking demand is
fulfilled. It is assumed that the illegal parking is restricted. As it is further assumed that no
round trips is made during the one-hour peak period, the cars entering a destination zone are
classified as visitors and cannot occupy the parking spaces owned by the residents in that zone.
In general, the number of parking spaces built in each traffic zone should be bounded by an
upper limit due to the limited land supply and/or parking standard.
The parameters in equations (11) and (12) are to be estimated for trip production and attraction,
respectively. The general form of accessibility measures can be expressed as a function of the
generalised travel time between zones / and j, and the size of activity in zone / or/ (Leake and
Huzayyin, 1979; Ortuzar and Willumsen, 1994). In equations (13) and (14), the size of activity
in zones / andj are defined as the trip production and attraction in the last iteration respectively.
9 is a parameter to reflect the sensitivity of journey time on the accessibility. An increase in
accessibility would generate higher demand for travel. Since accessibility is inversely related to
parameter 0, changes in 9 tend to have greater effect on accessibility to work activities
compared to non-work activities (Dalvi and Martin, 1976). Thus, a small value should be
adopted for work journeys. Sensitivity tests for various #are to be carried out in the numerical
example.
The dispersion parameter a for trip distribution in equation (4) is a measure to reflect the
sensitivity of journey time from an origin to a destination. An increase in a would generate an
O-D travel demand and/or trip length frequency with shorter journey times. The effect of a is
to be examined together with the sensitivity tests on parameter 9.
Figure 1: Flowchart of the Proposed Model

SENSITIVITY ANALYSIS BASED ALGORITHM
The parking demand constraint (2) in the upper-level problem involves the nonlinear and
implicit function of decision variable h. Therefore, local linear approximations using Taylor's
formula are implemented based on the derivatives of the O-D demand function with respect to
the number of parking spaces by traffic zone. The derivative information is obtained by
implementing the method of sensitivity analysis (Tobin and Friesz, 1988; Yang, 1997). The
resulting linear programming problem can then be solved using the well-known simplex
method.
The linear approximation of parking demand constraint (2) can be derived as below:
(18)
where V h / y can be obtained by the method of sensitivity analysis for the network equilibrium
problem, h* is the solution at the current iteration and ttj (h*) is the corresponding equilibrium
O-D travel demand. Equation (18) is then applied to equation (2) to form a set of simple linear
constraints.
The mechanism of the solution algorithm is an iterative process between the upper-level and
lower-level problems. The proposed sensitivity analysis based (SAB) algorithm can be
described as follows:
SAB Algorithm:
Step 0. Determine an initial number of parking spaces h* and trip productions and
attractions by cars. Set k = 0.
Step 1. Solve the lower-level combined trip distribution/assignment problem (4)-(10)
for given h(A); and hence getvw, Tw gw and dw.
Step 2. Calculate the accessibility measures using equations (13) and (14), and hence
find the new trip productions and attractions by equations (11) and (12).
Step 3. Calculate the derivative VhT^ using sensitivity analysis method.
Step 4. Formulate local linear approximations of the upper-level parking constraint (2)
using the derivative information, and solve the resulting linear programming
problem to obtain the new number of parking spaces h^+;\
Step 5. If I h}k+1} - h}® | < co for ally e Jthen stop, where co is a pre-determined error
tolerance and is set to be 0.0001 in this paper. Otherwise let k := k+l and return
to Step 1 .
NUMERICAL EXAMPLE
A numerical example is presented to illustrate how to use the proposed model to minimise the
supply of parking spaces for satisfying the elastic parking demand. The example is designed to
demonstrate the advantages of the balance of demand and supply of parking spaces using the
proposed model. Three scenarios: (1) "Demand" = "Supply"; (2) "Demand" > "Supply"; and
(3) "Demand" < "Supply", are studied. The effects of sensitivity of accessibility measures
(parameter 9) on the total network travel time defined by equation (19) and the supply of
parking spaces are investigated together with various values of dispersion parameter a.
Total network travel time = ^ ca (va) x va + ^T dj (D}) x £) (19)
The example road network shown in Figure 2 consists of 4 zone centroids (represented by
dotted line circles and connected to road nodes by parking links), 6 road nodes and 14 one-way
links. The link travel time data is presented in Table 1 and the data for parking, trip production
and attraction and the parameter values are given in Table 2. The value of dispersion parameter
a is assumed to be 0.1 for the gravity-type trip distribution model in this example. It should be
borne in mind that the parking spaces (demand and supply) used in this example are referred to
the public parking spaces.
11
0^0
14 12
Figure 2: Example Network

Table 1: Link Travel Time Data for the Example Network

Link Number Free-flow travel time Capacity
a c°(hrs) Sa(pcu/hr)
1,3 0.15 1500
2,8 0.15 800
4,6 0.15 1500
5,10 0.10 1500
7,13 0.15 800
9,11 0.15 1500
12,14 0.15 1500
Table 2: Trip Production, Attraction and Parking Data for the Network
Zone 1 Zone 2 Zone 3 Zone 4
Population, <?,(103) 35 52 48 30
Employment, &j (103) 32 21 27 20
Number of cars, w, (103) 1.8 2.5 2.2 1.6
Free-flow parking access time, 0.020 0.035 0.030 0.025
d0j (hrs)
Parking charge, Fj (hrs) 0.080 0.065 0.053 0.086
Upper limit of parking spaces 5000 5000 5000 5000
supplied, hf*
Pre-occupied parking spaces, 80 120 75 50
hocc
Po 125.60 125.60 125.60 125.60

Pi 0.0063 0.0063 0.0063 0.0063
P2 0.46 0.46 0.46 0.46
Yo 122.40 122.40 122.40 122.40
71 0.0072 0.0072 0.0072 0.0072
72 0.53 0.53 0.53 0.53
9 0.50 0.50 0.50 0.50
Scenario (1): Parking Demand = Parking Supply
Firstly, the balance of parking demand and supply is studied using the proposed model. As the
number of parking spaces supplied would affect the searching time for an available parking
space, which influences the desire for trip making and the distribution of travel patterns,
parking demand at each traffic zone would then be changed. Subsequently, the change of zonal
accessibility would lead to a change in trip production and attraction. Tables 3 and 4 show the
demand and supply of parking spaces and the total network travel time of the example network
under the initial and balanced scenarios, respectively. Since the balance of parking demand and
supply is obtained, the utilisation of parking spaces is 100%. The resultant link flows are
compared in Table 5 for the cases with initial and optimum solutions.
Table 3: The Results at Initial Conditions

Zone 1 Zone 2 Zone 3 Zone 4 Total
(a) Parking demand , D . 2360.10 2284.74 2335.07 2307.19 9287.10
(b) Total parking supply, hj 3500 3500 3500 3500 14000
(c) Pre-occupied parking 80 120 75 50 325
spaces, h°cc
(d) Available parking spaces, 3420 3380 3425 3450 13675
(a)/(d) 0.69 0.68 0.68 0.67 0.68

(e) Total network travel time
(his) 4585.11
Parking demand was calculated by equation (12).
Table 4: The Results of Scenario 1

(a) Parking demand, Dj 2351.55 2189.47 2276.31 2195.29 9012.62
(b) Total parking supply, hj 2431.55 2309.47 2351.32 2245.28 9337.62
spaces, h°cc
(d) Available parking spaces, 2351.55 2189.47 2276.32 2195.28 9012.62
(a)/(d) 1.00 1.00 1.00 1.00 1.00

(hrs) 6609.15
Scenario (2): Parking Demand > Parking Supply
In the second scenario, the parking supply is reduced to 2000 parking spaces for each traffic
zone. If parking demand is greater than the number of parking spaces available, the parking
delay would be increased. Thus, the journey time would also increase. The accessibility
measures would reflect the increase of travel time. The desire of trip making would then be
reduced and car traffic suppressed. However, the parking demand would not be reduced
significantly in the morning peak hours. The resulting parking demand and supply are shown
in Table 6 together with the total network travel time. It is found that the overall parking
demand exceeds the available parking spaces by 17% and the total network travel time is
increased significantly by 35% as compared to the balanced scenario i.e. Scenario (1).
Table 5: The Equilibrium Link Flow and Flow/Capacity Ratio

Objective function Initial solution Optimum solution
14000.00 9337.62
Link a va (pcu/hr) va/Sa va (pcu/hr) va/Sa
1 1423.40 0.95 1366.99 0.91
2 755.50 0.94 742.09 0.93
3 1530.51 1.02 1521.07 1.01
4 1579.81 1.05 1520.96 1.01
5 1568.19 1.04 1516.00 1.01
6 1759.60 1.17 1735.37 1.16
7 848.40 1.06 804.18 1.01
8 829.59 1.04 830.48 1.04
9 1615.21 1.08 1545.07 1.03
10 1495.51 1.00 1455.67 0.97
11 1579.56 1.05 1534.22 1.02
12 1458.79 0.97 1391.10 0.93
13 704.93 0.88 668.51 0.84
14 1350.47 0.90 1319.92 0.88

spaces, h°cc
(a)/(d) 1.19 1.16 1.17 1.14 1.17

(hrs) 8918.26
Since illegal parking is not allowed in the model for the planning purpose, an artificial shadow
link (Lam and Zhang, 1999) can be introduced to each zone centroid so as to store the excess
demand vehicles. The excess demand vehicles then cruise around in the shadow links until
parking spaces are available in the next time slice. The waiting time on the shadow links can
be referred to the parking delay that included in the model.
Scenario (3): Parking Demand < Parking Supply
In the third scenario, the parking supply is increased to 2500 parking spaces for each traffic
zone. When the motorists can easily find a parking space at their destination zones, the parking
delay is reduced. As the travel time is decreased, the change of zonal accessibility measures
would lead to a change in trip production and attraction. As a result, car traffic is induced and
the parking demand increases correspondingly. In this scenario, the available parking spaces
can still cater for the increase in parking demand. The results of parking demand and supply,
and total network travel time are shown in Table 7. It is found that the available parking supply
exceeds the total parking demand by 7% and the total network travel time is reduced by 10%
when compared with the one in Scenario (1).

spaces, hj°cc
(a)/(d) 0.95 0.93 0.94 0.92 0.93

(hrs) 5952.68
Overall Results
The overall results of the three scenarios show that the total network travel times decrease with
increasing the supply of parking spaces. The total network travel time is the greatest in the
scenario of parking demand being greater than parking supply and the total network travel time
in the scenario of parking demand equals to supply (i.e. balanced scenario) is second. The least
total network travel time is found in the scenario that parking demand is less than parking
supply. However, in this scenario the excess parking spaces represents a waste of resources.
Figure 3 shows the total network travel time during the 1-hour study period for various
combinations of parking demand and supply, in which the dotted line represents the balanced
scenario (i.e. parking demand = parking supply). The results can be classified into three
categories and discussed in the followings.
Total Network
Travel Time
6000
6000 6500 7000 7500 8000 8500 9000 9500 10000 10500 11000 11500 12000
Parking Supply (number of parking spaces supplied)
Figure 3: Total Network Travel Time (veh-hr) during the 1-hour Study Period for
Various Parking Demand and Supply
Category 1. Parking demand = Parking supply
This category is a reference point for comparing the total network travel time with the
following two categories. It is observed that the total network travel time is 6300 veh-hrs when
parking demand and supply equal to 9000 parking spaces (i.e. the balanced scenario).
Category 2. Parking demand > Parking supply
In this category, it is found that the total network travel time increases sharply when parking
demand is greater than supply. This is because the parking delay is greatly increased due to a
shortage of parking spaces. Considering the scenario that the total supply of parking spaces is
9000 and the total parking demand is 10000, the rate of change in total network travel time
from the balanced scenario is increased by 40.75%. If the parking demand increases to 11000,
the total network travel time is increased by 99.35%. Moreover, if the parking demand
increases to 12000, there is 181.95% increase in the total network travel time. The rate of
change in total network travel time increases dramatically with the parking demand.
Category 3. Parking demand < Parking supply
In this category, there is only a little reduction in the total network travel time when the parking
demand is less than supply. By comparing to the balanced scenario with 9000 parking spaces,
the decrease of parking demand to 8000 leads to 27.89% reduction in total network travel time.
If the parking demand is decreased to 7000 parking spaces, 46.88% of total network travel time
is reduced. If the parking demand is decreased to 6000 parking spaces, 60.03% reduction in
total network travel time is obtained. The rate of reduction in total network travel time is
gradually decreased with the reduction of the parking demand.
From the results of the above three categories, they show that the excess parking spaces
supplied cannot reduce the total network travel time efficiently. On the other hand, a shortage
of parking spaces would increase the total network travel time significantly. Thus, the balanced
parking demand/supply seems to be most effective.
Sensitivity Tests on Parameter 9
In order to assess the impacts of the journey time variation to accessibility measures on the
minimum parking supply and total network travel time, a set of sensitivity tests have been
carried out with different values of parameter 0. For varying values of 0, the total network
travel time and the optimum total number of parking spaces supplied is plotted in Figures 4 and
5 respectively together with different values of a (dispersion parameter for trip distribution).
It can be seen in Figures 4 and 5 that an increase in the value of 6 results in a decrease in both
total network travel time and optimum total number of parking spaces supplied. This is
because the sensitivity of journey time is increased with #and hence there is a reduction in the
accessibility of trip production and attraction. Thus, the number of car trips made decreases as
does the parking demand. Therefore, the corresponding parking supply can be reduced. Since
the travel demand decreases, there is less traffic on the roads and this leads to a reduction in the
total network travel time.
6800
o. A 1
6700 -— a = 0.3
6600 - •:*- -a = 0.7

6500
6400
6300
6200
6100
6000
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.5
Sensitivity parameter on accessibility measures, 6
Figure 4: Sensitivity of Accessibility Parameter on Total Network Travel Time
8950
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3
Sensitivity parameter on accessibility measures, 9
Figure 5: Sensitivity of Accessibility Parameter on Optimum Total Number of Parking

Spaces Supplied
It can be found in Figure 4 that the total network travel time decreases with the value of
parameter a. This is because when a increases, the distribution of trips from an origin shifts to
a destination with shorter journey time. Hence, the total network travel time decreases. Figure
5 shows that the optimum number of parking spaces supplied is independent of the values of
parameter a. Although the distribution of trips is changed, the total number of car trips
attracted to a destination is almost unchanged in this example (with symmetric characteristics).
Thus, the parking demand and so the supply of parking spaces is more or less the same for
various values of a.
Effectiveness
A cost analysis should be made to demonstrate the trade-off between the reduction in total
network travel time and amount of parking spaces supplied. It was supposed that the cost for
parking supply is proportional to the number of parking spaces for which parking demand is
required. On this basis, a measure of effectiveness of one additional parking supply can be
defined as below:
Percentage reduction in total network travel time

% Effectiveness = (20)
Number of additional parking spaces supplied
It is expected that the measure of effectiveness would decrease with the increasing number of
parking spaces supplied. Suppose that parking demand is 11000 and parking supply is referred
to 6000, Figure 6 shows the variation of the percentage reduction in total network travel time
against the increasing additional parking spaces supplied.
—•— Parking Demand = 11000
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8.5 9 9.5 10
Number of additional parking supply (thousands)
Figure 6: Percentage Reduction in Total Network Travel Time by Additional Number

of Parking Spaces Supplied
The results shown in Figure 6 indicate that the percentage reduction in total network travel time
decreases from 23.19 to 7.93 (about 192%) when the additional parking spaces are increased
from 1000 to 5000 at the destination zones (i.e. reaching to the balanced scenario). On the
other hand, the percentage reduction in total network travel time only reduces from 7.93 to 2.31
(about 71%) if the parking spaces are further increased from 5000 to 9000. It can be seen in
Table 8 that the % effectiveness is decreasing with the increasing number of parking spaces.
The highest percentage of effectiveness (0.052 per parking space) was found with the first 500
additional parking spaces.
Table 8: % Effectiveness of Parking Spaces

Number of additional Percentage reduction in % Effectiveness
parking spaces total network travel time
500 25.89 0.052
1000 23.19 0.046
1500 20.68 0.041
2000 18.33 0.037
2500 16.16 0.032 Demand < Supply
3000 14.15 0.028
3500 12.33 0.025
4000 10.68 0.021
4500 9.22 0.018
5000 7.93 0.016 Demand = Supply
5500 6.80 0.014
6000 5.82 0.012
6500 4.98 0.010
7000 4.26 0.009
7500 3.65 0.007 Demand > Supply
8000 3.13 0.006
8500 2.69 0.005
9000 2.31 0.005
9500 1.99 0.004
10000 1.72 0.003
CONCLUSIONS
This paper proposes a bilevel programming model to balance the demand and supply of parking
spaces. The proposed model takes into account the route and destination choices of motorists
simultaneously, in which the 0-D travel demand is not fixed. The vehicle trip production and
attraction ends are elastic to traffic congestion, availability of parking spaces and variable
parking delays. A sensitivity analysis based algorithm has been developed for determining the
total minimum number of parking spaces required to cater for the elastic parking demand in the
study area.
The numerical example is presented to illustrate the application of the proposed model. The
effects of balanced and unbalanced parking demand/supply are investigated in the three
scenarios. The results show that the total network travel time is greatest when parking demand
is greater than parking supply. The least total network travel time is obtained in the scenario
that parking demand is less than parking supply, but the parking resource becomes inefficient.
The total network travel time in the balanced scenario lies in between, and the parking space is
fully utilised. The results also show that excess parking spaces supplied cannot reduce the total
network travel time efficiently. On the other hand, a shortage of parking spaces would increase
the total network travel time dramatically. Thus, a balance of parking demand and supply leads
to the most effective result on reduction of the total network travel time. The sensitivity tests of
the two parameters for the accessibility measures and trip distribution are also carried out to
examine their effects on the number of parking spaces supplied and the total network travel
time.
As the degree of sensitivity of journey time on accessibility measures should be different for
different trip purposes, further study is required to extend the proposed model to simulate
multi-user classes. Various types of parking spaces such as off-street and on-street for private
cars and goods vehicles can also be considered in the model. The proposed model can also be
extended to time-dependent dimension because the model allows for parking spaces to be
occupied at the beginning of the one-hour study period and to be carried over to the next hour
period. The number of parking spaces supplied can then be determined at the most critical
period with the greatest parking demand, although parking demand may be less than parking
supply in the other periods throughout the day.
ACKNOWLEDGEMENTS
The work described in this paper was substantially supported by a grant from the Research
Committee of The Hong Kong Polytechnic University (Project No. G-V152).
REFERENCES
Bell, M.G.H. and Y. lida (1997). Transportation Network Analysis. John Wiley & Sons,
Chichester.
Bifulco, G.N. (1993). A stochastic user equilibrium assignment model for the evaluation of
parking policies. European Journal of Operation Research, 71, 269-287.
Bruton, M.J. (1985). Introduction to Transportation Planning. Hutchinson, London.
Bullen, A.G.R. (1982). Development of computerized analysis of alternative parking
management policies. Transport Research Record, 845, 31-37.
Coombe, D., P. Guest, G. Schofield and A. Skinner (1997). The effects of parking control
strategies in the city of Bristol. Traffic Engineering and Control, 38, 204-208.
Dalvi, M.Q. and K.M. Martin (1976). The measurement of accessibility: some preliminary
results. Transportation, 5, 17-42.
Ergun, G. (1971). Development of downtown parking model. Highway Research Record, 369,
118-134.
Evans, S.P. (1976). Derivation and analysis of some model for combining trip distribution and
assignment. Transportation Research, 10, 37-57.
Feeney, B.P. (1989). A review of the impact of parking policy measures on travel demand.
Transportation Planning and Technology, 13, 229-244.
Gur, Y.J. and E.A. Beimborn (1984). Analysis of parking in urban centres: Equilibrium
assignment approach. Transportation Research Record, 957, 55-62.
Hunt, J.D. (1988). Parking location choice: insights and representation based on observed
behaviour and hierarchical logit modelling formulation. Institute of Transportation
Engineers, 58th Annual Meeting, Vancouver, Canada.
Lam, W.C.H., R.Y.C. Fung, S.C. Wong and C.O. Tong (1998). The Hong Kong parking
demand study. Proceedings of the Institution of Civil Engineers, Transport, 129, 218-
227.
Lam, W.H.K. and H.J. Huang (1992). A combined trip distribution and assignment model for
multiple user classes. Transportation Research B, 26B, 275-287.
Lam, W.H.K. and M.L. Tarn (1997). Why standard modelling and evaluation procedures are
inadequate for assessing traffic congestion measures. Transport Policy, 4, 217-223.
Lam, W.H.K. and Y. Zhang (1999). Capacity-constrained traffic assignment in networks with
residual queues. ASCE Journal of Transportation Engineering (forthcoming).
Leake, G.R. and A.S. Huzayyin (1979). Accessibility measures and their suitability for use in
trip generation models. Traffic Engineering and Control, 20, 566-572.
Leake, G.R. and A.S. Huzayyin (1980). Importance of accessibility measures in trip production
models. Transportation Planning and Technology, 6, 9-20.
Nour Eldin, M.S., T.Y. El-Reedy and H.K. Ismail (1981). A combined parking and traffic
assignment model. Traffic Engineering and Control, 22, 524-530.
Oppenlander, J.C. and R.F. Dawson (1988). Optimal location of sizing of parking facilities.
Institute of Transportation Engineers, 58th Annual Meeting, Vancouver. Technical
Paper 428.
Ortuzar, J. de D. and L.G. Willumsen (1994). Modelling Transport. John Wiley & Sons,
Chichester.
Polak, J. and K. Axhausen (1990). Parking search behaviour: a review of current research and
future prospects. Transport Studies Unit. Working Paper 540, Oxford University.
Safwat, K.N.A. and T.L. Magnanti (1988). A combined trip generation, trip distribution, modal
split, and trip assignment model. Transportation Science, 18, 14-30.
Salomon, I. (1986). Towards a behavioural approach to city centre parking. Cities, August, 200-
208.
Sheffi, Y. (1985). Urban Transportation Networks: Equilibrium Analysis with Mathematical
Programming Methods. Prentice-Hall, Englewood Cliffs, NJ.
Tam, M.L. and W.H.K. Lam (1999). Maximum car ownership under constraints of road
capacity and parking space. Transportation Research A (forthcoming).
Thompson, R.G. and A.J. Richardson (1998). A parking search model. Transportation
Research A, 32, 159-170.
Tobin, R.L. and T.L. Friesz (1988). Sensitivity analysis for equilibrium network flows.
Transportation Science, 22, 242-250.
Yang, H. (1997). Sensitivity analysis for the elastic-demand network equilibrium problem with
applications. Transportation Research B, 31B, 55-70.
Yang, H., M.G.H. Bell and Q. Meng (1999). Modeling the capacity and level of service of
urban transportation networks. Transportation Research (forthcoming).
Young, W. (1982). The development of an elimination-by-aspects model of residential location
choice. PhD. Thesis, Department of Civil Engineering, Monash University.
Young, W., R.G. Thompson and M.A.P.Taylor (1991). A review of urban car parking models.
Transport Reviews, 11, 63-84.
CHAPTER 10
TRAVELLER SURVEY AND TRANSIT PLANNING
The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore, all progress
depends on the unreasonable man. (George Bernard Shaw)
Only those who will risk going too far can possibly find out how far one
can go.
Better ask twice than to lose your way once.
735
THE ROLE OF LIFESTYLE AND

ATTITUDINAL CHARACTERISTICS
IN RESIDENTIAL NEIGHBORHOOD CHOICE
Michael N. Bagley, University of California, Davis, USA

and
Patricia L. Mokhtarian, University of California, Davis USA
ABSTRACT
This paper investigates the importance of attitudinal and lifestyle variables to residential neigh-
borhood choice for 492 residents of three San Francisco Bay Area neighborhoods. One
neighborhood, North San Francisco (N = 155), was classified as traditional, whereas the other
two, Concord (N = 165) and San Jose (N = 172), were classified as suburban. Separate factor
analyses identified 10 attitudinal dimensions and 11 lifestyle dimensions. Mean factor scores
for the three neighborhoods differed significantly for most of the factors. For example,
consistent with expectations, the mean scores on the pro-high density, pro-environment, pro-
pricing, and pro-alternatives attitudinal factors were significantly higher for residents of
traditional NSF than for the suburban residents. On lifestyle dimensions, NSF residents were
significantly more likely to be culture-lovers, and less likely to be nest-builders and altruists,
than the suburbanites. These seven factors, together with three sociodemographic variables
(number of children under age 16, number of vehicles, and years lived in the Bay Area - all
positively associated with the suburban neighborhoods), were significant in the final binary
logit model of residential neighborhood choice. The adjusted p2 for the model was 0.52, and
the collective contribution of the attitudinal/ lifestyle factors provides support for the
usefulness of this approach to residential choice modeling. In particular, it is suggested that
this approach will help illuminate the policy-relevant question as to whether observed
differences in travel behavior are induced by the land use configuration of the neighborhood
itself, or are derived from intrinsic propensities for different travel choices. Evidence is
mounting that the second hypothesized mechanism is stronger: that is, that as an explanation
for travel behavior, neighborhood type tends to act as a proxy for the "true" explanatory
variables with which it is strongly associated, namely attitudinal and lifestyle predispositions.
1. INTRODUCTION
It seems self-evident that residential location decisions profoundly influence urban travel
patterns, but the precise nature of that influence is not completely understood. For example,
numerous empirical studies (see, e.g., Frank and Pivo, 1994; Rutherford et al., 1996; and
Kitamura et al., 1997) have demonstrated that living in higher-density, mixed-use
("traditional") neighborhoods is associated with fewer vehicle trips and smaller distances
traveled compared to living in typical low-density suburban environments. These encouraging
results have supported a growing movement (known as New Urbanism) to use land use
planning and design as a tool for reducing travel. But whether the land use configuration
"causes " the observed travel patterns, or whether people with different a priori travel propen-
sities select themselves into residential neighborhoods which support those propensities, is
impossible to determine from residential location and travel data alone. The difference is
important, since if a selection bias is at work, policies designed to induce travel changes
through land use changes may not have the expected or desired effect.
To understand the extent to which travel-related predispositions influence residential location,

it is necessary to collect data on individuals' attitudes and lifestyle preferences. To our
knowledge, only one previous study (Prevedouros, 1992) has rigorously measured personality
characteristics and analyzed their association with choice of residential neighborhood type
(without developing an explicit choice model). This paper measures a different and more
extensive set of lifestyle and attitudinal factors, and employs them together with travel and
demographic characteristics as explanatory variables in a model of residential neighborhood
choice. We demonstrate that lifestyle and attitudinal factors contribute significant explanatory
power beyond the types of variables normally included in models of residential choice, such
as distance to work and shopping (Louviere and Timmermans, 1990), commute mode
(Horowitz, 1995), and car ownership (Weisbrod et al, 1980).
The rest of this paper is organized as follows. Section 2 describes the study background,
including discussion of the data and the definition of the residential choice dependent variable
used in this research. The next section compares sociodemographic, travel, and residential
characteristics across the three neighborhoods studied. Section 4 develops and analyzes the
attitudinal and lifestyle variables. Findings from a binary logit model of residential choice are
presented in Section 5. The final section provides conclusions and directions for future
research.
Lifestyle Characteristics in Neighbourhood Choice 73 7
2. STUDY BACKGROUND
The data used for this study were originally collected for a land use-travel behavior project
sponsored by the California Air Resources Board in 1992. A sizeable amount of micro-scale
data on land use, the roadway network, and public transit was obtained from site surveys of
five San Francisco Bay Area neighborhoods (selected sections of approximately one square
mile within the cities or areas of Concord, Pleasant Hill, North San Francisco, South San Fran-
cisco, and San Jose). In addition, demographic, socioeconomic, attitudinal, lifestyle, and
travel-related data were collected through mail-out surveys and travel diaries completed by
residents in the same neighborhoods. The main objective of the original study was to examine
the impacts of neighborhood type (i.e., land use) and individual attitudes on travel behavior.
Thus, variability of neighborhood type was important in the development of regression-based
travel-behavior models (Kitamura et al, 1997). To obtain this variability, the five
neighborhoods were chosen to represent relatively extreme values in terms of key factors
describing land use type: public transit accessibility, land use mix, residential density, and
employment mix.
Though the original study was aimed at understanding the influence of land use and attitudes
on travel behavior, the data set contains an abundance of information germane to residential
choice modeling, including the data on attitudes and lifestyles (described in Section 4), which
is not normally obtained in such studies. For a more detailed description of the data and the
study, refer to Kitamura et al. (1994).
A respondent in this study lives in one of five neighborhoods, each of which could be
considered an indicator of residential choice. Indeed, some residential choice studies, such as
Lerman (1975) and Horowitz (1995), take census tracts or other location indicators as the
dependent variable. However, to develop residential choice models that are robust and
transferable, the generic characteristics of a neighborhood are of greater interest than a specific
geographic location itself. The trait of "traditionalness" is the defining dimension chosen for
this study (though many other traits such as aesthetic appeal could be used).
Two of the five neighborhoods had mixed characteristics, of both suburban and traditional
neighborhoods, while the remaining three represented the two extremes of a traditionalness
dimension relatively well. North San Francisco (NSF) was characterized by high residential
density, a large number of businesses (470), a grid-like street pattern, high public transit
accessibility (21 bus routes), and numerous wide sidewalks. The Concord neighborhood was
described by low residential density, a moderate number of businesses (149), a mixed
curvilinear/grid-like street pattern, low public transit accessibility (3 bus routes), and
discontinuous sidewalk paths. The San Jose neighborhood also exhibited low residential
density, the fewest businesses (96), a grid street pattern containing numerous discontinuities,
low public transit accessibility (5 bus routes), and few sidewalks.
Since the goals of this study suggested highlighting as sharp a contrast as possible between
the two neighborhood types, the two "hybrid" neighborhoods were dropped from the analysis
reported here (they were retained for other analyses involving a continuous traditionalness
measure itself as the dependent variable). The dependent variable for the residential choice
models presented in Section 5 is a binary variable classifying the respondent's residential
location as traditional or suburban.
The final data set used here contains 492 cases from the three neighborhoods: North San
Francisco (traditional, N=155), Concord (suburban, N=165), and San Jose (suburban, N=172).
Nearly all of the variables have very little missing data (2% or less) due to an initial screening
of cases for study use that required full survey data from respondents on attitudinal and
lifestyle questions. Variables based on trip diary data had the highest percentage of missing
data, but were still valuable. The next two sections highlight the associations between
neighborhood type and respondent sociodemographics, attitudes, and lifestyles.
3. DESCRIPTION OF THE SAMPLE
Tables 1 and 2 present key sociodemographic, travel, and residential characteristics of the
respondents, by neighborhood of residence. Table 1 shows that, on average, respondents
across the three neighborhoods are similar in some ways: they tend to be college-educated,
have managerial or professional occupations, have moderate household incomes, have similar
numbers of workers in the household, and be similarly likely to be female. This commonality
across neighborhoods is valuable for this study in that it reduces the influence of confounding
factors often found in research on travel behavior and/or residential location. In other words,
it allows the focus to be placed on other factors involved in the decision-making process
instead of variables like income, education (a proxy for factors such as awareness or
availability of job opportunities), gender, or number of workers - variables which have been
found in other studies (e.g., Madden, 1981; Timmermans et al., 1992) to influence residential
choice. For example, respondents in (traditional) North San Francisco could have chosen to
live in either of the other two (suburban) neighborhoods, and variables like income or
awareness are likely not to be the main factors in choosing the neighborhood.
Lifestyle Characteristics in Neighbourhood Choice 739
Table 1: Sociodemographic Characteristics of the Sample (N = 492)
Variable NSF (N=155') CON (N=1650 SJ (N=172[)

Occupation2: number (percent)
Manager/adm in istrator3 25 (17%) 19 (12%) 29(18%)
3
Professional/technical 48 (33%) 56 (35%) 66 (41%)
3
Administrative support 22 (15%) 21 (13%) 21 (13%)
Retired 16 (10%) 35 (21%) 34 (20%)
Household composition: mean (standard deviation)
Household size 1.83 (0.90) 2.45 (1.09) 2.72(1.11)
4
No. people over 16 1.75 (0.80) 1.98 (0.66) 2.16(0.74)
5
No. people under 16 0.11 (0.39) 0.49 (0.89) 0.57 (0.92)
No. full-time workers 0.98 (0.70) 0.94 (0.72) 1.00(0.71)
No. workers (part- or full-time) 1.19(0.61) 1.12(0.73) 1.26 (0.77)
Personal characteristics: mean (standard deviation)
Age6 43.7 (14.2) 54.1 (14.8) 52.2 (13.9)
7 11
Education category ' 4.32 (1.23) 3.58(1.24) 3.82 (1.28)
8 12
Female ' 81 (53%) 83 (50%) 83 (48%)
9 n
Household income category ' 6.14 (1.45) 6.32 (1.26) 6.44(1.44)
Years lived in Bay Area10 19.7(17.1) 35.2(18.2) 32.3 (15.3)
1
Sample sizes vary according to missing data levels and/or applicable cases. However, unless otherwise noted,
sample sizes for NSF, CON, and SJ are 155, 165, and 172 respectively.
I
Not all job type categories are presented, and consequently, percentages will not sum to 100%.
3
N=147 (NSF), N=161 (CON), N=162 (SJ); 4 N=149 (NSF), N=161 (CON), N=172 (SJ);
5
7
9
N=152 (NSF), N=161 (CON), N=169 (SJ); 10 N=155 (NSF), N=163 (CON), N=171 (SJ).
II
Education and household income were collected as ordinal variables. For education, a value of 4 represents
completion of a 4-year degree, and for income, a value of 6 represents $35,001 to $50,000 a year.
12
The values given for this variable are the number and percentage of the sample that are female.
Table 2: Travel and Residential Characteristics of the Sample (N = 492)
Variable NSF CON SJ

(N=155!) (N=165!) (N=172])
General travel information: mean (standard deviation)
No. of vehicles 1.35 (0.92) 2.15 (1.01) 2.37(1.08)
No. of vehicles /driver 0.91 (0.55) 1.07 (0.50) 1.09(0.54)
2
Commute distance 6.70(11.10) 15.98 (14.86) 14.05 (11.76)
(1-way, miles)
Person trips / day (any 4.75 (3.08) 4.16 (2.43) 4.15 (2.14)
mode, 3 -day average)3
Mode choices: number (percent)
Use public transit9 124 (80%) 107 (65%) 83 (48%)
49
Use bike or walk mode ' 79 (61%) 40 (28%) 17 (11%)
Residential characteristics: mean (standard deviation)
Home size (square feet)5 1304 (825) 1527 (483) 1678 (398)
6
No. of bedrooms 2.02(1.09) 2.98 (0.70) 3.51 (0.63)
7 10
Home value category ' 5.58(1.25) 3.76 (0.80) 4.53 (0.62)
8 10
Monthly rent category ' 3.42(1.15) 2.94 (0.97) 3.47 (1.23)
Most important reasons for choosing current neighborhood: number (percent)
selecting this among their top three reasons
Housing cost 85 (55%) 101 (61%) 103 (60%)
Close to shops and services 57 (37%) 31 (19%) 43 (25%)
Close to work 53 (34%) 53 (32%) 29 (17%)
Good school 8 (5%) 26 (16%) 36 (21%)
1
Sample sizes will vary according to missing data levels and/or applicable cases. However, unless
otherwise noted, sample sizes for NSF, CON, and SJ are 155, 165, and 172 respectively.
2
4
6
8
N=104 (NSF), N=17 (CON), N=17 (SJ).
9
Binary variable equal to one if respondent used public transit or bike/walk, respectively, during the
three-day travel diary period, or indicated in a section describing his/her "most common trips" that at
least one such trip involved transit or bike/walk.
10
Home value and monthly rent were collected as ordinal categorical variables. Reference points for
each category include: 4 (home value ranging from $180,001 to $250,000), 5 (home value ranging
from $250,001 to $375,000), 2 (monthly rent ranging from $351 to $500), and 3 (monthly rent ranging
from $501 to $700), respectively.
However, there are also important sociodemographic differences among the neighborhoods.
Not surprisingly, households in the two suburban neighborhoods are significantly larger and
have more children on average, than households in traditional NSF. Respondents in Concord
and San Jose are also significantly older and have lived in the Bay Area far longer than NSF
residents. Thus, the latter group is consistent with the stereotype of the young, upwardly
mobile professional single-person or dual-career household with no children - the population
segment most likely to be attracted to an urban environment such as that in NSF. It is
important to stress, however, that while objective demographic characteristics may be
correlated with attitudes, attitudes are expected to be a more reliable indicator of residential
choice. Two households may have identical demographic characteristics, but different attitudes
prompting them to make different choices. Conversely, each neighborhood (representing an
"identical" residential choice) is heterogeneous, containing households with a diversity of
demographic characteristics.
Table 2 presents some travel and residential characteristics of the sample, by neighborhood.
Again true to form, NSF households own significantly fewer vehicles per driver and use transit
and non-vehicular modes far more often than households in suburban Concord and San Jose.
(The higher transit use for Concord compared to San Jose is likely due to the BART rail rapid
transit station near the western boundary of the neighborhood). Average person trip rates are
significantly higher for NSF residents, and their commute distances are less than half as long,
compared to respondents in the other two neighborhoods.
As for residential characteristics, as expected, North San Francisco residences are smaller but
more expensive, on average, than residences in the suburban neighborhoods (except that the
average rented residence in NSF is about as expensive as its larger counterpart in San Jose).
As shown by the footnotes indicating sample sizes, only a third of the NSF respondents owned
their homes, compared to 90% for each of the other two neighborhoods.
Except for "housing cost", which was the most commonly-selected reason in all three
neighborhoods, respondents' primary reasons for choosing their current neighborhood generally
differed across location - in expected ways. Being close to shops and services was more
important to people living in the traditional neighborhood than to those in the suburban
neighborhoods, while having good schools was significantly more important to San Jose
respondents than to North San Francisco respondents (who most often had no school-age
children). Proximity to work was an important factor to San Jose residents only half as often
as it was to residents of the other two neighborhoods. Pairwise t-tests between the
neighborhoods for each of the variables showed that all differences between neighborhoods
were significant at the 95% confidence level except for "housing cost" (no differences across
all neighborhoods) and "close to work" (no difference between NSF and Concord).
4. ATTITUDE AND LIFESTYLE MEASUREMENT
To measure attitudes and lifestyles, two factor analyses were performed on responses to
numerous survey items related to personal views and activities. Factor analysis is a statistical
technique for extracting a small number of fundamental dimensions (factors) from a large set
of intercorrelated variables measuring various aspects of those dimensions. After
experimenting with various factor extraction and rotation options, principal components
analysis (PCA) and oblique rotation solutions were selected in both cases, with the number of
factors chosen based on the eigenvalue-one and interpretability rules of thumb (Rummel,
1970). Tables 3 and 4 present the largest pattern matrix loadings for the final factor solutions.
The loading of lifestyle or attitude / on factory represents factory's unique contribution to the
variance in variable i. In practice, these loadings will generally lie between -1 and 1, and the
higher the magnitude of the loading, the more strongly variable i is associated with factor j.
Ratings on the original lifestyle/attitude variables were linearly combined to create

standardized scores on each factor for each case, where the contribution of each variable to
the factor score is approximately proportional to the loading of that variable on the same
factor. Mean values for each of the attitudinal and lifestyle factor scores by neighborhood are
presented in Figures 1 and 2 respectively, where the factors are arranged roughly in order of
degree of significant difference across neighborhood (based on a one-way analysis of variance
on each factor score).
4.1 Attitude Measurement
Responses on a 5-point Likert-type scale (strongly disagree to strongly agree) to 39 attitudinal

statements relating to urban life (covering topics such as urban transportation, the environment,
and housing) were factor analyzed with SPSS (Norusis, 1990). Table 3 shows the largest
pattern-matrix loadings for the final 10-factor solution, which accounts for 49.1% of the total
variance in the attitude data. One-way ANOVAs performed on each factor indicated that
mean scores on all but the last (pro-transit) factor differed significantly by neighborhood.
However, for brevity, only the four factors significant in the binary logit residential choice
models presented in Section 5 will be discussed below. For a more detailed discussion of all
ten factors, see Kitamura et al. (1994 and 1997).
Table 3: Strongest Pattern Matrix Loadings for Attitudinal Factor Scores1
Statement Loading
Factor 1: Pro-High Density
I need to have space between me and my neighbors -0.75

I would only live in a multiple family unit, (apartment, condo, etc.) as a last resort -0.69
It's important for children to have a large backyard for playing -0.67
High-density residential development should be encouraged 0.55
Factor 2: Pro-Environment
Environmental protection costs too much -0.78
Environmentalism hurts minority and small businesses -0.75
People and jobs are more important than the environment -0.73
Environmental protection is good for California's economy 0.73
Stricter vehicle smog control laws should be introduced and enforced 0.47
Factor 3: Pro-Pricing
I would be willing to pay a toll to drive on an uncongested road 0.76
We should raise the price of gasoline to reduce congestion and air pollution 0.38
Traffic congestion will take care of itself because people will make adjustments -0.25
Factor 4: Pro-Alternatives
Having shops and services within walking distance of my home would be 0.50
important to me
Vehicle emissions increase the need for health care 0.49
I use public transportation when I cannot afford to drive 0.44
We should provide incentives to people who use electric or other clean-fuel 0.42
vehicles
More lanes should be set aside for carpools and buses 0.39
(continued)
Table 3: Strongest Pattern Matrix Loadings for Attitudinal Factor Scores1

(continued)
Statement Loading
Factor 5: Pro-Driving
Driving allows me to get more done 0.74
Driving allows me freedom 0.71
I would rather use a clean-fuel car than give up driving 0.66
Factor 6: Pro-Drive Alone
I like someone else to do the driving -0.70
I am not comfortable riding with strangers 0.62
Ridesharing saves money -0.49
Factor 7: Pro-Growth
We need to build more roads to help decrease congestion 0.59
Too many people drive alone -0.48
Too much agricultural land is consumed for housing -0.44
Getting stuck in traffic doesn't bother me too much 0.39
Factor 8: Work-Driven
I like to spend most of my time working 0.73
When things are busy at work, I get more done by cutting back on 0.71
personal time
Factor 9: Time-Satisfied
I would like to have more time for leisure -0.74
I feel that I am wasting my time when I have to wait -0.69
Getting stuck in traffic doesn't bother me too much 0.48
Factor 10: Pro-Transit

Public transportation is unreliable -0.70
It costs more to use public transportation than it does to drive a car -0.59
Buses and trains are pleasant to travel in 0.57
I can read and do other things when I use public transportation 0.43
1
Some lower and secondary factor loadings are presented when they help improve interpretation of
the factors.
Figure 1: Mean Attitudinal Factor Scores by Neighborhood
f
Pro- Pro-Pricing Pro
Environment Alter
-0.6 -i-
The pro-high density factor is based on statements such as, "I need to have space between me
and my neighbors" (loading = -0.75) and "High-density residential development should be
encouraged" (loading = 0.55). It is hypothesized that a person who has a high score on this
factor will be more likely to prefer a residence in a high-density area. As expected, the mean
score on this factor for North San Francisco was much higher (0.57) than for Concord (-0.49)
or San Jose (-0.33), indicating that respondents in the traditional neighborhood are more
favorable toward high-density development than respondents in the suburban neighborhoods
of this study.
The pro-environment factor is defined by statements such as "environmental protection costs

too much" (loading = -0.78) and "stricter vehicle smog control laws should be introduced and
enforced" (loading = 0.47). An individual who is very environmentally sensitive may be more
likely to live in a traditional neighborhood, as this type of neighborhood uses land more
efficiently and facilitates the use of transportation modes other than the automobile. The mean
factor scores shown in Figure 1 support this hypothesis, with the ranking by neighborhood the
same as for the pro-high density factor although all means are less extreme for this factor than
for the first.
Pro-pricing and pro-alternatives are factors relating to regulations and policies concerning
transportation and the environment. The pro-pricing factor is characterized by statements such
as "I would be willing to pay a toll to drive on an uncongested road" (loading = 0.76) and
"We should raise the price of gasoline" (loading = 0.38). The pro-alternatives factor is
somewhat heterogeneous, but generally relates to the provision of alternatives to gasoline-
powered automobile travel, including statements such as, "We should provide incentives to
people who use electric or other clean-fuel vehicles" (loading = 0.42) and "More lanes should
be set aside for carpools and buses" (loading = 0.39). It is hypothesized that an individual
who favors policies supporting more environmentally-efficient forms of travel will be more
likely to live in a traditional neighborhood. Figure 1 supports this hypothesis, showing that,
on average, NSF residents scored significantly more highly on both these factors than did
residents of the two suburban neighborhoods.
These four attitudes collectively point to major differences between North San Francisco res-
pondents' and Concord and San Jose respondents' views of land use and the environment. It
is important to understand that this statistically significant difference in and of itself does not
imply a particular direction of causality. Do people with different attitudes choose to live in
different neighborhoods, or do the different neighborhoods in which people live engender dif-
ferent attitudes? Although the latter direction of influence (residential location causes
attitudes) may well occur over the long run, we can only assert that the former relationship
(attitudes cause residential location) appears to be more plausible as the stronger direction of
influence in the short term. Resolving this question more completely would require a
longitudinal study of how the attitudes of residents of different types of neighborhoods change
with the length of time that they live in those neighborhoods.
4.2 Lifestyle Measurement
Lifestyle was measured based on the responses to three questions in the survey: 1) "What
types of subjects did you read last month (check all that apply)?", having 30 possible choices
plus "other"; 2) "What best describes the way you spent last weekend (check as many as
apply)?", having 19 choices plus "other"; and 3) "From the following lists, check all that you
have done within the last 12 months", having 57 possible responses in four categories labeled
outdoors/ sports, entertainment/events, travel, and do it yourself/education/hobbies, plus "other"
responses for each category. Discarding the "other" responses resulted in a total of 106 binary
variables representing a diverse set of lifestyle activities.
Factor analysis was performed on these 106 variables (although this procedure is more
commonly conducted on variables that are at least approximately continuous, Rummel, 1970
points out that any data whatsoever can be factor-analyzed). The final eleven-factor solution
shown in Table 4 explains 29.4% of the total variance in the activity data, indicating that a
considerable amount of the total variance in lifestyle indicators falls outside the 11-dimensional
space spanned by the identified factors. Based on one-way ANOVAs on each factor, mean
scores on the first six factors of Table 4 and Figure 2 differed significantly by neighborhood.
Again, for brevity, only the three lifestyle factor scores that were significant in the models of
Section 5 will be discussed below.
The lifestyle factor that differs most significantly across neighborhoods is heavily defined by
activities such as: "attended a concert/symphony" (loading = 0.49), "attended the ballet"
(loading = 0.46), and "attended the theater" (loading = 0.39). Hence, this is labeled the
"culture-lover" factor. It is hypothesized that people with a culture-oriented lifestyle will be
more likely to choose a residence that is accessible to many cultural activities (most likely in
or near the high-density urban core). This hypothesis is supported by Figure 2, showing that
North San Francisco residents have a much higher mean score on the culture-lover factor
(0.65) than do residents of the two suburban neighborhoods Concord and San Jose (-0.35 and -
0.33, respectively).
Table 4: Strongest Pattern Matrix Loadings for Lifestyle Factor Scores1
Activity Description2 Loading

Factor 1: Culture-Lover
Attended a concert or symphony 0.49
Attended the ballet 0.46
Read material on art or architecture 0.44
Attended the theater 0.39
Factor 2: Altruist
Read material on religion 0.61
Spent last weekend participating in religious 0.58
activities
Volunteered to help the community 0.53
Spent last weekend doing volunteer work 0.52
Participated in community events 0.43
Factor 3: Nest-Builder
Read material on home improvement 0.65
Read material on gardening 0.57
Made house improvements myself 0.57
Put in a flower or vegetable garden 0.55
Spent last weekend doing yard work 0.53
Factor 4: Relaxer
Spent last weekend reading 0.56
Spent last weekend at home relaxing 0.55
Spent last weekend shopping 0.48
Spent last weekend doing chores 0.47
Factor 5: Traveler
Traveled to another country 0.47

Took a cruise 0.43
Visited another state 0.37
Visited a wildlife refuge 0.35
(continued)

(continued)
Factor 6: Adventurer
Went hunting 0.53

Used an off-road vehicle 0.51
Went to a shooting range 0.47
Participated in a motor cross 0.41
Factor 7: Fun-Seeker
Went to a zoo 0.54
Read children's stories 0.51
Visited an aquarium 0.45
Visited an amusement park 0.36
Factor 8: Homebody
Read materials on women's issues 0.65
Read material on fashion 0.60
Sewed (made clothes, quilts, etc.) 0.58
Read material on cooking or recipes 0.56
Did needlework or embroidery 0.54
Read material on decorating 0.50
Factor 9: Outdoor Enthusiast

Visited a national park or historic site 0.64
Visited a state park or historic site 0.60
Visited a local park or historic site 0.56
Went hiking or backpacking or camping 0.51
Visited a beach 0.49
(continued)

(continued)

Factor 10: Athlete
Participated in a sports event 0.64
Played tennis or golf 0.59
Attended a professional sports event 0.57
Read material on sports or exercise or health 0.55
Spent last weekend outdoors participating in 0.48
sports
Factor 11: Hobbyist
Read material on science or nature 0.56
Read material on the environment 0.50
Read material on the outdoors 0.45
Read material on history 0.44
Read material on photography 0.41
Read material on humor 0.39
Read material on pets 0.32
Spent last weekend doing hobbies 0.29
1
Some lower and secondary factor loadings are presented when they help improve interpretation of
the factors.
2
The time frame for these activities is as follows: "Read material on..." within the past month; all
other activities occurred within the past 12 months except where noted to have taken place the past
weekend.
The next significant factor is characterized by activities such as: "read material on home
improvement" (loading = 0.65), "made house improvements myself (loading = 0.57), and
"spent last weekend doing yardwork" (loading = 0.53). This factor has been named "nest-
builder" as it refers to a lifestyle that involves many home-related activities. A person having
a high score on this factor is hypothesized to be more likely to live in a low-density neighbor-
hood, where homes and lots are larger and home-ownership is higher. As anticipated,
respondents from Concord and San Jose had higher average scores on this factor than did res-
pondents from North San Francisco.
Figure 2: Mean Lifestyle Factor Scores by Neighborhood
0.8 T
Culture- // Ndst-Builder Altftis Hobbyist

The third factor found significant in the residential choice models was labeled "altruist", based
on heavily-loading statements such as: "Spent last weekend on religious activities" (loading
= 0.58), "volunteered to help the community" (loading = 0.53), and "participated in
community events" (loading = 0.43). Although we did not have a prior hypothesis about how
this factor would differ by neighborhood, Figure 2 shows that San Jose residents scored most
highly on this factor, NSF residents scored most negatively, and Concord residents were nearly
neutral. While the differences are statistically significant according to the one-way ANOVA,
the spread between the highest and lowest mean is smaller than for the other two lifestyle
factors discussed. It may be that San Jose residents are marginally more conservative and
hence - perhaps - more inclined to be religious; it may be that their marginally larger
households, higher presence of children, and incomes give them somewhat more motivation
and means to participate in community activities (joining the Parents - Teachers Association,
a "stay-at-home" parent volunteering time to a charitable organization).
In identifying differences in lifestyle by neighborhood, it is again important to examine the

question of the direction of causality. The stronger direction of causality here may be more
debatable than in the case of attitudes, since neighborhood type could clearly influence the
activities undertaken. Do "culture-lovers" live in the urban core partly to have ready access
to cultural events, or do they live there for other reasons but, after the fact, are induced to take
advantage of their greater proximity to those events? Do "nest-builders" engage in home-
improvement activities mainly because their large suburban home and yard (which they've
chosen for other reasons) require them to do so, or do those who enjoy engaging in home-
improvement activities choose to live in such a neighborhood while those who do not enjoy
them choose to live in a higher-density, lower-maintenance residence such as a condo or
apartment?
Although we acknowledge the "residential location causes lifestyle" link to be potentially

important, we suggest that the reverse direction is still quite plausible in this context. The
specific definition of the lifestyle variables used here supports their interpretation as indicators
of predisposition - that is, as causes rather than effects. For the most part, the variables
represent activities which would be relatively accessible to everyone in a large metropolitan
area, regardless of their specific neighborhood type. The fact that the time frame for 57 of
the activities was "within the last year" even further allows for rough equality of opportunity
across neighborhood types. The 30 variables identifying subjects the respondents read about
within the last month are likely to reflect intrinsic interests, and again the subjects would be
equally available to residents of all neighborhood types.
5. RESIDENTIAL CHOICE MODELS
A binary logit model of residential choice with dependent variable alternatives "suburb" (1)
and "traditional neighborhood" (0) was estimated on the data from the three neighborhoods
(N=492). More than 60 measures representing travel, residence, employment, attitude,
lifestyle, and sociodemographic characteristics were evaluated for inclusion as explanatory
variables in the model. In addition to t- and chi-square tests and analysis of variance, stepwise
procedures of adding and removing variables (singly or in groups) were conducted to select
the final, "best" model shown in Table 5.
With an adjusted-p2 statistic of 0.52 (compared to 0.10 for the market share model containing
only a constant), the overall model goodness-of-fit is respectable. The negative sign for the
constant term indicates that unmeasured variables favor the choice of a traditional
neighborhood on average. The remaining ten significant variables all had the expected signs,
and fall into three categories: sociodemographic variables, attitude factor scores, and lifestyle
factor scores.
The three sociodemographic variables - number of people under age 16, number of vehicles,
and number of years lived in the Bay Area - are all positively associated with choosing a
suburban neighborhood. The appeal of the suburbs (larger homes and yards, perceived better
schools and safer environment) as a place to raise a family needs no further explanation. The
association of higher numbers of vehicles with a suburban residence is consistent with findings
from the travel behavior/land use literature (see, e.g., Cervero, 1996; Rutherford et al., 1996),
although some mutual causality could certainly be at work here (the move to a suburban home
with attached garage and less transit availability may necessitate and/or facilitate the
acquisition of additional cars). The last significant sociodemographic variable, "years lived
in the Bay Area", can be considered a life cycle proxy. Specifically, it is hypothesized that
people who have large values for this variable are older and more likely to have (or have had)
children living at home. Even if a household is now in the empty nest stage and no longer
needs the four bedroom home near good schools, inertia may keep it in that location which
was optimal in the past. Nijkamp et al. (1993) hypothesized that life cycle is the
"predominant explanatory factor in residential relocation decisions".
Seven out of the 10 significant variables in the model are attitudinal (4) and lifestyle (3) factor
scores, demonstrating the considerable explanatory power of these types of variables. As
would be expected from Figure 1 and the discussion in Section 4.1, the signs on the four
attitudinal variables are all negative, indicating that people scoring highly on the pro-
Table 5: Relative Effects of Sociodemographic Characteristics and Attitude/Lifestyle

Factors in a Binary Logit Residential Choice Model
(Dependent Variable: Suburb = 1 and Traditional Neighborhood = 0)
Variable Name and Type Full Model Attitude and Sociodemographic

Lifestyle Factors Characteristics
S = Sociodemographic Excluded Excluded
A — Attitude Factor
L = Lifestyle Factor ft t ft t ft t
(p-val.) (p-val.) (p-val.)
Constant -1.36 -3.49 -2.52 -7.66 1.02 7.24

(0.00) (0.00) (0.00)
Number of people under 16 (S) 0.83 3.50 0.91 4.77
(0.00) (0.00)
Number of vehicles (S) 0.77 4.43 1.01 6.83
(0.00) (0.00)
Years lived in Bay Area (S) 0.03 3.38 0.05 6.59
(0.00) (0.00)
Pro-pricing (A) -0.58 -3.34 -0.63 -4.10
(0.00) (0.00)
Pro-environment (A) -0.29 -1.87 -0.35 -2.40
(0.06) (0.02)
Pro-high density (A) -0.84 -4.86 -0.94 -6.00
(0.00) (0.00)
Pro-alternatives (A) -0.63 -3.70 -0.60 -3.85
(0.00) (0.00)
Altruist (L) 0.32 2.23 0.28 2.17
(0.03) (0.03)
Culture-lover (L) -0.76 -4.91 -0.86 -5.95
(0.00) (0.00)
Nest-builder (L) 0.57 3.83 0.69 5.14
(0.00) (0.00)
Number of observations 492 492 492

Initial log-likelihood -341.03 -341.03 -341.03
Log-likelihood at convergence -151.95 -219.04 -181.46
P2 0.55 0.36 0.47

2
Adjusted p 0.52 0.35 0.44
2
x 378.16 243.98 319.16
pricing, pro-environment, pro-high density, and pro-alternatives factors are significantly more
likely to live in a traditional neighborhood. The signs for the three lifestyle factor scores are
consistent with Figure 2 and the discussion in Section 4.2: those scoring high on the culture-
lover factor are more likely to live in a traditional neighborhood, whereas those scoring highly
on the nest-builder and altruist factors are more likely to be suburbanites. From the final log-
likelihoods presented in Table 5, chi-squared tests on the inclusion of each block of variables
can be performed. The result is that the addition of each block of variables is significant to
the final model, but that the attitudinal/lifestyle block explains more information than does the
sociodemographic block.
6. CONCLUSIONS AND DIRECTIONS FOR FURTHER RESEARCH
This paper investigated the importance of attitudinal and lifestyle variables to residential neigh-
borhood choice for residents of three San Francisco Bay Area neighborhoods. It was found
that attitudinal measures on topics such as residential density and transportation pricing
policies varied significantly between residents of traditional and suburban neighborhoods.
Similarly, measures of respondents' lifestyles, based on participation in activities such as
cultural events and home improvement, were found to be strongly associated with
neighborhood choice. Evaluation of the relative contributions of sociodemographic and
attitudinal/lifestyle blocks of variables suggested that the latter category carried greater
explanatory power.
This effort is currently being extended in two important ways. The first way involves the
development of an individual-specific, continuous indicator of neighborhood "traditionalness",
and the use of that measure as the dependent variable in a residential choice model. The
dichotomous classification of traditional or suburban used in the current work masks
considerable variation within neighborhood, introduces "noise" into the model, and therefore
weakens the explanatory power of the model. For example, an individual's observed attributes
(explanatory variables) may lead to the prediction that she would choose a traditional
neighborhood. In fact she may live in suburban Concord (an "incorrect" prediction based on
the binary dependent variable used here), but in an apartment complex near the BART station
and shops/services (consistent with her observed explanatory variables). An individual-specific
measure of the level of traditionalness she experiences - a measure which could vary
considerably even within the small geographic areas used in this study - should be more
closely related to the explanatory variable attributes she possesses than the binary variable
which is identical for all respondents in a given neighborhood, and hence should result in a
goodness of fit even better than the reasonably high adjusted p2 of 0.52 obtained for the best
model of this study.
The second important extension of this study is to embed a neighborhood choice model into
a system of simultaneous equations (see, e.g., Kain, 1962; Waddell, 1993). A number of
interdependencies among neighborhood choice, travel behavior, attitudes/lifestyle, and
sociodemographic variables can be identified. The best way to account for these relationships
of (potentially) mutual causality is through structural equations modeling. The authors have
developed a conceptual model of these interrelationships; work is currently underway to
operationalize the structural equations model it implies, using the available data.
Other extensions are also possible and potentially useful. For example, sample segmentation
by variables such as presence of children, number of workers, and vehicle availability may
improve future models of residential choice by increasing the homogeneity of the resulting
segments. Finally, when collection of new data is possible, development of attitudinal and
lifestyle variables specifically oriented toward a residential choice context may be even more
fruitful than the measures used here, which were taken from a study of travel behavior.
Further, as mentioned in Section 4.1, the most rigorous approach to disentangling relationships
of mutual causation requires the collection of longitudinal data.
In summary, we believe the approach presented in this paper constitutes an important direction
for residential choice modeling. While attitudinal and lifestyle variables are not easy to
forecast and hence these models may not yet readily translate into impacts on regional travel
demand, they still offer useful policy insight. Examining the impact of such variables on
residential choice sheds light on the issue of whether observed differences in travel behavior
are induced by the land use configuration of the neighborhood itself, or are derived from
intrinsic propensities for different travel choices. Which one of those two claims is true has
important implications for new urbanism policies intended to reduce travel through land use
planning. In reality, both behavioral mechanisms are likely to be in effect to varying degrees,
and only the most sophisticated analysis techniques (structural equations modeling of
longitudinal data on residential location, lifestyle/attitudes, travel behavior, and demographics)
can properly distinguish the effects of each. Using more limited methods, however, suggestive
evidence is mounting that the second hypothesized mechanism is stronger: that is, that as an
explanation for travel behavior, neighborhood type tends to act as a proxy for the "true"
explanatory variables with which it is strongly associated, namely attitudinal and lifestyle
predispositions.
ACKNOWLEDGEMENTS
The data analyzed in this paper were collected under Contract Number A132-103 with the
California Air Resources Board. The authors wish to express their thanks to the many
important contributors to this study, including Ryuichi Kitamura, Laura Laidet, Carol
Buckinger, and Fred Gianelli of UC Davis; and Terry Parker, Fereidun Feizollahi, and Anne
Geraghty of the Air Resources Board. Discussions with Susan Handy and Kelly Clifton of
the University of Texas, Austin and Ilan Salomon of the Hebrew University of Jerusalem have
been helpful to the development of this work.
REFERENCES
Cervero, Robert (1996). Traditional neighborhoods and commuting in the San Francisco Bay
Area. Transportation 23, 373-394.
Frank, Lawrence and Pivo, Gary (1994). Impacts of mixed use and density on utilization of
three modes of travel: single-occupant vehicle, transit, and walking. Transportation
Research Record 1466, 44-52.
Horowitz, Joel, L. (1995). Modeling choices of residential location and mode of travel to
work. In The Geography of Urban Transportation, second edition, edited by Susan
Hanson, 219-239. New York: Guilford Press.
Kain, J.F. (1962). A Contribution to the Urban Transportation Debate; An Econometric
Model of Urban Residential and Travel Behavior. Paper number P-2667 of the Rand
Corporation, Santa Monica, Calif.
Kitamura, Ryuichi, Laidet, Laura, Mokhtarian, Patricia, L., Buckinger, Carol, and Gianelli,
Fred (1994). Land Use and Travel Behavior. Research Report UCD-ITS-RR-94-27,
Institute of Transportation Studies, University of California, Davis.
Kitamura, Ryuichi, Mokhtarian, Patricia, L., and Laidet, Laura (1997). A micro-analysis of
land use and travel in five neighborhoods in the San Francisco Bay Area.
Transportation 24(2), 125-158.
Lerman, Steven R. (1975). A Disaggregate Behavioral Model of Urban Mobility Decisions.
PhD dissertation, Department of Civil Engineering, MIT, Cambridge, Massachusetts.
Louviere, Jordan and Timmermans, Harry (1990). Hierarchical information integration
applied to residential choice behavior. Geographical Analysis 22(2), 127-144.
Madden, J. F. (1981). Why women work closer to home. Urban Studies 18, 181-194.
Nijkamp, P., Wissen, L. V., and Rima, A. (1993). A household life cycle model for
residential relocation behavior. Socio-Economic Planning Science 27(1), 35-53.
Norusis, Marija J. (1990). SPSS/PC+ Advanced Statistics 4.0. Chicago: SPSS Inc.
Prevedouros, Panos (1992). Associations of personality characteristics with transport behavior

and residence location decisions. Transportation Research 26A, 381-391.
Rummel, R. J. (1970). Applied Factor Analysis. Evanston, IL: Northwestern University
Press.
Rutherford, G. Scott, McCormack, Edward, and Wilkinson, Martina (1996). Travel impacts
of urban form: implications from an analysis of two Seattle area travel diaries.
Presented at the Travel Model Improvement Program Conference on Urban Design,
Telecommuting, and Travel Behavior. Williamsburg, Virginia. October 27-30.
Timmermans, H., Borgers, A., and Oppewal, H. (1992). Residential choice behavior of dual
earner households: a decompositional joint choice model. Environment and Planning
24A, 517-533.
Waddell, Paul (1993). Exogenous workplace choice in residential location models: Is the
assumption valid? Geographical Analysis 25(1), 65-82.
Weisbrod, G., Lerman, S., and Ben-Akiva, M. (1980). Tradeoffs in residential location
decisions: transportation versus other factors. Transport Policy and Decision Making
1, 13-26.
759
PLANNING OF SUBWAY TRANSIT SYSTEMS
S. C. Wirasinghe, Department of Civil Engineering, The University of Calgary, Canada

U. Vandebona, University of New South Wales, Australia
ABSTRACT
The planning of underground transit systems (subways, metros) must explicitly account for
construction, operating and fleet cost as well as the level of service provided to passengers.
We conjecture that passengers' priority is to have stations located close to their origins and
destinations to minimize access costs, because access component is at low speed compared to
high speed travel while using the subway. Furthermore, subway routes can be relatively
unrestricted because of advanced tunnelling technologies. Consequently, two sub-optimization
problems: (i) location of stations to minimize the sum of station costs and access costs and (ii)
choice of routes, based on given station locations, to minimize the sum of passenger travel
time, transfer costs and subway fleet, operating and construction costs are described and
solved. Planning of operationally feasible networks and the resulting total system cost is
discussed.
INTRODUCTION
It is important to plan subway (metro) systems in a cost effective manner because of the
massive investment that is needed to build them. In this paper, we propose a method for
planning subway networks that considers both the operating cost and the capital cost of the
system as well as the level of service provided to passengers.
When surface urban rail networks are being planned, the routes are chosen to satisfy (in the
conventional jargon) "travel desire lines" along "identifiable corridors". The choice of the
routes is significantly affected by the existing urban infrastructure and the availability of rights-
of-way. The stations are subsequently located on the chosen routes. Some ideas regarding
station location on a chosen surface route are given in Hurdle and Wirasinghe (1980) and
Wirasinghe and Ghoneim (1981).
The planning of underground urban rail (subway or metro) networks is discussed here. This
paper attempts to establish the mathematical foundation applicable in station location and route
layout determination. The method proposed here relies on graph theory principles and
mathematical optimisation techniques to resolve the above issues.
A worldwide survey of over forty subways by Wirasinghe and Vandebona (1988) and a recent
study by Gendreau et al. (1995) have revealed that a clear documented methodology for
planning subway networks does not exist. In general, the methods used are similar to those for
surface networks, even though subway routes are restricted to a much lesser degree by the
existing urban infrastructure. For example, it is now technically feasible to tunnel at depths of
70 m for metro construction without disturbing the structures on the ground (Public Innovation
Abroad, 1989).
A comprehensive overview and review of the existing approaches to locating rapid transit lines
is given by Gendreau et al. (1995). However, they do not distinguish between surface and
underground systems. They observed that ... "the primary goal of rapid transit is to improve
mobility by providing shorter travel times", that ... "cost is an overriding consideration", that ...
"the overall network structure must be examined (though) much research remains to be done"
and that ... "some feasibility studies do not address the question of rider ship." They point-out
the shortcomings of the median-shortest-path approach of Current et al. (1988) to the planning
of transportation network's, in particular that it does not account for the "demand for trips", and
that such a criterion cannot be handled by standard network design models. They also indicate
that a "tabu search" approach to be described in a forthcoming article may be more promising.
The basic theory behind our approach to subway network planning is that passengers desire
stations that are close to the origin and destination of their journey because of the relatively
slow speed of access modes compared to the speed of subway trains. Once they are in the
subway system, they want low waiting times and as few transfers as possible (typically no
more than two). Passengers are not too concerned regarding the actual routing of their trip
within the subway network as long as their travel times are low. Typically they are unaware of
their geographical location while en-route. A key insight is that passengers first look for
"routes" close to their origin and destination when they access surface transit systems and
subsequently choose a "station"; however, typically, they first locate the origin and destination
Planning Of Subway Transit Systems 761
"stations" when travelling by subway systems and subsequently choose the "routes", mainly for
the purpose of locating transfer points.
What we propose is to reverse the conventional planning approach by locating the subway
stations first and then connecting them by an appropriate network. This is a feasible approach
to the planning of underground networks because routes are for the most part unrestricted by
other infrastructure. Consequently, explicit tradeoffs are made possible between (i) the costs of
gaining access to, from stations and the cost of stations in relation to the station location
problem and (ii) the cost of travel time (waiting, riding and transfer time) within the system
and the cost of constructing and operating the network in relation to the route planning
problem. This approach is particularly appropriate when a large network with many stations
covering a metropolitan area is being planned as opposed to a single "corridor". In the first
part of this paper, we will examine the problem of locating the stations and choosing a basic
network that connects those stations. In the second part we shall discuss the planning of the
route network for operational feasibility and minimal cost.
It is important to mention that there is a general lack of literature about mathematical

principles covering subway station and network planning aspects. Previous research efforts
have focused on identification of the effects of different planning options. For example,
Musso and Vuchic (1988) have examined five groups of metro network characteristics:
network size and form (e.g. number of closed loop routes), network topology (e.g. fraction of
network with overlapping lines), relationship between network and city (e.g. percent of urban
area within walking distance of a station), service measures (e.g. space - km offered per day)
and utilization indicators (e.g. passengers per year per network length) and proposed that they
be used to evaluate and compare existing and proposed networks. The network has to be fully
defined prior to the above evaluation. Our normative approach is complementary to that of
Musso and Vuchic (1988). The few nearly minimum total cost networks that results from our
methodology can be further evaluated and compared using their approach.
LOCATION OF STATIONS
Wirasinghe and Vandebona (1987) have acknowledged that station location and route
alignment are inter-related planning steps. Figure la displays the interactions among demand,
station location and route alignment. For computational simplicity though, what is proposed is
a step by step process where station location is determined first and the network configuration
is determined next as shown in Figure Ib. This model is developed as a supply model and
therefore does not attempt to forecast the effect on transit demand. However, that may be
achieved by integration of a suitable transport and landuse interaction model.
The number of stations and their locations are determined by balancing the station construction
cost, the station operating cost and the cost related to the increase of travel time caused by
Demand Pattern
r
Station Route
Station location
Location Alignment
(a) Idealized planning framework

Route alignment
(b) Computational model
Figure 1. Sequence of Planning Steps
trains stopping at stations (which increase with the number of stations), and the cost of gaining
access to and from the stations (which decreases with the number of stations). The basic
"location theory" for placing the stations is based on a theory proposed by Newell (1973) and
further refined by Erlenkotter (1989) that determines the optimal shed-area of a facility.
Consider a shed-area A(x,y) served by a station located at the centroid (x,y) of the area, so as to
minimize the access costs of subway riders. Let the subway trip-ends within the shed-area that
use access-mode i be p;(x,y) per unit area per day. Therefore, the total subway trip-ends per
unit area per day in the shed-area for all access modes is given by:
p(x,y)= I Pi(x,y) (1)
Thus the total number of trip ends in the shed per day, i.e. the passenger demand, is given by:
P(x,y) = p(x,y) A(x,y) (2)
Station Construction and Operating Costs
Suppose the sum of discounted construction and operating cost per day for a station at (x,y) is
given by the linear function:
zs(x,y) = Ys(x,y) + c(x,y) a^x.y) (3)
where,
ys(x,y) = fixed cost of a station per day at (x,y),

as(x,y) = total floor area of the station at (x,y)
and
c(x,y) = marginal cost per unit-area of a station at (x,y).
The results from a survey of subways by the authors are utilized to estimate the station-size
represented by as(x,y) as a function of the total daily trip-ends at a station. A functional form
that well fits the North-American and Asian data for the total floor area of all levels of a station
served by a single route is
as(x,y) = am + a'e- t/e P a (x,y,t) (4)
where,
am = minimum area of a station, in m2,

t = age of the station in years at the time of data collection,
P(x,y,t) = total daily trip-ends measured at the station at age t,
and
a', a and 6 are constant coefficients.
Ideally one would prefer to obtain a relationship between station-area and the design-year
demand for travel in terms of trip ends per day. However, the available data represents the
current demand for travel. The explanatory variable-product Ptx(x,y,t)e"t/e is a compromise that
approximately represents the demand at the time of opening of the station.
Multiple linear regression analysis using the logarithmic form of Equation 4:
ln[as(x,y) - am] = aP(x,y,t) - (1/6) t + In a' (5)
has revealed that the hypotheses a = 0.5, a' = 50 and 6 = 10 could not be rejected at the 5%
level of significance. The value of am, that gave the highest r2 value of 0.69, was found by
systematic elimination to be 1300 m2.
Combining Equations (3) and (4) gives
zs(x,y) = ys(x,y) + c(x,y) am + c(x,y) a' e'"9 Pa(x,y,t) (6)
where P(x,y,t) can be interpreted as the demand at the station in trip ends per day in the design-
year, t years into the future.
Train Stopping Costs
The train stopping time consists of the time lost during deceleration, acceleration, door opening
and closing plus passenger boarding and alighting. Let the operating cost per train for the time
lost during deceleration, acceleration and door opening, closing be YT • If there are N trains per
day, the related daily cost is
NYT (7)
However, if the daily train operating cost during passenger boarding and alighting time at a
station is taken to be proportional to the daily demand p(x,y)A(x,y), the first derivative with
respect to A(x,y) of that cost taken per unit-area of the shed will be zero. Since we plan to do
just that, i.e. to find the A(x,y) that will minimize the total cost per unit shed-area, we disregard
the cost related to boarding and alighting time from further consideration in this section.
Access Distances and Costs
Passenger behaviour in selection of the origin and destination stations, as well as the choice of
access paths and modes has an impact on the computation of the access cost component. It is
customary to assume that passengers access the nearest station and use the shortest access path.
This model is based on those assumptions.
If we neglect the effect of differential fares and assume that passengers travel to/from the
nearest stations, the average distance traveled within a station-shed of area A(x,y) is shown by
Newell (1973) to be kA(x,y) l/2 where k is a network configuration factor related to the shed-
shape and the street layout pattern within the shed. The k factors for both simple and complex
networks are reported by Erlenkotter (1989).
Let the cost per unit distance traveled by access mode i per passenger (consisting of passenger
time cost and vehicle operating cost) be y,. Using a street network parameter k(x,y), the access
costs for passengers in the shed of a station located centroidally at (x,y) is
CA = 2 Yi Pi (x,y) A(x,y) [k(x,y) A1'2 (x,y)] (8)
where k(x,y) A1/2(x,y) is the average access distance.
Total Station Related Costs
The total of the station related costs given by Equations (6), (7) & (8) is
C(x,y) = k(x,y) A3/2 (x,y) y (x,y) p(x,y) + ys* (x,y) + (|>(x,y) [p(x,y) A(x,y)]a (9)
where, the fixed cost at (x,y),
YS* (x,y) = ys (x,y) + c(x,y) am + NYT (9a)
the weighted average access cost per unit distance at (x,y),
Y(x,y) = [ Z P l (x,y)Yi]/p(x,y) (9b)

i=i
and,
(|)(x,y) = c(x,y) a' e'"8 (9c)
From Equation (9), we can determine the cost per unit area. If p(x,y) is assumed to vary
slowly with (x,y), the total cost, which is the integral of the cost per unit area over the shed, is
minimized when the integrand is minimized at each (x,y). Dividing Equation (9) by A(x,y) and
using the first order condition for a minimum, we obtain:
Vi k(x,y) Y (x,y) p(x,y) / p (x,y) = Vt ())(x,y) P I/2 (x,y) + Ys* (x,y) p1/2 (x,y) (10)
where
p(x,y) = A-'(x,y) (11)
is the station density per unit area in the shed area A(x,y).
The optimum station density p*(x,y) can be obtained in closed form if necessary by
transforming the Equation (10) to a cubic equation. Further, Ys*( x >y)> which contains most of
the station construction cost in Equation (9) is generally substantially larger than the third term,
the variable portion of the station cost. Neglecting that term allows a simplified solution to the
Equation (10):
p* (x,y) - P/2 k(x,y) Y (x,y) p(x,y) / YS* (x,y)]2/3 (12)
The above solution has a form similar to that proposed by Newell (1973) and Erlenkotter
(1989) for the general warehouse location problem.
SUBWAY STATION LOCATIONS FOR THE CITY OF CALGARY

Table 1 shows the passenger demand values and the area of each transportation zone for the
City of Calgary. The transit passenger demand is obtained from forecast values for 38
transportation zones in Calgary. The optimum station density is computed from Equation (12).
The last column in Table 1 shows the number of stations for each transportation zone obtained
by multiplying the optimum station density of the zone by the area of the particular zone.
Table 2 shows a sensitivity analysis of the relevant parameters for the model application to the
City of Calgary. The parameter values selected are also shown in the Table 2.
Table 1. Optimum Number of Stations per Zone
Transportation 1 Station density Number of

Trip ends Area Sq. km
Zone #per sq. km stations
1 4377 27.5 0.058 1.61
2 1506 17.2 0.039 0.68
3 3314 21.7 0.057 1.23
4 2780 5.3 0.128 0.68
5 1239 10.4 0.048 0.50
6 1349 4.7 0.086 0.40
7 1757 5.0 0.099 0.49
8 2639 45.5 0.030 1.37
9 1005 6.7 0.056 0.38
10 1452 8.9 0.059 0.53
11 1687 3.2 0.128 0.41
12 1877 5.7 0.094 0.54
13 320 35.9 0.009 0.31
14 1862 10.1 0.064 0.65
15 5264 40.5 0.051 2.07
16 1517 7.8 0.067 0.52
17 3114 9.5 0.094 0.89
18 3200 19.6 0.059 1.16
19 107 42.1 0.004 0.16
20 1398 52.7 0.018 0.95
21 3203 1.6 0.303 0.50
22 32187 1.5 1.459 2.19
23 471 2.2 0.071 0.16.
24 3455 1.8 0.297 0.54
25 2289 2.3 0.197 0.45
26 326 2.7 0.049 0.13
27 794 3.2 0.077 0.25
28 2241 16.9 0.052 0.88
29 1112 3.3 0.095 0.32
30 926 3.5 0.081 0.29
31 3497 13.3 0.081 1.08
32 3230 12.9 0.079 1.01
33 2933 21.0 0.053 1.12
34 2111 12.3 0.061 0.75
35 2118 23.8 0.040 0.95
36 1914 7.4 0.080 0.59
37 3105 18.9 0.060 1.13
38 1166 18.8 0.031 0.59
Table 2. Sensitivity of the Number of Stations to Input Parameters
parameter brief definition value units Change in number of stations

at +10% at +100%
change change
k Network 0.4 - +6.5%* +58.0%*
parameter
Y> Access cost 1.50 $/km +6.5%* +59.0%*
C Marginal 0.05 $ /m"/day -0.7% -6.5%
station cost
N Number of 160 per day -1.5% -14.0%
trains per day
YT Stopping delay 2.50 $ per stopping -1.5% -14.0%
cost /train
Ys Fixed station 3000 $/day -4.0% -28.0%
cost /day
p(x,y) Transit trip ends per +6.5%* +58.0%*
demand unit area per day
* proportional to the (parameter)273
The next step is to find the actual station locations. For this purpose, unit areas within the urban
area are re-collected such that the total station requirement in each new zone becomes one.
The station requirement in each new zone is computed from the station densities of the unit
areas included in the particular zone. Stations are placed at the centroid of each of the new
zones.
Figure 2 shows transportation zones and the optimum number of stations from Table 1 suitable
for each transportation zone in the City of Calgary. The proposed subway station locations
shown are determined based on the rezoning structure which allows the location of exactly one
station at the center of the particular zone. Figure 2 also shows the existing surface LRT
network in Calgary and the proposed extensions. It can be noticed that the LRT stations are
constrained by the available surface right of way and are generally situated at the boundaries of
transportation zones whereas the estimated subway station locations are close to the centers of
the catchment areas.
OPTIMIZATION OF ROUTE RELATED COSTS

The construction and maintenance cost of subway tunnels is proportional to the total length of
the subway routes if the geology and tunnel depths are uniform and all sections are treated as
single tunnel sections. Analysis sections presented later describe the methodology to
incorporate overlapping routes that have multiple tunnels. Further, it can be shown trivially
that the train operating cost is also proportional to the total route length, if all the subway
routes in the network operate with equal headways, equal speeds and equal train sizes and the
operating cost is based on distance travelled. The same is true of the fleet cost if turn-around
times are neglected. Therefore, the total construction, operating and fleet cost can be
considered to be a linear function of the network length, as a first approximation.
Consequently, the minimization of the costs is achieved by minimizing the total network
length. This minimization problem amounts to finding the minimum spanning tree (MST) for
a given geographic distribution of nodes. For a limited number of nodes, the solution to the
MST problem can be readily obtained by visual observation. We have made use of an
algorithm for the MST suggested by Kruskal (1956), to develop a computer program which
would accept digitised station location input and provide the optimum route network. The
algorithm, which has a unique solution for any distribution of station locations, is based on the
observation that the shortest possible link emanating from a station should be on the minimum
spanning tree. Among the links that are not yet part of the MST, the link with the smallest
length that does not form a cyclic route with those already in the network is chosen for
inclusion.
Existing LRT Network

Proposed LRT Extension*
Location* for Metro Station*
Figure 2. Application of the Station Location Model to City of Calgary

We have used the MST algorithm to examine several existing subway networks. The existing
route networks and the optimum route network suggested by the minimum link cost networks
have only marginal differences due mostly to stations being located after the choice of routes in
the existing network.
The existing route layout and the minimum total link cost network for Sao-Paulo is shown in
the Figure 3 as an example. The two networks have many similarities. However, the
differences between them provide an insight into the "operational convenience" objectives of
the transit systems.
For example, compare the number of stations with an odd number of links emanating from a
station, in the Sao-Paulo system. In the minimum total link cost network there are 20 such
stations. In the existing network there are only 12 such stations.
When there are odd number of links emanating from a station, one route should terminate at
the particular station. Hence the minimum number of routes in the system is half of the
number of stations having an odd number of links. Each additional route in the system
increases the operational difficulties (from the operator's point of view) as well as the
(unproductive) turn around activities. Therefore it is desirable for the operator to minimize the
number of stations which have an odd number of links, to minimize the number of routes in the
network.
Minimum Total Link Cost Network

Existing Network
• Existing Stations
Figure 3. Sao-Paulo Subway Network

In a number of existing subways, there are some segments with multiple route corridors.
Particularly in low headway operations, multiple routes in a single corridor require dedicated
tracks to increase safety, and minimize possible delays and operator workload necessary to
manage merge and diverge operations. The total number of routes can be reduced by
introducing multiple route corridors at the expense of increased construction cost of tunnels.
The Figure 4a shows a hypothetical network with three routes. By multiple tunnelling between
the stations A and B, it is possible to reduce the total number of routes to two (Figure 4b). It
should be also noted that passengers travelling between Stations 1 and 2 have two transfers
removed when the network is changed from that of the Figure 4a to Figure 4b.
The multiple route corridor converts a pair of stations with an odd number of links to stations
with an even number of links. (For example, Stations A and B in the Figure 4). Existence of
multiple route corridors at the expense of multiple tunnelling also indicate the desirability of
minimizing the number of stations with an odd number of links emanating from a station.
Circular routes also remove a pair of stations which have an odd number of links at the expense
of constructing one of the longest links in the circular route. Furthermore, the circular route
eliminates the turn around requirement for vehicles and thereby increases the operational
convenience.
The above observations show that the minimum total link cost network may have to be
modified to suit operational considerations.
ROUTE 1
-ROUTE 2
ROUTE 3
(a) Three routes configuration (b) Two routes configuration
Figure 4. Examples of a multiple route corridor

OPTIMIZATION WITH REFERENCE TO PASSENGER TRAVEL DISTANCE

The minimum total link cost network obtained in the previous section does not make any
reference to the passenger demand distribution. However, it is evident that, under certain
demand distributions, the minimum total link cost solution may force a large number of
passengers to travel excessive distances compared to their direct path. An optimum solution
which accounts for the sum of the total link cost (which in turn accounts for cost components
described before) and the cost of travel time of passengers is pursued by the algorithm
described below. At this stage, it is assumed that the cost of transfer time of passengers is
negligible. However, a method for including transfer time of passengers is described in a later
section.
The algorithm begins with a network which has all stations linked to each other. Then, links
are systematically removed until the minimum combined cost network is obtained. The heart
of the algorithm is the method of computing the combined cost. The steps involved in
computing the combined cost are as follows:
1. Set up a trial network.
2. Compute the minimum travel time between all pairs of stations by using the Revised
Cascade Algorithm (Floyd, 1962) to find the minimum path and the corresponding distance
between all pairs of stations.
3. Compute the cost of total passenger travel time with references to the origin-destination
demand distribution for the given set of stations, which can be generated using standard
transportation planning techniques. It is assumed that none of the links is over congested
and that passengers select the fastest travel path between the particular pair of origin station
and destination station.
4. Compute total link cost.
5. Compute the combined cost given by the sum of the values obtained in steps 3 and 4 above.
By an iterative procedure, now seek the link, the removal of which would maximize the
savings of the combined cost. The removal of that particular link provides the next trial
network. This procedure is repeated until no further savings can be achieved by removal of any
link in the network.
The Figure 5a shows the minimum total link cost network generated for the station locations
proposed under a particular demand forecast for the City of Calgary. Compare the network in
the Figure 5a with networks shown in Figures 5b and 5c, which are the results of minimizing
the combined cost of the total link cost and the cost of the passenger travel time. The network
in Figure 5b is based on a value of travel time of $5 per hour per passenger, whereas the
(a) Minimum total link cost (b) Value of time =$5 (c) Value of time = $50
Figure 5. Minimum Total Cost Network for Optimum Station Locations
network in the Figure 5c is based on a value of travel time of $50. Both networks are based on
a construction cost of tunnels set at 20 M$ per km. The Figures 5b and 5c show that, as
expected, the total route length increases when the monetary value of the passenger travel time
is increased. Although circular routes are disallowed in the minimum total link cost network,
the minimization of the combined cost may yield a network with circular routes (Figure 5c).
ESTIMATION OF THE TOTAL COST

Networks generated using above two methods (i.e. the minimum total link cost network and the
minimum combined cost network) provide a reasonable starting network for planning
purposes. However, any given network can be organized to operate on a number of different
routing arrangements, and each system will result in a different total cost when the monetary
value of passenger transfer times is also included. Although transfers are a healthy feature of
the subway operation, the route system should be designed to minimize the transfers for a
given network configuration.
The total cost will vary according to the number of transferring passengers using the network.
The number of transferring passengers is dependent on the route arrangements adopted by the
operator. Therefore, computation of total cost allows the comparison of different routing
arrangements as well.
For comparison purposes, an annualized cost, Ca, can be computed using the following
formulation.
Ca= Z d i t Y c +Yo + D,Yi] + I YtD*sTs] (13)

i=all links s=all stations
where
d i = length of link i
YC = annualized construction cost per unit length of track
Yo = annualized operating cost, including fleet cost, per unit length of track
Yi = Value of riding time per passenger per unit distance
Yt = value of transferring time per passenger
DJ = passenger demand in link i per day
D*s = number of transferring passengers per day at station s
Ts = average transferring time at station s
In the computer program METRO developed to estimate the total cost of the system, the cost
of construction of links are computed accounting for multiple tunneling sections in the network
if there are any multiple route corridors. The fleet cost required for the operation of the
network is estimated using the planned headway and assuming equal size vehicles on all
routes. The crew cost is also incorporated into the fleet cost. An allowance is provided for the
turn around time requirements at termini of non circular routes. The cost of travel time and the
cost of transfer time experienced by passengers are also computed. The minimum travel time
path, the total travel time and the total transfer time between all station pairs are computed
using the Revised Cascade Algorithm mentioned before.
Each station served by a particular route is denoted by an individual node in the program.
Initially, the nodes are connected such that their connectivity corresponds to the specified route
layout. The distance between the nodes corresponds to the cost of travel time between stations
represented by the particular nodes. Then, the program accounts for the cost of transfer time at
stations by connecting the nodes representing a particular station which serves more than a
single route. The distance between the nodes in the transfer link is equal to the cost of mean
transfer time at the particular station. Figure 6 schematically shows the manner in which the
nodes are configured to handle transferring passengers. Consider a network consisting of two
routes. One route connects stations A,B,C,D,E and F. The other route connects stations
G,H,C,D,J and K. The stations C and D are transfer stations common to both routes. The
program assigns two nodes per station for stations C and D as shown in the figure. The distance
between Cl and C2 as well as the distance between Dl and D2 are made equal to the cost of
mean transfer time between routes at the stations C and.D.
Once the cost of travel time between node pairs is computed, the total cost due to the particular
pair of stations is computed by multiplying the cost of travel time by the passenger travel
demand between the station pair. When one of the stations in an origin-destination pair is a
transfer station, there are a number of nodes representing the particular transfer station. The
Figure 6. Schematic Representation of the Node Arrangement for Computation of the

cost of Passenger Travel Time
cost of travel time between such a pair of stations is the minimum cost of travel time between
the combinations of pairs that can be made up with the nodes representing the stations. For
example, the cost of travel time between A and D is the minimum of the cost of travel time
between the node pair (A,D1) and (A,D2). In the above example, the node pair (A,D1) would
yield the cost of travel time between the stations A and D.
The following cost components are presented in the annual cost form by taking into account
the expected life of the relevant component and a discount rate provided by the planner:
1. Total cost of construction and maintenance of the links.
2. Total fleet cost.
3. Total cost of passenger in vehicle time.
4. Total cost of passenger transfer time.
In the computer program METRO mentioned before, the planner can interactively select
alternative routing patterns to compute the total cost of the network. The description of a new
route network system can be input to the program in a few minutes, because the nodes are
displayed on the computer screen and the planner draws the required routes on the computer
display by using a few simple commands. For a network consisting of 30 stations, the program
generally requires less than five minutes to compute the above cost components.
The significance of the ability to include the analysis of feasible networks can be demonstrated
using the example of the City of Calgary. It is evident that the cost of a subway is exorbitant
for a city of the size of Calgary. However, the following example is included to demonstrate a
simple application of the model.
Figure 5 showed the minimum total link cost network and networks which account for the
effect of passenger travel time. Three networks (Figure 7) with different routing patterns which
are deemed feasible are analysed. The cost parameters in $M per year are summarized in
Table 3 for comparison purposes. The value of passenger travel time is assumed to be $10 per
hour, whereas the value of passenger transfer time is assumed to be $30 per hour. It is also
assumed that trains operate at 5 minute headways. A train is considered to be worth $5 M, and
has a useful life of 25 years. Although the cost of the network and the cost of the vehicles are
similar in the networks in Figures 7a and 7b, the cost of passenger travel time is sensitive to
the differences in the network. The cost of passenger travel time is comparatively low in the
network in Figure 7c. However, that particular network costs more to construct and operate.
(a) (b)
Figure 7. Examples of Operationally Feasible Subway Networks for the City of Calgary
Table 3. Comparison of Annual Cost of Alternative Networks
Network Figure 7a Figure 7b Figure 7c

Link Cost 107 105 122
Fleet Cost 12 12 13
Travel Time 137 130 121
Transfer Time 25 29 27
Total 281 276 283
Note: Costs shown in million dollars per year.

SUMMARY
The route selection procedure for a subway network can be summarized by the following six
steps:
1. Collect data related to geography, demand and relevant economic parameters.
2. Determine optimum station locations.
3. Determine the minimum total link cost network.
4. Determine the network which would minimize the sum of the total link cost and the cost
related to passenger travel time.
5. Modify the above networks (if required) to account for operational convenience and set up
feasible networks and routing systems.
6. Select a network by comparing the total cost of construction and maintenance of tunnels,
fleet cost and the total monetary value of passenger travel time and transfer time.
The computer package METRO which follows the above steps is developed for the purpose of
planning new subway systems and the expansions of existing networks.
The proposed method is made feasible by the separation of the subway planning problem into
two sub-problems related to station location and route selection respectively and related sub-
optimizations.
ACKNOWLEDGEMENTS
This research is supported in part by the Natural Sciences and Engineering Research Council of
Canada under Grant No. A4711 and by a University of Calgary Research Fellowship. Thirty
one subway systems which responded to the world wide survey are also acknowledged.
BIBLIOGRAPHY
De Leuw, Gather & Co. (1977). Study of Subway Station Design and Construction. National
Technical Information Service PB-268-894, U.S. Department of Commerce,
Washington, D.C.
Current, J., C. ReVelle and J. Cohon (1988). The Minimum Covering/Shortest Path Problem.
Decision Scl, 19:3, 490-503.
Erlenkotter, D. (1989). The General Optimal Market Area Model. Annals Op. Res., 18:1, 45-70
Floyd, R.W. (1962). Algorithm 67 Shortest Path. Communications ACM, 5, 345.
Hurdle, V.F. and S.C. Wirasinghe. (1980). Location of Rail Stations for Many to One Travel
Demand and Several Feeder Modes. J. Adv. Transportation, 14, 29- 46.
Gendreau, M., G. Laporte, and J.A. Mesa (1995). Locating Rapid Transit Lines. J. Adv.
Transportation, 29:2, 145-162.
Kruskal, J.B. Jr. (1956). On the Shortest Spanning Subtree of a Graph and the Travelling
Salesman Problem. Proc. Am. Math. Soc., 7, 48-50.
Musso, A. and V.R. Vuchic (1988). Characteristics of Metro Networks and Methodology for
their Evaluation. Paper 870489 of 67th Annual TRB Meeting, Washington, D.C.
Newell, G.F. (1973). Scheduling, Location, Transportation and Continuum Mechanics: Some
Applications to Optimization Problems, SIAM J. Ap. Math., 25:3, 346-360.
Nock, O.S. (1973). Underground Railways of the World. A & C Black Ltd., London.
Public Innovation Abroad (1989). Subway 200 Feet Deep in the Heart of Tokyo. Pub. Innov.
Abroad, 89:6, 5.
Schutz, F., H. Berglund, and H. Von Heland (1965). Technical Description of the Stockholm
Underground Railway. SVR: S. Forlags Ab, Stockholm.
Vuchic, V.R. and G.F. Newell (1968). Rapid Transit Interstation Spacing for Minimum Travel
Time. Trans. Sc/., 2, 303-339.
Wirasinghe, S.C. and N.S. Ghoneim (1981). Spacing of Bus-Stops for Many to Many Travel
Demand. Trans. ScL, 15, 210-221.
Wirasinghe, S.C., and D. Szplett (1984). An Investigation of Passenger Interchange and Train
Standing Time at LRT Stations: (ii) Estimation of Standing Time. J. Adv.
Transportation, 18:1, 13-24.
Wirasinghe, S.C. and U. Vandebona (1987). Some Aspects of the Location of Subway
Stations and Routes. 4th International Symposium on Locational Decisions, Namur,
Belgium.
Wirasinghe, S.C. and U. Vandebona (1987). Subway Station Location and Network Design.
Research Report CE88-3, Department of Civil Engineering, The University of Calgary,
Calgary, Canada.
779
SCHEDULING RAIL TRACK MAINTENANCE

To MINIMISE OVERALL DELAYS
Andrew Higgins, CSIRO, Brisbane, Australia.
Luis Ferreira and Maree Lake, School of Civil Engineering, Queensland University of
Technology, Australia.
ABSTRACT
In Australian freight operations, maintenance costs comprise between 25 -35 percent of
total train operating costs. Therefore, it is important that the track maintenance planning
function is undertaken in an effective and efficient manner. This paper focuses on the
development of a model designed to help resolve the conflicts between train operations
and the scheduling of maintenance activities. The model involves scheduling
maintenance activities to minimise disruptions to train services and reduce maintenance
costs. The main applicability of such a model is as a decision support tool for track
maintenance planners and train planners.
The track maintenance scheduling problem, which involves the allocation of

maintenance activities to time windows and crews to activities, is formulated as an
integer programming model. The objective is to minimise a weighted combination of
expected interference delays and prioritised finishing time of activities. Minimising the
first component will ensure a minimum interference between track maintenance activities
and scheduled trains when either are delayed. The heuristic solution is obtained in two
steps. Firstly, an initial solution is generated, scheduling each of the activities in turn,
where the latter are ordered in terms of the importance of finishing time. Each activity is
selected and allocated to an available permissible work crew. If there are no available
permissible work crews when the activity is chosen, the earliest finishing crew will be
selected. The second stage uses the tabu search heuristic.
The model presented here was applied to an 89 km track corridor on the eastern coast of
Australia. The schedule constructed using tabu search has a 7 percent reduction in
objective function value as compared to the schedule constructed manually. The model
was also used to demonstrate the effects of activity schedule and maintenance resource
changes. A four day planning horizon was used for which the model was used to test
proposed changes. Increasing the time window by moving less important trains was
shown to reduce potential delays significantly.
i. INTRODUCTION
Although Australian rail systems have achieved considerable productivity gains in the
last decade, unit operating costs are still well below world's best practice. In 1993/94
track maintenance productivity lagged behind such benchmarks by an estimated $A80
million. This represents 16 percent of potential operating cost savings available if world's
best practice is achieved (Bureau of Industry Economics, 1995).
In Australian freight operations, maintenance costs comprise between 25 -35 percent of

total train operating costs. Therefore, it is important that the track maintenance planning
function is undertaken in an effective and efficient manner. This applies to short-term
planning such as daily scheduling of activities; as well as the medium to long-term
planning of required maintenance activities. Infrastructure provision is increasingly seen
as a separate business to be managed, planned and owned by a different entity. Such
vertical separation is seen as providing the vehicle that will improve accountability and
profitability within the rail industry. In addition, separate control and management of
track is designed to encourage rail transport competition by allowing new train operators
to gain access to the right-of-way (Ferreira, 1997).
Efficient maintenance planning requires an up-to-date, locally relevant decision support

tools. This paper focuses on the development of a model designed to help resolve the
conflicts between train operations and the scheduling of maintenance activities. The
model involves scheduling maintenance activities to reduce disruptions to train services;
maintenance costs; and the amount of time a given track segment has a level of service
below a specified benchmark, (e.g. speed restrictions imposed on trains due to poor track
condition). This latter aspect is particularly important when contractual obligations exist
between track providers and rail carriers relating track performance to access charges.
The main applicability of such a model is as a decision support tool for track
maintenance planners and train planners. The need for such a model is more pressing
under single track train operations, where trains can only pass or overtake each other at
specified locations (sidings). A model to optimise a given train schedule with respect to
train related operating costs, including the risk of delays, has been put forward by
Higgins et al. (1996). However, such a model does not incorporate track maintenance
activities in the decision process.
Scheduling Rail Track Maintenance 781
The paper is organised as follows: Section 2 provides a brief outline of the types of track
maintenace activities which need to be scheduled; this is followed in Section 3 by a
statement of the problem and a discussion of past work on the scheduling of track
maintenance; Section 4 provides a detailed description of the model developed; Section 5
shows the results of applying the model to a case-study; and finally some conclusions are
offered in Section 6.
2. MAINTENANCE ACTIVITIES
Track maintenance covers all the measures for preserving and re-establishing the nominal
condition, as well as the measures for determining and assessing the actual condition in a
technical system (Kooran, 1992). This section details the following main types of
maintenance: rail grinding; rail replacement; tamping; track stabilisation; ballast
injection; and sleeper replacement.
Rail Grinding: This consists of grinding machines travelling along the track with
grinding stones, which are rotating stones or stones oscillating longitudinally, to abrade
the rail's surface. Rail grinding is conducted to correct rail corrugations, fatigue and
metal flow and to re-profile the rail.
Rail Replacement: This may be conducted to upgrade the track to a higher gauge rail or
to replace the same gauge rail due to defects, wear or derailment damage.
Tamping: This is conducted to correct longitudinal profile, cross level and alignment of
track. A number of sleepers at a time are lifted to the correct level with vibrating
tamping tines inserted into the ballast.
Track Stabilisation: Track stabilisers vibrate the track in the lateral direction with a
vertical load to give controlled settlement. Tamping and compacting ballast underneath
sleepers reduces the lateral resistance of the track. Track stabilisation can restore the
lateral resistance to the original level.
Ballast Injection (Stoneblowing): Ballast injection, or stoneblowing, is conducted to

correct longitudinal profile. The process introduces additional stones to the surface of the
existing ballast bed, while leaving the stable compact ballast bed undisturbed (Esvald,
1989).
Sleeper Replacement: In almost all types of sleeper defects, remedial action is not
possible and the sleeper requires replacement. Defective sleepers can result in the rail
losing the correct gauge, which can cause rollingstock derailments.
3. TRACK MAINTENANCE SCHEDULING

This section addresses the short term scheduling of maintenance to minimise the conflict
between train operations and track maintenance, once the maintenance to be conducted
has been determined. The problem of maintenance planning in the medium to long-term,
including track degradation modelling, has been addressed by Ferreira and Murray
(1997).
The Problem
Planned train schedules under single track operations outside urban areas are likely to be
subjected to significant variations on a daily basis. This is due to a number of reasons
including missed 'meets' and 'passes' (locations where conflicts between opposing trains
are resolved); locomotive and rollingstock failures; and late departures from the origins.
Given such uncertainty in the times when specific trains are likely to occupy a given
track link, it is necessary to schedule maintenance activities so that any potential conflict
with train services is minimised. In addition, the maintenance activities themselves are
subjected to uncertainty in duration, due to the longer or shorter times to complete each
task. Such uncertainty may lead to disruptions to train services and hence unreliability of
train arrivals. A model to estimate such unreliability has been proposed by Ferreira and
Higgins (1996).
The overall problem consists of scheduling a given set of maintenance activities over a
network of rail links such that:
(a) Train delays are kept to as low as possible, having regard to the importance of each
train. A pre-defined train hierarchy is used to reflect the fact that some trains may carry
more time sensitive freight than others. From a market share and business perspective
such trains are more 'important' to the operator and therefore they should be free from
the risk of maintenance delays as far as possible;
(b) For specified long-distance train services, maintenance activities should not be
allowed to affect train schedules so as to produce cumulative delays at the destinations
above a given level. Such aspect of maintenance scheduling was found to be particularly
relevant by Aspebakken et al. (1991) when developing a graphical based system.
(c) the direct costs associated with maintenance activities are kept as low as possible;
(d) maintenance activities taken place according to an a priori set of priority weights;
and
(e) maintenance crews are assigned to activities according to a specified area of coverage
for each crew.
Increasing Importance of Effective Scheduling
Contractors have become involved with a broader range of work across the rail
engineering spectrum, from design and construction to maintenance and operations
(Kramer, 1998). With this increase of outsourcing, it becomes important to effectively
schedule the maintenance windows, as contractual arrangements will govern the
allowable maintenance times. Penalties are likely to be accrued if the maintenance
windows are not as required, resulting in a trade off between customer service and
maintenance costs. A maintenance window is a planned pre-organisation of train services
for a set time or over a set period, to ensure track possession for prescribed infrastructure
works (Griffiths, 1991). The length of maintenance window planned is not necessarily
the hours that the maintenance gangs actually receive. The difference between the
planned and actual length is train and workblock conflict related (Szymkowiak, 1991).
That is, delays to trains reduce the size of the maintenance window.
In an attempt to coordinate maintenance and train operations, maintenance has been

scheduled at night, outside peak train times or on the weekends. However, this is not
always feasible due to other considerations in the scheduling of maintenance, such as
noise restrictions, safety aspects or personnel rostering.
Past Work
At Burlington Northern, work on coordinating maintenance and train operations was

conducted with the aim of notifying customers several weeks in advance when delays
due to maintenance could be expected (Aspebakken, et al, 1991). The Service
Maintenance Planning System (SMP) was created to allow visualisation of the interaction
between trains and maintenance work for specific corridors. Time/distance diagrams and
the locations of any conflicts between trains and maintenance are displayed. The system
does not schedule the maintenance activities or maintenance crews, instead highlighting
the conflicts between the maintenance and trains. Basic priority rules are used to make
adjustments to the timetable. However, this does not necessarily optimise the train and
maintenance schedule with respect to overall costs or reliability.
Research into the goals, constraints and necessary outputs of a maintenance scheduler has
been undertaken, in particular detailing the program Intelligent Maintenance Scheduler
(IMS) developed by Ruffing and Marvel (1994). The goal of this system was reported to
be maximising the work window size and minimising train delays, with constraints
including included labour agreements, train service contracts, physical limitations and
maintenance activity characteristics. The required outputs listed were the specific track
occupancy times of the maintenance crews and the resultant train delays. Train delays
are considered to be the direct and indirect time increases to the train schedule for
maintenance to be carried out. This research does not take into account the unforeseen
delays of trains or maintenance on the day of operation and maintenance crews are not
assigned to activities.
4. PROBLEM FORMULATION
Definition of Terms and Notation Used
Expected interference delay:- This is the average delay that both the train schedule and
maintenance activities would suffer due to unforeseen events which may occur to a train
or activity. For this paper, unforeseen events are of a known discrete distribution.
Prioritised finishing time:- It is desirable to have a maintenance activity finished as soon

as possible. This is particularly so for those which result in the track being at a higher
standard. Minimising the prioritised finishing time requires scheduling the maintenance
activities (with different priorities) so as to have the finishing time prior to as many
scheduled trains as possible.
Time Window: A group of time intervals which are not broken up by a scheduled train.
Objective Function
As both trains and maintenance activities are subjected to unexpected delays, the
objective function consists of three weighted components, that minimise: expected
interference delays (BID) due to delayed trains overlapping the maintenance schedule;
BID due to delayed maintenance activities overlapping the train schedule; and prioritised
finishing time of maintenance activities. In presenting the analytical objective function
and constraints, the following notation and parameters are firstly defined:
/ Maintenance crews which are allocated to the track corridor

K Set of track links which make up the track corridor

U Set of scheduled trains on the corridor
J Set of activities which are to be completed in time horizon T
Vk Activities that are scheduled on link k e K during time horizon T
Lj Crews permissible to work on maintenance activity j e J
11 if maintenance crew / e I is available at time interval t e T
[O otherwise
1 if link k e K is available for maintenance at time interval t e T
{ 0 otherwise
Bj Number of time intervals required to carry out maintenance activity j € J
given that the work is continuous
Fj Number of extra time intervals required for maintenance activity j e J
due to discontinuous maintenance
M, j j Minimum travel time intervals for maintenance crew i e 7 to travel from
the track link where activity j\ e J is carried out to that link where
activity j2 e J is carried out.
Cl, j, Cost of maintenance crew / e 7 working on activity j e J at time interval
teT
C2j Importance of activity j e J in terms of finishing time
SCk, Number of scheduled trains which pass through link k e K before time
teT
h Priority weight of activity finish time relative to BID between the train
schedule and maintenance activities
The 0-1 decision variables are defined as follows:
1 if track maintenance crew i e 7 is assigned to activity/ e J

0 otherwise
1 if work is carried out on activity 7 e Jin time interval t € T

if
10 otherwise
The objective function for the model is as follows:

Min Z = **(£ X
(1)
where:
tl = max (t * Yj._,), ie. finishing time of activity j e J

U — set of scheduled trains
Ouk = time interval in which train u e U is scheduled to enter link k e K
Puj = priority weight when there is an overlap between train u e U and activity
j e J for which either is delayed. This priority is a combination of that
associated with train u e U and activity j e J.
fiu.'w) - probability that train u e U is delayed w intervals
fc(j,w) = probability that activity j e. J is delayed w time intervals.
The first component of objective function (1) measures the expected interference delay
(or expected amount of weighted overlap) due to delayed trains overlapping scheduled
maintenance activities. That is, if train u eU was delayed w time intervals on entry to
link k e K and an activity j e J was scheduled (with crew / e /) for this time interval
(ie. YJt0 + w =l), then the weighted amount of overlap is equal to Puj. For all possible
train delays, the expected amount of overlap (ie. BID) between train u &U and activity
j & J is 2^, /(">w) * PUj*Yjot + » • Therefore, the first component of (1) is achieved by
summing over all track links and scheduled trains. In the second component of (1) which
is when delayed maintenance activities overlap scheduled train services, if activity j e J
is delayed \v = Oulc -tl > 0, then there is an overlap with train u e U with weight Puj
and probability of occurrence fc(j,Ouk -tl). In the third part of objective function (1),
that is minimising the weighted completion time of the maintenance activities,
C2j * SCk „ is the priority of activity j e J multiplied by the number of scheduled trains
which are prior to the finishing time of activity j e J on link k e K.
Constraints
The model is subject to the following constraints:

where k is such that j e Vk .
Z YJ,, = BJ + Z (^,, - YJ,,-I ) * V

abs 2
-^ V j (3)
<C Vie/./er (4)
V (5)
rnin(s.t. *,,, * YJlJt + *,,, * 7 >2>/+z = 2) > M,,1J2 if yl * J2 (6)
Z Z Z
/€/ jeJ teT
max(r * 1^, , ) < min(7 *YJ2l) if activity y' 1 must be completed before the
commencement of activity j2
(8)
XIJ=OifitLJ (9)
u, Yjt = 0 or 1 V i € J, t (10)
Constraint (2) identifies links which are not allowed to be assigned maintenance during
certain times intervals. This applies to when the track is unavailable due to scheduled
trains. Constraint (3) enforces the required number of time intervals for which an activity
is worked on. The second part of the right hand side on this constraint extends the
number of time intervals required for an activity if it is disturbed by a scheduled train. If
work on an activity is continuous then this value is equal to 0 (ie.
^] abs^., -YJI_l)*FJ - 1 2 = Fy). Some activities will cause the track link to be
/Er
unusable until the activity is complete. For these activities which cannot be disturbed by
a scheduled train, Fj is set to a large value. The availability of maintenance crews in
each time intervals is represented by constraint (4). Activities are prevented from being
worked in parallel on the same link using constraint (5). For example, both re-railing and
tamping cannot be carried out at the same time and on the same track link. Constraint (6)
ensures the minimum number of time intervals between the finishing time of activity j 1
and the start time of/2 is MiJlJ2 . Such a constraint is required for a maintenance crew to
have enough time to travel between links at different parts of the track corridor and to
prepare for the next activity. Constraint (7) ensures that the maximum operating costs for
the maintenance crews does not exceed the allocated cost budget. For this formulation,
cost is considered as a constraint rather than a component to be minimised. It can be
made a soft constraint by attaching a penalty to the amount in which the budget is
exceeded. Constraint (8) ensures any necessary ordering between activities. For example,
visual inspection may only be carried out after sleepers are replaced. Permissible
maintenance crews for individual links are represented by constraint (9). Constraint (10)
ensures the decision variables are binary.
Solving the Model
The model presented here, which contains a large number of decision variables for a real
life problem, is suited to local search type heuristics due to the ease in constructing an
efficient and meaningful search neighbourhood. The heuristic applied to find a near
optimal solution is the Tabu Search (TS) heuristic (Glover 1990 and Glover et al. 1993).
The TS escapes local optimal solutions by allowing up-hill (non-improving) moves to be
performed when no down-hill (improving) moves are available. For the maintenance
scheduling problem, a move can be to swap attributes of two maintenance activities (with
probability P); or to shift an activity to a different set of (available) time windows (with
probability 1-P). At each iteration of the TS, the neighbourhood (or part of it) is
searched, for which the best non-tabu move found in the search is applied. A move is
tabu if it is one of the L most recent moves applied. The tabu status is over-ridden
(aspiration criteria), if the solution is better than any at that stage of the search.
Depending on the problem at hand, the general TS can be extended so as to promote
various forms of diversification and intensification. For the application of this model, the
search is re-initialised after LOCIT iterations by replacing the current solution with the
best found so far. The tabu search is terminated after GLOBIT iterations. Before the TS
is applied an initial solution is found by scheduling each of the activities in order of
prioritised finishing time (third part of objective function (1)). Activities are allocated to
available permissible work crews. The budget constraint is ignored at this stage.
5. MODEL TESTING: AN APPLICATION
The Test Problem
The track corridor chosen to test the model is 89km long and stretches between Gympie
and Maryborough on the Queensland north coast railway, Australia. On the busiest day
of the week, over 30 trains are scheduled to travel along this track. There are four
different types of trains scheduled along this track, namely: heavy freight (HF); fast
freight (FF); diesel hauled passenger trains (P); and fast passenger trains (FP). The track
corridor, which contains 13 sidings, has two sets of maintenance crews are allocated to it.
It is assumed that one set is responsible for the track links between Gympie and Thebine,
while the other is allocated to links between Thebine and Maryborough. It is assumed
that a crew allocated to the Gympie - Thebine corridor may work on the other track links,
but at 1.5 times the normal cost. A maintenance activity cannot be scheduled in a time
interval which contains a scheduled train.
The list of activities for each track link along with the minimum time intervals to
complete each are shown in Table 1. For the base problem under study, the priorities of
these activities are scaled so that the delay due to train disturbances will be about 5
percent of the entire objective value. This means that the activities will be predominately
scheduled so as to minimise activity finishing time. Delays due to disturbances will only
be significant if the solution technique is to decide between two schedules with similar
total activity finishing time, but vastly different disturbance delays.
Table 1: Number of time intervals required to complete each activity
Number of time intervals

Activity
Track link 1 2 3
1 16 4
2 12 9
3 6
4 7 8
5 9 7 6
6 6 8
7 5 6
8 4
9 10 10
10 9 10
11 14 6 4
12 6
13 12 10
Using the result of Higgins (1996), train delays are found to follow a negative
exponential distribution with a mean delay of 0.30 hours. For the purposes of this
example, delay distributions for maintenance activities are assumed to be same as that for
trains. If an activity was discontinuous due to scheduled trains, the extra time required
was assumed to be 0.2 hours for all activities. That is, if an activity stopped 5 times due
to a scheduled train, an extra time interval was added to its duration.
It is assumed that maintenance crews 2 hours to travel the length of the entire corridor.
Since this corridor contains 12 track links in series, it is estimated that the number of
hourly time intervals required for crews to travel between two links is equal to the
number of links separating them divided by 5. A change in maintenance crews during a
activity is disallowed. It is assumed that work can be carried out on a activity any time of
the day with equal costs. The base case contains six maintenance crews (three for
Gympie - Thebine, three for Thebine - Maryborough)
Model Results
The best solution found for the base case, using the tabu search is illustrated in Figure 1
for the allocation of activities to time windows. This is a time-distance train scheduling
diagram for which the horizontal dashed lines represent the intermediate sidings. The 72
to 96 hour section of planning horizon in this figure contains a large number of scheduled
trains, making the available time windows very few in number and short. It would not be
practical to allocate activities during this period unless they are urgent. Table 2 shows the
start and finish times for each maintenance activity on each track link.
A schedule was constructed manually by ordering activities in terms of priority and

allocating them in turn to the cheapest available work crew. The schedule constructed
using tabu search has a 7 percent reduction in objective function value as compared to the
schedule constructed manually.
The model was also used to demonstrate the effects of activity schedule and maintenance
resource changes. The first sensitivity test involved changing the train schedule as shown
in Figure 2. Trains marked 1 and 2 (Figure 1) are rescheduled to positions la and 2a
(Figure 2) respectively, so as to increase the time windows earlier in the schedule. This
allows less train interruptions for activities, lower expected delay due to disruptions and
earlier completion time of activities. Illustrated in Figure 2 is the schedule obtained using
the tabu search heuristic. There is a 2 percent reduction in prioritised maximum finishing
time and an 18 percent reduction in expected delays due to train and activity
disturbances.
Maintenance planners may also wish to know how many crews should be available at a
given time. Too few crews will prevent activities from being completed in the allocated
time horizon. Too many will cause a large increase in maintenance costs with very little
improvement in the maintenance schedule. The costs and benefits of modifying the
number of work crews has been analysed and the results are shown in Table 3. Results
shown are in terms of the objective function as a percentage of the base case of 6 crews.
When reducing the number of crews to below 6, there is a major increase in the
prioritised finishing time of activities, while more than six crews results in only minor
improvements. Less than 5 crews gives an infeasible solution for a four day planning
horizon.
A test was carried out to assess the effects of modifying the priority of activity finishing
time with respect to expected interference delay. The results are displayed in Table 4.
The first column represents the importance ratio of prioritised activity finishing time with
respect to expected interference delay for which the base problem has a ratio of 1. This
ratio is halved at each row of Table 4.
The greatest fluctuation in prioritised finishing time of activities and expected

disturbance delay is when the importance ratio is between 0.125 and 0.0312. When the
ratio is less than 0.675, expected disturbance delay is very dominant in the objective
function. This is an important result since it is difficult for maintenance planners to
determine the cost of having activities finish early relative to delays due to train and
maintenance activity disturbances. Planners can use such a result to obtain the desired
balance in the objective function.
Table 2: Maintenance Activity Schedule - Base Case
Start and Finish Times (hrs in 24 hr clock)

Activity
]I 2 3
Track link S F S F S F
1 34 67 1 5
2 16 32 1 11
3 11 21
4 11 22 25 35
5 18 29 30 41 7 16
6 16 22 3 10
7 24 31 3 8
8 7 10
9 2 14 40 59
10 24 36 42 64
11 39 70 28 36 2 5
12 33 40
13 39 60 16 27
S = Start time; F = Finish time
Table 3: Sensitivity Analysis: Changing the Number of Maintenance Crews
Objective Function as Percent of Base Case

Number of crews Prioritised activity Overall objective
finishing time function
5 113.0 111.1
6 (Base Case) 100.0 100.0
7 98.8 98.5
8 96.1 95.7
Table 4: Results: Changes to the Importance of Finishing Time and EID
Objective Function As Percent of Base Case

Importance Ratio Prioritised Finishing Time Expected
Disturbance Delay
1 (Base Case) 100 100
0.5 100.5 97.1
0.25 100.8 87.1
0.125 102.7 81.2
0.0625 108.1 72.8
0.0312 114.5 66.0
0.0156 116.3 64.4
0.0078 117.1 63.8
6. CONCLUSIONS AND FURTHER RESEARCH

There has been little published research which aims to optimise both train and track
maintenance operations simultaneously. Past work has either altered the train schedule to
accommodate track maintenance using priority rules or held the train schedule as a fixed
input and assigned the maintenance crews to the activities.
The scheduling of maintenance activities and allocation of crews to these are traditionally
performed using the experience and knowledge of planners. Since maintenance costs are
such a large proportion of total operating costs, there is a need for the development of
operations research tools aimed at assisting the maintenance planners. This paper has
presented a model aimed at improving rail maintenance decisions. The objective was a
combination of minimising the prioritised finishing time of each activity, as well
expected interference to (and from) scheduled trains.
The tabu search heuristic technique was used to find a solution to the large 0-1 integer
program. Such a technique was suitable since the neighbourhood was easily defined by
swapping the order of activities or maintenance crews; or by shifting an activity to a
different time window.
The model was applied to a 89 km track corridor on the eastern coast of Australia. A four
day planning horizon was used for which the model was used to test proposed changes
including rescheduling trains and changing the number of maintenance crews. Increasing
the time window by moving less important trains was shown to reduce potential delays
significantly.
The model is mainly aimed at providing for an 'off-line' planning function. However, the
same model could be used by local track managers and train planners in real-time so that
adjustments could be made to a planned schedule of activities in the light of unplanned
train services or train cancellations. Such a system would need to be integrated into a
train dispatching real-time database.
As a part of further research, an ideal planning process would be to schedule maintenance

activities at the same time as constructing the train schedule. As proposed by Ruffing and
Marvel (1994), instead of constructing and optimising a train plan to minimise delays due
to train conflicts and unplanned events, a train plan can be constructed to maximise time
windows available for possible maintenance activities. A long continuous time window is
less likely to be disrupted by trains (or vice versa) than many short time windows split by
scheduled trains.
REFERENCES
Aspebakken, J. I., Galen, G. L. and Stroot, R. E. (1991). Service maintenance planning

on high densities, heavy haul traffic routes. Heavy Haul Workshop, International Heavy
Haul Conference, Vancouver, Canada.
Bureau of Industry Economics (1995). Rail freight 1995: International Benchmarking.

Report 95/22. AGPS, Canberra.
Esvald, C. (1989) Modern Railway Track, MRT-Productions, West Germany.
Ferreira, L. (1997). Rail track infrastructure ownership: Investment and operational

issues. Transportation, 24 (2), 183-200.
Ferreira, L. and Higgins, A. (1996). Modelling reliability of train arrival times. Journal of
Transportation Engineering, American Society Civil Engineers (ASCE), 122
(6),414-420.
Ferreira, L. and Murray, M. (1997). Modelling rail track deterioration and maintenance:
current practices and future needs. Transport Reviews, 17 (3), 207-221.
Higgins, A., Kozan, E. and Ferreira, L. (1996). Optimal scheduling of trains on a single
line track Transportation Research - Part B, 3 OB (2), 147-161.
Higgins, A, (1996) Optimisation of train schedules to minimise transit time and

maximise reliability, PhD Thesis, School of Mathematical Sciences, Queensland
University of Technology, Brisbane, Australia.
Glover F. (1990). Tabu Search: A Tutorial, Interfaces 20: 74-79.
Glover, F., Taillard, E. and de Werra, D. (1993) A User's Guide to Tabu Search, Annuals
of Operations Research, Vol 41 (1), pp. 3-28.
Griffiths, B. (1991) Implementation of a Maintenance Window Programme in an

Operating Suburban Network, Preprints of the 1991 International Heavy Haul
Workshop, International Heavy Haul Association, Vancouver, Canada, pp. 195-202.
Koogan, N. A. (1992) Track Maintenance at the Netherlands Railways, Ninth

International Rail Track Conference, Perth, Australia.
Kramer, J. (1998) Outsourcing in Rail Engineering: Contracting's Role, Railway Track

and Structures, April 1998, pp. 23.
Ruffing J. A. and Marvel B. (1994). An Analysis of the Scheduling of Work Windows

for Railroad Track Maintenance Gangs. Proceedings of the 1994 International Heavy
Haul Workshop, pp. 1-4.
Szymkowiak, J. A. (1991) Proactive Management of Service and Maintenance Conflicts,

Preprints of the 1991 International Heavy Haul Workshop, International Heavy Haul
Association, Vancouver, Canada, pp.121-131.
INDEX
Assignment, Deterministic 397

Assignment, Dynamic 283, 301, 327, 517, 535
Assignment, Stochastic 257, 351, 373, 397
Bottlenecks (of Traffic) 107, 125, 147, 665
Cell Representation (of Traffic) 81, 327, 555
Congested Traffic 147, 445, 685
Cost Functions 3 87, 621, 665
Environment 735
Equilibrium Analysis 27, 51, 173, 257, 301, 351, 397,
445, 489, 707
Flow-Density-Speed Relationships 3, 81, 125, 147
Fuzzy Logic 535
Merging Models 173
Multilane Analysis 3, 27, 147, 685
Neural Networks 419
Origin-Destination (OD) Analysis 257, 397, 419,445
Optimisation 445, 621, 645, 665, 759, 779
Parking (demand and supply) 707
Pedestrians 235
Prediction (of Traffic) 419, 471
Public Transport 685, 759, 779
Queueing 107,517,685
Rail transport 779
Residential neighbourhood 735
Road Networks 257, 283, 535, 621, 645
Road Pricing 664
Road Safety 191,213
Route Guidance 535,577,621
Scheduling 779
Seat-belt 213
Simulation (of Traffic) 3, 213, 373, 517, 535, 555, 577,
601
Sociodemographic Variables 735
Traffic Flow Models 3, 27, 51, 81, 107, 125, 555
Traffic Control 577,601,621,645
Traffic Data 107, 147, 191, 373
Traffic Signals 445, 489, 645

Transportation and Traffic Theory

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Transportation and Traffic Theory

Încărcat de

Drepturi de autor:

Formate disponibile

TRANSPORTATION

ETTEMA & TIMMERMANS

GARLING, LAITILA & WESTIN

STOPHER & LEE-GOSSELIN

Related Pergamon journals

Free specimen copies of journals available on request

© 1999 Elsevier Science Ltd. All rights reserved.

Electronic Storage or Usage

First edition 1999

Library of Congress Cataloging in Publication Data

British Library Cataloguing in Publication Data

Chapter 1 - Traffic Flow Models 1 -104

Chapter 2 - Traffic Flow Behaviour 105-188

Chapter 3 - Road Safety and Pedestrians 189 - 254

Chapter 4 - Flow Evaluation on Road Networks 255 - 324

Chapter 5 - Traffic Assignment 325- 416

Chapter 6 - Traffic Demand, Forecasting and Decision Tools 417 - 514

Chapter 7 - Traffic Simulation 515-574

Chapter 8 - Traffic Information and Control 575-662

Chapter 9 - Road Tolling and Parking Balance 663 - 732

Chapter 10 - Traveller Survey and Transit Planning 733 - 796

Input Chapter and Authors Advance in Knowledge

Input Chapter and Authors Advance in Knowledge

Prof. R. Hamerslag Delft University of Technology, The Netherlands

IN MEMORIAM MICHAEL CREMER

Adler, J.L. Rensselaer Polytechnic Institute, Troy, NY, USA

Fukuyama, K. Dept. of Social Systems Engineering, Tottori University, Japan

Reinhardt Psychology Dept, University of Ulster at Jordanstown, UK

Imagination is more important than knowledge. (Albert Einstein)

We think in generalities, we live in details. (Alfred North Whitehead)

Science proceeds more by what it has learned to ignore than what

J.PLebacque, ENPC-CERMICS, Marnes La Vallee, France

(3) u(x,t) — ue(k(x,t)}

2 THE MAIN MODELING ISSUES

2.1 Basic variables

2.1.1 Flow speed

If we consider a homogeneous roadway, the equilibrium speed is only a function of density

2.2 Supply representation

2.3 Models solutions

2.3.1 Analytical solutions

ue(k] = u° [(!/ (1 + exp((k/kmax - 0.25)/0.06))) - 3.72,1(T6]

= qe(k(f}} + -T-~ (qx(t) - q°(t)) by limited development

x(t + St) = x(t) + 6t [q!(t) -

2.4 Models vs. reality

• It can be identified using real data,

2.4.1 Interpretation of measurements

2.4.2 Parameters values

3 ANALYSIS OF SOME TYPICAL CASES

3.1 Variation of the road layout

• A variation of traffic demand upstream (or supply downstream).

3.1.1 First order models

77 ff\ = 9(x+,t) -q(x~,t)

Ex 2. Capacity restriction, increase of the demand from below to above capacity

3.1.2 Higher order models

It follows from (14) that:

the LWR and Payne models yield very close results.

introducing a supplementary term

Zhang's argument still applies to yield:

3.2 Multiclass-multibehavior/multilane traffic modelling.

The unknowns kf are determined by maximizing the total flow

for instance in the case of two lanes.

3.3 Platoons and their dispersion.

Platoon dispersion is an experimental fact, incorporated as a fundamental feature in the TRAN-