Sunteți pe pagina 1din 1284

Lecture Notes in Computer Science

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Moshe Y. Vardi
Rice University, Houston, TX, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany

4488

Yong Shi Geert Dick van Albada


Jack Dongarra Peter M.A. Sloot (Eds.)

Computational
Science ICCS 2007
7th International Conference
Beijing, China, May 27 - 30, 2007
Proceedings, Part II

13

Volume Editors
Yong Shi
Graduate University of the Academy of Sciences
Beijing 100080, China
E-mail: yshi@gucas.ac.cn
Geert Dick van Albada
Peter M.A. Sloot
University of Amsterdam, Section Computational Science
1098 SJ Amsterdam, The Netherlands
E-mail: {dick, sloot}@science.uva.nl
Jack Dongarra
University of Tennessee, Computer Science Department
Knoxville, TN 37996-3450, USA
E-mail: dongarra@cs.utk.edu

Library of Congress Control Number: 2007927049


CR Subject Classication (1998): F, D, G, H, I.1, I.3, I.6, J, K.3, C.2-3
LNCS Sublibrary: SL 1 Theoretical Computer Science and General Issues
ISSN
ISBN-10
ISBN-13

0302-9743
3-540-72585-7 Springer Berlin Heidelberg New York
978-3-540-72585-5 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microlms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
Springer-Verlag Berlin Heidelberg 2007
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientic Publishing Services, Chennai, India
Printed on acid-free paper
SPIN: 12065738
06/3180
543210

Preface

The Seventh International Conference on Computational Science (ICCS 2007)


was held in Beijing, China, May 27-30, 2007. This was the continuation of previous conferences in the series: ICCS 2006 in Reading, UK; ICCS 2005 in Atlanta,
Georgia, USA; ICCS 2004 in Krakow, Poland; ICCS 2003 held simultaneously at
two locations in, Melbourne, Australia and St. Petersburg, Russia; ICCS 2002
in Amsterdam, The Netherlands; and ICCS 2001 in San Francisco, California,
USA. Since the rst conference in San Francisco, the ICCS series has become
a major platform to promote the development of Computational Science. The
theme of ICCS 2007 was Advancing Science and Society through Computation. It aimed to bring together researchers and scientists from mathematics
and computer science as basic computing disciplines, researchers from various
application areas who are pioneering the advanced application of computational
methods to sciences such as physics, chemistry, life sciences, and engineering,
arts and humanitarian elds, along with software developers and vendors, to
discuss problems and solutions in the area, to identify new issues, and to shape
future directions for research, as well as to help industrial users apply various
advanced computational techniques.
During the opening of ICCS 2007, Siwei Cheng (Vice-Chairman of the Standing Committee of the National Peoples Congress of the Peoples Republic of
China and the Dean of the School of Management of the Graduate University
of the Chinese Academy of Sciences) presented the welcome speech on behalf of
the Local Organizing Committee, after which Hector Ruiz (President and CEO,
AMD) made remarks on behalf of international computing industries in China.
Seven keynote lectures were delivered by Vassil Alexandrov (Advanced Computing and Emerging Technologies, University of Reading, UK) - Ecient Scalable Algorithms for Large-Scale Computations; Hans Petter Langtangen (Simula Research Laboratory, Lysaker, Norway) - Computational Modelling of Huge
Tsunamis from Asteroid Impacts; Jiawei Han (Department of Computer Science, University of Illinois at Urbana-Champaign, USA) - Research Frontiers
in Advanced Data Mining Technologies and Applications; Ru-qian Lu (Institute of Mathematics, Chinese Academy of Sciences) - Knowledge Engineering
and Knowledge Ware; Alessandro Vespignani (School of Informatics, Indiana
University, USA) -Computational Epidemiology and Emergent Disease Forecast; David Keyes (Department of Applied Physics and Applied Mathematics,
Columbia University) - Scalable Solver Infrastructure for Computational Science
and Engineering; and Yves Robert (Ecole Normale Suprieure de Lyon , France)
- Think Before Coding: Static Strategies (and Dynamic Execution) for Clusters
and Grids. We would like to express our thanks to all of the invited and keynote
speakers for their inspiring talks. In addition to the plenary sessions, the conference included 14 parallel oral sessions and 4 poster sessions. This year, we

VI

Preface

received more than 2,400 submissions for all tracks combined, out of which 716
were accepted.
This includes 529 oral papers, 97 short papers, and 89 poster papers, spread
over 35 workshops and a main track. For the main track we had 91 papers (80
oral papers and 11 short papers) in the proceedings, out of 360 submissions. We
had some 930 people doing reviews for the conference, with 118 for the main
track. Almost all papers received three reviews. The accepted papers are from
more than 43 dierent countries and 48 dierent Internet top-level domains.
The papers cover a large volume of topics in computational science and related areas, from multiscale physics to wireless networks, and from graph theory
to tools for program development.
We would like to thank all workshop organizers and the Program Committee
for the excellent work in maintaining the conferences standing for high-quality
papers. We would like to express our gratitude to sta and graduates of the Chinese Academy of Sciences Research Center on Data Technology and Knowledge
Economy and the Institute of Policy and Management for their hard work in
support of ICCS 2007. We would like to thank the Local Organizing Committee
and Local Arrangements Committee for their persistent and enthusiastic work
towards the success of ICCS 2007. We owe special thanks to our sponsors, AMD,
Springer; University of Nebraska at Omaha, USA and Graduate University of
Chinese Academy of Sciences, for their generous support.
ICCS 2007 was organized by the Chinese Academy of Sciences Research Center on Data Technology and Knowledge Economy, with support from the Section Computational Science at the Universiteit van Amsterdam and Innovative
Computing Laboratory at the University of Tennessee, in cooperation with the
Society for Industrial and Applied Mathematics (SIAM), the International Association for Mathematics and Computers in Simulation (IMACS), the Chinese
Society for Management Modernization (CSMM), and the Chinese Society of
Optimization, Overall Planning and Economical Mathematics (CSOOPEM).
May 2007

Yong Shi

Organization

ICCS 2007 was organized by the Chinese Academy of Sciences Research Center on Data Technology and Knowledge Economy, with support from the Section Computational Science at the Universiteit van Amsterdam and Innovative
Computing Laboratory at the University of Tennessee, in cooperation with the
Society for Industrial and Applied Mathematics (SIAM), the International Association for Mathematics and Computers in Simulation (IMACS), and the Chinese
Society for Management Modernization (CSMM).

Conference Chairs
Conference Chair - Yong Shi (Chinese Academy of Sciences, China/University
of Nebraska at Omaha USA)
Program Chair - Dick van Albada (Universiteit van Amsterdam,
The Netherlands)
ICCS Series Overall Scientic Co-chair - Jack Dongarra (University of Tennessee,
USA)
ICCS Series Overall Scientic Chair - Peter M.A. Sloot (Universiteit van
Amsterdam, The Netherlands)

Local Organizing Committee


Weimin Zheng (Tsinghua University, Beijing, China) Chair
Hesham Ali (University of Nebraska at Omaha, USA)
Chongfu Huang (Beijing Normal University,
Beijing, China)
Masato Koda (University of Tsukuba, Japan)
Heeseok Lee (Korea Advanced Institute of Science and Technology, Korea)
Zengliang Liu (Beijing University of Science and Technology, Beijing, China)
Jen Tang (Purdue University, USA)
Shouyang Wang (Academy of Mathematics and System Science, Chinese
Academy of Sciences, Beijing, China)
Weixuan Xu (Institute of Policy and Management, Chinese Academy of Sciences,
Beijing, China)
Yong Xue (Institute of Remote Sensing Applications, Chinese Academy of
Sciences, Beijing, China)
Ning Zhong (Maebashi Institute of Technology, USA)
Hai Zhuge (Institute of Computing Technology, Chinese Academy of Sciences,
Beijing, China)

VIII

Organization

Local Arrangements Committee


Weixuan Xu, Chair
Yong Shi, Co-chair of events
Benfu Lu, Co-chair of publicity
Hongjin Yang, Secretary
Jianping Li, Member
Ying Liu, Member
Jing He, Member
Siliang Chen, Member
Guanxiong Jiang, Member
Nan Xiao, Member
Zujin Deng, Member

Sponsoring Institutions
AMD
Springer
World Scientic Publlishing
University of Nebraska at Omaha, USA
Graduate University of Chinese Academy of Sciences
Institute of Policy and Management, Chinese Academy of Sciences Universiteit
van Amsterdam

Program Committee
J.H. Abawajy, Deakin University, Australia
D. Abramson, Monash University, Australia
V. Alexandrov, University of Reading, UK
I. Altintas, San Diego Supercomputer Center, UCSD
M. Antolovich, Charles Sturt University, Australia
E. Araujo, Universidade Federal de Campina Grande, Brazil
M.A. Baker, University of Reading, UK
B. Balis, Krakow University of Science and Technology, Poland
A. Benoit, LIP, ENS Lyon, France
I. Bethke, University of Amsterdam, The Netherlands
J.A.R. Blais, University of Calgary, Canada
I. Brandic, University of Vienna, Austria
J. Broeckhove, Universiteit Antwerpen, Belgium
M. Bubak, AGH University of Science and Technology, Poland
K. Bubendorfer, Victoria University of Wellington, Australia
B. Cantalupo, DATAMAT S.P.A, Italy
J. Chen Swinburne, University of Technology, Australia
O. Corcho, University of Manchester, UK
J.C. Cunha, Univ. Nova de Lisboa, Portugal

Organization

S. Date, Osaka University, Japan


F. Desprez, INRIA, France
T. Dhaene, University of Antwerp, Belgium
I.T. Dimov, ACET, The University of Reading, UK
J. Dongarra, University of Tennessee, USA
F. Donno, CERN, Switzerland
C. Douglas, University of Kentucky, USA
G. Fox, Indiana University, USA
W. Funika, Krakow University of Science and Technology, Poland
H.J. Gardner, Australian National University, Australia
G. Geethakumari, University of Hyderabad, India
Y. Gorbachev, St. Petersburg State Polytechnical University, Russia
A.M. Goscinski, Deakin University, Australia
M. Govindaraju, Binghamton University, USA
G.A. Gravvanis, Democritus University of Thrace, Greece
D.J. Groen, University of Amsterdam, The Netherlands
T. Gubala, ACC CYFRONET AGH, Krakow, Poland
M. Hardt, FZK, Germany
T. Heinis, ETH Zurich, Switzerland
L. Hluchy, Institute of Informatics, Slovak Academy of Sciences, Slovakia
A.G. Hoekstra, University of Amsterdam, The Netherlands
W. Homann, University of Amsterdam, The Netherlands
C. Huang, Beijing Normal University Beijing, China
M. Humphrey, University of Virginia, USA
A. Iglesias, University of Cantabria, Spain
H. Jin, Huazhong University of Science and Technology, China
D. Johnson, ACET Centre, University of Reading, UK
B.D. Kandhai, University of Amsterdam, The Netherlands
S. Kawata, Utsunomiya University, Japan
W.A. Kelly, Queensland University of Technology, Australia
J. Kitowski, Inst.Comp.Sci. AGH-UST, Cracow, Poland
M. Koda, University of Tsukuba Japan
D. Kranzlm
uller, GUP, Joh. Kepler University Linz, Austria
B. Kryza, Academic Computer Centre CYFRONET-AGH, Cracow, Poland
M. Kunze, Forschungszentrum Karlsruhe (FZK), Germany
D. Kurzyniec, Emory University, Atlanta, USA
A. Lagana, University of Perugia, Italy
J. Lee, KISTI Supercomputing Center, Korea
C. Lee, Aerospace Corp., USA
L. Lefevre, INRIA, France
A. Lewis, Grith University, Australia
H.W. Lim, Royal Holloway, University of London, UK
A. Lin, NCMIR/UCSD, USA
P. Lu, University of Alberta, Canada
M. Malawski, Institute of Computer Science AGH, Poland

IX

Organization

M. Mascagni, Florida State University, USA


V. Maxville, Curtin Business School, Australia
A.S. McGough, London e-Science Centre, UK
E.D. Moreno, UEA-BENq, Manaus, Brazil
J.T. Moscicki, Cern, Switzerland
S. Naqvi, CoreGRID Network of Excellence, France
P.O.A. Navaux, Universidade Federal do Rio Grande do Sul, Brazil
Z. Nemeth, Computer and Automation Research Institute, Hungarian Academy
of Science, Hungary
J. Ni, University of Iowa, USA
G. Norman, Joint Institute for High Temperatures of RAS, Russia
Nuall
B. O
ain, University of Amsterdam, The Netherlands
C.W. Oosterlee, Centrum voor Wiskunde en Informatica, CWI, The Netherlands
S. Orlando, Universit`
a Ca Foscari, Venice, Italy
M. Paprzycki, IBS PAN and SWPS, Poland
M. Parashar, Rutgers University, USA
L.M. Patnaik, Indian Institute of Science, India
C.P. Pautasso, ETH Z
urich, Switzerland
R. Perrott, Queens University, Belfast, UK
V. Prasanna, University of Southern California, USA
T. Priol, IRISA, France
M.R. Radecki, Krakow University of Science and Technology, Poland
M. Ram, C-DAC Bangalore Centre, India
A. Rendell, Australian National University, Australia
P. Rhodes, University of Mississippi, USA
M. Riedel, Research Centre Juelich, Germany
D. Rodrguez Garca, University of Alcal
a, Spain
K. Rycerz, Krakow University of Science and Technology, Poland
R. Santinelli, CERN, Switzerland
J. Schneider, Technische Universit
at Berlin, Germany
B. Schulze, LNCC, Brazil
J. Seo, The University of Manchester, UK
Y. Shi, Chinese Academy of Sciences, Beijing, China
D. Shires, U.S. Army Research Laboratory, USA
A.E. Solomonides, University of the West of England, Bristol, UK
V. Stankovski, University of Ljubljana, Slovenia
H. Stockinger, Swiss Institute of Bioinformatics, Switzerland
A. Streit, Forschungszentrum J
ulich, Germany
H. Sun, Beihang University, China
R. Tadeusiewicz, AGH University of Science and Technology, Poland
J. Tang, Purdue University USA
M. Taufer, University of Texas El Paso, USA
C. Tedeschi, LIP-ENS Lyon, France
A. Thandavan, ACET Center, University of Reading, UK
A. Tirado-Ramos, University of Amsterdam, The Netherlands

Organization

P. Tvrdik, Czech Technical University Prague, Czech Republic


G.D. van Albada, Universiteit van Amsterdam, The Netherlands
F. van Lingen, California Institute of Technology, USA
J. Vigo-Aguiar, University of Salamanca, Spain
D.W. Walker, Cardi University, UK
C.L. Wang, University of Hong Kong, China
A.L. Wendelborn, University of Adelaide, Australia
Y. Xue, Chinese Academy of Sciences, China
L.T. Yang, St. Francis Xavier University, Canada
C.T. Yang, Tunghai University, Taichung, Taiwan
J. Yu, The University of Melbourne, Australia
Y. Zheng, Zhejiang University, China
W. Zheng, Tsinghua University, Beijing, China
L. Zhu, University of Florida, USA
A. Zomaya, The University of Sydney, Australia
E.V. Zudilova-Seinstra, University of Amsterdam, The Netherlands

Reviewers
J.H. Abawajy
D. Abramson
A. Abran
P. Adriaans
W. Ahn
R. Akbani
K. Akkaya
R. Albert
M. Aldinucci
V.N. Alexandrov
B. Alidaee
I. Altintas
K. Altmanninger
S. Aluru
S. Ambroszkiewicz
L. Anido
K. Anjyo
C. Anthes
M. Antolovich
S. Antoniotti
G. Antoniu
H. Arabnia
E. Araujo
E. Ardeleanu
J. Aroba
J. Astalos

B. Autin
M. Babik
G. Bai
E. Baker
M.A. Baker
S. Balfe
B. Balis
W. Banzhaf
D. Bastola
S. Battiato
M. Baumgarten
M. Baumgartner
P. Beckaert
A. Belloum
O. Belmonte
A. Belyaev
A. Benoit
G. Bergantz
J. Bernsdorf
J. Berthold
I. Bethke
I. Bhana
R. Bhowmik
M. Bickelhaupt
J. Bin Shyan
J. Birkett

J.A.R. Blais
A. Bode
B. Boghosian
S. Bolboaca
C. Bothorel
A. Bouteiller
I. Brandic
S. Branford
S.J. Branford
R. Braungarten
R. Briggs
J. Broeckhove
W. Bronsvoort
A. Bruce
C. Brugha
Y. Bu
K. Bubendorfer
I. Budinska
G. Buemi
B. Bui
H.J. Bungartz
A. Byrski
M. Cai
Y. Cai
Y.Q. Cai
Z.Y. Cai

XI

XII

Organization

B. Cantalupo
K. Cao
M. Cao
F. Capkovic
A. Cepulkauskas
K. Cetnarowicz
Y. Chai
P. Chan
G.-L. Chang
S.C. Chang
W.A. Chaovalitwongse
P.K. Chattaraj
C.-K. Chen
E. Chen
G.Q. Chen
G.X. Chen
J. Chen
J. Chen
J.J. Chen
K. Chen
Q.S. Chen
W. Chen
Y. Chen
Y.Y. Chen
Z. Chen
G. Cheng
X.Z. Cheng
S. Chiu
K.E. Cho
Y.-Y. Cho
B. Choi
J.K. Choi
D. Choinski
D.P. Chong
B. Chopard
M. Chover
I. Chung
M. Ciglan
B. Cogan
G. Cong
J. Corander
J.C. Corchado
O. Corcho
J. Cornil
H. Cota de Freitas

E. Coutinho
J.J. Cuadrado-Gallego
Y.F. Cui
J.C. Cunha
V. Curcin
A. Curioni
R. da Rosa Righi
S. Dalai
M. Daneva
S. Date
P. Dazzi
S. de Marchi
V. Debelov
E. Deelman
J. Della Dora
Y. Demazeau
Y. Demchenko
H. Deng
X.T. Deng
Y. Deng
M. Mat Deris
F. Desprez
M. Dewar
T. Dhaene
Z.R. Di
G. di Biasi
A. Diaz Guilera
P. Didier
I.T. Dimov
L. Ding
G.D. Dobrowolski
T. Dokken
J.J. Dolado
W. Dong
Y.-L. Dong
J. Dongarra
F. Donno
C. Douglas
G.J. Garcke
R.P. Mundani
R. Drezewski
D. Du
B. Duan
J.F. Dufourd
H. Dun

C. Earley
P. Edmond
T. Eitrich
A. El Rhalibi
T. Ernst
V. Ervin
D. Estrin
L. Eyraud-Dubois
J. Falcou
H. Fang
Y. Fang
X. Fei
Y. Fei
R. Feng
M. Fernandez
K. Fisher
C. Fittschen
G. Fox
F. Freitas
T. Friesz
K. Fuerlinger
M. Fujimoto
T. Fujinami
W. Funika
T. Furumura
A. Galvez
L.J. Gao
X.S. Gao
J.E. Garcia
H.J. Gardner
M. Garre
G. Garsva
F. Gava
G. Geethakumari
M. Geimer
J. Geiser
J.-P. Gelas
A. Gerbessiotis
M. Gerndt
S. Gimelshein
S.G. Girdzijauskas
S. Girtelschmid
Z. Gj
C. Glasner
A. Goderis

Organization

D. Godoy
J. Golebiowski
S. Gopalakrishnan
Y. Gorbachev
A.M. Goscinski
M. Govindaraju
E. Grabska
G.A. Gravvanis
C.H. Grelck
D.J. Groen
L. Gross
P. Gruer
A. Grzech
J.F. Gu
Y. Guang Xue
T. Gubala
V. Guevara-Masis
C.H. Guo
X. Guo
Z.Q. Guo
L. Guohui
C. Gupta
I. Gutman
A. Haegee
K. Han
M. Hardt
A. Hasson
J. He
J. He
K. He
T. He
J. He
M.R. Head
P. Heinzlreiter
H. Chojnacki
J. Heo
S. Hirokawa
G. Hliniak
L. Hluchy
T.B. Ho
A. Hoekstra
W. Homann
A. Hoheisel
J. Hong
Z. Hong

D. Horvath
F. Hu
L. Hu
X. Hu
X.H. Hu
Z. Hu
K. Hua
H.W. Huang
K.-Y. Huang
L. Huang
L. Huang
M.S. Huang
S. Huang
T. Huang
W. Huang
Y. Huang
Z. Huang
Z. Huang
B. Huber
E. Hubo
J. Hulliger
M. Hultell
M. Humphrey
P. Hurtado
J. Huysmans
T. Ida
A. Iglesias
K. Iqbal
D. Ireland
N. Ishizawa
I. Lukovits
R. Jamieson
J.K. Jan
P. Janderka
M. Jankowski
L. Jantschi
S.J.K. Jensen
N.J. Jeon
T.H. Jeon
T. Jeong
H. Ji
X. Ji
D.Y. Jia
C. Jiang
H. Jiang

XIII

M.J. Jiang
P. Jiang
W. Jiang
Y. Jiang
H. Jin
J. Jin
L. Jingling
G.-S. Jo
D. Johnson
J. Johnstone
J.J. Jung
K. Juszczyszyn
J.A. Kaandorp
M. Kabelac
B. Kadlec
R. Kakkar
C. Kameyama
B.D. Kandhai
S. Kandl
K. Kang
S. Kato
S. Kawata
T. Kegl
W.A. Kelly
J. Kennedy
G. Khan
J.B. Kido
C.H. Kim
D.S. Kim
D.W. Kim
H. Kim
J.G. Kim
J.H. Kim
M. Kim
T.H. Kim
T.W. Kim
P. Kiprof
R. Kirner
M. Kisiel-Dorohinicki
J. Kitowski
C.R. Kleijn
M. Kluge
upfer
A. Kn
I.S. Ko
Y. Ko

XIV

Organization

R. Kobler
B. Koblitz
G.A. Kochenberger
M. Koda
T. Koeckerbauer
M. Koehler
I. Kolingerova
V. Korkhov
T. Korkmaz
L. Kotulski
G. Kou
J. Kozlak
M. Krafczyk
D. Kranzlm
uller
B. Kryza
V.V. Krzhizhanovskaya
M. Kunze
D. Kurzyniec
E. Kusmierek
S. Kwang
Y. Kwok
F. Kyriakopoulos
H. Labiod
A. Lagana
H. Lai
S. Lai
Z. Lan
G. Le Mahec
B.G. Lee
C. Lee
H.K. Lee
J. Lee
J. Lee
J.H. Lee
S. Lee
S.Y. Lee
V. Lee
Y.H. Lee
L. Lefevre
L. Lei
F. Lelj
A. Lesar
D. Lesthaeghe
Z. Levnajic
A. Lewis

A. Li
D. Li
D. Li
E. Li
J. Li
J. Li
J.P. Li
M. Li
P. Li
X. Li
X.M. Li
X.S. Li
Y. Li
Y. Li
J. Liang
L. Liang
W.K. Liao
X.F. Liao
G.G. Lim
H.W. Lim
S. Lim
A. Lin
I.C. Lin
I-C. Lin
Y. Lin
Z. Lin
P. Lingras
C.Y. Liu
D. Liu
D.S. Liu
E.L. Liu
F. Liu
G. Liu
H.L. Liu
J. Liu
J.C. Liu
R. Liu
S.Y. Liu
W.B. Liu
X. Liu
Y. Liu
Y. Liu
Y. Liu
Y. Liu
Y.J. Liu

Y.Z. Liu
Z.J. Liu
S.-C. Lo
R. Loogen
B. Lopez
A. Lopez Garca de
Lomana
F. Loulergue
G. Lu
J. Lu
J.H. Lu
M. Lu
P. Lu
S. Lu
X. Lu
Y.C. Lu
C. Lursinsap
L. Ma
M. Ma
T. Ma
A. Macedo
N. Maillard
M. Malawski
S. Maniccam
S.S. Manna
Z.M. Mao
M. Mascagni
E. Mathias
R.C. Maurya
V. Maxville
A.S. McGough
R. Mckay
T.-G. MCKenzie
K. Meenal
R. Mehrotra
M. Meneghin
F. Meng
M.F.J. Meng
E. Merkevicius
M. Metzger
Z. Michalewicz
J. Michopoulos
J.-C. Mignot
R. mikusauskas
H.Y. Ming

Organization

G. Miranda Valladares
M. Mirua
G.P. Miscione
C. Miyaji
A. Miyoshi
J. Monterde
E.D. Moreno
G. Morra
J.T. Moscicki
H. Moshkovich
V.M. Moskaliova
G. Mounie
C. Mu
A. Muraru
H. Na
K. Nakajima
Y. Nakamori
S. Naqvi
S. Naqvi
R. Narayanan
A. Narjess
A. Nasri
P. Navaux
P.O.A. Navaux
M. Negoita
Z. Nemeth
L. Neumann
N.T. Nguyen
J. Ni
Q. Ni
K. Nie
G. Nikishkov
V. Nitica
W. Nocon
A. Noel
G. Norman
Nuall
B. O
ain
N. OBoyle
J.T. Oden
Y. Ohsawa
H. Okuda
D.L. Olson
C.W. Oosterlee
V. Oravec
S. Orlando

F.R. Ornellas
A. Ortiz
S. Ouyang
T. Owens
S. Oyama
B. Ozisikyilmaz
A. Padmanabhan
Z. Pan
Y. Papegay
M. Paprzycki
M. Parashar
K. Park
M. Park
S. Park
S.K. Pati
M. Pauley
C.P. Pautasso
B. Payne
T.C. Peachey
S. Pelagatti
F.L. Peng
Q. Peng
Y. Peng
N. Petford
A.D. Pimentel
W.A.P. Pinheiro
J. Pisharath
G. Pitel
D. Plemenos
S. Pllana
S. Ploux
A. Podoleanu
M. Polak
D. Prabu
B.B. Prahalada Rao
V. Prasanna
P. Praxmarer
V.B. Priezzhev
T. Priol
T. Prokosch
G. Pucciani
D. Puja
P. Puschner
L. Qi
D. Qin

H. Qin
K. Qin
R.X. Qin
X. Qin
G. Qiu
X. Qiu
J.Q. Quinqueton
M.R. Radecki
S. Radhakrishnan
S. Radharkrishnan
M. Ram
S. Ramakrishnan
P.R. Ramasami
P. Ramsamy
K.R. Rao
N. Ratnakar
T. Recio
K. Regenauer-Lieb
R. Rejas
F.Y. Ren
A. Rendell
P. Rhodes
J. Ribelles
M. Riedel
R. Rioboo
Y. Robert
G.J. Rodgers
A.S. Rodionov
D. Rodrguez Garca
C. Rodriguez Leon
F. Rogier
G. Rojek
L.L. Rong
H. Ronghuai
H. Rosmanith
F.-X. Roux
R.K. Roy
U. R
ude
M. Ruiz
T. Ruofeng
K. Rycerz
M. Ryoke
F. Safaei
T. Saito
V. Sakalauskas

XV

XVI

Organization

L. Santillo
R. Santinelli
K. Sarac
H. Saraan
M. Sarfraz
V.S. Savchenko
M. Sbert
R. Schaefer
D. Schmid
J. Schneider
M. Schoeberl
S.-B. Scholz
B. Schulze
S.R. Seelam
B. Seetharamanjaneyalu
J. Seo
K.D. Seo
Y. Seo
O.A. Serra
A. Sfarti
H. Shao
X.J. Shao
F.T. Sheldon
H.Z. Shen
S.L. Shen
Z.H. Sheng
H. Shi
Y. Shi
S. Shin
S.Y. Shin
B. Shirazi
D. Shires
E. Shook
Z.S. Shuai
M.A. Sicilia
M. Simeonidis
K. Singh
M. Siqueira
W. Sit
M. Skomorowski
A. Skowron
P.M.A. Sloot
M. Smolka
B.S. Sniezynski
H.Z. Sojka

A.E. Solomonides
C. Song
L.J. Song
S. Song
W. Song
J. Soto
A. Sourin
R. Srinivasan
V. Srovnal
V. Stankovski
P. Sterian
H. Stockinger
D. Stokic
A. Streit
B. Strug
P. Stuedi
A. St
umpel
S. Su
V. Subramanian
P. Suganthan
D.A. Sun
H. Sun
S. Sun
Y.H. Sun
Z.G. Sun
M. Suvakov
H. Suzuki
D. Szczerba
L. Szecsi
L. Szirmay-Kalos
R. Tadeusiewicz
B. Tadic
T. Takahashi
S. Takeda
J. Tan
H.J. Tang
J. Tang
S. Tang
T. Tang
X.J. Tang
J. Tao
M. Taufer
S.F. Tayyari
C. Tedeschi
J.C. Teixeira

F. Terpstra
C. Te-Yi
A.Y. Teymorian
D. Thalmann
A. Thandavan
L. Thompson
S. Thurner
F.Z. Tian
Y. Tian
Z. Tianshu
A. Tirado-Ramos
A. Tirumala
P. Tjeerd
W. Tong
A.S. Tosun
A. Tropsha
C. Troyer
K.C.K. Tsang
A.C. Tsipis
I. Tsutomu
A. Turan
P. Tvrdik
U. Ufuktepe
V. Uskov
B. Vaidya
E. Valakevicius
I.A. Valuev
S. Valverde
G.D. van Albada
R. van der Sman
F. van Lingen
A.J.C. Varandas
C. Varotsos
D. Vasyunin
R. Veloso
J. Vigo-Aguiar
J. Vill`
a i Freixa
V. Vivacqua
E. Vumar
R. Walentkynski
D.W. Walker
B. Wang
C.L. Wang
D.F. Wang
D.H. Wang

Organization

F. Wang
F.L. Wang
H. Wang
H.G. Wang
H.W. Wang
J. Wang
J. Wang
J. Wang
J. Wang
J.H. Wang
K. Wang
L. Wang
M. Wang
M.Z. Wang
Q. Wang
Q.Q. Wang
S.P. Wang
T.K. Wang
W. Wang
W.D. Wang
X. Wang
X.J. Wang
Y. Wang
Y.Q. Wang
Z. Wang
Z.T. Wang
A. Wei
G.X. Wei
Y.-M. Wei
X. Weimin
D. Weiskopf
B. Wen
A.L. Wendelborn
I. Wenzel
A. Wibisono
A.P. Wierzbicki
R. Wism
uller
F. Wolf
C. Wu
C. Wu
F. Wu
G. Wu
J.N. Wu
X. Wu
X.D. Wu

Y. Wu
Z. Wu
B. Wylie
M. Xavier Py
Y.M. Xi
H. Xia
H.X. Xia
Z.R. Xiao
C.F. Xie
J. Xie
Q.W. Xie
H. Xing
H.L. Xing
J. Xing
K. Xing
L. Xiong
M. Xiong
S. Xiong
Y.Q. Xiong
C. Xu
C.-H. Xu
J. Xu
M.W. Xu
Y. Xu
G. Xue
Y. Xue
Z. Xue
A. Yacizi
B. Yan
N. Yan
N. Yan
W. Yan
H. Yanami
C.T. Yang
F.P. Yang
J.M. Yang
K. Yang
L.T. Yang
L.T. Yang
P. Yang
X. Yang
Z. Yang
W. Yanwen
S. Yarasi
D.K.Y. Yau

XVII

P.-W. Yau
M.J. Ye
G. Yen
R. Yi
Z. Yi
J.G. Yim
L. Yin
W. Yin
Y. Ying
S. Yoo
T. Yoshino
W. Youmei
Y.K. Young-Kyu Han
J. Yu
J. Yu
L. Yu
Z. Yu
Z. Yu
W. Yu Lung
X.Y. Yuan
W. Yue
Z.Q. Yue
D. Yuen
T. Yuizono
J. Zambreno
P. Zarzycki
M.A. Zatevakhin
S. Zeng
A. Zhang
C. Zhang
D. Zhang
D.L. Zhang
D.Z. Zhang
G. Zhang
H. Zhang
H.R. Zhang
H.W. Zhang
J. Zhang
J.J. Zhang
L.L. Zhang
M. Zhang
N. Zhang
P. Zhang
P.Z. Zhang
Q. Zhang

XVIII

Organization

S. Zhang
W. Zhang
W. Zhang
Y.G. Zhang
Y.X. Zhang
Z. Zhang
Z.W. Zhang
C. Zhao
H. Zhao
H.K. Zhao
H.P. Zhao
J. Zhao
M.H. Zhao
W. Zhao

Z. Zhao
L. Zhen
B. Zheng
G. Zheng
W. Zheng
Y. Zheng
W. Zhenghong
P. Zhigeng
W. Zhihai
Y. Zhixia
A. Zhmakin
C. Zhong
X. Zhong
K.J. Zhou

L.G. Zhou
X.J. Zhou
X.L. Zhou
Y.T. Zhou
H.H. Zhu
H.L. Zhu
L. Zhu
X.Z. Zhu
Z. Zhu
M. Zhu.
J. Zivkovic
A. Zomaya
E.V. Zudilova-Seinstra

Workshop Organizers
Sixth International Workshop on Computer Graphics and Geometric
Modelling
A. Iglesias, University of Cantabria, Spain
Fifth International Workshop on Computer Algebra Systems and
Applications
A. Iglesias, University of Cantabria, Spain,
A. Galvez, University of Cantabria, Spain
PAPP 2007 - Practical Aspects of High-Level Parallel Programming
(4th International Workshop)
A. Benoit, ENS Lyon, France
F. Loulerge, LIFO, Orlans, France
International Workshop on Collective Intelligence for Semantic and
Knowledge Grid (CISKGrid 2007)
N.T. Nguyen, Wroclaw University of Technology, Poland
J.J. Jung, INRIA Rh
one-Alpes, France
K. Juszczyszyn, Wroclaw University of Technology, Poland
Simulation of Multiphysics Multiscale Systems, 4th International
Workshop
V.V. Krzhizhanovskaya, Section Computational Science, University of
Amsterdam, The Netherlands
A.G. Hoekstra, Section Computational Science, University of Amsterdam,
The Netherlands

Organization

XIX

S. Sun, Clemson University, USA


J. Geiser, Humboldt University of Berlin, Germany
2nd Workshop on Computational Chemistry and Its Applications
(2nd CCA)
P.R. Ramasami, University of Mauritius
Ecient Data Management for HPC Simulation Applications
R.-P. Mundani, Technische Universit
at M
unchen, Germany
J. Abawajy, Deakin University, Australia
M. Mat Deris, Tun Hussein Onn College University of Technology, Malaysia
Real Time Systems and Adaptive Applications (RTSAA-2007)
J. Hong, Soongsil University, South Korea
T. Kuo, National Taiwan University, Taiwan
The International Workshop on Teaching Computational Science
(WTCS 2007)
L. Qi, Department of Information and Technology, Central China Normal
University, China
W. Yanwen, Department of Information and Technology, Central China Normal
University, China
W. Zhenghong, East China Normal University, School of Information Science
and Technology, China
GeoComputation
Y. Xue, IRSA, China
Risk Analysis
C.F. Huang, Beijing Normal University, China
Advanced Computational Approaches and IT Techniques in
Bioinformatics
M.A. Pauley, University of Nebraska at Omaha, USA
H.A. Ali, University of Nebraska at Omaha, USA
Workshop on Computational Finance and Business Intelligence
Y. Shi, Chinese Acedemy of Scienes, China
S.Y. Wang, Academy of Mathematical and System Sciences, Chinese Academy
of Sciences, China
X.T. Deng, Department of Computer Science, City University of Hong Kong,
China

XX

Organization

Collaborative and Cooperative Environments


C. Anthes, Institute of Graphics and Parallel Processing, JKU, Austria
V.N. Alexandrov, ACET Centre, The University of Reading, UK
D. Kranzlm
uller, Institute of Graphics and Parallel Processing, JKU, Austria
J. Volkert, Institute of Graphics and Parallel Processing, JKU, Austria
Tools for Program Development and Analysis in Computational
Science
A. Kn
upfer, ZIH, TU Dresden, Germany
A. Bode, TU Munich, Germany
D. Kranzlm
uller, Institute of Graphics and Parallel Processing, JKU, Austria
J. Tao, CAPP, University of Karlsruhe, Germany
R. Wissm
uller FB12, BSVS, University of Siegen, Germany
J. Volkert, Institute of Graphics and Parallel Processing, JKU, Austria
Workshop on Mining Text, Semi-structured, Web or Multimedia
Data (WMTSWMD 2007)
G. Kou, Thomson Corporation, R&D, USA
Y. Peng, Omnium Worldwide, Inc., USA
J.P. Li, Institute of Policy and Management, Chinese Academy of Sciences, China
2007 International Workshop on Graph Theory, Algorithms and Its
Applications in Computer Science (IWGA 2007)
M. Li, Dalian University of Technology, China
2nd International Workshop on Workow Systems in e-Science
(WSES 2007)
Z. Zhao, University of Amsterdam, The Netherlands
A. Belloum, University of Amsterdam, The Netherlands
2nd International Workshop on Internet Computing in Science and
Engineering (ICSE 2007)
J. Ni, The University of Iowa, USA
Workshop on Evolutionary Algorithms and Evolvable Systems
(EAES 2007)
B. Zheng, College of Computer Science, South-Central University for
Nationalities, Wuhan, China
Y. Li, State Key Lab. of Software Engineering, Wuhan University, Wuhan, China
J. Wang, College of Computer Science, South-Central University for
Nationalities, Wuhan, China
L. Ding, State Key Lab. of Software Engineering, Wuhan University, Wuhan,
China

Organization

XXI

Wireless and Mobile Systems 2007 (WMS 2007)


H. Choo, Sungkyunkwan University, South Korea
WAFTS: WAvelets, FracTals, Short-Range Phenomena
Computational Aspects and Applications
C. Cattani, University of Salerno, Italy
C. Toma, Polythecnica, Bucharest, Romania
Dynamic Data-Driven Application Systems - DDDAS 2007
F. Darema, National Science Foundation, USA
The Seventh International Workshop on Meta-synthesis and
Complex Systems (MCS 2007)
X.J. Tang, Academy of Mathematics and Systems Science, Chinese Academy of
Sciences, China
J.F. Gu, Institute of Systems Science, Chinese Academy of Sciences, China
Y. Nakamori, Japan Advanced Institute of Science and Technology, Japan
H.C. Wang, Shanghai Jiaotong University, China
The 1st International Workshop on Computational Methods in
Energy Economics
L. Yu, City University of Hong Kong, China
J. Li, Chinese Academy of Sciences, China
D. Qin, Guangdong Provincial Development and Reform Commission, China
High-Performance Data Mining
Y. Liu, Data Technology and Knowledge Economy Research Center, Chinese
Academy of Sciences, China
A. Choudhary, Electrical and Computer Engineering Department, Northwestern
University, USA
S. Chiu, Department of Computer Science, College of Engineering, Idaho State
University, USA
Computational Linguistics in HumanComputer Interaction
H. Ji, Sungkyunkwan University, South Korea
Y. Seo, Chungbuk National University, South Korea
H. Choo, Sungkyunkwan University, South Korea
Intelligent Agents in Computing Systems
K. Cetnarowicz, Department of Computer Science, AGH University of Science
and Technology, Poland
R. Schaefer, Department of Computer Science, AGH University of Science and
Technology, Poland

XXII

Organization

Networks: Theory and Applications


B. Tadic, Jozef Stefan Institute, Ljubljana, Slovenia
S. Thurner, COSY, Medical University Vienna, Austria
Workshop on Computational Science in Software Engineering
D. Rodrguez, University of Alcala, Spain
J.J. Cuadrado-Gallego, University of Alcala, Spain
International Workshop on Advances in Computational
Geomechanics and Geophysics (IACGG 2007)
H.L. Xing, The University of Queensland and ACcESS Major National Research
Facility, Australia
J.H. Wang, Shanghai Jiao Tong University, China
2nd International Workshop on Evolution Toward Next-Generation
Internet (ENGI)
Y. Cui, Tsinghua University, China
Parallel Monte Carlo Algorithms for Diverse Applications in a
Distributed Setting
V.N. Alexandrov, ACET Centre, The University of Reading, UK
The 2007 Workshop on Scientic Computing in Electronics
Engineering (WSCEE 2007)
Y. Li, National Chiao Tung University, Taiwan
High-Performance Networked Media and Services 2007 (HiNMS
2007)
I.S. Ko, Dongguk University, South Korea
Y.J. Na, Honam University, South Korea

Table of Contents Part II

Resolving Occlusion Method of Virtual Object in Simulation Using


Snake and Picking Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
JeongHee Cha, GyeYoung Kim, and HyungIl Choi

Graphics Hardware-Based Level-Set Method for Interactive


Segmentation and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Helen Hong and Seongjin Park

Parameterization of Quadrilateral Meshes . . . . . . . . . . . . . . . . . . . . . . . . . . .


Li Liu, CaiMing Zhang, and Frank Cheng

17

Pose Insensitive 3D Retrieval by Poisson Shape Histogram . . . . . . . . . . . .


Pan Xiang, Chen Qi Hua, Fang Xin Gang, and Zheng Bo Chuan

25

Point-Sampled Surface Simulation Based on Mass-Spring System . . . . . . .


Zhixun Su, Xiaojie Zhou, Xiuping Liu, Fengshan Liu, and Xiquan Shi

33

Sweeping Surface Generated by a Class of Generalized Quasi-cubic


Interpolation Spline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benyue Su and Jieqing Tan

41

An Articial Immune System Approach for B-Spline Surface


Approximation Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

sler
Erkan Ulker
and Veysi I

49

Implicit Surface Reconstruction from Scattered Point Data with


Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jun Yang, Zhengning Wang, Changqian Zhu, and Qiang Peng

57

The Shannon Entropy-Based Node Placement for Enrichment and


Simplication of Meshes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vladimir Savchenko, Maria Savchenko, Olga Egorova, and
Ichiro Hagiwara

65

Parameterization of 3D Surface Patches by Straightest Distances . . . . . . .


Sungyeol Lee and Haeyoung Lee

73

Facial Expression Recognition Based on Emotion Dimensions on


Manifold Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Young-suk Shin

81

AI Framework for Decision Modeling in Behavioral Animation of


Virtual Avatars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. Iglesias and F. Luengo

89

XXIV

Table of Contents Part II

Studies on Shape Feature Combination and Ecient Categorization of


3D Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Tianyang Lv, Guobao Liu, Jiming Pang, and Zhengxuan Wang

97

A Generalised-Mutual-Information-Based Oracle for Hierarchical


Radiosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jaume Rigau, Miquel Feixas, and Mateu Sbert

105

Rendering Technique for Colored Paper Mosaic . . . . . . . . . . . . . . . . . . . . . .


Youngsup Park, Sanghyun Seo, YongJae Gi, Hanna Song, and
Kyunghyun Yoon

114

Real-Time Simulation of Surface Gravity Ocean Waves Based on the


TMA Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Namkyung Lee, Nakhoon Baek, and Kwan Woo Ryu

122

Determining Knots with Quadratic Polynomial Precision . . . . . . . . . . . . . .


Zhang Caiming, Ji Xiuhua, and Liu Hui

130

Interactive Cartoon Rendering and Sketching of Clouds and Smoke . . . . .

Eduardo J. Alvarez,
Celso Campos, Silvana G. Meire,
Ricardo Quir
os, Joaquin Huerta, and Michael Gould

138

Spherical Binary Images Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Liu Wei and He Yuanjun

146

Dynamic Data Path Prediction in Network Virtual Environment . . . . . . .


Sun-Hee Song, Seung-Moon Jeong, Gi-Taek Hur, and Sang-Dong Ra

150

Modeling Inlay/Onlay Prostheses with Mesh Deformation Techniques . . .


Kwan-Hee Yoo, Jong-Sung Ha, and Jae-Soo Yoo

154

Automatic Generation of Virtual Computer Rooms on the Internet


Using X3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Aybars Ugur and Tahir Emre Kalayc

158

Stained Glass Rendering with Smooth Tile Boundary . . . . . . . . . . . . . . . . .


SangHyun Seo, HoChang Lee, HyunChul Nah, and KyungHyun Yoon

162

Guaranteed Adaptive Antialiasing Using Interval Arithmetic . . . . . . . . . .


Jorge Fl
orez, Mateu Sbert, Miguel A. Sainz, and Josep Veh

166

Restricted Non-cooperative Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Seth J. Chandler

170

A New Application of CAS to LATEX Plottings . . . . . . . . . . . . . . . . . . . . . . .


Masayoshi Sekiguchi, Masataka Kaneko, Yuuki Tadokoro,
Satoshi Yamashita, and Setsuo Takato

178

Table of Contents Part II

XXV

JMathNorm: A Database Normalization Tool Using Mathematica . . . . . .


Ali Yazici and Ziya Karakaya

186

Symbolic Manipulation of Bspline Basis Functions with Mathematica . . .


A. Iglesias, R. Ipanaque, and R.T. Urbina

194

Rotating Capacitor and a Transient Electric Network . . . . . . . . . . . . . . . . .


Haiduke Saraan and Nenette Saraan

203

Numerical-Symbolic Matlab Program for the Analysis of


Three-Dimensional Chaotic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Akemi G
alvez

211

Safety of Recreational Water Slides: Numerical Estimation of the


Trajectory, Velocities and Accelerations of Motion of the Users . . . . . . . .
Piotr Szczepaniak and Ryszard Walenty
nski

219

Computing Locus Equations for Standard Dynamic Geometry


Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Francisco Botana, Miguel A. Ab
anades, and Jes
us Escribano

227

Symbolic Computation of Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Andres Iglesias and Sinan Kapcak

235

Dynaput: Dynamic Input Manipulations for 2D Structures of


Mathematical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Deguchi Hiroaki

243

On the Virtues of Generic Programming for Symbolic Computation . . . .

Xin Li, Marc Moreno Maza, and Eric


Schost

251

Semi-analytical Approach for Analyzing Vibro-Impact Systems . . . . . . . .

Algimantas Cepulkauskas,
Regina Kulvietiene,
Genadijus Kulvietis, and Jurate Mikucioniene

259

Formal Verication of Analog and Mixed Signal Designs in


Mathematica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mohamed H. Zaki, Ghiath Al-Sammane, and So`ene Tahar
Ecient Computations of Irredundant Triangular Decompositions with
the RegularChains Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changbo Chen, Francois Lemaire, Marc Moreno Maza,
Wei Pan, and Yuzhen Xie
Characterisation of the Surfactant Shell Stabilising Calcium Carbonate
Dispersions in Overbased Detergent Additives: Molecular Modelling
and Spin-Probe-ESR Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Francesco Frigerio and Luciano Montanari

263

268

272

XXVI

Table of Contents Part II

Hydrogen Adsorption and Penetration of Cx (x=58-62) Fullerenes with


Defects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Xin Yue, Jijun Zhao, and Jieshan Qiu

280

Ab Initio and DFT Investigations of the Mechanistic Pathway of


Singlet Bromocarbenes Insertion into C-H Bonds of Methane and
Ethane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
M. Ramalingam, K. Ramasami, P. Venuvanalingam, and
J. Swaminathan

288

Theoretical Gas Phase Study of the Gauche and Trans Conformers of


1-Bromo-2-Chloroethane and Solvent Eects . . . . . . . . . . . . . . . . . . . . . . . .
Ponnadurai Ramasami

296

Dynamics Simulation of Conducting Polymer Interchain Interaction


Eects on Polaron Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jose Rildo de Oliveira Queiroz and Geraldo Magela e Silva

304

Cerium (III) Complexes Modeling with Sparkle/PM3 . . . . . . . . . . . . . . . . .


Alfredo Mayall Simas, Ricardo Oliveira Freire, and
Gerd Bruno Rocha

312

The Design of Blue Emitting Materials Based on Spirosilabiuorene


Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Miao Sun, Ben Niu, and Jingping Zhang

319

Regulative Eect of Water Molecules on the Switches of


Guanine-Cytosine (GC) Watson-Crick Pair . . . . . . . . . . . . . . . . . . . . . . . . . .
Hongqi Ai, Xian Peng, Yun Li, and Chong Zhang

327

Energy Partitioning Analysis of the Chemical Bonds in mer-Mq3


(M = AlIII , GaIII , InIII , TlIII ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ruihai Cui and Jingping Zhang

331

Ab Initio Quantum Chemical Studies of Six-Center Bond Exchange


Reactions Among Halogen and Halogen Halide Molecules . . . . . . . . . . . . .
I. Noorbatcha, B. Arin, and S.M. Zain

335

Comparative Analysis of the Interaction Networks of HIV-1 and Human


Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kyungsook Han and Byungkyu Park

339

Protein Classication from Protein-Domain and Gene-Ontology


Annotation Information Using Formal Concept Analysis . . . . . . . . . . . . . .
Mi-Ryung Han, Hee-Joon Chung, Jihun Kim,
Dong-Young Noh, and Ju Han Kim
A Supervised Classier Based on Articial Immune System . . . . . . . . . . . .
Lingxi Peng, Yinqiao Peng, Xiaojie Liu, Caiming Liu, Jinquan Zeng,
Feixian Sun, and Zhengtian Lu

347

355

Table of Contents Part II

Ab-origin: An Improved Tool of Heavy Chain Rearrangement Analysis


for Human Immunoglobulin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Xiaojing Wang, Wu Wei, SiYuan Zheng, Z.W. Cao, and Yixue Li
Analytically Tuned Simulated Annealing Applied to the Protein
Folding Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Juan Frausto-Solis, E.F. Rom
an, David Romero,
Xavier Soberon, and Ernesto Li
n
an-Garca
Training the Hidden Vector State Model from Un-annotated Corpus . . . .
Deyu Zhou, Yulan He, and Chee Keong Kwoh
Using Computer Simulation to Understand Mutation Accumulation
Dynamics and Genetic Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
John Sanford, John Baumgardner, Wes Brewer, Paul Gibson, and
Walter ReMine

XXVII

363

370

378

386

An Object Model Based Repository for Biological Pathways Using


XML Database Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Keyuan Jiang

393

Protein Folding Simulation with New Move Set in 3D Lattice Model . . . .


X.-M. Li

397

A Dynamic Committee Scheme on Multiple-Criteria Linear


Programming Classication Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Meihong Zhu, Yong Shi, Aihua Li, and Jing He

401

Kimberlites Identication by Classication Methods . . . . . . . . . . . . . . . . . .


Yaohui Chai, Aihua Li, Yong Shi, Jing He, and Keliang Zhang

409

A Fast Method for Pricing Early-Exercise Options with the FFT . . . . . . .


R. Lord, F. Fang, F. Bervoets, and C.W. Oosterlee

415

Neural-Network-Based Fuzzy Group Forecasting with Application to


Foreign Exchange Rates Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Lean Yu, Kin Keung Lai, and Shouyang Wang

423

Credit Risk Evaluation Using Support Vector Machine with Mixture of


Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Liwei Wei, Jianping Li, and Zhenyu Chen

431

Neuro-discriminate Model for the Forecasting of Changes of Companies


Financial Standings on the Basis of Self-organizing Maps . . . . . . . . . . . . . .
Egidijus Merkevicius, Gintautas Garsva, and Rimvydas Simutis

439

A New Computing Method for Greeks Using Stochastic Sensitivity


Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Masato Koda

447

XXVIII

Table of Contents Part II

Application of Neural Networks for Foreign Exchange Rates Forecasting


with Noise Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Wei Huang, Kin Keung Lai, and Shouyang Wang

455

An Experiment with Fuzzy Sets in Data Mining . . . . . . . . . . . . . . . . . . . . .


David L. Olson, Helen Moshkovich, and Alexander Mechitov

462

An Application of Component-Wise Iterative Optimization to


Feed-Forward Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yachen Lin

470

ERM-POT Method for Quantifying Operational Risk for Chinese


Commercial Banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Fanjun Meng, Jianping Li, and Lijun Gao

478

Building Behavior Scoring Model Using Genetic Algorithm and Support


Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Defu Zhang, Qingshan Chen, and Lijun Wei

482

An Intelligent CRM System for Identifying High-Risk Customers:


An Ensemble Data Mining Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kin Keung Lai, Lean Yu, Shouyang Wang, and Wei Huang

486

The Characteristic Analysis of Web User Clusters Based on Frequent


Browsing Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Zhiwang Zhang and Yong Shi

490

A Two-Phase Model Based on SVM and Conjoint Analysis for Credit


Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kin Keung Lai, Ligang Zhou, and Lean Yu

494

A New Multi-Criteria Quadratic-Programming Linear Classication


Model for VIP E-Mail Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Peng Zhang, Juliang Zhang, and Yong Shi

499

Ecient Implementation of an Optimal Interpolator for Large Spatial


Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nargess Memarsadeghi and David M. Mount

503

Development of an Ecient Conversion System for GML Documents . . .


Dong-Suk Hong, Hong-Koo Kang, Dong-Oh Kim, and Ki-Joon Han

511

Eective Spatial Characterization System Using Density-Based


Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chan-Min Ahn, Jae-Hyun You, Ju-Hong Lee, and Deok-Hwan Kim

515

MTF Measurement Based on Interactive Live-Wire Edge Extraction . . . .


Peng Liu, Dingsheng Liu, and Fang Huang

523

Table of Contents Part II

Research on Technologies of Spatial Conguration Information


Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Haibin Sun
Modelbase System in Remote Sensing Information Analysis and Service
Grid Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yong Xue, Lei Zheng, Ying Luo, Jianping Guo, Wei Wan,
Wei Wei, and Ying Wang

XXIX

531

538

Density Based Fuzzy Membership Functions in the Context of


Geocomputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Victor Lobo, Fernando Bac
ao, and Miguel Loureiro

542

A New Method to Model Neighborhood Interaction in Cellular


Automata-Based Urban Geosimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Yaolong Zhao and Yuji Murayama

550

Articial Neural Networks Application to Calculate Parameter Values


in the Magnetotelluric Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Andrzej Bielecki, Tomasz Danek, Janusz Jagodzi
nski, and
Marek Wojdyla
Integrating Ajax into GIS Web Services for Performance
Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Seung-Jun Cha, Yun-Young Hwang, Yoon-Seop Chang,
Kyung-Ok Kim, and Kyu-Chul Lee
Aerosol Optical Thickness Retrieval over Land from MODIS Data on
Remote Sensing Information Service Grid Node . . . . . . . . . . . . . . . . . . . . . .
Jianping Guo, Yong Xue, Ying Wang, Yincui Hu, Jianqin Wang,
Ying Luo, Shaobo Zhong, Wei Wan, Lei Zheng, and Guoyin Cai

558

562

569

Universal Execution of Parallel Processes: Penetrating NATs over the


Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Insoon Jo, Hyuck Han, Heon Y. Yeom, and Ohkyoung Kwon

577

Parallelization of C# Programs Through Annotations . . . . . . . . . . . . . . . .


Cristian Dittamo, Antonio Cisternino, and Marco Danelutto

585

Fine Grain Distributed Implementation of a Dataow Language with


Provable Performances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Thierry Gautier, Jean-Louis Roch, and Frederic Wagner

593

Ecient Parallel Tree Reductions on Distributed Memory


Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kazuhiko Kakehi, Kiminori Matsuzaki, and Kento Emoto

601

Ecient Implementation of Tree Accumulations on Distributed-Memory


Parallel Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kiminori Matsuzaki

609

XXX

Table of Contents Part II

SymGrid-Par: Designing a Framework for Executing Computational


Algebra Systems on Computational Grids . . . . . . . . . . . . . . . . . . . . . . . . . . .
Abdallah Al Zain, Kevin Hammond, Phil Trinder, Steve Linton,
Hans-Wolfgang Loidl, and Marco Costanti

617

Directed Network Representation of Discrete Dynamical Maps . . . . . . . . .


Fragiskos Kyriakopoulos and Stefan Thurner

625

Dynamical Patterns in Scalefree Trees of Coupled 2D Chaotic Maps . . . .


Zoran Levnajic and Bosiljka Tadic

633

Simulation of the Electron Tunneling Paths in Networks of


Nano-particle Films . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Milovan Suvakov
and Bosiljka Tadic

641

Classication of Networks Using Network Functions . . . . . . . . . . . . . . . . . .


Makoto Uchida and Susumu Shirayama

649

Eective Algorithm for Detecting Community Structure in Complex


Networks Based on GA and Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Xin Liu, Deyi Li, Shuliang Wang, and Zhiwei Tao

657

Mixed Key Management Using Hamming Distance for Mobile Ad-Hoc


Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Seok-Lae Lee, In-Kyung Jeun, and Joo-Seok Song

665

An Integrated Approach for QoS-Aware Multicast Tree Maintenance . . .


Wu-Hong Tsai and Yuan-Sun Chu

673

A Categorial Context with Default Reasoning Approach to


Heterogeneous Ontology Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ruliang Xiao and Shengqun Tang

681

An Interval Lattice Model for Grid Resource Searching . . . . . . . . . . . . . . .


Wen Zhou, Zongtian Liu, and Yan Zhao

689

Topic Maps Matching Computation Based on Composite Matchers . . . . .


Jungmin Kim and Hyunsook Chung

696

Social Mediation for Collective Intelligence in a Large Multi-agent


Communities: A Case Study of AnnotGrid . . . . . . . . . . . . . . . . . . . . . . . . . .
Jason J. Jung and Geun-Sik Jo

704

Metadata Management in S-OGSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Oscar Corcho, Pinar Alper, Paolo Missier, Sean Bechhofer,
Carole Goble, and Wei Xing
Access Control Model Based on RDB Security Policy for OWL
Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dongwon Jeong, Yixin Jing, and Doo-Kwon Baik

712

720

Table of Contents Part II

Semantic Fusion for Query Processing in Grid Environment . . . . . . . . . . .


Jinguang Gu
SOF: A Slight Ontology Framework Based on Meta-modeling for
Change Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Li Na Fang, Sheng Qun Tang, Ru Liang Xiao, Ling Li, You Wei Xu,
Yang Xu, Xin Guo Deng, and Wei Qing Chen
Data Forest: A Collaborative Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ronan Jamieson, Adrian Haegee, Priscilla Ramsamy, and
Vassil Alexandrov
NetODrom An Example for the Development of Networked
Immersive VR Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christoph Anthes, Alexander Wilhelm, Roland Landertshamer,
Helmut Bressler, and Jens Volkert

XXXI

728

736

744

752

Intelligent Assembly/Disassembly System with a Haptic Device for


Aircraft Parts Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christiand and Jungwon Yoon

760

Generic Control Interface for Networked Haptic Virtual


Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Priscilla Ramsamy, Adrian Haegee, and Vassil Alexandrov

768

Physically-Based Interaction for Networked Virtual Environments . . . . . .


Christoph Anthes, Roland Landertshamer, and Jens Volkert

776

Middleware in Modern High Performance Computing System


Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christian Engelmann, Hong Ong, and Stephen L. Scott

784

Usability Evaluation in Task Orientated Collaborative Environments . . .


Florian Urmetzer and Vassil Alexandrov

792

Developing Motivating Collaborative Learning Through Participatory


Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Gustavo Zurita, Nelson Baloian, Felipe Baytelman, and
Antonio Farias

799

A Novel Secure Interoperation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Li Jin and Zhengding Lu

808

Scalability Analysis of the SPEC OpenMP Benchmarks on Large-Scale


Shared Memory Multiprocessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Karl F
urlinger, Michael Gerndt, and Jack Dongarra

815

Analysis of Linux Scheduling with VAMPIR . . . . . . . . . . . . . . . . . . . . . . . . .


Michael Kluge and Wolfgang E. Nagel

823

XXXII

Table of Contents Part II

An Interactive Graphical Environment for Code Optimization . . . . . . . . .


Jie Tao, Thomas Dressler, and Wolfgang Karl

831

Memory Allocation Tracing with VampirTrace . . . . . . . . . . . . . . . . . . . . . . .


Matthias Jurenz, Ronny Brendel, Andreas Kn
upfer,
Matthias M
uller, and Wolfgang E. Nagel

839

Automatic Memory Access Analysis with Periscope . . . . . . . . . . . . . . . . . .


Michael Gerndt and Edmond Kereku

847

A Regressive Problem Solver That Uses Knowledgelet . . . . . . . . . . . . . . . .


Kuodi Jian

855

Resource Management in a Multi-agent System by Means of


Reinforcement Learning and Supervised Rule Learning . . . . . . . . . . . . . . .
zy
Bartlomiej Snie
nski

864

Learning in Cooperating Agents Environment as a Method of Solving


Transport Problems and Limiting the Eects of Crisis Situations . . . . . . .
Jaroslaw Kozlak

872

Distributed Adaptive Design with Hierarchical Autonomous Graph


Transformation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Leszek Kotulski and Barbara Strug

880

Integration of Biological, Psychological, and Social Aspects in


Agent-Based Simulation of a Violent Psychopath . . . . . . . . . . . . . . . . . . . . .
Tibor Bosse, Charlotte Gerritsen, and Jan Treur

888

A Rich Servants Service Model for Pervasive Computing . . . . . . . . . . . . . .


Huai-dong Shi, Ming Cai, Jin-xiang Dong, and Peng Liu

896

Techniques for Maintaining Population Diversity in Classical and


Agent-Based Multi-objective Evolutionary Algorithms . . . . . . . . . . . . . . . .
Rafal Drezewski and Leszek Siwik

904

Agents Based Hierarchical Parallelization of Complex Algorithms on


the Example of hp Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . .
M. Paszy
nski

912

Sexual Selection Mechanism for Agent-Based Evolutionary


Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rafal Drezewski and Krzysztof Cetnarowicz

920

Agent-Based Evolutionary and Immunological Optimization . . . . . . . . . . .


Aleksander Byrski and Marek Kisiel-Dorohinicki

928

Strategy Description for Mobile Embedded Control Systems Exploiting


the Multi-agent Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vilem Srovnal, Bohumil Hor
ak, V
aclav Sn
asel, Jan Martinovic,
Pavel Kr
omer, and Jan Platos

936

Table of Contents Part II

XXXIII

Agent-Based Modeling of Supply Chains in Critical Situations . . . . . . . . .


Jaroslaw Kozlak, Grzegorz Dobrowolski, and Edward Nawarecki

944

Web-Based Integrated Service Discovery Using Agent Platform for


Pervasive Computing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kyu Min Lee, Dong-Uk Kim, Kee-Hyun Choi, and Dong-Ryeol Shin

952

A Novel Modeling Method for Cooperative Multi-robot Systems Using


Fuzzy Timed Agent Based Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hua Xu and Peifa Jia

956

Performance Evaluation of Fuzzy Ant Based Routing Method for


Connectionless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Seyed Javad Mirabedini and Mohammad Teshnehlab

960

Service Agent-Based Resource Management Using Virtualization for


Computational Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sung Ho Jang and Jong Sik Lee

966

Fuzzy-Aided Syntactic Scene Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Marzena Bielecka and Marek Skomorowski

970

Agent Based Load Balancing Middleware for Service-Oriented


Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jun Wang, Yi Ren, Di Zheng, and Quan-Yuan Wu

974

A Transformer Condition Assessment System Based on Data Warehouse


and Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Xueyu Li, Lizeng Wu, Jinsha Yuan, and Yinghui Kong

978

Shannon Wavelet Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Carlo Cattani

982

Wavelet Analysis of Bifurcation in a Competition Model . . . . . . . . . . . . . .


Carlo Cattani and Ivana Bochicchio

990

Evolution of a Spherical Universe in a Short Range Collapse/Generation


Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ivana Bochicchio and Ettore Laserra

997

On the Dierentiable Structure of Meyer Wavelets . . . . . . . . . . . . . . . . . . . 1004


Carlo Cattani and Luis M. S
anchez Ruiz
Towards Describing Multi-fractality of Trac Using Local Hurst
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1012
Ming Li, S.C. Lim, Bai-Jiong Hu, and Huamin Feng
A Further Characterization on the Sampling Theorem for Wavelet
Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
Xiuzhen Li and Deyun Yang

XXXIV

Table of Contents Part II

Characterization on Irregular Tight Wavelet Frames with Matrix


Dilations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029
Deyun Yang, Zhengliang Huan, Zhanjie Song, and Hongxiang Yang
Feature Extraction of Seal Imprint Based on the Double-Density
Dual-Tree DWT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1037
Li Runwu, Fang Zhijun, Wang Shengqian, and Yang Shouyuan
Vanishing Waves on Semi-closed Space Intervals and Applications in
Mathematical Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045
Ghiocel Toma
Modelling Short Range Alternating Transitions by Alternating
Practical Test Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053
Stefan Pusca
Dierent Structural Patterns Created by Short Range Variations of
Internal Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1060
Flavia Doboga
Dynamic Error of Heat Measurement in Transient . . . . . . . . . . . . . . . . . . . . 1067
Fang Lide, Li Jinhai, Cao Suosheng, Zhu Yan, and Kong Xiangjie
Truncation Error Estimate on Random Signals by Local Average . . . . . . . 1075
Gaiyun He, Zhanjie Song, Deyun Yang, and Jianhua Zhu
A Numerical Solutions Based on the Quasi-wavelet Analysis . . . . . . . . . . . 1083
Z.H. Huang, L. Xia, and X.P. He
Plant Simulation Based on Fusion of L-System and IFS . . . . . . . . . . . . . . . 1091
Jinshu Han
A System Behavior Analysis Technique with Visualization of a
Customers Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099
Shoichi Morimoto
Research on Dynamic Updating of Grid Service . . . . . . . . . . . . . . . . . . . . . . 1107
Jiankun Wu, Linpeng Huang, and Dejun Wang
Software Product Line Oriented Feature Map . . . . . . . . . . . . . . . . . . . . . . . . 1115
Yiyuan Li, Jianwei Yin, Dongcai Shi, Ying Li, and Jinxiang Dong
Design and Development of Software Conguration Management Tool
to Support Process Performance Monitoring and Analysis . . . . . . . . . . . . . 1123
Alan Cline, Eun-Pyo Lee, and Byong-Gul Lee
Data Dependency Based Recovery Approaches in Survival Database
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1131
Jiping Zheng, Xiaolin Qin, and Jin Sun

Table of Contents Part II

XXXV

Usage-Centered Interface Design for Quality Improvement . . . . . . . . . . . . . 1139


Chang-Mog Lee, Ok-Bae Chang, and Samuel Sangkon Lee
Description Logic Representation for Requirement Specication . . . . . . . . 1147
Yingzhou Zhang and Weifeng Zhang
Ontologies and Software Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155
Waralak V. Siricharoen
Epistemological and Ontological Representation in Software
Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1162
J. Cuadrado-Gallego, D. Rodrguez, M. Garre, and R. Rejas
Exploiting Morpho-syntactic Features for Verb Sense Distinction in
KorLex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1170
Eunryoung Lee, Ae-sun Yoon, and Hyuk-Chul Kwon
Chinese Ancient-Modern Sentence Alignment . . . . . . . . . . . . . . . . . . . . . . . . 1178
Zhun Lin and Xiaojie Wang
A Language Modeling Approach to Sentiment Analysis . . . . . . . . . . . . . . . 1186
Yi Hu, Ruzhan Lu, Xuening Li, Yuquan Chen, and Jianyong Duan
Processing the Mixed Properties of Light Verb Constructions . . . . . . . . . . 1194
Jong-Bok Kim and Kyung-Sup Lim
Concept-Based Question Analysis for an Ecient Document Ranking . . . 1202
Seung-Eun Shin, Young-Min Ahn, and Young-Hoon Seo
Learning Classier System Approach to Natural Language Grammar
Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210
Olgierd Unold
Text Retrieval Oriented Auto-construction of Conceptual
Relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1214
Yi Hu, Ruzhan Lu, Yuquan Chen, and Bingzhen Pei
Filtering Methods for Feature Selection in Web-Document Clustering . . . 1218
Heum Park and Hyuk-Chul Kwon
A Korean Part-of-Speech Tagging System Using Resolution Rules for
Individual Ambiguous Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1222
Young-Min Ahn, Seung-Eun Shin, Hee-Geun Park,
Hyungsuk Ji, and Young-Hoon Seo
An Interactive User Interface for Text Display . . . . . . . . . . . . . . . . . . . . . . . 1226
Hyungsuk Ji and Hyunseung Choo
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1231

Resolving Occlusion Method of Virtual Object in


Simulation Using Snake and Picking Algorithm
JeongHee Cha, GyeYoung Kim, and HyungIl Choi
Information and media institute, School of Computing, School of Media,
Soongsil University , Sangdo 5 Dong , DongJak Gu, Seoul, Korea
pelly@vision.ssu.ac.kr, {gykim1,hic}@ssu.ac.kr

Abstract. For realistic simulation, it is essential to register the two worlds,


calculate the occlusion realm between the real world and the virtual object, and
determine the location of the virtual object based on the calculation. However,
if the constructed map is not accurate or the density is not sufficient to estimate
the occlusion boundary, it is very difficult to determine the occlusion realm. In
order to solve this problem, this paper proposes a new method for calculating
the occlusion realm using the snake and picking algorithm. First, the wireframe
generated by the CCD image and DEM was mapped using the visual clues to
acquire 3D information in the experimental realm, and the 3D information was
calculated at the point where occlusion problem for a moving target. The
validity of the proposed approach under the environment in which partial
occlusion occurs has been provided by an experiment.
Keywords: Occlusion, Snake, Picking, DEM, Augmented Reality, Simulation.

1 Introduction
Augmented reality is an area of technology that has originated in virtual reality. While
virtual reality offers a virtual world in which users are completely immersed,
augmented reality offers virtual objects on the basis of real world images. At present,
augmented reality technology is being researched and applied to various areas
including the military, medicine, education, construction, game, and broadcasting.
This paper studied on the development of a realistic simulated training model through
the display of virtual targets in the input images of CCD camera mounted in a tank
and the determination of occlusion areas generated from the creation and movement
of virtual objects through a movement path according to a scenario. Augmented
reality has three general characteristics: image registration, interaction, and real
time[1]. Image registration refers to the matching of the locations of the real world
object that users watch and the related virtual object, real time refers to the real time
image registration and interaction. Interaction implies that the combination of virtual
objects and the objects in real images must be harmonized with surrounding
environment in a realistic manner, and refers to the determination of occlusion areas
according to the changed location or line of sight of the observer or the re-rendering
of virtual objects after detection of collisions. However, to solve the problems of
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 1 8, 2007.
Springer-Verlag Berlin Heidelberg 2007

J. Cha, G. Kim, and H. Choi

occlusion such as the hiding of farther virtual objects by closer objects and the
covering of objects in real images by other objects, the two worlds must be accurately
coordinated and then the depth of the actual scene must be compared with the depth
of virtual objects[2][3]. But if the accuracy or density of the created map is
insufficient to estimate the boundary of occlusion area, it is difficult to determine the
occlusion area. To solve this problem, first, we created a 3D wireframe using the
DEM of the experiment area and then coordinate this using CCD camera images and
visual clues. Second, to solve the problem of occlusion by accurately estimating the
boundary regardless of the density of map, this paper also proposed a method to
obtain the reference 3D information of the occlusion points using the Snake algorithm
and the Picking algorithm and then to infer the 3D information of other boundaries
using the proportional relations between 2D and 3D DEMs. Third, for improving
processing speed, we suggest a method by comparing the MER(Minimum Enclosing
Rectangle) area of the object in the cameras angle of vision and the MER of the
virtual target. Fig. 1 shows the proposed system framework.

Fig. 1. Proposed System Framework

2 Methodology
2.1 Formation of Wireframe Using DEM and Registration with Real Images
Using Visual Clues
The topographical information DEM (Digital Elevation Model) is used to map the
real world coordinates to each point of the 2D CCD image. DEM has information on
the latitude and longitude coordinates expressed in X and Y and heights in fixed
interval. The DEM used for this experiment is a grid-type DEM which had been
produced to have the height information for 2D coordinates in 1M interval for the
limited experiment area of 300 m x 300 m. The DEM data are read to create a mesh

Resolving Occlusion Method of Virtual Object in Simulation

with the vertexes of each rectangle and a wireframe with 3D depth information as
Fig. 2[4][5]. This is overlaid on the sensor image to check the coordination, and visual
clues are used to move the image to up, down, left or right as shown in Fig. 3, thus
reducing error. Based on this initial coordinated location, the location changes by
movement of vehicles were frequently updated using GPS (Global Positioning
System) and INS (Inertial Navigation System).

Fig. 2. Wireframe Creation using DEM

Fig. 3. Registration of Two Worlds using Visual Clues

2.2 Extracting the Outline of Objects and Acquiring 3D Information


The Snake algorithm[6][7] is a method of finding the outline of an object by
repeatedly moving to the direction of minimizing energy function from the snake
vertex input by user. The energy function is shown in Expression [1]. As the energy
function is calculated for a discrete space, the parameters of each energy function
become the coordinates of each vertex in the image. In Expression [1], v (s ) is the

()

snake point, and v( s ) = ( x ( s ), y ( s )) , where x s and y (s ) refer to the positions


of x and y in the image of the snake point. Also, , and are weights, and this
paper gave = 1 , = 0.4 , and
1

= 2.0 , respectively.

Esnake = (Econt (v( s )) + Ecurve (v( s )) + Eimage (v( s )))ds


0

(1)

J. Cha, G. Kim, and H. Choi

The first term is the energy function that represents the continuity of the snake
vertexes surrounding the occlusion area and the second term is the energy function
that controls the smoothness of the curve forming the snake, of which the value
increases along with the curvature, enabling the detection of corner points. Lastly,
Eimage is a image feature function. All energy functions are normalized to have a
value between 1 and 0. As shown in Table 1, this algorithm extracts the outline by
repeatedly performing the energy minimization algorithm which sets a 3 pixels x3
pixels window at each vertex v(i ) , finds positions where energy is minimized in
consideration of the continuity between previous and next vertexes, curvature, and
edge strength, and then moves the vertexes to the positions.
Table 1. Snake Algorithm

2.3 Acquisition of 3D Information Using the Picking Algorithm


In order to acquire the 3D information of the extracted vertexes, this paper used the
Picking algorithm which is a well-known 3D graphics technique[8]. It finds the
collision point with the 3D wireframe created by DEM that corresponds to the points
in 2D image and provides the 3D information of the points. The picking search point
is the lowest point of the vertexes of the objects extracted from the 2D image. The
screen coordinate system that is a rectangular area indicating a figure that has been
projection transformed in the 3D image rendering process must be converted to the
viewport coordinate system in which the actual 3D topography exists to pick the
coordinate system where the mouse is actually present. First, the conversion matrix to
convert viewport to screen is used to obtain the conversion formula from 2D screen to
3D projection window, and then the ray of light is lengthened gradually from the
projection window to the ground surface to obtain the collision point between the
point to search and the ground surface. Fig. 4 is an example of picking the collision
point between the ray of light and DEM. The lowest point of the occlusion area
indicated by an arrow is the reference point to search, and this becomes the actual
position value of 2D image in a 3D space.

Resolving Occlusion Method of Virtual Object in Simulation

(a)occlusion candidate (b)matching ref.point and DEM (c)3D information extraction


Fig. 4. 3D information Extraction using Collision point of Ray and DEM

2.4 Creation of 3D Information Using Proportional Relational Expression


The collision point, or reference point, has 3D coordinates in DEM, but other vertexes
of the snake indicated as object outline cannot obtain 3D coordinates because they
dont have a collision point. Therefore, this paper suggested obtaining a proportional
relation between 2D image and 3D DEM using the collision reference point and then
obtaining the 3D coordinates of another vertex. Fig. 5 shows the proportional relation
between 2D and 3D vertexes. In Fig. 5, S m is center of screen, S B is reference point
of snake vertex (lowest point),
point,

S B = (S xB , S yB ) , S k is a point except reference

S k = (S xk , S yk ) . Pm is projection point of straight line of PB in 3D,

which is through the center of screen.

PB = (PxB , Py B , PzB ),

Pk

is

PB is 3D correspondence point of S B ,
a

point

except

reference

point,

Pk = (Pxk , Pyk , Pzk ) , t = Po PB , t m = Po Pm , B : tt , B : t t m . t ' is


projected vector of t to xz plane.
'

'

Fig. 5. Proportional Relation of the Vertex in 2D and 3D

pm that passes the center of the screen using the coordinates of the
'
reference point obtained above, t must be obtained first. As the t value is given by
To get

J. Cha, G. Kim, and H. Choi

the picking ray, the given


this

t value and y B are used to get B and t ' is obtained using

in Expression (2).

B = sin 1 (
To get
tm,

PyB
t

), t ' = t B cos ( B )

t' = t'

(2)

t m , B is obtained from Expression (2) which is the angle between t and

t m can be obtained using B from Expression (3).

B = tan 1 (
Because t m

PxB
t'

), t ' = t m cos (B ) , t m =

t'
cos (B )

tm = t m

(3)

= p zm , Pm = (0,0, t m ) .

Now, we can present the relation between the 2D screen view in Fig. 5 and the 3D
space coordinates, and this can be used to get pk , which corresponds to the 2D snake
vertex.

S B : PB = S k : Pk , S xB : PxB = S xk : Pxk ,
Pxk =

PxB S xk
S xB

, S yB

Consequently, we can get

: PyB S yk : Pyk , Pyk =

PyB S yk

(4)

S yB

Pk = (Pxk , Pyk ) , which is the 3D space point

corresponding to each snake vertex to search.


2.5 Creation of Virtual Target Path and Selection of Candidate Occlusion
Objects Using MER (Minimum Enclosing Rectangle)
To test the proposed occlusion-resolving algorithm, we created the movement path of
a virtual target, and determined the changes of the direction and shape of the target as
well as the 3D position of the target. First, the beginning and end points of the target
set by instructor were saved and the angle of these two points was calculated, and the
direction and shape of the target were updated in accordance with the change of the
angle. Further, the remaining distance was calculated using the speed and time of the
target, and the 3D coordinates corresponding to the position after movements were
determined. We also suggest a method of improving processing speed by comparing
the MER (Minimum Enclosing Rectangle) area of the object in the cameras angle of
vision and the MER of the virtual target because the relational operations between all
objects extracted from the image for occlusion processing and the virtual target take
much time. The MER (Minimum Enclosing Rectangle) of an object refers to the

Resolving Occlusion Method of Virtual Object in Simulation

minimum rectangle that can enclose the object and determines the object that has an
overlapping area by comparing the objects in the camera image and the MER of the
virtual target. In addition, the distance between object and virtual target is obtained
using the fact that the determined object and virtual target are placed more or less in a
straight line from the camera, and this value was used to determine whether there
exists an object between the virtual target and the camera.

3 Experimental Results
Fig. 6(left) shows movement path of the virtual target which trainee sets. Also, (right)
shows the various virtual targets created to display the targets changing with
movement on the image.

Fig. 6.

Moving Route Creation(left) and Appearance of Virtual Object as it Moved(right)

Fig. 7 shows the virtual images moving along the path by frame. We can see that as
the frames increase, it is occluded between the tank and the object.

Fig. 7. Experimental Results of Moving and Occlusion

Table 2 compares between the case of using snake vertexes to select objects in the
image to compare with virtual targets and the case of using the proposed MER. With
the proposed method, the processing speed decreased by 1.671, which contributed to
performance improvement.

J. Cha, G. Kim, and H. Choi


Table 2. Speed Comparison

Method
Snake vertexes
MER(proposed)

Total frame
301
301

Used object
10
10

Speed(sec)
112
67

Frame per sec.


2.687
4.492

4 Conclusions
To efficiently solve the problem of occlusion that occurs when virtual targets are
moved along the specified path over an actual image, we created 3D virtual world
using DEM and coordinated this using camera images and visual clues. Moreover, the
Snake algorithm and the Picking algorithm were used to extract an object that is close
to the original shape to determine the 3D information of the point to be occluded. To
increase the occlusion processing speed, this paper also used the method of using the
3D information of the MER area of the object, and proved the validity of the proposed
method through experiment. In the future, more research is required on a more
accurate extracting method for occlusion area that is robust against illumination as
well as on the improvement of operation speed.

Acknowledgement
This work was supported by the Korea Research Foundation Grant funded by the
Korean Government(MOEHRD)(KRF-2006-005-J03801).

References
[1] Bimber, O. and Raskar, R.,Spatial Augmented Reality: A Modern Approach to Augmented
Reality, Siggraph 2005, Los Angeles USA
[2] J. Yong Noh and U. Neumann. Expression cloning. In SIGGRAPH'01, pages 277-288,
2001.
[3] E. Chen. Quicktime VR-an image-based approach to virtual environment navigation. Proc.
of SIGGRAPH, 1995.
[4] Lilian Ji, Hong Yan, "Attractable snakes based on the greedy algorithm for contour
extraction", Pattern Recognition 35, pp.791-806 (2002)
[5] Charles C. H. Lean, Alex K. B. See, S. Anandan Shanmugam, "An Enhanced Method for
the Snake Algorithm," icicic, pp. 240-243, First International Conference on Innovative
Computing, Information and Control - Volume I (ICICIC'06), 2006
[6] Wu, S.-T., Abrantes, M., Tost, D., and Batagelo, H. C. 2003. Picking and snapping for 3d
input devices. In Proceedings of SIBGRAPI 2003, 140-147.

Graphics Hardware-Based Level-Set Method


for Interactive Segmentation and Visualization
Helen Hong1 and Seongjin Park2
1

Division of Multimedia Engineering, College of Information and Media,


Seoul Womens University, 126 Gongreung-dong, Nowon-gu, Seoul 139-774 Korea
2
School of Computer Science and Engineering, Seoul National University,
San 56-1 Shinlim-dong, Kwanak-gu, Seoul 151-741 Korea
hlhong@swu.ac.kr, sjpark@cglab.snu.ac.kr

Abstract. This paper presents an efficient graphics hardware-based method to


segment and visualize level-set surfaces as interactive rates. Our method is
composed of memory manager, level-set solver, and volume renderer. The
memory manager which performs in CPU generates page table, inverse page
table and available page stack as well as process the activation and inactivation
of pages. The level-set solver computes only voxels near the iso-surface. To run
efficiently on GPUs, volume is decomposed into a set of small pages. Only
those pages with non-zero derivatives are stored on GPU. These active pages
are packed into a large 2D texture. The level-set partial differential equation
(PDE) is computed directly on this packed format. The memory manager is
used to help managing the packing of the active data. The volume renderer
performs volume rendering of the original data simultaneouly with the evolving level set in GPU. Experimental results using two chest CT datasets
show that our graphics hardware-based level-set method is much faster than
software-based one.
Keywords: Segmentation, Level-Set, Volume rendering, Graphics hardware,
CT, Lung.

1 Introduction
The level-set method is a numerical technique for tracking interfaces and shapes[1].
The advantage of the level-set method is that one can perform numerical computations involving curves and surfaces on a fixed Cartesian grid without having to parameterize these objects. In addition, the level-set method makes it easy to follow
shapes which change topology. All these make the level-set method a great tool for
modeling time-varying objects. Thus, deformable iso-surfaces modeled by level-set
method have demonstrated a great potential in visualization for applications such as
segmentation, surface processing, and surface reconstruction. However, the use of
level sets in visualization has a limitation in their high computational cost and reliance
on significant parameter tuning.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 9 16, 2007.
Springer-Verlag Berlin Heidelberg 2007

10

H. Hong and S. Park

Several methods have been suggested for accelerate the computation time. Adalsteinson and Sethian [2] have proposed the narrow band method, which only computes the points near the front at each time step. Thus it is more efficient than the
standard level-set approach. However, the computational time is still large, especially when the image size is large. Paragios and Deriche introduced the Hermes
algorithm which propagates in a small window each time to achieve a much faster
computation. Sethian [3] presented a monotonically advancing scheme. It is restricted to a one directional speed term and the fronts geometric properties are
omitted. Unfortunately, the stop criteria have to be decided carefully so that the
front will not exceed the boundary. Whitaker[4] proposed the sparse-field method,
which introduces a scheme in which updates are calculated only on the wavefront,
and several layers around that wavefront are updated via a distance transform at
each iteration.
To overcome those limitations of software-based level-set methods, we propose an
efficient graphics hardware-based method to segment and visualize level set surfaces
as interactive rates.

2 Level-Set Method on Graphics Hardware


Our method is composed of memory manager, level-set solver and volume renderer as
shown in Figure 1. First, in order to help managing the packing of the active data, the
memory manager generates page table, inverse page table and available page stack as
well as process the activation and inactivation of pages. Second, level-set solver computes only voxels near the iso-surface like the sparse field level-set method. To run
efficiently on GPUs, volume is decomposed into a set of small pages. Third, volume
renderer performs volume rendering of the original data simultaneously with the
evolving level set.
2.1 Memory Manager
Generally, the size of texture memory in graphics hardware is rather small. Thus,
there has a limitation to load a large volume medical dataset which has over 1000
slices with 512 x 512 image size to texture memory. In order to overcome this limitation, it is required to load level sets only near the iso-surface, which called active
pages. In this section, we propose an efficient method to manage these active pages.
Firstly, main memory in CPU and texture memory in GPU is divided into pages.
Then data structure as shown in Fig. 2 is generated. In order to exchange the corresponding page number between main memory and texture memory, the page table
which converts the page number of main memory to corresponding page number of
texture memory and the inverse page table which converts the page number of texture
memory to corresponding page number of main memory is generated respectively. In
addition, the available page stack is generated to manage empty pages in texture
memory.

Graphics Hardware-Based Level-Set Method for Interactive Segmentation

Fig. 1. The flow chart of our method on graphics hardware

Fig. 2. Data structure for memory management

11

12

H. Hong and S. Park

In level-set method, the page including front is changed as front is increased or decreased. To manage these pages, activation and inactivation is performed as shown in
Fig. 3. The activation process is occurred when evolving front use the inactive page in
texture memory. At this process, main memory asks new page of texture memory to
available page stack. Then top page of available page stack is popped as shown in Fig.
3(a). The inactivation process is occurred when evolving front is departed from active pages of texture memory. As shown in Fig. 3(b), main memory asks the removal
of active page to texture memory, and the removed active page is pushed to available
page stack.

(a)

(b)
Fig. 3. The process of page activation and inactivation (a) page activation process (b) page
inactivation process

During level-set computation in GPU, partial differential equation is computed using information of current and neighbor pixels. In case of referring inside pixel of
page in texture memory, PDE can be calculated without preprocessing. In case of
referring boundary pixel of page, neighbor page should be referred to know information of neighbor pixel. However, it is difficult to know such information during PDE
calculation in GPU. In the case, vertex buffer is created in CPU to save the location of
current and neighbor pixels. For this, we define nine different cases as shown in Fig. 4.
In 1st, 3rd, 5th, 7th vertices, two pages neighbor to them are referred to know and save
the location of neighbor pixel to vertex buffer with the location of current pixel.
In 2nd, 4th, 6th, 8th vertices, one page neighbor to them are referred. In 9th vertex, the

Graphics Hardware-Based Level-Set Method for Interactive Segmentation

13

location of current is saved to vertex buffer without referring of neighbor page. The
location of neighbor pixel is calculated using page table and inverse page table
like Eq. (1).

Fig. 4. Nine different cases for referring neighbor page

Taddr
PageSize
M num = InversePageTable(Tnum )
neighbor ( M num ) = M num + neighborOffset
neighbor (Tnum ) = PageTable(neighbor ( M num ))

Tnum =

(1)

where Tnum is page number in texture memory, Taddress is page address in texture memory, M num is page number in main memory, PageSize is defined in 16 x 16.
2.2 Level-Set Solver
The efficient solution of the level set PDEs relies on updating only those voxels that
are on or near the iso-surface. The narrow band and sparse field methods achieve this
by operating on sequences of heterogeneous operations. For instance, the sparse-field
method keeps a linked list of active voxels on which the computation is performed.
However, the maximum efficiency for computing on GPU is achieved when homogeneous operations are applied to each pixel. To apply different operations to each pixel
in page has a burden in CPU-to-GPU message passing. To run efficiently on GPUs,
our level-set solver applies heterogeneous operations to nine different cases divided in
creation of vertex buffer.
Fig. 5 shows that vertex buffer is transferred to GPU during vertex shading, which
is divided into apex (1st, 3rd, 5th, 7th), edge (2nd, 4th, 6th, 8th) and the inner parts (9th).
Sixteen vertex buffers which include the location of four different apex points for the
apex case, eight ends points for the edge case, and four apex points for the inner case
are transferred. Then level-set computations are achieved by using Eq. (2) and (3).
D( I ) = | I T |
= | | D( I )

(2)
(3)

14

H. Hong and S. Park

where I is the intensity value of the image, D(I) is speed function, is level-set value,
T and is the average intensity value and standard deviation of segmenting region,
respectively.

Fig. 5. The process of efficient level-set operation in GPU

2.3 Volume Renderer


The conventional software-based volume rendering techniques such as ray-casting
and shear-warp factorization have a limitation to visualize level-set surfaces as interactive rate. Our volume renderer performs a texture-based volume rendering on
graphics hardware of the original data simultaneously with the evolving level set.
Firstly, updated level-set values in texture memory are saved to main memory
through inverse page table. Then texture-based volume rendering is applied for visualizing original volume with level-set surfaces. For efficient memory use, we use only
two channels for intensity value and level set value instead of using RGBA four
channels. Then proxy geometry is generated using parallel projection. Finally, we
map three-dimensional texture memory in GPU to the proxy geometry. The mapped
slices onto the proxy geometry render using compositing modes that include maximum intensity projection.

3 Experimental Result
All our implementation and tests were performed using a general computer equipped
with an Intel Pentium 4, 2.4 GHz CPU and 1GB memory. The graphics hardware was
ATI Radeon 9600 GPU with 256 MB of memory. The programs are written in the

Graphics Hardware-Based Level-Set Method for Interactive Segmentation

15

DirectX shader program language. Our method was applied to each unilateral lung of
two chest CT datasets to evaluate its accuracy and processing time. The volume resolution of each unilateral lung is 512 x 256 x 128. For packing active pages, the size of
2D texture memory is 2048 x 2048. Fig. 6 and 7 show how our method segment accurately in two- and three-dimensionally. The segmented lung boundary is presented in
red. In Fig. 7, original volume with level set surfaces is visualized by using maximum
intensity projection.

Fig. 6. The results of segmentation using our


graphics hardware-based level-set method

Fig. 7. The results of visualizing original


volume with level-set surfaces

We have compared our technique with software-based level-set method under the
same condition. Table 1 shows a comparison of the total processing time using the 2
different techniques. The total processing time includes times for performing page management and level-set computations. As shown in Table 1, our method is over 3.4 times
faster than software-based level set method. In particular, our method for computing
level set PDE is over 14 times faster than that of software-based level set method.
Table 1. The comparison results of total processing time using two different techniques.
(sec)
Page
Level set
Total
Dataset
Average
Method
manager
solver
processing time
AL
0.38
0.068
0.45
Proposed graphAR
0.37
0.066
0.44
ics hardware0.44
BL
0.38
0.073
0.45
based method
BR
0.36
0.067
0.43
AL
0.54
0.93
1.47
Software-based
AR
0.55
0.94
1.49
1.48
method
BL
0.55
0.94
1.49
BR
0.54
0.93
1.47
L: left lung, R: right lung

4 Conclusion
We have developed a new tool for interactive segmentation and visualization of level
set surfaces on graphics hardware. Our memory manager helps managing the packing

16

H. Hong and S. Park

of the active data. A dynamic, packed texture format allows the efficient processing of
time-dependent, sparse GPU computations. While the GPU updates the level set, it
renders the surface model directly from this packed texture format. Our method was
over 3.4 times faster than software-based level set method. In particular, our method
for computing level set PDE was over 14 times faster than that of software-based
level set method. The average of total processing time of our method was 0.6 seconds.
The computation time for memory management took almost times in total processing
time. Experimental results show that our solution is much faster than previous optimized solutions based on software technique.

Acknowledgement
This study is supported in part by Special Basic Research Program grant R01-2006000-11244-0 under the Korea Science and Engineering Foundation and in part by
Seoul R&D Program.

References
1. Osher S, Sethian J.A., Front propagating with curvature dependant speed: algorithms based
on Hamilton-Jacobi formulation, Journal of Computational Physics, Vol. 79 (1988) 12-49.
2. Adalsteinson D, Sethian J.A., A fast level set method for propagating interfaces, Journal of
Computational Physics (1995) 269-277.
3. Sethian J.A., A fast matching level set method for monotonically advancing fronts, Proc.
Natl. Acad. Sci. USA Vol. 93 (1996) 1591-1595.
4. Whitaker R., A level-set approach to 3D reconstruction from range data, International Journal of Computer Vision (1998) 203-231.

Parameterization of Quadrilateral Meshes


Li Liu 1, CaiMing Zhang 1,2, and Frank Cheng3
1

School of Computer Science and Technology, Shandong University, Jinan, China


Department of Computer Science and Technology, Shandong Economic University, Jinan,
China
3
Department of Computer Science, College of Engineering, University of Kentucky,
America
liuli_790209@163.com

Abstract. Low-distortion parameterization of 3D meshes is a fundamental


problem in computer graphics. Several widely used approaches have been
presented for triangular meshes. But no direct parameterization techniques are
available for quadrilateral meshes yet. In this paper, we present a
parameterization technique for non-closed quadrilateral meshes based on mesh
simplification. The parameterization is done through a simplify-project-embed
process, and minimizes both the local and global distortion of the quadrilateral
meshes. The new algorithm is very suitable for computer graphics applications
that require parameterization with low geometric distortion.
Keywords: Parameterization, mesh simplification, Gaussian curvature,
optimization.

1 Introduction
Parameterization is an important problem in Computer Graphics and has applications
in many areas, including texture mapping [1], scattered data and surface fitting [2],
multi-resolution modeling [3], remeshing [4], morphing [5], etc. Due to its importance
in mesh applications, the subject of mesh parameterization has been well studied.
Parameterization of a polygonal mesh in 3D space is the process of constructing a
one-to-one mapping between the given mesh and a suitable 2D domain. Two major
paradigms used in mesh parameterization are energy functional minimization and the
convex combination approach. Maillot proposed a method to minimize the norm of
the Green-Lagrange deformation tensor based on elasticity theory [6]. The harmonic
embedding used by Eck minimizes the metric dispersion instead of elasticity [3].
Lvy proposed an energy functional minimization method based on orthogonality and
homogeneous spacing [7]. Non-deformation criterion is introduced in [8] with
extrapolation capabilities. Floater [9] proposed shape-preserving parameterization,
where the coefficients are determined by using conformal mapping and barycentric
coordinates. The harmonic embedding [3,10] is also a special case of this approach,
except that the coefficients may be negative.
However, these techniques are developed mainly for triangular mesh
parameterization. Parameterization of quadrilateral meshes, on the other hand, is
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 1724, 2007.
Springer-Verlag Berlin Heidelberg 2007

18

L. Liu, C. Zhang , and F. Cheng

actually a more critical problem because quadrilateral meshes, with their good
properties, are preferred in finite element analysis than triangular meshes.
Parameterization techniques developed for triangle meshes are not suitable for
quadrilateral meshes because of different connectivity structures.
In this paper, we present a parameterization technique for non-closed quadrilateral
meshes through a simplify-project-embed process. The algorithm has the following
advantages:(1) the method provably produces good parameterization results for any
non-closed quadrilateral mesh that can be mapped to the 2D plane; (2) the method
minimizes the distortion of both angle and area caused by parameterization; (3) the
solution does not place any restrictions on the boundary shape; (4) since the
quadrilateral meshes are simplified, the method is fast and efficient.
The remaining part of this paper is organized as follows. The new model and the
algorithm are presented in detail in Section 2. Test results of the new algorithm are
shown in Section 3. Concluding remarks are given in Section 4.

2 Parameterization
Given a non-closed quadrilateral mesh, the parameterization process consists of four
steps. The first step is to get a simplified version of the mesh by keeping the boundary
and interior vertices with high Gaussian curvature, but deleting interior vertices with
low Gaussian curvature. The second step is to map the simplified mesh onto a 2D
domain through a global parameterization process. The third step is to embed the
deleted interior vertices onto the 2D domain through a weighted discrete mapping.
This mapping preserves angles and areas and, consequently, minimizes angle and area
distortion. The last step is to perform an optimization process of the parameterization
process to eliminate overlapping. Details of these steps are described in the
subsequent sections.
For a given vertex v in a quadrilateral mesh, the one-ring neighbouring vertices of
the vertex v are the vertices that share a common face with v . A one-ring
neighboring vertex of the vertex v is called an immediate neighboring vertex if this
vertex shares a common edge with v . Otherwise, it is called a diagonally neighboring
vertex.
2.1 Simplification Algorithm
The computation process, as well as the distortion, may be too large if the entire
quadrilateral mesh is projected onto the plane. To speed up the parameterization and
minimize the distortion, we simplify the mesh structure by reducing the number of
interior vertices but try to retain a good approximation of the original shape and
appearance. The discrete curvature is one of the good criteria of simplification while
preserving the shape of an original model.
In spite of the extensive use of quadrilateral meshes in geometric modeling and
computer graphics, there is no agreement on the most appropriate way to estimate
geometric attributes such as curvature on discrete surfaces. By thinking of a

Parameterization of Quadrilateral Meshes

19

quadrilateral mesh as a piecewise linear approximation of an unknown smooth


surface, we can try to estimate the curvature of a vertex using only the information
that is given by the quadrilateral mesh itself, such as the edge and angles. The
estimation does not have to be precise. To speed up the computation, we ignore the
effect of diagonally neighboring vertices, and use only immediate neighboring
vertices to estimate the Gaussian curvature of a vertex, as shown in Fig.1-(a). We
define the integral Gaussian curvature K = K v with respect to the area
S = S v attributed to v by

K = K = 2
s

(1)

i =1

where i is the angle between two successive edges. To derive the curvature from the
integral values, we assume the curvature to be uniformly distributed around the vertex
and simply normalized by the area

K=

K
.
S

(2)

where S is the sum of the areas of adjacent faces around the vertex v . Different
ways of defining the area S result in different curvature values. We use the Voronoi
area, which sums up the areas of vertex v s local Voronoi cells. To determine the
areas of the local Voronoi cells restricted to a triangle, we distinguish obtuse and nonobtuse triangles as shown in Fig. 1. In the latter case they are given by

SA =

1
( vi v k
8

cot( i ) + v i v j

cot( i )) .

(3)

For obtuse triangles,

SB =

1
vi v k
8

tan( i ), S C =

1
vi v j
8

tan( i ), S A = S S B S C .

(4)

A vertex deletion means the deletion of a vertex with low Gaussian curvature and
the incident edges. During the simplification process, we can adjust the tolerance
value to control the number of vertices reduced.

(a)

(b)

(c)

Fig. 1. Voronoi area. (a) Voronoi cells around a vertex; (b) Non-obtus angle; (c) Obtus angle.

20

L. Liu, C. Zhang , and F. Cheng

2.2 Global Parameterization

Parameterizing a polygonal mesh amounts to computing a correspondence between


the 3D mesh and an isomorphic planar mesh through a piecewise linear mapping. For
the simplified mesh M obtained in the first step, the goal here is to construct a
2

mapping between M and an isomorphic planar mesh U in R that best preserves


the intrinsic characteristics of the mesh M . We denote by vi the 3D position of the ith vertex in the mesh M , and by ui the 2D position (parameterized value) of the
corresponding vertex in the 2D mesh U .
The simplified polygonal mesh M approximates the original quadrilateral mesh,
but the angles and areas of M are different from the original mesh. We take the edges
of the mesh M as springs and project vertices of the mesh onto the parameterization
domain by minimizing the following edge-based energy function

1
1
2 {i , j}Edge v v
i
j

ui u j , r 0 .

(5)

where Edge is the edge set of the simplified mesh. The coefficients can be chosen in
different ways by adjusting r . This global parameterization process is performed on a
simplified mesh (with less vertices), so it is different from the global parameterization
and the fixed-boundary parameterization of triangular meshes.
2.3 Local Parameterization

After the boundary and interior vertices with high Gaussian curvature are mapped
onto a 2D plane, those vertices with low curvature, are embedded back onto the
parametrization plane. This process has great impact on the result of the
parametrization. Hence, it should preserve as many of the intrinsic qualities of a mesh
as possible. We need to define what it means by intrinsic qualities for a discrete mesh.
In the following, the minimal distortion means best preservation of these qualities.
2.3.1 Discrete Conformal Mapping
Conformal parameterization preserves angular structure, and is intrinsic to the
geometry and stable with respect to small deformations. To flatten a mesh onto a twodimensional plane so that it minimizes the relative distortion of the planar angles with
respect to their counterparts in the 3D space, we introduce an angle-based energy
function as follows

EA =

jN (i )

(cot

ij
4

+ cot

ij
4

) ui u j

(6)

where N (i ) is the set of immediate one-ring neighbouring vertices, and ij , ij are

the left and opposite angles of vi , as shown in Fig. 2-(a). The coefficients in the

Parameterization of Quadrilateral Meshes

21

formula (6) are always positive, which reduces the overlapping in the 2D mesh. To
minimize the discrete conformal energy, we get a discrete quadratic energy in the
parameterization and it depends only on the angles of the original surface.
2.3.2 Discrete Authalic Mapping
Authalic mapping preserves the area as much as possible. A quadrilateral mesh in 3D
space usually is not flat, so we cannot get an exact area of each quadrilateral patch. To
minimize the area distortion, we divide each quadrilateral patch into four triangular
parts and preserve the areas of these triangles respectively. For instance, in Fig. 2-(b)
the quadrilateral mesh vi v j v k v j +1 is divided into triangular meshes v i v j v j +1 ,

vi v j v k , vi vk v j +1 and v j vk v j +1 respectively. This changes the problem of


quadrilateral area preserving into that of triangular area preserving.
The mapping resulted from the energy minimization process has the property of
preserving the area of each vertex's one-ring neighbourhood in the mesh, and can be
written as follows

Ex =

jN (i )

(cot

ij
2

+ cot

vi v j

ij
2

ui u j

(7)

where ij , ij are corresponding angles of the edge (vi , v j ) as shown in Fig. 2-(c).
The parameterization deriving from E x is easily obtained, and the way to solve this
system is similar to that of the discrete conformal mapping, but the linear coefficients
now are functions of local areas of the 3D mesh.

(a)

(b)

(c)

Fig. 2. Edge and angles. (a) Edge and opposite left angles in the conformal mapping; (b)
Quadrilateral mesh divided into four triangles; (c) Edge and angles in the authalic mapping.

2.3.3 Weighted Discrete Parameterization


Discrete conformal mapping can be seen as an angle preserving mapping which
minimizes the angle distortion for the interior vertices. The resulting mapping will
preserve the shape but not the area of the original mesh. Discrete authalic mapping is
area preserving which minimizes the area distortion. Although the area of the original

22

L. Liu, C. Zhang , and F. Cheng

mesh would locally be preserved, the shape tends to be distorted since the mapping
from 3D to 2D will in general generate twisted distortion.
To minimize the distortion and get better parameterization results, we define linear
combinations of the area and the angle distortions as the distortion measures. It turns
out that the family of admissible, simple distortion measures is reduced to linear
combinations of the two discrete distortion measures defined above. A general
distortion measure can thus always be written as

E = qE A + (1 q) E X .

(8)

where q is a real number between 0 and 1. By adjusting the scaling factor q ,


parameterizations appropriate for special applications can be obtained.
2.4 Mesh Optimization

The above parameterization process does not impose restriction, such as convexity, on
the given quadrilateral mesh. Consequently, overlapping might occur in the projection
process. To eliminate overlapping, we optimize the parameterization mesh by
adjusting vertex location without changing the topology. Mesh optimization is a local
iterative process. Each vertex is optimized for a new location in a number of
iterations.
q

Let ui be the q times iteration location of the parameterization value

ui . The

optimisation process to find the new location in iterations is the following formula
n

u j q 1 ui q 1

i =1

ui q = ui q 1i (

) + 2 (
i =1

uk q 1 ui q 1
),0 < 1 + 2 < 1 . (9)
n

where u j , uk are the parameterization values of the immediate and diagonal


neighbouring vertices respectively. It is found that vertex optimization in the order of
"worst first" is very helpful. We define the priority of the vertex follows
n

u j q 1 ui q 1

i =1

= i (

uk q 1 ui q 1
) + 2 (
).
n
i =1
n

(10)

The priority is simply computed based on shape metrics of each parameterization


vertex. For a vertex with the worst quality, the highest priority is assigned. Through
experiments, we find that more iterations are needed if vertices are not overlapped in
an order of "first come first serve". Besides, we must point out that the optimization
process is local and we only optimize overlapping vertices and its one-ring vertices,
which will minimize the distortion and preserve the parameterization results better.

3 Examples
To evaluate the visual quality of a parameterization we use the checkerboard texture
shown in Fig. 3, where the effect of the scaling factor q in Eq. (8) can be found. In

Parameterization of Quadrilateral Meshes

23

fact, while q is equal to 0 or 1, the weighted discrete mapping is discrete conformal


mapping and authalic mapping separately. We can find few parameterization methods
of quadrilateral meshes, so the weighted discrete mapping is compared with discrete
conformal mapping and authalic mapping of quadrilateral meshes with q = 0 and
q = 1 in Eq. (8) separately.
Fig. 3-(a) and (e) show the sampled quadrilateral meshes. Fig. 3-(b) and (f) show
the models with a checkerboard texture map using discrete conformal mapping with
q = 0 . Fig.3-(c) and (g) show the models with a checkerboard texture map using
weighted discrete mapping with q = 0.5 . Fig. 3-(d) and (h) show the models with a
checkerboard texture map using discrete authalic mapping with q = 1 . It is seen that
the results using weighted discrete mapping is much better than the ones using
discrete conformal mapping and discrete authalic mapping.

(a)

(e)

(b)

(f)

(c)

(g)

(d)

(h)

Fig. 3. Texture mapping. (a) and (e) Models; (b) and (f) Discrete conformal mapping , q=0; (c)
and (g) Weighted discrete mapping , q=0.5; (d) and (h) Discrete Authalic mapping , q=1.

The results demonstrate that the medium value (about 0.5) can get smoother
parameterization and minimal distortion energy of the parameterization. And the
closer q to value 0 or 1, the larger the angle and area distortions are.

4 Conclusions
A parameterization technique for quadrilateral meshes is based on mesh
simplification and weighted discrete mapping is presented. Mesh simplification

24

L. Liu, C. Zhang , and F. Cheng

reduces computation, and the weighted discrete mapping minimizes angle and area
distortion. The scaling factor q of the weighted discrete mapping provides users with
the flexibility of getting appropriate parameterisations according to special
applications, and establishes different smoothness and distortion.
The major drawback in our current implementation is that the proposed approach
may contain concave quadrangles in the planar embedding. It is difficult to make all
of the planar quadrilateral meshes convex, even though we change the triangular
meshes into quadrilateral meshes by deleting edges. In the future work, we will focus
on using a better objective function to obtain better solutions and developing a good
solver that can keep the convexity of the planar meshes.

References
1. Levy, B.: Constrained texture mapping for polygonal meshes. In: Fiume E, (ed.):
Proceedings of Computer Graphics. ACM SIGGRAPH, New York (2001) 417-424
2. Alexa, M.: Merging polyhedron shapes with scattered features. The Visual Computer. 16
(2000): 26-37
3. Eck, M., DeRose, T., Duchamp, T., Hoppe, H., Lounsbery, M., Stuetzle, W.:
Multiresolution analysis of arbitrary meshes. In: Mair, S.G., Cook, R.(eds.): Proceedings
of Computer Graphics. ACM SIGGRAPH, Los Angeles (1995) 173-182
4. Alliez, P., Meyer, M., Desbrun, M.: Interactive geometry remeshing. In: Proceedings of
Computer Graphics.ACM SIGGRAPH, San Antonio (2002) 347-354
5. Alexa, M.: Recent advances in mesh morphing. Computer Graphics Forum. 21(2002)
173-196
6. Maillot, J., Yahia, H., Verroust, A.: Interactive texture mapping. In: Proceedings of
Computer Graphics, ACM SIGGRAPH, Anaheim (1993) 27-34
7. Levy, B., Mallet, J.: Non-distorted texture mapping for sheared triangulated meshes. In:
Proceedings of Computer Graphics, ACM SIGGRAPH, Orlando (1998) 343-352
8. Jin, M., Wang, Y., Yau, S.T., Gu. X.: Optimal global conformal surface parameterization.
In: Proceedings of Visualization, Austin (2004) 267-274
9. Floater, M.S.: Parameterization and smooth approximation of surface triangulations.
Computer Aided Geometric Design.14(1997) 231-250
10. Lee, Y., Kim, H.S., Lee, S.: Mesh parameterization with a virtual boundary. Computer &
Graphics. 26 (2006) 677-686

Pose Insensitive 3D Retrieval by Poisson Shape


Histogram
Pan Xiang1, Chen Qi Hua2, Fang Xin Gang1, and Zheng Bo Chuan3
1

Institute of Software, Zhejiang University of Technology


Institute of Mechanical, Zhejiang University of Technology
3
College of Mathematics & Information , China West Normal University
1,2
310014, Zhejiang, 3637002, Nanchong, P.R. China
panxiangid@yahoo.com
2

Abstract. With the rapid increase of available 3D models, content-based 3D retrieval is attracting more and more research interests. Histogram is the most
widely in constructing 3d shape descriptor. Most existing histogram based descriptors, however, will not remain invariant under rigid transform. In this paper, we proposed a new kind of descriptor called poisson shape histogram. The
main advantage of the proposed descriptor is not sensitive for rigid transform. It
can remain invariant under rotation as well. To extract poisson shape histogram,
we first convert the given 3d model into voxel representation. Then, the poisson
solver with dirichlet boundary condition is used to get shape signature for each
voxel. Finally, the poisson shape histogram is constructed by shape signatures.
Retrieving experiments for the shape benchmark database have proven that
poisson shape histogram can achieve better performance than other similar
histogram-based shape representations.
Keywords: 3D shape matching, Pose-Insensitive, Poisson equation, Histogram.

1 Introduction
Recent development in modeling and digitizing techniques has led to a rapid increase
of 3D models. More and more 3D digital models can be accessed freely from Internet
or from other resources. Users can save the design time by reusing existing 3D models. As a consequence, the concept has changed from How do we generate 3D models? to How do we find them?[1]. An urgent problem right now is how to help
people find their desirable 3D models accurately and efficiently from the model databases or from the web. Content-based 3D retrieval aiming to retrieve 3D models by
shape matching has become a hot research topic.
In Content-based 3D retrieval, histogram based representation has been widely
used for constructing shape features[2]. For histogram based representation, it needs
to define shape signatures. The defined shape signature is the most important for
histogram descriptor. It should be invariant to affine transformations such as translation, scaling, rotation and rigid transform. Some rotation invariant shape signatures,
such as curvature, distance et al, have been used for content-based 3d retrieval. Those
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 2532, 2007.
Springer-Verlag Berlin Heidelberg 2007

26

X. Pan et al.

shape signatures are independent of 3d shape rotation. However, little researches are
focusing on extracting invariant shape signatures under rigid transform. Those existing rotation-invariant shape signatures are often sensitive to rigid transform.
In this paper, we propose a new kind of shape signature called poisson shape
measure. It can remain almost invariant under not only rotation transform, but also
rigid transform. The proposed shape signature is based on poisson theory. As one of
the most important PDE theory, it has been widely used for computer vision, computer graphics, analysis of anatomical structures and image processing[3-5]. However, it has not been used for defining 3d shape signature and then content based 3d
retrieval. The process of constructing poisson shape histogram can be concluded as
following: the given 3d model will be first converted into voxel representation. Then,
the poisson solver with dirichlet boundary condition is used to get shape signature for
each voxel. Finally, the poisson shape histogram is constructed by the shape signatures. The comparative study shows poisson shape histogram can achieve better retrieving performance than other similar histogram descriptors.
The remainder of the paper is organized as follows: Section 2 provides a brief review of the related work. Section 3 discusses the poison equation and the related
property. Section 4 discusses how to construct poisson shape histogram. Section 5
provides the experimental results for content-based 3D retrievals. Finally, Section 6
concludes the paper and recommends some future work.

2 Related Work
Previous shape descriptors can be classified into two groups by their characteristics:
namely structural representation and statistical representation. The method proposed
in this paper belongs to statistical representation. This section mainly gives a brief
review on statistical shape description for content-based 3D retrieval. For more details
about structure descriptors and content-based 3D retrieval, please refer to some
survey papers[6-8].
As for statistical representation, the most common approach is to compute geometry signatures of the given model first, such as normal, curvature, distance and so on.
Then, the extracted shape signatures are used to construct histogram. Existing shape
signatures for 3d shape retrieval can be grouped into two types: one is the rotation
invariant shape signatures, and the other is not. For the latter, rotation normalization is
performed prior to the extraction of shape signatures.
Rotation variant shape signatures
Extend Gaussian Image (EGI) defines shape feature by normal distribution over the
sphere[9]. An extension version of EGI is the Complex Extend Gaussian Image
(CEGI)[10], which combines distance and normal for shape descriptor. Shape
histograms defined on shells and sectors around a model centroid is to capture point
distribution[11]. Transform-based shape features can be seen as a post-process of the
original shape signatures. It often can achieve better retrieving accuracy than the

Pose Insensitive 3D Retrieval by Poisson Shape Histogram

27

original shape signatures. Vranic et al perform spherical harmonics transform for


point distribution of the given model[12]. While Chen et al considered the concept
that two models are similar if they look similar from different view angles. Hence
they extracted transform coefficients in 2D spaces instead of the 3D space[13]. Transform-based 3D retrievals often can achieve better retrieving performance than
histogram-based methods. but are more computational costly.
Rotation invariant shape signatures
This kind of shape signature is robust again rotation transform. Shape distribution
used some measures over the surfaces, such as distance, angle and area, to generate
histograms[14]. The angle and distance distribution (AD) is to integrate normal information into distance distribution[15]. The generalized shape distributions is to
combine local and global shape feature for 3d retrieval. Shape index defined by curvature is adopted as MPEG-7 3D shape descriptor[16]. Radius-Angle Histogram is to
extract the angle between radius and normal for histogram[17]. The local diameter
shape-function is to compute the distance from surface to medial axis[18]. It has the
similar characteristic with the poisson measure proposed by this paper. The extraction
of local diameter shape function, however, is very time-cost(It requires nearly 2 minutes in average for construing histogram).

3 Poisson Equation
Poissons equation arises in gravitation and electrostatics, and is the fundamental of
mathematical physics. Mathematically, Poissons equation is a second-order elliptic
partial differential equation defined as:

U = 1

(1)

where U is the laplacian operation. The poisson equation is to assign every internal
point a value. As for definition, the poisson equation is somewhat similar with distance transform. The distance transform will assign to every internal point a value that
depends on the relative position of that point within the given shape, which reflects its
minimal distance to the boundary. The poisson equation, however, has a huge difference with distance transform. The poisson is to place a set of particles at the point and
let them move in a random walk until they hit the contour. It measures the mean time
required for a particle to hit the boundaries. Thats to say, the poisson equation will
consider each internal point affected one more boundary points, and will be more
robust again distance transform.
The poisson equation has the potential property in shape analysis. Here we show
some of these properties.
1. Rotation invariant. Poisson equation is independent of the coordinate system over the
entire domain (volume in 3D, and region in 2D). It makes the signature defined by poisson
equation be robust again rotation.
2. Geometry structure related. The poisson equation is correlated to the geometry of the
structure. This correlation gives a mathematical meaning to the shape structure.

28

X. Pan et al.

3. Rigid-transform invariant. Similar with geodesic distance, the poisson equation has a
strong robustness over the rigid transform.

4 Poisson Shape Histogram and Matching


Followed by the definition of poisson equation, this section will discuss how to construct poisson shape histogram and similarity calculation.
The definition of poisson equation is to assign each internal point a value. Most 3D
models, however, will use boundary representation, such as the mesh model. The
given mesh model will be converted into 3d discrete grid(484848) first. The voxelization algorithm used in this paper is based on Z-buffer[19]. The efficiency of this
algorithm is independent of the object complexity, and can be implemented efficiently.
The voxelization also make a process of scale normalization for the given model.
Suppose the voxelization model can be represented by a finite voxel
set Vi , i = 1,2, "" N , where N is total voxel count. The tacus package[20] is then
used for poisson solver. After that, for each voxel Vi , we can get poisson shape signature, denoted by Pi . The construction of poisson shape histogram can be concluded as
the following steps:
1) For the signature set Pi , i

= 1,2," i, " N , compute its mean value and vari-

ance respectively.

2) For each Pi , perform Gaussian normalization by the following equation.

Pi =
'

Pi
.
3

(2)

'

3) For normalized set Pi , construct histogram containing 20 bins, denoted by:

H = {H 1 , H 2 , " H i , " H 20 }
For two histograms, we use L1-metric to measure their dissimilarity.

Dis1, 2 = H 1,i H 2,i , i = 1,2, "" N .

(3)

where H1 and H2 denote poisson shape histogram for two models. The bigger value
means two models are more dissimilar.
Section 3 discusses the property of poisson equation, and it shows the poisson
equation will be independent of rigid transform. Figure 1 gives poisson shape histogram for horses under different rigid transform. The poisson shape histogram remains
almost invariant for different rigid transform(the small difference due to the voxelization error). As a comparison, the D2 shape distribution, however, appears to be huge
difference for two models.

Pose Insensitive 3D Retrieval by Poisson Shape Histogram

29

Fig. 1. Histogram descriptors for above models(Upper: Horses under different rigid-transform.
Lower: The left is the poisson shape histogram for the above two models, and the right is the
D2 shape distribution as well. The difference between poisson shape histograms appear to be
very minor. While the difference of the D2 shape distributions appears to be very obvious).

5 Experiment
Experiments are carried out to test the retrieving performance of poisson shape histogram. All experiments are performed with the hardware Intel Pentium 1.86GHZ,
512M memory. The testing 3D models are Princeton Shape Benchmark database(PSB)[21]. It contains 1804 mesh models, and is classified into two groups. Each
group contains 907 models as well. One is the training set, which is used to get best
retrieving parameters. The other is the testing set for retrieving performance comparison of different shape descriptors. The benchmark also provides different evaluating
criterions for retrieving precision. Here we use Precision-Recall curve to measure the
retrieving accuracy, and the precision-recall measure has been widely used in information retrieval. We first show the time in constructing shape poisson histogram, and
then retrieving accuracy comparison with similar histograms.
As for content-based 3D retrieval, the feature extraction process should be performed quickly. This is very important, especially for practical applications. The
costing time for building poisson shape histogram consists of the following steps:
voxelization, poisson solver and histogram construction. The voxelization time is
almost 0.07s for each model, and the histogram construction is near to 0s. Notice the
time for poisson solver is related with the count of voxel. Table 1 shows the costing
time for different voxel models.
In average, the costing time for poisson shape histogram is about 0.6s. While for
D2 shape distribution, the generating time is about 0.8s.
Next, we will compare the retrieving performance of poisson shape histogram(PSH) with some other histogram based shape descriptors. They are 3D shape
spectrum(3DS), and D2 distance(D2). Figure 2 givens the precision-recall curve for

30

X. Pan et al.
Table 1. The costing time for poisson solver
Voxel models

Poisson solver (s)

8624

1.1

6832

0.7

4500

0.4

2306

0.2

Fig. 2. The Precision-Recall curves for different histogram-based descriptors

Fig. 3. Some retrieving results(For each row, the left model is the query model, and other three
models are the most similar with queried model. Notice the model under different rigid transform can be retrieved correctly).

Pose Insensitive 3D Retrieval by Poisson Shape Histogram

31

three kinds of shape descriptors. It shows the poisson shape histogram can achieve
best retrieving precision. Some retrieving results are also shown in Figure 3. Notice
those models under different rigid transform can be retrieved correctly.

6 Conclusion and Future Work


This paper proposed a new kind of 3d shape descriptor called poisson shape histogram. It uses poisson equation as the main mathematical theory. The encouraging
characteristic of poisson shape histogram is insensitive for rigid transform. It remains
rotation invariant as well. The retrieving experiments have shown that the poisson
shape histogram can achieve better retrieving precision than other similar histogrambased 3d shape descriptors.
As a kind of histogram, the main drawback of poisson shape histogram can only
capture global shape feature. It can not support partial matching. While for the definition of poisson equation, the poisson shape signature is only affected by local
neighbors. It shows the poisson shape measure can represent local shape feature as
well. As one of the future work, we will work for partial matching based on poisson
equation.
Acknowledgments. This work was supported by natural science foundation of Zhejiang Province(Grant No: Y106203, Y106329). It was also partially funded by the
Education Office of Zhejiang Province(Grant No 20051419) and the Education Office
of Sichuan Province(Grant No 2006B040).

References
1. T. Funkhouser, P. Min, and M. Kazhdan, A Search Engine for 3D Models. ACM Transactions on Graphics, (2003)(1): 83-105.
2. Ceyhun Burak Akgl, Blent Sankur, Ycel Yemez, et al., A Framework for HistogramInduced 3D Descriptors. European Signal Processing Conference (2006).
3. L. Gorelick, M. Galun, and E. Sharon Shape representation and classification using the
poisson equation. CVPR, (2004): 61-67.
4. Y. Yu, K. Zhou, and D. Xu, Mesh Editing with Poisson-Based Gradient Field Manipulation. ACM SIGGRAPH, (2005).
5. H. Haider, S. Bouix, and J. J. Levitt, Charaterizing the Shape of Anatomical Structures
with Poisson's Equation. IEEE Transactions on Medical Imaging, (2006). 25(10):
1249-1257.
6. J. Tangelder and R. Veltkamp. A Survey of Content Based 3D Shape Retrieval Methods. in
International Conference on Shape Modeling. (2004).
7. N. Iyer, Y. Kalyanaraman, and K. Lou. A reconfigurable 3D engineering shape search
system Part I: shape representation. in CDROM Proc. of ASME 2003. (2003). Chicago.
8. Benjamin Bustos, Daniel Keim, Dietmar Saupe, et al., An experimental effectiveness comparison of methods for 3D similarity search. International Journal on Digital Libraries,
(2005).
9. B. Horn, Extended Gaussian Images. Proceeding of the IEEE, (1984). 72(12): 1671-1686.

32

X. Pan et al.

10. S. Kang and K. Ikeuchi. Determining 3-D Object Pose Using The Complex Extended
Gaussian Image. in International Conference on Computer Vision and Pattern Recognition. (1991).
11. M. Ankerst, G. Kastenmuller, H. P. Kriegel, et al. 3D Shape Histograms for Similarity
Search and Classification in Spatial Databases. in International Symposium on Spatial
Databases. (1999).
12. D. Vranic, 3D Model Retrieval, PH. D Thesis, . 2004, University of Leipzig.
13. D. Y. Chen, X. P. Tian, and Y. T. Shen, On Visual Similarity Based 3D Model Retrieval.
Computer Graphics Forum (EUROGRAPHICS'03), (2003). 22(3): 223-232.
14. R. Osada, T. Funkhouser, B. Chazelle, et al. Matching 3D Models with Shape Distributions. in International Conference on Shape Modeling and Applications. (2001).
15. R. Ohbuchi, T. Minamitani, and T. Takei. Shape-Similarity Search of 3D Models by Using
Enhanced Shape Functions. in Theory and Practice of Computer Graphics. (2003).
16. T. Zaharia and F. Preteux. 3D Shape-based Retrieval within the MPEG-7 Framework. in
SPIE Conference on Nonlinear Image Processing and Pattern Analysis. (2001).
17. Xiang Pan, Yin Zhang, Sanyuan Zhang, et al., Radius-Normal Histogram and Hybrid
Strategy for 3D Shape Retrieval. International Conference on Shape Modeling and Applications, (2005): 374-379.
18. Ran Gal, Ariel Shamir, and Daniel Cohen-Or, Pose Oblivious Shape Signature. IEEE
Transactions of Visualization and Computer Graphics, (2005).
19. E. A. Karabassi, G. Papaioannou , and T. Theoharis, A Fast Depth-buffer-based Voxelization Algorithm. Journal of Graphics Tools, (1999). 4(4): 5-10.
20. S. Toledo, TAUCS: A Library of Sparse Linear Solvers. Tel-Aviv University, 2003.
http://www.tau.ac.il/~stoledo/taucs.
21. P. Shilane, K. Michael, M. Patrick, et al. The Princeton Shape Benchmark. in International
Conference on Shape Modeling. (2004).

Point-Sampled Surface Simulation Based on


Mass-Spring System
Zhixun Su1,2 , Xiaojie Zhou1 , Xiuping Liu1 , Fengshan Liu2 , and Xiquan Shi2
1

Department of Applied Mathematics, Dalian University of Technology, Dalian


116024, P.R. China
zxsu@comgi.com, xiaojiezhou66@gmail.com, xpliu@comgi.com
Applied Mathematics Research Center, Delaware State University, Dover, DE
19901, USA
{fliu,xshi}@desu.edu

Abstract. In this paper, a physically based simulation model for pointsampled surface is proposed based on mass-spring system. First, a
Delaunay based simplication algorithm is applied to the original pointsampled surface to produce the simplied point-sampled surface. Then
the mass-spring system for the simplied point-sampled surface is constructed by using tangent planes to address the lack of connectivity information. Finally, the deformed point-sampled surface is obtained by
transferring the deformation of the simplied point-sampled surface. Experiments on both open and closed point-sampled surfaces illustrate the
validity of the proposed method.

Introduction

Point based techniques have gained increasing attention in computer graphics.


The main reason for this is that the rapid development of 3D scanning devices has
facilitated the acquisition of the point-sampled geometry. Since point-sampled
objects do neither have to store nor to maintain globally consistent topological
information, they are more exible compared to triangle meshes for handling
highly complex or dynamically changing shapes. In point based graphics, point
based modeling is a popular eld [1,4,9,13,15,21], in which physically based modeling of point-sampled objects is still a challenging area.
Physically based modeling has been investigated extensively in the past two
decades. Due to the simplicity and eciency, mass-spring systems have been
widely used to model soft objects in computer graphics, such as cloth simulation. We introduce mass-spring system to point-sampled surface simulation. A
Delaunay based simplication algorithm is applied to the original point-sampled
surface to produce the simplied point-sampled surface. By using the tangent
plane and projection, mass-spring system is constructed locally for the simplied
point-sampled surface. Then the deformed point-sampled surface is obtained by
transferring the deformation of the simplied point-sampled surface.
The remaining of the paper is organized as follows. Related work is introduced
in Section 2. Section 3 explains the Delaunay based simplication algorithm.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 3340, 2007.
c Springer-Verlag Berlin Heidelberg 2007


34

Z. Su et al.

Section 4 describes the simulation of the simplied point-sampled surfaced based


on mass-spring system. Section 5 introduces the displacement transference to the
original point-sampled surface. Some experiments results are shown in Section
6. A brief discussion and conclusion is presented in Section 7.

Related Work

Point-sampled surfaces often consist of thousands or even millions of points sampled from an underlying surface. Reducing the complexity of such data is one of
the key processing techniques. Alexa et al [1] described the contribution value of
a point by estimating its distance to the MLS surface dened by the other sample
points, and the point with the smallest contribution will be removed repeatedly.
Pauly et al [14] extended mesh simplication algorithms to point clouds, and
presented the clustering, iterative, and particle simulation simplication algorithms. Moenning et al [12] devised a coarse-to-ne uniform or feature-sensitive
simplication algorithm with user-controlled density guarantee. We present a
projection based simplication algorithm, which is more suitable for the construction of mass-spring system.
Point based surface representation and editing are popular elds in point
based graphics. Alexa et al [1] presented the now standard MLS surface, in
which the surface is dened as the stationary set of a projection operator. Later
Shachar et al [4] proposed a robust moving least-squares tting with sharp features for reconstructing a piecewise smooth surface from a potentially noisy point
clouds. The displacement transference in our method is similar to moving least
squares projection. Zwicker [21] presented Pointshop3D system for interactive
editing of point-based surfaces. Pauly et al [15] introduced Boolean operations
and free-form deformation of point-sampled geometry. Miao et al [10] proposed
a detail- preserving local editing method for point-sampled geometry based on
the combination of the normal geometric details and the position geometric details. Xiao et al [19,20] presented ecient ltering and morphing methods for
point-sampled geometry. Since the pioneering work of Terzopoulos and his coworkers [18], signicant research eort has been done in the area of physically
based modeling for meshes [5,16]. Recently, Guo and Qin et al [2,6,7,8] proposed
the framework of physically based morphing, animation and simulation system.
M
uller et al [13] presented point based animation of elastic, plastic and melting
objects based on continuum mechanics. Clarenz et al [3] proposed framework
for processing point-based surfaces via PDEs. In this paper, mass-spring system
is constructed directly for the simplied point-sampled surface. The idea of the
present method is similar to [17]. They studied curve deformation, while we focus
on point-sampled surface simulation.

Simplication of Point-Sampled Surface

The point-sampled surface consists of n points P = {pi R3 , i = 1, . . . , n} sampled from an underlying surface, either open or closed. Since the normal at any

Point-Sampled Surface Simulation Based on Mass-Spring System

35

point can be estimated by the eigenvector of the covariance matrix that corresponds to the smallest eigen value [14], without loss of generality, we can assume
that the normal ni at point pi is known as input. Traditional simplication algorithms reserve more sample points in regions of high-frequency, whereas less
sample points are used to express the regions of low-frequency, which is called
adaptivity. However, adaptivity does not necessary bring good results for simulation. An example is shown in Fig. 1, 1a) shows the sine curve and the simplied
curve, force F is applied on the middle of the simplied curve, 1b) shows that
the simulation based on the simplied curve produce the wrong deformation. We
present a Delaunay based simplication algorithm, which is suitable for simulation and convenient for the construction of mass-spring system.

a) Sine curve (solid line) and the


simplied polyline(solid line)

b) Deformation of the simplied


polyline under an applied force F

Fig. 1. The eect of simplication to the simulation

For pi P , the index set of its k-nearest points is denoted by Nik = {i1 , . . . ,
ik }. These points are projected onto the tangent plane at point pi (the plane
passing through pi with normal ni ), and the corresponding projection points
are denoted by q ji , j = 1, . . . , k. 2D Delaunay triangulation is implemented on
the k + 1 projection points. There are two possible cases: 1) pi is not on the
boundary of the surface, 2) pi is on the boundary of the surface, shown as Fig. 2.
Suppose that there are m points {qji r , r = 1, . . . , m} which are connected with
pi in the triangle mesh, the union of the triangles that contain pi is denoted
by Ri whose diameter is di . In either case, if di is less than the user-dened
threshold, pi will be removed. This process is repeated until the desired number
of points is reached or the diameter di for each point exceeds the threshold. The
resulting simplied point set is denoted by S = {sj , j = 1, . . . , ns }, and sj is
called simulation point. It is important to select a proper value of k, too small
k may inuence the quality of simplication, while too big k will increase the
computational cost. In our experiments, preferable is in the interval [10 20].

4
4.1

Simulation Based on Mass-Spring System


Structure of the Springs

Since no explicit connectivity information is known for the simplied pointsampled surface, traditional mass-spring system [16] can not be applied directly.
Here the stretching and bending springs are constructed based on the region
Ri corresponding to si . For si S, the vertices of the region Ri are {q ji r , r =

36

Z. Su et al.

a) Delaunay triangulation for case 1)

b) Delaunay triangulation for case 2)

Fig. 2. Delaunay triangulation of the projection points on the tangent plane

1, . . . , m}, which are the projection points of {sji r , r = 1, . . . , m}. Assuming that
q ji r are sorted counter clockwise. The stretching springs link si and sji r and the
j
bending springs connect sji r and si r+2 (Fig. 3). This process is implemented on
each point on S, and the structure of the springs is obtained consequently.

a) Stretching springs for case 1) and 2) b) Bending springs for case 1) and 2)
Fig. 3. The spring structures (dashed lines)

4.2

Estimation of the Mass

The mass of si is needed for simulation. Note that in region of low sampling
density, a simulation point si will represent large mass, whereas smaller mass
in region of higher sampling density. Since the area of region Ri reects the
sampling density, the mass of si can be estimated by
mi =

1
SRi
3

(1)

where SRi is the area of region Ri , and is the mass density of the surface.
4.3

Forces

According to the Hookes law, the internal force F s (Si,j ) of spring Si,j linking
two mass points si and sj can be written as


I i,j
s
0
F s (Si,j ) = ki,j I i,j li,j
(2)
I i,j 
s
0
where ki,j
is the stiness of spring Si,j , I i,j = sj si , and li,j
is the natural
length of spring Si,j .

Point-Sampled Surface Simulation Based on Mass-Spring System

37

In dynamic simulation, a damping force is often introduced to increase the


stability. In our context, the damping force is represented as
d
F d (Si,j ) = ki,j
(v j v i )

(3)

d
where ki,j
is the coecient of damping, v j and v i are the velocities of sj and si .
Appling external forces to the mass-spring system yields realistic dynamics.
The gravitational force acting on a mass point si is given by

F g = mi g

(4)

where mi is the mass of the mass point si , g is the acceleration of gravity. A


force that connects a mass point to a point in world coordinates r0 is given by
F r = kr (r 0 si )

(5)

where kr is the spring constant. Similar to [18], other types of external forces,
such as the eect of viscous uid, can be introduced into our system.
4.4

Simulation

The mass-spring system is governed by Newton s Law. For a mass point si ,


there exists the equation
F i = mi ai = mi

d2 xi
dt2

(6)

where mi , xi , and ai are the mass, displacement, acceleration of si respectively.


A large number of integration schemes can be used to Eq. (6). Explicit schemes
are easy to implement and computationally cheap, but stable only for small
time steps. In contrast, implicit schemes are unconditionally stable at the cost
of computational and memory consumption. We use explicit Euler scheme for
simplicity in our system.

Deformation of the Original Point-Sampled Surface

The deformation of the original point-sampled surface can be obtained by the


deformation of the simplied point-sampled surface. Let us consider the xcomponent u of the displacement eld u = (u, v, w). Similar to [13], we compute
the displacement of pi through the simulation points in its neighborhood. While
the simulation points sampled from an underlying surface, it may be singular
due to coplanarity if we use moving least square tting to compute the displacement directly. We treat the tangent plane at pi as the reference domain.
The simulation points sji , j = 1, . . . , k in the neighborhood of pi are projected
onto the reference plane, with corresponding projection points q ji , j = 1, . . . , k,
and (
xj , yj ) , j = 1, . . . , k are the coordinates of q ji , j = 1, . . . , k in the local
coordinate system with origin pi . Let the x-component u is given by
u (
x, y) = a0 + a1 x
+ a2 y

(7)

38

Z. Su et al.

The parameters al , l = 0, 1, 2 can be obtained by minimizing


Ei =

k


w (rj ) (uj a0 a1 x
j a2 yj )2

(8)

j=1

where rj is the distance between pi and q ji , w () is a Gaussian weighting function w (rj ) = exp (rj2 /h2 ). Then ui = u (0, 0) = a0 . Similarly, v and w can
be computed. Since the shape of the point-sampled surface is changed due to
the displacements of the sample points, the normal of the underlying surface
will change consequently. The normal can be computed by the covariance analysis as mentioned above. The point sampling density will be changed due to
the deformation, we use the resampling scheme of [1] to maintain the surface
quality.

Experimental Results

We implement the proposed method on a PC with Pentium IV 2.0GHz CPU


and 512MB RAM. Experiments are performed on both closed and open surfaces,
shown as Fig. 4. The sphere is downloaded from the website of PointShop3D and
composed of 3203 surfels. For the modeling of the hat, the original point-sampled
surface is sampled from the lower part of a sphere, and a stretching force acted
on the middle of the point-sampled surface produce the hat. We also produce
another interesting example, the logo CGGM 07, which are both produced by
applying force on the point-sampled surfaces (Fig. 5). The simplication and
the construction of mass-spring system can be performed as preprocess, and
the simulation points is much less in the simplied surface than the original
point-sampled surface, so the simulation is very ecient. The performance of
simulation is illustrated in Table 1. The main computational cost is the transference of the displacement from the simplied surface to the original pointsampled surface and the normal computation of the deformed point-sampled
surface. Compared to the global parameterization in [7], the local construction
of mass-spring system makes the simulation more ecient. The continuum-based
method [13] presented the modeling of volumetric objects, while our method can
deal with both volumetric objects using their boundary surface and sheet like
objects.
Table 1. The simulation time
Number of simulation points 85 273 327
Simulation time per step (s) 0.13 0.25 0.32

Point-Sampled Surface Simulation Based on Mass-Spring System

a) The deformation of a sphere

39

b) The modeling of a hat

Fig. 4. Examples of our method

Fig. 5. The CGGM 07 logo

Conclusion

As an easy implemented physically based method, mass-spring systems have


been investigated deeply and have been used widely in computer graphics. However, it can not be used to point-sampled surfaces due to the lack of connectivity
information and the diculty of constructing mass-spring system. We solve the
problem of the construction of mass-spring system for point-sampled surface
based on projection and present a novel mass-spring based simulation method
for point-sampled surface. A Delaunay based simplication algorithm facilitates
the construction of mass-spring system and ensures the eciency of the simulation method. Further study focuses on the simulation with adaptive topology.
And the automatic determination of the simplication threshold should be investigated to ensure suitable tradeo between accuracy and eciency in the
future.

Acknowledgement
This work is supported by the Program for New Century Excellent Talents in
University grant (No. NCET-05-0275), NSFC (No. 60673006) and an INBRE
grant (5P20RR01647206) from NIH, USA.

References
1. Alexa M., Behr J., Cohen-Or D., Fleishman S., Levin D., Silva C. T.: Computing
and rendering point set surfaces. IEEE Transactions on Visualization and Computer Graphics 9(2003) 3-15
2. Bao Y., Guo X., Qin H.: Physically-based morphing of point-sampled surfaces.
Computer Animation and Virtual Worlds 16 (2005) 509 - 518

40

Z. Su et al.

3. Clarenz U., Rumpf M., Telea A.: Finite elements on point based surfaces, Proceedings of Symposium on Point-Based Graphics (2004)
4. Fleishman S., Cohen-Or D., Silva C. T.: Robust moving least-squares tting with
sharp features. ACM Transactions on Graphics 24 (2005) 544-552
5. Gibson S.F., Mirtich B.: A survey of deformable models in computer graphics.
Technical Report TR-97-19, MERL, Cambridge, MA, (1997)
6. Guo X., Hua J., Qin H.: Scalar-function-driven editing on point set surfaces. IEEE
Computer Graphics and Applications 24 (2004) 43 - 52
7. Guo X., Li X., Bao Y., Gu X., Qin H.: Meshless thin-shell simulation based on
global conformal parameterization. IEEE Transactions on Visualization and Computer Graphics 12 (2006) 375-385
8. Guo X., Qin H.: Real-time meshless deformation. Computer Animation and Virtual
Worlds 16 (2005) 189 - 200
9. Kobbelt L., Botsch M.: A survey of point-based techniques in computer graphics.
Computer & Graphics, 28 (2004) 801-814
10. Miao Y., Feng J., Xiao C., Li H., Peng Q., Detail-preserving local editing for pointsampled geometry. H.-P Seidel, T. Nishita, Q. Peng (Eds), CGI 2006, LNCS 4035
(2006) 673-681
11. Miao L., Huang J., Zheng W., Bao H. Peng Q.: Local geometry reconstruction
and ray tracing for point models. Journal of Computer-Aided Design & Computer
Graphics 18 (2006) 805-811
12. Moenning C., Dodgson N. A.: A new point cloud simplication algorithm. Proceedings 3rd IASTED Conference on Visualization, Imaging and Image Processing,
Benalm
adena, Spain (2003) 1027-1033
13. M
uller M., Keiser R., Nealen A., Pauly M., Pauly M., Gross M., Alexa M.: Point
based animation of elastic, plastic and melting objects, Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2004) 141-151
14. Pauly M., Gross M., Kobbelt L.: Ecient simplication of point-sampled surfaces.
Proceedings of IEEE Visualization (2002) 163-170
15. Pauly M., Keiser R., Kobbelt L., Gross M.: Shape modeling with point-sampled
geometry. ACM Transactions on Graphics 22(2003) 641-650
16. Provot X.: Deformation constraints in a mass-spring model to describe rigid cloth
behavior. Proc of Graphics Interface (1995) 147-154.
17. Su Z., Li L., Zhou X.: Arc-length preserving curve deformation based on subdivision. Journal of Computational and Applied Mathematics 195 (2006) 172-181
18. Terzopoulos D., Platt J., Barr A., Fleischer K.: Elastically deformable models.
Proc. SIGGRAPH (1987) 205-214
19. Xiao C., Miao Y., Liu S., Peng Q., A dynamic balanced ow for ltering pointsampled geometry. The Visual Computer 22 (2006) 210-219
20. Xiao C., Zheng W., Peng Q., Forrest A. R., Robust morphing of point-sampled
geometry. Computer Animation and Virtual Worlds 15 (2004) 201-210
21. Zwicker M., Pauly M., Knoll O., Gross M.: Pointshop3d: An interactive system for
point-based surface editing. ACM Transactions on Graphics 21(2002) 322- 329

Sweeping Surface Generated by a Class of


Generalized Quasi-cubic Interpolation Spline
Benyue Su1,2 and Jieqing Tan1
1

Institute of Applied Mathematics, Hefei University of Technology,


Hefei 230009, China
Department of Mathematics, Anqing Teachers College, Anqing 246011, China
bysu1219@yahoo.com.cn

Abstract. In this paper we present a new method for the model of


interpolation sweep surfaces by the C 2 -continuous generalized quasicubic interpolation spline. Once given some key position, orientation
and some points which are passed through by the spine and initial
cross-section curves, the corresponding sweep surface can be constructed
by the introduced spline function without calculating control points inversely as in the cases of B-spline and Bezier methods or solving equation system as in the case of cubic polynomial interpolation spline. A
local control technique is also proposed for sweep surfaces using scaling function, which allows the user to change the shape of an object
intuitively and eectively. On the basis of these results, some examples
are given to show how the method is used to model some interesting
surfaces.

Introduction

Sweeping is a powerful technique to generate surfaces in CAD/CAM, robotics motion design and NC machining, etc. There has been abundant research
in the modeling of sweeping surfaces and their applications. Hu and Ling ([2],
1996) considered the swept volume of a moving object which can be constructed
from the envelope surfaces of its boundary. In this study, these envelope surfaces
are the collections of the characteristic curves of the natural quadric surfaces.
Wang and Joe ([13], 1997) presented sweep surface modeling by approximating
a rotation minimizing frame. The advantages of this method lie in the robust
computation and smoothness along the spine curves. J
uttler and M
aurer ([5],
1999) constructed rational representations of sweeping surfaces with the help


This work was completed with the support by the National Natural Science Foundation of China under Grant No. 10171026 and No. 60473114, and in part by the Research Funds for Young Innovation Group, Education Department of Anhui Province
under Grant No. 2005TD03, and the Anhui Provincial Natural Science Foundation
under Grant No. 070416273X, and the Natural Science Foundation of Anhui Provincial Education Department under Grant No. 2006KJ252B, and the Funds for Science
& Technology Innovation of the Science & Technology Department of Anqing City
under Grant No. 2003-48.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 4148, 2007.
c Springer-Verlag Berlin Heidelberg 2007


42

B. Su and J. Tan

of the associated rational frames of PH cubic curves and presented sucient


conditions ensuring G1 continuity of the sweeping surfaces. Schmidt and Wyvill
([9], 2005) presented a technique for generating implicit sweep objects that support direct specication and manipulation of the surface with no topological
limitations on the 2D sweep template. Seong, Kim et. al ([10], 2006) presented
an ecient and robust algorithm for computing the perspective silhouette of
the boundary of a general swept volume. In computer graphics, many advanced
techniques using sweeping surfaces ([1], [3], [4]) have been applied to the deformation, NC simulation, motion traced and animation, including human body
modeling and cartoon animation, etc. Yoon and Kim ([14], 2006) proposed a approach to the freeform deformation(FFD) using sweeping surfaces, where a 3D
object was approximated with sweep surfaces and it was easy to control shape
deformations using a small number of sweep parameters. Li, Ge and Wang ([6],
2006) introduced a sweeping function and applied it to the surface deformation
and modeling, where the surface can be pulled or pushed along a trajectory
curve.
In the process of constructing sweep surface, the hard work in modeling are
to present simple objects and rene them towards the desired shapes, where the
construction of the spine and across-section curves and the design of the moving
frame ([8]) are very important. Frenet frame, generalization translation frame
and rotation-minimizing frame et. al ([4], [5], [7], [13], [14]) can all be applied to
solve these problem thoroughly.
In general, the spine curve can be presented by Bezier and B-spline methods.
But they have many diculties in calculating the data points conversely in order
to interpolate given points. The main contribution of this paper is the development of a new method based on a class of generalized quasi-cubic interpolation
spline. This approach has the following features:
The spine and across-section curves are C 2 continuous and pass through some
given points by the user without calculating the control points conversely as in
the cases of B-spline and Bezier methods or solving equation system as in the
case of cubic polynomial interpolation spline.
A local control technique is proposed by the dened spline. It is implemented
exibly and eectively on the computer-human interaction.
The moving frame is smoothness and can be established associated with the
spine curve uniformly using our method.
The rest of this paper is organized as follows: A C 2 -continuous generalized
quasi-cubic interpolation spline is introduced in Sect. 2. We present a new
method for the sweep surface modeling by the generalized quasi-cubic interpolation spline in Sect. 3. Some examples of shape modeling by the introduced
method are given in Sect. 4. Finally, we conclude the paper in Sect. 5.

Sweeping Surface

43

C 2 -Continuous Generalized Quasi-cubic Interpolation


Spline

Denition 1. [11] Let b0 , b1 , b2 , , bn+2 , (n 1), be given control points. Then


a generalized quasi-cubic piecewise interpolation spline curve is dened to be
pi (t) =

3


Bj,3 (t)bi+j , t [0, 1], i = 0, 1, , n 1 ,

(1)

j=0

where
B0,3 (t) =

1
4

B1,3 (t) =

14

B2,3 (t) =

1
4

B3,3 (t) =

3
4

t
2

sin 2 t

cos t +

1
2

cos t

sin t ,

2t

t
2

+ sin 2 t

1
4

cos t

1
2

sin t ,

t
2

cos 2 t +

1
4

cos t +

1
2

sin t .

t
2

1
4

+ cos

1
4

1
2

sin t ,
(2)

From (2), we know that Bi,3 (t), (i = 0, 1, 2, 3), possess properties similar to
those of B-spline base functions except the positive property. Moreover, pi (t)
interpolates the points bi+1 and bi+2 . That is,
pi (0) = bi+1 , pi (1) = bi+2 ,

(3)

From (1) and (2), we can also get


pi (0) = ( 2 1)(bi+2 bi ),
pi (0) =
So

2
4 (bi

pi (1) = ( 2 1)(bi+3 bi+1 ) ,

2bi+1 + bi+2 ), pi (1) =


(l)

2
4 (bi+1

2bi+2 + bi+3 ) .

(l)

pi (1) = pi+1 (0), pi (1) = pi+1 (0), l = 1, 2, i = 0, 1, , n 2 .

(4)

(5)

Therefore, the continuity of the quasi-cubic piecewise interpolation spline curves


is established up to second derivatives. Besides this property, the quasi-cubic
piecewise interpolation spline curves also possess symmetry, geometric invariability and other properties, the details of these properties can be found in our
another paper ([11]).

Sweep Surface Modeling

Given a spine curve P (t) in space and a cross-section curve C(), a sweep surface
W (t, ) can be generated by
W (t, ) = P (t) + R(t)(s(t) C()),

(6)

where P (t) is a spine curve, R(t) is an orthogonal matrix representing a moving


frame along P (t), s(t) is scaling function. Geometrically, a sweep surface W (t, )

44

B. Su and J. Tan

is generated by sweeping C() along P (t) with moving frame R(t). cross-section
curve C() is in the 2D or 3D space which passes through the spine curve P (t)
during sweeping.
So the key problems in sweep surface generation are to construct the spine
and cross-section curves P (t), C() and to determine the moving frame R(t).
Given a initial cross-sections Cj () moving along a spine curve Pi (t). Each
given position is associated with a local transformation Ri (t) on the Cj (). The
sweep surface is generated by interpolating these key cross-sections Ci () at
these special positions by the user:
Wi,j (t, ) =
Pi (t) +
Ri (t)(s
i (t) Cj ())

xi (t)
r11,i (t) r12,i (t) r13,i (t)
sxi (t)Cxj ()
= yi (t) + r21,i (t) r22,i (t) r23,i (t) syi (t)Cyj () ,
zi (t)
r31,i (t) r32,i (t) r33,i (t)
0

(7)

where the s(t) is scaling function, which can be used to change the shapes of
cross-sections to achieve local deformations.
3.1

The Construction of Spine and Cross-Section Curves

From the above discussions, we know that once some places that the crosssections will pass through are given, a spine curve can be constructed to interpolate these places (points) as follows:
Pi (t) = (xi (t), yi (t), zi (t))T =

3


Bj,3 (t)bi+j , t [0, 1] ,

j=0

(8)

i = 0, 1, , n 1 ,
where bi , i = 0, 1, , n + 2, (n 1), are given points (positions) by user, and
Bj,3 (t), (j = 0, 1, 2, 3), are generalized quasi-cubic piecewise interpolation spline
base functions.
Similarly, if the explicit expression of cross-section curves are unknown in
advance. But we know also the cross-section curves pass through some given
points, then we can dene the cross-section curves by
Cj () = (Cxj (), Cyj (), 0)T =
j = 0, 1, , m 1 ,

3


Bk,3 ()qj+k , [0, 1] ,

k=0

(9)

where qj , j = 0, 1, , m + 2, , (m 1), are given points (positions) by user.


In order to improve the exibility and local deformation of the interpolation
sweeping surfaces, we introduce scaling functions dened by
si (t) = (sxi (t), syi (t), 0)T =

3

j=0

Bj,3 (t)si+j , t [0, 1] ,

(10)

i = 0, 1, , n 1 ,
where si = (
si , si , 0)T , i = 0, 1, , n+2, (n 1) . si and si are n+3 nonnegative
real numbers respectively, which are called scaling factors. Bj,3 (t), (j = 0, 1, 2, 3),
are generalized quasi-cubic piecewise interpolation spline base functions.

Sweeping Surface

3.2

45

The Moving Frame

In order to interpolate their special orientations of key cross-sections, we can


nd a proper orthogonal matrix sequence R(t) as a series of moving frame, such
that R(t) interpolate the given orthogonal matrices at the time t = ti . Therefore,
the interpolation problem lie in R(ti ) = Ri , where Ri are the given orthogonal
matrices at t = ti .
For the given positions of the moving frames (Pi , Rxi , Ryi , Rzi ), i = 0, 1, ,
n 1, we interpolate the translation parts Pi by generalized quasi-cubic interpolation spline introduced in the above section, and we can also interpolate
three orthogonal coordinates (Rxi , Ryi , Rzi ) homogeneously by the generalized
quasi-cubic interpolation spline (Fig.1(a)). Namely,
Ri (t) = (Rxi (t), Ryi (t), Rzi (t))T =

3


Bj,3 (t)(Rxi+j , Ryi+j , Rzi+j )T ,

j=0

(11)

t [0, 1], i = 0, 1, , n 1 ,

3
2
1
0
1
0.5

(a)

(b)

Fig. 1. (a) The moving frame on the dierent position, dashed line is spine curve. (b)
The sweep surface associated with open cross-section curve.

Notes and Comments. Since (Rxi (t), Ryi (t), Rzi (t)) dened by Eq.(11) usually
does not form an accurate orthogonal coordinate system at t = ti , we shall renew
it by the Schmidt orthogonalization or an approximation of the orthogonal one
with a controllable error. We can also convert the corresponding orthogonal
matrices into the quaternion forms, then interpolate these quaternions by the
(11) similarly, at last, the accurate orthogonal coordinate system can be obtained
by the conversion inversely.
From the (7), (8) and (11), we know that for the xed = ,
Wi,j (t, ) =

3


Bk,3 (t)(bi+k + Ri+k (si+k qj )) ,

k=0

where qj = qj ( ) , and for the xed t = t ,

(12)

46

B. Su and J. Tan

Wi,j (t , ) = Pi + Ri

3


Bk,3 (t)(si qj+k ) ,

(13)

k=0

where Pi = Pi (t ), Ri = Ri (t ) and si = si (t ) .
Since qj are constant vectors, we get that Wi,j (t, ) are C 2 -continuous and
the points on curves Wi,j (t, ) can be obtained by doing the transformation of
stretching, rotation and translation on the point qj .
The cross-section curves Wi,j (t , ) at the t = t can also be attained by the
stretching, rotation and translation transformation on the initial cross-section
curves Cj ().
Moveover, by computing the rst and second partial derivatives of Wi,j (t, ),
we get
l
Wi,j (t, )
tl

= Pi (t) +

l
l Wi,j (t, )

= Ri (t)(si (t) Cj ()) ,

(l)

dl
(Ri (t)(si (t)
dtl

Cj ())) ,

l = 1, 2 .

(l)

(14)

Then Wi,j (t, ) are C 2 -continuous with respect to t and by the (5) and (14).

The Modeling Examples

Example 1. Given interpolating points of spine curve by b0 = (0, 0, 1),b1 =


(0, 0, 1),b2 = (1, 0, 2.5),b3 = (2, 0, 3),b4 = (3, 0, 3),b5 = (4, 0, 2) and b6 = (4, 0, 2).
Suppose the initial cross-section curves pass through the points (cos (i1)
,
6
(i1)
sin 6 ), i = 1, 2, , 13. The rotation angle at the four positions is 0, /3,
/2 and 2/3 respectively. Scaling factors are selected by si = si 1. Then we
get sweeping interpolation surface as in the Fig.2(a) and Fig.3.

3.5

2.5

Trajectory curves

1.5

Object curves

1
1
0
1

0.5

0.5

1.5

2.5

3.5

1
0
1
1

(a)

(b)

Fig. 2. The four key positions of cross-section curve during sweeping. (a) is the gure
in example 1 and (b) is the gure in example 2.

Example 2. Given interpolation points of spine curve by b0 = (0, 0, 0),b1 =


(0, 0, 1),b2 = (2, 0, 2.5),b3 = (4, 0, 3),b4 = (6, 0, 3),b5 = (8, 0, 2) and b6 = (8, 0, 2).
The initial cross-section curve interpolates the points (cos (i1)
, sin (i1)
), i =
6
6
1, 2, , 13. The rotation angle at the four positions is 0, /6, /4 and /2 respectively. The scaling factors are chosen to be si = si = {1.4, 1.2, 1, 0.8, 0.6, 0.4, 0.2}.
Then we get sweeping interpolation surface as in the Fig.2(b) and Fig.4.

Sweeping Surface

(a)

47

(b)

Fig. 3. The sweep surface modeling in example 1, (b) is the section plane of gure (a)

(a)

(b)

Fig. 4. The sweep surface modeling in example 2, (b) is the section plane of gure (a)

Example 3. The interpolation points of spine curve and rotation angles are
adopted as in the example 2. The initial cross-section curve interpolates the
points q0 = (3, 1), q1 = (2, 2), q2 = (1, 1), q3 = (1, 2), q4 = (2, 1), q5 = (3, 2).
The scaling factors are chosen to be si = si 1. Then we get the sweeping interpolation surface by open cross-section curve as in the Fig.1(b).

Conclusions and Discussions

As mentioned above, we have described a new method for constructing interpolation sweep surfaces by the C 2 continuous generalized quasi-cubic interpolation
spline. Once given some key position and orientation and some points which are
passed through by the spine and initial cross-section curves, we can construct
corresponding sweep surface by the introduced spline function. We have also proposed a local control technique for sweep surfaces using scaling function, which
allows the user to change the shape of an object intuitively and eectively.
Note that, in many other applications of sweep surface, the cross-section
curves are sometimes dened on circular arcs or spherical surface, etc. Then
we can construct the cross-section curves by the circular trigonometric Hermite
interpolation spline introduced in our another paper ([12]).
On the other hand, in order to avoid a sharp acceleration of moving frame,
we can use the chord length parametrization in the generalized quasi-cubic interpolation spline.

48

B. Su and J. Tan

In future work, we will investigate the real-time applications of the surface


modeling based on the sweep method and interactive feasibility of controlling
the shape of freeform 3D objects .

References
1. Du, S. J., Surmann, T., Webber, O., Weinert, K. : Formulating swept proles for
ve-axis tool motions. International Journal of Machine Tools & Manufacture 45
(2005) 849861
2. Hu, Z. J., Ling, Z. K. : Swept volumes generated by the natural quadric surfaces.
Comput. & Graphics 20 (1996) 263274
3. Hua, J., Qin, H. : Free form deformations via sketching and manipulating the scalar
elds. In: Proc. of the ACM Symposium on Solid Modeling and Application, 2003,
pp 328333
4. Hyun, D. E., Yoon, S. H., Kim, M. S., J
uttler, B. : Modeling and deformation of
arms and legs based on ellipsoidal sweeping. In: Proc. of the 11th Pacic Conference
on Computer Graphics and Applications (PG 2003), 2003, pp 204212
5. J
uttler, B., M
aurer C. : Cubic pythagorean hodograph spline curves and applications to sweep surface modeling. Computer-Aided Design 31 (1999) 7383
6. Li, C. J., Ge, W. B., Wang, G. P. : Dynamic surface deformation and modeling
using rubber sweepers. Lecture Notes in Computer Science 3942 (2006) 951961
7. Ma, L. Z., Jiang, Z. D., Chan, Tony K.Y. : Interpolating and approximating moving frames using B-splines. In: Proc. of the 8th Pacic Conference on Computer
Graphics and Applications (PG 2000), 2000, pp 154164
8. Olver, P. J. : Moving frames. Journal of Symbolic Computation 36 (2003) 501512
9. Schmidt, R., Wyvill, B. : Generalized sweep templates for implicit modeling. In:
Proc. of the 3rd International Conference on Computer Graphics and Interactive
Techniques in Australasia and South East Asia, 2005, pp 187196
10. Seong, J. K., Kim, K. J., Kim, M. S., Elber, G. : Perspective silhouette of a general
swept volume. The Visual Computer 22 (2006) 109116
11. Su, B. Y., Tan, J. Q. : A family of quasi-cubic blended splines and applications. J.
Zhejiang Univ. SCIENCE A 7 (2006) 15501560
12. Su, B. Y., Tan, J. Q. : Geometric modeling for interpolation surfaces based on
blended coordinate system. Lecture Notes in Computer Science 4270 (2006)
222231
13. Wang, W. P., Joe, B. : Robust computation of the rotation minimizing frame for
sweep surface modeling. Computer-Aided Design 23 (1997) 379391
14. Yoon, S. H., Kim, M. S. : Sweep-based Freeform Deformations. Computer Graphics
Forum (Eurographics 2006) 25 (2006) 487496

An Artificial Immune System Approach for B-Spline


Surface Approximation Problem
Erkan lker1 and Veysi ler2
1

Seluk University, Department of Computer Engineering, 42075 Konya, Turkey


eulker@selcuk.edu.tr
2
Middle East Technical University, Department of Computer Engineering, 06531 Ankara,
Turkey
isler@ceng.metu.edu.tr

Abstract. In surface fitting problems, the selection of knots in order to get an


optimized surface for a shape design is well-known. For large data, this
problem needs to be dealt with optimization algorithms avoiding possible local
optima and at the same time getting to the desired solution in an iterative
fashion. Many computational intelligence optimization techniques like
evolutionary optimization algorithms, artificial neural networks and fuzzy logic
have already been successfully applied to the problem. This paper presents an
application of another computational intelligence technique known as
Artificial Immune Systems (AIS) to the surface fitting problem based on BSplines. Our method can determine appropriate number and locations of knots
automatically and simultaneously. Numerical examples are given to show the
effectiveness of our method. Additionally, a comparison between the proposed
method and genetic algorithm is presented.

1 Introduction
Since B-spline curve fitting for noisy or scattered data can be considered as a nonlinear optimization problem with high level of computational complexity [3, 4, 6],
non-deterministic optimization strategies should be employed. Here, methods taken
from computational intelligence offers promising results for the solutions of this
problem. By the computational intelligence techniques as is utilized in this paper, we
mean the strategies that are inspired from numerically based Artificial Intelligence
systems such as evolutionary algorithms and neural networks. One of the most
conspicuous and promising approaches to solve this problem is based on neural
networks. Previous studies are mostly focused on the traditional surface
approximation [1] and the first application of neural networks to this field is taken
place in [15]. Later on, the studies that include Kohonen networks [8, 9, and 12], SelfOrganized Maps [13, 14] and Functional networks [5, 7, and 10] provided extension
of studies of surface design. Evolutionary algorithms are based on natural selection
for optimization with multi-aims. Most of the evolutionary optimization techniques
such as Genetic Algorithm (GA) [3, 6, and 17], Simulated Annealing [16] and
Simulated Evolution [17, 18, and 19] are applied to this problem successfully.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 4956, 2007.
Springer-Verlag Berlin Heidelberg 2007

50

E. lker and V. ler

This paper presents the application of one of the computational intelligence


techniques called Artificial Immune System (AIS) to the surface fitting problem by
using B-Splines. Individuals are formed by treating knot placement candidates as
antibody and the continuous problem is solved as a discrete problem as in [3] and [6].
By using Akaike Information Criteria (AIC), affinity criterion is described and the
search is implemented from the good candidate models towards the best model in
each generation. The proposed method can describe the placement and numbers of the
knot automatically and concurrently. In this paper, the numerical examples are given
to show the effectiveness of the proposed method. Moreover, a comparison between
the proposed method and the genetic algorithm is presented.

2 B-Spline Surface Approximation


Geometry fitting can be formulized as the minimization of the fitting error under
some accuracy constraints in the jargon of mathematics. A typical error scale for
parametrical surface fitting is as follows;
Q2 =

i =1

j =1

w i , j {S (x i , y

) F }

i, j

(1)

Surface fitting from sample points is also known as surface reconstruction. This
paper applies a local fitting and blending approach to this problem. The readers can
refer to [1] and [2] for details. A B-Spline curve, C (u), is a vector valued function
which can be expressed as:
m

C (u ) = N i ,k (u ) Pi , u [u k 1 , u m +1 ]

(2)

i =0

where Pi represents control points (vector) and Ni,k is the normal k-degree B-spline
basis functions and Ni,k can be defined as a recursive function as follows.
1 if u [u i , u i +1 ),
N i ,1 (u ) =
and
0 otherwise

N i , k (u ) =

u ui
u u
N i , k 1 (u ) + i + k
N i +1, k 1 (u )
u i + k 1 u i
u i + k u i +1

(3)

where ui represents knots that are shaped by a knot vector and U= {u0,u1,,um}. Any
B-Spline surface is defined as follows:
m

S (u, v ) = N i ,k (u )* N j ,l (v )* Pi , j

(4)

i = 0 j =0

As can be seen from upper equations, B-spline surface is given as unique with the
degree, knot values and control points. Then, surface is formed by these parameters.
Input is a set of unorganized points in surface reconstruction problems so the degree
of the surface, knots and control points are all unknown. In equation (3), the knots
not only exist in the dividend but also exist in the denominator of the fractions. Thus a
spline surface given as in equation (4) is a function of nonlinear knots. Assume that

An Artificial Immune System Approach for B-Spline Surface Approximation Problem

51

the data to be fitted are given as if they are on the mesh points of the D=[a,b]x[c,d]
rectangular field on the x-y plane. Then the expression like below can be written [3]:
Fi , j = f (xi , y j ) + i , j ,

(i = 1,2,", N

; j = 1,2,", N y ).

(5)

In this equation, f(x,y) is the function exists at the base of data, Nx and Ny
represent the number of the data points in x and y directions respectively and i,j is a
measurement error. Equation (4) is adjusted or fitted to given data by least square
methods with equation (5). For parameterization of B-spline curve in Equation (2) and
surface in Equation (4) must be produced by preferring one of the parameterization
methods among uniform, chordal or centripetal methods. Then, the squares of
remainders or residuals are calculated by Equation (1). The lower index of the Q2
means the dimension of the data. The objective that will be minimized in B-spline
surface fitting problem is the function in Equation (1) and the variables of the object
function are B-spline coefficients and interior knots. B-spline coefficients are linear
parameters. However, interior knots are nonlinear parameters since S(u,v) function is
a nonlinear knots function. This minimization problem is known as multi-modal
optimization problem [4].

3 B-Spline Surface Approximation by Artificial Immune System


AIS were emerged as a new system that combines varieties of biological based
computational methods such as Artificial Neural Network and Artificial Life in
1990s. AIS were used in very diverse areas such as classification, learning,
optimization, robotic and computer security [11]. We need the following components
to construct the AIS: (i) The representation of system parts, (ii) A mechanism to
compute the interaction of system parts with each other and with the environment,
(iii) Adoption procedures. Different methods were employed for each of these
components in the algorithms that have been developed until now. We decided that
Clonal Selection algorithm was the best for the purpose of our study.
Distance criterion is used for determination of mutual interaction degree of
Antigen-Antibody as a scale. If antibody and antigen are represented as
Ab=<Ab1,Ab2,...,AbL> and Ag=<Ag1,Ag2,...,AgL>, respectively, Euclid distance
between Ab and Ag is calculated as follows:
D =

( Ab
i =1

Ag

)2

(6)

B-spline surface fitting problem is to approximate the B-spline surface that


approximate a target surface in a certain tolerance interval. Assume that object surface
is defined as Nx x Ny grid type with ordered and dense points in 3D space and the
knots of the B-spline surface that will be fitted are nx x ny grid that is subset of Nx x Ny
grid. Degrees of curves, mx and my, will be entered from the user. Given number of
points Nx x Ny is assigned to L that is dimensions of the Antigen and Antibody. Each
bit of Antibody and Antigen is also called their molecule and is equivalent to a data
point. If the value of a molecule is 1 in this formulation then a knot is placed to a
suitable data point otherwise no knot is placed. If the given points are lied down

52

E. lker and V. ler

between [a,b] and [c,d] intervals, nx x ny items of knots are defined in this interval
and called as interior knots. Initial population includes k Antibody with L numbers of
molecules. Molecules are set as 0 and 1, randomly.
For recognizing (response against Antigen) process, affinity of Antibody against
Antigen were calculated as in Equation (7) that uses the distance between AntibodyAntigen and AIC that is preferred as fitness measure in [3] and [6] references.
Affinity = 1 (AIC Fitnessavrg )

(7)

In Equation (7), Fitnessavrg represents the arithmetical average of AIC values of all
Antibodies in the population and calculated as follow. If the AIC value of any of the
individual is greater than Fitnessavrg then Affinity is accepted as zero (Affinity=0) in
Equation (7).
K

Fitnessavrg = AICi K
i =1

(8)

Where, K is the size of the population and AICi is the fitness measure of the ith
antibody in the population. AIC is given as below.

AIC2 = N x N y log e Q2 + 2{(n x + m x )(n y + m y ) + n x + n y }

(9)

The antibody which is ideal solution and the exact complementary of the Antigen
is the one whose affinity value is nearest to 1 among the population (in fact in
memory). Euclid distance between ideal Antibody and Antigen is equal to zero. In
this case the problem becomes not surface approximation but surface interpolation.
In order to integrate clonal selection algorithm to this problem some modification
must be carried out on the original algorithm. The followings are step by step
explanation of the modifications made on the algorithm and how these modifications
applied to above mentioned algorithm.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

Enter the data points to be fitted (In a form of Nx and Ny grid).


Enter the control parameters.
Build initial Antibody population with random molecules.
If the population is built as the first time, create memory array (save all
antibodies).
Otherwise, update antibody population and memory cells and develop the
memory.
For each of antibody calculate B-spline and fit it to the given curve. Later on
calculate the sum of squares of residuals (Q2).
For each of antibody in the population calculate its AIC value and calculate the
average AIC value of population.
For each of antibody calculate the affinity.
Choose the best antibody according to the affinity and in request antigen and
interactions of every antibody. The number of the clones will be K-varieties.
Produce matured antibody population by making the molecules change rational
with affinity values of clones.
Implement mutation according to the mutation rate.
Produce new antibody according to the variety ratio.
If iteration limit is not reached or antigen is not recognized fully go to step 5.

An Artificial Immune System Approach for B-Spline Surface Approximation Problem

53

4 Experimental Results
In order to evaluate the proposed AIS based Automatic Knot Placement algorithm
five bivariate test functions were used (see Table 1). These functions were constructed
to have a unit standard deviation and a non-negative range. Since the antibodies with
highest affinity values are kept in the memory in AIS architecture, the antibody of the
memory with the highest affinity for each generation was presented in the results. To
see the performance evaluation and approximation speed, genetic algorithm suggested
by Sarfaz at al. [6, 17] and proposed algorithm in this study were compared. Knot
ratio and operation of making important points immobilize at the knot chromosome
are discarded in their algorithm. The developed program has the flexibility of entering
the B-spline surface orders from the user. To test the quality of the proposed model,
Root Mean Square (RMS) error were calculated for M and N values from 5 to 10 for
400 (20x20) training data points from five surfaces defined as above. Initial
population is fed till 500 generations. Increase in the number of generation makes
increase in fitting (error decreases). The slot of the approximated curves indicates the
probability of having still approximation in next generation. Table 2 shows the
statistics of GA and AIS optimization execution. RMS errors between point clouds and
modeled surface based on the best chromosome in GA and antibodies in population of
memory in AIS were given in Table 3 for four test functions (Surface I Surface IV).
The analyses are implemented for all surfaces in Table 1. M x N mesh is
determined as randomly for Surface II Surface IV. At Surface II shown in Table 3,
the choices for M and N are fitting to MxN=8x8. Similarly, the choices of M and N for
Surface III and IV are fitting to MxN=9x9 and MxN=10x10, respectively.
Table 1. Five test functions for the bivariate setting

f ( x1 , x 2 ) = 10 .391{( x1 0.4 )( x 2 0.6 ) + 0.36}

{ (

)}

f ( x1 , x2 ) = 24.234 r 2 0.75 r 2 , r 2 = (x1 0.5) + ( x2 0.5)

)}

f ( x1 , x2 ) = 42.659 0.1 + x1 0.05 + x1 10x1 x2 + 5x4 , x1 = x1 0.5, x2 = x2 0.5

f (x1 , x2 ) = 1.3356 1.5(1 x1 ) + e

2 x1 1

f ( x1 , x2 ) = 1.9 1.35 + e Sin 13( x1 0.6) + e Sin(7 x2 )


x1

Sin 3 (x1 0.6) + e 3( x2 0.5 ) Sin 4 (x2 0.9)


2

x2

}]

Table 2. Parameter Set


Parameter
Mesh Size
Population size
String length
Mutation Rate
Crosover
Variety
Memory Size
Generation
B-splines order

AIS
20x20
20
200 (Antibody cell length)
None
None
6 (30%)
40
500
Random and user defined

GA
20x20
20
200 (chromosome gen-length)
0.001
0.7
6 (30%)
None
500
Random and user defined

E. lker and V. ler

54

Table 3. RMS (x10-2) values of AIS and GA methods for 400 data points from Surface I to
Surface IV for different MxN
Gen.
1
10
25
50
100
200
300
400
500

Surface I (7x7)
Surface II (8x8)
G.A. A.I.S.
G.A. A.I.S.
8.26
9.34
3.64
3.69
7.99
7.14
3.70
3.59
8.22
5.88
4.25
3.42
8.72
5.61
3.91
2.86
8.34
5.53
3.72
2.71
8.72
5.58
4.01
2.01
8.99
5.29
4.60
1.58
7.63
5.23
3.52
1.52
8.86
5.23
3.99
1.50

Surface III (9x9)


G.A. A.I.S.
10.10
10.25
11.25
9.57
9.67
8.71
10.40
7.93
9.97
7.48
10.50
7.06
10.61
7.03
10.38
7.03
10.60
7.00

Surface IV (10x10)
G.A. A.I.S.
8.21
8.57
8.76
8.02
8.36
7.61
8.53
7.44
8.78
7.22
9.30
7.10
7.99
7.07
8.45
7.00
8.57
6.95

Table 4. RMS (x10-2) values of AIS and GA methods for 400 data points from Surface V for
different MxN. (a) AIS, and (b) GA.

AIS

GA

MxN
5
6
7
8
9
10
5
6
7
8
9
10

6
8.6015
8.4809
7.6333
7.8404
7.9077
8.0664
10.501
10.238
9.9913
10.013
10.020
9.3970

7
8.0171
8.4179
7.2749
6.8614
7.8398
6.7625
9.6512
9.8221
9.4922
8.6365
9.1249
9.1297

8
7.9324
8.3722
7.4622
6.4382
6.9039
7.1614
9.1281
9.3189
8.9494
8.6247
8.7523
8.4642

9
7.0035
7.5269
7.2034
6.4288
6.9028
6.3575
9.7179
9.5761
8.2377
8.2134
8.1843
8.2721

10
7.3263
7.3554
6.9558
6.7375
6.8971
6.9640
9.8944
7.7725
8.1184
7.6657
7.3076
7.6331

7.0629
7.1965
6.1804
6.0138
5.7911
6.3637
9.6573
8.5993
8.1649
7.8947
7.4484
7.3366

Table 5. Fitness and RMS statistics of GA and AIS for Surface V


Generations

1
10
25
50
100
200
300
400
500

A.I.S. (x10-2)
Best Best Max.
RMS Fitn. RMS
8.84 806 27.98
7.70 695 8.811
7.37 660 7.793
6.74 589 7.357
6.03 500 6.711
5.92 485 6.085
5.91 484 5.965
5.86 477 5.918
5.79 467 8.490

Avrg. Avrg.
Fitn. RMS
1226 16.3
767 8.43
682 7.57
641 7.20
511 6.12
492 5.97
488 5.94
481 5.89
488 5.95

G.A. (x10-2)
Best Best Max Avrg. Avrg.
RMS Fitn. RMS Fitn. RMS
8.82 804 26.70 1319 17.6
7.96 722 12.43 961 10.8
9.69 879 30.38 1085 12.9
7.93 719 22.33 940 10.6
8.01 727 10.86 891 9.87
9.26 843 12.58 925 10.2
7.69 694 29.60 1043 12.3
8.47 772 11.95 922 10.2
7.93 719 13.31 897 9.95

Table 4 points out Surface V. RMS error was calculated for some M and N values
from 5 to 10 for 400 (20x20) training data points. As the reader can evaluate, the error
approaches depending on the M and N values are reasonable and the best choice fits to
the MxN=9x10 as shown italic in Table 4. According to the knots MxN that offer best

An Artificial Immune System Approach for B-Spline Surface Approximation Problem

55

fitting, the proposed algorithm was also compared with GA based algorithm by Safraz
et al. regarding to approximation speed. The outputs of the programs were taken for
some generations in the training period. Best and average fitness values of individuals
and antibodies according to the related generations of the program outputs were given
in Table 5. The graphics of approximations of the proposed AIS approach and GA
approach for all generations are given in Fig. 1. The bold line and dotted line
represent maximum fitness and average fitness values respectively.
G.A.

A.I.S.

2000

2000
1500
Fitness

1000

500

500

Max

Max

491

456

421

386

351

316

281

246

211

176

141

71

Avg.

106

469

433

397

361

325

289

253

181

145

73

109

37

217

Generation

Best

0
Avg.

1000

36

Fitness

1500

Generation

Best

Fig. 1. Parameter optimization based on GA and AIS regarding to the generations

5 Conclusion and Future Work


This paper presents an application of another computational intelligence technique
known as Artificial Immune Systems (AIS) to the surface fitting problem using Bsplines. In this study, original problem like in [3] and [6] was converted to a discrete
combinational optimization problem and solved as the strategy of this study. In this
paper, it has been clearly shown that proposed AIS algorithm is very useful for the
recognition of the good knot automatically. The suggested method can describe the
numbers and placements of the knots concurrently. None of the subjective parameters
such as error tolerance or a regularity (order) factor and initial places of the knots that
are chosen well are not required.
There are two basic requirements on each of B-spline surface in iterations for
guaranteeing the appropriate approximation in B-spline surface fitting. (1) Its shape
must be similar to the target or object surface; (2) its control points must be scattered
or its knot points must be determined appropriately. The technique presented in this
paper is proved to reduce the necessity of the second requirement.
In this study, Clonal Selection Algorithm of AIS is applied to the surface
reconstruction problem and various new ways of surface modeling is developed. The
big potential of this approach has been shown. For a given set of 3D data points, AIS
assists to choice the most appropriate B-spline surface degree and knot points. The
authors of this manuscript will use other AIS technique to improve the proposed
method in their future studies. The positive or negative effects of other techniques
will tried to be obtained and comparisons will be done in the future studies.
Additionally, NURBS surfaces will be used to improve the suggested algorithm. This
extension is especially important regarding to have the complex optimization of
weight of NURBS.

56

E. lker and V. ler

Acknowledgements
This study has been supported by the Scientific Research Projects of Selcuk
University (in Turkey).

References
1. Weiss, V., Andor, L., Renner, G., Varady, T., Advanced surface fitting techniques,
Computer Aided Geometric Design Vol. 19, p. 19-42, (2002).
2. Piegl, L., Tiller, W., The NURBS Book, Springer Verlag, Berlin, Heidelberg, (1997).
3. Yoshimoto, F., Moriyama, M., Harada, T., Automatic knot placement by a genetic
algorithm for data fitting with a spline, Proc. of the International Conference on Shape
Modeling and Applications, IEEE Computer Society Press, pp. 162-169, (1999).
4. Goldenthal, R., Bercovier, M. Design of Curves and Surfaces by Multi-Objective
Optimization, April 2005, Leibniz Report 2005-12.
5. Iglesias, A., Echevarra, G., Galvez, A., Functional networks for B-spline surface
reconstruction, Future Generation Computer Systems, Vol. 20, pp. 1337-1353, (2004).
6. Sarfraz, M., Raza, S.A., Capturing Outline of Fonts using Genetic Algorithm and Splines,
Fifth International Conference on Information Visualisation (IV'01) , pp. 738-743, (2001).
7. Iglesias, A., Glvez, A., A New Artificial Intelligence Paradigm for Computer-Aided
Geometric Design, Lecture Notes in Artificial Intelligence Vol. 1930, pp. 200-213, (2001).
8. Hoffmann, M., Kovcs E., Developable surface modeling by neural network,
Mathematical and Computer Modelling, Vol. 38, pp. 849-853, (2003)
9. Hoffmann, M., Numerical control of Kohonen neural network for scattered data
approximation, Numerical Algorithms, Vol. 39, pp. 175-186, (2005).
10. Echevarra, G., Iglesias, A., Glvez, A., Extending Neural Networks for B-spline Surface
Reconstruction, Lecture Notes in Computer Science, Vol. 2330, pp. 305-314, (2002).
11. Engin, O., Dyen, A., Artificial Immune Systems and Applications in Industrial Problems,
Gazi University Journal of Science 17(1): pp. 71-84, (2004).
12. Boudjema, F., Enberg, P.B., Postaire, J.G., Surface Modeling by using Self Organizing Maps
of Kohonen, IEEE Int. Conf. on Systems, Man and Cybernetics, vol. 3, pp. 2418-2423, (2003).
13. Barhak, J., Fischer, A., Adaptive Reconstruction of Freeform Objects with 3D SOM Neural
Network Grids, Journal of Computers & Graphics, vol. 26, no. 5, pp. 745-751, (2002).
14. Kumar, S.G., Kalra, P. K. and Dhande, S. G., Curve and surface reconstruction from points: an
approach based on SOM, Applied Soft Computing Journal, Vol. 5(5), pp. 55-66, (2004).
15. Hoffmann, M., Vrady, L., and Molnar, T., Approximation of Scattered Data by Dynamic
Neural Networks, Journal of Silesian Inst. of Technology, pp, 15-25, (1996).
16. Sarfraz, M., Riyazuddin, M., Curve Fitting with NURBS using Simulated Annealing,
Applied Soft Computing Technologies: The Challenge of Complexity, Series: Advances in
Soft Computing, Springer Verlag, (2006).
17. Sarfraz, M., Raza, S.A., and Baig, M.H., Computing Optimized Curves with NURBS Using
Evolutionary Intelligence, Lect. Notes in Computer Science, Volume 3480, pp. 806-815,
(2005).
18. Sarfraz, M., Sait, Sadiq M., Balah, M., and Baig, M. H., Computing Optimized NURBS Curves
using Simulated Evolution on Control Parameters, Applications of Soft Computing: Recent
Trends, Series: Advances in Soft Computing, Springer Verlag, pp. 35-44, (2006).
19. Sarfraz, M., Computer-Aided Reverse Engineering using Simulated Evolution on NURBS, Int.
J. of Virtual & Physical Prototyping, Taylor & Francis, Vol. 1(4), pp. 243 257, (2006).

Implicit Surface Reconstruction from Scattered


Point Data with Noise
Jun Yang1,2, Zhengning Wang1, Changqian Zhu1, and Qiang Peng1
1

School of Information Science & Technology Southwest Jiaotong


University, Chengdu, Sichuan 610031 China
2
School of Mechanical & Electrical Engineering Lanzhou Jiaotong
University, Lanzhou, Gansu 730070 China
yangj@mail.lzjtu.cn, {znwang, cqzhu, pqiang}@home.swjtu.edu.cn

Abstract. This paper addresses the problem of reconstructing implicit function


from point clouds with noise and outliers acquired with 3D scanners. We
introduce a filtering operator based on mean shift scheme, which shift each
point to local maximum of kernel density function, resulting in suppression of
noise with different amplitudes and removal of outliers. The clean data points
are then divided into subdomains using an adaptive octree subdivision method,
and a local radial basis function is constructed at each octree leaf cell. Finally,
we blend these local shape functions together with partition of unity to
approximate the entire global domain. Numerical experiments demonstrate
robust and high quality performance of the proposed method in processing a
great variety of 3D reconstruction from point clouds containing noise and
outliers.
Keywords: filtering, space subdivision, radial basis function, partition of unity.

1 Introduction
The interest for point-based surface has grown significantly in recent years in
computer graphics community due to the development of 3D scanning technologies,
or the riddance of connectivity management that greatly simplifies many algorithms
and data structures. Implicit surfaces are an elegant representation to reconstruct 3D
surfaces from point clouds without explicitly having to account for topology issues.
However, when the point sets data generated from range scanners (or laser scanners)
contain large noise, especially outliers, some established methods often fail to
reconstruct surfaces or real objects.
There are two major classes of surface representations in computer graphics:
parametric surfaces and implicit surfaces. A parametric surface [1, 2] is usually given
by a function f (s, t) that maps some 2-dimensional (maybe non-planar) parameter
domain into 3-space while an implicit surface typically comes as the zero-level
isosurface of a 3-dimensional scalar field f (x, y, z). Implicit surface models are
popular since they can describe complex shapes with capabilities for surface and
volume modeling and complex editing operations are easy to perform on such models.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 5764, 2007.
Springer-Verlag Berlin Heidelberg 2007

58

J. Yang et al.

Moving least square (MLS) [3-6] and radial basis function (RBF) [7-15] are two
popular 3D implicit surface reconstruction methods.
Recently, RBF attracts more attention in surface reconstruction. It is identified as
one of most accurate and stable methods to solve scattered data interpolation
problems. Using this technique, an implicit surface is constructed by calculating the
weights of a set of radial basis functions such they interpolate the given data points.
From the pioneering work [7, 8] to recent researches, such as compactly-supported
RBF [9, 10], fast RBF [11-13] and multi-scale RBF [14, 15], the established
algorithms can generate more and more faithful models of real objects in last twenty
years, unfortunately, most of them are not feasible for the approximations of
unorganized point clouds containing noise and outliers.
In this paper, we describe an implicit surface reconstruction algorithm for noise
scattered point clouds with outliers. First, we define a smooth probability density
kernel function reflecting the probability that a point p is a point on the surface S
sampled by a noisy point cloud. A filtering procedure based on mean shift is used to
move the points along the gradient of the kernel functions to the maximum probability
positions. Second, we reconstruct a surface representation of clean point sets
implicitly based on a combination of two well-known methods, RBF and partition of
unity (PoU). The filtered domain of discrete points is divided into many subdomians
by an adaptively error-controlled octree subdivision, on which local shape functions
are constructed by RBFs. We blend local solutions together using a weighting sum of
local subdomains. As you will see, our algorithm is robust and high quality.

2 Filtering
2.1 Covariance Analysis
Before introducing our surface reconstruction algorithm, we describe how to perform
eigenvalue decomposition of the covariance matrix based on the theory of principal
component analysis (PCA) [24], through which the least-square fitting plane is
defined to estimate the kernel-based density function.
Given the set of input points {pi}i[1,L], pi R3, the weighted covariance matrix
C for a sample point pi is determined by

L
C = p j pi
j =1

)(p j pi )

p j pi

h ,

(1)

where pi is the centroid of the neighborhood of pi, is a monotonically decreasing


weight function, and h is the adaptive kernel size for the spatial sampling density.
Consider the eigenvector problem

C el = l el .

(2)

Since C is symmetric and positive semi-define, all eigenvalues l are real-valued and
the eigenvectors el form an orthogonal frame, corresponding to the principal
components of the local neighborhood.

Implicit Surface Reconstruction from Scattered Point Data with Noise

59

Assuming 012, it follows that the least square fitting plane H(p):
( p pi ) e0 = 0 through pi minimizes the sum of squared distances to the neighbors
of pi. Thus e0 approximates the surface normal ni at pi, i.e., ni = e0. In other words, e1
and e2 span the tangent plane at pi.

2.2 Mean Shift Filtering


Mean shift [16, 17] is one of the robust iterative algorithms in statistics. Using this
algorithm, the samples are shifted to the most likely positions which are local maxima
of kernel density function. It has been applied in many fields of image processing and
visualization, such as tracing, image smoothing and filtering.
In this paper, we use a nonparametric kernel density estimation scheme to estimate
an unknown density function g(p) of input data. A smooth kernel density function
g(p) is defined to reflect the probability that a point p R3 is a point on the surface S
sampled by a noisy point cloud . Inspired by the previous work of Schall et al. [21],
we measure the probability density function g(p) by considering the squared distance
of p to the plane H(p) fitted to a spatial k-neighborhood of pi as

g ( p ) = gi ( p ) = i ( p p pro ) Gi ( p pro pi ) 1 ( p pi ) ni h
L

i =1

i =1

},

(3)

where i and Gi are two monotonically decreasing weighting functions to measure the
spatial distribution of point samples from spatial domain and range domain, which are
more adaptive to the local geometry of the point model. The weight function could be
either a Gaussian kernel or an Epanechnikov kernel. Here we choose Gaussian
function e x / 2 . The ppro is an orthogonal projection of a certain sample point p on
the least-square fitting plane. The positions p close to H(p) will be assigned with a
higher probability than the positions being more distant.
The simplest method to find the local maxima of (3) is to use a gradient-ascent
process written as follows:
2

g ( p ) = g i ( p )
i =1

2 L
i ( p ppro ) Gi (ppro pi ) ( p pi ) ni ni .
h2 i =1

(4)

Thus the mean shift vectors are determined as


L
m(p) = p i ( p ppro ) Gi ( ppro pi ) ( p pi ) ni ni
i =1

(p p ) G (p
L

i =1

pro

pro

pi ) .

(5)

Combining equations (4) and (5) we get the resulting iterative equations of mean
shift filtering

pij +1 = m(p ij ) , pio = pi ,

(6)

where j is the number of iteration. In our algorithm, g(p) satisfies the following
conditions
g ( p2 ) g ( p1 ) >g ( p1 )( p2 p1 )

p1 0,p2 0 ,

(7)

60

J. Yang et al.

thus g(p) is a convex function with finite stable points in the set U = {pi | g ( pi ) g ( p1i )}

resulting in the convergence of the series {pij , i = 1,..., L, j = 1, 2,...} . Experiments show
that we stop iterative process if p ij +1 pij 5 103 h is satisfied. Each sample
usually converges in less than 8 iterations. Due to the clustering property of our
method, groups of outliers usually converge to a set of single points sparsely
distributed around the surface samples. These points can be characterized by a very
low spatial sampling density compared to the surface samples. We use this criteria for
the detection of outliers and remove them using a simple threshold.

3 Implicit Surface Reconstruction


3.1 Adaptive Space Subdivision
In order to avoid solving a dense linear system, we subdivide the whole input points
filtered by mean shift into slightly overlapping subdomains. An adaptive octree-based
subdivision method introduced by Ohtake et al. [18] is used in our space partition.
We define the local support radius R= di for the cubic cells which are generated
during the subdivision, di is the length of the main diagonal of the cell. Assume each
cell should contain points between Tmin and Tmax. In our implementation, =0.6, Tmin
=20 and Tmax =40 has provided satisfying results.
A local max-norm approximation error is estimated according to the Taubin
distance [19],
= max f ( pi ) / f ( pi ) .

(8)

pi ci < R

If the is greater than a user-specified threshold 0, the cell is subdivided and a local
neighborhood function fi is built for each leaf cell.
3.2 Estimating Local Shape Functions
Given the set of N pairwise distinct points ={pi}i[1,N], pi R3, which is filtered by
mean shift algorithm, and the set of corresponding values {vi}i[1,N], vi R, we want to
find an interpolation f : R3R such that
f ( p i ) = vi .

(9)

We choose the f(p) to be a radial basis function of the form


N

f ( p ) = ( p ) + i ( p p i
i =1

(10)

where (p)= kk(p) with {k(p)}k[1,Q] is a basis in the 3D null space containing all
real-value polynomials in 3 variables and of order at most m with Q = { 3m + 3 } depending
on the choice of , is a basis function, i are the weights in real numbers, and | . |
denotes the Euclidean norm.

Implicit Surface Reconstruction from Scattered Point Data with Noise

61

There are many popular basis functions for use: biharmonic (r) = r, triharmonic
(r) = r3, multiquadric (r) = (r2+c2)1/2, Gaussian (r) = exp(-cr2), and thin-plate
spline (r) = r2log(r), where r = |p-pi|.
As we have an under-determined system with N+Q unknowns and N equations, socalled natural additional constraints for the coefficient i are added in order to ensure
orthogonality, so that
N

=
i =1

i =1

i =1

=0 .

(11)

The equations (9), (10) and (11) may be written in matrix form as
A
T

v ,
=
0 0

(12)

where A=(|pi-pj|), i,j =1,,N, =k(pi), i=1,,N, k=1,,Q, =i, i=1,,N and
=k, k=1,,Q. Solving the linear system (14) determines i and k, hence the f(p).

Fig. 1. A set of locally defined functions are blent by the PoU method. The resulting function
(solid curve) is constructed from four local functions (thick dashed curves) with their associated
weight functions (dashed dotted curves).

3.3 Partition of Unity


After suppressing high frequency noise and removing outliers, we divide the global
domain ={pi}i[1,N] into M lightly overlapping subdomains {i}i[1,M] with i i
using an octree-based space partition method. On this set of subdomains {i}i[1,M], we
construct a partition of unity, i.e., a collection of non-negative functions {i}i[1,M]
with limited support and with i=1 in the entire domain . For each subdomain i
we construct a local reconstruction function fi based on RBF to interpolate the
sampled points. As illustrated in Fig. 1, four local functions f1(p), f2(p), f3(p) and f4(p)
are blended together by weight functions 1, 2, 3 and 4. The solid curve is the
final reconstructed function.
Now an approximation of a function f(p) defined on is given by a combination
of the local functions
M

f (p ) = fi (p ) i (p ) .

(13)

i =1

The blending function is obtained from any other set of smooth functions by a
normalization procedure

62

J. Yang et al.

i ( p ) = wi ( p )

w (p )
j

(14)

The weight functions wi must be continuous at the boundary of the subdomains i.


Tobor et al. [15] suggested that the weight functions wi be defined as the composition
of a distance function Di:Rn[0,1], where Di(p)=1 at the boundary of i and a decay
function : [0,1][0,1], i.e. wi(p)= Di(p). More details about Di and can be
found in Tobors paper.
Table 1. Computational time measurements for mean shift filtering and RBF+PoU surface
reconstructing with error bounded at 10-5. Timings are listed as minutes:seconds.
model
Pinput
Pfilter
Tfilter
Toctree
Trec

Bunny
362K
165K
9:07
0:02
0:39

Dragon head
485K
182K
13:26
0:04
0:51

(a)

(b)

Dragon
2.11M
784K
41:17
0:10
3:42

(c)

Fig. 2. Comparison of implicit surface reconstruction based on RBF methods. (a) Input noisy
point set of Stanford bunny (362K). (b) Reconstruction with Carrs method [11]. (c)
Reconstruction with our method in this paper.

4 Applications and Results


All results presented in this paper are performed on a 2.8GHz Intel Pentium4 PC with
512M of RAM running Windows XP.
To visualize the resulting implicit surfaces, we used a pure point-based surface
rendering algorithm such as [22] instead of traditionally rendering the implicit
surfaces using a Marching Cubes algorithm [23], which inherently introduces heavy
topological constraints.
Table 1 presents computational time measurements for filtering and reconstructing
of three scan models, bunny, dragon head and dragon, with user-specified error
threshold 10-5 in this paper. In order to achieve good effects of denoising we choose a
large number of k-neighborhood for the adaptive kernel computation, however, more
timings of filtering are spent . In this paper, we set k=200. Note that the filtered points
are less than input noisy points due to the clustering property of our method.
In Fig. 2 two visual examples of the reconstruction by Carrs method [11] and our
algorithm are shown. Carr et al. use polyharmonic RBFs to reconstruct smooth,

Implicit Surface Reconstruction from Scattered Point Data with Noise

63

manifold surfaces from point cloud data and their work is considered as an excellent
and successful research in this field. However, because of sensitivity to noise, the
reconstructed model in the middle of Fig. 2 shows spurious surface sheets. The
quality of the reconstruction is highly satisfactory, as be illustrated in the right of
Fig. 2, since a mean shift operator is introduced to deal with noise in our algorithm.
For the purpose of illustrating the influence of error thresholds on reconstruction
accuracy and smoothness, we set two different error thresholds on the reconstruction
of the scanned dragon model, as demonstrated by Fig. 3.

(a)

(b)

(c)

(d)

Fig. 3. Error threshold controls reconstruction accuracy and smoothness of the scanned dragon
model consisting of 2.11M noisy points. (a) Reconstructing with error threshold at 8.4x10-4. (c)
Reconstructing with error threshold at 2.1x10-5. (b) and (d) are close-ups of the rectangle areas
of (a) and (c) respectively.

5 Conclusion and Future Work


In this study, we have presented a robust method for implicit surface reconstruction
from scattered point clouds with noise and outliers. Mean shift method filters the raw
scanned data and then the PoU scheme blends the local shape functions defined by
RBF to approximate the whole surface of real objects.
We are also investigating various other directions of future work. First, we are trying
to improve the space partition method. We think that the Volume-Surface Tree [20], an
alternative hierarchical space subdivision scheme providing efficient and accurate
surface-based hierarchical clustering via a combination of a global 3D decomposition at
coarse subdivision levels, and a local 2D decomposition at fine levels near the surface
may be useful. Second, we are planning to combine our method with some feature
extraction procedures in order to adapt it for processing very incomplete data.

References
1. Weiss, V., Andor, L., Renner, G., Varady, T.: Advanced Surface Fitting Techniques.
Computer Aided Geometric Design, 1 (2002) 19-42
2. Iglesias, A., Echevarra, G., Glvez, A.: Functional Networks for B-spline Surface
Reconstruction. Future Generation Computer Systems, 8 (2004) 1337-1353
3. Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin D., Silva, C. T.: Point Set
Surfaces. In: Proceedings of IEEE Visualization. San Diego, CA, USA, (2001) 21-28
4. Amenta, N., Kil, Y. J.: Defining Point-Set Surfaces. ACM Transactions on Graphics, 3
(2004) 264-270

64

J. Yang et al.

5. Levin, D.: Mesh-Independent Surface Interpolation. In: Geometric Modeling for Scientific
Visualization, Spinger-Verlag, (2003) 37-49
6. Fleishman, S., Cohen-Or, D., Silva, C. T.: Robust Moving Least-Squares Fitting with
Sharp Features. ACM Transactions on Graphics, 3 (2005) 544-552
7. Savchenko, V. V., Pasko, A., Okunev, O. G., Kunii, T. L.: Function Representation of
Solids Reconstructed from Scattered Surface Points and Contours. Computer Graphics
Forum, 4 (1995) 181-188
8. Turk, G., OBrien, J.: Variational Implicit Surfaces. Technical Report GIT-GVU-99-15,
Georgia Institute of Technology, (1998)
9. Wendland, H.: Piecewise Polynomial, Positive Definite and Compactly Supported Radial
Functions of Minimal Degree. Advances in Computational Mathematics, (1995) 389-396
10. Morse, B. S., Yoo, T. S., Rheingans, P., Chen, D. T., Subramanian, K. R.: Interpolating
Implicit Surfaces from Scattered Surface Data Using Compactly Supported Radial Basis
Functions. In: Proceedings of Shape Modeling International, Genoa, Italy, (2001) 89-98
11. Carr, J. C., Beatson, R. K., Cherrie, J. B., Mitchell, T. J., Fright, W. R., McCallum, B. C.,
Evans, T. R.: Reconstruction and Representation of 3D Objects with Radial Basis
Functions. In: Proceedings of ACM Siggraph 2001, Los Angeles, CA , USA, (2001) 67-76
12. Beatson, R. K.: Fast Evaluation of Radial Basis Functions: Methods for Two-Dimensional
Polyharmonic Splines. IMA Journal of Numerical Analysis, 3 (1997) 343-372
13. Wu, X., Wang, M. Y., Xia, Q.: Implicit Fitting and Smoothing Using Radial Basis
Functions with Partition of Unity. In: Proceedings of 9th International Computer-AidedDesign and Computer Graphics Conference, Hong Kong, China, (2005) 351-360
14. Ohtake, Y., Belyaev, A., Seidel, H. P.: Multi-scale Approach to 3D Scattered Data
Interpolation with Compactly Supported Basis Functions. In: Proceedings of Shape
Modeling International, Seoul, Korea, (2003) 153-161
15. Tobor, I., Reuter, P., Schlick, C.: Multi-scale Reconstruction of Implicit Surfaces with
Attributes from Large Unorganized Point Sets. In: Proceedings of Shape Modeling
International, Genova, Italy, (2004) 19-30
16. Comaniciu, D., Meer, P.: Mean Shift: A Robust Approach toward Feature Space Analysis.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 5 (2002) 603-619
17. Cheng, Y. Z.: Mean Shift, Mode Seeking, and Clustering. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 8 (1995) 790-799
18. Ohtake, Y., Belyaev, A., Alexa, M., Turk, G., Seidel, H. P.: Multi-level Partition of Unity
Implicits. ACM Transactions on Graphics, 3 (2003) 463-470
19. Taubin, G.: Estimation of Planar Curves, Surfaces and Nonplanar Space Curves Defined
by Implicit Equations, with Applications to Edge and Range Image Segmentation. IEEE
Transaction on Pattern Analysis and Machine Intelligence, 11 (1991) 1115-1138
20. Boubekeur, T., Heidrich, W., Granier, X., Schlick, C.: Volume-Surface Trees. Computer
Graphics Forum, 3 (2006) 399-406
21. Schall, O., Belyaev, A., Seidel, H-P.: Robust Filtering of Noisy Scattered Point Data. In:
IEEE Symposium on Point-Based Graphics, Stony Brook, New York, USA, (2005) 71-77
22. Rusinkiewicz, S., Levoy, M.: Qsplat: A Multiresolution Point Rendering System for Large
Meshes. In: Proceedings of ACM Siggraph 2000, New Orleans, Louisiana, USA, (2000)
343-352
23. Lorensen, W. E., Cline, H. F.: Marching Cubes: A High Resolution 3D Surface
Construction Algorithm. Computer Graphics, 4 (1987) 163-169
24. Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., Stuetzle, W.: Surface Reconstruction
from Unorganized Points. In: Proceedings of ACM Siggraph92, Chicago, Illinois, USA,
(1992) 71-78

The Shannon Entropy-Based Node Placement


for Enrichment and Simplication of Meshes
Vladimir Savchenko1, Maria Savchenko2, Olga Egorova3, and Ichiro Hagiwara3
1

Hosei University, Tokyo, Japan


vsavchen@k.hosei.ac.jp
2
InterLocus Inc.Tokyo, Japan
savchenko.m.aa@m.titech.ac.jp
3
Tokyo Institute of Technology, Tokyo, Japan
egorova.o.aa@m.titech.ac.jp, hagiwara@mech.titech.ac.jp

Abstract. In this paper, we present a novel simple method based on


the idea of exploiting the Shannon entropy as a measure of the interinuence relationships between neighboring nodes of a mesh to optimize
node locations. The method can be used in a pre-processing stage for
subsequent studies such as nite element analysis by providing better
input parameters for these processes. Experimental results are included
to demonstrate the functionality of our method.
Keywords: Mesh enrichment, Shannon entropy, node placement.

Introduction

Construction of a geometric mesh from a given surface triangulation has been


discussed in many papers (see [1] and references therein). Known approaches
are guaranteed to pass through the original sample points that are important in
computer aided design (CAD). However, results of triangulations drastically depend on uniformity and density of the sampled points as it can be seen in Fig.1.
Surface remeshing has become very important today for CAD and computer
graphics (CG) applications. Complex and detailed models can be generated by
3D scanners, and such models have found a wide range of applications in CG
and CAD, particularly in reverse engineering. Surface remeshing is also very
important for technologies related to engineering applications such as nite element analysis (FEA). Various automatic mesh generation tools are widely used
for FEA. However, all of these tools may create distorted or ill-shaped elements,
which can lead to inaccurate and unstable approximation. Thus, improvement
of the mesh quality is an almost obligatory step for preprocessing of mesh data
in FEA. Recently, sampled point clouds have received much attention in the CG
community for visualization purposes (see [2], [3]) and CAD applications (see [4],
[5], [6]). A noise-resistant algorithm [6] for reconstructing a watertight surface
from point cloud data presented by Kolluri et al. ignores undersampled regions;
nevertheless, it seems to us that some examples show that undersampled areas
need an improvement by surface retouching or enrichment algorithms. In some
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 6572, 2007.
c Springer-Verlag Berlin Heidelberg 2007


66

V. Savchenko et al.

applications, it is useful to have various, for instance, simpler versions of original


complex models according to the requirements of the applications, especially, in
FEA. In addition to the deterioration in the accuracy of calculations, speed may
be sacriced in some applications. Simplication of a geometric mesh considered
here involves constructing a mesh element which is optimized to improve the
elements shape quality using an aspect ratio (AR) as a measure of the quality.

(a)

(b)

Fig. 1. Surface reconstruction of a technical data set. (a) Cloud of points (4100
scattered points are used). (b)Triangulation produced by Delaunay-based method (N
triangular elements: 7991, N points: 4100).

In this paper, we present an attempt of enrichment of mesh vertices according


to AR-based entropy which is the analog of the Shannon entropy [7]. Further it
is called A-entropy. We progressively adapt the new coming points by performing elementary interpolation operations proposed by Shepard [8] (see also [9] for
more references) for generating the new point instances until an important function If (in our case, a scalar which species the ideal sampling density) matches
some user-specied values. To optimize node coordinates during simplication
process (edge collapsing), A-entropy is also implemented.
Recently, a wide scope of papers addressed the problem of remeshing of triangulated surface meshes, see, for instance, [10], [11] and references therein where
surface remeshing based on surface parameterization and subsequent lifting of
height data were applied. However, the main assumption used is that geometric
details are captured accurately in the given model. Nevertheless, as it can be
seen from the Darmstadt benchmark model (technical data set shown in Fig. 1),
a laser scanner often performs non-uniform samples that leads to under-sampling
or the mesh may have holes corresponding to deciencies in point data. In theory, the problem of surface completion does not have a solution when the plane
of the desirable triangulation is not planar; and, especially, begins to go wrong
when the plane of triangulation is orthogonal to features in the hole boundary
(so called, crenellations features). See a good discussion of the problem in [12].
Let us emphasize that our approach is dierent from methods related to reconstruction of surfaces from scattered data by interpolation methods based, for
instance, on minimum-energy properties (see, for example, [13]). In our case, an

The Shannon Entropy-Based Node Placement

67

approximation of the original surface (triangular mesh) is given. In some applications, it is important to preserve the initial mesh topology. Thus, our goal
is to insert new points in domains where the If function does not satisfy the
user-specied value. The main contribution of the paper is a novel algorithm of
a vertex placement which is discussed in details in Section 2.

Vertex Placement Algorithm

The approach is extremely simple and, in theory, is user dependent. In an analogy


to a hole lling, the user denes an area, where enrichment may be done, that
is, the user selects a processing area with no crenellations. In practice, all object
surface can be considered as the processing area (as it has been done in the
example shown in Fig. 1).

Fig. 2. Scheme of a new point generation. p1 , p2 , pi are points of the neighborhood.


The dashed line is a bisector of the empty sector, g is the generated point.

After that the algorithm works as follows:


1. Dene a radius of sampling Rs as an analog of the point density; for instance,
for the technical data set the radius equal to 1. It can be done automatically
by calculating average point density.
2. For each point of the user-dened domain, select K nearest points p that are
in Rs neighborhood. If K is less (or equal) than the user-predened number of
the neighborhood points (in our experiments - 6) and the maximum angle of the
empty area is larger (or equal) the user-predened angle (in our experiments
- 900 ), generate a new point g with the initial guess provided by a bisector of
the empty sector as shown in Fig. 2.
3. Select a new neighborhood of the point g; it can be slightly dierent from the
initial set of points. This is done in the tangent plane (projective plane) dened
by neighborhood points.
4. Perform a local Delaunay triangulation.
5. Find points forming a set of triangles with the common node g (a star) as
shown in Fig. 3(a). Calculate the new placement of the center of the star g using
technique described below (Fig. 3(b)).

68

V. Savchenko et al.

6. And on the lifting stage, calculate the local z-coordinates of g by Shepard


interpolation. In our implementation, we use the compactly supported radial
basis function [14] as the weight function.

(a)

(b)

Fig. 3. (a) An initial star. (b) The nal star.

The key idea of the fth step is to progressively adapt the newly created points
throw a few iterations. That is, an area with low sampling density will be lled
in accordance with points generated on the previous steps. In order to obtain
a good set of the new (approximated) points coordinates, we need a measure
of a goodness of triangulations arising from randomly coming points. It is
natural to use a mesh quality parameter, AR of the elements of a star, for such
a measure. In the case of a triangular mesh, AR can be dened as a ratio of the
maximum edge length to the minimum edge length of an element. Nevertheless,
according to our experiments it is much better to use an information Mi (Mi is
the AR of the i-th triangle of the star) associated with respect to a point g (Fig.
3) of the star in an analogy with the Shannon entropy [8], which denes the
uncertainty of a random variable, and can be a natural measure for the criterion
used in the enrichment algorithm. Shannon dened the entropy of an ensemble
of messages: if there are N possible messages that can be sent in one package,
and message m is being transmmited with probability pm , then the entropy is
as follows
S=

N


pm log (pm ) .

(1)

Intuitively, we can use AR-based entropy, with respect to the point g as follows
S=

N


Mi /Mt log (Mi /Mt ) ,

(2)

i=0

where Mt is the summarized AR value of a star, N is the number of faces of


the star. From the statistical point of view, a strict denition of the Shannon
entropy for a mesh, which we denoted as A-entropy and used in our algorithm,
is provided as follows: consider discrete random variable with distribution:

The Shannon Entropy-Based Node Placement

x1 x2 ... xn
p1 p2 ... pn

69


()

where probabilities pi =P{=xi }. Then divide an interval 0 x < 1 into such


intervals i that the length of i equals to pi . Random variable is dened as
= xi , when i has distribution (). Suppose we have a set of empirically
received numbers 1 = a1 ,...,n = an written in its increasing order, where ai
is the AR of the i-th element of the neighborhood with the point g as a center.
Let these numbers dene a division of an interval a1 x < an into i = ai
- ai1 . In our case, the parameter ai has its minimal value equal to 1, which
is not necessarily achieved in given sampling data. Constructing the one-to-one
correspondence between 1 x < an and 0 x < 1 , the following probabilities
can be written:
p1 =

a1 1
a2 a 1
an an1
, p2 =
, pn =
...
an 1
an 1
an 1

Thus, we can dene the random variable with the distribution as follows


a1 a2 ... an
.
p1 p2 ... pn
Its probability values are used in formula (3) for A-entropy:
A=

N


pi log (pi ) , pi =

ai ai1
, p0 = 1.
an 1

(3)

The value A of A-entropy depends on the coordinates of the center of the star
(point g in Fig. 3). Thus, the problem of maximization of the value A is reduced
to the problem of nding the new coordinates of this center (Fig. 3(b)) and is
considered as the optimization problem. For solving this optimization problem,
we use the downhill simplex method of Nelder and Mead [15].

Experimental Results

In practice, as it can be seen in Fig. 4, implementation of the algorithm discussed


above leads to a reasonable surface reconstruction of areas with initially low sampling density (see Fig. 1). The number of scattered points in the initial model
is 4100, after enrichment the number of points was increased up to 12114. For
decreasing the number of points we simplify this model and the nal number of
points is 5261.
Our triangular mesh simplication method uses predictor-corrector steps for
predicting candidates for edge collapsing according to a bending energy [16] with
the consequent correction of the point placement in simplied mesh. Let us notice that the main idea of our simplication approach is to provide minimal local
surface deformation during an edge collapse operation.

70

V. Savchenko et al.

(a)

(b)

(c)

Fig. 4. The mechanical data set. (a) Mesh after enrichment. (b) Mesh after simplication. (c) Shaded image of the nal surface.

At each iteration step:


- candidate points for an edge collapse are dened according to a local decimation cost of points belonging to a shaped polygon.
- after all candidates have been selected, we produce a contraction of the edge
with choosing an optimal vertex position by using A-entropy according to the
fth step of the algorithm (see Section 2).
To detail the features of the proposed point placement scheme, Fig. 5 presents
results of applying well known or even classical surface simplication algorithms
(tools can be found in [17]) and our method. We show fragment of a Horse
model (the initial AR value is equal to 1.74; here and further, the average value
of the AR is used) after the mesh simplication produced by the dierent simplication techniques.

(a)

(b)

(c)

(d)

Fig. 5. Mesh fragments of the Horse model after simplication (13% of original
elements) by using: (a) Progressive simplication method; (b) Method based on a
global error bound; (c) Method based on a quadric error metric; (d) Our method

The volume dierence between the initial model and simplied one by our
technique is 0.8%; the nal AR value is equal to 1.5. The global error bound
method demonstrates the worst results; the volume dierence is 1.3%, the nal
AR value is equal to 2.25. At a glance of the model visualization and the volume preservation, the best method, without any doubt, is the method based on
the quadric error metric, see [18]. However, there is a tradeo between attaining a

The Shannon Entropy-Based Node Placement

71

high quality surface reconstruction and minimization of AR. As a result, the


nal AR value is equal to 2 and many elongated and skinny triangles can be
observed in the mesh.

Concluding Remarks

In this paper, we introduce the notion of AR-based entropy (A-entropy) which


is the analog of the Snannon entropy. We consider the enrichment technique and
the technique for improving the mesh quality which are based on this notion.
The mesh quality improvement in presented simplication technique can be compared with smoothing methods based on the averaging of the coordinates, such
as Laplacian [19] or an angle based method of Zhou and Shimada [20]. These
methods have an intrinsic drawback such as a possibility of creating inverted triangles. In some non-convex domains, nodes can be pulled outside the boundary.
Implementation of the entropy-based placement in simplication algorithm decreases a possibility that a predicted point does not create an inverted triangle,
but does not guarantee that such event does not occur at all. However, producing operations in the tangent plane allows suciently easy avoiding creation of
inverted triangles. Interpolation based on the Shepard method produces excessive bumps. In fact, it is a well known feature of the original Shepard method.
More sophisticated local interpolation schemes such as [21] and others can be
implemented to control the quality of interpolation. Matters related to feature
preserving shape interpolation have to be considered in the future. We have
mentioned that it might be natural to use AR (the mesh quality parameter) of
the elements of a star as a measure for providing a reasonable vertex placement.
Nevertheless, we would like to emphasize that according to our experiments, in
many cases it does not lead at all to a well-founded estimate of a point g. It
might be a rational approach to use the Shannon entropy as a measure of the
inter-inuence relationships between neighboring nodes of a star to calculate optimal positions of vertices. We can see in Fig. 5 that shapes of mesh elements
after implementation of our method dier signicantly from results of applying
other simplication methods. Meshes in Fig. 5(a, b, c) are more suitable for
visualization than for numerical calculations. Improvement of meshes of a very
low initial quality, for instance, the Horse model simplied by the global error bound method, takes many iteration steps to attain AR value close to our
result and after implementation of Laplacian smoothing to the model shown in
Fig. 5(b) the shape of the model is strongly deformed. After implementation of
Laplacian smoothing (300 iteration steps) to the Horse model, simplied by
the quadric error metric method, AR and volume dierence between the original model and improved one become 1.6 and 5.2%, correspondingly. Examples
demonstrated above show that the mesh after applying our method is closer to
a computational mesh and can be used for FEA in any eld of study dealing
with isotropic meshes.

72

V. Savchenko et al.

References
1. Frey P. J.: About Surface Remeshing. Proc.of the 9th Int.Mesh Roundtable (2000)
123-136
2. Alexa, M., Behr, J.,Cohen-Or, D., Fleishman, S., Levin, D., Silvia, C. T.: Point
Set Surfaces. Proc. of IEEE Visualization 2001 (2002) 21-23
3. Pauly, M., Gross, M., Kobbelt, L.: Ecient Simplication of Point-Sampled Surfaces. Proc. of IEEE Visualization 2002(2002) 163-170
4. Hoppe, H., DeRose, T., Duchamp, T., McDonald, J.,Stuetzle, W.: Surface Reconstruction from Unorganized Points. Proceedings of SIGGRAPH 92 (1992) 71-78
5. Amenta, N., Choi, S., Kolluri, R.: The Powercrust. Proc. of the 6th ACM Symposium on Solid Modeling and Applications (1980) 609-633
6. Kolluri, R., Shewchuk, J.R., OBrien, J.F.: Spectral Surface Reconstruction From
Noisy Point Clouds. Symposium on Geometry Processing (2004) 11-21
7. Blahut, R.E.: Principles and Practice of Information Theory. Addison-Wisley
(1987)
8. Shepard, D.: A Two-Dimensional Interpolation Function for Irregularly Spaced
Data. Proc. of the 23th Nat. Conf. of the ACM (1968) 517-523
9. Franke, R., Nielson, G.,: Smooth Interpolation of Large Sets of Scattered Data.
Journal of Numerical Methods in Engineering 15 (1980) 1691-1704
10. Alliez, P., de Verdiere, E.C., Devillers, O., Isenburg, M.: Isotropic Surface Remeshing. Proc.of Shape Modeling International (2003)49-58
11. Alliez, P., Cohen-Steiner, D., Devillers, O., Levy, B., Desburn, M.: Anisotropic
Polygonal Remeshing. Inria Preprint 4808 (2003)
12. Liepa, P.: Filling Holes in Meshes. Proc. of 2003 Eurographics/ACM SIGGRAPH
symp.on Geometry processing 43 200-205
13. Carr, J.C., Mitchell, T.J., Beatson, R.K., Cherrie, J.B., Fright, W.R., McCallumn,
B.C., Evans, T.R.: Filling Holes in Meshes. Proc.of SIGGRAPH01 (2001) 67-76
14. Wendland, H.: Piecewise Polynomial, Positive Dened and Compactly Supported
Radial Functions of Minimal Degree. AICM 4 (1995) 389-396
15. Nelder, J.A., Mead, R.: A simplex Method for Function Minimization. Computer
J. 7 (1965) 308-313
16. Bookstein, F.L.: Morphometric Tools for Landmarks Data. Cambridge University
Press (1991) Computer J. 7 (1965) 308-313
17. Schroeder, W., Martin, K., Lorensen,B.: The Visualization Toolkit. Ed.2 Prentice
Hall Inc. (1998)
18. Garland, M.: A Multiresolution Modeling: Survey and Future Opportunities. Proc.
of EUROGRAPHICS, State of the Art Reports (1999)
19. Bossen, F.J., Heckbert, P.S.: A Pliant Method for Anisotropic Mesh Generation.
Proc. of the 5th International Meshing Roundtable (1996) 63-74
20. Zhou, T., Shimada, K.: An Angle-Based Approach to Two-dimensional Mesh
Smoothing. Proc.of the 9th International Meshing Roundtable (2000) 373-384
21. Krysl, P., Belytchko, T.: An Ecient Linear-precision Partition of Unity Basis
for Unstructured Meshless Methods. Communications in Numerical Methods in
Engineering 16 (2000) 239-255

Parameterization of 3D Surface Patches by


Straightest Distances
Sungyeol Lee and Haeyoung Lee
Hongik University, Dept. of Computer Engineering,
72-1 Sangsoodong Mapogu, Seoul Korea 121-791
{leesy, leeh}@cs.hongik.ac.kr

Abstract. In this paper, we introduce a new piecewise linear parameterization of 3D surface patches which provides a basis for texture mapping, morphing, remeshing, and geometry imaging. To lower distortion
when atting a 3D surface patch, we propose a new method to locally
calculate straightest distances with cutting planes. Our new and simple
technique demonstrates competitive results to the current leading parameterizations and will help many applications that require one-to-one
mapping.

Introduction

A 3D mesh parameterization provides a piecewise linear mapping between a 3D


surface patch and an isomorphic 2D patch. It is a widely used or required operation for texture-mapping, remeshing, morphing or geometry imaging. Guaranteed one-to-one mappings that only requires a linear solver have been researched
and many algorithms [4,5,11,8,10] were proposed. To reduce inevitable distortions when attening, a whole object is usually partitioned into several genus
0 surface patches. Non-linear techniques [19] are also presented with good results in some applications but they require more computational time than linear
methods.
Geodesics on meshes have been used in various graphics applications such as
parameterization [10], remeshing [14,20], mesh segmentation [20,6], and simulations of natural phenomena [16,9]. Geodesics provide a distance metric between
vertices on meshes while the Euclidean metric can not. Straightest geodesic path
on meshes was introduced by Polthier and Schmies [15] and used for parameterization by [10]. However their straightest geodesics may not be dened between a
source and a destination and require special handling of the swallow tails created
by conjugate vertices [16] and triangles with obtuse angles [9].
In this paper we present a new linear parameterization of 3D surface patches.
Our parameterization is improved upon [10] by presenting a new way to locally
calculate straightest geodesics. Our method demonstrates visually and statistically competitive results to the current leading methods [5,10] as shown in
Figure 1, 3, 5, and Table 1.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 7380, 2007.
c Springer-Verlag Berlin Heidelberg 2007


74

S. Lee and H. Lee

(a) By Floaters
(dist. 1.26)

(b) By Ours
(dist. 1.20)

(c) By Ours
with a fixed

(d) By Ours
with a measured

Fig. 1. Comparisons with texture-mapped models, Hat and Nefertiti: (a) is resulted
by Floaters [5] with a distortion of 1.26. (b) is by our new parameterization with a
distortion of 1.20, less than by Floaters. The distortion is measured by the texture
stretch metric [19]. (c) is by ours with a xed boundary and (d) is also by ours with a
measured boundary.We can see much less distortion in (d) than (c).

1.1

Related Work

Parameterization. There has been an increased need for a parameterization for


texture-mapping, remeshing, morphing or geometry imaging. Many piecewise
linear parameterization algorithms [4,5,11,8,10] were proposed. Generally the
rst step for parameterization is mapping boundary vertices to a xed position.
Usually the boundary is mapped to a square, a circle, or any convex shape
while respecting the 3D-to-2D length ratio between adjacent boundary vertices.
The positions of the interior vertices in the parameter space are then found
by solving a linear system. The linear system is generated with coecients in a
convex combination of 1-ring neighbors for each interior vertex. These coecients
characterize geometric properties such as angle and/or area preserving.
Geodesic Paths. There are several algorithms for geodesic computations on
meshes, mostly based on shortest paths [13,1,7] and have been used for remeshing
and parameterization [20,14]. However, still special processing for triangles with
obtuse angles is required. A detailed overview of this approach can be seen in [12].
Another approach is to compute the straightest geodesic path. Polthier and
Schmies rst introduced an algorithm for the straightest geodesic path on a
mesh [15]. Their straightest geodesic path is uniquely dened with the initial
condition i.e., a source vertex and direction but not with boundary conditions
i.e., a source and a destination. A parameterization by straightest geodesics was
rst introduced in [10]. They used locally calculated straightest geodesic distances for a piecewise linear parameterization. Our parameterization is improved
upon [10] by presenting a new way to calculate straightest geodesics.

Our Parameterization by Straightest Distances

A 3D mesh parameterization provides a piecewise linear mapping between a


3D surface patch and an isomorphic 2D patch. Generally the piecewise linear

Parameterization of 3D Surface Patches by Straightest Distances

75

parameterization is accomplished as follows: for every interior vertex Vi of a


mesh, a linear relation between the (ui , vi ) coordinates of this point and the
(uj , vj ) coordinates of its 1-ring neighbors {Vj }jN (i) , is set of the form:

aij (Uj Ui ) = 0
(1)
jN (i)

where Ui = (ui , vi ) are the coordinates of vertex Vi in the parameter space,


and aij are the non-negative coecients of matrix A. The boundary vertices are
assigned to a circle, or any convex shape while respecting the 3D-to-2D length
ratio between adjacent boundary vertices. The parameterization is then found
by solving the resulting linear system AU = B. A is sparse because each line in
the matrix A contains only a few non-zero elements (as many as the number of
its neighbors). A preconditioned bi-conjugate gradient (PBCG) method [17] is
used to iteratively solve this sparse linear system.
As long as the boundary vertices are mapped onto a convex shape, the resulting mapping is guaranteed to be one-to-one. The core of this shape-preserving
parameterization is how to determine non-negative coecients aij . In this paper,
we propose a new algorithm to determine these coecients.
2.1

Our Local Straightest Distances

The core of this piecewise linear parameterization is nding nonnegative coefcients aij in the equation 1. Our new parameterization proposes to determine
these coecients by using locally straightest paths and distances with local cutting planes. The work by Lee et. al. [10] uses local straightest geodesics by
Polthier and Schmiess [15] for these coecients, however the tangents of the
straightest geodesics by this previous method are determined by gaussian curvatures at vertices and may not be intuitively straightest especially when the
gaussian curvature is not equal to 2. In Figure 2, Vps is determined by having
the same left and right angle at Vi by [10], while Vour is determined intuitively
straightest by our local cutting plane.
Our new method for local straightest paths and distances is determined as
follows. As shown in Figure 2, a base plane B is created locally at each interior
vertex. To preserve shape better, the normal N ormalB of the base planeB is
calculated by area-weighted averaging of neighboring face normals of Vi as shown
in equation 2 and normalized later.

N ormalB =
wj N ormalj
(2)
jN (i)

In this way, we found that the distortion is lower than a simple averaged
normal of neighboring faces. A local cutting plane P passing with Vi , Vj is also
calculated. Two planes intersect in a line as long as they are not parallel. Our
cutting plane P pierces a neighboring face (for example j-th neighboring face)
on the mesh. Therefore there is a line segment which is the straightest path by

76

S. Lee and H. Lee

P
Vi

Vi

Vj

Vj

Vj

Vj
Vi

Vk

Vj'

Vl
Vk

Vi
Vj'

Vl

Vk Vps

Vour

Vl

Vl
Vk

Vps

Vour

Fig. 2. Our new local straightest path: For each interior vertex Vi , a local base B
and a cutting plane P with Vi , Vj is created. A local straightest path is computed by
cutting the face Vi Vk Vl with P. The intersection Vj  is computed on the edge Vk Vl and
connected to Vi to form a local straightest path. Vps is determined by the Polthier and
Schimess [15] and Vour is determined by our new method.

Fig. 3. Results by our new parameterization: models are Nefertiti, Face, Venus, Man,
Mountain from the left to the right

our method. There may be multiple line intersections where the plane P may
pierce multiple neighboring faces. As a future work, we will explore how to select
a line segment.
A local straightest path is computed by intersecting the face Vi Vk Vl and the
cutting plane P. The tangent a for this intersecting line segment Vj Vj  can be
easily calculated from the normal N ormalj of the face Vi Vk Vl and the normal
N ormalp of the cutting plane P as follows:
a = N ormalj XN ormalc

(3)

Then, the intersection vertex Vj  is computed on the edge Vk Vl and connected


to Vi for the local straightest path Vj Vi Vj  . Finally barycentric coordinates for

Parameterization of 3D Surface Patches by Straightest Distances

77

the weights of Vj , Vk , Vl are computed, summed, normalized and then used to


ll up the matrix A. Figure 3 shows the results of our new parameterization.
2.2

Discussion

Floaters [5] is considered as the widely used parameterization and LTDs [10]
also used a straightest geodesic path algorithm by [15]. Therefore we compare
our method to the two existing parameterizations.
The visual results achieved by our new parameterization are shown in
Figure 3. The distortion with the texture-stretch metric in [19] is also measured
and shown in Table 1. Notice that our parameterization produces competitive
results to the current leading linear parameterizations. With measured boundary
The previous algorithms and the distortion metric (L2 -norm, the mean stretch
over all directions) are all implemented by us.

Measured Boundary for Parameterization

As shown in Figure 4 (b) and (c), and the 1st and 2nd gures in Figure 3, high
distortion always occurs near the boundary. To reduce this high distortion, we
attempt to derive a boundary by our straightest geodesic path algorithm.
An interior source vertex S can be specied by a user or calculated as a center
vertex of the mesh from the boundary vertices. A virtual edge is dened as an
edge between S and a vertex on the boundary. Straightest paths and distances
of virtual edges to every vertex on the boundary will be measured as follows:
1. Make virtual edges connecting from S to every boundary vertex of the mesh.
2. Map each virtual edge onto the base plane B by a polar map, which preserves
angles between virtual edges such as [4]. The normal of the base plane B is
calculated as previously mentioned in 2.
3. Measure the straightest distance for each virtual edge on B from S to each
boundary vertices with corresponding cutting planes.
4. Position each boundary vertex at the corresponding distance from S on B.
5. If the resulted boundary is non-convex shaped, change it to a convex. Find
the edges having minimum angle with the consecutive edge (i.e., concaved
part of the boundary) and move the boundary vertex to form a convex.
In the linear system AU = B, the boundary vertices in B is simply set to the
measured position (ui , vi ) and (0, 0) for inner vertices. Then PBCG as mentioned
in 2 is used to nd positions in the parameterized space.
Figure 4 (d) and (e) clearly shows the utility of our straightest geodesic paths
with the simple models Testplane on the top and Testplane2 on the bottom.
With a circular boundary, previous parameterizations [5,10] produce the same
results in (b) for two dierent models. In (c), there is also a high distortion in
the texture-mapping by using (b). Our straightest path algorithm contributes to
deriving two distinct measured boundaries and results in very low distortion in
(d) and much better texture-mapping in (e).

78

S. Lee and H. Lee

(a) Models

(b) Circular
boundary

(c) Textured
by (b)

(d) Measured
boundary

(e) Textured
by (d)

Fig. 4. Comparisons between parameterizations with a xed boundary and a measured boundary by our new method: With a circular boundary, previous parameterizations [5,10] produce the same results in (b) for two dierent models in (a). Notice
in (c) that there are a lot of distortion in the texture-mapping by the results in (b).
Our straightest path algorithm contributes to creating a measured boundary to reduce
distortion by distinct results in (d) and much better texture-mapping in (e).

Fig. 5. Results by our new parameterization with dierent boundaries. Models are
Face in the two left and Venus on the two right columns. The tip of the nose on each
model is chosen as S.

Results with more complex models are demonstrated in Figure 5. Notice that
there is always a high level of distortion near the xed boundary but a low level
of distortion near the measured boundary by using our method. The straightest

Parameterization of 3D Surface Patches by Straightest Distances

79

distances to the boundary vertices are actually dependent on the selection on the
source vertex S. We simply use a vertex centered on the mesh from the boundary
as for the source S. As a future work, we will explore how to select the vertex S.

Results

The visual results by our method are shown in Figure 1, 3, and 5. The statistical
results comparing our parameterization with other methods are listed in Table 1.
Notice that visually and statistically our methods produce competitive results
than the previous methods.
Table 1. Comparisons of distortion measured by the texture stretch metric [19]: The
boundary is xed to a circle. Combined with measured boundaries by our straightest
path algorithm, our new parameterization in the 6th column produces better results
than the current leading methods.
Models
Nefertiti
Man
Face
Venus
Mountain

No. of Floaters [5] LTDs [10] Our Param.


Our Param.
Vertices xed bound. xed bound. xed bound. measured bound.
299
1208
1547
2024
2500

1.165
1.244
1.223
2.159
1.519

1.165
1.241
1.222
2.162
1.552

1.164
1.240
1.221
2.168
1.550

1.146
1.226
1.334
1.263
1.119

The performance complexity of our algorithm is all linear to the number of


vertices, i.e., O(V ). The longest processing time among our models in Table 1 is
0.53 sec, required for the Mountain having the highest number of vertices. The
processing time is measured on a laptop with a Pentium M 2.0GHz 1GB RAM.

Conclusion and Future Work

In this paper, we introduce a new linear parameterization by locally straightest


distances. We also demonstrate the utility of our straightest path algorithm to
derive a measured boundary for parameterizations with better results.
Future work will extend the utility of our straightest path algorithm by applying it to other mesh processing techniques such as remeshing, subdivision, or
simplication.

Acknowledgement
This work was supported by grant No. R01-2005-000-10120-0 from Korea Science
and Engineering Foundation in Ministry of Science & Technology.

80

S. Lee and H. Lee

References
1. Chen J., Han Y.: Shortest Paths on a Polyhedron; Part I: Computing Shortest
Paths, Int. J. Comp. Geom. & Appl. 6(2), 1996.
2. Desbrun M., Meyer M., Alliez P.: Intrinsic Parameterizations of Surface Meshes,
Eurographics 2002 Conference Proceeding, 2002.
3. Floater M., Gotsman C.: How To Morph Tilings Injectively, J. Comp. Appl.
Math., 1999.
4. Floater M.: Parametrization and smooth approximation of surface triangulations, Computer Aided Geometric Design, 1997.
5. Floater M.: Mean Value Coordinates, Comput. Aided Geom. Des., 2003.
6. Funkhouser T., Kazhdan M., Shilane P., Min P.,Kiefer W., Tal A., Rusinkiewicz
S., Dobkin D.: Modeling by example, ACM Transactions on Graphics, 2004.
7. Kimmel R., Sethian J.A.: Computing Geodesic Paths on Manifolds, Proc. Natl.
Acad. Sci. USA Vol.95 1998, 1998.
8. Lee Y., Kim H., Lee S.: Mesh Parameterization with a Virtual Boundary, Computer and Graphics 26 (2002), 2002.
9. Lee H., Kim L., Meyer M., Desbrun M.: Meshes on Fire, Computer Animation
and Simulation 2001, Eurographics, 2001.
10. Lee H., Tong Y. Desbrun M.: Geodesics-Based One-to-One Parameterization of
3D Triangle Meshes, IEEE Multimedia January/March (Vol. 12 No. 1), 2005.
11. Meyer M., Lee H., Barr A., Desbrun M.: Generalized Barycentric Coordinates to
Irregular N-gons, Journal of Graphics Tools, 2002.
12. Mitchell J.S.B.: Geometric Shortest Paths and network optimization, In Handbook of Computational Geometry, J.-R. Sack and J. Urrutia, Eds. Elsevier Science
2000.
13. Mitchell J.S.B., Mount D.M., Papadimitriou C.H.: The Discrete Geodesic Problem, SIAM J. of Computing 16(4), 1987.
14. Peyre G., Cohen L.: Geodesic Re-meshing and Parameterization Using Front
Propagation, In Proceedings of VLSM03, 2003.
15. Polthier K., Schmies M.: Straightest Geodesics on Polyhedral Surfaces, Mathematical Visualization, 1998.
16. Polthier K., Schmies M.: Geodesic Flow on Polyhedral Surfaces, Proceedings of
Eurographics-IEEE Symposium on Scientic Visualization 99, 1999.
17. Press W., Teuklosky S., Vetterling W., Flannery B.: Numerical Recipes in C,
second edition, Cambridge University Press, New York, USA, 1992.
18. Riken T., Suzuki H.: Approximate Shortest Path on a Polyhedral Surface Based
on Selective Renement of the Discrete Graph and Its Applications, Geometric
Modeling and Processing 2000 (Hongkong), 2000.
19. Sander P.V., Snyder J., Gortler S.J., Hoppe H.: Texture Mapping Progressive
Meshes, Proceedings of SIGGRAPH 2001, 2001.
20. Sifri O., Sheer A., Gotsman C. : Geodesic-based Surface Remeshing, In Proceedings of 12th Intnl. Meshing Roundtable, 2003.

Facial Expression Recognition Based on Emotion


Dimensions on Manifold Learning
Young-suk Shin
School of Information and telecommunication Engineering, Chosun University,
#375 Seosuk-dong, Dong-gu, Gwangju, 501-759, Korea
ysshin@chosun.ac.kr

Abstract. This paper presents a new approach method to recognize facial expressions in various internal states using manifold learning (ML). The manifold
learning of facial expressions reflects the local features of facial deformations
such as concavities and protrusions. We developed a representation of facial
expression images based on manifold learning for feature extraction of facial
expressions. First, we propose a zero-phase whitening step for illuminationinvariant images. Second, facial expression representation from locally linear
embedding (LLE) was developed. Finally, classification of facial expressions in
emotion dimensions was generated on two dimensional structure of emotion
with pleasure/displeasure dimension and arousal/sleep dimension. The proposed
system maps facial expressions in various internal states into the embedding
space described by LLE. We explore locally linear embedding space as a facial
expression space in continuous dimension of emotion.

1 Introduction
A challenging study in automatic facial expression recognition is to detect the change
of facial expressions in various internal states. Facial expressions are continuous because the expression image varies smoothly as the expression is changed. The variability of expression images can be represented as subtleties of manifolds such as
concavities and protrusions in the image space. Thus automatic facial expression
recognition has to be detected subtleties of manifolds in the expression image space,
and it is also required continuous dimensions of emotion because the expression images consist of several other emotions and many combinations of emotions.
The dimensions of emotion can overcome the problem of discrete recognition
space because the discrete emotions can be treated as regions in a continuous space.
The two most common dimensions are arousal (calm/excited), and valence (negative/positive). Russell who argued that the dimensions of emotion can be applied to
emotion recognition [1]. Peter Lang has assembled an international archives of imagery rated by arousal and valence with image content [2]. To recognize facial expressions in various internal states, we worked with dimensions of emotion instead of
basic emotions or discrete emotion categories. The dimensions of emotion proposed
are pleasure/displeasure dimension and arousal/sleep dimension.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 8188, 2007.
Springer-Verlag Berlin Heidelberg 2007

82

Y.-s. Shin

Many studies [3, 4, 5, 6, 7] for representing facial expression images have been
proposed such as Optic flow, EMG(electromyography), Geometric tracking method,
Gabor representation, PCA (Principal Component Analysis) and ICA (Independent
Component Analysis). At recently study, Seung and Lee [8] proposed generating
image variability as low-dimensional manifolds embedded in image space. Roweis
and Saul [9] showed that locally linear embedding algorithm is able to learn the
global structure of nonlinear manifolds, such as the pose and expression of an individuals faces. But there have been no reports about how to contribute the intrinsic
features of the manifold based on various internal states on facial expression
recognition.
We explore the global structure of nonlinear manifolds on various internal states
using locally linear embedding algorithm. This paper developed a representation of
facial expression images on locally linear embedding for feature extraction of various
internal states. This representation consists of two steps in section 3. Firstly, we present a zero-phase whitening step for illumination-invariant images. Secondly, facial
expression representation from locally linear embedding was developed. A classification of facial expressions in various internal states was presented on emotion dimension having pleasure/displeasure dimension and arousal/sleep dimension using 1nearest neighborhood. Finally, we discuss locally linear embedding space and facial
expression space on dimensions of emotion.

2 Database on Dimensions of Emotion


The face expression images used for this research were a subset of the Korean facial
expression database based on dimension model of emotion [10]. The dimension
model explains that the emotion states are not independent one another and related to
each other in a systematic way. This model was proposed by Russell [1]. The dimension model also has cultural universals and it was proved by Osgood, May & Morrison and Russell, Lewicka & Niit [11, 12].
The data set with dimension structure of emotion contained 498 images, 3 females
and 3 males, each image using 640 by 480 pixels. Expressions were divided into two
dimensions according to the study of internal states through the semantic analysis of
words related with emotion by Kim et al. [13] using 83 expressive words. Two
dimensions of emotion are Pleasure/Displeasure dimension and Arousal/Sleep dimension. Each expressor of females and males posed 83 internal emotional state expressions when 83 words of emotion are presented. 51 experimental subjects rated
pictures on the degrees of expression in each of the two dimensions on a nine-point
scale. The images were labeled with a rating averaged over all subjects. Examples of
the images are shown in figure 1. Figure 2 shows a result of the dimension analysis of
44 emotion words related to internal emotion states.
.

Fig. 1. Examples from the facial expression database in various internal states

Facial Expression Recognition Based on Emotion Dimensions on Manifold Learning

83

Fig. 2. The dimension analysis of 44 emotion words related to internal emotion states

3 Facial Expression Representation from Manifold Learning


This section develops a representation of facial expression images based on locally
linear embedding for feature extraction. This representation consists of two steps. In
the first step, we perform a zero-phase whitening step for illumination-invariant images. Second step, facial expression representation from locally linear embedding was
developed.
3.1 Preprocessing
The face images used for this research were centered the face images with coordinates
for eye and mouth locations, and then cropped and scaled to 20x20 pixels. The luminance was normalized in two steps. First, the rows of the images were concatenated to
produce 1 400 dimensional vectors. The row means are subtracted from the dataset,
X. Then X is passed through the zero-phase whitening filter, V, which is the inverse
square root of the covariance matrix:

V = E { XX

} 2 , Z = XV

(1)

This indicates that the mean is set to zero and the variances are equalized as unit
variances. Secondly, we subtract the local mean gray-scale value from the sphered
each patch. From this process, Z removes much of the variability due to lightening.
Fig. 3(a) shows original images before preprocessing and Fig. 3(b) shows images
after preprocessing.

84

Y.-s. Shin

(a)

(b)

Fig. 3. (a) original images before preprocessing (b) images after preprocessing

3.2 Locally Linear Embedding Representation


Locally linear embedding algorithm[9] is to preserve local neighbor structure of data
in both the embedding space and the observation space and is to map a given set of
high-dimensional data points into a surrogate low-dimensional space.
Similar expressions on continuous dimension of emotion can be existed in the local
neighborhood on the manifold. And the mapping from the high-dimensional data
points to the low dimensional points on the manifold is very important for dimensionality reduction. LLE can overcome the problem of nonlinear dimensionality reduction, and its algorithm does not involve local minima [9]. Therefore, we applied the
locally linear embedding algorithm to feature extraction of facial expressions.
LLE algorithm is used to obtain the corresponding low-dimensional data Y of the
training set X. D by N matrix, X consists of N data item in D dimensions. Y, d by N
matrix, consists of d < D dimensional embedding data for X. LLE algorithm can be
described as follow.
Step 1: compute the neighbors of each data point, X
Step 2: compute the weights W that best reconstruct each data point from its
neighbors, minimizing the cost in eq. (2) by two constraints.
K

(W ) = xi Wij xij

(2)

j =1

First, each data point


if

xi is reconstructed only from its neighbors, enforcing Wij = 0

xi and x j are not in the same neighbor. Second, the rows of the weight matrix

have sum-to-one constraint

ij

= 1 . These constraints compute the optimal

j =1

weights

Wij according to the least square. K means nearest neighbors per data point.

Step 3: compute the vectors Y best reconstructed by the weights W, minimizing the
quadratic form in eq.(3) by its bottom nonzero eigenvectors.

Facial Expression Recognition Based on Emotion Dimensions on Manifold Learning

85

(Y ) = yi Wij yij

(3)

j =1

This optimization is performed subjected to constraints. Considering that the cost

(Y ) is invariant to translation in Y,

= 0 is to remove the degree of freedom

by requiring the coordinates to be centered on the origin. Also,

1
yi yiT = I is to

N i

avoid degenerate solutions of Y=0. Therefore, eq.(3) can be described to an eigenvector decomposition problem as follow.
k

(Y ) = yi Wij yij
j =1

= arg min ( I W )Y

(4)

= arg min Y T ( I W )T ( I W )Y
Y

The optimal solution of

eq.(3)

is the smallest

eigenvectors of

matrix

( I W ) ( I W ) . The eigenvalues which are zero is discarded because discarding


T

eigenvectors with eigenvalue zero enforces the constraint term. Thus we need to compute the bottom (d+1) eigenvectors of the matrix.
Therefore we obtain the corresponding low-dimensional data set Y in embedding
space from the training set X. Figure 4 shows facial expression images reconstructed
from bottom (d+1) eigenvectors corresponding to the d+1 smallest eigenvalues discovered by LLE, with K=3 neighbors per data point. Especially, the first eight components d=8 discovered by LLE represent well features of facial expressions. Facial
expression images of various internal states mapped into the embedding space described by the first two components of LLE (See Fig. 5). From figure 5, we can
explore the structural nature of facial expressions in various internal states on embedding space modeled by LLE.

(a)

(b)

(c)

Fig. 4. Facial expression images reconstructed from bottom (d+1) eigenvectors (a) d=1,
(b) d=3, and (c) d=8

86

Y.-s. Shin

Fig. 5. 318 facial expression images of various internal states mapped into the embedding space
described by the first two components of LLE

The further a point is away from the center point, the higher is the intensity of displeasure and arousal dimensions. The center points coexists facial expression images
of various internal states.

4 Result and Discussion


Facial expression recognition in various internal states with features extracted by LLE
algorithm was evaluated by 1-nearest neighborhood on two dimensional structure of
emotion having pleasure/displeasure dimension and arousal/sleep dimension. 252
images for training and 66 images excluded from the training set for testing are used.
The 66 images for test include 11 expression images of each six people. The class
label which is recognized consists of four sections on two dimensional structure of
emotion. Fig. 6 shows the sections of each class label.
Table 1 gives a result of facial expression recognition recognized by proposed algorithm on two dimensions of emotion and indicates a part of all. The recognition
result in the Pleasure/Displeasure dimension of test set showed 90.9% and 56.1% in
the Arousal/Sleep dimension. In Table 1, the first column indicates the emotion words
of 11 expression images used for testing, the second and third columns include each
dimension value on bipolar dimensions of test data. The fourth column in Table 1
indicates the class label(C1,C2,C3,C4) of test data and the classification results recognized by proposed algorithm are shown in the fifth column.

Facial Expression Recognition Based on Emotion Dimensions on Manifold Learning


a
10
r
o
u
s
a
l
5
s
l
e
e
p

C3

C2

C4

C1

87

0
0
pleasure

10
displeasure

Fig. 6. The class region on two dimensional structure of emotion


Table 1. A result data of facial expression recognition recognized by proposed algorithm (Abbreviation: P-D, pleasure/displeasure; A-S, arousal/sleep;)
Emotion
(person)

word Test data


Class label Recognized class
label on proposed
P D A S of test data
algorithm
pleasantness (a) 1.40 5.47
3
3
depression (a)
6.00 4.23
1
1
crying(a)
7.13 6.17
2
2
gloomy(a)
5.90 3.67
1
1
strangeness(a)
6.13 6.47
2
1
proud(a)
2.97 5.17
3
1
confident(a)
2.90 4.07
4
3
despair(a)
7.80 5.67
1
1
sleepiness(a)
6.00 1.93
4
1
likable(a)
2.07 4.27
4
3
delight(a)
1.70 5.70
3
3
gloomy( b )
6.60 3.83
1
2
strangeness( b ) 6.03 5.67
2
4
proud( b )
2.00 4.53
4
3
confident( b )
2.47 5.27
4
1
despair (b )
6.47 5.03
2
2
sleepiness(b )
6.50 3.80
1
1
likable(b)
1.83 4.97
4
4
delight(b)
2.10 5.63
3
4
boredom( b )
6.47 5.73
2
3
tedious( b)
6.73 4.77
1
1
Jealousy(b )
6.87 6.80
2
2

This paper explores two problems. One is to explore a new approach method to
recognize facial expressions in various internal states using locally linear embedding
algorithm. The other is to explore the structural nature of facial expressions in various
internal states on embedding space modeled by LLE.

88

Y.-s. Shin

As a result of the first problem, the recognition results of each dimension through
1-nearest neighborhood were significant 90.9% in Pleasure/Displeasure dimension
and 56.1% in the Arousal/Sleep dimension. The two dimensional structure of emotion
in the facial expression recognition appears as a stabled structure for the facial expression recognition. Pleasure-Displeasure dimension is analyzed as a more stable dimension than Arousal-Sleep dimension. In second case, facial expressions in continuous
dimension of emotion was showed a cross structure on locally linear embedding
space. The further a point is away from the center point, the higher is the intensity of
displeasure and arousal dimensions. From these results, we can know that facial expression structure on continuous dimension of emotion is very similar to structure
represented by the manifold model.
Thus our result may be analyzed that the relationship of facial expressions in various internal states can be facilitated on the manifold model. In the future work, we
will consider learning invariant manifolds of facial expressions.
Acknowledgements. This work was supported by the Korea Research Foundation
Grant funded by the Korean Government (KRF-2005-042-D00285).

References
1. Russell, J. A.: Evidence of convergent validity on the dimension of affect. Journal of Personality and Social Psychology, 30, (1978) 1152-1168
2. Peter J. L.: The emotion probe: Studies of motivation and attention. American Psychologist, 50(5) (1995) 372-385
3. Donato, G., Bartlett, M., Hager, J., Ekman, P. and Sejnowski, T.: Classifying facial actions, IEEE PAMI, 21(10) (1999) 974-989
4. Schmidt, K., Cohn, J. :Dynamics of facial expression:Normative characteristics and individual difference, Intl. Conf. On Multimedia and Expo, 2001
5. Pantic, M., Rothkrantz, L.J.M.: Towards an Affect-Sensitive Multimodal Human Computer Interaction, Proc. Of IEEE. 91 1370-1390
6. Shin, Y., An, Y.: Facial expression recognition based on two dimensions without neutral
expressions, LNCS(3711) (2005) 215-222
7. Bartlett, M.: Face Image analysis by unsupervised learning, Kluwer Academic Publishers
(2001)
8. Seung, H. S., Lee, D.D.:The manifold ways of perception, Science (290), (2000) 22682269
9. Roweis, S.T., Saul, L.K..:Nonlinear Dimensionality reduction by locally linear embedding,
Science (290), (2000) 2323-2326
10. Bahn, S., Han, J., Chung, C.: Facial expression database for mapping facial expression
onto internal state. 97 Emotion Conference of Korea, (1997) 215-219
11. Osgood, C. E., May, W.H. and Miron, M.S.: Cross-curtral universals of affective meaning.
Urbana:University of Illinoise Press, (1975)
12. Russell, J. A., Lewicka, M. and Nitt, T.: A cross-cultural study of a circumplex model of
affect. Journal of Personality and Social Psychology, 57, (1989) 848-856
13. Kim, Y., Kim, J., Park, S., Oh, K., Chung, C.: The study of dimension of internal states
through word analysis about emotion. Korean Journal of the Science of Emotion and Sensibility, 1 (1998) 145-152

AI Framework for Decision Modeling in


Behavioral Animation of Virtual Avatars
A. Iglesias1 and F. Luengo2
1

Department of Applied Mathematics and Computational Sciences, University of


Cantabria, Avda. de los Castros, s/n, 39005, Santander, Spain
2
Department of Computer Science, University of Zulia, Post Oce Box #527,
Maracaibo, Venezuela
iglesias@unican.es, fluengo@cantv.net

Abstract. One of the major current issues in Articial Life is the decision modeling problem (also known as goal selection or action selection).
Recently, some Articial Intelligence (AI) techniques have been proposed
to tackle this problem. This paper introduces a new based-on-ArticialIntelligence framework for decision modeling. The framework is applied
to generate realistic animations of virtual avatars evolving autonomously
within a 3D environment and being able to follow intelligent behavioral
patterns from the point of view of a human observer. Two examples of
its application to dierent scenarios are also briey reported.

Introduction

The realistic simulation and animation of the behavior of virtual avatars emulating human beings (also known as Articial Life) has attracted much attention
during the last few years [2,5,6,7,8,9,10,11,12,13]. A major goal in behavioral
animation is the construction of an intelligent system able to integrate the
dierent techniques required for the realistic simulation of the behavior of virtual humans. The challenge is to provide the virtual avatars with a high degree
of autonomy, so that they can evolve freely, with a minimal input from the animator. In addition, this animation is expected to be realistic; in other words,
the virtual avatars must behave according to reality from the point of view of a
human observer.
Recently, some Articial Intelligence (AI) techniques have been proposed to
tackle this problem [1,3,4,8]. This paper introduces a new based-on-ArticialIntelligence framework for decision modeling. In particular, we apply several
AI techniques (such as neural networks, expert systems, genetic algorithms,
K-means) in order to create a sophisticated behavioral system that allows the
avatars to take intelligent decisions by themselves. The framework is applied to
generate realistic animations of virtual avatars evolving autonomously within a
3D environment and being able to follow intelligent behavioral patterns from
the point of view of a human observer. Two examples of the application of this
framework to dierent scenarios are briey reported.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 8996, 2007.
c Springer-Verlag Berlin Heidelberg 2007


90

A. Iglesias and F. Luengo

The structure of this paper is as follows: the main components of our behavioral system are described in detail in Section 2. Section 3 discusses the
performance of this approach by means of two simple yet illustrative examples.
Conclusions and future lines in Section 4 close the paper.

Behavioral System

In this section the main components of our behavioral system are described.
2.1

Environment Recognition

At the rst step, a virtual world is generated and the virtual avatars are placed
within. In the examples described in this paper, we have chosen a virtual park
and a shopping center, carefully chosen environments that exhibit lots of potential objects-avatars interactions. In order to interact with the 3D world, each
virtual avatar is equipped with a perception subsystem that includes a set of
individual sensors to analyze the environment and capture relevant information.
This analysis includes the determination of distances and positions of the dierent objects of the scene, so that the agent can move in this environment, avoid
obstacles, identify other virtual avatars and take decisions accordingly. Further,
each avatar has a predened vision range (given by a distance threshold value
determined by the user), and hence, objects far away from the avatar are considered to be visible only if the distance from the avatar to the object is less than
such threshold value; otherwise, the object becomes invisible.
All this information is subsequently sent to an analyzer subsystem, where it
is processed by using a representation scheme based on genetic algorithms. This
scheme has proved to be extremely useful for pattern recognition and identication. Given a pair of elements A and B and a sequence j, there is a distance function that determines how near these elements are. It is dened as
k

dist(j, A, B) = k1
|Aji Bij |, where Aji denotes the ith gene at sequence j for
i=1

the chromosome A, and k denotes the number of genes of such a sequence. Note
that we can think of sequences in terms of levels in a tree. The sequence j is
simply the level j down the tree at which it appears, with the top of the tree as
sequence 1. A and B are similar at sequence (or at level) j if dist(j, A, B) = 0.
Note that this hierarchical structure implies that an arbitrary object is nearer
to that minimizing the distance at earlier sequences. This simple expression
provides a quite accurate procedure to classify objects at a glance, by simply
comparing them sequentially at each depth level.
2.2

Knowledge Acquisition

Once new information is attained and processed by the analyzer, it is sent to the
knowledge motor. This knowledge motor is actually the brain of our system. Its
main components are depicted in Figure 1(left). Firstly, the current information

AI Framework for Decision Modeling in Behavioral Animation

91

Fig. 1. (left) Knowledge motor scheme; (right) goal selection subsystem scheme

is temporarily stored into the knowledge buer, until new information is attained.
At that time, previous information is sent to the knowledge updater (KU), the
new one being stored into this knowledge buer and so on. This KU updates
both the memory area and the knowledge base.
The memory area is a neural network applied to learn from data (in our
problem, the information received from the environment through the perception
subsystem). In this paper we consider the unsupervised learning, and hence we
use an autoassociative scheme, since the inputs themselves are used as targets.
To update the memory area, we employ the K-means least-squares partitioning
algorithm for competitive networks, which are formed by an input and an output
layer, connected by feed forward connections. Each input pattern represents a
point in the conguration space (the space of inputs) where we want to obtain
classes. This type of architecture is usually trained with a winner takes all algorithm, so that only those weights associated with the output neuron with largest
value (the winner) are updated. The basic algorithm consists of two main steps:
(1) compute cluster centroids and use them as new cluster seeds and (2) assign
each chromosome to the nearest centroid. The basic idea behind this formulation
is to overcome the limitation of having more data than neurons by allowing each
neuron to store more than one data at the same time.
The knowledge base is actually a based-on-rules expert system, containing
both concrete knowledge (facts) and abstract knowledge (inference rules). Facts
include complex relationships among the dierent elements (relative positions,
etc.) and personal information about the avatars (personal data, schedule, hobbies or habits), i.e. what we call avatars characteristic patterns. Additional subsystems for tasks like learning, coherence control, action execution and others
have also been incorporated. This deterministic expert system is subsequently
modied by means of probabilistic rules, for which new data are used in order
to update the probability of a particular event. Thus, the neuron does not exhibit a deterministic output but a probabilistic one: what is actually computed
is the probability of a neuron to store a particular data at a particular time. This
probability is continuously updated in order to adapt our recalls to the most recent data. This leads to the concept of reinforcement, based on the fact that the
repetition of a particular event over time increases the probability to recall it.

92

A. Iglesias and F. Luengo

Of course, some particular data are associated with high-relevance events whose
inuence does not decrease over time. A learning rate parameter introduced in
our scheme is intended to play this role.
Finally, the request manager is the component that, on the basis of the information received from the previous modules, provides the information requested
by the goal selection subsystem described in next section.
2.3

Decision Modeling

A central issue in behavioral animation is the adequate choice of appropriate


mechanisms for decision modeling. Those mechanisms will take a decision about
which is the next action to be carried out from a set of feasible actions. The
fundamental task of any decision modeling module is to determine a based-onpriority sorted list of goals to be performed by the virtual agent. The goals
priority is calculated as a combination of dierent avatars internal states (given
by mathematical functions not described in this paper because of limitations of
space) and external factors (which will determine the goals feasibility).
Figure 1(right) shows the architecture of our goal selection subsystem, comprised of three modules and a goal database. The database stores a list of arrays
(associated with each of the available goals at each time) comprised of: the goal
ID, its feasibility rate (determined by the analyzer subsystem), the priority of
such a goal, the wish rate (determined by the emotional analyzer), the time at
which the goal is selected and its success rate.
The emotional analyzer (EA) is the module responsible to update the wish
rate of a goal (regardless its feasibility). Such a rate takes values on the interval [0, 100] according to some mathematical functions (not described here) that
simulate human reactions in a very realistic way (as shown in Section 3).
The intention planning (IP) module determines the priority of each goal. To
this aim, it uses information such as the factibility and wish rate. From this point
of view, it is rather similar to the intention generator of [13] except by the
fact that decision for that system is exclusively based on rules. This module also
comprises a buer to store temporarily those goals interrupted for a while, so
that the agent exhibits a certain persistence of goals. This feature is specially
valuable to prevent avatars from the oscillatory behavior appearing when the
current goal changes continuously.
The last module is the action planning (AP), a based-on-rules expert system
that gets information from the environment (via the knowledge motor), determines the sequence of actions to be carried out in order to achieve a particular
goal and updates the goals status accordingly.
2.4

Action Planning and Execution

Once the goals and priorities are dened, this information is sent to the motion
subsystem to be transformed into motion routines (just as the orders of our brain
are sent to our muscles) and then animated in the virtual world. Currently, we

AI Framework for Decision Modeling in Behavioral Animation

93

Fig. 2. Example 1: screenshots of the virtual park environment

have implemented routines for path planning and obstacle avoidance. In particular, we have employed a modication of the A* path nding algorithm, based
on the idea to prevent path recalculation until a new obstacle is reached. This
simple procedure has yielded substantial savings in time in all our experiments.
In addition, sophisticated collision avoidance algorithms have been incorporated
into this system (see the examples described in Section 3).

Two Illustrative Examples

In this section, two illustrative examples are used to show the good performance of our approach. The examples are available from Internet at the URLs:
http://personales.unican.es/iglesias/CGGM2007/samplex.mov (x = 1, 2).
Figure 2 shows some screenshots from the rst movie. In picture (a) a woman
and her two children go into the park. The younger kid runs following some
birds. After failing to capture them, he gets bored and joins his brother. Then,
the group moves towards the wheel avoiding the trees and the seesaw (b). Simultaneously, other people (the husband and a girl) enter into the park. In (c) a
kid is playing with the wheel while his brother gets frustrated after expecting to
play with the seesaw (in fact, he was waiting for his brother besides the seesaw).
After a while, he decides to join his brother and play with the wheel anyway.
Once her children are safely playing, the woman relaxes and goes to meet her
husband, who is seated on a bench (d). The girl is seated in front of them, reading
a newspaper. Two more people go into the park: a man and a kid. The kid goes
directly towards the playground, while the man sees the girl, becomes attracted
by her and decides to sit down on the same bench, looking for a chat. As she
does not want to chat with him, she stands up and leaves. The new kid goes
to play with the wheel while the two brothers decide to play with the seesaw.
The playground has two seesaws, so each brother goes towards the nearest one
(e). Suddenly, they realize they must use the same one, so a brother changes his
trajectory and moves towards the other seesaw. The mother is coming back in
order to take after her children. Her husband also comes behind her and they

94

A. Iglesias and F. Luengo

Fig. 3. Temporal evolution of the internal states (top) and available goals wishes
(bottom) for the second example in this paper

start to chat again (f). The man on a bench is now alone and getting upset so he
decides to take a walk and look for the girl again. Simultaneously, she starts to
make physical exercises (g). When the man realizes shes busy and hence will not
likely pay attention on him, he changes his plans and walks towards the couple,
who are still chatting (g). The man realizes they are not interested to chat with
him either, so he nally leaves the park.
It is interesting to point out that the movie includes a number of remarkable motion and behavioral features. For instance, pictures (a)-(b)-(g) illustrate
several of our motion algorithms: persecution, obstacle avoidance, path nding,
interaction with objects (wheel, seesaw, bench) and other avatars, etc. People in
the movie exhibit a remarkable ability to capture information from the environment and change their trajectories in real time. On the other hand, they also
exhibit a human-like ability to realize about what is going on about others and
change their plans accordingly. Each virtual avatar has previous knowledge on
neither the environment nor other avatars, as it might happen in real life when
people enter for the rst time into a new place or know new people.
The second scene consists of a shopping center at which the virtual avatars can
perform a number of dierent actions, such as eat, drink, play videogames, sit
down to rest and, of course, do shopping. We consider four virtual avatars: three
kids and a woman. The pictures in Figure 3 are labelled with eight numbers
indicating the dierent simulations milestones (the corresponding animation
screenshots for those time units are displayed in Figure 4): (1) at the initial

AI Framework for Decision Modeling in Behavioral Animation

95

Fig. 4. Example 2: screenshots of the shopping center environment

step, the three kids go to play with the videogame machines, while the woman
moves towards the eating area (indicate by the tables in the scene). Note that the
internal state with the highest value for the avatar analyzed in this work is the
energy, so the avatar is going to perform some kind of dynamic activity, such as to
play; (2) the kid keeps playing (and their energy level going down) until his/her
satisfaction reaches the maximum value. At that time, the anxiety increases, and
avatars wish turns into performing a dierent activity. However, the goal play
videogame has still the highest wish rate, so it will be in progress for a while;
(3) at this simulation step, the anxiety reaches a local maximum again, meaning
that the kid is getting bored about playing videogames. Simultaneously, the goal
with the highest value is drink water, so the kid stops playing and looks for
a drink machine; (4) at this time, the kid gets the drink machine, buys a can
and drinks. Consequently, the internal state function thirsty decreases as the
agent drinks until the status of this goal becomes goal attained; (5) Once this
goal is satised, the goal play videogames is the new current goal. So, the kid
comes back towards the videogame machines; (6) however, the energy level is
very low, so the goal play videogames is interrupted, and the kid looks for a
bench to sit down and have a rest; (7) once seated, the energy level turns up
and the goal have a rest does not apply anymore; (8) since the previous goal
play videogames is still in progress, the agent comes back and plays again.
Figure 3 shows the temporal evolution of the internal states (top) and the
goals wishes (bottom) for one of the kids. Similar graphics can be obtained for
the other avatars (they are not included here because of limitations of space).
The picture on the top displays the temporal evolution of the ve internal state
functions (valued onto the interval [0, 100]) considered in this example, namely,
energy, shyness, anxiety, hunger and thirsty. On the bottom, the wish rate
(also valued onto the interval [0, 100]) of the feasible goals (have a rest, eat
something, drink water, take a walk and play videogame) is depicted.

96

A. Iglesias and F. Luengo

Conclusions and Future Work

The core of this paper is the realistic simulation of the human behavior of virtual
avatars living in a virtual 3D world. To this purpose, the paper introduces a
behavioral system that uses several Articial Intelligence techniques so that the
avatars can behave in an intelligent and autonomous way. Future lines of research
include the determination of new functions and parameters to reproduce human
actions and decisions and the improvement of both the interaction with users
and the quality of graphics. Financial support from the Spanish Ministry of
Education and Science (Project Ref. #TIN2006-13615) is acknowledged.

References
1. Funge, J., Tu, X. Terzopoulos, D.: Cognitive modeling: knowledge, reasoning and
planning for intelligent characters, SIGGRAPH99, (1999) 29-38
2. Geiger, C., Latzel, M.: Prototyping of complex plan based behavior for 3D actors,
Fourth Int. Conf. on Autonomous Agents, ACM Press, NY (2000) 451-458
3. Granieri, J.P., Becket, W., Reich, B.D., Crabtree, J., Badler, N.I.: Behavioral control for real-time simulated human agents, Symposium on Interactive 3D Graphics,
ACM, New York (1995) 173-180
4. Grzeszczuk, R., Terzopoulos, D., Hinton, G.: NeuroAnimator: fast neural network
emulation and control of physics-based models. SIGGRAPH98 (1998) 9-20
5. Iglesias A., Luengo, F.: New goal selection scheme for behavioral animation of
intelligent virtual agents. IEICE Trans. on Inf. and Systems, E88-D(5) (2005)
865-871
6. Luengo, F., Iglesias A.: A new architecture for simulating the behavior of virtual
agents. Lectures Notes in Computer Science, 2657 (2003) 935-944
7. Luengo, F., Iglesias A.: Framework for simulating the human behavior for intelligent virtual agents. Lectures Notes in Computer Science, 3039 (2004) Part I:
Framework architecture. 229-236; Part II: Behavioral system 237-244
8. Monzani, J.S., Caicedo, A., Thalmann, D.: Integrating behavioral animation techniques. EUROGRAPHICS2001, Computer Graphics Forum 20(3) (2001) 309-318
9. Raupp, S., Thalmann, D.: Hierarchical model for real time simulation of virtual
human crowds. IEEE Trans. Visual. and Computer Graphics. 7(2) (2001) 152-164
10. Sanchez, S., Balet, O., Luga, H., Dutheu, Y.; Autonomous virtual actors. Lectures
Notes in Computer Science 3015 (2004) 68-78
11. de Sevin, E., Thalmann, D.: The complexity of testing a motivational model of action selection for virtual humans, Proceedings of Computer Graphics International,
IEEE CS Press, Los Alamitos, CA (2004) 540-543
12. Thalmann, D., Monzani, J.S.: Behavioural animation of virtual humans: what kind
of law and rules? Proc. Computer Animation 2002, IEEE CS Press (2002)154-163
13. Tu, X., Terzopoulos, D.: Articial shes: physics, locomotion, perception, behavior.
Proceedings of ACM SIGGRAPH94 (1994) 43-50

Studies on Shape Feature Combination and Efficient


Categorization of 3D Models
Tianyang Lv1,2, Guobao Liu1, Jiming Pang1, and Zhengxuan Wang1
1

College of Computer Science and Technology, Jilin University, Changchun, China


College of Computer Science and Technology, Harbin Engineering University, Harbin, China
raynor1979@163.com

Abstract. In the field of 3D model retrieval, the combination of different kinds


of shape feature is a promising way to improve retrieval performance. And the
efficient categorization of 3D models is critical for organizing models. The paper proposes a combination method, which automatically decides the fixed
weight of different shape features. Based on the combined shape feature, the
paper applies the cluster analysis technique to efficiently categorize 3D models
according to their shape. The standard 3D model database, Princeton Shape
Benchmark, is adopted in experiment and our method shows good performance
not only in improving retrieval performance but also in categorization.
Keywords: Shape-based 3D model retrieval; feature combination; categorization; clustering.

1 Introduction
With the proliferation of 3D models and their wide spread through internet, 3D model
retrieval emerges as a new field of multimedia retrieval and has great application
value in industry, military etc.. [1]. Similar to the studies in image or video retrieval,
researches in 3D model retrieval concentrate on the content-based retrieval way [2],
especially the shape-based retrieval. The major problem of shape-based retrieval is
extracting models shape feature, which should satisfy the good properties, such as
rotation invariant, representing various kinds of shape, describing similar shape with
similar feature, etc
Although many methods for extracting shape feature have been proposed [3], researches show that none is the best for all kinds of shapes [4, 5, 6, 7]. To solve this problem, it is an effective way to combine different shape features [5, 6, 7]. The critical step
of the combination is determining the weights of shape features. For instance, ref. [5]
determines the fixed weight due to users experience, which is based on numerous
experiments; meanwhile it decides the dynamic weight based on the retrieval result
and the categorization of 3D models.
However, the shortcomings of these methods are: need user experience to decide
the appropriate fixed weight, and cannot appoint weight for new feature; it is too time
consuming to compute the dynamic weight, while its performance is just a little better
than the fix-weighted way.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 97104, 2007.
Springer-Verlag Berlin Heidelberg 2007

98

T. Lv et al.

Moreover, it is still an open problem to categorize 3D models. Nowadays, the


categorization of 3D models depends on manual work, such as Princeton Shape
Benchmark (PSB) [4]. Even if the drawback of time consuming is not taken into consideration, the manual way also results in the following mistakes: first, models with
similar shape are classified into different classes, like the One Story Home class and
Barn class of PSB; second, models with apparently different shapes are classified into
the same class, like the Potted Plant class and Stair class. Table 1 states the detail. It
is because that human categorize 3D models according to their semantics in the real
life, instead of their shape.
Table 1. Mistakes of manual categorization of PSB

One Story Home

Barn

Potted Plant

Stair

To solve these problems, the paper conducts researches in two aspects: first, we
analyzes the influence of the value of weight on the combination performance, and
proposes an method, which automatically decide the value of the fixed weight; second, we introduces an efficient way for categorizing 3D models based on clustering
result.
The rest of the paper is organized as follows: Section 2 introduces the automatic
combination method; Section 3 states the categorization based on clustering result;
Section 4 gives the experimental results of PSB; and Section 5 summarizes the paper.

2 An Automatic Decision Method of Features Fixed-Weight


When combining different shape features with fixed weight, the distance dcom between
model q and model o is computed as follows:
l

d com ( q, o) = wi
i =1

d i ( q, o )
max(d i (q))

(1)

Where l is the number of the different shape features, wi is the fixed weight of the ith
shape feature, di(q, o) is the distance between q and o under the ith shape feature vector, and max(di(q)) is maximum distance of q and the others. Previous researches
show that the Manhattan distance performs better than the Euclidian distance, thus the
paper adopts the Manhattan distance in computing di(q, o).

Studies on Shape Feature Combination and Efficient Categorization of 3D Models

99

In this paper, four kinds of feature extraction methods are adopted and 5 sets of
feature vectors are obtained from PSB. The detail is stated as follows: (1) the shape
feature extracting method based on depth-buffer [11], termed DBD, which obtains the
feature vector with 438 dimensions; (2) the method based on EDT [12], termed NEDT,
which obtains the vector with 544 dimensions; (3) the method based on spherical
harmonic [13], which obtains two sets of vectors with 32 dimensions and 136 dimensions, termed RSH-32 and RSH-136 respectively; (4) the method performing the
spherical harmonic transformation on the voxelized models, termed SHVF, which
obtains the feature vector with 256 dimensions.
We conduct experiment on PSB to analyze the influence of the value of wi on the
combination performance. Table 2 evaluates the performance of combining any two
out of 5 different features of PSB. The weight of each feature is equal and the criterion R-precision is adopted [8]. It can be seen that there co-exist the good cases, like
combining DBD and NEDT, and the bad cases, like combining DBD and RSH-32.
But if the fixed weight (4:4:2:1:1) decided according to our experience is adopted, the
performance is much better.
Table 2. Combination performance comparision under different fixed weights
DBD NEDT
wi

RSH
-136

RSH
-32

SHVF DBD

NEDT

RSH
-136

RSH
-32

SHVF

+DBD

0.354

-- --

-- --

-- --

-- --

0.354

-- --

-- --

-- --

-- --

+NEDT

0.390

0.346

-- --

-- --

-- --

0.390

0.346

-- --

-- --

-- --

+RSH-136 0.364

0.376

0.292

-- --

-- --

0.372

0.378

0.292

-- --

-- --

+RSH-32

0.283

0.286

0.258 0.168

-- --

0.351

0.343

0.279 0.168

+SHVF

0.308

0.308

0.286 0.204

0.201

0.360

0.350

0.299 0.204 0.201

-- --

This experiment shows that the appropriate value of wi can greatly improve the
combination performance. Although wi decided due to experience performs well, it
has the limitations like time consuming and hard to popularize.
Thus, it is necessary to automatically decide wi. To accomplish this task, we suppose that if a feature is the best for most models, its weight should be the highest. And
if one feature is the best for a model, its weight should be summed by 1/N, where N is
the total number of models. As for a set of classified 3D models, we follow the winner-take-all rule. It means that if the ith feature is the best for the jth class Cj of models, wi is summed by nj/N, where nj is the size of Cj.
Finally, states the automatic decision formula of the fixed-weights wi as follows:

f (C )* n

nClass

wi =

j =1

(2)

100

T. Lv et al.

Where nClass is the number of the classes of models; fi(Cj)=1, iff the R-precision of the
l

ith feature is the highest for Cj, otherwise fi(Cj)=0. And

w
i =1

= 1.

Obviously, the proposed method can automatically decide the fixed weight for a
new shape feature by re-computing the Formula (2). During this process, the weights
of the existing features are also adjusted.

3 Efficient Categorization of 3D Models Based on Clustering


Result
As an unsupervised technique, cluster analysis technique is a promising candidate for
categorizing 3D models. It is good at discovering the concentration situation of the
feature vectors without prior knowledge. Since the models with similar feature are
grouped together and their feature reflects their shape, the clustering result can be
considered as a categorization of 3D models based on their shape. Ref. [10] performs
the research in this field. However, it relies on just one kind of shape feature, thus the
clustering result is highly sensitive to the performance of shape feature.
In contrast, the paper adopts the proposed fix-weighted feature combination
method and achieves a much better and more stable shape descriptor of 3D model.
The distance among models is computed according to Formula (1) and (2).
The X-means algorithm is selected to analyze the shape feature set of 3D models.
X-means is an important improvement of the well known method K-means. To overcome the highly dependency of K-means on the pre-decided number k of clusters, Xmeans requires but not restricts to the parameter k. Its basic idea is that: in an iteration
of clustering, it splits the center of selected clusters into two children and decides
whether a child survives. During clustering, the formula BIC(c x)=L(x c)k(m+1)/2*logn is used to decide the appropriate opportunity to stop clustering, where
L(x c) is the log-likelihood of dataset x according to the model c, and m is the dimensionality of the data. In this way, the appropriate number of clusters can be automatically decided.
Although X-means is efficient in classifying models according to their shape, there
still exist mistakes in the clustering result for two reasons:

(1) Due to the complexity and diversity of models shape, it is very difficult to describe all shapes. The combination of different shape features can partially solve this
problem, but still has its limit.
(2) X-means may make clustering mistakes. Up to now, it seems that the clustering
process ensure most data is clustered into the right groups, but not every data.
Thus, we import human correction to correct the mistakes lies in the clustering result. To avoid mistakes caused by manual intervene, like those in Table 1, we make
the restriction that a user can just delete some models from a cluster or delete the
whole cluster. And the pruned models are considered as the wrongly categorized and
are labeled as unclassified.
Finally, the refined clustering result is treated as the categorization of 3D models.

Studies on Shape Feature Combination and Efficient Categorization of 3D Models

101

In comparison with the pure manual work, the categorization base on clustering result is much more efficient and objective. The clustering technique not only shows the
number of classes according to models shape, but also states the member of a class.

4 Experiment and Analysis


We conduct series experiments on the standard 3D model database, Princeton Shape
Benchmark, which owns 1814 models. And the 5 sets of shape feature vector introduced in Section 2 are used for combination.
First, we analyze the performance of the automatic fix-weighted combination. According to Formula (2), the automatic fixed weight for 5 features are DBD=0.288,
NEDT=0.373, RSH-136=0.208, RSH-32=0.044 and SHVF=0.087. And Table 3 states
the R-Precision and improvement after combining any two features for PSB. Compared with Table 2, the performance of the automatic fix-weighted combination is
much better. The highest improvement is 24%=((0.208-0.168)/ 0.168), while the best
combination improves by 9.6%=((0.388-0.354)/0.354).
Table 3. The performance of combining two features based on the automatic fixed weight
DBD

NEDT

RSH-136

RSH-32

SHVF

R-Precision/Improvement
+DBD

0.354/--

-- --

-- --

-- --

-- --

+NEDT

0.388/+9.6%

0.346/--

-- --

-- --

-- --

+RSH-136

0.368/+4.8%

0.378/+9.3%

0.292/--

-- --

-- --

+RSH-32

0.360/+1.7%

0.353/+2.0%

0.298/+2.1%

0.168/--

-- --

+SHVF

0.356/+0.6%

0.350/+1.2%

0.302/+3.4%

0.208/+24%

0.201/--

Fig. 1. states the Precision-Recall curves along with R-Precision of 5 features, the
combination of 5 features based on equal fixed weight (Equal Weight), the combination using fixed weight (4:4:2:1:1) decided by experience (Experience Weight), and
the combination adopting the proposed automatic fixed weight (Automatic Weight).
It can be seen that the proposed method is the best under all criterions. It achieves
the best R-Precision 0.4046, which is much better than that of the Equal Weight
0.3486 and is also slightly better than the Experience Weight 0.4021. And its performance improved by 14.5% than the best single feature DBD.
After combining 5 features based on the proposed method, we adopt X-means to
analyze the PSB, and 130 clusters are finally obtained. In scanning these clusters, we
found that most clusters are formed by the models with similar shape, like the cluster
C70, C110, C112, C113 in Table 4. However, there also exist mistakes, such as C43 in
Table 4. After analyzing the combined feature of those wrong models, we find that
the mistakes are mainly caused by the shape feature, instead of clustering.

102

T. Lv et al.

Fig. 1. Performance comparison adopting Precision-Recall and R-Precision


Table 4. Detail of some result clusters of PSB

C70

C110

C112

Studies on Shape Feature Combination and Efficient Categorization of 3D Models

103

Table 4. (continued)

C113

C43

Then, we select 3 students that never contact these models to refine the clustering
result. At least two of them must reach an agreement for each deletion. In less than 2
hours, including the time costs on arguments, they labeled 202 models as the unclassified out of 1814, viz. 11.13%, and 6 clusters out of 130 are pruned, viz. 4.6%.
Obviously, the clustering result is a valuable reference for categorizing 3D models.
Even if the refinement time is included, the categorization based on clustering result
is much faster than the pure manual work, which usually costs days and is exhaustive.

5 Conclusions
The paper proposes a combination method, which automatically decides the fixed
weights of different shape features. Based on the combined feature, the paper categorizes 3D models according to their shape. Experimental result shows that the proposed
method shows good performance not only in improving retrieval performance but also
in categorization. Future work will concentrate on the study of clustering ensemble to
achieve a much stable clustering result of 3D models.

Acknowledgements
This work is sponsored by Foundation for the Doctoral Program of the Chinese Ministry of Education under Grant No.20060183041 and the Natural Science Research
Foundation of Harbin Engineering University under the grant number HEUFT05007.

References
[1] T.Funkhouser, et al. A Search Engine for 3D Models. ACM Transactions on Graphics.22
(1), (2003) 85-105.
[2] Yubin Yang, Hui Li, Qing Zhu. Content-Based 3D Model Retrieval: A Survey. Chinese
Journal of Computer. (2004), Vol. 27, No. 10, Pages: 1298-1310.

104

T. Lv et al.

[3] Chenyang Cui, Jiaoying Shi. Analysis of Feature Extraction in 3D Model Retrieval.
Journal of Computer-Aided Design & Computer Graphics. Vol.16, No.7, July (2004).
pp. 882-889.
[4] Shilane P., Min P., Kazhdan M., Funkhouser T.. The Princeton Shape Benchmark. In
Proceedings of the Shape Modeling International 2004 (SMI'04), Genova, Italy, June
2004. pp. 388-399.
[5] Feature Combination and Relevance Feedback for 3D Model Retrieval. The 11th International Conference on Multi Media Modeling (MMM 2005), 12-14 January 2005, Melbourne, Australia. IEEE Computer Society 2005. pp. 334-339.
[6] Ryutarou Ohbuchi, Yushin Hata,Combining Multiresolution Shape Descriptors for Effective 3D Similarity Search Proc. WSCG 2006, Plzen, Czech Republic, (2006).
[7] Atmosukarto I., Wee Kheng Leow, Zhiyong Huang. Feature Combination and Relevance
Feedback for 3D Model Retrieval. Proceedings of the 11th International Multimedia
Modelling Conference, (2005).
[8] R. Baeza-Yates, B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley,
(1999).
[9] Dan Pelleg, Andrew Moore. X-means: Extending K-means with Efficient Estimation of
the Number of Clusters. In Proc. 2000 Int. Conf. on Data Mining. (2000).
[10] Tianyang Lv, etc. An Auto-Stopped Hierarchical Clustering Algorithm for Analyzing 3D
Model Database. The 9th European Conference on Principles and Practice of Knowledge
Discovery in Databases. In: Lecture Notes on Artificial Intelligent, Vol. 3801, pp.
601 608.
[11] M. Heczko, D. Keim, D. Saupe, and D. Vranic. Methods for similarity search on 3D databases. Datenbank-Spektrum, 2(2):5463, (2002). In German.
[12] H. Blum. A transformation for extracting new descriptors of shape. In W. Wathen-Dunn,
editor, Proc. Models for the Perception of Speech and Visual Form, pages 362{380,
Cambridge, MA, November 1967. MIT Press.
[13] Kazhdan Michael , Funkhouser Thomas. Harmonic 3D shape matching [A]. In : Computer Graphics Proceedings Annual Conference Series , ACM SIGGRAPH Technical
Sketch , San Autonio , Texas , (2002)

A Generalised-Mutual-Information-Based Oracle
for Hierarchical Radiosity
Jaume Rigau, Miquel Feixas, and Mateu Sbert
Institut dInform`
atica i Aplicacions
Campus Montilivi P-IV, 17071-Girona, Spain
jaume.rigau|miquel.feixas|mateu@ima.udg.edu

Abstract. One of the main problems in the radiosity method is how to


discretise a scene into mesh elements that allow us to accurately represent
illumination. In this paper we present a new renement criterion for
hierarchical radiosity based on the continuous and discrete generalised
mutual information measures between two patches or elements. These
measures, derived from the generalised entropy of Harvda-Charv
at-Tsallis,
express the information transfer within a scene. The results obtained
improve on the ones based on kernel smoothness and Shannon mutual
information.

Introduction

The radiosity method solves the problem of illumination in an environment with


diuse surfaces by using a nite element approach [1]. The scene discretisation
has to represent the illumination accurately by trying to avoid unnecessary subdivisions that would increase the computation time. A good meshing strategy
will balance the requirements of accuracy and computational cost.
In the hierarchical radiosity algorithms [2] the mesh is generated adaptively:
when the constant radiosity assumption on a patch is not valid for the radiosity
received from another patch, the renement algorithm will subdivide it in a set
of subpatches or elements. A renement criterion, called oracle, informs us if a
subdivision of the surfaces is needed, bearing in mind that the cost of the oracle
should remain acceptable. In [3,4], the diculty in obtaining a precise solution
for the scene radiosity has been related to the degree of dependence between all
the elements of the adaptive mesh. This dependence has been quantied by the
mutual information, which is a measure of the information transfer in a scene.
In this paper, a new oracle based on the generalised mutual information [5],
derived from the generalised entropy of Harvda-Charvat-Tsallis [6], is introduced. This oracle is obtained from the dierence between the continuous and
discrete generalised mutual information between two elements of the adaptive
mesh and expresses the loss of information transfer between two patches due to
the discretisation. The results obtained show that this oracle improves on the
kernel smoothness-based [7] and the mutual information-based [8,9] ones, conrming the usefulness of the information-theoretic approach in dealing with the
radiosity problem.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 105113, 2007.
c Springer-Verlag Berlin Heidelberg 2007


106

2
2.1

J. Rigau, M. Feixas, and M. Sbert

Preliminaries
Radiosity

The radiosity method uses a nite element approach, discretising the diuse
environment into patches and considering the radiosities, emissivities and reectances constant over them. With these assumptions, the discrete radiosity
equation [1] is given by

Bi = Ei + i
Fij Bj ,
(1)
jS

where S is the set of patches of the scene, Bi , Ei , and i , are respectively the
radiosity, emissivity, and reectance of patch i, Bj is the radiosity of patch j,
and Fij is the patch-to-patch form factor, dened by
 
1
Fij =
F (x, y)dAy dAx ,
(2)
Ai Si Sj
where Ai is the area of patch i, Si and Sj are, respectively, the surfaces of patches
i and j, F (x, y) is the point-to-point form factor between x Si and y Sj , and
dAx and dAy are, respectively, the dierential areas at points x and y. Using
Monte Carlo computation with area-to-area sampling, Fij can be calculated:
Fij Aj

1
|Sij |

F (x, y),

(3)

(x,y)Sij

where the computation accuracy depends on the number of random segments


between i and j (|Sij |).
To solve the system (1), a hierarchical renement algorithm is used. The
eciency of this algorithm depends on the election of a good renement criterion. Many renement oracles have been proposed in the literature (see [10]
for details). For comparison purposes, we review here the oracle based on kernel smoothness (KS), proposed by Gortler et al. [7] in order to drive hierarchical
renement with higher-order approximations. When applied to constant approximations, this renement criterion is given by
i max{Fijmax Fijavg , Fijavg Fijmin }Aj Bj < ,

(4)

where Fijmax = max{F (x, y) | x Si , y Sj } and Fijmin = min{F (x, y) | x


Si , y Sj } are, respectively, the maximum and minimum radiosity kernel values
estimated by taking the maximum and minimum value computed between pairs
of random points on both elements, and Fijavg = Fij /Aj is the average radiosity
kernel value.
2.2

HCT Entropy and Generalised Mutual Information

In 1967, Harvda and Charv


at [6] introduced a new generalised denition of entropy. In 1988, Tsallis [11] used this entropy in order to generalise the BoltzmannGibbs entropy in statistical mechanics.

A Generalised-Mutual-Information-Based Oracle

107

Denition 1. The Harvda-Charv


at-Tsallis entropy (HCT entropy) of a discrete
random variable X, with |X| = n and pX as its probability distribution, is dened
by
n
1 i=1 p
i
H (X) = k
,
(5)
1
where k is a positive constant (by default k = 1) and \{1} is called entropic
index.

This
nentropy recovers the Shannon discrete entropy when 1, H1 (X)
i=1 pi ln pi , and fulls good properties such as non-negativity and concavity.
On the other hand, Taneja [5] and Tsallis [12] introduced the generalised
mutual information.
Denition 2. The generalised mutual information between two discrete random
variables (X, Y ) is dened by

n 
m

p
1
ij
,
I (X, Y ) =
1
(6)
1 1
1
p
q
i
j
i=1 j=1
where |X| = n, |Y | = m, pX and qY are the marginal probability distributions,
and pXY is the joint probability distribution between X and Y .
The transition of I (X, Y ) to the continuous generalised mutual information is
straightforward. Using entropies, an alternative form is given by
I (X, Y ) = H (X) + H (Y ) (1 )H (X)H (Y ) H (X, Y ).

(7)

Shannon mutual information (MI) is obtained when 1. Some alternative


ways for the generalised mutual information can be seen in [13].

Generalised Mutual Information-Based Oracle

We will see below how the generalised mutual information can be used to build
a renement oracle within a hierarchical radiosity algorithm. Our strategy will
be based on the estimate of the discretisation error from the dierence between
the continuous and discrete generalised mutual information (6) between two
elements of the adaptive mesh. The discretisation error based on Shannon mutual information was introduced by Feixas et al. [8] and applied to hierarchical
radiosity with good results.
In the context of a discrete scene information channel [4], the marginal probabilities are given by pX = qY = {ai } (i.e., the distribution of the relative area
of patches: AATi , where AT is the total area of scene) and the joint probability is
given by pXY = {ai Fij }. Then,
Denition 3. The discrete generalised mutual information of a scene is given
by

  a

F
1
i ij
=
I =
1
(ai Fij , ai aj ),
(8)
1
1
1
ai aj
iS jS

iS jS

108

J. Rigau, M. Feixas, and M. Sbert

where, using 1 =
obtained.


iS

jS

ai aj and (p, q) =

1 q p
1 q1 ,

the last equality is

This measure quanties the discrete information transfer in a discretised scene.


The term (ai Fij , ai aj ) can be considered as an element of the generalised
mutual information matrix I , representing the information transfer between
patches i and j.
To compute I , the Monte Carlo area-to-area sampling (3) is used, obtaining
for each pair of elements


1 a
i aj ai Fij
1 a1
a1
i
j



1 Ai Aj
1

1 A
F (x, y) .
T
1 AT AT
|Sij |

Iij = (ai Fij , ai aj ) =

(9)

(x,y)Sij

The information transfer between two patches can be obtained more accurately using the continuous generalised mutual information between them. From
the discrete form (8) and using the pdfs p(x) = q(y) = A1T and p(x, y) =
1
AT F (x, y), we dene
Denition 4. The continuous generalised mutual information of a scene is
given by

 
1
1
c
I =

F (x, y), 2 dAy dAx .


(10)
AT
AT
S S
This represents the continuous information transfer in a scene. We can split the
integration domain and for two surface elements i and j we have

 
1
1
c
Iij =

F (x, y), 2 dAy dAx


(11)
AT
AT
Si S j
that, analogously to the discrete case, expresses the information transfer between
two patches.
Both continuous expressions, (10) and (11), can be solved by Monte Carlo integration. Taking again area-to-area sampling (i.e., pdf Ai1Aj ), the last expression
(11) can be approximated by

1
1
F (x, y), 2
|Sij |
AT
AT
(x,y)Sij


1 Ai Aj
1

=
1 A
F (x, y) .
T
1 AT AT
|Sij |

Ic ij Ai Aj

(x,y)Sij

Now, we dene

(12)

A Generalised-Mutual-Information-Based Oracle

Denition 5. The generalised discretisation error of a scene is given by



= Ic I =
ij ,

109

(13)

iS jS

where ij = Ic ij Iij .
While expresses the loss of information transfer in a scene due to the discretisation, the term ij gives us this loss between two elements i and j. This
dierence is interpreted as the benet to be gained by rening and can be used
as the base of the new oracle.
From (13), using (9) and (12), we obtain
ij Ai Aj A2
T

where

ij =

1
|Sij |

1
,
1 ij

F (x, y)

(x,y)Sij

1
|Sij |

(14)

F (x, y) .

(15)

(x,y)Sij

Accordingly to the radiosity equation (1) and in analogy to classic oracles,


like KS, we consider the oracle structure i Bj < , where is the geometric
kernel [14]. Now, we propose to take the generalised discretisation error between
two patches as the kernel ( = ij ) for the new oracle based on generalised
mutual information (GMI ). To simplify the expression of this oracle, we multiply
the inequality by the scene constant AT 2 (1 ).
Denition 6. The hierarchical radiosity oracle based on the generalised mutual
information is dened by
i Ai Aj ij Bj < .
(16)

Results

In this section, the GMI oracle is compared with the KS and MI ones. Other
comparisons, with a more extended analysis, can be found in [14]. All oracles
have been implemented on top of the hierarchical Monte Carlo radiosity method.
In Fig. 1 we show the results obtained for the KS (a) and GMI oracles with
their Gouraud shaded solutions and meshes. In the GMI case, we show the
results obtained with the entropic indexes 1 (b) (i.e., note that GMI1 = MI) and
0.5 (c). For the sake of comparison, adaptive meshes of identical size have been
generated with the same cost for the power distribution: around 19,000 patches
and 2,684,000 rays, respectively. To estimate the form factor, the number of
random lines has been xed to 10.
In Table 1, we show the Root Mean Square Error (RMSE) and Peak Signal
Noise Ratio (PSNR) measures for KS and GMI (for 5 dierent entropic indexes)
oracles for the test scene. These measures have been computed with respect
to the corresponding converged image, obtained with a path-tracing algorithm

110

J. Rigau, M. Feixas, and M. Sbert

(a.i) KS

(a.ii) KS

(b.i) GMI1.00

(b.ii) GMI1.00

(c.i) GMI0.50

(c.ii) GMI0.50

Fig. 1. (a) KS and GMI (entropic indexes (b) 1 and (c) 0.5) oracles. By columns, (i)
Gouraud shaded solution of view1 and (ii) mesh of view2 are shown.

with 1,024 samples per pixel in a stratied way. For each measure, we consider a
uniform weight for every colour channel (RMSEa and PSNRa ) and a perceptual
one (RMSEp and PSNRp ) in accordance with the sRGB system.
Observe in the view1 , obtained with GMI (Fig. 1.i.b-c), the ner details of
the shadow cast on the wall by the chair on the right-hand side and also the
better-dened shadow on the chair on the left-hand side and the one cast by the
desk. In view2 (Fig. 1.ii) we can also see how our oracle outperforms the KS,
especially in the much more dened shadow of the chair on the right. Note the
superior quality mesh created by our oracle.

A Generalised-Mutual-Information-Based Oracle

111

Table 1. The RMSE and PSNR measures of the KS and GMI oracles applied to the
test scene of Fig. 1, where the KS and GMI{0.5,1} results are shown. The oracles have
been evaluated with 10 random lines between each two elements.
oracle
KS
GMI1.50
GMI1.25
GMI1.00
GMI0.75
GMI0.50

RMSEa
13.791
11.889
10.872
9.998
9.555
9.370

view1
RMSEp PSNRa
13.128 25.339
11.280 26.628
10.173 27.405
9.232 28.133
8.786 28.526
8.568 28.696

PSNRp RMSEa
25.767 15.167
27.084 13.046
27.982 11.903
28.825 10.438
29.254 10.010
29.473
9.548

(i)

view2
RMSEp PSNRa
14.354 24.513
12.473 25.821
11.279 26.618
9.709 27.758
9.257 28.122
8.740 28.533

PSNRp
24.991
26.211
27.086
28.387
28.801
29.300

(ii)

Fig. 2. GMI0.50 oracle: (i) Gouraud shadow solution and (ii) mesh are shown

Table 2. The RMSE and PSNR measures of the GMI oracle applied to the scene
of Fig. 2, where the GMI0.5 result is shown. The oracle has been evaluated with 10
random lines between each two elements.
oracle RMSEa
GMI1.50 16.529
GMI1.25 15.199
GMI1.00 14.958
GMI0.75 14.802
GMI0.50 14.679

RMSEp
15.530
14.145
13.844
13.683
13.573

PSNRa
23.766
24.494
24.633
24.724
24.797

PSNRp
24.307
25.119
25.306
25.407
25.477

In general, the improvement obtained with the GMI oracle is signicant.


Moreover, its behaviour denotes a tendency to improve towards subextensive
entropic indexes ( < 1). To observe this tendency, another test scene is shown in
Fig. 2 for an entropic index 0.5. Its corresponding RMSE and PSNR measures are
presented in Table 2. The meshes are made up of 10,000 patches with 9,268,000
rays to distribute the power and we have kept 10 random lines to evaluate the
oracle between elements.

112

J. Rigau, M. Feixas, and M. Sbert

Conclusions

We have presented a new generalised-mutual-information-based oracle for hierarchical radiosity, calculated from the dierence between the continuous and discrete generalised mutual information between two elements of the adaptive mesh.
This measure expresses the loss of information transfer between two patches due
to the discretisation. The objective of the new oracle is to reduce the loss of
information, obtaining an optimum mesh. The results achieved improve on the
classic methods signicantly, being better even than the version based on the
Shannon mutual information. In all the tests performed, the best behaviour is
obtained with subextensive indexes.
Acknowledgments. This report has been funded in part with grant numbers: IST-2-004363 of the European Community - Commission of the European
Communities, and TIN2004-07451-C03-01 and HH2004-001 of the Ministry of
Education and Science (Spanish Government).

References
1. Goral, C.M., Torrance, K.E., Greenberg, D.P., Battaile, B.: Modelling the interaction of light between diuse surfaces. Computer Graphics (Proceedings of
SIGGRAPH 84) 18(3) (July 1984) 213222
2. Hanrahan, P., Salzman, D., Aupperle, L.: A rapid hierarchical radiosity algorithm.
Computer Graphics (Proceedings of SIGGRAPH 91) 25(4) (July 1991) 197206
3. Feixas, M., del Acebo, E., Bekaert, P., Sbert, M.: An information theory framework
for the analysis of scene complexity. Computer Graphics Forum (Proceedings of
Eurographics 99) 18(3) (September 1999) 95106
4. Feixas, M.: An Information-Theory Framework for the Study of the Complexity
of Visibility and Radiosity in a Scene. PhD thesis, Universitat Polit`ecnica de
Catalunya, Barcelona, Spain (Desember 2002)
5. Taneja, I.J.: Bivariate measures of type and their applications. Tamkang Journal
of Mathematics 19(3) (1988) 6374
6. Havrda, J., Charv
at, F.: Quantication method of classication processes. Concept
of structural -entropy. Kybernetika (1967) 3035
7. Gortler, S.J., Schr
oder, P., Cohen, M.F., Hanrahan, P.: Wavelet radiosity. In
Kajiya, J.T., ed.: Computer Graphics (Proceedings of SIGGRAPH 93). Volume 27
of Annual Conference Series. (August 1993) 221230
8. Feixas, M., Rigau, J., Bekaert, P., Sbert, M.: Information-theoretic oracle based on
kernel smoothness for hierarchical radiosity. In: Short Presentations (Eurographics
02). (September 2002) 325333
9. Rigau, J., Feixas, M., Sbert, M.: Information-theory-based oracles for hierarchical
radiosity. In Kumar, V., Gavrilova, M.L., Tan, C., LEcuyer, P., eds.: Computational Science and Its Applications - ICCSA 2003. Number 2669-3 in Lecture Notes
in Computer Science. Springer-Verlag (May 2003) 275284
10. Bekaert, P.: Hierarchical and Stochastic Algorithms for Radiosity. PhD thesis,
Katholieke Universiteit Leuven, Leuven, Belgium (December 1999)

A Generalised-Mutual-Information-Based Oracle

113

11. Tsallis, C.: Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical Physics 52(1/2) (1988) 479487
12. Tsallis, C.: Generalized entropy-based criterion for consistent testing. Physical
Review E 58 (1998) 14421445
13. Taneja, I.J.: On generalized information measures and their applications. In:
Advances in Electronics and Electron Physics. Volume 76. Academic Press Ltd.
(1989) 327413
14. Rigau, J.: Information-Theoretic Renement Criteria for Image Synthesis. PhD
thesis, Universitat Polit`ecnica de Catalunya, Barcelona, Spain (November 2006)

Rendering Technique for Colored Paper Mosaic


Youngsup Park, Sanghyun Seo, YongJae Gi, Hanna Song,
and Kyunghyun Yoon
CG Lab., CS&E, ChungAng University,
221, HeokSuk-dong, DongJak-gu, Seoul, Korea
{cookie,shseo,yj1023,comely1004,khyoon}@cglab.cse.cau.ac.kr
http://cglab.cse.cau.ac.kr

Abstract. The work presented in this paper shows the way to generate
colored paper mosaics using computer graphics techniques. Following two
tasks need to be done to generate colored paper mosaic. The rst one is
to generate colored paper tile and the other one is to arrange the tile.
Voronoi Diagram and Random Point Displacement have been used in
this paper to come up with the shape of the tile. And, energy value that
the tile has depending on its location is the factor to determine the best
positioning of the tile. This paper focuses on representing the overlap
among tiles, maintenance of the edge of input image, and various shapes
of tiles in the nal output image by solving two tasks mentioned above.
Keywords: Colored paper mosaic, Tile generation and Tile arrangement.

Introduction

Mosaic is an artwork formed by lots of small pieces called tile. It can be expressed in many dierent ways depending on the type and the position of tile.
Photomosaics[1] shows the big image formed with small square tiles that are laid
out on a grid pattern. Distinctive output was driven from the process of combining multiple images into one image. While Photomosaics shows the arrangement
of tiles in a grid pattern, Simulated Decorative Mosaic[2] has tiles arranged in
the direction of the edge of input image. This shows the similar pattern found
in ancient Byzantine period. This pattern can also be found in Jigsaw Image
Mosaics[3]. The only dierence is to use various shape of image tiles instead
of a single-colored square tiles. In this paper, we show especially how colored
paper mosaic among various styles of mosaic artworks can be represented using
computer graphics techniques.
To generate colored paper mosaic, following two issues need to be taken care.
The rst issue is to decide on the shape of colored paper tile and the second
one is to arrange colored paper tile. Voronoi Diagram[9] and Random Fractal have been used in this paper to come up with the shape of colored paper
tile. But the problem using Voronoi Diagram is that it makes the form of tile
too plain since it generates only convex polygon. Therefore the method represented in this paper uses the predened data of colored paper as database
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 114121, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Rendering Technique for Colored Paper Mosaic

115

like Photomosaics. Then, it creates small piece of colored paper tile by clipping
Voronoi polygon repetitively from the data of colored paper. Many dierent
shapes of tiles like concave polygon can be expressed since it is made from
repetitive tearing of one colored paper. And the energy value that colored paper tile has depending on its location is calculated to nd the best positioning
of colored paper tile. The location that has the biggest sum of energy value is
dened as the best position. Tiles are placed at the point where the summation
of energy value is the biggest by being moved and rotated toward the nearest
edge.
1.1

Related Work

Existing mosaic studies focus on the selection, the generation, and the arrangement of tiles. We comparison the existing studies by classifying into two
groups.
The studies of rst group focus on the selection and the arrangement of tiles
since they use xed or predened shapes of tiles. Photomasaics[1] creates image formed with various small pieces of image tiles. It is an algorithm that
lays out selected image from the database in a grid pattern. It proposes an
eective method of tile selection from database. But it is hard to keep the
edge of image since the shape of tile in Photomosaic is all square ones. In
the study of Simulated Decorative Mosaic[2], Hausner shows the similar pattern and techniques used in Byzantine era by positioning single-colored square
tile in the direction of the edge of input image. It uses the methods of Centroidal Voronoi Diagram (CVD) and Edge Avoidance to arrange tiles densely.
In Jigsaw Image Mosaics (JIM)[3], it shows extended technique by using arbitrary shapes of image tiles while Simulated Decorative Mosaic uses single-colored
square tiles. It proposes solution of tile arrangement with Energy Minimization
Framework.
The studies of second group propose the method only about the generation of
tiles. Park[5] proposes passive colored paper mosaic generating technique that
shape and arrangement of tiles is all done by the users input. The proposed
method uses Random Fractal technique for generating torn shaped colored paper
tile. However, it gives the user too much works to do. To solve the problem
passive technique has, automatic colored paper mosaic[6] using Voronoi Diagram
is proposed. The majority works are done by computer and only part the user
needs to do is to input a few parameters. It reduces heavy load of work on user
side; however, it cannot maintain the edge of image since it arranges tiles without
considering the edge. In order to solve this problem, another new technique[7]
is suggested. In this new technique, it arranges Voronoi sites using Quad-Tree
and clips the tiles according to the edge of image once it goes out of the edge.
Even though this technique can keep the edge of images, it cannot express the
real texture and shape of colored paper tearing since the polygon created using
Voronoi Diagram becomes in a convex form and each polygon is not overlapped.
Therefore, the existing study is not showing various shapes of tiles and the
overlap among them.

116

2
2.1

Y. Park et al.

Preprocessing
Data Structure of Colored Paper

The data structure of colored paper is organized with 2 layers that contain the
information such as the texture image and vertex shown as gure 1. The upper
layer means visible space of colored paper that has the color value. And the
lower layer means the white paper represented on the torn portion. To dene
the data structure of colored paper in advance gives two good aspects. The rst
one is that it can express various shape of colored paper tile like concave polygon
besides convex one. This is because previously used colored paper is stored in the
buer and polygon clipping is repeated using Vonoroi Diagram as necessary. The
other one is that dierent type of paper mosaic can be easily done by modifying
data structure. If the image is used magazine, newspaper and so on instead of
colored paper then it will be possible to come up with paper mosaic like Collage.

Fig. 1. The data structure of colored paper object

2.2

Image Segmentation

At rst, the necessary image processing works[11] like blurring are performed
on the input image and the image is divided into several regions that have
similar color in LUV space by using Mean-Shift image segmentation technique[8].
We call the region container. And the proposed mosaic algorithm is performed
per container. However Mean-Shift segmentation can create small containers.
If mosaic processing is performed in this stage, the colored paper tiles will not
be attached to these small containers so it will result in lots of grout spaces in
the result image as show in gure 4. Therefore, there is another step needed
to integrate these small containers. To give exibility and to show individual
intention of expression, the process of integration of small containers is controlled
by the users input.

3
3.1

The Generation of Colored Paper Tile


Determination of Size and Color

To determine the size and the color of tile, the initial position where tile is attached is determined in advance by Hill-Climbing algorithm[4]. Hill-Climbing

Rendering Technique for Colored Paper Mosaic

117

algorithm keeps changing the position till the function value converges to optimal point. Since normally big tiles are applied primarily from boundary rather
than small ones in real life, the function is determined like equation 1 with following two factors: size and boundary. The size factor is dened by D(x, y) that
means the minimum distance value between pixel (x, y) and boundary. And the
boundary factor is dened by D(x, y) D(i, j) that means the sum of dierence
between neighbor pixels. Therefore, the position that has the largest value of
L(x, y) is regarded as an initial position.
L(x, y) =

x+1


y+1


D(x, y) D(i, j) [x = i&y = j]

(1)

i=x1 j=y1

The size of colored paper tile is determined by distance from boundary. At


rst, we divide boundary pixels into two groups. The rst group has smaller
value than y of initial position, And the second group has larger value than y.
Then, between two minimum distance values on each group, the small value set
as the minimum size and the large value set as the maximum size.
Colored paper that has the similar color as the one in the initial position is
selected. Colored paper is dened as single-colored. First of all, square area is
built around the initial position for the size of tile and then the average value of
RGB color in that area is selected.
3.2

Determination of Shape

There are two steps to determine the shape of colored paper tile. The rst one
is to determine the overall outline of tile to be clipped and the other one is to
express torn eect. Voronoi Diagram is applied to decide the overall outline of
tile. First, the area of colored paper is divided into several grids according to the
size of tile to be torn. Then, Voronoi diagram is created by placing individual
Voronoi site in each segment as shown in gure 2(b). The generated Voronoi
diagram contains multiple Voronoi polygons so it needs to be decided to clip
which polygon among them. Considering the fact that people start to clip from
the boundary of the paper in real mosaic work, the polygon located near the
boundary is decided to be torn rst. Since there is always vertex in the boundary
of colored paper as shown in gure 2(c), one of the polygons that contain vertex
is randomly chosen. Once the outline of tile is determined by Voronoi polygon,
it is necessary to apply torn eect to the boundary of determined outline. This
torn eect is done by applying Random Point Displacement that is one of the
Random Fractal techniques to colored papers layer individually. Random Point
Displacement algorithm is applied to the boundary of selected Voronoi polygons
that is not overlapped with the boundary of colored paper. The irregularity
of torn surface and white-colored portion can be expressed by perturbing the
random point of edge continuously in the vertical direction. Lastly, clip the
modied Voronoi polygon by Random Point Displacement algorithm as shown
in gure 2(d).

118

Y. Park et al.

(a) Colored paper

(b) Voronoi

(c) Torn Eect

(d) Clipping

Fig. 2. The process of paper tearing

The Arrangement of Colored Paper Tile

There are two things to consider arranging colored paper tiles. First one is to
maintain the edge of the input image and the other one is to get rid of empty
spaces among tiles or between tile and the edge of the image. To maintain the
edges of input image, the similar technique to Energy Minimization Framework
of Jigsaw Image Mosaics is used in this paper. The energy function is dened
rst depending on the position of tile and the sum of it is calculated like E(x, y)
in equation 2.
E(x, y) = Pi Po Pt

/T
Pi = Tmax /2 D(x, y) where (x, y) C and (x, y)
Po = Wo D(x, y)
where (x, y)
/C

Pt = Wt D(x, y)
where (x, y) T

(2)

Pi , Po , Pt shown in the expression above mean the number of pixels located in


the inside of container, outside of container, and on the overlapped area with
other tiles. And Wo , Wt represent weight value depending on each location of
pixel. The bigger the value of sum E(x, y) is the better the position is to maintain
the edges of input image. Therefore, the tile needs to be placed where the sum
of E(x, y) is the greatest. To get rid of empty spaces among tiles and between
the tile and the edge of image, tile is moved and rotated into the direction of
the nearest edge. This movement and rotation is continuously happening till the
sum of E(x, y) from equation 2 has convergence value or is not getting bigger

(a) The best case

(b) The worst case (c) less overlapping


Fig. 3. Positioning of colored paper tile

(d) edge keeping

Rendering Technique for Colored Paper Mosaic

119

any longer. The gure 3 shows four dierent situation of tile arrangement. The
gure 3(b) shows the case that the tile is positioned outside of the edge of
the image. Correction on tiles location needs to be done since it prevents the
tile from keeping the edge of the image. Two tiles are overlapped too much
in the gure 3(c) and it also needs to be modied. The gure 3(d) shows the
optimal arrangement of the tile. We can control this by adjusting the value of Wo
and Wt .

Results

The gure 4, 5 and 6 shows the result image rendered by assigning the size of
tile of the source image between 4 and 100. The result shown in the gure 4
is the result of colored paper mosaic processed by only applying segmentation
algorithm to the source image. The grout space appears where is segmented

(a)

(b)

Fig. 4. The examples that have lots of grout spaces

Fig. 5. The result of colored paper mosaic

120

Y. Park et al.

Fig. 6. The result of colored paper mosaic with height map

smaller than the size 4 since the minimum size of the tile is set to 4. These smaller
containers have to be integrated into the near container in order to get rid of grout
spaces. The result of colored paper mosaic including container integration step is
shown in the gure 5. In the gure 5, the grout spaces shown in gure 4(a) are disappeared. Also, lots of small segments are removed by integration so the number
of smaller size of tiles is reduced. And we can apply the texture eect to the result
image by using texture mapping, height map[10], and alpha blending as shown in
the gure 6. By adding these eects, the mosaic image gets more realistic.

Discussion and Future Work

The work presented in this paper shows the new method to generate colored
paper tile with computer graphics techniques. The dierence that this paper has
is that it can maintain the edges of the input image and express the various
shape of tile and overlaps among tiles. These three achievements are shown in
the gure 4, 5 and 6.
The proposed method has some problems. First, too many small tiles are lled
in between large tiles in the results. It is because grout spaces appear between
the tile and the edge during the process of arranging the tile. It causes the
problem to the quality of result image so it needs to be improved afterward.
Therefore, another step to consider the edge of image during the tile generation
is necessary. This additional step will reduce the generation of grout spaces
among tiles or between the tile and the edge of image. Second, the performance
of whole process is very low, since the tile arrangement is performed per pixel.
Therefore it is needed to apply GPU or any other algorithms for improving the
performance.
This paper also has some benets like following. First, the proposed method
can express the various shapes of tile and overlapping between other tiles. Second,

Rendering Technique for Colored Paper Mosaic

121

if other types of paper like newspaper are used instead of colored paper then it
will be possible to come up with another type of mosaic like Collage. It is easy
to express other type of mosaic in computer graphics by modifying the data
structure if more detailed and elaborate tile selection algorithm is applied.

References
1. Silver.R and Hawley.M (ed.): Photomosaics, New York: Henry Holt, 1997
2. Alejo Hausner : Simulating Decorative Mosaics, SIGGRAPH 2001, pp.573-580,
2001.
3. Junhwan Kim, Fabio Pellacini : Jigsaw Image Mosaics, SIGGRAPH 2002, pp.
657-664, 2002.
4. Chris Allen : A Hillclimbing Approach to Image Mosaics, UW-L Journal of Undergraduate Research , 2004
5. Young-Sup Park, Sung-Ye Kim, Cheung-Woon Jho, Kyung-Hyun Yoon : Mosaic
Techniques using color paper, Proceeding of KCGS Conference, pp.42-47, 2000
6. Sang-Hyun Seo, Young-Sup Park, Sung-Ye Kim, Kyung-Hyun Yoon : Colored Paper Mosaic Rendering, In SIGGRAPH 2001 Abstrac ts and Applications, p.156,
2001
7. Sang-Hyun Seo, Dae-Uk Kang, Young-Sup Park, Kyung-Hyun Yoon : Colored Paper Mosaic Rendering Based on Image Segmentation, Proceeding of KCGS Conference, pp27-34, 2001
8. D. Comanicu, P. Meer : Mean shift: A robust approach toward feature space analysis, IEEE Transaction on Pattern Analysis and Machine Intelligence, 24, 603-619,
May 2002
9. Mark de Berg, M. V. Kerveld, M. Overmars and O. Schwarzkopf : Computational
Geometry Algorithms and Applications, Springer, pp.145-161, 1997
10. Aaron Hertzmann : Fast Paint Texture, NPAR 2002, 2002
11. Rafael C. Gonzalez and Richard E. Woods : Digital Image Processing 2nd Edition,
publish ed by Prentice Hall, 2002

Real-Time Simulation of Surface Gravity Ocean


Waves Based on the TMA Spectrum
Namkyung Lee1 , Nakhoon Baek2, , and Kwan Woo Ryu1
1

Dept. of Computer Engineering, Kyungpook National Univ., Daegu 702-701, Korea


namklee@hotmail.com,kwryu@knu.ac.kr
2
School of EECS, Kyungpook National Univ., Daegu 702-701, Korea
oceancru@gmail.com
http://isaac.knu.ac.kr/~hope/tma.htm
Abstract. In this paper, we present a real-time method to display ocean
surface gravity waves for various computer graphics applications. Starting
from a precise surface gravity model in oceanography, we derive its implementation model and our prototype implementation shows more than 50
frames per second on Intel Core2 Duo 2.40GHz PCs. Our major contributions will be the improvement on the expression power of ocean waves
and providing more user-controllable parameters for various wave shapes.
Keywords: Computer graphics, Simulation, Ocean wave, TMA.

Introduction

Realistic simulation of natural phenomena is one of the interesting and important issues in computer graphics related areas including computer games and
animations. In this paper, we are focusing on the ocean waves, for which we
have many research results but not a complete solution yet[1].
Waves on the surface of the ocean are primarily generated by winds and
gravity. Although the ocean wave includes internal waves, tides, edge waves and
others, it is clear that we should display at least the surface gravity waves on the
computer screen, to nally represent the ocean. In oceanography, there are many
research results to mathematically model the surface waves in the ocean. Simple
sinusoidal or trochoidal expressions can approximate a simple ocean wave. Real
world waves are a comprised form of these simple waves, and called wave trains.
In computer graphics, we can classify the related results into two categories.
The rst one uses uid dynamics equations in a similar way used in the scientic
simulation eld. We have a number of results with the capability of obtaining
realistic animations of complex water surfaces[2,3,4,5,6]. However, these results
are hard to apply to large scenes of waters such as oceans, mainly due to their
heavy computation.
The other category is based on the ocean wave models from the oceanography,
and consists of three approaches. The rst group uses the Gerstner swell model.
Fournier[7] concentrated on the shallow water waves and surf along a shore line.
He started from parametric equations and added control parameters to simulate


Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 122129, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Real-Time Simulation of Surface Gravity Ocean Waves

123

various shapes of shallow water waves, but not for the large-scale ocean scenes
and/or deep-depth ocean waves. More complex parametric equations to present
the propagation of water waves had been introduced by Gonzato[8]. This model
is well suited for modeling propagating water of wave front, but its equations
are too complex for large-scale ocean waves.
Another group regards the ocean surface as a height eld with a prescribed
spectrum based on the experimental observations from oceanography. Mastin[9]
introduced an eective simulation of wave behavior using Fast Fourier Transform(FFT). The height eld is constructed through inverse FFT of the frequency
spectrum of the real world ocean waves. It can produce complex wave patterns
similar to real world ocean waves. Tessendorf[10] showed that dispersive propagation could be managed in the frequency domain and that the resulting eld
could be modied to yield trochoid waves. However, the negative aspect of FFT
based methods is homogeneity: we cannot handle any local properties such as
refraction, reection, and others.
The last one is the hybrid approach: The spectrum synthesized by a spectral
approach is used to control the trochoids generated by the Gerstner model.
Hinsinger[11] presented an adaptive scheme for the animation and display of
ocean waves in real time. It relied on a procedural wave model which expresses
surface point displacements as sums of wave trains. In this paper, we aim to
construct an ocean wave model with the following characteristics:
Real time capability: They usually want to display a large scale ocean
scene and some special eects may be added to the scene. So, we need to
generate the ocean wave in real time.
More user-controllable parameters: We will provide more parameters
to generate variety of ocean scenes including deep and shallow oceans, windy
and calm oceans, etc.
Focusing on the surface gravity waves: Since we target the large-scale
ocean, minor details of the ocean wave are not our major interest. In fact,
the minor details can be easily super-imposed to the surface gravity waves,
if needed.
In the following sections, we will present a new hybrid approach to nally get
a real-time surface gravity wave simulation. Since it is a kind of hybrid approach,
it can generate large scale oceans without diculty, and works in real time, to be
su- ciently used with computer generated animations or other special eects.
Additionally, we use a more precise wave model and have more controllable parameters including depth of sea, fetch length, wind speed, and so on, in comparison
with previous hybrid approaches. We will start from the theoretical ocean wave
models in the following section, and build up our implementation model. Our
implementation results and conclusions will be followed.

The Ocean Wave Model

The major generating force for waves is the wind acting on the interface between
the air and the water. From the mathematical point of view, the surface is made

124

N. Lee, N. Baek, and K.W. Ryu

up of many sinusoidal waves generated by the wind, and they are traveling
through the ocean. One of the fundamental models for the ocean wave is the
Gerstner swell model, in which the trajectory of a water particle is expressed as
a circle of radius r around its reference location at rest, (x0 , z0 ), as follows[11]:
x = x0 + r sin(t kx0 )
z = z0 + r cos(t kz0 ),

(1)

where (x, z) is the actual location at time t, =2f is the pulsation with the frequency f , and k=2/ is the wave number with respect to the wave length of .
Equation (1) shows a two-dimensional representation of the ocean wave, assuming that the x-axis coincides to the direction of wave propagation. The surface of
an ocean is actually made up of a nite sum of these simple waves, and the height
z of the water surface on the grid point (x, y) at time t can be expressed as:
z(x, y, t) =

n


Ai cos (ki (x cos i +y sin i )i t+i ) ,

(2)

where n is the number of wave trains, Ai is the amplitude, ki is the wave number,
i is the direction of wave propagation on the xy-plane and i is the phase. In
Hinsinger[11], they manually selected all these parameters, and thus, the user
may meet diculties to select proper values of them.
In contrast, Thon[12] uses a spectrum-based method to nd some reasonable
parameter sets. They used the Pierson-Moskowitz(PM) model[13], which empirically expresses a fully developed sea in terms of the wave frequency f as follows:

0.0081 g 2 54 ffp 4
EPM (f ) =
e
,
(2)4 f 5
where EPM (f ) is the spectrum, g is the gravity constant and fp = 0.13 g/U10 is
a peak of frequency depending on the wind speed U10 at a height of 10 meters
above the sea surface.
Although Thon used the PM model to give some impressive results, the PM
model itself assumes the innite depth of the ocean and thus may fail to the
shallow sea cases. To overcome this drawback, the JONSWAP model and TMA
model are introduced. The JONSWAP(Joint North Sea Wave Project) model[14]
is developed for fetch-limited seas such as North sea and expressed as follows:
f /fp 1

g 2 54 ffp 4 e 22
EJONSWAP (f ) =
e

,
(2)4 f 5
where is the scaling parameter, is the peak enhancement factor, and is
evaluated as 0.07 for f fp and 0.09 otherwise. Given the fetch length F , the
frequency at the spectral peak fp is calculated as follows:
 2 0.33
g F
fp = 3.5
.
3
U10

Real-Time Simulation of Surface Gravity Ocean Waves

125

The Texel, Marson and Arsole(TMA) model[15] extends the JONSWAP model
to include the depth of water h as one of its implicit parameters as follows:
ETMA (f ) = EJONSWAP (f ) (f , h),
where (f , h) is the Kitaigorodoskii depth function:


1
K
(f , h) =
1
+
,
s(f )
sinh K

with f = f h/g, K = 2(f )2 s(f ) and s(f ) = tanh1 [(2f )2 h].
The TMA model shows good empirical behavior even with the water depth of 6
meters. Thus, it is possible to represent the waves on the surface of lake or smallsize ponds, in addition to the ocean waves. Additionally, it also includes the fetch
length as a parameter, inherited from the JONSWAP model. Thus, the expression
power of the TMA model is much increased in comparison with the PM model
previously used by other researchers. We use this more improved wave model to
nally achieve more realistic ocean scenes with more user-controllable parameters.

The Implementation Model

To derive implementation-related expressions, we need to extend the spectrum


of TMA model to two dimensional world as follows[14]:
E(f, ) = ETMA (f )D(f, ),
where D(f, ) is a directional spreading factor that weights the spectrum at angle
from the downwind direction. The spreading factor is expressed as follows:
 

1
2p
D(f, ) = Np cos
,
2
where p = 9.77(f /fp) , Np = 212p (2p + 1)/ 2(p + 1) with Eulers Gamma
function and

4.06, if f < fp
=
.
2.34, otherwise
For more convenience in its implementation, we will derive some evaluation
functions for the parameters including frequency, amplitude, wave direction,
wave number and pulsation. The frequency of each wave train is determined
from the peak frequency fp and a random oset to simulate the irregularity of
the ocean waves. Thereafter, the pulsation and wave number is naturally calculated by their denition.
According to the random linear wave theory[16,17,18,19,20], directional wave
spectrum E(f, ) is given by
E(f, ) = (k(f ), ) k(f )

dk(f )
,
df

(3)

126

N. Lee, N. Baek, and K.W. Ryu

where k(f ) = 4 2 f 2 /g and (k(f ), ) is a wave number spectrum. The second


and the third term in Equation (3) can be computed as:
k(f )

dk(f )
32 2 f 3
=
.
df
g2

This allows us to re-write Equation (3) as follows[17]:


E(f, ) = (k(f ), )

32 2 f 3
.
g2

From the random linear wave[17,19], the wave number spectrum (k(f ), ) can
be approximated as:

(k(f ), ) =
A(f )2 ,
4 2
where is a constant. Finally, the amplitude A(f ) of a wave train is evaluated
as:


E(f, ) g 2
ETMA (f ) D(f, ) g 2
A(f ) =
=
.
3
8f
8f 3
Using all these derivations, we can calculate the parameter values for Equation (2). And then, we evaluate the height of each grid point (x, y) to construct
a rectangular mesh representing the ocean surface.

Implementation Results

Figures 1, 2 and 3 are some outputs from the prototype implementation. As shown
We implemented the ocean wave generation program based on the TMA model
presented in the previous section. It uses plain OpenGL library and does not use
any multi-threading or hardware-based acceleration techniques. At this time, we
focused on the expression power of our TMA model-based implementation, and
thus, our prototype implementation lacks some acceleration or optimization factors. Even though, it shows more than 50 frames per second on a PC with Intel

(a) wind speed 3m/s, water depth 5m (b) wind speed 3m/s, water depth 100m
Fig. 1. Ocean waves with dierent water depths: Even with the same wind speed, dierent
water depths result in very dierent waves. We use the fetch length of 5km for these images.

Real-Time Simulation of Surface Gravity Ocean Waves

127

(a) wind speed 3m/s, water depth 100m (b) wind speed 6m/s, water depth 100m
Fig. 2. Ocean waves with dierent wind velocities: Changes in wind speed generate more
clam or more choppy waves. The fetch length of 10 km is used for each of these images.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

Fig. 3. An animated sequence of ocean waves

128

N. Lee, N. Baek, and K.W. Ryu

Core2 Duo 6600 2.40GHz processor and a GeForce 7950GT based graphics card.
We expect that the frame rate will be much better in the next version.
In Figure 1, we can control the depth of the ocean to show very dierent
waves even with the same wind speed and the same fetch length. Especially, the
changes in the water depth are acceptable only for the TMA model, while the
previous PM model cannot handle it. Figure 2 shows the eect of changing the
wind speed. As expected, the high wind speed generates more choppy waves.
Figure 3 is a sequence of images captured during the real time animation of the
windy ocean. All examples are executed with mesh resolution of 200 200. More
examples are in our web page, http://isaac.knu.ac.kr/~hope/tma.htm.

Conclusion

In this paper, we present a real-time surface gravity wave simulation method,


derived from a precise ocean wave model in the oceanography. We started from a
precise ocean wave model of TMA model, which has not been used for a graphics
implementation, at least to our knowledge. Since we used a more precise ocean
wave model, users can control more parameters to create various ocean scenes.
The two major improvements of our method in comparison with the previous
works will be:
Enhanced expression power: Our method can display visually plausible
scenes even for shallow seas.
Improved user controllability: Our method provides more parameters
such as fetch length and depth of water, in addition to the wind velocity.
We implemented a prototype system, and showed that it can generate animated sequences of ocean waves in real time. We plan to integrate our implementation to large-scale applications such as games, maritime training simulators,
etc. Some detailed variations to the ocean waves can also be added to our implementation with minor modications.

Acknowledgements
This research was supported by the Regional Innovation Industry Promotion
Project which was conducted by the Ministry of Commerce, Industry and Energy(MOCIE) of the Korean Government (70000187-2006-01).

References
1. Iglesias, A.: Computer graphics for water modeling and rendering: a survey. Future
Generation Comp. Syst. 20(8) (2004) 13551374
2. Enright, D., Marschner, S., Fedkiw, R.: Animation and rendering of complex water
surfaces. In: SIGGRAPH 02. (2002) 736744
3. Foster, N., Fedkiw, R.: Practical animation of liquids. In: SIGGRAPH 01. (2001)
2330

Real-Time Simulation of Surface Gravity Ocean Waves

129

4. Foster, N., Metaxas, D.N.: Realistic animation of liquids. CVGIP: Graphical Model
and Image Processing 58(5) (1996) 471483
5. Foster, N., Metaxas, D.N.: Controlling uid animation. In: Computer Graphics
International 97. (1997) 178188
6. Stam, J.: Stable uids. In: SIGGRAPH 99. (1999) 121128
7. Fournier, A., Reeves, W.T.: A simple model of ocean waves. In: SIGGRAPH 86.
(1986) 7584
8. Gonzato, J.C., Saec, B.L.: On modelling and rendering ocean scenes. J. of Visualization and Computer Animation 11(1) (2000) 2737
9. Mastin, G.A., Watterberg, P.A., Mareda, J.F.: Fourier synthesis of ocean scenes.
IEEE Comput. Graph. Appl. 7(3) (1987) 1623
10. Tessendorf, J.: Simulating ocean water. In: SIGGRAPH 01 Course Notes. (2001)
11. Hinsinger, D., Neyret, F., Cani, M.P.: Interactive animation of ocean waves. In:
SCA 02: Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on
Computer animation. (2002) 161166
12. Thon, S., Dischler, J.M., Ghazanfarpour, D.: Ocean waves synthesis using a
spectrum-based turbulence function. In: Computer Graphics International 00.
(2000) 65
13. Pierson, W., Moskowitz, L.: A proposed spectral form for fully developed wind
seas based on the similarity theory of S.A. kitaigorodskii. J. Geophysical Research
(69) (1964) 51815190
14. Hasselmann, D., Dunckel, M., Ewing, J.: Directional wave spectra observed during
JONSWAP 1973. J. Physical Oceanography 10(8) (1980) 12641280
15. Bouws, E., G
unther, H., Rosenthal, W., Vincent, C.L.: Similarity of the wind wave
spectrum in nite depth water: Part 1. spectral form. J. Geophysical Research 90
(1985) 975986
16. Crawford, F.: Waves. McGraw-Hill (1977)
17. Krogstad, H., Arntsen, .: Linear Wave Theory. Norwegian Univ. of Sci. and Tech.
(2006) http://www.bygg.ntnu.no/ oivarn/.
18. Seyringer, H.: Nature wizard (2006) http://folk.ntnu.no/oivarn/hercules ntnu/
LWTcourse/.
19. Sorensen, R.: Basic Coastal Engineering. Springer-Verlag (2006)
20. US Army Corps of Engineers Internet Publishing Group:
Coastal engineering manual part II (2006) http://www.usace.army.mil/publications/engmanuals/em1110-2-1100/PartII/PartII.htm.

Determining Knots with Quadratic Polynomial


Precision
Zhang Caiming1,2 , Ji Xiuhua1 , and Liu Hui1
1

School of Computer Science and Technology, University of Shandong Economics,


Jinan 250014, China
2
School of Computer Science and Technology, Shandong University,
Jinan 250061, China

Abstract. A new method for determining knots in parametric curve


interpolation is presented. The determined knots have a quadratic polynomial precision in the sense that an interpolation scheme which reproduces quadratic polynomials would reproduce parametric quadratic
polynomials if the new method is used to determine knots in the interpolation process. Testing results on the eciency of the new method are
also included.
Keywords: parametric curves, knots, polynomials.

Introduction

The problem of constructing parametric interpolating curves is of fundamental


importance in CAGD, CG, scientic computing and so on. The constructed curve
is often required to have a better approximation precision and as well as to have
the shape suggested by the data points.
The construction of an ideal parametric interpolating curve requires not only
a good interpolation method, but also appropriate choice of the parameter knots.
In parametric curve construction, the chord length parametrization is a widely
accepted and used method to determine knots [1][2]. Other two useful methods
are centripetal model[3] and adjusted chord length method ([4], referred as Foleys method). When these three methods are used, the constructed interpolant
can only reproduce straight lines. In paper[5], a new method for determining
knots is presented (referred as ZCM method). The knots are determined using
a global method. The determined knots can be used to construct interpolants
which reproduce parametric quadratic curves if the interpolation scheme reproduces quadratic polynomials.
A new method for determining knots is presented in this paper. The knots associated with the points are computed by a local method. The determined knots
have a quadratic polynomial precision. Experiments showed that the curves constructed using the knots by the new method generally has the better interpolation precision.
The remaining part of the paper is arranged as follows. The basic idea of the
new method is described in Section 2. The choice of knots by constructing a
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 130137, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Determining Knots with Quadratic Polynomial Precision

131

parametric quadratic interpolant to four data points is discussed in Section 3.


The comparison of the new method with other four methods is performed in
Section 4. The Conclusions and Future Works is given in Section 5.

Basic Idea

Let Pi = (xi , yi ), 1 i n, be a given set of distinct data points which satises


the condition that for any point Pi , 1 < i < n, there are at least two sets of four
consecutive convex data points, which include it. As an example, for the data
points in Figure 3, the point Pi+1 belongs the two sets of consecutive convex data
points which are {Pi2 , Pi1 , Pi , Pi+1 } and {Pi , Pi+1 , Pi+2 , Pi+3 }, respectively.
The goal is to construct a knot ti for each Pi , 1 i n. The constructed knots
satisfy the following the condition: if the set of data points are taken from a
parametric quadratic polynomial, i.e.,
Pi = Ai2 + Bi + C,

1in

(1)

where A = (a1 , a2 ), B = (b1 , b2 ) and C = (c1 , c2 ) are 2D points, then


ti ti1 = (i i1 ),

1in

(2)

for some constant > 0.


Such a set of knots ti , 1 i n, is known to have a quadratic polynomial precision. Obviously, using the knots satisfying equation (2), an interpolation scheme which reproduces quadratic polynomials will reproduce parametric
quadratic polynomials.
Following, the basic idea in determining the knots ti , 1 i n, will be described. If the set of data points is taken from a parametric quadratic polynomial,
P () = (x(), y()) dened by
x() = X2 2 + X1 + X0 ,
y() = Y2 2 + Y1 + Y0 ,

(3)

then, there is a rotating transformation to transform it to the following parabola


form, as shown in Figure 1:
y = a1 t2 + b1 t + c1 ,
x = t

(4)

Then the knots ti , 1 i n can be dened by


ti = x
i ,

i = 1, 2, 3, , n,

which has a quadratic polynomial precision. Assume that the following transformation
x
= x cos 2 + y sin 2
y = x sin 2 + y cos 2
transforms P () (3) to the parabola (4), then we have the following theorem 1.

132

C. Zhang, X. Ji, and H. Liu


y

2.5
2

P3

P0
1.5
P1


x

P2

0.5
x

1

Fig. 1. A standard parabola in x


y coordinate system

Theorem 1. If the set of data points is taken from a parametric quadratic


polynomial,P () (3), then the knot ti , i = 1, 2, 3, , n which have a quadratic
polynomial precision, can be dened by
t1 = 0
ti = ti1 + (xi xi1 ) cos 2 + (yi yi1 ) sin 2 ,
where

i = 2, 3, , n


sin 2 = X2
/ X22 + Y22
cos 2 = Y2 / X22 + Y22

(5)

(6)

Proof. In the x
y coordinate system, it follows from (3) that
x
= (X2 2 + X1 + X0 ) cos 2 + (Y2 2 + Y1 + Y0 ) sin 2
y = (X2 2 + X1 + X0 ) sin 2 + (Y2 2 + Y1 + Y0 ) cos 2

(7)

If sin 2 and cos 2 are dened by (6), then the rst expression of (7) becomes
=

x
+ X0 cos 2 + Y0 sin 2
X1 cos 2 + Y1 sin 2

(8)

Substituting (8) into the second expression of (7) and rearranging, a parabola
is obtained, which is dened by
y = a1 x
2 + b1 x
+ c1 ,
where a1 , b1 and c1 are dened by
a1 = Y2 cos 2 X2 sin 2
b1 = 2a1 AB + (Y1 cos 2 X1 sin 2 )A
c1 = a1 A2 B 2 (Y1 cos 2 X1 sin 2 )AB + Y0 cos 2 X0 sin 2
Thus, ti can be dened by x
i , i.e., the knot ti ,i = 1, 2, 3, , n can be dened by
(5), which has a quadratic polynomial precision.

Determining Knots with Quadratic Polynomial Precision

133

The discussion above showed that the key point of determining knots is to
construct the quadratic polynomial, P () = (x(), y()) (3) using the given data
points. This will be discussed in Section 3.

Determining Knots

In this section, we rst discuss how to construct a quadratic polynomial with four
points, then discuss the determination of knots using the quadratic polynomial.
3.1

Constructing a Quadratic Polynomial with Four Points

Let Qi () be a parametric quadratic polynomial which interpolates Pi1 , Pi and


Pi+1 . Qi () can be dened on the interval [0, 1] as follows:
Qi (s) = 1 (s)(Pi1 Pi ) + 2 (s)(Pi+1 Pi ) + Pi

(9)

(s si )(s 1)
si
s(s si )
2 (s) =
1 si

(10)

where

1 (s) =

where 0 < si < 1.


Expressions (9) and (10) show that four data points are needed to determine
a parametric quadratic polynomial uniquely.
Let Pj = (xj , yj ), i 1 j i + 2, be four points in which there are no three
points on a straight line. The Pi+2 will be used to determine si (10) .
Without loss of generality, the coordinates of Pi1 , Pi , Pi+1 and Pi+2 are
supposed to be (0, 1), (0, 0), (1, 0) and (xi+2 , yi+2 ), respectively, as shown in
Figure 2. In the xy coordinate system, Qi (s) dened by (9) becomes
x = s(s si )/(1 si )
y = (s si )(s 1)/si

(11)

Let si+2 be the knot associated with the point (xi+2 , yi+2 ). As point (xi+2 , yi+2 )
is on the curve, we have
xi+2 = si+2 (si+2 si )/(1 si )
yi+2 = (si+2 si )(si+2 1)/si

(12)

It follows from (12) that


si+2 = xi+2 + (1 xi+2 yi+2 )si

(13)

Substituting (13) into (12), one gets the following equation:


s2i + A(xi+2 , yi+2 )si + B(xi+2 , yi+2 ) = 0

(14)

134

C. Zhang, X. Ji, and H. Liu


y 6
1 s Pi-1

s
Pi+2
s
Pi

P
s i+1

Fig. 2. Pi+2 is in the dotted region

where

2xi+2
xi+2 + yi+2
(1 xi+2 )xi+2
B(xi+2 , yi+2 ) =
(1 xi+2 yi+2 )(xi+2 + yi+2 )
> 1, the root of (14) is

1
xi+2 yi+2
si =
( xi+2 +
)
xi+2 + yi+2
xi+2 + yi+2 1
A(xi+2 , yi+2 ) =

As si+2

(15)

It follows from (9)(10) that if the given data points are taken from a parametric
quadratic polynomial Q(t), then there is an unique si satisfying 0 < si < 1
to make the curve Qi (s) (9) pass through the given data points. Since si is
determined uniquely by (15), Qi (s) is equivalent to Q(t).
Substituting si+2 > 1 into (11) one obtains
xi+2 > 1 and yi+2 > 0,

(16)

that is, the point (xi+2 , yi+2 ) should be on the dotted region in Figure 2.
3.2

Determining Knots

After si being determined, Qi (s) (9) can be written as

where

xi (s) = Xi,2 s2 + Xi,1 s + Xi,0 ,


yi (s) = Yi,2 s2 + Yi,1 s + Yi,0 ,

(17)

xi1 xi
xi+1 xi
+
si
1 si
(xi1 xi )(si + 1) (xi+1 xi )si
Xi,1 =

si
1 si
Xi,0 = xi1
yi1 yi
yi+1 yi
Yi,2 =
+
si
1 si
(yi1 yi )(si + 1) (yi+1 yi )si
Yi,1 =

si
1 si
Yi,0 = yi1

(18)

Xi,2 =

Determining Knots with Quadratic Polynomial Precision

135

It follows from Theorem 1 that for i = 2, 3, , n2, the knot interval tj+1 tj =
ij between Pj and Pj+1 , j = i 1, i, i + 1 can be dened by
ij = (xj+1 xj ) cos i + (yj+1 yj ) sin i ,

j = i 1, i, i + 1

(19)

where cos i and sin i are dened by (please see (6))



2 +Y2
sin i = Xi,2 / Xi,2
i,2

2
2
cos i = Yi,2 / Xi,2 + Yi,2
Based on the denition (19) that for the pair of P1 and P2 , there is one knot
interval 21 ; for the pair of P2 and P3 , there are two knot intervals, 22 and 32 ;
for the pair of Pi and Pi+1 , 3 i n 2, there are three knot intervals, i1
,
i
ii and i+1
;
the
knot
intervals
for
the
pair
of
P
and
P
,
j
=
n

1,
n
are
j1
j
i
similar. Now the knot interval i for the pair of Pi and Pi+1 , i = 1, 3, , n 1
are dened by
1 = 21
2 = 22 + 22
i = ii + 2i1 i2 /(i1 + i2 ), i = 3, 4, , n 3
1
n2 = n2
n2 + n2
n2
n1 = n1
where

(20)

i1 = |ii i1
|
i
i2 = |ii i+1
|
i

If the given set of data points are taken from a parametric quadratic polynomial,
1
2
1
2
1
then i1 = i2 = 0. The terms 22 , 2i1
i1
/(i1
+ i1
) and n2
are correction
n2
2
i
to 2 , i , i = 4, 5, , n 2 and n2 , respectively.

Pi1

Pi2

Pi3

Pi

Pi1

Pi2

Fig. 3. Example 1 of the data points

For the data points as shown in Figure 3, as the data points change its convexity, the knot interval between Pi and Pi+1 is dened by the combination of
ii1 and ii+1 , i.e, by
i = (ii1 + ii+1 )/2.
(21)
While for data points as shown in Figure 4, the knot intervals are determined by
subdividing the data points at point Pi+1 into two sets of data points. The rst

136

C. Zhang, X. Ji, and H. Liu

Pi

Pi2
Pi1

Fig. 4. Example 2 of the data points

set of data point ends at Pi+1 , while the second set of data point starts at Pi+1 .
If Pi1 , Pi and Pi+1 are on a straight line, then setting ti ti1 = |Pi1 Pi |,
ti+1 ti = |Pi Pi+1 |, this determination makes the quadratic polynomial Qi (t)
which passes Pi1 , Pi and Pi+1 be a straight line with the magnitude of the rst
derivative being a constant. Such a straight line is the most well dened curve
one can get in this case.

Experiments

The new method has been compared with the chord length, centripetal, Foley
and ZCMs methods. The comparison is performed by using the knots computed
using these methods in the construction of a parametric cubic spline interpolant.
For brevity, the cubic splines produced using these methods are called chord
spline, centripetal spline, Foleys spline, ZCM spline and new spline, respectively.
The data points used in the comparison are taken from the following ellipse
x = x( ) = 3 cos(2 )
y = y( ) = 2 sin(2 )

(22)

The comparison is performed by dividing the interval [0, 1] into 36 sub-intervals


to dene data points, i.e., i is dened by
i = (i + sin((36 i) i))/36

i = 0, 1, 2, , 36

where 0 0.25.
To avoid the maximum error occurred near the end points (x0 , y0 ) and (x20 ,
y20 ), the tangent vectors of F ( ) at = 0 and = 1 are used as the end
conditions to construct the cubic splines.
The ve methods are compared by the absolute error curve E(t), dened by
E(t) = min{|P (t) F ( )|}
= min{|Pi (t) F ( )|, i i+1 }

i = 0, 1, 2, , 19

where P (t) denotes one of the chord spline, centripetal spline, Foleys spline,
ZCMs spline or new spline, Pi (t) is the corresponding part of P (t) on the subinterval [ti , ti+1 ], and F ( ) is dened by (22). For the point P (t), E(t) is the
shortest distance from curve F ( ) to P (t) .

Determining Knots with Quadratic Polynomial Precision

137

Table 1. Maximum absolute errors


Error
= .0
= .05
= .10
= .15
= .20
= .25

chord centripetal Foley ZCM New


5.29e-5 5.29e-5 5.29e-5 5.29e-5 5.29e-5
1.67e-4 3.71e-3 2.39e-3 1.58e-4 1.60e-4
3.17e-4 8.00e-3 5.33e-3 2.93e-4 2.89e-4
5.08e-4 1.30e-2 8.88e-3 4.58e-4 4.37e-4
7.41e-4 1.86e-2 1.31e-2 6.55e-4 6.04e-4
1.02e-3 2.49e-2 1.79e-2 8.86e-4 7.88e-4

The maximum values of the error curve E(t) generated by these methods
are shown in table 1. The ve methods have also been compared on data points
which divide [0, 1] into 18, 72, ... etc subintervals. The results are basically similar
as those shown in tables 1.

Conclusions and Future Works

A new method for determining knots in parametric curve interpolation is presented. The determined knots have a quadratic polynomial precision. This means
that from the approximation point of view, the new method is better than the
chord length, centripetal and Foleys methods in terms of error evaluation in the
associated Taylor series. The ZCMs method has also a quadratic polynomial
precision, but it is a global method, while the new method is a local one.
The new method works well on the data points whose convexity does not
change sign, our next work is to extend it to work on the data points whose
convexity changes sign.
Acknowledgments. This work was supposed by the National Key Basic Research 973 program of China(2006CB303102), the National Natural Science
Foundation of China(60533060, 60633030).

References
1. Ahlberg, J. H., Nilson, E. N. and Walsh, J. L., The theory of splines and their
applications, Academic Press, New York, NY, USA, 1967.
2. de Boor, C., A practical guide to splines, Springer Verlag, New York, 1978.
3. Lee, E. T. Y., Choosing nodes in parametric curve interpolation, CAD, Vol. 21, No.
6, pp. 363-370, 1989.
4. Farin G., Curves and surfaces for computer aided geometric design: A practical
guide, Academic Press, 1988.
5. Zhang, C., Cheng, F. and Miura, K., A method for determing knots in parametric
curve interpolation, CAGD 15(1998), pp 399-416.

Interactive Cartoon Rendering and Sketching of Clouds


and Smoke
Eduardo J. lvarez1, Celso Campos1, Silvana G. Meire1, Ricardo Quirs2,
Joaquin Huerta2, and Michael Gould2
1

Departamento de Informtica, Universidad de Vigo, Spain


ccampos@ei.uvigo.es
Departamento de Lenguajes y Sistemas Informticos, Universitat Jaume I, Spain
{quiros, huerta, gould }@lsi.uji.es

Abstract. We present several techniques to generate clouds and smoke with


cartoon style and sketching obtaining interactive speed for the graphical results.
The proposed method allows abstracting the visual and geometric complexity of
the gaseous phenomena using a particle system. The abstraction process is
made using implicit surfaces, which are used later to calculate the silhouette and
obtain the result image. Additionally, we add detail layers that allow improvement of the appearance and provide the sensation of greater volume for the
gaseous effect. Finally, we also include in our application a simulator that generates smoke animations.

1 Introduction
The automatic generation of cartoons requires the use of two basic techniques in expressive rendering: a specific illumination model for this rendering style and the visualization of the objects silhouettes. This style is known as Cartoon rendering and its
use is common in the production of animation films and in the creation of television
contents. Cartoon rendering techniques in video games is also growing as they can
produce more creative details than the techniques based on realism.
There are several techniques to automatically calculate silhouette -outline- and celshading [1][2][3]. Shadowing and self-shadowing, along with the silhouettes, are
fundamental effects for expressing volume, position and limits of objects. Most of
these techniques require general meshes and they do not allow representation of
amorphous shapes, which are modeled by particle systems as in the case of clouds and
smoke.
Our objective is to create cartoon vignettes for interactive entertainment applications, combining cartoon techniques with a particle system simulator that allows representation of amorphous shapes such us clouds and smoke. Special attention should
be paid to the visual complexity of this type of gaseous phenomena, therefore we use
implicit surfaces in order to abstract and simplify this complexity [4][5]. To obtain the
expressive appearance, we introduce an algorithm that enhances silhouette visualization, within a cartoon rendering. For the simulation of smoke, we use a particle
system based on Selles [6] hybrid model.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 138145, 2007.
Springer-Verlag Berlin Heidelberg 2007

Interactive Cartoon Rendering and Sketching of Clouds and Smoke

139

2 Previous Work
Clouds are important elements in the modeling of natural scenes, both if we want to
obtain high quality images or for interactive applications. Clouds and smoke are gaseous phenomena very complicated to represent because of several issues: their fractal
nature, the intrinsic difficulty of its animation and local illumination differences.
The representation of cloud shapes has been treated by three different strategies:
volumetric clouds (explicit form [7] or procedural [8]), using billboards [9] [10], and
by general surfaces [12][13]. The approach based on volume, in spite of the improvements of graphics hardware, is not yet possible at interactive speed because of
the typical scene size and the level of detail required to represent the sky.
The impostors and billboards approach is the most widely used solution in video
games and, although the results are suitable, their massive use slows the visualization
due to the great number of pixels that must be rendered.
On the other hand, the use of general surfaces allows efficient visualization however it generates overly coarse models for representing volumetric forms. Bouthors
[11], extends Gardners model [12][13] by using a hierarchy of almost-spherical particles related to an implicit field that define a surface. This surface is later rendered to
create a volumetric characteristic that provides realistic clouds.
In expressive rendering, the relevant works on gaseous phenomena are scarce in
the literature. The first works published in this field are from Di Fiore [14] and Selle
[6] trying to create streamlined animations of these phenomena. The approach of Di
Fiore combines a variant of second order particle systems to simulate the gaseous
effect movement using 2D billboards drawn by artists, which are called basic visualization components.
Selle introduces a technique that facilitates the animation of cartoon rendering
smoke. He proposes to use a particle system whose movement is generated with the
method presented by Fedkiw [15] for the simulation of photorealistic smoke. To
achieve the expressive appearance, each particle is rendered as a disc in the depth
buffer creating a smoke cloud. In a second iteration of the algorithm, the silhouette of
the whole smoke cloud is calculated reading the depth buffer and applying the depth
differences. This method obtains approximately one image per second and has been
used by Deussen [16] for the generation of illustrations of trees.
McGuire [17], presents an algorithm for the real-time generation of cartoon rendering smoke. He extends Selles model incorporating shading, shadows, and nailboards
(billboards with depth maps). Nailboards are used to calculate intersections between
smoke and geometry, and to render the silhouette without using the depth buffer. The
particle system is based on work recently presented by Selle, Rasmussen, and Fedkiw
[18], which introduces a hybrid method that generates synergies using Lagrangian
vortex particle methods and Eulerian grid based methods.

3 Scene Modeling
The rendering process necessarily requires an abstraction and simplification of the
motif. This is made evident in the generation of hand-drawn sketches, even more so
when representing gases. By means of several strokes the artist adds detail to the

140

E.J. lvarez et al.

scene creating a convincing simplification of the object representation which can be


easily recognized by the viewer. Our method provides the user with complete freedom
to design the shape and the aspect (appearance) of the cloud.
In a first approach, we propose the possibility to model clouds as static elements in
the scene, the same way it normally happens in animation films. The process of modeling clouds begins with the definition by the user of particles pi that comprise the
cloud, each one having a center ci, a radius ri and a mass mi.
Once the set of particles is defined we perform the simplification and the abstraction of the geometric model of clouds. To calculate the implicit surface described by
the total particle set, we use the function of density proposed by Murakami and Ichihara [19], and later used by Luft and Deussen[5] for the real-time illustration of plants
with Watercolor style.
The influence of a particle pi in a point q is described by a density function Di (q)
defined as:

|| q - ci ||
Di(q)= 1

ri

2 2

(1)

For ||q - ci||<=ri, otherwise Di(q)=0.


In our model we include in the density function the mass mi of each particle which
allows the user to weigh the influence of each particle in the calculation of the implicit surface. The modified density function is expressed as:

|| q - ci ||
Di (q)= mi * 1

ri

2 2

(2)

The implicit surface is generated from the summation of the density function of the
set:
F ( q)= Di (q ) T .
i

(3)

Therefore, the implicit surface F (q) =0 is defined as those points q where summation of the density functions equals threshold T. The influence of the radius ri and the
mass mi of particles, as well as the threshold T, are chosen empirically as they depend
on the number and density of particles. Finally we triangulate the implicit surface and
then we optimize it according to the level of subdivisions si chosen by the user. Fig. 1
and Table 1 provide a comparison between different levels of simplification.

(a)

(b)

Fig. 1. Abstraction and simplification of two clouds models

Interactive Cartoon Rendering and Sketching of Clouds and Smoke

141

Table 1. Comparison of the triangles count and the parameters used for the implicit surfaces
Figure
1(a)
1(b)

#particles
12
6

#tri implicit surface


2360
352

T
0.05
0.6

mi
8
2

si
22
11

4 Rendering
Using an implicit surface allows calculation of the silhouette and to apply a illumination model for rendering. For silhouette detection and to achieve the cartoon appearance we use our previously published method [4]. Next, we describe the proposed
method and discuss the visual results obtained thus far.
The detection algorithm allows silhouette extraction as an analytical representation
obtaining interactive frame rates. As opposed to the methods proposed by other authors [14] [17] [6], the analytical description of the silhouette can be used to
create new textured polygons which improve the appearance of the model. Our system allows us to define the height and the scale of the new polygons that form the
silhouette.
The main drawback of this algorithm is that we need to remove those polygons that
have been generated for interior hidden edges of the original polygon. A simple solution to this problem draws the mesh during a second iteration of the algorithm, once
the hidden polygons have been removed, as shown by Fig. 2, left.

Fig. 2. Composing the final image for silhouette/based rendering

Finally, we select the texture to apply to the polygons of the silhouette and the
background image to compose the final image. In Fig. 2 right, we show the Sketch of
a cloud using this technique.
The illumination model used for the cartoon rendering allows a maximum of 32
gradations, which are applied on the mesh generated from the implicit surface as a 1D
texture. The process of obtaining the final image is similar to the one described previously, however in this case we do not use the mask but instead the polygonal mesh
textured with cartoon style, as shown in Fig. 3 left.

142

E.J. lvarez et al.

Fig. 3. Left, cartoon rendering image. Right, cartoon rendering with transparency.
Given the nature of gaseous phenomena it may be interesting to be able to define
transparency levels at the same time that cartoon rendering is applied. In this case it is
necessary to generate the mask of the cloud and to introduce it in a third step, as it is
shown in Fig. 3, right.

5 Details Layer for Clouds


Once the general aspect of the cloud is defined, it may be interesting to incorporate
greater level of detail to improve its appearance and to provide the sensation of
greater volume. With this purpose, we propose to calculate a second implicit surface.
The calculation of the second implicit surface is made from particles pi defined by
the user in the scene modeling process (section 3). We reduce the value of the radius
ri and the mass mi of each particle and we apply the density function Di(q) again,
creating an inner cloud.
We use this new implicit surface to calculate its silhouette. Since the positions of
particles used for its creation are the same for both surfaces, the second surface as
well as its silhouette will be contained initially within the first surface.
Our system allows the user to independently modify the calculation parameters of
both surfaces, making it is feasible to triangulate both surfaces with different number
of polygons. Moreover, also the height and scale parameters of the silhouette can be
changed for each surface. Thus the polygons that form the silhouette of the inner surface may be visible and cover part of the outer surface, enhancing the appearance of
the final image. The result obtained for the example cloud can be observed in Fig. 4.

Fig. 4. A cloud with two layers of detail

Interactive Cartoon Rendering and Sketching of Clouds and Smoke

143

Because the second surface is only necessary to add detail through the outline of
the silhouette, and it is inside the outer surface, it is not necessary to visualize it nor to
use it as a mask.

6 Smoke Simulation
Each particle of real smoke has very little mass or volume. Therefore, the smoke simulation is, in fact, the simulation of the instability of the air that contains the smoke particles. Expressive rendering is aimed at obtaining, first of all, a convincing shape of the
object. In the case of the amorphous shape of the smoke, as with clouds, we use a particle set that is the base for calculating the surface that is used for rendering this effect.
Cloud models can be static, however in the case of smoke it is necessary to have a
dynamic particle system. Our model uses a simplified version of the proposal made
recently by Selle et al [18] for the particle system. It allows the user to fit the parameters pertinent to wind, turbulence, environmental forces and vortices, among others.
The positions of particles are calculated interactively according to the initial configuration defined by the user. Once the new positions are computed, we recalculate
the implicit surface using the method described in section 3. Then we calculate the
silhouette and we render it as we have described in the previous sections.
In the real world the smoke particles dissipate according to their speed. Although
speed is a more objective criterion, it is more convincing to do the animation based on
time. This approach allows us to maintain the number of particles steady during the
simulation process. In this way we achieve that the speed of the visualization process
of the smoke remains more or less constant.

Fig. 5. Time evolution of cartoon smoke

Fig. 6. Time evolution of sketch smoke

144

E.J. lvarez et al.

7 Results
The results obtained show a convincing imitation of hand-drawn sketches and drawings, although our approach is not strict in its physical foundations. We have given
priority to the visual appearance with the purpose of simplifying the amount of information to represent while keeping the overall aspect and the capability of the user to
identify the amorphous objects. With our approach we have obtained good results for
interactive models rendering. Still, to obtain high resolution images intended for printing with good quality, we must optimize the algorithms developed.
The performance of our method has been demostrated in a PC platform, with
AMD's Athlon 64 X2 3800+ processor and a GeForce 7950 512 Mb graphics card,
running Windows XP. Once we calculate the geometry of the objects to render, we set
up different shape parameters for the clouds.
Different models have been created and different parameters have been applied,
which entails the necessity to execute different number of iterations of the algorithm
according to the desired target.
Table 2. Rendering times of clouds and smoke
Figure
2
3(a)
3(b)
4
5
6

#particles
6
47
21
9
150
200

#tri implicit surface


352
2464
4724
1928
1600 < #tri < 2100
1400 < #tri < 1700

#iteration
2
2
3
3
2
2

si
22
21
20
21
8
7

fps
283
88
65
154
38
48

8 Conclusions and Future Work


We present several techniques that allow representation of clouds and smoke with
cartoon rendering and sketching. In contrast to the existing methods to date, our method provides results with interactive frame rates. The appearance of the gaseous
phenomena is very stylized and incorporates greater level of detail depending on the
user preferences; he can change several parameters affecting the results.
Temporal cost can be improved further by programming our functions in the hardware of the GPU which would allow greater realism to the process of simulation of
the smoke. Also, it would be interesting to incorporate a model of behavior to generate particles of clouds with the purpose of generating animated sequences of its
movement and metamorphosis. Finally, this method could also be enhancing by introducing multiresolution features that would improve the massive application of gaseous effects in computer graphics.

Acknowledgements
This work was partially supported by grant 05VI-1C02 of the University of Vigo, rant
TIN2005-08863-C03 of the Spanish Ministry of Education and Science and by
STREP project GameTools (IST-004363).

Interactive Cartoon Rendering and Sketching of Clouds and Smoke

145

References
1. J. Buchanan, and M. Sousa. The edge buffer: a data structure for easy silhouette rendering. In Proceedings of NPAR 00, (2000) 3942
2. L. Markosian, M. Kowalski, D. Goldstein, S. Trychin, and J. Hughes. Real-time nonphotorealistic rendering. In Proceedings of SIGGRAPH 97, (1997) 415420
3. R. Raskar and M. Cohen. Image precision silhouette edges. In Proceedings of I3D.,
(1999) 135140
4. C. Campos, R. Quirs, J. Huerta, E. Camahort, R. Viv, J. Lluch. Real Time Tree Sketching. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, vol, 0302-9743,
(2004) 197204
5. T. Luft and O. Deussen, Real-Time Watercolor Illustrations of Plants Using a Blurred
Depth Test, In Proceedings of NPAR 06, (2006)
6. A. Selle, A. MOHR and S. Chenney, Cartoon Rendering of Smoke Animations. In Proceedings of NPAR 04, (2004) 5760
7. T. Nishita, E. Nakamae, Y. Dobashi. Display of clouds taking into account multiple anisotropic scattering and sky light. In Proceedings of SIGGRAPH96, (1996) 379386
8. J. Schpok, J. Simons, D. S. Ebert, C. Hansen, A real-time cloud modeling, rendering, and
animation system. Symposium on Computer Animation03, (2003) 160166
9. Y. Dobashi, K. Kaneda, H. Yamashita, T. Okita, T. Nishita, A simple, efficient method
for realistic animation of clouds. In Proceedings of ACM SIGGRAPH00, (2000) 1928
10. M. J. Harris, A. Lastra, Real-time cloud rendering. Computer Graphics Forum 20, 3,
(2001) 7684
11. A. Bouthors and F. Neyret, Modeling clouds shape. In Proceedings Eurographics '04,
(2004)
12. G. Y. Gardner, Simulation of natural scenes using textured quadric surfaces. In Computer Graphics In Proceedings of SIGGRAPH84, 18, (1984) 1120
13. G. Y. Gardner, Visual simulation of clouds. In Computer Graphics SIGGRAPH 85,
Barsky B. A., 19, (1985) 297303
14. F. Di Fiore, W. Van Haevre, and F. Van Reeth, "Rendering Artistic and Believable Trees
for Cartoon Animation", Proceedings of CGI2003, (2003)
15. R. Fedkiw, J. Stam, and H. W. Jensen, Visual simulation of smoke. In Proceedings of
SIGGRAPH 01, ACM Press, (2001) 1522
16. O. Deussen, and T. Strothotte, Computer-generated pen-and-ink illustration of trees. In
Proceedings of SIGGRAPH 00, (2000) 1318
17. M. McGuire, A. Fein. Real-Time Rendering of Cartoon Smoke and Clouds. In Proceedings of NPAR 06, (2006)
18. A. Selle, N. Rasmussen, R. Fedkiw, A vortex particle method for smoke, water and explosions. ACM Trans. Graph., (2005) 910914
19. S. Murakami and H. Ichihara, On a 3d display method by metaball technique. Transactions of the Institute of Electronics, Information and Communication Engineers J70-D, 8,
(1987) 16071615

Spherical Binary Images Matching


Liu Wei and He Yuanjun
Department of Computer Science and Engineering, Shanghai Jiaotong University,
Shanghai, 200240, P.R. China
liu-wei@sjtu.edu.cn

Abstract. In this paper a novel algorithm is presented to match the spherical


binary images by measuring the maximal superposition degree between them.
Experiments show that our method can match spherical binary images in a more
accurate way.
Keywords: spherical binary image, matching, icosahedron, subdivide.

1 Introduction
Spherical image plays an important role in optics, spatial remote sensing, computer
science, etc. Also, in the community of computer graphics, spherical image frequently
finds applications in photorealistic rendering[1], 3D model retrieval[2], virtual reality[3]
or digital geometry processing[4-5].
Compared with various algorithms for planar images matching and retrieval, there
is nearly no analogy for spherical ones up to now. But in some occasions, one cant
avoid facing the task of matching spherical images. Since spherical binary images
(SBIs) are used in a majority of cases, we emphasize our attentions on them. In this
paper we propose an effective method to match SBIs by measuring the maximal
superposition degree between them.

2 Our Method
In our research, we assume that the similarity between SBIs can be measured by the
maximal superposition degree between them, which also accords with the visual
apperception of human beings. While we know that spherical surface is a finiteunbounded region, it hasnt any information of border and also we cant find its start
and end, which baffles the matching. The key of this problem is trying to obtain the
result in a finite search space and the error is certain to decrease along with the
increase of search space. That is to say, we need to be capable of explicitly controlling
the error according to practical requests. In this paper, we divide the SBIs into umpty
equivalent regions and compare the difference between the corresponding regions of
two SBIs, the sum of which is the superposition in an orientation. Then we rotate one
SBI around its center with given rule and make another analogical calculation. And
after finite comparisons, their similarity can be obtained by choosing the case with the
most superposition.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 146149, 2007.
Springer-Verlag Berlin Heidelberg 2007

Spherical Binary Images Matching

147

As we know that regular polyhedra are uniform and have facets which are all of
one kind of regular polygon and thus better tessellations may be found by projecting
regular polyhedra onto the unit sphere after bringing their center to the center of the
sphere. A division obtained by projecting a regular polyhedron has the desirable
property that the resulting cells all have the same shapes and areas. Also, all cells have
the same geometric relationship to their neighbors. So if we adopt a regular polyhedron
as the basis to divide the SBIs, their distribution will also satisfy the requirements of
uniformity and isotropy. Since the icosahedron has the most facets of all five types of
regular polyhedra, we adopt it as a basis in our realization.
The division for a SBI has random orientation, which makes it difficult to compare.
For simplification, we firstly investigate the method to compare two SBIs in the same
orientation. As the SBIs have been divided into 20 equivalent spherical triangles, our
method is to subdivide each into four small ones according to the well known
geodesic dome constructions for several times(Fig.1) to form more compact trigonal
grids. If the gray of the SBI in the position of a grids center is 1, we tag this grid with
1, or otherwise 0. Then we totalize the superposed grids with the same tags and
calculate its proportion as the metric for similarity.

Fig. 1. The grids for a SBI in three resolutions

Suppose that the spherical icosahedron basis has been subdivided for N times and m
be the number of superposed grids with the same tags, then the similarity S between
the SBIs in this orientation can be defined as

m
.
20 * 4 N

Comparison of SBIs in the same orientation only reflects the superposition degree
in one situation, and therefore to obtain an all-around matching, we have to search for
more orientations to get the actual similarity. The varieties of orientation are achieved
through two steps in our realization:
Firstly, we rotate the icosahedron basis of one SBI around its center to obtain
another posture which is superposed with the prior one. Because the icosahedron has a
Symmetrical Group(SG) with its rank being 60, we need only rotate it 60 times to
get the most superposition in the group, that is, the max superposition in one SG
S G = max S i , i = 0,1,",60 . Fig.2 shows three orientations in a SG of a SBI for
example. Notice that the SBI is rotated along with the basis, too.

148

W. Liu and Y. He

Fig. 2. Three orientations in one symmetrical group, from which we can observe that the
trigonal meshes are superposed ( N = 2 )

Secondly, we learn that though rotations and comparison in one symmetrical group
can get a nearly approximate matching, the problem hasnt been completely finished,
yet. An undoubted fact is that the relation between the SBI and its icosahedron basis is
randomly fixed at the beginning, which may create a non-optimal discretization of SBI
for matching. To solve this problem, we adopt to experiment on other relations, which
will perhaps alter the shape or distribution of the discrete SBIs. Our method is to rotate
the icosahedron basis while maintaining the SBI fixed to obtain another SG. To ensure
that all SGs are distributed uniformly and able to cover different angles to solve the
rotation problem effectively, we adopt the relaxation mechanism proposed by Turk[6],
which intuitively has each directions of a SG push around other ones on the sphere by
repelling neighboring directions, and the most important step is to choose a repulsive
force and a repulsive radius for the interval.
Suppose we need L different SGs, there are totally 60 L rotations between the
two SBIs. The average maximum error of rotation angle A for two SBIs in longitude
and latitude can be roughly estimated using the following formula:

360 180
1080

= 60 L A =
A
A
L

(1)

The calculation is acceptable. Then we can decide the actual similarity between the
SBIs as S max = max S Gj , j = 0,1,", L 1 . Also, we can easily analyze and
conclude that the whole time complexity of our algorithm is O ( L 4 ) .
Foremost, we evaluate the retrieval performance of these combinations. Assume
that the query SBI belongs to the class Q containing k SBIs. The performance of
N

each combination of parameters ( N , L ) can be evaluated using the percentage of the

Q that appeared in the top (k 1) matches. As the query SBI is


excluded from the computation, successful rate is 100% if ( k 1) SBIs from the
class Q appeared in the top ( k 1) matches. In our experiments, two parameters
need be decided: N and L . To balance all the influencing facts, we test 18 cases to
decide the most appropriate combination in which N {2,3,4} and
L {10,11,12,13,14,15} . Thus the number of grids for a SBI ranges from 320,
SBIs from the class

1280 to 5120.

Spherical Binary Images Matching

149

In the experiments, we test various kinds of SBIs and from each kind we choose
five SBIs as the query and test the results. Table 1 lists the average performances of
the 18 cases, from which we find that when N = 3 and L = 15 , the arithmetic
obtains the best performance. Of course, since there is nearly no acknowledged
benchmark and interrelated reports on SBI matching up to now, our experiments and
result of performance can only be an attempt.
Table 1. Average Performances (%)

N/L
2
3
4

10
31.3
34.1
33.2

11
33.4
34.9
34.1

12
33.7
35.4
34.5

13
34.1
36.3
35.7

14
34.9
37.1
36.4

15
35.4
37.9.
37.2

As for the results, we make a tersely analysis. Generally speaking, discretization in


a high resolution will approach the original SBIs in a more accurate way, but too fine
division will arouse the explosion of information and data, also result in general loss
of description, and as a result a compromise must be considered. In addition, the
performance is certain to improve along with the increase of L , as bigger L leads to
more adequate matching.

3 Conclusion and Further Work


In this paper, a tentative method is proposed to match spherical binary images based
on superposition degree. Experiments show fairish performances, which preliminarily
validates its idea. As for the further work, global descriptors or feature vectors which
are analogous to that for planar images had better be extracted in advance to further
support off-line matching for mass retrieval.
Acknowledgements. This research has been funded by The National Natural Science
Foundation of China (60573146).

References
1. Sing K. A Survey of Image-based Rendering Techniques. Technical Report Series, CRL
97/4, Cambridge Research Laboratory
2. Kazhdan M, Funkhouser T. Harmonic 3D shape matching. Computer Graphics Proceedings
Annual Conference Series, ACM SIGGRAPH Technical Sketch, Texas, (2002)
3. Shigang L, Norishige C. Estimating Head Pose from Spherical Image for VR Environment.
The 3rd IEEE Pacific Rim Conference on Multimedia, Morioka, (2002), 1169-1176
4. Wu Y, He Y Tian H. Relaxation of spherical parametrization meshes. The Visual Computer,
(2005), 21(8): 897-904
5. Rhaleb Z, Christian R, Hans S. Curvilinear Spherical Parameterization. The 2006 IEEE
International Conference on Shape Modeling and Applications, Matsushima, 11-18
6. Turk G. Generating Textures on Arbitrary Surfaces Using Reaction-Diffusion. Computer
Graphics, (1991), 25(4): 289-298

Dynamic Data Path Prediction in Network Virtual


Environment
Sun-Hee Song, Seung-Moon Jeong, Gi-Taek Hur1, and Sang-Dong Ra2
Digital Contents Cooperative Research Center Dongshin University,
1
Dept. of Digital Contents, Dongshin University
2
Dept. of Computer Engineering, Chosun University
shsong@dsu.ac.kr

Abstract. This research studies real time interaction and dynamic data shared
through 3D scenes in virtual network environments. In a distributed virtual
environment of client-server structure, consistency is maintained by the static
information exchange; as jerks occur by packet delay when updating messages
of dynamic data exchanges are broadcasted disorderly, the network bottleneck
is reduced by predicting the movement path by using the Dead-reckoning
algorithm. The shared dynamic data of the 3D virtual environment is
implementation using the VRML EAI.
Keywords: net-VE, Dead-reckoning, Consistency.

1 Introduction
Net-VE(Network Virtual Environment)[1][2] is a system that connection the
distributed network to the virtual reality technology and offers 3D space to
cooperate distributed multi-users interaction through realtime networking.
Consistency at a distributed virtual environment[3] of the client-server structure is
continued by continuous exchange of static information among distributed clients.
The cycles transfer of static information brings traffic overhead of network. The
precise way for network users to know the others static is to transfer the packet by
hand shaking for each frame, which takes a overload of the synchronization and
decrease the velocity.
Based on the roles by which the dynamic data of the distributed multi-users is
processed through the multi-casting communication via the client-server and the peerpeer server, the network system in this study is composed of the message server and
the application server and distribution servicing loads by allocating realtime data to
the dynamic data server and non-realtime data to the static data server. When a new
client is connecter to a 3D scene of the network virtual space, it interpolates the prior
location with the Dead-reckoning[4] path prediction algorithm of DIS(Distributed
Interactive Simulation) to continue consistency and presentation the dynamic data
sharing scene of the 3D virtual space.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 150153, 2007.
Springer-Verlag Berlin Heidelberg 2007

Dynamic Data Path Prediction in Network Virtual Environment

151

2 Dynamic Data Path Prediction


2.1 Path Prediction Using Dead-Reckoning Algorithm
When you know the current location x(t ) , the location at the time change t + t after
movement for a cycle interval from the time t at an average velocity can be
calculated as in expression (1). Based on the location of the shared object in
expression (1), the object location at the current time can be estimated. The previous
location is interpolated if the error is over the predetermined threshold after reviewing
the error between the estimated and the actual static values.

x(t + t ) = x(t ) + xt


y (t + t ) = y (t ) + yt

x = V cos
y = V sin

(1)

V : average velocity at thetime [t , t + t ]

: average direction angle at thetime [t , t + t ]

Fig. 1. Dead-reckoning Convergence

Fig. 1 shows the Dead-reckoning convergence process. We can get more precise
estimates by increasing the estimate function interval of expression (2) but it results in
more composite calculation. Therefore, we use 2nd level functions such as 1st
differentials or 2nd differentials. We adjusts the threshold of the Dead-reckoning
convergence number and control the static information transfer rate. The client that
receives the discreteness dynamic data static information creates a continually shared
static using the shared static location convergence expression (2)
x(t ) = x(t0 ) + (t t0 )

(t t0 ) 2 d 2 x(t )
dx(t )
|t =t0 +
|t =t0 +"
dt
2
dt 2

(2)

Fig. 2. shows the measurement of the actual location, the convergence location and
the estimated location error when the path from the initial value of the dynamic data
by the Dead-reckoning convergence width, ( x, y, ) =(1.5, 1.8, 70.0), to the value,
( xn , yn , n ) =(4.62, 5.64, 70.0), is set to the velocity 2.4, the acceleration 0, the time

stamp 2.0, and the DR Interval 0.75, 0.50, 0.10, 0.05. When the actual location in
Table 1 and the location prediction error by convergence width are measured at the
point ( xn , yn , n ) =(4.62, 5.64, 70.0), the estimated error rate gets smaller and it
becomes possible to predict the location which is closer to the actual path when the

152

S.-H. Song et al.

Dead-reckoning convergence width is adjusted between 0.05 and 0.5, as shown in


Fig. 2. As the location prediction interpolation error is 0 or -0.01 in (b) DR Interval =
0.10 and (c) DR Interval = 0.05, the dynamic data movement is not sensed at the
client rendering. Although realtime rendering is more possible as the consistency is
higher, it is possible to send the location change information of the shared object to
the other clients and continue an proper transfer rate when in 0.10, because the server
function and frequent updates cause network broadband width delays.

(a) DR Interval=0.50

(b) DR Interval=0.10

(c) DR Interval=0.05

Fig. 2. Position Prediction of Dead-reckoning Convergence

The location interpolation includes check of the error between the estimated and
the real values and interpolation of the previous location when the error is over the
predetermined threshold of the Dead-reckoning convergence. If the threshold is big,
the average transfer rate of static information is low, even though the error of the
shared static gets bigger. If the threshold is small, the average transfer rate and
broadband width get heighten even though the error of the shared static gets smaller.
Pt 0 and V r mean the ESPDU location and velocity, respectively. expression (3)is for
interpolating the previous location of the entity using the initial location value
the time stamp velocity at the linear block

t0 by

d n and the location estimate t1 .

Pt1 = Pt 0 + V r (t1 t0 ) .

(3)

3 Conclusion
The dynamic data whose path was predicted by a Dead-reckoning algorithm
interpolates the previous location with an interpolation node(Interpolation), transfers
the shared object static information, and continues consistency with other clients.
At the network 3D virtual space, the movement path was predicted using the Deadreckoning algorithm at the client buffer because the congested broadcast by
interaction and static information caused network delay and jerks. The error between
the estimated and the actual static values, which is more than the threshold based on

Dynamic Data Path Prediction in Network Virtual Environment

153

the shared object location, required interpolation of the prior location using the Deadreckoning estimate function and multicasting of the ESPDU packet of the DIS.
Fig. 3 is the 3D scene with the output through the client rendering engine at the
network virtual space. The actual path of the dynamic data agent_A is 'Actual Path',
and as the Dead-reckoning estimate location path is a 'DR path' and the dynamic data
moves suddenly when the user who received the shared static updates the information,
it does not change right away to a client cache value, but moves to the 'Interpolation
path' by the convergence interval.

Fig. 3. Dead-reckoning Apply of 3D Graphics Scene

References
[1] Singhal, S. and Zyda, M., 1999. Networked Virtual Environments: Design and
Implementation, ACM Press [ISBN 0-201-32557-8].
[2] Bouras, C., Triantafillou, V. and Tsiatsos, T., 2001. Aspects of collaborative learning
environment using distributed virtual environments In Proceedings of ED-MEDIA,
Tampere, Finland, June25-30 pp.173-178
[3] Bouras,C., Psaltoulis, D., Psaroudis, C. and Tsiatsos, T., 2003. Multi user layer in the EVE
distributed virtual reality platform In Proceedings of Fifth International Workshop on
Multimedia Networks Systems and Applications (MNSA 2003) Providence, Rhode Island,
USA, May 19-22 pp.602-607.
[4] W. Cai, F.B.S. Lee, L. Chen, An auto-adaptive Dead-reckoning algorithm for distributed
interactive simulation, in: Proceedings of the Thirteenth Workshop on Parallel and
Distributed Simulation, 1999, pp.82-89.

Modeling Inlay/Onlay Prostheses with Mesh


Deformation Techniques
Kwan-Hee Yoo1, Jong-Sung Ha2, and Jae-Soo Yoo3
1

Dept. of Computer Education, Chungbuk National University, Korea


khyoo@chungbuk.ac.kr
2
Dept. of Game and Contents, Woosuk University, Korea
jsha@woosuk.ac.kr
3
School of EECE, Chungbuk National University, Korea
yjs@chungbuk.ac.kr

Abstract. This paper presents a method for effectively modeling the outer
surfaces of inlay/onlay prostheses restoring teeth that are partially destroyed.
We exploit 3D mesh deformation techniques: direct manipulation free-form
deformation (DMFFD) [9] and multiple wires deformation (MWD) [10] with
three kinds of information: standard teeth models, scanned mesh data from the
plaster cast of a patient's tooth, functionally guided plane (FGP) measuring the
occlusion of the patients' teeth. Our implementation can design inlay/onlay
prostheses by setting up various parameters required in dentistry during
visualizing the generated mesh models.
Keywords: Prostheses modeling, inlay/onlay, mesh deformation.

1 Introduction
Many artificial teeth prostheses are composed of cores and crowns [1]: the cores
directly contact the abutment to increase the adhesive strength to the crowns that are
revealed to the outside sight when the artificial teeth prostheses are put in.
Inlay/onlays can be regarded as a kind of single crowns, which are used for
reconstructing only one tooth that are partially destroyed. In general, a tooth adjoins
with adjacent teeth and also contacts other teeth at the opposite side when the upper
and lower jaws occlude. The adjoining surfaces at the adjacent side are said to be
adjacent surfaces, and the contact surfaces at the opposite side during the occlusion
are called occlusal surfaces. The inlay is a prosthesis fabricated when little dental
caries or established prostheses are on the two surfaces, while the onlay is a prosthesis
fabricated when its cusp in tongue side exists soundly but other parts are destroyed.
In modeling inlay/onlays with CAD/CAM and computer graphics techniques, the
most important subject is how to model their 3D shapes same as the dentists want to
form. That is, the adhesive strength to the abutment must be maximized. Furthermore,
the appropriate adjacency with neighboring teeth and the accurate occlusal strength to
the opposite tooth has to be guaranteed. Previous researches for modeling
inlay/onlays can be divided to two categories: 2D image-based [2-5] and 3D meshbased [6,7].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 154157, 2007.
Springer-Verlag Berlin Heidelberg 2007

Modeling Inlay/Onlay Prostheses with Mesh Deformation Techniques

155

Our method adopts the mesh-based modeling approach similarly to the GN-1
system [7]. In this paper, however, differently to taking a side view of 3D scanners for
producing mesh models in the GN-1 system, an inlay/onlay is modeled by dividing its
surface into two parts: an inner surface adhering to the abutment and an outer surface
revealed to the outside sight. The inner surfaces of inlay/onlays are modeled same as
the results of Yoo et al. [8] with the 2D Minkowski sum: compute a new model that is
the expansion of a terrain model with expansion values given by users. This paper
focuses on modeling the outer surface, which is just the union of two subparts: the
adjacent and occlusal surfaces, by deforming the corresponding standard tooth
according to the inherent features of each tooth.

2 Modeling the Outer Surfaces of Inlay/Onlays


The standard teeth models include the information of axes and geometric features for all
teeth in the upper and lower laws. First, the standard teeth are transformed and aligned
to the patients teeth by referencing the arrangement information of the former such as
adjacent points, tongue side points, lingual side points, and the positional information of
the latter. And then, adjacent surfaces are generated with the technique of direct
manipulation free-form deformation (DMFFD) [9] to a standard tooth by considering
the contact points. On the other hand, occlusal surfaces are generated by applying the
technique of multiple wires deformation (MWD) [10] to the two corresponding
polygonal lines that are, respectively, extracted from a standard tooth and FGP.
The DMFFD [9] is an extended version of free form deformation (FFD) [11],
which directly controls the points on the mesh for the deformation. For an arbitrary
point X and a set P of control points, they define the deformation equation as the
following matrix form X = BP . Here the matrix B is obtained by the B-spline
blending function with the three parametric values that are determined from the
given X . Then, the transformed point X ' is represented as B( P + P) , that is,
X = BP . For moving a given point X in the amount of X , the amount P for
moving control points can be inversely computed as.
P = B + X .

(1)

In the above equation, the B + is a pseudo inverse of the matrix B . If we apply FFD
to X with the computed P , X is transformed into X ' . Hence, it is possible to
deform a mesh intentionally, if we apply FFD to all vertices of the mesh after
computing P for each vertex with the same method.
The deformation technique of multiple wires deformation (MWD) [10] is used for
more naturally deforming the wired curves representing geometric features of cusp,
ridge, fissure, and pit, which are extracted after scanning the FGP. A wired curve is
represented as a tuple < W , R, s, r , f > , where W and R are the free-form parametric
curves that are the same in an initial state, s is a scalar for adjusting the radial size in
the curve circumference, r is a value representing the range effecting the curve
circumference, and f is a scalar function defined as f : R + [0,1] . The function f
guarantees the C1-continuity at least, and satisfies the properties of
1
f (0) = 1, f ( x) = 0 for x 1 and f ' (1) = 0 . Our implementation uses the C -continuous

156

K.-H. Yoo, J.-S. Ha, and J.-S. Yoo

function f ( x) = ( x 2 1) 2 , x [0,1] as in [10]. As R is deformed into W , an arbitrary


point p on R will be deformed accordingly. Let p R be the point nearest to R , and
pW be the corresponding point in W . Then, p R and pW have the same curve
parametric value. When W is deformed, the point p moves to p ' as.
p ' = p + ( pW p R ) f ( x ) .

(2)

In Equation (2), f ( x) is a function with three parameters R , p , and r , where r


represents a range. Generally, we define x =

p pR
r

. We can move the point p to

p ' by deforming W with an expansion parameter s for changing the wire size.

p ' = p + ( s 1)( p p R ) f ( x) + ( pW p R ) f ( x) .

(3)

Definitely, the above equation has the property that the expansion parameter s
moves the point p in the direction p p R . This principle of wire deformation is
extended for deforming the multiple wires. Let pi be the variation value of p when
the wire Wi is deformed. Then, the deformed point p ' in deforming all wires
Wi , i = 1,", n is written as.
n

p' = p +

pi f i ( x) m
i =1

i =1

(4)

f i ( x) m

The parameter m is used for locally controlling the shapes of the multiple wires, i.e.,
it controls the effects of Wi and si during the deformation. For example, the effects
of Wi and si rapidly increase according to the increasing value of m when f i (x)
approaches to 1.
In modeling the occlusal surfaces, Ri is the curve interpolating all points of the the
geometric features lines of cusp, ridge, fissure, and pit. The wired curve Wi
corresponding to Ri is determined by more complicated computations; for each
segment Li of the polygonal lines, we compute the intersection line segment Li '
between FGP and a z-axis parallel plane passing Li , cut Li ' so that it has the same xand y- coordinates with the end points of Li , and finally obtain a curve interpolating
all points of the cut line segments. Since the two curves interpolates the same number
of points, we can get the parametric value of the curves for any point on Wi . In our
implementation, we use the Catmull-Rom curve for the interpolating curves, and the
one suggested by Singh et al. [10] for the function f . Our implementation assigns the
values 1, 5, and 1 to si , ri and m , respectively. For the all of points p on the
standard tooth, W , and R , we compute pR and pW and then obtain pi with
Equation (4). By applying Equation (5) to pi of all wired curves and f , we can get
the finally deformed point q' .

Modeling Inlay/Onlay Prostheses with Mesh Deformation Techniques

157

3 Experiments and Future Works


Our system for modeling inlay/onlays is implemented in the environments of
Microsoft Foundation Class (MFC) 6.0 and OpenGL graphics library on PC. Fig. 1
illustrates the designed outer surface for an onlay.

(a)

(b)

(c)

(d)

Fig. 1. Designing the outer surface for an onlay; (a) a standard tooth model, (b) a scanned FGP
model, (c) a finally designed onlay, and (d) an onlay put in on the abutment

For more accurate modeling, several parameters can be set up through a simple
interface as the designed inlay/onlays are visualized. Currently, our implementation
gets the adjacent points manually for designing the adjacent surfaces of inlay/onlays.
In future, an automatic method for determining such adjacent points is needed to be
developed. It is another research subject to simulate the teeth occlusion by using the
FGP and the geometric features of teeth.
Acknowledgements. This work was partially supported by the Korea Research
Foundation Grant funded by the Korean Government (MOEHRD) (KRF-2006D00413).

References
1. Yoon, C.G., Kang, D.W., Chung, S.M.: State-of-arts in Fixed Prosthodontics, Jongii Press,
Korea (1999)
2. Myszkowski, K., Savchenko, V.V., Kunii, T.L.: Computer modeling for the occlusal
surface of teeth, Proc. of Conf. on Computer Graphics International (1996)
3. Savchenko, V.V., Schmitt, L.M.: Reconstructing occlusal surfaces of teeth using a genetic
algorithm with simulated annealing type selection, Proc. of Solid Modeling (2001) 39-46
4. Yoo, K.Y., Ha, J.S.: User-Steered Methods for Extracting of Geometric Features in 3D
Meshes, Computer-Aided Design and Applications, Vol. 2 (2005)
5. Sirona Corporation, http://www.sirona.com (1985)
6. Nobel Digital Process Corporation, Procera Systems, Nobel Digital Process Corporation,
Sweden (2001)
7. GC Corporation, GN-I Systems, GC Corporation, Japan (2001)
8. Yoo, K.Y., Ha, J.S.: An Effective Modeling of Single Cores prostheses using Geometric
Techniques, Journal of Computer-Aided Design, Vol. 37, No. 1 (2005)
9. Hsu, W.M., Hughes, J.F., Kaufman, H.: Direct manipulation of free-form deformations, In
Computer Graphics (SIGGRAPH '92), Vol. 26 (1992) 177-184
10. Singh, K., Fiume, E.: Wires: A Geometric Deformation Techniques, SIGGRAPH (1998)
11. Sederberg, T., Parry, S.: Free-form deformation of solid geometric models, In Computer
Graphics (SIGGRAPH86) (1986) 151-160

Automatic Generation of Virtual Computer


Rooms on the Internet Using X3D
Aybars Ugur1 and Tahir Emre Kalayc2
1

Ege University, 35100 Bornova-Izmir, Turkey,


aybars.ugur@ege.edu.tr
2
tahir.kalayci@ege.edu.tr
http://yzgrafik.ege.edu.tr

Abstract. In this paper, some natural links between virtual reality and
interactive 3D computer graphics are specied and Web3D technologies especially VRML and X3D are briey introduced. Web-based tool
which is called EasyLab3D was designed and implemented using X3D.
This web-based tool is used to generate automatically 3D virtual computer rooms that can be navigated in. It is important for introduction
of departments, companies and organizations which have computer laboratories and for planning new computer rooms. As a result, state-of-the
art technologies and methods in development of automatic 3D scene and
model generation tools are discussed.

Introduction

Sherman and Craig[1] dene VR (Virtual Reality) as a medium composed of


interactive computer simulations that sense the participants position and actions and replace or augment the feedback to one or more senses, giving the
feeling of being mentally immersed or present in the simulation. According to
them, the key elements to experiencing VR are a virtual world, immersion, sensory feedback, and interactivity. It is the use of computer graphics systems in
combination with various display and interface devices to provide the eect of
immersion in the interactive 3D computer-generated environment. We call such
an environment a virtual environment[2].
High quality interactive 3D content on the web without bandwidth and platform limitations allows internet users to become fully immersed in realistic 3D
worlds. Many Web3D technologies have been developed to give people real-time,
three-dimensional and interactive computer graphics on the web.
In this study, we designed and implemented an X3D-based tool (EasyLab3D)
which is used to generate automatically and navigate in virtual computer rooms
on the web using state-of-the-art technologies.

Extensible 3D (X3D)

Web3D is simply 3D graphics on the web. VRML-NG (X3D) arises in 1999


as a result of eorts to carry 3D to all environments. X3D which is developed
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 158161, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Automatic Generation of Virtual Computer Rooms

159

by Web3D consortium is the ISO standard for real-time 3D computer graphics.


XML was adopted as syntax for X3D in order to solve a number of real problems
of VRML. According to Yumetech President, Alan Hudson the reasons to use
XML as syntax are to interoperate with the Web and to incorporate new graphics
technologies in a standardized way1 . Main X3D features are extensions to VRML
(e.g. Humanoid Animation, Nurbs, GeoVRML etc.), the ability to encode the
scene using an XML syntax as well as the Open Inventor-like syntax of VRML97,
and enhanced application programmer interfaces.
Blais et al.[3] present a Web3D application for military education and training.
Patel et al.[4] describe an innovative system designed for museums to create,
manage and present multimedia based representations of museum artifacts in
virtual exhibitions both inside and outside museums. Barbieri et al.[5] developed
computer science virtual museum which can be visited online, also which contains
simple interactive games that illustrate basic principles of computer science.
Some other projects based on X3D are explained in [6] and [7].

Automatic Computer Room Generation Tool

We developed Web-based Automatic Virtual Computer Room Generation Tool


(EasyLab3D2) using Java and Xj3D. Some features such as navigation in 3D
room scenes and realistic presentation are important for people to introduce
computer laboratories to visitors or internet users. This tool can also be used for
designing new labs by providing 3D previews. 3D model of a computer laboratory
(Ege Lab) is generated using EasyLab3D in a few seconds (Fig. 1).

Fig. 1. Ege Lab in our department generated by EasyLab3D (Perspective View)

3D models (table/desk, room) required by program were developed using Flux


Studio 2.0 Beta3 . All models were developed as prototype to reduce the le size
1
2
3

http://www.xml.com/pub/a/2003/08/06/x3d.html
http://yzgrak.ege.edu.tr/projects/easylab3d/
http://www.mediamachines.com/make.php

160

A. U
gur and T.E. Kalayc

and to use an object many times without coding same things repeatedly. Created
model prototypes are stored at web space4 for online use.
Scenes are generated using our layout algorithms and prototype instances are
created with calculated transformations and put in a temp le on the y. Thus
user can save the scene generated as a le and publish on the web site. User
can also examine these les later using other browsers and plug-ins. Scenes are
created on the y, shown on the Xj3D browser using Java programming language.
Xj3D provides to change X3D scenes dynamically using SAI5 (Scene Authoring
Interface) technology. Java Web Start technology6 is used to make easier run of
program and to be downloaded all required libraries automatically.
Automatic generation algorithms for most popular rectangular computer
room layouts are implemented in this project. Computers which are on the tables are placed in that order. Computer Room Layout 1 has equally sized gaps
between computer tables and Layout 2 includes one corridor has two computer
blocks one at the left side of room and one at the right side shown in Fig. 2.
Room parameters given by the user are width (sizeX), length (sizeY), height
and distance to board (gapY, minimum feasible distance between board and
tables). Table/desk parameters are width (sizeX, width of table), length (sizeY)
and height. Some other parameters are Desk Count (room.tableCount), Sitting
Gap (table.gapY, feasible distance between tables that a computer user works
easily, in Y axis), Desk Gap (table.gapX, gap between tables in X axis).

Fig. 2. Computer Room Layout 1 (on the left) and Layout 2 (on the right)

Java-like Algorithm generates 3D model of computer room for Layout 1:


// Calculating table capacity of room width (X)
room.tCountX = (room.sizeX+table.gapX)/(table.sizeX+table.gapX);
// Calculating table capacity of room length (Y)
room.usableLength = room.sizeY - room.gapY;
4
5
6

http://yzgrak.ege.edu.tr/elab/modeller/
http://www.xj3d.org/tutorials/general sai.html
http://java.sun.com/products/javawebstart/

Automatic Generation of Virtual Computer Rooms

161

room.tCountY = (room.usableLength+table.gapY)/
(table.sizeY+table.gapY);
room.maxTableCount = room.tCountX * room.tCountY;
if(room.tableCount>room.maxTableCount)
room.tableCount=room.maxTableCount;
// Placement
foreach(table in room) calculatePosition(table);
Tables and computers exceeding the room capacity are not inserted into the
scene as specied in the algorithm.

Conclusion

New generation 3D languages, APIs and key Web3D technologies (X3D, Java
3D) oer possibilities for creating and manipulating with 3D objects and scenes
easily. X3D provides support to insert 3D shapes, text and objects into scene, to
transform objects, to modify attributes of objects and others such as grouping,
animation, illumination, etc. X3D also uses XMLs advantages such as rehostability, page integration, integration with the next-generation web technologies,
extensive tool chain support. Developments in Web3D technologies and authoring tools are important for the future of VR.
Increasing number of web-based 3D graphics projects easy to use like EasyLab3D will make 3D graphics natural part of web and will improve web quality.
3D interactive model of a computer lab is more than a series of 2D pictures and
also more enjoyable and realistic. Companies and organizations can use this tool
for generating 3D models of their labs only giving a few parameters. Internet
users can access and navigate in these lab models. This automatic generation
tool is also useful for interior designs of new labs by providing 3D previews.

References
1. Sherman, W.R., Craig, A.B.: Understanding Virtual Reality: Interface, Application,
and Design. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002)
2. Sowizral, H.A., Deering, M.F.: The java 3d api and virtual reality. IEEE Computer
Graphics and Applications 19 (1999) 1215
3. Blais, C., Brutzman, D., Horner, D., Nicklaus, S.: Web-based 3d technology for
scenario authoring and visualization: The savage project. In: I/ITSEC. (2001)
4. Patel, M., White, M., Walczak, K., Sayd, P.: Digitisation to presentation - building
virtual museum exhibitions. In: VVG. (2003) 189196
5. Barbieri, T., Garzotto, F., Beltrame, G., Ceresoli, L., Gritti, M., Misani, D.: From
dust to stardust: A collaborative 3d virtual museum of computer science. In: ICHIM
(2). (2001) 341345
6. Hetherington, R., Farrimond, B., Presland, S.: Information rich temporal virtual
models using x3d. Computers & Graphics 30 (2006) 287298
7. Yan, W.: Integrating web 2d and 3d technologies for architectural visualization: applications of svg and x3d/vrml in environmental behavior simulation. In: Web3D 06:
Proceedings of the eleventh international conference on 3D web technology, New
York, NY, USA, ACM Press (2006) 3745

Stained Glass Rendering with Smooth Tile


Boundary
SangHyun Seo, HoChang Lee, HyunChul Nah, and KyungHyun Yoon
ChungAng University,
221, HeokSuk-dong, DongJak-gu, Seoul, Korea
{shseo,fanpanic,hcnah,khyoon}@cglab.cse.cau.ac.kr
http://cglab.cse.cau.ac.kr

Abstract. We introduce a new glass tile generation method for simulating Stained Glass using region segmentation algorithm and cubic spline
interpolation method. We apply a Mean shift segmentation algorithm
to a source image to extract a shape of glass tile. We merge regions
by user input and use morphological operation to remove the invalid
shape. To make the shape of glass tile, we apply cubic spline interpolation and obtain the leading and the region with smooth boundary. Next,
we re-segment the region using the spline curves. Finally we apply the
transformed colors to each region to create a whole glass tile.

Introduction

This study is to make the Stained Glass image that looks like manually produced
by artists using the 2D image as the input. The Stained Glass rendering is a
eld of NPR(Non-Photo realistic Rendering) and it is very much dierent from
the traditional realistic rendering. While the rendering primitive of the realistic
rendering is a pixel that of the Stained Glass rendering is a region, the collection
of the pixels. Therefore the output image may be varied according to the size and
the shape of the area. In this paper, we would like to introduce a new method
to create glass tile. The Stained Glass is made by cutting and pasting the glass
therefore the unit of the Stained Glass is a glass tile. As conventional algorithm
used to create a glass tile simply using region segmentation, it could not have
the Stained Glass like feeling if the region segmentation were not appropriate.
In order to resolve this problem, we interpolated the boundaries between the
regions, re-segmented each segmented area to create the region to compose glass
tiles.

Related Work

Although many dierent studies have been performed since the studies regarding
the NPR were started Strothotte[1], Recently, there have been many trials to
simulate the Stained Glass using computer technology.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 162165, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Stained Glass Rendering with Smooth Tile Boundary

163

In Photoshop, The Stained Glass lter is one of them. The Stained Glass
lter of the Photoshop basically makes the image using Voronoi Diagram having
random Voronoi sites(Ho[5]).
Mould[2] approached to the studies on the Stained Glass in dierent ways.
This method dividied the input image using the region segmentation method
based on the color, and then it mitigated the segmented regions applying the
morphological operation. However the regions created by Mould[2] method are
far from the formative structures.

3
3.1

Glass Tile Generation for the Stained Glass


Region Generation of Glass Tiles

It is very important to abstract the segments that can be expressed in glass tiles.
In this study, we used the Mean Shift segmentation algorithm(Comanicu[4]) to
generate the basic segmented regions creating the glass tiles(Fig. 1(b)).
Additionally, as the regions with unexpected shapes can be created from the
application of the segmentation algorithm, we made it possible for the user to
merge the regions through input(Fig. 1(c)).

Fig. 1. Stained glass rendering process (a) Input image (b) Mean shift segmentation
(c) Region merge (d) Morphological operation (e) Region boundary interpolation (f)
Region re-division (g) Rendered image after color transform

3.2

Interpolation of the Region Boundaries

In order to purify each segment, we used morphological operations (Fig. 1(d))and


the cubic spline interpolation(Fig. 1(e)). because the abstracted regions have
improper shaped regions created by the Mean Shift segmentation algorithm and
the rough boundaries in the form of noise.
Although Mould[2] tried to resolve those two problems using the morphological operations only, this study used it to remove the improper shaped segments
and used the cubic spline interpolation to make the rough boundary.

164

S. Seo et al.

Next, to make the boundaries of the mitigated shaped regions smooth, we


applied the cubic spline interpolation. In order to calculate cubic spline interpolation we need to select the control point. To select the control point, we dened
the distance of the control point and then selected the point on the boundary
with the pre-set distance from the current control point (Fig. 2(a)).

Fig. 2. (a)Search control points for spline interpolation (b)Region re-segmentation process (c)Re-segmentation result

We applied the same method to create the leading that lls the gaps between
glasses in the actual stained glass during the interpolation process. It is to abstract the parts that did not create the segment as the leading by creating and
applying the smaller spline curves than the basic segments.
3.3

Re-segmentation of the Region

For securing the formative beauty that shown in the actual stained glass. We
re-segmented the large segment region(Fig. 1(f)).
In the re-segmentation process, we made random curve and re-segmented the
regions based on the curve.
First of all, select a random point on the region and identify two points
on the boundary near the selected point to use them as the control points.
Based on the selected three points, create a curve using the cubic spline interpolation (Fig. 2(b)). We limited the number of the control points as three
because if we use too many points on the segment it would bend the curve too
much. It is improper to apply the over-bent curve to the stained glass rendering
(Fig. 2(c)).
3.4

Determination of the Colors for Each Region

Mould[2] converted the colors in the input image to the colors that could be
used in the Middle Age to designate the colors of the glass tiles. We applied the
same method to this study. Through this process, we could have the strong color
contrast eects(Fig. 1(g)).

Stained Glass Rendering with Smooth Tile Boundary

165

Conclusion

In this study, we created the shapes of the smooth glass tiles similar to those in
the actual stained glass by interpolating the boundaries between the segments
to simulate the stained glass image, and then created the frame shape of leading
with irregular thickness that we could nd between the glass tiles. Additionally
we created the formative characteristics that composed a meaningful segment by
gathering small glass tiles through the re-segmentation of the segment. We also
expressed the strong color contrast through the color conversion process(Fig. 3).
Additionally, to emphasize the formative shapes, we highlighted the boundaries
before re-segmentation with thick lines. As the stained glass is mainly used in
windows, we gave the round light source eects to the image so that we could
get the lighting eects. Fig. 3 shows the comparison between the image after the
light source eect application and the image from Mould[2].
Actual stained glass is made of color glasses.

Fig. 3. Result Images

References
1. Thomas Strothotte and Stefan Schlechtweg, Non-Photorealistic Computer Graphics: Modeling, Rendering and Animation, (2002), Morgan Kaufmann, ISBN:
1-558-6078-70
2. David Mould: A Stained Glass Image Filter. In the proceedings of the 14th EUROGRAPHICS Workshop on Rendering, pp. 20-25
3. Grodecki, L., Brisac, C. : Gothic Stained Glass. Thames and Hudson, London, (1985)
4. Comanicu, D., Meer, P. : Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Machine Intell, 24, 4 (2002), 603-619
5. Ho, K., Keyser, J., Lin, M., Manocha, D. and Culver, T. : Fast Computation
of Generalized Voronoi Diagrams Using Graphics Hardware. In the proceedings of
SIGGRAPH 99: 277-286
6. Gonzalez, Woods: Digital Image Processing, Addison Wesley, (1993)
7. Adam Finkelstein, Marisa Range, :Image Mosaics, Technical Report of Princeton
Univ., (1998)

Guaranteed Adaptive Antialiasing Using


Interval Arithmetic
Jorge Florez, Mateu Sbert, Miguel A. Sainz, and Josep Veh
Institut dElectr`
onica, Inform`
atica i Autom`
atica
Universitat de Girona, Girona 17071, Spain
jeflorez@eia.udg.es

Abstract. Interval arithmetic has been used to create guaranteed intersection tests in the ray tracing algorithms. Although those algorithms
improve the reliability in the visualization of implicit surfaces, they do
not provide an alternative to avoid point sampling inside the pixel. In
this paper, we develop an interval adaptive antialiasing algorithm (IAA)
by means of the study of the coherence of sets of rays crossing a pixel
(instead of individual rays) in order to detect variations over the hit
surface. This method allows us to obtain better visualizations than the
traditional interval ray tracing algorithms.

Introduction

The ray tracing of implicit surfaces suers of accuracy problems, which are related to thin features that disappear when some special surfaces are rendered.
This occurs because the computers can not guarantee the robustness in oating
point operations during the intersection test [3,6]. Many authors have proposed
reliable ray tracing algorithms that perform guaranteed intersection tests based
on interval arithmetic [2,7,8].
However, those authors do not propose a reliable way to reduce aliasing in the
visualization of the surfaces. An alternative is to use adaptive sampling [9]. In
this technique, rays are traced for every corner of the pixel. If the values are too
dierent, the pixel is subdivided and new rays are traced in the new corners. Due
to the fact that it is still possible to miss thin parts of the surface, this method
uses bounding boxes for small objects. If the ray intersects a bounding box, the
sampling rate is increased to guarantee that view rays do not miss the object.
Although eective in most of the cases, this technique does not work very well
with long thin objects [4].
Other approaches are based on gathering information of adjacent rays as cone
tracing [1] and beam tracing [5]. The main disadvantage of those proposals is
that they require computationally complex intersection tests.
This paper introduces a method called interval adaptive antialiasing (IAA)
with the following characteristics:
This method examines areas of the pixel instead of points as in point sampling. Interval arithmetic is used to guarantee that small parts of the surface
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 166169, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Guaranteed Adaptive Antialiasing Using Interval Arithmetic

167

inside the pixel are not missed. All the set of rays that cover an area of the
pixel are treated as a unique ray.
The information obtained from sets of rays is studied to determine if the
area covered by the rays presents too much variation over the surface.
This method does not require bounding boxes to detect small features as
adaptive sampling does. Also, the complexity of the intersection test is almost the same as the traditional interval ray tracing.

Interval Adaptive Antialiasing (IAA)

The intersection between the implicit function f (x, y, z) = 0 and a ray dened
by:
x = sx + t(xp sx ); y = sy + t(yp sy ); z = sz + t(zp sz )
is dened by the function:
f (sx + t(xp sx ), sy + t(yp sy ), sz + t(zp sz ))
where (sx , sy , sz ) are the coordinates of the origin or view point, (xp , yp , zp ) are
the values of a point in the screen and t indicates the magnitude in the direction
of the ray. If the parameter t is replaced with an interval T , a set of real values
can be evaluated instead of a unique value. To cover pixel areas instead of points,
the real values of xp and yp of the screen must be considered as interval values
too. The function to include the new interval values can be dened as follows:
F (Xp , Yp , T ) = F (sx + T (Xp sx ), Sy + T (Yp sy ), sz + T (zp sz ))

(1)

To perform the evaluation with equation 1, the intervals Xp and Yp must


be xed to a range of values inside the pixel, and a bisection process must be
started over the parameter T . Every interval generated for the subdivision of T
is evaluated to know if the set of rays intersects the surface. If the set of rays does
not intersect any part of the implicit surface, then the result of the evaluation
of equation 1 does not contain zero (0
/ F (Xp , Yp , T )). Otherwise, it is possible
that one or more rays in that pixel area intersect the surface. In that case, the
parameter T must be subdivided until the machine precision is achieved.
To save the values of T near to the intersection of the set of rays, the following process is performed: when 0 F (Xp , Yp , T ) and F (Xp , Yp , T.Inf ) > 0, the
inmum value of the result (the less positive) is saved in a vector. Also if 0
F (Xp , Yp , T ) and F (Xp , Yp , T.Sup) < 0, the maximum value is saved in another
vector. When the subdivision process is over, the smaller of the positive points and
the bigger of the negatives are taken to create the interval of the nal value of T .
The interval T is used to detect variations over the surface in the following
way: using T , the interval values of X, Y and Z are calculated. Those values
correspond to the set of all the intersections of the set of rays. Also, the interval
normal is calculated using the derivative of the function F  (X, Y, Z), which is
the same derivative of the implicit function using interval values of X, Y, Z.
Finally, the interval dot product between the set of normals and the view rays

168

J. Fl
orez et al.

is calculated. If the width of the interval containing the dot products between
the set of rays and the normals is bigger than a predened threshold, or if the
surface is not monotonic for the values of T , the surface varies too much in the
evaluated area.
The interval adaptive antialiasing is perfomed in every pixel as follows: all the
area of the pixel is evaluated using process described in section 2.1. If the surface
varies too much inside the pixel, the pixel is divided in four subpixels and the
process is performed on them. In other case, the pixel or subpixel evaluated is
shaded using the average of the normals. If the pixel is divided, the average of the
shade values of the subpixels is used to obtain the nal shade value of the pixel.

Experimentation and Results

The IAA method was tested over the surfaces presented in gure 1. The comparisons have been performed between an adaptive algorithm using a traditional
ray tracing algorithm, and our interval adaptive algorithm. Figures 1a and 1b
show a twist with a shadow. The problems in the visualization in (a) occur because shadow rays miss the thin details of the twister. Also, the visualization
of gure 1a takes 27 minutes; gure 1b takes 20 minutes. The time dierence

Fig. 1. Experimentation images. (a) Fine details of the shadow are not well visualized
using traditional interval ray tracing. (b) Using IAA, those details are better visualized.
(c) A Blobby surface rendered by IAA algorithm. (d)A tri-trumpet surface, in which
some sections appear separated although interval arithmetic is used for the intersection
test, as is shown in (d). Using IAA, the surface is rendered correctly (f).

Guaranteed Adaptive Antialiasing Using Interval Arithmetic

169

is due to IAA detect pixels without too much variation inside, using only one
intersection test. Using the traditional ray tracing algorithm, at least four rays
are traced for every pixel.

Conclusions

In this paper we have presented an interval adaptive antialiasing method (IAA)


for the interval ray tracing of implicit surfaces. It can be adapted to the traditional interval algorithms used to ray tracing implicit surfaces, without increasing
the complexity of the intersection tests. Also, the proposed technique generates
better visualization results than methods based on point sampling, specially
for surfaces with thin features. IAA is completely based on interval arithmetic,
which guarantees the reliability of the algorithm. As a future work, we are planning to apply our method for reections and refractions. In this paper, sets of
rays are traced to visualize view and shadow rays.

Acknowledgements
This work has been partially funded by the European Union (European Regional Development Fund) and the Spanish Government (Plan Nacional de Investigacin Cientca, Desarrollo e Innovacin Tecnolgica, Ministerio de Ciencia
y Tecnologa) through the co-ordinated research projects DPI2002-04018-C0202, DPI2003-07146-C02-02, DPI2004-07167-C02-02, DPI2005-08668-C03-02 and
TIN2004-07451-C03-01 and by the government of Catalonia through SGR00296.

References
1. Amanatides, J.: Ray Tracing with Cones. Computer Graphics. 18 (1984) 129135
2. Capriani, O., Hvidegaard, L., Mortensen, M., Schneider, T: Robust and ecient ray
intersection of implicit surfaces. Reliable Computing. 1(6) (2000) 921
3. Fl
orez, J., Sbert, M., Sainz, M., Veh: Improving the interval ray tracing of implicit
surfaces. Lecture Notes in Computer Science. 4035 (2006) 655-664
4. Genetti, J., Gordon, D.: Ray Tracing With Adaptive Supersampling in Object Space.
Graphics Interface. (1993) 7077
5. Heckbert, P., Hanrahan, P.: Beam Tracing Polygonal Objects. Computer Graphics.
18 (1984) 119127,
6. Kalra, D., Barr, A.: Guaranteed ray intersection with implicit surfaces. Computer
Graphics (Siggraph proceedings). 23 (1982) 297206
7. Mitchell, Don: Robust ray intersection with interval arithmetic. Proceedings on
Graphics interface 90. (1990) 6874
8. Sanjuan-Estrada, J., Casado, L., Garca I.: Reliable Algorithms for Ray Intersection
in Computer Graphics Based on Interval Arithmetic. XVI Brazilian Symposium on
Computer Graphics and Image Processing. (2003) 3544
9. Whitted, T: An Improved Illumination Model for Shaded Display. Communications
of the ACM. 23 (1980) 343-349

Restricted Non-cooperative Games


Seth J. Chandler
University of Houston Law Center
schandler@uh.edu

Abstract. Traditional non-cooperative game theory has been an extraordinarily powerful tool in modeling biological and economic behavior,
as well as the eect of legal rules. And, although it contains plausible concepts of equilibrium behavior, it does not contain a theory of dynamics
as to how equilibria are to be reached. This paper on Restricted NonCooperative Games inserts dynamic content into traditional game theory
and thus permits modeling of more realistic settings by imposing topologies that restrict the strategies available to players. It uses Mathematica
to show how the payo array used in conventional game theory, coupled
with these strategy topologies, can construct a "game network", which
can be further visualized, analyzed, and "scored" for each of the players.
The paper likewise uses Mathematica to analyze settings in which each
player has the ability to engineer its own strategy topology and suggests
other potential extensions of Restricted Non-Cooperative Games.1
Keywords: non-cooperative game theory, Mathematica, law, Nash
Equilibrium, game network, New Kind of Science, directed graphs.

Introduction

In conventional non-cooperative game theory, each player can see and can instantaneously select any element of its strategy set in response to the other
players strategy selections.[1] In real settings, however, the strategies available
to a player at any given time will often be a function of the strategy it selected
at a prior time.[2] It may, for example, be possible to change only one aspect
of a strategy at a time. Alternatively, as in work earlier done by the author in
"Foggy Game Theory,"[3] the strategies may be placed in some cyclic topology
and only changes within some distance of the current stategy are permitted.
Sometimes these constraints on the dynamics of strategy selection may be the
result of external circumstances or cognitive limitations on the part of the player;
other times they may be deliberately engineered by the player itself.[4] Either
1

Mathematica code used to create the framework is available from the author on
request.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 170177, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Restricted Non-cooperative Games

171

way, however, the result is to overlay the strategies with a network connecting
them (a topology) in which some strategies are connected and others are not.2

From Strategy Topologies and Payo Arrays to Game


Networks

The left panel of Figure 1, produced using Mathematicas GraphPlot command,


shows a sample strategy network sD for an imaginary driver who has strategies
labeled A through E (each strategy representing some combination, perhaps, of
care and frequency). Notice that while the driver can continue the strategy of
B, once it abandons that strategy it cannot return.This is the sort of realism
permitted by this extension of conventional game theory. Notice further that
the strategies dier in their ability to immediately access other strategies: D can
access C and E immediately while A can access A and E immediately. Another
player (perhaps a pedestrian) might, for example, have strategies labeled 1-3,
again perhaps some combination of care when walking and frequency of walks.
The pedestrians strategy network sP is shown in the right panel of Figure 1, in
which the dashed arrows show the strategy connections.

Driver Strategy Topology


B

Pedestrian Strategy Topology

2
C

E
A

Fig. 1. Strategy topologies for driver and pedestrian

We can now use Mathematica s structural operations to create a new network


(directed graph or digraph) that is the Graph Cartesian Product of the networks
sD and sP . Thus, given the strategy topologies shown in Figure 1, if the existing
strategy combination is C1, the next strategy combinations could be A1, C1, D1
or E1 (if the driver moves) or C1, C2 or C3 (if the pedestrian moves).
In conventional game theory, the players get dierent payos depending on
what strategy combination is selected. So too here. I assume that players are
greedy (and perhaps not terribly clever) in that, in selecting their next move, they
2

All strategies have at least one outgoing edge, though that edge can be a self-loop.
Otherwise a player would not know what to do on their next move. One can imagine
a yet more general case in which the strategies available to each player are a function
not simply of the strategy employed by that particular player on the prior move but
the strategy combination used on the prior move or move history. Conventional noncooperative game theory may be thought of as a special case of restricted game
theory in which each player has a complete graph as their strategy topology.

172

S.J. Chandler

Driver Payoffs
1

Pedestrian Payoffs

0.878 0.958 0.967

0.108 0.691 0.015

0.057 0.976 0.223

0.054 0.139 0.439

0.778 0.863 0.448

0.811 0.549 0.12

0.764 0.349 0.877

0.589 0.097 0.257

0.944 0.491 0.603

0.469 0.608 0.687

B1
D1
C1
A1
E1

B3

D3

B2
D2

A3
A2

E3
E2

Fig. 2. Game network for driver and pedestrian based on payo array

choose the one whose associated strategy combination (given the complementary
strategy selections of the other players) oers them the highest payo.3 This
modication results in a thinning of the network created above so that, in an
n-player game, only n edges can generally emerge from each node. Each edge
represents the best move for one of the n players.
To use more formal notation, if there are n players in some restricted game
and the strategy topology (graph) of player i {1,. . .,n} is denoted as si , and
n

the set of strategy combinations in the restricted game is S ( si ), then the


i=1

moves potentially available to player i from some strategy combination u may


be written as Equation 1, where V is a function listing the vertices of a graph
and E is a function listing the edges of a graph. One can then write the moves
potentially available to all players from u as Equation 2 and the set of
moves potentially available to all players from all strategy combinations as
Equation 3.
3

More elaborate behavioral assumptions are certainly possible. Professor Steven


Brams, for example, relies on a similar dynamic structure in his celebrated Theory of
Moves.[2] He assumes, however, that the players can forsee and optimally negotiate
an entire tree (acyclic directed network) of moves among strategy combinations. Restricted Non-Cooperative Game Theory avoids some of the issues associated with the
construction of trees in the Theory of Moves in that, among other things, there are
no "terminals." Instead, players confront a cyclic network of strategy combinations.

Restricted Non-cooperative Games

173

M [u, s1 , . . . , sn , i] = {{{u S, v S}, i}|{ui, vi } E[si ] j=i uj = vj } (1)


n

M [u, s1 , . . . , sn ] = M [u, s1 , . . . , sn , i]

(2)

M [s1 , . . . , sn ] = M [u, s1 , . . . , sn ]

(3)

i=1

uS

One now uses the payos to narrow the set of moves to a subset of the set of
potential moves. A plausible way of doing so is, as in conventional game theory, to
create a payo function mapping each strategy combination to a set of payos,
one for each of the players. One can conveniently represent that function as
the n |s1 | . . . |sn | array P (dimensionality n + 1), where |s| assumes its
conventional meaning of the cardinality of a set s. Pi is in turn the n-dimensional
array in which element {u1 , . . . , un } represents the the payo to the ith player
resulting from strategy combination u. One can then denote the restricted game
G as having the following set of moves:
G[P, s1 , . . . , sn ]={{{u, v}, i} M [s1, . . . , sn ]|Pi [v]

max

{{u,m},i}M[u,s1 ,...,sn ,i]

Pi [m]}

(4)
This mathematical formalism is visualized in Figure 2. The top panel represents the payos associated with each strategy combination in the sample driverpedestrian game. The bottom panel shows the new "Game Network." Moves by
the driver are shown as medium-width solid lines while moves by the pedestrian
are thick dashed lines. Very faint, thin, dotted lines represent moves that were
potentially available but were discarded because they were not the best move
for either player.

Scoring Game Networks

We can now assign a score to each of the players from the game network described
above. The players score is simply a weighted average of the payos the player
receives from each strategy combination. A plausible weighting scheme assumes
that a random walk is taken by the players on the game network (starting from
some random node) and that the weights are thus the stationary probabilities
of being at each node (strategy combination). Thus, if there were, as in some
conventional non-cooperative games, a single strategy combination to which all
the players inexorably converged regardless of some "starting point" (a classic
Nash Equilibrium), that strategy combination would receive a weight of one.
If there were several such nodes, each would receive a weight corresponding
to the size (number of nodes) of its basin of attraction. These weights can be
computed readily by creating a Markov transition matrix associated with the
game network and then computing the stationary values.4 The scores can be
normalized by dividing each score by the score that would result if all nodes of
4

Mathematica has iterative constructs such as Nest or built-in functions such as


Eigenvectors that make this process quite simple.

174

S.J. Chandler

the network were weighted equally. In formal notation, the normalized score for
player i in game g = G[P, s1 , . . . , sn ] is equal to

[P, g, i] =
wu [g]Pi [u]/|V [g]| ,
(5)
uV [g]

where wu [g] is the weight accorded strategy combination u in game g. In our


sample driver-pedestrian game, the normalized score is 1.6345 for the driver and
1.214 for the pedestrian.5

Driver Payoffs
1

Pedestrian Payoffs

0.878 0.958 0.967

0.108 0.691 0.015

0.057 0.976 0.223

0.054 0.139 0.439

0.778 0.863 0.448

0.811 0.549 0.12

0.764 0.349 0.877

0.589 0.097 0.257

0.444 0.491 0.603

0.969 0.608 0.687

B1
D1
C1

D3

A1
E1

B3
B2
D2

A3
A2

E3
E2

Fig. 3. Modication of payo array generates new game network and scores

Just as in conventional game theory one can study how changes in the strategy combination-payo mapping alter the Nash Equilibria and/or the payos
at the equilibria, in restricted non-cooperative game theory one can study how
changes in the strategy combination-payo mapping alter the game network,
which in turn alter the players scores. Figure 3 shows the new payo array (top
panel) that results from requiring the driver to pay the pedestrian an amount of
0.5 if strategy combination E1 is employed, and the resulting new game network
5

In traditional game theory, the ability to nd at least one "Nash equilibrium" for all
games and the accompanying payos is preserved by permitting the players to use
probabilistic strategies[1] and to then indulge contested assumptions about the behavior of players under uncertainty. Probabilistic strategy selection is not permitted
here.

Restricted Non-cooperative Games

175

(bottom).6 The new normalized scores are 1.49 for the driver and 1.60 for the
pedestrian.

A New Kind of Science Approach to Game Networks

With this framework in place we can also undertake studies not possible in
conventional non-cooperative game theory. We can examine how, given a payo
matrix, changing the strategy topologies aects the associated game network,
which in turn aects the weights received by each strategy combination, which
in turn aects the scores received by each player. We can examine properties
of the game network itself, such as the lengths of its maximal cycles. We can
also, in eect, create a metagame in which the strategies are not just things
such as A or B but also choices about whether to permit oneself to transition
from A to B. Physical sabotage of otherwise existing transition possibilities can
create such restrictions; so can economic relationships with third parties such
that various strategy transitions become suciently unprotable (dominated)
and thus disregarded.
I now begin to examine this last proposition systematically using ideas of
Stephen Wolframs A New Kind of Science involving consideration of very simple cases and enumeration of possibilities.[6] Consider a game of n players
indexed over
i with

 each player having |s| strategies available to it. Each player
now has 1 + 2|s| |s| possible strategy topologies. This is so because each strategy can connect with the other strategies in 2|s| ways, but one of those ways, the
empty set, is prohibited, as each player must always 
have a next move,
 and there
are |s| strategies to be considered. There are thus ni=1 1 + 2|s| |s| strategy
combination topologies that can exist. Although the number of strategy topologies can readily become quite large, if there are two players and each player has
three strategies available to it, each player has 343 strategy topologies and there
are 117649 strategy combination topologies that can be enumerated along with
an identical number of associated game networks.7 It is well within the ability of
the framework we have created and todays computers to extract the maximal
cycle lengths and the scores for each of these game networks.
I can create a random payo array and then create adjacency list representations for each possible strategy topology. I can then invoke a command that
creates a losslessly compressed representation of the game network for all pairs
of strategy topologies. On a 2006 Macbook Pro, the computation for all game
networks of n = 2, |s| = 3 takes about 30 minutes and produces an expression consuming ve megabytes. (Mathematica permits a monitor to be attached
6

Legal rules often change payo arrays in just the fashion shown here by requiring one
player in a "game" to pay another player in a "game" if certain strategy combinations
are employed.[5] Total payos are generally conserved within a strategy combination,
unless there are transaction costs or payos to non-parties, in which event the total
payo can diminish.
If there were three players and each player had four strategies, there would be
129746337890625 game networks to be considered, which shows a limitation of a
"brute force" approach on contemporary computers.

176

S.J. Chandler

to the computation to watch its progress and the data can be stored for later
examination.)
With this listing of all possible networks and a decompression algorithm, I
can then compute the lengths of each maximal cycle for each of the 3432 game
networks. It ends up being most common for game networks to have the potential to cycle through at most three strategy combinations before returning
to a previously played strategy combination. Maximal cycles of 7, 8 and even
9 strategy combinations prove possible, however. Indeed, we can focus on the
smaller number of game networks that show complex dynamics with long cycles in which, depending on the sequence of player moves, there is at least the
potential for many strategy combinations to be pursued before returning to a
previously played combination.
Alternatively, I can take the game networks and compute a mapping between
pairs of strategy topologies and the scores for each player. This computation takes
only 10 minutes, much of which is spent in decompressing the game networks.
One can do several things at this point. One can examine which strategy
topology tends to have the highest average score for a particular payo array.
Figure 4 shows the results of this experiment. It shows that both players tend to
do best when they have signicant choices available to them regardless of their
current strategy choice. "Pre-commitment strategies," which attempt to restrict
strategy selections, tend not to do well when one does not know the strategy
topology of the opposing player.

Player 1

Player 2

Fig. 4. Strategy topologies yielding highest average payos

One can also examine the character of any pure traditional Nash equilibrium
for the "meta-game" created by this process in which the "strategies" are now
strategy topologies and the payos are the "scores." When one runs this experiment on the sample game shown above, one nds that there are eight Nash
Equilibria.8 Figure 5 shows a sample Nash equilibrium.9
8

All the equilibria have the following characteristics: the second player always chooses
strategy topologies in which it must move to strategy 1 no matter what strategy it
has played before; the rst player never permits a move to strategy 3 no matter what
player 1 does and no matter what it has done on any prior move.
One could create meta - metagames by imagining that the players can not only alter
their strategy topologies but also their ability to transition from among strategy
topologies. Because this would create over 5.7 1070831 possible game networks,
however, any exhaustive study of the possibilities is, for the forseeable future, impractical.

Restricted Non-cooperative Games


0.497

0.214

0.727

0.766

0.27

0.848

0.56

0.74

0.026

0.575

0.889

0.019

0.938

0.46

0.049

0.051

0.653

0.724

1
1, 3

1, 1

1
2

2, 1

2, 2

3, 3

2
1

1 2

2
1, 2

2, 3

2
3

177

3, 1

2
1

3, 2

Fig. 5. A sample Nash Equilibrium set of strategy topologies and the associated game
network

Conclusion

Mathematica successfully creates a useful and exible framework for the study
of n-player Restricted Non-Cooperative Games in which the players have potentially dierent strategy topologies. The paper does not purport to study this
extension of game theory exhaustively. The intent is to develop a general set of
tools from which further study can be protably pursued.
Acknowledgments. The author thanks Jason Cawley of Wolfram Research,
Professor Darren Bush of the University of Houston Law Center and Professor
Steven Brams of New York University for extraordinarily helpful comments on
a draft of this paper, as well as Yifan Hu of Wolfram Research for designing
Mathematicas GraphPlot package used extensively here.

References
1. Gintis, H.: Game theory evolving a problem-centered introduction to modeling
strategic behavior. Princeton University Press, Princeton, N.J. (2000)
2. Brams, S.J.: Theory of moves. Cambridge University Press, Cambridge [England] ;
New York, NY, USA (1994)
3. Chandler, S.J.: Foggy game theory and the structure of legal rules. In: Tazawa, Y.
(ed.): Symbolic Computations: New Horizons. Tokyo Denki University Press, Tokyo
(2001) 31-46
4. Dixit, A.K.: Strategic Behavior in Contests. American Economic Review 77 5 (1987)
891-898
5. Baird, D.G., Gertner, R.H., Picker, R.C.: Game theory and the law. Harvard University Press, Cambridge, Mass. (1994)
6. Wolfram, S.: A new kind of science. Wolfram Media, Champaign, IL (2002)

A New Application of CAS to LATEX Plottings


Masayoshi Sekiguchi, Masataka Kaneko, Yuuki Tadokoro, Satoshi Yamashita,
and Setsuo Takato
Kisarazu National College of Technology,
Kiyomidai-Higashi 2-11-1, Chiba 292-0041, Japan
masa@kisarazu.ac.jp

Abstract. We have found a new application of Computer Algebra System (CAS), KETpic which has been developed as a macro package for
a CAS. One of aspects in its philosophy is CAS-aided visualization in
LATEX documents. We aim to extend KETpic to other CASs, and derive
necessary conditions from the basic idea for CASs to accept it, i.e., I/O
functions with external les and manipulating numerical or string data.
Finally, we describe KETpic for Maple as a successful example. By using
KETpic we can draw ne pictures in LATEX documents.
Keywords: CAS, LATEX.

Introduction

In many cases, mathematicians or mathematics teachers, as well as other scientists, need to prepare good illustrations or educational materials. In general, CAS
gives us a set of highly accurate numerical data. Therefore, it is quite natural
to utilize CAS for the purpose of creating ne pictures. CASs support beautiful
and impressive graphics and some of them can output the picture in graphical
formats (EPS, JPEG, GIF, BMP, etc). However, the authors have not been satised with printed matters obtained as a direct output from CASs, as well as
from CADs (Computer Aided Design) or from data/function-plotting programs,
like Gnuplot. The reason is that mathematical lettering in their pictures is not
clear. We need to optimize their outputs so as to be available for mathematical
textbooks or academic papers.
On the other hand, LATEX has a quality in lettering high enough to satisfy
us suciently but no abilities of symbolic or numerical computation (see [5]).
It has the picture environment or ability of displaying graphical data les in
EPS format. By using Tpic, a graphical extension of LATEX, we can draw various
pictures based on 2D numerical data (see [2,4]). However, it is cumbersome to
handle numerical data directly and to generate tpic special commands. It is better to write a program generating tpic special commands from numerical plotting
data. The program will be another individual software or a macro package for
a CAS.
The authors have developed KETpic for Maple, a Maple macro package. It
generates tpic special commands, and enables us to draw complicated but 2D
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 178185, 2007.
c Springer-Verlag Berlin Heidelberg 2007


A New Application of CAS to LATEX Plottings

179

mathematical objects in LATEX documents at the highest accuracy. Detailed description of KETpic for Maple is given in [7]. Recently we organized a project
in which we aim to extend KETpic to other CASs. We consider necessary conditions for CASs to accept KETpic. Philosophy of designing KETpic includes a
basic idea, CAS-aided visualization in LATEX documents which we call CAS-aided
LATEX plottings. The requirements will be naturally derived from the basic idea.
Section 2 is devoted to construction of necessary conditions for a CAS to
generate graphic les which we can include in LATEX documents. In section 3, we
show that Maple satises the requirements, describe how KETpic realizes the
idea, and illustrate its outputs.

Requirements for CAS-Aided LATEX Plottings

In order to realize the idea CAS-aided LATEX plottings, we decided to develop


a macro package for a known CAS. We did not select other ways of development:
a new CAS which is designed to realize the idea or another individual software
which calls a kernel of CAS as an external computing engine. We believe that it
is best to develop a macro package for a known CAS.
Hereafter, we suppose that a standard CAS is equipped with abilities of symbolic or numerical computing, programming, and generating graphical images.
In addition, we require the following necessary conditions for a CAS.
R1.
R2.
R3.
R4.

Loadability of macro packages from external les,


Writability of numerical data and strings with formats on text les,
Accessibility to raw numerical data in 2D/3D coordinates,
Ability of manipulating numerical values or strings to generate graphic
codes, e.g., tpic special commands, PostScript, or EPS.

If a CAS satises writability without formats instead of the condition R2,


it can write a sequence of raw data in a text format. In this case, it is necessary
to translate the unformatted data into a formatted data. The translation may
be done by a post-processor. The condition R4 and the ability of programming
enable us to handle a lot of data collectively or iteratively, and optimize their
outputs.
For KETpic, we have chosen to generate tpic special commands. Our choice,
Tpic, allows us to obtain rich graphical expressions. We believe that Tpic is best
because it is wide-spread. Unfortunately a previewer, Mxdvi in Mac OS X, does
not support Tpic. We oer a particular version of KETpic for the previewer. It
generates eepic [4] commands instead of tpic special commands, and is downloadable from our web site [6]. Another way to provide rich expressions is to
generate PostScript or EPS (Encapsulated PostScript) les. Many versions of
LATEX allow EPS les to be inserted in documents. This is still realistic if we are
familiar to grammar of EPS format.
A graph with many curves or items is a powerful expression in mathematical documents. Producing the graph becomes easier by collective or iterative
operations, e.g., list processing, DO-loop, WHILE-loop, and so on.

180

M. Sekiguchi et al.

Optimization of graphics means ne-tuning of outputs and customizing of


graph accessories, e.g., tickmarks, labels of axes, and legends of curves, which
make pictures more appealing. For ne-tuning we have added commands to
KETpic, by which we can draw various hatchings, dashed lines, and projections of 3D objects. For customizing of graph accessories we have added other
commands to KETpic. These operations are realized by programming of CASs.

3
3.1

A Successful Example: KETpic for Maple


Maple Satises the Requirements

The condition R1 is satised by command read equipped with in Maple. We


can use it as follows.
> read UsersFolder/ketpicw.m;
The condition R2 is satised by Maple commands fopen, fclose, and fprintf.
They dene KETpic commands openfile and closefile. The usage is as
follows.
> openfile(UsersFolder/figure1.tex):
...
> closefile():
These also satisfy the condition R4. KETpic commands openpicture,
closepicture, and setwindow return \begin{picture} and \end{picture}
with option indicating its window size and a unit length. For instance, the
following set of commands,
> setwindow(0..5,-1.5..1.5):
> openpicture("1cm"):
...
> closepicture():
returns a set of commands of the picture environment as follows.
{\unitlength=1cm%
\begin{picture}%
(5.00,3.00)(0.00,-1.50)%
...
\end{picture}}%
The following Maple command gives plotting data to a variable g1.
> g1:=plot(sin(x),x=0..5):
If we execute a command ending with semi-column instead of column in the
command line above, we can see the internal expression of g1 which takes a list
format.

A New Application of CAS to LATEX Plottings

181

> g1:=plot(sin(x),x=0..5);
g1:=PLOT(CURVES([[0.,0.],...
...,[5.,-0.958924274663138453]],COLOUR(RGB,1.0,0.,0.)),...))
This operation satises the condition R3. This internal expression can be constructed through DO-loop operations and string manipulation. Maple command
op returns n-th operand of its argument.
> op(g1);
CURVES(...),COLOUR(...),AXESLABELS(...),VIEW(...)
> op(3,g1);
AXESLABELS(...)
> op(1,op(1,g1))
[[0.,0.],...,[5.,-0.958924274663138453]]
> op(1,op(1,op(1,g1)))
[0.,0.]
Maple commands sscanf and convert can translate characters into numerical
data, and vice versa. Other commands cat, substring, and length can be used
for string manipulations, concatenating, and so on. One of drawing commands
of KETpic is drwline. The usage is as follows.
> drwline(g1):
This command returns a set of tpic special codes as follows.
> drwline(g1);
\special{pa 0 0}\special{pa 43 -43}...\special{pa 164 -160}%
...
\special{pa 1844 394}...\special{pa 1969 378}%
\special{fp}%
%
Commands for hatching area, drawing dashed curves, and customizing graph
accessories are as follows.
>
>
>
>
>
>

setax("","\\xi","","\\eta","","",""):
g2:=plot(cos(x),x=0..5):
g3:=hatchdata(["ii"],[3,0],[g1,"s"],[g2,"n"]):
drwline(g3):
dashline(g1,g2):
expr([2,1],"ne","\\displaystyle
\\int_{\\pi/4}^{5\\pi/4}(\\sin\\xi-\\cos\\xi)d\\xi"):

Double backslash \\ returns single backslash because single backslash is a control code in Maple. The rst command setax denes axes, the origin, and their
names. In this case, the name of the horizontal axis is , and the vertical one
. The command hatchdata returns a set of stripes inside a closed curve obtained

182

M. Sekiguchi et al.

from g1 and g2. Its third argument [g1,"s"] indicates a region in the south of
curve g1. Similarly the forth argument [g2,"n"] indicates a region. The rst
argument ["ii"] indicates inside areas of them. The second argument [3,0]
denes a reference point. The command dashline(g1,g2) returns plotting data
 5/4
of g1 and g2 with dashed lines. The last command line puts a legend /4 (sin
cos )d at a point whose position is slightly dierent from (2, 1) to the northeast. The resulting gure is given in Fig. 1.

 5/4
/4

(sin cos )d

O
1
Fig. 1. Output of 3.1

3.2

Special Functions or Functions Dened by Integrals

Using Maple, we can call special functions, calculate values of them, and plot
graphs. The Chi-square distribution is dened by the gamma function (x) as
follows.
x 2 1 e 2
.
n
2 2 ( n2 )
n

fn (x) =

(1)

Curves of the distributions can be obtained by Maple, and can be included in


this document by KETpic (see Fig. 2 (left)). The corresponding sequence of
KETpic commands are given below.
0.5

20 0

20

Fig. 2. Chi-square distributions for degrees of freedom n = 1, 2, , 9 (left) and their


corresponding denite integrals (right)

A New Application of CAS to LATEX Plottings

183

> f:=(n,x)->x^{n/2-1}*exp(-x/2)/2^{n/2}/GAMMA(n/2);
> tmp:=[]:
for i from 1 to 9 do
tmp:=[op(tmp),plot(f(i,x),x=0..20)]:
od:
g4:=display(tmp):
The internal expression of g4 is
plot(CURVES(...),CURVES(...),...,CURVES(...))
One can nd a denition of the Chi-square distribution in the rst line. This is
an advantage of CAS. Another advantage is an iterative operation which one can
nd in the last three lines. To dene a function, one can use an integral form.
The following function is a denite integral of the Chi-square distribution (see
Fig. 2 (right)).

x

Fn (x) =

fn (t) dt.

(2)

The corresponding sequence of KETpic commands are given below.


> F:=(n,x)->int(f(n,t),t=0..x);
> tmp:=[]:
for i from 1 to 9 do
tmp:=[op(tmp),plot(F(i,x),x=0..20)]:
od:
g5:=display(tmp):
3.3

Curves Dened by Implicit Functions or with Parameters

Using Maple, we can draw a curve dened by implicit functions. In general, contours are obtained by the same way. Fig. 3 (left) shows contours of the following
Coulomb potential,
1
1
(x, y) = 
+
,
2
2
(x + 1) + y
(x 1)2 + y 2

(3)

where two electric charges place on (1, 0). The corresponding sequence of KETpic commands are given below.
> g6:=contourplot(((x+1)^2+y^2)^(-1/2)+((x-1)^2+y^2)^(-1/2),
x=XMIN..XMAX,y=YMIN..YMAX,grid=[100,100],
contours=[3,2,1.5,1.2,0.95]):
There are no technical diculties in this case. As well, it is not dicult to plot
parametric curves. Conformal mappings of complex functions consist of a set of
parametric curves. Fig. 3 (right) shows a conformal mapping of the following
complex function.
1
g(z) = .
(4)
z

184

M. Sekiguchi et al.

In this case, we emphasize its dierent images of Re(z) and Im(z). The images
of Im(z) in Fig. 3 (right) are plotted in bold curves. We explain the technique
briey. The corresponding sequence of KETpic commands are given below.
> g7:=conformal(1/z,z=-1-I..1+I):
> g8:=[]:
for i from 1 to 11 do
g8:=[op(g8),op(i,g7)]:
od:
> g9:=[]:
for i from 12 to 22 do
g9:=[op(g9),op(i,g7)]:
od:
The rst line simply gives plotting data to a variable g7. The second argument in conformal, z=-1-I..1+I, indicates a range of z, i.e., |Re(z)| 1 and
|Im(z)| 1. The value of g7 consists of 11 (default) curves of g(Im(z)) and 11
ones of g(Re(z)) in this order. The second (resp. third) line collects the rst
(resp. remaining) 11 curves and saves them in a variable g8 (resp. g9). We obtain Fig. 3 (right) by writing g8 with the doubled width and g9 with the default
width in a text le. The corresponding KETpic commands are as follows.
> drwline(g8,2):
> drwline(g9):
The option, 2 after g8 in the rst command, implies multiplier of the line
width.
(z)
2

1
1

(z)
2

1
2
Fig. 3. Contours of a Coulomb potential in (3) (left) and a conformal mapping of a
complex function in (4) (right)

A New Application of CAS to LATEX Plottings

185

Conclusions and Future Work

We have claried the requirements from R1 to R4 for a CAS to accept our


new application, CAS-aided LATEX plottings. We suppose the CAS be standard,
which means the CAS is equipped with abilities of symbolic computing and
numerical computing, programming, and showing graphical images.
Our rst example is a macro package for Maple, which we call KETpic for
Maple. The package is able to produce accurate and richly expressive pictures
with a minimal input and a reasonable eort. KETpic for Maple is available
on major platforms; Windows, Macintosh or Linux. Its minimal conguration
is a combination of Maple V release 5 (see [1,3]) and a DVI driver supporting
Tpic (see [4]). Anyone interested in KETpic can download the latest version
with its command reference and some examples from our web site [6], which are
completely free of charge.
KETpic is powerful to create LATEX plottings but is relatively weaker at the
following aspects. First, it does not support GUI. Therefore, users might have
diculties to handle KETpic. However, it is a neccesary consequence of textbased user-interface of CAS which realizes accurate plottings. Second, curve
tting is one of remaining problems of KETpic because GUI environments are
the best for tting a curve. Third, at present, KETpic is not good at 3D drawings,
especially surface plottings. Finally, there are no versions for other CASs. We
have several plans to extend KETpic to other CASs, e.g., Mathematica or free
CASs. We are developing a project to improve KETpic. In addition, we are
preparing its user manual.

References
1. Char, B.W. and Geddes, K.O., et al: Maple V Library Reference Manual, (1991),
Springer-Verlag.
2. Goossens, M., Rahtz, S. and Mittelbach, F.: The LATEX Graphics Companion,
(1997), Addison-Wesley.
3. Heal, K.M., Hansen, M.L. and Rickard, K.M.: Maple V Learning Guide, (1996),
Springer-Verlag.
4. Kwok, K.: EEPIC: Extensions to epic and LATEX Picture Environment Version 1.1,
1988, http://www.ntg.nl/doc/kwok/eepic.pdf
5. Mittelbach, F. and Goossens, M., et al: The LATEX Companion, (2004), AddisonWesley.
6. Sekiguchi, M.: http://www.kisarazu.ac.jp/~masa/math/
7. Sekiguchi, M., Yamashita, S. and Takato, S.: Development of a Maple Macro Package Suitable for Drawing Fine TEX-Pictures, (2006), Lecture Notes in Computer
Science 4151 (eds. A. Iglesias & N. Takayama), pp.2434, Springer-Verlag.

JMathNorm: A Database Normalization Tool


Using Mathematica
Ali Yazici1 and Ziya Karakaya2
1

Computer Engineering Department, TOBB University of Economics & Technology,


Ankara - Turkey
aliyazici@etu.edu.tr
2
Computer Engineering Department, Atilim University, Ankara - Turkey
ziya@atilim.edu.tr

Abstract. This paper is about designing a complete interactive tool,


named JMathNorm, for relational database (RDB) normalization using Mathematica. It is an extension of the prototype developed by the
same authors [1] with the inclusion of Second Normal Form (2NF), and
Boyce-Codd Normal Form (BCNF) in addition to the existing Third
normal Form (3NF) module. The tool developed in this study is complete and can be used for real-time database design as well as an aid
in teaching fundamental concepts of DB normalization to students with
limited mathematical background. JMathNorm also supports interactive
use of modules for experimenting the fundamental set operations such
as closure, and full closure together with modules to obtain the minimal
cover of the functional dependency set and testing an attribute for a candidate key. JMathNorms GUI interface is written in Java and utilizes
Mathematicas JLink facility to drive the Mathematica kernel.

Introduction

Design of a RDB system consists of four main phases, namely, (i) determination
of user requirements, (ii) conceptual design, (ii) logical design, and nally, (iv)
physical design [2]. During the conceptual design phase, set of business rules is
transformed into a set of entities with a set attributes and relationships among
them. Extended Entity Relationship (EER) modeling tool can be utilized for the
graphical representation of this transformation. The entity set in the EER model
is then mapped into a set of relation schemas {R1 , R2 , R3 , ..., Rn } where each Ri
represents one of the relations of the DB schema. A temporary primary key is
designated, and a set of functional dependencies (FDs) among the attributes of
each schema are established as an outcome of this phase.
As a side product of the logical design phase, each Ri is transformed into
well-formed groupings such that one fact in one group is connected to other
facts in other groups through relationships [3]. The ultimate aim of this article is
to perform this rather mechanical transformation process, called normalization,
eciently in an automatic fashion.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 186193, 2007.
c Springer-Verlag Berlin Heidelberg 2007


JMathNorm: A Database Normalization Tool Using Mathematica

187

Commercial DB design tools do not provide a complete solution for automatic


normalization and existing normalization tools for the purpose require high level
programming skills and complex data structures. Two such implementations in
Prolog language are discussed in [4,5]. Another study on automatic transformation is given in [6] in which UML is used to access Object Constraint Language
(OCL) to construct expressions that encode FDs using classes at a meta-level.
An alternative approach to normalization is given in [3] that focus on addressing FDs to normalize DB schema in place of relying on the formal denitions
of normal forms. The impact of this method on IS/IT students perceptions is
also measured in the same study. It appears that this approach is only useful
for small sets of FDs in a classroom environment and in particular not suited
for automatic normalization. A web-based tool for automatic normalization is
given in [7] which can normalize a DB schema up to 3NF for a maximum of 10
FDs only.
This article is an extension of the work [1] and discusses a complete normalization tool called JMathNorm which implements 2NF, 3NF, and BCNF using the
abstract algorithms found in the literature [2,9,13]. JMathNorms normalization
modules are written in Mathematica [8] using the basic list/set operations, the
user interface is designed using Java language, and nally, execution of Mathematica modules is accomplished by employing Mathematicas Java Link (JLink)
utility. The design approach in this study is similar to the Micro tool given in
[9]. However, JMathNorm provides additional aspects for educational purposes
and is implemented eciently without using any complex data structures such
as pointers.
The remainder of this article is organized as follows. Section 2 briey reviews
the DB normalization and some of the basic functions used in normalization
algorithms. In Section 3 Mathematica implementation of BCNF algorithm is
given. JMathNorm tool is demonstrated in Section 4. Remarks about the tool
and discussion for future work are provided in the nal section.

A Discussion on Normalization Algorithms

A functional dependency (FD) is a constraint about sets of attributes of a relation Ri in the DB schema. A FD between two sets of attributes X and Y,
denoted by, X Y species that there exists at most one value of Y for every
value of X (determinant)[2,10,11]. In this case, one asserts that X determines Y
or Y is functionally dependent on X.
For example, for a DB schema PURCHASE-ITEM = {orderNo, partNo,
partDescription, quantity, price}, with P K = {orderNo, partNo}, using a set of
business rule one can speciy the following FDs:
F D1 : {orderNo, partNo} {partDescription, quantity, price}
F D2 : partNo partDescription
For a given schema, other FDs among the attributes can be inferred from the
Armstrongs inference rules [2]. Alternatively, for an attribute set X, one can

188

A. Yazici and Z. Karakaya

deduce the others known as X closure, X + , determined by X, based on the FD


set F of the schema. Set closure is one of the fundamental functions for the
normalization algorithms and will be referred to as ClosureX [1] in the sequel.
FullClosureX, X ++ , is yet another function similar to ClosureX which returns
all attributes that are fully dependent on X with respect to FD set. This function
is used to remove partial dependencies for transforming a relation into 2NF. An
algorithm for full closure function is given below[9]:
Algorithm FullClosureX (X: attribute set; F: FD set ): return closure in tempX ;
1. tempX := X;
2. repeat
oldX := tempX;
for each FD Y Z in F do
if Y tempX then if not(Y X) then tempX := Z tempX
else if Y = X then tempX := Z tempX;
until (length(oldX) = length(tempX));
3. return tempX;
Given a set of FDs F, an attribute B is said to be extraneous [11] in X A
with respect to F if X = ZB, X = Z, and A Z + . A set of FDs H is called a
minimal cover[1,5] for a set F if each dependency in H as exactly one attribute
on the right-hand side, if no attribute on the left-hand side is extraneous, and
if no dependency in H can be derived from the other dependencies in H. Actually, the calculation of a minimal cover consists of Elimination of Extraneous
Attributes followed by the Elimination of Redundant Dependencies. Normalization algorithms considered in this study makes use of the minimal cover of
a given set of FDs. Moreover, they are computationally ecient with at most
O(n2 ) operations where n is the number of FDs in the schema.
Normalization is a step by step process to transform the DB schema into a set
of subschemas. For the normal forms used in this study (2NF, 3NF and BCNF)
this is achieved by decomposing each Ri into a set of relations by removing
certain kind of redundancies in the relation. Lack of normalization in a DB
schema causes update anomalies [13] which may destroy the integrity of the DB.
If a relation has no repeating groups, it is said to be in the rst normal form
(1NF). In this study, it is assumed that all relations do satisfy this condition. A
relation is in the second normal form (2NF) if no part of a PK determines nonkey
attributes of the relation. Note that, for the example above, because of F D2, the
relation is not in 2NF. 3NF relations prohibit transitive dependencies among its
attributes. And, nally, in Boyce-Codd Normal Form (BCNF) a nonkey attribute
cannot determine a prime attribute (any part of PK).
A 2NF algorithm with the attribute preservation property is given in [9].
JMathNorm uses a slightly modied version of this algorithm to remove partial
dependencies and hence transform the DB schema into 2NF. Bernsteins Synthesis algorithm [1,12] is implemented to provide 3NF relations directly for a
given set of attributes and a set of FDs F. Original dependencies are preserved,
however, lossless join property [1] is not guaranteed by this algorithm.

JMathNorm: A Database Normalization Tool Using Mathematica

189

In certain 3NF DB schemas, a FD from a nonprime attribute into a prime one


may exist. Boyce-Codd Normal Form (BCNF) of a 3NF relation is achieved by
removing such dependencies. A sketch of the BCNF algorithm with the lossless
join properties [2,12] is given below.
Algorithm BCNF (R: attribute set in 3NF; F: FD set ): return Q in BCNF ;
1. D := R;
2. while there is a left-hand side X of a FD X Y in F do
if X Y violatesBCNF then
decompose R into two schemas Rm := D Y ; and Rn := X Y ;
3. return Q := Rm Rn ;
The function violateBCNF tests if a given FD violates the BCNF condition by
calculating the X closure. If it includes all the attributes from R then R does
not violate the BCNF constraint, otherwise R violates the constraint and needs
to be decomposed into Rm and Rn as given above.

3
3.1

Mathematica Implementation
BCNF with Mathematica

In Fig.1, a use case diagram is given to demonstrate the functions and modules
used in the tool.
Tasks in Fig.1 are eectively implemented as Mathematica modules by utilizing only the Mathematicas list structure and the well-known set operations
[8]. These operations are U nion[], Complement[], Intersection[], M emberQ[],
Extract[], Append[], Length[], and Sort[].
A FD set F of a schema is represented by two lists, one for the left hand
sides (FL), and the other for the right hand sides of F (FR). Obviously, the
order of attributes in such a list is important and should be maintained by care
throughout the normalization process.
For the example above, the FD set is represented in Mathematica as follows:
F L = {{orderNo, partNo}, {orderNo, partNo},{orderNo, partNo},partNo}
F R = {partDescription, quantity, price, partDescription}
Accordingly, F L[[i]] F R[[i]], for i = 1, 2, 3, 4 as specied by the FD set F.
As an illustration, the Mathematica code for the BCNF algorithm is given
below. Given a FD set and a 3NF relation R, BCNF algorithm rst looks for a
BCNF violation using the function violatesBCNF. When found, it returns in Q
two sub relations satisfying the BCNF constraint.
BCNF[FL_, FR_, R_] := Module[{i, X, D, Q, DIF, REL}, D = R; Q = {};
For[i = 1, i <= Length[FL], i++,
If[Length[FL[[i]]] > 1, X = Sort[FL[[i]]], X = {FL[[i]]}];
flag = violatesBCNF[FL, FR, X, FR[[i]], U];

190

A. Yazici and Z. Karakaya

If[flag == 1, REL = Union[X, {FR[[i]]}];


Q = Union[Q, {REL}]; RC = Complement[R, {FR[[i]]}];
DIF = Intersection[R, RC]; Q = Union[Q, {DIF}];];];Return[Q];];
violatesBCNF[FL_,FR_,X_,Y_,R_]:=Module[{XP, flag},
XP=Sort[ClosureX[FL,FR,X]];
If[XP==Sort[U], flag=0,flag=1];Return[flag];];

uses
2NF

FullClosureX

*
*

ClosureX

IsItaKey

*
*

*
uses
uses

*
*
** *

* *

uses

uses

uses
Minimal Cover

*
uses
3NF

Elim. Extra. Attribs.

*
Elim. Red. Deps.

uses
BCNF

violatesBCNF

Fig. 1. Use Case Diagram for Normalization Modules

An example of a relation with BCNF violation and its decomposition by the


code above is given below. Consider a relation CLIENT-INTERVIEW={clientno,
interviewdate, interviewtime, stano, roomno} with the following FDs:
F D1: {clientNo, interviewdate} {interviewtime,stano,roomno}
F D2: {stano,interviewdate,interviewtime} clientno
F D3: {stano,interviewdate} roomNo
In this relation {clientno, interviewdate}, and {stano, interviewdate} are both
candidate keys and share a common attribute. And, BCNF constraint is violated
by FD2. The result of running the BCNF module by providing the required
parameters produces the following decomposition Q.
Q = {{interviewdate, roomno, staf f no},
{clientno, interviewdate, interviewtime, staf f no}}

JMathNorm User Interface

An interactive tool, JMathNorm, with a GUI written in Java is designed to


implement the system given in Fig.1. Each algorithm is implemented as a Mathematica module. JLink (Java Link) facility of Mathematica is utilized to load the

JMathNorm: A Database Normalization Tool Using Mathematica

191

Fig. 2. JMathNorms menu options

Fig. 3. Dialog box to dene FDs

Mathematica kernel and execute these modules as required. JMathNorm starts


with a dialogue box asking for the relevant path of Mathematica kernel. Mathematica functions for the normalization are loaded afterwards. Consequently,
only the calling statement of those modules is passed to Mathematica to receive
a result string. The result returned is just the set representation in a string. This
string is to be parsed into the desired data structure. In JMathNorm, results are
stored into javas Vector data structure.
The interface oers a menu driven interaction with the system. The main and
Operations submenus are displayed in Fig.2. JMathNorms FD pull-down menu
can be used to set up a new set of FDs, open an existing one, save or edit FDs
using a data entry dialog box. One can experiment with basic normalization
tasks, namely, set closure, set full closure, elimination of redundant attributes,

192

A. Yazici and Z. Karakaya

Fig. 4. A sample run for 3NF decomposition

elimination of redundant dependencies, testing for primary key and obtaining


minimal cover by utilizing the Basic Operations submenu. These set theoretic
operations form the basis of all of the normalization algorithms discussed in the
preceding sections. Moreover, because of their symbolic nature, verication of
the result returned from each manually is rather cumbersome. JMathNorm overcomes this problem, by providing a verication mechanism as a background for
teaching normalization theory eectively in a classroom environment. Database
schemas can be transformed into the required normal form directly from the
NForm submenu. As a result of normalization, the original relations of the DB
schema are decomposed into sub relations which is displayed systematically by
the Results submenu. In Fig.3, the dialog box for dening FDs are shown. A
sample run to decompose the relation into 3NF for is displayed in Fig.4.

Tests and Discussions

Several benchmark tests found in the literature are successfully applied with
varying number of FDs having dierent initial normal forms. Normalization
algorithms used in the tool possess at most quadratic time complexity providing
in the number of FDs and are computationally eective.
JMathNorm was also used in a classroom environment during a Database Systems course oered to about 25 third year computer engineering majors during
the Spring semester of 2006-07 academic year. Students are requested to form
project teams and design a medium size database system involving 8-10 relations. During the design process, they ended up with normalizing the relational
schema. Students usually preferred using JMathNorm to support or validate the
normalization process. It was reported that each team used JMathNorm on average of four times. In addition to the use in the project, students utilized the
tool to understand the normalization process and the underlying theory based
on the set theoretic operations discussed earlier.

JMathNorm: A Database Normalization Tool Using Mathematica

193

In the course evaluation forms, majority of the students have indicated that
the tool was quite useful to check their manual work in studying the normalization algorithms and to normalize schemas for the database design project of the
course.
Modules of JMathNorm was written in Mathematica utilizing only basic
list/set operations as the fundamental data structure. These operations empowered by the symbolic nature of Mathematica resulted in an eective normalization tool. Currently, it does not have the ability to create SQL statements for
the normalized schema. A table creation facility geared towards a specic DBMS
is to be included to JMathNorm.

References
1. Yazici, A. and Karakaya, Z.: Normalizing Relational Database Schemas Using
Mathematica, LNCS, Springer-Verlag, Vol.3992 (2006) 375-382.
2. Elmasri, R. and Navathe, S.B.: Fundamentals of Database Systems, 5th Ed., Addison Wesley (2007).
3. Kung, H. and Case, T.: Traditional and Alternative Database Normalization Techniques: Their Impacts on IS/IT Students Perceptions and Performance, International Journal of Information Technology Education, Vol.1, No.1 (2004) 53-76.
4. Ceri, S. and Gottlob, G.: Normalization of Relations and Prolog, Communications
of the ACM, Vol.29, No.6 (1986)
5. Welzer, W., Rozman, I. and Gyrks, J.G.: Automated Normalization Tool, Microprocessing and Microprogramming, Vol.25 (1989) 375-380.
6. Akehurst, D.H., Bordbar, B., Rodgers, P.J., and Dalgliesh, N.T.G.: Automatic
Normalization via Metamodelling, Proc. of the ASE 2002 Workshop on Declarative
Meta Programming to Support Software Development (2002)
7. Kung, H-J. and Tung, H-L.: A Web-based Tool to Enhance Teaching/Learning
Database Normalization, Proc. of the 2006 Southern Association for Information
Systems Conference (2006) 251-258.
8. Wolfram, S.: The Mathematica Book, 4th Ed., Cambridge University Press (1999).
9. Du, H. and Wery, L.: Micro: A Normalization Tool for Relational Database Designers, Journal of Network and Computer Applications, Vol.22 (1999) 215-232.
10. Manning, M.V.: Database Design, Application Development and Administration,
2nd. Ed., McGraw-Hill (2004).
11. Diederich, J. and Milton, J.: New Methods and Fast Algorithms for Database
Normalization, ACM Trans. on Database Systems, Vol.13, No.3(1988) 339-365.
12. Ozharahan, E.: Database Management: Concepts, Design and Practice, Prentice
Hall (1990).
13. Bernstein, P.A.: Synthesizing Third Norm Relations from Functional Dependencies,
ACM Trans. on Database Systems, Vol.1, No.4 (1976) 277-298.

Symbolic Manipulation of Bspline Basis


Functions with Mathematica
A. Iglesias1 , R. Ipanaque2 , and R.T. Urbina2
1

Department of Applied Mathematics and Computational Sciences,


University of Cantabria, Avda. de los Castros,
s/n, E-39005, Santander, Spain
iglesias@unican.es
2
Department of Mathematics, National University of Piura,
Urb. Miraores s/n, Castilla, Piura, Per
u

Abstract. Bspline curves and surfaces are the most common and most
important geometric entities in many elds, such as computer design and
manufacturing (CAD/CAM) and computer graphics. However, up to our
knowledge no computer algebra package includes especialized symbolic
routines for dealing with Bsplines so far. In this paper, we describe a
new Mathematica program to compute the Bspline basis functions symbolically. The performance of the code along with the description of the
main commands are discussed by using some illustrative examples.

Introduction

Bspline curves and surfaces are the most common and most important geometric
entities in many elds, such as computer design and manufacturing (CAD/CAM)
and computer graphics. In fact, they become the standard for computer representation, design and data exchange of geometric information in the automotive,
aerospace and ship-building industries [1]. In addition, they are very intuitive,
easy to modify and manipulate - thus allowing the designers to modify the shape
interactively. Moreover, the algorithms involved are quite fast and numerically
stable and, therefore, well suited for real-time applications in a variety of elds,
such as CAD/CAM [1,7], computer graphics and animation, geometric processing [5], articial intelligence [2,3] and many others.
Although there is a wealth of powerful algorithms for Bsplines (see, for instance, [6]), they usually perform in a numerical way. Surpringly, although there
is a large collection of very powerful general-purpose computer algebra systems,
none of them includes specic commands or specialized routines for dealing with
Bsplines symbolically. The present work is aimed at bridging this gap. This paper
describes a new Mathematica program for computing Bspline basis functions in
a fully symbolic way. Because these basis functions are at the core of almost any
algorithm for Bspline curves and surfaces, their ecient manipulation is a critical
step we have accomplished in this paper. The program is also able to deal with
Bspline curves and surfaces. However, this paper focuses on the computation of
Bspline basis functions because of limitations of space. The program has been
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 194202, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Symbolic Manipulation of Bspline Basis Functions with Mathematica

195

implemented in Mathematica v4.2 [8] although later releases are also supported.
The program provides the user with a highly intuitive, mathematical-looking
output consistent with Mathematicas notation and syntax [4].
The structure of this paper is as follows: Section 2 provides some mathematical background on Bspline basis functions. Then, Section 3 introduces the new
Mathematica program for computing them and describes the main commands
implemented within. The performance of the code is also discussed in this section
by using some illustrative examples.

Mathematical Preliminaries

Let T = {u0 , u1 , u2 , . . . , ur1 , ur } be a nondecreasing sequence of real numbers


called knots. T is called the knot vector. The ith Bspline basis function Ni,k (t)
of order k (or equivalently, degree k 1) is dened by the recurrence relations

1 if ui t < ui+1
Ni,1 (t) =
,
i = 0, 1, 2, . . . , r 1
(1)
0 otherwise
and
Ni,k (t) =

t ui
ui+k t
Ni,k1 (t) +
Ni+1,k1 (t)
ui+k1 ui
ui+k ui+1

(2)

for k > 1. Note that i-th Bspline basis function of order 1, Ni,1 (t), is a piecewise
constant function with value 1 on the interval [ui , ui+1 ), called the support of
Ni,1 (t), and zero elsewhere. This support can be either an interval or reduce to
a point, as knots ui and ui+1 must not necessarily be dierent. If necessary, the
0
convention = 0 in eq. (2) is applied. The number of times a knot appears in
0
the knot vector is called the multiplicity of the knot and has an important eect
on the shape and properties of the associated basis functions. Any basis function
of order k > 1, Ni,k (t), is a linear combination of two consecutive functions of
order k 1, where the coecients are linear polinomials in t, such that its order
(and hence its degree) increases by 1. Simultaneously, its support is the union of
the (partially overlapping) supports of the former basis functions of order k 1
and, consequently, it usually enlarges.

Symbolic Computation of Bspline Basis Functions

This section describes the Mathematica program we developed to compute the


Bspline basis functions in a fully symbolic way. For the sake of clarity, the program will be explained through some illustrative examples.
The main command, Ni,k [knots,var], returns the i-th Bspline basis function
of order k in the variable var associated with an arbitrary knot vector knots,
as dened by eqs. (1)-(2). For instance, eq. (1) can be obtained as:
In[1]:=N0,1[{ui ,ui+1 },t]
Out[1] := W hich[t < ui , 0, ui t < ui+1 , 1, t ui+1 , 0]

196

A. Iglesias, R. Ipanaque, and R.T. Urbina

where the output consists of several couples (condition,value) that reproduce the
structure of the right-hand side of eq. (1). The command Which evaluates those
conditions and returns the value associated with the rst condition yielding True.
Our command PiecewiseForm displays the same output with a more similar
appearance to eq. (1):
In[2]:=PiecewiseForm[%]

t < ui
0

Out[2] := 1
ui t < ui+1

0
t ui+1
This output shows the good performance of these commands to handle fully
symbolic input. Let us now consider a symbolic knot vector of length 4 such as:
In[3]:=Array[x,4]
Out[3] := {x(1), x(2), x(3), x(4)}
Now, we compute the basis functions up to order 3 for this knot vector as
follows:
In[4]:Table[Table[Ni,k[%,t] // PiecewiseForm,{i,0,3-k}],{k,1,3}]
Out[4] :=
0
0
0

t < x(1)
t < x(1)
t < x(1)

1 x(1) t < x(2)


0 x(1) t < x(2)

0 x(1) t < x(2)

0
x(2)

t
<
x(3)
1
x(2)

t
<
x(3)
0
x(2)

t
<
x(3)
,
,
,

0 x(3) t < x(4)


0 x(3) t < x(4)

1 x(3) t < x(4)

0
t x(4)
0
t x(4)
0
t x(4)

0
t
<
x(1)
0
t
<
x(1)

tx(1)

0
x(1)

t
<
x(2)

x(1)

t
<
x(2)

x(1)+x(2)

tx(2)

tx(3)
x(2)

t
<
x(3)

,
,
x(2)+x(3)
x(2)

t
<
x(3)
x(2)x(3)


tx(4)

0
x(3) t < x(4)

x(3)x(4) x(3) t < x(4)

0
t

x(4)
0
t

x(4)

0
t
<
x(1)

(tx(1))

x(1)

t
<
x(2)

(x(1)x(2)) (x(1)x(3))

(tx(1))
(tx(3))
(tx(2))
(t+x(4))

+
x(2)

t
<
x(3)

(x(1)x(3))
(x(2)+x(3))
(x(2)x(3))
(x(2)x(4))

(tx(4))

x(3)

t
<
x(4)

(x(2)+x(4))
(x(3)+x(4))


0
t x(4)
Note that, according to eq. (2), the i-th basis function of order k is obtained
from the i-th and (i + 1)-th basis functions of order k 1. This means that
the number of basis functions decreases as the order increases and conversely.
Therefore, for the set of basis functions up to order 3 we compute the Ni,k , with
i = 0, . . . , 3 k for k = 1, 2, 3. The whole set exhibits a triangular structure of
embedded lists in Out[4] for each hierarchical level (i.e. for each order value).
The knot vectors can be classied into three groups. The rst one is the
uniform knot vector; in it, each knot appears only once and the distance between

Symbolic Manipulation of Bspline Basis Functions with Mathematica

197

Fig. 1. (top-bottom, left-right) Bspline basis functions for the uniform knot vector
{1, 2, 3, 4, 5} and orders 1, 2, 3 and 4 respectively

consecutive knots is always the same. As a consequence, each basis function is


similar to the previous one but shifted to the right according to such a distance.
To illustrate this idea, let us proceed with a numerical knot vector so that the
corresponding basis functions can be displayed graphically. We compute the basis
functions of order 1 for the uniform knot vector {1, 2, 3, 4, 5}:
In[5]:=Table[Ni,1[{1,2,3,4,5},t] //PiecewiseForm,{i,0,3}]
Out[5]
0

:=

0
t<1

1 t < 2
0


1
2t<3
,
0
3 t < 4

4 t < 5

0
0
t5

0
t<1

1 t < 2

2t<3
0
,
3 t < 4
1

4 t < 5

0
t5
0

0
t<1

1 t < 2

2t<3
0
,
3 t < 4
0

4 t < 5

1
t5
0

t<1

1 t < 2

2t<3
3 t < 4

4 t < 5

t5

From (2) we can see that the basis functions of order 2 are linear combinations
of these step functions of order 1 (shown in Figure 1(top-left)). The coecients
of such a linear combination are linear polynomials as well, so the resulting basis
functions are actually piecewise linear functions (see Fig. 1(top-right)):
In[6]:=Table[Ni,2[{1,2,3,4,5},t] //PiecewiseForm,{i,0,2}]
0

1 + t

3t
Out[6] :=
0

0
0

0
t<1

0
1 t < 2

2 + t
2t<3
,
4t
3 t < 4

4 t < 5

0
t5
0

0
t<1

0
1 t < 2

2t<3
0
,
3 t < 4
3 + t


5t
4 t < 5

t5
0

t<1

1 t < 2

2t<3
3 t < 4

4 t < 5

t5

Similarly, the basis functions of order 3 are linear combinations of the basis
functions of order 2 in Out[6] according to (2):

198

A. Iglesias, R. Ipanaque, and R.T. Urbina

In[7]:=Table[Ni,3[{1,2,3,4,5},t] //PiecewiseForm,{i,0,1}]

0
t < 1
0 2
t<1

(1+t)

0
1 t < 2

1 t < 2

2
2

(2+t)
11 + 5 t t2 2 t < 3

t
<
3
2
2
2
Out[7] :=
,
23
2
(4t)
2 + 7tt 3 t < 4

3 t < 4

(5t)2

0
4

t
<
5
4

t
<
5

0
t5
0
t5
Note that we obtain two piecewise polynomial functions of degree 2 (i.e. order 3), displayed in Fig. 1(bottom-left), both having a similar shape but shifted
by length 1 with respect to each other. Finally, there is only one basis function of
order 4 for the given knot vector (the piecewise polynomial function of degree 3
in Fig. 1(bottom-right)):
In[8]:=Ni,4[{1,2,3,4,5},t] //PiecewiseForm

0 3
t<1

(1+t)

1 t < 2

3145 t+21 t2 3 t3

t
<
3
6
Out[8] := 131+117 t33 t2 +3 t3

3 t < 4

(5+t)3

t
<
5

0
t5
One of the most exciting features of modern computer algebra packages is
their ability to integrate symbolic, numerical and graphical capabilities within
a unied framework. For example, we can easily display the basis functions of
Out[5]-Out[8] on the interval (1, 5):
In[9]:=Plot[Table[Ni,#[{1,2,3,4,5},t],{i,0,4-#}]
//Evaluate,{t,1,5},PlotStyle->Table[Hue[(i+1)/(5-#)],
{i,0,4-#}],DisplayFunction->Identity]& /@ Range[4];
In[10]:=Show[GraphicsArray[Partition[%,2],
DisplayFunction->$DisplayFunction]]
Out[10] := See F igure 1
A qualitatively dierent behavior is obtained when any of the knots appears
more than once (this case is usually referred to as non-uniform knot vector).
An example is given by the knot vector {0, 0, 1, 1, 2, 2, 2}. In this case, the basis
functions of order 1 are given by:
In[11]:=Table[Ni,1[{0,0,1,1,2,2,2},t] // PiecewiseForm,{i,0,5}]




0
t < 0 0
t < 0 0
t<0

0 0t<1
1 0t<1
0 0t<1

,
,
,

0
1

t
<
2
0
1

t
<
2
0
1

t
<
2

t 2 0
t2
0
t 2 0

Out[11] :=
0
t < 0 0
t < 0 0
t<0

0
0

t
<
1
0
0

t
<
1
0
0

t<1

,
,

1
1

t
<
2
0
1

t
<
2
0
1

t
<
2

0
t2
0
t2
0
t2

Symbolic Manipulation of Bspline Basis Functions with Mathematica

199

Note that the knot spans involving the same knot (t = 0, t = 1 or t = 2)


at both ends reduce to a single point. This causes some basis functions (N01 ,
N21 , N41 and N51 in Out[11]) to be zero. This behavior continues until the order
reaches the multiplicity value of the multiple knot minus 2. For instance, there
is an identically null basis function of order 2, namely N42 :
In[12]:=Table[Ni,2[{0,0,1,1,2,2,2},t] // PiecewiseForm,{i,0,4}]




t<0 0
t<0
0
t < 0 0

0
0t<1
1t 0t<1
t 0t<1

,
,
,

t
1

t
<
2
0
1

t
<
2
0
1

t
<
2

0
t

2
0
t

2
0
t

Out[12] :=
0
t < 0 0
t<0

0
0

t
<
1
0
0

t
<
1

1
+
t
1

t
<
2
0
1

t
<
2

0
t2
0
t2
The basis functions of order 3 become:
In[13]:=Table[Ni,3[{0,0,1,1,2,2,2},t] // PiecewiseForm,{i,0,3}]

0
t<0
0
t<0

2t 2t2 0 t < 1
t2
0t<1

,
,
2

0
1

t
<
2
(2
+
t)
1

t
<
2

0
t

2
0
t

Out[13] :=
0
t<0
0
t < 0

0
0

t
<
1
0
0

t<1

2
2

4
+
6t

2t
1

t
<
2
(1
+
t)
1

t
<
2

0
t2
0
t2
Multiple knots do inuence the shape and properties of basis functions; for
instance, each time a knot is repeated, the continuity of the basis functions whose
support includes this multiple knot decreases. In particular, the continuity of
Ni,k at an interior knot is C km1 [6], m being the multiplicity of the knot. To
illustrate this fact, we compute the unique basis function of order 6:
In[14]:=(f6=N0,6[{0,0,1,1,2,2,2},t]) // PiecewiseForm

0
t<0

1
4

(10 7t)t
0t<1
8
Out[14] :=

(t 2)3 23t2 32t + 12 1 t < 2

0
t2
As we can see, m = 2 for the knot t = 1 and hence N0,6 is C 3 -continuous at
this point. This implies that its third derivative, given by:
In[15]:=(f63=D[f6,{t,3}])//Simplify //PiecewiseForm

15

(4 7t)t
0 t < 1
2
Out[15] :=
15
23t2 68t + 48 1 t < 2
2

200

A. Iglesias, R. Ipanaque, and R.T. Urbina

Fig. 2. (left) 6th-order basis function; (right) its third derivative

Fig. 3. Bspline curve and its control polygon (the set of segments connecting the control
points) for: (left) a non-periodic knot vector; (right) a uniform knot vector

is still continuous but no longer smooth (the continuity of tangent vectors is lost
at this point). Figure 2 displays both the basis function of order 6 (on the left)
and its third derivative (on the right):
In[16]:=Plot[#,{t,0,2},PlotStyle->{RGBColor[1,0,0]},
PlotRange->All]& /@ {f6,f63}
Out[16] := See F igure 2
The most common case of non-uniform knot vectors consists of repeating the
end knots as many times as the order while interior knots appear only once
(such a knot vector is called non-periodic knot vector). In general, a Bspline
curve does not interpolate any of the control points; interpolation only occurs
for non-periodic knot vectors (the Bspline curve does interpolate the end control
points) [6,7]. To illustrate this property, we consider the BSplineCurve command
(whose input consists of the list of control points pts, the order k, the knot vector
knots and the variable var), dened as:
In[17]:=BSplineCurve[pts List,k ,knots List,var ]:=
Module[{bs,n=Length[pts]},bs=Table[Ni,k[knots,var],{i,0,n-1}];
bs.pts // Simplify];
For instance, let us consider a set of 2D control points and two dierent knot
vectors (a non-periodic vector kv1 and a uniform knot vector kv2) and compute
the Bspline curve of order 3:

Symbolic Manipulation of Bspline Basis Functions with Mathematica

201

In[18]:=cp={{0,0},{2,-1},{4,9},{6,10},{8,5}};
In[19]:={kv1,kv2}={{0,0,0,1,2,3,3,3},{1,2,3,4,5,6,7,8}};
In[20]:=BSplineCurve[cp,3,#,t]& /@ {kv1,kv2};
In[21]:=MapThread[Show[Graphics[{RGBColor[1,0,0],Line[pts]}],
ParametricPlot[#1 //Evaluate,#2,PlotRange->All,
PlotStyle->RGBColor[0,0,1],DisplayFunction->Identity],
PlotRange->All,Frame->True,
DisplayFunction->$DisplayFunction]&,{%,{{t,0,3},{t,3,6}}}];
In[22]:=Show[GraphicsArray[%]]
Out[22] := See F igure 3
The curve interpolates the end control points in the rst case, while no control
points are interpolated in the second case at all. For graphical purposes, the
rk

support of the Bspline curves restrict to the points such that
Ni,k (t) = 1.
i=0

The next input computes the graphical support for the curves in Fig. 3:
In[23]:=

Ni,3 [#,t]& /@ {kv1,kv2} // PiecewiseForm

1
Out[23] :=
1

i=0

1
2
(t

2 2 1)

t<0

1
t + 6t 7

0 t < 1
1
1t<2 ,
1

2 t < 3

t3

t2

+
6t 17

2
1
(t

8)2
2

t < 1

1 t < 2

2 t < 3

3t<4
4 t < 5

5 t < 6

6 t < 7

7t<8

This result makes evident that the Bspline curves in Fig. 3 must be displayed
on the intervals (0, 3) and (3, 6) respectively (see the last line of In[21]).
Acknowledgements. This research has been supported by the Spanish Ministry of Education and Science, Project Ref. #TIN2006-13615.

References
1. Choi, B.K., Jerard, R.B: Sculptured Surface Machining. Theory and Applications.
Kluwer Academic Publishers, Dordrecht/Boston/London (1998)
2. Echevarra, G., Iglesias, A., G
alvez, A.: Extending neural networks for B-spline
surface reconstruction. Lectures Notes in Computer Science, 2330 (2002) 305-314
3. Iglesias, A., Echevarra, G., G
alvez, A.: Functional networks for B-spline surface
reconstruction. Future Generation Computer Systems, 20(8), (2004) 1337-1353
4. Maeder, R.: Programming in Mathematica, Second Edition, Addison-Wesley, Redwood City, CA (1991)
5. Patrikalakis, N.M., Maekawa, T.: Shape Interrogation for Computer Aided Design
and Manufacturing. Springer Verlag (2002)

202

A. Iglesias, R. Ipanaque, and R.T. Urbina

6. Piegl, L., Tiller, W.: The NURBS Book (Second Edition). Springer Verlag, Berlin
Heidelberg (1997)
7. Rogers, D.F.: An Introduction to NURBS. With Historical Perspective. Morgan
Kaufmann, San Francisco (2001)
8. Wolfram, S.: The Mathematica Book, Fourth Edition, Wolfram Media, Champaign,
IL & Cambridge University Press, Cambridge (1999)

Rotating Capacitor and a Transient Electric


Network
Haiduke Saraan1 and Nenette Saraan2
1

The Pennsylvania State University


University College
York, PA 17403
has2@psu.edu
2
Penn State Shock Trauma Center
The Milton S. Hershey Medical Center
Hershey, PA 17033
nsarafian@hmc.psu.edu

Abstract. The authors designed a rotating parallel-plate capacitor; one


of the plates is assumed to turn about the common vertical axis through
the centers of the square plates. We insert this capacitor into a series
with a resistor, forming a RC circuit. We analyze the characteristics of
charging and discharging scenarios on two dierent parallel tracks. On
the rst track we drive the circuit with a DC power supply. On the second
track, we drive the circuit with an AC source.
The analyses of the circuits encounter non-linear dierential equations. We format the underlying equations into their generic forms. We
then apply Mathematica [1], NDSolve to solve them. This work is an example showing how with the help of Mathematica one is able to augment
the scope of the traditional studies.
Keywords: Mathematica, Electric Network, Geometry.

Introduction and Motivation

It is a far-fetched concept to think about a transient electrical circuit and incorporate its characteristics to a discrete and abstract geometrical problem. The
authors have even taken the initiative one step further, relating these two basic concepts to the kinematics of mechanics. In other words, this article shows
how these three discrete concepts are brought together and molded into one
coherent and unique project. To accomplish this, one needs to think creatively,
Mathematica is the tool of choice helping to explore the possibilities. This article
including the Introduction is composed of eight sections. In Section 2, we apply
Mathematica to evaluate the overlapping area of the two rotating squares about
their common vertical axis. In section 3 we incorporate the rotational kinematics
and consider two dierent scenarios: 1) a symmetrical, uniform rotation; and 2)
an asymmetrical, accelerated rotation.
In Section 4-7, we view the overlapping squares as being two parallel metallic
plates that are separated by a gap forming a parallel-plate capacitor. Since the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 203210, 2007.
c Springer-Verlag Berlin Heidelberg 2007


204

H. Saraan and N. Saraan

area of the overlapping plates evaluates the capacitance of the capacitor, the rotating plates make the capacitor a variable one. Technical literature particularly
Mathematica-based articles and reports lack one such view. It is the ultimate
objective of this project to analyze the response of the electrical circuits to the
kinematics of the rotating plates.
Specically, in this article, we address the modications of the basic responses
of the electrical circuits composed of a resistor connected in a series with our
designed, time-dependent capacitor. In particular, we analyze the characteristics
of the RC circuits driven with DC as well as AC sources. In conjunction with
our analysis, in section 8 we close the article suggesting a few related research
avored circuit analysis projects.

Analysis

Figure 1 shows two identical overlapping squares. The bottom square designated
with non-prime vertices is fastened to the xy coordinate system. The top square,
designated with prime vertices is rotated counter clockwise about the common
vertical axis through the common origin O by an angle . The squares have the
side length of L and the rotation angle is the angle between the semi diagonals

OP 1 and OP 1 .
To evaluate the overlapping area of these two squares we evaluate the area
of trapezoid oabco; the overlapping area then equals four times the latter. The
intersecting points of the rotated sides of the top square with the sides of the
bottom one are labeled a, b, and c. Utilizing the coordinates of these points,
the area of the trapezoid is the sum of the areas of two triangles abc, and oac.

y
p1
b p1

p2
p2

45

a
x

O
p4
p4

p3
p3

Fig. 1. Display of two rotated squares. The bottom square is fastened to the xy coordinate system, the top square is rotated counter clockwise by radian.

Rotating Capacitor and a Transient Electric Network

205

To evaluate the coordinates of a, b, and c we write the equations for the






slanted lines, P4 P1 and P1 P2 and intersect them with the sides of the bottom
square. Intersection of the former with the P4 P1 and P1 P2 gives the coordinates
of a and b respectively. Similarly, the intersection of the latter with
P1 P2 yields

L L
   L 1tan 2 L 

  , 2 , and c
the coordinates of c. Theses are: a 2 , 2 tan 2 , b 2

1+tan 2
 L
L
tan
,
.
2
2
2
To evaluate the areas of the needed triangles, we convert the above coordinates
into Mathematica code. The inserted 1s in the thirdposition of
 the
 coordinates
L L
are for further
calculations.
o
=
{0,
0,
1},a[L
,

]
=
,
tan
2 2
2 , 1 ,b[L , ] =
 
 L 1tan 2 L 




  , 2 , 1 , c[L , ] =
L2 tan 2 , L2 , 1 . We dene two auxiliary
2

1+tan

functions, abc[L , ] = {a[L, ], b[L, ], c[L, ]},oac[L , ] = {o, a[L, ], c[L, ]}


The needed areas are, areaABC[L , ] = 12 det[abc[L, ]],areaOAC[L , ] =
1
2 det[oac[L, ]],areaOABCO[L , ] = areaOAC[L, ] + areaABC[L, ].
We divide the overlapping area by the area of the square, L2 , and plot its
normalized values as a function of the rotation angle . Figure 2 shows the
normalized area starts and ends at the same values. Its value after a 4 radian
turn drops to about 83% of the maximum value. The plot as one anticipates is
symmetric about 4 .

normalized area
1
0.8
0.6
0.4
0.2
3 5 3 7 ,rad
  

  
16 8 16 4 
16 8 16 2
Fig. 2. The normalized values of the overlapping area of the squares as a function of
the rotation angle

Modes of Mechanical Rotations

In this section we extend the analysis of Section 2. Here, instead of viewing the
rotation as being a discrete and purely geometrical concept, we view it as a
kinematic process. We set the rotation angle = t; that is, we introduce the
continuous time parameter t. For = 2
T with the period T = 4s, we explore

206

H. Saraan and N. Saraan

the uniform rotation. For an asymmetrical case, we consider a rotation with a


constant angular acceleration. According to = 12 t2 , to rotate the square by

rad
2 in one second yields = s2 . The corresponding normalized overlapping
areas are displayed in Fig 3.
Show[GraphicsArray[{U nif ormRotation,
AcceleratedRotation}], DisplayF uncton DisplayF unction].
area
1

area
1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2
1 1 3 1 5 3 7
      
8 4 8 2 8 4 8

t,s

1 1 3 1 5 3 7
      
8 4 8 2 8 4 8

t,s

Fig. 3. The graphs are the normalized values of the overlapping areas for: a: a uniform
rotation with = 2 rad/s, (the left graph) and b: a uniform angular acceleration with
= rad
(the right graph)
s2

Electrical Networks

Now we consider a RC series circuit. One such circuit driven by a DC power


supply is shown in Figure 4. The circuit is composed of two loops. Throwing the
DPDT (Double-Pole Double-Throw) switch to as position charges the capacitor,
while setting the switch to a bs discharges the charged capacitor.
As we pointed out in the Introduction, in this section we view the overlapping
squares as being two parallel metallic plates that are separated by a gap forming

Fig. 4. The schematics of a DC driven RC circuit. Throwing the DPDT switch onto
as charges the capacitor, while throwing the switch onto bs discharges the charged
capacitor.

Rotating Capacitor and a Transient Electric Network

207

a parallel-plate capacitor. Since the capacitor of a parallel-plate capacitor is in


proportion to the overlapping area of the plates, the continuous rotation of the
plates makes the capacitor time-dependent. It is the objective of this section to
analyze the characteristic responses of one such time-dependent capacitor in the
charging and discharging processes.

Characteristics of Charging and Discharging a DC


Driven RC Circuit with Time-Dependent Uniformly
Rotating Plates

For the charging process we apply Kirchho circuit law [2], this gives
dQ 1 A0
1
+
Q(t) = 0,
(1)
dt
A(t)

For the sake of convenience, we assume V C0 = 1, where C0 is the capacitance


of the parallel-plate with the plates completely overlapped, Q(t) and A(t) are
the capacitors charge and the overlapping area at time t, respectively; A0 is
the area of one of the squares; and = RC0 is the time-constant of the circuit.
For a constant capacitor A(t) A0 , and eq(1) yields the standard solution
t
Q(t) = 1 e . In this equation the maximum charge is normalized to unity.
For the rotating plates, however, eq(1) does not have an analytic solution.
We apply Mathematica NDSolve along with an appropriate initial condition and
solve the equation numerically this yields Q(t). We graphically compare its
characteristics vs. the characteristics of an equivalent RC circuit, see Fig 5.
Similarly, we analyze the characteristics of the discharging process. Equation
1 A0
(1) for the corresponding discharging process is dQ(t)
dt + A(t) Q(t) = 0. This
1
equation for a constant capacitor, A0 = A(t), yields dQ(t)
dt + Q(t) = 0, and
t
gives Q(t) = e . For the rotating capacitor, however, its solution is Q(t) =
A0

e 0 A() d . To solve the latter we apply Mathematica NIntegrate. This yields


the needed values. The results are displayed in Fig 5.
Q
1
0.8
0.6
0.4
0.2

Charging

0.20.40.60.8 1 t,sec

Q Discharging
1
0.8
0.6
0.4
0.2
1 1 3 1 5 3 7 1
      
8 4 8 2 8 4 8

area
1
0.8
0.6
0.4
0.2
t,sec

1 1 3 1 5 3 7 1
      
8 4 8 2 8 4 8

t,s

Fig. 5. Display of charging, discharging and the overlapping area of the uniformly
rotating plates. For the rst two graphs from left to right, the outer and the inner
curves/dots represent the constant and time-dependent capacitors, respectively. The
far right graph is borrowed from Fig 3.

It is interesting to note that the charging and discharging circuits respond


dierently to the time-varying capacitors; the impact of the time-dependent

208

H. Saraan and N. Saraan

capacitor is more pronounced for the former. Moreover, for the chosen timeconstant = 16 s, although the constant capacitor reaches its plateau within one
second, it appears the variable capacitor requires a longer time span.

Characteristics of Charging and Discharging DC


Driven RC Circuit with Time-Dependent Accelerated
Rotating Plates

One may comfortably also apply the analysis of Section IIa to generate the
characteristic curves associated with the uniformly accelerated rotating plates.
The Mathematica codes may easily be modied to yield the needed information.
The codes along with the associated graphic outputs are

Q
1
0.8
0.6
0.4
0.2

Q Discharging
1
0.8
0.6
0.4
0.2

Charging

1 1 3 1 5 3 7 1
      
8 4 8 2 8 4 8

t,sec

1 1 3 1 5 3 7 1
      
8 4 8 2 8 4 8

area
1
0.8
0.6
0.4
0.2
t,sec

1 1 3 1 5 3 7 1
      
8 4 8 2 8 4 8

t,s

Fig. 6. Display of charging, discharging and the overlapping area of the uniformly
accelerated rotating plates. The graph codes are the same as Fig 5. The far right graph
is borrowed from Fig 3.

To form an opinion about the characteristics of the charging curve for the
variable capacitor, one needs to view it together with the far right graph. The
rotating plates in this case are accelerated, illustrating that for identical time
intervals, the overlapping area at the beginning is greater than the overlapping
area at the end of the interval. The eects of the asymmetrical rotation are most
clearly visible at the tail of the curve. Similar to the uniform rotation (see the
second plot of Fig 5) the impact of the non-uniform rotation for the discharge
circuit is negligible.

Characteristics of Charging and Discharging an AC


Driven RC Circuit with Time-Dependent Capacitor

In this section we analyze the charging and the discharging characteristics of an


RC series circuit driven with an AC source. Schematically speaking, this implies
in Fig 4 we replace the DC power supply with an AC source. For this circuit,
Kirchhos law yields
dQ 1 A0
1
+
Q(t) sin(2f t) = 0,
dt
A(t)

(2)

Rotating Capacitor and a Transient Electric Network

209

In this equation f is the frequency of the signal and the voltage amplitude is
set to one volt.
Equation (1) is a non-trivial, non-linear dierential equation. To solve the
eq(2), we apply NDSolve along with the corresponding initial condition. The
response of the circuit is compared to the equivalent circuit with a constant
capacitor.

AC Driver

0.8
0.6
0.4
0.2
0.2

0.4

0.6

0.8

t,sec

Fig. 7. Plot of the charge vs. time. The outer, inner and the dashed curves are the
capacitors charge for the constant and the time-dependent capacitors for uniform and
accelerated rotations, respectively.

Utilizing the Mathematica code, one may analyze the frequency sensitivity of
the circuit. As the result of one such analysis, we observe that the dierences
between these characteristics are pronounced, provided the frequencies are set
to less than 1 hz.

Conclusions

As indicated in the Introduction, the authors have proposed a unique research


project that has brought together three dierent subject areas: Geometry, Mechanics, and Electrical Network. Mathematica, with its exible and easy to use
intricacies, is chosen as the ideal tool to analyze the project and address the
what-if scenarios. As pointed out in the text, some of the derived results are
intuitively just. And for the hard to predict cases, we applied Mathematica to
analyze the problem and to form an opinion. As an open-ended question and research oriented project, one may attempt to modify the presented analysis along
with the accompanied codes to investigate the response of parallel RC circuits.
It would also be complimentary to our theoretical analysis to manufacture a
rotating capacitor to supplement the experimental data.

210

H. Saraan and N. Saraan

References
1. S. Wolfram, The Mathematica Book, 5th Ed., Cambridge University Publication,
2003.
2. D. Halliday, R. Resnick, and J. Walker, Fundamentals of Physics, 7th Ed, New York:
John Wiley and Sons, 2005.

Numerical-Symbolic Matlab Program for the


Analysis of Three-Dimensional Chaotic Systems
Akemi Galvez
Department of Applied Mathematics and Computational Sciences,
University of Cantabria, Avda. de los Castros,
s/n, E-39005, Santander, Spain
akemi.galvez@postgrado.unican.es

Abstract. In this paper, a new numerical-symbolic Matlab program for


the analysis of three-dimensional chaotic systems is introduced. The program provides the users with a GUI (Graphical User Interface) that
allows us to analyze any continuous three-dimensional system with a
minimal input (the symbolic ordinary dierential equations of the system
along with some relevant parameters). Such an analysis can be performed
either numerically (for instance, the computation of the Lyapunov exponents, the graphical representation of the attractor or the evolution of the
system variables) or symbolically (for instance, the Jacobian matrix of
the system or its equilibrium points). Some examples of the application
of the program to analyze several chaotic systems are also given.

Introduction

The analysis of chaotic dynamical systems is one of the most challenging tasks
in computational science. Because these systems are essentially nonlinear, their
behavior is much more complicated than that of linear systems. In fact, even
the simplest chaotic systems exhibit a bulk of dierent behaviors that can only
be fully analyzed with the help of powerful hardware and software resources.
This challenging issue has motivated an intensive development of programs and
packages aimed at analyzing the range of dierent phenomena associated with
the chaotic systems.
Among these programs and packages, those based on computer algebra systems ( CAS) are receiving increasing attention during the last few years. Recent
examples can be found, for instance, in [2,3,5] for Matlab, in [4,7,9,10,11,12,16]
for Mathematica and in [17] for Maple, to mention just a few examples. In addition to their outstanding symbolic features, the CAS also include optimized
numerical routines, nice graphical capabilities and - in a few cases such as in
Matlab - the possibility to generate appealing GUIs (Graphical User Interfaces).
In this paper, the abovementioned features have been successfully applied to
generate a new numerical-symbolic Matlab program for the analysis of threedimensional chaotic systems. The program provides the users with a GUI that
allows us to analyze any continuous three-dimensional system with a minimal input (the symbolic ordinary dierential equations of the system along with some
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 211218, 2007.
c Springer-Verlag Berlin Heidelberg 2007


212

A. G
alvez

relevant parameters). Such an analysis can be performed either numerically (for


instance, the computation of the Lyapunov exponents, the graphical representation of the attractor or the evolution of the system variables over the time) or
symbolically (for instance, the Jacobian matrix of the system or its equilibrium
points). This paper describes the main components of the system as well some of
its most remarkable features. Some examples of the application of the program
to analyze several chaotic systems are also given.

Program Architecture and Implementation

The program introduced in this paper is comprised of four dierent components:


1. a set of numerical libraries containing the implementation of the commands
and functions designed for the numerical tasks. They have been generated by
using the native Matlab programming language and taking advantage of the
wealth of numerical routines available in this system. Usually, these Matlab
routines provide full control on a number of dierent options (such as the
absolute and relative error tolerance, stopping criteria and others) and are
fully optimized to oer the highest level of performance. In fact, this is one
of the major strengths of the program and one of the main reasons to choose
Matlab as its optimal programming environment.
2. a set of symbolic routines and functions. They have been implemented by using the Symbolic Math Toolbox that provides access to several Maple routines
for symbolic tasks.
3. the graphical commands for representation tasks. The powerful graphical capabilities of Matlab exceed those commonly available in other CAS such as
Mathematica and Maple. Although our current needs do not require to apply them at full extent, they avoid the users the tedious and time-consuming
task to implement many routines for graphical output by themselves. Some
nice viewing features such as 3D rotation, zooming in and out, coloring and
others are also automatically inherited from the Matlab windows system.
4. a GUI. Matlab provides a mechanism to generate GUIs by using the socalled guide (GUI development environment). This feature is not commonly
available in many other CAS so far. Although its implementation requires for complex interfaces - a high level of expertise, it allows the end users to
apply the program with a minimal knowledge and input, thus facilitating its
use and dissemination.
Regarding the implementation, this program has been developed by the author in Matlab v6.0 by using a Pentium IV processor at 2.4 GHz. with 512 MB
of RAM. However, the program supports many dierent platforms, such as PCs
(with Windows 9x, 2000, NT, Me and XP) and UNIX workstations. Figures
in this paper correspond to the PC platform version. The graphical tasks are
performed by using the Matlab GUI for the higher-level functions (windowing,
menus, or input) while the built-in graphics Matlab commands are applied for

Numerical-Symbolic Matlab Program

213

rendering purposes. The numerical kernel has been implemented in the native
Matlab programming language, and the symbolic kernel has been created by
using the commands of the Symbolic Math Toolbox.

Some Illustrative Examples

In this section we show some applications of the program through some illustrative examples.
3.1

Visualization of Chaotic Attractors

This example is aimed at showing the numerical and graphical features of the
program. Figure 1 shows a screenshot of a typical session for visualization of
chaotic attractors. The dierent windows involved in this task have been numbered for the sake of clarity: #1 indicates the main window of the program, from
where the other windows will show up when invoked. The workow is as follows:
rstly, the user inputs the system equations (upper part of window #1), which
are expressed symbolically. At this stage, only continuous three-dimensional
ows - described by a system of ordinary dierential equations (EDOs) - are
considered. For instance, in Fig. 1 we consider Chuas circuit, given by:

x = [y x f (x)]
y = x y + z
,
(1)

z = y
where

1
f (x) = bx + (a b) [|x + 1| |x 1|]
(2)
2
is the 3-segment piecewise-linear characteristic of the nonlinear resistor (Chuas
diode) and , , a and b are the circuit parameters. Then, the user declares the
system parameters and their values. In our example, we consider = 8.9, =
14.28, a = 1.14 and b = 0.71, for which the system exhibits chaotic behavior
[1]. In order to display the attractor and/or the evolution of the system variables
over the time, some kind of numerical integration is required. The lower part of
window #1 allows the user to choose dierent numerical integration methods
[15], including the classical Euler and 2nd- and 4th-order Runge-Kutta methods
(implemented by the author) along with some more sophisticated methods from
the Matlab kernel such as ode45, ode23, ode113, ode15s, ode23s, ode23t and
ode23tb (see [14] for details). Some input required for the numerical integration
(such as the initial point and the integration time) is also given at this stage. By
pressing the Numerical Integration settings button, window #2 appears and
some additional options (such as the absolute and relative error tolerance, the
initial and maximum stepsize and renement, the computation speed and others)
can be set up. Once chosen, the user proceeds to the graphical representation
stage, where he/she can display the attractor of the dynamical system and/or
the evolution of any of the system variables over the time. Such variables can

214

A. G
alvez

Fig. 1. Screenshots of the Matlab program for the analysis of chaotic systems: general
setup for the visualization of chaotic attractors

be depicted on the same or on dierent axes and windows. The Graphical


Representation settings button opens the window #3, where dierent graphical
options such as the line width and style, markers for the equilibrium points, and
others (including some coloring options leading to window #4) can be dened.
The nal result is the graphical output shown in window #5 where the double
scroll attractor is displayed.

Numerical-Symbolic Matlab Program

215

Fig. 2. Symbolic computation of the Jacobian matrix for the Lorenz system

3.2

Symbolic-Numerical Analysis of Chaotic Systems

An appealing feature of this program is the possibility to analyze the chaotic


systems either symbolically or numerically. Figure 2 shows an example for the
well-known Lorenz system [13], given by:

x = (y x)
y  = Rx y xz ,
(3)

z = xy bz
where , R and b are the system parameters. The program includes a module
for the computation of the Jacobian matrix and the equilibrium points of any
three-dimensional ow. The Jacobian matrix is a square matrix whose entries
are the partial derivatives of the system equations with respect to the system
variables. If no value for the system parameters is provided, the computation is
performed symbolically and the corresponding output depends on those system
parameters. Figure 2 shows the Jacobian matrix for the Lorenz system, which
depends not only on the system parameters but also on the system variables.
Otherwise, the computations are performed numerically. For instance, once some

216

A. G
alvez

Fig. 3. Numerical computation of the equilibrium points and the Lyapunov exponents
of the Lorenz system

8
in this example), the
3
Lyapunov exponents (LE) of the system can be numerically computed. To this
purpose, a numerical integration method is applied. Figure 3 shows the window at which the dierent options for this numerical integration process can
be chosen (left window) along with the graphical representation of the three
Lyapunov exponents over the time (right window). As shown in the gure, the
numerical values of these LE are 1.4 and 0.0022 and 15 respectively. Roughly
speaking, the LE are a generalization of the eigenvalues for nonlinear ows.
They are intensively applied to analyze the behavior of nonlinear systems, since
they indicate if small displacements of trajectories are along stable or unstable
parameter values are given ( = 10, R = 60 and b =

Numerical-Symbolic Matlab Program

217

Fig. 4. (left) Attractor and equilibrium points of the Lorenz system analyzed in Figure 3; (right) evolution of the system variables over the time

directions. In particular, a negative LE indicates that the trajectory evolves


along the stable direction for this variable (and hence, regular behavior for that
variable is obtained) while a positive value indicates a chaotic behavior. Because
in our example we nd positive LE, the system exhibits a chaotic behavior. This
fact is evidenced in Figure 4 (left) where the corresponding attractor of the
Lorenz system for our choice of the system parameters is displayed. The gure
also displays the equilibrium points of the Lorenz system for our choice of the
system parameters. Their corresponding numerical values are shown in the main
window of Figure 3. Finally, Figure 4 (right) shows the evolution of the system
variables over the time from t = 0 to t = 200.

Conclusions and Further Remarks

In this paper, a new numerical-symbolic Matlab program for the analysis of


three-dimensional continuous chaotic systems has been introduced. The system
allows the user to compute the Jacobian matrix, the equilibrium points and the
Lyapunov exponents of any chaotic three-dimensional ow, as well as to display
graphically the attractor and/or the system variables. Some examples of the
application of the program have also been briey reported. Future works include
the extension of this program to the case of discrete systems, the implementation
of especialized routines for the control of chaos [6,7,8] and the synchronization
of chaotic systems [10,11]. This research has been supported by the Spanish
Ministry of Education and Science, Project Ref. #TIN2006-13615.

218

A. G
alvez

References
1. Chua, L.O., Komuro, M., Matsumoto, T. The double-scroll family. IEEE Transactions on Circuits and Systems, 33 (1986) 1073-1118
2. Dhooge, A., Govaerts, W., Kuznetsov, Y.A.: Matcont: A Matlab package for numerical bifurcation analysis of ODEs. ACM Transactions on Mathematical Software
29(2) (2003) 141-164
3. Dhooge, A., Govaerts, W., Kuznetsov, Y.A.: Numerical continuation of fold bifurcations of limit cycles in MATCONT. Proceedings of CASA2003. Lecture Notes
in Computer Science, 2657 (2003) 701-710
4. G
alvez, A. Iglesias, A.: Symbolic/numeric analysis of chaotic synchronization with
a CAS. Future Generation Computer Systems (2007) (in press)
5. Govaerts, W., Sautois, B.: Phase response curves, delays and synchronization in
Matlab. Proceedings of CASA2006. Lectures Notes in Computer Science 3992
(2006) 391-398
6. Gutierrez, J.M., Iglesias, A., Guemez, J., Matas, M.A.: Suppression of chaos
through changes in the system variables through Poincare and Lorenz return maps.
International Journal of Bifurcation and Chaos, 6 (1996) 1351-1362
7. Gutierrez, J.M., Iglesias, A.: A Mathematica package for the analysis and control
of chaos in nonlinear systems. Computers in Physics, 12(6) (1998) 608-619
8. Iglesias, A., Gutierrez, J.M., Guemez, J., Matas, M.A.: Chaos suppression through
changes in the system variables and numerical rounding errors. Chaos, Solitons and
Fractals, 7(8) (1996) 1305-1316
9. Iglesias, A.: A new scheme based on semiconductor lasers with phase-conjugate
feedback for cryptographic communications. Lectures Notes in Computer Science,
2510 (2002) 135-144
10. Iglesias, A., G
alvez, A.: Analyzing the synchronization of chaotic dynamical systems with Mathematica: Part I. Proceedings of CASA2005. Lectures Notes in
Computer Science 3482 (2005) 472-481
11. Iglesias, A., G
alvez, A.: Analyzing the synchronization of chaotic dynamical systems with Mathematica: Part II. Proceedings of CASA2005. Lectures Notes in
Computer Science 3482 (2005) 482-491
12. Iglesias, A., G
alvez, A.: Revisiting some control schemes for chaotic synchronization
with Mathematica. Lectures Notes in Computer Science 3516 (2005) 651-658
13. Lorenz, E.N.: Journal of Atmospheric Sciences, 20 (1963) 130-141
14. The Mathworks Inc: Using Matlab. Natick, MA (1999)
15. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes
(2nd edition), Cambridge University Press, Cambridge (1992)
16. Saraan, H.: A closed form solution of the run-time of a sliding bead along a freely
hanging slinky. Proceedings of CASA2004. Lecture Notes in Computer Science,
3039 (2004) 319-326
17. Zhou, W., Jerey, D.J. Reid, G.J.: An algebraic method for analyzing open-loop
dynamic systems. Proceedings of CASA2005. Lecture Notes in Computer Science,
3516 (2005) 586-593

Safety of Recreational Water Slides:


Numerical Estimation of the Trajectory,
Velocities and Accelerations
of Motion of the Users
Piotr Szczepaniak and Ryszard Walenty
nski
Silesian University of Technology,
Faculty of Civil Engineering,
ul. Akademicka 5, PL44-100 Gliwice, Poland
Piotr.Szczepaniak@polsl.pl,Ryszard.Walentynski@polsl.pl
http://www.kateko.rb.polsl.pl

Abstract. The article briey shows how to estimate the safety of recreational water slides by numerical analysis of motion of the users. There
are presented: mathematical description of a typical water slides geometry, simplied model of a sliding person, model of contact between the
user and the inner surface of the slide, equations of motion written for
a rigid body with 6 degrees of freedom and nally some sample results
compared to the limitations set by current European Standard.
Keywords: water slide, safety, water park, motion, dynamics, nite
dierence, numerical integration, modeling, Mathematica.

Introduction

Water slides are one of the most popular facilities in water parks. They are
built all over the world. One of the most important problems associated with
them is safety. It is not well recognized, both mathematically and technically.
There are very few scientic papers concerning the problem. The most complete
publication [1] deals with the mathematical model of a water sliding process
using the assumption that the user has constant contact with the inner surface of
the chute. Actually, the most dangerous situations happen when the user looses
contact and cannot control the ride. The next problem is acceleration, often
called G-load. The human body, and especially brain and heart, is sensitive to
acceleration that signicantly exceeds gravity acceleration g. Several accidents,
in a variety of countries resulted in severe injuries and even death.
The contemporary practice, due to lack of design methodology, consists in
testing by a water slides expert after nishing the construction, however this
expert is described only as a t person, dressed in bathing suit [2]. It is not
acceptable from the point of view of modern engineering philosophy, which requires prior analysis based on mathematical model, which should be veried
experimentally.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 219226, 2007.
c Springer-Verlag Berlin Heidelberg 2007


220

P. Szczepaniak and R. Walenty


nski

Fig. 1. Basic elements of water slides

Typical Geometry of Water Slides

Most of the water slides (especially those built high over the ground level) are
constructed of two basic types of elements: the straight ones, being just a simple
cylinder, and the curved ones, having the shape of a slice of a torus. All these
elements are equipped on both ends with anges that allow connecting the following parts of the slide with screws (see Fig. 1 and 2). Because the resulting
axis of the chute consists of straight lines and circular arcs, it cant be described
by a simple equation, but some kind of an interval changing function is needed.

Fig. 2. A sample shape of a chute

The best choice seems to be the IntepolatingFunction within the Mathematica system. To obtain a parametric equation of the axis, one must rst
calculate a set of coordinates of discrete points, lying on the axis at small intervals. It can be done using procedures described in [3] or with the aid of any
software for 3-D graphics. The next step is to build an IntepolatingFunction
for each of the coordinates separately and nally join them into one vector function axis[l]. Having this function one can get the parametric equation of the
surface of the slide surface[l,phi,radius] using the following Mathematica
code:

Safety of Recreational Water Slides

221

surface[l_, phi_, radius_] := axis[l] +


radius * Module[{vecda, lda13, lda12},
vecda = axis[l];
lda13 = Sqrt[Sum[vecda[[i]]^2, {i, 3}]];
lda12 = Sqrt[Sum[vecda[[i]]^2, {i, 2}]];
{{vecda[[1]] / lda13, -vecda[[2]] / lda12,
-(vecda[[1]]*vecda[[3]])/(lda13*lda12)},
{vecda[[2]] / lda13, vecda[[1]] / lda12,
-(vecda[[2]] * vecda[[3]]) / (lda13 * lda12)},
{vecda[[3]] / lda13, 0, lda12 / lda13)}}
].{0, Cos[phi], Sin[phi]}] .
Within the above code l denotes the position of a current cross-section,
measured along the axis, phi and radius are cylindrical coordinates of points
creating the cross-section. Figure 2 has been created with this code and the
ParametricPlot3D command.

Model of a Sliding Person

The next task is to create a model of the human body. Its obvious, that a
complete bio-mechanical model with all its degrees of freedom (DOF) and parameters would be the best, but unfortunately its almost impossible to predict
the values of dimensions, mass, moments of inertia and stiness of all parts of
the users body, especially at the design stage of the construction process. Thats
why a simpler model is needed.
To create it, one can notice, that sliding people are quite stiened and the
fastest users touch the surface of the chute only with their heels and bladebones. This allows us to replace the sliding person by a rigid body, constrained
by 3 unilateral supports located at the vicinity of the previous mentioned parts
of the body (spheres at the vertices of the triangle representing the body on
Fig. 3). Such a body has 6 DOFs - translations xi and rotations i around 3
axes of the local (i ) or global (Xi ) coordinates system, and is subjected to the
inuence of the following forces: gravity F G , normal contact forces F Ni and
friction forces F Ti , shown on Fig. 4. Vectors of these forces can be calculated
using the following formulae:

0 if (ui < 0) [(k ui + c ui u i ) < 0]
F Ni =
,
(1)
(k ui + c ui u i ) ni otherwise
F Ti = |F Ni |

v Ti
,
|v Ti |

FG = mg ,
F Sum =

3

i=1

(F Ni + F Ti ) + F G ,

(2)
(3)
(4)

222

P. Szczepaniak and R. Walenty


nski

Fig. 3. Replacement of a user by a model of a rigid body

Fig. 4. Model of contact between the moving body (grey circles) and the inner surface
of the chute (dotted line). Dot (black ) in the upper left corner denotes the center of
the current cross-section (axis of the slide).

M Sum =

3


pi (F Ni + F Ti ) ,

(5)

i=1

where:
i number of the current zone of contact,
k, c constants of the quasi viscous-elastic model of human body,
ui deection of the zone of contact,
ni unit vector, normal to the surface of the slide,
coecient of friction,
v Ti tangent component of the velocity vector,
m mass of the user,
g vector of gravitational acceleration,
F Sum summary force,
M Sum summary moment of forces.
At this stage the main problem is calculating ui and ni . The best way to solve
it, is to nd on the axis of the slide these points, that are nearest to the current

Safety of Recreational Water Slides

223

positions of the centers of the contact spheres, because vectors ni must lie on
lines connecting these pairs of points. It can be easily done with the Mathematica
FindMinimum command.

Equations of Motion

The applied equations of motion are based on the well known Newtons laws of
motion [4]:
d2
[m x(t)] = F Sum ,
(6)
dt2
d
[K(t)] = M Sum ,
(7)
dt
K(t) = A(t) J (t) ,
(8)

cos[3 (t)] sin[3 (t)] 0


cos[2 (t)] 0 sin[2 (t)]

0
1
0
A(t) = sin[3 (t)] cos[3 (t)] 0
0
0
1
sin[2 (t)] 0 cos[2 (t)]

1
0
0
0 cos[1 (t)] sin[1 (t)] ,
0 sin[1 (t)] cos[1 (t)]

J1 0 0
J = 0 J2 0 ,
0 0 J3

1
0
d1 (t) d2 (t)
cos[1 (t)] +
(t) =
0 +

dt
dt
0
sin[1 (t)]

sin[2 (t)]
d3 (t)
+
sin[1 (t)] cos[2 (t)] ,
dt
cos[ (t)] cos[ (t)]
1

(9)

(10)

(11)

where:
K(t) vector of moment of momentum,
A(t) matrix of transformation from local (rotating)
to global (xed) coordinates system,
J tensor of main moments of inertia,
(t) vector of angular velocity.
As one can see, the equations of motion are so complicated, that it is impossible to obtain an analytical solution. In fact, even NDSolve, the numerical
solution of dierential equations in Mathematica, does not work due to the usage of the FindMinimum command within the equations of motion (see Sect. 3).
So a special code had to be written, and it follows an algorithm based on a
combination of the Taylors series and multi-step methods [5]:

224

P. Szczepaniak and R. Walenty


nski

Input -> totalTime, deltaT, initialcoordinates, initialvelocity,


function acceleration[coordinates, velocities];
steps = totalTime / deltaT;
coord[[1]] = initialcoordinates;
vel[[1]] = initialvelocity;
acc[[1]] = acceleration[coord[[1]],vel[[1]]];
coord[[2]] = coord[[1]] + vel[[1]] * deltaT
+ 0.5 acc[[1]] * deltaT^2;
vel[[2]] = vel[[1]] + acc[[1]] * deltaT;
Do[acc[[i]] = acceleration[coord[[i]],vel[[i]]];
coord[[i+1]] = 2 * coord[[i]] - coord[[i-1]]
+ acc[[i]] * deltaT^2;
vel[[i]] = 0.5 (coord[[i+1]] - coord[[i-1]]) / deltaT,
{i, 2, steps + 1}];
Output -> lists: coord[[i]], vel[[i]], acc[[i]].
Of course, each single element of the output lists coord[[i]], vel[[i]] and
acc[[i]] is a 6 dimensional vector. This code seems to be stable and a sucient
precision is reached with deltaT = 0.002 [second].

Sample Results

Some sample results were obtained for the slide shown in Fig. 2. There is shown
the trajectory of the motion and the axis of the slide in Fig. 5. The beginning
is at the point (0,0), and then the body goes along the direction of the X2 axis,
next turns left and follows the curved axis of the slide, slightly swinging. More

Fig. 5. Top view of the axis of the slide (dotted line) and the trajectory of motion (bold
line)

Safety of Recreational Water Slides

225

Fig. 6. Sample results (description in text)

Fig. 7. Results of the G-load calculations. Shades of grey denotes dierent values of G.

detailed results can be read from Fig. 6. There are presented: the G-load acting
on the moving body
G = |F Sum F G | /(9.81 m) ,

(12)

226

P. Szczepaniak and R. Walenty


nski

value of velocity and rst and second angular coordinates (longitudinal and
transversal rotations around 1 and 2 axes - see Fig. 3), where on the horizontal
axes of these charts is marked the length of the chute.
The current design code [2] sets some limitations on the values of permissible
G-load for safety reasons. It says, that G 2.6 g is safe, and 2.6 g < G 4.0 g is
acceptable, but only for less then 0.1 second. Within the presented example these
limitations are kept, but sometimes it is hard to do so. Especially the second
condition can cause some problems when designing a very steep and fast water
slide, where the high speed generates huge centrifugal force at each bend.

Conclusions

Numerical modelling of motion is a good method of checking the geometry of


water slides during the design process. It allows to estimate the excitement level
of the ride, which is dependent on speed and acceleration, and points out the
dangerous parts of the slide, where the value of G-load is to high (exceeds 2.6 or
4.0 g). The applied algorithms are user-friendly, and the results of computations
can be presented in many forms, beginning from pure numbers and ending on
pictures like Fig. 7 and animations of the motion.
Acknowledgments. This paper has been supported by the Polish Committee
of Scientic Research (grant No. 0416/T02/2006/31).

References
1. Joo, S.-H., Chang, K.-H.: Design for the safety of recreational water slides. Mech.
Struct. & Mach. 29 (2001) 261294
2. European Standard EN 1069-1:2000 Water slides of 2 m height and more - Part 1:
Safety requirements and test methods. CEN, Bruxelles (2000)
3. Szczepaniak, P.: Zjezdzalnie wodne. Obliczanie geometrii zjezdzalni i modelowanie
ruchu uzytkownika (Water slides. Calculating the geometry and modelling of a motion of a user). MSc thesis. Silesian University of Technology, Gliwice (2003)
4. Borkowski, Sz.: Mechanika og
olna. Dynamika Newtonowska (General Mechanics.
Newtons Dynamics). 2nd edn. Silesian University of Technology Press, Gliwice
(1995)
5. Burden, R.L., Faires, J.D.: Numerical Analysis. 5th ed. PWS Publishing Company,
Boston (1993)

Computing Locus Equations for Standard


Dynamic Geometry Environments
Francisco Botana, Miguel A. Ab
anades, and Jes
us Escribano
Departamento de Matem
atica Aplicada I, Universidad de Vigo, Campus A
Xunqueira, 36005 Pontevedra, Spain
fbotana@uvigo.es
Ingeniera Tecnica en Inform
atica de Sistemas, CES Felipe II (UCM), 28300
Aranjuez, Spain
mabanades@cesfelipesegundo.com
Departamento de Sistemas Inform
aticos y Computaci
on, Universidad Complutense de
Madrid, 28040 Madrid, Spain
escribano@sip.ucm.es

Abstract. GLI (Geometric Locus Identier ), an open web-based tool


to determine equations of geometric loci specied using Cabri Geometry
and The Geometers Sketchpad, is described. A geometric construction of
a locus is uploaded to a Java Servlet server, where two computer algebra
systems, CoCoA and Mathematica, following the Groebner basis method,
compute the locus equation and its graph. Moreover, an OpenMath description of the geometric construction is given. GLI can be eciently
used in mathematics education, as a supplement of the locus functions
of the standard dynamic geometry systems. The system is located at
http://nash.sip.ucm.es/GLI/GLI.html.
Keywords: Interactive
OpenMath.

geometry, Automated

deduction, Locus,

Introduction

Dynamic geometry programs are probably the most used computers tools in
mathematics education today, from elementary to pregraduate level. At the same
time, computer algebra systems are widely employed in the learning of scientic
disciplines, generally starting from high school. Nevertheless, the top ranked
programs in both elds have evolved separately: although almost all of the computer algebra systems oer some specialized library for the study of geometry (cf.
Mathematica, Maple, Derive, ...), none of them use the dynamic paradigm, that
is, in their geometric constructions free elements cannot be graphically dragged
making dependent elements behave accordingly. On the other side, standard
dynamic geometry environments, such as Cabri Geometry [11] and The Geometers Sketchpad [10], are selfcontained: whenever, albeit rarely, they need some
symbolic computation, they use their own resources.
Some attempts connecting both types of systems have been reported, mainly
coming from academia. Besides the above mentioned geometric libraries, specialized packages for geometry using the symbolic capabilities of computer algebra
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 227234, 2007.
c Springer-Verlag Berlin Heidelberg 2007


228

F. Botana, M.A. Ab
anades, and J. Escribano

systems exist (see, for instance, [1]). Nevertheless, they lack the dynamic approach, being more an environment for exact drawing than a tool for interactive
experiments. We can also cite [12] as an interesting work, where a dynamic geometric system is developed inside Mathematica through an exhaustive use of
MathLink.
The approach of using computer algebra algorithms in dynamic geometry has
been more fruitful. Two ways have been devised for dealing with this cooperation.
Some systems incorporate their own code for coping with algebraic techniques
in geometry ([14,8,9],...), while other systems emphasize on reusing software
([2,15,17],...). Both strategies have been partially successful on solving some of
the three main points in dynamic geometry, that is, the continuitydeterminism
dilemma, the proof and discovery abilities, and the complete determination of
loci (see [6] for an extensive study of the subject).
This paper describes a webbased resource allowing the remote discovery of
equations of algebraic loci specied through the wellknown Cabri and The
Geometers Sketchpad environments. Moreover, OpenMath has been chosen as
the communicating language allowing other OpenMath compliant systems to
make use of the tool.

Numerical vs. Symbolic Loci

An astonishing feature of dynamic geometry systems is their ability to draw the


path of an object dependent on another one while this last element is dragged
along a predetermined path. The trajectory of the rst object is called its locus.
Since their earliest versions, both Cabri and The Geometers Sketchpad oered
graphical loci generation through their trace mode. Roughly speaking, the strategy consists of sampling the path of the dragged object and constructing, for
each sample, the position of the point generating the locus. An interpolation of
these support points returns the locus as a graphical continuous object in the
screen. The heuristics used for the interpolation produce anomalous loci in some
border cases, and this strategy is not well suited to produce the equation of the
obtained loci. Reacting against this drawback, the newest version of Cabri incorporates a tool for computing approximate algebraic expressions of loci. Although
this new feature is not documented, as usual in the market considerations of the
Cabrilog company, the algorithm is sketched in [16] as follows: 1) random selection of about one hundred locus supporting points, and 2) calculation of the
(n + 1)(n + 2)/2 coecients of the bivariate locus equation of degree n, from
n = 1 onwards and using a system of equations derived from the support points,
until they approximately satisfy the locus equation.
There is no doubt about the interest of functions returning equations of loci, as
Cabri does although in an approximate way. Nevertheless, no comment is made
on the exactness of the equation (hence inducing an unexpert user to take it as
an accurate one). Furthermore, some loci are described by equations of dierent
degree if the user exploits the dynamic character of the environment. We have
shown in [3] the superior performance of symbolic approaches for dealing with

Computing Locus Equations for Standard Dynamic Geometry Environments

229

loci: our proposal also obtains the equations of algebraic loci and bans out some
anomalous results in standard systems. The equations are sound (since the used
Groebner bases method is), and the method is fast enough to be integrated in a
dynamic environment, as tested with GDI (see [4] where the detailed algorithm
is described).

User Interface and Architecture

Although conceived with the spirit of a plug-in for Cabri and The Geometers
Sketchpad, simplicity and convenience of use were the guiding lines in the design
of GLI. Consequently, its interface (http://nash.sip.ucm.es/GLI/GLI.html) has
been designed to look like a simple web page (see Figure 1). The user makes use
of an applet to create a text le that is uploaded to the server. The equation
and graph of the locus are then displayed in a new browser window.

Fig. 1. User interface

The Cabri or The Geometers Sketchpad le with the original specication of


the locus undergoes a double translating process. First the original le is processed as a text le producing an XML codication of an OpenMath element.
OpenMath is an extensible standard for representing the semantics of mathematical objects (see [13]). This OpenMath description is then translated into
webDiscovery code, which is the description that the nal application is designed

230

F. Botana, M.A. Ab
anades, and J. Escribano

to interpret. webDiscovery is a web application developed by Botana (see [5])


capable of performing a wide variety of discovery tasks whose kernel has been
appropriately modied to be integrated in GLI as the nal computational tool.
Unlike the les generated by Cabri, the les directly generated by The Geometers Sketchpad are coded. A JavaSketchpad .htm le has to be used instead.
The decision of making the whole translating process available to the user
was thought, on one hand, as a testimonial statement to the computational
community where lack of transparency stands in the way of attempts to use the
available tools. On the other hand, the OpenMath description of the geometric
locus is made available to other OpenMath compliant geometric systems.
GLI is based on webMathematica [18], a Java servlet technology allowing
remote access to the symbolic capabilities of Mathematica. Once the user has
created and uploaded the appropriate text le, a Mathematica Server Page is
launched, reading the le and initializing variables. An initialization le for CoCoA [7] containing the ideal generated by the appropriate dening polynomials
is also written out, and CoCoA, launched by Mathematica, computes a Groebner basis for this ideal. The returned factors are classied as points, lines, conics
or general curves. Although Mathematica provides an implementation of the
Groebner basis algorithm, CoCoA has been chosen mainly due to the experience of better performance in several appropriate examples. Additionally, the
Mathematica graphing abilities are used to plot the locus.

4
4.1

Examples
An Ellipse

The old exercise of drawing an ellipse using a pencil, two pins, and a piece
of string is frequently proposed as the rst one when young students begin
practising loci computation.

Fig. 2. The ellipse as a locus

Computing Locus Equations for Standard Dynamic Geometry Environments

231

It is well known that both environments, Cabri and The Geometers Sketchpad, return a surprising answer when simulating the construction, giving just
half of the ellipse. Figure 2 shows the construction made in The Geometers
Sketchpad (inside the square) and the answer returned by GLI. The plot range
in the graph produced by GLI depends on ad-hoc computations of the graph size
made by Mathematica. Despite being an independent process, the computations
of an optimum plot range can be directed by a line of code in the webDiscovery
description of the task. A default but modiable value has been included in all
webDiscovery descriptions.
4.2

Lima
con of Pascal

Given a xed point P3 and a variable one P6 on a circle, the limacon of Pascal
is the locus of points P14 such that P3 , P6 and P14 are collinear and the distance
from P6 to P14 is a constant, specied in Figure 4 by the segment P8 P9 (see
http://www-groups.dcs.st-and.ac.uk/~history for a historical reference).

Fig. 3. The limacon of Pascal in Cabri

As in the case of the preceding subsection, computing the locus of P14 gives
just a part of it (Figure 3, left). It is necessary to compute the locus of the other

intersection, P14
, in order to get the whole locus (Figure 3, right). It seems that
Cabri is making some extra assumptions, for instance, the point P14 is inside
the circle, whereas the imposed constraints subsume both points in just one.
Regarding the locus equation, Cabri returns two equations (instead of one!)
0.04x2 y + 0.04y 3 + 0.22x2 + 0.19xy + 0.82y 2 + 0.87x + 5.07y + 10 = 0
0.04x2 y + 0.04y 3 + 0.17x2 + 0.20xy + 0.95y 2 + 0.77x + 5.62y + 10 = 0.
Plotting these equations with Mathematica we get the curves in Figure 4, left,
while the curve returned by our system GLI is shown at the right. The equation
of the limacon, a quartic, is included as a factor of the solution (see Figure 5).
The extraneous factor of the circle is due to a degenerated condition: note that,

232

F. Botana, M.A. Ab
anades, and J. Escribano

Fig. 4. Plots of the limacon

since P6 is a variable point on the circle, it can coincide with P3 , so reducing


the locus to a circle centered at P3 and with radius P8 P9 . The generation of this
factor could be avoided in GLI by adding the condition NotEqual(P3,P6).

Fig. 5. A fragment of the equation returned for the limacon

4.3

A Simple Locus, Dierent Answers

We will use a very simple example to illustrate the dierent behavior of standard
systems and the one proposed here. Let us consider a line P1 P2 with a variable
point on it, P4 , taken as the center of a circle with radius P5 P6 . Compute the
locus of a point P9 bound to the circle when P4 moves along the line. The locus found by The Geometers Sketchpad is, as expected, a line parallel to P1 P2
(Figure 6). Nevertheless, GLI answers that The locus is (or is contained in) the
whole plane. This is due to the dierent treatment of point P9 : in standard systems its denition is taken not just as a point on the circle, but other constraints
not explicitly given are assumed. However, from an strictly algebraic point of
view, P9 is any point on the circle. So, when the circle moves, the locus is a non
linear part of the plane, and our non semialgebraic approach detects this fact
answering as stated.
Note that if Cabri is used and the rst line is dened through a point and
a graphically specied slope, the translation of the construction will fail since
the current version of GLI does not support this Cabri primitive. When working
with a non allowed locus le an error message will appear in the corresponding
text area in the applet. The user is then instructed to inspect the JAVA console

Computing Locus Equations for Standard Dynamic Geometry Environments

233

Fig. 6. A simple locus

to see a sequential description of the translation process, with which one can
determine the primitive not admitted by the current version of GLI that has
been used.
4.4

Extending the Scope of Loci Computations

Cabri and The Geometers Sketchpad can only nd loci of points that have been
eectively constructed in their systems, that is, points who parametrically depend on another one. Hence, points which are implicitly dened by simultaneous
multiple conditions cannot be used for generating loci. A classical result such
as the theorem of WallaceSteiner (stating that the locus of points X such that
their orthogonal projections to a given triangle are collinear is the triangles circumcircle) cannot be discovered unless the user previously knows the result! The
symbolic kernel of GLI has been designed to be easily modied to support this
type of implicit loci. Further versions of GLI are under development to support
an extended class of computable loci.

Conclusion

The work presented here, although of a small magnitude, shows the possibilities of the interconnection between dynamic geometry and computer algebra
systems. Moreover, we think that the generalization of broad band internet connections will make remote access to applications the main general trend, of which
GLI is a perfect example. The decision of making OpenMath the communication
language between the systems involved could be seen as secondary from a computational point of view. However, the use of standard semantic representation
of mathematical objects is, as we see it, the main challenge in the computational community. GLI wants to be an example of that too. Moreover, the use
of OpenMath as intercommunicating language opens the door to further connections with dierent geometry related systems. As future work, a twofolded
ongoing research is being conducted to extend GLIs domain, to other Dynamic
Geometry Systems on one hand, and to nonpolynomial equations and inequalities on the other. This latter extension of GLI will allow a considerable increase
in the set of possible relations between the dierent geometric elements and
hence of its applications.

234

F. Botana, M.A. Ab
anades, and J. Escribano

Acknowledgments. This work has been partially supported by UCM research


group ACEIA and research grants MTM2004-03175 (Botana) and MTM200502865 (Abanades, Escribano) from the Spanish MEC.

References
1. Autin, B.: Pure and applied geometry with Geometrica. Proc. 8th Int. Conf. on
Applications of Computer Algebra (ACA 2002), 109110 (2002)
2. Botana, F., Valcarce, J.L.: A dynamic-symbolic interface for geometric theorem
discovery. Computers and Education, 38(13), 2135 (2002)
3. Botana, F.: Interactive versus symbolic approaches to plane loci generation in
dynamic geometry environments. Proc. I Int. Workshop on Computer Graphics
and Geometric Modeling (CGGM 2002), LNCS, 2330, 211-218 (2002)
4. Botana, F., Valcarce, J.L.: A software tool for the investigation of plane loci. Mathematics and Computers in Simulation, 61(2), 141154 (2003)
5. Botana, F.: A Webbased intelligent system for geometric discovery. Proc. I Int.
Workshop on Computer Algebra Systems and Applications (CASA 2003), LNCS,
2657, 801-810 (2003)
6. Botana, F., Recio, T.: Towards solving the dynamic geometry bottleneck via a
symbolic approach. Proc. V Int. Workshop on Automated Deduction in Geometry
(ADG 2004), LNAI, 3763, 92110 (2006)
7. Capani, A., Niesi, G., Robbiano, L.: CoCoA, a system for doing Computations in
Commutative Algebra. Available via anonymous ftp from: cocoa.dima.unige.it
8. Gao, X.S., Zhang, J.Z., Chou, S.C.: Geometry Expert. Nine Chapters, Taiwan
(1998)
9. http://www.geogebra.at
10. Jackiw, N.: The Geometers Sketchpad v 4.0. Key Curriculum Press, Berkeley
(2002)
11. Laborde, J. M., Bellemain, F.: Cabri Geometry II. Texas Instruments, Dallas (1998)
12. Miyaji, C., Kimura, H.: Writing a graphical user interface for Mathematica using
Mathematica and Mathlink. Proc. 2nd Int. Mathematica Symposium (IMS97),
345352 (1997)
13. http://www.openmath.org/
14. RichterGebert, J., Kortenkamp, U.: The Interactive Geometry Software Cinderella. Springer, Berlin (1999)
15. RoanesLozano, E., RoanesMacas, E., Villar, M.: A bridge between dynamic
geometry and computer algebra. Mathematical and Computer Modelling, 37(910),
1005-1028 (2003)
16. Schumann, H.: A dynamic approach to simple algebraic curves. Zentralblatt f
ur
Didaktik der Mathematik, 35(6), 301316 (2003)
17. Wang, D.: GEOTHER: A geometry theorem prover. Proc. 13th International Conference on Automated Deduction (CADE 1996), LNCS, 1104, 166170 (1996)
18. http://www.wolfram.com/products/webmathematica/index.html

Symbolic Computation of Petri Nets


Andres Iglesias1 and Sinan Kapcak2
1

Department of Applied Mathematics and Computational Sciences,


University of Cantabria, Avda. de los Castros,
s/n, E-39005, Santander, Spain
iglesias@unican.es
http://personales.unican.es/iglesias
2
Department of Mathematics, Izmir Institute of Technology,
Urla, Izmir, Turkey
sinankapcak@iyte.edu.tr

Abstract. Petri nets are receiving increasing attention from the scientic community during the last few years. They provide the users with a
powerful formalism for describing and analyzing a variety of information
processing systems such as nite-state machines, concurrent systems,
multiprocessors and parallel computation, formal languages, communication protocols, etc. Although the mathematical theory of Petri nets
has been intensively analyzed from several points of view, the symbolic
computation of these nets is still a challenge, particularly for generalpurpose computer algebra systems (CAS). In this paper, a new Mathematica package for dealing with some Petri nets is introduced.

Introduction

Petri nets (PN) are receiving increasing attention from the scientic community
during the last few years. Most of their interest lies on their ability to represent a number of events and states in a distributed, parallel, nondeterministic or
stochastic system and to simulate accurately processes such as concurrency, sequentiality or asynchronous control [1,3]. Petri nets provide the users with a very
powerful formalism for describing and analyzing a broad variety of information
processing systems both from the graphical and the mathematical viewpoints.
Since its inception in the early 60s, they have been successfully applied to many
interesting problems including nite-state machines, concurrent systems, multiprocessors and parallel computation, formal languages, communication protocols
and many others.
Although the mathematical fundamentals of Petri nets have been analyzed by
using many powerful techniques (linear algebraic techniques to verify properties
such as place invariants, transition invariants and reachability; graph analysis
and state equations to analyze their dynamic behavior; simulation and Markovchain analysis for performance evaluation, etc.), and several computer programs
for PN have been developed so far, the symbolic computation of these nets is still
a challenge, particularly for general-purpose computer algebra systems (CAS).
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 235242, 2007.
c Springer-Verlag Berlin Heidelberg 2007


236

A. Iglesias and S. Kapcak

In this paper, a new Mathematica package for dealing with some Petri nets
is introduced. The structure of this paper is as follows: Section 2 provides a
gentle introduction to the basic concepts and denitions on Petri nets. Then,
Section 3 introduces the new Mathematica package for computing them and
describes the main commands implemented within. The performance of the code
is also discussed in this section by using some illustrative examples. Conclusions
and further remarks close the paper.

Basic Concepts and Denitions

A Petri net (PN) is a special kind of directed graph, together with an initial
state called the initial marking (see Table 1 for the mathematical details). The
graph of a PN is a bipartite graph containing places {P1 , . . . , Pm } and transitions
{t1 , . . . , tn }. Figure 1 shows an example of a Petri net comprised of three places
and six transitions. In graphical representation, places are usually displayed as
circles while transitions appear as vertical rectangular boxes. The graph also
contains arcs either from a place Pi to a transition tj (input arcs for tj ) or from
a transition to a place (output arcs for tj ). These arcs are labeled with their
weights (positive integers), with the meaning that an arc of weight w can be
understood as a set of w parallel arcs of unity weight (whose labels are usually
omitted). In Fig. 1 the input arcs from P1 to t3 and P2 to t4 and the output arc
from t1 to P1 have weight 2, the rest having unity weight.

Fig. 1. Example of a Petri net comprised of three places and six transitions

A marking (state) assigns to each place Pi a nonnegative integer, ki . In this


case, we say that Pi is marked with ki tokens. Graphically, this idea is represented
by ki small black circles (tokens) in place Pi . In other words, places hold tokens
to represent predicates about the world state or internal state. All markings
are denoted by vectors M of length m (the total number of places in the net)
such that the i-th component of M indicates the number of tokens in place Pi .
From now on the initial marking will be denoted as M0 . For instance, the initial
marking (state) for the net in Figure 1 is {2, 1, 0}.

Symbolic Computation of Petri Nets

237

Table 1. Mathematical denition of a Petri net


A Petri net (PN) is an algebraic structure PN = (P, T, A, W, M0 ) comprised of:
a nite set of places, P = {P1 , P2 , . . . , Pm },
a nite set of transitions, T = {t1 , t2 , . . . , tn },
a set of arcs, A, either from a place to a transition (input arcs) or
from a transition to a place (output arcs):
A (P T) (T P)
a weight function: W : A IN q
(with q = #(A))
an initial marking: M0 : P IN m

If PN is a nite capacity net, we also consider:


a set of capacities, C : P IN m
a nite collection of markings (states) Mi : P IN m

The dynamical behavior of many systems can be expressed in terms of the


system states of their Petri net. Such states are adequately described by the
changes of markings of a PN according to a ring rule for the transitions: a
transition tj is said to be enabled if each input place Pi of tj is marked with wi,j
tokens, where wi,j is the weight of the arc from Pi to tj . For instance, transitions
t2 , t3 and t5 are enabled, while transitions t4 and t6 are not. Note, for example,
that transition t4 has weight 2 while place P2 has only 1 token, so arc from P2 to
t4 is disabled. If transition tj is enabled, it may or may not be red (depending
on whether or not the event represented by such a transition occurs). A ring of
transition tj removes wi,j tokens from each input place Pi of tj and adds wj,k
tokens to each output place Pk of tj , wj,k being the weight of the arc from tj to
Pk . In other words, if transition tj is red, all input places of tj have their input
tokens removed and a new set of tokens is deposited in the output places of tj
according to the weights of the arcs connecting those places and tj . For instance,
transition t3 removes one token from place P1 and adds one token to place P2 ,
thus changing the previous marking of the net.
A transition without any input place is called a source transition. Note that
source transitions are always enabled. In Figure 1 there is only one source transition, namely t1 . A transition without any output place is called a sink transition.
The reader will notice that the ring of a sink transition removes tokens but does
not generate new tokens in the net. Sink transitions in Figure 1 are t2 , t4 and
t6 . A couple (Pi , tj ) is said to be a self-loop if Pi is both an input and an output
place for transition tj . A Petri net free of self-loops is called a pure net. In this
paper, we will restrict exclusively to pure nets.
Some PN do not put any restriction on the number of tokens each place can
hold. Such nets are usually referred to as unnite capacity net. However, in most
practical cases it is more reasonable to consider an upper limit to the number of
tokens for a given place. That number is called the capacity of the place. If all

238

A. Iglesias and S. Kapcak

places of a net have nite capacity, the net itself is referred to as a nite capacity
net. All nets in this paper will belong to this later category. For instance, the
net in Figure 1 is a nite capacity net, with capacities 2, 2 and 1 for places P1 ,
P2 and P3 , respectively.
If so, there is another condition to be fullled for any transition tj to be
enabled: the number of tokens at each output place of tj must not exceed its
capacity after ring tj . For instance, transition t1 in Figure 1 is initially disabled
because place P1 has already two tokens. If transitions t2 and/or t3 are applied
more than once, the two tokens of place P1 will be removed, so t1 becomes
enabled. Note also that transition t3 cannot be red initially more than once, as
capacity of P2 is 2.

The Mathematica Package for Petri Nets

In this section a new Mathematica package for dealing with Petri nets is introduced. For the sake of clarity, the main commands of the package will be
described by means of its application to some Petri net examples. In particular,
in this paper we will restrict to the case of pure and nite capacity nets, a kind
of nets with many interesting applications. We start our discussion by loading
the package:
In[1]:= <<PetriNets
According to Table 1, a Petri net (like that in Figure 1 and denoted onwards
as net1) is described as a collection of lists. In our representation, net1 consists
of three elements: a list of couples {place, capacity}, a list of transitions and a
list of arcs from places to transitions along with its weights:
In[2]:= net1={{{p1,2},{p2,2},{p3,1}},{t1,t2,t3,t4,t5,t6},
{{p1,t1,2},{p1,t2,-1},{p1,t3,-2},{p2,t3,1},
{p2,t4,-2},{p2,t5,-1},{p3,t5,1},{p3,t6,-1}}};
Note that the arcs are represented by triplets {place, transition, weight},
where positive value for the weights mean output arcs and negative values denote input arcs. This notation is consistent with the fact that output arcs add
tokens to the places while input arcs remove them. Now, given the initial marking {2, 1, 0} and any transition, the FireTransition command returns the new
marking obtained by ring such a transition:
In[3]:= FireTransition[net1,{2,1,0},t2];
Out[3]:= {1,1,0}
Given a net and its initial marking, an interesting question is to determine
whether or not a transition can be red. The EnabledTransitions command
returns the list of all enabled transitions for the given input:
In[4]:= EnabledTransitions[net1,{2,1,0}];
Out[4]:= {t2,t3,t5}
The FireTransition command allows us to compute the resulting markings
obtained by applying these transitions onto the initial marking:

Symbolic Computation of Petri Nets

239

In[5]:= FireTransition[net1,{2,1,0},#]& /@ %;
Out[5]:= {{1,1,0},{0,2,0},{2,0,1}}
Note that, since transition t1 cannot be red, an error message is returned:
In[6]:= FireTransition[net1,{2,1,0},t1];
Out[6]:= FireTransition: Disabled transition: t1 cannot be red for the given net
and the {2,1,0} marking.

Fig. 2. The reachability graph for the Petri net net1 and the initial marking {2, 1, 0}

From Out[4] and Out[5], the reader can easily realize the successive application
of the EnabledTransitions and FireTransition commands allows us to obtain
all possible markings and all possible rings at each marking. However, this is
a tedious and time-consuming task to be done by hand. Usually, such markings
and rings are graphically displayed in what is called a reachability graph. The
next input returns the reachability graph for our Petri net and its initial marking:
In[7]:= ReachabilityGraph[net1,{2,1,0}];
Out[7]:= See Figure 2
Figure 2 can be interpreted as follows: the outer column on the left provides
the list of all possible markings for the net. Their components are sorted from the
left to the right according to the standard lexicographic order. For any marking,
the row in front gives the collection of its enabled transitions. For instance, the
enabled transitions for the initial marking {2, 1, 0} are {t2, t3, t5} (as expected
from Out[4]), while they are {t1, t4, t6} for {0, 2, 1}. Given a marking and one

240

A. Iglesias and S. Kapcak

Fig. 3. Example of a Petri net comprised of ve places and six transitions

Fig. 4. Reachability graph for the Petri net in Figure 3

of its enabled transitions, you can determine the output marking of ring such
transition by simply moving up/down in the transition column until reaching
the star symbol: the marking in that row is the desired output. By this simple
procedure, results such as those in Out[5] can readily be obtained.

Symbolic Computation of Petri Nets

241

Fig. 5. Reachability graph after modifying the weight of the arc from P5 to t6

A second example of a Petri net is shown in Figure 3. This net, comprised of


ve place and six transitions, has many more arcs than the previous example.
Consequently, its reachability graph, shown in Figure 4, is also larger. The Mathematica codes for dening the net and getting this graph are similar to those for
the rst example and, hence, have been intentionally omitted.
The net in Figure 3 exhibits a number of remarkable features: for instance,
places P1 , P2 and P5 have more than one output transition, leading to nondeterministic behavior. Such a structure is usually referred to as a conict, decision or choice. On the other hand, this net has no source transitions. This fact
is reected in the reachability graph, which has a triangular structure: entries
appear only below the diagonal. As opposed to this case, the net in Figure 1 has
one single source transition (namely, t1), the only element above the diagonal in
its reachability graph.
It is worthwhile to mention that the place P1 has only input arcs, meaning
that its number of initial tokens can only decrease, but never increase. This
means that the capacity of P1 might be less without aecting current results.

242

A. Iglesias and S. Kapcak

On the other hand, the reachability graph in Figure 4 has some markings no
transitions can be applied onto. Examples of such markings are {1, 3, 3, 2, 0},
{1, 2, 3, 2, 0} or {0, 4, 3, 1, 0} (although not the only ones). They are sometimes
called end markings. Note that not end markings appear in the rst net of this
paper. Note also that the transition t6 is never red (it never appears in the
graph of Figure 4). By simply decreasing the weight of the arc from P5 to t6
to the unity, the transition becomes enabled, as shown in the new reachability
graph depicted in Figure 5.

Conclusions and Further Remarks

In this paper, a new Mathematica package for dealing with nite capacity Petri
nets has been introduced. The main features of the package have been discussed
by its application to some simple yet illustrative examples. Our future work
includes the application of this package to real problems, the extension to other
cases of Petri nets, the implementation of new commands for the mathematical
analysis of these nets and the characterization of the possible relationship (if
any) with the functional networks and other networked structures [2,4,5].
This research has been supported by the Spanish Ministry of Education and
Science, Project Ref. #TIN2006-13615. The second author also thanks the nancial support from the Erasmus Program of the European Union for his stay
at the University of Cantabria during the period this paper was written.

References
1. Murata, T.: Petri nets: Properties, analysis and applications. Proceedings of the
IEEE, 77(4) (1989) 541-580
2. Echevarra, G., Iglesias, A., G
alvez, A.: Extending neural networks for B-spline
surface reconstruction. Lectures Notes in Computer Science, 2330 (2002) 305-314
3. German, R.: Performance Analysis of Communication Systems with Non-Markovian
Stochastic Petri Nets. John Wiley and Sons, Inc. New York (2000)
4. Iglesias, A., G
alvez, A.: A New Articial Intelligence Paradigm for Computer-Aided
Geometric Design. Lectures Notes in Articial Intelligence, 1930 (2001) 200-213
5. Iglesias, A., Echevarra, G., G
alvez, A.: Functional networks for B-spline surface
reconstruction. Future Generation Computer Systems, 20(8) (2004) 1337-1353

Dynaput: Dynamic Input Manipulations


for 2D Structures
of Mathematical Expressions
Deguchi Hiroaki
Kobe University, 3-11 Tsurukabuto, Nada-ku, Kobe 657-8501, Japan
deg@main.h.kobe-u.ac.jp
http://wwwmain.h.kobe-u.ac.jp/MathBB/

Abstract. This paper describes a prototype of input interface of GUI


for mathematical expressions. Expressions are treated as sets of objects.
To handle 2D structures, we have added new areas, called peripheral
area, to the objects. Based on new operations for these areas, the system
provides dynamic interactive environment. The internal data structure
for the interface is also presented in this paper. The tree structure of
the objects is very simple, and has high potentiality for being converted
to a variety of formats. Using this new dynamic input interface, users
can manipulate 2D structures directly and intuitively, and they can get
mathematical notations of wished format.
Keywords: GUI, input interface, direct manipulation, text entry,
pen-based computing.

Introduction

How to handle mathematical expressions on screens of computers has been discussed from 1960s[1]. While many systems have been proposed, there are no
easy-to-use systems for beginners. Users must choose not too bad(wrong) one
from existing systems. One of the reasons for the situation is that APIs of the
existing systems are not designed adequately in consideration of handling 2D
structure.
For example, a cursor(caret) of GUI text editors has 2D coordinates, but the
information is not used eectively. A Cursor is placed on or before or after one
of characters. Because of texts linear structure, the position of a cursor is able
to be converted to 1D information. Therefore, although the cursor of GUI text
editors looks like 2D cursor, it is 1D cursor in practice. And, many systems which
handle mathematical expressions are based on this type of cursor models.
1.1

Input Interfaces of Computer Algebra Systems

Computer algebra systems handle mathematical expressions, but their input interfaces are generally based on template models, such as word processors equation editor part which are treated like sub programs. An equation editor with
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 243250, 2007.
c Springer-Verlag Berlin Heidelberg 2007


244

H. Deguchi

template model is able to handle mathematical expressions by using templates


of 2D structures. And the input interface of the system is designed under the
inuence of a paradigm based on the cursor model as mentioned above.
Templates for 2D structures have box structure, each box of which is designed
for cursor model based input interfaces. Box structures can be nested inside
other boxes. 2D structures are constructed from such nested boxes. Usually
these boxes are displayed on the screen. There are symbols users would like to
display, and boxes these systems have to display, on the screen. Such systems
are not intuitive, because of these boxes users dont necessarily wish to display.
Therefore, user interfaces of many computer algebra systems are not suitable to
handle 2D structures.
As for input devices, template based systems require both pointing devices
and keyboards. Keyboards are required for inputting texts, and pointing devices
are required for selecting templates of 2D structures. It is not easy-to-use that
users are forced to use two or more devices.
1.2

The Present Work

The goal of this work is to provide easy-to-use environment to novice users,


such as students studying mathematics of primary (or higher) education. In our
former researches[2,3,4,5], we have developed MathBlackBoard as a user interface of computer algebra systems. In MathBlackBoard, only a pointing device is
required, and keyboards are not necessarily indispensable devices. And mathematical expressions located in the editing area of MathBlackBoard are able to be
dragged and be dropped. The user interface of MathBlackBoard is easy-to-use
at least for students of junior high schools[6,8].
This paper describes a new input interface[7] which replaces template based
interfaces. In this paper, we introduce a dynamic input interface and its data
structure. The interface has patents pending in Japan. The new version of MathBlackBoard(Fig. 1) has been developed as a prototype of the interface. In the
Section 2 of this paper, the new model of GUI operations is shown, and operations of existing systems and of the new system are discussed. In the Section 3,
data structure for dynamic input interface is described. Finally the Section 4
describes a conclusion.

GUI Operations

Using computers with GUI, users can manipulate objects on screens directly.
Especially, drag and drop operations are intuitive. However, it is hard to say
that they are eectively utilized in interfaces treating mathematical expressions.
For example, in some systems, to connect a selection to a target, users can select
and drag the selection, but can drop them only onto prepared boxes related to
the target. These boxes have to be prepared by using templates before drag and
drop operations.

Dynaput: Dynamic Input Manipulations for 2D Structures

245

Fig. 1. MathBlackBoard

2.1

Drag and Drop

In general, drag and drop means drag an object and drop it to dierent places
or onto other objects. Each object on GUI screens has its own selection area
which is used to be selected or to be dropped onto. And when a dragged object
is released by manipulation of pointing devices, all of the selection areas are
scanned. If the pointer lies inside the selection area of any object, the dragged
object is dropped onto such pointed object. Otherwise, the dragged object is
dropped to the place marked by the pointer.
Using these operations, users can drag and drop expressions onto a box prepared or into an insertion point which is used in linear structure of texts. Insertion points have 1D information (or convertable to 1D information) of their
position, and are used with text stream, where text includes characters and symbols. When a dragged object is released, all of the boxes are scanned. And then,
if a box is pointed, all of the insertion points in the box are scanned.
In the model of templates and their box structures, that is used with drag
and drop operations generally, mathematical expressions are constructed from
nested boxes which contain other boxes or text stream as contents.
2.2

Drag and Draw

As mentioned above, existing drag and drop operations and template models are
not suitable for 2D structure of mathematical expressions, because a mathematical formula is constructed as box-structured texts.
To extend drag and drop operation, we have added new elements to objects.
The most important element added is area for target of drag and drop operations.
Notice that the areas are used to be targets of drag and drop, not to be boxstructured containers. These new added areas are called peripheral area. Left

246

H. Deguchi

Fig. 2. Left: A symbol object x with a selection area, peripheral areas, and a baseline. Right: Base points and their link relations.

hand of Fig. 2 shows the symbol object x with M as a selection area, R0R5
as peripheral areas, and BL as a baseline.
By using objects with peripheral areas, users can drag mathematical expressions and drop them onto the peripheral area of the target, for connecting the
selection to the target with the desired position. Each object of the system has
information of its connected child-nodes. And, it has information of location
and display size of its child-nodes. Right hand of Fig. 2 shows the symbol object
x with P0P5 as base points. Each base point of P0P5 is related to each
peripheral area of R0R5. Base points are used as link points to connect their
child-nodes. For each base point, information of link relation for its child-node is
congured(right hand of Fig. 2). Size of a dotted line square means child-nodes
display size. And relation between a dotted line square and related base point
means the relative point of the child-node.
The operations for objects with these new elements are able to be explained
as drag and drop operations. But, in this case, drag and drop is an operation
between the selection and the peripheral area of the target. It is not an operation between the selection and the target itself. Therefore it could be a new
GUI operation. In case the selection is dropped onto peripheral areas, the GUI
operation is called drag and draw. The drag and draw operation is an operation between the selection and the target. In this case, drag and draw means
drag the selection and draw it near to the target. The system with drag and
draw operations is suitable for mathematical expressions, because objects are
associated with other objects in the nature of 2D structures by using peripheral
areas and their related elements.
2.3

Dynaput Operation

The drag and draw operations are suitable to handle 2D structure of mathematical expressions. And there are other elements to extend GUI operations for
beginners. The new elements are feedbacks from drag operations and feedbacks
from draw operations.

Dynaput: Dynamic Input Manipulations for 2D Structures

247

Feedbacks from drag operations should show what object is dragged. Such
information helps novice users to know which object is selected and is dragging.
MathBlackBoard is the system which provides this kind of information, from the
early stages of development.
Feedbacks from draw operations also show useful information, categories of
which are where the selection object will be connected to, and which size the
selection object will be changed. With information from feedbacks, users can
check the place where the dragged object will be connected to, or the preview
of size, before dragged object is released by manipulation of pointing devices.
Thus users can check results of various cases by moving pointing devices, without carrying out determination operation that is to release the pressed button
of the pointing device in general. If the preview is what the user wished, determination operation is to be performed by manipulating pointing devices. After
the determination, the selection is connected to the target, and is placed on the
desired position with the previewed size.
The environment with these two kinds of feedbacks provides interactive
dynamic input interface. In the dynamic input interface of MathBlackBoard,
operations include not only an inputting aspect but also an editing aspect.
Users can input expressions and edit expressions in the same way. For example, input operation in MathBlackBoard is performed as follows: drag the object
in the palette area of Fig. 1 and drop it onto other objects, peripheral areas,
or the blackboard area of Fig. 1. And edit operation is performed as follows:
drag the object in the blackboard area and drop it onto other objects, peripheral areas, or the blackboard area.
Since same operations in the user interface mean both of inputting and editing,
as described above, that could be new GUI operations. Because the operations
are not simple input operations but dynamic input operations, the operations
are called dynaput instead of input. Dynaput is a coined word combining
dyna (from dynamic) and put (from input or put).

3
3.1

Data Structure
Layout Tree

Symbol objects for dynaputting are structured as tree. Each node has information of numbered direction for linking to child-nodes(left hand of Fig. 3). In
Fig. 3, the thick lines mean the default direction. In this case, the default direction is 2. Right hand of Fig. 3 shows a layout tree of the following expression:
f (x) =

n


ai xi

i=0

The structure of layout tree is based on the layout of symbols. Mathematical


semantic representation is not used in the layout tree structure. Our strategy for
handling expressions on computer screens is that what it means appears when
you need. The selected tree is traversed when the user invokes a command.

248

H. Deguchi

Fig. 3. Left: Numbered directions. (The thick line means the default direction.) Right:
An example of layout tree. (The numbers beside lines mean directions.)

The layout trees can be converted easily to presentation markup languages,


such as TeX and MathML. Fig. 4 shows tree traversal and TeX output. The
child-node in default direction is visited after all of child-nodes in other directions
are visited. Parent-node performs pre/post processes. For example, when it has
child-node in the non-default directions, the parenthesis { is outputted before
visit, and } is outputted after all child-nodes in the direction are visited. The
outputs of other symbols (such as or ) depend on conditions of the content
of the symbol object and the direction number of the child-node.

Fig. 4. Tree traversal and TeX output

The MathML
 (Presentation Markup) output is similar in method to the TeX
output. The part of a MathML output example is as follows:
<munderover>
<mo>&sum;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>0</mn>
</mrow>
<mrow>
<mi>n</mi>
</mrow>
</munderover>

Dynaput: Dynamic Input Manipulations for 2D Structures

249

Layout tree traversal with outputting text characters is the common method
of these. The tasks for outputting are simple because of the simple structure of
layout trees. Thus layout trees are converted to presentation markup languages
easily.
3.2

Binary Tree of Symbol Objects

Layout trees can be converted to mathematical expressions for computer algebra


systems to evaluate, in another way. Layout trees are converted to binary trees
of symbol objects before converting to expressions for evaluation.
At rst, a layout tree is parsed into reverse Polish notation. The layout
tree is scanned from root in the default direction. An example of reverse Polish
notation of the layout tree in Fig. 3 is as follows:

][a][x][i][power][invisible times][apply][=]

[f][x][apply function][


As to and a, only child-nodes of direction 2 are visited in that scan
process. The ignored child-nodes are called after the process, if it is needed. That
is decided by the parser. Temporary objects (such as power or apply) are
generated in the scan process.
And then, a binary tree of the symbol objects(left hand of Fig. 5) is constructed from the reverse Polish notation. Right hand of Fig. 5 shows another
example. The parser can decide meaning of f (x).

Fig. 5. Left: An example of binary tree. Right: A left sub tree of another example.

Finally, the binary tree of the symbol objects is traversed, and outputs which
depend on contents
 of symbol objects are generated. Notice that the ignored
child-nodes of and a are called, when the binary tree is traversed.
The binary trees of symbol objects are used internally and temporarily in
the process of the converting. Users can get mathematical notations of wished
format after the process, because binary trees are able to be converted to various notations of mathematical expressions including inx, prex, and postx
notations.

250

H. Deguchi

Conclusion

The new dynamic input interface has been developed. We have added new elements to objects of the system. For such objects, new GUI operations have
been dened. Drag and draw is one of drag and drop operations. In case the
selection is dragged and drawn near to the target, the operation is called drag
and draw. The exact actions of the operations are to drag over the peripheral
areas and to drop onto these areas.
Dynaput has also been dened as operations which include input and edit,
and provide dynamic input interface by previewing results. To provide dynaput
operations, the interface should have operations drag and draw and peripheral
areas for symbol objects. Using dynaput operations, users can input and edit
mathematical expressions intuitively by using only pointing devices.
In this user interface, mathematical expressions are constructed from treestructured symbol objects. Boxes as containers and text stream as contents,
which are used in other systems with templates, are not used. The structure
of layout tree is based on 2D structure of symbols on computer screens, and
mathematical meanings of objects are ignored temporarily.
The presentation markup outputs, such as TeX and MathML, are generated
easily because of the simple structure of layout trees. Mathematical notations
for computer algebra systems evaluations are also converted from layout trees
via tree transformation. What it means appears when you need is our strategy
for handling 2D structures of expressions.
Using this new input interface with the operation dynaput, users can manipulate mathematical formulas intuitively, and users are able to acquire mathematical expressions of various formats.
A demo video of MathBlackBoard is available on the following URL:
http://wwwmain.h.kobe-u.ac.jp/MathBB/

References
1. Kajler, N., Soier, N.: A Survey of User Interfaces for Computer Algebra Systems.
J. Symb. Comput. 25(2) (1998) 127159
2. Matsushima J.: An Easy-to-Use Computer Algebra System by Using Java. Master
Thesis, Kobe University (1998) [in Japanese]
3. Deguchi H.: Blackboard Applet. Journal of Japan Society for Symbolic and Algebraic
Computation 9(1) (2002) 3237 [in Japanese]
4. Deguchi H.: MathBlackBoard. Journal of Japan Society for Symbolic and Algebraic
Computation 11(3,4) (2005) 7788 [in Japanese]
5. Deguchi H.: MathBlackBoard as User Interface of Computer Algebra Systems. Proceedings of the 10th Asian Technology Conference in Mathematics (2005) 246252
6. Deguchi H.: A Practical Lesson Using MathBlackBoard. Journal of Japan Society
for Symbolic and Algebraic Computation 12(4) (2006) 2130 [in Japanese]
7. Deguchi H.: A Dynamic Input Interface for Mathematical Expressions. Proceedings
of the Human Interface Symposium 2006 (2006) 627630 [in Japanese]
8. Deguchi H., Hashiba H.: MathBlackBoard as Eective Tool in Classroom. ICCS
2006, Part II Springer LNCS 3992 (2006) 490493

On the Virtues of Generic Programming for


Symbolic Computation

Xin Li, Marc Moreno Maza, and Eric


Schost
University of Western Ontario, London N6A 1M8
{xli96, moreno, schost}@scl.csd.uwo.ca

Abstract. The purpose of this study is to measure the impact of C


level code polynomial arithmetic on the performances of AXIOM highlevel algorithms, such as polynomial factorization. More precisely, given
a high-level AXIOM package P parametrized by a univariate polynomial
domain U, we have compared the performances of P when applied to
dierent Us, including an AXIOM wrapper for our C level code.
Our experiments show that when P relies on U for its univariate polynomial computations, our specialized C level code can provide a signicant speed-up. For instance, the improved implementation of square-free
factorization in AXIOM is 7 times faster than the one in Maple and
very close to the one in MAGMA. On the contrary, when P does not
rely much on the operations of U and implements its private univariate
polynomial operation, then P cannot benet from our highly optimized
C level code. Consequently, code which is poorly generic reduces the
speed-up opportunities when applied to highly ecient and specialized
Keywords: Generic programming, fast arithmetic, ecient implementation, high performance, polynomials.

Introduction

Generic programming, and in particular type constructors parametrized by types


and values, is a clear need for implementing computer algebra algorithms. This
has been one of the main motivations in the development of computer algebra
systems and languages such as AXIOM [10] and Aldor [15] since the 1970s.
AXIOM and Aldor have a two-level object model of categories and domains
which allows the implementation of algebraic structures (rings, elds, . . . ) and
their members (polynomial domains, elds of rational functions, . . . ). In these
languages, the user can implement domain and category constructors, that is,
functions returning categories or domains. For instance, one can implement a
function UP taking a ring R as parameter and returning the ring of univariate
polynomials over R. This feature is known as categorical programming.
Another goal in implementing computer algebra algorithms is that of eciency. More precisely, it is desirable to be able to realize successful implementations of the best algorithms for a given problem. Sometimes, this may
sound contradictory with the generic programming paradigm. Indeed, ecient
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 251258, 2007.
c Springer-Verlag Berlin Heidelberg 2007


252

Schost
X. Li, M.M. Maza, and E.

implementations often require specialized data-structures (e.g., primitive arrays


of machine words for encoding dense univariate polynomials over a nite eld).
High-performance was not the primary concern in the development AXIOM. For
instance, until recently [11], AXIOM had no domain constructor for univariate
polynomials with dense representation.
The MAGMA [2,1] computer algebra system, developed at the University of
Sydney since the 1990s, has succeeded in providing both generic types and highperformance. As opposed to many previous systems, a strong emphasis was put
on performance: asymptotically fast state-of-the art algorithms are implemented
in MAGMA, which has become a De facto reference regarding performance.
MAGMAs design uses the language of universal algebra as well. Users can
dynamically dene and compute with structures (groups, rings, . . . ), that belong
to categories (e.g., permutation groups), which themselves belong to varieties
(e.g., the variety of groups); these algebraic structures are rst-class objects.
However, some aspects of categorical programming available in AXIOM are not
present: users cannot dene new categories; the interfacing with C seems not
possible either.
In this paper, we show that generic programming can contribute to highperformance. To do so, we rst observe that dense univariate and multivariate
polynomials over nite elds play a central role in computer algebra, thanks
to modular algorithms. Therefore, we have realized highly optimized implementations of these polynomial data-types in C, Aldor and Lisp. This work is
reported in [5] and [12].
The purpose of this new study is to measure the impact of our C level code
polynomial arithmetic on the performances of AXIOM high-level algorithms,
such as factorization. More precisely, given a high-level AXIOM package P (or
domain) parametrized by a univariate polynomial domain U, we have compared
the performances of P when applied to dierent Us, including an AXIOM wrapper for our C level code.
Our experiments show that when P relies on U for its univariate polynomial
computations, our specialized C level code can provide a signicant speed-up.
On the contrary, when P does not rely much on the operations of U and implements its private univariate polynomial operations, then P cannot benet from
our highly optimized C level code. Consequently, code which is poorly generic reduces the speed-up opportunities when applied to highly ecient and specialized
polynomial data-types.

Software Overview

We present briey the AXIOM polynomial domain constructors involved in our


experimentation. Then, we describe the features of our C code that play a central
in this study: nite eld arithmetic and fast univariate polynomial arithmetic.
We notably discuss how the choice of special primes enables us to obtain fast
algorithms for reduction modulo p.

On the Virtues of Generic Programming for Symbolic Computation

2.1

253

AXIOM Polynomial Domain Constructors

Let R be an AXIOM Ring. The domain SUP(R) implements the ring of univariate
polynomials with coecients in R. The data representation of SUP(R) is sparse,
that is, only non-zero terms are encoded. The domain constructor SUP is written
in the AXIOM language.
The domains DUP(R) implements exactly the same operations as SUP(R).
More precisely, these two domains satisfy the category UnivariatePolynomialCategory(R). However, the representation of the latter domain is dense: all
terms, null or not, are encoded. The domain constructor DUP was developed in
the AXIOM language, see [11] for details.
Another important domain constructor in our study is PF: for a prime number
p, the domain PF(p) implements the prime eld Z/pZ.
Our C code is dedicated to multivariate polynomials with dense representation
and coecients in a prime eld To make this code available at the AXIOM level,
we have implemented a domain constructor DUP2 wrapping our C code. For a
prime number p, the domains DUP2(p) and DUP(PF(p)) implement the same
category, that is, UnivariatePolynomialCategory(PF(p)).
2.2

Finite Field Arithmetic

The implementation reported here focuses on some special small nite elds. By
a small nite eld, we mean a eld of the form K = Z/pZ, for p a prime that
ts in a 26 bit word (so that the product of two elements reduced modulo p ts
into a double register). Furthermore, the primes p we consider have the form
k2 + 1, with k a small odd integer (typically k 7), which enables us to write
specic code for integer Euclidean division.
The elements of Z/pZ are represented by integers from 0 to p 1. Additions
and subtractions in Z/pZ are performed in a straightforward way: we perform
integer operations, and the result is then reduced modulo p. Since the result of
additions and subtractions is always in (p 1), . . . , 2(p 1), modular reduction
requires at most a single addition or subtraction of p; for the reduction, we use
routines coming from Shoups NTL library [9,14].
Multiplication in Z/pZ requires more work. A standard solution, present in
NTL, consists in performing the multiplication in double precision oating-point
registers, compute numerically the quotient appearing in the Euclidean division
by p, and nally deduce the remainder.
Using the special form of the prime p, we designed the following faster
approximate Euclidean division, that shares similarities with Montgomerys
REDC algorithm [13]; for another use of arithmetic modulo special primes,
see [4]. Let thus Z be in 0, . . . , (p 1)2 ; in actual computations, Z is obtained
as the product of two integers less than p. The following algorithm computes an
approximation of the remainder of kZ by p, where we recall that p has the form
k2 + 1:
1. Compute q =  2Z .
2. Compute r = k(Z q2 ) q.

254

Schost
X. Li, M.M. Maza, and E.

Proposition 1. Let r be as above and let r0 < p be the remainder of kZ by p.


Then r r0 mod p and r = r0 p, with 0 < k + 1.
Proof. Let us write the Euclidean division of kZ by p as kZ = q0 p + r0 . This
implies that


q0 + r0
q = q0 +
k2
holds. From the equality qp + r = q0 p + r0 , we deduce that we have


q0 + r0
r = r0 p with =
p.
k2
The assumption Z (p 1)2 enables us to conclude that < k + 1 holds.

In terms of operations, this reduction is faster than the usual algorithms, which
rely on either Montgomerys REDC or Shoups oating-point techniques. The
computation of q is done by a logical shift; that of r requires a logical AND (to
obtain Z 2q), and a single multiplication by the constant c. Classical reduction
algorithms involve 2 multiplications, and other operations (additions and logical
operations). Accordingly, in practical terms, our approach turned out to be more
ecient.
There are however drawbacks to this approach. First, the algorithm above does
not compute Z mod p, but a number congruent to kZ modulo p (this multiplication by a constant is also present in Montgomerys approach). This is however
easy to circumvent in several cases, for instance when doing multiplications by
precomputed constants (this is the case in FFT polynomial multiplication, see
below), since a correcting factor k 1 mod p can be incorporated in these constants. The second drawback is that the output of our reduction routine is not
reduced modulo p. When results are reused in several computations, errors accumulate, so it is necessary to perform some error reduction at regular time steps,
which slows down the computations.
2.3

Polynomial Arithmetic

For polynomial multiplication, we use the Fast Fourier Transform (FFT) [6,
Chapter 8], and its variant the Truncated Fourier Transform [8]. Indeed, since
we work modulo primes p of the form k2 + 1, Lemma 8.8 in [6] shows that Z/pZ
admits 2 th primitive roots of unity, so that it is suitable for FFT multiplication
for output degrees up to 2 1.
Both variants feature a O(d log(d)) asymptotic complexity; the latter oers a
smoother running time, avoiding the usual abrupt jumps that occur at powers
of 2 in classical Fast Fourier Transforms.
Using fast multiplication enables us to write a fast Euclidean division for
polynomials, using Cook-Sieveking-Kungs approach through power series inversion [6, Chapter 9]. Recall that this algorithm is based on the remark that the
quotient q in the Euclidean division u = qv + r in K[x] satises
revdeg udeg v (q) = revdeg u (u) revdeg v (v)1 mod xdeg udeg v+1 ,

On the Virtues of Generic Programming for Symbolic Computation

255

where revm (p) denotes the reverse polynomial xm p(1/x). Hence, computing the
quotient q is reduced to a power series division, which itself can be done in time
O(d log(d)) using Newtons iteration [6, Chapter 9].
Newtons iteration was implemented using middle product techniques [7],
which enable us to reduce the cost of a direct implementation by a constant
factor (these techniques are particularly easy to implement when using FFT
multiplication, and are already described in this case in [14].
Our last ingredient is GCD computation. We implemented both the classical
Euclidean algorithm, as well as its faster divide-and-conquer variant using socalled Half-GCD techniques [6, Chapter 11]. The former features a complexity
in O(d2 ), whereas the latter has cost in O(d log(d)2 ), but is hindered by a large
multiplicative constant hidden by the big-O notation.
2.4

Code Connection

Open AXIOM is based on GNU Common Lisp (GCL), GCL being developed
in C [12]. We follow the GCL developers approach to integrate our C level
code into GCLs kernel. The crucial step is converting dierent polynomial data
representations between AXIOM and the ones in our C library via GCL level.
The overhead of these conversions may signicantly reduce the eectiveness of
our C implementation. Thus, good understanding of data structures in AXIOM
and GCL is a necessity to establish an ecient code connection.

Experimentation

In this section, we compare our specialized domain constructor DUP2 with our
generic domain constructor DUP and the standard AXIOM domain constructor
SUP. Our experimentation takes place into the polynomial rings:
Ap = Z/pZ[x],
Bp = (Z/pZ[x]/m)[y],
for a machine word prime number p and an irreducible polynomial m Z/pZ[x].
The ring Ap can be implemented by any of the three domain constructors DUP2,
DUP and SUP applied to PF(p), whereas Bp is implemented by either DUP and SUP
applied to Ap . In both Ap and Bp , we compare the performances of factorization
and resultant computations provided by theses dierent constructions. These
experimentations deserve two goals.
(G1 ) When there is a large proportion of the running time which is spent in
computing products, remainders, quotients, GCDs in Ap , we believe that
there are opportunities for signicant speed-up when using DUP2 and we
want to measure this speed-up w.r.t. SUP and DUP.
(G2 ) When there is a little proportion of the running time which is spent in
computing products, remainders, quotients, GCDs in Ap , we want to check
whether using DUP2, rather than SUP and DUP, could slow down computations.

Schost
X. Li, M.M. Maza, and E.

256

For computing univariate polynomial resultants over a eld, AXIOM runs the
package PseudoRemainderSequence implementing the algorithms of Ducos [3].
This package takes R: IntegralDomain and polR: UnivariatePolynomialCategory(R) as parameters. However, this code has its private divide operation
and does not rely on the one provided by the domain polR. In fact, the only nontrivial operation that will be run from polR is addition! Therefore, if polR has
a fast division with remainder, this will not benet to resultant computations
performed by the package PseudoRemainderSequence. Hence, in this case, there
is very little opportunities for DUP2 to provide speed-up w.r.t. SUP and DUP.
For square-free factorization over a nite eld, AXIOM runs the package
UnivariatePolynomialSquareFree. It takes RC: IntegralDomain P: UnivariatePolynomialCategory(RC) as parameters. In this case, the code relies on
the operations gcd and exquo provided by P. Hence, if P provides fast GCD computations and fast divisions, this will benet to the package UnivariatePolynomialSquareFree. In this case, there is a potential for DUP2 to speed-up
computations w.r.t. SUP and DUP.
30

160
SUP(FP(p))
DUP2(FP(p))

SUP(FP(p))
DUP2(FP(p))
140

25
120
Time [sec]

Time [sec]

20

15

100
80
60

10
40
5
20
0

0
500

1000

1500

2000

2500 3000
Degree

Fig. 1. Resultant
Z/pZ[x]

3500

4000

4500

computation

5000

in

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Degree

Fig. 2. Square-free factorization in


Z/pZ[x]

We start the description of our experimental results with resultant computations in Ap = Z/pZ[x]. As mentioned above, this is not a good place for obtaining
signicant performance gain. Figure 1 shows that computations with DUP2 are
just slightly faster than those with SUP. In fact, it is satisfactory to verify that
using DUP2, which implies data-type conversions between the AXIOM and C
data-structures, does not slow down computations.
We continue with square-free factorization and irreducible factorization in
Ap . Figure 2 (resp. Figure 3) shows that DUP2 provides a speed-up ratio of 8
(resp. 7) for polynomial with degrees about 9000 (resp. 400). This is due to
the combination of the fast arithmetic (FFT-based multiplication, Fast division,
Half-GCD) and highly optimized code of this domain constructor.
In the case of irreducible factorization, we could have obtained a better ratio
if the code was more generic. Indeed, the irreducible factorization over nite
elds in AXIOM involves a package which has its private univariate polynomial arithmetic, leading to a problem similar to that observed with resultant

On the Virtues of Generic Programming for Symbolic Computation


250

257

4
SUP(FP(p))
DUP2(FP(p))

SUP(SUP(FP(p)))
DUP(DUP(FP(p)))
DUP(DUP2(FP(p)))

3.5
200

Time [sec]

Time [sec]

150

100

2.5
2
1.5
1

50
0.5
0

0
0

50

100

150

200
Degree

250

300

350

Fig. 3. Irreducible factorization


Z/pZ[x]

400

in

10

20

30

40

50
60
Total Degree

Fig. 4. Resultant
(Z/pZ[x]/m)[y]

3000

70

80

90

computation

100

in

35
SUP(SUP(FP(p)))
DUP(DUP(FP(p)))
DUP(DUP2(FP(p)))

30

2500

AXIOM-Apr-06
Magma-2.11-2
Maple-9.5

25
Time [sec]

Time [sec]

2000

1500

20

15

1000
10
500

0
0

Total Degree

Fig. 5. Irreducible factorization


(Z/pZ[x]/m)[y]

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Total Degree

in

Fig. 6. Square-free factorization in


Z/pZ[x]

computations. The package in question is ModMonic, parametrized by R: Ring


and Rep: UnivariatePolynomialCategory(R), which implements the Frobenius map.
We conclude this section with our benchmarks in Bp = (Z/pZ[x]/m)[y]. For
resultant computations in Bp the speed-up ratio obtained with DUP2 is better
than in the case of Ap . This is because the arithmetic operations of DUP2 (addition, multiplication, inversion) perform better than those of SUP or DUP. Finally,
for irreducible factorization in Bp , the results are quite surprising. Indeed, AXIOM uses Tragers algorithm (which reduces computations to resultants in Bp ,
irreducible factorization in Ap and GCDs in Bp ) and, based on our previous
results, we could have anticipated a good speed-up ratio. Unfortunately, the
package AlgFactor, which is used for algebraic factorization, has its private
arithmetic. More precisely, it re-denes Bp with SUP and factorizes the input
polynomial over this new Bp .

Conclusion

The purpose of this study is to measure the impact of our C level specialized
implementation for fast polynomial arithmetic on the performances of AXIOM

258

Schost
X. Li, M.M. Maza, and E.

high-level algorithms. Generic programming is well designed in the AXIOM


system. The experimental results demonstrate that by replacing a few important
operations in DUP(PF(p)) with our C level implementation, the original AXIOM
univariate polynomial arithmetic over Z/pZ has been speed up by a large factor
in general. For algorithm such as univariate polynomial square free factorization
over Z/pZ, the improved AXIOM code is 7 times faster than the one in Maple
and very close to the one in MAGMA (see Figure 6).

References
1. W. Bosma, J. J. Cannon, and G. Matthews. Programming with algebraic structures: design of the Magma language. In ISSAC94, pages 5257. ACM Press,
1994.
2. The Computational Algebra Group in the School of Mathematics and Statistics at
the University of Sydney. The MAGMA Computational Algebra System for Algebra, Number Theory and Geometry. http://magma.maths.usyd.edu.au/magma/.
3. L. Ducos. Optimizations of the subresultant algorithm. J. of Pure Appl. Alg.,
145:149163, 2000.
4. T. F
arnqvist. Number theory meets cache locality: ecient implementation of
a small prime FFT for the GNU Multiple Precision arithmetic library. Masters
thesis, Stockholms Universitet, 2005.
Schost. Implementation techniques for
5. A. Filatei, X. Li, M. Moreno Maza, and E.
fast polynomial arithmetic in a high-level programming environment. In ISSAC06,
pages 93100. ACM Press, 2006.
6. J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 1999.
7. G. Hanrot, M. Quercia, and P. Zimmermann. The middle product algorithm, I.
Appl. Algebra Engrg. Comm. Comput., 14(6):415438, 2004.
8. J. van der Hoeven. The truncated Fourier transform and applications. In ISSAC04,
pages 290296. ACM Press, 2004.
9. http://www.shoup.net/ntl. The Number Theory Library. V. Shoup, 19962006.
10. R. D. Jenks and R. S. Sutor. AXIOM, The Scientic Computation System.
Springer-Verlag, 1992.
11. X. Li. Ecient management of symbolic computations with polynomials, 2005.
University of Western Ontario.
12. X. Li and M. Moreno Maza. Ecient implementation of polynomial arithmetic
in a multiple-level programming environment. In A. Iglesias and N. Takayama,
editors, ICMS 2006, pages 1223. Springer, 2006.
13. P. L. Montgomery. Modular multiplication without trial division. Math. of Comp.,
44(170):519521, 1985.
14. V. Shoup. A new polynomial factorization algorithm and its implementation. J.
Symb. Comp., 20(4):363397, 1995.
15. S. M. Watt, P. A. Broadbery, S. S. Dooley, P. Iglio, S. C. Morrison, J. M. Steinbach,
and R. S. Sutor. A rst report on the A# compiler. In ISSAC94, pages 2531.
ACM Press, 1994.

Semi-analytical Approach for Analyzing


Vibro-Impact Systems
1

Algimantas Cepulkauskas
, Regina Kulvietiene1 , Genadijus Kulvietis1 ,
and Jurate Mikucioniene2
1

Vilnius Gediminas Technical University, Sauletekio 11, Vilnius 2040, Lithuania


{algimantas cepulkauskas, regina kulvietiene,
genadijus kulvietis}@gama.vtu.lt
2
Kaunas University of Technology, Kestucio 27, Kaunas 44025, Lithuania
jumik@ktu.lt

Abstract. A semi-analytical approach, combining the features of analytical and numerical computations, is proposed. Separating the linear
and nonlinear parts of equations of motion, the harmonic balance method
and computer algebra have been synthesized for obtaining analytical solutions of the nonlinear part, whereas the linear part was solved by the
numerical methods. On the basis of this technique, the numerical investigation of abrasive treatment process dynamics has been performed and
regimes ensuring the most eective treatment process determined.

Introduction

Mechanical systems exhibiting impacts, so-called impact oscillators in the English literature at the present time, or vibro-impact systems in the Russian literature, are strongly nonlinear or piecewise linear, due to sudden changes in
velocities of vibrating bodies at the instant of impact or friction forces when the
velocity of motion changes its polarity. Their practical signicance is considerable and the investigation of the motion of such systems was begun about fty
years ago [7].
Several methods of the theoretical analysis were developed and dierent models of impacts were presented [1,6,7]. The method of tting which uses the
Newton restitution coecient seems to be most important. It is accurate and
applicable under the assumption that the time during an impact is negligible. It
can solve certain assumed periodic impact motion and its stability, but this procedure can be realized in the explicit form only for simple mechanical systems,
simple types of periodic motion, and for undamped impactless motion.
As usual, the solution method must correlate with the type of motion equations and concurrently with the character of the initial mechanic system. The
harmonic balance method was chosen for the investigation of a vibratory abrasive
treatment process [2,8] as it is easily applied to systems where calculations are
made by computer algebra methods and, in this case, considerably less computer
memory is needed than by other methods [3]. As a result, we obtain a nonlinear algebra equation system with many very short expressions. So it is possible
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 259262, 2007.
c Springer-Verlag Berlin Heidelberg 2007


260

A. Cepulkauskas
et al.

and expedient to process the results by numerical methods. The analytic calculation system VIBRAN is extremely eective in this case. Since an adequate
method for solving the motion equation of dynamically complicated systems has
to contain a large amount of analytic as well as numerical calculations, there
must be a strong program connection among them. The analytic calculation
system VIBRAN was selected in order to ensure this connection [3].The system
is designed to generate subroutines in FORTRAN, to select and reject excessive
operations while generating programs in accordance with analytic expressions.
Besides, this system stands out by exibility of input-output and the unique
means for operations with sparse matrices.

Implementation of Computer Algebra

Consider the system of s degrees of freedom with the Lagrangian function [5]:
L = L(qi , qi , t) (i = 1, 2, ...s),

(1)

where L is the Lagrangian function, qi , qi , t are generalized coordinates and


velocities of the system and time; s is the number of degrees of freedom.
The equations of motion of such a system are [4]:


d L
L

= Fqi .
(2)
dt qi
qi
These equations can be divided into the linear and nonlinear parts by formal
replacements L = LL + LN and Fqi = FLi + FN i . The equations of motion may
now be expressed in the form:
 

  


d LL
LL
d LN
LN

FLi +

FN i = 0,
(3)
dt qi
qi
dt qi
qi
where Fqi , FN i , FLi are generalized force, nonlinear (polynomial with respect to
generalized coordinates and periodical or Fourier expansion of time) and linear
parts; and LN , LL are nonlinear and linear parts of the Lagrangian function,
respectively.
The linear part can be formalized for numerical analysis without diculty and
we used special VIBRAN programs to analyze the nonlinear part of the system.
The proposed method provides shorter expressions for analytical computation
and allows the analysis of systems of higher order. After some well-known perturbations, the equations of motion can be rewritten in the matrix form:
[M ]{
q} + [B]{q}
+ [C]{q} = {H(q, q,
t)} + {f (t)},

(4)

where f(t) is the periodic function; H(q, q,


t) is the nonlinear part of a system,
calculated by a special VIBRAN program.
Solution of the abovementioned system can be expressed using the harmonic
balance method in the form [3,4]:
{q} = {A0 } + {A1 } cos (t) + {A2 } sin (t) + ...,

(5)

Semi-analytical Approach for Analyzing Vibro-Impact Systems

261

where {Ai } are the unknown vectors that can be found by the nonlinear algebraic
equations.
According to the harmonic balance method, these equations for the rst three
vector coecients in the matrix form are:
[U ] {A} {f } {H(A)} = {0} ,

(6)

where fi are coecients of Fourier expansion of the function f(t). Analogously,


equations for other harmonics could be found by the VIBRAN program. The
expressions of Hi and their derivatives required are expressed in closed form
using computer algebra techniques by the FORTRAN code generation procedure. Special modications were made to the terms with dry friction, and the
integration procedure was developed by the Malkin method [7].

Application of Abrasive Treatment Process Dynamics


in the Investigation

The new method for the treatment of miniature ring-shaped details was proposed
in [8], where the internal surface of the detail is treated as well as an external
one. During the vibratory treatment process the working medium particles are
constantly striking each other. As a result, slight scratches and crater-shaped
crevasses occur on the surface of treated details that form the surface microrelief. In this way, abrasive friction and impacts on the treated details perform
the treatment process.
The equations of motion describe the dynamics of a vibratory abrasive treatment process [5]:

1 + bx 1 + bk (x 1 x a ) + (c + c1 )x1
m1 x

= c(t) Fm (x1 ) F1 sign(x 1 x a ) + b(t),


(7)

ma x
a + bx a + bk (x a x 1 ) + cxa = c(t) + F2 sign(x 1 x a ) + b(t).
where m1 is the mass of components treated; ma is the mass of abrasive particles; the load mechanical properties are evaluated by elasticity c and working
medium resistance b; F1 , F2 are forces of dry friction between a component and
abrasive; bk is viscosity resistance; the elasticity of magnetic eld is evaluated by
a stiness coecient c1 , the detail was additionally excited by generating a variable component of magnetic eld Fm (x1 ). Its stiness properties were obtained
experimentally by the least squares method.
The kinematics excitation of a loaded vessel is (t) = A sin t.
Analytic expressions obtained according to the VIBRAN program conclude
the part of analytic calculations of H(A) [3]. The corresponding derivatives are
very simple and there are only 25 nonzero terms.
All the properties complying with the dynamic pattern of the process are
investigated in the numerical way and the program itself is composed in the
FORTRAN language. For this reason, in order to calculate the factors of analytic expressions and their partial derivatives, two FORTRAN subroutines have

262

A. Cepulkauskas
et al.

been generated: one for compiling a dictionary and another for calculating the
expressions themselves. Besides, the program, created by applying the harmonic
balance method for systems of dierential equations in addition to the equations
where amplitudes and constant components are found, also presents equation
derivatives from unknown quantities. In this case, one or some criteria of further
numerical parameter optimization may be calculated.

Conclusions

On the basis of solving nonlinear dierential equations by a harmonic balance


method and synthesis of the analytic calculation system VIBRAN, the investigation method of nonlinear systems with a dry friction eect has been created. This
method combines the advantages of analytic calculation methods and computer
algebra. They are a compound of the principle of parallel analytic-numerical calculation, where analytic rearrangements are applied only to the nonlinear part
of the system, while concurrently the linear part of the system could be easily
solved in the numerical way. The proposed method provides shorter expressions
for analytic computation and allows the analysis of systems of higher order.

References
1. Baron, J.M.: Abrasive and Magnetic Treatment of Details and Cutters, St. Petersburg, Mashinostrojenie (in Russian), (1986)
2. Blekhman, I.I.: Forming the properties of nonlinear mechanical systems by means of
vibration, Proc. IUTAM/IFToMM Symposium on Synthesis of Nonlinear Dynamical
Systems, Solid Mechanics and Its Applications, vol. 73, Kluwer Academic Publ.,
Dordrecht (2000) 1-13.
3. Cepulkauskas, A., Kulvietiene, R., Kulvietis G.: Computer Algebra for Analyzing
the Vibrations of Nonlinear Structures. Lecture Notes in Computer Science, Vol.
2657. Springer-Verlag, Berlin Heidelberg New York (2003) 747-753
4. Klymov, D.O., Rudenko, V.O.: Metody kompiuternoj algebry v zadachah mechaniki
Nau , O(r)scow, (in russian), (1989)
5. Kulvietiene, R., Kulvietis, G., Fedaravicius, A., Mikucioniene, J.: Numeric-Symbolic
Analysis of Abrasive Treatment Process Dynamics, Proc. Tenth World Congress of
the Theory of Machines and Mechanisms, Vol. 6, Oulu, Finland, (1999) 2536-2541
6. Lewandowski, R.: Computational Formulation for periodic vibration of Geometrically Nonlinear Structures Part 1: Theoretical Background. - Int. J. Solid Structures, 34 (15) (1997) 1925-1947
7. Malkin, I.G.: Some Problems of Nonlinear Oscillation Theory, Moscow, Gostisdat
(in russian), (1956)
8. Mikucioniene, J.: Investigation of vibratory magnetic abrasive treatment process
dynamics for miniature details, Ph.D. Thesis, KTU, Technologija, Kaunas, (1997)

Formal Verification of Analog and Mixed Signal Designs


in Mathematica
Mohamed H. Zaki, Ghiath Al-Sammane, and Sofi`ene Tahar
Dept. of Electrical & Computer Engineering, Concordia University
1455 de Maisonneuve W., Montreal, Quebec, H3G 1M8, Canada
{mzaki,sammane,tahar}@ece.concordia.ca

Abstract. In this paper, we show how symbolic algebra in Mathematica can be


used to formally verify analog and mixed signal designs. The verification methodology is based on combining induction and constraints solving to generate correctness for the system with respect to given properties. The methodology has the
advantage of avoiding exhaustive simulation usually utilized in the verification.
We illustrate this methodology by proving the stability of a modulator.
Keywords: AMS Designs, Formal Verification, Mathematica.

1 Introduction
With the latest advancement of semiconductors technology, the integration of the digital, analog and mixed-signal (AMS) designs into a single chip was possible and led into
the development of System on Chip (SoC) designs. One of the main challenges of SoC
designs is the verification of AMS components, which interface digital and analog parts.
Traditionally, analyzing the symbolically extracted equations is done through simulation [1]. Due to its exhaustive nature, simulation of all possible scenarios is impossible,
and hence it cannot guarantee the correctness of the design. In contrast to simulation,
formal verification techniques aim to prove that a circuit behaves correctly for all possible input signals and initial conditions and that none of them drives the system into
an undesired behavior. In fact, existing formal methods [2] are time bounded, where
verification is achieved only on a finite time interval. We overcome this limitation by
basing our methodology on mathematical induction, hence any proof of correctness of
the system is time independent. In this paper, we show how symbolic algebra in Mathematica can be used to formally verify the correctness of AMS designs. We illustrate
our methodology by applying it to prove the stability of a modulator [3].
The proposed verification methodology is based on combining induction and constraints solving to generate correctness proof for the system. This is achieved in two
phases; modeling and verification, as shown in Figure 1. Starting with an AMS description (digital part and analog part) and a set of properties, we extract, using symbolic
simulation, a System of Recurrence Equations (SRE) [4]. These are combined recurrence relations that describe each property in terms of the behavior of the system. SRE
is used in the verification phase along with an inductive based proof with constraints
defined inside Mathematica (details can be found in [4]). If a proof is obtained, then the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 263267, 2007.
c Springer-Verlag Berlin Heidelberg 2007


264

M.H. Zaki, G. Al-Sammane, and S. Tahar

property is verified. Otherwise, we provide counterexamples for the non-proved properties using reachability criteria. If the counterexample is realistic (strong instance),
then we have identified a problem (bug) in the design, otherwise the counterexample is
spurious (weak instance) and should be eliminated from the verification process.

AMS Description
Analog

Properties

Digital
Symbolic
Simulation
System of Recurrence
Equations (SRE)

Modeling
Verification
Verified
True

Constraints
Refinement
Yes

Inductive Proof
With Constraints
No
Counterexample

Counterexample
Analyzer

Weak
Instance

Strong Instance
Verified
False

Fig. 1. Overview of the Methodology

2 Implementation in Mathematica
An SRE is a system of the form: Xi (n) = fi (X j (n )), ( j, ) i , n Z, where
fi (X j (n )) is a generalized If-formula (see [5] for a complete definition). The set i
is a finite non empty subset of 1, . . . , k N. The integer is called the delay. A property
is a relation of the form: P = quanta(X, cond, expr), where quanta {, }, X is a set
of variables. cond is a logical proposition formula constructed over X and expr is an
If-formula that takes values in the Boolean Domain B.
Proving Properties. Mathematical induction is then used to prove that a property P(n)
holds for all nonnegative integers n n0 , where n0 is the time point after which the
property should be True:
Prove that P(n0 ) is True.
Prove that n > n0 , P(n) P(n + 1).
The induction algorithm is implemented in Mathematica using functions like Reduce,
Assuming and Re f ine. It tries to prove a property of the form quanta(X, cond, expr),
otherwise it gives a counterexample using FindCounterExample:
If Prove(quanta(X, cond, expr)) = True then
Return(True)
else
FindCounterExample(cond expr, var)
Finding Counterexamples. The basic idea is to find particular variable values for
which the property is not satisfied. This is implemented using the Mathematica function FindInstance[expr, vars, assum]. It finds an instance of vars that makes expr True

Formal Verification of Analog and Mixed Signal Designs in Mathematica

265

if an instance exists, and gives {} if it does not. The result is of the following form
{{v1 instance1, v2 instance2, . . . , vm instancem }}, where vars={v1 , v2 , . . . , vm }.
FindInstance can find instances even if Reduce cannot give a complete reduction. For
example, the Mathematica command FindInstance [x2 ay2 == 1 && x > y, {a, x, y}]
returns {{a 12 , x 12 , y 1}}
We need to insure that the instance is reachable by the SRE before considering it as a
counterexample. For example, we verify the recurrence equation Un = Un1 + 1 against
the property n > 0. Pn = Un > 0. FindInstance transforms Pn to an algebraic one and
gives the instance Un1 2. However, this instance will never be reachable by Un for
U0 = 0. Depending on the reachability, we name two types of SRE instances:
Strong instance: if it is given as a combination of of the design input values. Then,
the counterexample is always reachable.
Weak instance: if it is a combination of input values and recurrence variables values.
In this case,there in no guarantee that the counterexample is reachable.
If the recurrence equations are linear and if the conditions of the If-formula are
monotones, then we can search directly for a reachable strong instance. We can solve
these equations in Mathematica using the function RSolve[Eq1, Eq2 , . . ., Xi (n), . . ., n]. It
returns an explicit solution of the SRE {Eqn } in terms of time relations where the time
n is an explicit parameter. We use RSolve to identify a time point at which a desired
behavior is reached.

3 First-Order Modulator
modulators [3] are used in designing data converters. It is stable if the integrator output remains bounded under a bounded input signal. Figure 2 shows a first-order of
one-bit with two quantization levels, +1V and 1V. The quantizer (input signal y(n))
should be between 2V and +2V in order to not be overload. The SRE of the is :
y(n) = y(n 1) + u(n) v(n 1)
v(n 1) = IF(y(n 1) > 0, 1, 1)
u[n]

y[n]

v[n]

Z-1
Z-1

Fig. 2. First-order Modulator

The stability is expressed with the following properties:


Property 1. |u| 1 |y(0)| 1 |y(n)| 2. To ensure that the modulator will always be stable starting from initial conditions. In Mathematica to prove the property at
time n we write:

266

M.H. Zaki, G. Al-Sammane, and S. Tahar

in[1]:= Reduce[
ForAll[{u,y[n-1]}, And[-1< u < 1, -2< y[n-1] < 2 ],
And[(-1+u+y[n-1]  2) ,(1+u+y[n-1]  -2)]], {u,y[n-1]}, Reals]
out[1]:= True

Property 2. |u| > 1 |y(0)| 1 |y(n)| > 2. If the input to the modulator does not
conform to the stability requirement in Property 1, then the modulator will be unstable:
in[1]:= FindInstance[And[ 1<u , 1> y>0 ,(-1+u+y>2) ],u,y]
out[1]:= {u 72 , y 12 }

As y = 12 is already a valid state for y[n], then the instance is weak. We refine the
instance by adding it to the constraints list and restart the proof:
in[1]:= Assuming[And[ u
out[1]:= True

Thus, the instance u

7
2

7
2,

1> y>0 ] ,Refine[(-1+u+y>2)]]

is a strong instance for any y[n].

Property 3. |u| 1 |y(0)| > 2 n0 > 0 n > n0 . |y(n)| < 2. If the input of the
quantizer is distorted and cause the modulator to be temporarily unstable, the system
will return to stable region and stay stable afterwards; which means that there exist
an n for which the modulator will be stable for all n > n0 . Rsolve is used along with
FindInstance to search for this n0 . We have two cases: y[n 1] > 0 and y[n 1] < 0. In
Mathematica to prove the property, we write:
in[1]:= Eq=y[n+1]==(-1+u+y[n]);
RSolve[Eq&&y[0]== a ,y[n],n]
out[1]:= y[n] a-n+n u
in[2]:= Reduce[a+n+n u>-2 && u>-1 && a Reals,n]
out[2]:= a Reals && u > (-1) && n > 2a
1+u
in[3]:= FindInstance[a < -2 && n > 2 && 1 > u > 0.5 &&
n > 2a
, {a, u,n}]
(1+u
out[3]:= {a -5.5, u 0.75, n 4}

Thus, we have found a time value which provides a proof for the property: n > 2a
1+u .
As the property is formalized using the existential quantifier, it is enough to find one
instance: n0 4 .

4 Conclusions
We have presented how Mathematica can be used efficiently to implement a formal verification methodology for AMS designs. We used the notion of SRE as a mathematical
model that can represent both the digital and analog parts of the design. The induction
based technique traverses the structure of the normalized properties and provides a correctness proof or a counterexample, otherwise. Our methodology overcomes the time
bound limitations of conventional exhaustive methods. Additional work is needed in

Formal Verification of Analog and Mixed Signal Designs in Mathematica

267

order to integrate the methodology in the design process, like the automatic generation
of the SRE model from design descriptions given in HDL-AMS languages.

References
1. Gielen, G.G.E., Rutenbar, R.A.: Computer-aided Design of Analog and Mixed-signal Integrated Circuits. In: Proceedings of the IEEE. Volume 88. (2000) 18251852
2. Zaki, M.H., Tahar, S., Bois, G.: Formal Verification of Analog and Mixed Signal Designs:
Survey and Comparison. In: NEWCAS06, Gatineau, Canada, IEEE (2006)
3. Schreier, R., Temes, G.C.: Understanding Delta-Sigma Data Converters. IEEE Press-Wiley
(2005)
4. Al-Sammane, G., Zaki, M.H., Tahar, S.: A Symbolic Methodology for the Verification of
Analog and Mixed Signal Designs. In: DATE07, Nice, France, IEEE/ACM (2007)
5. Al-Sammane, G.: Simulation Symbolique des Circuits Decrits au Niveau Algorithmique. PhD
thesis, Universite Joseph Fourier, Grenoble, France (2005)

Ecient Computations of Irredundant


Triangular Decompositions with the
RegularChains Library
Changbo Chen1 , Francois Lemaire2 , Marc Moreno Maza1 , Wei Pan1 ,
and Yuzhen Xie1
1
2

University of Western Ontario, London N6A 1M8, Canada


Universite de Lille 1, 59655 Villeneuve dAscq Cedex, France

Abstract. We present new functionalities that we have added to the


RegularChains library in Maple to eciently compute irredundant
triangular decompositions. We report on the implementation of dierent
strategies. Our experiments show that, for dicult input systems, the
computing time for removing redundant components can be reduced to
a small portion of the total time needed for solving these systems.
Keywords: RegularChains, quasi-component, inclusion test, irredundant triangular decomposition.

Introduction

Ecient symbolic solving of parametric polynomial systems is an increasing


need in robotics, geometric modeling, stability analysis of dynamical systems
and other areas. Triangular decomposition provides a powerful tool for these
systems. However, for parametric systems, and more generally for systems in
positive dimension, these decompositions have to face the problem of removing
redundant components. This problem is not limited to triangular decompositions
and is also an important issue in other symbolic decomposition algorithms such
as those of [9,10] and in numerical approaches [7].
We study and compare dierent criteria and algorithms for deciding whether
a quasi-component is contained in another. Then, based on these tools, we obtain
several algorithms for removing redundant components in a triangular decomposition. We report on the implementation of these dierent solutions within the
RegularChains library [5].
We have performed extensive comparisons of these approaches using wellknown problems in positive dimension [8]. Our experiments show that, the removal of the redundant components is never a bottleneck. Moreover, we have
developed a heuristic inclusion test which provides very good running time performances and which fails very rarely in detecting an inclusion. We believe
that we have obtained an ecient solution for computing irredundant triangular
decompositions.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 268271, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Ecient Computations of Irredundant Triangular Decompositions

269

Inclusion Test of Quasi-components

In this section we describe our strategies for the inclusion test of quasicomponents based on the RegularChains library. We refer to [1,6,5] for the
notion of a regular chain, its related concepts, such as initial, saturated ideals,
quasi-components and the related operations.
Let T, U K[X] be two regular chains. Let hT and hU be the respective
products of their initials. We denote by sat(T ) the saturated ideal of T . We
discuss how to decide whether the quasi-component W (T ) is contained in W (U )
or not. An unproved algorithms for this inclusion test is stated in [4]; it appeared
not to be satisfactory in practice, since it is relying on normalized regular chains,
which tend to have much larger coecients that non-normalized regular chains
as veried experimentally in [2] and formally proved in [3].
Proposition 1. The inclusion W (T ) W (U ) holds if and only if the following
both statements hold

(C1 ) for all p U we have p sat(T ),
(C2 ) we have W (T ) V (hU ) = .
If sat(T ) is radical, then condition (C1 ) can be replaced by:
(C1 ) for all p U we have p sat(T ),
which is easier to check. Checking (C2 ) can be approached in dierent
ways, depending on the computational cost that one is willing to pay. The
RegularChains library provides an operation Intersect(p, T ) returning regular
chains T1 , . . . , Te such that we have
V (p) W (T ) W (T1 ) W (Te ) V (p) W (T ).
A call to Intersect can be seen as relatively cheap, since Intersect(p, T ) exploits
the fact that T is a regular chain. Checking
(Ch ) Intersect(hU , T )=,
is a good criterion for (C2 ). However, when Intersect(hU , T ) does not return
the empty list, we cannot conclude. To overcome this limitation, we rely on
Proposition 2 and the operation Triangularize of the RegularChains library.
For a polynomial system, Triangularize(F ) returns regular chains T1 , . . . , Te such
that V (F ) = W (T1 ) W (Te ).
Proposition 2. The inclusion W (T ) W (U ) holds if and only if the following
both statements hold

(C1 ) for all p U we have p sat(T ),

(C2 ) for all S Triangularize(T {hU }) we have hT sat(S).
This provides an eective algorithm for testing the inclusion W (T ) W (U ).
However, the cost for computing Triangularize(T {hU }) is clearly higher than
that for Intersect(hU , T ), since the former operation cannot take advantage of
the fact T is a regular chain.

270

C. Chen et al.

Removing Redundant Components

Let F K[X] and let T = T1 , . . . , Te be a triangular decomposition of V (F ),


that is, a set of regular chains such that we have V (F ) = W (T1 ) W (Te ).
We aim at removing every Ti such that there exists Tj , with i = j and W (Ti )
W (Tj ). Based on the results of Section 2, we have developed the following strategies for testing the inclusion W (T ) W (U ).
heuristics-no-split: It checks whether (C1 ) and (Ch ) hold. If both hold,
W (T ) W (U ) has been established, otherwise no conclusions can be made.
heuristically-with-split: It tests the conditions (C1 ) and (Ch ). Checking
(C1 ) is achieved by means of the operation Regularize [5,6]: for a polynomial
p and a regular chain T , Regularize(p, T ) returns regular chains T1 , . . . , Te
such that we have
W (T ) W (T1 ) W (Te ) W (T ),
for each 1 i e the polynomial p is either 0 or regular modulo sat(Ti ).
Therefore, Condition (C1 ) holds i for all Ti returned by Regularize(p, T )
we have p  0 mod sat(Ti ).
certified: It checks conditions (C1 ) and (C2 ). If both hold, then W (T )
W (U ) has been established. If at least one of the conditions (C1 ) or (C2 )
does not hold, then the inclusion W (T ) W (U ) does not hold either.
The following polynomial systems are well-known systems which can be found
at [8]. For each of them, the zero set has dimension at least one. Table 1 and
Table 2 report the number of components and running time of dierent approaches
for these input systems, based on which we make the following observations:
1. The heuristic removal without split performs very well. First, for all examples, except sys 8, it discovers all redundant components. Second, for all
examples, except sys 8, its running time is a relatively small portion of the
solving time (third column of Table 1).
2. Theoretically, the heuristic removal with split can eliminate more redundancies than the other strategies. Indeed, it can discover that a quasi-component
Table 1. Triangularize without removal, certied removal
Triangularize
Certied
Name (No removal) Proposition 2
 RC time(s)  RC time(s)
1 genLinSyst-3-2 20
1.684
17 1.182
2
Butcher 15
9.528
7 0.267
3
MacLane 161 12.733
27 7.144
4
neural 10 14.349
4 8.948
5
Vermeer
6 27.870
5 58.396
6
Liu-Lorenz 23 29.044
16 121.793
7
chemical
7 71.364
5 7.727
8
Pappus 393 37.122 120 141.702
9
Liu-Lorenz-Li 22 1796.622
9 96.364
10 KdV572c11s21
41 8898.024
7 6.980
Sys

Ecient Computations of Irredundant Triangular Decompositions

271

Table 2. Heuristic removal, without and with split, followed by certication


Heuristic
(C1 ) and (Ch )
Sys (without split)
 RC time(s)
1
17
0.382
2
7
0.178
3
27
3.437
4
4
1.881
5
5
0.771
6
16
1.937
7
5
0.243
8
124
42.817
9
9
8.186
10
7
4.878

Certication
(Deterministic)
 RC time(s)
17
1.240
7
0.259
27
8.470
4
8.353
5
60.108
16 123.052
5
7.828
120 135.780
9 101.668
7
6.688

Heuristic
(C1 ) and (Ch )
(with split)
 RC time(s)
17
0.270
7
0.147
27
3.358
4
6.429
8 54.455
18 96.492
5
5.180
124 48.756
10 105.598
7
5.881

Certication
(Deterministic)
 RC time(s)
17
1.214
7
0.325
27
8.239
4
14.045
8 109.928
18 203.937
5
12.842
120 148.341
10 217.837
7
7.424

is contained in the union of two others, meanwhile these three components


are pairwise noninclusive.
3. In practice, the heuristic removal with split does not discover more irredundant components than the heuristic removal without split, except for systems
5 and 6. However, the running time overhead is large.
4. The direct deterministic removal is also quite expensive on several systems
(5, 6, 8). Unfortunately, the heuristic removal without split, used as precleaning process does not really reduce the cost of a certied removal.

References
1. P. Aubry, D. Lazard, and M. Moreno Maza. On the theories of triangular sets. J.
Symb. Comp., 28(1-2):105124, 1999.
2. P. Aubry and M. Moreno Maza. Triangular sets for solving polynomial systems: A
comparative implementation of four methods. J. S. Com., 28(1-2):125154, 1999.
Schost. Sharp estimates for triangular sets. In ISSAC 04, pages
3. X. Dahan and E.
103110. ACM, 2004.
4. D. Lazard. A new method for solving algebraic systems of positive dimension.
Discr. App. Math, 33:147160, 1991.
5. F. Lemaire, M. Moreno Maza, and Y. Xie. The RegularChains library. In Ilias S.
Kotsireas, editor, Maple Conference 2005, pages 355368, 2005.
6. M. Moreno Maza. On triangular decompositions of algebraic varieties. Technical
Report TR 4/99, NAG Ltd, Oxford, UK, 1999. http://www.csd.uwo.ca/moreno.
7. A.J. Sommese, J. Verschelde, and C.W. Wampler. Numerical decomposition of the
solution sets of polynomial systems into irreducible components. SIAM J. Numer.
Anal., 38(6):20222046, 2001.
8. The SymbolicData Project. http://www.SymbolicData.org, 20002006.
9. D. Wang. Elimination Methods. Springer, 2001.
10. G. Lecerf. Computing the equidimensional decomposition of an algebraic closed
set by means of lifting bers. J. Complexity, 19(4):564596, 2003.

Characterisation of the Surfactant Shell Stabilising


Calcium Carbonate Dispersions in Overbased Detergent
Additives: Molecular Modelling and Spin-Probe-ESR
Studies
Francesco Frigerio and Luciano Montanari
ENI R&M, SDM-CHIF, via Maritano 26,
20097 San Donato Milanese, Italy
{francesco.frigerio, luciano.montanari}@eni.it

Abstract. The surfactant shell stabilising the calcium cabonate core in overbased
detergent additives of lubricant base oils was characterised by computational and
experimental methods, comprising classical force-field based molecular simulations and spin-probe Electron Spin Resonance spectroscopy. An atomistic model
is proposed for the detergents micelle structure. The dynamical behaviour observed during diffusion simulations of three nitroxide spin-probe molecules into
micelle models could be correlated to their mobility as determined from ESR
spectra analysis. The molecular mobility was found to be dependent on the
chemical nature of the surfactants in the micelle external shell.

1 Introduction
The lubrication of modern internal combustion engines requires the addition of specific additives to the base oils to improve the overall performance (minimization of
corrosion, deposits and varnish formation in the engine hot areas) [1]. Calcium sulphonates are the most widely used metallic detergent additives. They are produced by
sulfonation of synthetic alkylbenzenes. The simplest member would be a neutral alkylbenzene sulphonate with an alkyl solubilizing group approximately C18 to C20 or
higher to provide adequate oil solubility. In addition to metallic detergents such as the
neutral sulphonate, modern oil formulations contain basic compounds which provide
some detergency. Their main function, however, is to neutralize acid and to prevent
corrosion from acid attack. It is economically advantageous to incorporate as much
neutralizing power in the sulphonate molecule as possible: excess base in the form of
calcium carbonate can be dispersed in micelles [2, 3] to produce the so-called overbased sulphonates. Dispersions of calcium carbonate stabilized by calcium sulphonates have been characterized [4] using different techniques: electron microscopy [5],
ultracentrifugation [6], and neutron scattering [7]. SAXS results show that overbased
calcium sulphonates appear as polydisperse micelles having an average calcium carbonate core radius of 2.0 nm with a standard deviation of 0.4 nm [8]. The overbased
calcium sulphonates form reverse micelles in oil, consisting of amorphous calcium
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 272279, 2007.
Springer-Verlag Berlin Heidelberg 2007

Characterisation of the Surfactant Shell Stabilising Calcium Carbonate Dispersions

273

carbonate nanoparticles surrounded by sulphonate surfactants. The polar heads (sulphonate) are attached to the metal core, while the hydrocarbon tails of hydrophobic
nature stabilize the colloidal particle in the non-polar oil medium. Coupling three surface analyses techniques (XPS, XANES and ToF-SIMS) it was observed that there is
a presence of some residual calcium hydroxide in the micellar core which is located
prevalently at the surroundings of the micelle core [9]. ToF-SIMS shows that the molecular structures of detergent molecules are in good agreement with micelles synthesis data; little is still known on surfactant shell physical nature.
The compactness of the surfactant shells could play an important role to prevent
negative consequences due to the interaction of carbonate core with other additives
used in oil formulation or with water molecules. Such interactions cause the calcium
carbonate separation by precipitation. In this study the molecular dynamics within the
surfactant shell was probed in a combined computational and experimental approach
by a small nitroxide such as TEMPO (2,2,6,6-tetramethylpiperidine-N-oxyl) and two
nitroxide-labelled fatty acids (5- and 16-doxyl-stearic acids). In fact, nitroxides are
known to exhibit ESR spectra that depend on the mobility of the spin-probe and the
micro-viscosity of the probe environment. They have been used to evaluate the microstructure of the absorbed layer of surfactants and polymers at the solid-liquid interface
[10-14] and also inside composite polymers [15, 16].
Detailed three-dimensional models were previously proposed for small overbased
micelles containing various classes of surfactants [17-21]. Experimental measurements collected in our laboratory (data not shown) pointed to a couple of important
micelle features: a flat disk shape for the inner core and a tightly packed outer shell.
The core is mainly composed of amorphous calcium carbonate and it is surrounded by
a distribution of surfactant molecules, arranged as a single layer with polar groups
contacting the core surface. The diffusion of TEMPO and labelled fatty acids through
the overbased micelle surfactant shell was simulated by a classical molecular mechanics methodology. The force-field based simulations protocols were applied to detailed
atomistic models, which were built limited to the central portion of the micelle structure and contain all its essential features. The slow molecular motions of the stable
surfactant layer around the rigid inorganic core could be reproduced by performing
molecular dynamics calculations. Furthermore, the movements of the nitroxide spinprobes were studied by an approach combining forced diffusion and constraint-free
molecular dynamics. Stable locations of such small molecules could be defined for
each micelle model under investigation.
It is assumed that the polar head groups of the probe molecules (nitroxide for
TEMPO and carboxylic for fatty acids) tend to be placed on the surface of the calcium
carbonate cores. ESR spectra give information about the viscosity of the local environment at different distances from carbonate surface (at the boundary for TEMPO,
while at 5- and 16- carbon positions for the two spin-labelled fatty acids). In our laboratory different surfactant molecules were used in the synthesis of overbased detergents with high calcium carbonate content, expressed as Total Base Number (the
amount of soluble colloidal carbonate and of metal oxide and hydroxide, measured as
equivalent milligrams of KOH per gram of sample [22]). Three overbased detergents
with TBN=300 and a mixture of mono- and di-alkyl-benzene-sulphonate were
analyzed.

274

F. Frigerio and L. Montanari

2 Experimental Methods
An approach combining computer graphics and atomistic simulations [17, 19]
produced a detailed model for the central portion of the overbased micelle structure.
The essential model features (thickness and internal structure of the core, concentration and location of excess hydroxide ions, density and molecular arrangement of the
shell) were inferred from analytical determinations performed on the detergent micelles that were produced in our laboratories (data not shown). Different relative concentrations of the surfactants were used to build the external shell of three micelle
models, referred to as model a, b, c throughout this paper. The starting molecular and
ionic building blocks were selected and manually manipulated within the InsightII
[23] graphical interface in order to set up initial atomic distributions. A partial micelle
model was built as an inorganic slab surrounded by two surfactant layers, one on top
of each of its largest surfaces. Three-dimensional periodic boundary conditions were
constantly applied during simulations in order to avoid model truncation effects.
Atomic parameters were assigned from the pcff force field [24]. Afterwards an amorphous calcium carbonate core (with a small concentration of hydroxide ions) and a
tight surfactant shell were generated by applying stepwise Monte Carlo docking and
molecular dynamics [25] to limited portions of the micelle models. These simulations
were performed by using InsightII and Discover [23], respectively. Along this part of
the model building process, the uninvolved atoms were kept frozen. Nitroxide spinprobes were then added to the system assembly, after subjecting the obtained micelle
models to extensive energy relaxation. The starting configurations contain a small
cluster of spin-probe molecules, packed in a single layer contacting the micelle surfactant shell. Respectively, 14 molecules of TEMPO, 6 of 5-doxyl-stearic acid and 6
of 16-doxyl-stearic acid were used. The forced diffusion procedure available within
InsightII [23] was carefully tailored to suit the overbased micelle model features. Its
application followed the thorough energy minimisation of each one of the starting
system configurations. Since the molecular motions within tightly packed assemblies
are very slow, they were accelerated by adding a directional force for a very short
time period at the beginning of the simulations, with the effect of gently pushing the
nitroxide spin-probes toward the micelle core. In this way the extremely long process
of generating spontaneous but unfrequent diffusion pathways could be avoided and
the simulations were concentrated on the more interesting task of studying the small
molecule motions throughout the micelle surfactant shell. The potential energy of the
system and the relative distances between the nitroxide groups of the spin-probes and
the micelle core center were derived by analysing the trajectories collected during the
following free molecular dynamics simulations. These values were finally compared
to the results from equivalent simulations, performed without the previous application
of the forced diffusion protocol and used as a reference (non-diffusive) state.
All spin-probe molecules (Aldrich Chemie) were diluted at 0.3 mM concentration
into a mixture of SN150/SN500 (produced by ENI according to ASTM standard
specification) lubricant bases (2/1 by weight). The overbased detergents were dissolved (at 30% by weight) into the spin-probe/lubricant solutions. The ESR spectra
were collected with a Bruker ESP 300E spectrometer conditioned at a temperature
of 50C.

Characterisation of the Surfactant Shell Stabilising Calcium Carbonate Dispersions

275

3 Results
The diffusion pathways of TEMPO and of 5- and 16- spin-probe labelled stearic acids
were followed and compared in the three model-built partial micelles (identified as a,
b, c) with share an identical inorganic core but differ by the molecular distribution of
surfactants in the external shell.
Epot vs -dist
TEMPO in a, b, c

1500
1000
500
0
-500
0

10

15

Fig. 1. Plot (left): -distance () vs Epotential (kcalmol-1) from the simulation of TEMPO into
micelle models a (grey), b (blue), c (magenta). Pictures (right): simulation boxes with 14 molecules (orange, Van der Waals) diffused into models a (green), b (yellow), c (cyan), composed
of inorganic core (Van der Waals) and surfactant layers (ball and stick).

The force constant application produced diffusive pathways during dynamics trajectories (Fig. 1) for the micelle models containing TEMPO. The average penetration
depth of the TEMPO nitroxide group into the micelle model is shortest and most energetically unfavourable (Fig. 1, left) within model a, while the results are slightly
better with model b and at their best within model c. The potential energy cost payed
for the production of TEMPO diffusive pathways appears high and generally increasing with the force constant value. For fast comparison among the micelle models only
very short trajectories were analysed. Further, longer molecular dynamics simulations
(data not shown) completely release all strain accumulated during the first part of the
Epot vs -dist
d5sta in a, b, c
1500
1000
500
0
-500
0

10

15

Fig. 2. Plot (left): -distance () vs Epotential (kcalmol-1) from the simulation of 5-doxylstearic acid into micelle models a (grey), b (blue), c (magenta). Pictures (right): simulation
boxes with 6 molecules (orange, Van der Waals) diffused into models a (green), b (yellow), c
(cyan), composed of inorganic core (Van der Waals) and surfactant layers (ball and stick).

276

F. Frigerio and L. Montanari

diffusive process, while the final configurations do not differ significantly. One of the
final, energetically relaxed TEMPO diffusion configurations is depicted for each of
the three models a, b, c (Fig. 1, right). While most spin-probe molecules are located
into the surfactant shell, only a few of them can get in contact with the carbonate core.
These results can be compared to the dynamic behaviour of 5-doxyl-stearic acid
(Fig. 2). The average equilibrium penetration depth towards the micelle core (Fig. 2,
left) is similar to what observed with TEMPO, but the generally lower potential
energy cost reveals an easier diffusion through the surfactant shell. Anyway the structure of the two spin-probes is different: the stearic acid bears a nitroxide group laterally grafted to its long tail, therefore the reported distances from the micelle core
(Fig. 2) apply to a longer molecule than in the case of TEMPO (Fig. 1). However, the
order of spin-probe diffusion efficiencies through the three micelle models is again
found as: a < b < c. Three of the resulting relaxed configurations are reported (Fig. 2,
right). Differently from TEMPO, only a small fraction of the labelled stearic acid
molecules is able to reach a deep location into the surfactant shell.
Epot vs -dist
16sta in a, b, c
1500
1000
500
0
-500
0

10

15

Fig. 3. Plot (left): -distance () vs Epotential (kcalmol-1) from the simulation of 16-doxylstearic acid into micelle models a (grey), b (blue), c (magenta). Pictures (right): simulation
boxes with 6 molecules (orange, Van der Waals) diffused into models a (green), b (yellow), c
(cyan), composed of inorganic core (Van der Waals) and surfactant layers (ball and stick).

Comparable results were obtained from the analysis of the 16-doxyl-stearic acid
dynamics trajectories (Fig. 3). The average equilibrium penetration depth (Fig. 3, left)
plotted against the potential energy cost still reveals some differences: model c is
slightly favoured over b, and this last over a. Limited penetration into the surfactant
shell is generally observed, compared to 5-doxyl-stearic acid. This can be attributed to
the nitroxide group location further away from the the spin-probe polar head in the
16-doxyl-stearic acid. Three energetically relaxed configurations, resulting from the
interaction of a cluster of 16-doxyl-stearic acid with models a, b, c, are reported
(Fig. 3, right). As previously evidenced, the penetration of the surfactant shell by
these spin-probe molecules is limited and they do not get in close contact with the
core surface, differently from what happens with TEMPO.
The ESR spectra recorded for spin-probe molecules in solutions containing overbased micelles a, b and c are presented in Fig. 4, 5, and 6. Frequently such spectra
show the superimposition of two components, an isotropic triplet produced by freely
moving molecules and an anisotropic feature typical of a species located in a rigid

Characterisation of the Surfactant Shell Stabilising Calcium Carbonate Dispersions

277

environment. The characteristic shape of that anisotropic signal originates from the
rotation of cylindrical molecules. In the adopted conditions 2A// hyperfine parallel
coupling could be measured on most recorded spectra; its values are reported on
Table 1.

Fig. 4. ESR spectra of a solution of TEMPO in base lubricant oil containing, respectively,
overbased micelles a (top), b (middle), c (bottom)

The results obtained with TEMPO (Fig. 4) clearly show the two components described above. The 2A// values (Table 1) for the three overbased micelles are quite similar to each other, though a somewhat lower rigidity is suggested for model a (Fig. 4,
top). The aspect of ESR spectrum for model b (Fig. 4, middle) is dominated by the isotropic signal, revealing a lower population of the anisotropic species, compared to the
other two overbased micelle solutions.

Fig. 5. ESR spectra of a solution of 5-doxyl-stearic acid in base lubricant oil containing, respectively, overbased micelles a (top), b (middle), c (bottom)

The 2A// values (Table 1) measured for 5-doxyl-stearic acid (Fig. 5) reveal a less
rigid environment around the nitroxide group, as compared to TEMPO. This is
slightly more evident for models b and c (Fig. 5, middle and bottom).

Fig. 6. ESR spectra of a solution of 16-doxyl-stearic acid in base lubricant oil containing, respectively, overbased micelles a (top), b (middle), c (bottom)

278

F. Frigerio and L. Montanari

With the 16-labelled stearic acid spin-probe (Fig. 6) a 2A// value could be measured (Table 1) only in the spectrum recorded for micelle a (Fig. 6, top), whereas in the
other cases no hyperfine coupling could be detected by the peak analysis.
Table 1. Hyperfine parallel coupling 2A// values measured from the ESR spectra of overbased
micelle solutions with three different spin-probes

Spin-Probe
micelle a
micelle b
micelle c

TEMPO
72.9
74.5
74.1

5-doxyl- stearic acid


66.3
64.5
63.5

16-doxyl-stearic acid
69

The differences in mobility observed by comparing the ESR spin probe spectra
mainly ensued from their diverse spin-label distances from the micelle core polar surface. The carboxyl group of both labelled fatty acids was strongly attracted but could
not reach it through the surfactant shell. On the contrary TEMPO was able to penetrate deeply and produced the highest 2A// values. The larger coupling value for 5doxyl spin probe in micelle a is due to peculiar alkyl chain features of that surfactant
shell. This effect is observed only with labelled fatty acids and is put into greater evidence when the nitroxide sits at a long distance from the micelle core as in 16-doxylstearic acid: the 2A// value is higher than for 5-doxyl-stearic acid in micelle a, while
in micelles b and c the mobility is comparable to that of a free molecule in solution.

4 Discussion
The development of new generations of surfactants for lubricant oil additives requires
an accurate characterisation of the reverse micelle structure of the overbased detergents [26]. In this study a combination of experimental and computational results
helped defining a correlation between the chemical nature of the stabilising surfactant
shell and the environmental rigidity imposed upon diffusing small molecules. The
surfactant shell compactness, responsible for the remarkable stabilisation of the
strongly basic micelle core in a non-polar environment, can be reasonably distinguished from shell viscosity. The first property is commonly attributed to a tight
packing of surfactant aromatic moieties [19, 21], while the second is mainly influenced by the molecular features of their alkyl chains. The ESR spectra analysis contributed a quantitative measurement of the micelle shell viscosity, revealing a subtle
modulation of the mobility experienced by spin-probe molecules in different locations
throughout the surfactant shell. The spin-probes diffusion was found to depend on
both their molecular shape and the polar groups location along the structure. The molecular dynamics simulations of such process provided a pictorial description of the
surfactant shell viscosity effects on diffusion. Moreover, its distance dependence from
the micelle core was quantitatively confirmed. Small molecules like TEMPO were
able to penetrate into a highly rigid environment next to the core surface, while labelled fatty acids were shown to fill the available room among surfactant alkyl chains,
further away from the core. Compared with TEMPO, the nitroxide grafted next to the
fatty acid polar group (5-doxyl-stearic acid) experienced higher mobility. When

Characterisation of the Surfactant Shell Stabilising Calcium Carbonate Dispersions

279

attached to the apolar end (16-doxyl-stearic acid) molecular freedom was found as
high as in solution and an environmnetal rigidity effect was revealed only by the peculiar surfactant shell structure of micelle a. In conclusion, the previously defined
structural features of overbased reverse micelles [2, 17-21, 26] have been further
detailed by this study, in view of developing improved performances as detergent
additives of lubricant base oils.

References
1. Liston, T. V., Lubr. Eng. 48 (1992) 389-397
2. Roman, J.P., Hoornaert, P., Faure, D., Biver, C., Jacquet, F., Martin, J.M., J. Coll. Interface Sci. 144 (1991) 324-339
3. Bandyopadhyaya, R., Kumar, R., Gandhi, K.S., Langmuir 17 (2001) 1015-1029
4. Hudson, L.K., Eastoe, J., Dowding, P.J., Adv. Coll. Interface Sci. 123-126 (2006) 425-431
5. Mansot, J.L. and Martin, J.B., J. Microsc. Spectrosc. Electron., 14 (1989) 78
6. Tricaud, C., Hipeaux, J.C., Lemerle, J., Lubr. Sci. Technol. 1 (1989) 207
7. Markovic, I., Ottewill, R.H., Coll. Polymer. Sci. 264 (1986) 454
8. Giasson, S., Espinat, D., Palermo, T., Ober, R., Pessah, M., Morizur, M.F., J. Coll. Interface Sci. 153 (1992) 355
9. Cizaire, L.; Martin, J.M.; Le Mogne, Th., Gresser, E., Coll. Surf. A 238 (2004) 151
10. Berliner, L.J.: Spin Labeling, Theory and Applications, Academic Press, New York (1979)
11. Dzikovski, B.G., Livshits, V.A., Phys. Chem. Chem. Phys. 5 (2003) 5271
12. Wines, T.H., Somasundaran, P., Turro, N.J., Jockusch, S., Ottaviani, M.F., J. Coll.
Interface Sci. 285 (2005) 318
13. Kramer, G., Somasundaran, P., J. Coll. Interface Sci. 273 (2004) 115
14. Tedeschi, A.M., Franco, L., Ruzzi, M., Padano, L., Corvaja, C., DErrico, G., Phys. Chem.
Chem. Phys. 5 (2003) 4204
15. Maddinelli, G., Montanari, L., Ferrando, A., Maestrini, C., J. Appl. Polym. Sci. 102 (2006)
2810
16. Randy, B, Rabek, J.F.: ESR Spectroscopy in Polymer Research, Springer-Verlag, New
York (1977)
17. Tobias, D.J., Klein, M.L., J. Phys. Chem. 100 (1996) 6637-6648
18. Griffiths, J.A., Bolton, R., Heyes, D.M., Clint, J.H., Taylor, S.E., J. Chem. Soc. Faraday
Trans. 91 (1995) 687-696
19. Griffiths, J.A., Heyes, D.M., Langmuir, 12 (1996) 2418-2424
20. Bearchell, C.A., Danks, T.N., Heyes, D.M., Moreton, D.J., Taylor, S.E., Phys. Chem.
Chem. Phys. 2 (2000) 5197-5207
21. Bearchell, C.A., Heyes, D.M., Moreton, D.J., Taylor, S.E., Phys. Chem. Chem. Phys. 3
(2001) 4774-4783
22. Arndt, E.R., Kreutz, K.L. J.Coll. Interface Sci. 123 (1988) 230
23. Accelrys, Inc., San Diego
24. Hill, J.-R., Sauer, J., J. Phys Chem. 98 (1994) 1238-1244
25. Allen, M.P., Tildesley, D.J.: Computer simulations of liquids, Clarendon, Oxford (1987)
26. Hudson, L.K., Eastoe, J., Dowding, P.J., Adv Colloid Interface Sci 123-126 (2006) 425431

Hydrogen Adsorption and Penetration of Cx (x=58-62)


Fullerenes with Defects
Xin Yue 1, Jijun Zhao2,*, and Jieshan Qiu1,*
1

State Key Laboratory of Fine Chemicals, Carbon Research Laboratory, School of Chemical
Engineering, Center for Nano-Materials and Science, Dalian University of Technology, Dalian,
116024, China
2
State Key Laboratory of Materials Modification by Laser, Electron, and Ion Beams, School
of Physics and Optoelectronic Technology and College of Advanced Science and Technology;
Dalian University of Technology, Dalian, 116024, China
zhaojj@dlut.edu.cn, jqiu@dlut.edu.cn

Abstract. Density functional theory calculations were performed to investigate


the endohedral and exohedral adsorption of a H2 molecule on the classical and
nonclassical fullerenes Cx (x=58, 59, 60, 62) with seven-, eight-, and ninemembered rings. The amplitude of adsorption energies are within 0.03eV and
the molecule-fullerene interaction belongs to van der Waals type. Penetration of
a H2 molecule through different fullerene cages was discussed and the
corresponding energy barriers were obtained. We find that the existence of
large holes reduces the penetration barrier from 12.6 eV for six-membered ring
on perfect C60 cage to about 8eV for seven-membered rings and to about 5eV
for eight-membered rings.

1 Introduction
Soon after the discovery of carbon fullerenes, it was found that a variety of atoms
and molecules can be incorporated into the hollow carbon cages to form endohedral
complex structures, which lead to new nanoscale materials with novel physical and
chemical properties [1-3]. Endohedral fullerenes are not only of scientific interest
but are of technological importance for their potential usage in various fields such
as molecular electronics [4], magnetic resonance imaging [5], quantum computer
[6-9], and nuclear magnetic resonance (NMR) analysis [10, 11]. On the other hand,
tremendous efforts have been devoted to the hydrogen storage in carbon
nanostructures like nanotubes [12]. Thus, the study of endohedral fullerene
complexes with encapsulation of H2 molecule is focus of interests from different
aspects.
In order to achieve endohedral fullerene complex with hydrogen molecule
encapsulated inside, the surface of the fullerene cages must be opened to have a
sufficiently large orifice to let the H2 molecule penetrate. Murata et al. investigated
*

Corresponding authors.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 280 287, 2007.
Springer-Verlag Berlin Heidelberg 2007

Hydrogen Adsorption and Penetration of Cx (x=58-62) Fullerenes with Defects

281

the synthesis, structure, and properties of novel open-cage fullerenes with heteroatom
on the rim of the orifice [13] as well as the feasibility of inserting small atoms or
molecules through the orifice of an open-cage C60 derivative. Hatzimarinaki et al.
reported a novel methodology for the preparation of five-, seven-, and nine-membered
fused rings on C60 fullerene [14].
Recently, molecular hydrogen was successfully placed inside open-cage
fullerenes [13, 15-21]. Murata et al. [16] reported the first syntheses and X-ray
structures of organic and organometallic derivatives of C60 and the usage of the
encapsulated molecular hydrogen as a magnetic shielding probe. After the
encapsulation of H2, the endohedral cages were then closed through a
molecularsurgery method on a gram scale with maximum 100% H2 incorporation
[20]. Stimulated by these experimental progresses, ab initio computational studies
have been reported for endohedral H2@C60 complex. Slanina et al. performed
theoretical calculations of the encapsulation energy using modified Perdew-Wang
and Becke functionals (MPWB1K) [22]. Shigetaa et al. studied dynamic charge
fluctuation of endohedral fullerene with H2 [23].
In addition to the opening and closing of fullerene cages via chemical
approaches, it is possible to have the as-prepared defect fullerene cages with large
holes [24-26]. For example, Qian et al. detected pronounced peak of C62- on the LDFTMS mass spectrum and performed DFT calculation of the C62 cage with one
4MR [24]. Deng et al. observed the odd-numbered clusters C59 in laser desorption
ionization of C60 oxides [26]. Accordingly, ab initio calculatons have been carried
out for the geometries, energies, and stabilities of these defective fullerene C60
cages [27-29]. Hu et al. computed fullerene cages with large hole. [27, 28].
Lee studied the structure and stability of the defective fullerenes of C59, C58 and
C57 [29].
Despite the existing theoretical efforts, within the best of our knowledge, there is
no ab initio calculation on the hydrogen adsorption and encapsulation in the defect
fullerenes. These nonclassical fullerenes with seven-membered ring (7MR), eightmembered ring (8MR), and so on, may serve well as model systems for the open-cage
fullerenes obtained from other methods. Thus, it would be interesting to study the
relationship between the size of the orifice ring and the barrier for H2 molecule
penetrating from outside to inside of fullerene. In this paper, we address these issues
by conducting DFT calculations on the adsorption and penetration of H2 molecule on
C60 and nonclassical fullerenes with 7MR, 8MR, and 9MR.

2 Computational Methods
All-electron DFT calculations were carried out employing the generalized gradient
approximation (GGA) with the PW91 functional [30] and the double numerical
plus polarization (DNP) basis set that are implemented in the DMol program [31].
Self-consistent field (SCF) calculations were carried out with a convergence criterion
of 10-6 a.u. on the total energy. To ensure high quality results, the real-space global
orbital cutoff radius was chosen to be as high as 5.0 . It is known that DFT method

282

X. Yue, J. Zhao, and J. Qiu

within GGA approximation is usually insufficient for describing the weakly van der
Waals (vdW) interaction. A recent DFT calculation of the hydrogen adsorption on
carbon and boron nitride nanotubes [32] demonstrated that PW91 functional can
roughly reproduce the strength of the vdW interaction between a H2 molecule and a
C6H6 benzene molecule by highly accurate HF-MP2 calculations.

3 Results and Discussion


In this work, we considered eight fullerene cages including perfect C60 and those
defect fullerenes. The configurations of the defect fullerene cages were taken from
Ref. [29] for C58 and C59 with 7MR, 8MR, and 9MR, and from Ref. [24] for C62 with
4MR. On the one side, cages with the vacancy defect (unsaturated atom) were created
by removing one atom from C60, such as C59 4-9 (with one 4MR and one 9MR) and
C59_5-8 (with one 5MR and one 8MR). On the other hand, topological defects
including larger rings (7MR and 8MR) or smaller 4MR were created on the fullerene
cages of C58, C60, and C62. For C60, we considered perfect C60 (Ih) as well as a C60 cage
with two 7MR (along with one 4MR), which is denoted as C60 4-7-7. For C59, the
cage with one 4MR and one 9MR is denoted as H2@C59_4-9, and the cage with one
5MR and one 8MR as H2@C59_5-8. For C58, the cage with two 5MR and one 7MR is
denoted as H2@C58_5-5-7, the cage with two 4MR, one 8MR, and one 5MR as
H2@C58 4-4-8(5), and the cage with two 4MR, one 8MR, and one 6MR as H2@C58 44-8(6).
At the beginning, eight fullerene cages were optimized at level of PW91/DNP.
Hydrogen molecule was then placed in the center of each cage as initial
configuration of the endohedral complexes. These endohedral H2@Cx complexes
were fully optimized. The optimized structures are shown in Figure 1. Moreover,
exohedral adsorption of H2 molecule on these eight cages was also considered. The
adsorption energy of hydrogen molecule is defined as the difference between the
total energy of the H2-cage complex specie and summation of the total energies of
the individual H2 molecule (EH2) and the fullerene cage (Ecage). Hence, the adsorption
energies for both endohedral (Eendo) and exohedral adsorption (Eexo) are computed
from
Eendo=Eendo-H2EcageEH2

(1)

Eexo=Eexo-H2EcageEH2

(2)

To study the penetration behavior of a H2 molecule from the endohedral site to the
exohedral site, we first adjust the orientation of the central H2 molecule to be
perpendicular to the largest hole on the surface of the fullerene cages. Then, singlepoint energies of the H2-cage complex (H2@C60, H2@C60_4-7-7, H2@C59_4-9,
H2@C59_5-8, H2@C58_5-5-7, H2@C58_4-4-8(5), and H2@C58_4-4-8(6)) were computed
along the penetration path by gradually moving the H2 molecule from the cage center
to the outside of the fullerene cage through the largest hole by a step of 0.3 up to

Hydrogen Adsorption and Penetration of Cx (x=58-62) Fullerenes with Defects

283

the longest distance of 9 from the cage center. The main theoretical results are
summarized in Table 1 and Figure 2.

H2@C60_4-7-7

H2@C60

H2@C59_4-9

H2@C59_5-8

H2@C58_4-4-8(5)

H2@C58_4-4-8(6)

H2@C58_5-5-7

H2@C62

Fig. 1. Optimized configurations of H2@C62, H2@C60, H2@C60_4-7-7, H2@C59_4-9, H2@C59_58, H2@C58_5-5-7, H2@C58_4-4-8(5)
Table 1. Total energies of the eight optimized cages of perfect and defect fullerenes CX (X=58,
59, 60, 62) and the corresponding endohedral complex H2@CX. The total energy of a H2
molecule from our DFT calculation at the same level is -1.1705707 Hartree. Endohedral
adsorption energy (Eendo) and exohedral adsorption (Eexo) for H2 on CX (X=58, 69, 60) cage as
well as energy barrier for penetration of H2 through the largest hole on the cage.

Eendo-H2
C62
C60
C60_4-7-7
C59_4-9
C59_5-8
C58_5-5-7
C58_4-4-8(5)
C58_4-4-8(6)

(Hartree)
-2363.528743
-2287.401287
-2287.273601
-2249.062584
-2249.095940
-2211.063665
-2211.012397
-2210.892028

Eextro-H2
(Hartree)
-2363.528763
-2287.401230
-2287.273788
-2249.063739
-2249.097171
-2211.064968
-2211.013340
-2210.894158

E(CX)
(Hartree)
-2362.357426
-2286.216007
-2286.102568
-2247.892223
-2247.925697
-2209.893787
-2209.842018
-2209.722625

Eendo
(meV )
-20.3
-17.7
-12.6
5.7
8.9
18.8
5.2
31.8

Eextro

(meV)
-20.9
-16.2
-17.7
-26.3
-24.6
-16.6
-20.5
-26.2

Barrier
(eV)

12.6
7.9
9.1
5.2
8.3
4.6
5.2

The total energy between perfect C60 and defect C60_4-7-7 is 3.47 eV. In other
words, formation of two 7MR and one 4MR on perfect C60 requires 3.47 eV, while
previous calculation found that formation of two 7MR on a (6,6) carbon nanotube is
2.74 eV [33]. The total energy of C59_5-8 is lower than C59_4-9 by 0.91 eV, close to
the theoretical value of 0.89 eV by Lee et al. at level of B3LYP/6-31G* [29]. For C58,
C58_5-5-7 is more stable than C58_4-4-8(5) by 1.40 eV and than C58_4-4-8(6) by 4.67
eV, rather close to previous results of 1.34 eV and 4.77 eV by Lee [29].

284

X. Yue, J. Zhao, and J. Qiu

As shown in Table I, for all the cases studied, the exohedral adsorption of H2
molecule on the surface of fullerene cage is exothermic, with Eexo ranging from -16.6
to -26.3 meV. The exohedral adsorption energy of H2 molecule is insensitive to the
atomic configuration of fullerene cages. Experimentally, the adsorption of a H2
molecule on the graphite surface is -42 meV. It is known that GGA usually
underestimate the surface adsorption energy of vdW type [34]. Therefore, the present
GGA calculation might somewhat underestimate the adsorption energy of H2
molecule.
On the contrary, the endohedral adsorption is either exothermic or endothermic,
with Eendo ranging from -12.6 to 31.8 meV. The incorporation of a H2 molecule in C60
(perfect or defect) and C62 cages is exothermic, while encapsulation of a H2 molecule
in C58 and C59 cages is endothermic. This finding can be roughly understood by the
difference in the interior space of the fullerene cages. In other words, C60 and C62
cages are larger and have more space for the encapsulation of H2 molecule. In a
previous study [22], the best estimate for the encapsulation energy for H2@C60 was at
least 173 meV.
The energy barrier for the penetration of H2 molecule through the largest hole of
the eight different fullerene cages are presented in Table 1 and the corresponding
single-point energies for the penetration paths are shown in Figure 2. First of all, in
Figure 2 we find all the energy paths for the H2 penetration are smooth and
have clear highest peak on them, which correspond to the energy barriers given in
Table 1. Among them, the energy barrier for penetrating the six-membered ring on
C60 cage is highest, i.e., 12.6 eV, and the energy barrier for penetrating the eightmembered ring on C58_4-4-8(5) cage is lowest, i.e, 4.6 eV. The energy barriers for
other cages with 8MR such as C59_5-8 and C58_4-4-8(6) are close, both of 5.2 eV.
For those defect fullerene cages with 7MR, such as C60_4-7-7 and C58_5-5-7, the
energy barriers are around 8 eV. In other words, the penetration barrier reduces
from 12.6 eV for 6MR on perfect C60 cage to about 8eV for 7MR and to about 5eV
for 8MR. However, it is interesting to find that the penetration barrier through the
largest 9MR on the C59_4-9 cage is relatively high, i.e., 9.1 eV.
To summarize, the encapsulation and penetration of H2 molecule on the perfect and
defect C60 cages were investigated using density functional theory at level of
PW91/DNP. Fullerene cages of Cx with x=58, 59, 60, 62 containing 7MR, 8MR, and
9MR were considered. The interaction for H2 molecule adsorption on fullerene cages
is relatively weak and of vdW type. The exohedral adsorption for H2 molecule on the
surface of fullerene cage is exothermic, while the endohedral adsorption is exothermic
for C60 and C62 or endothermic for C58 and C59 cages. The penetration barrier from
endohedral to exohedral site significantly reduces from 12.6 eV for 6MR perfect cage
to about 8 eV for 7MR and to about 5 eV for 8MR on defect cages. However, these
reduced energy barriers for 7MR and 8MR are still too high for a H2 molecule to
penetrate at ambient conditions. Finally, it is worthy to point out that the present
calculations focus on the physisorption and penetration of H2 molecule while the
possible chemisorption of the H2 molecule and corresponding transition states were
not considered.

Hydrogen Adsorption and Penetration of Cx (x=58-62) Fullerenes with Defects

12
10
8
6
4
2
0

(a)

H2@C60
H2@C60_4-7-7

(b)

H2@C59_4-9
H2@C59_5-8

285

Energy (eV)

10
8
6
4
2
0
8

H2@C58_5-5-7
H2@C58_4-4-8(5)
H2@C58_4-4-8(6)

(c)

6
4
2
0
0

Distance (A)
Fig. 2. Relative energy of H2@CX complex as function of the distance for H2 from cage center
along the path towards center of the largest ring. The zero energy is set to be the total energy
for the H2 in the center of cage H2@C62, H2@C60, H2@C60_4-7-7, H2@C59_4-9, H2@C59_5-8,
H2@C58_5-5-7, H2@C58_4-4-8(5).

Acknowledgements. This work was supported by the National Natural Science


Foundation of China (No.29976006), the Natural Science Foundation of Liaoning

286

X. Yue, J. Zhao, and J. Qiu

Province of China (No.9810300701), Program for New Century Excellent Talents in


University of China, and the Ministry of Education of China.

References
1. Funasaka, H., Sugiyama, K., Yamamoto, K., Takahashi, T.: Magnetic Properties of RareEarth Metallofullerenes, J. Phys. Chem. 99 (1995) 1826-1830
2. Michael, D., John: A. M.: Isolation and Properties of Small-Bandgap Fullerenes, Nature
393, (1998) 668-671
3. Boltalina, O. V., Ioffe, I. N., Sorokin, I. D., Sidorov, L. N.: Electron Affinity of Some
Endohedral Lanthanide Fullerenes, J. Phys. Chem. A 101 (1997) 9561-9563
4. Kobayashi, S., Mori, S., Iida, S., Ando, H., Takenobu, T., Taguchi, Y., Fujiwara,
Taninaka, A., Shinohara, A. H., Iwasa, Y.: Conductivity and Field Effect Transistor of
La2@C80 Metallofullerene, J. Am. Chem. Soc. 125 (2003) 8116-8117
5. Kato, H., Kanazawa, Y., Okumura, M., Taninaka, A., Yokawa, T., Shinohara, H.:
Lanthanoid Endohedral Metallofullerenols for MRI Contrast Agents, J. Am. Chem. Soc.
125 (2003) 4391-4397
6. Harneit, W.: Fullerene-Based Electron-Spin Quantum Computer, Phys. Rev. A 65 (2002)
032322-032327
7. Suter, D., Lim, K.: Scalable Architecture for Spin-based Quantum Computers with a
Single Type of Gate, Phys. Rev. A 65 (2002) 052309-052313
8. Twamley, J.: Quantum-Cellular-Automata Quantum Computing with Endohedral
Fullerenes, Phys. Rev. A 67 (2003) 052318-052329
9. John, J. L., Morton, Al. M., Tyryshkin, A., Ardavan, K., Porfyrakis, S. A., Lyon, G. A.
Briggs, D.: High Fidelity Single Qubit Operations Using Pulsed Electron Paramagnetic
Resonance, Phys. Rev. Lett. 95 (2005) 200501-4
10. Saunders, M., Cross, R. J., Jimnez, V., Shimshi, H. A., Khong, R. A.: Noble Gas Atoms
Inside Fullerenes, Science 271 (1996) 1693-1697
11. Martin, S, J. Hugo, V. A., James, C. R., Stanley, M., Daro, F., Frank. I.: Probing the
Interior of Fullerenes by 3He NMR Spectroscopy of Endohedral 3He@C60 and 3He@C70,
Nature, 367 (1994) 256-258
12. Ding, R. G., Lu, G. Q., Yan, Z. F., Wilson, M. A.: Recent Advances in the Preparation and
Utilization of Carbon Nanotubes for Hydrogen Storage, J. Nanosci. Nanotech. 1 (2003)
7-29
13. Murata, Y., Murata, M., Komatsu, K..: Synthesis, Structure, and Properties of Novel
Open-Cage Fullerenes Having Heteroatom(s) on the Rim of the Orifice, Chem. Eur. J. 9
(2003) 1600-1609
14. Maria H., Michael O.:Novel Methodology for the Preparation of Five-, Seven-, and NineMembered Fused Rings on C60, Org. Lett. 8 (2006) 1775-1778
15. Murata, M.., Murata, Y., Komatsu, K.: Synthesis and Properties of Endohedral C60
Encapsulating Molecular Hydrogen, J. Am. Chem. Soc. 128 ( 2006) 8024-8033
16. Murata, Y., Murata, M., Komatsu, K.: 100% Encapsulation of a Hydrogen Molecule into
an Open-Cage Fullerene Derivative and Gas-Phase Generation of H2@C60, J. Am. Chem.
Soc. 125 (2003) 7152-7153
17. Carravetta, M., Murata, Y., Murata, M., Heinmaa, I., Stern, R., Tontcheva, A., Samoson,
A., Rubin, Y., Komatsu, K.., Levitt, M. H.: Solid-State NMR Spectroscopy of Molecular
Hydrogen Trapped Inside an Open-Cage Fullerene, J. Am. Chem. Soc. 126 (2004) 40924093

Hydrogen Adsorption and Penetration of Cx (x=58-62) Fullerenes with Defects

287

18. Iwamatsu, S.I., Murata, S., Andoh, Y., Minoura, M., Kobayashi, K., Mizorogi, N., Nagase,
S.: Open-Cage Fullerene Derivatives Suitable for the Encapsulation of a Hydrogen
Molecule, J. Org. Chem. 70 (2005) 4820-4285
19. Chuang, S. C., Clemente, F. R., Khan, S.I., Houk, K. N., Rubin, Y.: Approaches to Open
Fullerenes: A 1,2,3,4,5,6-Hexaadduct of C60, Org. Lett. 8 (2006) 4525-4528
20. Komatsu, K., Murata, M., Murata, Y.: Encapsulation of Molecular Hydrogen in Fullerene
C60 by Organic Synthesis, Science 307 (2005) 238-240
21. Komatsu, K., Murata, Y.: A New Route to an Endohedral Fullerene by Way of
-Framework Transformations, Chem. Lett. 34 (2005) 886-891
22. Slanina, Z. K., Pulay, P., Nagase, S.: H2, Ne, and N2 Energies of Encapsulation into
Evaluated with the MPWB1K Functional, J. Chem. Theory Comput. 2 (2006) 782-785
23. Shigetaa Y., Takatsukab K.: Dynamic Charge Dluctuation of Endohedral Fullerene with
Coencapsulated Be Atom and H2, J. Chem. Phys. 123 (2005) 131101-131104
24. Qian, W., Michael D., Bartherger, S.J., Pastor, K.N., Houk, C. L., Wikins, Y. R.: C62, a
Non-Classical Fullerene Incorporating a Four-Membered Ring, J. Am. Chem. Soc. 122
(2002) 8333-8334
25. OBrien, S. C., Heath, J. R., Curl, R. F., Smalley, R. E.: Photophysics of Buckminster
Fullerene and Other Carbon Cluster Ions, J. Chem. Phys. 88 (1988) 220-230
26. Deng, J. P., Ju, D.D., Her, G. R., Mou, C. Y., Chen, C. J., Han, C. C.: Odd-Numbered
Fullerene Fragment Ions from Ca Oxides, J. Phys. Chem. 97 (1993) 11575-11577
27. Hu, Y. H., Ruckensten, E.: Ab Initio Quantum Chemical Calculations for Fullerene Cages
with Large Holes, J. Chem. Phys. 119 (2003) 10073-10080
28. Hu, Y.H., Ruckensten, E.: Quantum Chemical Density-Functional Theory Calculations of
the Structures of Defect C60 with Four Vacancies, J. Chem. Phys. 120 (2004) 7971-7975
29. Lee, S. U., Han, Y.K.: Structure and Stability of the Defect Fullerene Clusters of C60: C59,
C58, and C57, J. Chem. Phys. 121 (2004) 3941-3492
30. Perdew, J. P., Wang Y.: Accurate and Simple Analytic Representation of the Electron-Gas
Correlation Energy, Phys. Rev. B 45 (1992) 13244-13249
31. Delley, B.: An All-Electron Numerical Method for Solving the Local Density Functional
for Polyatomic Molecules, J. Chem. Phys. 92(1990) 508-517;
32. Zhou, Z., Zhao, J. J., Chen, Z. F., Gao, X. P., Yan, T. Y., Wen, B. P., Schleyer, V.R.:
Comparative Study of Hydrogen Adsorption on Carbon and BN Nanotubes, J. Phys.
Chem. B 110 (2006 )13363-13369
33. Zhao, J. J., Wen, B., Zhou, Z., Chen, Z. F., Schleyer, P. R.: Reduced Li diffusion barriers
in composite BC3 nanotubes, Chem. Phys. Lett. 415 (2005) 323-326
34. Zhao, J. J., Buldum, A., Han, J., Lu, J. P.: Gas molecule adsorption in carbon nanotubes
and nanotube bundles, Nanotechnology 13 (2002) 195-200

Ab Initio and DFT Investigations of the Mechanistic


Pathway of Singlet Bromocarbenes Insertion into
C-H Bonds of Methane and Ethane
M. Ramalingam1, K. Ramasami2, P. Venuvanalingam3, and J. Swaminathan4
1

Rajah Serfoji Government College, Thanjavur-613005, India


km_ramalingam@yahoo.co.in
2
Nehru Memorial College, Puthanampatti-621007, India
3
Bharathidasan University, Tiruchirapalli-620024, India
4
Periyar Maniammai College of Technology for Women, Vallam613403, India

Abstract. The mechanistic pathway of singlet bromocarbenes (1CHBr and


1
CBr2) insertions into the C-H bonds of methane and ethane have been analysed
at ab initio (HF and MP2) and DFT (B3LYP) level of theories using 6-31g (d,
p) basis set. The QCISD//MP2/6-31g(d, p) level predicts higher activation
barriers. NPA, Mulliken and ESP charge analyses have been carried out along
the minimal reaction path by the IRC method at B3LYP and MP2 levels for
these reactions. The occurrence of the TSs either in the electrophilic or
nucleophilic phase and net charge flow from alkane to carbene in the TS has
been identified through NBO analyses.

Keywords: bromocarbenes; ab initio; DFT,; insertions; IRC.

1 Introduction
The carbenes and halocarbenes are known as reactive intermediates with intriguing
insertion, addition, and rearrangement reactions. How structural factors (bond angle,
electron-withdrawal by induction and electron-donation by resonance) influence the
relative stabilities of these states is still under scrutiny [1]. The synthetic organic
chemistry [2], organometallic chemistry [3] and other areas, principally; the ArndtEistert chain homologation procedure, the Rimer-Tiemann reaction (formylation of
phenols), cyclopropanation of alkenes [4] and subsequent rearrangements, [5] ketene
and allene [6] preparation, synthesis of strained ring systems, ylide generation and
subsequent rearrangements, cycloaddition reactions [7] and photoaffinity labeling. [8]
are some of the vital fields of wide applications of the carbenes and halocarbenes
Among the different types of reactions of singlet carbenes, the highly characteristic
concerted insertion reactions into Y-H bonds (Y=C, Si, O etc.), involving a threecenter cyclic transition state [9] seem to be important in synthetic organic chemistry
[2]. In the halocarbenes, the halogens would interact with the carbenic carbon through
the oppositely operating electronic [mesomeric (+M) - donor and inductive (-I) -
acceptor] effects. Based on this, the electrophilicity of carbenes has been reported to
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 288295, 2007.
Springer-Verlag Berlin Heidelberg 2007

Ab Initio and DFT Investigations of the Mechanistic Pathway

289

decrease with increased bromination resulting in a substantially high activation barrier


[10]. Interestingly both the electrophilicity and nucleophilicity nature of carbenes
have been encountered in the insertion reactions [11]. Hence the focal theme of this
investigation is the characterization of these two features in terms of the quantum of
charge transfer among the reactants during the course of the reaction and to determine
the energetics, reaction enthalpies and activation barriers for the singlet
bromocarbenes insertion reactions into C-H bonds of methane and ethane. If we
monitor the total charge on the carbene moiety as the reaction progresses (by
following, the Intrinsic Reaction Coordinate -IRC [12]) we should be able to detect a
turning point, signifying the end of the first electrophilic phase and the onset of the
second nucleophilic phase. In order to properly confirm the two-phase mechanism, we
carry out the charge versus reaction path probe for the insertion reactions into C-H
bonds of said alkanes. In this study we investigate the reactions CBrX + HY with
X = H, Br and Y = CH3, C2H5. The rapidity of carbene reactions has challenged
experimental techniques and hence this theoretical ab initio quantum mechanical
investigations.

2 Computational Details
Geometries of the reactants, the transition states and the products have been optimized
first at HF/6-31g (d, p) level using Gaussian03W suite of program [13]. The resultant
HF geometries obtained were then optimized at MP2 and B3LYP [14-18] levels. The
standard 6-31g (d, p) [19, 20] basis set has been adopted in all the calculations for
better treatment of 1, 2-hydrogen shift during the insertion process. Further single
point energy calculations have been done at the QCISD level on the MP2 optimized
geometries of the species on the lowest energy reaction pathway.[21] All the
stationary points found, except those obtained at the QCISD level, were characterized
as either minima or transition states (TSs) by computing the vibrational harmonic
frequencies TSs have a Hessian index of one while minima have zero hessian index.
All TSs were further characterized by animating the imaginary frequency in
MOLDEN [22] and by checking with intrinsic reaction coordinate (IRC) analyses.
The calculated vibrational frequencies have been used to compute the thermodynamic
parameters like enthalpies of the reaction. The intrinsic reaction coordinate analyses
have been done for the transition structures obtained at the MP2 level [23]. The
Mulliken [24], NPA [25] and charges derived by fitting the electrostatic potential [26]
methods have been followed for the atomic charges computations, along the reaction
path.

3 Results and Discussion


The C-H bonds of methane and ethane undergo insertion reactions with 1CHBr/1CBr2
forming mono/dibromoalkanes. Reactants first form a pre-reactive complex. The
complex proceeds to form a concerted transition state that then develops into a

290

M. Ramalingam et al.

product. The energy profile diagram for the insertion reactions of 1CHBr and 1CBr2
into methane is shown in Fig. 1, in which the energies of complex, transition state and
the product are shown with reference to the reactants.
TS

Relative energy (kcal/mol)

10
Reactants
-10

complex

-30
-50
-70
-90

Product

-110
1

2
3
4
Reaction coordinate

Fig. 1.
- - - - - Energy profile for 1CHBr + CH4 CH3-CH2Br at MP2/6-31g**
Energy profile for 1CBr2 + CH4 CH3-CHBr2 at MP2/6-31g**
1

The optimized geometries of the TSs located in the reaction pathway for 1CHBr and
CBr2 insertion reactions are presented in Fig. 2.

Fig. 2. Geometrical parameters (distance in ) and barriers of the transition states for CHBr
and CBr2 insertion into C-H bond of methane and ethane at B3LYP and MP2 levels. (MP2
values in the parentheses).

3.1 Singlet Bromocarbenes Insertion into Methane and Ethane


The B3LYP, MP2 and QCISD results alone have been taken for discussions in this
investigation since HF overestimates the transition states [27, 28]. The B3LYP/6-31g
(d, p) activation energies for the insertions of 1CHBr and 1CBr2 into C-H of methane

Ab Initio and DFT Investigations of the Mechanistic Pathway

291

are 4.28(TS-1) and 20.42 (TS-2) kcal/mol respectively. The MP2 value for 1CHBr
insertion is ca. 1 kcal/mol higher and that for 1CBr2 insertion is ca. 3 kcal/mol lower
than those of the corresponding B3LYP values. Replacement of hydrogen by bromine
in 1CHBr decreases its electrophilicity [29], and deactivates the electrophilic attack to
certain extent by the carbene moiety in the first phase of insertion. So the barrier
heights increase dramatically from 4.28 to 20.42 kcal/mol at B3LYP and 5.36 to
17.36 kcal/mol at MP2 calculations respectively for methane. The barriers computed
at QCISD/6-31g (d, p)//MP2/6-31g (d, p) level are 9.68 kcal/mol and 23.93 kcal/mol
for 1CHBr and 1CBr2 insertion into methane. The TSs are first order saddle points as
determined by numerical vibrational frequency.
In the case of ethane, the barrier heights for 1CHBr insertion are 1.47 and 3.27
kcal/mol respectively at B3LYP and MP2 levels (TS-3). These values have been
enhanced to 15.49 and 12.41 kcal/mol (TS-4) correspondingly for the 1CBr2 insertion.
The relevant geometrical parameters of the transition states for the 1CHBr and 1CBr2
insertions to methane and ethane have been shown in Fig. 2, Table 1 and 2. The TS
for 1CBr2 insertion into methane (TS-2) comes much later along the reaction
coordinate than that for 1CHBr insertion (TS-1) as reflected in the relative C2-H3
bond distances of 1.430 (1.345) and 1.274 (1.202) and the charges on H3,
0.279(0.278) and 0.255 (0.216) respectively. Similar trend has been observed for the
singlet bromocarbenes insertion into ethane.
Table 1. Geometrical parameters (distances in ), barriers and heat of reaction (Hr) in
kcal/mol at the TSs of 1CHBr with alkanes at B3LYP (MP2)/6-31g (d, p)

alkane

rc1h1

rc1c2

rc2h1

Ea

qct

Hr

methane

1.276
(1.370)
1.292
(1.490)

22.280
( (2.336)
2.337
(2.479)

1.273
(1.186)
1.260
(1.151)

4.28
(5.36)
1.47
(3.27)

0.252
(0.187)
0.267
(0.130)

-87.41
(-96.40)
-89.13
(-99.31)

ethane

qct quantum of charge transfer from alkane to carbene at the TSs


Table 2. Geometrical parameters (distances in ), barriers and heat of reaction (Hr) in
kcal/mol at the TSs of 1CBr2 with alkanes at B3LYP (MP2)/6-31g (d, p)

alkane

rc1h1

methane

1.175
(1.199)
1.160
(1.190)

ethane

rc1c2

rc2h1

Ea

qct

Hr

22.233
(2.239)

1.424
(1.344)

20.42
(17.36)

0.322
(0.382)

-68.95
(-80.85)

2.280
(2.271)

1.470
(1.364)

15.49
(1(2.41)

0.362
(0.402)

-71.27
(-84.86)

qct quantum of charge transfer from alkane to carbene at the TSs

292

M. Ramalingam et al.

3.2 Energetics
In general the activation barrier depends upon the polarity of the C-H bond of alkane
and the type of bromocarbene (1CHBr or 1CBr2) to be inserted. The above statement
draws support from the fact that the pair of electrons on the carbene carbon involved
in the bonding process with the C-H of alkane is more and more stabilized with the
degree of bromination. This results in the inhibition of the ease of bond formation due
to the less availability of the electron pair on the cabene carbon. The NBO [30]
analysis quantifies this aspect in terms of the energies of the electron pairs on 1CHBr
and 1CBr2 as -0.4057(-0.5595) and -0.4535 (-0.6019) au respectively according to
B3LYP (MP2) theories with 6-31g (d, p) basis set. The enthalpies of the insertion
reactions of 1CHBr and 1CBr2 into methane are -87.41(-96.40) and 68.95(-80.85)
and that for ethane are 89.31(-99.31) and 71.27(-84.86) kcal/mol at B3LYP (MP2)
levels respectively. The reaction enthalpies (Tables 1 and 2) show that the
exothermicity of the insertion reactions indicating that the transition states analyzed
resemble the reactants rather than the products [31]. The proximity of the transition
states to the reactants deviates with the degree of bromination of methylene.
Irrespective of the level of theory (B3LYP or MP2) followed, the insertions of 1CHBr
form the transition states earlier than that of 1CBr2 as revealed by exothermicity
values.
3.3 Transition State Geometries
A scrutiny of the bond breaking and bond forming steps corresponding to C2-H3 and
C6-H3 respectively during the insertion process reveals that it is a concerted reaction.
It is observed that the formation of C6-H3 bond is earlier than the C2-C6 bond in the
TS in terms of the bond distances (Tables 1 and 2). The C6-H3, C2-H3 and C2-C6
bond distances in TSs of 1CBr2 insertion reactions confirm the late transition state in
comparison to the corresponding values in the 1CHBr insertion reactions. In order to
improve the accuracy of the energetic parameters, single point computations at
QCISD level has also been adopted and values are listed in Tables 1and 2. The barrier
heights predicted at QCISD level are higher than the MP2 values both for methane
and ethane.
3.4 NBO Analyses
NBO analyses of charge distribution in the transition states give some insight into the
insertion reactivity. For all the transition states the second-order perturbative analyses
were carried out for all possible interactions between filled Lewis-type NBOs and
empty non-Lewis NBOs. These analyses show that the interaction between the C2H3 bond of alkane and the empty p orbital of the carbenic carbon (CH PC ) and
the charge transfer from lone pair of electrons of the carbenic carbon to the
antibonding orbital of C2-H1(nC *CH) seems to give the strongest stabilization.
Finally we observed that there was a net charge flow from the alkane moiety to the
inserting carbene moiety. The quantum of charge transfer from alkane to carbene
supporting the donor-acceptor interaction in the transition states for all the insertion
reactions both at B3LYP and MP2 levels have been collected in Tables 1 and 2. The

Ab Initio and DFT Investigations of the Mechanistic Pathway

293

inverse relationship between the quantum of charge transfer and the activation
barriers reveals the fact that for the favorable insertion, the nucleophilicity of the
alkane should have been enhanced either sterically or electronically. This correlation
holds good for the reactions analysed in this investigation
3.5 IRC - Charge Analyses
The total charge on the carbene moiety along the IRC for the insertion reactions of
methane and ethane respectively, as calculated by Mulliken [23], NPA [24] and ESP
[25] methods using theoretical models has been shown in Fig.3. We have chosen
density functional (B3LYP) plot showing charge on the carbene moiety in addition to
the MP2 plot, which serve as our ab initio standard.
-0.1

a
Charge on CHBr2(a.u.)

Charge on CHBr(a.u.)

-0.05
-0.1
-0.15
-0.2
-0.25
-0.3
-0.35
-1

-0.2
-0.25
-0.3
-0.35
-0.4
-0.45
-1.5

-0.4
-2

-0.15

-1

-0.5

Charge on CHBr2 (a.u.)

Charge on CHBr(a.u.)

-0.16
-0.18
-0.2
-0.22
-0.24
-0.26
-1.5

-0.5

IRC()

-0.2
-0.25
-0.3
-0.35
-0.4
-0.45
-2

-1

0.5

-0.15

-0.14

IRC()

IRC()

0.5

-1

1.5

IRC()

Fig. 3. ()-NPA, (y)-Mulliken and (-)-ESP charge analyses respectively. () and (c)
correspond to the transition states and the turning points respectively. Electrophilic-phase
region right to the turning point Nucleophilic-phase region left to the turning point.

We discuss first the insertion reactions with methane, 1CBrX (X = H, Br) + CH4.
The charge/IRC curves of these reactions are shown in Figs. 3. These two reactions
provide clear evidence for the two-phase mechanism in that there is a distinct turning
point(minimum) in all the charge/IRC curves for the two Hamiltonians (MP2 and
B3LYP) regardless of the model used to compute the atomic charges. For the 1CHBr
insertion (Fig.3 a), the charge minimum occurs after the transition state (TS), whereas
with 1CBr2 (Fig. 3b) the minimum occurs just before the TS. Thus for the 1CHBr
insertion, the TS lies within the first, i.e., electrophilic phase, whereas for 1CBr2 the
TS is reached at the starting point of the nucleophilic phase. This indicates that the TS
for insertion of 1CHBr into the C-H bond in methane occurs much earlier along the
reaction coordinate than does the TS for the corresponding 1CBr2 insertion. This
indication is fully supported both by the TS geometries - for example, the C-H bond
undergoing the insertion is much shorter in the 1CHBr (1.202) TS than in the TS for
1
CBr2 (1.345) insertion (Fig. 2) and by the heat of reaction and barrier height

294

M. Ramalingam et al.

(Tables 1 and 2), which are more negative and much smaller, respectively for 1CHBr
(-96.40; 5.36kcal/mol respectively) than for 1CBr2. (-80.85; 17.36 kcal/mol
respectively). This is in agreement with the Hammond postulate [32]. From the
viewpoint of reactivity, it may be said that the vacant p-orbital on 1CHBr is more
available than that on 1CBr2, thus facilitating the initial electrophilic phase of the
reaction. In other words, reactivity increases in the order 1CBr2 < 1CHBr. There is an
agreement in the overall shape and depth of the curves themselves between the
MP2 and B3LYP plots. However the turning points (minima) in the B3LYP plots are
less pronounced. The NPA and ESP curves are identical in shape at MP2 level.
In the case of ethane, 1CBrX (X = H, Br) + C2H6 the position of the turning points
and the charge/IRC curves for these insertions at MP2 level are shown in (Fig. 3).
Unlike for the 1CHBr insertion into methane, the TS-3 occurs at the turning point
(Fig. 1c), which is in neither electrophilic phase nor nucleophilic phase. But for 1CBr2
insertion into ethane at MP2 (Fig. 1d), the TS is observed at the starting point of the
nucleophilic phase conforming to the belated TS formation in comparison with the TS
for insertion of 1CHBr (Fig. 1c). In general, the nucleophilic phase dominates for
1
CBr2 insertions, whereas the electrophilic phase dominates 1CHBr insertions.

4 Summary
The singlet bromocarbenes insertions into the C-H bond of methane and ethane have
been analyzed and the influence of bromine on the transition states, energetics,
geometrical parameters etc., has been investigated both at B3LYP and MP2 levels of
theory using 6-31g (d, p) basis set. For the bromocarbenes B3LYP, MP2 and QCISD
level theories predict the activation barriers of different heights, varying both with the
extent of bromination and the type of alkane. The NBO analyses have been done with
a view to analyzing the charge transfer processes during the insertion reactions. The
charge/IRC plots provide clear evidences for the two-phase mechanism namely an
electrophilic phase and a nucleophilic phase for insertions of both 1CHBr and 1CBr2
into the C-H bond of methane and ethane respectively. B3LYP functional used in this
work, gives the same picture of the investigated insertion reactions as the more
traditional MP2 method for both geometries and heats of reaction.

References
1.
2.
3.
4.
5.
6.
7.
8.

Irikura, K, K., Goddard, W.A., Beauchamp, J.L.: J.Am.Chem.Soc. 114 (1992) 48


Kirmse, W.: Carbene Chemistry. 2nd Edn. Academic Press, New York (1971)
Fischer, E.O., Maasbol, A.: Angew. Chem., Int. Ed. Engl. 3 (1964) 580
Salaun, J.: Chem. Rev. 89 (1989) 1247
Brookhart, M., Studabaker, W.B.: Chem. Rev. 87 (1987) 411
Walbrick, J. M., Wilson, J.W., Jones, W.M.: J. Am. Chem. Soc. 90 (1968) 2895
Padwa, A., Hornbuckle, S.F.: Chem. Rev. 91 (1991) 263
Baldvin, J.E., Jesudason, C.D., Moloney, M., Morgan, D.R., Pratt, A.J.: Tetrahedron 47
(1991) 5603
9. Von, W., Doering, E., Prinzbach, H.: Tetrahedron 6 (1959) 24
10. Russon, N., Siclia, E., Toscano, M.: J. Chem. Phys. 97 (1992) 5031

Ab Initio and DFT Investigations of the Mechanistic Pathway


11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.

295

Dobson, R.C., Hayes, D.M., Hoffmann, R.: J. Am.Chem. Soc. 93 (1971) 6188
Fukui, K.: J. Phys. Chem. 74 (1970) 4161
Gaussian 03, Revision C.02, Gaussian, Inc., Wallingford CT (2004)
Lee, C., Yang, W., Parr, R.G.: Physical Review B 37 (1988) 785
Becke, D.: Phys. Rev. A 38 (1988) 3098
Miehlich, B., Savin, A., Stoll, H., Preuss, H.: Chem. Phys. Lett. 157 (1989) 200
Becke, A.D.: J. Chem. Phys. 98 (1993) 5648
Becke, A.D.: J. Chem. Phys. 104 (1996) 1040
Franel, M.M., Pietro, W.J., Hehre, W.J., Bimcley, J.S., Gordon, M.S., DeFrees, D.J.,
Pople, J.A.: J. Chem. Phys., 77 (1982) 3654.
Hariharan, P.C., Pople, J.A.: Chem. Phys. Lett. 66 (1972) 217
Pople, J. A., Gordon, M. H., Raghavachari, K.:.J. Chem. Phys. 87 (1987) 5968
Schaftenaar,G., Noordik,J.H.: J Comput Aid Mol Design 14 (2000) 123
Gonzalez, C., Schlegel, H.B.: J. Chem. Phys.94 (1990) 5523
Mulliken, R.S.: J. Chem. Phys. 23 (1955) 1833
Reed, A.E., Carpenter, J.E., Weinhold, F., Curtiss, L.A.: Chem. Rev. 88 (1988) 8991
Breneman, C.M., Wiberg, K.B.:J. Comput. Chem. 11 (1990) 361
Ramalingam, M., Ramasami, K., Venuvanalingam, P., Sethuraman, V.: J.
Mol.Struct.(Theochem) 755 (2005) 169
Bach, R.D., Andres, J.L., Su, M.D., McDouall, J.J.W.: J. Am. Chem. Soc. 115 (1993)
5768
Gilles, M.K., Lineberger, W.C., Ervin, K.M.: J. Am. Chem. Soc.115 (1993) 1031
Glendening, E.D., Reed, A.E., Carpenter, J.E., Weinhold, F., Curtiss, L.: Chem. Rev. 88
(1988) 899 NB Version 3.1
Carpenter, J.E.: Ph.D. Thesis, University of Wisconsin (Madison, WI) (1987)
Hammond, G.S.: J. Am. Chem. Soc. 77 (1955) 334

Theoretical Gas Phase Study of the Gauche and Trans


Conformers of 1-Bromo-2-Chloroethane and Solvent
Effects
Ponnadurai Ramasami
Faculty of Science, Department of Chemistry, Univeristy of Mauritius, Rduit, Republic of
Mauritius
p.ramasami@uom.ac.mu
www.pages.intnet.mu/ramasami/

Abstract. This is a systematic gas phase study of the gauche and trans conformers of 1-bromo-2-chloroethane. The methods used are second order
Mller-Plesset theory (MP2) and density functional theory (DFT). The basis set
used is 6-311++G(d,p) for all atoms. The functional used for DFT method is
B3LYP. G2/MP2 and CCSD(T) calculations have also been carried out using
MP2 optimised structure. The results indicate that there is more preference for
the trans conformer. The energy difference between the trans and gauche conformers (Etg) and related rotational thermodynamics are reported. The MP2/6311++G(d,p) energy difference (Etg) for 1-bromo-2-chloroethane is 7.08
kJ/mol. The conformers of 1-bromo-2-chloroethane have also been subjected to
vibrational analysis. This study has also been extended to investigate solvent effects using the Self-Consistent Reaction Field methods. The structures of the
conformers are not much affected by the solvents but the energy difference
(Etg) decreases with increasing polarity of the solvent. The results from the
different theoretical methods are in good agreement.

1 Introduction
100 years ago, Bischoff found that C-C single bond in ethane is not completely free
[1]. Due to the hindered internal rotation, 1,2-disubstituted ethanes are the simplest
molecules showing conformational isomerism thus leading to the gauche and trans
conformers. It is generally found that the trans conformer is more stable than the
gauche form and this is due to steric hindrance in the gauche conformation [2]. Theoretical calculations of the energy difference between the trans and gauche conformers
(Etg) have been actively pursued for over 40 years, as they are important parameters
to the conformational analysis of molecules [3].
In previous communications, energy differences (Etg) have been calculated for
1,2-dihaloethanes (XCH2CH2X, X=F, Cl, Br and I) [4] and for 1-fluoro-2-haloethanes
(FCH2CH2X, X=Cl, Br and I) [5] in the gas phase. These studies indicate that except
for 1,2-difluoroethane, the trans conformer is more stable that the gauche conformer.
The energy difference (Etg) increases with the size of the halogen. The atypical
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 296303, 2007.
Springer-Verlag Berlin Heidelberg 2007

Theoretical Gas Phase Study

297

behaviour of 1,2-difluoroethane has been associated with the gauche effect [6-10] but
the latter is not observed for the 1-fluoro-2-haloethanes. Solvent effects have also
been explored for 1,2-dichloroethane and 1,2-dibromoethane [11]. The study indicates
that an increase in the solvent polarity decreases the energy difference (Etg). It is
worth to point out at this stage that literature involving theoretical studies is limited
with respect to solvent effects [11,12] although polarity of the solvent is known to
affect conformational equilibrium [13].
As part of a continuing series of studies on internal rotation [4-5,11], 1-bromo-2chloroethane has been the target of this work. 1-Bromo-2-chloroethane, being a 1,2disubstituted ethane, can exist as the gauche (C1 symmetry) and trans (Cs symmetry)
conformers as illustrated in Figure 1.

Cl

Cl

Br

H
H

Gauche conformer (C1 symmetry)

H
Br

Trans conformer (Cs symmetry)

Fig. 1. Gauche and Trans conformers of 1-bromo-2-chloroethane

These gauche and trans conformers of 1-bromo-2-chloroethane have been studied


with a view to obtain (i) the optimised structural parameters, (ii) the energy difference (Etg) and (iii) related thermodynamics properties for torsional rotation. Apart
from energy calculations, the conformers of 1-bromo-2-chloroethane have also been
subjected to vibrational analysis. Solvent effects, using Self-Consistent Reaction
Field (SCRF) methods [14], have also been explored with the dielectric constant
of the solvent varying from 5 to 40. The results of the present investigation are reported herein and to the best of our knowledge there has not been such a type of
investigation.

2 Calculations
All the calculations have been carried out using Gaussian 03W [15] program suite
and GaussView 3.0 [16] has been used for visualising the molecules. The calculations
have been carried out using second order Mller-Plesset perturbation theory (MP2)
and density functional theory (DFT). The basis set used is 6-311++G(d,p) for all
atoms. The functional used for DFT method is B3LYP. A conformer has first been
optimised and the optimised structure has then been used for frequency calculation
using the same method and basis set involved for optimisation. G2/MP2 and

298

P. Ramasami

CCSD(T) calculations have also been carried out using MP2/6-311++G(d,p)


optimised structure. For all the converged structures, frequencies calculations have
also been carried out in order to ensure that this conformation corresponds to a
minimum. The SCRF methods used are Isodensity Model (SCRF=IPCM) [14] and
Self-Consistent Isodensity Model (SCRF=SCIPCM) [14]. MP2/6-311++G(d,p) gas
phase optimised structures have been used for the single point calculations for the
Isodensity Model and B3LYP/6-311++G(d,p) full geometry optimisation calculations
have been carried for the Self-Consistent Isodensity Model.

3 Results and Discussion


The optimised structural parameters, which are of interest for the two conformers of
1-bromo-2-chloroethane, are summarised in Table 1. Analysis of Table 1 allows some
conclusions to be made. Firstly, for both conformers the predicted C-Cl and C-Br
bond lengths are longer from B3LYP calculation although the C-C and C-H bond
lengths are nearly same. Secondly the bond angles CCCl and CCBr are larger for the
gauche conformer than the trans conformer. This can be explained in terms of greater
amount of steric repulsion between the halogen atoms in the gauche conformation.
Further these bond angles are larger from B3LYP calculation. However, these bond
angles are nearly same for the trans conformer for both methods. Thirdly, the torsional angle ClCCBr for the gauche conformer is larger from B3LYP calculation.
Lastly the moments of inertia are generally greater from MP2 calculation with
IA > IB IC.
Table 1. Optimised structural parameters for the gauche and trans conformers of 1-bromo-2chloroethane using 6-311++G(d,p) as the basis set
Parameter

B3LYP
Gauche

MP2
Trans

Gauche

Trans

r (C-Cl)/

1.809

1.821

1.776

1.783

r (C-C)/

1.511

1.512

1.512

1.512

r (C-Br)/

1.973

1.980

1.937

1.941

r (C-H)/

1.090

1.087

1.090

1.089

(CCCl)/

113.2

109.0

112.3

109.0

(CCBr)/

113.3

109.2

112.6

109.5

W (ClCCBr)/

70.4

180.0

68.0

180.0

2.826

0.013

3.067

0.015

IA/ GHz

8.870

28.689

8.862

28.909

IB/ GHz

1.427

0.961

1.509

0.991

IC/ GHz

1.283

0.941

1.347

0.970

Dipole moment/ D

Theoretical Gas Phase Study

299

The energies of the gauche and trans conformers of the 1-bromo-2-chloroethane


are given in Table 2. These energies have been obtained after full geometry optimisation which has been verified by frequency calculation. G2/MP2 and CCSD(T) energies are also given in Table 2. As part of G2/MP2 and CCSD(T) calculations, MP2/6311+G(3df,2p) and MP3/6-311++G(d,p) energies are included in Table 2. The energy
difference (Etg) and related rotational thermodynamic parameters are also summarised in Table 2. A glance at Table 2 clearly shows that the trans conformer is more
stable. The energy difference (Etg) predicted using B3LYP method is greater than
MP2 method for the same basis set. The free energy difference (Gtg) can be used to
estimate the relative percentage of the trans and gauche conformers. It is found that at
298 K, the percentage of the trans conformer is generally greater than 90%. At this
stage, it is interesting to compare the energy difference (Etg) of the unsymmetrical
1-bromo-2-chloroethane with the symmetrical 1,2-dichloroethane and 1,2dibromoethane. The MP2/6-311++G(d,p) values for these compounds are 6.08 and
8.79 kJ/mol respectively [4].
Table 2. Calculated energies and rotational thermodynamic parameters for the conformers of
1-bromo-2-chloroethane
Method

Gauche

Trans

(Hartrees)

(Hartrees)

'Etg
(kJ/mol)

'Htg

'Htg

'Gtg

(0 K)

(298 K)

(298 K)

(kJ/mol)
B3LYP/6-311++G(d,p)
MP2/6-311++G(d,p)
MP2/ 6-311+G(3df,2p)
G2/MP2
MP3/6-311++G(d,p)
CCSD(T)/6-311++G(d,p)

-3113.0202578
-3111.0127999
-3110.6755106
-3110.7491862
-3110.5697388
-3110.5944679

-3113.0234642
-3111.0154953
-3110.6778141
-3110.7515898
-3110.5725913
-3110.5972661

8.42
7.08
6.05
6.31

(kJ/mol)

(kJ/mol)

7.65
6.45

8.01
6.86

5.88

6.26

7.49
7.35

The gauche and trans conformers of 1-bromo-2-chloroethane have also been


subjected to vibrational analysis. The calculated frequencies are reported in Table 3
and the simulated spectra are illustrated in Figure 2. The 18 modes of vibrations
account for the irreducible representations v = 18A of the C1 point group of the
gauche conformer and v = 11A + 7A of the Cs point group of the trans conformer.
All the 18 fundamentals of the gauche and trans conformers have been assigned
appropriately. The values indicate that predictions with MP2 level of theory are
systematically larger than B3LYP level of theory. Since steric interaction between
the atoms is more in the gauche than trans conformer, the CCCl and CCBr bending
modes have higher frequencies in the gauche conformation than the trans conformation. The bending vibrational modes of the CH2 group are in the order scissoring >
wagging > twisting > rocking. The bending mode of the CH2 group attached to bromine atom is at a lower frequency compared to CH2 group attached to chlorine atom.
This can be explained on the basis of reduced mass for CH2 group when attached
to bromine atom. However the stretching vibrational modes of the CH2 group bonded
to the halogen atoms are reversed in terms of frequency. The calculated frequentcies for 1-bromo-2-chloroethane are in agreement with literature values obtained
experimentally [17].

300

P. Ramasami

Table 3. Calculated frequencies (cm-1) of the conformers of 1-bromo-2-chloroethane and their


assignments

B3LYP
3158.9
(1.4)
3138.2
(1.2)
3090.1
(12.3)
3071.8
(8.9)
1469.1
(3.0)
1461.7
(10.8)
1338.6
(25.4)
1296.5
(57.3)
1211.4
(3.2)
1146.6
(1.5)
1038.2
(1.8)
928.7
(9.1)
868.5
(23.1)
658.4
(26.0)
555.0
(13.9)
382.0
(8.7)
241.6
(1.4)
94.1
(0.6)

MP2
3206.7
(1.7)
3190.4
(1.4)
3131.8
(13.9)
3120.2
(9.6)
1481.0
(1.0)
1471.7
(10.6)
1382.1
(24.3)
1337.9
(45.2)
1244.7
(3.0)
1179.2
(2.0)
1077.7
(1.3)
968.6
(6.9)
898.7
(18.5)
724.2
(13.8)
610.6
(8.7)
397.1
(5.5)
251.0
(0.9)
106.6
(0.5)

Gauche
Literature Assignments
[17]
3010
CH2 a str
3010

CH2 a str

2960

CH2 s str

2960

CH2 s str

1428

CH2 scis

1428

CH2 scis

1299

CH2 wag

1260

CH2 wag

1190

CH2 twist

1127

CH2 twist

1025

CC str

923

CH2 rock

856

CH2 rock

664

CCl str

571

CBr str

385

CCCl deform

251

CCBr deform

107

Torsion

B3LYP

MP2

3185.5
(0.9)
3159.9
(0.4)
1292.6
(0.02)
11219.3
(2.6)
975.6
(0.3)
771.1
(3.3)
109.2
(6.0)
3113.3
(7.7)
3102.3
(1.9)
1490.9
(1.5)
1486.8
(5.6)
1320.9
(2.7)
1237.0
(49.6)
1060.7
(1.0)
716.3
(30.4)
616.2
(70.1)
242.2
(0.9)
191.1
(7.8)

3229.5
(1.1)
3207.5
(0.4)
1314.6
(0.02)
1158.3
(3.1)
1011.3
(0.5)
785.0
(2.2)
119.0
(5.1)
3151.1
(10.4)
3142.4
(1.5)
1499.7
(0.04)
1493.9
(5.1)
1372.4
(2.9)
1272.9
(49.1)
1100.8
(1.2)
803.3
(24.5)
688.6
(40.6)
259.1
(0.7)
201.0
(6.4)

Trans
Literature
[17]
3010

CH2 a str

3010

CH2 a str

1259

CH2 twist

1111

CH2 twist

961

CH2 rock

763

CH2 rock

123

Torsion

2960

CH2 s str

2960

CH2 s str

1446

CH2 scis

1444

CH2 scis

1284

CH2 wag

1203

CH2 wag

1052

CC str

726

CCl str

630

CBr str

251

CCCl deform

202

CCBr deform

Assignments

-Values in bracket are infrared intensities in (km/mol)


-For the trans conformer, first 11 frequencies are of A symmetry and last 8 frequencies are of
A symmetry
-For the gauche conformer, all 18 frequencies are of A symmetry

This study has also been extended to study solvent effects. The structures of the
conformers are not much affected by the polarity of the polarity of the solvents. The
effects of solvent on the energy of the gauche and trans conformers and energy difference (Etg) are summarised in Table 4 and illustrated in Figure 3. It can be found that
solvent effects are small but they can be calculated and an increase in the polarity of

Theoretical Gas Phase Study

301

B3LYP
MP2
Arbritary units

the solvent decreases the energy difference (Etg). However in the polar solvents, the
decrease in the energy of the more polar gauche conformer is larger than the trans
conformer.

MP2-Trans
MP2-Gauche
B3LYP-Trans
B3LYP-Gauche

3500

3000

2500

2000

1500

1000

500

-1

cm

Fig. 2. Simulated spectra of the gauche and trans conformers of 1-bromo-2-chloroethane

Energy difference, Etg (kJ/mol)

7.0
6.5
6.0
5.5
5.0

MP2

4.5

B3LYP

4.0
3.5
3.0
0

10

20

30

40

Dielectric constant

Fig. 3. Energy difference (Etg) for 1-bromo-2-chloroethane in solvents with different dielectric
constants

302

P. Ramasami

Table 4. Energy and energy difference (Etg) for the conformers of 1-bromo-2-chloroethane in
solvents with different dielectric constants
MP2

5
10
15
20
25
30
35
40

B3LYP

Gauche

Trans

Etg

Gauche

Trans

Etg

(Hartrees)

(Hartrees)

(kJ/mol)

(Hartrees)

(Hartrees)

(kJ/mol)

-3110.5351096
-3110.5356523
-3110.5358531
-3110.5359575
-3110.5360216
-3110.5360649
-3110.5360961
-3110.5361197

-3110.5326210
-3110.5332507
-3110.5334889
-3110.5336138
-3110.5336906
-3110.5337428
-3110.5337807
-3110.5338094

6.53
6.31
6.21
6.15
6.12
6.10
6.08
6.07

-3113.0262618
-3113.0268009
-3113.0269959
-3113.0270967
-3113.0271582
-3113.0271996
-3113.0272295
-3113.0272520

-3113.0242279
-3113.0250547
-3113.0253597
-3113.0255187
-3113.0256163
-3113.0256822
-3113.0257298
-3113.0257657

5.34
4.58
4.30
4.14
4.05
3.98
3.94
3.90

: Dielectric constant.

4 Conclusions
This theoretical study has lead to the determination of the optimised structural
parameters, the energy difference (Etg) and related thermodynamics parameters for
1-bromo-2-chloroethane. The results indicate that there is a preference for the trans
conformer both in the gaseous and solution phases. The calculated frequencies of the
conformers are in agreement with literature values. The energy difference (Etg) decreases as the solvent becomes more polar. The results of this study may be used as a
set of reference for the conformers of 1-bromo-2-chloroethane.

Acknowledgements
The author is grateful to anonymous reviewers for their comments to improve the
manuscript. The author acknowledges the facilities from the University of Mauritius.

References
1. Orville-Thomas W.J.: Internal Rotation in Molecules. Wiley, New York, 1974
2. Dixon D.A., Matsuzawa N., Walker S.C.: Conformational Analysis of 1,2-Dihaloethanes:
A Comparison of Theoretical Methods. J. Phys. Chem. 96 (1992) 10740-10746
3. Radom L., Baker J., Gill P.M.W., Nobes R.H., Riggs N.V.: A Theoretical Approach to
Molecular Conformational Analysis. J. Mol. Struc. 126 (1985) 271-290
4. Ramasami P.: Gauche and Trans Conformers of 1,2-Dihaloethanes: A Study by Ab initio
and Density Functional Theory Methods. Lecture Series on Computer and Computational
Sciences. Vol. 1, Brill Academic Publishers, The Netherlands (2005) 732-734
5. Ramasami P.: Gas Phase Study of the Gauche and Trans Conformers of 1-Fluoro-2Haloethanes CH2F-CH2X (X=Cl, Br, I) by Ab initio and Density Functional Methods: Absence of Gauche Effect. Lecture Notes in Computer Science. Vol. 3993, Springer, (2006)
153-160

Theoretical Gas Phase Study

303

6. Tavasli M., OHagan D., Pearson C. Petty M.C.: The Fluorine Gauche Effect. Langmuir
Isothems Reprot the Relative Conformational Stability of (+/-)-Erythro- and (+/-)-Threo9,10-Difluorostearic acids. Chem. Commun. 7 (2002) 1226-1227
7. Briggs C.R., Allen M.J., OHagan D., Tozer D.J., Slawin A.M., Geota A.E., Howard J.A.:
The Observation of a Large Gauche Preference when 2-Fluoroethylmanine and 2Fluoroethanol Become Protonated. Org. Biomol. Chem. 2 (2004) 732-740
8. Banks J.W., Batsanov A.S., Howard J.A.K., OHagan D., Rzepa H.S., Martin-Santamaria
S.: The Preferred Conformation of -Fluoroamides. J. Chem. Soc., Perkin Trans. 2. 8
(1999) 2409-2411
9. Wiberg K.B., Murcko M. A., Laidig E.K., MacDougall P. J.: Origin of the Gauche Effect in Substituted Ethanes and Ethenes. The Gauche Effect. J. Phys. Chem. 96 (1992)
6956-6959 and references therein
10. Harris W.C., Holtzclaw J.R., Kalasinsky V.F.: Vibrational Spectra and Structure of 1,2Difluoroethane: Gauche-Trans Conformers. J. Chem. Phys. 67 (1977) 3330-3338
11. Sreeruttun R. K., Ramasami P.: Conformational Behaviour of 1,2-Dichloroethane and 1,2Dibromoethane: 1H-NMR, IR, Refractive index and Theoretical Studies. Physics and
Chemistry of Liquids. 44 (2006) 315-328
12. Wiberg K.B., Keith T.A., Frisch M.J., Murcko M.: Solvent Effects on 1,2-Dihaloethane
Gauche/Trans Ratios. J. Phys. Chem. 99 (1995) 9072-9079
13. McClain B.L., Ben-Amotz D.: Global Quantitation of Solvent Effects on the Isomerization
Thermodynamics of 1,2-Dichloroethane and trans-1,2-Dichlorocyclohexane. J. Phys.
Chem. B 106 (2002) 7882-7888
14. Foresman J.B., Keith T.A., Wiberg K.B., Snoonian J., Frisch M.J.: J. Phys. Chem. 100,
(1996) 16098-16104 and references therein
15. Gaussian 03, Revision C.02, Frisch M.J., Trucks G.W., Schlegel H.B., Scuseria G.E.,
Robb M.A.,. Cheeseman J.R, Montgomery J.A., Jr., Vreven T., Kudin K.N., Burant J.C.,
Millam J.M., Iyengar S.S., Tomasi J., Barone V., Mennucci B., Cossi M., Scalmani G.,
Rega N., Petersson G.A., Nakatsuji H., Hada M., Ehara M., Toyota K., Fukuda R., Hasegawa J., Ishida M., Nakajima T., Honda Y., Kitao O., Nakai H., Klene M., Li X., Knox
J.E., Hratchian H.P.,. Cross J.B, Bakken V., Adamo C., Jaramillo J., Gomperts R., Stratmann R.E., Yazyev O., Austin A.J., Cammi, R. Pomelli C., Ochterski J.W., Ayala P.Y.,
Morokuma K., Voth G.A., Salvador P., Dannenberg J.J., Zakrzewski V.G., Dapprich S.,
Daniels A.D., Strain M.C., Farkas O., Malick D.K., Rabuck A.D., Raghavachari K.,
Foresman J.B., Ortiz J.V., Cui Q., Baboul A.G., Clifford S., Cioslowski J., Stefanov B.B.,
Liu G., Liashenko A., Piskorz P., Komaromi I., Martin R.L., Fox D.J., Keith T., Al-Laham
M.A., Peng C.Y., Nanayakkara A., Challacombe M., Gill P.M.W., Johnson B., Chen W.,
Wong M.W., Gonzalez C., and Pople J.A., Gaussian, Inc., Wallingford CT, 2004.
16. GaussView, Version 3.09, R. Dennington II, T. Keith, J. Millam, K. Eppinnett, W. L.
Hovell, R. Gilliland, Semichem, Inc., Shawnee Mission, KS, 2003.
17. Shimanouchi T.: Tables of Molecular Vibrational Frequencies Consolidated Volume I.
National Bureau of Standards (1972) 1-160

Dynamics Simulation of Conducting Polymer


Interchain Interaction Eects on Polaron
Transition
Jose Rildo de Oliveira Queiroz and Geraldo Magela e Silva
Institute of Physics
University of Braslia, 70.917-970,
Braslia, Distrito Federal, Brazil
{magela,rildo}@fis.unb.br
http://www.fis.unb.br

Abstract. Eects of interchain interaction on the polaron-bipolaron


transition on conjugated polymer are investigated. We use the Su-Schrieer-Heeger model combined with the Pariser-Parr-Pople model modied to include interchain interaction, and an external electric eld. We
study the dynamics within the Time-Dependent Unrestricted HartreeFock approximation. We nd that removing an electron from interacting
conducting polymer chains bearing a single positively charged polaron
leads to the direct transition of polaron to bipolaron state. The transition which is produced is single-polaron to bipolaron transition whose
excitation spectrum explains the experimental data. We also nd that
depending on how fast the electron is removed, a structure that contains
a bipolaron coupled to a breather is created.
Keywords: Polaron, Dynamics, Interchain-Interaction, Transition.

Introduction

Properties of organic light-emitting diodes, transistors and lasers are due to


conjugated polymers.[1,2] Their semiconductor properties are related to the nonlinear electronic response of the coupled electron-lattice system.[3] These nondegenerate ground state -electron materials are able to form, by the electronlattice interaction, self localized electron states called polaron and bipolaron.
Bipolarons and polarons are though to play the leading role in determining
the charge injection, optical and transport properties of conducting polymers.[4]
Bipolarons and polarons are self-localized particle-like defects associated with
characteristic distortions of the polymer backbone and with quantum states
deep in the energy gap due to strong electron-lattice coupling. A polaron has
a spin 1/2 and an electric charge e, whereas a bipolaron is spinless with a
charge 2e.
A critical problem in the understanding of these materials is the consistent
description of the dynamics of mechanism of creation, stability and transition of
polarons to bipolarons.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 304311, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Dynamics Simulation of Conducting Polymer Interchain Interaction Eects

305

UV-Vis-NIR spectroscopy studies on poly(p-phenylene vinylene) combined


to the follow-up of the kinetics of doping with iodine vapor were reported and
interpreted as direct observations of the formation of polaronic charge carriers.[1]
However, by following dierent doping levels with I2 doping, bipolaron formation
is identied as well showing that polarons and bipolarons coexist in the oxidized
polymer. These results corroborate the ndings of Steinm
uller et al[5] where the
evolution of the gap states of bithiophene as a model system for polythiophene for
dierent n-doping levels was followed by ultraviolet photo-emission spectroscopy
(UPS) and electron-energy-loss spectroscopy (EELS).
The polaron-bipolaron transition problem was explicitly addressed by Cik et
al in poly(3-dodecyl thiophene) in connection with temperature changes.[6] They
found that when the sample was heated and subsequently cooled, there was an
amplication of the diamagnetic inter- and intra-chain bipolarons. Kaufman et al
study of polypirrole[7] by optical-absorption spectroscopy and ESR also pointed
that the metastable states possess spin, while the stable states do not.
Many eorts have been devoted to describe the polaron-bipolaron conundrum
theoretically. Electronic structure calculations,[8] extensions of the Su-SchrieerHeeger model,[9,10] the Pariser-Parr-Pople model,[11] as well as combinations of
them[12] have been used to determine the relative prevalence of each excited state
in various regimes. Several dierent approaches[9,12,13,14] point to bipolaron
system been more stable than the polaron system when dopants are taken into
account.
Two mechanisms have been put forward to explain the transition from polaron to bipolaron states. Polarons recombination into bipolaron,[6,7,15] where
the bipolaron is generated when polarons with the same electric charge meet
each other; and single-polaron to bipolaron transition,[1,13,16] where the polaron structure is transformed by the addition of one extra charge.
Here, we report the results of dynamical calculations on polaron-bipolaron
transition mechanism with interacting chains. We use the Su-Schrieer-Heeger
model[17] modied to include the Coulomb interaction via extended Hubbard
model, Brazovskii-Kirova (BK) symmetry breaking terms, the action of an external electric eld, and interchain interactions.[12] The time-dependent equations
of motion for the lattice sites and the -electrons are numerically integrated
within the Time-Dependent Hartree-Fock approximation.
Stafstr
om et al have used a similar approach to treat polaron migration between chains (ref. [20]). Nevertheless, they did not consider electron Coulomb
interaction and symmetry breaking terms. Furthermore, open end boundary conditions were used.
In agreement with UV-Vis-NIR spectroscopy,[1] UPS and EELS measurements,[5] our theoretical studies of the transition indicate that the single-polaron
to bipolaron transition is the preferred mechanism of polaron-bipolaron transition in conjugated polymers.
We nd that a breather mode of oscillation is created at the lattice in connection with the transition around the bipolaron. The breather amplitude is

306

J.R. de Oliveira Queiroz and G.M. e Silva

associated with how fast the extra charge is added to the system. Moreover, the
created bipolaron is trapped by the breather.

Model

A SSH-Extended Hubbard type Hamiltonian modied to include an external


electric eld and interchain interaction is considered. The Hamiltonian is given
by:
H = H1 + H2 + Hint
(1)
where,
Hj =


(tji,i+1 Cji+1,s Cji,s + H.c)
i,s

1
1
(nji )(nji )
2
2
i

+V
(nji 1)(nji+1 1)
+U

K
2

yj2i +

M
i

u 2ji ,

j = 1, 2

(2)

and
Hint =

q


t (C1i,s C2i,s + C2i,s C1i,s )

(3)

i=p,s

Vp (C1m,s C1m,s + C1m+1,s C1m+1,s )

(4)

Ci,s
(Ci,s ) is the creation (annihilation) operator of a electron with spin s at


the ith lattice site, ni,s Ci,s


Ci,s is the number operator, and ni = s ni,s .
yn un+1 un , where un is the displacement of nth CH-group from equilibrium
position in the undimerized phase. tjn,n+1 = exp(iA)[(1 + (1)n0 )t0 yjn ],
t0 is the transfer integral between the nearest neighbor sites in the undimerized
chains, t is the hopping integral between sites with the same index on dierent
chains from p site to q site, is the electron-phonon coupling, 0 is the BK
symmetry-breaking parameter. M is the mass of a CH group, K is the spring
constant of a -bond, U and V the on-site and nearest-neighbor Coulomb repulsion strengths, respectively. ea/(c), e is the absolute value of the electronic
charge, a the lattice constant, and c the light velocity. The relation between the
time-dependent vector potential A and the uniform electric eld E is given by
We use as parameters the commonly accepted values for conjugated
E = 1c A.
2 , = 4.1eV A
1 , U = 0 to
polymers: t0 = 2.5eV , t = 0.075eV , K = 21eV A
0 = 0.05t0 , Vp = 0.2eV , and a bare optical phonon
1.8t0 , V = U/2, 
a = 1.22A,
energy Q =  4K/M = 0.16eV .[19]

Dynamics Simulation of Conducting Polymer Interchain Interaction Eects

307

The dynamics of the lattice part is made with the Euler-Lagrange equations
and the Schr
odinger -electrons equation of motion is solved within the unrestricted time-dependent Hartree-Fock approximation. It should be pointed out
that both equations depend explicitly on the occupation number of the oneparticle electronic states.[12]
In order to perform the dynamics, an initial self-consistent state is prepared
solving the equations of motion for the lattice and -electrons simultaneously.[20]
Periodic boundary conditions are considered. The initial state is taken in equilibrium (E = 0). Therefore, we have u n = 0 for all n in the initial state.
The equations of motion are solved by discretizing the time variable with a
step t. The time step t is chosen so that the change of ui (t) and A(t) during
this interval is always very small in the electronic scale.[12]

Simulation Results

One more hole is injected in polymer chains bearing already positively charged
polarons. Since charged excitations defects can be created by quite dierent
means: photoexcitations, chemical doping or direct charge injection via electronic
device, we performed simulations where the extra electron is taken from the
system during dierent time intervals (T ). We varied T from 0 to 100 fs. The
shorter time intervals simulate photoexcitations and the direct charge injection.
The longer time intervals account for the dierent impurity addition procedures
associate with chemical doping. The electron is taken from the highest occupied
level using the following expression
OF (t) =

1
(t ti )
[1 + cos(
)]
2
T

(5)

for t between ti and ti + T . Here, ti is the time when the hole injection begins
and OF (t) is the occupation number of the Fermi level.
We have considered two polymeric interacting chains with N = 60 sites each,
containing initially two positively charged polaron in all simulations. We use a
mean charge density i (t), derived from the charge density i (t), and the order
parameter yi (t) [yi (t) = ui+1 (t) ui (t)] to analyze the simulations.[19] The
dynamics of the system is followed during 100,000 time steps spanning 400 fs.
A smooth transition of one of the polarons to a bipolaron, in its respective
chain, is obtained after the adiabatic removal (T > 80 fs) of the third electron.
Figure 1 shows the time evolution of the energy levels neighboring and inside
the energy gap. It can be seen that the energy levels associated with the polaron
move in the middle-gap direction assuming a bipolaron conformation. The small
oscillation of the levels are due to lattice oscillations induced by the hole injection
perturbation.
Figure 2 presents bond length order parameter of chains 1 and 2. It should be
noted that we use periodic boundary conditions, therefore, the order parameter
of chain 1 (Fig. 2(a)) represents a polaron around site 1 (it begins at site 45,
it goes until site 60 and it continues from site 1 to site 15). Positively charged

308

J.R. de Oliveira Queiroz and G.M. e Silva


Energy Gap

1.5

Energy (eV)

0.5

0.5

1.5
0

100

200
Time (fs)

300

400

Fig. 1. Time evolution of energy levels inside and around the gap in an adiabatic
transition. The spin up levels are shown. The system changes from polaron levels (t <
80 fs) to bipolaron levels conguration (t > 100 fs).

polarons repel each other. They stay apart from each other as far as possible.
The polaron-bipolaron transition occurs in chain 2. This clear transition happens in chain 2 as an apparent spontaneous symmetry breaking. Nevertheless,
the presence of an impurity on chain 2 leads to a symmetry breaking and the
association of one polaron to it. It is obtained that the polaron associated with
the impurity makes the transition to bipolaron.
Eects of interchain interaction were addressed by varying the extent of the
interacting region (p and q in the Hamiltonian). For the transitions where two
chains interact only on half of their length (p=31 and q=60), one polaron stays
in the interacting region and the other stays in the non-interacting region due
again to Coulomb repulsion. It is obtained that the polaron-bipolaron transition
happens with the polaron in the interacting region. Therefore, the interchain
interaction is also eective in promoting the transition.
Figure 3 presents a very special case where two polarons merge to create a
bipolaron. This case is quite the originally suggested process for the polaronbipolaron transition.[21] Here, after the hole injection, there appears an exciton lasting for about 200 f s and then the bipolaron takes place in the lattice.
Nevertheless, it should be noted that this happens when one chain has a high
density of polarons and the other one has initially none of them. It can be
clearly seen that two polarons in chain 1 merges to a single bipolaron and another polaron appears in chain 2 due to interchain interaction and Coulomb
repulsion.
The fast removal of the third electron (T < 80 fs) leads to the appearance of
a breather oscillation mode in the lattice. This breather appears at the bipolaron
position. As a matter of fact, the bipolaron is trapped by the breather.[19]

Dynamics Simulation of Conducting Polymer Interchain Interaction Eects

309

Polarons on Neighboring Chains


0.12

Bond Length Order Parameter (Angstrom)

0.08

0.04

Chain 1

(a)

Chain 2

(b)

0
0.12

0.08

0.04

10

20

30
Sites

40

50

60

Fig. 2. Evolution of the bond length order parameter of two neighboring chains. Solid
line: initial conguration; dotted line: nal conguration. The polaron on chain 1 (Fig.
2(a)) remains stable after the extra electron is adiabatically taken from the system,
but the polaron on chain 2 (Fig. 2(b)) makes a transition and becomes a bipolaron.
Polarons on the Same Chain
0.1

Chain 1

0.08
0.06
0.04

Charge Density (e)

0.02
0
0.1

Chain 2

0.08
0.06
0.04
0.02
0

10

20

30
Sites

40

50

60

Fig. 3. Charge density of chains corresponding to simulation of Fig. 4. The initial


polarons on chain 1 (solid line) coalesce on a bipolaron (dotted line). There is also the
creation of a polaron on chain 2 (dotted line).

This trapping of the Bipolaron leads to a reduction of the Bipolaron mobility. Furthermore, the breather oscillation frequency could be detected by infrared

310

J.R. de Oliveira Queiroz and G.M. e Silva

spectroscopy and its presence in association with bipolarons should have eects
on the overall conduction properties of the material.
The distinction between adiabatic and non-adiabatic injection eects is
thought to be associated to the relaxation processes involved at the electronic
level.

Conclusions

Eects of interchain interaction on the transition of polarons to bipolarons on


two interacting conjugated polymeric chains were investigated. This study was
carried out through numerical calculations.
It should be noted that our purpose is a qualitative description of the transition process on conducting polymers in general. The adopted parameter values
are most that of polyacetylene because they are well known values. There was
no attempt to t the parameters to any specic polymer. Nevertheless, there
was a remarkable accordance between the experimental values of the subgap
transitions obtained in reference [1] for poly(p-phenine vinylene) doped with I2 ,
F eCl3 and H2 SO4 and our results. There, for example, the major subgap energy
dierences for polarons were 1.81 eV for I2 ; 2.07 eV for H2 SO4 ; and 2.08 eV for
F eCl3 , whereas the same energy dierences in our simulations varied from 1.93
to 2.12 eV .
We present theoretical results pointing to direct single-polaron to bipolaron
transition as the favored mechanism of bipolaron formation. This result is in
accordance with previous results on polaron and bipolaron dynamics calculations[12] where a pair of polarons do not spontaneously merged to produce a
bipolaron.
Since we considered dierent characteristic time intervals for the hole insertion in the chain, in order to simulate dierent ways of bipolaron generation
(photoproduction, chemical doping or direct charge injection), we obtained different responses from the lattice. It is found that the non-adiabatic electron
removal led to the formation of an associated breather oscillation mode in the
chain. Moreover, the breather interacts with the newly formed bipolaron trapping it around its position. The trapping and depinning of bipolarons from
breathers have direct inuence on the mobility of that charge carrier in the
chain.
Based in our numerical simulations, we suggested two experimental approaches to better understand the polaron-bipolaron transition mechanism. First, the
verication of the presence and quantity of breathers associated with dierent
transition regimes. Second, the change in bipolarons mobility due to the trapping
eect of breathers.
Acknowledgments. We thank CAPES, FINATEC and CNPQ for nancial
support.

Dynamics Simulation of Conducting Polymer Interchain Interaction Eects

311

References
1. Fernandes, M. R., Garcia, J. R., Schultz, M. S., and Nart, F. C.: Polaron and
bipolaron transitions in doped poly(p-phenylene vinylene) lms. Thin Sol. Films
474 (2005) 279.
2. Burroughes, J., Bradley, D. D. C., Brown, A. R., Marks, R. N., Mackay, K., Friend,
R. H., Burn, P. L., and Holmes, A. B.: Light-emitting diodes based on conjugated
polymers. Nature 347 (1990) 539.
3. Jeckelmann, E., and Baeriswyl, D.: The metal-insulator transition in polyacetylene:
variational study of the Peierls-Hubbard model. Synth. Met. 65 (1994) 211.
4. Furukawa, Y.: in P rimary P hotoexcitations in Conjugated P olymers, edited by
N. S. Sariciftci (World Scientic, Singapore, 1997) 496.
5. Steinm
uller, D., Ramsey, M. G., and Netzer, F. P.: Polaron and bipolaronlike states
in n-doped bithiophene. Phys. Rev. B 47 (1993) 13323.
6. Cik, G., Sersen, F., and Dlhan, L. D.: Thermally induced transitions of polarons
to bipolarons in poly(3-dodecylthiophene). Synth. Met. 151 (2005) 124.
7. Kaufman, J. H., and Colaneri, N.: Evolution of Polaron States into Bipolarons in
Polypyrrole. Phys. Rev. Lett. 53 (1984) 1005.
8. Geskin, V. M., and Bredas, J. -L.: Polaron Pair versus Bipolaron on Oligothiophene
Chains: A Theoretical Study of the Singlet and Triplet States. ChemPhysChem 4
(2003) 498.
9. Saxena, A., Brazovskii, S., Kirova, N., Yu, Z. G., and Bishop, A. R.: Stability of
bipolarons in conjugated polymers. Synth. Met. 101 (1999) 325.
10. Xie, S-J., and Mei, L-M.: Transition between bipolaron and polaron states in doped
heterocycle polymers. Phys. Rev. B 50 (1994) 13364.
11. Yao, K. L., Han, S. E., and Zhao, L.: The polaron and bipolaron states of
poly(phenylene vinylene). J. Chem. Phys. 114 (2001) 6437.
12. e Silva, G. M.: Electric-eld eects on the competition between polarons and bipolarons in conjugated polymers. Phys. Rev. B 61 (2000) 10777.
13. Irle, S., and Lischka, H.: Combined ab initio and density functional study on polaron to bipolaron transitions in oligophenyls and oligothiophenes. J. Chem. Phys.
107 (1997) 3021.
14. Bredas, J. L., Scott, J. C., Yakushi, K., and Street, G. B.: Polarons and Bipolarons
in polypyrrole: Evolution of the band structure and optical spectrum upon doping.
Phys. Rev. B 30 (1984) 1023.
15. Farias, G. A., da Costa, W. B., and Peeters, F. M.: Acoustical polarons and bipolarons in two dimensions. Phys. Rev. B 54 (1996) 12835.
16. Verbist, G., Peeters, F. M., and Devreese, J. T.: Large bipolarons in two and three
dimensions. Phys. Rev. B 43 (1991) 2712.
17. Su, W. P., Schhrieer, J. R., and Heeger, A. J.: Soliton excitations in polyacetylene.
Phys. Rev. B 22 (1980) 2099; 28 (1983) 1138.
18. Johansson, A., and Stafstr
om, S.: Polaron Dynamics in a System of Coupled Conjugated Polymer Chains. Phys. Rev. Lett 86 (2001) 3602.
19. Lima, M. P., and e Silva, G. M.: Dynamical evolution of polaron to bipolaron in
conjugated polymers. Phys. Rev. B 74 (2006) 224304.
20. Pinheiro, C. da S., and e Silva, G. M.: Use of polarons and bipolarons in logical
switches based on conjugated polymers. Phys. Rev. B 65 (2002) 94304.
21. Moses, D., Wang, J., Yu, G., and Heeger, A. J.: Temperature-Independent Photoconductivity in Thin Films of Semiconducting Polymers: Photocarrier Sweep-Out
Prior to Deep Trapping. Phys. Rev. Lett 80 (1998) 2685.

Cerium (III) Complexes Modeling with Sparkle/PM3


Alfredo Mayall Simas1, Ricardo Oliveira Freire1, and Gerd Bruno Rocha2
1

Departamento de Qumica Fundamental, CCEN, UFPE, 50590-470 Recife, PE, Brazil


{simas, rfreire}@ufpe.br
2
Departamento de Qumica, CCEN, UFPB, 58.059-970 Joo Pessoa, PB, Brazil
gbr@quimica.ufpb.br

Abstract. The Sparkle/PM3 model is extended to cerium(III) complexes. The


validation procedure was carried out using only high quality crystallographic
structures (R factor < 0.05), for a total of thirty-seven Ce(III) complexes. The
Sparkle/PM3 unsigned mean error, for all interatomic distances between the
Ce(III) ion and the directly coordinating oxygen or nitrogen atoms, is 0.080, a
level of accuracy equivalent to the Sparkle/AM1 figure of 0.083. Moreover,
their accuracy is similar to what can be obtained by present-day ab initio
effective core potential full geometry optimization calculations on such
lanthanide complexes.
Keywords: Cerium, Sparkle Model, PM3, Lanthanide.

1 Introduction
The motivation for research on Ce(III) has been mainly focused on developing
materials for phosphor and scintillator applications [1].
Recently, we introduced Sparkle/AM1 [2], a new paradigm for semiempirical
quantum chemical calculations on lanthanide complexes. Sparkle/AM1 lanthanides
function as new elements to the semiempirical quantum chemistry molecular orbital
model AM1 [3]. That is, when a lanthanide complex is calculated, the lanthanide is
represented by a sparkle, whereas the ligands are modeled by AM1.
The Sparkle model assumes that the angular effects of the f orbitals are negligible,
and do not take them into account. The sparkle model replaces the lanthanide(III) ion
by a Coulombic charge of +3e superimposed to a repulsive exponential potential of
the form exp(-r), which accounts for the size of the ion; provides three electrons to
the orbitals of the ligands; adds two Gaussian functions to the core-core repulsion
energy term; and includes the lanthanide atomic mass. Thus, the sparkle model
assumes that the lanthanide trications behave like simple ions, without any angular
steric properties.
Indeed, Sparkle/AM1 was mainly designed to predict geometries of lanthanide
complexes at a level of accuracy useful for complex design. Recent research on
lanthanide complexes has in fact indicated that Sparkle/AM1 coordination polyhedron
geometries are comparable to, if not better than geometries obtained from the best
contemporary ab-initio full geometry optimization calculations with effective core
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 312318, 2007.
Springer-Verlag Berlin Heidelberg 2007

Cerium (III) Complexes Modeling with Sparkle/PM3

313

potentials [4]. Besides, Sparkle/AM1 calculations are hundreds of times faster [2],
and have been recently employed for the study of quantum yields of luminescence for
some complexes [5]-[9].
PM3 [10],[11] was introduced in 1989 as a more accurate semiempirical model,
giving lower average errors than AM1 [3], mainly for the enthalpies of formation.
PM3 also became very popular [12]. More recently, Stewart completed the
parameterization of PM3 to all non-radioactive elements of the main group, excluding
the noble gases, thus largely amplifying its usefulness [13].
In order to broaden the range of applications of our sparkle model, we advance, in
the present article, Sparkle/PM3 parameters for the calculation of Ce(III) complexes
to complement the Sparkle/AM1 parameters that have already been published for
Ce(III) ion [14].

2 Parameterization Procedure
The parameterization procedure used for Ce(III) was essentially the same as the one
described in our previous work on Sparkle/AM1 for Ce(III)[14]. Accordingly, we
only used high quality crystallographic structures (R-factor < 5%) taken from the
"Cambridge Structural Database 2003" (CSD) [15]-[17], having found a total of 37
structures of complexes of Ce(III). As training sets, we used the same three subsets of
15 complexes each, already chosen for the parameterization of Sparkle/AM1 for the
same ions[14].
The Sparkle/PM3 parameters found for the cerium (III) ion are shown in Table 1.
Table 1. Parameters for the Sparkle/PM3 model for Ce(III)

GSS
ALP
a1
b1
c1
a2
b2
c2
1
EHEAT (kcal.mol-1)
AMS (amu)
1

Sparkle/PM3 - Ce(III)
58.5701153062
2.5665085968
1.8026688761
7.5971870028
1.8009003439
0.1319892158
9.6116040841
3.0613741124
944.7
140.1150

The heat of formation of the Ce(III) ion in Sparkle/PM3 was obtained by adding to
the heat of atomization of cerium, its first three ionization potentials.

3 Results and Discussion


As geometry accuracy measures, we used the average unsigned mean error for each
complex i, UMEi, defined as:

314

A.M. Simas, R.O. Freire, and G.B. Rocha

UMEi =

1
ni

ni

R
j =1

CSD
i, j

Ricalc
,j

(1)

where ni is the number of ligand atoms directly coordinating the lanthanide ion. Two
cases have been examined: (i) UME(Ln-L)s involving the interatomic distances Rj
between the lanthanide central ion, Ln, and the atoms of the coordination polyhedron,
L, important to complex design; and (ii) UMEs of all the edges of the pyramids, that
is, of the interatomic distances Rj between the lanthanide central ion and the atoms of
the coordination polyhedron, as well as all the interatomic distances Rj between all
atoms of the coordination polyhedron. Table S1 of the supplementary material
presents the UME(Ce-L)s and UMEs for both Sparkle/PM3 and Sparkle/AM1 for thirtyseven cerium (III) complexes.
Table 2 presents unsigned mean errors for both Sparkle/PM3 and Sparkle/AM1 for
various types of distances in the Ce(III) complexes considered. Results indicate that
the two models are essentially equivalent. Distances between the cerium (III) ion and
its directly coordinated ligand atoms are predicted with higher accuracy than either
the distances between two Ce(III) ions in dilanthanide compounds, or the distances
between atoms of the faces of the coordination polyhedron. Luckily that is so, because
radial lanthanide ion-ligand atom distances are far more important for luminescent
complex design [18].
Table 2. Sparkle/AM1 and Sparkle/PM3 unsigned mean errors for all types of sets of distances
involving the central cerium (III) ion, Ce, and the ligand atoms of the coordination polyhedron,
L, for thirty-seven Ce(III) complexes considered

Ce Ce
Ce O
Ce N
L L
Ce L and CeCe
Ce-L, CeCe and L-L

Sparkle/AM1
0.212
0.081
0.073
0.208
0.083
0.174

Sparkle/PM3
0.212
0.078
0.067
0.190
0.080
0.155

Assuming that the sparkle model is a good representation of the lanthanide ion, as
well as of its interactions with the ligands, the distribution of these UMEs should be
random around a mean, whose value can be used as a measure of accuracy of the
model. Since the UMEs are positive, defined in the domain (0,), they should follow
the gamma distribution which has the probability density function g(x; k, ), where
x > 0 stands for the UMEs, k > 0 is the shape parameter, and > 0 is the scale
parameter of the gamma distribution. The expected value of the gamma distribution is
simply k. The shape and scale parameters were estimated with the method of
maximum likelihood in order to obtain the gamma distribution fit of the UME data.
The quality of the gamma distribution fit can be assessed via the one-sample nonparametric Kolmogorov-Smirnov [19] test. For the hypothesis that the UME values
follow a gamma distribution not to be rejected at the usual level of 5%, the p-value of
the test statistic must thus be larger than 0.05.

Cerium (III) Complexes Modeling with Sparkle/PM3

315

Figure 1 presents a gamma distribution fit of the UME(Ce-L)s for Sparkle/PM3.


Superimposed to the fit, a histogram of the data with the number of bars chosen to
best adjust the histogram to the curve obtained from the gamma distribution fit is also
presented so that the reader can check the regions where the actual UMEs really
occurred. The p-value of the gamma distribution fit for Sparkle/PM3 is 0.062, above
the 0.05 value, thus attaching statistical significance to the fit and, by extension, to the
Ce(III) Sparkle/PM3 model as well.
28
26

Number of Complexes

24

Cerium (III)

22
20

Sparkle/PM3

18
16
14

= 0.0142

12

k = 5.61

10
8

p-value = 0.062

mean = 0.080

4
2
0
0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

UME(Ce-L) ()
Fig. 1. Probability densities of the Gamma distribution fits of the UME(Ce-L)s for the Ce(III)
Sparkle/PM3 model, superimposed to histograms of the same data for all 37 Ce(III) complexes
considered; where k is the shape parameter and is the scale parameter of the gamma
distribution; the p-value is a measure of the significance of the gamma distribution fit; and
mean is the expected value of the fitted gamma distribution, which is set to be equal to the
arithmetic mean value of the 37 UME(Ce-L)s

Recently, an exhaustive study by our research group has been accomplished on


coordination polyhedron geometry prediction accuracies of ab initio effective core
potential (ab initio/ECP) calculations [4]. The study consisted of complete full
geometry optimization calculations on dozens of complexes of various lanthanide
ions, the largest containing 164 atoms, varying both basis sets (STO-3G, 3-21G, 631G, 6-31G*, and 6-31+G) and method (HF, B3LYP, and MP2 full). The notable
conclusion was that RHF/STO-3G/ECP appears to be the most efficient model
chemistry in terms of coordination polyhedron crystallographic geometry predictions
from isolated lanthanide complex ion calculations. Contrary to what would normally
be expected, either an increase in the basis set or inclusion of electron correlation, or
both, consistently enlarged the deviations and aggravated the quality of the predicted
coordination polyhedron geometries.

316

A.M. Simas, R.O. Freire, and G.B. Rocha

0.40
0.35

Average
UMEs
Sparkle/PM3.............0.067
Sparkle/AM1.............0.072
RHF/STO-3G/ECP.... 0.072

Ce(III)

AFURUO

polydentate

ZUNMAW

tridentate

0.10

HIXWEQ

FILKEQ

bidentate

0.15

monodentate

0.20

nitrate

0.25
-diketone

UME(Ce-L) ()

0.30

0.05
0.00
PEKWEH

TADKEP

Calculated Structures
Fig. 2. Unsigned mean errors, UME(Ce-L)s, involving only the interatomic distances Rj between
the cerium central ion and the atoms of the coordination polyhedron (in ), obtained from the
Sparkle/PM3, Sparkle/AM1 and RHF/STO-3G/ECP ab-initio calculations of the ground state
geometries, for each of the six representative cerium(III) complexes, identified by their
respective Cambridge Structural Database
0.40

0.10

polydentate

bidentate

0.15

monodentate

0.20

nitrate

0.25

-diketone

UME ()

0.30

Average
UMEs
Sparkle/PM3.............0.105
Sparkle/AM1.............0.119
RHF/STO-3G/ECP.... 0.171

tridentate

0.35

Ce(III)

0.05
0.00
PEKWEH

ZUNMAW

AFURUO

TADKEP

HIXWEQ

FILKEQ

Calculated Structures
Fig. 3. Unsigned mean errors, UME(Ce-L)s, (in ), between the cerium central ion and the atoms
of the coordination polyhedron, as well as the interatomic distances Rj between all atoms of the
coordination polyhedron obtained from the Sparkle/PM3, Sparkle/AM1 and RHF/STO-3G/ECP
ab-initio calculations of the ground state geometries, for each of the six representative
cerium(III) complexes identified by their respective Cambridge Structural Database

Cerium (III) Complexes Modeling with Sparkle/PM3

317

For Ce(III) we chose six of these complexes to have their geometries fully
optimized with the model chemistry RHF/STO-3G/ECP. The chosen complexes were
selected to be representative of the various classes of ligands (-diketones, nitrates,
monodentates, bidentates, tridentates, and polydentates) present in the validation set
(see Fig. S2 in supplementary material).
Figure 2 and 3 presents the average UME(Ce-L) and UME values for Sparkle/PM3,
Sparkle/AM1 and RHF/STO-3G/ECP full geometry optimizations of the six
complexes considered. Clearly, all three model chemistries are comparable, with
Sparkle/PM3 being in average slightly superior to Sparkle/AM1, which is, in turn,
superior to RHF/STO-3G/ECP for the prediction of the geometries of the whole
coordination polyhedra.

4 Conclusion
The most accurate ab initio effective core potential full geometry optimization
calculations that can be nowadays carried out on cerium (III) complexes, of a size
large enough to be of relevance to present-day research, exhibit the same accuracy of
either Sparkle/PM3 or Sparkle/AM1 models. Our results do indicate that the Sparkle
model is seemingly an accurate and statistically valid tool for the prediction of
coordination polyhedra of lanthanide complexes.
More importantly, the ability to perform a screening on many different putative
structures of lanthanide complexes in a combinatorial manner, made possible by both
Sparkle/PM3 and Sparkle/AM1, may prove to be of importance for complexes design
research.
Acknowledgments. We appreciate the support from CNPq (Brazilian agency), from
the Instituto do Milnio de Materiais Complexos, and the Cambridge Crystallographic
Data Centre for the Cambridge Structural Database.
Supplementary Material Available: Instructions and examples on how to
implement the Ce(III) Sparkle/PM3 model in Mopac93r2. Parts of the codes of
subroutines Block.f, Calpar.f and Rotate.f that need to be changed, as well as their
modified versions for Ce(III). Examples of Mopac93r2 crystallographic geometry
input (.dat) and optimized geometry summary output (.arc) files from Sparkle/PM3
calculations for the Ce(III) complex GIFCUT10 and for the dicerium complex
XEXCUY. Tables of UME(Ce-L)s and UMEs for both Sparkle/PM3 and Sparkle/AM1
for Ce(III). Figure with gamma distribution fits of the UME data for both
Sparkle/PM3 and Sparkle/AM1 models.

References
1. Weber, M.J., Lecoq, P., Ruchti, R.C., Woody, C., Yen, W.M., Zhu, R.Y., Scintillator and
Phosphor Materials, Materials Research Society Symposium Proceedings Materials
Research Society, Pittsburgh (1994) Vol.348.
2. Freire, R.O., Rocha, G.B., Simas, A.M. Inorg Chem 44 (2005) 3299.

318

A.M. Simas, R.O. Freire, and G.B. Rocha

3. Dewar, M.J.S., Zoebisch, E.G., Healy, E.F., Stewart, J.J.P. J. Am. Chem. Soc. 107 (1985)
3902.
4. Freire, R.O., Rocha, G.B., Simas, A.M. J. Mol. Model, 12 (2006) 373.
5. Lima, P.P., Ferreira, R.A.S., Freire, R.O., Paz, F.A.A., Fu, L.S., Alves, S., Carlos, L.D.,
Malta, O.L. Chem Phys Chem 7 (2006) 735.
6. Pavithran, R., Kumar, N.S.S., Biju, S., Reddy, M.L.P., Junior, S.A., Freire, R.O. Inorg.
Chem. 45 (2006) 2184.
7. Faustino, W.M., Malta, O.L., Teotonio, E.E.S., Brito, H.F., Simas, A.M., de As, G.F. J.
Phys. Chem. A 110 (2006) 2510.
8. dos Santos, E.R., dos Santos, M.A.C., Freire, R.O., Junior, S.A., Barreto, L.S., de
Mesquita, M.E. Chem. Phys. Lett. 418 (2006) 337.
9. Pavithran, R., Reddy, M.L.P., Junior, S.A., Freire, R.O., Rocha, G.B., Lima, P.P. Eur. J.
Inorg. Chem. 20 (2005) 4129.
10. Stewart, J.J.P. J. Comput. Chem. 10 (1989) 209.
11. Stewart, J.J.P. J. Comput. Chem. 10 (1989) 221.
12. Stewart, J.J.P. in: Encyclopedia of Computational Chemistry, P. v. R. Schleyer (editor-inchief),John Wiley & Sons Ltd, Athens, USA (1998).
13. Stewart, J.J.P. J. Mol. Model, 10 (2006) 155.
14. Freire, R.O., do Monte, E.V., Rocha, G.B., Simas, A.M. Inorg Chem 691 (2006) 2584.
15. Allen, F.H. Acta Crystallogr. B, 58 (2002) 380.
16. Bruno, I.J., Cole, J.C., Edgington, P.R., Kessler, M., Macrae, C.F., McCabe, P., Pearson,
J., Taylor, R. Acta Crystallogr. B, 58 (2002) 389.
17. Allen, F.H., Motherwell, W.D.S. Acta Crystallogr. B, 58 (2002) 407.
18. de S, G.F., Malta, O.L., Donega, C.M., Simas, A.M., Longo, R.L., Santa-Cruz, P.A., da
Silva Jr., E.F. Coord. Chem. Rev. 196 (2000) 165.
19. William, J.C. Practical nonparametric statistics ed. John Wiley & Sons, New York.

The Design of Blue Emitting Materials Based on


Spirosilabifluorene Derivatives
Miao Sun, Ben Niu, and Jingping Zhang
Faculty of Chemistry, Northeast Normal University, Changchun 130024, China
zhangjp162@nenu.edu.cn

Abstract. Ground state geometries and electronic properties of four


experimentally reported spirosilabifluorene derivatives are calculated by
HF(DFT)/6-31G* method. Their first excited state geometries are investigated
using CIS/6-31G* method. The absorption and emission spectra are evaluated by
TD-B3LYP/6-31G* and show an excellent agreement with the experimental
data. The CH/N substituted spirosilabifluorene derivatives are also
investigated. Compared to the pristine molecule, no significant change of the
emission wavelengths are found for the CH/N substituted derivatives.
Furthermore, we find that the performance and the optical properties of these
derivatives can be improved through oligomerization or addition of a phenyl
group.
Keywords: spirosilabifluorene, TD-DFT, HF, emission, oscillator strength.

1 Introduction
Organic light-emitting diodes (OLEDs) have been the subjects of intensive
investigation, since they were reported by Tang et al. [1] The major challenges remain,
including the need to develop a blue electroluminescence emitter material, which is
essential for the development of a full color display based on the color changing medium
technology or white emission. [2, 3] OLEDs based on silicon derivatives exhibit
excellent performance with external electroluminescence quantum efficiencies close to
the theoretical limit for a fluorescent material. [4] Compared with spirobifluorene,
spirosilabifluorene may have an improved morphological stability in films.[5] They
have also been extensively studied as charge transport materials, and the frontier
molecular orbitals give the main contribution to charge-transfer interactions. [6, 7]
Recently, a series of aryl-substituted 9,9-spiro-9-silabifluorene (SSF) derivatives,
2,2-di-tert-butyl-7,7-diphenyl-9,9-spiro-9-silabifluorene (1a), 2,2-di-tert-butyl-7,7
-dipyridin-2-yl-9,9-spiro-9-silabifluorene (1b), 2,2-di-tert-butyl-7,7-diphenyl-4-yl9,9-spiro-9-silabifluorene (1c), and 2,2-di-tert-butyl-7,7-bis(2,2-bipyridin-6-yl)-9,
9- spiro-9-silabifluorene (1d) were prepared by Lee et al.[4] The absorption spectrum
of each of the novel spiro-linked siloles was found showing a significant red shift
relative to that of the corresponding carbon analogue. [8] The corresponding solid-state
films were reported to exhibit intense violet-blue emission (PL = 398-415 nm).
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 319 326, 2007.
Springer-Verlag Berlin Heidelberg 2007

320

M. Sun, B. Niu, and J. Zhang

In this contribution, we have investigated the optical and electronic properties of


spirosilabifluorene derivatives (1a-1d) with the aim to give a further explanation of the
experimental results, and then design new photoluminescence materials based on
selected approaches for spirosilabifluorene based materials. CH/N substitution
approach, which has been cited as an efficient approach for tuning of the emitting color
in the case of Mq3 (M = Al, [9, 10] Ga [9] and q = 8-hydroxyquinoline) based
fluorescent materials, has not exploited yet in the case of spirosilabifluorene based
materials. Thus, CH/N substituted spirosilabifluorene derivatives are investigated.
With the aim to investigate the ending groups for enhancing the oscillator strength (f),
we design the derivatives with and without t-Bu and phenyl groups.

2 Computational Details
The structures of spirosilabifluorene derivatives (1a-1d) and the designed derivatives
were provided in Figure 1. For the sake of comparison, spirosilabifluorene (SSF) was
used as parent molecule. Two SSF derivatives with phenyl (1e) and t-Bu (1f) groups
were chosen to investigate the ending group effect on f. The other substituted models
used in our calculations, obtained by a systematic substitution of CH by N atoms on
positions 4, 5, 6, and 7, were labeled as shown (see Figure 1(II)). The 4-, 5-, 6-,
7-di-substituted compounds on both A and B moieties were considered for 2a-2d,
respectively. The 4,4-, 5,5-, 6,6-, 7,7-C-substituted compounds (3a-3d) were
considered for di-substitution on the same A moiety of SSF. Furthermore, the dimeric
form of 3d was listed in Figure 1(III).

6'

7'

R3

(I) Rx substitution:

R1

2'

R1=R3=

4'

R2=R4= t-Bu 1a

3'

Si 1
2

3
4

R4

5'

R1=R3=
R2=R4=H

1e

1b
t-Bu
1f

1c

N
N
1d

R1=R2=R3=R4=H spirosilabifluorene (SSF)

R2

5'

4'

7'

3'

6'
A

2'

6
2

(II) "CH"/N substitution:


A and B
X4=N 2a
X4=X4'=N
X5=N 2b
X5=X5'=N
X6=N 2c
X6=X6'=N
X7=N 2d
X7=X7'=N

A
3a
3b
3c
3d

(III) Dimer
Fig. 1. Molecular structures of the investigated derivatives (t-Bu = tert-butyl)

The Design of Blue Emitting Materials Based on Spirosilabifluorene Derivatives

321

All the geometry optimizations for the ground state (S0) were performed using the ab
initio Hartree Fock (HF) [11] and the B3LYP [12] method with the 6-31G* [13] basis
set. The low-lying excited-state (S1) structures were then optimized using the
configuration interaction with single excitations (CIS) [14] with 6-31G* basis set, the
optical spectra was calculated using TD-B3LYP [15] with 6-31G* basis set to obtain
estimates including some account of electron correlation. All the calculations have
been performed using Gaussian 03 package. [16]

3 Results and Discussions


3.1 Comparison of Computational Methods and Optical Characteristics for
1a-1d
3.1.1 Geometries of Ground States
More detailed insights into the nature of the excited states responsible for the optical
properties can be gained by looking at the computed frontier molecular orbitals
(FMOs), when calculated on a reliable molecular geometry. The optimized structures
for 1a-1d show that the bond lengths and bond angles do not suffer from appreciable
variation, according to different calculation methods. These derivatives keep their
mutually perpendicular -systems, and the torsion angles within silafluorene fragment
are less than 1. Apparently, the major differences are related to the torsion angles
between substituents and the silafluorene fragments, according to HF and B3LYP
results. The dihedral angles between silafluorene fragment and the phenyl/pyridyl
substituents are 44/26 (1a/1b) at HF level, and 37/17 (1a/1b) at B3LYP level. This
indicates that the hydrogen-free pyridine ring twists less than the phenyl group. Note
that the lowest energy conformer for bipyridyl has the nitrogen atoms in a
trans-conformation. The dihedral angles between the outer and inner aryl rings for
biphenyl and bipyridyl substituents are -45 (1c) and 4(1d) at HF level, while -37 (1c)
and 3 (1d) at the B3LYP level. Therefore, the twisting for the backbone in 1b and 1d
are reduced as compared with 1a and 1c.
3.1.2 Absorption Properties
To investigate the effect of electron correlation on the computed energies and spectra,
calculations using TD-B3LYP method, are carried out based on both the HF/6-31G*
and B3LYP/6-31G* optimized S0 geometries. For the sake of comparison, and to
estimate the accuracy of the theoretical level applied in this work, we listed both the
experimental [4] and theoretical [17] maximum absorption wavelengths (abs) for 1a-1d
in Table 1. From these results, one may find that the calculated abs values at
TD-B3LYP//HF/6-31G* level are in excellent agreement with experimental results,
with less than 7 nm of deviation. These give credit to the computational approach, and
reliable geometries may be predicted from this level, which can be applied to the
system under investigation. Our results indicates that abs do not change significantly in
1a-1d (within the range of 314-322 nm), with only a slight blue shift being found in 1a,
and while 1b-1d are nearly identical.

322

M. Sun, B. Niu, and J. Zhang

Table 1. Computed absorption and emission wavelengths (abs and em, nm), oscillator strength
(f), assignment, coefficient
Calca

fa

Coefficient Expc
Absorption
1a
314
0.200 HOMOLUMO
0.65
307
1b
322
0.221 HOMOLUMO
0.67
317
1c
320
0.420 HOMOLUMO
0.64
315
1d
322
0.267 HOMOLUMO
0.66
1e
312
0.137 HOMOLUMO
0.66
1f
299
0.059 HOMOLUMO
0.63
Emission
1a
385
0.473 HOMOLUMO
0.65
383
1b
387
0.585 HOMOLUMO
0.64
382
1c
397
0.939 HOMOLUMO
0.67
393
1d
389
0.658 HOMOLUMO
0.69
1e
382
0.375 HOMOLUMO
0.63
1f
364
0.147 HOMOLUMO
0.64
* a HF/6-31G* optimized S0 geometries; b B3LYP/6-31G*
c
Experimental data from ref. 4; d Theoretical results from ref. 17.
Assignment

Calcb

fb

Calcd

328
337
338
337

0.200
0.180
0.474
0.270

294
309
303
319

optimized S0 geometries;

3.1.3 Excited States Structures


Comparing with the ground states structures, some of the bonds in excited states
equilibrium geometries are lengthened, some others being shortened. All of the
significant changes are located on one of the two identical (A or B) moieties, the other
one remaining practically unaffected. The central silole ring is revealed to be the most
distorted with the C2-C3 and C2-C3 bond lengths being lengthened by ca 0.06 upon
excitation. The S1 optimized C3-C3 bond lengths in the range from 1.40 to 1.41 , are
found to be shortened by ca 0.09 compared with their corresponding S0 states. The
dihedral angles between the silafluorene fragment and the phenyl/biphenyl rings
decrease for 1a and 1c in the S1 states. According to the very small torsion angles ( 1)
between the ortho-hydrogen-free pyridine and the silafluorene fragment for 1b and 1d
in S1 states, effectively coplanar configurations allow better orbital overlap.
Moreover, these molecules maintain their two mutually perpendicular systems in S1
states.
3.1.4 Emission Properties
In Table 1, we summarize the calculation results for the first singlets excitation energies
with the corresponding oscillator strengths based on TD-B3LYP//CIS/6-31G* level.
The results show that, comparatively to 1a and 1b, em red-shifted by 10 nm with an
intense violet-blue emission wavelength at 397 nm, in excellent agreement with the
experimental data. The energy gaps of 1a, 1b, and 1d do not change significantly
(within the range of 3.60-3.55 eV), compared with that of 1c (3.48eV). This is
consistent with the slight red shift of 1c, and the nearly identical wavelengths of 1a, 1b
and 1d. The oscillator strength of 1c is twice larger than the others. This may be due to
the efficient conjugation from the external phenyl group in S1 by decreasing the
torsional angles between the external and inner phenyl rings, corresponding to some
amount of orbital distribution from the external ring (see Figure 2). It also leads to a

The Design of Blue Emitting Materials Based on Spirosilabifluorene Derivatives

323

red-shifted wavelength for 1c. On the other hand, the pyridine and bipyridyl ring in 1b
and 1d have no significant influence on em compared with 1a, which is consistent with
the FMO patterns in S1 for 1a, 1b, and 1d. FMOs are only mainly located on the
silafluorene fragments and their adjacent aryl rings, the external aryl rings faily
contributing to them (see Figure 2).

Fig. 2. TD-B3LYP/6-31G* electronic density contours of the orbitals involved in transitions for
1a-1d

3.1.5 Oscillator Strength


To investigate the role of the aryl substituent, we carried out the same calculations for
SSF, 1e, and 1f. From the results listed in Table 1, one can find that the predicted f
values are in the increasing order 1f < 1e < 1a for both absorption and emission results.
The oscillator strength of 1e (with the bulky aromatic substituent) are close to that of 1a
(with both the aryl and t-Bu substituent), compared with t-Bu substituted derivatives 1f.
Thus, it can be concluded that phenyl group plays a very important role in enlarging the
oscillator strength.
3.2 Optimized Geometries and Optical Properties for the CH/N Substituted
Derivatives
3.2.1 Optimized Geometries of S0 and S1 States
In general, the optimization results in S0 do not reveal any significant change on the
geometry of the skeleton of the parent molecule (SSF), when substituting CH by N.
In all the cases, the silafluorene fragments maintain their mutually perpendicular
systems. Compared with the skeleton of SSF, the bond lengths and bond angles
change mainly on the nitrogen and its neighboring carbon atoms. The C-N bond lengths
are predicted to be ca. 1.321 , the C-C-N and N-C-C angles being larger (by ca 1.32 5.40) than the corresponding C-C-C angle in SSF. However, the C-N-C bond angles
become smaller (by ca 2.76 - 4.48 ) than the corresponding C-C-C ones in SSF. The
optimization results for S1 states possess the same feature as 1a-1d, upon S0-S1
transition, the main is localized on one of the two identical (A or B) moieties, the other
parts remaining practically unaffected.

324

M. Sun, B. Niu, and J. Zhang

3.2.2 Emission Properties


Figure 3 shows the electronic density contours of the FMOs for SSF, 2a, and 3a in S1
states as representatives of systems under investigation. Both the HOMO and LUMO
spread over -conjugated backbone as those for 1a-1d. They are respectively antibonding
and bonding between the carbon atoms bridging the two phenyl groups within the
silafluorene fragment for HOMOs and LUMOs. The same bonding-antibonding pattern
is also observed between adjacent atoms in the same aryl ring.

Fig. 3. TD-B3LYP/6-31G* electronic density contours of the LUMO (upper panel) and HOMO
(lower panel) for spirosilabifluorene, 2a and 3a

The calculated emission wavelengths (em), the oscillator strength (f), the transition,
corresponding assignment, and the main CI expansion of the derivatives of interest are
listed in Table 2. For all CH/N substituted derivatives, emitting states are mainly of
LUMOHOMO character. As there is a slight change for the energy gaps (Eg)
between HOMO and LUMO for the investigated system, slight change can be also
found for em, with the maximum change being 13 nm (in 2c and 3d).
Table 2. Computed emission wavelengths (em), oscillator strength (f), transitions, coefficient

2a
2b
2c
2d
3a
3b
3c
3d

em(nm)
357
367
347
361
362
363
363
373

f
0.093
0.097
0.090
0.070
0.100
0.100
0.100
0.012

transition
S1S0
S1S0
S1S0
S1S0
S1S0
S1S0
S2S0
S2S0

Assignment
coefficient
HOMOLUMO
0.64
HOMOLUMO
0.64
HOMOLUMO
0.64
HOMOLUMO
0.63
HOMOLUMO
0.64
HOMOLUMO
0.64
HOMOLUMO
0.64
HOMOLUMO
0.68

3.3 Optical Property for Dimer


It is believed that larger f can also be reached in the case of oligomers. Moreover, both
absorption and emission wavelengths will be red-shifted in the case of oligomers. [18]
In this work, we considered the dimeric form of 3d through positions 6 and 6 (shown in
Figure 1(III)). On the basis of the same approach, the abs values for the first three states
are predicted to be 346 nm (f = 0.0000), 346 nm (f = 0.0072) and 329 nm (f = 0.6828),
respectively, in comparison with 3d, whose abs corresponding to the first three sates are

The Design of Blue Emitting Materials Based on Spirosilabifluorene Derivatives

325

328 nm (f = 0.000), 323 nm (f = 0.000) and 307 nm (f = 0.000), it becomes clear that
both abs and f are increased in the dimeric form. The em values for the first three
excited states are predicted to be 405 nm (f = 1.184), 395 nm (f = 0.000) and 395 nm
(f = 0.008), respectively, while those of 3d are 380 nm (f = 0.000), 373 nm (f = 0.012)
and 353 nm (f = 0.042). It can be found that both abs/em and f are increased in the
dimmer form. In general, larger oscillator strengths correspond to the intensive
fluorescent emitting, therefore, this approach may contribute to develop the property of
emitting materials.

4 Conclusions
Calculations have been performed on a novel class of spirosilabifluorene and some of
its derivatives. Some light is shed on a series of experimentally available derivatives
and CH/N on substituted derivatives. Our calculation results on 1a-1d are in good
agreement with experimental results. The similar optical properties for 1a-1d are
ascribed to the identical FMO distribution pattern and similar energy gaps between
FMOs. Compared to the pristine molecule, no significant change of emission
wavelengths is found for the CH/N substituted derivatives. According to our results,
phenyl group and oligomers may play a very important role in enlarging the oscillator
strengths, which will increase the fluorescent intensity. Thus, we may be improved the
emitting properties by adding the aryl substituents or oligomerization. Furthermore, the
dimeric forms of the designed CH/N substituted derivatives are expected to be
promising candidate for blue emitting materials.
Acknowledgment. Financial supports from the NSFC (No.50473032) and Outstanding
Youth Project of Jilin Province are gratefully acknowledged.

References
1. Tang, C. W., Van Slyke, S. A.: Organic electroluminescent diodes. Appl. Phys. Lett.
51(1987) 913-915.
2. Kulkarni, A. P., Tonzola, C. J., Babel, A., Jenekhe, S. A. Electron Transport Materials for
Organic Light-Emitting Diodes. Chem. Mater. 16 (2004) 4556-4573.
3. Tu. G., Zhou, Q., Cheng, Y., Wang, L.X.: White electroluminescence from polyfluorene
Chemically doped with 1,8-napthalimide moieties. Appl. Phys. Lett. 85(2004) 2172-2174.
4. Lee, S. H., Jang, B. B., Kafafi, H. Z.: Highly Fluorescent Soil-State Asymmetric
Spirosilabifluorene Derivatives. J. Am. Chem. Soc. 127(2005) 9071-9078.
5. Ohshita, J., Lee, K. H., Hamamoto, D., Kunugi, Y., Ikadai, J., Kwak, Y. W.: Synthesis of
Novel Spiro-condensed Dithienosiloles and the Application to Organic FET. Chem. Lett.
33(2004) 892-893.
6. Schneider, D., Rade, T., Riedl, T., Dobbertin, T., Werner, O.: Deep blue widely tunable
organic solid-state laser based on a spirobifluorene derivative. Appl. Phys. Lett. 84(2004),
4693-4695.
7. Yin, S. W., Yi, Y.P., Li, Q. X., Yu, G., Liu, Y. Q., Shuai, Z. G.: Balanced Carrier Transports
of Electrons and Holes in Silole-Based Compounds-A Theoretical Study. J. Phys. Chem. A.
110(2006) 7138-7143.

326

M. Sun, B. Niu, and J. Zhang

8. Luke, V., Plszegir, T., Milota, F., Sperling, J., Kauffmann, H. F.: Density Matrix Analysis,
Simulation, and Measurements of Electronic Absorption and Fluorescence Spectra of
Spirobifluorenes. J. Phys. Chem. A. 110(2006) 1775-1782.
9. Gahungu, G., Zhang, J. P.: CH/N substituted mer-Gaq3 and mer-Alq3 derivatives; an
effective approach for the tuning of emitting color. J. Phys. Chem. B. 109 (2005)
17762-17767.
10. Van Slyke, S. A., Bryan, P. S., Lovencchio, F. V.; US Patent 1990, 5, 150, 006.
11. McWeeny, R. Dierksen. G.: Self-Consistent Perturbation Theroy. II. Exctension to Open
Shells. J. Chem. Phys. 49(1968) 4852-4856.
12. Becke; A. B.: Density-functional thermochemistry. III. The role of exact Exchange. J.
Chem. Phys. 98(1993) 5648-5652.
13. Binkley, J. S., Pople J. A., Hehre, W. J.:Self-Consistent Molecular Orbital Theory for
Excited States. J. Am. Chem. Soc. 102 (1980) 939-947.
14. Foresman, J. B., Head-Gordon, M., Pople, J. A., Frisch, M. J.: Toward a Systematic
Molecular Orbital Theory for Excited States. J. Phys. Chem. 96 (1992) 135-149.
15. Stratman, R. E., Scuseria, G. E., Frisch, M. J.: An efficient implementation of
time-dependent density-functional theory for the calculation of excitation energies of large
molecules. J. Chem. Phys. 109(1998) 8218-8224.
16. Frisch, M. J., et al. Gaussian, Inc., Pittsburgh PA, 2003. Gaussian 03, Revision B.03.
17. Yang, G. C., Su, Z. M., Qin, C. S.: Theoretical Study on the Second-Order Nonlinear Optical
Properties of Asymmetric Spirosilabifluorene Derivatives. J. Phys. Chem. A. 110 (2006)
4817-4821.
18. Yang, L., Feng, J. K., Ren, A. M.: Theoretical Study on Electronic Structure and Optical
Properties of Phenothiazine-Containing Conjugated Oligomers and Polymers. J. Org. Chem.
70 (2005) 5987-5996.

Regulative Effect of Water Molecules on the Switches of


Guanine-Cytosine (GC) Watson-Crick Pair
Hongqi Ai1, Xian Peng1, Yun Li1, and Chong Zhang2
1

School of Chemistry and Chemical Engineering, University of Jinan,


Jinan, 250022, P.R. China
Chm_aihq@ujn.edu.cn
2
Department of Chemistry and Technology, Liaocheng University,
Liaocheng, 252059, P.R. China
Zhangchong@sdu.edu.cn

Abstract. To regulate the switch effect induced by NH- anion substitution at the
H8 site of the guanine-cytosine (GC) watson-crick pair and investigate the actual
state in the solution, 1-9 water molecules attached GNH-C derivatives are probed
in the present paper. Results show that the switch effect, i.e., deformation effect
of hydrogen bonding between GNH- and C pair reduces step-wise as the number
of discrete water molecules around the NH- group increases. Thus one can
control the switch action of GNH-C by regulating the number of water molecules
around the NH- group.

1 Introduction
Hydrogen bonding in the DNA pairs plays a vital role, thus it has been the subject of
biological theory chemists [1-13]. These hydrogen bonds (H-bonds), which are
extensively studied, mainly locate between guanine and cytosine (GC pair) [6-8,11-13]
and between adenine and thymine (AT pair) [1, 4, 12]. Recently, Bickelhaupt group [1]
reported an interesting observation that, when the H8 site of guanine species or the H6
site of cytosine species in the GC pair was substituted by an anion or cation, the
H-bonds between G and C would deform and take on switch action. Similar actions are
also reported in other literatures [6-8, 11-13]. The action can be described as follows
from two models in Fig. 1. The O6H4-N4 distance will shorten and, N1-H1N3
and N2-H2N2distances, especially the latter will be elongated when the H8 is
substituted by an anion, however. On the other hand, the change trend of above three
distances will reverse when the H6 of cytosine is substituted by a cation (See GC in
Fig.1). These action looks like a switch.
In the present paper, we will probe how to regulate the switch by attaching water
molecules stepwise around the substituted anion based one of the Bickelhaupt et al
models, NH- substituted GC (GNH-C). The optimized structure of GNH-C is also
displayed in Fig.1.
The geometry optimizations and frequency calculations of all these hydrated
complexes (1-9hydrates) are performed at the B3LYP/6-31G* [14] level. The serial
number of each atom in the GC and GNH-C are shown in the Fig.1. All calculations are
carried out with the GAUSSIAN 03 package of program Revision C.02 [15].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 327330, 2007.
Springer-Verlag Berlin Heidelberg 2007

328

H. Ai et al.

8
N

8 C8 7

N8

6 4' 4'
C C
C
5 6
4'
N9
C 4
N
N
9
1
1
3'
3 2
2'
N C
C
2'
2 2
N

5'

C
5'
6' C
1'

9 N
6'

1'

C 87
9

6
C C
5 6
C4
N
3 2 1 1
C
N
2 2
N

GC

4'

4'

5'

C C
4' 5'
6' C
N
3' 2' 1'
C
N
2'
O

6'

1'

GNH-C

Fig. 1. B3LYP/6-31G*-optimized GC and GNH-C

2 Results and Discussions


Table 1 lists the distances of three H-bonds, O6H4'-N4', N1-H1N3' and
N2-H2O2' of optimized GC and GNH-C in the gas phase. Results reveal that the
H-bonding distances are consistent well with those from Ref[1] though the
Ref[1]-values obtained at the BP86/TZ2P level are smaller somewhat than our results
due to the difference of calculation methods.
Table 1. The distances of hydrogen bonds in GC and GNH-C. Distance in .

Base pair
GC
GNH-C
a

O6H4'-N4'
2.818, 2.73a
2.660, 2.60a

N1-H1N3'
2.951, 2.88a
3.001, 2.90a

N2-H2O2'
2.934, 2.87a
3.174, 3.00a

Those in italic are from ref [1].

2.1 Hydration Schemes of GNH-C and Regulation Effect of Hydration


The switch action of GNH-C derives from the electronic effect of the substituted anion
[1-2,6-8], thus hydration around the anion site (NH- group) can effectively regulate the
action of the corresponding complex and is selected. The essential principle of attaching
these water molecules is keeping the balance between these hydrated complexes minima
in energy and larger recovery in distances of three H-bonds. The structural coordinates
and relative energies of each hydrate series for (1-9) hydrated complexes are displayed in
the supporting information (SP).
The change trend of the three H-bonds along the number of water molecules attached is
displayed Fig. 2. It is noted that the distances displayed in the figure are those from the
most stable complex GNH-CnW1 of each hydrate series. The distances of three H-bonds of
GC and GNH-C, two non-hydrated complexes, are shown in a vertical line in the figure and
correspond to the 0 position of horizontal axis. For any of the three plots, there are two
points in the vertical line corresponding to the 0 position. The top point in the black plot
corresponds to O6H4'-N4' distance in GC, while the bottom point to the distance in
GNH-C. Thus the distance between the two points indicates the shortening of O6H4'-N4'
distance of GC induced by NH- anion. The following points reveal that the shortening is
recovered by gradually increased hydration. The top points at the 0 position respective

Regulative Effect of Water Molecules on the Switches

329

in the red and green plots, however, correspond to the distances N1-H1N3' and
N2-H2O2' of GNH-C, while two bottom points to the corresponding distances in GC.
Thus their differences of the identical plot indicate the elongating of the distance
N1-H1N3' or N2-H2O2' induced by the anion in GC. Stepwise hydration also makes
the differences decrease and the octahydrated case appears to reach optimal recovery.
O6...H4'-N4'
N1-H1...N3'
N2-H2...O2'

Three H-bonding distances, in angstrom

3.20
3.15
3.10
3.05
3.00
2.95
2.90
2.85
2.80
2.75
2.70
2.65
2.60
-1

10

The number of hydrated water molecules

Fig. 2. The change plot of three H-bonding distances along the hydrated number

3 Conclusions
1-9 water molecules bound GNH-C complexes are probed at the B3LYP/6-31G* level.
Results reveal that the regulation effect from the local hydration is obvious. Up to 8 water
molecules, the regulation effect reaches optimal though the deformation of H-bonds has
not recovered into the state of neutral GC completely. The regulation effect is improved
gradually along with the increase of locally (anion) hydrated molecule number. The trend
is doubtless and can be observed intuitively from Fig.2 though there may be other more
stable hydrated structures missed for each hydrate series discussed in the paper.
Acknowledgement. This work is supported by NSFC (20573047) and Foundations for
doctoral start-up by Jinan University (B0418).

References
1. Guerra, C. F., Wijst,T., Bickelhaupt, F. M.: Supramolecular Switches Based on the Guanine
Cytosine (GC) WatsonCrick Pair: Effect of Neutral and Ionic Substituents. Chem. Eur. J.
12 (2006) 3032-3042.
2. Meng, F., Wang, H., Xu, W., Liu, .C.: Theoretical Study of GC+/GC Base Pair Derivatives.
Chem. Phys. 308 (2005) 117-123.
3. Hunter, K. C., Wetmore, S. D.: Hydrogen-bonding Between Cytosine and Water: Computational Evidence for a Ring-opened Complex. Chem. Phys. Lett. 422 (2006) 500 - 506.
4. Zendlov, L., Hobza, P., Kabel, M.: Potential Energy Surfaces of the Microhydrated
GuanineCytosine Base Pair and its Methylated Analogue. ChemPhysChem. 7 (2006) 439-447.

330

H. Ai et al.

5. Mirzaei, M., Elmi, F., Hadipour, N. L.: A Systematic Investigation of Hydrogen-Bonding


Effects on the 17O, 14N, and 2H Nuclear Quadrupole Resonance Parameters of Anhydrous
and Monohydrated Cytosine Crystalline Structures: A Density Functional Theory Study. J.
Phys. Chem. B 110 (2006) 10991-10996.
6. Chen, E. S., Chen, E. C.M.: A Proposed Model for Electron Conduction in DNA Based upon
Pairwise Anion Stacking: Electron Affinities and Ionization Potentials of the Hydrogen
Bonded Base Pairs. Bioelectrochemistry and Bioenergetics 46 (1998) 1519.
7. Richardson, N. A., Wesolowski, S. S. Schaefer, III, H. F.: Electron Affinity of the
Guanine-Cytosine Base Pair and Structural Perturbations upon Anion Formation. J. Am.
Chem. Soc. 124 (2002) 10163-10170.
8. Li, X., Cai, Z., Sevilla, M. D.: Investigation of Proton Transfer within DNA Base Pair Anion and
Cation Radicals by Density Functional Theory (DFT). J. Phys. Chem. B 105 (2001) 10115-10123.
9. Chen, E. C.M., Chen, E. S.: Negative Ion Mass Spectra, Electron Affinities, Gas Phase
Acidities, Bond Dissociation Energies, and Negative Ion States of Cytosine and Thymine
10. J. Phys. Chem. B 104 (2000) 7835-7844.
11. Schiedt, J., Weinkauf, R., Neumark, D.M., Schlag, E.W.: Anion Spectroscopy of Uracil,
Thymine and the Amino-oxo and Amino-hydroxy Tautomers of Cytosine and Their Water
Clusters. Chem. Phys. 239 (1998) 511524.
12. Gu, J., Wang, J., Leszczynski, J.: H-Bonding Patterns in the Platinated Guanine-Cytosine
Base Pair and Guanine-Cytosine-Guanine-Cytosine Base Tetrad: an Electron Density
Deformation Analysis and AIM Study J. Am. Chem. Soc. 2004, 126, 12651-12660
13. Sponer, J., Sabat, M., Burda, J. V. Leszczynski, J., Hobza, P.: Interaction of the
Adenine-Thymine Watson-Crick and Adenine-Adenine Reverse-Hoogsteen DNA Base
Pairs with Hydrated Group IIa (Mg2+, Ca2+, Sr2+, Ba2+) and IIb (Zn2+, Cd2+, Hg2+) Metal
Cations: Absence of the Base Pair Stabilization by Metal-Induced Polarization Effects. J.
Phys. Chem. B 103 (1999) 2528-2534
14. Muoz, J., Sponer, J., Hobza, P., Orozco, M. Luque, F. J.: Interactions of Hydrated Mg2+
Cation with Bases, Base Pairs, and Nucleotides. Electron Topology, Natural Bond Orbital,
Electrostatic, and Vibrational Study. J. Phys. Chem. B 105 (2001) 6051-6060.
15. (a) Becke, A. D.: Density-Functional Thermochemistry. III. The Role of Exact Exchange J.
Chem. Phys. 78 (1993) 5648-5652. (b) Becke, A. D.: Density-Functional Exchange-Energy
Approximation with Correct Asymptotic Behavior. Phys. Rev. A 38 (1988) 3098-3100. (c)
Lee, C., Yang, W., Parr, R.G.: Development of the Colle-Salvetti Correlation-Energy
Formula into a Functional of the Electron Density. Phys. Rev. B. 37 (1988) 785-789.
16. Gaussian 03, Revision C.02, Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E.,
Robb, M. A., Cheeseman, J. R., Montgomery, J. A., Vreven, Jr., T., Kudin, K. N., Burant, J. C.,
Millam, J. M., Iyengar, S. S., Tomasi, J., Barone, V., Mennucci, B., Cossi, M., Scalmani, G.,
Rega, N., Petersson, G. A., Nakatsuji, H., Hada, M., Ehara, M., Toyota, K., Fukuda, R.,
Hasegawa, J., Ishida, M., Nakajima, T., Honda, Y., Kitao, O., Nakai, H., Klene, M., Li, X.,
Knox, J. E., Hratchian, H. P., Cross, J. B., Adamo, C., Jaramillo, J., Gomperts, R., Stratmann, R.
E., Yazyev, O., Austin, A. J., Cammi, R., Pomelli, C., Ochterski, J. W., Ayala, P. Y.,
Morokuma, K., Voth, G. A., Salvador, P., Dannenberg, J. J., Zakrzewski, V. G.., Dapprich, S.,
Daniels, A. D., Strain, M. C., Farkas, O., Malick, D. K., Rabuck, A. D., Raghavachari, K.,
Foresman, J. B., Ortiz, J. V., Cui, Q., Baboul, A. G., Clifford, S., Cioslowski, J., Stefanov, B. B.,
Liu, G., Liashenko, A., Piskorz, P., Komaromi, I., Martin, R. L., Fox, D. J., Keith, T.,
Al-Laham, M. A., Peng, C. Y., Nanayakkara, A., Challacombe, M., Gill, P. M. W., Johnson, B.,
Chen, W., Wong, M. W., Gonzalez, C., and Pople,J. A. Gaussian, Inc., Wallingford CT, 2004.

Energy Partitioning Analysis of the Chemical Bonds in


mer-Mq3 (M = AlIII, GaIII, InIII, TlIII)
Ruihai Cui1,2 and Jingping Zhang3
1

Department of Applied Chemistry, Harbin Institute of Technology, Harbin 150001, China


2
Department of Chemistry, Harbin University, Harbin 150080, China
3
Faculty of Chemistry, Northeast Normal University, Changchun 130024 China
zhangjp162@nenu.edu.cn

Abstract. Geometries of ground states of mer-tris(8-hydroxyquinolinato)metal (Mq3, M=Al3+, Ga3+, In3+, Tl3+) are optimized by B3LYP/6-31G(d)
methods. The bonding interactions between the metal fragment Mq2 and each
single ligand q have been analyzed with the energy decomposition scheme. The
calculated results suggest that the HOMO and LUMO distribution fashion can
be simply traced back to the lowest electrostatic attractive and highest orbital
interaction energy between fragments A-quinolate ligand and Mq2 and B-ligand
and Mq2, respectively.
Keywords: energy partitioning analysis, Mq3, FMO distribution fashion.

1 Introduction
Tris(8-hydroxyquinolinato)aluminum, Alq3, is the mile stone for the development of
organic light-emitting diode (OLED), which was used in the first OLED [1]. The
majority of the molecular properties of metaloquinolate work carried out thus far has
been on the ground-state (S0) [2-5] and first excited state (S1) [6,7] characteristics of
Alq3. Mq3 has two geometric isomers, the facial (fac-Mq3) and meridianal (merMq3) forms, respectively. The energy partitioning analyses (EPA) method has
recently been used in systematic investigations of the nature of the chemical bond in
main-group and transition metal compounds [8]. In order to explore the difference of
individual ligands from chemical bond point of view, energy partitioning method has
been applied to the series of complexes mer-Mq3 as shown in Fig. 1, as a series study
of our previous works [9,10].

2 Methodology
The calculations for mer-Mq3 (Fig. 1) described here were carried out by the same
methods as described previously [9,10]. The structure of mer-Mq3 in S0 state was
optimized using the B3LYP functional with 6-31G(d) basis set for C, H, O, N, Al(III)
Ga(III) and Stuttgart RLC ECP for In(III) and Tl(III), respectively. In order to explore
the difference of individual ligands from chemical bond point of view, EPA was
applied to the series of complexes.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 331334, 2007.
Springer-Verlag Berlin Heidelberg 2007

332

R. Cui and J. Zhang

Fig. 1. The geometry of mer-Mq3 with labels A-C for three q ligands (M=Al, Ga, In, Tl)

3 Results and Discussion


3.1 The Optimized Structure Analyses of S0 for mer-Mq3
The optimized structures for these complexes are listed in Table 1 (the results for Al
and Ga complexes see our previous results [9]). For each complex, the bond lengths
for O-M in ligand A are shorter than those in ligands B and C. The bond lengths of OM are shorter than those of N-M and increase with their atomic numbers as O-Al < OGa < O-In < O-Tl. The bond angles of O-M-O and N-M-N decrease with their atomic
numbers as O(N)-Al-O(N) > O(N)-Ga-O(N) > O(N)-In-O(N) > O(N)-Tl-O(N).
Table 1. Selected optimized bond lengths () and bond angles () for mer-Mq3 (M= InIII, TlIII)
by B3LYP/6-31G(d)

O-M
N-M
Oa-M-Nb
Ob-M-Oc
Na-M-Nc
Oa-M-Na
Ob-M-Nb
Oc-M-Nc

Inq3-A
2.063
2.250

Inq3-B
2.080
2.270
166.68
157.08
166.30
77.61
76.39
77.32

Inq3-C
2.090
2.220

Tlq3-A
2.156
2.400

Tlq3-B
2.180
2.400
162.11
151.69
162.72
74.26
73.48
74.69

Tlq3-C
2.190
2.340

3.2 The Frontier Molecular Orbitals (FMO) Analyses of S0 for mer-Mq3


The FMO plots of mer-Mq3 are shown in Figure 2 (the results for Al and Ga
complexes see our previous results [9]). For each complex, the HOMO is mainly
localized on one ligand (labeled A), while the two others are predominantly localized
on the other ligands (C and B respectively for HOMO-1 and HOMO-2). The most
low-lying unoccupied molecular orbital (LUMO) is found to be mainly localized on

Energy Partitioning Analysis of the Chemical Bonds

333

ligand B the two others (LUMO+1 and LUMO+2) being mainly localized on other
two ligands (C and A), respectively, for Alq3 and Tlq3, while almost uniformly
localized on the same other two ligands (C and A) for Gaq3 and Inq3.
HOMO-2

HOMO-1

HOMO

LUMO

LUMO+1

LUMO+2

B
Fig. 2. FMOs for the ground state (S0) of mer-Mq3(M=In(A), Tl(B)) with small core
(Isocontour value 0.05)

3.3 The Energy Decomposition Analyses for mer-Mq3 in S0 States


Table 2 gives the most important results of the bonding analysis for the interactions
between one quinolate ligand and corresponding Mq2 fragment for investigated
systems, respectively.
Table 2. The ETS analysis for Mq2++q- complexes at BP86/T2ZP (kcal/mol) for the ground
states of mer-Mq3

Alq2-qA
Alq2-qB
Alq2-qC
Gaq2-qA
Gaq2-qB
Gaq2-qC
Inq2-qA
Inq2-qB
Inq2-qC
Tlq2-qA
Tlq2-qB
Tlq2-qC

Eint
-192.42
-190.80
-194.37
-169.60
-169.11
-172.25
-164.49
-163.38
-166.92
-148.20
-148.56
-156.01

EPauli
152.46
157.09
158.00
192.67
199.14
199.00
185.09
181.18
184.15
180.50
176.16
180.42

Eelstat
-215.25
-219.11
-221.48
-222.57
-229.06
-230.73
-224.02
-224.59
-228.34
-196.28
-197.74
-202.14

Eorb
-129.62
-128.78
-130.89
-139.70
-139.19
-140.52
-125.56
-119.97
-122.73
-132.42
-126.98
-134.28

Eelstat %
62.41
62.98
62.85
61.44
62.21
62.15
64.08
65.18
65.04
59.72
60.89
60.08

Eorb %
37.59
37.02
37.15
38.56
37.39
37.85
35.92
34.82
34.96
40.28
39.11
39.92

The results given in Table 2 show the same tendency for all investigated
complexes. The interaction energy Eint between the two fragments increases in the
sequence as Alq2-qL < Gaq2-qL < Inq2-qL < Tlq2-qL. Moreover, the larger Eelstat%
(59.72-65.18%) over Eorb% (34.82-40.28%) suggests that metal-ligand interactions
have a larger electrostatic character than covalent character. For each complex, the

334

R. Cui and J. Zhang

electrostatic interaction energy (Eelstat) between qA and MqBqC fragments is weaker


than those of qB-MqAqC and qC-MqAqB, which may results in the HOMOs localizing
on A-ring for S0 states. The orbital interaction term, Eorb, between qB and MqAqC
fragments is higher than those of qA-MqBqC and qC-MqAqB, which may correspond to
the localization of LUMOs on B-ligands.

4 Conclusions
The same HOMO and LUMO distribution fashion for mer-Mq3 system can be simply
traced back to the lowest electrostatic attractive and highest orbital interaction energy
between fragments A-quinolate ligand and Mq2 and B-ligand and Mq2, respectively.
Our results suggest that EPA may be a powerful approach to rationalize the
distribution patterns of HOMO and LUMO for complexes consisting of metal ion and
three same component ligands with C1 symmetry.
Acknowledgment. Financial supports from the NSFC (No.50473032) and
NCET-06-0321.

References
1. Tang, C. W., Slyke, S. A. Van: Organic electroluminescent diodes. Appl. Phys. Lett. 51
(1987) 913-915
2. Curioni, A., Boero, M., Andreoni, W.: Alq3: Ab Initio Calculations of Its Structural and
Electronic Properties in Neutral and Charged States. Chem. Phys. Lett. 294 (1998)
263-271
3. Curioni, A., Andreoni, W.: Metal-Alq3 complexes: The nature of the chemical bonding. J.
Am. Chem. Soc. 121 (1999) 8216-8220
4. Martin, R. L., Kress, J. D., Campbell, I. H., Smith, D. L.: Molecular and solid-state
properties of tris-(8-hydroxyquinolate)-aluminum. Phys. Rev. B 61 (2000) 15804-15811.
5. Stampor, W., Kalinowski, J., Marconi, G., Marco, P. Di, Fattori, V., Giro, G.:
Electroabsorption study of excited states in tris 8-hydroxyquinoline aluminum complex.
Chem. Phys. Lett. 283 (1998) 373-380.
6. Halls, M. D., Schlegel, H. B.: Molecular orbital study of the first excited state of the
OLED material tris (8-hydroxyquinoline) aluminum (III). Chem. Mater. 13 (2001)
2632-2640
7. Sugimoto, M., Sakaki, S., Sakanoue, K., Newton, M. D.: Theory of emission state of
tris(8-quinolinolato)aluminum and its related compounds. J. Appl. Phys. 90 (2001)
6092-6097
8. Lein, M., Frenking, G.: The nature of the chemical bond in the light of an energy
decomposition analysis. Theory and Applications of Computational Chemistry: The First
40 Years, C.E. Dykstra, G. Frenking, K.S. Kim, G.E. Scuseria (Eds), Elsevier,
Amsterdam, 291 (2005).
9. Zhang, J. P., Frenking, G.: Quantum chemical analysis of the chemical bonds in tris
(8-hydroxyquinolinato)aluminum as a key emitting material for OLED. J. Phys. Chem. A
108 (2004) 10296-10301.
10. Zhang, J. P., Frenking, G.: Quantum chemical analysis of the chemical bonds in

Mq3(M=Al ,Ga )as emitting material for OLED. Chem. Phys. Lett. 394 (2004) 120-125.

Ab Initio Quantum Chemical Studies of Six-Center Bond


Exchange Reactions Among Halogen and Halogen Halide
Molecules
I. Noorbatcha1, B. Arifin2, and S.M. Zain3
1

Department of Biotechnology Engineering, Faculty of Engineering, International Islamic


University, Jalan Gombak, 53100 Kuala Lumpur, Malaysia
ibrahiman@iiu.edu.my
2
Faculty of Science, University Technology MARA, 40450 Shah Alam, Malaysia
3
Department of Chemistry, University Malaya, 50603 Kuala Lumpur, Malaysia

Abstract. The possibility of six-center bond exchange reaction among halogen


and halogen halide molecules is confirmed using ab initio quantum chemical
methods. The following reactions: (Cl2)2 + Br2 2 BrCl + Cl2; (Cl2)2 + HI
HCl + ICl + Cl2 and (HI)2 + F2 HF + HI + IF were studied. The
energy barrier for these reaction were found to be very low (<7 kcal mol).
These results are consistent with the molecular beam results.
Keywords: Ab initio, six-center bond exchange reaction, halogen, halogen
halide.

1 Introduction
There are only a few reaction mechanisms, which postulate termolecular reaction
among three diatomic molecules [1]. However, there is no termolecular reaction
available for which we have both experimental and theoretical evidence [2-5]. A definitive experimental evidence for a termolecular reaction through a six centered transition state [TS] during the bond exchange among halogen molecules is available
from molecular beam experiments. Dixon and Herschbach [6] found that the following reactions:
(Cl2)2 + Br2 2 BrCl + Cl2
(Cl2)2 + HI

HCl + ICl + Cl2

(R1)
(R2)

could proceed at collision energies as low as 3 kcal mol-1. A similar reaction,


(HI)2 + F2

HF + HI + IF

(R3)

has been observed by Durana and McDonald [7]. However, no theoretical calculations
are available in support of these experiments. It would be interesting to know if the
theoretical calculations can support the above experimental findings involving halogens
and hydrogen halides. In the present work we have carried out ab initio quantum
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 335338, 2007.
Springer-Verlag Berlin Heidelberg 2007

336

I. Noorbatcha, B. Arifin, and S.M. Zain

mechanical calculations for the existence of the six-center TS for the reactions R1-R3,
and find that our calculations confirms the presence of six centered TS for these
reactions.
Table 1. Calculated Energy barrier and vibration frequency of the transitions states
Method

Eb
(kcal/mol)

(cm-1)

Cl4Br2
HF/6-311G*
HF/MIDI!
HF/cc-pVDZ
HF/cc-pVTZ
MP2/MIDI!
MP2/cc-pVDZ
B3LYP/6-311G*
B3LYP/MIDI!
B3PW91/MIDI!
BHandH/MIDI!

93.80
76.91
87.44
92.42
32.86
42.10
27.95
5.70
7.30
7.11

546.80i
664.30i
613.01i
522.32i
249.52i
272.83i
277.24i
235.79i
238.19i
368.50i

B3LYP/cc-pVDZ
B3LYP/cc-pVTZ

25.31
30.34

268.20i
285.16i

Cl4HI
HF/6-311G**
HF/MIDI!
HF/cc-pVTZ
HF/cc-pVQZ
HF/LANL2DZ
MP2/MIDI!
MP2/cc-pVTZ
B3LYP/6-311G**
B3LYP/MIDI!
B3PW91/MIDI!
BHandH/MIDI!
B3LYP/cc-pVTZ
B3LYP/cc-pVQZ
B3LYP/LANL2DZ

76.30
72.85
75.87
74.49
56.0
32.98
30.18
21.99
4.85
6.20
7.05
26.11
26.50
5.73

314.13i
689.83i
261.90i
224.35i
70.33i
851.98i
555.36i
427.86i
359.71i
361.72i
657.03i
471.34i
438.63i
71.65i

H2I2F2
HF/6-311G**
HF/MIDI!
HF/cc-pVTZ
HF/cc-pVQZ
HF/LANL2DZ
MP2/MIDI!
MP2/cc-pVTZ
B3LYP/MIDI!
B3PW91/MIDI!
BHandH/MIDI!
B3LYP/cc-pVTZ
B3LYP/LANL2DZ

54.35
55.75
57.17
54.91
48.28
8.55
3.79
-3.21
-0.38
-0.37
-1.24
-5.49

1035.00i
1766.82i
643.65i
628.78i
1483.38i
227.75i
416.40i
274.79i
205.41i
684.24i
44.34i
47.08i

Ab Initio Quantum Chemical Studies of Six-Center Bond Exchange Reactions

337

B r6

2 .4 1 4
2 .5 3 8
8 5 .9
8 1 .6
1 5 5 .7 1 4 9 .2
B r5
C l1
1 5 7 .0
2 .5 8 4
1 5 3 .4
2 .6 8 2
2 .5 3 6
2 .4 0 4
1
6
6
.7
8
1
.0
8 1 .4 1 6 1 .6
8 2 .0
8 4 .3
C l4
C l2
C l3
2 .2 5 7
2 .3 6 2
2 .3 4 7
2 .3 3 7
C l2
2 .8 7 8
2 .5 6 9
2 .4 9 7
2 .4 9 4
7 1 .0
7 3 .0
C l 3 1 6 7 .3
C l1
1 4 0 .2
1 5 5 .1
1 4 7 .3
2 .2 9 4
2 .1 2 4
1 6 7 .0
2 .5 4 5
2 .2 4 3
8 1 .4
9
3
.2
1 7 3 .2
8 7 .3
8 4 .1
I6
C l4
1 .6 4 6
H5
2 .2 3 3
1 .7 8 7
1 .8 2 4
2 .2 5 7
2 .4 4 7

I1

2 .0 4 2
8 9 .3
2 .3 7 2
7 7 .6
H 2 1 4 0 .7
F6
1 7 0 .9
1 4 7 .6
1 6 9 .5
1 .7 5 6
3 .3 9 8
1 .6 0 8
1 6 6 .5
2 .8 2 0
7 9 .5
8 9 .6
8 0 .7 1 6 2 .6
8 1 .9
I3
F5
H4
1 .7 1 6
1 .6 3 7
2 .0 9 8
1 .6 2 0
1 .6 0 1
1 .6 2 8

Fig. 1. Transition state geometries for R1 (top), R2 (middle) and R3 (bottom). The upper entries in each geometry was obtained using HF/cc-pVTZ and the lower entries were obtained
using B3LYP/cc-pTVZ. The bond lengths are in and the angles in degrees.

2 Results and Discussion


All electronic structure calculations were done using the Gaussian 98 [8]. Geometries
for the TS were optimized at the Hartree-Fock (HF) level of theory using 6-311G*,
MIDI!, cc-pVDZ, cc-pVTZ, cc-pVQZ (for H, Cl and Br), SDB-cc-pVTZ, SDB-ccpVQZ (for I only) and LANL2DZ basis sets [9-10]. The second order Moller-Plesset
perturbation theory (MP2) methods and density functional theory (DFT) methods,
B3LYP, B3PW91 and BHandH [10] were also used to optimize the geometries using
the above basis sets. The TS were verified by checking for the presence of only one
imaginary frequency () and by performing intrinsic reaction coordinate (IRC) calculations. The energy barrier (Eb) for the reactions was obtained after zero point energy
corrections.
The calculated Eb and values are listed in Table 1. The HF energy barriers are
found to be much higher than the MP2 and DFT energy barriers emphasizing the
significance of the correlation energy in cyclic systems. The MIDI! basis set gives
lower Eb values compared to the 6-311G* basis set for both the HF and DFT methods.
It is significant to note that all the DFT methods predicts low barrier heights in the

338

I. Noorbatcha, B. Arifin, and S.M. Zain

range of 5-7 kcal/mol for R1 and R2 which is in good agreement with the experimental results. However, for R3 the DFT methods yield negative Eb values. These negative values could be due to tendency of the DFT methods to underestimate the energy
barriers or due to the numerical errors associated with the absence of pruned grids for
iodine atoms in DFT methods as implemented in Gaussian 98. These negative values
may not be due to relativistic effects, as the use of relativistically corrected
LANL2DZ basis set also leads to negative Eb values. The flatness of the potential
energy surface around the transition state region also makes it difficult to detect it
accurately. Under this scenario, it can be safely assumed that R3 has near-zero energy
barrier.
The TS structures obtained using HF/cc-pVTZ and B3LYP/cc-pVTZ methods are
given in Figure 1. All the TS geometries were found to be planar. The TS geometries
look more like a bulging triangle, rather than as a hexagon as expected for six centered reactions, with the heavier atom at the corner of the triangle and the lighter atom
at the middle of the bulging sides of the triangle.

3 Conclusion
The present ab initio calculations clearly confirms the possibility of six-centered exchange reactions: (Cl2)2 + Br2 2 BrCl + Cl2; (Cl2)2 + HI HCl + ICl +
Cl2 and (HI)2 + F2 HF + HI + IF with low activation barriers. These results are in agreement with the molecular beam experimental results.
Supporting information Available: Energies and geometries of TS for all the
methods.

References
1. Laidler, K.J.: Chemical Kinetics; Mc-Graw Hill, New York (1965)
2. Rabinowitz, M.J.; Gardiner, W.C.: Chem. Phys. Lett. 124 (1986) 63-67
3. (a) Wright, J.S. Can. J. Chem. 53 (1975) 549-555 (b) Wright, J.S. Chem. Phys. Lett. 6
(1970) 476-481
4. (a) Dixon, D.A.; Stevens, R.M.; Herschbach, D.A.: Faraday Discuss. Chem. Soc. 62
(1977) 110-126 (bb) Heidrich D.; Van Eikema Hommes, N.J.R., Schleyer, P .V. R.: J.
Comput. Chem. 14 (1993) 1149-1153 (c) Janoschek, R.; Kalcher, J.: Int. J. Quant. Chem.
38 (1990) 653-664
5. NoorBatcha, I.; Thareja, S.; Sathyamurthy, N. J.: Phys. Chem. 91 (1987) 2171-2173
6. (a) Dixon, D.A.; Herschbach, D.R.: J. Am. Chem. Soc. 97 (1975) 6268-6270 (b) Dixon,
D.A. Herschbach, D.R. Ber. Bunsenges.: Phys. Chem. 145 (1977) 145-151
7. Durana, J.F.; McDonald, J.D.: J. Am. Chem. Soc. 98 (1976) 1289-1291
8. Gaussian 98, Revision A.7,. Frisch, M. J.; et. al.: Gaussian, Inc., Pittsburgh PA(1998)
9. http://www.emsl.pnl.gov:2080
10. Gaussian 98 Users Reference, Gaussian, Inc., Pittsburgh PA(1998)

Comparative Analysis of the Interaction Networks of


HIV-1 and Human Proteins
Kyungsook Han* and Byungkyu Park
School of Computer Science and Engineering, Inha University, Inchon 402-751, Korea
khan@inha.ac.kr, bpark@inhaian.net

Abstract. Various interactions of human immunodeficiency virus type 1 (HIV-1)


proteins with those of the host cell are known, but a large-scale network
encompassing all known interactions of HIV-1 proteins and host cell proteins
have not been analyzed or visualized. This is partly because large-scale
interaction data are not readily available from public databases and individual
literatures report small-scale interactions. NCBI recently released a database of
all known interactions of HIV-1 proteins and human proteins. Now, a
challenging task is to analyze all the interactions in a systematic way and to
identify biologically meaningful interaction groups or patterns. This paper
presents the development of a web-based system (http://hiv1.hpid.org) for
visualizing and analyzing the large-scale interactions between HIV-1 and
human proteins and the comparative analysis of the interactions. The whole
interaction network contains 1,768 interactions of 65 different types with 810
human proteins. The analysis identified several interesting interaction groups
from the comparative analysis of the interaction networks.

1 Introduction
Twenty years following its discovery, human immunodeficiency virus type 1 (HIV-1)
remains a major threat to public health and a challenge for drug development [1].
Various interactions of HIV-1 proteins with those of the host cell are known, but a
large-scale network encompassing all known interactions of HIV-1 proteins and host
cell proteins have not been analyzed or visualized. This is partly because large-scale
interaction data are not readily available from public databases and individual
literatures report small-scale interactions.
NCBI recently released a database of all known interactions of HIV-1 proteins and
human proteins (http://www.ncbi.nlm.nih.gov/RefSeq/HIVInteractions/). Now, a
challenging task is to analyze all the interactions in a systematic way and to identify
biologically meaningful interaction groups or patterns. We have developed a webbased system (http://hiv1.hpid.org) for systematically analyzing the large-scale
interactions between HIV-1 and human proteins and the comparative analysis of the
interactions. Comparative analysis of the interactions between HIV-1 and human
proteins using the system revealed several interesting interaction groups. This is the
*

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 339346, 2007.
Springer-Verlag Berlin Heidelberg 2007

340

K. Han and B. Park

first online system for interactively visualizing and analyzing the interactions of
HIV-1 and human proteins. This paper presents the system and some analysis results.

2 Methods
We have constructed a web-based HIV-1 database (http://hiv1.hpid.org) for the
analysis of the experimental data on the interactions between HIV-1 and human
proteins.
The
experimental
data
were
extracted
from NCBI
at
http://www.ncbi.nlm.nih.gov/RefSeq/HIVInteractions. The whole interaction network
contains 1,768 interactions between 11 HIV-1 proteins and 810 human proteins
(Fig. 1). All the 65 types of protein-protein interactions shown in Table 1 were
included in our analysis.

Fig. 1. The whole interaction network between 11 HIV-1 proteins and 810 human proteins.
This network contains 1768 interactions of 65 different types. HIV-1 proteins are displayed in
yellow in the networks.

Different HIV-1 proteins have different number of human proteins as their


interacting partners (Fig. 2). HIV-1 tat protein has 667 human protein partners, which
is the largest number of known human protein partners for a single HIV-1 protein.
Table 2 shows the number of human proteins interacting with HIV-1 proteins,
interactions that share human protein partners with other HIV-1 proteins, and the
interactions of the interact with type that share human protein partners with other
HIV-1 proteins. The numbers in the last row (total) represent the number of nonredundant occurrences (that is, duplicate occurrences of a same entity are counted

Comparative Analysis of the Interaction Networks of HIV-1 and Human Proteins

341

only once). In the first and second columns, all types of interactions are considered,
and the third column considers the interactions of the interact with type. The reason
that the number of interactions differs from the number of proteins in the second
column is because different types of interactions are counted more than once for a
same pair of proteins. The interacting human proteins with an HIV-1 protein are
visualized by our program called InterViewer [2, 3] as a star-shaped networks
centered at the HIV-1 protein. We performed pairwise comparison of the interaction
networks to find interaction groups.
Table 1. Types of interactions between HIV-1 and human proteins
interaction type
1. acetylated by
2. acetylates
3. activated by
4. activates
5. antagonized by
6. associates with
7. binds
8. co-localizes with
9. competes with
10. complexes with
11. cooperates with
12. degraded by
13. degrades
14. dephosphorylates
15. depletes
16. depolymerizes
17. deregulates

interaction type
18. downregulates
19. enhanced by
20. enhances
21. exported by
22. fractionates with
23. imported by
24. inactivates
25. incorporates
26. increases
27. induces
28. influenced by
29. inhibited by
30. inhibits
31. inhibits induction of
32. inhibits release of
33. interacts with
34. interferes with

interaction type
35. isomerized by
36. modified by
37. modifies
38. modulated by
39. modulates
40. myristoylated by
41. phosphorylated by
42. phosphorylates
43. polymerizes
44. promotes binding to
45. protects
46. recruited by
47. recruits
48. redistributes
49. regulated by
50. regulates
51. regulates import of

interaction type
52. releases
53. relocalizes
54. repressed by
55. requires
56. sequesters
57. stabilizes
58. stimulated by
59. stimulates
60. suppressed by
61. suppresses
62. synergizes with
63. transported by
64. ubiquitinated by
65. upregulates

Table 2. The number of human proteins interacting with HIV-1 proteins, the interactions that
share human protein partners with other HIV-1 proteins, and the interactions of the interact
with type that share human protein partners with other HIV-1 proteins

HIV-1 proteins
capsid
nucleocapsid
gag matrix
p6
gag Pr55
pol
rev
tat
vif
vpr
vpu
total

Human
protein
partners
22
20
64
11
27
69
55
667
56
128
18
810

Interactions that share human


protein partners with other HIV-1
proteins
Interactions
Protein partners
27
14
37
19
98
46
11
5
47
22
114
61
56
33
291
178
98
47
157
93
24
13
960
204

Interactions of interact
with type that share
human protein partners
with other HIV-1 proteins
3
0
0
0
1
2
1
47
39
4
1
98

342

K. Han and B. Park

3 Results and Discussion


From the comparative analysis of HIV-1 interaction networks, we identified several
interaction groups of human proteins. Human proteins in a same interaction group
always have the same set of HIV-1 proteins as their interacting partners. There are a
total 23 interaction groups, 2 in network A, 1 in network B, 7 in network C, and 13 in
network D (Fig. 3). Each interaction group of human proteins is shown in a red round
box, and interacting HIV-1 proteins are shown in yellow round boxes.
The network in Fig. 3A shows 2 groups of human proteins with 5 interacting
HIV-1 proteins. 2 human proteins in a group, NP_002736 and NP_002737, have the
same interacting partners of HIV-1 proteins p6, tat, rev, vif, and matrix. Human
proteins NP_002736 and NP_002737 are mitogen-activated protein kinase (MAPK) 1
and 3, respectively. MAPK has been shown to regulate HIV-1 infectivity by
phosphorylating vif [4]. Phosphorylation of vif by a serine/threonine protein kinase(s)
plays an important role in regulating HIV-1 replication and infectivity. The gagderived protein p6 of HIV-1 plays a crucial role in the release of virions from the
membranes of infected cells [5]. Three human proteins in the group {NP_002256,
NP_002261, NP_002262} in Fig. 3A have the same interacting partners of tat, rev,
integrase, matrix, and vpr. NP_002256, NP_002261, and NP_002262 are karyopherin
beta 1, transportin 1, RAN binding protein 5, respectively.
Fig. 3B shows an interaction group of 8 human proteins {NP_002145, NP_002146,
NP_004125, NP_005336, NP_005337, NP_005338, NP_006588, NP_068814}, all
interact with vpr, matrix, tat, and gag. The 8 human proteins are the members of the
Hsp70 protein family, and by controlled binding and release, facilitate the folding,
oligomeric assembly-disassembly, and intracellular transport of protein complexes.
Fig. 3C shows 7 interaction groups of human proteins, each with a same set of 3
HIV-1 protein partners. Among the interaction groups, the group of 40 human
proteins interacting with integrase, vif and tat is the largest. The network in Fig. 3D
shows 13 interaction groups of human proteins. The human proteins in a same
interaction group share a same set of 2 interacting partners. The group interacting
with tat and vpr is the largest, containing 42 human proteins.
Fig. 4 shows an interaction network of HIV-1 and human proteins, consisting of
interactions of interact with type only. The network shows 98 interactions between
11 HIV-1 proteins and 49 human proteins. The human proteins shared by the vif and
tat genes (enclosed in a red box in the network) are proteasome (prosome, macropain)
subunits of a highly ordered ring-shaped 20S core structure.
Fig. 5 shows the number of interacting human proteins of each HIV-1 protein and
the interactions of the interact with type that share human protein partners with
other HIV-1 proteins. HIV-1 tat protein has the largest number of interacting human
proteins as well as the largest number of interactions of the interact with type.
In summary, 6 out of total 810 human proteins (0.7%) interact with exactly 5
HIV-1 proteins (the interaction groups in Fig. 3A), 12 human proteins (1.5%) interact
with exactly 4 HIV-1 proteins (Fig. 3B), 81 human proteins (10%) interact with
exactly 3 HIV-1 proteins (the interaction groups in Fig. 3C), 105 human proteins
(13%) interact with exactly 2 HIV-1 proteins (Fig. 3D), and 606 human proteins
(74.8%) interact with only one HIV-1 protein. There is no human protein that
interacts with more than 5 HIV-1 proteins.

Comparative Analysis of the Interaction Networks of HIV-1 and Human Proteins


A

343

Fig. 2. The interaction networks between HIV-1 and human proteins, in which HIV-1 proteins
and human proteins are displayed in yellow and white nodes, respectively. (A) HIV-1 capsid
protein interacting with 22 human proteins. (B) HIV-1 nucleocapsid protein interacting with 20
human proteins. (C) HIV-1 matrix protein interacting with 64 human proteins. (D) HIV-1 p6
protein interacting with 11 human proteins. (E) HIV-1 gag Pr55 protein interacting with 27
human proteins. (F) HIV-1 integrase protein interacting with 69 human proteins. (G) HIV-1 rev
protein interacting with 55 human proteins. (H) HIV-1 tat protein interacting with 667 human
proteins. (I) HIV-1 vif protein interacting with 56 human proteins. (J) HIV-1 vpr protein
interacting with 128 human proteins. (K) HIV-1 vpu protein interacting with 18 human
proteins. All the networks are star-shaped networks centered at an HIV-1 protein.

344

K. Han and B. Park


B

Fig. 3. The interaction groups of human proteins identified from the analysis of the interactions
between HIV-1 and human proteins. Proteins in a same interaction group always have the same
set of interacting partners. There are a total 23 interaction groups, 2 in network A, 1 in network
B, 7 in network C, and 13 in network D. Each interaction group of human proteins is shown in
a red round box, and interacting HIV-1 proteins are shown in yellow round boxes.

Comparative Analysis of the Interaction Networks of HIV-1 and Human Proteins

345

Fig. 4. Interaction network of HIV-1 and human proteins, consisting of interactions of interact
with type only. The network shows 98 interactions between 11 HIV-1 proteins and 49 human
proteins. The human proteins shared by the vif and tat genes (enclosed in a red box in the
network) are proteasome (prosome, macropain) subunits of a highly ordered ring-shaped 20S
core structure.
1000

667

Interact with
Human
128

100

69

64

55

47

56

39

27

22

20

18
11

10
4

1
ga
g

p6

2
1

1
vp
u

vp
r

vi
f

ta
t

re
v

po
l

m
at
ri
x

ca
ps
id
nu
cl
eo
ca
ps
id

Fig. 5. The number of interacting human proteins of each HIV-1 protein (yellow bars) and the
interactions of the interact with type that share human protein partners with other HIV-1
proteins (orange hatched bars)

346

K. Han and B. Park

4 Conclusion
Most of the interactions between virus and host cells are complex and have not been
fully understood despite the substantial discoveries in recent years. Investigating
virus-host interactions is important for understanding viral replication and for
identifying new targets with potential for drug intervention. We have constructed a
web-based HIV-1 database (http://hiv1.hpid.org) for the comparative analysis of the
experimental data on the interactions between HIV-1 and human proteins.
Comparative analysis of the interactions between HIV-1 and human proteins using the
system revealed several interesting interaction groups. Our work can be extended in
several directions. First, biological experiments can be performed to identify
functions or other biological properties common to all proteins in the same interaction
group. Second, our study analyzed the interactions between HIV-1 and human
proteins, but can be expanded to include the interactions between human proteins.
Finally, our system should be updated to include the interaction data for the HIV-1 nef
protein, which has been added to the HIV-1 database of NCBI very recently. We
believe that this is the first online system for the comparative analysis of the
interactions networks of HIV-1 and human proteins and that it is a valuable tool for
scientists in the field of protein-protein interactions and HIV/AIDS research.

Acknowledgements
This work was supported by the Korea Research Foundation Grant funded by the
Korean Government (KRF-2006-D00038) and in part by MOST (KOSEF) through
the Systems Bio-Dynamics Research Center.

References
1. Trkola, A.: HIV-host interactions: vital to the virus and key to its inhibition. Current
Opinion in Microbiology 7 (2004) 555-559
2. Han, K., Ju, B., Jung, H.: WebInterViewer: Integrated Framework for Visualizing and
Analyzing Molecular Interaction Networks. Nucl. Acids Res. 32 (2004) W89-W95
3. Ju, B.-H., Han, K.: Complexity Management in Visualizing Protein Interaction Networks.
Bioinformatics 19 (2003) i177i179
4. Yang, X., Gabuzda, D.: Regulation of human immunodeficiency virus type 1 infectivity by
the ERK mitogen-activated protein kinase signaling pathway. Journal of Virology 73 (1999)
3460-3466
5. Mller, B., Patschinsky, T., Krusslich, H.G.: The Late-Domain-Containing Protein p6 Is the
Predominant Phosphoprotein of Human Immunodeficiency Virus Type 1 Particles. Journal
of Virology 76 (2002) 1015-1024

Protein Classication from Protein-Domain and


Gene-Ontology Annotation Information Using
Formal Concept Analysis
Mi-Ryung Han1 , Hee-Joon Chung1 , Jihun Kim1 , Dong-Young Noh2,3, *,
and Ju Han Kim1,4, *
1

Seoul National University Biomedical Informatics (SNUBI),


2
Department of Surgery,
3
Cancer Research Institute,
4
Human Genome Research Institute,
Seoul National University College of Medicine, Seoul, Korea
{gene0309,joonny96,djdoc,dynoh,juhan}@snu.ac.kr

Abstract. There are a number of dierent attributes to describe ontology of proteins such as protein structure, biomolecular interaction, cellular location, and protein domains which represent the basic evolutionary
units that form protein. In this paper, we propose a mathematical approach, formal concept analysis (FCA), which toward abstracting from
attribute-based object descriptions. Based on this theory, we present
extended version of algorithm, tripartite lattice, to compute a concept
lattice. By analyzing tripartite lattice, we attempt to extract proteins,
which are related to domains and gene ontology (GO) terms from bottom nodes to the top of lattice. In summary, using tripartite lattices, we
classied proteins from protein domain composition with their describing
gene ontology (GO) terms.

Introduction

The theory of concept (or Galois) lattices (Wille, 1884) provides a natural and
formal approach to discover and represent concept hierarchies (Carpineto et al.,
1993). Conceptual data processing (also widely known as formal concept analysis) has become a standard technique in data and knowledge processing that
has been applied for data visualization, data mining, information retrieval (using ontologies) and knowledge management. Concept (or Galois) lattice analysis
represents patterns of intersection and inclusion among dual subsets of two sets
of discrete elements (i.e. objects and attributes) (Mische et al., 2000).
Since concepts are necessary for expressing human knowledge, any knowledge
management process benets from a comprehensive formalization of concepts.
Formal concept analysis (FCA) oers such a formalization by mathematizing
the concept of concept as a unit of thought constituted of two parts: extension
and intension. If data are small, as compared with data bases in bioinformatics,
* Corresponding authors
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 347354, 2007.
c Springer-Verlag Berlin Heidelberg 2007


348

M.-R. Han et al.

formal concept data analysis shows how this abstract technique can unfold and
better interpret the biological topics. Using formal concept data analysis, we
can focus on exploratory data analysis with meaningful concept relationships.
Therefore, in this paper, we approach protein classication using a new extension
of lattice analysis - tripartite lattices - (Fararo et al., 1984; Mische et al., 2000)
which is based on the formal concept analysis (FCA).
We use protein, protein domain and Gene Ontology (GO) (Ashburner et al.,
2000) terms to show the intersections and inclusions among them by proposing
tripartite lattices. Protein domains represent the basic evolutionary units that
form protein. Multi-domain proteins can be made from single domain combination, and proteins with two or more domains constitute the majority of proteins
in all organisms studied. Furthermore, domains that co-occur in proteins are
more likely to display similar function or localization (Mott et al., 2002) than
domains in separate proteins. Therefore we can classify similar protein functional
groups from protein domain composition.

2
2.1

Methods
Formal Concept Analysis

We briey introduce the basic notions of Formal Concept Analysis (Ganter et


al., 1999).
Denition 1. A formal context is a triple of sets (G, M, I), where G is called
a set of objects, M is called a set of attributes, and I G M. The inclusion
(g, m) I is read object g has attribute m.
Table 1. Example data set. Formal context of Proteins (G) and GO terms describing
the proteins (M).
GO terms (M)
Protein amino acid
DNA repair
Protein binding ATPase activity
Proteins (G)
phosphorylation
Protein 1
X
X
X
Protein 2
X
X
Protein 3
X
X
Protein 4
X
Protein 5
X
X
Protein 6
X
X
X

In table 1, six proteins (G) are annotated with four GO terms (M) using
gene ontology information: {(g, m): (Protein 1, DNA repair, Protein amino acid
phosphorylation, ATPase activity), (Protein 2, DNA repair, Protein binding),
(Protein 3, DNA repair, ATPase activity), (Protein 4, Protein amino acid phosphorylation), (Protein 5, Protein amino acid phosphorylation, Protein binding),
(Protein 6, DNA repair, Protein amino acid phosphorylation, ATPase activity)}

Protein Classication from Protein-Domain

349

Denition 2. For A G and B M: A = {m M | g A (gIm)}, B =


{g G | m B (gIm)}.
That is, in a formal context, duality relationship of a subset of proteins
denoted by A, and a subset of GO terms denoted by B. Here, A is the set of GO
terms common to the Proteins in A, and B is the set of Proteins which have all
GO terms in B. For example, {Protein 1, Protein 3} = {DNA repair, ATPase
activity}, {Protein 1} = {DNA repair, Protein amino acid phosphorylation,
ATPase activity}, {DNA repair, ATPase activity} = {Protein 1, Protein 3,
Protein 6} as shown in table 1.
Denition 3. A formal concept of a formal context (G, M, I) is a pair (A, B),
where A G, B M, A = B, and B = A. The set A is called the extent,
and the set B is called the intent of the concept (A, B). The concepts of a given
context are naturally ordered by the subconcept-superconcept relation dened by
(A1, B1) (A2, B2): A1 A2 ( B2 B1 ).
In our case, we say (A1, B1) = {(Protein 1), (DNA repair, Protein amino acid
phosphorylation, ATPase activity)}, (A2, B2) = {(Protein 1, Protein 3), (DNA
repair, ATPase activity)} in table 1. Then subconcept-superconcept relation can
be dened by the order of {Protein 1} {Protein 1, Protein 3} and {DNA
repair, ATPase activity} {DNA repair, Protein amino acid phosphorylation,
ATPase activity}.
We constructed a two-mode binary matrix according to the associations between
six proteins and four gene ontology terms as shown in table 1 (entering X into
the matrix whenever a particular protein is annotated with a gene ontology
term). Each protein has more than one explicit GO term. The basic lattice
procedure applies two algebraic operations - intersection and inclusion - to a
two-mode incidence matrix (Mische et al., 2000). First, all possible intersections
between the rows of a two-mode matrix are calculated and generate all possible
intersecting subsets of GO term mapping proteins. The complete set which the
vector containing all X is then added to complete the array of subsets, showing
which subsets are included in larger subsets. This dual ordering of sets of proteins
and GO terms constitute the lattice that can be visualized in a line diagram in
which nodes representing subsets are linked to nodes representing the larger
subsets in which they are included.
In this paper, all matrices were run through the BioLattice program designed
by Jihun Kim. The output of BioLattice program is interpreted with four dierent color coded concepts; concept lattice is decomposed into four sub-structures
based on core-periphery model. (Red: The core structure is dened to maximal
sublattice according to size in which every element is upper bounds of an atom.
Green: Except for the core, all lower bounds to each elements of core are communicating elements. Yellow: Independent (background) structure is dened to
each sublattice that atom equals to coatoms. Gray: The other parts of concept
lattice are dened to peripheral structure.).

350

M.-R. Han et al.

G = { Protein 1, Protein 2, Protein 3, Protein 4, Protein 5, Protein 6 }


M = { DNA repair, Protein binding, Protein amino acid phosphorylation, ATPase activity }
I = { (Protein 1, DNA repair), (Protein 1, Protein amino acid phosphorylation), (Protein 1, ATPase activity), (Protein 2, DNA repair), (Protein 2, Protein
binding). . .}

DNA repair

Protein binding

Protein amino acid phosphorylation

Top

C3
C8

C2

Protein 4

ATPase activity

C5
Protein 3

C6

C7

C4

Protein 1
Protein 6

Protein 2

Protein 5

Bottom

Fig. 1. Bipartite lattice: the concept lattice of the context in Table1 (Letter C stands
for concept)

Figure 1 presents a nine nodes lattice diagram based upon the protein by GO
term matrix (6 X 4). This lattice diagram can be read in two directions, beginning
at the top or the bottom. In here, {Protein 1, Protein 6} is a superconcept of
{Protein 4} and {Protein 3} because {Protein 4} and {Protein 3} is described
by a subset of the attributes describing {Protein 1, Protein 6}.

Protein Classication from Protein-Domain

2.2

351

Tripartite Lattice: Interpenetrations Among Three Distinct


Sets of Elements

Fararo et al. introduces tripartite structural analysis and shows how bipartite
lattices can be extended to the tripartite lattices. They use persons, groups, and
organizations; or persons, cultural systems, and social systems (Fararo et al.,
1984). We investigated their analysis a step further to show the intersections and
inclusions among three sets of interpenetrating biological elements (i.e. proteins,
domains, GO terms). In this paper, tripartite lattices (theoretically generalizable
to the k-partite level) show the interpenetration among three two-mode matrices:
we use proteins by domains (PD), proteins by GO terms (PG), domains by GO
terms (DG).
Let P, G and D denote the set of Proteins, GO terms and Domains respectively. We let the number of entities in the set X by nx (where X stands for
P, G or D). Rxy denote the nx x ny-matrix in which Rxy (i, j) = 1 if the ith
element of X is linked to the jth element of Y, and Rxy (i, j) = 0, otherwise.
The matrix specifying the relationship between sets X and Y is the transpose of
that representing the relationship between Y and X.
Rxy = Ryx (Rxx = 0, for each X)
Table 2. Matrix structure of a tripartite lattice

GO
Protein
Domain

GO
0
RP G
RDG

Protein
RGP
0
RDP

Domain
RGD
RP D
0

Table 2 shows a symmetrical matrix with the upper right blocks composed of
transposes of three lower left blocks. There are no within set relationships.

3
3.1

Results
Bipartite Lattice

Dataset. Exploratory data analysis is rst performed using protein lists from
Krebs TCA cycle pathway and Citrate TCA cycle pathway. We test bipartite
lattice analysis using these pathways because they are one of the most investigated pathways. We use ArrayXPath which is publicly available major pathway
resources including KEGG, GenMAPP, BioCarta and PharmGKB Pathways
(Chung HJ et al., 2004). For further analysis, we have created a repository of
protein, domain and gene ontology (GO) from SwissProt/TrEMBL for protein,
InterPro for domain and Gene Ontology for GO term. In this paper, we use
these resources to extract object and attribute information, and to perform formal concept analysis. The initial letter of protein ID is P or Q and Domain
ID starts with three big letters IPR.

352

M.-R. Han et al.

Analysis of Bipartite lattice. We construct three two-mode binary matrices (see gure 2 (a), (b), (c)). These bipartite lattices show the relationship
between protein and protein domain, protein and protein annotated GO term,
domain and domain annotated GO term respectively (Each concept lattice of
the matrices is not shown here). Through these lattices, we can point out which
domains are common to several proteins or which GO terms are common to
several proteins or which GO terms are common to several domains. However,
the limitation of bipartite lattice analysis is that it presents only an abstract
overview of the relations between two elements. For example, Krebs TCA cycle,
Citrate TCA cycle pathways related protein P36957, Q02218 have the same
GO term energy pathways. If we want to know domain composition of those
proteins, we have to search lattice twice with the same protein IDs (protein and
protein domain, protein and protein annotated GO term). Then we can nd
that these proteins have dierent domains {P36957: IPR003016, IPR000069,
IPR011053} {Q02218: IPR011603, IPR005475}. For further information, we have
to search domain related GO terms in dierent concept lattice.
Protein domain

GO term

Protein domain

Protein

Protein

(a)

GO term

(b)

(c)

Fig. 2. Input matrix with attribute and object. (a) Two-mode binary matrix of Protein
domain and Protein. (b) Two-mode binary matrix of GO term and Protein. (c) Twomode binary matrix of GO term and Protein domain: Row elements are objects and
column elements are attributes.

Therefore, bipartite lattice does not show us how these protein and domain
elements come together with particular GO terms. If one more set of biological
element is added to the bipartite lattice, which is called tripartite lattice, we
can extract more compact and concrete information with three sets of biological
elements (proteins, domains, GO terms). By proposing tripartite lattice, we can
explore domain related proteins and their common GO terms simultaneously.
3.2

Tripartite Lattice

Dataset. We use Pathway crosstalk to select protein lists from random sampled pathways (see gure 3(a)) (Chung HJ et al., 2005). By random sampling,
we choose a group of pathways from Pathway crosstalk with our xed window
size (see gure 3(b)). Random sampling approach is used to obtain more accurate estimates of data statistics. Then domains and GO terms are extracted
using protein lists from below three random sampled pathways. We use distinct
domains, proteins and GO terms to make matrix (see table 2). The GO terms
are including domain annotated GO terms and protein annotated GO terms.

Protein Classication from Protein-Domain

(a)

(b)

353

Fig. 3. Select specic pathways. (a) Pathway crosstalk: Calculating pairwise similarity
matrix between each pair of pathways and applying multi-dimensional scaling method
created the global crosstalk graph of major biological pathways. Yellow nodes represent BioCarta, green nodes GenMAPP, red nodes KEGG, and blue nodes PharmGKB
Pathways. (b) Three pathways are selected by random sampling with our xed window size. (Three neighboring pathways are chosen because we want to have as much
common protein attributes as possible.) This is shown in transparent red rectangular
region (BioCarta/Hs IGF-1 Signaling Pathway, BioCarta/Hs Insulin Signaling Pathway, BioCarta/Hs Inhibition of Cellular Proliferation by Gleevec Pathway).

Analysis of Tripartite lattice. Formal concept analysis is performed using


a symmetrical matrix with proteins, domains and GO terms. Among the most
specic objects, all four proteins have domain IPR000719 in common. Below
lists are four proteins and their domain composition.
{P06213: IPR000719, IPR001245, IPR008266, IPR003961, IPR006212,
IPR009030}
{P45983: IPR000719, IPR008351, IPR003527, IPR002290, IPR008271}
{Q13233: IPR000719, IPR008271, IPR002290, IPR007527}
{P27361: IPR000719, IPR008349, IPR003527, IPR002290, IPR008271}
We can nd P45983 (Mitogen-activated protein kinase 8), Q13233 (Mitogenactivated protein kinase kinase kinase 1), P27361 (Mitogen-activated protein
kinase 3) have the same synonym of EC 2.7.1.37 by searching ENZYME (Enzyme nomenclature database). Here, P06213 (Insulin receptor) has a synonym of
EC 2.7.1.112 which is similar to EC 2.7.1.37 in that those enzymes can transfer
a phosphate from a high energy phosphate such as ATP, to an organic molecule.
<Reaction catalysed>
EC 2.7.1.37: ATP + a protein ADP + a phosphoprotein
EC 2.7.1.112: ATP + a protein tyrosine ADP + a protein tyrosine phosphate
Therefore, we can classify above four proteins as a similar protein functional
group. In addition, the function of each protein is described by GO terms in
tripartite lattice. So we can say that domains that co-occur in proteins are more
likely to display similar function or localization.

Discussion

We have investigated tripartite lattice, a new extension of lattice analysis, to


show the interpenetrations among protein, domain and GO terms. By analyzing

354

M.-R. Han et al.

bipartite lattice, we extracted an abstract overview of the relations between


two sets of biological elements. However, this analysis did not show us how
protein and domain elements come together with particular GO terms. Using
tripartite lattice, we classied proteins from protein domain composition with
their describing GO terms. Because these proteins have similar functions, we
can extract concrete information from tripartite lattice in that domains which
co-occur in proteins are more likely to show similar function or localization as
shown in GO term description.
By approaching extended version of algorithm to compute a concept lattice as
we mentioned above, proteins are classied according to protein functions. However, the graphical representations of tripartite lattices are quite complex and difcult to read for large data sets. Therefore, we may consider interactive simplication of the obtained concepts by merging conceptual hierarchies as a future step.

Acknowledgement
This study was supported by a grant from Korea Health 21 R&D Project, Ministry
of Health and Welfare, Republic of Korea (A060711), a grant from Korea Health
21 R&D Project, Ministry of Health & Welfare, Republic of Korea (01-PJ3-PG601GN07-0004), and a grant from Korea Health 21 R&D Project, Ministry of Health
& Welfare, Republic of Korea (0412-MI01-0416-0002).

References
1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis,
A.P., Dolinski, K., Dwight, S.S., Eppig, J.T. et al. (2000) Gene Ontology: tool for the
unication of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29
2. Carpineto, C., Romano, G. (1993) GALOIS: An order-theoretic approach to conceptual clustering. Proceedings of 10th International Conference on Machine Learning,
Amherst. pp. 33-40
3. Chung HJ, Kim M, Park CH, Kim J, Kim JH. (2004) ArrayXPath: mapping and visualizing microarray gene expression data with integrated biological pathway resources
using Scalable Vector Graphics, Nucleic Acids Res. Jul 1;32:W460-W464.
4. Chung HJ, Park CH, Han MR, Lee S, Ohn JH, Kim J, Kim JH, Kim JH. (2005)
ArrayXPath II: mapping and visualizing microarray gene expression data with biomedical ontologies and integrated pathway resources using Scalable Vector Graphics.
Nucleic Acids Res.
5. Fararo, Thomas, J., Patrick Doreian. (1984) Tripartite structural analysis: Generalizing the Breiger-Wilson Formalism. Social Networks. 6, 141-175.
6. Ganter, B., Wille, R. (1999) Formal Concept Analysis: Mathematical Foundation.
Springer, Heidelberg.
7. Mische, A., Pattison., P. (2000) Composing a civic arena: Publics, projects, and social
settings. Poetics. 27, 163-194.
8. Mott, R., Schults, J., Bork, P., Ponting, C. P. (2002) Predicting protein cellular localization using a domain projection method. Genome Res. 12, 1168-1174.
9. Wille, R. (1884) Line diagrams of hierarchical concept systems. International Classication. 2, 77-86.

A Supervised Classifier Based on Artificial


Immune System
Lingxi Peng1,2, Yinqiao Peng1, Xiaojie Liu2, Caiming Liu2, Jinquan Zeng2,
Feixian Sun2, and Zhengtian Lu2
1

School of Information, Guangdong Ocean University,


Zhanjiang 524025, China
manplx@163.com, liuxiaojie8@126.com
2
College of Computer Science, Sichuan University,
Chengdu 610065, China

Abstract. Artificial immune recognition system (AIRS) has been convincingly


proved a highly effective classifier, which has been successfully applied to pattern recognition and etc. However, there are two shortcomings that limit its further applications, one is the huge size of evolved memory cells pool, and the
other is low classification accuracy. In order to overcome these limitations, a
supervised artificial immune classifier, UCAIS, is presented. The implementation of UCAIS includes: the first is to create a pool of memory cells. Then,
B-cell population is evolved and the memory cells pool is updated until the
stopping criterion is met. Finally, classification is accomplished by majority
vote of the k nearest memory cells. Compared with AIRS, UCAIS not only reduces the huge size of evolved memory cells pool, but also improves the classification accuracy on the four famous datasets, the Iris dataset, the Ionosphere
dataset, the Diabetes dataset, and the Sonar dataset.
Keywords: machine
classification.

learning;

artificial

immune

system;

supervised

1 Introduction
In the last twenty years there has been a great deal of interest in exploiting the known
properties of the immune system as metaphorical inspiration for computational problem solving. Exciting results have been obtained from the research of network
intrusion detection system, pattern recognition, combination optimization, machine
learning, and etc [1-4].
In machine learning field, De Castro and Von Zubens Work examined the role of
the clonal selection process within the immune system and went on to develop an
unsupervised learning known as CLONALG [5]. This work was extended by employing the metaphor of the immune network theory, which led to the aiNet algorithm.
Timmis et al. developed a resource limited artificial immune network [6]. All these
models reported good benchmark results for cluster extraction and exploration, and
indicated that immune system may be an excellent machine learning method.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 355362, 2007.
Springer-Verlag Berlin Heidelberg 2007

356

L. Peng et al.

Building on these previous works, in particular the ideas of CLONALG and resource limitation, Watkins presented the artificial immune recognition system (AIRS)
algorithm in 2001 [7] and the revision in 2002 [8]. Watkins convincingly demonstrated not only that artificial immune systems could be used for supervised learning
but also the AIRS classifier was a highly effective classifier [7-8].
Subsequently, AIRS has been successfully and widely applied to many fields.
Zhang et al. found AIRSs performance was better than that of traditional K-means,
ISODATA, fuzzy, and SOM on remote sensing imagery classification [9]. Xu et al.
found AIRSs performance was better than that of neural network on weather forecast
[10]. Polat et al. found AIRSs performance was better than that of MLP and PNN on
medical diagnosis [11].
However, first of all, classification accuracy of AIRS is still lower than that of
some traditional methods [7-8]. Secondly, at training stage in AIRS, the size of the
evolved memory cells pool is huge, with small distances among memory cells. After
the new candidate memory cell joins the memory cells pool, AIRS just calculates the
affinity of match memory cell and candidate memory cell to judge whether the match
memory cell should be discarded, but the affinity of match memory cell and any other
memory cell is not calculated. For the Ionosphere dataset, the Diabetes dataset, and
the Sonar dataset, the percentages of the size of the evolved memory cells pool for the
whole training dataset are over 75% [7-8]. Finally, AIRS adopts [0,1]n feature vector
to describe the features of antigens [7-8] and memory cells, and match them with
Euclidean distance, which does not calculate the right weight of the feature. However,
the present literatures for the right weights of features mainly include man-made factors and lack of mathematical gist.
In order to solve the above problems in AIRS, this paper proposes a supervised
classifier based on artificial immune system (UCAIS). The experimental results show
that UCAIS not only reduces the size of the evolved memory cells pool, but also improves the classification accuracy, compared with AIRS.
The rest of the paper is organized as follows. In Section 2, an UCAIS classifier is
presented. In Section 3, simulations and experimental results are provided. Finally,
Sections 4 contains our summary and conclusions.

2 Proposed Classifier
Broadly, the training principle of UCAIS is based upon the mammalian immune systems response to an invading antigen. In a mammalian immune system, the system
generates B-cells, which respond to an invader and its presenting characteristics, and
through mutation, these B-cells develop greater and greater affinity for the antigen.
Most B-cells have a short lifetime, but a small proportion of them become memory
cells, which are longer-lived. These memory cells in nature enable a mammalian
immune system to respond rapidly to a second invasion by a previously encountered
threat. In UCAIS, the antigens are the training data. When a training example is presented, an initial population of B-cells is mutated; the resulting B-cells with the high

A Supervised Classifier Based on Artificial Immune System

357

affinity to the training data continue to propagate, producing larger numbers of mutated clones, while those with less affinity produce fewer offspring and even die out.
Before the UCAIS classifier is given, let us establish the notional conventions first.
Let MC represent the set of memory cell, and mc represents one memory cell where
mc MC. Let AB represent the set of ARB (Artificial Recognition Ball) [9], and ab
represents one B-cell where ab AB. Let AG represent the set of antigen, and ag
represents one antigen where ag AG. Let mc.c, ab.c, and ag.c represent the class of
a given memory cell and B-cell, antigen, respectively, where mc.c C={1 2
nc}, ab.c C={1 2 nc}, and ag.c C={1 2 nc}, nc is the number of
classes in the dataset. Let ab.stim represent the stimulation of a given ag to AB ab. Let
TotalResource represent the allowed total resources of the system. Let mc.f, ab.f, and
ag.f represent the feature vector of a given memory cell, B-cell and antigen, respectively. Let mc.fi, ab.fi, and ag.fi, represent the value of the ith value of mc.f, ab.f, and
ag.f, respectively.

Definition 1. Let MCc represent the set of memory cell of the cth class, such that
MCcMC=MC1MC2.MCnc. If ag.c c, then mc MCc.
Definition 2. Let ABc represent the AB set of the cth class, such that
ABcAB=AB1AB2.ABnc. If ab.c c, then AB ABc.
Definition 3. Let AGc represent the AG set of the cth class, such that
AGcAG=AG1AG2.AGnc. If ag.c c, then AG AGc.
Definition 4. MMCD (Minimum Memory Cell Distance): the minimum distance of
the same class in the memory cells set.

MMCD is used to control the size of memory cell pool, which is to reduce the huge
size of memory cells pool in AIRS. After the mach memory cell joins into the set of
memory cells pool, if the affinity of mach memory cell and any other memory cell is
smaller than MMCD, the mach memory cell will be discarded from the memory cells
set.
During training, the first is to initialize a pool of memory cells. Then, B-cell population is evolved and the memory cells pool is updated until the stopping criterion is
met, which may be thought of as generalizations of the training instances vectors. The
classifier includes five following processes.
2.1 Initialization
The initialization stage is to normalize the feature vectors of the training data, and
seed the memory cell if desired. The process is as follows.
1) Normalize the feature vectors of the training data.
2) Randomly select antigen from the training set of AG and join it into the set of
memory cells pool.
With the change of AG class, the variation degree of each feature varies. Specifically, the larger variation degree of the feature, the larger right weight is given to the
feature. The right weight is to reduce the size of memory cells pool and improve the
classification accuracy. For the training set of AG, suppose there are m antigens, and n
features constitute every feature vector. Let ag[m] represent the mth antigen. Let

358

L. Peng et al.

AG.f i represent the average value of the ith feature. Let SDi represent the standard
deviation of the ith feature and SDi = (

1
1 m
(ag[ j ]. f i AG. f i ) 2 ) 2 . Let vi represent

m 1 j =1

variation coefficient of the ith feature of feature vector and vi=SDi/ AG.f i . Let i
represent the ith feature right weight of the feature vector, which is defined by
formula (1).
n

i =vi / vi

(1)

i=1

The calculation of affinity and stimulation are defined by formula (2) and formula
(3) respectively. When the parameters are mc and ag, formula (3) and formula (3)
present the affinity and stimulation of mc and ag, respectively. When the parameters
are ab and ag, formula (2) and formula (3) present the affinity and stimulation of ab
and ag, respectively. When the parameters are mc and mc, formula (2) present the
affinity of two memory cells.
Affinity ( x, y ) =

x. f i y. f i / n

(2)

Stimulation( x, y ) = 1 Affinity ( x, y )

(3)

i =1

2.2 Clone and Mutation of ARB

UCAIS then learn from each training antigen, and the first is to identify the memory
cell which has the same class as the antigen and which is most stimulated by the antigen ag according to formula (4). If there are no memory cells having the same class as
ag in the memory cells pool, add ag to the pool as mcmatch.

mcmatch = {

ag iff MCag.c=
arg max stimulation(ag ,mc) otherwise
mcMCag .c

(4)

Once the mcmatch with highest stimulation is identified, it generates new ARBs. Let
NumClones=clonalRate*hyperClonalRate*stimulation (mcmatch, ag). The hyperClonalRate and clonalRate are integer values set by the user. The clonalRate is used to
determine how many clones are produced by ARBs and memory cells. A typical
value is 10. The hyperClonalRate is a multiplier, which ensures that a hypermutating
memory cells produces more new cells than standard ARB. UCAIS creates NumClones new clones of mcmatch, where each feature vector of the clone can be mutated
with stimulation value. Specifically, the higher the normalized stimulation value, the
smaller the range of mutation allowed. The mutated ARBs of clone and mcmatch are
then joined into the set of AB.

A Supervised Classifier Based on Artificial Immune System

359

2.3 Competition for Resources

The number of resources in the system is a fixed value, TotalResource. For each ab
AB, if ab.c=ag.c, then ab is allocated a number of resources. The principle of resources allocation is that a B-cell which is highly stimulated by the given ag can own
more resources, and the total resources which are allocated to B-cells can not exceed
TotalResource. The competitive allocation of resources will result in some
B-cells owning the least resources died. The goal is to control the number of B-cells.
This process include following steps.
1) If ab is of the same class as the antigen ag, find the maximum stimulation and
minimum stimulation among all the ARBs in the set of AB by formula (3). For each
ab AB, normalize its stimulation by formula (5).

ab.stim =

ab.stim min stim


iff ab.c = ag.c
maxstim - minstim

(5)

2) Calculate abs resources based on stimulation level by formula (6).


ab.resources = ab.stim * clonalRate iff ab.c = ag .c

(6)

3) Sum all resources. If the sum of resources just allocated to the ARBs exceeds the
allowance Totalresource, resources are removed from the least stimulated ARBs first.
4) Meanwhile, all surviving ARBs are allowed to generate mutated clones.
5) A stopping criterion is calculated by formula (7) at this point. It is met if the average stimulation level for the same class as the antigen ag is above a stimulation
threshold value set by the user. If the stopping criterion has not been met, repeat,
beginning at step 1).
ABi

s=

ab .stim
j =1

ABi

iff ab j ABi and ab.c = ag .c

(7)

2.4 Consolidating and Controlling the Memory Cells Pool

If the new candidate for the memory cells pool, mccand, which is defined by formula
(8), is a better fit for the presenting antigen than the best existing memory cells,
mcmatch, it will be added to the pool, MCag.c=MCag.c{mccand}. Moreover, if the affinity
between mccand and mcmatch is less than the MMCD, then mccand actually replaces
mcmatch in the memory cells pool and mcmatch is discarded. Calculate the affinity of
mcmatch and other memory cell. If the affinity is lower than MMCD, mcmatch is also
discarded from the memory cells set.
mccand = arg max stimulation(ag , ab) iff ab.c = ag.c
ab ABag . c

(8)

360

L. Peng et al.

2.5 Classification Process

Classification of test data is accomplished by majority vote of the k nearest memory


cells to the presented test antigen. The user can set the value of k according to the
classification number of the dataset.

3 Experiments
UCAIS has been used to classify four benchmark datasets taken from the repository
of the UCI [12], the Fisher iris flowers dataset, the Pima diabetes dataset, the Ionosphere dataset, and the Sonar dataset. These four datasets are famously used for testing classification algorithm. The size of the training datasets, and the user-assignable
parameters include the stimulation threshold, , clonalRate, hyperClonalRate, Totalresource, and k-value are 0.8, 0.2, 1, 500, 3, respectively, which are all same with
literatures [7-8]. The experiments adopt formula (3) and formula (4) to calculate affinity and stimulation respectively where =2.

Fig. 1. Relationship of MMCD setting


and average classification accuracy

Fig. 2. Relationship of MMCD setting and


the size of the evolved memory cells pool

In order to test the setting of MMCD to the size of the evolved memory cells pool
and ACA (Average Classification Accuracy), we carry out the experiments on four
datasets respectively. The results are shown in Fig.1 and Fig.2. Fig.1 presents ACA
increases with the increase of MMCD, for four datasets, when MMCD is 0.05, 0.03,
0.03, and 0.03, respectively, ACA gets satisfied results; meantime, the sizes of
evolved memory cells pool are relatively small. However, with the continuous increase of MMCD, ACA decreases.Fig.2 illustrates that the size of the evolved memory cells pool decreases with the increase of MMCD. The results prove that the proper
setting of MMCD can not only reduce the size of the evolved memory cells pool, but
also help to improve the ACA.
UCAIS reduces the set of memory cells pool. We compared the size of evolved
memory cells pool of UCAIS with AIRS1 [7], the first version of AIRS, and AIRS2
[8], the revision of AIRS1. Both AIRS1 and AIRS2 have been widely and successfully applied to many fields. The results are shown in Fig.3, which illustrates that

A Supervised Classifier Based on Artificial Immune System

361

UCAIS is the smallest of three. Fig.4 presents the percentage reduction of the evolved
set of memory cells of AIRS1, AIR2 and UCAIS, which can be seen that UCAIS is
the highest on four datasets. Compared UCAIS with AIRS2, which reduces the size of
the memory cells pool on formal three datasets, the percentage reduction and the
improvement of UCAIS are 79.6% (+5.6%), 70% (+18%), 79.6% (+19.6%), and 38%
(+31%), respectively.

Fig. 3. Comparison of the size of the


evolved memory cells pool

Fig. 4. Comparison of the percentage reduction of the evolved memory cells pool

In order to prove that UCAIS improves classification accuracy, we compare


UCAIS, AIR1, AIRS2, and Duch [13], which publishes the results of applying a large
number of classifiers against many of the four benchmark classification datasets.
Table.1 [7,13] presents that the comparison result of the classification accuracy. The
symbol - implies corresponding method has no classification accuracy on the dataset. Due to limited pages, lots of methods with low classification accuracy are not
given. Indeed, on every benchmark dataset UCAIS outperforms a number of wellrespected classifiers.
Table 1. The comparison of classification accuracy
Method
UCAIS
SSV
C-MLP2LN
3-NN
Logdisc
IncNet
SVM
AIRS2
AIRS1

Iris
98.2%
98.0%
98.0%
96.0%
96.7%

Classification accuracy
Ionosphere
Diabetes
96.9%
78.3%
96.7%
77.7%
77.6%
93.2
95.6%
74.2%
94.9%
75.8%

Sonar
92.3%
90.4%
84.9%
84.0%

4 Conclusion
A supervised classifier named UCAIS is presented in this paper. Compared with
AIRS, UCAIS not only reduces the huge size of evolved memory cells pool, but also

362

L. Peng et al.

improves the classification accuracy. The next work is to apply UCAIS to more actual
problems. UCAIS is a general algorithm, and can be used in other fields. For example, if the antigen set is considered as specified patterns, or normal status of network,
and etc., UCAIS can be used for pattern recognition, anomaly detection, and others.

Acknowledgments
This work was supported by 863 High Tech Project of China under Grant NO.
2006AA01Z435, the National Natural Science Foundation of China under Grant
No.60373110, No.60573130, and NO.60502011.

References
1. Li, T.: Computer Immunology. Publishing House of Electronics Industry Beijing (2004)
2. Klarreich E. Inspired by Immunity. Nature, vol. (415) (2002) 468-470
3. Li T., An immune based dynamic intrusion detection model. Chinese Science Bulletin,
vol. 50(22) (2005) 2650-2657
4. Li, T.: An immunity based network security risk estimation. Science in China Ser. F Information Sciences, vol. 48(5) (2005) 798-816
5. De Castro, L. N., F. Von Zuben.: The clonal selection algorithm with engineering applications. in Proc. of Genetic and Evolutionary Computation Conference. USA: Morgan
Kaufman Publishers, (2000) 36-37
6. De Castro, L. N., J. Timmis.: An Artificial Immune Network for Multimodal Optimisation.
Congress on Evolutionary Computation. Part of the World Congress on Computational Intelligence, (2002) 699-704
7. Watkins, L. Boggess.: A Resource Limited Artificial Immune Classifier. Proceedings of
Congress on Evolutionary Computation, Berlin Heidelberg: Springer Verlag, (2002)
926-931
8. Watkins, J. Timmis, L. Boggess.: Artificial Immune Recognition System (AIRS): An Immune-Inspired Supervised Learning Algorithm. Genetic Programming and Evolvable Machines, vol. 5 (3) (2004) 291-317
9. Zhong, Y.F., Zhang, L.P., Huang B., Li P.X.: An unsupervised artificial immune classifier
for multi/hyperspectral remote sensing imagery. IEEE Transactions on Geosciences and
Remote Sensing, vol. 44(2) (2006) 420-431
10. Xu, C.L., Li, T., Huang X.M.: Artificial Immune Algorithm Based System for Forecasting
Weather, Journal of Sichuan University(Engineering Science Edition), vol. 37(5) (2005)
125-129
11. Polat, K., Sahan, S., Kodaz, H., Gunes, S.: A new classification method to diagnosis liver
disorders: supervised artificial immune system (AIRS). in Proc. of the IEEE 13th Signal
Processing and Communications Applications,New Yok (2005) 169-174
12. Blake,C.L.Merz,C,J.:
UCI
Repository
of
machine
learning
databases.
http://www.ics.uci.edu/~mlearn/MLRepository.html (1998)
13. W. Duch.: Datasets used for classification: Comparison of results. http://
www.phys.uni.torun.pl/kmk/projects/datasets.html (2002)

Ab-origin: An Improved Tool of Heavy Chain


Rearrangement Analysis for Human Immunoglobulin
Xiaojing Wang1,2, Wu Wei1,2, SiYuan Zheng2, Z.W. Cao1,, and Yixue Li1,2,*
1
2

Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai, China
Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological
Sciences, Chinese Academy of Sciences; Graduate School of the Chinese Academy of
Sciences, 320 YueYang Road, Shanghai 200031, China
zwcao@scbit.org,
yxli@scbit.org

Abstract. An improved tool to explore the origin of human immunoglobulin


heavy chains, named Ab-origin, has been developed. It can analyze in detail the
V-D-J joints from the rearranged sequence by searching against germline
databases. In addition to the known information about antibody recombination,
appropriate score system and restriction of searching location are also
incorporated to improve computing performance. When compared with a newly
developed software SoDA, Ab-origin performed much better in both accuracy
and stability, with 2, 7 and 1 percent higher for V, D and J respectively. Though
only taking the human heavy chain for an example here, the algorithm also suits
for the analysis of all immunoglobulin and TCR sequences.
Keywords: antibody diversity, V(D)J recombination.

1 Introduction
To protect ourselves from intruders, our immune system produces antibody proteins
which are able to recognize and neutralize foreign substances, namely antigens. In
response to the different varieties of antigens encountered over a humans lifetime, B
cells need to make thousands of millions of different antibodies deriving from the
limited immunologic information encoded in human genome[1]. It is estimated that
the human body can produce at least 108 different antibodies[2], as a homo-dimmer of
heavy and light peptide chains, each of them containing a unique variable region. In
contrast to the huge diversities of unique regions of antibodies, the variable region of
immunologic protein is only encoded by combination of three kinds of gene
segments: variable (V), diversity (D) and joining (J) fragments (V and J segments
only in the case of light chain). Taking heavy chain as an example, all the possible
variable regions is only encoded by gene groups of 51 V genes, 27 D genes and 6 J
genes in human chromosome 14 [2].
Obviously human beings has gained the most important and amazing
immunological mechanism to generate the vast diversity of antibodies during the long

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 363369, 2007.
Springer-Verlag Berlin Heidelberg 2007

364

X. Wang et al.

history of evolution[1]. But how can the human bodies produce such a huge
number of antibodies from limited number of gene groups? Several mechanisms
in vivo have been revealed to answer this question, such as combinatorial V-(D)-J
genetic joining, junctional flexibility, somatic hypermutation and combinatorial
association of light and heavy chains[2]. It is notable that the random
recombination of the V, (D), J genetic segments plays the most critical role[1],
not only creating the diversity at level of 105 (10 3 for light chain), but also
providing the basic structural frame for further diversity developing of the
antibody variable region. In this sense, analysis of the V-(D)-J recombination
process could help to facilitate antibody engineering for potential therapeutic and
research applications.
With the accomplishment of human genome project, genetic information of
immunoglobulins (Igs) has been collected into public databases[3], which makes
bioinformatics analysis of V-D-J junction possible. Several tools have been developed
trying to trace back to genetic coding from the mature immunoglobulin sequences,
including some pioneer work which applied alignment methods[4, 5] or consecutive
matches for D segment matching[6]. Recently, a new tool based on dynamic
programming named SoDA was established, which is intended to process sequences
in batch[7].
Although these existing tools produced some positive results, their methods
were too complex. At the present time BLAST has been proved a very successful
and efficient tool for sequence alignment, which motivates us to achieve the task
with this handy and powerful tool. However, BLAST is a program for general
sequence analysis, while the antibody sequences have their own specialities, so it
is indispensable for us to assign these specialities to BLAST by setting appropriate
parameters.
In this paper, a similar tool named Ab-origin was setup based on BLAST[8]
algorithm, which has been widely accepted as a powerful tool for sequence alignment
that allows custom parameter settings according to specific situations. Ab-origin was
developed by JAVA language and run on Linux server. To better model the natural
process of antibody maturation induced by antigen-affinity, the unconfirmed events
such as D-D fusion[5, 9] and insertion/deletion during somatic hypermutations are
excluded after checking with related reference [2, 10].

2 Method
2.1 Germline Database
Sequences of V, D, J germline genes of human immunoglobulin heavy chains were
collected from IMGT[3] database. After removing the partial genes, the numbers
of the V, D and J alleles are 187, 34, and 12 respectively. V germline sequences
vary from 288 to 305 nucleotides in length, D vary from 11 to 37 and J from 48
to 63.

Ab-origin: An Improved Tool of Heavy Chain Rearrangement Analysis

365

2.2 Principle
To our knowledge, V, D and J gene segments assemble through a site-specific
recombination reaction which is thought to be a random assortment[11]. Selections of
V, D and J segments during rearrangement process are independent. So, according to
the Bayes rule, there is
P(V,D,J|Q)=P(V)P(D)P(J)/P(Q) .

(1)

Where Q is the query sequence which is the target we need to analyze from mature
antibody. V, D, and J represent the three germline segments, respectively. P(V,D,J|Q)
is the probability of finding the correct V, D and J genes of the giving sequence
Q. This is the real case from germline to the mature antibody sequence in our body,
but when we decipher the mature sequence, the identification of the D region always
lies on the location of the V and J segments which you found in the query sequence.
So the formula changed to
P(V,D,J|Q)=P(V)P(D|V,J)P(J)/P(Q) .

(2)

When giving a observed sequence Q, P(Q) is a uniform. As a result, maximizing


P(V), P(D|V,J), P(J) separately will also maximize the conditional probability
P(V,D,J|Q). The probability of V, D and J was defined as a function of alignment
score which we referenced from BLAST[12].
2.3 Algorithm
2.3.1 Search for V and J
Ab-origin calls NCBI BLAST software to find the best V gene segment in the library
with the highest similarity to the query sequence. As we dont consider the
insertion/deletion events, Ab-origin performs the alignment with a gap-forbidden
style, moreover, uses a smaller word size and a specific scoring system of +5 for
match and -6 for mismatch according to the similarity between germline and query
sequence. For J segment the best hit is also found using the same method as the V
segment. The probability was defined as the function of expected value getting from
the BLAST process[12].
2.3.2 Search for D
As illustrated above, D germline sequences are shorter than the V, J germlines and
vary largely in length. In the recombination, it was further shortened by deletions at
both ends and heavily modified by somatic hypermutaion as the whole D was located
in the CDR3 region. Because of its short length, false positive matches are of higher
probability[12], and in some cases, no hit is found at all. For this reason, BLAST
algorithm may not be effective enough to assess the D segment accurately; instead we
try to use an algorithm extended from BLAST.
We defined the searching space as Q(V_end-5,J_start+5). The algorithm to find D
germline in a query sequence is shown as following

366

X. Wang et al.

For all D germlines do


For every site n of query sequence from V_end-5 to
J_start+5 do
For every site from n to n+D_length do
If nucleotide of query sequence equals the one of
the germline sequence
score=score+5;
Else score=score-4;
End if.
End for.
End for.
End for.
Then the scores are sorted to find the best one. To filter the stochastic matches, the
hits with match number less than the half length of the D germline are discarded, and
a rigorous penalty score -4 is adopted.
2.3.3 Identification of N and P Region
We searched the short palindromic sequence (P region) at the exact margin of the
V, D and J region exactly reverse-complementary to the corresponding V, D and
J germline, respectively. During the V(D)J joining process, a region of non-template
nucleotides may be added by a terminal deoxynucleotidyl transferase (TdT) catalyzed
reaction, namely N-region[2]. Ab-origin defined this region as the left parts of the
query sequence after previous assignments.
2.4 Validation
To test Ab-origin, a simulation program was developed to generate artificial
sequences of heavy chain variable regions. By randomly choosing V, D, J germline
segments, the program simulates V-(D)-J rearrangement and to be more vivid, it cuts
0 to 5 nucleotides randomly from either sides of the joints of V-D and D-J
combination due to the imprecise joining of the coding sequences [2]. Subsequently,
up to 15 N-nucleotides were randomly chosen, adding to both the D-J and V-D
joints. By introducing point mutations independently at each site, somatic
hypermutation is simulated with a transition rate twice as much as a transversion
rate[10]. Five different mutation rates were set ranging from 2% to 10% stepping 2%
corresponding to different phase of antibody affinity maturation[2]. At each mutation
rate, 1000 artificial sequences were generated. Finally, the sequences which contain
termination codon (TAG, TGA, TAA) at current open reading frame (ORF) were
removed.

Ab-origin: An Improved Tool of Heavy Chain Rearrangement Analysis

367

3 Results and Discussions


Among the existing tools, only SoDA, IMGT/V-QUEST and VDJsolver can accept
batch submission. SoDA has shown several advantages compared to other tools such
as JOINSOLVER and V-QUEST, e.g. capacity of batch analysis, better results[7].
Thus we use simulated sequences to compare Ab-origin with SoDA for testing the
performance of Ab-origin.
After filtration, numbers of the remaining simulation sequences without
termination codons at five different mutation rates ranging from 2% to 10% stepping
2% are 353, 301, 243, 146, and 124 respectively. These sequences were analyzed by
Ab-origin and SoDA using the same germline database from IMGT[3]. The results
are shown in the Table1. At each mutation rate, Ab-origin identified more V, D, J
segments correctly than SoDA did. The average accuracy ratios for V, D and J
segments are up to 95%, 81% and 98%, respectively, compared to SoDA with 93%,
74% and 97%. Among which, the performance in determining D segment has the
most notable improvment. In 353, 301, 243, 146 and 124 cases of analysis, Ab-origin
correctly identified 292, 241, 193, 118 and 103 respectively, with an improvement of
7% in accuracy compared to SoDA.
Table 1. Results of five sets of simulated sequences with different mutation rates*
Mutation rate (%)

Total number

Correct pickup

353
301
a
SoDA Ab SoDA Ab

243
SoDA Ab

10

146
SoDA Ab

V 339

347

276

287

232

234

134

138

D 276

292

235

241

178

193

111

351

291

293

241

241

140

350

124
SoDA Ab
113

117

118

84

103

142

119

122

* The numbers represent the correct results from two programs at five different mutation
rates. A correct inference means that the finding out gene segment was exactly the one we
used in simulation, even not allowing for mismatched alleles. a, Ab-origin.

In addition, we compared the error rates of the V, D and J segments at each


mutation rate (Figure1). The figure shows a higher error rate in the D segments than
that in V or J segments. The J segments were found to have the least error rates,
which is due to limited choices in the antibody combinations and consistent with the
previous studies[6, 7, 9]. Furthermore, all error rates of Ab-origin are lower than that
of SoDAs at every mutation level (with an exception of the J segements at the
mutation rate 6), and the variance of error rates among each mutation rate is also
smaller (with standard deviation 0.016, 0.016 and 0.010 by Ab-origin and 0.023,
0.043 and 0.016 by SoDA for V, D and J segments, respectively). In summary, Aborigin has better performance in both accuracy and stability, in contrast SoDA tends
to have higher error rate concomitant with higher mutation rate which is also
mentioned in their publication[7].
In the real case of natural antibody maturation process, the imprecise
recombination of V-D and D-J segments may lead to the loss of several nucleotides in

368

X. Wang et al.

the D extremity, and meanwhile the random insertion of N-nucleotides may occur
between the V-D and D-J joints. In addition, different D gene segments may have
identical sequences, such as IGHD5-5*01 and IGHD5-18*01, IGHD4-4*01 and
IGHD4-11*01, resulting in some unexpected inference mistakes. As a result, its very
difficult to recognize the D segment source. In contrast, the proportion of V and J
being influenced in the recombination is much smaller, making the error rates in
identifying D segment significantly higher than those of both V and J segments.

Fig. 1. Comparison of error rates between the results from Ab-origin (dashed line) and SoDA
(solid line) at five mutation rates

In general, Ab-origin has better performance at every mutation level in both


accuracy and stability; moreover, compared to SoDA it takes much shorter time to run
(data not shown). Its accuracy and stability is due to effective parameter settings,
while the fast running speed is duo to the optimization of BLAST in time and memory
costs.

4 Conclusions
An improved tool Ab-origin was developed to efficiently identify V, D, and J gene
segments from a rearranged antibody sequence by searching against germline
database using appropriate rules. To evaluate the tool, we compared Ab-origin and
SoDA with a set of artificial antibody sequences which were produced by simulating
the antibody maturation process. The results show Ab-origin not only finds more
correct V, D and J segments, with 2, 7 and 1 percent higher compared to SoDA
respectively, but also reduces the computational cost. Though this paper only take the
human heavy chain for an example, the algorithm established here also suits for the
analysis of all immunoglobulin and TCR sequences in human, even other mammals

Ab-origin: An Improved Tool of Heavy Chain Rearrangement Analysis

369

which utilize similar antibody production mechanisms. Complementing the previous


tools for partitioning the rearranged immunoglobulin sequences, Ab-origin may
facilitate our understanding of antibody maturation process to provide the theoretical
backgrounds for the antibody engineering for therapeutic and research applications.
Acknowledgments. This work was supported in part by grants from Ministry of
Science and Technology China (2003CB715900, 2004CB720103), National Natural
Science Foundation of China (30500107, 30670953), and Science and technology
commission of Shanghai municipality (04DZ19850, 06PJ14072).

References
1. Maizels N.: Immunoglobulin gene diversification. Annu Rev Genet. Vol. 39. (2005)
23-46.
2. Goldsby R. A., Kindt T. J., Osborne B. A.Kuby J.: Chapter5, Immunology 5e. 5th edn;
(2003).
3. Lefranc M. P.: IMGT, the international ImMunoGeneTics database. Nucleic Acids Res.
Vol. 29. (2001) 207-209.
4. Giudicelli V., Chaume D.Lefranc M. P.: IMGT/V-QUEST, an integrated software
program for immunoglobulin and T cell receptor V-J and V-D-J rearrangement analysis.
Nucleic Acids Res. Vol. 32. (2004) W435-440.
5. Corbett S. J., Tomlinson I. M., Sonnhammer E. L., Buck D.Winter G.: Sequence of the
human immunoglobulin diversity (D) segment locus: a systematic analysis provides no
evidence for the use of DIR segments, inverted D segments, "minor" D segments or D-D
recombination. J Mol Biol. Vol. 270. (1997) 587-597.
6. Souto-Carneiro M. M., Longo N. S., Russ D. E., Sun H. W.Lipsky P. E.: Characterization
of the human Ig heavy chain antigen binding complementarity determining region 3 using
a newly developed software algorithm, JOINSOLVER. J Immunol. Vol. 172. (2004)
6790-6802.
7. Volpe J. M., Cowell L. G.Kepler T. B.: SoDA: implementation of a 3D alignment
algorithm for inference of antigen receptor recombinations. Bioinformatics. Vol. 22.
(2006) 438-444.
8. Altschul S. F., Gish W., Miller W., Myers E. W.Lipman D. J.: Basic local alignment
search tool. J Mol Biol. Vol. 215. (1990) 403-410.
9. Ohm-Laursen L., Nielsen M., Larsen S. R.Barington T.: No evidence for the use of DIR,
D-D fusions, chromosome 15 open reading frames or VH replacement in the peripheral
repertoire was found on application of an improved algorithm, JointML, to 6329 human
immunoglobulin H rearrangements. Immunology. Vol. 119. (2006) 265-277.
10. Odegard V. H.Schatz D. G.: Targeting of somatic hypermutation. Nat Rev Immunol.
Vol. 6. (2006) 573-583.
11. Jung D., Giallourakis C., Mostoslavsky R.Alt F. W.: Mechanism and control of V(D)J
recombination at the immunoglobulin heavy chain locus. Annu Rev Immunol. Vol. 24.
(2006) 541-570.
12. Bedell J., Korf I.Yandell M.: BLAST. O'Reilly; (2003) 360.

Analytically Tuned Simulated Annealing Applied to the


Protein Folding Problem
Juan Frausto-Solis1, E.F. Romn1, David Romero2, Xavier Soberon3,
and Ernesto Lin-Garca4
1

ITESM Campus Cuernavaca, Paseo de la Reforma 182-A Col. Lomas de Cuernavaca,


62589, Temixco Morelos, Mxico
juan.frausto@itesm.mx, A00376933@itesm.mx
2
IMASS UNAM
davidr@matcuer.unam.mx
3
IBT UNAM
soberon@ibt.unam.mx3
4
Universidad Autnoma de Coahuila
elinan@mail.uadec.mx

Abstract. In this paper a Simulated Annealing algorithm (SA) for solving the
Protein Folding Problem (PFP) is presented. This algorithm has two phases:
quenching and annealing. The first phase is applied at very high temperatures
and the annealing phase is applied at high and low temperatures. The
temperature during the quenching phase is decreased by an exponential
function. We run through an efficient analytical method to tune the algorithm
parameters. This method allows the change of the temperature in accordance
with solution quality, which can save large amounts of execution time for PFP.
Keywords: Peptide, Protein Folding, Simulated Annealing.

1 Introduction
The protein folding problem (PFP) is one of the most challenging problems in the
bioinformatics area. The folding protein process starts with an initial protein state
(i.e. special configuration of amino acids atoms), followed by intermediate states and
ends in a final state. The final state is known as native structure, which is
characterized by the minimal energy in the last configuration of amino acids' atoms.
The natural protein folding process is not yet completely understood; the protein
follows an unknown path from any conformation to its native structure [1]. It seems
that in natural folding, the protein does not explore all its possible states [2]. In order
to save time, computational folding simulation helps to find the native structure of a
given protein and avoids generating all the possible states. Ab Inition Methods are
very popular to predict protein final conformation. The protein states are
characterized by their energy which depends on the interaction among their atoms.
Atomic energies are affected by position of atoms, torsion angles and distance among
atoms. The force fields are used to measure the configuration energies of a protein;
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 370377, 2007.
Springer-Verlag Berlin Heidelberg 2007

Analytically Tuned Simulated Annealing Applied to the Protein Folding Problem

371

these include many interactions among atoms, affecting different energies; the most
important are: 1) Torsional energy; 2) Hydrogen bonds energy; 3) Nonbonded
energy; 4) Electrostatic energy. The most popular and successful software systems for
calculating force fields are AMBER [3], CHARMM [4], ECEPP/2 [5] and ECEPP/3
[6]. Heuristic methods are used for solving PFP, the most common are: Genetic
Algorithms, Simulated Annealing (SA), Neural Network, and Tabu Search. SA
provides excellent solutions [7][20] in a short execution time [21][22]; it is an
analogy with thermodynamics and the way that liquids freeze and crystallize. The SA
parameters must be tuned for finding good solutions; these parameters are obtained by
an analytical method [27] or by experimentation [31]-[32]. Analytical methods are
used for defining the parameters with formal models; on the other hand, in
experimental methods, the parameters are defined by trial and error. Once SA is
tuned, it is executed to obtain very good solutions; during the execution, the
temperature changes in accordance with equilibrium stochastic, which is detected by
three methods [24]: (1) trial and error, (2) mean and standard deviation and (3)
accepted solutions vs proposed solutions criterion. Recently, a new method was
developed [23] to set the cooling scheme parameters in SA Algorithms, this method
establishes that both, the initial and final temperature are a function of the maximum
and minimum cost increment obtained from the neighborhood structure. This method
has been applied to solve NP-Hard Problems like Satisfiability problem [SAT] [23]
[24]. This papers deals with a new SA algorithm for PFP. The proposed algorithm has
two phases named Quenching and Annealing Phases. The first phase is an analogy of
the physical quenching process, which is similar to the annealing process but the
temperature, is quickly decreased until a quasi-thermal equilibrium is reached. In the
case of PFP, the energy is changed in a chaotic way because it has extreme variations.
The quenching phase is applied at very high temperatures and decreased with an
exponential function. Once the quasi-thermal is reached by this function, the
algorithm starts the annealing phase, which gradually reduces the temperature values
adapting the analytical tuning [23] methods to PFP.

2 Analytical Tuning
2.1 Setting Initial and Final Temperatures
Analytical tuning can be helpful for setting up the initial temperature. The probability
of accepting any new solution is near to 1 at high temperatures, so, the deterioration
of cost function is maximal. The initial temperature C(1) is associated with the
maximum deterioration admitted and the defined acceptance probability. Let Si be the
current solution and Sj a new proposed one, and Z(Si) and Z(Sj) are the costs
associated to Si and Sj; the maximum and minimum deteriorations are expressed
as Zmax and Zmin. Then, the probability P( Zmax) of accepting a new solution
with the maximum deterioration is (1) and then C(1) can be calculated as in (2). In a
similar way, the final temperature is established according to the probability
P( Zmin) of accepting a new solution with the minimum deterioration (see (3)).

372

J. Frausto-Solis et al.

Z max
= P(Z max) .
exp
C (1)

(1)

C (1) =

Z max
.
ln( P(Z max))

(2)

C( f ) =

Z min
.
ln( P(Z min))

(3)

With these parameters, SA is able to find solutions near the optimal or in some
cases, the optimal one. The initial temperature can be extremely high because
according to (2), C(1) is extremely affected by Zmax.
2.2 Setting the Markov Chain Length

SA can be devised with constant or variable Markov Chains (MC). Let L(k) be the
number of iterations at k temperature in Metropolis Loop (ML); it can be set as a
multiple of variables of the problem. In SA with constant MC, L(k) is set as a
constant for all the temperatures; in other implementations, ML is stopped by a
certain number of accepted solutions. On other hand, analytical methods determine
L(k) with a simple Markov model [23]; at high temperatures, only a few iterations
are required because the stochastic equilibrium is quickly reached; nevertheless, at
low temperatures a more exhaustive exploration is needed, so, a larger L(k) is used.
Let L(1) be L(k) at C(1) and Lmax be the maximum MC length; C(k) is decreased by
the cooling function (4), where parameter is between 0.7 and 0.99 [21][22] and
L(k) is calculated with (5):

C (k + 1) = C (k ) .

(4)

L(k + 1) = L(k ) .

(5)

In (5), is the increment coefficient of MC (>1); so, L(k+1) > L(k) and L(1) =1
and the last MC L(f) is equal to Lmax. The functions (4) and (5) are applied
successively in SA from C(1) to C(f); consequently C(f) and Lmax can be obtained in
(6) and (7).

C ( f ) = nC (1) .

(6)

L max = n L(1) .

(7)

In (6) and (7), n is the step number from C(1) to C(f); so we get (8) and (9).

n=

ln C ( f ) ln C (1)
.
ln

(8)

Analytically Tuned Simulated Annealing Applied to the Protein Folding Problem

ln L max ln L(1) .

= exp

373

(9)

This tuning approach prevents: a) SA spends a large amount of time making


computations even though the stochastic equilibrium is indeed reached or b) SA stops
far away the equilibrium state. So, SA becomes faster than other implementations. As
we have shown, Metropolis parameters depend only on the definition of the C(1) and
C(f) shown in section 2.1. Lmax must be set to a value that allows a good exploration
(between 1 to 4 times the neighborhood size or 63% to 99%) [23].

3 Implementation
The general cooling scheme was tested with two small proteins (Met5enkephaline,
C-peptide). SMMP was used [28][29], and the objective to evaluate the
conformation energy function with ECEPP/2 [5]. Neighbor solutions were selected
randomly (angles in [-180, 180]), and C(1) and C(f) were calculated using
P( Zmax)=0.7 and P( Zmin)=0.3. If the former probability were superior to 0.70
and closer to 1, it would allow excellent exploration, but SA would be inefficient; on
the contrary, with lower values, SA would have a short exploration level but it would
not be able to find a good solution. The initial temperature (C(1)=1.76x1025) is
extremely high because the high values of the energies; therefore, Zmax has
extremely high or low values; C(f) is set as 0.001. At the end of the process, a small
probability to accept deteriorations is enough and after trial and error, 0.3 was chosen. In
other words, the general cooling scheme establishes the adequate value of C(1) to
perform a better stochastic walk. Nevertheless, the cooling function allows that the
temperature decreases very fast at the beginning of the process (chaos phase) and the
chaos phase. The cooling function at this phase is given by (10), and it uses (11) and (12):

c(k + 1) = k c(k )

k = (1 k )
k = k21

(10)
(11)
(12)

In (11) and (12), 0< 1 < 1 but closer to one (e.g. 0.999); therefore (11) gradually
converges to one and, the cooling function becomes equivalent to (4). The value is
changed according to the percentage of accepted solutions into the Markov chain, and
its value is different for each implementation. When 1 reaches one, the quenching
phase ends and the annealing phase starts. At the beginning of the quenching process,
is set as 0.7 There are five different tested analytical approaches: one with
constant MC length equal to 3,600 for all the temperatures range, another one with
adaptable MC length, and the other three with growing MC length (from L(1)=360 to
L(f)=3,600); these values are obtained by different values of the parameter and then
equations (8) and (9) are applied for calculating the Metropolis parameters.

374

J. Frausto-Solis et al.

4 Results
The following implementations were tested and compared: 1) Original SMMP code
[26]; 2) Experimental tuning with MC of constant length (ESAC); 3) Experimental
tuning with MC of adaptable length as [6] (ESAP); 4) Experimental with MC of
adaptable length as [6] and low dispersion as [6] (ESAD); 5) Analytical tuning with
MC of constant length (ASAC); 6) Analytical tuning with MC of adaptable length as
[6] (ASAA); 7) Analytical tuning with MC of growing length and regular cooling
( = 0.7, 0.85 and 0.95) (ASAR); 8) Analytical tuning with MC of growing length
and slow cooling ( = 0.7, 0.85 and 0.98) (ASAS); 9) Analytical tuning with MC of
growing length and regular/slow cooling ( = 0.7, 0.85, 95 and 0.98) (ASARS).
Table 1 shows the results for Met5enkephaline, and table 2 shows the results for
C-peptide. These tables show the average and standard deviation of the results
obtained after thirty tests in each case. Results are displayed in terms of the cost for
the final solution and the time required for finding it. All the results were validated in
Ramachandran Plots [30] and we can notice that all the final configurations have
angles into the feasible region. The final configuration of ASAR was also very similar
to the one reported in PDB (Protein Data Bank, www.pdb.org). When the searching
Table 1. Met5 Enkephaline Average of the results, this includes best and worst solutions

Approach

SMMP
ESAC
ESAP
ESAD
ASAC
ASAA
ASAR
ASAS
ASARS
Approach

SMMP
ESAC
ESAP
ESAD
ASAC
ASAA
ASAR
ASAS
ASARS

Average of the results. Time (minutes) and Energies (Kcal/mol)


Average
Std. Dev
Average Energies
Std. Dev.
Time

11.9
0.05
3.1
0.05
13.4
7.86
2.1
0.24
5.5
0.02
3.6
0.50
2.2
0.12
4.5
0.12
4.0
0.07
Best Solutions
Time
Energies
11.9
-10.3897
3.1
-10.7110
20.8
-10.7032
2.3
-9.3143
5.5
-10.7101
4.6
-10.0857
2.3
-10.6886
4.6
-10.6768
3.9
-10.6462

-9.1674
2.3145
-8.9461
2.2017
-7.8249
1.5978
-6.9538
1.0404
-9.8721
0.5233
-6.2292
2.0609
-8.0136
1.4801
-8.7191
1.5968
-8.1564
1.3745
Worst Solutions
Time
Energies
11.9
-6.6521
3.1
-4.2083
5.7
-5.2110
1.9
-5.8780
5.5
-6.2117
3.1
-1.9257
2.0
-3.2091
4.4
-7.0253
3.8
-5.4932

Analytically Tuned Simulated Annealing Applied to the Protein Folding Problem

375

Table 2. C-Peptide Average of the results

Average of the results. Time (minutes) and Energies (Kcal/mol)


Average Time
Std. Dev
Average Energies
Std. Dev.
SMMP
237.3
12.91
-97.8376
4.5717
ESAC
13.9
3.22
-76.8528
5.2947
ESAP
319.6
51.15
-88.8809
6.9504
ESAD
177.0
36.00
-73.6233
4.9742
ASAC
37.8
7.06
-82.6164
6.3692
ASAA
70.1
15.62
-78.9747
5.9442
ASAR
18.0
4.70
-77.3030
5.7723
ASAS
37.9
5.58
-80.6623
7.2507
ASARS
37.3
2.54
-80.1416
5.3733
Best Solutions
Worst Solutions
Approach
Time
Time
Energies
Energies
SMMP
244.2
-101.9443
243.3
-86.7523
ESAC
12.7
-90.4995
12.7
-67.1276
ESAP
278.5
-103.7011
286.8
-76.2635
ESAD
155.8
-80.3128
158.5
-65.1571
ASAC
27.5
-102.7710
-27.9
-74.0381
ASAA
70.3
-95.5888
59.7
-70.1984
ASAR
18.2
-91.9294
21.6
-67.3917
ASAS
48.5
-96.7650
49.4
-70.3337
ASARS
37.5
-95.4636
38.6
-70.3716
Approach

process of these implementations is close to the end, the variables of the problem
converge to a specific value; the total variables of Met5enkephaline are nineteen, of
which only seventeen are clearly convergent. We made additional experimentation
using a Genetic Algorithm obtaining the worst results. With Met5enkephaline the
algorithm reached only -3.5 Kal/mol in average, and with C-Peptide an average of -57
Kcal/mol was obtained.

5 Conclusions
A SA algorithm for Folding Problems is presented in this paper. This algorithm uses
extremely high temperatures in a chaos (quenching) phase allowing the exploration of
a bigger percentage of the solution space than previous SA approaches; the algorithm
uses two phases, one for the chaos phase where temperatures are too high and the
other for lower temperatures. The results presented in the paper with two peptides
show that the new approach is able to find solutions with better quality than the
classical SA algorithms. According to this experimentation, the general cooling
scheme presented here obtains very good results. This method is useful for setting the
initial temperature of SA applied to the protein folding problem and it guarantees to
reach more suitable solutions. This method also provides a good technique to save
time execution at high temperatures using dynamic Markov Chains. The values of the
cost function for the best configurations obtained with the analytical implementations

376

J. Frausto-Solis et al.

are fairly close to each other and they are very close to those obtained by
experimental tuning procedures. For the quenching phase, a cooling function
decreasing gradually the temperature of the system is presented. The most remarkable
advantage of this tuning method is the saving of time in setting the initial and final
temperatures. Now we are validating these approaches with larger common proteins;
it represents a very interesting greater challenge.

References
1. Anfinsen, C.: Principles that govern the folding of protein chains. Science 181, (1973)
223 230.
2. Levinthal, C.: Are there pathways for protein folding?. J. Chem. Phys. 65, (1968) 44 45.
3. Ponder, J.: Case, Force fields for protein simulations. Adv. Prot. Chem. 66, (2003) 27 85.
4. Brooks, R., Bruccoleri, R., Olafson, B., States, D., Swaminathan, S., Karplus, M.: A
program for macromolecular energy, minimization, and dynamics calculations. J. Comp.
Chem. 4, (1983) 187 217.
5. Momany, F., McGuire, R., Burgess, A., Scheraga, H.: Energy Parameters in Polypeptide.
VII. Geometric Parameters, Partial Atomic Charges, Nonbonded Interactions, Hydrogen
Bond Interactions, and Intrinsic Torsional Potentials for the Naturally Occurring Amino
Acids. The Journal of Physical Chemistry. Vol 79, No. 22, (1975).
6. Nemethy, G., Gibson, K., Palmer, K., Yoon, C., Paterlini, G., Zagari, A., Rumsey, S.,
Scheraga, H.: Energy parameters in polypeptides. 10. Improved geometrical parameters
and nonbonded interactions for use in the ECEPP/3 algorithm with application to prolinecontaining peptides. J. Phys. Chem. 18, 323. (1992).
7. Morales, L., Garduo, R., Romero, D.: Application for simulated annealing to the multiple
minima problem in small peptides. J. Biomol. Str. And Dyn. 8, (1991) 1721 735.
8. Morales, L., Garduo, R., Romero, D.: The multiple minima problem in small peptide
revisited. The threshold accepting approach. J. Biomol. Str. And Dyn. 9, (1992).
9. Hansmann, U., Okamoto, Y.: Prediction of Peptide Conformation by the Multicanonical
Algorithm. arXiv: cond-mat/9303024 v1, (1993).
10. Okamoto, Y.: Protein Folding Problem as Studied by New Simulation Algorithms. Recent
Research Developments in Pure & Applied Chemistry. Proc. Acad. Sci. USA 1987, 84,
(1998) 6611-6615.
11. Garduo, R., Romero, D.: Heuristic Methods in conformational space search of peptides.
J. Mol. Str. 308, (1994) 115 123.
12. Simons, K., Kooperberg, C., Huang, E., Baker, D.: Assembly of Protein Tertiary
Structures from Fragments with Similar Local Sequences using Simulated Annealing and
Bayesian Scoring Functions. J. Mol. Biol. 268, (1997) 209 225.
13. Pillardy, J., Czaplewski, C., Liwo, A., Lee, J., Ripoll, D., Kazmierkiewicz, R., Odziej, S.,
Wedemeyer, W., Gibson, K., Arnautova, Y., Saunders, J., Ye, Y., Scheraga, H.: Recent
improvements in prediction of protein structure by global optimization of a potential
energy function. PNAS vol 98. No. 5. (2000) 2329 2333.
14. Hiroyasu, T., Miki, M., Ogura, S., Aoi, K., Yoshida, T., Okamoto, Y., Dongarra, J.:
Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using
Genetic Crossover. Proceedings of 2002 Genetic and Evolutionary Computation
Conference (GECCO 2002) Workshop Program. (2002) 49-51.

Analytically Tuned Simulated Annealing Applied to the Protein Folding Problem

377

15. Vila, J., Ripoll, D., Scheraga, H.: Atomically detailed folding simulation of the B domain
of staphylococcal protein A from random structures. PNAS vol 100. No. 25.
14812 14816.
16. Hung, L., Samudrala, R.: PROTINFO: Secondary and tertiary protein structure prediction.
Nucleic Acids Research, Vol. 31, No. 13: (2003) 3296 3299.
17. Chen, W., Li, K., Liu, J.: The simulated annealing method applied to protein structure
prediction. Third international conference on machine learning and cybernetics, Shanghai.
(2004).
18. Liwo, A., Khalili, M., Scheraga, H.: Ab initio simulations of protein-folding pathways by
molecular dynamics with the united-residue model of polypeptide chains. PNAS 2005,
vol. 102. No. 7. (2004) 2362 2367.
19. Alves, R., Degrve, L., Caliri, A.: LMProt: An Efficient Algorithm for Monte Carlo
Sampling of Protein Conformational Space. Biophysical Journal; ProQuest Medical
Library. 87, 3. (2004).
20. Lee, J., Kim, S., Lee, J.: Protein structure prediction based on fragment assembly and
parameter optimization. Biophisycal Chemestry 115 (2005) 209 214.
21. Kirkpatrick, S., Gelatt, C., Vecchi, M.: Optimization by simulated annealing. Science,
Number 4598, 220, 4598. (1983) 671 680.
22. Cerny, V.: Thermo dynamical approach to the traveling salesman problem: An efficient
simulation algorithm. Journal of Optimization Theory and Applications, 45(1). (1985)
41 51.
23. Sanvicente, H., Frausto, J.: A Method to Establish the Cooling Scheme in Simulated
Annealing Like Algorithms. ICCSA 2004. Springer Verlag. LNCS, ISSN: 0302-9743.
(2004).
24. Sanvicente, H.: Metodologa de paralelizacin del ciclo de temperatura en algoritmos tipo
recocido simulado. Tesis doctoral, ITESM Campus Cuernavaca, Mxico. (2003).
25. Sanvicente, H., Frausto, J.: Optimizacin de los dimetros de las tuberas de una red de
distribucin de agua mediante algoritmos de recocido simulado. Ingeniera hidrulica en
Mxico. Vol XVIII, num. 1, (2003) 105 118.
26. Sanvicente, H., Frausto, J., Imperial, F.: Solving SAT Problems with TA Algorithms
Using Constant and Dynamic Markov Chains Length. AAIM05. Springer Verlag. LNCS,
ISSN: 0302-9743, (2005).
27. Frausto, J., Sanvicente, H., Imperial, F.: ANDYMARK: An analytical method to establish
dynamically the length of the Markov chain in simulated annealing for the satisfability
problem. Lecture Notes in Computer Science, Springer Verlag. LNCS, ISSN:0302-9743,
(2006).
28. Eisenmenger, F., Hansmann, U., Hayryan, S., Hu, C.: SMMP: A modern Package for
Protein Simulation. Comp. Phys. Comm. 138, 192, (2001).
29. Eisenmenger, F., Hansmann, U., Hayryan, S., Hu, C.: An Enhanced Version of SMMP
Open source software package for simulation of proteins. Comp. Phys. Comm, (2006)
174-422,
30. Ramachandran, G. N., Ramakrishnan, C., Sasisekharan, V.: Stereochemistry of
polypeptide chain configuration. J. Mol. Biol.7, (1963) 95 99.
31. Perez Joaquin O., Pazos Rodolfo, Velez Laura, Rodrguez Guillermo: Automatic
Generation of Control Parameters for the Threshold Accepting Algorithm, LNCS 2313,
Springer Verlag, MICAI (2002) 118-127.
32. Perez Joaquin O., Pazos R.A., Romero David, Santaolaya Rene., Rodrguez Guillermo,
Sosa V:,: Adaptive and Scalable Allocation of Data-Objects in the Web, LCNS 2667
Springer Verlag, ICCSA (2003) 134-143.

Training the Hidden Vector State Model from


Un-annotated Corpus
Deyu Zhou, Yulan He, and Chee Keong Kwoh
School of Computer Engineering, Nanyang Technological University
Nanyang Avenue, Singapore 639798
{zhou0063,asylhe,asckkwoh}@ntu.edu.sg

Abstract. Since most knowledge about protein-protein interactions still


hides in biological publications, there is an increasing focus on automatically extracting information from the vast amount of biological literature. Existing approaches can be broadly categorized as rule-based or
statistically-based. Rule-based approaches require heavy manual eort.
On the other hand, statistically-based approaches require large-scale,
richly annotated corpora in order to reliably estimate model parameters.
This is normally dicult to obtain in practical applications. We have
proposed a hidden vector state (HVS) model for protein-protein interactions extraction. The HVS model is an extension of the basic discrete
Markov model in which context is encoded as a stack-oriented state vector. State transitions are factored into a stack shift operation similar to
those of a push-down automaton followed by the push of a new preterminal category label. In this paper, we propose a novel approach based on
the k-nearest-neighbors classier to automatically train the HVS model
from un-annotated data. Experimental results show the improved performance over the baseline system with the HVS model trained from a
small amount of the annotated data.
Keywords: information extraction, Hidden Vector State Model, proteinprotein interactions.

Introduction

Understanding protein functions and how they interact with each other give
biologists a deeper insight into the understanding of living cell as a complex
machine, disease process and provide targets for eective drug designs. As of to
date, vast knowledge of protein-protein interactions are still locked in the fulltext journals. As a result, automatically extracting information about proteinprotein interactions is crucial to meet the demand of the researchers.
Most existing approaches are either based on simple pattern matching, or by
employing parsing methods. Approaches using pattern matching [1] rely on a set
of predened patterns or rules to extract protein-protein interactions. Parsing
based methods employ either deep or shallow parsing. Shallow parsers [2] break
sentences into none overlapping phases and extract local dependencies among
phases without reconstructing the structure of an entire sentence. Systems based
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 378385, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Training the Hidden Vector State Model from Un-annotated Corpus

379

on deep parsing [3] deal with the structure of an entire sentence and therefore are
potentially more accurate. The major drawback of the aforementioned methods
is that they may require complete manual redesign of grammars or rules in order
to be tuned to dierent domains. On the contrary, statistical models can perform
the protein-protein interactions extraction task without human intervention once
they are trained from annotated corpora. Many empiricist methods [4] have been
proposed to automatically generate the language model to mimic the features of
un-structured sentences. For example, Seymore [5] used Hidden Markov Model
(HMM) for task of extracting important elds from the headers of computer
science research papers. In [6], a statistical method based on the hidden vector
state model (HVS) to automatically extract protein-protein interactions from
biomedical literature has been proposed. However, methods of this categories
do not perform well partially due to the lack of large-scale, richly annotated
corpora.
How to learn from both annotated and un-annotated data, i.e. semi-supervised
learning, have been investigated. The proposed methods include EM (expectationmaximization) with generative mixture models [7], self-training [8], co-training
[9], transductive support vector machines, graph-based methods [10] and so
on. Nigam et al. [7] applied the EM algorithm on the mixtures of polynomials
for the task of text classication. They showed that the classiers trained from
both the labeled and unlabeled data perform better than those trained solely
from the labeled data. Yarowsky [11] used self-training for word sense disambiguation. Rosenberg et al. [8] applied self-training to object detection from
images. Jones [9] used co-training, co-EM and other related methods for information extraction from text. Blum et al. [10] proposed an algorithm based on
nding minimum cuts in graphs to propagate labels from the labeled data to the
unlabeled data. For a detailed survey on semi-supervised learning, please refer
to [12].
In this paper, we present a novel method to automatically train the HVS
model from un-annotated data. Considering a semantic annotation as a class
label for each sentence, we employ Part-Of-Speech (POS) tagging to convert
the original sentence into the POS tag sequence, we then use the modied knearest-neighbors (KNN) classiers to assign a semantic annotation to an unseen sentence based on its POS tag sequence. The rest of the paper is organized
as follows. Section 2 briey describes the HVS model and how it can be applied to extract protein-protein interactions from the biomedical literature. Section 3 presents the proposed approach on automatically training the HVS model
from un-annotated data. Experimental results are discussed in section 4. Finally,
section 5 concludes the paper.

The Hidden Vector State Model

The Hidden Vector State (HVS) model [13] is a discrete Hidden Markov Model
(HMM) in which each HMM state represents the state of a push-down automaton
with a nite stack size. Each vector state in the HVS model is in fact equivalent

380

D. Zhou, Y. He, and C.K. Kwoh

to a snapshot of the stack in a push-down automaton and state transitions may


be factored into a stack shift by n positions followed by a push of one or more
new preterminal semantic concepts relating to the next input word. Such stack
operations are constrained in order to reduce the state space to a manageable
size. Natural constraints to introduce are limiting the maximum stack depth and
only allowing one new preterminal semantic concept to be pushed onto the stack
for each new input word. Such constraints eectively limit the class of supported
languages to be right branching. The joint probability P (N, C, W ) of a series of
stack shift operations N , concept vector sequence C, and word sequence W can
be decomposed as follows
P (N, C, W ) =

T


P (nt |W1t1 , Ct1


1 )

t=1
t1
P (ct [1]|W1t1 , Ct1
, Ct1 )
1 , nt ) P (wt |W1

(1)

where:
Ct1 denotes a sequence of vector states c1 ..ct . ct at word position t is a
vector of Dt semantic concept labels (tags), i.e. ct = [ct [1], ct [2], .., ct [Dt ]]
where ct [1] is the preterminal concept and ct [Dt ] is the root concept;
W1t1 Ct1
denotes the previous word-parse up to position t 1;
1
nt is the vector stack shift operation and takes values in the range of 0, .., Dt1
where Dt1 is the stack size at word position t 1;
ct [1] = cwt is the new preterminal semantic tag assigned to word wt at word
position t.
The details of how this is done are given in [13]. The result is a model which
is complex enough to capture hierarchical structure but which can be trained
automatically from only lightly annotated data.
To train the HVS model, an abstract annotation needs to be provided for each
sentence. For example, for the sentence,
CUL-1 was found to interact with SKR-1, SKR-2, SKR-3, SKR-7, SKR-8 and SKR-10 in
yeast two-hybrid system.

The Annotation is:


PROTEIN NAME(ACTIVATE(PROTEIN NAME))

The HVS model does not require explicit semantic tag/word pairs to be given
in the annotated corpus. All it needs are abstract semantic annotations for training. This means that many sentences might share the same semantic annotation
and they therefore could possibly exhibit the similar syntactic structures which
can be revealed through part-of-speech (POS) tagging. Figure 1 gives an example of several sentences sharing the same semantic annotation and their corresponding abbreviated POS tag sequences which were generated by removing
unimportant POS tags from the original POS tag sequences. Here the symbol KEY denotes the protein-protein interaction keyword and PTN denotes the
protein name.

Training the Hidden Vector State Model from Un-annotated Corpus

381

SS(KEY(PROTEIN NAME(PROTEIN NAME)) SE)


Sentence
Abbreviated POS tag sequence
The physical interaction of cdc34 and ICP0 leads to ACKEY IN PTN CC PTN
its degradation.
Finally , an in vivo interaction between pVHL and ACKEY IN PTN CC NN PTN
hnRNP A2 was demonstrated in both the nucleus
and the cytoplasm.
The in vivo interaction between DAP-1 and TNF-R1 ACKEY IN PTN CC PTN
was further conrmed in mammalian cells.
Fig. 1. An example of multiple sentences sharing the same annotation

Methodologies

In this section, the procedure of training the HVS model from un-annotated
corpus is described which employs the k-nearest-neighbors algorithm with POS
sequences alignment.
Considering the semantic annotation as the class label for each sentence, semantic annotation can be converted to a traditional classication problem. Given
a small set of annotated data and a large set of un-annotated data, we would like
to predict the annotations for the sentences from the un-annotated data based
on their similarities to the sentences in the annotated corpus. At the beginning,
full papers are retrieved from MedLine and split into sentences. Protein names
and keywords describing protein-protein interaction are then identied based on
a preconstructed dictionary. After that, each sentence is parsed by a POS tagger
and the POS tag sequence is generated. Finally, based on the POS tag sequence,
the sentence will be assigned an annotation based on the similarity measure to
the existing sentences in the annotated corpus.
The details of each step are described below.
1. Identifying protein names and protein interaction keywords.
Protein names need to be identied rst in order to extract protein-protein
interaction information. In our system, protein names are identied based on
a manually constructed dictionary. Since protein interaction keywords play
an important role in later step, a keyword dictionary describing interaction
categories has been built based on [3].
2. Part-of-speech tagging.
The part-of-speech (POS) tags for each sentence is generated by the Brills
tagger [14]. The Brills tagger can only achieve 83% overall accuracy on
biomedical text since there are many unknown biomedical domain-specic
words. We plan to replace it with the POS tagger trained from the biomedical
domain in our future work.
3. Automatically generating semantic annotations for the un-annotated
sentences.
The semantic annotations for the un-annotated sentences are assigned by
the KNN classier.

382

3.1

D. Zhou, Y. He, and C.K. Kwoh

k-Nearest-Neighbor Classication with Constrains

Since automatically generating annotations for un-annotated sentences can be


converted into a classication problem, we applied the k-nearest-neighbor (KNN)
algorithm to handle this task. The training data consists of N pairs (x1 , y1 ),
(x2 , y2 ), . . . , (xN , yN ), with xi denotes a POS tag sequence, and yi denotes a
semantic annotation. Given a query point xq , the KNN algorithm nds the k
training points x(r) , r = 1, ..., k closet in distance to xq , and then classify using
majority voting among the k neighbors.
In our implementation here, the distance between two POS tag sequences
are derived based on dynamic programming for sequence alignment instead of
the commonly used Euclidean distance. In section 3.2, we discuss in detail on
the distance measure which is more appropriate for our purpose. Also, instead
of majority voting, some rules are dened to classify a sentence among its k
neighbors as shown in Figure 2. The reason behind is that only a small amount
of training data are available here and majority voting would require a large
amount of training data in order to get reliable results.
Input: a POS tag sequence S, its k neighbors set KN = {N1 , N2 , , Nk }
and their corresponding class label Ci = Class(Ni ), i = 1, ..., k.
Output:Ss class label Cs
Algorithm:
1. Sort the k neighbors in descending order based on their respective
similarity score with S and denote the sorted list as N1 , N2 , ..., Nk
2. Count the number of protein names cpt and protein interaction keywords ckw of S
3. For i = 1 to k do
a) Count the number of protein names cipt and protein interaction
keywords cikw of Ni
b) If cipt equals to cpt and cikw equals to ckw , assign Cs with Class
label of Ni and exit the loop
4. If Cs is not assigned any value, assign Cs with NULL
5. Return Cs
Fig. 2. Procedure of classication using KNN

3.2

POS Sequence Alignment

The similarity between two POS sequences is calculated based on sequence alignment. Suppose a = a1 a2 an and b = b1 b2 bm be the two POS tag sequences
of length of n and m, dene S(i, j) as the score of the optimal alignment between
the initial segment from a1 to ai of a and the initial segment from b1 to bj of b,
where S(i, j) is recursively calculated as follows:
S(i, 0) = 0, i = 1, 2, ...n

(2)

S(0, j) = 0, j = 1, 2, ...m

(3)

Training the Hidden Vector State Model from Un-annotated Corpus

0,

S(i 1, j 1) + s(a , b ),
i j
S(i, j) = max
 

S(i

1,
j)
+
s(a
,

),
i

 
S(i, j 1) + s( , bj )

(4)


Here s(ai , bj ) is the score of aligning ai with bj and is dened as log

383

p(ai ,bj )
p(ai )p(bj )


,

where p(ai ) denotes the appearance probability of tag ai and p(ai , bj ) denotes the
probability that ai and bj appear at the same position in two aligned sequences.
A score matrix can then be built and dynamic programming is used to nd
the largest score between the two sequences.

Experiments

To evaluate the eciency of the proposed methods, corpus I was construct based
on the GENIA corpus [15]. GENIA is a collection of research abstracts selected
from the search results of MEDLINE database with keyword (MESH terms)
human, blood cells and transcription factors. These abstracts were then split
into sentences and those containing more than two protein names were kept.
Altogether 2600 sentences were left.
The corpus I was split into two parts; part I contains 1600 sentences which
can be further split into two data sets: EL consisting of 400 sentences with annotations and EU consisting of the remaining 1200 sentences without annotations,
part II consists of 1000 sentences which was used as the test data set.
4.1

Choosing Proper k

The EL data in Part I of Corpus I were split randomly into the training set
and the validation set at the ration of 9:1. The validation set consists of 40
sentences and the remaining 360 sentences were used as the training set. Experiments were conducted ten times (i.e Experiment 1, 2, 3,..,9 in Figure 3)
with dierent training and validating set each round. At each round, a set of
experiments were conducted with k set to 1, 3, 5, 7. Figure 3 shows the classication precision of KNN with dierent k values, where precision is dened as
P recision = T P/(T P + F P ). Here, T P is the number of sentences that have
been assigned with the correct annotations, F P is the number of sentences that
do not get the correct annotations. It can be observed from Figure 3 that the
overall best performance was obtained when k is set to 3.
4.2

Extraction Results

The baseline HVS model was trained on EL from the part I of Corpus I which
consists of 400 sentences. Sentences from data set EU were then automatically assigned with semantic annotations using the KNN method described in

384

D. Zhou, Y. He, and C.K. Kwoh

section 3.1. The HVS model were incrementally trained with these newly added
training data. Total 187 sentences from the un-annotated training data were
successfully assigned with the semantic annotations.
The results reported here are based on the values of TP (true positive), FN
(false negative), and FP (false positive). TP is the number of correctly extracted
interactions. (TP+FN) is the number of all interactions in the test set and
(TP+FP) is the number of all extracted interactions. F-score is computed as
2RecallPrecision
, where Recall is dened as TP/(TP+FN) and Precision is
Recall+Precision
dened as TP/(TP+FP). All these values are calculated automatically.
Figure 4 shows the protein-protein interactions extraction performance versus
the number of un-annotated sentences added. It can be observed that in general
the F-score value increases when increasingly adding more un-annotated data
from EU . The best performance was obtained when adding in 187 un-annotated
sentences where F-score reaches 58.9%.
1

0.61

0.9

0.6

Performance

0.8

Precision

0.7
0.6
0.5
0.4
k=1

0.3

k=5

0.1
0

0.58
0.57
0.56
Fscore

0.55

k=3

0.2

0.59

Precision

0.54

Recall

k=7

0.53
0

Experiment

20 40 60 80 100 120 140 160 180


Number of Sentences Added in Training Data

Fig. 3. Classication precision vs. dierent Fig. 4. Performance vs the amount of


k value
added un-annotated sentences

Conclusions and Future Work

In this paper, we presented a novel method to automatically train the HVS


model from un-annotated data. Using the modied KNN algorithm, semantic
annotations can be automatically generated for un-annotated sentences. The
HVS model can then be rened with the increasingly added un-annotated data
and this eventually leads to the increase on the F-measure when used for proteinprotein interactions extraction. In future work, we will investigate other semisupervised learning methods to improve the classication performance. In addition, the current approach can only assign the existing semantic annotations to
those un-annotated sentences. It would be interesting to incorporate bootstrapping style approach to derive the semantic annotations not just limited to the
annotated corpus.

Training the Hidden Vector State Model from Un-annotated Corpus

385

References
1. Minlie Huang, Xiaoyan Zhu, and Yu Hao. Discovering patterns to extract proteinprotein interactions from full text. Bioinformatics, 20(18):36043612, 2004.
2. J. Pustejovsky, J. Castano, J. Zhang, M. Kotecki, and B. Cochran. Robust Relational Parsing Over Biomedical Literature: Extracting Inhibit Relations. In
Proceedings of the Pacic Symposium on Biocomputing., pages 362373, Hawaii,
U.S.A, 2002.
3. Joshua M. Temkin and Mark R. Gilder. Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics,
19(16):20462053, 2003.
4. Nikolai Daraselia, Anton Yuryev, Sergei Egorov, Svetalana Novichkova, Alexander
Nikitin, and llya Mazo. Extracting human protein interactions from MEDLINE
using a full-sentence parser. Bioinformatics, 20(5):604611, 2004.
5. Kristie Seymore, Andrew McCallum, and Roni Rosenfeld. Learning Hidden Markov
Model Structure for Information Extraction. In AAAI 99 Workshop on Machine
Learning for Information Extraction, 1999.
6. Deyu Zhou, Yulan He, and Chee Keong Kwoh. Extracting Protein-Protein Interactions from the Literature using the Hidden Vector State Model. In International
Workshop on Bioinformatics Research and Applications, Reading, UK, 2006.
7. Kamal Nigam, Andrew K. McCallum, Sebastian Thrun, and Tom M. Mitchell. Text
classication from labeled and unlabeled documents using EM. Machine Learning,
39(2/3):103134, 2000.
8. Chuck Rosenberg, Martial Hebert, and Henry Schneiderman. Semi-supervised selftraining of object detection models. In Seventh IEEE Workshop on Applications
of Computer Vision, 2005.
9. Rosie Jones. Learning to extract entities from labeled and unlabeled text. PhD
thesis, Carnegie Mellon University, 2005.
10. Avrim Blum and Shuchi Chawla. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of 18th International Conference on Machine
Learning, pages 1926. Morgan Kaufmann, San Francisco, CA, 2001.
11. David Yarowsky. Unsupervised word sense disambiguation rivaling supervised
methods. In Meeting of the Association for Computational Linguistics, pages 189
196, 1995.
12. Xiaojin Zhu. Semi-supervised learning literature survey. Technical Report 1530,
Computer Sciences, University of Wisconsin-Madison, 2005.
13. Yulan He and Steve Young. Semantic processing using the hidden vector state
model. Computer Speech and Language, 19(1):85106, 2005.
14. Eric Brill. Some Advances in Transformation-Based Part of Speech Tagging. In
National Conference on Articial Intelligence, pages 722727, 1994.
15. JD. Kim, T. Ohta, Y. Tateisi, and J Tsujii. GENIA corpussemantically annotated
corpus for bio-textmining. Bioinformatics, 19(Suppl 1):i1802, 2003.

Using Computer Simulation to Understand Mutation


Accumulation Dynamics and Genetic Load
John Sanford1, John Baumgardner2, Wes Brewer3, Paul Gibson4, and Walter ReMine5
1

Dept. Hort. Sci., Cornell University, Geneva, NY, 14456, USA


jcs21@cornell.edu
2
Los Alamos National Laboratory, Los Alamos, NM, USA, retired
3
Computational Engineering, Mississippi State University, MS, USA
4
Dept. Plant, Soil and Agric. Syst., Southern Illinois University, Carbondale, IL, USA
5
Science and Math Dept., Northwestern College, St. Paul, MN, USA

Abstract. Long-standing theoretical concerns about mutation accumulation


within the human population can now be addressed with numerical simulation.
We apply a biologically realistic forward-time population genetics program to
study human mutation accumulation under a wide-range of circumstances.
Using realistic estimates for the relevant biological parameters, we investigate
the rate of mutation accumulation, the distribution of the fitness effects of the
accumulating mutations, and the overall effect on mean genotypic fitness. Our
numerical simulations consistently show that deleterious mutations accumulate
linearly across a large portion of the relevant parameter space. This appears to
be primarily due to the predominance of nearly-neutral mutations. The problem
of mutation accumulation becomes severe when mutation rates are high.
Numerical simulations strongly support earlier theoretical and mathematical
studies indicating that human mutation accumulation is a serious concern. Our
simulations indicate that reduction of mutation rate is the most effective means
for addressing this problem.
Keywords: genetic load, Mendels Accountant, mutation accumulation,
population genetics, simulation.

1 Introduction
The problem of genetic load has concerned geneticists for over 50 years [1][2].
Theoretically, high mutation rates and the natural inefficiencies of selection both appear
to ensure the accumulation of deleterious mutations within the genomes of higher
organisms [3]. These concerns have been accentuated by the apparent reduction of
selection pressures within human populations within historical time frames [4]. All
these earlier concerns were based upon purely theoretical considerations.
Advances in computer science and the increasing power of simulation programs provide us with a new way of understanding the problem of mutation
accumulation. The use of numerical simulation allows us to test empirically previous
mathematical analyses, which are otherwise inherently abstract and difficult to test.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 386392, 2007.
Springer-Verlag Berlin Heidelberg 2007

Using Computer Simulation to Understand Mutation Accumulation Dynamics

387

Such simulations allow us to examine in precise detail complex biological scenarios


which otherwise would require extreme simplification and generalization before any
type of mathematical analysis would be possible.

2 The Program
The computer program Mendels Accountant (hereafter referred to simply as
Mendel) has been developed to provide a biologically realistic forward-time
numerical simulation of mutation accumulation [5]. This is a highly flexible program
which for the first time effectively models natural mutation distributions,
environmental variance, and improved modeling of linkage/recombination. Mendel is
designed to model sexually reproducing diploid organisms. Mendel tracks individual
mutations in a detailed manner from parents to progeny through many generations.
Mutations are modeled so as to have a continuous range of effects from lethal to
beneficial and to vary in expression from fully dominant to fully recessive. Each
mutations unique identifier encodes its genotypic fitness effect, whether it is
recessive or dominant, and its location within the genome (the specific linkage block
where it resides within a specific chromosome). This allows realistic treatment of
linkage of mutations along segments of chromosomes. Mutational effects may be
combined either in a multiplicative or additive manner to yield an overall genotypic
fitness for each new offspring.
Mendel is designed to track large numbers of distinct mutations by using a single
four-byte integer word to store the mutations unique identifier. This allows up to
about four billion different unique mutations to be tracked in a given population. The
number of mutations per individual is limited by the available processor memory and
the population size before selection. As an example, 1.6 GB of memory available for
storing mutation identifiers translates into a maximum of 40,000 mutations in each of
10,000 individuals. Thus, Mendel is effectively an infinite sites model, in contrast
with k-allele and stepwise models that both impose highly restrictive limits on the
number and variety of mutations. Mendel offers the option of tracking only those
mutations whose fitness effect exceeds a user-specified threshold. This threshold
usually is chosen to lie in that region of extremely small mutation effect which is
beyond the reach of selection. Typically, we find that half to two-thirds of all
mutations lie in this un-selectable region. The fitness effects of the untracked
mutations under this option are nevertheless accounted for in the composite fitness
effect of the linkage block where the mutation occurs. This option allows the user to
investigate scenarios that involve two to three times more total mutations that would
be possible otherwise.
Mendel offers the important option of including the effects of environmental
variation. Environmental variance, specified via a heritability parameter and a nonscaling noise standard deviation, combines with genotypic fitness to yield the
phenotypic fitness. Selection then acts on phenotypic fitness to eliminate that fraction
of the offspring (the population surplus) required to maintain the user-specified
population size. The surplus population is a consequence of the specified fertility, as
implied by the average number of offspring per female. Mendel provides the user the

388

J. Sanford et al.

choice of either natural selection (probability selection) or artificial selection


(truncation selection). Because Mendel is optimized for memory efficiency and speed,
many non-trivial scenarios can be run on a desktop or laptop computer. Moreover,
because Mendel is parallelized with MPI, it readily handles large population sizes and
complex population substructure on cluster computers. Mendels graphical user
interface is designed to make the specification of a scenario intuitive and simple,
while also providing a variety of visual representations of the program output. Mendel
is therefore a versatile research tool. It is also useful as an interactive teaching
resource for a variety of settings ranging from introductory courses in biology to more
advanced courses in population genetics.

3 Analysis
Mendels input parameters include: number of offspring per female, mutation rate,
fraction of mutations which are beneficial, fraction of mutations that are recessive,
high-impact mutation threshold, fraction of mutations with effect greater than
threshold (two parameters that specify the distribution of mutation effects), number of
linkage blocks, number of chromosomes, genome size, mutation effect combining
method, heritability of genotypic fitness, type of selection, number of generations, and
population size. Mendels output report is provided at regular generation intervals and
includes summary statistics on number and types of mutations, mean population
fitness, fitness standard deviation, and related information. In addition, data for each
generation is stored in various files and is also plotted in output figures.
In the example we present below, we employ the following input parameters:
number of offspring per female = 6 (4 surplus offspring selected away), mutation rate
= 10 per offspring, fraction of mutations which are beneficial = 0.01, fraction of
mutations that are recessive = 0.8, high-impact mutation threshold = 0.1, fraction of
mutations with effect greater than threshold = 0.001, number of linkage blocks =
1000, number of chromosomes = 23, genome size = 3 billion, mutation effect
combination method = multiplicative, heritability of genotypic fitness = 0.2, type of
selection = probability, number of generations = 5,000, and population size = 1000.
Although the current human population size is more than six billion, we have found
that population sizes above 1,000 result in only marginal increases in selection
efficiency. It is reasonable to expect that, beyond a certain level, larger population
size will not result in more efficient selection, because of increased environmental
variance.
Some of the output from this example is displayed in the following figures. Fig. 1a
shows the mean mutation count per individual plotted with respect to time. A
noteworthy aspect of this figure is a nearly exact linear accumulation of mutations, a
feature we observe consistently across a broad region of parameter space. The slope
of this line is governed primarily by the mutation rate. Selection intensity modifies the
slope of this line only to a limited degree. This is because of the preponderance of unselectable nearly-neutral deleterious mutations (as further described below).

Using Computer Simulation to Understand Mutation Accumulation Dynamics

389

Fig. 1. (a) Mutation count per individual and (b) mean population fitness, plotted for 5,000
generations. (a) shows that deleterious mutations accumulate in close to a strict linear fashion
(reaching 47,730scale on left). Beneficial mutations also accumulate in a linear manner, but
their lower number results in sampling error fluctuations (reaching 498 scale on right). (b)
shows a progressive decline in population fitness.

390

J. Sanford et al.

Fig. 1b shows an initial non-linear genotypic fitness decline, which soon becomes
essentially a linear decline. We observe this pattern across most of the parameter
space we have explored. Mendel defines an individuals genotypic fitness as 1.0 plus
the combined positive and negative effects of all the individuals mutations. In this
case mutation effects are being combined multiplicatively. We have found that the
slope of this curve (fitness change over time) is determined primarily by three things
the mutation rate, the average mutational effect, and the selection intensity.
Fig. 2 shows the distribution of mutation effects of accumulating deleterious
mutations. Mendel employs a distribution of mutation effects (prior to selection),
which reflects what is found in nature a continuous distribution essentially
exponential in character. Input parameters such as genome size and the fraction of
high-impact mutations define the exact shape of the mutation-effect distribution
curve. Because of the shape of the mutation-effect curve, lethal mutations will always
be very rare, and a large fraction of deleterious mutations will have near-zero impact.
When strong selection is applied, regardless of the other input parameters, high
impact mutations are consistently eliminated quite effectively especially the
dominant ones. However, across a wide range of parameter space the bins nearest to

Fig. 2. Distributions of accumulating mutations are shown above. Red bins represent the
expected mutation accumulation when no selection is applied. Blue bins represent actual
accumulation of recessive mutations. Green bins represent actual accumulation of dominant
mutations. The magnitude of each mutations effect is shown on the x-axis, which is a linear
scale. The bin nearest zero represents mutations which change fitness by a factor between .0001
and .00001. Mutations with a magnitude of less than .00001 were not tracked or plotted.

Using Computer Simulation to Understand Mutation Accumulation Dynamics

391

zero fill at essentially the same rate, regardless of whether or not selection is being
applied. Experimentally, these nearly-neutral mutations are consistently found to be
un-selectable in accordance with mathematical theory [6][7]. Mutations with
intermediate levels of impact accumulate at intermediate rates. The transition zone
between selectable and un-selectable mutations is very wide, especially for recessive
mutations. The actual point at which mutations become un-selectable depends on
numerous input parameters, but is readily apparent in Mendels output for any given
scenario.
Fig. 3 shows that over time many alleles move toward fixation. The movement
toward fixation is extremely slow for both deleterious and beneficial mutations
consistent with the mathematical predictions of Haldane [8]. However, over long
periods of time, even with intense selection, a significant number of deleterious
mutations consistently become fixed.
All these findings strongly support previous theoretical and mathematical analyses
[1], [3], [9], [10] which have predicted that deleterious mutation accumulation in the
human population is a very real biological concern.

Fig. 3. Mutant allele frequencies are shown above, with rare alleles (<1%) on the far left, and
fixed or nearly fixed alleles (>99%) on the far right. Deleterious mutations are shown in red,
beneficial mutations are shown in green. In this instance 5,845 deleterious mutations have been
fixed after 5,000 generations. No beneficial mutations were fixed in this example.

392

J. Sanford et al.

4 Conclusions
The program Mendels Accountant provides a biologically realistic platform for
analyzing the problem of mutation accumulation. This program demonstrates that the
problem of deleterious mutation accumulation is very serious under a wide range of
scenarios and across a vast portion of parameter space. The relentless accumulation of
deleterious mutations is primarily due to the existence of un-selectable nearlyneutral mutations, but the genetic load problem is greatly amplified when mutation
rates are high. Intensified natural selection only marginally slows the accumulation of
deleterious mutations. Preliminary Mendel experiments indicate that the most
effective means of slowing mutation accumulation and reducing a populations
genetic load is by reduction of the mutation rate. This study clearly indicates that
more research is needed. Mendels Accountant is freely available to users and can be
downloaded at either http://mendelsaccountant.info or http://sourceforge.net/ projects/
mendelsaccount.

References
1. Muller, H.J.: Our load of mutations. Amer. J. Human Genetics 2 (1950) 111-176.
2. Wallace, B.: Fifty years of genetic load. J. Hered. 78 (1987) 134-142.
3. Kondrashov, A.S.: Contamination of the genome by very slightly deleterious mutations:
why have we not died 100 times over? J. Theor. Biol. 175 (1995) 583-594.
4. Crow, J.F.: The high spontaneous mutation rate: a health risk? PNAS 94 (1997)
8380-8386.
5. Sanford, J., Baumgardner, J., Gibson, P., Brewer, W., Remine, W.: Mendels Accountant:
a biologically realistic forward-time population genetics program. SCPE, 8(2) (submitted).
6. Kimura, M.: Model of effectively neutral mutations in which selective constraint is
incorporated. PNAS 76 (1979) 3440-3444.
7. Kimura, M.: Neutral Theory of Molecular Evolution. Cambridge University Press, New
York (1983) 30-31.
8. Haldane, J.B.S.: The cost of natural selection. J. Genetics 55 (1957) 511-524.
9. Muller, H. J.: The relation of recombination to mutational advance. Mutation Research 1
(1964) 2-9.
10. Loewe, L.: Quantifying the genomic decay paradox due to Mullers ratchet in human
mitochondrial DNA. Genetical Research 87 (2006) 133-159.

An Object Model Based Repository for Biological


Pathways Using XML Database Technology
Keyuan Jiang
Department of Computer Information Technology, Purdue University Calumet, U.S.A.
jiang@calumet.purdue.edu

Abstract. With the availability of a growing collection of biological pathway


datasets, there is an increasing need for an efficient repository with the powerful
search capability to facility large scaled analyses of biological pathway. A
novel data store was designed based upon the BioPAX object models, and
implemented with the emerging XML database technology. Our approach
significantly reduces the system complexity while maintaining powerful search
functionalities.
Keywords: Biological pathway datasets, BioPAX, XML database, XML
datatype.

1 Introduction
Biological pathways represent our current understanding of living organisms at the
cellular level. Increasingly, pathway datasets have become an important source for
biomedical research, ranging from the elucidation of biological functions using the
systems biology approach [1] to the discovery of new pharmaceutical targets [2]. The
availability of genomic sequence datasets and advancement of high throughput
laboratory techniques have helped accumulation of a large amount of pathway
datasets. Today, there exist more than 200 publicly accessible pathway databases [3,
4], each with its own data model and access method, leading to data heterogeneity,
incompleteness and inconsistency [5]. The availability of a large amount of pathway
datasets is providing an opportunity for large scale analyses [6] which call for a
consistent and homogeneous representation and an efficient storage of a variety of
pathways and networks for metabolic reactions, signaling transduction, regulatory
pathways and protein-protein interactions.
Representing the processes of cellular machinery, biological pathways are human
constructs describing molecules involved in particular processes and their
relationships. Owning to the nature of pathways, they are typically modeled as
directed graphs [5, 7], and such a treatment facilitates the visualization of pathways
by human eyes. Internally, pathway software systems use graph data structures to
model pathway datasets, and most of them store the datasets in a relational database
management system (DBMS) by taking advantage of the mature and efficient query
mechanisms and storage built into those database engines. However, mapping graph
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 393396, 2007.
Springer-Verlag Berlin Heidelberg 2007

394

K. Jiang

data structures onto relational models leads to more complex designs of databases and
inefficient queries, and such an approach is not suitable for many applications that
analyze pathway datasets in further elucidating biological functions [5]. After
reviewing and analyzing the existing data models for storing pathway datasets,
Deville et al [5] suggested that the most appropriate data model is the object-oriented
model for storing biological pathways.
To overcome the data heterogeneity and adopt the object-oriented data model for
biological pathways, we present in this paper our recent work on designing and
implementing a storage system for pathway objects based upon the emerging XML
database technology. The object-oriented model of pathways is based upon the
community-developed BioPAX (Biological Pathways Exchange) standard [8], and
each individual object defined in BioPAX is stored in the XML datatype column to
provide XML fidelity and leverage the XQuery technology performing sophisticated
queries on various BioPAX objects.

2 Design Considerations
Several efforts have been made in order to unify the data format of pathways, notably
BioPAX [8], PSI-MI (Proteomics Standard Initiative Molecular Interactions) [9],
SBML (Systems Biology Markup Language) [10], and CellML (Cell Markup
Language) [11]. SBML and CellML are primarily for describing quantitative
(mathematical) models of cellular networks, and PSI-MI focuses on the molecular
interactions. On the other hand, the BioPAX standard is an object-oriented
representation of biological pathways with high expressiveness capable of
representing various types of interactions and pathways [12]. The BioPAX standard,
documented as ontology, defines components of a pathway as classes (entities). Each
class is made up of a collection of attributes which may be instances of the BioPAX
utility classes. A class may represent a physical object such as an RNA molecule or a
process in a pathway such as a biochemical reaction.
The BioPAX standard also defines a special XML format based upon OWL (Web
Ontology Language) [13] for representing each entity or utility class. This allows each
instance (or object) of a BioPAX class to be represented as an XML document.
For large scaled analyses, it is common to retrieve objects instantiated from the
BioPAX classes such as a biochemical reaction, a small molecular, a pathway, an
external reference to the identifier in the authoritative data source, each having a
biological meaning to some extend. The appropriate granularity of the pathway data
model lies at the level of BioPAX objects for a variety of pathway analyses. In our
design, BioPAX objects were chosen as the storage unit for data model.
Since there are dozens of BioPAX classes defined in the BioPAX standard,
modeling them with the conventional relational model seems awkward. A user or
program may need to retrieve the object by merely specifying its unique ID which is
its URI made up of the concatenation of the value of xmlns:base and that of rdf:ID
according the W3C specification. This type of retrieval is difficult, if not impossible,
to implement with the relational model, because the query needs to determine at the
runtime which table to query against.

An Object Model Based Repository for Biological Pathways

395

In our design, instead of using the character datatype in the relational model, we
chose the emerging XML datatype, which provides XML fidelity by preserving the
Infoset and XQuery data model, and supports query and update of all aspects of stored
XML documents. In addition, a single XML datatype column allows storing multiple
categories of XML documents, enabling us to store various BioPAX objects in a
single column which facilitates querying objects by their URIs. Leveraging XQuery, a
search on an object with one of its synonyms can be readily achieved without creating
a linking table to associate objects and their synonyms, simplifying the query
statements.
Like any other XML data, BioPAX compliant datasets are hierarchical in nature.
When defined, objects can be embedded in the composing object, and such
embedding can be multiple levels in depth. In order to store various types of objects
in our data store, the XML-based hierarchy was flattened by rearranging the BioPAX
compliant data such that embedded objects were replaced with references and moved
out of the composing object. In other words, all the objects are organized in a single
layer. When the composing object along with all its components is retrieved, all its
components are retrieved by specifying their IDs and placed in the same result set.

3 Implementation
Our pathway repository was implemented using Microsoft SQL Server 2005 which
supports the XML datatype, XML indexing, and XQuery. In our database, there is a
single XML column to store all types BioPAX objects and a primary key column for
object unique IDs: their URIs. All the queries were created as stored procedures in
order to achieve the best performance.
Three different strategies were employed to create queries. 1) For simple queries
such as one that only involves a URI, standard SQL statements are executed. 2) If a
query deals with the internal part of a BioPAX object for example, a synonym, an
XQuery statement is issued. 3) The most sophisticated queries involve retrieving all
the components that make up a particular object (e.g., all the objects making up a
particular pathway). This will require multiple queries to accomplish. In this case, we
use C# to create CLR (Common Language Runtime) stored procedures which
demonstrate high flexibility and performance in manipulating the data stored in a
SQL Server 2005.
To provide public and platform-neutral access to our pathway repository by other
software applications, we developed a SOAP-based Web service called
PathwayGateway service which furnishes a number of predefined access methods. In
addition, the same set of methods is available as HTTP POST and GET methods.
More detailed documentation on these methods as well as the WSDL [16] file can be
found at http://jlab.calumet.purdue.edu/theGateway. We have imported into our
repository a number of pathway datasets retrieved from KEGG [15] and BioCyc
Websites. Currently our system contains the datasets of the following species: homo
sapiens, Escherischia coli (K-12), and Escherischia coli (O157:H7).

396

K. Jiang

4 Discussions
Our approach of employing XML database technology to store BioPAX objects
provides a much simplified data model in storing biological pathways, significantly
reduces the complexity of system design and the implementation, without any
compromise on the query power. Leveraging the XML database technology, three
different query strategies are explored to provide the best performance and flexibility.
The availability of a large quantity of pathway datasets kept in an efficient data
store will promote future researches in large scaled analyses of biological pathways
and development of a new breed of software applications. Examples of future
directions may include (a) the comparative study of pathways through similarity
searches, and (b) the dynamic visualization and exploration of pathways by parsing
the XML-based pathway objects at the runtime.
Acknowledgments. The author wishes to thank Microsoft Research for its support to
this project, Christopher Nash and Qian Wu for their efforts in implementing the
system, and reviewers for their constructive comments.

References
1. Cary, M.P., Bader, G.D. and Sander, C.: Pathway information for systems biology. FEBS
Lett. 579 (2005) 1815-20
2. Fishman, M.C. and Porter, J.A.: Pharmaceuticals: a new grammar for drug discovery.
Nature. 437 (2005) 491-3
3. Pathguide: the Pathway Resource List [http://pathguide.org]
4. Bader, G.D., Cary, M.P. and Sander, C.: Pathguide: a pathway resource list. Nucleic Acids
Res. 34 (2006) D504-6
5. Deville, Y., et al.: An overview of data models for the analysis of biochemical pathways.
Brief Bioinform. 4 (2003) 246-59
6. Sharan, R. and Ideker, T.: Modeling cellular machinery through biological network
comparison. Nat Biotechnol. 24 (2006) 427-33
7. Schaefer, C.F.: Pathway databases. Ann N Y Acad Sci. 1020 (2004) 77-91
8. Luciano, J.S.: PAX of mind for pathway researchers. Drug Disc Today. 10 (2005) 937-42
9. Hermjakob, H., et al.: The HUPO PSI's molecular interaction format--a community
standard for the representation of protein interaction data. Nat Biotechnol. 22 (2004)
177-83
10. Hucka, M., et al.: The systems biology markup language (SBML). Bioinformatics. 19
(2003) 524-31
11. Lloyd, C.M., Halstead, M.D. and Nielsen, P.F.: CellML: its future, present and past. Prog
Biophys Mol Biol, 85 (2004) 433-50
12. Stromback, L., et al.: Representing, storing and accessing molecular interaction data: a
review of models and tools. Brief Bioinform. 7 (2006) 331-8
13. W3C: Web Ontology Language (OWL). [http://www.w3.org/2004/OWL/].
14. BioCyc: BioCyc Database Collection. [http://biocyc.org/.
15. KEGG: Kyoto Encyclopedia of Genes and Genomes. [http://www.genome.jp/kegg/].
16. W3C: Web Service Description Language. [http://www.w3.org/TR/wsdl].

Protein Folding Simulation with New Move Set in 3D


Lattice Model
X.-M. Li
Faculty of computer, Guangdong University of technology, Guangzhou, Guangdong,
510006
lxmdwj@163.com

Abstract. We present the lowest energy conformations for several large


benchmark problems in 3D HP model. We found these solutions with MC and
genetic algorithm using new move set. The new move set including rotation and
mirror reflection is suitable for use in protein folding simulation. Experiment
results show that new move set can find these best solutions in less time on average and dramatically superior to that of the commonly used move set.
Keywords: Protein folding simulation, 3D lattice model, move set.

1 Introduction
The HP model for protein folding was introduced by Dill [1]. In the model a protein
consists of a sequence of amino acids, each labeled as either hydrophobic (H) or hydrophilic (P). The sequence must be placed on a 2D or 3D grid without overlapping
so that the adjacent amino acids in the sequence remain horizontally or vertically
adjacent in the grid. The goal is to minimize the energy, which in the simplest variation corresponds to maximizing the number of adjacent hydrophobic pairs.
Recent theoretical work has focused on approximation algorithms, although these
have not proven helpful for finding minimum energy conformations. Many heuristic
algorithms for finding minimum energy conformations have been explored [2]~[6]. In
this paper we demonstrate the effectiveness of a new local move set with Monte
Carlo and genetic search algorithm on the HP lattice model in protein folding
problem.

2 New Move Set


A move set in lattice model defines possible conformational changes that can take
place in a unit time. Although there are many different versions for move sets [7]~[8],
the choice of the move set is not complete solved yet. In the paper, we will extend the
previously used move set by adding mirror reflection to the move sets. The new move
sets contain four different moves including four-bead crankshaft moves, three bead
flip, rotation symmetry and mirror reflection. Moves that violate excluded volume are
forbidden.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 397400, 2007.
Springer-Verlag Berlin Heidelberg 2007

398

X.-M. Li

3 Algorithm
The Monte Carlo method for protein folding can be described in the following general
algorithm. (1) Start from a random conformation. (2) From a conformation S1 with
energy E1, make a single random change of the conformation to another conformation
S2 and evaluate its energy E2. (3) If E2 is less than E1, then accept the change to
conformation S2, otherwise decide nondeterministically whether to accept the change
according to the energy increase with the change. Usually the criterion is that the new
conformation is accepted if:
ber between 0 and 1 and

Rnd < exp[

E1 E 2
] , where Rnd is a random numck

ck is gradually decreased during the simulation to achieve

convergence. If the change was not accepted, then retain the former conformation S1.
(4) If the stop criterion is not met, then repeat steps (2) to (4).
The algorithm (IMC) we applied here is a special implementation of Monte Carlo
for the 3D lattice model. We choose new move set as our change in conformation in
protein folding.
In improved Genetic algorithms (IGA) of protein folding, each candidate solution
is referred to as a set of coordinate value. The process involved in IGA optimization
problems is based on natural evolution and works as follows,
(1) Randomly generate an initial population of potential solutions. Each individual
must be a legal conformation.
(2) Evaluate the fitness or energy of each solution.
(3) Each conformation of population evolves independently for some iteration by
the improved Monte Carlo procedure.
(4) According to crossover probability select two solutions biased in favor of
fitness.
(5) Crossover the solutions at a random point on the coordinate string to produce
two new solutions.
(6) The lowest energy conformation in current generation is directly replicated to
next generation. If the stop criterion is not met, then go back to step (2).

4 Experience Results
Table 1 and 2 shows the results of the 10 different 3D 48mer sequences[9] in 3D lattice
model with Monte Carlo and Genetic algorithm using traditional move set and new
move set respectively. For all the sequences, 3 independent simulations are carried
out at the same situations. In the table the N denotes the number of residues of the
sequence and E denotes optimal energies determined from the designed structure. The
first item of tables was the obtained minimum energy, the second item was the number of conformations scanned before that energy value was found.
For the ten test cases of 48mer we choose to start with temperature coefficient T=2
and was cooled by T=0.98T every 10,000 move steps with Monte Carlo. Genetic
algorithm was run for 300 iterations with the population size 200. For the mutation

Protein Folding Simulation with New Move Set in 3D Lattice Model

399

stage the cooling scheme starts with T=2 and cooled by T=0.97T very 5 generations.
The crossover stage starts with T=0.3 and is cooled by T=0.99T very 5 generations.
Bold items in the table show best result per sequence. From the table 1, we know that
the Monte Carlo using new move set acquire the lower energy conformations than
using classical move set except seq#1 and seq#10 which have the same best results.
Table 2 indicates that the Genetic Algorithm using new move set find the lower energy conformations than using classical move set except seq#6 which have the same
best results. Because new move set includes not only the classical move set but also
mirror reflection, It is not an unexpected result that new move set shows better energy
conformation than classical move set. From table 2 we find that IGA methods search
the lowest energy conformations for nine of the ten sequences.
Table 1. Results from 3 runs per 3D sequence with MC and IMC
N/E
1(-32)
2(-34)
3(-34)
4(-33)
5(-32)
6(-32)
7(-32)
8(-31)
9(-34)
10(-33)

MC using classical move set


1
2
3
-30(798)
-29(568)
-31(741)
-29(737)
-28(614)
-29(666)
-29(558)
-29(434)
-32(862)
-29(713)
-30(689)
-30(620)
-29(770)
-29(793)
-29(731)
-27(554)
-28(617)
-29(716)
-27(546)
-28(549)
-29(612)
-28(771)
-28(932)
-29(786)
-29(691)
-31(787)
-31(573)
-30(729)
-31(846)
-32(1,021)

IMC using new move set


1
2
3
-30 (434)
-30(382)
-31 (756)
-30 (650)
-30(336)
-31 (892)
-31 (629)
-32(521)
-33 (625)
-31 (554)
-30(299)
-31 (546)
-30 (761)
-30(271)
-30 (301)
-28 (46)
-29 (431)
-30 (1,240)
-29 (115)
-28(185)
-30 (635)
-28 (296)
-29(409)
-30 (507)
-32 (749)
-31(134)
-32(451)
-30 (315)
-31(568)
-32 (421)

Table 2. Results from 3 runs per 3D sequence with GA and IGA


N/E
1(-32)
2(-34)
3(-34)
4(-33)
5(-32)
6(-32)
7(-32)
8(-31)
9(-34)
10(-33)

GA using classical move set


1
2
3
-31(1,503)
-30(1,494)
-31(1,451)
-30(1,213)
-30(960)
-31(1,242)
-31(1,304)
-32(1,251)
-32(906)
-31(1,715)
-30(834)
-31(1,185)
-30(950)
-31(1,501)
-31(1,196)
-30(873)
-30(1,326)
-31(2,176)
-30(1,538)
-29(1,474)
-30(1,260)
-29(1,205)
-29(1,314)
-30(1,300)
-31(1392)
-31(1,256)
-32(2,312)
-31(1,737)
-31(953)(99)
-32(1,345)

1
-31(482)
-34(830)
-33(434)
-32(714)
-31(560)
-31(984)
-32(1,091)
-30(1,081)
-31(482)
-34(830)

IGA using new move set


2
3
-32 (750)
-32(685)
-32(863)
-32(1,192)
-34(1,077)
-34(535)(56)
-31(660)
-33(1,212)
-31(1,101)
-32(831)
-30(365)
-31(958)
-30(458)
-31(715)
-31(913)
-31 (756)
-32 (750)
-32(685)
-32(863)
-32(1,192)

The results of IGA were summarized and compared with other methods in Table 3.
The data item of IGA and IMC of Table 3 come from the best results of table 2 and
table 1 respectively. The data item of MC comes from reference [9].
From the table 3 we find that new move set is superior in protein folding simulations. IMC method finds lower energy conformation than MC in six of ten sequences
and two sequences obtain the same local minima and only two sequences inferior to
MC method.

400

X.-M. Li
Table 3. Comparison IMC and IGA with other methods in 3D lattice model
N/E
48-1(-32)
48-2(-34)
48-3(-34)
48-4(-33)
48-5(-32)

IGA
-32 (685)
-34 (830)
-34 (535)
-33(1,212)
-32 (831)

IMC
-31 (756)
-31 (892)
-33 (625)
-31 (546)
-30(271)

MC
-30
-30
-31
-30
-30

N/E
48-6(-32)
48-7(-32)
48-8(-31)
48-9(-34)
48-10(-33)

IGA
-31(958)
-32 (1,091)
-31 (756)
-34 (711)
-33 (991)

IMC
-30 (1,240)
-30 (635)
-30 (507)
-32(451)
-32 (421)

MC
-30
-31
-31
-30
-30

5 Conclusion
We proposed a new move set and carried out comparative studies on the effects of
move sets in protein folding in the cubic lattice model. It would like to expand our
prototype to handle more challenging protein folding problems. We can conclude that
GA and MC simulation of lattice protein folding is highly dependent on the move set
used. In the folding simulation a more flexible move set always results in faster folding and lower energy conformation than commonly used move set. With appropriate
modifications and enhancements, we expect the method to be useful for dealing with
folding simulation of real protein sequence.

References
1. Ken A..Dill. Theory for the folding and stability of globular proteins. Biochemistry,
(1985)24:1501
2. S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi, Optimization by simulated annealing.
Science (1983)220:671
3. Ron Unger and John Moult. Genetic algorithms for protein folding simulations. J.Mol. Biol.
(1993)231:75
4. Rainer Konig and Thomas Dandekar. Improving genetic algorithms for protein folding
simulations by systematic crossover. Biosystems. (1999) 50: 17~25
5. Faming Liang. Evolutionary Monte Carlo for protein folding simulations. J.Chem.Phys.
(2001) 115:3374
6. Tianzi. Jiang Protein folding simulations of the hydrophobic-hydrophilic model by combining tabu search with genetic algorithms. J.Chem.Phys. (2003)119:4592
7. J.Shin, W.S. Oh. Study of move set in cubic lattice model for protein folding. J.Phys.Chem.
(1998) 102:6405~6412.
8. N.L. Nunes, K. Chen, J.S. Hutchinson, A flexible lattice model to study protein folding,
J.Phys. Chem. (1996) 100:10443.
9. Yue K and Fiebig KM. A test of lattice protein folding algorithms. Proc Natl Acad Sci
USA. (1995) 92:325

A Dynamic Committee Scheme on Multiple-Criteria


Linear Programming Classification Method
Meihong Zhu1,2,Yong Shi1, Aihua Li1, and Jing He1
1

CAS Research Center on Data Technology and Knowledge Economy


Management School, Graduate University of CAS,
Beijing 100080, China
2
Statistics School, Capital University of Economics and Business
Beijing100070, China
{zhumh,yshi,lah04b,jinghe}@mails.gucas.ac.cn

Abstract. This paper aims to provide a scheme for effectively and efficiently
finding an approximately optimal example size with respect to a given dataset
when using Multiple-Criteria Linear Programming (MCLP) classification
method. By integrating techniques of both progressive sampling and
classification committee, it designs a dynamic classification committee scheme
for MCLP. The experimental results have shown that our idea is feasible and
the scheme is effective and efficient for exploring an approximately optimal
sample size. The empirical results also help us to further investigate some
general specialties of MCLP, such as the more general function expressions
reflecting the relationship between accuracy and sample size, and between
computing cost and sample size.
Keywords: Classification, Multiple-Criteria Linear Programming, Progressive
Sampling, and Committee.

1 Introduction
In data mining field, Multiple-Criteria Linear Programming (MCLP) classification
method is an outstanding classification tool [1-5]. But, just like many other
classification tools, its computation efficiency is sometimes low when faced with
large and high-dimension datasets. Among research efforts to improve classification
efficiency of MCLP, sampling is an important approach. In practice, MCLP is often
executed on a one-off (static) selected sample, while the sample size is often
determined by analysts subjectively. The subsequential problem is that we cant judge
whether the one-off selected sample is good enough for analyses. If the sample is
small, it cant reflect the original data sufficiently; contrarily, if it is too large,
computation cost will be unacceptable. So, how to identify an appropriate sample size
is the key to the success of applying sampling techniques to MCLP. Among various
sampling approaches, progressive sampling is a well-known one for finding an
approximately optimal example size [6, 7].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 401408, 2007.
Springer-Verlag Berlin Heidelberg 2007

402

M. Zhu et al.

This paper designs a scheme to effectively and efficiently find an approximately


optimal example size for MCLP classification method with respect to a given dataset.
This scheme integrates techniques of both progressive sampling and classification
committee. First, a new non-standard arithmetic progressive sampling procedure is
implemented to determine example size of each step. Secondly, at each step of
progressive sampling, a classification committee [8-13] is constituted by 9 samples
with the same size. Finally, by comparing the accuracies of adjacent committees
dynamically and interactively, an approximately optimal example size for MCLP is
found. Resumptively, this scheme can be called a Dynamic Committee Scheme on
MCLP classification method.
To illuminate our idea simply, this paper only focuses on two-group classification
problems. Two real-life databases with different sizes are applied to verify our idea.
Experiment results provide us following conclusions: Firstly, the Dynamic Committee
Scheme on MCLP can effectively find the approximately optimal example size for
MCLP classification method with respect to a given dataset, i.e., the approximately
optimal example size really exists. Secondly, it can efficiently find the approximately
optimal results, i.e., the computation is acceptable. Additionally, these experiment
results help us to execute an in-depth study of specialties of MCLP, such as the
relationship between classification accuracy and sample size, and that between
computing cost and sample size, and that between computing cost and number of
attributes.
The rest of the paper is organized as follows. Section 2 introduces some
background knowledge for latter analyses. Section 3 describes our dynamic
Classification Committee scheme on Multiple-Criteria Linear Programming
classification method in detail. Section 4 introduces experimental operations and
analyzes corresponding results. The last section concludes our work and presents
future research directions.

2 Background Knowledge
To understand our idea and method, three aspects of background knowledge are
needed.
2.1 Two-Class MCLP Classification Model
MCLP classification model is first presented by Shi et al.[1]. A two-class MCLP
model can be depicted as follows:
Given a training dataset which has p predictive variables (attributes) a = (a1, . . . ,
ap ) and n cases, let Ai = (Ai1, . . . , Aip) be the data for the variables in ith case, where i
= 1, . . . , n. We hope to explore the best coefficients of p variables denoted by X =
(x1, . . . , xp )T, and a boundary value b(a scalar ) to separate existing data into two
classes: G1 and G2; that is,
Ai X b, Ai G1 and Ai X b, Ai G2

(1)

A Dynamic Committee Scheme on MCLP Classification Method

403

In this expression, AiX is a linear function of attribute variables X used for classifying
each labeled or unlabeled case. To measure the separation degree of G1 and G2, we
define:
i = the overlapping degree of two classes boundary for case Ai ;
i = the distance of case Ai from its adjusted boundary.
The objective function is minimizing the sum of i and maximizing the sum of i
simultaneously. So there is following classification model (M1).
M in im iz e

i a n d M a x im iz e

S u b je c t to :

(2)

A i X = b + i - i , A i G 1 ,
A i X = b - i + i , A i G 2 .

Where Ai are given; X and b are unrestricted; and i and i 0.


This multi-objectives linear programming problem can be transformed into a
single-objective model by introducing the compromise solution approach [2,3].We
assume the ideal value of - i i be * > 0 and the ideal value of i i be * > 0. If
-i i > *, we define the regret measure as d+ = i i + *; otherwise, it is 0. If - i
i < *, the regret measure is defined as d- = * + i i; otherwise, it is 0. Thus, we
have (i) *+ i i = d - - d+, (ii) | * + i i | = d- + d+, and (iii) d- , d+
0.Similarly, we have * - i i = d - - d +, | * - i i| = d- + d+, and d- , d+ 0. So
M1 can be changed into M2.
M inim ize d - + d + + d - + d S u bject to :

*+ i i = d - d +

(3)

* - i i = d - d +

Ai X = b + i - i , A i G 1 ,
Ai X = b - i + i , A i G 2 .

Where Ai, *, and * are given, X and b are unrestricted, and i , i , d-, d+ , d- , d+
0. If b is given and X is found, we can classify a labeled or unlabeled by using linear
discriminant AX.
The standard two-class MCLP algorithm is based on the M2. It uses the idea of
linear programming to determine a boundary separating classes. Comparing with
other classification tools, its simple and direct, free of the statistical assumptions,
flexible in defining and modifying parameters, and high on classification accuracy
[5]. It has been widely used in business fields, such as credit card analysis, fraud
detecting, and so on. But faced with high dimension data or too much size of data,
sometimes its computing efficiency is low.
2.2 Progressive Sampling
Progressive sampling [7] is a famous dynamic sampling method that can be used in
large datasets. It attempts to maximize model accuracy as efficiently as possible. It is
based on the fact that when the sample size is large enough, with the further increase

404

M. Zhu et al.

of sample size, the model accuracy doesnt increase significantly. It is in fact a tradeoff between classification accuracy and computing cost. Because of dynamic
interactions of sampling process, it overcomes the weakness of one-off (static)
sampling.
It starts with a small sample n0 and augment sample amount progressively at each
step. At each step, a model is built from the current sample and is evaluated. If the
resulting model has not reached user-specified accuracy threshold, the algorithm must
operate once again. Generally, the relation between accuracy and corresponding
sample size can be expressed in learning curve (Figure 1). The convergence nmin in the
curve is the approximately optimal sample size.
In progressive sampling, two key aspects affecting sampling effectivity and
efficiency are increasing mechanism of sample size and stopping or convergence
criterion of sampling .
As for the first aspect, [6] proposed an arithmetic increasing schedule as

S a = { n0 ,n1 ,n2 ,...,nk } = { n0 ,n0 + n ,n0 + 2n ,...,n0 + k n } .


[7] presented a geometrical increasing schedule as

S g = { n0 ,n1 ,n2 ,...,nk } = { n0 ,a n0 ,a 2 n0 ,...,a k n0 } .


[6] drew a conclusion that arithmetic sampling is more efficient that one-off sampling,
and [7] verified geometrical sampling is more efficient than arithmetic sampling.

Fig. 1. Accuracy vs. sample size

2.3 Classification Committee


To improve prediction performance of an individual classifier, classification
committee techniques are often used in many classification methods. There are many
researches on committee/multi classifier/ensemble. The most popular guidelines for
combining all individual classifiers are Bagging[8], Boosting[9],Random
Subspace[10], and Random Forest[11]. There have been many variations of the above
guidelines. The combination rules mainly include simple majority (mainly for
Bagging), weighted majority (mainly for Boosting), minimum, maximum, product,
average, Naive Bayes and so on[12].
Bagging is based on bootstrapping [13] and aggregating concepts, so it
incorporates the benefits of both approaches. In Bagging, m random independent
bootstrap samples with the same size n(N) are taken out from the original dataset N,
and then m individual classifiers are developed, and finally a classification committee

A Dynamic Committee Scheme on MCLP Classification Method

405

is made by aggregating these m member classifiers using a simple majority vote.


Bagging can be executed in parallel way, thus it is efficient.

3 A Dynamic Committee Scheme on MCLP Classification Method


In this paper, we want to explore the approximately optimal sample size nmin for
MCLP classification method with respect to a given dataset. For simplicity, our
research only focuses on two-class problems.
In our scheme, a committee is integrated by m individual classifiers. The
combination mechanism is a variation of Bagging technique, which takes out samples
in a non-repetition way from original dataset N with ni<<N.
To explain our method expediently, we define:
acc(ni)=classification accuracy on test set computed by the committee with sample
size ni;
c=user-defined accuracy threshold or criterion for stopping sampling.
To explore nmin, we design a progressive sampling scheme. In this scheme, sample
size is increased according to a non-standard arithmetic progressive manner. Initially,
sample sizes are arithmetically increased with a fixed increasing extent n (see
formula (4). When formula (5) is satisfied, the subsequent three sample sizes are
increased according to a less degree n' as shown in formula (6).

{ n0 ,n1 ,n2 ,...,nk } = { n0 ,n0 + n ,n0 + 2n ,...,n0 + k n }

(4)

acc( nk-1 )-acc( nk-2 ) c, and acc( nk )-acc( nk-1 ) c

(5)

{nk+1 ,nk+2 ,nk+3 } ={nk +n' , nk +2n' , nk +3n' }, where n' < n

(6)

If formula (7) is satisfied, sampling process is cutoff; or else, later sample sizes are
still increased according to the increment n.
acc( nk+1 )-acc(nk ) c, acc( n k+2 )-acc(nk+1 ) c, and acc( nk+3 )-acc( nk+2 ) c (7)

When sampling process is stopped, the committee with the highest accuracy among
the six sequential committees generated from the above six sample
sizes { nk 2 ,nk 1 ,nk ,nk +1 ,nk + 2 ,nk +3 } is the final committee for prediction.

4 Empirical Research
To indicate effectivity and efficiency of our design, two real databases with different
sizes are used in our experiment research. They are described as follows:
Table 1. Description of the Two Databases
Database

number of predictive attributes

Size of training set

Size of test set

Credit Card
Census Income

63
43

4000(small)
50000(large)

1000
20000

406

M. Zhu et al.

In our design, we stipulate parameter b =1, m=9, and c=0.3%~0.5%.


The operating results and intuitionistic expressions for two databases are shown in
following tables and figures. Detailedly, time in following tables indicates average
time for generating a single member classifier, and acc1 means accuracy rate for
class 1 (majority class), and acc2 means accuracy rate for class2 (minority class),
and acc denotes overall weighted-average accuracy.
4.1 Experiments and Results

As for the credit card dataset, considering its high dimension, we judge the computing
cost is high, so parameter c should be set comparatively high, and nand n0
comparatively low. Here c is predefined as 0.3%, n0 as 200, and n as 200. The initial
scheme is S={200,400,600,800}. The results show that the absolute improvement of
acc on n2 vs. n1, n3 vs. n2 are all not more than 0.3%, i.e., the results on different
sample sizes are stabile, so latter sample sizes are changed at a less scope by n' =100
sequentially. Then S={200, 400, 600, 800, 900, 1000, 1100, }.The acc
improvement on n4 vs. n3 ,n5 vs. n4, and n6 vs. n5 are all less than 0.3%,while the
computing costs significantly grow with the increase of n. So sampling process is
terminated. Among the latter six committees, the committee on n3 is adopted as the
final classifier for prediction due to its good performance. Additionally, to depict the
time vs. n curve, we add two excess sample points 700 and 1500 which are all marked
with symbol *.
Table 2. Time and Accuracy on Different Sample Sizes for the Credit Card Database

Time

200
400
600
800
900
1000
1100
700*
1500*

1
10
60
210
390
660
1200
120
4800

acc1

acc2(%)

acc

69.09
71.52
72.00
71.52
70.06
71.64
70.40
71.39
71.89

82.57
81.14
80.00
84.00
84.57
81.71
85.65
84.00
83.51

69.7
73.2
73.4
73.7
73.6
73.4
73.1
73.6
73.8

time vs.n

acc vs. n
90

5000
4500

85

4000
3500

80

acc1

75

3000

acc2

2500

acc

2000
1500

70

1000
500

65
0

200

400

600

800 1000 1200 1400 1600

0
0

200

400

600

800

1000

1200

1400

1600

Fig. 2. Result of acc vs. n for Credit Card Fig. 3. Result of time vs. n for Credit Card
Database
Database

A Dynamic Committee Scheme on MCLP Classification Method

407

Similarly, we have following results with respect to the census database. The
approximately optimal sample size is n6=1700.
Table 3. Time and Accuracy on Different Sample Sizes for the Census Database
N

time (s)

300
600
900
1200
1500
1600
1700
1800
1900*
2500*

Acc1 %
78.52
78.07
77.89
78.44
78.42
78.57
78.77
78.13
78.79
78.56

8
9
25
55
143
200
230
290
340
940

Acc %
85.20
86.00
87.42
87.25
86.38
86.72
86.69
87.75
86.77
86.86

Acc %
80.89
80.88
81.27
81.57
81.25
81.47
81.58
81.54
81.62
81.51

acc vs. n
time vs.n
90
88

1000
900

86

800
a cc 1

84

a cc 2

82

a cc

80

700
600
500
400
300
200

78

100
0

76
0

300

600

900

1200 1500

1800

2100 2400

2700

500

1000

1500

2000

2500

3000

Fig. 4. Result of acc vs. n for the census Fig. 5. Result of time vs. n for the census
database
database

4.2 Analysis of Results

From the results of two databases, we have three empirical results. Firstly, the
Dynamic Committee-Classifier Scheme on MCLP can effectively find the
approximately optimal example size for MCLP classification method with respect to a
given dataset, i.e., the approximately optimal example size really exists. Considering
the time cost, 800 is enough for the first dataset, which only accounts for 20% of
original training dataset; 1700 is sufficient for the second dataset, which only equals
to 3.5% of the whole training set. Secondly, it can efficiently find the approximately
optimal sample size. For example, in the credit card database, we can see
4800>1+10+60+210+390+660+1200. That is to say, if we sample a sample with 1500
cases, its computing cost outclasses the whole computing time of scheme
{200,400,600,800,900,1000,1100}, while its accuracy rate only has 0.1 percent of
improvement. From the time vs. n curve, we can also see that time cost is
unacceptable when n is too large. Lastly, we find some laws reflecting the relationship
between acc and n, between time and n, and between number of attributes and time.
The time vs. n curve is in the form of power function.

408

M. Zhu et al.

5 Conclusions and Future Efforts


To enhance computation efficiency, sampling is an important strategy in MCLP
classification. To ensure the credibility and efficiency of sampling, we integrate
progressive sampling and classification committee, and then design a dynamic
classification committee scheme for MCLP classification method. The experimental
results have shown that our idea is feasible and the scheme is effective and efficient in
exploring an approximately optimal sample size. The empirical results also help us to
further investigate some general specialities of MCLP. From the above two datasets,
we can see the time vs. n curve is in the form of power function. But we cant
perfectly depict the form of acc vs. n curve due to deficiency of number of samples.
In our future researches, we will further explore more general forms of functions
acc=f(n) and time=f(n).

References
1. Shi, Y., Wise, M., Luo, M., Lin, Y.: Data mining in credit card portfolio management: a
multiple criteria decision making approach. In: Koksalan, M., Zionts, S. (eds.): Multiple
Criteria Decision Making in the New Millennium, Springer, Berlin(2001)427-436
2. Shi, Y., Yu, P. L.: Goal setting and compromise solutions. In: Karpak, B., Zionts, S. (eds.):
Multiple Criteria Decision Making and Risk Analysis Using Microcomputers, SpringerVerlag, Berlin (1989) 165-204
3. Shi,Y.: Multiple Criteria Multiple Constraint-levels Linear Programming: Concepts,
Techniques and Applications. World Scientific Publishing, River Edge, New Jersey (2001)
4. Shi, Y., Peng, Y., Xu, W. ,Tang, X.: Data Mining via Multiple Criteria Linear
Programming: Applications in Credit Card Portfolio Management. International Journal of
Information Technology and Decision Making. 1 (2002) 131-151.
5. Kou, G., Liu, X., Peng, Y., Shi, Y., Wise, M., Xu, W.: Multiple Criteria Linear
Programming to Data Mining: Models, Algorithm Designs and Software Developments.
Optimization Methods and Software. Vol. 18(2003) 453-473
6. John, G., Langley, P.: Static Versus Dynamic Sampling for Data Mining. In : Simoudis,
E., Han, J.W., Usama M. Fayya( eds): Proceedings of the Second International Conference
on Knowledge Discovery in Databases and Data Mining, AAAI/MIT Press( 1996)367370
7. Provost, F., Jensen, D., Oates, T.: Efficient progressive sampling. In: Proceedings of the
Fifth KDDM, ACM Press, New York(1999) 2332
8. Breiman, L.: Bagging predictors. Machine Learning. 24(2)(1996)123140
9. Freund. Y., Schapire, RE. :Experiments with a new boosting algorithm. In: Proceedings
of 13th International Conference on Machine Learning (1996) 148156
10. Ho, T.K.: The random subspace method for constructing decision forests. IEEE
transactions on Pattern Analysis and Machine Intelligence.20(8)(1998)832844
11. Breiman, L.: Random forests. Machine Learning. 45(1)( 2001)532
12. Freund, Y., Schapire, RE.: Discussion of the paper Arcing Classifiers by Leo Breiman.
The Annals of Statistics 26 (3) (1998) 824832
13. Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman& Hall, New
York(1993)

Kimberlites Identication by Classication


Methods
Yaohui Chai1,2 , Aihua Li1,2 , Yong Shi1,2 , Jing He1,2 , and Keliang Zhang3
1

Chinese Academy of Sciences Research Center on Data Technology and Knowledge


Economy,
Chinese Academy of Sciences, Beijing 100080, China
2
School of Management, Graduate University of Chinese Academy of Sciences,
Chinese Academy of Sciences, Beijing 100039, China
3
College of earth science, Graduate University of Chinese Academy of Sciences,
Chinese Academy of Sciences, Beijing 100039, China
chaiyaohui05@mails.gucas.ac.cn, lah04b@mails.gucas.ac.cn,
yshi@gucas.ac.cn, hejing@gucas.ac.cn

Abstract. Kimberlites identication is a very important task for diamond mining. In traditional way, geologists draw upon past experience
to do this work. Whether the bedrock should be drilled depends on their
analysis of rock samples. This method has two disadvantages. First, as
the database increasing, it becomes more dicult to do this work by
manual inspection. Secondly, the accuracy is inuenced by the experts
experience, and it reaches scarcely 80 percents averagely. So an analytical
method to kimberlites identication over large geochemical datasets is
demanded. This article applies two methods (SVM and decision tree) to
a dataset provided by a mining company. Comparing the performances
of these two methods, our results demonstrate that SVM is an eective
method for this work.
Keywords: Kimberlites Identication, Classication, Feature Selection,
SVM, Decision Tree.

Introduction

World natural diamond production for 2004 is estimated at 156 million carats
and it translated into 61.5 billion US dollars in worldwide jewellery sales [1]. Even
though, the current level of demand for diamonds with high color and quality is
still not being met by the worlds producing diamond mines. Numerous companies are carrying out various phases of diamond exploration in Botswana, which
is the worlds leading producer of gem quality diamonds. Due to the extensive


This research has been partially supported by a grant from National Natural Science
Foundation of China (#70621001, #70531040, #70501030, #70472074), #9073020
from NSFB, 973 Project #2004CB720103, Ministry of Science and Technology,
China, and BHP Billiton Co., Australia.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 409414, 2007.
c Springer-Verlag Berlin Heidelberg 2007


410

Y. Chai et al.

Kalahari sand cover (and Karoo basalts underneath), sophisticated and innovative sampling and geophysical techniques are required to locate undiscovered
kimberlites [2].
The goal of this paper is to build an analytical model for kimberlites identication. Two classication methods are applied to a dataset containing information
about rock samples drilled in Botswana. This article is organized as follows. In
Section 2, we describe the dataset and its preprocessing steps. Section 3 gives
an overview of those two classication methods of SVM and decision tree, and
their results are listed out in Section 4. Finally, we conclude this work with a
summary in Section 5.

Data Description and Data Preprocessing

The dataset contains rock samples data from one region of Botswana. Original
dataset has 5921 row of observations and 89 variables, and each observation
describes detailed information of one rock sample about its position, physical
and chemical attributes. These variables include numeric and character types.
After consulting the experts, we deleted some rows missing important variables
and exclude some variables, which are irrelevant, redundant, or correlated such
as sample-id and horizon. Then some types of variables are transformed from
character to binary to satisfy the requirements of models, such as color and
shape.
After data transformation, the dataset includes 4659 observations and 101
variables.

Methods

Data classication is a two-step process [3]. In the rst step, a model is built
describing a predetermined set of data classed or concepts. This model is constructed by analyzing database tuples described by attributes. Each tuple is
assumed to belong to a predened class, as determined by one of the attributes,
called the class label attributes. The data used to build the model is called training set. And this step can be called supervised learning. In the second step, the
model is used to classication. And the predictive accuracy of the model is estimated. The data set used to classication in this step is called testing set. When
the constructed model is proved to be stable and robust, then this model can be
used to predict the new data.
The kimberlites identication for this dataset can be regarded as a four-group
classication problem based on the fact that there are four important kinds of
rock in this dataset. We will apply two standard classication methods for this
work.

Kimberlites Identication by Classication Methods

3.1

411

SVM

The support vector machine[4] is a powerful machine learning method for classication, and it is always described as an optimal problem as follows:
M in

1
 w 2
2

(1)

Subject To:
yi (( xi ) + b) 1, i = 1, 2, . . . , l

(2)

T = {(x1 , y1 ) , (xl , yl )} (X Y ) , xi X = Rn , yi Y = {1, 1}, i =


1, 2, l. Usually, as for the linearly inseparable sets, we should introduce the
kernel function and solve its dual problem to get the decision function:
Dual problem:
M ax w() =

l


i=1

l
1 
i j yi yj xi , xj 
2 i,j=1

(3)

Subject To:
i 0,

l


i yi = 0,

i = 1, 2, l

(4)

i=1

Here, use a kernel to compute xi , xj  = (xi ), (xj ) = k (xi , xj ),and the
decision function is:
l

f (x) = sgn(
i yi k(x, xi ) + b )

(5)

i=1

3.2

Decision Tree

A decision tree[5] is a ow-chart-like tree structure, where each internal node


denotes a test on an attribute, each branch represents an outcome of the test,
and leaf nodes represent classes or class distributions. The top node in a tree is
called the root node.
Algorithms to build a tree is listed as follows,
Function Tree =Decision Tree Create (T, A, Y) [3]
Input: training set, attributes and label.
Output: A decision tree.
1. Tree = Create Node (T);
2. If samples in T are the same, then label the leaf nodes with the same class,
return Tree;
3. If attributes are empty, then label the node with the most common class in
T,return Tree;

412

Y. Chai et al.

4. (X,Values) = Attribute Selection (T,A,Y); //select the best attribute X and


split points Values
5. for each V in Values do
6. Sub T =Vs sub-samples set satised with the testing condition
7. Node = Decision Tree Create (Sub T, A-X, Y);
8. Create Branch (Tree, Node);
9. end for;
10. return Tree;
3.3

Cross-Validation

There are several ways to estimate the classier accuracy, such as holdout
method, k-fold cross-validation method, and leaving-one-out and so on. In this
article we choose the 10-fold Cross-validation method. The initial data are randomly partitioned into 10 mutually exclusive subsets or folds, each of which has
the approximately equal size. Training and testing are performed 10 times.

Experimental Results

For this dataset are linearly inseparable, we use C-SVM with RBF kernel function [6] to classify the rock samples. (This model is called C SV M 1 ) There
are two parameters (C for the objective function and GAMMA for the kernel
function) selected for this model. The most common and reliable approach is to
decide on parameter ranges, and to do an exhaustive grid search over the parameter space to nd the best setting [7]. Figure.1 shows its process. It contains
contour plots for training datasets. The dierent colors of the projected contour
plots show the progression of the grid methods best parameter estimate. The
nal optimal parameter settings are C=128 and GAMMA=0.0078125.

Fig. 1. Parameter selected by grid.py

Kimberlites Identication by Classication Methods

413

Decision tree is assembled to SPSS 12(SPSS Inc.), and it is easy to use the
GUI to import the data and export the tree model. The depth of the tree is 3,
and 36 nodes are created for this model. There are 12 rules to classify the rock
samples. The accuracy is also estimated by 10 folds cross validation method.
Table.1 shows the accuracy result of both these methods compared with another
method linear discriminant:
Table 1. The ten folds cross validation accuracy for those methods
Methods
C SV M 1 Decision tree Linear discriminant C SV M 2
Cross validation 95.66%
89.1%
80.1%
95.04%

The two main approaches take the comparable computation time with 2
minutes around, while the SVM has excellent accuracy compared with decision
tree and linear disciminant. Still we nd that the parameter selection for SVM
takes a couple of hours. For reducing the computation time and computational
complexity, a feature selection is needed. And this work can also help the geophysical experts to make right decision based on less rock sample attributes. In
this article, we use F-score [8] as a feature selection criterion base on the simple
rule that the large score is, the more likely this feature is more discriminative.
Table 2. The accuracy after feature selection
Feature selected
Cross validation

101
95.66%

88
95.15%

44
95.04%

Based on the F-score rank, we selected 44 features and then apply C-SVM
with RBF for training and prediction. Its accuracy is still above 95 percents.
This model (C SV M 2 ) takes less time on parameter selection and the best
settings are C=64, GAMMA=0.00195. Through the ten-fold cross validation,
this compromise model is proved to be accurate and stable, so it can be applied
to a new geochemical dataset.

Conclusion

A sophisticated and innovative method for Diamond-bearing kimberlites identication is needed for the diamond mining, especially in the area covered by
extensive Kalahari sand. When a model is proved to be robust and eective
for this work, it will be greatly helpful to the experts on kimberlites discovery.
This article applies two methods to this new domain of application. Our results
demonstrate that, both of these two methods have much higher prediction accuracy than 80 percents (the experts suggested). The tree model is faster than
SVM while SVM provides a higher accuracy. Feature selection method is used to

414

Y. Chai et al.

build a compromise model between higher accuracy and less computation time,
and its classication accuracies are acceptable.
This paper is not aimed at providing a method to take place of the experts
dicision. It is just the opposite that the process developed in this research should
be supervised by a veteran expert, and its results must be understandable.

Acknowledgement
The author received much help from following members: Dr.Yi Zeng(exploration
and mining technology BHP billiton centre, Principal Scientist), Dr.Dongping
Wei(College of earth science, Graduate University of Chinese Academy of Sciences, Professor), Zhan Zhang(School of Management, Graduate University of
Chinese Academy of Sciences, Masters).

References
1. Diamond Facts 2004/05 NWT Diamond Industry Report, available at
http://www.iti.gov.nt.ca/diamond/diamond facts2005.htm
2. C. Williams, B. Coller, T. Nowicki, J. Gurney: MEGA KALAHARI GEOLOGY:
CHALLENGES OF KIMBERLITE EXPLORATION IN THIS MEDIUM (2003)
3. Han, J. W. and Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publication (2001)
4. Vapnik, V.: The Nature of Statistical Learning Theory. New York: Springer-Verlag
(1995)
5. C. Wu, D. Landgrebe, and P. Swain: The decision tree approach to classication.
Technical Report TR-EE-75-17, Laboratory for Applications of Remote Sensing,
School of Engineering, Purdue University, West Lafayette, IN, May (1975)
6. Mackay D.: Introduction to Gaussian processes. In: Neural Networks and Machine
Learning (1999)
7. Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines. (2001) Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm
8. Yi-Wei Chen and Chih-Jen Lin: Combining SVMs with Various Feature Selection
Strategies. In: Feature extraction, foundations and applications (2005)

A Fast Method for Pricing Early-Exercise


Options with the FFT
R. Lord1 , F. Fang2 , F. Bervoets1, and C.W. Oosterlee2
1

Modeling and Research, Rabobank International, Utrecht, The Netherlands


roger.lord@rabobank.com; frank.bervoets@rabobank.com
Delft University of Technology, Delft Institute of Applied Mathematics, Delft,
The Netherlands
f.fang@ewi.tudelft.nl; c.w.oosterlee@tudelft.nl

Abstract. A fast and accurate method for pricing early exercise options
in computational nance is presented in this paper. The main idea is to
reformulate the well-known risk-neutral valuation formula by recognizing
that it is a convolution. This novel pricing method, which we name the
CONV method for short, is applicable to a wide variety of payos and
only requires the knowledge of the characteristic function of the model.
As such the method is applicable within exponentially Levy models,
including the exponentially ane jump-diusion models. For an M -times
exercisable Bermudan option, the overall complexity is O(M N log(N ))
with N grid points used to discretize the price of the underlying asset.
It is also shown that American options can be very eciently computed
by combining Richardson extrapolation to the CONV method.
Keywords:
Transform.

Option

pricing,

Levy

Process,

Convolution,

FFT,

Introduction

When valuing and risk-managing exotic derivatives, practitioners demand fast


and accurate prices and sensitivities. As the nancial models and option contracts used in practice are becoming increasingly complex, ecient methods have
to be developed to cope with such models. Aside from non-standard exotic derivatives, plain vanilla options in many stock markets are actually of the American
type. As any pricing and risk management system has to be able to calibrate
to these plain vanilla options, it is of the utmost importance to be able to value
these American options quickly and accurately.
In the past couple of years a vast body of literature has considered the modeling of asset returns as innite activity Levy processes, due to the ability of such
processes to adequately describe the empirical features of asset returns and at
the same time provide a reasonable t to the implied volatility surfaces observed
in option markets. Valuing American options in such models is however far from
trivial, due to the weakly singular kernels of the integral terms appearing in the
PIDE, as reported in, e.g., [2,6,10,11].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 415422, 2007.
c Springer-Verlag Berlin Heidelberg 2007


416

R. Lord et al.

In this paper we present a novel FFT-based method for pricing options with
early exercise features. The only requirement of the method is that the characteristic function of the underlying asset is known, which is the case for many
exponential Levy model, with the popular exponentially ane jump-diusion
(EAJD) models of [7] as an important subclass. In contrast to the PIDE methods, our method has no diculty in handling processes of innite activity, such
as the Variance Gamma (VG) or CGMY models. A real benet of this class of
methods is, next to its exibility, the impressive computational speed, as the
FFT algorithm is employed.

FFT-Based Methods for Option Pricing in Literature

All transform methods depart from the risk-neutral valuation formula that, for
a European option, reads:
V (t, S(t)) = er E[V (T, S(T ))] ,

(1)

where E denotes the operator of taking expectation of some random variable


w.r.t. risk-neutral probability measure, V denotes the value of the option, r is
the risk-neutral interest rate1 , t is the current time point, T is the maturity of
the option and = T t. The variable S denotes the underlying asset price.
Since for many models the density is not available in closed-form whereas
the characteristic function is, a number of papers starting from Heston [9] have
attacked the problem via another route. Focusing on a plain vanilla European call
option, note that for dividend-protected assets (1) can be written very generally
as:
V (t, S(t)) = S(t) Ker(T t)IP(S(T ) > K),
(2)
where IP(S(T ) > K) is the risk-neutral probability of ending up in-the-money
and is the delta of the option, the sensitivity of the option with respect
to changes in the stock price. Both IP(S(T ) > K) and can be recovered
by inverting techniques, e.g., by Gil-Palaez inversion [8]. Carr and Madan [4]
considered another approach by directly taking the Fourier transform of the
damped option price with respect to k, the logarithm of the strike price. Premultiplying the option price with a damping function exp (k) to ensure the
existence of the Fourier transform, Carr and Madan ended up with



er (u ( + 1)i)
k
r
F{e V (t, k)}= e
eiuk E (S(T ) ek )+ dk =
,(3)
(u i)(u ( + 1)i)
IR
where i the imaginary unit, k is the logarithm of the strike price
K and

 is
the characteristic function of the log-underlying, i.e., (u) = E eiu ln S(T ) . The
methods considered up till here can only handle the pricing of European options.
1

Throughout the paper we assume that interest rates are deterministic, this assumption can be relaxed at the cost of increasing the dimensionality of some of the
methods.

A Fast Method for Pricing Early-Exercise Options with the FFT

417

Dene the set of exercise dates as T = {t0 , . . . , tM } and assume the exercise
dates are equally spaced: tk+1 tk = t. The best known examples of early
exercise options are the American and Bermudan options. American options
can be exercised at any time prior to the options expiry; Bermudan options can
only be exercised at certain dates in the future. The Bermudan option price can
then be found via backward induction as

C(tk , S(tk )) = ert E [V (tk+1 , S(tk+1 ))]
k = M 1, . . . , 0,
(4)
V (tk , S(tk )) = max{C(tk , S(tk )), E(tk , S(tk ))},
where C denotes the continuation value of the option and V is the option value
on the very next exercise date. Clearly the dynamic programming problem in (4)
is a successive application of the risk-neutral valuation formula, as we can write
the continuation value as

rt
C(tk , S(tk )) = e
V (tk+1 , y)f (y|S(tk ))dy,
(5)
IR

where f (y|S(tk )) represents the probability density describing the transition


from S(tk ) at tk to y at tk+1 . Based on (4) and (5) the QUAD method was
introduced in [1]. The method requires that the transition density is known in
closed-form. This requirement is relaxed in [12], where the QUAD-FFT method
is introduced and the underlying idea is that the transition density can be recovered by inverting the characteristic function. But the overall complexity of
both methods is O(M N 2 ) for an M -times exercisable Bermudan option with N
grid points used to discretize the price of the underlying asset.

The CONV Method

One of the rening properties of a Levy process is that its increments are independent of each other, which is the main premise of the CONV method:
f (y|x) = f (y x).

(6)

Note that x and y do not have to represent the asset price directly, they could
be monotone functions of the asset price. The assumption made in (6) therefore
certainly holds when the asset price is modeled as a monotone function of a
Levy process, since one of the dening properties of a Levy process is that its
increments are independent of each other. In this case x and y in (6) represent
the log-spot price. By including (6) in (5) and changing variables z = y x the
continuation value can be expressed as

rt
C(tk , x) = e
V (tk+1 , x + z)f (z)dz,
(7)

which is a cross-correlation of the option value at time tk+1 and the density f (z).
If the density function has an easy closed-form expression, it may be benecial

418

R. Lord et al.

to compute the integral straight forwardly. However, for many exponential Levy
models we either do not have a closed-form expression for the density (e.g. the
CGMY/KoBoL model of [3] and many EAJD models), or if we have, it involves
one or more special functions (e.g. the Variance Gamma model).
Since the density is hard to obtain, let us consider taking the Fourier transform of (7). In the remainder we will employ the following denitions for the
continuous Fourier transform and its inverse,


h(u)
:= F {h(t)}(u) =
eiut h(t)dt,
(8)


1

h(t) := F 1 {h(u)}(t)
=
eiut h(u)du.
(9)
2
If we dampen the continuation value (7) by a factor exp (x) and subsequently
take its Fourier transform, we arrive at


ert F {ex C(tk , x)}(u) =
eiux ex
V (tk+1 , x + z)f (z)dzdx. (10)

Changing the order of the integrals and the variables by x = y z, we obtain


 
rt
x
e F {e C(tk , x)}(u) =
ei(u+i)y V (tk+1 , y)dy ei(u+i)z f (z)dz



i(u+i)y
=
e
V (tk+1 , y)dy
ei(u+i)z f (z)dz

= F {ey V (tk+1 , y)}(u) (u + i).

(11)

In the last step we used the fact that the complex-valued Fourier transform of
the density is simply the extended characteristic function

(x + yi) =
ei(x+yi)z f (z)dz,
(12)

which is well-dened when (yi) < , as |(x+yi)| |(yi)|. Inverse Fourier


transform and undamping on (11) yield the CONV formula:


C(tk , x) = ertx F 1 F {ey V (tk+1 , y)}(u) (u + i) .
(13)
To value Bermudan options, one can recursively call (13) and (4) backwards
in time: First recover the option values on the last early-exercise date; then
feed them into (13) and (4) to obtain the option values on the second last
early-exercise date; , continue the procedure till the rst early-exercise date
is reached; for the last step, feed the option value on the rst early-exercise date
into (13) and there we obtain the option values on the initial date.
To value American options, there are two routes to follow: they can be approximated either by Bermudan options with many early exercise dates or by
Richardson extrapolation based on only a series of Bermudan options with an

A Fast Method for Pricing Early-Exercise Options with the FFT

419

increasing number of early exercise dates. In the experiments we evaluated both


approaches and compared their CPU time and the accuracy. As for the approach
via Richardson extrapolation, our choice of scheme is the one proposed by Chang,
Chung, and Stapleton [5].

Implementation

Lets ignore the damping factor in this section, for the ease of analysis, and
simplify the notations as: ert C(x, tk ) C(x) and V (y, tk+1 ) V (y).
lies in TL :=
 Suppose
 that we are only interested in a portion of C(x)Athat

L2 , L2 . Assume that f (z) 0 for z outside TA := A
2 , 2 . Both L and A
denote positive real numbers. Then we may re-write the risk-neutral valuation
formula as


C(x) =
V (x + z)f (z)dz =
V (x + z)f (z)dz,
(14)
TA

IR

which indicates that if values of C(x) arewanted on TL then values of V (y) that
L+A
we need for computation lie in TA+L := L+A
.
2 , 2
Remark 1 (Value of A). When working in the log-stock domain (e.g. x :=
log(S)), we approximate the standard deviation of the density function by the
volatility of its characteristic function, therefore approximate A by 10 times
volatility. The approximation gives good results in series of experiments.
4.1

Discrete CONV Formula

Recall that functions on compact supports can be represented by their Fourier


series, it then follows that we may rewrite V (y) as


2
2
1
V (y) =
vk eik A+L y , with vk =
V (y)eik A+L y dy. (15)
A + L TA+L
kZZ
Substitute the Fourier series of V (y) in (14) and interchange the summation and
the integration (allowed by Fubinis theorem) to result in
C(x) =


vk

kZZ

2
2
f (z)eik A+L z dz eik A+L x ,

(16)

TA

where the integration inside the brackets is precisely the denition of the char2
acteristic function at u = k A+L
. Truncate the series in (16) to yield

C(x)
=


kZN



2
2
vk k
eik A+L x ,
A+L

(17)

420

R. Lord et al.

where ZN = {n| N2 n < N2 , ZZ}. Up to this point, (17) is almost ready for
the implementation, were vk to be obtained numerically as well. To recover vk ,
quadrature rules are employed. With composite mid-point rule one obtains
vk =

2
y  ik L+A
yj
e
V (yj ),
L+A

(18)

jZN

where y = L+A
N , {yj := jy + yc |j ZN } and yc denotes the grid center.
It then yields the discrete version of the CONV formula after substituting (18)
into (17):
Cm =


1  iuk xm
e
(uk )
eiuk yj V (yj ),
N
kZN

(19)

jZN

2
where uk = k L+A
and {xm := my + xc |m ZN } with grid center xc . Note
that the x- and y-grids share the same mesh size so that the same u-grid can be
used in both the inner and the outer summations.

4.2

Computational Complexity and Convergence Rate

The pleasing feature of (19) is that both summations can be fast resolved by
existing FFT algorithms. Therefore, the overall computational complexity is
O(N log(N )) for European options, and O(M N log(N )) for an M -times exercisable Bermudan options.
In the mean while, it can be proven analytically that the convergence rate of
the method is O( N12 ) for both vanilla European and Bermudan options. Though
well not include the error analysis in this paper, the regular point-wise convergence rate of the method can be well observed in the experiment results.

Numerical Results

By various numerical experiments we aim to show the speed of the computations


and the exibility of the CONV method. Three underlying models are adopted in
the experiments: Geometric Brownian Motion (GBM), Variance Gamma (VG),
and CGMY. The pricing problems are of Bermudan and American style.
The computer used for the experiments has a Intel Pentium 4 CPU, 2.8 GHz
frequency and a total 1011 MB physical memory. The code is programmed in
Matlab.
Results for 10-times exercisable Bermudan options under GBM and VG are
summarized in table 1, where the fast computational speed (e.g. less than 1 second for N = 216 ), the high accuracy (e.g. with only 26 grid points the error is
already of level 102 ) and the regular convergence rate (e.g. the convergence rate
is 4 for Bermudan payo) are shown. Results for American options under VG and
CGMY are summarized in table 2, where P(N/2) denotes the results obtained
by approximating the American option values directly by N/2-times exercisable

A Fast Method for Pricing Early-Exercise Options with the FFT

421

Table 1. CPU time, errors and convergence rate in pricing 10-times exercisable Bermudan put under GBM and VG with the CONV method
GBM: Reference= 10.4795201;
VG: Reference= 9.04064611;
N = 2d time(sec) absolute error convergence time(sec) absolute error convergence
6
0.002
9.54e-02
0.001
7.41e-02
7
0.002
2.44e-02
3.91
0.002
5.42e-03
1.37
8
0.003
6.45e-03
3.78
0.003
2.68e-03
2.02
9
0.010
1.69e-03
3.81
0.006
6.10e-04
4.39
10
0.011
4.47e-04
3.79
0.015
1.38e-04
4.40
11
0.021
1.12e-04
3.97
0.022
3.16e-05
4.38
12
0.043
2.83e-05
3.97
0.042
7.92e-06
3.99
13
0.091
7.09e-06
4.00
0.096
1.99e-06
3.97
14
0.210
1.76e-06
4.04
0.208
5.15e-07
3.88
For GBM: S0 = 100, K = 110, T = 1, = 0.2, r = 0.1, q = 0;
For VG: S0 = 100, K = 110, T = 1, = 0.12, = 0.14, = 0.2, r = 0.1, q = 0;
Reference values are obtained by the PIDE method with 4 million grid points.

Table 2. CPU time, errors and convergence rate in pricing 10-times exercisable Bermudan put under VG and CGMY with the CONV method

VG: Reference= 0.800873607


P(N/2)
Richardson
N = 2d time(sec) error time(sec) error
7
0.01 4.61e-02 0.03 4.51e-02
8
0.04 6.47e-03 0.05 1.36e-02
9
0.11 6.78e-03 0.07 2.69e-03
10
0.45 5.86e-03 0.14 1.43e-03
11
1.73 2.87e-03 0.28 2.71e-04
12
7.18 1.03e-03 0.57 5.76e-05

CGMY(Y < 1)
CGMY(Y > 1)
Reference=
Reference=
0.112171 [2]
9.2185249 [13]
Richardson
Richardson
time(sec) error time(sec) error
0.02 1.37e-02
0.02 5.68e-01
0.04 2.08e-03
0.04 2.78e-01
0.07 4.83e-04
0.08 1.29e-01
0.12 9.02e-05
0.14 8.68e-03
0.26 4.21e-05
0.28 6.18e-04
0.55 2.20e-05
0.59 6.14e-03

For VG: S0 = 100, K = 90, T = 1, = 0.12, = 0.14, = 0.2, r = 0.1, q = 0;


Reference value from PIDE implementation with about 65K 16K grid points
For CGMY(Y < 1): Y = 0.5, C = 1, G = M = 5, S0 = 1, K = 1, T = 1, r = 0.1, q = 0;
For CGMY(Y > 1): Y = 1.0102, C = 0.42, G = 4.37, M = 191.2, S0 = 90, K = 98, T =
0.25, r = 0.06, q = 0;

Bermudan options, and Richardson denotes the results obtained by the 6-times
repeated Richardson extrapolation scheme. For the VG model, the extrapolation
method turns out to converge much faster and spend far less time than the direct
approximation approach (e.g., to get the same 104 accuracy, the extrapolation
method is more than 20 times faster than the direct-approximation method). For
CGMY model, results by the extrapolation approach are given. They demonstrate that the CONV method can be well combined with the extrapolation
technique as well as any models with known characteristic functions.

422

R. Lord et al.

Conclusions and Future Works

The CONV method, like other FFT-based methods, is quite exible w.r.t the
choice of asset process and also the type of option contract. It can be applied if
the underlying follows a Levy processe and its characteristic function is known.
The CONV method is highly accurate and fast in pricing Bermudan and American options. It can be used for fast option pricing and for parameter calibration
purposes.
The future works include thorough error analysis and application of the
method to exotic options. Generalization of the method to high-dimensions
and incorporation of the method with sparse grid method are also of our great
interest.

References
1. Andricopoulos, A.D., Widdicks, M., Duck, P.W. and Newton, D.P.: Universal Option Valuation Using Quadrature, J. Financial Economics, 67(2003),3:
447-471
2. Almendral, A. and Oosterlee, C.W.: Accurate Evaluation of European and
American Options Under the CGMY Process., to appear in SIAM J. Sci. Comput(2006)
3. Boyarchenko, S. I. and Levendorski, S. Z.: Non-Gaussian Merton-BlackScholes theory, vol. 9 of Advanced Series on Statistical Science & Appl. Probability,
World Scientic Publishing Co. Inc., River Edge, NJ, 2002
4. Carr, P. P. and Madan, D. B.: Option valuation using the Fast Fourier Transform, J. Comp. Finance, 2 (1999), pp. 6173
5. Chang, C-C , Chung, S-L and Stapleton, R.C.: Richardson extrapolation technique for pricing American-style options Proc. of 2001 Taiwanese Financial Association, Tamkang University Taipei, June 2001. Available at http://papers.ssrn.
com/sol3/papers.cfm?abstract id=313962
6. Cont, R. and Tankov, P.: Financial modelling with jump processes, Chapman
& Hall, Boca Raton, FL, 2004
7. Duffie, D., Pan, J. and Singleton, K.: Transform analysis and asset pricing
for ane jump-diusions. Econometrica 68(2000): 13431376
8. Gil-Pelaez, J.: Note on the inverse theorem. Biometrika 37(1951): 481-482
9. Heston, S.: A closed-form solution for options with stochastic volatility with applications to bond and currency options, Rev. Financ. Stud., 6 (1993), pp. 327343.
10. Hirsa, A. and Madan, D. B.: Pricing American Options Under Variance
Gamma, J. Comp. Finance, 7 (2004).
11. Matache, A. M., Nitsche, P. A. and Schwab, C.: Wavelet Galerkin pricing of
American options on L
evy driven assets, working paper, ETH, Z
urich, 2003.
12. OSullivan, C.: Path Dependent Option Pricing under Levy Processes EFA 2005
Moscow Meetings Paper, Available at SSRN: http://ssrn.com/abstract=673424,
Febr. 2005.
13. Wang, I., Wan, J.W. and Forsyth, P. : Robust numerical valuation of European and American options under the CGMY process. Techn. Report U. Waterloo,
Canada, 2006.

Neural-Network-Based Fuzzy Group Forecasting


with Application to Foreign Exchange Rates Prediction
Lean Yu1,2, Kin Keung Lai2, and Shouyang Wang 1
1

Institute of Systems Science, Academy of Mathematics and Systems Science,


Chinese Academy of Sciences, Beijing 100080, China
{yulean,sywang}@amss.ac.cn
2
Department of Management Sciences, City University of Hong Kong,
Tat Chee Avenue, Kowloon, Hong Kong
{msyulean,mskklai}@cityu.edu.hk

Abstract. This study proposes a novel neural-network-based fuzzy group forecasting model for foreign exchange rates prediction. In the proposed model,
some single neural network models are first used as predictors for foreign exchange rates prediction. Then these single prediction results produced by each
single neural network models are fuzzified into some fuzzy prediction representations. Subsequently, these fuzzified prediction representations are aggregated
into a fuzzy group consensus, i.e., aggregated fuzzy prediction representation.
Finally, the aggregated prediction representation is defuzzified into a crisp
value as the final prediction results. For illustration and testing purposes, a typical numerical example and three typical foreign exchange rates prediction experiments are presented. Experimental results reveal that the proposed model
can significantly improve the prediction performance for foreign exchange
rates.
Keywords: Artificial neural networks, fuzzy group forecasting, foreign
exchange rates prediction.

1 Introduction
Foreign exchange rate forecasting has been a common research stream in the last few
decades. Over this time, the research stream has gained momentum with the advancement of computer technologies, which have made many elaborate computation
methods available and practical [1]. Due to its high volatility, foreign exchange rates
forecasting is regarded as a rather challenging task. For traditional statistical methods,
it is hard to capture the high volatility and nonlinear characteristics hidden in the
foreign exchange market. As a result, many emerging artificial intelligent techniques,
such as artificial neural networks (ANN), were widely used in foreign exchange rates
forecasting and obtained good prediction performance. For example, De Matos [2]
compared the strength of a multilayer feed-forward neural network (FNN) with that of
a recurrent network based on the forecasting of Japanese yen futures. Kuan and Liu
[3] provided a comparative evaluation of the performance of MLFN and a recurrent
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 423430, 2007.
Springer-Verlag Berlin Heidelberg 2007

424

L. Yu, K.K. Lai, and S. Wang

neural network (RNN) on the prediction of an array of commonly traded exchange


rates. In the article of Tenti [4], the RNN is directly applied to exchange rates forecasting. Hsu et al. [5] developed a clustering neural network (CNN) model to predict
the direction of movements in the USD/DEM exchange rate. Their experimental results suggested that their proposed model achieved better forecasting performance
relative to other indicators. In a more recent study by Leung et al. [6], the forecasting
accuracy of MLFN was compared with the general regression neural network
(GRNN). The study showed that the GRNN possessed a greater forecasting strength
relative to MLFN with respect to a variety of currency exchange rates. Similarly,
Chen and Leung [7] adopted an error correction neural network (ECNN) model to
predict foreign exchange rates. Yu et al. [8] proposed an adaptive smoothing neural
network (ASNN) model by adaptively adjusting error signals to predict foreign exchange rates and obtained good performance.
However, neural networks are a kind of very unstable learning paradigm. Usually,
small changes in the training set and/or parameter selection can produce large changes
in the predicted output. To remedy the drawbacks, this paper attempts to utilize a set
of neural network predictors to construct a fuzzy group forecasting methodology for
foreign exchange rates prediction. In the proposed model, a number of single neural
network models are first used as predictors for foreign exchange rates prediction.
Then these single prediction results produced by each single neural network models
are fuzzified into some fuzzy prediction representations. Subsequently, these fuzzified
prediction representations are aggregated into a fuzzy group consensus, i.e., aggregated fuzzy prediction representation. Finally, the aggregated fuzzy prediction representation is defuzzified into a crisp value as the final prediction results.
The major aim of this study is to present a new forecasting paradigm called fuzzy
group forecasting that can significantly improve the prediction capability of neural
networks. The rest of this study is organized as follows. Section 2 describes the proposed neural network based fuzzy group forecasting model in detail. For further
illustration, a typical numerical example and three typical foreign exchange rates
prediction experiments are presented in Section 3. Section 4 concludes the study.

2 Neural-Network-Based Fuzzy Group Forecasting Methodology


In this section, a neural-network-based fuzzy group forecasting model is proposed for
time series prediction problem. The basic idea of the fuzzy group forecasting model is
to make full use of group members knowledge to make a more accurate prediction
over any single neural network predictors. For convenience of simplification, this
study utilizes three group feed-forward neural network members to construct a fuzzy
group forecasting model. Generally speaking, the neural-network-based fuzzy group
forecasting consists of four different steps.
Step I: Single Neural Predictor Creation. In order to construct different single neural
predictor for the neural network model with the same structure, we use different training sets to create different neural network predictors. In this study the bagging algorithm proposed by Breiman [9] is used to generate different training sets.

Neural-Network-Based Fuzzy Group Forecasting

425

Step II: Single Predictors Fuzzification. Based on these different training sets, each
neural network can produce some different predictors. Using the different predictors,
we can obtain different prediction results. Because neural predictor is an unstable
learning paradigm, we are required to integrate these different results produced by
different predictors, as earlier noted in Section 1. For these different predictions, we
consider them as a fuzzy number for further processing. For example, suppose that the
original dataset is used to create k different training sets, i.e., TR1, TR2, , TRk, via

f1i , f 2i , " f ki , for

the bagging algorithm, we can use them to generate k models, i.e.,

the ith neural network predictor. Accordingly, k different predictions, i.e.,

f1i ( x), f 2i ( x), " f ki ( x) , can be generated by the ith neural network predictor when
out-of-sample forecasting. In order to make full use of all information provided by
different predictions, without loss of generalization, we utilize the triangular fuzzy
number to construct the fuzzy representation for different predicted results. That is,
the smallest, average and largest of the k predictions are used as the left-, mediumand right membership degrees. That is, the smallest and largest values are seen as the
optimistic and pessimistic prediction and the average value is considered to be the
most likely prediction. Using this fuzzification method, one can make a fuzzy prediction judgment for each time series. More clearly, the triangular fuzzy number in this
case can be represented as

([ (
k ], [max ( f

)]
( x ) )])

~
Z i = ( z i1 , z i 2 , z i 3 ) = min f 1i ( x ), f 2i ( x ), " , f ki ( x ) ,

k
j =1

f ji ( x )

i
1

( x ), f 2i ( x ), " , f ki

(1)

In such a fuzzification processing way, the prediction problem is extended into a


fuzzy group forecasting framework.
Step III: Fuzzy Prediction Aggregation. Suppose that there is p different group
members, i.e., p different neural network predictors, let Z~ = ( Z~ 1 , Z~ 2 , " Z~ p ) be
the aggregation of the p fuzzy prediction representations where () is an aggregation
function. Now how to determine the aggregation function or how to aggregate these
fuzzy prediction representations to be a group prediction consensus is an important
and critical problem under fuzzy group forecasting environment. Generally speaking,
there are many aggregation operators and rules that can be used to aggregate fuzzy
prediction representations. Usually, the fuzzy prediction representations of p group
members will be aggregated using a commonly used linear additive procedure, i.e.,

~
~
p
Z = i =1 wi Z i =

p
i =1

wi z i1 , i =1 wi z i 2 , i =1 wi z i 3
p

(2)

where wi is the weight of the ith group forecasting member, i = 1, 2, , p. The


weights usually satisfy the following normalization condition:

i =1 wi = 1
p

(3)

The key to this aggregation procedure is how to determine the optimal weight wi of
the ith fuzzy prediction representation under fuzzy group forecasting environment.

426

L. Yu, K.K. Lai, and S. Wang

Often the fuzzy representations of the predictions are largely dispersed and separated.
In order to achieve the maximum similarity, the fuzzy prediction representations
should move towards one another. This is the principle on the basis of which an aggregated fuzzy prediction representation is generated. Relied on this principle, a leastsquares aggregation optimization approach is proposed to integrate fuzzy prediction
results produced by different prediction models.
The generic idea of this proposed aggregation optimization approach is to minimize
the sum of squared distance from one fuzzy prediction to another and
~ thus make
~ them
maximum agreement. Particularly, the squared distance between Z i and Z j can be
defined by

d ij2 =

( (w Z~ w Z~ ) ) =
2

3
l =1

(w z

i il

w j z jl )

(4)

Using this definition, we can construct the following optimization model, which
minimizes the sum of the squared distances between all pairs of weights fuzzy prediction representations:

p
p
p
p

3
2
2
Minimize D = d ij = l =1 (wi z il w j z jl )
i
=
1
j
=
1
,
j

i
i
=
1
j
=
1
,
j

p
( P) Subject to wi = 1
i =1

wi 0, i = 1,2, " p

(5)
(6)
(7)

In order to solve the above optimal weights, the constraint (7) is first not considered. If the solution turns out to be nonnegative, then constraint (7) is satisfied automatically. Using Lagrange multiplier method, Equations (5) and (6) can construct the
following Lagrangian function:
p

L( w, ) =

[l =1 (wi zil w j z jl ) ] 2 (i =1 wi = 1)
p

(8)

i =1 j =1, j i

Differentiating (8) with wi, we can obtain


p
L
3

= 2 (wi z il w j z jl )zil 2 = 0 for each i = 1, 2, , p.


wi
j =1, j i l =1

(9)

Equation (9) can be simplified as

( p 1)(3l =1 zil2 )wi (zil z jl )w j = 0


p

j =1, j i

l =1

for each i = 1, 2, , p.

(10)

Setting W = (w1, w2, , wp)T, I = (1, 1, , 1)T and the T denotes transpose,
3
3
bij = ( p 1) l =1 zil2 , i = j = 1,2,", p , bij = l =1 z il z jl , i , j = 1, 2, " , p ; j i and

Neural-Network-Based Fuzzy Group Forecasting

B = (bij ) p p

3
2
( p 1) l =1 z1l

3
(z 2l z1l )
= l =1

"
3

(z pl z1l )
l =1

(z1l z 2l )
l =1

( p 1)(3l =1 z 22l )
"

(z pl z 2l )
3

l =1

(z1l z pl )
l =1

3
"
(z 2l z pl )

l =1

"
"
3
2
" ( p 1) l =1 z pl

427

"

(11)

Using matrix form and above settings, Equations (10) and (6) can be rewritten as

BW I = 0

(12)

I TW = 1

(13)

Similarly, Equation (5) can be expressed in a matrix form as D = W BW . Because D is a squared distance which is usually larger than zero, B should be positive
definite and invertible. Using Equations (12) and (13) together, we can obtain
T

* = 1 (I T B 1 I )

W * = B 1 I

) (I

B 1 I

(14)

(15)

Since B is a positive definite matrix, all its principal minors will be strictly positive
and thus B is a nonsingular M-matrix [10]. According to the properties of M-matrices,
we know B-1 is nonnegative. Therefore W * 0, which implies that the constraint in
Equation (7) is satisfied.
Step IV: Aggregated Prediction Defuzzification. After completing aggregation, a
fuzzy group consensus can be obtained by Equation (2). To obtain crisp value of
credit score, we use a defuzzification procedure to obtain the crisp value for decisionmaking purpose. According to Bortolan and Degani [11], the defuzzified value of a
triangular fuzzy number

~
Z = ( z1 , z 2 , z 3 ) can be determined by its centroid, which is

computed by
z3
x z1
z x
dx
x
dx + x 3
z1
z2
z 2 z1
z 3 z 2

(z + z + z3 )
z1

z = z3
=
= 1 2
(16)
z2 x z
z3 z x
3
3
1
z1 ~z ( x)dx

dx
+
dx
z1 z 2 z1 z2 z3 z2
z3

x ~z ( x)dx

z2

In this way, a final group forecasting consensus is computed with the above processes. For illustration and verification purposes, an illustrated numerical example and
three typical foreign exchange rates are conducted.

428

L. Yu, K.K. Lai, and S. Wang

3 Experiments
In this section, we first present an illustrative numerical example to explain the implementation process of the proposed fuzzy group forecasting model using US dollar
against Chinese Renminbi (USD/RMB) exchange rate series. Then three typical foreign
exchange rates, US dollar against each of the three currencies British pounds (GBP),
euros (EUR) and Japanese yen (JPY), are used for testing. All four exchange data are
obtained from Pacific Exchange Rates Services (http://fx.sauder.ubc.ca/), provided by
Professor Werner Antweiler, University of British Columbia, Vancouver, Canada.
3.1 An Illustrative Numerical Example
Assume that there is USD/RMB series covered from January 1, 2006 to November
30, 2006, one would like to predict future USD/RMB exchange rate, e.g., December
1, 2006. For simplification, we first apply three standard FNN models with different
topological structures to conduct this example. For example, we use three different
numbers of hidden neurons to generate three different FNN models. In this example,
we try to utilize five different models for prediction. For this purpose, the bagging
sampling algorithm [9] is then used to create five different training sets. For each
FNN model, five different training sets are used and five different prediction results
are presented. With the above assumptions, three FNN models can produce 15 different prediction results, each for five predictions, as shown below.
FNN (5-09-1) = (7.8211, 7.8321, 7.8451, 7.8122, 7.8247)
FNN (5-12-1) = (7.8309, 7.8292, 7.8302, 7.8385, 7.8278)
FNN (5-15-1) = (7.8082, 7.8199, 7.8208, 7.8352, 7.8393)
where the numbers in parentheses indicate the topological structures of standard FNN
model, for example, (5-12-1) represents five input nodes, twelve hidden nodes and
one output node. Using Equation (1), the predictions of the three FNN models can be
fuzzified into three triangular fuzzy numbers as fuzzy prediction representations, i.e.,

~
Z FNN1 = ( z11 , z12 , z13 ) = (7.8122,7.8270,7.8451)
~
Z FNN 2 = ( z 21 , z22 , z23 ) = (7.8278,7.8313,7.8385)
~
Z FNN 3 = ( z31 , z32 , z33 ) = (7.8082,7.8247,7.8393)
The subsequent task is to aggregate the three fuzzy prediction representations into
a group prediction consensus. Using the above optimization method, we can obtain
the following results:

367.6760 183.9417
B = 183.9417 368.0916
183.7432 183.8470
2.1439 2.1427
1
3
B = 10 2.1427 2.1415
2.1450 2.1438

183.7432
183.8470 ,
367.2971
2.1450
2.1438 ,
2.1461

Neural-Network-Based Fuzzy Group Forecasting

429

3
~
~
W *T = (0.3333, 0.3332; 0.3335) , Z * = i=1 w*Zi = (7.8161, 7.8277, 7.8410)

The final step is to defuzzify the aggregated fuzzy prediction value into a crisp prediction value. Using Equation (14), the defuzzified value of the final group prediction
consensus is calculated by

z = (7.8161 + 7.8277 + 7.8410) 3 = 7.8282


According to the data source, the actual value of USD/RMB in December 1, 2006
is 7.8283. By comparison, our fuzzy group prediction is rather promising. For further
verification, three typical foreign exchange rates are tested.
3.2 Three Foreign Exchange Rates Prediction Experiments
In this subsection, three typical foreign exchange rates are used to test the effectiveness of the proposed neural network-based fuzzy group forecasting model. The data
used here are monthly and they consist of the USD/GBP, USD/EUR and USD/JPY.
We take monthly data from January 1971 to December 2000 as in-sample (training
periods) data sets (360 observations including 60 samples for cross-validations). We
also take the data from January 2001 to November 2006 as out-of-sample (testing
periods) data sets (71 observations), which is used to evaluate the good or bad performance of prediction based on some evaluation measurement. For evaluation, two
typical indicators, normalized mean squared error (NMSE) [1] and directional statistics (Dstat) [1] are used. In addition, for comparison purpose, linear regression (LinR),
logit regression (LogR) and single FNN model are used here. Particularly, we use ten
different FNN models with different structures to construct the fuzzy group forecasting model. In addition, the bagging algorithm [9] is used to create ten different training sets. Accordingly, the results obtained are reported in Table 1.
Table 1. The NMSE and Dstat comparisons for different models
Models
Single LinR Model
Single LogR Model
Single FNN Model
Fuzzy Group Forecasting

GBP
NMSE
Dstat (%)
0.0811
57.75
0.0792
63.38
0.0767
69.01
0.0083
84.51

EUR
NMSE
Dstat (%)
0.0892
59.15
0.0669
67.61
0.0663
70.42
0.0112
83.10

NMSE
0.2346
0.1433
0.1782
0.0458

JPY
Dstat (%)
56.33
70.42
71.83
81.69

From Table 1, a clear comparison of various methods for the three currencies is
given via NMSE and Dstat. Generally speaking, the results obtained from the two tables also indicate that the prediction performance of the proposed neural network
based fuzzy group forecasting model is better than those of the single neural network
model, linear regression and logit regression forecasting models for the three main
currencies. The main reasons are that (1) aggregating multiple predictions into a
group consensus can definitely improve the performance, as Yu et al. [1] revealed; (2)
fuzzification of the predictions may generalize the model by processing some uncertainties of forecasting; and (3) as an universal approximator, neural network might
also make a contribution for the performance improvement.

430

L. Yu, K.K. Lai, and S. Wang

4 Conclusions
In this study, a new neural network based fuzzy group forecasting model is proposed
for foreign exchange rates prediction. In terms of the empirical results, we can find
that across different models for the test cases of three main currencies British
pounds (GBP), euros (EUR) and Japanese yen (JPY) on the basis of different
evaluation criteria, our proposed neural network based fuzzy group forecasting
method performs the best. In the proposed neural network based fuzzy group forecasting cases, the NMSE is the lowest and the Dstat is the highest, indicating that the
proposed neural network based fuzzy group forecasting model can be used as a promising tool for foreign exchange rates prediction.

Acknowledgements
This work is supported by the grants from the National Natural Science Foundation of
China (NSFC No. 70221001, 70601029), the Chinese Academy of Sciences (CAS
No. 3547600), the Academy of Mathematics and Systems Sciences (AMSS No.
3543500) of CAS, and the Strategic Research Grant of City University of Hong Kong
(SRG No. 7001677, 7001806).

References
1. Yu, L., Wang, S.Y., Lai, K.K.: A Novel Nonlinear Ensemble Forecasting Model Incorporating GLAR and ANN for Foreign Exchange Rates. Computers & Operations Research
32 (2005) 2523-2541
2. De Matos G.: Neural Networks for Forecasting Exchange Rate. M. Sc. Thesis. The University of Manitoba, Canada (1994)
3. Kuan, C.M., Liu, T.: Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks. Journal of Applied Econometrics 10 (1995) 347-364
4. Tenti, P.: Forecasting Foreign Exchange Rates Using Recurrent Neural Networks. Applied
Artificial Intelligence 10 (1996) 567-581
5. Hsu, W., Hsu, L.S., Tenorio, M.F.: A Neural Network Procedure for Selecting Predictive
Indicators in Currency Trading. In Refenes A.N. (Ed.): Neural Networks in the Capital
Markets. New York: John Wiley and Sons (1995) 245-257
6. Leung, M.T., Chen, A.S., Daouk, H.: Forecasting Exchange Rates Using General Regression Neural Networks. Computers & Operations Research 27 (2000) 1093-1110
7. Chen, A.S., Leung, M.T.: Regression Neural Network for Error Correction in Foreign Exchange Rate Forecasting and Trading. Computers & Operations Research 31 (2004)
1049-1068
8. Yu, L., Wang, S.Y., Lai, K.K. Adaptive Smoothing Neural Networks in Foreign Exchange
Rate Forecasting. Lecture Notes in Computer Science 3516 (2005) 523-530
9. Breiman, L.: Bagging Predictors. Machine Learning 26 (1996) 123-140
10. Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. Academic, New York (1979)
11. Bortolan, G., Degani, R.: A Review of Some Methods for Ranking Fuzzy Subsets. Fuzzy
Sets and Systems 15 (1985) 1-19

Credit Risk Evaluation Using Support Vector Machine


with Mixture of Kernel
Liwei Wei1,2, Jianping Li1, and Zhenyu Chen1,2
1

Institute of Policy & Management, Chinese Academy of Sciences, Beijing 100080, China
2
Graduate University of Chinese Academy of Sciences, Beijing 100039, China
{lwwei, ljp, zychen}@casipm.ac.cn

Abstract. Recent studies have revealed that emerging modern machine learning
techniques are advantageous to statistical models for credit risk evaluation, such
as SVM. In this study, we discuss the applications of the support vector machine with mixture of kernel to design a credit evaluation system, which can
discriminate good creditors from bad ones. Differing from the standard SVM,
the SVM-MK uses the 1-norm based object function and adopts the convex
combinations of single feature basic kernels. Only a linear programming problem needs to be resolved and it greatly reduces the computational costs. More
important, it is a transparent model and the optimal feature subset can be obtained automatically. A real life credit dataset from a US commercial bank is
used to demonstrate the good performance of the SVM- MK.
Keywords: Credit risk evaluation SVM-MK Feature selection Classification
model.

1 Introduction
Undoubtedly credit risk evaluation is an important field in the financial risk management. Extant evidence shows that in the past two decades bankruptcies and defaults
have occurred at a higher rate than any other time. Thus, its a crucial ability to accurately assess the existing risk and discriminate good applicants from bad ones for
financial institutions, especially for any credit-granting institution, such as commercial banks and certain retailers. Due to this situation, many credit classification models have been developed to predict default accurately and some interesting results
have been obtained.
These credit classification models apply a classification technique on similar data
of previous customers to estimate the corresponding risk rate so that the customers
can be classified as normal or default. Some researchers used the statistical models,
such as Linear discriminate analysis [1], logistic analysis [2] and probit regression [3],
in the credit risk evaluation. These models have been criticized for lack of classification precision because the covariance matrices of the good and bad credit classes are
not likely to be equal.
Recently, with the emergence of Decision tree [4] and Neural network [5], artificial
intelligent (AI) techniques are wildly applied to credit scoring tasks. They have
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 431438, 2007.
Springer-Verlag Berlin Heidelberg 2007

432

L. Wei, J. Li, and Z. Chen

obtained promising results and outperformed the traditional statistics. But these methods often suffer local minima and over fitting problems.
Support vector machine (SVM) is first proposed by Vapnik [6]. Now it has been
proved to be a powerful and promising data classification and function estimation
tool. Reference [7], [8] and [9] applied SVM to credit analysis. They have obtained
some valuable results. But SVM is sensitive to outliers and noises in the training sample and has limited interpretability due to its kernel theory. Another problem is that
SVM has a high computational complexity because of the solving of large scale quadratic programming in parameter iterative learning procedure.
Recently, how to learn the kernel from data draws many researchers attention. The
reference [10] draws the conclusion that the optimal kernel can always be obtained as
a convex combinations of many finitely basic kernels. And some formulations [11]
[12] have been proposed to perform the optimization in manner of convex combinations of basic kernels.
Motivated by above questions and ideas, we propose a new method named support
vector machines with mixture of kernel (SVM-MK) to evaluate the credit risk. In this
method the kernel is a convex combination of many finitely basic kernels. Each basic
kernel has a kernel coefficient and is provided with a single feature. The 1-norm is
utilized in SVM-MK. As a result, its objective function turns into a linear programming parameter iterative learning procedure and greatly reduces the computational
complexity. Furthermore, we can select the optimal feature subset automatically and
get an interpretable model.
The rest of this paper is organized as follows: section 2 gives a brief outline of
SVM-MK. To evaluate the performance of SVM-MK for the credit risk assessment,
we use a real life credit card data from a major US commercial bank in this test in
section 3. Finally, section 4 draws the conclusion and gives an outlook of possible
future research areas.

2 Support Vector Machine with Mixture of Kernel


Considering a training data set G =

{(xG y )}

i,

i =1

G
xi R m is the i th input pattern and

yi is its corresponding observed result yi {+ 1 1} . In credit risk evaluation


model, xi denotes the attributes of applicants or creditors, yi is the observed result of
timely repayment. If the applicant defaults a loan, yi =1, else yi =-1.
The optimal separating hyper-plane is found by solving the following regularized
optimization problem [6]:

1 2
n
K
min J ( , ) = + c i =1 i
2

K K
yi T (xi ) + b 1 i
s.t.
i 0

i = 1," n

(1)

(2)

Credit Risk Evaluation Using Support Vector Machine with Mixture of Kernel

433

where c is a constant denoting a trade-off between the margin and the sum of total
K
errors. ( x ) is a nonlinear function that maps the input space into a higher dimensional feature space. The margin between the two parts is 2

The quadratic optimization problem can be solved by transforming (1) and (2) into
the saddle point of the Lagrange dual function:

1 n
K K
n
maxi=1i i, j=1i j yi y j k (xi , x j )
2

n yi i = 0
i =1
s.t.
0 i c, i = 1,", n

(K K )

(3)

(4)

(K )

where k xi , x j = ( xi ) x j is called the kernel function, i are the Lagrange multipliers.


In practice, a simple and efficient method is that the kernel function being illustrated as the convex of combinations of the basic kernel:
n
K K
k (xi , x j ) = i =1 d k (xi , d , x j , d )

where

(5)

K
xi , d denotes the d th component of the input vector xi .

Substituting Equation (5) into Equation (3), and multiplying Equation (3) and (4)

by d , suppose i , d

= i d , then the Lagrange dual problem change into:

1 n,m

maxi,d i,d i , j ,d =1 i,d j ,d yi y j k (xi,d , x j ,d )


2

i,d yi i ,d = 0
m

s.t. 0 d =1 i ,d c, i = 1, ", n
0, d = 1,", m
i ,d

(6)

(7)

The new coefficient i , d replaces the Lagrange coefficient i . The number of coef-

ficient that needs to be optimized is increased from n to n m . It increases the computational cost especially when the number of the attributes in the dataset is large. The
linear programming implementation of SVM is a promising approach to reduce the
computational cost of SVM and attracts some scholars attention. Based on above
idea, a 1-norm based linear programming is proposed:
n
K K
min J ( , ) = i , d i , d + i =1i

(8)

434

L. Wei, J. Li, and Z. Chen

yi

s.t.

j,d

j , d y j k (xi , d , x j , d ) + b 1 i
i 0, i = 1,", n
i , d 0, d = 1,", m

In equation (8), the regularized parameter

i, d .

(9)

controls the sparse of the coefficient

ui

(10)

The dual of this linear programming is:

max

n
i =1

n ui yi y j k (xi ,d , x j ,d ) 1, j = 1,", n.d = 1,", m


i=1
n
s.t.
i=1ui yi = 0

0 ui , i = 1,", n

(11)

The choice of kernel function includes the linear kernel, polynomial kernel or RBF
kernel. Thus, the SVM-MK classifier can be represented as:

K
f (x ) = sign j , d j , d y j k (xi , d , x j , d ) + b

(12)

It can be found that above linear programming formulation and its dual description
is equivalent to that of the approach called mixture of kernel [12]. So the new coefficient i , d is called the mixture coefficient. Thus this approach is named support
vector machine with mixture of kernel (SVM-MK). Comparing with the standard
SVM that obtains the solution of i by solving a quadratic programming problem, the
SVM-MK can obtain the value of mixture coefficient i, d by solving a linear programming. So the SVM-MK model greatly reduces the computational complexity. It
is more important that the sparse coefficients i , d give us more choices to extract the
satisfied features in the whole space spanned by all the attributes.

3 Experiment Analysis
In this section, a real-world credit dataset is used to test the performance of SVMMK. The dataset is from a major US commercial bank. It includes detailed information of 5000 applicants, and two classes are defined: good and bad creditor. Each
record consists of 65 variables, such as payment history, transaction, opened account
etc. This dataset includes 5000 applicants, in which the number of bad accounts is 815
and the others are good accounts. Thus the dataset is greatly imbalance. So we preprocess the data by means of sampling method and making use of 5-fold crossvalidation to guarantee valid results. In addition, three evaluation criteria measure the
efficiency of classification:

Credit Risk Evaluation Using Support Vector Machine with Mixture of Kernel

Type error =

number of observed good but classified as bad


number of observed good

number of observed bad but classified as good


number of observed bad
number of false classifica tion
Total error =
the number of evaluation sample

Type error =

435

(13)
(14)
(15)

3.1 Experiment Result


Our implementation was carried out on the Matlab6.5. Firstly, the data is normalized.
In this method the Gaussian kernel is used, and the kernel parameter needs to be chosen. Thus the method has two parameters to be prepared set: the kernel parameter

2 and the regularized parameter . The Type I error (e1), Type II error (e2), Total
2
error (e), number of selected features and the best pairs of ( , ) for each fold
using SVM-MK approach are shown in table 1. For this method, its average
Type I error is 24.25%, average Type II error is 17.24%, average Total error is
23.22% and average number of selected features is 18.
Table 1. Experimental results for each fold using SVM-MK

Fold #

e1 (%)

e2 (%)

e (%) Optimized

1
2
3
4
5
Average

20.2
27.75
24.54
20.27
28.49
24.25

15.7
15.39
14.29
13.68
27.14
17.24

18.9
26.3
23.1
19.5
28.3
23.22

Optimized selected features

4
5.45
5
3
1.2

5
3
1
5
8

19
13
8
25
26
18

Then, we take fold #2 as an example. The experimental results of SVM-MK using


various parameters are illustrated in table 2. The performance is evaluated using five
measures: number of the selected features (NSF), selected specific features, Type I
error (e1), Type II error (e2) and Total error (e).
From table 2, we can draw the following conclusions:
(1) The values of parameters to get the best prediction results are = 3 and =
5.45, the number of selected features is 13, the bad ones error is 15.39% and the
total error is 26.3%, they are the lowest errors compared to the other situations.
And almost eight of ten default creditors can be discriminated from the good ones
using this model at the expense of denying a small number of the non-default
creditors.
(2) With the increasing of , the number of selected features is gradually increasing.
A bigger parameter makes the value of coefficient matrix become decreasing
so that the constraints of the dual linear programming can be satisfied. As a
2

436

L. Wei, J. Li, and Z. Chen

Table 2. Classification error and selected features of various parameters in SVM-MK (fold #2)

10

15

20

3
13
20
32
41
41
54 3,8
3,8,9,10,11 3, 4, 8,9,10,12 3,4,5,6,8,10,12
3,4,5,6,8,10,12,14,15
55 10,11 14,20,24,31 14,15,16,17,20 14,15,16,17,19,20
16,17,19,20,23,24,25
61 14,20,31 38,40,42,47 23,24,25,28,31 23,24,25,28,31,32,33 28,31,32,33,34,37,38

NSF
Selected
specific
features

38,47,52 51,52,53,54
53,55,61 55,60,61

e1
e2
2 =2 e
e1
5
e2
e
e1
5.45 e2
e
e1
10 e2
e
e1
11.4 e2
e
e1
15 e2
e

100
0
87.3
17.8
32.7
18.1
0
100
13.9
0
100
13.9
0
100
13.9
0
100
13.9

92
3. 37
63. 5
38.2
8.6
34.1
27.75
15.39
26.3
98.88
0
86
99.88
0
87.3
100
0
87.3

32,34,38,39,40 34,37,38,39,40,41,42
42,45,47,51,52 43,45,47,48,50,51,52
53,54,55,58
53,54,55,58,59,61,63

14. 52
48.2
19. 2
22
29
23
21
32.4
24.5
19. 4
35. 3
21. 6
26. 62
16. 24
25. 4
32.1
10. 8
29.1

22. 55
41. 45
26.2
4. 2
83.89
17
11.7
100
11.7
2 .3
93.16
11.8
1. 1
94. 87
11. 9
1. 15
92.83
13.9

35. 4
21. 55
32. 45
14.04
29. 9
15. 9
11.7
100
11.7
88
3. 1
77. 8
92
7. 53
79. 94
94. 36
6. 3
81. 6

39,40,41,42,43,45
47,48,50,51,52,53
54,55,58,59,61,63

0.11
98. 29
11. 6
0.68
93.16
11.5
11.7
100
11.7
3. 2
87. 9
11. 9
12.9
29.9
14.9
5. 21
60. 68
11. 7

result, the sparse of i , d becomes bad in the primal LP problem. That is to say, a
bigger parameter results in selecting a good many of features. But the parame-

ter has no effect on the feature selection. When parameter is equal to 3,


only 13 attributes is selected and the best classification results are obtained. It is
shown that a reasonable feature extraction can improve the performance of the
learning algorithm greatly. These selected specific attributes also help the loaner
draw a conclusion as to the nature of credit risk existing in the information of the
creditors easily. This implies plays a key role in the feature selection. So we
must pay more attention to select the appropriate values for parameter .
2

(3) When the value of parameter matches the certain values of parameter , we
can get promising classification results. In general, there is a trade off between
Type I and II error in which lower Type II error usually comes at the expense of
higher Type I error.
2

3.2 Comparison of Results of Different Credit Risk Evaluation Models


The credit dataset that we used has imbalanced class distribution. Thus, there is nonuniform misclassifying cost at the same time. The cost of misclassifying a sample in
the bad class is much higher than that in the good class. So it is quite important that
the prior probabilities and the misclassification costs be taken into account in order to

Credit Risk Evaluation Using Support Vector Machine with Mixture of Kernel

437

obtain a credit evaluation model with the minimum expected misclassification [13].
When there are only two different populations, the cost function in computing the
expected misclassification is considered as follows:

cos t = c21 1 p(21) + c12 2 p(1 2)

(16)

c21 and c12 are the corresponding misclassification costs of Type I and Type II
error, 1 and 2 are prior probabilities of good and bad credit applicants,
where

p(21) and p(1 2) measure the probability of making Type I error and Type II error. In

this study, p(21) and p(1 2) are respectively equal to Type I and Type II error. The

misclassification ratio associated with Type I and Type II error are respectively 1 and
5 [13]. In order to further evaluate the effectiveness of the proposed SVM-MK credit
evaluation model, the classification results are compared with some other methods
using the same dataset, such as multiple criteria linear programming (MCLP), multiple criteria non-linear programming (MCNP), decision trees and neural network. The
results of the four models quoted from the reference [14]. Table 3 summarizes the
Type I, Type II and Total error of the five models and the corresponding expected
misclassification costs (EMC).
Table 3. Errors and the expected misclassification costs of the five models

Model

e1 (%)

e2 (%)

e (%)

EMC

MCLP
24.49
59.39
30.18
0.51736
MCNP
49.03
17.18
43.84
0.52717
Decision Tree
47.91
17.3
42.92
0.51769
Neural Network
32.76
21.6
30.94
0.40284
SVM-MK
24.25
17.24
23.22
0.30445
The priors of good and bad applicants are set to as 0.9 and 0.1 using the ratio of good and bad
credit customers in the empirical dataset.

From table 3, we can conclude that the SVM-MK model has better credit scoring
capability in term of the overall error, the Type I error about the good class, the Type
II error about the bad class and the expected misclassification cost criterion in comparison with former four models. Consequently, the proposed SVM-MK model can
provide efficient alternatives in conducting credit evaluating tasks.

4 Conclusions
This paper presents a novel SVM-MK credit risk evaluation model. By using the
1-norm and a convex combination of basic kernels, the object function which is a
quadratic programming problem in the standard SVM becomes a linear programming
parameter iterative learning problem so that greatly reducing the computational costs.
In practice, it is not difficult to adjust kernel parameter and regularized parameter to
obtain a satisfied classification result. Through the practical data experiment, we have
obtained good classification results and meanwhile demonstrated that SVM-MK
model is of good performance in credit scoring system. And we get only a few

438

L. Wei, J. Li, and Z. Chen

valuable attributes that can interpret a correlation between the credit and the customers information. So the extractive features can help the loaner make correct decisions. Thus the SVM-MK is a transparent model, and it provides efficient alternatives
in conducting credit scoring tasks. Future studies will aim at finding the law existing
in the parameters setting. Generalizing the rules by the features that have been
selected is another further work.

Acknowledgements
This research has been partially supported by a grant from National Natural Science
Foundation of China (#70531040), and 973 Project (#2004CB720103), Ministry of
Science and Technology, China.

References:
1. G. Lee, T. K. Sung, N. Chang: Dynamics of modeling in data mining: Interpretive approach to bankruptcy prediction. Journal of Management Information Systems, 16(1999),
63-85
2. J. C. Wiginton: A note on the comparison of logit and discriminate models of consumer
credit behavior. Journal of Financial Quantitative Analysis 15(1980), 757-770
3. B.J. Grablowsky, W. K. Talley: Probit and discriminant functions for classifying credit
applicants: A comparison. Journal of Economic Business. Vol.33(1981), 254-261
4. T. Lee, C. Chiu, Y. Chou, C. Lu: Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Computational Statistics and
Data Analysis, Vol.50, 2006(4), 1113-1130
5. Yueh-Min Huang, Chun-Min Hung, Hewijin Christine Jiau: Evaluation of neural networks
and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Analysis: Real World Applications 7(2006), 720-747
6. V. Vapnik: The nature of statistic learning theory. Springer, New York(1995)
7. T. Van Gestel, B. Baesens, J. Garcia, and P. Van Dijcke: A support vector machine approach to credit scoring. Bank en Financiewezen 2 (2003), 73-82
8. Y. Wang, S. Wang, K. Lai, A new fuzzy support vector machine to evaluate credit risk.
IEEE Transactions on Fuzzy Systems, Vol.13, 2005(6), 820-831
9. Wun-Hwa Chen, Jin-Ying Shih: A study of Taiwans issuer credit rating systems using
support vector machines. Expert Systems with Applications 30(2006), 427-435
10. A. Ch. Micchelli, M. Pontil: Learning the kernel function via regularization. Journal of
Machine Learning Research, 6(2005), 1099-1125
11. G. R.G. Lanckrient, N. Cristianini, P. Bartlett, L. El Ghaoui, M.I. Jordan: Learning the
kernel matrix with semidefinite programming. Journal of Machine Learning Research,
5(2004), 27-72
12. F.R. Bach, G. R.G. Lanckrient, M.I. Jordan: Multiple kernel learning, conic duality and the
SMO algorithm. Twenty First International Conference on Machine Learning, (2004),
41-48
13. D. West: Neural network credit scoring models. Computers and Operations Research,
27(2000), 1131-1152
14. J. He, Y. Shi, W. X. Xu: Classifications of credit cardholder behavior by using multiple
criteria non-linear programming. In Y. Shi, W. Xu, Z. Chen (Eds.) CASDMKM 2004,
LNAI 3327, Springer-Verlag Berlin Heidelberg, (2004), 154-163

Neuro-discriminate Model for the Forecasting of Changes


of Companies Financial Standings on the Basis
of Self-organizing Maps
Egidijus Merkeviius1, Gintautas Garva1,2,3, and Rimvydas Simutis1
1

Department of Informatics, Kaunas Faculty of Humanities, Vilnius University


Muitins st. 8, LT- 44280 Kaunas, Lithuania
2
Department of Information Systems, Kaunas University of Technology,
Studentu 50, LT-51368 Kaunas, Lithuania
3
Department of Informatics, Engineering and Biomechanics, Lithuanian Academy of
Physical Education,
Sporto 6, LT-44221 Kaunas, Lithuania
{egidijus.merkevicius, gintautas.garsva,
rimvydas.simutis}@vukhf.lt

Abstract. This article presents the way how creditor can predict the trends of
debtors financial standing. We propose the model for forecasting changes of
financial standings. Model is based on the Self-organizing maps as a tool for
prediction, grouping and visualization of large amount of data. Inputs for
training of SOM are financial ratios calculated according any discriminate
bankruptcy model. Supervised neural network lets automatically increase
accuracy of performance via changing of weights of ratios.
Keywords: self-organizing maps, Z-score, bankruptcy, prediction, bankruptcy
class, multivariate discriminate model, Altman, Zmijewski, feed-forward neural
network, model.

1 Introduction
Bankruptcy is a process which results reorganization of the company in order to repay
debts and fulfill other liabilities. Close monitoring of financial standing of the
company is very important in order to prevent possible bankruptcy.
Model forecasting changes in financial standing is presented in this article. The
fundamental bankruptcy models (Altman, Zmijewski etc.), supervised and
unsupervised artificial neural networks are used as the base in this model. The
concept of the model is universal any of discriminate bankruptcy model can be used
in order to describe the company by one meaning.
Related works are presented in the first part of the article. The methodology of the
model is described in the second part. Third part includes description and results of
testing of the model with the actual financial data.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 439446, 2007.
Springer-Verlag Berlin Heidelberg 2007

440

E. Merkeviius, G. Garva, and R. Simutis

2 Related Work
Bankruptcy is described as inability or impairment of ability of an individual or
organization to pay their creditors, in other words default. One of the most
important tasks of the creditors is to manage the credit risk making forecast of
changes in financial standing. The early history of research attempts to classify and
predict bankruptcy is well documented in [1],[7]. Altman [2], Zmijewski, [16],
Ohlson [10], Shumway [14] and other authors are the first and fundamental creators
of bankruptcy models. The key-point of these models is to determine most important
indicators (ratios) and their weights in discriminate or logistic function. A detailed
description of the SOM method is presented in [9].
During the past 15 years investigations in area of SOM applications to financial
analysis have been done. Doebeck described and analyzed most cases in [5]. Martindel-Prio and Serrano-Cinca generated SOMs of Spanish banks and subdivided those
banks into two large groups, the configuration of banks allowed establishing root
causes of the banking crisis [11].
Based on Kiviluotos study [8], through visual exploration one can see the
distribution of important indicators (i.e. bankruptcy) on the map.
The following authors have estimated only historical and current financial data of
the companies and afterwards they have interpreted it for forecasting bankrupt of
those companies. In this paper, we suggest generating SOM as one that could be
applied for forecasting of bankruptcy classes for other than trained companies.

3 Methodology
In this section we present Neuro-discriminate model for forecasting of changes of
companies financial standings on the basis of Self-organizing maps (further - Model).
The Model includes various methods - multivariate discriminate analysis, selforganizing maps, feed-forward supervised neural network, combination of which
makes original model of forecasting. These methods [2], [16], [9], [1] used in the
Model are original with no major adjustments, so they are not presented.
The main concept of the Model is presented in figure 1.
Description of the Model concept:
1.
2.

3.

On the basis of bankruptcy models changes of companies financial standing


are determined (0 negative changes, 1 positive changes);
The components of discriminate bankruptcy model are used for training of
unsupervised neural network and generating SOM. Testing of accuracy of
the SOM is executed via calculation of corresponding nodes between
training and testing data.
The accuracy of forecasting is improved via changing of weights. Feedforward neural network is used in the Model as a tool for changing of
weights where inputs are test data and targets are outputs of trained SOM.

Neuro-Discriminate Model for the Forecasting of Changes

441

Financial data
data
Multivariate discriminate
bankruptcy model

corrected
weights

FF ANN
inputs / outputs

SOM

w
w
w
w

. ..
x

weights

Fig. 1. The concept of the Model

The detailed algorithm of the Model is visualized in figure 2.

M ySQL
DB

Local companies
financial d atabase

Weights of
components of
discriminate
bankruptcy model

Financial ratios:
components of
discriminate
bankruptcy model
- EDGAR

inputs

trained
SOM

EDGAR
label nodes of
trained
SOM
calculate
corresponding
nodes
LT
label nodes of
trained
SOM

Financial ratios:
components of
discriminate
bankruptcy model
- LT

LT
Calculate
Z-scores/
classification

EDGAR
labeled nodes
of trained
SOM/
targets

accuracy is
acceptable

Feed-forward neural network


no

yes

ED GA R PRO
Financial
d atabas e

EDGAR
Calculate
Z-scores/
classification

inputs
x
x
x

w
w
w
w

Status Report
y

...
x

Fig. 2. Algorithm of proposed Model and methodology of weights adoption

The main steps of the algorithm are as follows:


1.

On the basis of financial statements of the companies, taken from EDGAR


PRO Online database (further related data is named EDGAR)[6], the

442

E. Merkeviius, G. Garva, and R. Simutis

2.

3.
4.
5.

6.

7.
8.

financial variables (ratios) and the bankruptcy scores (further named


Z-scores) are calculated and converted to bankruptcy classes based on
Z-scores changes during two years (0-negative changes, 1-positive changes).
The same calculations are made with data, taken from other financial
sources, for example, financial statements of Lithuanian companies (further
related data is named LT), assigning bankruptcy classes (0-negative changes,
1-positive changes) in the same way as described above.
Data preprocessing is executed. The process consists of normalizing data
set, selecting map structure and the topology, setting other options like data
filter, delay etc.
The SOM is generated on the basis of EDGAR data. The Inputs of SOM
are the Z-Score variables and the labels are bankruptcy classes.
The generated SOM is labeled with the bankruptcy classes of LT companies.
Labeled units of the trained SOM are compared with the same units labeled
with LT bankruptcy classes. Corresponding units are calculated. If the
number of corresponding EDGAR and LT labels which are located on the
same SOM map unit number (accuracy of prediction) is acceptable, then
status report is generated, otherwise changing of weights with the feedforward neural network (further - FF ANN) starts.
The attempt to increase corresponding labels is made in order to create such
a map structure in which the amount of unit numbers has the biggest
corresponding label number. For this goal we have used FF ANN, where
inputs are the ratios of LT calculated as bankruptcy discriminate model with
no weights. The targets of the ANN are units of SOM labels which belong to
correspondent LT data. The initial weights of ANN are the original weights
of bankruptcy discriminate model. As a result we have changed weights.
The weights of original discriminate bankruptcy model are updated with
changed weights.
Next iteration of presented algorithm starts (1-7 steps). When the
performance of the prediction doesnt rapidly change the algorithm has
stopped.

The results of this algorithm are as follows:


1.
2.
3.
4.

New SOM with a certain prediction percentage and good visualization of


large amount of data;
Original way how to use different financial database in the same bankruptcy
model.
New multivariate discriminator model that is based on the original
discriminate bankruptcy model with corrected weight variables.
Automatic tool to update original weights according to accuracy of
prediction FF ANN.

Result of prediction is the most important information for creditor showing the
trend of a company (positive or negative changes of financial standing of company).

Neuro-Discriminate Model for the Forecasting of Changes

443

4 Results of Testing
In this section we have presented the results of Model testing.
The testing of proposed Model has been executed using two real financial datasets:
companies from NASDAQ list, (further - TRAINDATA) loaded from EDGAR PRO
Online database and a dataset of Lithuanian companys financial statements
(TESTDATA) presented by one of the Lithuanian banks.
The basis for generating the SOM is TRAINDATA. Calculated bankruptcy ratios
are based on the original discriminate bankruptcy model by Zmijewski [16]. The
ratios are used as inputs and changes of Z-Scores during two years are used as labels
for the identification of units in the SOM.
Table 1. Characteristics of financial datasets

Dataset
Taken from

TRAINDATA
TESTDATA
EDGAR PRO Online Database Database of Lithuanian
(free trial)
bank.
Period of financial data
7 periods consecutively
2003-2004
Amount of companies
9364
776
Count of records
56184
776
Count of records after
46353
767
elimination of missing
data
Number
of
inputs
6
(attributes)
Risk
classes
of If the Z-score of second period is less than the Z-score of
bankruptcy
first period then the risk class is determined as 0, otherwise
1.
The SOM was trained using the SOM Toolbox for Matlab package [15]. Changing
of weights with FF ANN has been executed using NNSISID Toolbox for Matlab [13].
The testing process was as follows:
1.
2.
3.
4.
5.
6.

Getting data from financial sources and putting them to the MySQL
database.
Filtering missing data.
Calculating Z-scores; assigning of the risk classes.
Dividing TRAIN data into two subsets: training and testing data with ratio
70:30.
Preprocessing of training data: normalizing of data set, selecting structure of
map and topology, setting of other options like data filter, delay etc.
Executing algorithm which is described in the section 3 while accuracy of
corresponding nodes between TRAINDATA and TESTDATA achieves
desirable result.

In the figure 3 is presented the run of testing.

444

E. Merkeviius, G. Garva, and R. Simutis

Fig. 3. Run of increase of Model performance

Figure 3 shows rapidly increase of Model performance: testing on the basis of


original weights with 30% of TRAINDATA accuracy of prediction seeks 75.69%,
while testing with TESTDATA accuracy of prediction seeks 79.15%; after changing
of weights on the 11 step of cycle the accuracy of prediction seeks respectively
87.08% and 84.47%. The number of iterations is related with 8 step of the algorithm,
i.e. when the performance of the prediction doesnt rapidly change the algorithm
has stopped.
Table 2 presents Performance matrix of TESTDATA.
Table 2. Performance matrix of TESTDATA

Actual vs Predicted (Performance Matrix)


Predicted (by model)
0
1
Total (units)
73.47
26.53
98
Actual 0 (%)
9.03
90.96
166
Actual 1 (%)
Total (%)
84.47%
Table 3 presents comparison of importance of ratios in discriminate bankruptcy
model before and after changing of weights.
The highest impact on results has Total liabilities/Total assets ratio and Net
income/Total assets ratio. Changing of weights allows seek the highest accuracy of
bankruptcy prediction.

Neuro-Discriminate Model for the Forecasting of Changes

445

Table 3. Changes of variables weights before and after the cycle


Name
no-ratio weight
Net income/Total assets
Total liabilities/Total assets
S.-t. assets/ S.-t. liabilities
no-ratio weight
Net income/Total assets
Total liabilities/Total assets
S.-t. assets/ S.-t. liabilities
Performance of bankruptcy
prediction (%)

Variables (X0-first period, X1second period)


X0_1
X0_2
X0_3
X0_4
X1_1
X1_2
X1_3
X1_4

Weight Weight after


before

-4,336
-4,513
5,679
0,004
-4,336
-4,513
5,679
0,004

-4,336
-5,834
5,255
-0,207
-4,336
-4,545
4,617
-0,180

79.15%

84.47%

5 Conclusions

The presented Neuro-discriminate model for forecasting of changes of


companies financial standings on the basis of Self-organizing maps also
includes multivariate discriminate analysis of bankruptcy and feed-forward
supervised neural network; combination of these methods makes original
model suitable for forecasting.
The other authors which were studied capabilities of SOM in the areas of
bankruptcy have estimated only historical and current financial data of the
companies and afterwards they have interpreted it for forecasting bankrupt of
those companies. We suggest generating SOM as one that could be applied
for forecasting of bankruptcy classes for other than trained companies.
The presented model works well with real world data, the tests of the model
with presented datasets showed accuracy of prediction with more than
84% performance.
Methodology of presented model is flexible to adopt every datasets because
rules and steps of methodology algorithm are universal.
Changing of weights with supervised neural network allows seek the highest
accuracy of bankruptcy prediction.
Result of prediction is the most important information for creditor showing
the trend of a company (positive or negative changes of financial standing
of company).

References
1. Atiya.: Bankruptcy prediction for credit risk using neural networks: a survey and new
results. IEEE Transactions on Neural Networks, Vol. 12, No. 4, (2001) 929-935
2. Altman, E.: Financial Ratios, Discrimination Analysis and the Prediction of Corporate
Bankruptcy. Journal of Finance, (1968)

446

E. Merkeviius, G. Garva, and R. Simutis

3. Altman. E.: Predicting Financial Distress of Companies: Revisiting the Z-Score and
ZETA Models. (working paper at http://pages.stern.nyu.edu/~ealtman/Zscores.pdf)
(2000)
4. Deboeck, G.: Financial Applications of Self-Organizing Maps. American Heuristics
Electronic Newsletter, Jan, (1998)
5. Deboeck, G.: Self-Organizing Maps Facilitate Knowledge Discovery In Finance. Financial
Engineering News, (1998)
6. EDGAR Online, Inc. http://pro.edgar-online.com (1995-2006)
7. Galindo, J., Tamayo, P.: Credit Risk Assessment Using Statistical and Machine Learning:
Basic Methodology and Risk Modeling Applications. Computational Economics. (2000),
Vol 15
8. Kiviluoto, K.: Predicting bankruptcies with the self-organizing map. Neurocomputing,
Vol. 21, (1998),191201
9. Kohonen, T.: The Self-Organizing Map. Proceedings of the IEEE, 78:1464-1480
10. Ohlson, J. A.: Financial Ratios and the Probabilistic Prediction of Bankruptcy. Journal of
Accounting Research (Spring). (1980), 109-131
11. Martin-del-Prio, K., Serrano-Cinca, Self-Organizing Neural Network: The Financial State
of Spanish Companies. In Neural Networks in Finance and Investing. Using Artificial
Intelligence to Improve Real-World Performance. R.Trippi, E.Turban, Eds. Probus
Publishing, (1993), 341-357
12. Merkeviius, E., Garva, G., Girdzijauskas, S.: A Hybrid SOM-Altman Model for
Bankruptcy Prediction. International Conference on Computational Science (4), Lecture
Notes in Computer Science, 3994, 364-371, (2006), ISSN 0302-9743
13. Nrgaard, M.: Neural Network Based System Identification Toolbox Version 2. Technical
Report 00-E-891, Department of Automation Technical University of Denmark. (2000).
http://kalman.iau.dtu.dk/research/control/nnsysid.html
14. Shumway, T.: Forecasting Bankruptcy More Accurately: A Simple Hazard Model, Journal
of Business, Vol. 74, No. 1 (2001), 101-124
15. Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: SOM toolbox for Matlab 5,
Technical report A57 (2000), Helsinki University of Technology, Finland
16. Zmijewski, M. E.: Methodological Issues Related to the Estimation of Financial Distress
Prediction Models. Journal of Accounting Research 24 (Supplement): (1984) 59-82

A New Computing Method for Greeks Using


Stochastic Sensitivity Analysis
Masato Koda
Systems and Information Engineering, University of Tsukuba
1-1-1 Tennou-Dai, Tsukuba, 305-8573 Japan
koda@sk.tsukuba.ac.jp

Abstract. In a risk management of derivative securities, Greeks, i.e.


sensitivity coecients, are important measures of market risk to evaluate the impact of misspecication of some stochastic model on the
expected payo function. We investigate a new computing method for
Greeks based on Malliavin calculus without resort to a direct dierentiation of the complex payo functions. As a result, a new relation between
and is obtained for the Asian option.
Keywords: Greeks, Malliavin calculus, Monte Carlo simulation.

Introduction

We consider a stochastic integral equation in a well-dened Black-Scholes set-up,


t
t
St = S0 + 0 rS d + 0 S dW ,
(1)
where St is the price of underlying asset with S0 denoting the present (initial)
value, r denotes the risk-free interest rate, is the volatility, and (Wt )0tT is
a standard Brownian motion. It should be noted that, in European options, we
have a closed solution to (1) as ST = S0 exp(T + WT ), where = r 2 /2
for a xed expiration or maturity time T .
We are interested in studying how to evaluate the sensitivity with respect to
model parameters, e.g., present price S0 , volatility , etc., of the expected payo
E[erT (ST )],

(2)

for an exponentially discounted value of the payo function (ST ), where E[]
denotes the expectation operator. The sensitivity of more sophisticated payo
functions including path-dependent Asian-type options like
E[erT (

1T
St dt)],
T 0

(3)

be treated in a similar manner along the lines that are investigated in this study.
In nance, this is the so-called model risk problem. Commonly referred to
as Greeks, sensitivities in nancial market are typically dened as the partial
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 447454, 2007.
c Springer-Verlag Berlin Heidelberg 2007


448

M. Koda

derivatives of the expected payo function with respect to underlying model


parameters. In general, nite dierence approximations are heavily used to simulate Greeks. However, the approximation soon becomes inecient particularly
when payo functions are complex and discontinuous.
To overcome this diculty, Broadie and Glasserman [1] proposed a method
to put the dierential of the payo function inside the expectation operator
required to evaluate the sensitivity. But this idea is applicable only when the
density of the random variable involved is explicitly known. Recently, Fournie et
al. [2] suggested the use of Malliavin calculus, by means of integration by parts,
to shift the dierential operator from the expected payo to the underlying
diusion (e.g., Gaussian) kernel, introducing a weighting function.
Another examples that are similar to the present study and explored by the
present author (e.g., Ref. [3], [7]) but not covered in this paper are models
involving a step function and non-smooth objective functions. In these studies,
the stochastic sensitivity analysis technique based on the Novikovs identity is
used instead of Malliavin calculus.
In this paper, we present a new constructive approach for the computation of
Greeks in nancial engineering. The present approach enables the simulation of
Greeks without resort to direct dierentiation of the complex or discontinuous
payo functions.

Malliavin Calculus

Let R be the space of random variables of the form F = f (Wt1 , Wt2 , , Wtn ),
where f is smooth and Wt denotes the Brownian motion [6]. For a smooth
random variable F R, we can dene its derivative DF = Dt F , where the
dierential operator D is closable. Since D operates on random variables by
dierentiating functions in the form of partial derivatives, it shares the familiar
chain rule property, Dt (f (F )) = f (F ) Dt F = f  (F )Dt F , and others.
We denote by D the Skorohod integral, dened as the adjoint operator of D.
If u belongs to Dom(D ), then D (u) is characterized by the following integration
by parts (ibp) formula:
T
E[F D (u)] = E[ 0 (Dt F )ut dt].

(4)

Note that (4) gives a duality relationship to link operators D and D . The
adjoint operator D behaves like a stochastic integral. In fact, if ut is an adapted
process, then the Skorohod integral coincides with the Ito integral: i.e., D (u) =
T
ut dWt . In general, one has
0
D (F u) = F D (u)

T
0

(Dt F )ut dt.

(5)

A heuristic derivation of (5) is demonstrated here. Let us assume that F and G


are any two smooth random variables, and ut a generic process, then by product
rule of D one has

A New Computing Method for Greeks Using Stochastic Sensitivity Analysis

449

T
T
T
E[GF D (u)] = E[ 0 Dt (GF )ut dt] = E[ 0 G(Dt F )ut dt] + E[ 0 (Dt G)F ut dt]
T
= E[G 0 (Dt F )ut dt] + E[GD (F u)],
T
which implies that E[GD (F u)] = E[G(F D (u) 0 (Dt F )ut dt)] for any
random variables G. Hence, we must have (5).
In the present study, we frequently use the following formal relationship to
remove the derivative from a (smooth) random function f as follows:
E[f (X)Y ] = E[f  (X)Y ] = E[f (X)HXY ],

(6)

where X, Y , and HXY are random variables. It is noted that (6) can be deduced
from the integration by parts formula (4), and we have an explicit expression for
HXY as


Y

HXY = D
.
(7)
T
Dt Xdt
0
If higher order derivatives are involved then one has to repeat the procedure (6)
iteratively.

European Options

In the case of European option whose payo function is dened by (2), the essence
of the present method is that the gradient of the expected (discounted) payo,
E[erT (ST )], is evaluated by putting the gradient inside the expectation, i.e.,
E[erT (ST )], which involves computations of (ST ) =  (ST ) and ST .
Further, applying Malliavin calculus techniques, the gradient is rewritten as
E[erT (ST )H] for some random variable H. It should be noted, however, that
there is no uniqueness in this representation since we can add to H any random
variables that are orthogonal to ST . In general, H involves Ito or Skorohod
integrals.
3.1

Delta

Now we compute Delta, , the rst-order partial dierential sensitivity coecient of the expected outcome of the option, i.e., (2), with respect to the initial
asset value S0 :
=

ST
erT
E[erT (ST )] = erT E[ (ST )
]=
E[ (ST )ST ].
S0
S0
S0

Then, with X = Y = ST in (7), we perform the integration by parts (ibp) to


give



erT
erT
ST

=
E[(ST )HXY ] =
E (ST )D
.
(8)
T
S0
S0
Dt ST dt
0

450

M. Koda

T
T
T
Since 0 Dt ST dt = 0 ST Dt WT dt = ST 0 1{tT } = T ST , we evaluate
the stochastic integral


ST
1
D (1)
WT

HXY = D
= D (
)=
=
T
T
T
T
0 Dt ST dt
with the help of (5) applied to u = 1 (a constant process which is adapted and
Ito integral yields D (1) = WT ). Then the nal expression for reads
=
3.2

erT
E[(ST )WT ].
T S0

(9)

Vega

Next Greek Vega, V , is the sensitivity of (2) with respect to the volatility :
V =

ST
E[erT (ST )] = erT E[ (ST )
] = erT E[ (ST )ST {WT T }].

Then, utilizing (6) and (7) again with X = ST and Y = ST (WT T ), we apply
the ibp to give




rT
WT
T T )
V = erT E[(ST )HXY ]erT E (St )D STT(W
=
e
E
(S

1
.
T )D
T
D S dt
0

So, we evaluate the stochastic integral as




WT
1
1

HXY = D
1 =
D (WT ) D (1) =
D (WT ) WT .
T
T
T
With the help of (4) applied to u = 1 (adapted process) and F = WT , we have
T
T
D (WT ) = WT2 0 Dt WT dt = WT2 0 dt = WT2 T.
If we bring together the partial results obtained above, we derive the nal expression

2

WT
1
V = erT E (ST )
WT
.
(10)
T

3.3

Gamma

The last Greek Gamma, , involves a second-order derivative,


=

2
erT
E[erT (ST )] =
E[ (ST )ST2 ].
2
S0
S02

Utilizing (6) and (7) with X = ST and Y = ST2 , we obtain after a rst ibp






erT
ST2
erT
ST


=
E (ST )D
=
E (ST )D
.
T
S02
S02
T
Dt ST dt
0

A New Computing Method for Greeks Using Stochastic Sensitivity Analysis

451

With the help of (5) applied to u = 1/T (constant adapted process) and F =
ST , we have




ST
ST
1 T
WT

D
=
D (1)
Dt ST dt = ST
1 .
T
T
T 0
T
Then, repeated application of (6) and (7) with X= ST and Y = ST (WT /T 1),
the second ibp yields





 T
 erT
rT
WT

T ST
= e S 2 E  (ST )ST W

1
=
E
(S
)D

1
.
2
T
T
T
S
D S dt
0

With the help of (5) as before, we can evaluate the stochastic integral as







ST
WT
1 WT
1 WT2
1

D
1
=
D
1 =
WT
.
T
T
T
T
T T

0 Dt ST dt
If we combine the results obtained above, the nal expression becomes



erT
WT2
1
=
E (ST )
WT
.
T S02
T

(11)

Comparing (11) with (10), we nd the following relationship between V


and :
V
=
.
(12)
T S02
Since we have closed solutions for all the Greeks in European options, we can
easily check the correctness of the above results.

Asian Options

In the case of Asian option whose payo functional is dened by (3), the essence
of the present approach is again that the gradient of the expected (discounted)
T
T
payo is rewritten as E[erT ( T1 0 St dt)] = erT E[( T1 0 St dt)H], for some
random variable H. Dierent from the European options, however, we do not
have a known closed solution in this case.
4.1

Delta

Delta in this case is given by

1T
erT
1T
1T
E[erT ( 0 St dt)] =
E[ ( 0 St dt) 0 St dt].
S0
T
S0
T
T
T
Utilizing (6) and (7) with X = Y = 0 St dt/T , we may apply the ibp to give
=

rT

= e S0 E

1
T

T
0

 
 rT
 
  T

S dt
T
St dt) D T DY Xdt = e S0 E T1 0 St dt D 0T tSt dt ,
0

452

M. Koda

T T
T
where we have used the relationship 0 0 Dv St dvdt= 0 tSt dt. With the help
T
T
of (5) applied to u = 1/ (constant adapted process) and F = 0 St dt/ 0 tSt dt,
we may obtain
 




erT
1T
1
WT
< T2 >
=
E
St dt
+
1
,
(13)
S0
T 0
<T >

<T >
T T
T
where we have used the relationship 0 0 tDv St dvdt = 0 t2 St dt, and where
we dened
T
T 2
tSt dt
t St dt
< T >= 0T
and < T 2 >= 0 T
.
S
dt
S
dt
t
t
0
0
4.2

Vega

Vega in this case becomes








1T
1T
1  T St
rT

V =
E erT
S
dt
=
e
E

S
dt
dt
t
t

T 0
T 0
T 0



1T
1T
= erT E 
St dt
St {Wt t}dt .
0
T
T 0
T
As before, with the help of (6) and (7) applied to X = 0 St dt/T and Y =
T
St (Wt t)dt/T , we have
0
V = erT E

1
T

T
0

 

 
 T

S W dt
T
St dt D T DY Xdt = erT E T1 0 St dt D 0 T ttS tdt 1 ,
0

which, with the help of (5), yields the following expression





T 2

1
 T T St Wt dtdW
t St dt 0T St Wt dt
rT
0
0
0

V =e
E T St dt
+
WT
. (14)
T tS dt
( T tS dt)2
0

4.3

Gamma

Gamma involves a second-order derivative,


T
2
1T
erT
1T
rT
E[e
(
S
dt)]
=
E[ ( 0 St dt)( 0 St dt/T )2 ].
t
2
2
0
S0
T
S0
T
T
T
Application of (6) with X = 0 St dt/T and Y = ( 0 St dt/T )2 , and utilizing a
close variant of (7) (see Ref. [4], [5]), i.e.,


St Y

HXY = D
,
T
0 Sv Dv Xdv
=

we may obtain after a rst integration by parts

A New Computing Method for Greeks Using Stochastic Sensitivity Analysis

453





erT
St Y
 1 T

=
E ( 0 S d )D
T
S02
T
Sv Dv Xdv
0



rT
e
1T
2St
=
E  ( 0 S d )D
,
(15)
2
S0
T
T
T T
T t
T
where the relation 0 0 Sv Dv St dvdt = 0 0 St Sv dvdt = 2 ( 0 St dt)2 is
used. Further, we obtain
rT

= 2e
E  ( T1
T S 2
0

T
0

S d )

T
0



T
T
rT
St dWt = 2e2 T S 2 E  ( T1 0 St dt)(ST S0 r 0 St dt) ,
0

which involves (1).


T
Then, repeated application of (6) and (7) with X = 0 St dt/T and Y =
T
ST S0 r 0 St dt, the second integration by parts yields
=

2erT
2
2 S0

E ( T1

2erT
2
2 S0

E ( T1

T
0

T
0

St dt)D
St dt)D




ST S0 r 0T St dt

0T tSt dt
ST S0

0T tSt dt





2rerT
2
2 S0

E ( T1

T
0

St dt)D

T

T
St dt
0T



0 tSt dt

With the help of (5) applied to u = 1/ and F = (ST S0 )/ 0 tSt dt, the
present approach yields a brand new estimate which gives an explicit relationship
between and as follows:



2erT
1T
S

S
2r
T
0
= 2 2 E ( 0 St dt)D
2
T
S0
T

S0
0 tSt dt




2erT
1T
1
WT
< T2 >
= 2 2 E ( 0 St dt)  T
+
(ST S0 ) T ST
S0
T

<T >
tSt dt
0

2r
2 .
S0

(16)

Monte Carlo Simulations of Asian Option

Here, we present the simulation results with parameters r = 0.1, = 0.25,


T = 0.2 (in years), and S0 = K = 100 (in arbitrary cash units) where K denotes
the strike price. We have divided the entire interval of integration into 252 pieces,
representing the trading days in a year.
In Fig. 1, we compare the convergence behavior of with the result obtained
by Broadie and Glasserman [1]. The result indicates a fairy good convergence to
the steady-state value that is attained at 10, 000th iteration stage in [1].
In Fig. 2, we compare the simulation result of V with the one that is obtained at 10, 000th iteration stage in [1]. This indicates that some noticeable
bias may remain in the present Monte Carlo simulation, and further study may
be necessary to analyze and reduce the bias involved.
Although we cannot compare the proposed method with others, the present
simulations may provide most detailed and extensive results currently available
for Greeks of Asian option.

454

M. Koda

Fig. 1. Estimated Delta

Fig. 2. Estimated Vega

Conclusions

We have presented a stochastic sensitivity analysis method, in particular, a constructive approach for computing Greeks in nance using Malliavin calculus. As
a result, a new relation between and is obtained for the Asian option. The
present approach may be useful when the random variables are smooth in the
sense of stochastic derivatives. It is, however, necessary to further investigate and
improve Monte Carlo procedures to reduce the bias involved in the simulation
of Vega in Asian-type options and other sophisticated options.

References
1. Broadie, M., Glasserman, P.: Estimating security price derivatives using simulation.
Management Science 42 (1996) 269-285
2. Fournie, E., Lasry, J.M., Lebuchoux, L., Lions, P.L.: An application of Malliavin
calculus to Monte Carlo methods in Finance II. Finance and Stochastics 5 (2001)
201-236
3. Koda, M., Okano, H.: A new stochastic learning algorithm for neural networks.
Journal of the Operations Research Society of Japan 43 (2000) 469-485
4. Koda, M., Kohatsu-Higa, A., Montero, M.: An Application of Stochastic Sensitivity
Analysis to Financial Engineering. Discussion Paper Series, No. 980, Institute of
Policy and Planning Sciences, University of Tsukuba (2002)
5. Montero, M., Kohatsu-Higa, A.: Malliavin calculus applied to nance. Physica A
320 (2003) 548-570
6. Nualert, D.: The Malliavin Calculus and Related Topics. Springer, New York (1995)
7. Okano, H., Koda, M.: An optimization algorithm based on stochastic sensitivity
analysis for noisy objective landscapes. Reliability Engineering and System Safety,
79 (2003) 245-252

Application of Neural Networks for Foreign Exchange


Rates Forecasting with Noise Reduction
Wei Huang1,3, Kin Keung Lai2,3, and Shouyang Wang4
1 School

of Management, Huazhong University of Science and Technology


WuHan, 430074, China
whuang@amss.ac.cn
2 College of Business Administration, Hunan University
Changsha 410082, China
3 Department of Management Sciences, City University of Hong Kong,
Tat Chee Avenue, Kowloon, Hong Kong
{weihuang,mskklai}@cityu.edu.hk
4 Institute of Systems Science, Academy of Mathematics and Systems Sciences
Chinese Academy of Sciences, Beijing, 100080, China
swang@amss.ac.cn

Abstract. Predictive models are generally fitted directly from the original noisy
data. It is well known that noise can seriously limit the prediction performance
on time series. In this study, we apply the nonlinear noise reduction methods to
the problem of foreign exchange rates forecasting with neural networks (NNs).
The experiment results show that the nonlinear noise reduction methods can
improve the prediction performance of NNs. Based on the modified DieboldMariano test, the improvement is not statistically significant in most cases. We
may need more effective nonlinear noise reduction methods to improve prediction performance further. On the other hand, it indicates that NNs are particularly well appropriate to find underlying relationship in the environment characterized by complex, noisy, irrelevant or partial information. We also find that
the nonlinear noise reduction methods work more effectively when the foreign
exchange rates are more volatile.

1 Introduction
Foreign exchange rates exhibit high volatility, complexity and noise that result from
the elusive market mechanism generating daily observations [1]. It is certainly very
challenging to predict foreign exchange rates. Neural networks (NNs) have been
widely used as a promising alternative approach for a forecasting task because of
several distinguishing features [2]. Several design factors significantly affect the
prediction performance of neural networks [3]. Generally, NNs learn and generalize
directly from noisy data with the faith on ability to extract the underlying deterministic dynamics from the noisy data. However, it is well known that the model's generalization performance will be poor unless we prevent the model from over-learning.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 455461, 2007.
Springer-Verlag Berlin Heidelberg 2007

456

W. Huang, K.K. Lai, and S. Wang

Given that most financial time series contain dynamic noise, it is necessary to reduce
noise in the data with nonlinear methods before fitting the prediction models. However, not much work has been done [4]. In this study, we employ two nonlinear noise
reduction methods to alleviate these problems. The remainder of this paper is organized as follows. In Section 2, we give a brief introduction to the two nonlinear noise
reduction methods. Section 3 presents the experiment design. Section 4 discusses the
empirical experiment result. Finally, Section 5 offers some concluding remarks.

2 Nonlinear Noise Reduction


Conventional linear filtering in the time or Fourier domain can be very powerful as
long as nonlinear structures in the data are unimportant. However, nonlinearity can
not be fully characterized by second order statistics like the power spectrum. Nonlinear noise reduction does not rely on frequency information in order to define the distinction between signal and noise. Instead, structure in the reconstructed phase space
will be exploited. Here we want to concentrate on two nonlinear noise reduction
methods that represent the geometric structure in phase space by local approximation.
The former does so to constant order, while the latter uses local linear subspaces plus
curvature corrections. Interested reader can refer to the articles of review character
[5-7].
2.1 Simple Nonlinear Noise Reduction (SNL)
SNL replaces the central coordinate of each embedding vector by the local average of
this coordinate:

zn =
whrere

xk U n

U n

(1)

U n is neighborhood formed in phase space containing all points like x k ,

such that

x k x n < . This noise reduction method amounts to a locally constant

approximation of the dynamics and is based on the assumption that the dynamics is
continuous.
2.2 Locally Projective Nonlinear Noise Reduction (LP)
LP rests on hypotheses that the measured data is composed of the output of a lowdimensional dynamical system and of random or high-dimensional noise. This means
that in an arbitrarily high-dimensional embedding space the deterministic part of the
data would lie on a low-dimensional manifold, while the effect of the noise is to
spread the data off this manifold. If we suppose that the amplitude of the noise is

Application of NNs for Foreign Exchange Rates Forecasting with Noise Reduction

457

sufficiently small, we can expect to find the data distributed closely around this manifold. The idea of the projective nonlinear noise reduction scheme is to identify the
manifold and to project the data onto it.
Suppose the dynamical system forms a q-dimensional manifold containing the
trajectory. According to the embedding theorems, there exists a one-to-one image of
the attractor in the embedding space, if the embedding dimension is sufficiently high.
Thus, if the measured time series were not corrupted with noise, all the embedding
vectors

~
S n would lie inside another manifold in the embedding space. Due to the

noise this condition is no longer fulfilled. The idea of the locally projective noise

S n there exists a correction n , with n small,


~
~
in such a way that S n n and that n is orthogonal on . Of course a
reduction scheme is that for each

projection to the manifold can only be a reasonable concept if the vectors are

embedded in spaces which are higher dimensional than the manifold . Thus we
have to over-embed in m-dimensional spaces with m>q.
The notion of orthogonality depends on the metric used. Intuitively one would
think of using the Euclidean metric. But this is not necessarily the best choice. The
reason is that we are working with delay vectors which contain temporal information.
Thus even if the middle parts of two delay vectors are close, the late parts could be far
away from each other due to the influence of the positive Lyapunov exponents, while
the first parts could diverge due the negative ones. Hence it is usually desirable to
correct only the center part of delay vectors and leave the outer parts mostly
unchanged, since their divergence is not only a consequence of the noise, but also of
the dynamics itself. It turns out that for most applications it is sufficient to fix just the
first and the last component of the delay vectors and correct the rest. This can be
expressed in terms of a metric tensor P which we define to be

1 1 < i = j < m
Pij =
0 otherwise
where

(2)

m is the dimension of the over-embedded delay vectors.

Thus we have to solve the minimization problem

min

P
i

s.t.

a ni ( S n n ) + bni = 0, for i = q + 1,..., m

a ni Pa nj = ij
where

~
ani are the normal vectors of at the point S n n .

(3)

458

W. Huang, K.K. Lai, and S. Wang

3 Experiments Design
3.1 Neural Network Models
In this study, we employ one of the widely used neural networks models, the threelayers back-propagation neural network (BPNN), for foreign exchange rates forecasting. The activation function used for all hidden nodes is the logistic function, while
the linear function is employed in the output node. The number of input nodes is a
very important factor in neural network analysis of a time series since it corresponds
to the number of past lagged observations related to future values. To avoid introducing a bias in results, we choose the number of input nodes as 3, 5, 7 and 9, respectively. Because neural networks with one input node are too simple to capture the
complex relationships between input and output, and it is rarely seen in the literatures
that the number of input nodes is more than nine. Generally speaking, too many nodes
in the hidden layer produce a network that memorizes the input data and lacks the
ability to generalize. Another consideration is that as the number of hidden nodes in a
network is increased, the number of variables and terms are also increased. If the
network has more degrees of freedom (the number of connection weights), more
training samples are needed to constrain the neural network. It has been shown that
the in-sample fit and the out-of-sample forecasting ability of NNs are not very sensitive to the number of hidden nodes [8]. Parsimony is a principle for designing NNs.
Therefore, we use four hidden nodes in this study.
3.2 Random Walk Model
The weak form of efficient market theory describes that prices always fully reflect the
available information, that is, a price is determined by the previous value in the time
series because all the relevant information is summarized in that previous value. An
extension of this theory is the random walk (RW) model. The RW model assumes that
not only all historic information is summarized in the current value, but also that incrementspositive or negativeare uncorrelated (random), and balanced, that is,
with an expected value equal to zero. In other words, in the long run there are as many
positive as negative fluctuations making long term predictions other than the trend
impossible. The random walk model uses the actual value of current period to predict
the future value of next period as follows:

y t +1 = y t

(4)

where y t +1 is the predicted value of the next period; y t is the actual values of current
period.
3.3 Performance Measure

We employ root of mean squared error (RMSE) to evaluate the prediction performance of neural networks as follows:

Application of NNs for Foreign Exchange Rates Forecasting with Noise Reduction

RMSE =

2
( y t y t )

459

(5)

where y t is the actual value; y t is the predicted value; T is the number of the
predictions.
3.4 Data Preparation

From Pacific Exchange Rate Service provided by Professor Werner Antweiler,


University of British Columbia, Canada, we obtain 3291 daily observations of U.S.
dollar against the British Pound (GBP) and Japanese Yen (JPY) covering the period
the period from Jan 1990 to Dec, 2002. We take the natural logarithmic transformation to stabilize the time series. First, we produce the testing sets for each neural network models by selecting 60 patterns of the latest periods from the three datasets,
respectively. Then, we produce the appropriate training sets for each neural networks
model from the corresponding left data by using the method in [9].

4 Experiments Results
Table 1 show the prediction performances of the random walk model, which are used
as benchmarks of prediction performance of foreign exchange rates. In Table 2 and 3,
the RMSE of noisy data is largest, and RMSE of LP is least in the same row. It indicates that noise reduction methods actually improve the prediction performance of
NNs. Further, LP is more effective than SNL in reducing noise of exchange rates.
We also apply the modified Diebold-Mariano test [10] to examine whether the two
noise reduction methods can improve NNs financial forecasting significantly. From
the test statistic values shown in Table 4 and 5, only in the prediction of JPY, the NNs
with 9 input nodes using data filtered by LP outperform those using noisy data significantly at 20% level (the rejection of equality of prediction mean squared errors is
based on critical value of student's t-distribution with 69 degrees of freedom, namely
1.294 at 20% level). Perhaps there is still noise in exchange rates time series after
using the noise reduction methods. We look forward more effective noise reduction
methods in the future. On the other hand, it also implies that NNs are useful to extract
the underlying deterministic dynamics from noisy data at present stage. In addition,
the test statistic value of SNL for GBP is less than that of SNL for JPY in the same
row. Such a pattern is observed between LP for GBP and LP for JPY. JPY is more
volatile than GBP, that is, there are more noises in JPY than in GBP to be filtered. So
the improvement for JPY after noise reduction is more significant than GBP.
Table 1. The prediction performance of the random walk models

RMSE of GBP
0.005471

RMSE of JPY
0.007508

460

W. Huang, K.K. Lai, and S. Wang

Table 2. The prediction performance of NNs using data with noise, filtered by SNL and filtered
by LP, respectively (GBP)

#Input nodes
3
5
7
9

Noisy data
0.005471
0.004491
0.004496
0.0054671

Data filtered by SNL


0.005465
0.004473
0.004494
0.005464

Data filtered by LP
0.005461
0.004457
0.004467
0.00541

Table 3. The prediction performance of NNs using data with noise, filtered by SNL and filtered
by LP, respectively (JPY)

#Input nodes
3
5
7
9

Noisy data
0.007438
0.006348
0.006358
0.007293

Data filtered by SNL


0.007018
0.00616
0.006279
0.006811

Data filtered by LP
0.006719
0.005989
0.006067
0.006399

Table 4. The test statistic value of equality of prediction errors between noisy and filtered
data (GBP)

#Input nodes
3
5
7
9

Data filtered by SNL


0.012
0.043
0.005
0.007

Data filtered by LP
0.021
0.093
0.087
0.129

Table 5. The test statistic value of equality of prediction errors between noisy and filtered
data (JPY)

#Input nodes
3
5
7
9

Data filtered by SNL


0.904
0.391
0.162
0.879

Data filtered by LP
1.122
0.733
0.509
1.605

5 Conclusions
In the paper, we apply the two nonlinear noise reduction methods, namely SNL and
LP, to the foreign exchange rates forecasting with neural networks. The experiment
results show that SNL and LP can improve the prediction performances of NNs. The
improvement is not statistically significant based on the modified Diebold-Mariano
test at 20% level. Especially, LP performs better than SNL in reducing noise of foreign exchange rates. The noise reduction methods work more effectively on JPY than

Application of NNs for Foreign Exchange Rates Forecasting with Noise Reduction

461

on GBP. In the future work, we will employ more effective nonlinear noise reduction
methods to improve prediction performance further. On the other hand, it indicates
that NNs are particularly well appropriate to find underlying relationship in an environment characterized by complex, noisy, irrelevant or partial information. In the
most cases, NNs outperform the random walk model.

Acknowledgements
The work described in this paper was supported by Strategic Research Grant of City
University of Hong Kong (SRG No.7001806) and the Key Research Institute of
Humanities and Social Sciences in Hubei Province-Research Center of Modern
Information Management.

References
1. Theodossiou, P.: The stochastic properties of major Canadian exchange rates. The Financial Review, 29(1994) 193-221
2. Zhang, G., Patuwo, B.E. and Hu, M.Y.: Forecasting with artificial neural networks: the
state of the art. International Journal of Forecasting, 14(1998) 35-62
3. Huang, W., Lai, K.K., Nakamori, Y. and Wang, S.Y:. Forecasting foreign exchange rates
with artificial neural networks: a review. International Journal of Information Technology
& Decision Making, 3(2004) 145-165
4. Soofi, A., Cao, L.: Nonlinear forecasting of noisy financial data. In Modeling and Forecasting Financial Data: Techniques of Nonlinear Dynamics, Soofi, A. and Cao, L. (Eds),
Boston: Kluwer Academic Publishers, (2002) 455-465
5. Davies, M.E.: Noise reduction schemes for chaotic time series. Physica D, 79(1994)
174-192
6. Kostelich, E.J. and Schreiber, T.: Noise reduction in chaotic time series data: A survey of
common methods. Physical Review E, 48(1993) 1752-1800
7. Grassberger, P., Hegger, R., Kantz, H., Schaffrath, C. and Schreiber, T.: On noise reduction methods for chaotic data, CHAOS. 3(1993) 127
8. Zhang, G. and Hu, M.Y.: Neural network forecasting of the British Pound/US Dollar exchange rate. Journal of Management Science, 26(1998) 495-506
9. Huang, W., Nakamori, Y., Wang, S.Y. and Zhang, H.: Select the size of training set for financial forecasting with neural networks. Lecture Notes in Computer Science, Vol. 3497,
Springer-Verlag Berlin Heidelberg (2005) 879884
10. Harvey, D., Leybourne, S. and P. Newbold: Testing the Equality of Prediction Mean
Squared Errors. International Journal of Forecasting 13(1997) 281-91

An Experiment with Fuzzy Sets in Data Mining


David L. Olson1, Helen Moshkovich2, and Alexander Mechitov
1

University of Nebraska, Department of Management, Lincoln, NE USA 68588-0491


dolson3@unl.edu
2
Montevallo University, Comer Hall, Montevallo, AL USA 35115
MoshHM@montevallo.edu, Mechitov@montevallo.edu

Abstract. Fuzzy modeling provides a very useful tool to deal with human
vagueness in describing scales of value. This study examines the relative error
in decision tree models applied to a real set of credit card data used in the
literature, comparing crisp models with fuzzy decision trees as applied by See5,
and as obtained by categorization of data. The impact of ordinal data is also
tested. Modifying continuous data was expected to degrade model accuracy, but
was expected to be more robust with respect to human understanding. The
degree of accuracy lost by See5 fuzzification was minimal (in fact more
accurate in terms of total error), although bad error was worse. Categorization
of data yielded greater inaccuracy. However, both treatments are still useful if
they better reflect human understanding. An additional conclusion is that when
categorizing data, care should be taken in setting categorical limits.
Keywords: Decision tree rules, fuzzy data, ordinal data.

1 Introduction
Classification tasks in business applications may be viewed as tasks with classes
reflecting the levels of the same property. Evaluating creditworthiness of clients is
rather often measured on an ordinal level as, e.g., {excellent}, {good}, {acceptable},
or {poor} (Ben David et al., 1989.) Applicants for a job are divided into accepted and
rejected, but sometimes there may be also a pool of applicants left for further analysis
as they may be accepted in some circumstances [2], [11]. Different cars may be
divided into groups {very good}, {good}, {acceptable}, {unacceptable}. This type of
tasks is called ordinal classification [5]. The peculiarity of the ordinal classification
is that data items with {better} qualities (characteristics) logically are to be presented
in {better} classes: the better the article in its characteristics the closer it is to the class
{accepted}. It was shown in[6] that taking into account possible ordinal dependence
between attribute values and final classes may lead to a smaller number of rules with
the same accuracy and enable the system to extend obtained rules to instances not
presented in the training data set.
There are many data mining tools available, to cluster data, to help analysts find
patterns, to find association rules. The majority of data mining approaches to
classification tasks, work with numerical and categorical information. Not many data
mining techniques take into account ordinal data features.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 462469, 2007.
Springer-Verlag Berlin Heidelberg 2007

An Experiment with Fuzzy Sets in Data Mining

463

Real-world application is full of vagueness and uncertainty. Several theories on


managing uncertainty and imprecision have been advanced, to include fuzzy set
theory [13], probability theory [8], rough set theory [7] and set pair theory [14], [15].
Fuzzy set theory is used more than the others because of its simplicity and similarity
to human reasoning. Although there is a wide variety of different approaches within
this field, many view advantages of fuzzy approach in data mining as an interface
between a numerical scale and a symbolic scale which is usually composed of
linguistic terms [4].
Fuzzy association rules described in linguistic terms help increase the flexibility
for supporting users in making decisions. Fuzzy set theory is being used more and
more frequently in intelligent systems. A fuzzy set A in universe U is defined
as A = {(x, A ( x)) | x U , A ( x) [0,1]} , where
indicating the degree of membership of

A (x)

is a membership function

x to A . The greater the value of A ( x) , the

more x belongs to A . Fuzzy sets can also be thought of as an extension of the


traditional crisp sets and categorical/ordinal scales, in which each element is either in
the set or not in the set (a membership function of either 1 or 0.)
Fuzzy set theory in its many manifestations (interval-valued fuzzy sets, vague sets,
grey-related analysis, rough set theory, etc.) is highly appropriate for dealing with the
masses of data available. This paper will review some of the general developments of
fuzzy sets in data mining, with the intent of seeing some of the applications in which
they have played a role in advancing the use of data mining in many fields. It will
then review the use of fuzzy sets in two data mining software products, and
demonstrate the use of data mining in an ordinal classification task. The results will
be analyzed through comparison with the ordinal classification model. Possible
adjustments of the model to take into account fuzzy thresholds in ordinal scales will
be discussed.

2 Fuzzy Set Experiments in See5


See5, a decision tree software, allows users to select options to soften thresholds
through selecting a fuzzy option. This option would insert a buffer at boundaries
(which is how PolyAnalyst works as well). The buffer is determined by the software
based on analysis of sensitivity of classification to small changes in the threshold. The
treatment from there is crisp, as opposed to fuzzy. Thus, in decision trees, fuzzy
implementations seem to be crisp models with adjusted set boundaries.
See5 software was used on a real set of credit card data[10]. This dataset had 6,000
observations over 64 variables plus an outcome variable indicating bankruptcy or not
(variables defined in [10]). Of the 64 independent variables, 9 were binary and 3
categorical. The problem can be considered to be an ordinal classification task as the
two final classes are named as GOOD and BAD with respect to financial success.
This means that majority of the numerical and categorical attributes (including binary
ones) may be easily characterized by more preferable values with respect to GOOD
financial success.
The dataset was balanced to a degree, so that it contained 960 bankrupt outcomes
(BAD) and 5040 not bankrupt (GOOD.) Winnowing was used in See5, which

464

D.L. Olson, H. Moshkovich, and A. Mechitov

reduced the number of variables used in models to about 20. Using 50 percent of the
data for training, See5 selected 3000 observations at random as the training set, which
was then tested on the remaining 3000 observations in the test set. Minimum support
on See5 was varied over the settings of 10, 20, and 30 cases. Pruning confidence
factors were also varied, from 10% (greater pruning), 20%, 30%, and 40% (less
pruning). Data was locked within nominal data runs, so that each treatment of pruning
and minimum case settings was applied to the same data within each repetition. Five
repetitions were conducted (thus there were 12 combinations repeated five times, or
60 runs). Each run was replicated for original crisp data, original data using fuzzy
settings, ordinal crisp data, ordinal data using fuzzy settings, and categorical data
(See5 would have no difference between crisp and fuzzy settings). Rules obtained
were identical across crisp and fuzzy models, except fuzzy models had adjusted rule
limits. For instance, in the first run, the following rules were obtained:
CRISP MODEL: RULE 1:
RULE 2:
RULE 3:

RULE 4:

RULE 5 ELSE

IF RevtoPayNov 11.441,
then GOOD
IF RevtoPayNov > 11.441 AND
IF CoverBal3 = 1
then GOOD
IF RevtoPayNov > 11.441 AND
IF CoverBal3 = 0 AND
IF OpentoBuyDec > 5.35129
then GOOD
IF RevtoPayNov > 11.441 AND
IF CoverBal3 = 0 AND
IF OpentoBuyDec 5.35129 AND
IF NumPurchDec 2.30259
then BAD
GOOD

The fuzzy model for this data set:


IF RevtoPayNov 11.50565
then GOOD
IF RevtoPayNov > 11.50565 AND
IF CoverBal3 = 1
then GOOD
RULE 3:
IF RevtoPayNov > 11.50565 AND
IF CoverBal3 = 0 AND
IF OpentoBuyDec > 5.351905
then GOOD
RULE 4:
IF RevtoPayNov > 11.50565 AND
IF CoverBal3 = 0 AND
IF OpentoBuyDec 5.351905 AND
IF NumPurchDec 2.64916
then BAD
RULE 5 ELSE
GOOD

FUZZY MODEL: RULE 1:


RULE 2:

Binary and categorical data are not affected by the fuzzy option in See5. They are
considered already fuzzified with several possible values and corresponding
membership function of 1 and 0.
Models run with initial data (numeric and categorical scales) in original crisp form,
and using See5s fuzzification were obtained. There were 15 runs averaged for each
pruning level, and 20 runs averaged for each minimum case level. The overall line is

An Experiment with Fuzzy Sets in Data Mining

465

based on all 60 runs. The average number of crisp rules was 9.4 (fuzzy rules were the
same, with different limits). Crisp total error averaged 488.7, while fuzzy total error
averaged 487.2.
The number of rules responded to changes in pruning rates and minimum case
settings as expected (the tie between 20 percent and 30 percent pruning rates can be
attributed to data sampling chance). There were no clear patterns in error rates by
treatment. Fuzzy models were noticeably different from crisp models in that they had
higher error rates for bad cases, with corresponding improvement in error in the good
cases. The overall error was tested by t-test, and the only significant differences found
were that the fuzzy models had significantly greater bad error than the crisp models,
and significantly less cheap error. The fuzzy models had slightly less overall average
error, but given the context of credit cards, bad error is much more important. For
data fit, here the models were not significantly different. For application context, the
crisp models would clearly be preferred. The generalizable conclusion is not that crisp
models are better, only that in this case the fuzzy models were worse, and in general
one cannot count on the same results across crisp and fuzzified models.
In this case introducing fuzzy thresholds in the rules did not lead to any significant
results. The usage of small fuzzy intervals instead of crispy thresholds did not
significantly improve the accuracy of the model and did not provide better
interpretation of the introduced interval rules. On the other hand, crisp data was not
significantly better than the fuzzy data.
The same tests were conducted with presenting relevant binary and categorical
variables in an ordinal form. See5 allows stating that the categorical scale is
[ordered] with the presented order of attribute values corresponding to the order of
final classes. The order is not derived from the data but is introduced by the user as a
pre-processing step in rules/tree formation.
See5 would not allow locking across data sets, and required different setup for
ordinal specification, so we could not control for data set sampling across the tests.
Some categorical and/or binary variables such as Months late were clearly ordinal
and were marked as ordinal for this experiment. Categorical variables with no clear
ordinal qualities such as State, were left nominal. Crisp rules averaged 7.0, with
total error of 487. Fuzzy total error was 482.
The number of rules clearly dropped. Expected response of number of rules to
pruning and minimum case settings behaved as expected, with the one anomaly at 20
percent pruning, again explainable by the small sample size. Total error rates within
ordinal model were similar to the nominal case in the first set of runs, with fuzzy
model total error rates showing up as slightly significant (0.086 error probability) in
the ordinal models.
Comparing nominal and ordinal models, the number of rules was significantly
lower for ordinal models (0.010 error probability.) There were no significances in
errors across the two sets except for total error (ordinal models had slightly
significantly lower total errors, with 0.087 error probability.) This supports our
previous finding that using ordinal scales where appropriate lead to a set of more
interesting rules without loss in accuracy [6].
The data was categorized into 3 categories for each continuous variable. This in
itself is another form of fuzzification. Twenty five variables were selected based upon
the typical winnowing results of the original data. The same tests were conducted,

466

D.L. Olson, H. Moshkovich, and A. Mechitov

although fuzzification was not used since See5 would have no difference in applying
its fuzzification routine (we did run as a check, but results were always identical).
Average number of rules was just over 6, with average total error 495.7. Results were
much worse than the results obtained in prior runs with continuous and ordinal data.
That is clearly because in the third set of runs, data was categorized manually, while
in the prior two runs See5 software set the categorical limits. The second set of runs
involved data converted to fuzzy form by the See5 software. These are two different
ways to obtain fuzzy categories. Clearly the software can select cutoff limits that will
outperform ad hoc manual cutoffs.

3 Fuzzy Sets and Ordinal Classification Task


Previous experiments showed very modest improvements in the rule set derived from
introducing of fuzzy intervals instead of crisp thresholds for continuous scales using
SEE5. Interpretation of the modified rules was not more friendly or more logical.
Using stable data intervals was in general slightly more robust than using crisp
thresholds. Considering ordinal properties of some categorical/binary attributes led to
a better rule set although this did not change the fuzzy intervals for the continuous
scales. This supports our previous findings [6].
One of the more useful aspects of fuzzy logic may be the orientation on the
partition of continuous scales into a pre-set number of linguistic summaries[12]. In
[1] this approach is used to form fuzzy rules in a classification task. The main idea of
the method is to use a set of pre-defined linguistic terms for attributes with continuous
scales (e.g., Young, Middle, Old for an attribute Age measured
continuously). In this approach. the traditional triangular fuzzy number is calculated
for each instance of age in the training data set, e.g. age 23 is presented in Young
with a 0.85 membership function and in Middle with a 0.15 membership function (0
in Old). Thus the rewritten data set is used to mine interesting IF-THEN rules using
linguistic terms.
One of the advantages of the proposed approach stressed by the authors is the
ability of the mining method to produce rules useful for the user. In [3], the method
was used to mine a database for direct marketing campaign of a charitable
organization. In this case the domain expert defined appropriate uniform linguistic
terms for quantitative attributes. For example, an attribute reflecting the average
amount of donation (AVGEVER) was fuzzified into Very low (0 to $300), Low
($100 to $500), Medium ($300 to $700), High ($500 to $900) and Very High
(over $700). The analogous scale for frequency of donations (FREQEVER) was
presented as follows: Very low (0 to 3), Low (1 to 5), Medium (3 to 7), High
(5 to 9) and Very High (over 7). Triangular fuzzy numbers were derived from these
settings for rule mining. The attribute to be predicted was called Response to the
direct mailing and included two possible values Yes and No. The database
included 93 attributes, 44 having continuous scales.
Although the application of the method produced a huge number of rules (31,865)
with relatively low classification accuracy (about 64%), the authors argued that for a
task of such complexity the selection of several useful rules by the user of the results
was enough to prove the usefulness of the process. The presented rules found useful
by the user were presented as follows:

An Experiment with Fuzzy Sets in Data Mining

467

Rule 1: IF a donor was enrolled in any donor activity in the past (ENROLL=YES),
THEN he/she will have RESPONSE=YES
Rule 2: IF a donor was enrolled in any donor activity in the past (ENROLL=YES)
AND did not attend it (ATTENDED=NO), THEN he/she will have RESPONSE=YES
Rule 3: IF FREQEVER = MEDIUM, THEN RESPONSE=YES
Rule 4: IF FREQEVER = HIGH, THEN RESPONSE=YES
Rule 5: IF FREQEVER = VERY HIGH, THEN RESPONSE=YES
We infer two conclusions based on these results. First, if obvious ordinal
dependences between final classes of RESPONSE (YES/NO) and such attributes as
ENROLL, ATTENDED, and FREQEVER were taken into account the five rules
could be collapsed into two without any loss of accuracy and with higher levels for
measures of support and confidence: rule 1 and a modified rule 3 in the following
format IF FREQEVER is at least MEDIUM, THEN RESPONSE=YES. Second,
although presented rules are user friendly and easily understandable, they are not as
easily applicable. Overlapping scales for FREQEVER makes it difficult for the user to
apply the rules directly. It is necessary to carry out one more step - agree on the
number where medium frequency starts (if we use initial database) or a level of a
membership function to use in selecting medium frequency if we use the
rewritten dataset. The assigned interval of 3 to 5 evidently includes High
frequency (which does not bother us) but also includes Low frequency which we
possibly would not like to include into our mailing list. As a result a convenient
approach for expressing continuous scales with overlapping intervals at the
preprocessing stage may be not so convenient in applying simple rules.
This presentation of the ordinal classification task allows use of this knowledge to
make some additional conclusions about the quality of the training set of objects.
Ordinal classification allows introduction of the notion of the consistency of the
training set as well as completeness of the training set. In the case of the ordinal
classification task quality of consistency in a classification (the same quality objects
should belong to the same class) can be essentially extended: all objects with higher
quality among attributes should belong to a class at least as good as objects with
lower quality. This condition can be easily expressed as follows: if Ai(x) Ai(y) for
each i=1, 2, , p, then C(x) C(y).
We can also try to evaluate representativeness of the training set by forming all
possible objects in U (we can do that as we have a finite number of attributes with a
small finite number of values in their scales) and check on the proportion of them
presented in the training set. It is evident that the smaller this proportion the less
discriminating power well have for the new cases. We can also express the resulting
rules in a more summarized form by lower and upper border instances for each
class [6].
Advantages of using ordinal scales in an ordinal classification task do not lessen
advantages of appropriate fuzzy set techniques. Fuzzy approaches allow softening
strict limitations of ordinal scales in some cases and provides a richer environment for
data mining techniques. On the other hand, ordinal dependences represent essential
domain knowledge which should be incorporated as much as possible into the mining
process. In some cases the overlapping areas of attribute scales may be resolved by
introducing additional linguistic ordinal levels. For example, we can introduce an

468

D.L. Olson, H. Moshkovich, and A. Mechitov

ordinal scale for age with the following levels: Young (less than 30), Between
young and medium (30 to 40), Medium (40 to 50), Between medium and old (50
to 60) and Old (over 60). Though it will increase the dimensionality of the problem,
it would provide crisp intervals for the resulting rules.
Ordinal scales and ordinal dependences are easily understood by humans and are
attractive in rules and explanations. These qualities should be especially beneficial in
fuzzy approaches to classification problems with ordered classes and linguistic
summaries in the discretization process. The importance of ordinal scales for data
mining is evidenced by appearance of this option in many established mining
techniques. See5 includes the variant of ordinal scales in the problem description [9].

4 Conclusions
Fuzzy representation is a very suitable means for humans to express themselves.
Many important business applications of data mining are appropriately dealt with by
fuzzy representation of uncertainty. We have reviewed a number of ways in which
fuzzy sets and related theories have been implemented in data mining. The ways in
which these theories are applied to various data mining problems will continue to
grow.
Ordinal data is stronger than nominal data. There is extra knowledge in knowing if
a greater value is preferable to a lesser value (or vice versa). This extra information
can be implemented in decision tree models, and our results provide preliminary
support to the idea that they might strengthen the predictive power of data mining
models.
Our contention is that fuzzy representations better represent what humans mean.
Our brief experiment was focused on how much accuracy was lost by using fuzzy
representation in one application classification rules applied to credit applications.
While we expected less accuracy, we found that the fuzzy models (as applied by See5
adjusting rule limits) usually actually were more accurate. Models applied to
categorical data as a means of fuzzification turned out less accurate in our small
sample. While this obviously cannot be generalized, we think that there is a logical
explanation. While fuzzification will not be expected to yield better fit to training
data, the models obtained by using fuzzification will likely be more robust, which is
reflected in potentially equal if not better fit on test data. The results of these
preliminary experiments indicate that implementing various forms of fuzzy analysis
will not necessarily lead to reduction in classification accuracy.

References
1. Au, W-H, Keith C. C. Chan: Classification with Degree of Membership: A Fuzzy
Approach. ICDM (2001): 35-42
2. David, B. A. (1992): Automated generation of symbolic multiattribute ordinal knowledgebased DSSs: Methodology and applications. Decision Sciences, 23(6), 157-1372
3. Chan, Keith C. C., Wai-Ho Au, Berry Choi: Mining Fuzzy Rules in A Donor Database for
Direct Marketing by a Charitable Organization. IEEE ICCI (2002): 239-246

An Experiment with Fuzzy Sets in Data Mining

469

4. Dubois, D., E. Hullermeier, H. Prade: A Systematic Approach to the Assessment of Fuzzy


Association Rules. Data Mining and Knowledge Discovery, (2006), July, 1-26
5. Larichev, O.I., Moshkovich, H.M. (1994): An approach to ordinal classification problems.
International Trans. on Operations Research, 82, 503-521
6. Moshkovich H.M., Mechitov A.I., Olson, D.: Rule Induction in Data Mining: Effect of
Ordinal Scales. Expert Systems with Applications Journal, (2002), 22, 303-311
7. Pawlak, Z.: Rough set, International Journal of Computer and Information Sciences.
(1982), 341-356
8. Pearl, J.: Probabilistic reasoning in intelligent systems, Networks of Plausible inference,
Morgan Kaufmann, San Mateo,CA (1988)
9. See5 - http://www.rulequest.com
10. Shi, Y., Peng, Y., Kou, G., Chen, Z.: Classifying credit card accounts for business
intelligence and decision making: A multiple-criteria quadratic programming approach,
International Journal of Information Technology & Decision Making 4:4 December
(2005), 581-599
11. Slowinski, R. (1995): Rough set approach to decision analysis. AI Expert,19-25
12. Yager, R.R.: On Linguistic Summaries of Data, in G. Piatetsky-Shapiro and W.J. Frawley
9Eds.) Knowledge Discovery in Databases, Mento Park, CA: AAAI/MIT Press, (1991),
347-363
13. Zadeh, L.A.: Fuzzy sets, Information and Control 8 (1965), 338-356
14. Zhao, K.-G.: Set pair analysis a new concept and new systematic approach. Proceedings
of national system theory and regional analysis conference, Baotou (1989) (In Chinese)
15. Zhao, K.-G.: Set pair analysis and its preliminary application, Zhejiang Science and
Technology Press (2000) (In Chinese)

An Application of Component-Wise Iterative


Optimization to Feed-Forward Neural Networks
Yachen Lin
Fidelity National Information Services, Inc.
11601 Roosevelt Blvd-TA76
Saint Petersburg, FL 33716
yachen.lin@fnf.com

Abstract. Component-wise Iterative Optimization (CIO) is a method of dealing


with a large data in the OLAP applications, which can be treated as the enhancement of the traditional batch version methods such as least squares. The
salient feature of the method is to process transactions one by one, optimizes estimates iteratively for each parameter over the given objective function, and
update models on the fly. A new learning algorithm can be proposed when applying CIO to feed-forward neural networks with a single hidden layer. It incorporates the internal structure of feed-forward neural networks with a single
hidden layer by applying the algorithm CIO in closed-form expressions to update weights between the output layer and the hidden layer. Its optimally computational property is a natural consequence inherited from the property of the
algorithm CIO and is also demonstrated in an illustrative example.

1 Introduction
In recent years, the development of technology has paved the way for the industry to
use more sophisticated analytics for making business decisions in on-line analytical
processing (OLAP). In the check and credit card processing business, for example, the
advanced artificial intelligence is widely used to authorize transactions. This is mainly
due to the fact that the fraud activities nowadays are more and more aggressive than
ever before. Once certain forms of fraud were identified or caught by the industry, a
new form will be intercepted in a very short period of time. People are convinced that
fighting fraud requires a non-stopping effort, i.e. constantly updating fraud pattern
recognition algorithms and timely incorporating new fraud patterns in the algorithm
development process.
The traditional method of least squares presents a challenge for updating the model
on the fly of transactions in both a general linear model and a general non-linear
model. This challenge is mainly due to the matrix manipulation in the implementation
of the computation.
In dealing with the challenge, both industry and academic researchers have made
substantial efforts. The most successful stories, however, are for the linear case only
and mainly for the improvement of computational properties of least squares. Since
the method known as recursive least squares - RLS was derived; many variants of
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 470477, 2007.
Springer-Verlag Berlin Heidelberg 2007

An Application of Component-Wise Iterative Optimization

471

RLS have been proposed for a variety of applications. A method so called a sliding
window RLS was discussed in many papers such as [2] and [3]. By applying QR
decomposition, U-D factorization, and singular value decomposition (SVD), more
computationally robust implementations of RLS have been discussed in papers such
as [1]. Certainly, these researches have substantially increased the computational
efficiency of RLS, of cause LS algorithms, but they are limited for the linear case
only. Furthermore, the core ingredient in the computation of the algorithm of RLS and
its variants are still a matrix version, although some matrices are more re-usable this
time in the implementation.
In view of the difficulties that traditional least squares have when updating models
on the fly of transactions, a new procedure Component-wise Iterative Optimization
(CIO) was proposed in [10]. Using the new method of CIO, updating models on the
fly of transactions becomes straightforward for both linear and non-linear cases. More
importantly, the method itself yields an optimal solution with the objective function of
sum of squares, in particular, least square estimates when a general linear model is
used. The method CIO can be described as follows.
Let X be a new pattern, F be the objective function, and

E = (e1( 0) ,", e (p0) ) t p be the initial estimates, where p is the domain of the
parameter vector. Given the notation below

e 1( 1 ) = Arg _ OP F ( e 1 , e 2( 0 ) , " , e
e1

(0 )
p

, X ) ,

e1(1) is an optimal solution for the parameter of e1 over the objective function
F given the sample pattern of X and e2 , , e p being held fixed.

then

With the above given notations, the method CIO can be described by the following
procedure. Given the initial estimates of E
CIO updates the estimates in
Step 1. Compute e

(1 )
1

= (e1( 0) ,", e (p0) ) t p , the method of

p steps below:

= Arg _ OP F ( e 1 , e 2( 0 ) , " , e (p0 ) , X )


e1

Step 2. Compute e 2(1) = Arg _ OP F ( e1(1) , e 2 , e 3( 0 ) " , e (p0 ) , X ) by substi-

e2

(1)

(0)

tuting e1 for e1

Step p. Compute e (p1) = Arg _ OP F ( e1(1) ,..., e P(1) 1 , e p , X ) by substituting

ep

(1)
k

for

e , k = 1, 2,... p 1 .
(0)
k

After these steps, the initial estimates

(e1(1) ,", e (p1) ) t .

(e1( 0) ,", e (p0) ) t can be updated by

472

Y. Lin

The idea of CIO is no stranger to one who is familiar with the Gibbs sampler by
[6]. It is no harm or may be easier to understand the procedure of CIO in terms of the
non-Bayesian version of Gibbs sampling. The difference is that CIO generates optimal estimates for a given model and the Gibbs sampler or MCMC generates samples
from some complex probability distribution. It has been shown in [10] that the procedure of CIO converges.
How can the algorithm CIO be applied in the neural network field? It is well
known that a useful learning algorithm is developed for generating optimal estimates
based on specific neural network framework or structure. Different forms and structures of f represent the different types of neural networks. For example, the architecture of the feed-forward networks with a single hidden layer and one output can be
sketched in the following figure:

+1

+1

x1
x2

xp

However, too many learning algorithms proposed in the literatures were just ad hoc in
nature and their reliability and generalization were often only demonstrated on applications for limited empirical studies or simulation exercises. Such analyses usually
emphasize on a few successful cases of applications and did not necessarily establish
a solid base for more general inferences. In the study of this kind, much theoretical
justification is needed. In the present study, a useful learning algorithm can be derived
from a well established theory of the algorithm CIO. To establish the link between
CIO and neural networks, we can look at any types of neural networks in a view of a
functional process. A common representation of the processing flow based on neural
networks can be written as follows:
(1.1)
y = f ( x i j , w j , j = 0 , 1 ,..., p ) + i

where

i is an error random term, i = 1,2, " , n , and w = ( w0 , w1 ,", w p ) t

is the

parameter vector.
Information first is summarized into

i = ( 0i ,..., pi ) t R p +1
activated

by

X i , where X i = (1, x1i ,..., x pi ) t , and

= [1 , 2 ,..., m ] .

in
the

and

function

Secondly it is
way

of

( X i ) = (1, ( X i ),..., ( X i )) . Then it is further summarized in the


t

t
1

t
1

An Application of Component-Wise Iterative Optimization

output node by applying new weights to the activation function, i.e.


where

473

( t X i ) ,

= ( 0 , 1 ,..., m )t R m+1 . Thus,


f = t ( t X i ) .

From the above discussion, we can see that feed-forward neural network with a
single hidden layer fit into the general modeling frame work of (1.1) very well. Only
in this setting, the targeted or response variable Y can be expressed by the function

f = t ( t X i ). Therefore, one can image that the algorithm CIO can be applied
to feed-forward neural networks with a single hidden layer because the function

f = t ( t X i ) is a special case of (1.1).


The rest of the paper is structured in the following way: In the next section, we discuss an application of CIO to feed-forward neural networks with a single hidden
layer. Certainly, a new learning algorithm originated from the application of CIO will
be discussed in a relatively detailed fashion there. At the final, an illustration example
by using the new learning algorithm is given.

2 An Application of CIO to Feed-Forward Neural Networks


Training neural networks to reveal the proper pattern in a data set is a non-trivial task.
The performance of a network is often highly associated with the effectiveness of a
training algorithm. The well-known back-propagation algorithm [12] is a popular
approach to train a multi-layer perceptron, i.e. feed-forward networks by minimizing
the square errors. Some of its properties have been studied through a number of applications. With the development of high power computing equipment, many alternative
algorithms were proposed such as the development of second-order learning algorithm and classical approaches of Gauss-Newton and Newton-Raphson. [8] gave a
learning algorithm by using the projection pursuit technique for optimizing one node
at a time. The approach was further developed in [7]. Some other existing algorithms
in optimization approaches were also used. The comparisons of these algorithms have
been conducted for some cases in [13]. [9] proposed a learning algorithm by using
Hessian matrix in the update recursive formula - a variation of the second-order learning algorithm.
In training feed-forward neural networks with a single hidden layer, its special
structure for the processing flow can be exploited. From the discussion in section 1,
we know that the output can be expressed in the following model:

yi = t ( t X i ) + i ,

(2.1)

( t X i ) = (1, ( 1t X i ),..., ( 1t X i )) t and i is an error random term,


i = 1,2, " , n . If the objective function of the mean squared errors is given, then

where

training the neural network is equivalent to finding least square estimates of the
weight parameters.

474

Y. Lin

Since (2.1) is only a special form of (1.1), given the objective function of the mean
squared errors, we can apply the algorithm CIO to train the network. There are two
options: (a) directly to apply CIO to (2.1), and (b) only apply CIO
to

= ( 0 , 1 ,..., m )t R m+1 , the weight parameters vector between the hidden

layer and the output layer. Considering the condition of the theorem for CIO, we only
discuss option (b) in this paper.
Simple computations for (2.1) based on CIO lead to
Thus, updating for the estimates of
tion that given

g ' ( (j k ) (i )) = ( tj X i ).

= ( 0 , 1 ,..., m )t R m+1

, under the condi-

= [1 , 2 ,..., m ], i = ( 0i ,..., pi ) t R p +1 , can be done

simply following the procedure below.


Step 0. Initial value: Choose an initial value of

( 0) = ( 0( 0) , 1( 0) ,", p( 0) ) t

for randomly or by certain optimal rules such as least squares.

(1) = ( 0(1) , 1(1) ," , p(1) ) t :

Step 1. Compute
Compute

0(1) : Given a sample pattern ( y i , xi 1 ,..., xi p ),

the equation

yi g ( 0 ) = 0, and denoted by 0(1) (i ).

sample patterns

Repeat n times for all

( y i , xi 1 ,..., xi p ), i = 1,..., n , then let

0(1) =
Compute

we can find the solution of

1(1) :

1 n
0(1) (i).
n i

0(1)

First, substitute

for

0( 0)

in function of

g ( 1 ), where

g ( 1 ) = f ( 1 , given xi j , 0(1) , k( 0) , k 0, 1, k = 0,1,..., p) . Then, solve the


equation
ple

y i g ( 1 ) = 0, and the solution is denoted by 1(1) (i )

pattern

for given sam-

( y i , xi 1 ,..., xi p ). Repeat n times for all sample patterns

( y i , xi 1 ,..., xi p ), i = 1,..., n , then let

( ( X ))
=
( ( X ) )
i =1

t
1

(1)
1

t
1

1(1) (i ).

(p) Compute the last component p : By the p-1 steps above, components of
(1)

are taken as l(1) , l = 0,1,..., p 1 in the


equation

function of

g ( p ) . Then, solve the

yi g ( p ) = 0, and the solution is denoted by p(1) (i )

for given

An Application of Component-Wise Iterative Optimization

sample patterns ( y i , xi 1 ,..., x i p ), i

( ( X ))
=
( ( X ) )
i =1

Step k. Compute
get

= 1,..., n , and then let


2

t
p

(1)
p

475

t
p

p(1) (i ).

( k ) = ( 0( k ) , 1( k ) ,", p( k ) ) t : Repeat the Step 1 k times, then

(k ) .

Let us denote the above procedure by CIO( ; ), which means that given , update by CIO( ; ). The other group of weight parameters in the network

= [1 , 2 ,..., m ], i = ( 0i ,..., pi ) t R p +1

can be updated by one of

many commonly used iterative procedures such as Gauss-Newton, Newton-Raphson,


and Conjugate Gradient, denoted by CUIP(; ), which means that given , update
by CUIP(; ).
Given the above notations, let be the activation function, x be input features, and
y be the response, the following figure shows the updating procedure.
Algorithm by Neural-CIO (, , , , )
1.
(old) ( 0 )
2.
(old) ( 0 )
3.
SSR Criterion(( 0 ), ( 0 ), , , )
4.
While SSR > do;
(new) CIO( (old) ; (old) )
(new) CUIP( (old) ; (new) )
SSR Criterion(( new), ( new ), , , )
5.
Return ( new), ( new ), SSR
The advantage of function CIO( (old) ; . ) over the other available learning algorithm is its closed form, i.e. (new) CIO( (old) ; . ). To update the weight parameter
vector , we do not need to apply iterations while updating by the closed form. Therefore, it is more efficient computationally. In the section, we will show this computational efficiency by a numeric example. For the function Criterion(( new), ( new ), ,
, ), it can be of many forms such like the mean squared error, the number of iterations, or the lackness of training progress.

3 An Illustrative Example
This section gives a numeric example of a classification problem using Fishers famous Iris data. A comprehensive study on applying neural networks to this data set
was given in [11]. In the following, both algorithms Neural-CIO and Backpropagation are implemented and compared over the data in the framework of three

476

Y. Lin

feed-forward neural networks with a single hidden layer, 4-2-1, 4-3-1, 4-4-1, i.e. 2, 3,
and 4 hidden nodes, respectively.
For the data, a sample of three records of the original data can be seen in the following table.
Table 1. Three records of Fishers Iris data

Sepal Length
5.1
7.0
6.3

Sepal width
3.5
3.2
3.3

Petal length
1.4
4.7
6.0

Petal width
0.2
1.4
2.5

Class
Setosa
Versicolor
Verginica

All measurements are in centimeters. It is well-known that the class Setosa can be
linearly separated from the other two and the other two can not be separated linearly.
Thus, we only apply the algorithms to the network to separate these two classes. The
data is further transformed using the following formulas and then each record is assigned a class number either 0.0999 or 0.0001.
Sepal length == (Sepal length - 4.5) / 3.5; Sepal width == (Sepal width - 2.0) / 1.5;
Petal length == (Pepal length - 3.0) / 3.5; Petal width == (Pepal width - 1.4) / 1.5;
Table 2. Two records of transformed Fishers Iris data

Sepal Length
0.7143
0.5143

Sepal width
0.8000
0.8667

Petal length
0.4857
0.4286

Petal width
0.0000
0.7333

Class
0.0999
0.0001

Note: There are total 100 records in the data set and 50 for each class.

In the training process, for both algorithms, the stopping rule is chosen to be the
mean squared error less than a pre-assigned number. The activation function is taken
the form of ( x ) = (1 + e ) .
The training results and performance are summarized in the table in the next page.
The results in the table clearly shows the advantages that the new learning algorithm incorporates the internal structure of feed-forward neural networks with a single
hidden layer by applying the algorithm CIO in closed-form expressions to update
weights between the output layer and the hidden layer. Its optimally computational
property is a natural consequence inherited from the property of the algorithm CIO
and this point has been further verified in the above illustrative example.
x 1

Table 3. Comparison between Neural-CIO and Back-propagation

Structure
4-2-1
4-3-1
4-4-1

Error
0.0002
0.0002
0.0002

#Misclassification
CIO
Back
6
6
5
6
5
6

# Iteration
CPU Time(second)
CIO
Back
CIO
Back
2433
9092
68
93
1935
8665
96
120
811
6120
60
110

Note: CIO means Neural-CIO and Back means Back-propagation.

An Application of Component-Wise Iterative Optimization

477

References
1. Baykal, B., Constantinids, A.: Sliding window adaptive fast QR and QR-lattice algorithm.
IEEE Signal Process 46 (11), (1998) 2877-2887
2. Belge, M., Miller, E.: A sliding window RLS-like adaptive algorithm for filtering alphastable noise. IEEE Signal Process Letter 7 (4), (2000) 86-89
3. Choi, B., Bien, Z.: Sliding-windowed weighted recursive least-squares method for parameter estimation. Electron Letter 25 (20), (1989) 1381-1382
4. Fisher, R: The use of multiple measurements in taxonomic problems, Ann. Eugencis 7, Pt
II, (1939). 197-188
5. Fletcher, R.: Practical Methods of Optimization Vol I: Unconstrained Optimization, Comput. J. 6, (1980). 163 168
6. Geman, S., Geman, D.: Stochastic relaxation, Gibbs distribution and Bayesian restoration
of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, (1984)
721-741
7. Hwang, J. N., Lay, S. R., Maechler, M., and Martin, D. M.: Regression modeling in backpropagation and projection pursuit learning, IEEE Trans. on Neural networks vol. 5, 3,
(1994)342 - 353
8. Jones, L. K.: A simple lemma on greedy approximation in Hilbert space and convergence
rates for projection pursuit regression and neural network training, Ann. Statist. Vol. 20
(1992) 608- 613
9. Karayiannis, N. B., Venetsanopoulos, A. N.: Artificial Neural Networks: Learning algorithms, Performance evaluation, and Applications KLUWER ACADEMIC
PUBLISHERS(1993).
10. Lin, Y.: Component-wise Iterative Optimization for Large Data, submitted (2006)
11. Lin, Y.: Feed-forward Neural Networks Learning algorithms, Statistical properties, and
Applications, Ph.D. Dissertation, Syracuse University (1996)
12. Rumelhart, D. E., Hinton, E. G., Williams, R. J.: Learning internal representations by error
propagation, Parallel Distributed Processing. Chap. 8, MIT, Cambridge, Mass. (1986)
13. Watrous, R. L.: Learning algorithms for connectionist networks: applied gradient methods
of nonlinear optimization, IEEE First Int. Conf. Neural Networks, San Diego, (1987) II
619-627

ERM-POT Method for Quantifying Operational


Risk for Chinese Commercial Banks
Fanjun Meng1 , Jianping Li2, , and Lijun Gao3
1

Economics school, Renmin University of China, Beijing 100872, P.R. China


jmengfan@163.com
2
Institute of Policy & Management, Chinese Academy of Sciences, Beijing 100080,
P.R. China
ljp@casipm.ac.cn
3
Management School, Shandong University of Finance, Jinan 250014, P.R. China
glj963217@163.com

Abstract. Operational risk has become increasingly important topics


for Chinese Commercial Banks in recent years. Considering the huge operational losses, Extreme value theory (EVT) has been recognized as a
useful tool in analyzing such data. In this paper, we presented an ERMPOT (Exponential Regression Model and the Peaks-Over-Threshold)
method to measure the operational risk. The ERM-POT method can
lead to bias-corrected estimators and techniques for optimal threshold
selections. And the experiment results show that the method is reasonable.
Keywords: operational risk; EVT; POT; ERM; VaR.

Introduction

Basel II for banking mandates a focus on operational risk. In the Basel framework, operational risk is dened as the risk of loss resulting from inadequate or
failed internal processes, people and systems or from external events. The operational risk is one of the most important risks for Chinese commercial banks,
and brings huge losses to Chinese commercial banks in recent years. Considering
the size of these events and their unsettling impact on the nancial community
as well as the growing likelihood of operational risk losses, it is very important
for analysis, modelling and prediction of rare but dangerous extreme events.
Quantitative approaches should be achieved.
EVT has developed very rapidly over the past two decades. Also it has been
recognized as a useful set of probabilistic and statistical tools for the modelling
of rare events and its impact on insurance, nance and quantitative risk management is well recognized [2]. The distribution of operational risk is heavy-tailed




This research has been partially supported by a grant from National Natural Science
Foundation of China (# 70531040) and the President Fund of Institute of Policy and
Management, Chinese Academy of Sciences (CASIPM) (0600281J01).
The Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 478481, 2007.
c Springer-Verlag Berlin Heidelberg 2007


ERM-POT Method for Quantifying Operational Risk

479

and high-kurtosis. Considering the nature of the operational risk and their unsettling impact, EVT can play a major role in analyzing such data. A fast growing
literature exists. For details, see references [1],[2],[3],[4] . . .
In this paper, we presented an ERM-POT method to measure the operational
risk. In POT method, how to choose the threshold in an optimal way is a delicate
matter. A balance between bias and variance has to be made. In general, Hillplot and mean excess function (MEF) can be used to choose the threshold.
However, selection of the threshold by the two approaches may produce biased
estimates because the two are empirical. The ERM-POT method can lead to
bias-corrected estimators and techniques for optimal threshold selections. With
ERM-POT, the optimal threshold is selected, and then VaR is obtained.
The paper is organized as follows: In section 2, we give a brief view of the
ERM-POT method. Section 3 is the experiment with the ERM-POT. The last
part concludes the paper.

The ERM-POT Method

Given a threshold u and a heavy-tailed sample X1 , ......, Xn , denote Nu be the


number of exceeding observations Xi1 , ......XiNu and denote the excesses Yj =
Xij u 0. The distribution of excess values of X over threshold u is dened
by
F (x + u) F (u)
Fu (x) := P (X u x|X > u) =
,
(1)
1 F (u)
For a suciently high threshold u, Fu (x) can be approximated by the Generalized Pareto Distribution (GPD) [7]:

1/
1 (1 + x
, = 0
)
G, (x) =
,
(2)
1 exp(x/), = 0
where G,u, (x) := G, (x u) with the shape parameter and the scale parameter . Fit a GPD to the excesses Y1 , ......, YNu to obtain estimates and

with Maximum Likelihood (ML). For x > u, from equation(1) and equation (2),
We estimate the tail of F with
Nu
x u 1/
F (x) = 1
(1 +
)
,
n

where Fn (u) is the empirical distribution function in u : Fn (u) = 1 Nu /n. For


a given probability p, inverting F (x) then yields the following estimator for high
quantiles above the threshold u [7]: xp = u +

VaR = u +

u
(N
np ) 1

u
(N
np ) 1

, then

(3)

Next we choose the threshold u. For simplicity, denote k = Nu . ERM is


proposed for log-spacings of order statistics X1,n X2,n ... Xn,n [6,7]:
Zj,k = j log(Xnj+1,n /Xnj,n) ( + bn,k (

j
) )fj,k , 1 j k .
k+1

(4)

480

F. Meng, J. Li, and L. Gao

with {fj,k, 1 j k} a pure random sample from the standard exponential


distribution, shape parameter , real constant 0 and 
rate function b. The
k
Hill estimator is given for k {1, ..., n 1} by Hk,n = k1 j=1 (log Xnj+1,n
log Xnk,n ) .
We choose the threshold by the asymptotic mean squared error (AMSE) of
the Hill estimator [7], given by
AMSE(Hk,n ) = Avar(Hk,n ) + Abias2 (Hk,n ) =

2
bn,k 2
+(
) .
k
1

(5)

Similar to the adaptive threshold selection method [6], in the ERM we cal bn,k
, with ML for each k = {3, ......, n 1}. Determine
culate the estimates ,
AMSE(Hk,n ) for each k = {3, ......, n 1} and then determine the optimal k with
kopt = argmink {AMSE}. Thus we choose Xnkopt ,n as the threshold.

Experiment

3.1

Data Set

For obvious reasons, operational risk data are hard to come by. This is to some
extent true for Chinese commercial banks. We collect 204 operational losses of
more than 20 banks from public media. The time of the database ranged from
1994 to 2006. We use quantile-quantile plot (QQ-plot) to infer the tail behavior
of observed losses. From Fig. 1, the 204 public reported operational loss data are
heavy-tailed. (units: Ten thousand RMB)

2
0

Exponential Quantiles

operational risk losses

2*10^6

4*10^6

6*10^6

Ordered Data

Fig. 1. QQ plot

The estimators and quantiles are obtained by software S-Plus.


3.2

Results and Analysis

From Table. 1, the threshold is u = 43200 ten thousand RMB. Almost 80%
operational losses are less than this, and most extreme values are beyond this.
So this threshold calculated sounds reasonable.

ERM-POT Method for Quantifying Operational Risk

481

Table 1. The results of ERM-POT (units: Ten thousand RMB)


u

VaR0.95

VaR0.99

VaR0.999

43200

1.505829

116065.6

417740.1

5062954

163319974

Since the shape parameter is used to characterize the tail behavior of a


distribution: the lager , the heavier the tail. The estimate we obtained =
1.505829 > 1 proves that the operational losses for Chinese commercial banks
are severely heavy-tailed. At the same time, within the 99.9% condence interval VaR(= 163319974) excluding the expected losses nearly accounts for 6%
to the average assets from 2003 to 2005 for Chinese commercial banks. As a
consequence, we should pay much attention on the operational risk, and at the
same time useful quantitative and qualitative approaches should be achieved for
Chinese commercial banks. At last, comparing with the VaR in Lijun Gao [4],
in which VaR = 136328000, we know both results are close to each other.

Conclusions

In this paper, we presented an ERM-POT to measure the operational risk of


extremely heavy-tailed loss data. Selection of the threshold by Hill plot or MEF
may produce biased estimates. The ERM-POT provides a solution to such problem. With ERM-POT, the optimal threshold is selected. and then VaR is obtained. From Table. 1, we know the threshold sounds reasonable and the new
method is useful .

References
1. Chavez-Demoulin, V., Davison, A.: Smooth extremal models in nance. The Journal
of the Royal Statistical Society, series C 54(1) (2004) 183-199
2. Chavez-Demoulin, V., Embrechts, P., Neslehov
a: Quantitative models for Operational Risk: Extrems, Dependence and Aggregation. The meeting Implementing an
AMA for Operational Risk, Federal Reserve Bank of Boston.(2005)
3. Chernobai, A., Rachev, S., T.: Applying Robust Methods to Operational Risk Modeling. Journal of Operational Risk (Spring 2006) 27-41
4. Lijun Gao: The research on operational risk measurement and capital distribution
of Chinese commercial banks, Doctorate thesis.(2006)(in Chinese)
5. Lijun Gao, Jianping Li, Weixuan Xu. Assessment the Operatinal Risk for Chinese
Commercial Banks. Lecture Notes in Computer Science 3994. (2006) 501-508
6. Matthys, G. and Beirlant, J. :Adaptive threshold selection in tail index estimation.
In Extremes and Integrated Risk Managenent, Risk Books, London.(2000a) 37-49.
7. Matthys, G. and Beirlant, J. : Estimating the extreme value index and high quantiles
with exponential regression models. Statistica Sinica 13 (2003) 853-880
8. Neslehov
a, J., Chavez-Demoulin, V., Embrechts, P., :Innite-mean models and the
LDA for operational risk. Journal of Operational Risk (Spring 2006)3-25

Building Behavior Scoring Model Using Genetic


Algorithm and Support Vector Machines
Defu Zhang1,2, Qingshan Chen1, and Lijun Wei1
1

Department of Computer Science, Xiamen University, Xiamen 361005, China


2
Longtop Group Post-doctoral Research Center, Xiamen, 361005, China
dfzhang@xmu.edu.cn

Abstract. In the increasingly competitive credit industry, one of the most


interesting and challenging problems is how to manage existing customers.
Behavior scoring models have been widely used by financial institutions to
forecast customers future credit performance. In this paper, a hybrid GA+SVM
model, which uses genetic algorithm (GA) to search the promising subsets of
features and multi-class support vector machines (SVM) to make behavior
scoring prediction, is presented. A real life credit data set in a major Chinese
commercial bank is selected as the experimental data to compare the
classification accuracy rate with other traditional behavior scoring models. The
experimental results show that GA+SVM can obtain better performance than
other models.
Keywords: Behavior Scoring; Feature Selection; Genetic Algorithm; MultiClass Support Vector Machines; Data Mining.

1 Introduction
Credit risk evaluation decisions are crucial for financial institutions due to high risks
associated with inappropriate credit decisions. It is an even more important task today
as financial institutions have been experiencing serious competition during the past
few years. The advantage of using behavior scoring models can be described as the
benefit from allowing financial institutions to make better decisions in managing
existing clients by forecasting their future performance. The decision to be made
include what credit limit to assign, whether to market new products to these particular
clients, and how to manage the recovery of the debt while the account turns bad.
Therefore, new techniques should be developed to help predict credit more accurately.
Currently, researchers have developed a lot of methods for behavior scoring, the
modern data mining techniques, which have made a significant contribution to the
field of information science [1], [2], [3]. At the same time, with the size of databases
growing rapidly, data dimensionality reduction becomes another important factor in
building a prediction model that is fast, easy to interpret, cost effective, and

This research has been supported by academician start-up fund (Grant No. X01109) and 985
information technology fund (Grant No. 0000-X07204) in Xiamen University.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 482485, 2007.
Springer-Verlag Berlin Heidelberg 2007

Building Behavior Scoring Model Using GA and SVM

483

generalizes well to unseen cases. Data reduction is performed via feature selection in
our approach. Feature selection is an important issue in building classification
systems. There are basically two categories of feature selection algorithms: feature
filters and feature wrappers. In this paper we adopt the wrapper model of feature
selection which requires two components: a search algorithm that explores the
combinatorial space of feature subsets, and one or more criterion functions that
evaluate the quality of each subset based directly on the predictive model [4].
GA is used to search through the possible combinations of features. GA is an
extremely flexible optimization tool for avoiding local optima as it can start from
multiple points. The input features selected by GA are used to train a Multi-Class
Support Vector Machines (SVM) that extracts predictive information. The trained
SVM is tested on an evaluation set, and the individual is evaluated both on predictive
accuracy rate and complexity (number of features).
This paper is organized as follows. In Section 2, we show the structure of the
GA+SVM model, and describe how GA is combined with SVM. The experimental
results are analyzed in Section 3. Conclusions are provided in Section 4.

2 GA+SVM Model for Behavior Scoring Problems


Firstly, we will give a short overview of the principles of genetic algorithm and
support vector machines. Further details can be found in [5], [6].
In order to use SVM for real-world classification tasks, we should extend typical
two-class SVM to solve multiple-class problems. Reference [7] gives a nice overview
about ideas of multi-class reduction to binary problems.

Fig. 1. A wrapper model of feature selection (GA+SVM)

Our behavior scoring model is a hybrid model of the GA and SVM procedures, as
shown in Fig. 1. In practice, the performance of genetic algorithm depends on a
number of factors. Our experiments used the following parameter settings: the
population size is 50, the maximum number is 100, the crossover rate is 0.9, and the
mutation rate is 0.01.
The fitness function has to combine two different criteria described to obtain better
performance. In this paper we use Faccuracy and Fcompexity to denote the two criteria.

484

D. Zhang, Q. Chen, and L. Wei

Faccuracy: The purpose of the function is to favor feature sets with a high predictive
accuracy rate, SVM takes a selected set of features to learn the patterns and calculates
the predict accuracy. The radial basis function (RBF) is used as the basic kernel
function of SVM. With selected features, randomly split the training data set, the ratio
of Dtrain and Dvalidation is 2:1. In addition, since SVM is a stochastic tool, five iterations
of the proposed method are used to avoid the affect of randomized algorithm. And the
Faccuracy is an average of five iterations.
Fcompexity: This function is aimed at finding parsimonious solution by minimizing
the number of selected feature as follows:
Fcompexity = 1 - (d-1)/(D-1) .

(1)

Where D is the dimensionality of the full data set, and d is the dimension of the
selected feature set. We expect that lower complexity will lead to easier
interpretability of solution as well as better generalization.
The fitness function of GA can be described as follows:
Fitness(x) = Faccuracy(x) + Fcompexity(x) .

(2)

3 Experimental Results
A credit card data set provided by a Chinese commercial bank is used to demonstrate
the effectiveness of the proposed model. The data set is in recent eighteen months,
and includes 599 instances. Each instance contains 17 independent variables. The
decision variable is the customer credit: good, bad, and normal credit. The number of
good, normal, and bad is 160, 225, and 214 respectively.
In this section, GA+SVM is compared with a pure SVM, back-propagation neural
network (BPN), Genetic Programming (GP) and logistic regression (LR). The scaling
ratio of the training and test data set is 7:3. In order to compare the proposed method
with other models, five sub-samples are used to compare the predictive accuracy rate
of those models. The predictive accuracy rates of the test data set are shown in Table 1.
In the first sample, the feature subset selected by GA is shown in Table 2.
Table 1. Predictive accuracy rates of proposed models
GA+SVM
SVM
BPN
GP
LR

Sample 1
0.8883
0.8771
0.8659
0.8827
0.8492

Sample 2
0.8994
0.8715
0.8676
0.8939
0.8659

Sample 3
0.9162
0.8883
0.8892
0.9106
0.8770

Sample 4
0.8771
0.8492
0.8431
0.8827
0.8436

Sample 5
0.8883
0.8659
0.8724
0.8883
0.8715

Overall
0.8940
0.8704
0.8676
0.8916
0.8614

Table 2. Features selected by GA+SVM in Sample 1


Feature Type
Customes personal information
Customes financial information

Selected Features
Age, Customer type, Education level
Total asset, Average of saving

Building Behavior Scoring Model Using GA and SVM

485

On the basis of the simulated results, we can observe that the classificatory
accuracy rate of the GA+SVM is higher than other models. In contrast with other
models, we consider that GA+SVM is more suitable for behavior scoring problems
for the following reasons. Unlike BPN which is only suited for large data sets, our
model can perform well in small data sets [8]. In contrast with the pure SVM,
GA+SVM can choose the optimal input feature subset for SVM. In addition, unlike
the conventional statistical models which need the assumptions of the data set and
attributes, GA+SVM can perform the classification task without this limitation.

4 Conclusions
In this paper, we presented a novel hybrid model of GA+SVM for behavior scoring.
Building a behavior scoring model involves the problems of the features selection and
model identification. We used GA to search for possible combinations of features and
SVM to score customers behavior. On the basis of the experimental results, we can
conclude that GA+SVM obtain higher accuracy in the behavior scoring problems.
In future work, we may incorporate other evolutionary algorithms with SVM for
feature subset selections. How to select the kernel function, parameters and feature
subset simultaneously can be also our future work.

References
1. Chen, S., Liu, X: The contribution of data mining to information science. Journal of
Information Science. 30(2004) 550-558
2. West, D.: Neural network credit scoring models. Computers & Operations Research.
27(2000) 1131-52
3. Li, J., Liu, J., Xu, W., Shi. Y.: Support Vector Machines Approach to Credit Assessment.
International Conference on Computational Science. Lecture Notes in Computer Science,
Vol. 3039. Springer-Verlag, Berlin Heidelberg New York (2004)
4. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence.
1(1997) 273-324
5. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. 3rd edn.
Springer-Verlag, Berlin Heidelberg New York (1996)
6. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag, Berlin
Heidelberg New York (1995)
7. Allwein, E., Schapire, R., Singer, Y.: Reducing multiclass to Binary: A Unifying Approach
for Margin Classifiers. The Journal of Machine Learning Research. 1 (2001) 113-141
8. Nath, R., Rajagopalan, B., Ryker, R.: Determining the saliency of input variables in neural
network classifiers. Computers & Operations Research. 8(1997) 767773

An Intelligent CRM System for Identifying High-Risk


Customers: An Ensemble Data Mining Approach
Kin Keung Lai1, Lean Yu1,2, Shouyang Wang2, and Wei Huang1
1

Department of Management Sciences, City University of Hong Kong,


Tat Chee Avenue, Kowloon, Hong Kong
{mskklai,msyulean}@cityu.edu.hk
2 Institute of Systems Science, Academy of Mathematics and Systems Science,
Chinese Academy of Sciences, Beijing 100080, China
{yulean,sywang,whuang}@amss.ac.cn

Abstract. In this study, we propose an intelligent customer relationship management (CRM) system that uses support vector machine (SVM) ensembles to
help enterprise managers effectively manage customer relationship from a risk
avoidance perspective. Different from the classical CRM for retaining and targeting profitable customers, the main focus of our proposed CRM system is to
identify high-risk customers for avoiding potential loss. Through experiment
analysis, we find that the Bayesian-based SVM ensemble data mining model
with diverse components and choose from space selection strategy show the
best performance over the testing samples.
Keywords: Customer relationship management, support vector machine,
ensemble data mining, high-risk customer identification.

1 Introduction
Customer relationship management (CRM) has become more and more important
today due to the intensive competitive environment and increasing rate of change in
the customer market. Usually, most enterprises are interested in knowing who will
respond, activate, purchase, or use their products or services. However, customer risk
avoidance and management is also a critical component to maintain profitability in
many industries, such as commercial banks and insurance industries. These businesses are concerned with the amount of risk they are taking by accepting someone or
a certain corporate as a customer. Sustainability and profitability of these businesses
particularly depends on their ability to distinguish faithful customers from bad ones
[1-2]. In order to enable these businesses to take either preventive or correct immediate action, it is imperative to satisfy the need for efficient and reliable model that can
accurately identify high-risk customers with potential default trend.
In such a CRM system that focusing on customer risk analysis, a generic approach
is to apply a classification technique on similar data of previous customers both
faithful and delinquent customers in order to find a relationship between the characteristics and potential default [1-2]. One important ingredient needed to accomplish
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 486489, 2007.
Springer-Verlag Berlin Heidelberg 2007

An Intelligent CRM System for Identifying High-Risk Customers

487

this goal is to seek an accurate classifier in order to categorize new customers or existing customers as good or bad. In the process of customer classification, data mining
techniques, especially classification techniques, play a critical role. Some existing
techniques about risk identification can be referred to [1-3].
Recent studies [2-3] found that the unitary data mining technique did not produce
consistently good performance because each data mining technique had their own
shortcomings and different suitability. Therefore the ensemble data mining technique
is an effective way to remedy this drawback. In this study, our aim is to propose a
support vector machine (SVM) based ensemble data mining model to identify the
high risk customer. The rest of the study is organized as follows. Section 2 describes a
four-step SVM ensemble data mining model for CRM to identify high risk customer
in detail. In Section 3, we conduct some experiments with a real-world customer dataset. Finally, some conclusions are drawn in Section 4.

2 The SVM-Based Ensemble Data Mining System for CRM


As earlier noted, ensembling multiple classification models into an aggregated output
has been an effective way to improve the performance of data mining [2]. A definition
of effective ensemble classifiers was introduced by Hansen and Salamon [4], who
stated: A necessary and sufficient condition for an ensemble of classifiers to be more
accurate than any of its individual members is if the classifiers are accurate and diverse. An accurate classifier is the one that is well trained and whose error rate is
better than a random selection of output classes. Two classifiers are diverse if they
make different errors on the same input values. Different from their study [5], which
is done by using ANN; our tool is SVM, which is a more robust model than ANN.
Generally, the SVM-based ensemble data mining system for high risk customers
identification comprises the following four main steps: data preparation, single classifier creation, ensemble member selection, and ensemble classifier construction, which
are described as follows.
The first step of this ensemble data mining system is to prepare input data into a
readily available format. The main task of this phase is to collect related data and to
perform necessary preprocessing. A systematical research about data preparation for
complex data analysis is done by Yu et al. [5].
The second step is to create single classifiers. According to the bias-variancecomplexity trade-off principle [6], an ensemble model consisting of diverse individual
models (i.e., base models) with much disagreement is more likely to have a good
performance. Therefore, how to generate the diverse model is the key to the construction of an effective ensemble model [3]. For the SVM classifier, there are three main
ways for generating diverse base models: (a) utilizing different training data sets; (b)
changing the kernel functions in SVM, such as polynomial function and Gaussian
function; (c) varying the SVM model parameters, such as parameter C and 2.
Although there are many ways to create diverse base models, the above approaches
are not necessarily independent. In order to guarantee the diversity, a selection step is
used to pick out some of them, and construct an ensemble classifier for CRM.
In order to select diverse classifiers, the subsequent step is to select some independent single models. There are many algorithms, such as principal component

488

K.K. Lai et al.

analysis (PCA) [7], choose the best (CTB) [8] and choose from subspace (CFS) [8]
for ensemble member selection. The PCA is used to select a subset of members from
candidate members using the maximizing eigenvalue of the error matrix. The idea of
CTB is to select the classifier with the best accuracy from candidate members to formulate a subset of all members. The CFS is based on the idea that for each model
type, it chooses the model exhibiting the best performance. Readers can refer to [7-8].
After single classifiers are selected, ensemble classifier will be combined in the final step. Typically, majority voting based ensemble strategy [4] and Bayesian-based
ensemble strategy [11] are the most popular methods for constructing an accurate
ensemble model. Interested readers can refer to [4, 11] for more details.

3 Experimental Analysis
In this section a real-world credit dataset is used to test the performance of the SVMbased ensemble data mining model. The dataset used in this study is from the
financial service company of England, obtained from accessory CDROM of a book
[12]. For space consideration, the data is omitted. For testing purpose, we randomly
divide the scaled dataset into two parts: training set with 800 samples, testing set with
1071 samples. Particularly, the total accuracy [1-2] is used to measure the efficiency
of classification. For comparison, logistic regression-based (LogR) ensemble and
artificial neural network-based (ANN) ensemble are also conducted. Note that LogR
ensemble, ANN ensemble and SVM ensemble are constructed with 20 different
classifiers respectively. Accordingly different experimental results are reported in
Table 1.
Table 1. The total accuracy of different ensemble models (%)
Ensemble
LogR

ANN

SVM

Selection
strategy
PCA
CTB
CFS
PCA
CTB
CFS
PCA
CTB
CFS

Ensemble strategy
Majority voting ensemble
Bayesian-based ensemble
57.65
60.08
58.63
62.29
59.49
60.86
68.63
73.35
69.06
71.24
68.65
72.93
70.75
76.36
71.30
77.68
75.63
87.06

From Table 1, we find the following interesting conclusions. (1) Diversity of individual classifiers can improve the performance of the SVM-based ensemble models;
(2) CFS always performs the best among the three strategies of ensemble members
selection; (3) Bayesian-based ensemble strategy always performs much better than
majority voting ensemble strategy. These results and findings also demonstrate the
effectiveness of the SVM-based ensemble models relative to other ensemble approaches, such as logit regression ensemble and neural network ensemble.

An Intelligent CRM System for Identifying High-Risk Customers

489

4 Conclusions
In this study, we propose an intelligent CRM system that uses SVM ensemble techniques to help enterprise managers effectively manage customer relationship from a
risk avoidance perspective. Through experiment analysis, we can easily find that the
SVM-based ensemble model performs much better than LogR ensemble and ANN
ensemble, indicating that the SVM-based ensemble models can be used as an effective CRM tool to identify high risk customers.

Acknowledgements
This work is supported by the grants from the National Natural Science Foundation of
China (NSFC No. 70221001, 70601029), Chinese Academy of Sciences (CAS No.
3547600), Academy of Mathematics and Systems Sciences (AMSS No. 3543500) of
CAS, and City University of Hong Kong (SRG No. 7001677, 7001806).

References
1. Lai, K.K., Yu, L., Zhou, L.G., Wang, S.Y.: Credit Risk Evaluation with Least Square
Support Vector Machine. Lecture Notes in Computer Science 4062 (2006) 490-495
2. Lai, K.K., Yu, L., Wang, S.Y., Zhou, L.G.: Credit Risk Analysis Using a Reliability-based
Neural Network Ensemble Model. Lecture Notes in Computer Science 4132 (2006)
682-690
3. Lai, K.K., Yu, L., Huang, W., Wang, S.Y.: A Novel Support Vector Metamodel for Business Risk Identification. Lecture Notes in Artificial Intelligence 4099 (2006) 980-984
4. Hansen, L., Salamon, P.: Neural Network Ensemble. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (1990) 993-1001
5. Yu, L., Wang, S.Y., Lai, K.K.: A Integrated Data Preparation Scheme for Neural Network
Data Analysis. IEEE Transactions on Knowledge and Data Engineering 18 (2006) 217-230
6. Yu, L., Lai, K.K., Wang, S.Y., Huang, W.: A Bias-Variance-Complexity Trade-Off
Framework for Complex System Modeling. Lecture Notes in Computer Science 3980
(2006) 518-527
7. Yu, L., Wang, S.Y., Lai, K.K.: A Novel Nonlinear Ensemble Forecasting Model Incorporating GLAR and ANN for Foreign Exchange Rates. Computers & Operations Research
32(10) (2005) 2523-2541
8. Partridge D., Yates, W.B.: Engineering Multiversion Neural-Net Systems. Neural Computation 8 (1996) 869-893
9. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer: New York (1995)
10. Suykens J.A.K., Vandewalle, J.: Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9 (1999) 293-300
11. Xu, L., Krzyzak, A., Suen, C.Y.: Methods of Combining Multiple Classifiers and Their
Applications to Handwriting Recognition. IEEE Transactions on Systems, Man, and
Cybernetics 22(3) (1992) 418-435
12. Thomas, L.C., Edelman, D.B., Crook, J.N.: Credit Scoring and its Applications. Society of
Industrial and Applied Mathematics, Philadelphia (2002)

The Characteristic Analysis of Web User Clusters


Based on Frequent Browsing Patterns
Zhiwang Zhang and Yong Shi
School of Information of Graduate University of Chinese Academy of Sciences, Chinese
Academy of Sciences Research Center on Data Technology and Knowledge Economy, Beijing
(100080), China
zzwmis@sohu.com
Chinese Academy of Sciences Research Center on Data Technology and Knowledge Economy,
Graduate University of Chinese Academy of Sciences, Beijing (100080), China
yshi@gucas.ac.cn

Abstract. Web usage mining (WUM) is an important and fast developing area
of web mining. Recently, some enterprises have been aware of its potentials,
especially for applications in Business Intelligence (BI) and Customer
Relationship Management (CRM). Therefore, it is crucial to analyze the
behaviors and characteristics of web user so as to use this knowledge for
advertising, targeted marketing, increasing competition ability, etc. This paper
provides an analytic method, algorithm and procedure based on suggestions
from literature and the authors experiences from some practical web mining
projects. Its application shows combined use of frequent sequence patterns
(FSP) discovery and the characteristic analysis of user clusters can contribute to
improve and optimize marketing and CRM.
Keywords: WUM, FSP, clustering.

1 Introduction
Data mining has been used by many organizations to extract the valuable information
from large volumes of data and then use them to make critical business decisions. As
for WUM, a lot of work mainly focus on web user navigating patterns discovery and
association analysis, user and web pages clustering. However, it is insufficient in
analyzing the characteristics of web clusters after identifying interesting frequent
browsing patterns. In this paper, firstly we introduce related work in WUM and its
analytic steps. And then, we discuss the three main steps in WUM, taking an
educational web server for example. Here our main work lies in creating a data mart
of web usage data, discovering some FSP of web user, providing a method that
measures the similarities among different FSP for user clusters, and providing an
algorithm and its applications. In the end, a comparison between this algorithm and kmeans, Kohonen is given.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 490493, 2007.
Springer-Verlag Berlin Heidelberg 2007

The Characteristic Analysis of Web User Clusters

491

2 Related Work
2.1 Taxonomy of Web Mining
Web mining involves a wide range of applications that aims at discovering and
extracting hidden information in data stored on the Web. Besieds, Web mining can be
categorized into three different classes: (i) Web content mining, (ii) Web structure
mining and (iii) WUM. For detailed surveys of Web mining please refer to [1].
Respectively, Web content mining [1] is the task of discovering useful information
available on-line. Web structure mining [1, 2, 3] is the process of discovering the
structure of hyperlinks within the Web. WUM is the task of discovering the activities
of the users while they are browsing and navigating through the Web [1].
2.2 WUM
WUM, from the data mining aspect, is the task of applying data mining techniques to
discover usage patterns from Web data in order to understand and better serve the
needs of users navigating on the Web [1]. As every data mining task, the process of
WUM also consists of three main steps: (i) data selecting and preprocessing, (ii)
pattern discovery and (iii) pattern analysis.

3 Data Selecting and Preprocessing


After we set up a definite business target, it is necessary to extract and select the types
of data from data sources. In general, in WUM we may gain the following three types
of data: (i) web data: generated by visits to a web site; (ii) business data: produced by
the respective OLTP, OLAP, DSS and other systems; (iii) meta data, data describing
the web site itself , for example, content and structure, etc.
Just as the discussion above, Web log data can mainly get from the internet or
intranet resources. In this paper, web log files are from DePaul CTI Web server log.
The preprocessed and filtered data is based on a random sample of users visiting this
site for a 2 week period which contains 13745 sessions and 683 page views from 5446
users. To analyze, we have developed a data mart to support FSP mining and further
the characteristics analysis of user clusters. The data model is reported in Figure1.
SDJHYLHZBFRGH
SDJHYLHZ,'
LWHP5HIHUUH
LWHP


SDJHBYLHZ
VHVVLRQ,'
SDJHYLHZ,'
SDJHYLHZ,'

SDJHYLHZ,'
SDJHYLHZ,'
FOLFNV

XVHUBVHVVLRQ
VHVVLRQ,'
XVHU,'
SDJHYLHZ,'
GXUDWLRQ
GDWH


XVHUBUHJLVWHU
XVHU,'
ORJLQ,'
ORJLQ1DPH


IHTXHQWBSDWWHUQV
XVHU,'
VXSSRUW
FRQILGHQFH
FRQVHTXHQW
DQWFHGHQW
DQWFHGHQW

DQWFHGHQWN

Fig. 1. Data mart tables and their relations

492

Z. Zhang and Y. Shi

4 Frequent Patterns Discovery


4.1 FSP Discovery
In this section, we use the notion of FSP discovery where temporal correlations
among items in user transactions are discovered.
4.2 Results of Experimentation
In this paper we employ FSP discovery to the above dataset and produce these results
in Fig.2 with maximum length be 3 and the top 3 of these FSP numbered 3514.
User Support Confidence Antecedent1 ==> Antecedent2 ==> Consequent
4 0.0204 1.0000
/programs/2002/bachelorcs2002.asp ==> /news/default.asp
5 0.0255 1.0000
/admissions/international.asp ==> /admissions/
4 0.0204 1.0000
/authenticate/login.asp?section=advising&title
=appointment&urlahead=apt_making/makeapts ==> /news/default.asp
7 0.0357 1.0000
/cti/core/core.asp?section=news ==> /news/default.asp
4 0.0204 1.0000
/news/news.asp?theid=586 ==> /news/default.asp
Fig. 2. FSP on Web server log

5 The Characteristics Analysis of Web User Clusters


5.1 Similarity Measures
Given a weighted value set W of the pages in the k th length FSP fpi , and thus, the
similarity between the FSP fpi and fp j is defined as:
k
1 if the two pages is same
sim( fpi , fp j ) = wl * eq ( pil , p jl ) , eq ( pil , p jl ) =
.
l =1
0 otherwise

Here eq (*,*) may be the combined probability distribution function of the fpi and fp j .
5.2 Algorithm

In this part, we give an algorithm which implement maximum spanning tree (MST)
clustering based on the above FSP (MSTCBFSP), as the following:
Input: FP = { fp1 , fp2 , " , fpm } , W = {w1 , w2 , " , wk } , . Output: T // Set of clusters.
Processing Flow: // MST clustering based on FSP.

Step one: Compute the similarity of the different FSP and construct a graph G .
Step two: Building a MST Tmax on graph G .
Step three: Clustering analysis according to the MST Tmax .

The Characteristic Analysis of Web User Clusters

493

5.3 Results of Experimentation

Each gets the five clusters (Table1.) of web user with the different characteristics after
we run algorithm MSTCBFSP, k-means and Kohonen on the above data set of FSP.
Table 1. The comparison of the results between MSTBFSP and k-means, Kohonen
Algorithm

Clusters

K-means

Cluster#1
Cluster#2
Cluster#3
Cluster#4
Cluster#5
Cluster-1
Cluster-2
Cluster-3
Cluster-4
Cluster-5
Cluster+1
Cluster+2
Cluster+3
Cluster+4
Cluster+5

Kohonen

MSTCBFSP

Percent
(User)
3.59%
64.54%
24.70%
6.77%
0.40%
23.90%
64.94%
0.80%
3.59%
6.77%
65.74%
22.71%
2.39%
0.40%
8.76%

Antecedent1(Pagevie
wID, UserPercent)
359 100% 557 0%
557 100% 359 0%
557 100% 359 0%
359 100% 557 0%
359 0% 557 0%
557 100% 359 0%
557 100% 359 0%
359 0% 557 0%
359 100% 557 0%
359 100% 557 0%
359 0% 557 100%
557 100% 359 0%
557 0% 359 100%
359 0% 557 0%
359 100% 557 0%

Antecedent2(Pagevie
wID, UserPercent)
67 100% 1 0%
1 100% 67 0%
67 100% 1 0%
1 0% 67 0%
1 0% 67 0%
1 0% 67 0%
1 100% 67 0%
1 0% 67 100%
1 0% 67 100%
1 0% 67 0%
67 0% 1 100%
1 0% 67 100%
67 100% 1 0%
1 0% 67 0%
1 0% 67 0%

Consequent(Pagevi
ewID, UserPercent)
388 100% 666 0%
388 0% 666 0%
388 100% 666 0%
388 0% 666 0%
388 0% 666 100%
388 100% 666 0%
388 0% 666 100%
388 100% 666 0%
388 100% 666 0%
388 0% 666 0%
388 0% 666 100%
388 100% 666 0%
388 100% 666 0%
388 0% 666 100%
388 0% 666 0%

6 Comparison of the Results and Conclusions


We use the MSTCBFSP could promptly find a global optimum solution and randomshaped clusters. In contrast, k-means is apt to find a local optimum solution and it
does not work on categorical data directly and can only find convex-shaped clusters.
Besides, for Kohonen map, the explanation of clustering results is very difficult. In
conclusion, the MSTCBFSP is a better method. Consequently, we may try to use the
fuzzy MSTCBFSP in the future.
Acknowledgements. This research has been partially supported by a grant from
National Natural Science Foundation of China (#70621001, #70531040, #70501030,
#70472074, #9073020), 973 Project #2004CB720103, Ministry of Science and
Technology, China, and BHP Billiton Co., Australia.

References
1. Margaret H. Dunham, Data Mining Introductory and Advanced topics, Prentice Hall (2003)
206-218.
2. Ajith Abraham, Business Intelligence from Web Usage Mining, Journal of Information &
Knowledge Management, Vol. 2, No. 4 (2003) 375-390.

A Two-Phase Model Based on SVM and


Conjoint Analysis for Credit Scoring
Kin Keung Lai, Ligang Zhou, and Lean Yu
Department of Management Sciences, City University of Hong Kong, Hong Kong
{mskklai,mszhoulg,msyulean}@cityu.edu.hk

Abstract. In this study, we use least square support vector machines


(LSSVM) to construct a credit scoring model and introduce conjoint
analysis technique to analyze the relative importance of each input feature for making the decision in the model. A test based on a real-world
credit dataset shows that the proposed model has good classication
accuracy and can help explain the decision. Hence, it is an alternative
model for credit scoring tasks.
Keywords: support vector machines, conjoint analysis, credit scoring.

Introduction

For nancial institutions, the ability to predict if a loan applicant or existing


customer will default or not is crucial, and an improvement in prediction accuracy can help reduce losses signicantly. Most statistic methods and optimization
techniques and some new approaches in articial intelligence have also been used
for developing credit scoring models. A comprehensive descriptions of methods
being used in credit scoring can be found in a recent survey [1].
Each method has its advantage and disadvantage, so it is dicult to nd one
model that can perform consistently better than other models in all circumstances. For the measure of classication accuracy, AI technologies can perform
better than traditional methods; however, their black-box property make it dicult for the decision makers to use them with adequate condence. In this study,
we introduce a LSSVM [2] approach with the radial basis function (RBF) kernel
and adopt an approach based on the principle of design of experiment (DOE) to
optimize the parameters [3]. In addition, the conjoint analysis method is used to
calculate the relative importance of each input feature for credit risk evaluation.
The rest of this paper is organized as follows. Section 2 illustrates the basic
concepts in LSSVM and the main process in conjoint analysis and describes our
method. In Section 3, we use a real-world dataset to test the proposed method
and analyze the results with conjoint analysis. Section 4 provides a conclusion
about this study.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 494498, 2007.
c Springer-Verlag Berlin Heidelberg 2007


A Two-Phase Model Based on SVM and Conjoint Analysis

495

Two-Stage Model Based on SVM and Conjoint


Analysis

Given a training dataset {xk , yk }N


k=1 , we can formulate the LSSVM model in
feature space as following [2]:
min (w, ) =

w, b,

1 T
C N
w w +
2
k=1 k
2
2

Subject to: yk [wT (xk ) + b] = 1 k, k = 1, ..., N


The classier in the dural space takes the form:


N
y(x) = sign
k yk K(x, xk ) + b
k=1

(1)

(2)

Where the k is the Lagrange multiplier, in this study, we chose kernel function
to be Radial-basis function (RBF): K(x, xk ) = exp(||x xk ||2 / 2 ).
In the above LSSVM model, there are two parameters to be determined, C and
. A method inspired by DOE proposed by Staelin [3] can reduce the complexity
sharply, relative to grid search method. The main steps of this approach are as
follows:
1. Set initial range for Cand as [C min, C max], [D min, D max],iter = 1;
2. While iterMAXITER do
2.1. According to the pattern as shown in Figure 1, nd the 13 points in the
space [(C min, D min), (C max, D max)], set C Space=C max-C min, D Space=
D max-D min;
2.2. For each of the 13 points which have never been evaluated, carry out the
evaluation: set the parameters according to coordinate of this point, run LSSVM
via k-fold cross validation, the average classication accuracy is the performance
index of this point;
2.3. Choose the point p0(C0, D0) with best performance to be the center, set
new range of the searching space. C Space= C Space/2, D Space=D Space/2. If
the rectangle area with (C0, D0) as the center and C Space, D Space as width
and height exceeds the initial range, adjust the center point until the new search
space is contained within the [(C min, D min),(C max, D max)].
2.4. Set new searching space, C min=C0-C Space/2, C max=C0+C Space/2,
D min=D0-D Space/2, D max=D0+D Space/2, iter = iter +1;
3. Return the point with best performance, use its best parameters to create
a model for classication.
Conjoint analysis is a technique with wide commercial uses, such as prediction
of protability or market share for proposed new product, providing insights
about how customers make trade-os between various service features, etc. For
an
Nevaluated applicant with feature x , its utility can be dened by the value
k=1 k yk K(x, xk ) + b from the LSSVM model, the larger is this value, the less
is the possibility of its default. All the tested applicants can be ranked by their
utility. Then the part-worth model is selected to be the utility estimation model

496

K.K. Lai, L. Zhou, and L. Yu


Dmax

Dmin
Cmin

Cmax

Fig. 1. Sketch of two iterations search with method based on DOE

because of its simplicity and popularity. We choose multiple regression method to


estimate the part-worth, and ranking order of the applicants as the measurement
scales for dependent variables. The decision manager, are concerned about not
only the classication accuracy of the model but also the average importance of
each attribute which can be measured by the relative importance of attribute in
conjoint analysis. For each model, we can calculate the relative importance of
attribute i as the following formula:
max Pij min Pij
Ii = 
100% j = 1, . . . Li
n
(max Pij min Pij )

(3)

i=1

Where Pij is the part-worth of level j for attribute i, Li is the number of levels
for attribute i, n is the number of attributes.

Empirical Study

A real world data set German dataset from UCI is used. it consists of 700 good
instances and 300 bad ones. For each instance, 24 input variables described 19 attributes. 5-fold cross validation is used to evaluate the performance of each group
of parameters setting for the LSSVM model. We set MAXITER =5, searching
space of log2 C is [-5, 15], log2 is [-15, 5] and nally get the optimal parameters
C = 27.5 , = 25.0 . Then we use 10-fold cross validation to test the eciency of
this model and ve repetitions are conducted to reduce the stochastic variability
of model training and the testing process. The results were also compared with
other methods on the same dataset shown in Table 1.
The proposed method can provide high classication accuracy, but a major
deciency of the model is the diculty in explaining the role of the input features
for making the decision. We conjoint analysis to calculate the relative importance
of each feature after ranking the utility of the testing sample. For the 10-fold cross
validation running, there are 10 group testing samples; Figure 2 illustrate the
average relative importance of each feature of the 10 group testing samples for
the LSSVM+DOE model. From this gure, we can see that only three features of
the applicants exceed 8%. Although some of the features have less importance,
they all contribute to the decision in the LSSVM+DOE model.

A Two-Phase Model Based on SVM and Conjoint Analysis

497

Table 1. Results from dierent methods on the German credit data set
Methods
Classication accuracy (%) Std. (%)
SVM+Grid Searcha
76.00
3.86
SVM+Grid Search + F-scorea
77.50
4.03
SVM+GAa
77.92
3.97
MOEb
75.64

RBFb
74.60

MLPb
73.28

LVQb
68.37

FARb
57.23

LS-SVM+DOE
77.96
5.83
a

Results from [4] ,

results from [5].

Fig. 2. Relative importance of features for German dataset

Conclusion

This study proposes a two-phrase model for credit scoring. The parameters of
the LSSVM model are optimized by a searching procedure inspired by Design
of Experiment. Then the decision from the LSSVM model is estimated by the
conjoint analysis method. The relative importance of attributes derived from
conjoint analysis provide the decision makers with some idea about what features of the applicant are of importance for the model and whether the decision is
consistent with their past experience. The results show that the proposed model
has good classication accuracy and, in some degree, can help the decision makers to explain their decision.

References
1. Thomas, L.C.: A Survey of Credit and Behavioural Scoring: Forecasting Financial Risk of Lending to Consumers. International Journal of Forecasting 16 (2000)
149-172
2. Suykens, J.A.K., Gestel, T.V., Brabanter, J.D., Moor, B.D., Vandewalle, J.: Least
Squares Support Vector Machines. World Scientic, Siningapore (2002)
3. STAELIN, C.: Parameter Selection for Support Vector Machines. Tech. Rep.,
HPL-2002-354 (R. 1), HP Laboratories Israel, (2003)

498

K.K. Lai, L. Zhou, and L. Yu

4. Huang, C.L., Chen, M.C., Wang, C.J.: Credit Scoring with a Data Mining Approach Based on Support Vector Machines. Expert Systems with Applications
(2006) doi:10.1016/j.eswa.2006.1007.1007
5. West, D.: Neural Network Credit Scoring Models. Computers & Operations Research
27 (2000) 1131-1152

A New Multi-Criteria Quadratic-Programming Linear


Classification Model for VIP E-Mail Analysis
Peng Zhang1,2, Juliang Zhang1, and Yong Shi1
1

CAS Research Center on Data Technology and Knowledge Economy,


Beijing 100080, China
zhangpeng04@mails.gucas.ac.cn, yshi@gucas.ac.cn
2
School of Information Science and Engineering, Graduate University of
Chinese Academy of Sciences, Beijing 100080, China

Abstract. In the recent years, classification models based on mathematical


programming have been widely used in business intelligence. In this paper, a
new Multi-Criteria Quadratic-Programming Linear Classification (MQLC)
model is proposed and tested with VIP E-Mail dataset. This experimental study
uses a variance of K-fold Cross-Validation to demonstrate the accuracy and
stability of the model. Finally, we compare our model with the decision tree by
using commercial software C5.0. The result indicates that the proposed MQLC
model performs better than decision tree on small samples.
Keywords: VIP E-Mail Analysis, Data Mining, Multi-criteria Quadraticprogramming Linear Classification, Cross-Validation, Decision Tree.

1 Introduction
Data Mining is a discipline combining a wide range of subjects such as Statistics,
Machine Learning, Artificial Intelligence, Neural Network, Database Technology and
Pattern recognition [1]. Recently, classification models based on mathematical
programming approaches have been introduced in data mining. In 2001, Shi et al [2,
3] proposed a Multiple Criteria Linear Programming (MCLP) model which has been
successfully applied to a major US bank credit card portfolio management. In this
paper, a new Multi-criteria Quadratic-programming Linear Classification (MQLC)
model is proposed and used in the VIP E-Mail dataset provided by a major web
hosting company in China. As a new model, the performance and stability of MQLC
are focal issues. In order to respond to these challenges, this paper conducts a
variance of k-folders cross-validation experiment and compares the prediction
accuracy of MQLC with decision tree in software C5.0. Our findings indicate that
MQLC is highly stable and performs well in small samples.
This paper is organized as the following. Next section is an overview of MQLC
model formulation; the third section describes the characteristics of the VIP E-Mail
dataset; the fourth section talks about the process and results of cross validation; the
fifth section illustrates the comparison study with decision tree in commercial
software C5.0; and the last section concludes the paper with summary of the findings.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 499502, 2007.
Springer-Verlag Berlin Heidelberg 2007

500

P. Zhang, J. Zhang, and Y. Shi

2 Formulation of Multi-Criteria Quadratic-Programming Linear


Classification Model
Given a set of

r attributes a = (a1 ,..., ar ) , let Ai = ( Ai1 ,..., Air ) R r be one of the

sample observations of these properties. Suppose we predefine two groups G1 and G2 ,


we can select a boundary b to separate these two groups. A vector
X=(x1 ,x 2 ,...,x n ) R r can be identified to establish the following linear inequality
A i x < b, Ai G1 ;

A i x b, Ai G2 ;

[2,

3,

4,

5].

We

define

external

measurement i to be the overlapping distance of the two-group boundary for


record Ai . If Ai G1 but we misclassified it into G2 or vice versa, i will equal
to | Ai x b | . Then we define internal measurement i to be the distance of
record Ai from its adjusted boundary b* . If Ai is correctly classified, distance i will
equal to | Ai x b* | , where b* = b + i or b* = b i . Suppose f ( ) denotes the
relationship of all overlapping i while g ( ) denotes the aggregation of all
distances i . The final absolute catch rates depend on simultaneously
minimizing f ( ) and maximizing g ( ) , the most common way of representation is
by normal value. Given weights w , w >0 and let f ( ) = || + c1 || pp , g ( ) = || + c2 ||qq ,
the generalized model can be converted into single criterion mathematical
programming model as:
p
q
(Normal Model) Minimize w || + c1 || p w || + c2 ||q

(1)

Subject to:
Ai x i + i b = 0, Ai G1
Ai x + i i b = 0, Ai G2

i , i 0, i = 1,..., n

Based on the Normal Model, , , c1 , and c2 can be randomly chosen. When we


set p=2, q=1 and
n

i =1

i =1

c1 =1, c2 =0, we can formulate the objective function as

w ( i + 1) 2 w i . After expanding it, we can write the model as follows:


n

i =1

i =1

i =1

(MQLC Model) Minimize w i 2 + 2w i w i

(2)

Subject to:
Ai x i + i b = 0, Ai G1
Ai x + i i b = 0, Ai G2
i , i 0, i = 1,..., n

In MQLC model, the normal value of is larger than implies that the penalty
for misclassifying Ai is more severe than not enlarging the distance from the adjusted

A New MQLC Model for VIP E-Mail Analysis

501

boundary. Since it is possible that i is less than 1, in such scenario the objective value
will decrease when we square i . To make sure the penalty is aggravated when
misclassification occurs, MQLC add 1 to all i before squaring them.

3 VIP E-Mail Dataset


Our partner companys VIP E-Mail data are mainly stored in two kinds of repository
systems; one is databases manually recorded by employee, which was initially
produced to meet the needs of every kind of business service; the other is log files
recorded automatically by machine, which contains the information about customer
login, E-Mail transaction and so on. After integrating all these data with the keyword
SSN, we finally acquire the Large Table which has 185 features from log files and 65
features from database. Considering the customer records integrity, we eventually
extracted two groups of customer record, the current customers and the lost
customers, 4998 records in each group respectively. Combining these 9996 SSN with
the 230 features, we eventually acquired the Large Table for data mining.

4 Empirical Study of Cross-Validation


In this paper, a variance of k-fold cross-validation is used on VIP E-Mail dataset; each
calculation is consisting of training with one of the subsets and testing with the other
k-1 subsets. We have computed 10 group dataset. The accuracy of 10 group training
set is extremely high, with the average accuracy on the lost user of 88.48% and on the
Current user of 91.4%. When we concentrate on the 10 group testing set, the worst
and best classification catch rates are 78.72% and 82.17% for the lost customers and
84.59% and 89.04 % for the Current customers. That is to say, the absolute catch rates
of the lost class are above 78% and the absolute catch rates of the Current class are
above 84%. The result indicates that a good separation of the lost class and Current
class is observed with this method.

5 Comparison of MQLC and Decision Tree


The following experiment compares MQLC and Decision Tree in the commercial
software C5.0. From the results of Table II, we can see that when the percentage of
training set is increased from 2% to 25%, the accuracy of Decision Tree increases
from 71.9% to 83.2% while the accuracy of MQLC increases from 78.54% to
83.56%. The accuracy of MQLC is slightly better than Decision Tree. Moreover,
when training set is 10% of the whole dataset, the accuracy of MQLC peaked at
83.75%. That is to say, MQLC can perform well even on small samples. The reason
maybe due to the fact that MQLC solves a convex quadratic problem, which can
acquire the global optimal solution easily, on the other hand, Decision Tree merely
selects the better tree from a limited built tree, which might not be the best tree. In
addition, the pruning procedure of Decision Tree may further eliminate some better
branches. In conclusion, MQLC performs better than Decision Tree on small samples.

502

P. Zhang, J. Zhang, and Y. Shi


Table 1. Comparison of MQLC and Decision Tree

Percentage
of training

Decision Tree

MQLC

2%
5%

Training
92.7%
94.9%

Testing
71.9%
77.3%

Training
96.0%
92.8%

Testing
78.54%
82.95%

10%
25%

95.2%
95.9%

80.8%
83.2%

90.7%
88.95%

83.75%
83.56%

6 Conclusion
In this paper, a new Multi-criteria Quadratic-programming Linear Classification
(MQLC) Model has been proposed to the classification problems in data mining. 230
features are extracted from the original data source to depict all the VIP E-Mail users,
and 9996 records are chosen to test the performance of MQLC. Through the results of
cross-validation, we can see that the model is extremely stable for multiple groups of
randomly generated training set and testing set. The comparison of MQLC and
Decision Tree in C5.0 tells us that MQLC performs better than Decision Tree on
small samples. There have been experiments that show the accuracy of MQLC is not
affected by the parameters of the objective function, further research will include
mathematically proving this phenomenon.
Acknowledgments. This research has been partially supported by a grant from
National Natural Science Foundation of China (#70621001, #70531040, #70501030,
#70472074), 973 Project #2004CB720103, Ministry of Science and Technology,
China, NSFB(No.9073020) and BHP Billiton Co., Australia.

References
1. Han, J. W., M. Kamber: Data Mining: Concepts and Techniques. San Diego. CA: Academic
Press (2000) ch.1.
2. Shi, Y., Wise, M., Luo, M. and Lin, Y.: Data mining in credit card portfolio management: a
multiple criteria decision making approach. in Multiple Criteria Decision Making in the
New Millennium, M. Koksalan and S.Zionts, eds., Berlin: Springer (2001) 427-436
3. Shi, Y., Peng, Y., Xu, W., Tang, X.: Data Mining via Multiple Criteria Linear
Programming: Applications in Credit Card Portfolio Management. International Journal of
Information Technology and Decision Making, vol. 1(2002) 131-151.
4. Gang Kou, Yi Peng, Yong Shi, Weixuan Xu: A Set of Data Mining Models to Classify
Credit Cardholder Behavior. International Conference on Computational Science (2003)
54-63
5. Yi Peng, Gang kou, Zhengxin Chen, Yong Shi: Cross-Validation and Ensemble Analyses
on Multiple-Criteria Linear Programming Classification for Credit Cardholder Behavior.
International Conference on Computational Science (2004) 931-939

Ecient Implementation of an Optimal


Interpolator for Large Spatial Data Sets
Nargess Memarsadeghi1,2 and David M. Mount2
1

NASA/GSFC, Code 588, Greenbelt, MD, 20771


Nargess.Memarsadeghi@nasa.gov
University of Maryland, College Park, MD, 20742
mount@cs.umd.edu

Abstract. Interpolating scattered data points is a problem of wide ranging interest. One of the most popular interpolation methods in geostatistics is ordinary kriging. The price for its statistical optimality is that
the estimator is computationally very expensive. We demonstrate the
space and time eciency and accuracy of approximating ordinary kriging
through the use of covariance tapering combined with iterative methods.
Keywords: Geostatistics, kriging, tapering, iterative methods.

Introduction

Scattered data interpolation is a problem of interest in numerous areas such as


electronic imaging, smooth surface modeling, and computational geometry [1,2].
Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method
of choice is ordinary kriging [3]. This is because it is a best unbiased estimator [4,3,5]. Unfortunately, this interpolant is computationally very expensive to
compute exactly. For n scattered data points, computing the value of a single
interpolant involves solving a dense linear system of size roughly n n. This is
infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small number of points
that lie close to the query point [3,5]. There are many problems with this local
approach, however. The rst is that determining the proper neighborhood size
is tricky, and is usually solved by ad hoc methods such as selecting a xed number of nearest neighbors or all the points lying within a xed radius. Such xed
neighborhood sizes may not work well for all query points, depending on local
density of the point distribution [5]. Local methods also suer from the problem
that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its
borders [6]. Thus, at interface boundaries where the neighborhood changes, the
interpolant behaves discontinuously. Therefore, it is important to consider and
solve the global system for each interpolant. However, solving such large dense
systems for each query point is impractical.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 503510, 2007.
c Springer-Verlag Berlin Heidelberg 2007


504

N. Memarsadeghi and D.M. Mount

Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering [7]. The problems arise
from the fact that the covariance functions that are used in kriging have global
support. In tapering these functions are approximated by functions that have
only local support, and that possess certain necessary mathematical properties.
This achieves greater eciency by replacing large dense kriging systems with
much sparser linear systems. Covariance tapering has been successfully applied
to a restriction of our problem, called simple kriging [7]. Simple kriging is not an
unbiased estimator for stationary data whose mean value diers from zero, however. We generalize these results by showing how to apply covariance tapering
to the more general problem of ordinary kriging.
Our implementations combine, utilize, and enhance a number of dierent approaches that have been introduced in literature for solving large linear systems
for interpolation of scattered data points. For very large systems, exact methods
such as Gaussian elimination are impractical since they require O(n3 ) time and
O(n2 ) storage. As Billings et al. suggested, we use an iterative approach [8].
In particular, we use the symmlq method [9], for solving the large but sparse
ordinary kriging systems that result from tapering.
The main technical issue that need to be overcome in our algorithmic solution
is that the points covariance matrix for kriging should be symmetric positive
denite [3,10]. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive deniteness.
Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b,
where A is the tapered symmetric positive denite covariance matrix [7]. Thus,
Cholesky factorization [11] could be used to solve their linear systems. They implemented an ecient sparse Cholesky decomposition method. They also showed
if these tapers are used for a limited class of covariance models, the solution of
the system converges to the solution of the original system. Matrix A in the
ordinary kriging system, while symmetric, is not positive denite. Thus, their
approach is not applicable to the ordinary kriging system [10]. After obtaining a sparse ordinary kriging linear system through tapering, we use symmlq
to solve it[9].
We show that solving large kriging systems becomes practical via tapering and
iterative methods, and results in lower estimation errors compared to traditional
local approaches, and signicant memory savings compared to the original global
system. We also developed a more ecient variant of the sparse symmlq method
for large ordinary kriging systems. This approach adaptively nds the correct
local neighborhood for each query point in the interpolation process.
We start with a brief review of the ordinary kriging in Section 2. In Section 3
the tapering properties are mentioned. We introduce our approaches for solving
the ordinary kriging problem in Section 4. Section 5 describes data sets we
used. Then, we describe our experiments and results in Section 6. Section 7
concludes the paper. Full version of our paper has details that were omitted
here [10].

Ecient Implementation of an Optimal Interpolator

505

Ordinary Kriging

Kriging is an interpolation method named after Danie Krige, a South African


mining engineer, who pioneered in the eld of geostatistics [5]. Kriging is also
referred to as the Gaussian process predictor in the machine learning domain [12].
Kriging and its variants have been traditionally used in mining and geostatistics
applications [4,5,3]. The most commonly used variant is called ordinary kriging,
which is often referred to as a BLUE method, that is, a Best Linear Unbiased
Estimator [3,7]. Ordinary kriging is considered to be best because it minimizes
the variance of the estimation error. It is linear because estimates are weighted
linear combination of available data, and is unbiased since it aims to have the
mean error equal to zero [3]. Minimizing the variance of the estimation error
forms the objective function of an optimization problem. Ensuring unbiasedness
of the error imposes a constraint on this objective function. Formalizing this
objective function with its constraint results in the following system [10,3,5].

   
C L
w
C0
=
,
(1)
Lt 0

1
where C is the matrix of points pairwise covariances, L is a column vector of
all ones and of size n, and w is the vector of weights wi , . . . , wn . Therefore, the
minimization problem for n points reduces to solving a linear system of size
(n + 1)2 , which is impractical for very large data sets via direct approaches. It is
also important that matrix C be positive denite [10,3]. Note that the coecient
matrix in the above linear system is a symmetric matrix which is not positive
denite since it has a zero entry on its diagonal.
Pairwise covariances are modeled as a function of points separation. These
functions should result in a positive denite covariance matrix. Christakos [13]
showed necessary and sucient conditions for such permissible covariance functions. Two of these valid covariance functions, are the Gaussian and Spherical
covariance functions (Cg and Cs respectively). Please see [13,3,5] for details of
these and other permissible covariance functions.

Tapering Covariances

Tapering covariances for the kriging interpolation problem, as described in [7],


is the process of obtaining a sparse representation of the points pairwise covariances so that positive deniteness of the covariance matrix as well as the
smoothness property of the covariance function be preserved. The sparse representation via tapering is obtained through the Schur product of the original
positive denite covariance matrix by another such matrix.
Ctap (h) = C(h) C (h).

(2)

The tapered covariance matrix, Ctap , is zero for points that are more than a
certain distance apart from each other. It is also positive denite since it is the

506

N. Memarsadeghi and D.M. Mount

Schur product of two positive denite matrices. A taper is considered valid for
a covariance model if it perseveres its positive-deniteness property and makes
the approximate systems solution converge to the original systems solution.
The authors of [7] mention few valid tapering functions. We used Spherical,
W endland1 , W endland2 , and T opHat tapers [7]. These tapers are valid for R3
and lower dimensions [7]. Tapers need to be as smooth as the original covariance
function at origin to guarantee convergence to the optimal estimator [7]. Thus,
for a Gaussian covariance function, which is innitely dierentiable, no taper
exists that satises this smoothness requirement. However, since tapers proposed
in [7] still maintain positive deniteness of the covariance matrices, we examined
using these tapers for Gaussian covariance functions as well. We are using these
tapers mainly to build a sparse approximate system to our original global system
even though these tapers do not guarantee convergence to the optimal solution
of the original global dense system theoretically.

Our Approaches

We implemented both local and global methods for the ordinary kriging problem.
Local Methods: This is the traditional and the most common way of solving
kriging systems. That is, instead of considering all known values in the interpolation process, points within a neighborhood of the query point are considered.
Neighborhood sizes are dened either by a xed number of points closest to the
query point or by points within a xed radius from the query point. Therefore,
the problem is solved locally. We experimented our interpolations using both of
these local approaches. We dened the xed radius to be the distance beyond
which correlation values are less than 106 of the maximum correlation. Similarly, for the xed number approach, we used maximum connectivity degree of
points pairwise covariances, when covariance values are larger than 106 of the
maximum covariance value. Gaussian elimination [14] was used for solving the
local linear systems in both cases.
Global Tapered Methods: In global tapered methods we rst redene
our points covariance function to be the tapered covariance function obtained
through Eq. (2), where C(h) is the points pairwise covariance function, and
C (h) is a tapering function. We then solve the linear system using the symmlq
approach as mentioned in [9]. Note that, while one can use conjugate gradient
method for solving symmetric systems, the method is guaranteed to converge
only when the coecient matrix is both symmetric and positive denite [15].
Since ordinary kriging systems are symmetric and not positive denite, we used
symmlq. We implemented a sparse symmlq method, similar to the sparse conjugate gradient method in [16]. In [16]s implementation, matrix elements that are
less than or equal to a threshold value are ignored. Since we obtain sparseness
through tapering, this threshold value for our application is zero.
Global Tapered and Projected Methods: This implementation is motivated by numerous empirical results in geostatistics indicating that interpolation
weights associated with points that are very far from the query point tend to be

Ecient Implementation of an Optimal Interpolator

507

close to zero. This phenomenon is called the screening eect in the geostatistical
literature [17]. Stein showed conditioned under which the screening eect occurs
for gridded data [17]. While the screening eect has been the basis for using local
methods, there is no proof of this empirically supported idea for scattered data
points [7]. We use this conjecture for solving the global ordinary kriging system
Ax = b and observing that many elements of b are zero after tapering. This
indicates that for each zero element bi , representing the covariance between the
query point and the ith data point, we have Ci0 = 0. Thus, we expect their associated interpolation weight, wi , to be very close to zero. We assign zero to such
wi s, and consider solving a smaller system A x = b , where b consists of nonzero
entries of b. We store indices of nonzero rows in b in a vector called indices. A
contains only those elements of A whose row and column indices both appear
in indices. This method is eectively the same as the xed radius neighborhood
size, except that the local neighborhood is found adaptively. There are several
dierences between this approach and the local methods. One is that we build
the global matrix A once, and use relevant parts of it, contributing to nonzero
weights, for each query point. Second, for each query, the local neighborhood is
found adaptively by looking at covariance values in the global system. Third,
the covariance values are modied.

Data Sets

As mentioned before, we cannot solve the original global systems exactly for
very large data sets, and thus cannot compare our solutions with respect to the
original global systems. Therefore, we need ground truth values for our data
sets. Also, since performance of local approaches can depend on data points
density around the query point, we would like our data sets to be scattered nonuniformly. Therefore, we create our scattered data sets by sampling points of a
large dense grid from both uniform and Gaussian distributions. We generated
our synthetic data sets using the Sgems [18] software. We generated values on a
(1000 1000) grid, using the Sequential Gaussian Simulation (sgsim) algorithm
of the Sgems software [19,18]. Points were simulated through ordinary kriging
with a Gaussian covariance function of range equal to 12, using a maximum of
400 neighboring points within a 24 unit radius area. Then, we created 5 sparse
data sets by sampling 0.01% to 5% of the original simulated grids points. This
procedure resulted in sparse data sets of sizes ranging from over 9K to over 48K.
The sampling was done so that the concentration of points in dierent locations
vary. For each data set, 5% of the sampled points were from 10 randomly selected
Gaussian distributions. The rest of the points were drawn from the uniform
distribution. Details of the real data tests and results are in our full paper [10].

Experiments

All experiments were run on a Sun Fire V20z running Red Hat Enterprise release
3, using the g++ compiler version 3.2.3. Our software is implemented in c++,

508

N. Memarsadeghi and D.M. Mount

using the GsTL and ANN libraries [19,20]. GsTL is used to build and solving the
linear systems. ANN is used for nding nearest neighbors for local approaches.
For each input data we examined various ordinary kriging methods on 200
query points. Half of these query points were sampled uniformly from the original
grids. The other 100 query points were sampled from the Gaussian distributions.
We tested both local and global methods. Local methods used Gaussian elimination for solving the linear systems while global tapered methods used sparse
symmlq. Running times are averaged over 5 runs.
We examined methods mentioned in Section 4. Global approaches require
selection of a tapering function. For synthetic data, we examined all tapers mentioned in Section 3. Even though there is no taper which is as smooth as the
Gaussian model to guarantee convergence to the optimal estimates, in almost all
cases, we obtained lower estimation errors when using global tapered approaches.
As expected, smoother functions result in lower estimation errors. Also, results
from tapered and projected cases are comparable to their corresponding tapered
global approaches. In other words, projecting the global tapered system did not
signicantly aect the quality of results compared to the global tapered approach
in our experiments. In most cases, Top Hat and Spherical tapers performed similar to each other with respect to the estimation error, and so did Wendland
tapers. Wendland tapers give the lowest overall estimation errors. Among Wendland tapers, Wendland1 has lower CPU running times for solving the systems.
Figure 1 shows the results when W endland1 taper was used.
For local approaches, using xed radius neighborhoods resulted in lower errors
for query points from the Gaussian distribution. Using xed number of neighbors seems more appropriate for uniformly sampled query points. Considering
maximum degree of points covariance connectivity as number of neighbors to
use in the local approach requires extra work and longer running times compared
to the xed radius approach. The xed radius local approach is faster than the
xed neighborhood approach by 1-2 orders of magnitude for the uniform query
points, and is faster within a constant factor to an order of magnitude for query
points from clusters, while giving better or very close by estimations compared
to the xed number of neighbors approach (Tables 1 and 2).
Tapering, used with sparse implementations for solving the linear systems,
results in signicant memory savings. Table 3 reports these memory savings for
synthetic data to be a factor of 392 to 437.
Table 1. Average CPU Times for Solving the System over 200 Random Query Points
n
48513
39109
29487
19757
9951

Local
Fixed Fixed
Num Radius
0.03278 0.00862
0.01473 0.00414
0.01527 0.00224
0.00185 0.00046
0.00034 0.00010

Top
Hat
8.456
4.991
2.563
0.954
0.206

Tapered Global
Top Hat Spherical Spherical W1
W1
Projected
Projected
Projected
0.01519
7.006
0.01393 31.757 0.0444
0.00936
4.150
0.00827 17.859 0.0235
0.00604
2.103
0.00528 08.732 0.0139
0.00226
0.798
0.00193 02.851 0.0036
0.00045
0.169
0.00037 00.509 0.0005

W2
57.199
31.558
15.171
05.158
00.726

W2
Projected
0.04515
0.02370
0.01391
0.00396
0.00064

Ecient Implementation of an Optimal Interpolator

509

Table 2. Average Absolute Errors over 200 Randomly Selected Query Points
n
48513
39109
29487
19757
9951

Local
Fixed Fixed
Num Radius
0.416 0.414
0.461 0.462
0.504 0.498
0.569 0.562
0.749 0.756

Tapered Global
Top Top Hat Spherical Spherical W1
W1
Hat Projected
Projected
Projected
0.333 0.334
0.336
0.337 0.278 0.279
0.346 0.345
0.343
0.342 0.314 0.316
0.429 0.430
0.430
0.430 0.384 0.384
0.473 0.474
0.471
0.471 0.460 0.463
0.604 0.605
0.602
0.603 0.608 0.610

W2
0.276
0.313
0.372
0.459
0.619

W2
Projected
0.284
0.322
0.382
0.470
0.637

Table 3. Memory Savings in the Global Tapered Coecient Matrix


n
48513
39109
29487
19757
9951

(n + 1)2
(Total Elements)
2,353,608,196
1,529,592,100
869,542,144
39,0378,564
99,042,304

Stored % Stored
Elements
5,382,536 0.229
3,516,756 0.230
2,040,072 0.235
934,468
0.239
252,526
0.255

Average Absolute Error Over 200 Query Points

Savings
Factor
437.267
434.944
426.231
417.755
392.206

Average CPU Time for Solving the System


Over 200 Query Points

100

0.8
0.7
0.6
0.5
0.4
0.3
0.2

Fixed Num
Fixed Radius
Wendland1 Tapered
Wendland1 Tapered & Projected

0.1
0
10000

20000
30000
40000
Number of Scattered Data Points (n)

Average CPU Running Time

Average Absolute Error

0.9

10
1
1E-1
1E-2
1E-3
1E-4
1E-5
10000

Fixed Num
Fixed Radius
Wendland1 Tapered
Wendland1 Tapered & Projected

20000
30000
40000
Number of Scattered Data Points (n)

Fig. 1. Left: Average Absolute Errors. Right: Average CPU Running Times.

Conclusion

Solving very large ordinary kriging systems via direct approaches is infeasible for
large data sets. We implemented ecient ordinary kriging algorithms through
utilizing covariance tapering [7] and iterative methods [14,16]. Furrer et al. [7]
had utilized covariance tapering along with sparse Cholesky decomposition to
solve simple kriging systems. Their approach is not applicable to the general ordinary kriging problem. We used tapering with sparse symmlq method to solve
large ordinary kriging systems. We also implemented a variant of the global tapered method through projecting the global system on to an appropriate smaller
system. Global tapered methods resulted in memory savings ranging from a factor of 4.54 to 437.27. Global tapered iterative methods gave better estimation
errors compared to the local approaches. The estimation results of the global tapered method were very close to the global tapered and projected method. The

510

N. Memarsadeghi and D.M. Mount

global tapered and projected method solves the linear systems within order(s)
of magnitude faster than the global tapered method.

Acknowledgements
We would like to thank Galen Balcom for his contributions to the c++ implementation of the symmlq algorithm.

References
1. Amidror, I.: Scattered data interpolation methods for electronic imaging systems:
a survey. J. of Electronic Imaging 11 (2002) 157176
2. Alfeld, P.: Scattered data interpolation in three or more variables. Mathematical
methods in computer aided geometric design (1989) 133
3. Isaaks, E.H., Srivastava, R.M.: An Introduction to Applied Geostatistics. Oxford
University Press (1989)
4. Journel, A., Huijbregts, C.J.: Mining Geostatistics. Academic Press Inc (1978)
5. Goovaerts, P.: Geostatistics for Natural Resources Evaluation. Oxford University
Press, Oxford (1997)
6. Meyer, T.H.: The discontinuous nature of kriging interpolation for digital terrain
modeling. Cartography and Geographic Information Science, 31 (2004) 209216
7. Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of
large spatial datasets. J. of Computational and Graphical Statistics 15 (2006)
502523
8. Billings, S.D., Beatson, R.K., Newsam, G.N.: Interpolation of geophysical data
using continuous global surfaces. Geophysics 67 (2002) 18101822
9. Paige, C.C., Saunderszi, M.A.: Solution of sparse indenite systems of linear equations. SIAM J. on Numerical Analysis 12 (1975) 617629
10. Memarsadeghi, N., Mount, D.M.: Ecient implementation of an optimal interpolator for large spatial data sets. Technical Report CS-TR-4856, Computer Science
Department, University of Maryland, College Park, MD, 20742 (2007)
11. Loan, C.F.V.: Intro. to Scientic Computing. 2nd edn. Prince-Hall (2000)
12. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT
Press (2006)
13. Christakos, G.: On the problem of permissible covariance and variogram models.
Water Resources Research 20 (1984) 251265
14. Nash, S.G., Sofer, A.: Linear and Nonlinear Programming. McGraw-Hill Companies (1996)
15. Shewchuk, J.R.: An intro. to the conjugate gradient method without the agonizing
pain. CMU-CS-94-125, Carnegie Mellon University (1994)
16. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes
in C++, The Art of Scientic Computing. Cambridge University Press (2002)
17. Stein, M.L.: The screening eect in kriging. Annals of Statistics 1 (2002) 298323
18. Remy, N.: The Stanford Geostatistical Modeling Software (S-GeMS). SCRC Lab,
Stanford University. (2004)
19. Remy, N.: GsTL: The Geostatistical Template Library in C++. Masters thesis,
Department of Petroleum Engineering of Stanford University (2001)
20. Mount, D.M., Arya, S.: ANN: A library for approximate nearest neighbor searching. http://www.cs.umd.edu/ mount/ANN/ (2005)

Development of an Efficient Conversion System


for GML Documents
Dong-Suk Hong, Hong-Koo Kang, Dong-Oh Kim, and Ki-Joon Han
School of Computer Science & Engineering, Konkuk University,
1, Hwayang-Dong, Gwangjin-Gu, Seoul 143-701, Korea
{dshong,hkkang,dokim,kjhan}@db.konkuk.ac.kr

Abstract. In this paper, we designed and implemented a Conversion System for


GML documents, and evaluated the performance of the system. The conversion
system that uses the BXML-based binary encoding specification can directly
display geographic information from BXML documents and can convert GML
documents to BXML documents without loss of information, and vice versa.
BXML is generally more effective in scanning cost and space requirement than
GML and coordinate values in the form of BXML can be used directly without
conversion.
Keywords: Conversion System, GML, BXML, OGC, Geographic Information.

1 Introduction
Recently, OGC(Open GIS Consortium) has presented the GML(Geography Markup
Language) specification[3,4] for the interoperability of the geographic information.
However, since the GML documents are encoded in text and its tag is very repetitive,
it yields problems such as large data size, slow transmission time, and enormous
document scanning cost[6]. The major method to reduce the size of the text includes
compressing such as GZIP[1]. However, as the data compressed by GZIP must be
decompressed into the original GML document, the document scanning cost and the
coordinate-value converting cost increase enormously.
OGC has proposed the BXML(Binary-XML) encoding specification[5] that can
encode the GML document into binary XML format and reduce the document size by
removing the repetitive tag names and attributes. And, the BXML encoding specification also can reduce the coordinate-value converting cost to display geographic information by encoding coordinate values into binary values.
In this paper, we designed and implemented a Conversion System for GML documents. The system uses the BXML-based binary encoding specification proposed by
OGC. BXML documents are generally more effective in scanning cost and space
requirement than GML documents. Especially, coordinate values in the form of
BXML documents can be used directly without conversion. The system can directly
display geographic information from BXML documents and can convert GML
documents to BXML documents without loss of information, and vice versa. In addition, this paper analyzed the performance results of the system.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 511514, 2007.
Springer-Verlag Berlin Heidelberg 2007

512

D.-S. Hong et al.

2 Related Works
OGC proposed the GML specification for interoperability[3,4]. GML is an XML
encoding in compliance with ISO 19118 for the transport and storage of geographic
information modeled according to the conceptual modeling framework used in the
ISO 19100 series and including both the spatial and non-spatial properties of geographic features[2,3]. GML as presently encoded using plain-text XML[7] has three
major performance problems[5]: the text in the XML structure is highly redundant
and bulky, making it slow to transfer over the Internet; the lexical scanning of XML is
unexpectedly costly; and the conversion of text-encoded numerical coordinate and
observation values is also very costly.
BXML was proposed for more effective transmission by reducing the size of the
XML document without changes in contents for a limited environment[5]. If an XML
file is used as a database of objects with a specific attribute used as a primary key,
then it will be greatly more efficient to randomly seek directly to the data tokens that
encode those objects. BXML can directly represent raw binary data without any indirect textual-encoding methods, and a backward-compatibility mechanism is provided
to enable the translation raw-binary blocks into an equivalent XML representation.

3 System Development
Fig. 1 shows the architecture of the conversion system for GML documents presented
in this paper.

Fig. 1. System Architecture

This system is divided into a server and a client. The server system is composed of
a GML document analysis module to analyze GML documents, an encoding module
to encode GML documents into BXML documents, a BXML document analysis
module to analyze BXML documents, a decoding module to decode BXML documents into GML documents, a display module to read the spatial data from BXML
documents for display, and a network transmitting module to transmit the data encoded into BXML documents through network. The client system is composed of a
network transmitting module, a BXML document analysis module, and a display
module.

Development of an Efficient Conversion System

513

In order to convert the GML document into the BXML document, the GML document is analyzed, made into tokens, and encoded into the corresponding binary values
based on the code values. Fig. 2 and Fig. 3 shows an example of a GML document
and a BXML document converted from the GML document of Fig. 2, respectively. In
addition, In order to convert the BXML document into the original GML document,
the BXML document is analyzed and the token types are classified.
The display module derives out the spatial data from a BXML document and displays it on the screen as maps. Fig. 4 shows the screen display from the BXML document of Fig. 3.

Fig. 2. GML Document

Fig. 3. BXML Document

Fig. 4. Screen Display

4 Performance Evaluation
This chapter examines the performance evaluation results of the conversion system
for GML documents. The GML documents used are 20Mbytes, 40Mbytes, 60Mbytes,
and 80Mbytes and are composed of point data, polyline data, and polygon data.
Fig. 5 illustrates the document size when the GML documents are compressed by
GZIP and when the GML documents are encoded into the BXML documents. As
shown in Fig 5, the compression of the GML document using GZIP achieves the
better compression rate. However, the GZIP compression does not consider document
scanning but solely reduces the data size. Therefore, the compressed GZIP document
should be decompressed and restored to its original state in order for us to use the data
contained within. On the other hand, encoding into the BXML document can save
more document scanning cost than the GZIP compression.
Fig. 6 illustrates the display time of the geographic information of the GML document and the BXML document. As shown in Fig 6, using the BXML document rather
than the GML document achieves faster display time and the gap between the two
document types increases as the file size enlarges. In case of the GML document,
since all data are encoded in texts, the coordinate values must be separated for coordinate value conversion. However, in case of the BXML document, since its geographic
data are encoded in its original data type, faster speed is possible as it can be used
without any type conversion process.

514

D.-S. Hong et al.

Fig. 5. Comparison of Compression Size

Fig. 6. Comparison of Display Time

5 Conclusion
This paper designed and implemented the conversion system for GML documents that
can convert GML documents into BXML documents without loss of information, and
vice versa. The system also can directly display geographic information from the
BXML documents. By using the conversion system, it is possible to enhance
the transmission speed by reducing the GML document size and also can enhance the
document scanning speed for faster display of the geographic information of the
BXML documents.

Acknowledgements
This research was supported by the Seoul Metropolitan Government, Korea, under the
Seoul R&BD Program supervised by the Seoul Development Institute.

References
1. IETF RFC 1952: GZIP File Format Specification Version 4.3, http://www.ietf.org/
rfc/rfc1592.txt (1996).
2. ISO/TC 211: ISO 19136 Geographic Information - Geography Markup Language(GML)
(2004).
3. OpenGIS Consortium, Inc.: Geography Markup Language(GML) Implementation Specification 3.1.0. (2004).
4. OpenGIS Consortium, Inc.: Geography Markup Language(GML) Version 3.1.1 Schemas
(2005).
5. OpenGIS Consortium, Inc.: Binary-XML Encoding Specification 0.0.8. (2003).
6. Piras, A., Demontis, R., Vita, E. D., Sanna, S.: Compact GML: Merging Mobile Computing
and Mobile Cartography, Proceedings of the 3rd GML and Geo-Spatial Web Service Conference (2004).
7. W3Consortium: Extensible Markup Language (XML) 1.0., (Second Edition),
http://www.w3.org/ TR/REC-xml (2000).

Effective Spatial Characterization System


Using Density-Based Clustering
Chan-Min Ahn1, Jae-Hyun You2, Ju-Hong Lee3,*, and Deok-Hwan Kim4
1,2,3

Department of Computer science and Engineering, Inha-University, Korea


4
Department of Electronic and Engineering, Inha-University, Korea
1,2
{ahnch1, you}@datamining.inha.ac.kr,
3,4
{juhong, deokhwan}@inha.ac.kr

Abstract. Previous spatial characterization methods does not analyze well spatial regions for a given query since it only focus on characterization for users
pre-selected area and without consideration of spatial density. Consequently,
the effectiveness of characterization knowledge is decreased in these methods.
In this paper, we propose a new hybrid spatial characterization system combining the density-based clustering module which consists of the attribute removal
generalization and the concept hierarchy generalization. The proposed method
can generate characteristic rule and apply density-based clustering to enhance
the effectiveness of generated rules.

1 Introduction
Recently, the need for the spatial data mining has increased in many applications such
as geographic information system, weather forecasting, and market analysis, etc. The
study on spatial data mining techniques is as follows: the spatial characteristic rule to
extract summary information of spatial and non-spatial data, spatial association rule to
find the spatial relationship between data, and the spatial clustering to partition spatial
data objects into a set of meaningful subclasses, and so on [6,7,8,9].
Especially, spatial characterization is one of methods for discovering knowledge
by aggregating non-spatial data and spatial data [4]. Many previous data mining systems support spatial characterization methods. Economic Geography and GeoMiner
are the representative spatial data mining system which support spatial knowledge
discovery [7, 9].
Ester et al. suggest Economic Geography system based on BAVIRIA database [9].
Economic Geography uses a part of entire database and performs characterization by
using the relative frequency of spatial and non-spatial data. That is, it increases effectiveness of characterization using only relative frequencies of objects nearest to target
object. GeoMiner system enhances DBMiner system to treat both spatial and nonspatial data [7]. This system enables knowledge discovery for association, clustering,
characterization, and classification. It also presents both NSD(Non-Spatial Data
*

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 515522, 2007.
Springer-Verlag Berlin Heidelberg 2007

516

C.-M. Ahn et al.

Dominant Generalization) and SD(Spatial data Dominant Generalization) algorithms


for spatial characterization. GeoMiner has the advantage of presenting the appropriate
type of discovered knowledge according to the users needs.
The previous characterization systems have the problems as follows: first, their
characterization methods do not analyze well spatial regions for a given user query
since they only use regions predefined by domain experts. Second, the user should
describe the query range directly to reduce the search scope [6, 8].
In this paper, to solve the previous problems, we propose an effective spatial characterization method combining spatial data mining module which extends generalization technique and density-based clustering module. The density-based clustering
module enables the system to analyze the spatial region and to reduce the search
scope for a given user query. Our characterization method can generate useful characteristic rule and apply the density-based clustering to enhance the effectiveness of
generated rules. Its effectiveness is measured by information gain which uses the
entropy of selected data set [1].
The rest of the paper is organized as follows: in Section 2, we describe the proposed
spatial characterization method using density-based clustering. In Section 3, we introduce a new characterization system implementing the suggested method. Section 4
contains the result of our experiment. Finally, Section 5 summarizes our works.

2 Spatial Characterization Method Using Density-Based


Clustering
In this section, we describe a spatial characterization method using density-based
clustering. Our characterization method first performs the density-based clustering on
spatial data and then extracts summary information by aggregating non-spatial data
and spatial data with respect to each cluster.
2.1 Proposed Spatial Characterization Method
Spatial characterization extracts the summary information of data class with respect to
the search scope from the given spatial data and non-spatial data. The proposed spatial characterization method consists of five steps shown in Fig.1.
The first step is to retrieve task-relevant data in the search scope from spatial database. Task-relevance data are spatial data and non-spatial data of database tuples
related with the user query. After that, we perform density based clustering with
respect to spatial data of retrieved task-relevant data. Note that this step is described
separately in Section 2.2. As a result of clustering, the search scope is reduced to
the objects within the selected spatial clusters. In the third and forth step, for the
non-spatial attribute of each object in the spatial clusters, generalization is performed until the specific user threshold is satisfied. The proposed generalization
module consists of an attribute removal generalization and a concept hierarchy
generalization.

Effective Spatial Characterization System Using Density-Based Clustering

517

Algorithm 1. Spatial characterization algorithm


Given : Concept hierarchy, User threshold
Method :
1. Task-relevant data are retrieved by a user query.
2. Perform density-based clustering with respect to spatial data obtained from step 1.
3. Repeat generalization until a given user threshold is satisfied.
(1) Remove useless non-spatial attribute in task-relevant data.
(2) If concept hierarchy is available, generalize task-relevant data to highlevel concept data.
4. Apply aggregate operation to non-spatial data obtained from step 3.
5. Return spatial characteristic rules.

Fig. 1. Spatial characterization algorithm

The attribute removal generalization is used in the process of converting nonspatial task-relevant data into summary information. The attributes are removed in the
following cases: (1) tuple values with respect to a specific attribute are all distinct or
the attribute has more distinct tuple values than the user threshold. (2) The tuple value
with respect to a specific attribute cannot be replaced with high level concept data.
After attribute removal steps, non-spatial task-relevant data is converted to summary information using concept hierarchy. A concept hierarchy is a sequence of mappings from a set of low-level raw data to more general high-level concepts which
represent summary information [5]. At the end, the algorithm returns spatial characteristic rules.
This generalization algorithm is the key point of the proposed spatial characterization. Extracting generalized knowledge from spatial database using spatial
characterization requires generalization of spatial data and non-spatial data. Thus, the
generalization module performs the generalization task for non-spatial task-relevant
data by using the user defined concept hierarchy.
2.2 Density-Based Spatial Clustering
Spatial data mining uses both non-spatial data and spatial data. A tuple in the database
includes a spatial object. Spatial object has the coordinate information of spatial data
related with the non-spatial data. Spatial clustering is used to partition the areas the
user wants to search [2, 3, 9, 10]. We use DBSCAN as the density-based spatial clustering module to group spatial areas for spatial characterization since it can be used as
a basic tool for spatial analysis [9, 10].
DBSCAN merges regions with sufficient high density into clusters. The proposed
method supports not only a point type but also a polygon type. We extend DBSCAN
with minmax distance function to support two types of spatial data. Let object be a
point data or a polygon data. In order for objects to be grouped, there must be at least
a minimum number of objects called MinObj in -neighbourhood, Ne(p), from an

518

C.-M. Ahn et al.

object p, given a radius where p and q are objects and D is a data set. If p and q are
point type objects, their distance can be calculated by Euclidian distance function [9].
Otherwise, their distance can be calculated by minmax distance function [11]. The
minmax distance function is to compute the minimum value of all the maximum distances between objects.
Ne( p) = {q D | dist ( p, q) }

(1)

Fig. 2 illustrates clustering example of polygon type objects with respect to the
map of the province in Incheon city. To cluster polygon type objects, we calculate
MBR of each polygon object. After that, we calculate the minmax distance values
between the center of selected MBR and other MBRs. The remaining step is the same
as the clustering process of the point type objects.

Fig. 2. Clustering example of polygon type object

Fig. 3 illustrates the extended DBSCAN algorithm. For this clustering algorithm,
we need some parameters. Let SetOfbject be a set of spatial data, NoOfClusters be the
given number of clusters, and r be a radius of -neighborhood.
The complexity of the expended DBSCAN algorithm can be expressed by O(n2),
where n is the number of objects.

3 Design and Implementation of Spatial Characterization System


In this section, we describe each component of the proposed spatial characterization
system and an application example.
The proposed spatial characterization system consists of spatial data mining query
processor, spatial data mining module, spatial database, and spatial clustering module.
We use a SMQL query language for spatial data mining. SMQL is a spatial data mining language used in GMS spatial database system [13].
Example 1 shows an SMQL query of an application for the proposed system. Taskrelevant data in Table.1 are intermediate query results using Example 1.

Effective Spatial Characterization System Using Density-Based Clustering

519

Algorithm 2. Extended density-based clustering algorithm


Given : All data in D are unclassified
Input Parameter : SetOFObjects, NoOfClusters,
ClassifiedSetOfObject , radius r, MinObj
Method :
1. Make ClassifiedSetOfObjects empty.
2. Initially, choose any object p from SetOfObjects and select it as a seed
object.
3. Search density reachable objects from the seed object with respect to
the radius r and add them into the neighborhood. To find densityreachable object, use Euclidean distance function in the case of a point
data type or use the minmax distance function in the case of a polygon
data type.
4. Choose randomly any object within neighborhood satisfying the core
object condition and select it as a new seed object.
5. Repeat 3 to 4 until the number of density reachable objects is less than
MinObj.
6. Move neighborhood objects from SetOfObjects to ClassifiedSetOfObjects.
7. Repeat 2 to 6 until no more objects in SetOfObjects exist or the generated number of neighborhood is greater than or equal to NoOfClusters.

Fig. 3. Extended density-based clustering algorithm

Example 1. SMQL query for characterization with respect to annual_income, education, and age of women who reside in Incheon city.
MINE Characterization as woman_pattern
USING HIERARCHY H_inome, H_education, H_age
USING Clustering distance 30
FOR annual_income, education, age
FROM census
WHERE province = "InCheon", gender = "F"
SET distinct_value threshold 20
[Step1] Given a SMQL query statement in Example.1, the system collects annual_income, education, age, and coordinate information attribute values of taskrelevant data from census table in the spatial database. Please refer to [13] for SMQL
query processor.
[Step 2] Clustering is performed by using coordinate information, which is stored
in object pointer attribute, with respect to spatial task-relevant data.
[Step3] In this step, low level attribute values are replaced with matching high
level attribute value in the concept hierarchy. The system generalizes task-relevant
data in Table.1 with respect to annual_income attribute. This generalization will make
the high opportunity of aggregation task.

520

C.-M. Ahn et al.

[Step4] Characterization that aggregates all tuples is performed. Table 2 shows the
aggregated data as the result of SMQL query shown in Example 1.
Table 1. Task-relevant data

id
1
2
3
4

Annual_income
2580
2400
3850
1780

age
45
40
36
27

education
Univ
Senior
Master
Univ

object
<X,Y>
<X,Y>
<X,Y>
<X,Y>

Table 2. A result of spatial characterization

Cluster
C1
C3
C1

annual_income
Middle
Middle
Middel high

Age
middle
middle
middle

education
Higher
Secondary
Higher

count
481
316
156

4 Evaluation and Result


We use a real map in the province of Incheon city. The data have 8000 objects which
consisting of points and polygons. We perform an experiment and evaluate the result
using entropy.
Entropy [1, 12] is used to measure the purity of target data in information theory.
When the data set S is classified into c subsets, the entropy of S is defined as follows:
c

Entropy(S ) = pi log 2 pi

(2)

i =1

where S is a selected data set, pi is the ratio that the set S includes class i, and C is the
number of set that is distinct from the set S.
The average weight of an attributes is defined as:
W = wni / T

(3)

where wni is the weight of ni, ni represents randomly selected attribute, and T is the
total weight.
In order to measure whether the result of spatial characterization reflects well data
distribution in the database, the entropy and weight of attribute is used. Thus, the
information gain from the result of spatial characterization using the entropy and the
weight of each attribute is defined as follows:
Gain(G ) = E Wa Ea + Wb Eb

(4)

Effective Spatial Characterization System Using Density-Based Clustering

521

where E is a total entropy, a is a data set, b is another data set distinct from a, Wa is an
average weight of a, Wb is an average weight of b, Ea is an entropy of a, and Eb is an
entropy of b, respectively.
We apply the information gain to the sample data used in application example 1
mentioned in section 3. Characterization without clustering denotes the previous
method [7,9] using only data generalization while characterization with clustering
denotes the proposed method. Fig 4 shows the experimental result that compares
characterization with clustering and characterization without clustering in terms of the
information gain.

1.2
1
0.8

ina 0.6
G
0.4
0.2
0

Annual_income

Characterization without clustering

Age

Education

Characterization with clustering

Fig. 4. Result of two spatial characterization using information gain

The experimental result demonstrates that the characterization with clustering is


more effective than that without clustering with respect to annual income, age, and
education attributes, respectively.

5 Conclusion
We propose a new spatial characterization method that generalizes spatial objects
and groups them by using DBSCAN density-based clustering as automatically selected areas, and aggregates them on non-spatial attributes. The proposed spatial
characterization method has the properties as follows: first, we use a density based
clustering to automatically select search scope. Second, the method can eliminate
unnecessary spatial objects using attribute removal and concept hierarchy generalization operation.
The elimination of unnecessary spatial objects can give us useful knowledge,
where information gain is high. The experimental result demonstrates that the performance of the proposed characterization method is better than that of the previous
characterization method.

522

C.-M. Ahn et al.

Acknowledgement
This research was supported by the MIC(Ministry of Information and Communication), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute of Information Technology Assessment).

References
1. Baeza-Yaters, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press (2000)
pp.144-149
2. Edwin, M., Knorr, Raymond, Ng.: Robust Space Transformations for Distance-based Operations. SIGKDD, San Francisco, California USA 2 (2001) pp.126-135
3. E., Shaffer, and M., Garland.: Efficient Adaptive Simplification of Massive Meshes. In
Proceedings of IEEE Visualization 2001 (2001) pp.127-134
4. J., Amores, and P., Radeva.: Non-Rigid Registration of Vessel Structures in IVUS Images,
Iberian Conference on Pattern Recognition and Image Analysis. Puerto de Andratx, Mallorca, Spain, Lecture Notes in Computer Science, Springer-Verlag 2652 (2003) pp.45-52
5. J., Han, and M., Kamber.: Data Mining Concept and Techniques. Morgan Kaufman (2001)
pp.130-140
6. J., Han, Y., Cai, and N., Cercone.: Knowledge Discovery in Databases : An AttributeOriented Approach. Proceedings of the 18th VLDB Conference. Vancouver, British Columbia, Canada (1992) pp.547-559
7. J., Han, K., Koperski, and N., Stefanovic.: GeoMiner : A system prototype for spatial data
mining, Proceedings of 1997 ACM-SIGMOD International Conference on Management of
Data 26 2 (1997) pp.553-556
8. M., Ester, H., -P., Kriegel, and J., Sander.: Knowledge discovery in large spatial databases
: Focusing Techniques for Efficient Class Identification. In proc. 4th Intl. Symp. on Large
Spatial Databases 951 (1995) pp.67-82
9. M., Ester, H., -P., Kriegel, and J., Sander.: Algorithms and applications for spatial data
mining. Geographic Data Mining and Knowledge Discovery, London: Taylor and Francis
(2001) pp.160-187
10. M., Ester, H., -P., Kriegel, J., Sander, and X., Xu.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proc. of ACM SIGMOD 3rd
International Conference on Knowledge Discovery and Data Mining. AAAI Press (1996)
pp.226-231
11. N., Roussopoulos, S., Kelley, and F., Vincent.: Nearest Neighbor Queries. In Proc. of
ACM SIGMOD Intl. Conf. on Management of Data, San Jose, CA (1995) pp.71-79
12. N., Roy, and C., Earnest.: Dynamic Action Space for Information Gain Maximization in
Search and Exploration. In Proc. of American Control Conference, Minneapolis 12 (2006)
pp.1631-1636
13. S., Park, S.H., Park, C.M., Ahn, Y.S., Lee, J.H., Lee,: Spatial Data Mining Query Language for SIMS. Spring conference of The Korea Information Science Society 31 1 (2003)
pp.70-72

MTF Measurement Based on Interactive Live-Wire Edge


Extraction
Peng Liu1,2,3, Dingsheng Liu1, and Fang Huang1,3
1
2

China Remote Sensing Satellite Ground Station, #45, Bei San Huan Xi Road, Beijing, China
Institute of electronic Chinese academy of science, #19, Bei Si Huan Xi Road, Beijing, China
{pliu, dsliu, fhuang}@ne.rsgs.ac.cn
3
Graduate University of Chinese Academy of Sciences

Abstract. When we want to measure parameters of the Modulation Transfer


Function (MTF) from remote sensing image directly, the sharp edges are
usually used as targets. But for noise, blur and the complexity of the images,
fully automatic locating the expected edge is still an unsolved problem. This
paper improves the semi-auto edge extraction algorithm of live-wire [1] and
introduces it into the knife-edge method [8] of MTF measuring in remote
sensing image. Live-wire segmentation is a novel interactive algorithm for
efficient, accurate, and reproducible boundary extraction that requires minimal
user input with a mouse. The image is transformed into a weighted graph with
variety restrictions. Edge searching is based on dynamic programming of
Dijkstras algorithm [5]. Optimal boundaries are computed and selected at
interactive rates as the user moves the mouse starting from a manually specified
seed point. In this paper, a promoted model of live-wire is proposed to
measuring the on orbit Modulation Transfer Function for high spatial resolution
imaging satellites. We add the no-linear diffusion filter in the local cost function
to ensure the accurateness of the extraction of edges. It can both de-noise and
do not affect the shape of the edges when we extracting the edges, so that the
calculation of the MTF is more reasonable and precise.
Keywords: MTF measurement, interactive Live-wire edge extraction, sharp
edge.

1 Introduction
In order to measure the on orbit MTF of remote sensing images, knife-edges method
makes use of special targets for evaluating spatial response since the targets stimulate
the imaging system at all spatial frequencies [8]. The algorithm must determine edge
locations with very high accuracy, such as Figure 1.1(a). The ESF (edge spread
function) was then differentiated to obtain the LSF (line spread function) as in the
second picture in Figure 1.1(d). Then the LSF was Fourier-transformed and
normalized to obtain the corresponding MTF, see Figure 1.1 (e).
So like Figure1.2 (a) and (b), we hope that we can extract the edges arbitrarily and
like figure1.2 (c), we can acquire the perpendicular profiles easily. But for the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 523530, 2007.
Springer-Verlag Berlin Heidelberg 2007

524

P. Liu, D. Liu, and F. Huang

complexity of the images, fully automated locating the expected edge is still an
unsolved problem. Especially, in most case the edge in the image could be not straight
or regular. So the aim to cut a profile that is perpendicular to the edge is also very
difficult. As far as above reasons are considered, we introduce live-wire [1] into
MTF measurement and promote the novel method of edges detecting. Live-wire is
one of active contour [2] model for efficient, accurate boundary extraction. Optimal
boundaries are computed and selected at interactive rates as the user moves the mouse
starting from a manually specified seed point. In this paper we enhance the
performance of the live-wire model and make it more fitful to MTF measurement in
the remote sensing images.

(a)

(b)

(c)

(d)

(e)

Fig. 1.1. The process from the sharp edges to MTF

2 Live-Wire Model
The motivation behind the live-wire algorithm is to provide the user with full control
over the edge extraction [2]. Initially, the user clicks to indicate a starting point on
the desired contour, and then as the mouse is moved it pulls a live-wire behind it
along the contour. When the user clicks again, the old wire freezes on the contour,
and a new live one starts from the clicked point. The livewire method poses this
problem as a search for shortest paths on a weighted graph. Thus the livewire
algorithm has two main parts: first, the conversion of image information into a
weighted graph, and then the calculation of shortest paths in the graph [6],[9]. This
paper for the first time introduces the algorithm into MTF measurement and makes
some good promotions.
In the first part of the livewire algorithm, weights are calculated for each edge in
the weighted graph, creating the image forces that attract the livewire. To produce the
weights, various image features are computed in a neighborhood around each graph
edge, and then combined in some user-adjustable fashion. The general purpose of the
combination is edge localization, but individual features are generally chosen as in [9].
First, features such as the gradient and the Laplacian zero-crossing [3] have been used
for edge detection. Second, directionality, or the direction of the path should be taken
consideration locally. Third, training is the process by which the user indicates a
preference for a certain type of boundary. For resisting the affections of noise, this
paper makes promotion by adding the no-linear diffusion filter term, which we can
see in the following contents.

MTF Measurement Based on Interactive Live-Wire Edge Extraction

525

In the second part of the livewire algorithm, Dijkstras algorithm [5] is used to find
all shortest paths extending outward from the starting point in the weighted graph.
The livewire is defined as the shortest path that connects two user-selected points (the
last clicked point and the current mouse location). This second step is done
interactively so the user may view and judge potential paths, and control the
extraction as finely as desired. And this will make MTF measurement easier.
2.1 Weighted Map and Local Cost
The local costs are computed as a weighted sum of these component functions, such
as Laplacian zero-Crossing, Gradient Magnitude and Gradient Direction. Letting
l ( p, q ) represents the local cost on the directed link from pixel p to a neighboring
pixel q , the local cost function is

l ( p, q ) = wZ . f Z (q ) + wD . f D ( p, q ) + wG . f G (q ) + wdiv f div ( p)

(1)

f div ( p ) is Laplacian zero-Crossing, f G (q ) is Gradient Magnitude,


f D ( p , q ) is Gradient Direction and f div ( p ) is Divergence of Unit Gradient. wZ ,
wD , wG and wdiv are their weights, and each w is the weight of the corresponding
In formula (1),

feature function. The Laplacian zero-crossing is a binary edge feature used for edge
localization [7]. The Laplacian image zero-crossing corresponds to points of maximal
(or minimal) gradient magnitude. Thus, Laplacian zero-Crossings represent good
edge properties and should therefore have a low local cost. If I L (q) is the Laplacian of
an image I at the pixel q , then

0 if I L (q ) = 0
f Z (q ) =
1 if I L (q ) 0

(2)

Since the Laplacian zero-crossing creates a binary feature, f Z (q ) does not


distinguish between strong, high gradient edges and weak, low gradient edges.
However, gradient magnitude provides a direct correlation between edge strength and
local cost. If I x and I y represent the partials of an image I in x and y respectively,
then the gradient magnitude G is approximated with G = I x2 + I y2 . Thus, the gradient
component function is

fG =

max (G min( G ) ) (G min( G ))


G min( G )
= 1
max (G min( G ) )
max (G min( G ) )

(3)

The gradient direction f D ( p, q ) adds a smoothness constraint to the boundary by


associating a high cost for sharp changes in boundary direction. The gradient direction
is the unit vector defined by I x and I y . Letting D ( p ) be the unit vector perpendicular

526

P. Liu, D. Liu, and F. Huang

to the gradient direction at the point

p (for D ( p) = ( I y ( p ), I x ( p )) ), the

formulation of the gradient direction feature cost is:

f D ( p, q ) =

{cos[D( p ) L( p, q )]

+ cos[L( p , q ) D( p )]

D( p ) = (I y ( p ) I x ( p ))

D(q ) = (I y (q ) I x (q ))

L( p , q ) =

1 q p;

p q p q;

f div ( p ) =
x

Ix
I

+I

if
if

(4)

(5)

D( p ) (q p ) 0

D( p ) (q p ) 0

(6)

(7)

y
+

Iy
I

+I

Above, in D ( p ) L( p , q ) of (4), is vector dot products. And here, (5) and


(6) are the bi-directional links or edge vector between pixels p and q . The link is
either horizontal, or vertical, or diagonal (relative to the position of q in p s
neighborhood). The dot product of D( p ) and L( p, q ) is positive, as noted in [6]. The
direction feature cost is low when the gradient directions of the two pixels are similar
to each other and the link between them.
Here, f div ( p ) is the divergence of unit gradient vector in the point of p . And

wdiv is the weight of the term. Its function is to de-noise. There is not this term in the
original model [6]. In order to de-noise we add this term to the model.
In (7), is small positive constant that prevent I 2 x + I 2 y + to be zero. This term
comes from the no-linear diffusion filter, first proposed by [4]. And it has been
successful as a de-noise algorithm. f div ( p ) is sensitive to the oscillating such as noise
but not penalize the step edges. So, in location of edge f div ( p) is small, and in the
location of noise f div ( p) is big. The function of f div ( p) will be demonstrated in the
following of the paper.
2.2 Searching for an Optimal Path
As mentioned, dynamic programming can be formulated as a directed graph search
for an optimal path. This paper utilizes an optimal graph search similar to that
presented by Dijkstra [5]. Further, this technique builds on and extends previous
boundary tracking methods in 4 important ways same as in [6], but the difference of

MTF Measurement Based on Interactive Live-Wire Edge Extraction

527

our method is that we add the no-linear diffusion filter to the weighted map so that the
search can resist the effect of noise. And all these characters can make MTF
measurement easier.
The live-wire 2-D dynamic programming (DP) graph search algorithm is as
follows: Figure 2.2 (a) is the initial local cost map with the seed point blacked. For
simplicity of demonstration the local costs in this example are pixel based rather than
link based and can be thought of as representing the gradient magnitude cost feature.
Figure 2.2 (b) shows a portion of the cumulative cost and pointer map after expanding
the seed point. Noticing that the diagonal local costs which have been scaled by
Euclidean distance does not show in this figure. In fact we compute the diagonal local
costs in our method, but for convenience we do not show them. This is demonstrated
in Figure 2.2 (c) where several points have now been expanded, and the seed point
and the next lowest cumulative cost point on the active list. In fact, the Euclidean
weighting between the seed and diagonal points makes them more costly than nondiagonal paths. Figures 2.2 (d), (e), and (f) show the cumulative cost/direction pointer
map at various stages of completion. Note how the algorithm produces a wave-front
of active points emanating from the initial start point, which is called the seed point,
and that the wave-front grows out faster where there are lower costs.

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 2.2. (a)-(f) is the process of optimal path searching from one seed point to another. And
the blacked points are the active points on the live paths.

And here, we have to say, the importance of the term of f div ( p) is obvious. In the
Figure 2.3, (a) is the edge extraction in the smooth ideal remote sense images. We can
see that the result is satisfying. Even the edge is not straight or regular, the live-wire
work well in (a). Following, (b) and (c) are the result or edge detection by live-wire
model without the term f div ( p) , and they are badly affected by noise. In Figure 2.3(b)
and (c), edge extraction is not optimal for us. Where the level of noise is high, the
edge extraction is not very accurately. But, (d) is the result of live-wire edge
extraction with the term f div ( p) . And we can see that result (d) is not affected by
noise, and the boundary is on the right position for the functions of term f div ( p) . The
term of f div ( p) is sensitive to noise but almost does not change on the edge. So the
term of no-linear diffusion filter operator f div ( p) enhances the performance of the
live-wire model very much.

528

P. Liu, D. Liu, and F. Huang

(a)

(b)

(c)

(d)

Fig. 2.3. (a) is the live-wire edge detecting of remote sensor image. (b) and (c) are the livewire edge detecting without f div ( p ) term and (d) is with the f div ( p) term.

3 MTF Measurement Based on Live-Wire


Until now, we have acquired the knife-edges, and we must resample the profile that
perpendicular to the edges same as [8]. On one edge we can acquire many profiles as
Figure 3.1(b) shows. In order to check the accuracy of the algorithm, we use a known
PSF to convolve the image of Figure 3.1 (a) and acquire the blur image of Figure 3.1
(b). The ideal MTF of the known PSF has been shown as figure 3.1 (g). Firstly, we

(a)

(b)

(e)

(c)

(f)

(d)

(g)

Fig. 3.1. MTF measurement in ideal situation

select the edge image of Figure 3.1 (b). Then, we use live-wire to search the optimal edge as knife-edge and compute the profiles that perpendicular to the edge as
Figure 3.1 (b). Furthermore we use the minimum mean-square [8] value to compute
the edge spread function as Figure 3.1 (d). Since the image Figure 3.1 (b) is ideal and
simple, the ESF should be very accuracy. And the ESF was then differentiated to
obtain the LSF as in the Figure 3.1(e). Then the LSF was Fourier-transformed and

MTF Measurement Based on Interactive Live-Wire Edge Extraction

529

normalized to obtain the corresponding MTF Figure3.1 (f). The ideal MTF of (b) is
Figure 3.1 (g). Comparing the Figure3.1 (f) with the Figure3.1 (g), we can find that
the live-wire works very well and the result is very precise. There only is a little error
in the MTF measurement, and it is out of the cut-off frequency. The amplitude
frequency from 0 to 20 in (f) and (g) are almost the same, the error comes out only
from 20 to 35. The results do just verify the accuracy of the live-wire method.
Figure3.2 (a) and (b) is the same target image of remote sensing, and the (c) and (d)
are the MTF result of difference methods. In Figure3.2 (c) and (d), the blue solid line
is the MTF that is measured by method in [8]. And in Figure3.2 (c) the red solid line
is the MTF that is measured by our algorithm. In Figure3.2 (d) the red solid line is the
MTF that is measured by live-wire algorithm without improvement. The target image
is irregular, but for the accurate edge extraction and more correct profile cutting, the
result of red solid line that is based on our improvement is obviously more precise,
see Figure3.2 (c). Further more in Figure3.2 (d), the red solid line is the MTF
measured by original live-wire model that does not add the de-noise term of f div ( p) .
We can see that for the affection of noise, the edges is not very accurate and it can
affect the measurement of MTF in (d) and the error is also obvious. This illustrate the
no-improved live-wire algorithm is not very suit for MTF measure.

(a)

(b)

(c)

(d)

Fig. 3.2. MTF measurement based on difference methods

(a)

(b)

(c)

(d)

Fig. 3.3. MTF measurement of variety of remote sensing images

Figrure3.3 is MTF measurement used different images. In these figures we can see
that, if the context in the image is complicated and the edge is not very straight we
can also extract the edge accurately as Figure3.3 (a). And this success attribute to the
advantages of the live-wire algorithm and the no-linear diffusion filter term added in

530

P. Liu, D. Liu, and F. Huang

live-wire model. At last we measure the MTF of the image of Figure3.3 (c) that
comes from Google earth. And live-wire can snap the edge easily and precisely on
the complex image context. The profile that is perpendicular to the edge is cut and
LSF is computed. Figure3.3 (d) is the MTF of Figure3.3 (c). Because the quality of
the image is good, the result of MTF should be relatively ideal.

4 Conclusions
In this paper we propose an improved edge detection model based on live-wire
algorithms to measure the MTF of remote sensing images. The no-linear filter term
was added. And, these improvements make it more suitable for MTF measurement. It
highly enhances the performance of the knife-edge method of MTF measurement and
it makes the measurement more convenient and flexible. Then, we use the straight
edge no more when we want to measure the MTF of the sensors directly. Furthermore, the influence of the noise is restrained after no-linear diffusion filter term is
added to the weight map. The profiles that are perpendicular to edge can be
simultaneously and accurately computed. So, all these advantages help us measure the
MTF of the image more accurately in very complicated image context. The following
work will focus on making use of the improved MTF parameters in de-convolution of
remote sensing images.

References
1. W. A. Barrett and E. N. Mortensen. Interactive live-wire boundary extraction. Medical
Image Analysis, 1(4):331341, 1997.
2. M. Kass, A. Witkin, and D. Terzopoulos, Snakes: Active Contour Models, in Proc. of the
First Int. Conf. on Computer Vision, London, England, pp. 259-68, June 1987.
3. E. N. Mortensen and W. A. Barrett. Intelligent Scissors for Image Composition. Computer
Graphics (SIGGRAPH 95), 191198, 1995.
4. P. Perona, J. Malik, Scale-space and edge detection using anisotropic diffusion, PAMI
12(7), pp. 629-639, 1990.
5. E. W. Dijkstra A Note On Two Problems in Connexion With Graphs Numerische
Mathematik, 269271, 1959.
6. A. X. Falcao, J. K. Udupa, and F. K. Miyazawa. An Ultra-Fast User-Steered Image
Segmentation Paradigm: Live-wire on the Fly. IEEE Transactions on Medical Imaging,
19(1):5561, 2000.
7. A. X. Falcao, J. K..Udupa, S. Samarasekera, and S. Sharma. User-Steered Image
Segmentation Paradigms: Live-wire and Live Lane. Graphical Models and Image
Processing, 60:233260, 1998.
8. Taeyoung Choi, IKONOS Satellite on Orbit Modulation Transfer Function (MTF)
Measurement using Edge and Pulse Method, thesis South Dakota State University.2002
9. E. N. Mortensen and W. A. Barrett. Interactive Segmentation with Intelligent Scissors.
Graphical Models and Image Processing, 60(5):349384, 1998.

Research on Technologies of Spatial


Conguration Information Retrieval
Haibin Sun
College of Information Science and Engineering,
Shandong University of Science and Technology,
Qingdao 266510, China
Offer sun@hotmail.com

Abstract. The problem of spatial conguration information retrieval is


a Constraint Satisfaction Problem (CSP), which can be solved using traditional CSP algorithms. But the spatial data can be reorganized using
index techniques like R-tree and the spatial data are approximated by
their Minimum Bounding Rectangles (MBRs), so the spatial conguration information retrieval is actually based on the MBRs and some special
techniques can be studied. This paper studies the mapping relationships
among the spatial relations for real spatial objects, the corresponding
spatial relations for their MBRs and the corresponding spatial relations
between the intermediate nodes and the MBRs in R-tree.

Introduction

Spatial conguration retrieval is an important research topic of content-based


image retrieval in Geographic Information System (GIS), computer vision, and
VLSI design, etc. A user of a GIS system usually searches for congurations
of spatial objects on a map that match some ideal conguration or are bound
by a number of constraints. For example, a user may be looking for a place to
build a house. He wishes to have a house A north of the town that he works,
in a distance no greater than 10km from his childs school B and next to a
park C. Moreover, he would like to have a supermarket D on his way to work.
Under some circumstances, the query conditions cannot be fully satised at
all. The users may need only several optional answers according to the degree
of conguration similarity. Of the conguration similarity query problem, the
representation strategies and search algorithms have been studied in several
papers[1,2,3].
In the real world, spatial data often have complex geometry shapes. It will
be very costly if we directly to calculate the spatial relationships between them,
while much invalid time may be spent. If N is the number of spatial objects, and
n the number of query variables, the total number of possible solutions is equal to
the number of n-permutations of the N objects: N !/(N n)! . Using Minimum
Bounding Rectangles (MBRs) to approximate the geometry shapes of spatial
objects and calculating the relations between rectangles will reduce the calculation greatly. So we can divide the spatial conguration retrieval into two steps:
rstly the rectangle combinations for which it is impossible to satisfy the query
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 531537, 2007.
c Springer-Verlag Berlin Heidelberg 2007


532

H. Sun

conditions will be eliminated, and then the real spatial objects corresponding
to the remaining rectangle combinations will be calculated using computational
geometry techniques. To improve the retrieval eciency, the index data structure which is called R-tree[4]or the variants R+-tree[5] and R*-tree[6] can be
adopted.
The next section takes topological and directional relations as examples to
study the mapping relationships between the spatial relationships for MBRs and
the corresponding relationships for real spatial objects; the last section concludes
this paper.

Spatial Mapping Relationships

This paper mainly concerns the topological and directional relations for MBRs
and the corresponding spatial relationships for real spatial objects. The ideas
in this paper can be applied to other relationships such as distance and spatiotemporal relations, etc.
2.1

Topological Mapping Relationships

This paper focuses on RCC8[7] (see Fig.1) relations and studies the mapping
relationship between the RCC8 relations for real spatial objects and the RCC8
relations for the corresponding MBRs. Let p and q be two real spatial objects,
p and q be their corresponding MBRs. If the spatial relation between p and q
is PO (Partly Overlap), then the possible spatial relation between p and q is
PO(Partly Overlap) or TPP (Tangential Proper Part) or NTPP (Non-Tangential
Proper Part) or EQ (Equal) or TPPi (inverse of Tangential Proper Part) or
NTPPi (inverse of Non-Tangential Proper Part) which can be denoted by the

x
y
x DC y
x
y
x PO y

x
y

x y

y x

x EC y x TPP y x TPPi y
xy

x y

y x

x EQ y x NTPP y x NTPPi y

Fig. 1. Two-dimensional examples for the eight basic relations of RCC8

Research on Technologies of Spatial Conguration Information Retrieval

533

disjunction form PO(p, q) TPP(p, q) NTPP(p, q) EQ(p, q) TPPi(p, q)


NTPPi(p, q). To use R-tree to improve the eciency of the spatial conguration
retrieval, the topological relations in the query condition should rst be transformed to the corresponding topological relations for the MBRs, which can be
used to eliminate the rectangle combinations that cannot fulll the constraints
from the leaf nodes in the R-tree. The intermediate nodes in the R-tree can also
be used to fast the retrieval process. Let p be the rectangle that enclose p,
i.e. the parent node of leaf node p in the R-tree, which is called intermediate
node. Given the spatial relation between p and q, the spatial relation between
p and q can be derived. For example, from the spatial relation TPP(p, q),
the spatial relation PO(p,q) TPP(p,q) EQ(p,q) TPPi(p,q) NTPPi(p,q)
can be obtained. It is very interesting that the parents of the intermediate nodes
also have the same property. Table 1 presents the spatial relations between two
real spatial objects, the possible spatial relations that their MBRs satisfy and
the possible spatial relations between the corresponding intermediate node and
the MBR.
Table 1. The spatial relations between two real spatial objects, the possible spatial
relations that their MBRs satisfy and the possible spatial relations between the corresponding intermediate node and the MBR
RCC8
relation RCC8 relation between
between p and q MBRs p and q
DC(p,q)
DC(p,q) EC(p,q)
PO(p,q) TPP(p,q)

NTPP(p,q)

EQ(p,q) TPPi(p,q)
NTPPi(p,q)
EC(p,q)
EC(p,q) PO(p,q)

TPP(p,q)

NTPP(p,q) EQ(p,q)

TPPi(p,q)

NTPPi(p,q)
PO(p,q)
PO(p, q) TPP(p,
q) NTPP(p,q)
EQ(p,q) TPPi(p, q)
NTPPi(p, q)
TPP (p,q)
TPP(p, q) NTPP(p,
q) EQ(p, q)
NTPP (p,q)

TPPi (p,q)
NTPPi (p,q)
EQ(p,q)

RCC8 relation between p and q


PO(p,q)
TPP(p,q)

NTPP(p,q)

EQ(p,q)
TPPi(p,q)
NTPPi(p,q)
EC(p,q) DC(p,q)
EC(p,q)

TPP(p,q)
EQ(p,q)
NTPPi(p,q)

PO(p,q)
NTPP(p,q)
TPPi(p,q)

PO(p, q) TPP(p, q)
NTPP(p, q) EQ(p, q)
TPPi(p, q) NTPPi(p, q)

PO(p, q) TPP(p,
NTPP(p, q) EQ(p,
TPPi(p, q) NTPPi(p,
NTPP(p, q)
PO(p, q) TPP(p,
NTPP(p, q) EQ(p,
TPPi(p, q) NTPPi(p,
EQ(p, q) TPPi(p, q) EQ(p, q) TPPi(p,
NTPPi(p, q)
NTPPi(p, q)
NTPPi(p, q)
NTPPi(p, q)
EQ(p, q)
EQ(p, q) TPPi(p,
NTPPi(p, q)

q)
q)
q)
q)
q)
q)
q)

q)

534

H. Sun

Based on the above mapping relationship and the R-tree, the candidate MBR
combinations can be retrieved eciently, and then a renement step is needed
to derive the spatial relations among the real spatial objects that the MBRs
enclose, which means that the spatial relation between p and q should be derived
from the spatial relation between p and q. From the spatial relation between
two MBRs, we can derive several possible spatial relations or only one denite
spatial relation between two real spatial objects that the MBRs enclose. In the
former case the complex geometry computation will be applied whereas it will
be omitted in the latter case. For example, given the spatial relation NTPPi(p,
q), we can derive DC(p, q) EC(p, q) PO(p, q) NTPPi (p, q) TPPi (p,
q), the geometry computation must be adopted to ascertain the spatial relation
between p and q. But if we know the spatial relation DC(p, q), then spatial
relation DC(p, q) can be derived directly.
2.2

Direction Mapping Relationships

According to Goyal and Egenhofers cardinal direction model[?], there are 9


atomic cardinal direction relations(O, S, SW, W, NW, N, NE, E, SE) (see Fig.2)
and totally 218 cardinal direction relations for non-empty connected regions in
the Euclidean space 2 (illustrated by 3 3 matrix, see Fig.3) [8].
There are 36 cardinal direction relations for the non-empty and connected regionsMBRs: O, S, SW, W, NW, N, NE, E, SE, S:SW, O:W,
NW:N, N:NE, O:E, S:SE, SW:W, O:S, E:SE, W:NW, O:N, NE:E, S:SW:SE,
NW:N:NE, O:W:E, O:S:N,SW:W:NW,NE:E:SE, O:S:SW:W, O:W:NW:N,
O:S:E:SE, O:N:NE:E, O:S:SW:W:NW:N, O:S:N:NE:E:SE, O:S:SW:W:E:SE,
O:W:NW:N:NE:E, O:S:SW:W:NW:N:NE:E:SE(see Fig.4). This kind of cardinal

NWA

NA

NEA

B
WA

SWA

OA

EA

A
SA

SEA

Fig. 2. Capturing the cardinal direction relation between two polygons, A and B,
through the projection-based partitions around A as the reference object

Research on Technologies of Spatial Conguration Information Retrieval

535

Fig. 3. 218 cardinal direction relations between two non-empty and connected
regions[8]

direction relation has the rectangle shape, so it is also named rectangle direction
relation, otherwise it is called non-rectangle direction relation.
In the following, we study the mapping relationships between the cardinal
direction relations for real spatial objects and the cardinal direction relations for
the corresponding MBRs. First of all, we give a denition as follows.
Denition 1. a cardinal direction relation R contains another cardinal direction
relation R, if all the atomic relations in R also exist in R.
The mapping relationships from the cardinal direction relations for real spatial
objects to the ones for their MBRs can be described using the following theorems.
Theorem 1. if the cardinal direction relation between the real spatial objects p
and q is rectangle direction relation R(see Fig.4), the cardinal direction relation
between their MBRs p and q is also R; if the cardinal direction relation between
the real spatial objects p and q is non-rectangle direction relation R, the cardinal
direction relation between their MBRs p and q is the rectangle direction relation
R in Fig.4 which contains relation R and has the minimum area.
Theorem 1 can be derived by combining Fig.3 and Fig.4. Assume that the cardinal direction relation between two real spatial objects p and q is N:NW:W
which obviously is not rectangle direction relation, from Fig.4 the rectangle direction relation that contains N:NW:W and has the minimum rectangle area is
O:W:NW:N, so the cardinal direction relation between two MBRs p and q is
O:W:NW:N.

536

H. Sun

SW

NW

NE

SE

S:SW

O:W

NW:N

SW:W

O:S

E:SE

N:NE

O:E

S:SE

W:NW

O:N

NE:E

S:SW:SE

NW:N:NE

O:W:E

O:S:N

SW:W:NW

NE:E:SE

O:S:SW:W

O:W:NW:N

O:S:E:SE

O:N:NE:E

O:S:SW:
W:NW:N

O:S:N:NE:
E:SE

O:S:SW:

O:W:NW:

O:S:SW:W:NW

W:E:SE

N:NE:E

N:NE:E:SE

Fig. 4. 36 cardinal direction relations for MBRs

Similarly the mapping relationships from the cardinal direction relations


for MBRs to the ones for the possible real spatial objects can be described as
follows.
Theorem 2. if the cardinal direction relation R between two MBRs p and q
contains no more than 3 atomic cardinal direction relations (including 3), the
corresponding cardinal direction relation between the real spatial objects p and q
is also R; otherwise, the possible cardinal direction relations between p and q will
be the subsets of relation R which can be transformed to relation R when p and
q are approximated by p and q.
For example, if the cardinal direction relation between two MBRs is
S:SW:SE(including three atomic relations:S,SW,SE), then the cardinal direction relation between the corresponding two real spatial objects denitely is
S:SW:SE. If the cardinal direction relation between two MBRs is O:S:SW:W,
the possible cardinal direction relations between two real spatial objects include
O:W:SW, W:O:S,SW:S:O,SW:S:W and O:S:SW:W.
Given the cardinal direction relation between the MBRs p and q, the cardinal
direction relation between p, which is the parent node of p in R-tree, and q
can be described using the following theorem.
Theorem 3. if the cardinal direction relation between MBRs p and q is R, the
possible cardinal direction relations between p and q are the rectangle direction
relations containing R.

Research on Technologies of Spatial Conguration Information Retrieval

537

For example, if the cardinal direction relation between p and q is O:S:SW:W,


the possible cardinal direction relations between p and q will be O:S:SW:W,
O:S:SW:W:NW:N, O:S:SW:W:E:SE and O:S:SW:W:NW:N:NE:E:SE.

Conclusions

This paper has studied the spatial conguration information retrieval problem
which includes the mapping relationship among the spatial relations (topological and directional relations) for real spatial objects, the corresponding spatial
relations for the corresponding MBRs and the corresponding spatial relations between intermediate nodes and the MBRs in R-tree. Based on these results, search
algorithms can be designed to solve the spatial conguration retrieval problems.
The research work of this paper is valuable for the information retrieval system
related to spatial data.

References
1. Bergman, L. Castelli, V., Li C-S. Progressive Content-Based Retrieval from Satellite Image Archives. D-Lib Magazine, October 1997.
http://www.dlib.org/dlib/october97/ibm/10li.html.
2. Gupta A., Jain R. Visual Information Retrieval. Communications of ACM, May
1997, 40(5): 70-79.
3. Orenstein, J. A. Spatial Query Processing in an Object-Oriented Database System.
In: Proc. Of the 1986 ACM SIGMOD international conference on Management of
data, 1986, pages 326-336.
4. Guttman, A. R-trees: A Dynamic Index Structure for Spatial Searching. In: Proc.
Of ACM SIGMOD, 1984, pages 47-57.
5. Timos K. Sellis, Nick Roussopoulos, Christos Faloutsos. The R+-Tree: A Dynamic
Index for Multi-Dimensional Objects. In: Proceedings of 13th International Conference on Very Large Data Bases, September 1-4, 1987, Brighton, England, pages
507-518.
6. Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger. The R*tree: An Ecient and Robust Access Method for Points and Rectangles. In: Proceedings of the ACM SIGMOD, 1990, pages 322-331.
7. D. A. Randell, Z. Cui and A. G. Cohn. A Spatial Logic Based on Regions and
Connection. In: Proc. 3rd Int. Conf. on Knowledge Representation and Reasoning,
Morgan Kaufmann, San Mateo, 1992, pages 165-176.
8. S.Cicerone and P. Di Felice. Cardinal directions between spatial objects: the
pairwise-consistency problem. Information Sciences. 2004, 164(1-4): 165-188.

Modelbase System in Remote Sensing Information


Analysis and Service Grid Node
Yong Xue1,2, Lei Zheng1,3, Ying Luo1,3, Jianping Guo1,3, Wei Wan1,3, Wei Wei1,3,
and Ying Wang1,3
1

State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of
Remote Sensing Applications of Chinese Academy of Sciences and Beijing Normal University,
Institute of Remote Sensing Applications, Chinese Academy of Sciences, P.O. Box 9718,
Beijing 100101, China
2
Department of Computing, London Metropolitan University, 166-220 Holloway Road, London
N7 8DB, UK
3
Graduate School, Chinese Academy of Sciences, Beijing 100049, China
y.xue@londonmet.ac.uk

Abstract. In this article we describe a modelbase system used in Remote


Sensing Information Analysis and Service Grid Node (RISN) at Institute of
Remote Sensing Applications (IRSA), Chinese Academy of Sciences (CAS).
The goal of the Node is to make good use of physically distributed resources in
the field of remote sensing science such as data, models and algorithms, and
computing resource left unused on Internet. With the modelbase system, we can
organize and manage models better and make full use of models. With this
system, we update both local and remote models easily, and we can also add
remote modelbase into our modelbase management system. At the same time,
we provide interfaces to access and run models from our node. Implementing it,
we use Oracle to organize and manage models, use java language to connect
with Oracle database and run models on Condor platform.
Keywords: Modelbase system, Remote sensing information analysis and
Service Grid Node, modelbase management system.

1 Introduction
The research of model management theory began 80th last century. In 1980 Blanning
(1980) first imported the notion of modelbase, and designed model query language
(MQL) like database query language to management models. Geoffrion (1987)
designed structural model language (SML), which introduced structural program
design into building models. Muhanna et al. (1988) introduced system theory into
modelbase management system. Wesseling et al. (1996) designed dynamic model
language to support special data structural.
Modelbase can be devised into two kinds by its models: graph modelbase whose
models are graph and arithmetic modelbase whose models are arithmetic or program.
Kuntal et al. (1995) Organized Large Structural Modelbases which were full of graph
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 538541, 2007.
Springer-Verlag Berlin Heidelberg 2007

Modelbase System in Remote Sensing Information Analysis and Service Grid Node

539

models at 1995, which gave a good example of graph modelbase, but arithmetic
modelbase is different according its application. Liu et al. built a simulation
supporting system of cold rolling process control system based on modelbase. Li et al.
(2002) built water environment models and put them in modelbase. So we present an
arithmetic modelbase that is used in remote sensing application and also can connect
to Grid environment - Condor platform through RISN at IRSA, CAS, China based on
the Condor platform.
The node is a special node of Spatial Information Grid (SIG) in China. The node
will be introduced in Section 2. The modelbase system and the function of modelbase
in the node will be demonstrated in Section 3. Finally, the conclusion and further
development will be addressed in Section 4.

2 Remote Sensing Information Analysis and Service Grid Node


Remotely sensed data is one of the most important spatial information sources, so the
research on architectures and technical supports of RISN is the significant part of the
research on SIG. The aim of the Node is to integrate data, traditional algorithm and
software, and computing resource distributed, provide one-stop service to everyone
on Internet, and make good use of everything pre-existing. The Node is extendable,
which contains many personal computers, supercomputers or other nodes. It also can
be as small as just on personal computer. There are two entries to the Node: 1) A
portal implemented by JSP. User can visit it through Internet browses, such as
Internet Explorer and others; 2) A portal implemented by web service and workflow
technology. It is special for SIG. Other Grid resources and Grid systems can be
integrated with our node through this portal. The node provides the application
services such as aerosol optical depth retrieval, land surface temperature retrieval, soil
moisture retrieval, surface reflectance retrieval, and vegetation index applications
from MODIS data. To manage models, we used modelbase system in the node.
Figure 1 shows the structure of our modelbase system.

3 Modelbase System in Remote Sensing Information Analysis and


Service Grid Node
Modelbase system has two main parts: modelbase file system and modelbase
management system.
3.1 Modelbase File System
Modelbase file system is the place where models are stored. In the RISN, models are
stored in several computers, remote computers and remote modelbases. All models
treated as virtually stored in one computer with modelbase management system. We
achieve the distributed management system in this architecture. The system has
another benefit: we can add both models separately or in remote modelbase freely but
not download models to our computers. In our RISN, we provide Condor platform to
run models on Grid system, so the architecture assumes to the option of sharing
resources in the Grid computing.

540

Y. Xue et al.

Fig. 1. Information registry of models in remote modelbase

3.2 Modelbase Management System


Modelbase management system contains the registry information of models, the
interface with Condor platform and the interface with portal.
3.2.1 Registry of Models
We import notion of meta-module (Xue et al. 2006) and extends it in our system. In
our system, every model is a meta-module, and the information is stored in the table
of modelbase.
3.2.2 Updating in Remote Modelbase
In our design, we can add remote modelbase expediently but no need to download
models to our computer. But in order to add models in remote computer into our
modelbase system, the information of models is needed to be registered, and a method
is needed to be provided to get models if you want to provide your models to run by
others. If you register your modelbase information in our modelbase system, we will
provide an interface to connect to your modelbase and make it part of our modelbase.
3.2.3 Models Access and Running on a Grid Computing Platform
As our modelbase system is one part of RISN, we provide interface to connect with
our portal and workflow.
After select the model you want to run, you can run by pressing the submit
button on the portal, or download model to your own computer according to the
information of the model to run. If you choose to run the model on our Remote
Sensing Information Analysis and Service Grid Node, we provide some model to run
on the Condor platform, you need to input parameter as required.

4 Conclusion and Future Work


Grid technology is a very effective method for remotely sensed data processing
service. Through it the existing remotely sensed models and algorithms resource can

Modelbase System in Remote Sensing Information Analysis and Service Grid Node

541

be shared seems like the common service in Grid environment. In this article we
describe a modelbase system used in Remote Sensing Information Analysis and
Service Grid Node which can be used on Condor platform. With the modelbase
system, we can organize and manage models better and make full use of models and
run models on Condor platform. Any models can be added by any users who are
authorized, in the future, models will be added to modelbase only after checked by
administrator.

Acknowledgment
This publication is an output from the research projects Multi-scale Aerosol Optical
Thickness Quantitative Retrieval from Remotely Sensing Data at Urban Area
(40671142) and "Grid platform based aerosol monitoring modeling using MODIS
data and middlewares development" (40471091) funded by NSFC, China and 863
Program - Remote Sensing Information Processing and Service Node funded by the
MOST, China.

References
1. Blanning, R.W.,: How Managers Decide to Use Planning Models. Long Range Planning.
13(1980) 32-35
2. Geoffrion, A.,: An Introduction to Structured Modeling. Management Science.
33:5(1987)547-588
3. Muhanna, W.A., and Roger, A.P.: A System Framework for Model Software Management.
TIMS/ORSA meeting, Washington, DC. 4(1988)

Density Based Fuzzy Membership Functions in the


Context of Geocomputation
Victor Lobo1,2, Fernando Bao1, and Miguel Loureiro1
1

ISEGI - UNL
Portuguese Naval Academy
{vlobo, bacao, mloureiro}@isegi.unl.pt
2

Abstract. Geocomputation has a long tradition in dealing with fuzzyness in


different contexts, most notably in the challenges created by the representation
of geographic space in digital form. Geocomputation tools should be able to
address the imminent continuous nature of geo phenomena, and its
accompanying fuzzyness. Fuzzy Set Theory allows partial memberships of
entities to concepts with non-crisp boundaries. In general, the application of
fuzzy methods is distance-based and for that reason is insensible to changes in
density. In this paper a new method for defining density-based fuzzy
membership functions is proposed. The method automatically determines fuzzy
membership coefficients based on the distribution density of data. The density
estimation is done using a Self-Organizing Map (SOM). The proposed method
can be used to accurately describe clusters of data which are not well
characterized using distance methods. We show the advantage of the proposed
method over traditional distance-based membership functions.
Keywords: fuzzy membership, fuzzy set theory, density based clustering,
SOM.

1 Introduction
One of the most challenging tasks in geocomputation has been the need to provide an
adequate digital representation to continuous phenomena such as those typically
captured in geographic information. The need to define crisp boundaries between
objects in geographic space leads to data representations that, while apparently
providing a rigorous description, in reality have serious limitations as far as fuzziness
and accuracy are concerned. There is an inherent inexactness built into spatial,
temporal and spatio-temporal databases largely due to the artificial discretization of
what are most often continuous phenomena [1]. The subtleties that characterize
space and time changes in geo-phenomena constitute a problem as they carry large
levels of fuzziness and uncertainty. While fuzziness might be characterized as
inherent imprecision which affects indistinct boundaries between geographical
features, uncertainty is related to the lack of information [1]. Note that these
characteristics are not limited to the spatial representation but also include
categorization, and attribute data. All these facts lead to the eminently fuzzy nature of
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 542549, 2007.
Springer-Verlag Berlin Heidelberg 2007

Density Based Fuzzy Membership Functions in the Context of Geocomputation

543

the data used in geocomputation, or as [2] puts it, uncertainty is endemic in


geographic data. Rather then ignoring these problems and dismissing them as
irrelevant, the geocomputation field should be able to devise ways of dealing with
them. In many instances this will translate into attributing uncertainty levels to the
representations. This way the user will be aware of the limitations involved in the use
of the data, and thus be able to intuitively attribute some level of reliability.
Fuzzy Set Theory constitutes a valuable framework to deal with these problems
when reasoning and modeling in geocomputation [2]. Fuzzy Set Theory allows partial
memberships of entities to concepts with non-crisp boundaries. The fundamental idea
is that while it is not possible to assign a particular pattern to a specific class it is
possible to define a membership value. In general, the application of automatic fuzzy
methods is distance-based. Thus, the (geographic or attribute) distance between a
prototype and a pattern defines the membership value of the pattern to the set defined
by the prototype. This approach is not only intuitive but also adequate to deal with
many different applications. Nevertheless, there is yet another way of approaching the
problem, which trades distance for density. In this case, it is not the distance of the
pattern to the prototype that governs the membership value, but the pattern density
variations between them. This perspective will emphasize discontinuous zones
assuming them has potential boundaries. Membership will be a function of the
changes in the density in the path between the pattern and the prototype. Thus, if
density is constant then the membership value will be high. There are several classical
examples in clustering where the relevance of density is quite obvious [3]. In this
paper a new method for defining density-based fuzzy membership functions is
proposed. The method automatically determines fuzzy membership coefficients based
on the distribution density of data. The density estimation is done using a SelfOrganizing Map (SOM). The proposed method can be used to accurately describe
data which are not well characterized using distance methods. We show the advantage
of the proposed method over traditional distance-based membership functions.

2 Problem Statement
Fuzzy Set Theory was introduced in 1965 by Zadeh [4]. A fuzzy set may be regarded
as a set with non-crisp boundaries. This approach provides a tool for representing
vague concepts, by allowing partial memberships of entities to concepts. In the
context of data analysis, entities are usually represented by data patterns and concepts
are also represented by data patterns that are used as prototypes. Fuzzy membership
may be defined using the following formalism. Given:
a set of n input patterns X = {x1, , xi, , xn}, where xi = (xi1, , xij, , xid)T
d, and each measure xij is a feature or attribute of pattern xi
a set of k concepts defined by prototypes C={c1, ..., cc, , ck}, where cc = (cc1,
, ccj, , ccd)T d, with k<n,
the fuzzy membership uic of pattern xi to a prototypes cc is defined so that:
uic [0,1], i=1, , n and c=1, , k

(1)

544

V. Lobo, F. Bao, and M. Loureiro

There are many ways of defining memberships uic. These may be divided into two
major groups e.g. [5, 6] :
1- The probabilistic methods, where the sum of memberships of a pattern to all
concepts has to add to 1:
uic = 1 , i=1, , n

(2)

Some authors state that these memberships can be interpreted as a relative


perspective e.g. [5], since the membership of a pattern to a given concept depends on
the membership of that pattern to all other concepts.
2- The possibilistic methods, where it is only required that:
uic 0, i=1, , n

(3)

While still maintaining the need to satisfy (1), the possibilistic approach relaxes the
constraint imposed by (2). This perspective may be interpreted as an absolute way of
determining the membership of a pattern to a concept, since its computation does not
depend on the membership of that pattern to other concepts.
Different methods of determining fuzzy membership coefficients have been
proposed e.g. [7]. Most of these methods are based on the distance between patterns
and prototypes. However, in some cases distance based methods may not achieve the
best results. A classical example of this is given by Ultsch [3] in the form of two
rings, in which membership of a data pattern to one of the rings can not be defined as
distance to any single point.
In this paper a new method for determining fuzzy membership coefficients based
on variations of data pattern density is proposed. In order to compute the density
distribution of data X in the d input space, the use of a Self-Organizing Map (SOM)
is proposed.

3 Density Estimation Using a Self-Organizing Map


A Self-Organizing Map (SOM) is an unsupervised neural network. It was introduced
in 1982 by Kohonen [8], and has been used as visualization tool for high dimensional
data, as well as in many different tasks such as clustering, dimension reduction,
classification, sampling, vector quantization and data mining [9].
The basic idea of a SOM is to map a set of data patterns onto a (usually)
2-dimensional grid of neurons or units. That grid forms what is known as the output
space, as opposed to the input space that is the original space where the data patterns
are. When a SOM is trained with a given dataset, its units will tend to spread
themselves in the input space in a way that is proportional to some function of the
density of the data patterns [10]. This means that where data density is high, the SOM
units are close to each other in the input space. In places where data density is low,
SOM units, even if neighbors in the grid (output space), will be far apart from each
other in the input space.
The SOM may be regarded as a graph [11], where the SOM units are the nodes,
and edges connect the units that are neighbors in the output space, i.e., the edges form

Density Based Fuzzy Membership Functions in the Context of Geocomputation

545

the regular grid of the SOM. We may associate to each of these edges a value that
corresponds to the distance, in the input space, between its end nodes. Given two
neighboring SOM units wa and wb, the value associated with the edge E that joins
them is:
E(wa, wb) = || wa wb ||

(4)

In this paper, this graph is named a U-Graph. This U-graph is usually the first step
in the computation of a U-Matrix [12], which is widely used in cluster analysis [13].

4 An Algorithm to Compute Fuzzy Membership Functions


Let X be the set of input patterns {x1, , xi, , xn} and W be a U-graph with pq
nodes w, obtained by training a SOM with those data patterns.
The cost of a path P with e edges between two nodes a and b is defined as:
j =e2

C (a, b ) =

j =1

E ( w j , w j +1 ) E ( w j +1 , w j + 2 )

(5)

Where w1, w2, ..., we are the nodes that form the path. The cost is obtained from the
sum of absolute differences of edges, between pairs of adjacent edges on the path P
from node a to node b. If the path has less than 3 nodes, then the cost is considered to
be zero.
If in some area of the SOM the data patterns have a more or less constant density,
the distance between units will also be more or less constant, and thus the cost will be
low. On the other hand, if there are variations in the data pattern density, the SOM
units will be unevenly spaced. This varying distance between nodes will lead to a high
value in the cost function.
Areas where the density is constant may be regarded as a continuous cluster, and
thus all data patterns in that area should have a high membership to a reference
prototype. On the other hand, if there are large variations in the data density, the data
patterns should be considered to form different clusters, and thus the membership of a
pattern to a prototype in another cluster should be low.
To convert low cost to high membership (and vice-versa), the membership of a
pattern xi to a concept cc is calculated using:
u ab =

1
1 + C ( a , b)

(6)

The procedure to compute the membership of a pattern xi to concept cc may now be


defined as follows:
1.
2.
3.
4.
5.
6.

Train a SOM with the dataset X


Compute the U-Graph of that SOM
Find the node wx that maps pattern xi
Find the node wc that maps concept cc
Find the path between wc and wx that minimizes the cost function (5)
Compute the membership of xi to cc using membership function (6)

546

V. Lobo, F. Bao, and M. Loureiro

The first two steps must be done only once, while the others must be repeated for
each data pattern. Step 5 involves solving an optimization problem on a graph. It must
be pointed out that this is not simply the shortest path along the graph, which would
be a trivial problem. The fact that the absolute value of the difference between two
consecutive edges is used, instead of the values of the edges themselves, renders the
traditional shortest path algorithms useless. While this is a serious practical problem,
it is irrelevant for the purpose of this paper. In the tests presented in the next section,
this optimization was performed using a very fast heuristic based on simulated
annealing, that obtains a sub-optimal but still very useful solution.

5 Preliminary Results with Artificial Data


From a set of problems where classical methods behave poorly in fuzzy membership
determination, the one shown in Fig. 1 was chosen. It is composed of two zones, each
one with approximately uniform pattern density distribution, but with numerically
different pattern densities between them. The patterns were placed randomly on both
zones, 1000 patterns on zone A (the left half of the square) and 100 patterns on zone
B (the right half of the square).

Fig. 1. Top left side: Pattern dataset and zone identification. Each zone has approximately
constant pattern density. Top right side: SOM trained with the dataset. Notice the
approximately constant density of SOM units in each zone. On the bottom we present the
defuzzified results of using the classical membership functions (eq.7) on the left, and the
proposed membership functions on the right. Darker patterns have higher membership to.

Density Based Fuzzy Membership Functions in the Context of Geocomputation

547

After training the SOM, more units will end up on zone A, due to the higher
density of patterns, and the opposite happens in zone B (Fig.1). It is worth noticing
that each zone has a prototype density approximately constant, which is a function of
the pattern density [10].
To characterize zones A and B, one data pattern from each must be selected as a
prototype, and the membership of all other patterns to those two prototypes must be
computed. It would be desirable that all data patterns of zone A have a high
membership to its prototype, and a low membership to the other one (and viceversa).
Two patterns were chosen as reference prototypes, one pattern from each of the
zones A and B (Fig.1). They were intentionally placed asymmetrically with respect to
the border between zones A and B.
To act as benchmark against the new method proposed, one of the most popular
distance based methods was used [7], which computes membership of a data pattern x
to concept c as:

u ij =

xi c j

c =1 xi cc

m 1

(7)

The values of membership functions computed using this equation and using the
proposed method are presented in Fig. 2. By observing these figures, we may easily
see that:
1- When using probabilistic distance based methods, data patterns far away have
average membership values, and these are approximately the same for all concepts.
This uncertainty may be solved by using a maximum operator, but this defeats the
purpose of fuzzy membership. On the other hand, using the approach proposed in this
paper, even distant patterns are clearly identified as belonging to a cluster provided
the pattern density is constant. The same effect could be obtained by certain clustering
algorithms such as simple linkage, but these fail in other cases [3].
2- When using distance based methods, the borders between concepts will always
be in the mid points between their respective prototypes. This means that the exact
positioning of these prototypes is critical. If the relative positions of the prototypes
vary slightly, the line separating the classes may change considerably. If the borders
are not straight lines, distance based methods cannot be used to identify them unless
several prototypes are used for each concept (forming stepwise linear borders). In the
method proposed in this paper, the borders will occur whenever there are variations in
data density, regardless of how far from the reference prototype, and borders may
have any shape.
3- When using probabilistic based methods, data patterns on the borders are
considered to have 0.5 membership to each concept. In many cases this is not at all
true, since the border belongs simultaneously to both concepts, with high membership
values. In the method proposed in this paper, the borders have high membership to
both concepts.

548

V. Lobo, F. Bao, and M. Loureiro

Fig. 2. Probabilistic membership functions to c1 (right) and c2 (left) obtained with (eq. 7) on the
top row, and with the proposed method on the bottom

6 Conclusions and Future Work


In this paper a new method for computing fuzzy memberships was proposed. This
method allows membership to be sensibly determined in special cases where distance
based methods fail.
The method requires computing a large SOM and U-Graph with the available data.
This is not a problem, since the algorithm for training SOMs is quite fast and is easily
parallelized [14].
One of the bottlenecks of the proposed algorithm is the optimization of the path along
the U-Graph. This is an interesting problem requiring some research, but fortunately, as
shown in our tests, simple heuristics are quite effective for this kind of problem.
Preliminary results with artificial datasets indicate that the new method is indeed
efficient in characterizing clusters which are adjacent but have different densities.
Although not tested experimentally, it seems clear that the new method will also
characterize correctly any clusters, provided that between them there are areas where
data density is low, since the variation in data density will induce high values in the
cost function, and thus low values in membership. This effect will occur even if the
clusters have very irregular or long shapes.

References
1. Peuquet, D.J., Making Space for Time: Issues in Space-Time Data Representation.
GeoInformatica, 2001. 5(1): p. 11-32.
2. Goodchild, M., Introduction: Special Issue on 'Uncertainty in geographic information
systems'. Fuzzy sets and systems, 2000. 113(1): p. 3-5.

Density Based Fuzzy Membership Functions in the Context of Geocomputation

549

3. Ultsch, A. Clustering with SOM: U*C. in WSOM 2005. 2005. Paris.


4. Zadeh, L.A., Fuzzy Sets. Information and Control, 1965. 8: p. 338-353.
5. Baraldi, A. and P.Blonda, A survey of Fuzzy Clustering Algorithms for Pattern
Recognition - Part I. IEEE Trans. on Systems, Man and Cybernetics - Part B: Cybernetics,
1999. 29: p. 778-785.
6. Krishnapuram, R. and J.M. Keller, A possibilistic approach to clustering. IEEE
Transactions on Fuzzy Systems, 1993. 1: p. 98-110.
7. Bezdek, J.C., et al., Fuzzy models and algorithms for pattern recognition and image
processing. 1999: Kluwer Academic Publishers.
8. Kohonen, T., Self-Organizing Maps. 3rd ed. Information Sciences. 2001, BerlinHeidelberg: Springer. 501.
9. Vesanto, J., Data Mining Techniques Based on the Self-Organizing Map, in Department of
Engineering Physics and Mathematics. 1997, HELSINKI UNIVERSITY OF
TECHNOLOGY: Helsinkia. p. 71.
10. Cottrell, M., J.C. Fort, and G. Pages, Theoretrical Aspects of the SOM algorithm.
Neurocomputing, Elsevier, 1998. 21: p. 119-138.
11. Bauer, H.-U. and K.R. Pawelzik, Quantifying the Neighborhood Preservation of SelfOrganizing Maps. IEEE Transactions on Neural Networks, 1992. 3: p. 570-579.
12. Ultsch, A. and H.P. Simeon, Exploratory Data Analysis Using Kohonen Networks on
Transputers. 1989, Department of Computer Science, University of Dormund, FRG.
13. Vesanto, J. and E. Alhoniemi, Clustering of the Self-Organizing Map. IEEE Transactions
on Neural Networks, 2000. 11(3): p. 586-600.
14. Bandeira, N. and V. Lobo. Training a Self-Organizing Map distributed on a PVM network.
in IEEE World Conference on Computational Intelligence. 1998. Anchorage, Alaska,
USA.

A New Method to Model Neighborhood Interaction in


Cellular Automata-Based Urban Geosimulation
Yaolong Zhao1,2 and Yuji Murayama2
1

Faculty of Land Resource Engineering, Kunming University of Science and


Technology, Kunming, Yunnan 650093, P.R. China
yaolongzhao@gmail.com
2
Graduate School of Life and Environmental Sciences, University of Tsukuba,
Tennodai 1-1-1, Tsukuba, 305-8572 Japan
mura@atm.geo.tsukuba.ac.jp

Abstract. Local spatial interaction (i.e. neighborhood interaction) between


land-use types is an important component in Cellular Automata -based urban
geosimulation models. Herein a new method based on the integration of Toblers
First Law of Geography with Reillys gravity model and coupled with logistical
regression approach is proposed to model and calibrate the neighborhood
interaction. This method is embedded into a constrained CA model to simulate
the spatial process of urban growth in the Tokyo metropolitan area. The results
indicate that this method captures the main characteristics of neighborhood
interactions in the spatial process of urban growth. Further, this method provides
an alternative and extensive approach to present local spatial interactions for
bottom-up urban models.
Keywords: Geosimulation, cellular automata, urban growth, GIS.

1 Introduction
One of the most important developments in Geographic Information Science
(GIScience) is the expansion of theories, models, and technologies to effectively
discern and interpret spatiotemporal patterns, relationships, and interactions among
features, activities, processes, and events in geographic domains. In current era, as
rapid changes of urban land-use all over the world have greatly impacted on local [1-3]
and global environmental changes [4, 5], the issue of modeling the spatial process of
urban growth to better understand the mechanism and consequences of urbanization
and explore the extent of future urban land-use change has attracted sweeping
attentions of scientists with background in different disciplines ranging from
anthropology to mathematical programming. This issue also has enriched the theory
and technology of simulation model of geographic phenomena.
An important component in Cellular Automata (CA)-based urban geosimulation
models is the local spatial interaction between neighborhood land-use types. The
neighborhood interaction is often addressed based on the notion that urban
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 550557, 2007.
Springer-Verlag Berlin Heidelberg 2007

A New Method to Model Neighborhood Interaction

551

development can be conceived as a self-organizing system in which natural constraints


and institutional controls (land-use policies) temper the way in which local
decision-making processes produce macroscopic urban form. Different processes can
explain the importance of neighborhood interaction. At large scale, simple mechanisms
for economic interaction between locations were provided by the central place theory
[6] that describes the uniform pattern of towns and cities in space as a function of the
distance that consumers in the surrounding region travel to the nearest facilities. Spatial
interaction between the location of facilities, residential areas and industries has been
given more attention in the work of Krugman [7, 8]. The spatial interaction is explained
by a number of factors that either cause concentration of urban functions (centripetal
forces: economies of scale, localized knowledge spill-over, thick labor markets) and
others that lead to a spatial spread of urban functions (centrifugal forces: congestion,
land rents, factor immobility etc.).
In keeping with the spirit of simplicity, neighborhood interaction in the application
of CA on urban geosimulation models most often adopt either the Von Neumann 33
(or 55) or the Moore 33 neighborhood [9-11]. For most physical systems, these are
clearly the most appropriate definitions since such systems typically have only local
causation. However, in the case of human systems like cities, the idea of locality may
be much larger, since people and institutions are aware of their surroundings in a wider
space [12]. Thus it is desirable to define a neighborhood large enough (i.e. extended
neighborhood) to capture the operational range of the local processes being modeled by
CA. White and Engelen (1993) firstly proposed this kind of configuration of
neighborhood for exploring the relationship of CA-based model of urban form
evolution [13]. In 1997, White et al. calibrated the neighborhood effect by means of a
trial and error approach for geosimulation of Cincinnati city [14]. In 2004, this research
group proposed automatic calibration procedure for this kind of neighborhood effect
[15].
The objective of this study is to improve the methodology of existing CA models by
proposing a theoretical framework to model and calibrate the neighborhood effect. This
framework aims to assist modelers in the implementation and quantification of
neighborhood interaction in urban geosimulation models.

2 Modeling Neighborhood Interaction


Toblers First Law of Geography (FLG), Everything is related to everything else, but
near things are more related than distant things, is the fundamental theory in this
framework. This law, firstly proposed in 1970 [16], has brought strong controversy in
geography domain. In 2003, a panel on this law was organized in AAG meeting in New
Orleans. Five famous geographers presented their comments in this panel and these
comments were published in a forum of Annals of the Association of American
Geographers in 2004. Some professors agreed with Tobler, others not. However, all of
them accepted the actual geography phenomena illustrated by the FLG. The
divarication existed on the word law. Goodchild especially discussed the validity and
usefulness of the FLG in GIScience and geography [17]. Here, the controversy of
whether phenomena can be expressed as law was discarded, and the local knowledge
expressed in Toblers FLG was accepted. It is assumed that the effect of cell states in

552

Y. Zhao and Y. Murayama

the neighborhood area of developable cell accord with the rule of distance decay
described in the FLG.
The expression of Toblers FLG is very qualitative, and a distance decay function is
needed for representing the law. Herein the idea of Reillys law of retail gravitation
[18] was adopted, which states that A city will attract retail trade from a town in its
surrounding territory, in direct proportion to the population size of the city and in
inverse proportion to the square of the distance from the city. Fig. 1 shows one of the
extended neighborhood patterns of one developable cell i, which is defined as all cells
within a radius of eight cells, an area containing 196 cells. It is assumed that in cellular
environment all the cells in the neighborhood contribute to the conversion of
developable cell i. The contribution of one cell is associated with the state of itself and
the distance to the developable cell i. It can be express as follows:
f kh = Gkh

Aj .

(1)

d 2ji

Where,
fkh: constribution of one cell j with land-use k in the neighborhood to the conversion
of the developable cell i to land-use h for next stage,
Aj: area of the cell j,
dji: the Euclidean distance between the cell j in the neighborhood area and the
developable cell i, and
Gkh: constant of the effect of land-use k on the transition to land-use h. + stands for
positive, repulsive.
Fig.2 indicates the scheme of the impact gradient using this function. Notice that this
is a modificatory Reillys function and no unit problem exist in this function.
Then the aggregated effect of the cells in the neighborhood can be expressed as:
m

Fkh = Gkh

Aj

2
j =1 d ji

I kj .

(2)

Impact index: great

Modificatory Reillys Model

Distance from developable cells: far

Fig. 1. An extended neighborhood pattern

Fig. 2. Scheme of the impact gradient

A New Method to Model Neighborhood Interaction

553

Where,
m: total number of the cells in the neighborhood, and
Ikj: index of cells. Ikj=1, if the state of cell j is equal to k; Ikj=0, otherwise.
For one land-use type of one cell, there are just two results of transition: change or no
change. Therefore, logical regression approach was selected to calculate the
probabilities of the transition of cell i under neighborhood effect. The contribution of
neighborhood effect to the probability of conversion to land-use h of a cell (Ni) is
described as a function of a set of aggregated effect of different land-use types:
Log (

N ih
A
) = oi + ikh Fikh = oi + ikh Gkh 2m I mk .
1 N ih
k
k
m d mi

As Gkh is a constant, let:

0' i = 0i ,

(3)

'
ikh
= ikhGkh

Then:
N ih
Am
'
Log (
) = 0' i + ikh
I .

2 mk
1 N ih
k
m d mi

(4)

Where, 'oi and 'ikh are the coefficients which should be calibrated.

3 Calibration of Neighborhood Interaction


3.1 Study Area and Data Set
The Tokyo metropolitan area was identified as study area to confirm the approach as
this area possesses abundant dataset of land-use. The dataset is named Detailed Digital
Information (10m grid land-use) Metropolitan Area of Tokyo which was released in
1998 by Geographic Survey Institute of Japan. The data in 1984 and 1989 was used to
calibrate the neighborhood effect.
The dataset was designed 10 cell states of land-use and was aggregated into
100100m grid from 1010m in majority rule. Land-use of water represents fixed
features in the model, that is, this feature are assumed not to change and which
therefore do not participate in the dynamics in order to protect the life environment.
Two types of land-use of forest & wasteland, and cropland take passive features that
participate in the land-use dynamics. However, the dynamics are not driven by an
exogenous demand for land. They appear or disappear in response to land being taken
or abandoned by the active functions. The active functions are four land-use categories
which are forced by exogenously generated demands for land to CA in response to the
growth of the urbanized area: vacant, industrial, residential, and commercial land.
Other three types of land-use of road, public land, and special land show passively
active features which are dynamics in the model. Active and passively active types of
land-use are assumed to participate in the neighborhood interaction.
3.2 Calibration of the Neighborhood Interaction
As one of statistical analysis techniques, logistical regression has to consider the
problem of spatial statistics like spatial dependence and spatial sampling [19, 20] in the

554

Y. Zhao and Y. Murayama

calibration procedure. The integration of both systematic and random sampling method
was adopted to eliminate spatial dependence effect. Firstly, land-use changes were
detected from the data set in 1984 and 1989. Systematic sampling was implemented
and approximately half cells of the changes of every one of four active land-use types
were remained. Then random selection of not changed cells was carried out to create
nearly 1:1 ratio for changed cells and not changed cells. Its total size was 27, 070 cells.
The result of calibration is shown in Table 1.
Table 1. Result of calibration of neighborhood interaction
Active land-use types
factors and test
Vacant land Industrial land
Residential land
Commercial land
Total size (cells)
11034
1732
11596
2708
of sampling
Vacant land, h
1.147
*0.091
0.190
0.158
Industrial land, h
0.334
1.446
0.262
0.457
Residential land, h
0.103
**
0.562
0.209
Commercial land, h
0.348
0.727
0.181
1.821
Road, h
0.199
**
0.421
0.561
Public land, h
0.198
**
0.199
0.224
Constant 0
-2.428
-1.988
-2.830
-2.763
Test
PCP (%)
84.3
87.6
83.6
86.3
ROC
0.924
0.937
0.905
0.937
PCP: Percentage Correctly Predicted (0-100%); ROC: Relative Operating Characteristic (0-1).
*: significant at p<0.05; **: non-statistically significant; others significant at p<0.001.

Table 1 illustrates that all the values of PCP of four active land-use types are more
than 80% and all of the values of ROC more than 0.9, thus showing goodness of fit of
this approach. The results of the test also indicate the existence of neighborhood
interaction in urban land-use changes.

4 Simulation and Results


Constrained CA model was put forward to confirm this method. In this model,
transition potentials for each cell are calculated as follows:
t

Pik = (1+ t 1N ik )(1+t 1Sik )(1+t 1Z ik )(1+ t 1Aik )t 1v .

(5)

Where, tPik = the potential of the cell i for land use k in time t; t-1Nik = the neighborhood
effect on the cell i for land-use k at time t-1, which equals the value of Nih in equation 4;
t-1
Sik = the intrinsic suitability of the cell i for land-use k at time t-1; t-1Zik = the zoning
status of the cell i for land-use k at time t-1; t-1Aik the accessibility of the cell i to
transportation for land-use k at time t-1; t-1v is the scalable random perturbation term at
time t-1. Four active land-use types were changed by an intervention that is exogenous
to the CA model from regional systems.

A New Method to Model Neighborhood Interaction


(a)

555

(b)

Fig. 3. Land-use maps in the study area: (a) simulation in 1994; (b) reality in 1994

In this schema the neighborhood factor N makes city works like a non-linear system
as suitability, accessibility and land-use zoning status are relatively stable in certain
period.
In a learning stage, t-1Sik, t-1Zik, and t-1Aik also were calibrated through the dataset in
1984 and 1989. The calibrated model was used to simulate the spatial process of urban
growth of the Tokyo metropolitan area from 1989 to 1994. Land-use maps in
simulation and reality are shown in Fig. 3. Note that in order to make the maps clearer,
land-use types of forest & wasteland and cropland have been grouped into
non-urbanized area and other land-use types except water into urbanized area.
A good CA-based model produces results which have all the patterned complexity of
the real system [12]. Comparison of the simulated result with the actual data in terms of
fractal dimension and spatial metrics was carried out towards the test of this proposed
approach. These two indices are excellent for presenting the pattern of complex system
like city [21]. Table 2 shows the assessment result of the simulated urbanized area in
terms of fractal dimension. Areas, the distance from which to Tokyo station is more
than 50km, were omitted considering the effect of boundary in this table. In order to
understand the change of urbanized area structure and confirm the ability of this model
in capturing the change of structure, fractal dimension of urbanized area in 1989 also is
shown. Table 2 indicates that urbanized area shows bifractal structure in the study area.
Urbanized area had grown more greatly in the second radius zone with 16-50km than in
the first one with 0-16km. Model well captures this characteristic.
Fig.4 shows the assessment result of simulation in terms of spatial metrics. Two
metrics were used for the assessment: NP (number of patches) and PD (patch density).
Values of NP and PD had declined from 1989 to 1994, thus indicating the
characteristics of compact growth or conglomeration of the existing urbanized area in
the study area. Simulated urbanized area presents the same characteristics. However, if
we take out the component of neighborhood interaction from the model, NP and PD
turgidly increased. This confirms the utility of the proposed approach in modeling
neighborhood interaction.

556

Y. Zhao and Y. Murayama


Table 2. Assessment of the simulated urbanized area in terms of fractal dimension

Fractal dimension
in different radius zones
In 0-16km radius
In 16-50km radius

Reality in 1989

Reality in 1994

Simulation in 1994

1.94
1.45

1.95
1.48

1.95
1.48

13000

1.60
12561
NP

1.52

1.40
11000
1.30
10000

9909
9609

9594

1.16

1.16

1.20

1.20

9000

8000

Value of PD (Number per 100 ha.)

Value of NP (Number of patches)

1.50

PD

12000

1.10

1.00
reality in 1989

reality in 1994

simulation in 1994 simulation in 1994


without
neighborhood effect

Fig. 4. Comparison of the significance of spatial metrics of urbanized area between reality and
simulation in the Tokyo metropolitan area

5 Concluding Remarks
This study developed a novel approach for modeling and calibrating neighborhood
interaction in CA-based urban geosimulation. The proposed method provides a
theoretical framework for presenting neighborhood effect of CA. The results of
simulation using the Tokyo metropolitan area as a case study indicates that urban
geosimulation model which embeds this method well captures the main characteristics
of spatial process of urban growth. The results also confirmed the utility of this method
for presenting dynamics of complex system.
This approach can be used not only in regular grid cells, but also in irregular cells,
like in vector structure, as it considers the area of cells and the distance decay. To
discuss this issue would be a value extension to the current study.

References
1. Lin, G.C.S., Ho, S.P.S.: China's land resources and land-use change: insights from the 1996
land survey. Land Use Policy 20 (2003) 87-107
2. McKinney, M.L.: Urbanization as a major cause of biotic homogenization. Biological
Conservation 127 (2006) 247-260
3. Paul, M.J., Meyer, J.L.: Streams in the urban landscape. Annual Review of Ecology
Systematics 32 (2001) 333-365

A New Method to Model Neighborhood Interaction

557

4. Grimm, N.B., Grove, J.M., Pickett, S.T.A., Redman, C.L.: Integrated approaches to
long-term studies of urban ecological systems. Bioscience 50 (2000) 571-584
5. Lambin, E.F., Turner, B.L., Geist, H.J., Agbola, S.B., Angelsen, A., Bruce, J.W., Coomes,
O.T., Dirzo, R., Fischer, G., Folke, C., George, P.S., Homewood, K., Imbernon, J.,
Leemans, R., Li, X., Moran, E.F., Mortimore, M., Ramakrishnan, P.S., Richards, J.F.,
Skanes, H., Steffen, W., Stone, G.D., Svedin, U., veldkamp, T.A., Vogel, C., Xu, J.: The
causes of land-use and land-cover change: moving beyond the myths. Global Environmental
Change 11 (2001) 261-269
6. Christaller, W.: Central Places of Southern Germany (Edition 1966). Prentice Hall, London
(1933)
7. Fujita, M., Krugman, P., Mori, T.: On an evolution of hierarchical urban systems. European
Economic Review 43 (1999) 209-251
8. Krugman, P.: The role of geography in development. International Regional Science
Review 22 (1999) 142-161
9. Batty, M.: Urban evolution on the desktop: simulation with the use of extended cellular
automata. Environment and Planning A 30 (1998) 1943-1967
10. Wu, F.: SimLand: a prototype to simulate land conversion through the integrated GIS and
CA with AHP-derived transition rules. International Journal of Geographical Information
Science 12 (1998) 63-82
11. Yeh, A.G.O., Xia, L.: A constrained CA model for the simulation and planning of
sustainable urban forms by using GIS. Environment and Planning B 28 (2001) 733-753
12. White, R., Engelen, G.: High-resolution integrated modelling of the spatial dynamics of
urban and regional systems. Computers, Environment and Urban Systems 24 (2000)
383-400
13. White, R., Engelen, G.: Cellular automata and fractal urban form: a cellular modelling
approach to the evolution of urban land-use patters. Environment and Planning A 25 (1993)
1175-1199
14. White, R., Engelen, G., Uljee, I.: The use of constrained cellular automata for
high-resolution modeling of urban land-use dynamics. Environment and Planning B 24
(1997) 323-343
15. Straatman, B., White, R., Engelen, G.: Towards an automatic calibration procedure for
constrained cellular automata. Computers, Environment and Urban Systems 28 (2004)
149-170
16. Tobler, W.: A computer movie simulating urban growth in the Detroit region. Geographical
Analysis 46 (1970) 234-240
17. Goodchild, M.F.: The validity and usefulness of laws in geographic information science and
geography. Annals of the Association of American Geographers 94 (2004) 300-303
18. Reilly, W.J.: The Law of Retail Gravitation. Knicherbocker, New York (1931)
19. Irwin, E.G., Geoghegan, J.: Theory, data, methods: developing spatially explicit economic
models of land use change. Agriculture, Ecosystems & Environment 85 (2001) 7-23
20. Cheng, J., Masser, I.: Urban growth pattern modeling: a case study of Wuhan city, PR
China. Landscape and Urban Planning 62 (2003) 199-217
21. Barredo, J.I., Demicheli, L.: Urban sustainability in developing countries' megacities:
modelling and predicting future urban growth in Lagos. Cities 20 (2003) 297-310

Articial Neural Networks Application to


Calculate Parameter Values in the
Magnetotelluric Method
Andrzej Bielecki1 , Tomasz Danek2 ,
Janusz Jagodzi
nski1 , and Marek Wojdyla3
1

Institute of Computer Science, Jagiellonian University,


Nawojki 11, 30-072 Krak
ow, Poland
2
Department of Geoinformatics and Applied Computer Science,
Faculty of Geology, Geophysics and Environmental Protection
AGH University of Science and Technology,
Al. Mickiewicza 30, 30-059 Krak
ow, Poland
3
Geophysical Exploration Company
Jagiellonska 76, 03-301 Warszawa, Poland
bielecki@softlab.ii.uj.edu.pl

Abstract. In this paper a possibility of neural network application


to data processing in magnetotelluric method is studied. The modular neural system, consisting of three multi-layer neural networks, is
used for obtaining geoelectric model of lithosphere basing on amplitude
and phase MT curves. Cases of two and three at lithosphere layers are
considered.
Keywords: magnetotelluric method, multi-layer neural networks.

Introduction

Magnetotelluric (MT) is one of the most commonly used passive geophysical


method. That method apply measuring of uctuations in natural electric (E)
and magnetic (H) elds in orthogonal directions at the surface of the Earth to
determining the conductivity structure of the Earth. Apart from its usefulness in
deep lithosphere research and hydrocarbon prospecting, the precise knowledge
of the conductivity structure is of major importance in several respects. For
example, it has been shown [4] that sites of the Earths surface lying close to
conductive edges are the most sensitive sites for the recording of low frequency
signals that have been found to precede earthquakes [7].
The fundamental theory of magnetotelluric was rst propounded by Tikhonov
[6], and Cagniard [2]. Magnetotelluric data, called amplitude and phase sounding curves, are usually obtained by processing of measured time series, which
describe electromagnetic variations in particular point on Earth surface. Sounding curves are next interpreted with various inversion method. The nal results are geoelectrical cross-sections (or maps) which are usually correlated with
geology.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 558561, 2007.
c Springer-Verlag Berlin Heidelberg 2007


ANNs Application

559

Data Processing

As a results of MT measurements time series of electric (Ex, Ey) and magnetic


(Hx, Hy) components are calculated to spectral domain using Fast Fourier Transformation. Such components are connected by formula which includes impedance
tensor [Z]:
E = ZH

(1)

where

Z=

Zxx Zxy
Zyx Zyy


(2)

For horizontally layered Earth Zxx = Zyy = 0 and Zxy = Zyx (Cagniard
model), so we can obtain magnetotelluric amplitude and phase curves by:
0.2 2
|Z|
f

(3)

(f ) = Arg(Z)

(4)

(f ) =

Neural Approach

The applied modular neural system consists of three multi-layer articial neural
networks (ANNs). The rst one - ANN1 - recognizes how many layers constitute the studied fragment of lithosphere - only two-layer and three-layer cases
are considered. If the number of lithosphere layers is equal to two then input
vector is put into the second neural network ANN2. Otherwise it is put into
the third neural net ANN3. Networks ANN2 and ANN3 calculate parameters of
one-dimensional model.
Two various form of the input vector components [x1 , ..., xn ] for ANNs were
tested. In the rst case vector components were scaled using the following forms
x2i1 =

ln (i )
k1

(5)
x2i =

(i )
k2 ,

where was frequency in logarithmic scale,  was resistance, was phase, k1


and k2 were constant values, calculated during data preprocessing in such a way
that components of input vectors xi belonged to interval [0, 1], i = 1, ..., 55. In
the second case integrated form of input data were used


x2i1 =

i+1
(i )d
i

55 ( )d
i
1

k3

(6)

x2i =

i+1
(i )d
i

k4 55 (i )d
1

where i = 1, ..., 54 and constant values k3 and k4 were used for component
normalization as in the case of input data representation given by formula (5).

560

A. Bielecki et al.

Results

Mean absolute percentage error (MAPE) was used as a measure of accuracy


both in the net ANN2 calculating parameters for two-layer model and in the net
ANN3 used for three-layer model. The error is given by the formula

n m 
100    yij dij 
MAPE =
,
(7)
n i=1 j=1  dij 
where yij is the jth component of output vector of an ANN if the ith input
vector is presented, dij is the jth component of correct output vector if the ith
input vector is presented, m is the number of output vector components and n
is the number of presented examples.
Both ANN2 and ANN3 were perceptrons having one hidden layer. In both
cases it turned out that twenty hidden neurons was an optimal number. Sigmoidal mapping was used as an activation function of hidden neurons and both
sigmoidal and linear neurons were tested in the output layer - see Tables 1
and 2. Components of output signals were scaled so they belonged to interval
[0, 1]. Back-propagation and momentum were used as training algorithms. Both
scaled representation of input data (see formula (5)) and integrated representation (formula (6)) were tested. Various learning constatnts were tested as well.
For both networks (ANN2 and ANN3) training process consisted of 8000 epochs.
In two-layer case the training set consisted of 1072 elements and testing set
had 358 elements. Results are presented in Table 1.
Table 1. Results for two-dimensional model

No Data

1
2
3
4
5
6
7
8

scaled
scaled
scaled
integ.
integ.
scaled
scaled
scaled

Activation
MAPE for MAPE for
function Training Training
testing the whole
for output parameter method
set
set
layer
sigmoidal
0.2
backprop.
3.30
3.06
linear
0.2
backprop.
2.32
1.98
sigmoidal
0.4
backprop.
2.53
2.48
sigmoidal
0.2
backprop.
7.32
7.34
linear
0.2
backprop.
5.68
6.16
sigmoidal
0.8
backprop.
3.23
3.18
linear
0.2
momentum
1.73
1.65
sigmoidal
0.4
momentum
2.41
2.42

In three-layer case the training set consisted of 1147 elements and testing set
had 383 elements. Results are presented in Table 2.
The net ANN1, recognizing to which model (two- or three-layer) belonged
given curves, consisted of ve neurons in the hidden layer and two output neurons. If the rst component of output signal was greater then the two-layer model
was chosen. The network achieved 100% of correctness after 700 training epochs.

ANNs Application

561

Table 2. Results for three-dimensional model

No Data

1
2
3
4
5
6
7

scaled
scaled
scaled
integ.
scaled
integ.
integ.

Activation
MAPE for MAPE for
function Training Training
testing the whole
for output parameter method
set
set
layer
linear
0.2
backprop.
4.18
3.57
sigmoidal
0.2
backprop.
4.02
3.40
linear
0.2
momentum
3.89
3.77
linear
0.2
backprop.
4.79
4.46
sigmoidal
0.4
backprop.
4.50
4.10
sigmoidal
0.4
backprop.
4.40
4.13
linear
0.2
momentum
4.72
4.53

Concluding Remarks

The presented investigations concerns possibility of articial neural systems applications in MT method. It should be stressed that presented tests are preliminary and 2D and 3D cases must be studied. Anyway the obtained results are
promising and can be a contribution to studies concerning eciency of modular
neural systems ([1], [3], [5]).

References
1. Auda G., Kamel M.: Cooperative modular neural networks for pattern recognition.
Pattern Recognition Let., Vol. 18 (1997) 1391-1398
2. Cagniard L.: Basic theory of the magnetotelluric method of geophysical prospecting.
Geophysics, Vol. 18 (1953) 605-637
3. Marciniak A., Korbicz J.: 2000, Modular neural networks. In: Duch W., Korbicz J.,
Rutkowski L., Tadeusiewicz R. (eds.): Biocybernetics and Biomedical Engineering.
Warszawa, EXIT (2000) 135-178
4. Sarlis, N., Lazaridou, M., Kapiris, P., Varotsos, P.: Numerical model of the selectivity eect and the DV/L criterion. Geophysics Research Letters, Vol. 26 (1999)
3245-3248
5. Sharkey A. (ed.): Combining Articial Neural Nets: Ensemble and Modular MultiNet Systems. Springer, Berlin (1999)
6. Tikhonov, A. N., 1950, The determination of the electrical properities of deep layers
of the Earths crust. Dokl. Acad. Nauk. SSR, Vol. 73 (1950) 295-297
7. Varotsos, P., Alexopoulos, K., Nomicos, K., Lazaridou, M., 1988, Ocial earthquake
prediction in Greece. Tectonophysics, Vol. 152 (1988) 193-196

Integrating Ajax into GIS Web Services for


Performance Enhancement
Seung-Jun Cha1, Yun-Young Hwang1,
Yoon-Seop Chang2, Kyung-Ok Kim2, and Kyu-Chul Lee1,*1
1

Department of Computer Engineering, Chungnam National University, Korea


2
Electronics and Telecommunications Research Institute(ETRI), Korea
{junii, yyhwang, kclee}@cnu.ac.kr, {ychang, kokim}@etri.re.kr

Abstract. In the GIS(Geospatial Information System) Web Services,


SOAP/MTOM shows best performance when transferring large size of data
between services. SOAP/MTOM uses XOP for message optimizing method, so
data serialization and deserialization time is reduced. Additionally, integrating
Ajax(Asynchronous JavaScript and XML) approach into GIS visualization Web
Services have performance enhancement, because it provides more interactive
user experience. Ajax, one of the technologies of Web 2.0, is appeared after
Dot Com bubble is occurred. The intent is to make web pages feel more
responsive by exchanging small amounts of data with the server behind the
scenes, so that the entire web page does not have to be reloaded each time the
user requests a change.
Keywords: Ajax, Web Services, GIS.

1 Introduction
Over the past decade GIS(Geographic Information Systems) technology has evolved
from the traditional model of stand-alone systems to distributed models. Distributed
GIS services will be implemented more extensively by using Web Services. But
there is on additional problem: large amounts of data need to be moved among users
and providers, so as to enable the first to perform their designated task. Since
geographic-related data are usually large in size this exchange becomes more and
more difficult despite the improvements in communication. But among SOAP
message transfer methods, SOAP/MTOM[1], which is using XOP for optimizing
message transfer, shows best performance when transferring large size of data
between services.
The bursting of the dot-com bubble in the fall of 2001 marked a turning point for
the web. The companies that had survived the collapse seemed to have some things in
common. Dot-com collapse marked some kind of turning point for the web, so it is
called Web 2.0. Ajax(Asynchronous JavaScript and XML) is one of the Web 2.0s
technologies. It provides asynchronous communication between the client and the
*

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 562568, 2007.
Springer-Verlag Berlin Heidelberg 2007

Integrating Ajax into GIS Web Services for Performance Enhancement

563

server. Integrating Ajax approach into GIS visualization Web Services have
performance enhancement for user interface. [11]
This paper describe the model which is integrated Ajax into GIS Web Services,
and decide how to evaluate performance between using Ajax and non-Ajax in GIS
Web Services. Through the evaluation of the performance between Web Services
using Ajax and Web Services using non-Ajax, results indicate integrated Ajax into
Web Service model shows good performance Enhancement is approved.
The rest of this paper is organized as follows: a brief discussion of OGC GIS Web
Services and Ajax is given in section 2. Section 3 shows comparisons with related
work, and section 5 contains the introduction of integrated model. A definition of
performance evaluation and test results are described in Section 6. Conclusions are
followed in section 7.

2 Background
2.1 OGC GIS Web Services
Geographic Information Science has a lot to benefit by the adoption of the service
computing model. As mentioned also in the introduction geographic information
comes from different and diverse sources and in different formats. This is especially
true for the environmental related information which has to combine not only data
from different sources by also models and software.[8]
Technically, Web Services technologies have provided the necessary standards
for applications in different domains to integrate with GIS data and services,
significant accomplishments in GIS Web Services have led to several exemplifying
map and image services that adhere to Web Services standards and bring terabytes
of geospatial data and digital maps to enterprise developers who house no GIS
data.
The OGC(Open Geospatial Consortium) has successfully executed efforts for GIS
Web Services(OWS) initiative has undergone multiple phases including the Web
Map Server(WMS), Web Feature Server(WFS), Web Coverage Server(WCS), and
OGC Web Service Architecture, which support application developers in integrating a
variety of online geoprocessing and location services.
2.2 Ajax(Asynchronous JavaScript and XML)
Ajax is a style of web application development that used a mix of modern web
technologies to provide a more interactive user experience. Ajax is not a technology.
It is an approach to web applications that includes a couple of technologies. These are
JavaScript, HTML, Cascading Style Sheet(CSS), Document Object Model(DOM),
XML and XSLT, and XMLHttpRequest as messaging protocol. [2]
These core technologies forming Ajax are mature[6], well-known and used in web
applications widely. Ajax became so popular because it has a couple of advantages

564

S.-J. Cha et al.

for the browser based web applications developers. It eliminates the stop-start nature
of interactions, user interactions with the server happen asynchronously, data can
manipulated without having to render the entire page again and again in the web
browser, and requests and responses over the XMLHttpRequest protocol are
structured XML documents. This enables developers easily integrate Ajax
applications into Web Services.

3 Related Works
Previous researchers have described about applying Web Services to GIS. The
paper[3] indicates about GIS Web Services about satellite imaging. It says there are
problems when using GIS Web Services, because of sending large size of data. The
paper[7] describes about developing GIS visualization Web Services for geophysical
applications. Since images and capabilities documents can be too large and
transferring these data over the internet is cumbersome, its first priority is researching
technique for improving WMS performance.
Related to the Ajax, The paper[6] introduced integrating the model of Ajax
approach into GIS Web Services. It describes of invoking Web Services in the Ajax
Model but it is only theoretical so it has nonexistence of implementation.
Our study differs from previous studies in that we compare performance among
standard SOAP, SwA/MIME, and SOAP/MTOM. So, the paper[9] indicated the
result. When we send large size of data(raster and vector), SOAP/MTOM shows the
best performance. And there are no studies of implementing and evaluating
applications for GIS Web Services using Ajax.

4 Integrated Ajax into GIS Web Services


In Ajax, the client communicated with the server by using XMLHttpRequest method.
XMLHttpRequest provides asynchronous access to the server simply like the
Googlemap. Its performance seems better because web pages using Ajax can appear
to load relatively quickly since the payload coming down is much smaller in size. So
roundtrip time is less than the classic web application.
Web Services provide integrated environments within distributed system. When
the client makes a request for receiving images from the server, the server runs inner
JSP which can access specific Web Service of providing image files. And then the
server saves them to its repository. The clients working process must be waited until
response message is arrived from the server, because the client requests
synchronously.
In the model integrated Web Services with Ajax (Fig. 1), as both Ajax and Web
Services are used together, two different sides advantages are applied. So it can
integrate Web Services from different platforms application and access to the server
asynchronously. It enhances user interface and reduces roundtrip time.

Integrating Ajax into GIS Web Services for Performance Enhancement

565

Fig. 1. Integrated Web Services with Ajax

5 Performance Evaluation Between Using Ajax and Using


Non-Ajax in GIS Web Services
When an application is launched, images which may be displayed on the screen are
fetched from the server. If the server does not have those images, the server requests
for fetching images from specific Web Service. User may wait for looking displayed
images for the moment. During that time, Ajax is executed, so it fetches extra images
roundabout from Web Services to the server. When user drag the screen for watching
another images, already fetched image is displayed faster.
5.1 Evaluation Metrics and Method
5.1.1 Consideration Characteristic of Fetching Time in Ajax
The time when user look at displayed images, the server creates a XMLHttpRequest
Object and executes it. Extra images are fetched from Web Services to the server in
that operation. So it takes less than users waiting time. (Fig. 2 )

Fig. 2. Evaluation metrics

566

S.-J. Cha et al.

5.1.2 Image Fetching Time


Displayed images must be fetched from Web Service regardless of using Ajax.
Because the request using Ajax is asynchronous, images fetching time is less than
using non-Ajaxs fetching time. (Fig. 2 - , )

5.1.3 User Response Time


When users drag the screen, addition images must be fetched. Non-Ajax application is
not include images in themselves, so it accesses Web Services for fetching images
every time. But Ajax application has already fetched necessary images, so they are
displayed faster. (Fig. 2 - , )

5.2 Evaluation Environments and Test Data Design


For the test, we need three computers. The configuration of the client systems is:
Pentium 4-1.66GHz processor
1Gbytes of memory
Windows XP Pro.
The configuration of the server systems is:
Pentium 3-1GHz *2 (Dual CPU) processors
2Gbytes of memory
RedHat Linux 9
The configuration of the Web Server systems is:
Pentium 4-1.66GHz processor
1Gbytes of memory
RedHat Linux 9
All computers are connected in commonly used network. So test data is selected by
considering generally used data. Google map use 20Kbytes of image tile so the tests
are performed by receiving 20Kbytes of image.
5.3 Evaluation Results
The result of consideration of fetching time in Ajax is presented in Table 1. It takes
almost 2 seconds while all images(12 tiles) are fetched. It shows that 2 seconds is
reasonable time to fetch images while user waited.
In images fetching time, we compare Web Services using Ajax with Web Services
using non-Ajax. Fig. 3 shows that if only one image is fetched, the one takes longer
than the other, because it has some overheads at XMLHttpRequest method. But, if
multiple images are fetched the other takes longer than the one, because Ajax can
communicate asynchronously.
Table 1. Consideration characteristic of fetching time

Ajax

1
186ms

3
475ms

6
938ms

9
1410ms

12
1988ms

Integrating Ajax into GIS Web Services for Performance Enhancement

3000

567

2737

2500
2030

)s2000
(m
e1500
im
T
1000

1988
1410

1299

Ajax
non-Ajax

938

500

681
475

186
155

5
6
7
8
Number of images

10

11

12

Fig. 3. Image fetching time

Fig. 4 indicated the user roundtrip time. In Web Services using Ajax the server has
already read extra images and save them to its repository. So when the client requests
mages, it can read them directly and the server does not need to connect Web
Services. However in non-Ajax application, when the client requests images the
server must connect Web Services for fetching images every time. So it takes longer
than before. The roundtrip time is most important factor for user interface. If
roundtrip time measured less, users realize that the application has good performance.
The more evaluation data comes to many, the more gaps we get.
4000

3792

3500
3000
)s 2500
(m
e 2000
m
iT
1500

2609
1544

1000
500
0

672
222
47
1 2

190
3

438
5 6 7 8 9
Number of Images

1290

Ajax
non-Ajax

767
10 11 12

Fig. 4. User response time

6 Conclusion
Our research is for performance enhancement in GIS Web Services.
performed evaluation experiments of SOAP and its variants for the
efficiency. Our tests indicate that SOAP performance is improved
SOAP/MTOM for sending protocol, in both raster data and vector data.

So we have
transmission
when using
It is because

568

S.-J. Cha et al.

SOAP/MTOM uses XOP optimizing mechanism so that serialization time and


deserialization time is reduced. Especially in vector data, a standard SOAP message is
made by embedding vector data as elements or attributes. Although total message size
using SwA/MIME and SOAP/MTOM is bigger than standard SOAP because of
tagging, roundtrip time is less than that. It is because SwA/MIME and SOAP/MTOM
consist of small SOAP message and attachments exist in boundaries.
In addition to precede test, GIS Web Services performance may have
enhancement by using asynchronous technology. So we have chosen Ajax
technology, one of the Web 2.0 technologies. Ajax provides asynchronous
communication when the client request required images to the server. Web Services
and Web Services using Ajax comparison results that Web Services using Ajax
represent good performance in images fetching and user roundtrip time because it
fetches them beforehand.
These results will contribute to integration constructions main technology of GIS
systems. And also they will also contribute that in Web Services of dealing with large
size of data used in GIS system, its performance should be enhanced.

Acknowledgement
This research was supported by the Ministry of Information and Communication,
Korea, under the College Information Technology Research Center Support Program,
grant number IITA-2006-C1090-0603-0031.

References
1. Gudgin, M., et al., 2005, SOAP Message Transmission Optimization Mechanism, (on-line)
http://www.w3.org/TR/soap12-mtom/
2. Jesse James Garret, Ajax: A New Approach to Web Applications. 2005. 2, http://
www.adaptivepath.com/ publications/essays/archives/000385.php
3. Kotzinos, D., et al., GIS Web Services and Satellite Imaging as a Framework for
Environmental Monitoring: The Design and Implementation of a Vegetation Indices
Calculation Test Case, Computer Science Department, University of Crete, Greece
4. Mitra, N., 2003, SOAP Version 1.2 Part 0:Primer, (on-line) http:// www.w3.org/ TR/
2003/ soap12-part0/
5. Ng, A., et al., 2005, A Study of the Impact of Compression and Binary Encoding on SOAP
performance, Department of Computing, Macquarie University, Australia
6. Sayar, A., et al., 2006, Integrating AJAX Approach into GIS Visualization Web Services,
Community Grids Lab, Indiana University, Bloomington, Indiana
7. Sayar, A., et al., Developing GIS Visualization Web Services for Geophysical
Applications, Community Grids Lab, Indiana University, Bloomington, Indiana
8. Tu, S., et al., 2006, Web Services for Geographic Information System, IEEE internet
Computing, pp. 13-15
9. Seung-Jun, C., et al., A Performance Evaluation of SOAP Variants for GIS Web Services,
ISRS 2006 PORSEC, Vol 2(2006), 615-618
10. Tim OReilly, What Is Web 2.0. Design Pattern and Business Models for the Next
Generation of Software, OREILLY, 2005. 9
11. Ying, Y., et al., 2004, A performance Evaluation of Using SOAP with Attachments for eScience, School of Computer Science, Cardiff University1.1 Checking the PDF File

Aerosol Optical Thickness Retrieval over Land from


MODIS Data on Remote Sensing Information Service
Grid Node
Jianping Guo1,3, Yong Xue1,2,*, Ying Wang 1,3, Yincui Hu1,3, Jianqin Wang4 ,
Ying Luo1,3, Shaobo Zhong1,3, Wei Wan1,3, Lei Zheng1,3, and Guoyin Cai1,3
1

State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of
Remote Sensing Applications of Chinese Academy of Sciences and Beijing Normal University,
Institute of Remote Sensing Applications, Chinese Academy of Sciences, PO Box 9718,
Beijing 100101, China
2
Department of Computing, London Metropolitan University, 166-220 Holloway Road,
London N7 8DB, UK
3
Graduate University, Chinese Academy of Sciences, Beijing 100049, China
4
College of Information and Electrical Engineering, China Agricultural University, PO Box
142, Beijing, 10083,China
jpguo_irsa@hotmail.com, y.xue@londonmet.ac.uk

Abstract. The signal at the top of the atmosphere will certainly contain
information about both the surface and the atmosphere. To derive the
geophysical parameters from satellites remote sensing images, the atmospheric
effects must be decoupled. Aerosol Optical Thickness (AOT), an important
aerosol optical property, should be correctly determined to remove the
atmospheric effect. The retrieval process is great time-consuming and the EMS
memory required is too large for a personal computing to run efficiently.
Therefore, to facilitate the process smoothly, SYNTAM model is used to
retrieve AOT over a wide range of land including China and one European area
from MODIS data on the Remote Sensing Information Service Grid Node
(RSIN, http://www.tgp.ac.cn) deployed at Institute of Remote Sensing
Applications, Chinese Academy of Sciences. AOT retrieval results show that
the RSIN Grid service is high efficient and has the potential to be applied to the
remote sensing parameter inversion.
Keywords: Aerosol Optical Thickness Retrieval, MODIS Data, Remote
Sensing Information Service Grid Node.

1 Introduction
Aerosols are important components of the atmosphere that influence the Earths
energy balance both directly (by scattering and absorbing radiation) and indirectly (by
serving as nuclei for cloud formation), and affect hydrological cycle (IPCC, 2001).
*

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 569 576, 2007.
Springer-Verlag Berlin Heidelberg 2007

570

J. Guo et al.

They also affect public health and reduce visibility (Jacovides et al., 1994). Aerosol
particles have heterogeneous spatial and temporal distributions with life spans on the
order of days to weeks (IPCC 2001). Satellite observations of aerosol concentrations
are thought to contribute greatly to reduce the large uncertainty in current estimates of
aerosol-caused radiative forcing (Varotsos et al., 2006).
AOT can be retrieved from a wide range of satellites sensors including Moderate
Resolution Imaging Spectroradiometer (MODIS), MISR (Multiangle Imaging
SpectroRadiometer), AVHRR (Advanced Very High Resolution Radiometer),
POLDER (Polarization and Directionary of the Earth's Reflectances). The operational
retrieval of the AOT over sea suggests that the algorithm is matured. Derivation of
AOT over land, however, remains great uncertainty due to its high variation of land
type. DDV method has been proposed among researchers to retrieve aerosol
properties over such dark targets as water bodies and vegetation areas (Liu et. al.,
1996).
MODIS is a sensor with the ability to characterize the spatial and temporal
characteristics of the global aerosol field. Launched aboard NASAs Terra and Aqua
satellites in December 1999 and May 2002, MODIS has 36 channels spanning the
spectral range from 0.41 to 15m representing three spatial resolutions: 250m, 500m,
and 1km. The aerosol retrieval makes use of seven of these channels (0.472.1m) to
retrieve AOT and properties (Kaufman et al., 1997).
When assuming the aerosol optical property invariable and the temporal
differences between the two satellite overpasses over the same region ignored, the
Synergy of TERRA and AQUA MODIS data (SYNTAM) algorithm is used to
retrieval AOT from MODIS (Tang et al. 2005).
Grid computing aggregates heterogeneous resources and provides hardware and
software services, supporting application and services composition, workflow
expression, scheduling, and execution management and service level agreements
based allocation of resources. It has been an enabled environment for data sharing and
processing.
Researchers and corporations have developed different types of grid computing
platforms including support resource pooling or sharing such as SETI@Home,
Condor, and Alchemi harnessing idle CPU cycles from desktop computers in the
network, Globus, EU DataGrid, and Gridbus allow sharing of computational and
distributed data resources. Guo et al., (2005) proposed a grid-based spatial
epidemiology applicationa for scientists from both biological and spatial information
fields. With respect to remote sensing applications based on grid service Aloisio et al.
(2004) proposed a grid architecture, then after that, the grid platform of remote
sensing data processing is developed (Aloisio et al. 2003).
In this paper we focus on the implementation of aerosol optical thickness retrieval
on RSIN developed by Telegeoprocessing Research Group in Institute of Remote
Sensing Applications (IRSA), Chinese Academy of Sciences (CAS), providing more
than ten geophysical parameter retrieval functions (Luo et al., 2006). The remainder
of this paper is organized as follows: in Section 2, the algorithm of AOT retrieval is
introduced in detail. The architecture and process of the AOT retrieval service are
presented in Section 3. Two AOT retrieval experiments over China and one European

Aerosol Optical Thickness Retrieval over Land from MODIS Data

571

area are performed on RSIN. Finally, some conclusions are drawn about the AOT
retrieval implementation by means of Grid service platform-RSIN, and future work is
discussed.

2 SYNTAM Model
The Synergy of Terra and Aqua MODIS data (SYNTAM) algorithm is used to
retrieval AOT in this paper. The aerosol retrieval model bases on Equation (1) (Xue
and Cracknell 1995).
'

Aj,i =

(a j b) (0.00879i 4.09+ ji ) sec j '

(Aj,i b a j ) + a j (1 Aj,i )e
'

(a j b) (0.00879i 4.09+ ji ) sec j '

(1)

( Aj,i b a j ) + b(1 Aj,i )e


'

'

where j=1,2, respectively stands for the observation of TERRA and AQUA; i=1,2,3,
respectively stands for three visible spectral bands with central wavelength at 0.47m,
0.55m, 0.66m; is the central wavelength. A is the Earths surface reflectance. A
is the Earths system reflectance (Tang, et al.2005).
The SYNTAM algorithm infers the surface-leaving and path radiance contributions
to total observed radiance without any assumption regarding the absolute surface
reflectance and the land type, but with assumption regarding the invariance of
wavelength exponent during time intervals of Terra and Aqua satellites overpass
time (10:30, 13:30, local time, respectively).
The process of AOT retrieval using SYNTAM includes fourfold as follows: first of
all, the input images from both Terra and Aqua should be georeferenced before coregistration are performed. Secondly, the radiance calibration is performed to get the
correct physical parameters including sensor zenith angle, solar zenith angle, and top
of atmosphere reflectance. Thirdly, cloud should be screened and removed. Finally
all pixel values, as input parameters, are to SYNTAM to derive AOT. The retrieval
results may be post-processed and then provided to the end users through the RSIN
Grid service.

3 AOT Retrieval Services on RSIN


3.1 Remote Sensing Information Service Grid Node
Remotely sensed data is one of the most important spatial information sources, so are
the researches on architectures and technical supports of remote sensing information
analysis.
RSIN, a significant part of the research on SIG, aims at integrating data, traditional
algorithm and software, and computing resource distributed, provide one-stop service
to everyone on Internet, and make good use of everything pre-existing. The node can
be very large, which contains many personal computers, supercomputers or other

572

J. Guo et al.

nodes. It also can be as small as just on personal computer. Figure 1 describes the
architecture of Remote sensing information analysis and service Grid node. The node
is part of the SIG, but it also can provide services independently. Therere two entries
to the node (Luo et al., 2006):
1. A portal implemented by JSP. User can visit it through Internet browses, such as
Internet Explorer and others.
2. A portal implemented by web service and workflow technology. It is special for
SIG. Other Grid resources and Grid systems can integrate with our node through
this portal.

Fig. 1. RSIN web page (URL: http:/www.tgp.ac.cn)

3.2 Grid Implementation of AOT Retrieval


Generally speaking, the retrieval process consists of data preparing, data preprocessing, AOT retrieval and post-processing of results. So the Grid service of the
SYNTAM AOT retrieval is partitioned into four sub services correspondingly. These
sub services are implemented one by one. Upon receiving the users order of the AOT
retrieval service via grid portal, the grid manager initialized an AOT retrieval service
and run the data searching sub service to find the MODIS data among the data
resource in the Grid pool. Then the data query results are sent to the data preprocessing service. The pre-process service sends the pre-process job to the
computing resource and then collect the returned results. The pre-processed MODIS
data finally is transferred to the SYNTAM processing service. After processing
among computing resource on RSIN, the RSIN collects all of the retrieved AOT

Aerosol Optical Thickness Retrieval over Land from MODIS Data

573

results and then post-processes them. The final results are sent to the user via the Grid
portal. At the same time, the job status is simultaneously monitored by a grid pool
manager.
In the following paragraphs, data query schemes, data pre-processing, job
management and AOT post-processing are discussed in detail.
3.2.1 Data Query
The MODIS data are distributed on RSIN, whose metadata describe the information
of the MODIS data, which includes range, producer, quality, date and time,
processing methods, satellite, and so on. The data searching service search the
registered metadata based on SQL and find out where the required data hosted in.
Then query results then returned to the data pre-processing service.
3.2.2 Data Pre-processing
Before going to SYNTAM AOT retrieval process, geometrical corrections (including
calibration, geo-reference, merging and clipping etc) and radio-metrical calibration
should be performed. After that, 16 input parameters required in SYNTAM are in
ASCII format and poised to be input to the next phase.
Geo-referencing of MODIS data is time consumable and computationally intensive.
Combined with the calibration, the geo-reference task is submitted to the Grid. The
algorithms and the partition strategy can be found in the paper from Hu et al. (2005).
The merging, clipping and format transfer are combined to a unity one. When it
concerns to regional or global scale, the partition strategy must be considered if there
are no high-powered computer that could handle the merging process in the Grid
pool. We apply dynamic filling methods to fulfill the task. Firstly, the request range is
divided into regular pieces according to the available computers amount. The sub
range information and the geo-referenced data are sent to the job nodes in the grid
environment. Secondly, the job nodes search the data within the specified range and
fill the data into the correct location. After the required 16 parameter files are ready
on the job node, the SYNTAM AOT services start up.
3.2.3 Job Management and Post-processing
The task is partitioned into many sub jobs, which are identified by unique Grid job
identifiers. The job manager monitors the job status (running or idle) during the
process. Finished status is reached whenever user retrieves all the output files
produced by a job. The job is check-pointed for later restart. When a chosen resource
refuses to accept the job, the job is vacated and waiting for the manager to reallocate
to other computing element. The grid manager collects all the results returned from
the job nodes. The results are merged dynamically. After all of the results are merged,
the merged files are transformed to the format, which the user required and then the
transformed files are finally transferred to the user.

4 Experiments and Results


MODIS level 1B data for Europe covering north France and part of England,
Netherlands and China, acquired from NASA on July 2, 2006, and November 3,
2004, respectively, are input to RSIN.

574

J. Guo et al.

In respect of the AOT retrieval in Europe, Figure 2 shows the preliminary results.
The retrieval results are tested with collocated Aeronet in-situ data in Western Europe
where 4 Aeronet sites can be available. Figure 3 shows that AOT measurements in
Oostende Aeronet site is compared with the collocated MODIS measurements
averaging over the area within 5km around Oostende. It shows good agreements
between SYNTAM retrieval and MODIS Aeronet measurements. The average
difference between SYNTAM retrieval AOT value and Aeronet measurements is
0.0165.

Fig. 2. AOT retrieval from MODIS data on RSIN for a part of Europe on July 2, 2006

5 Conclusions
The implementation of the AOT retrieval on RSIN is discussed and tested. In order to
check for consistent AOD retrievals, validation for 4 AERONET sites has been
incorporated. It is shown in the few examples in this paper that the retrieved AOD
compares favorably with the collocated AERONET sun-photometer measurements.
Our tests are based on RSIN. The experiments are successful but there are some
aspects we should improve in the future. For example, the load balance on RSIN is to
be improved. Moreover, the partition strategy doesnt consider the difference among
the computing elements. When the job is submitted to node with lower computing
ability, the whole efficiency will be affected. Otherwise, when there are high power
computers in the grid pool, it may be more efficient to submit most of jobs to them. A
scheme should be added in order that able person should do more work. Another

Aerosol Optical Thickness Retrieval over Land from MODIS Data

575

aspect we should improve is the data management. In our experiments, the database is
file based. When the data are centralized in one node, the transferring way will be
jam-packed In the future the distributed database should be build with the dynamic
replica scheme to reduce the pressure on the data source nodes.

D
''
Fig. 3. AOT in Oostende (N 5113'30
, E 02D 55'30'' ), as a function of wavelength

Acknowledgement
This publication is an output from Multi-scale Aerosol Optical Thickness
Quantitative Retrieval from Remotely Sensing Data at Urban Area (40671142) and
"Aerosol fast monitoring modelling using MODIS data and middlewares
development" (40471091) funded by NSFC, China. We are grateful to the MODIS
team for making available data used here. Many thanks go to the PI investigators of
the mentioned AERONET sites used in this paper and Sina-EU Dragon program.

References
1. Aloisio, G., Cafaro, M.: A dynamic Earth observation system. Parallel Computing (2003)
1357-1362
2. Aloisio, G., Cafaro, M., Epicoco, I., Quarta, G.: A problem solving environment for
remote sensing data processing. In Proceeding of ITCC 2004: International Conference on
Information Technology: Coding and Computing held in Las Vegas, NV, USA on 5-7
April 2004,Vol.2. 56-61
3. Cannataro, M.: Clusters and grids for distributed and parallel knowledge discovery.
Lecture Notes in Computer Science (2000) Vol. 1823, 708-716
4. Guo Jianping, Yong Xue, Chunxiang Cao, Wuchun Cao, Xiaowen Li, Jianqin Wang,
Liqun Fang: eMicrob: a Grid-based Spatial Epidemiology Application. Lecture Notes in
Computer Science (2005) Vol. 3516, 472-475
5. Hu Yincui, Xue Yong, Tang Jiakui, Zhong Shaobo, Cai Guoyin: Data-parallel
Georeference of MODIS Level 1B Data Using Grid Computing. Lecture Notes in
Computer Science (2005) Vol. 3516, 883-886

576

J. Guo et al.

6. Intergovernmental Panel on Climate Change 2001The Scientific Basis (contribution of


working group I to the Third Assessment Report of the Intergovernmental Panel on
Climate Change). Cambridge University Press, Cambridge (2001)
7. Jacovides, C.P., Varotsos, C., Kaltsounides, N.A., Petrakis, M., Lalas, D.P.: Atmospheric
turbidity parameters in the highly polluted site of Athens basin. Renewable Energy (1994)
4 (5), 465-470
8. Kaufman, Y.J., Tanr, D., Remer, L.A., Vermote, E., Chu, A., and Holben, B.N.: Remote
sensing of tropospheric aerosol from EOS-MODIS over the land using dark targets and
dynamic aerosol models. J. Geophys. Res. (1997) Vol.102, 17051-17067
9. Luo Ying, Yong Xue, Yincui Hu, Chaolin Wu, Guoyin Cai, Lei Zheng,Jianping Guo, Wei
Wan, Shaobo Zhong: Remote Sensing Information Processing Grid Node with LooseCoupling Parallel Structure. Lecture Notes in Computer Science (2006) Vol. 3991,
876-879
10. Luo Ying, Yong Xue, Chaolin Wu, Yincui Hu, Jianping Guo, Wei Wan, Lei Zheng,
Guoyin Cai, Shaobo Zhong, Zhengfang Wang: A Remote Sensing Application Workflow
and Its Implementation in Remote Sensing Service Grid Node. Lecture Notes in Computer
Science (2006), Vol. 3991, 292 -299
11. Running, S.W., Justice, C.O., Salomonson, V.V., Hall, D., Barker, J., Kaufman, Y.J.,
Strahler, A.H., Huete, A.R., Muller, J.P., Vanderbilt,V., Wan, Z.M., Teillet, P., Carneggie,
D.: Terrestrial remote sensing science and algorithms planned for EOS/MODIS.
International Journal of Remote Sensing (1994), 15(17), 3587 3620
12. Tang Jiakui, Xue Yong, Yu Tong, Guan Yanning: AOT Determination by Exploiting the
Synergy of TERRA and AQUA MODIS (SYNTAM). Remote Sensing of Environment
(2005), 94 (3), 327-334
13. Varotsos, C.A., Ondov J. M., Cracknell A. P., Efstathiou, M.N., Long-range persistence in
global Aerosol Index dynamics. International Journal of Remote Sensing (2006), 27 (16),
3593-3603
14. Xue, Y., Cracknell, A.P.: Operational Bi-Angle Approach to Retrieve the Earth Surface
Albedo from AVHRR data in the Visible Band. International Journal of Remote Sensing
(1995), 16(3), 417-429

Universal Execution of Parallel Processes:


Penetrating NATs over the Grid
Insoon Jo1 , Hyuck Han1 , Heon Y. Yeom1 , and
Ohkyoung Kwon2
1

School of Computer Science and Engineering,


Seoul National University,
Seoul 151-742, Korea
{ischo,hhyuck,yeom}@dcslab.snu.ac.kr
Supercomputing Center, KISTI, Daejeon, Korea
okkwon@kisti.re.kr

Abstract. Today, clusters are very important computing resources and


many computing centers manage their clusters in private networks. But
parallel programs may not work in private clusters. Because hosts in private clusters are not globally reachable, hosts behind dierent private
clusters cannot be reached directly in order to communicate. It will certainly be a huge loss of resources if private clusters are excluded from
the computing due to this reason. There has been much research on
this issue, but most of them concentrate on user-level relaying because
it is a general and easily-implementable solution. However, even wellimplemented, user-level solutions result in much longer communication
latency than kernel-level solutions. This paper adopted a novel kernellevel solution and applied it to MPICH-G2. Our scheme is generally
applicable, simple and ecient. The experimental results show that our
scheme incurs very little overhead except when small messages are transmitted. That is, it supports a more universal computing environment by
including private clusters with remarkably little overhead.

Introduction

Grid is a computing paradigm aiming for large-scale resource sharing and clusters
are one of its key components. By sharing globally distributed resources such as
data, storage, and application, it pursues benets of both cost and eciency.
Many parallel programming models on top of Grid are proposed to make full
use of integrated resources. Currently, Message Passing Interface (MPI) is a
defacto standard and many studies aim to advance MPI.
It is very typical that clusters assign private addresses to their working nodes
and such nodes are unreachable from public networks. Therefore, it is impossible
for any two nodes behind dierent private clusters to directly communicate. It
could be a huge loss of resources if private clusters are excluded from Grid
due to the reason above. Hence there has been much work on making parallel
programs be executed over clusters in private networks [1,2,3]. Most of them
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 577584, 2007.
c Springer-Verlag Berlin Heidelberg 2007


578

I. Jo et al.

relay messages using a separate application because it could be a more general


solution than kernel-level strategy. However, even well-implemented, user-level
relaying results in an increase in communication latency since every message
destined for cluster nodes in private networks should necessarily pass through
a relaying application. This might lead to considerable decline of throughput,
specially in case nodes frequently exchange large messages. Although the kernellevel approach is superior over user-level solution in terms of performance, it
is generally not used due to its poor portability. Notice that the latter can be
congurable at the user-level.
Main objective of our study is to devise a portable kernel-level relaying method
and to apply it to MPICH-G2, which is a grid-enabled implementation of the
MPI. MPICH-G2 identies each node by hostname and uses it to make a pointto-point connection between the computing nodes. Hosts in private networks do
not have a globally unique hostname. So connection trial to them will fail unless
both source and destination hosts belong to the same private cluster. We call
our scheme MPICH-GU because it is a more universal version to be runnable
over private clusters. We measured communication latency in MPICH-GU and
MPICH-G2. We also benchmarked them using NPB (NAS Parallel Benchmarks)
suites [4]. The results show that the performance of our scheme is nearly the
same as that of MPICH-G2 except small messages. In other words, it supports
a more universal Grid environment including private clusters with very little
overhead. There is no need to mention that it can remarkably outperform userlevel solutions.
The rest of this paper is organized as follows. Section 2 introduces various
approaches to traverse private networks. Section 3 details design and implementation of MPICH-GU. Section 4 presents our experimental results. Section 5
concludes this paper.

General Concepts

In this section, we briey peruse generally known schemes for implementing


inter-host communication across private networks.
2.1

Problem Domain

In general, each host behind private networks does not have a unique address on
public networks. Thus it remains anonymous and inaccessible from the external
hosts. Such unreachability does not trouble outgoing-centric applications such
as web browser.
However, as P2P programs for game, video conferencing, or le sharing are
widely used, it is inevitable to solve this unreachability. P2P applications essentially need inter-host communication, but the incoming connections from the
outside hosts toward private networks will not be setup due to their NATs [5].
Most of NATs function as an asymmetric bridge between a private network and
public networks. That is, they permit outgoing trac but discard incoming one

Universal Execution of Parallel Processes: Penetrating NATs over the Grid

579

unless it belongs to an existing session initiated from the host within the private
networks.
There has been some work on this issue [6,7,8], and the currently known
schemes are relaying, connection reversal, and UDP hole punching.

Host?S

Server?S

2.?Request?to?relay?connection?
request?to?Host?A

Public Internet

Public Internet
3.?Relay?connection?
request?from?Host?B
4.?Try?reversed?connection?
to?Host?B

1.?Try?direct?connection
to?Host?A?and?fail

Host?A

Host?B

(a) Relaying

Host?A

Host?B

(b) Connection Reversal

Fig. 1. Generally Known NAT Traversal Schemes

2.2

Relaying

In the relaying scheme, a special host acts as an intermediary between normal


hosts. Suppose that host A and B are behind dierent private networks and
both know host S, which must have a public address to be reachable from any
host. When A wants to send some message to B, it does not attempt a direct
connection because such one will be certainly dropped by the others NAT.
As shown in Figure 1(a), it just sends messages to S via existing connection
between A and S, and S delivers them to B using the already-established connection between S and B. In case B wants to communicate, the reverse process
occurs.
Relaying is the most general approach. It can function as long as the relaying host is adequately congured. However, it has major shortcomings. First,
each host severely depends on the relaying host, so it can be a potential bottleneck. Secondly, communication latency is increased due to additional overhead
of TCP/IP stack traversal and each switch between kernel and user mode. It
has been reported by M
uller et al. [9] that relaying had a higher latency and
also achieved only about half the bandwidth of the kernel-level solutions such as
gateway or NAT.
2.3

Connection Reversal

Connection reversal provides a direct connection only when at least one of two
hosts is globally reachable. As in Figure 1(b), suppose that there are two hosts

580

I. Jo et al.

A and B, and a server S. A is behind private networks, B has a public IP address


and S can reach both of them. A can establish a direct connection to B. But B
cannot because A is not globally reachable. After failing to initialize connection,
B uses S to relay connection request to A. On receiving such request, A tries
reversed connection to B. Connection reversal is too restrictive to be generally
used, because it does not work at all if two hosts reside in the dierent private
networks.

3
3.1

Design and Implementation


Overall Architecture of MPICH-GU

MPICH-GU consists of three kinds of hierarchical process managing modules,


which are central manager, cluster manager and local job manager, and working
processes as shown in Figure 2. Each manager manages lower-level managers
or process and relays control messages between higher-level and lower-level. For
example, when user requests a job, central manager distributes it to its cluster
managers and individual cluster manager does the same to its local job managers.
Then local job manager forks and executes a working process.

Central Manager

Cluster Manager

Exchange
Control Message

Cluster Manager

Local Job
Manager

Local Job
Manager

Local Job
Manager

Local Job
Manager

MPI Process

MPI Process

MPI Process

MPI Process

Exchange MPI Message (Penetrating NATs)

Fig. 2. Overall Architecture of MPICH-GU

The channel establishment of MPICH-GU proceeds as follows: Each MPI


process creates listener socket to accept channel opening requests from others,
and then transmits its protocol information containing the address of listener.
It is sent to central manager through local job manager and cluster manager in
turn. After gathering such information of all processes, central manager broadcasts it to every process through cluster managers and local job managers. On
receiving this, process constructs commworldchannels table, in which i-th entry
is the channel information of process with rank i. This represents real context
for individual channel, namely, currently listening address, pointer to established

Universal Execution of Parallel Processes: Penetrating NATs over the Grid

581

connection, pointer to sending queue, and etc. When a process has something
to send, it attempts connection to destination process using information in this
table.
3.2

Penetrating NATs

Our scheme utilizes the common property of NAT. Whenever receiving a message, NAT decides either to drop it or to redirect it to some host in its network
based on its table. NAT will certainly drop the message unless it belongs to
an existing session initiated from the inner host. If the host wants to receive
all the messages sent from the unconnected hosts to oneself, its address should
be registered in the NAT table. Therefore, such host would request its NAT to
forward incoming messages, that is, to register its address in the table. Then all
the messages sent to that address will be redirected to the owner of the address.
Notice that any host outside the NAT connects to a host behind the NAT using
the address of the NAT.
Our scheme requires a forwarding-enabled NAT as a precondition. Once this
condition is satised, the following steps should be implemented:
Every host behind private networks requests forwarding incoming messages,
and such request is adequately handled by NAT.
Each host can get address information of anyone with which it wants to
communicate from the hierarchical process management system.
Each host can connect to any host using information got from the process
management system.
We implemented MPICH-GU library based on MPICH version 1.2.6 and the
Globus Toolkit version 4.0.1. More details about implementation are as follows.
As stated in Section 3.1, at initial stage, each MPI process creates listener socket,
then reads the value of environment variable named FRONT NAME. Only for a
process behind private network, it represents the hostname of the machine running its NAT. Otherwise, it is null. Therefore, if it has some value, the process
should request local job manager to register globally reachable address linked to
its own listener socket.
Cluster manager delivers such request to NAT and returns the public address
back to requestor. Then each process transmits its protocol information containing two endpoints consisting of the (hostname, TCP port) pair to central
manager. For hosts behind private networks, they are private address and public
address obtained from the NAT, and for the others, they are the identical public addresses. On receiving this from central manager, each process constructs
commworldchannels tables in which each entry contains two endpoints of individual process.
When a process has something to send, it tries just one connection according
to network topology. It rst checks the public endpoint of destination process.
If that hostname matches its own FRONT NAME, it means that both belong to the
same private network. Hence it attempts establishing a channel using a private
endpoint. Otherwise it does using a public endpoint.

582

I. Jo et al.

3.3

Discussion

Some might be concerned that how many NATs would support forwarding
scheme. By the research of Bryan et al. [8], about 82% of the tested NATs
support forwarding scheme for UDP, and about 64% support forwarding scheme
for TCP. Saikat et al. [10] also showed that TCP forwarding scheme was successful at a average rate of 88% for existing NATs and a 100% for pairs of certain
common type of NATs. These results show that running parallel processes over
private clusters does not require a special mechanism or device and our scheme
can be applied to many parallel systems such as OpenMP [11] and MPJ [12].
Our scheme can simply and eciently manage NATs forwarding entries. At
initializing phase, each MPI process checks its own network topology. Then if it is
needed, it requests forwarding via local job manager. Such request is deliverd to
its NAT by the cluster manager and adequately handled. Notice that if the NAT
is based on the Linux, well-known commands such as iptables can handle the
request. The central manager can also detect abortion or completion of the job.
If such events happen, the central manager noties them to cluster managers.
And each cluster manager requests its NAT to stop relaying.

Experimental Results

We have conducted two kinds of experiments to measure the performance of


our scheme. The rst is to measure the communication latency and the second
is to research how it will aect the performance under real workload. To grasp
relative performance, we carried out the same experiments using MPICH-G2 as
well. We used 7 machines with 850MHz Pentium III processor and 1GB RAM,
which are running Linux 2.4.20. The type of networks we used in all experiments
is Gigabit Ethernet. For MPICH-GU experiments, we congured two private
networked clusters with 3 nodes. For MPICH-G2 experiments, we used the same
machines but assigned them public addresses.
4.1

Communication Performance

To evaluate communication latency, we wrote a simple ping-pong application.


It measures a round-trip time for various-sized messages by averaging the time
spent to send and receive 20 messages.
As shown in Figure 3(a), the latency of MPICH-GU is very close to that
of MPICH-G2. Figure 3(b) represents the round-trip time ratio of MPICH-GU
to MPICH-G2. As can be seen from the gure, the bigger the message size,
the smaller the latency gap becomes. For example, for messages of size 40KB,
MPICH-GU takes 12.5% longer than MPICH-G2, but for messages of size 80MB,
the overhead of MPICH-GU is negligible.
4.2

Workload Performance

To evaluate performance of actual workload, we used NPB suites of version 3.2.


We executed two of them, which were BT and SP with class A and class B.

Universal Execution of Parallel Processes: Penetrating NATs over the Grid

583

(a) Communication Latency

(b) Ratio of Communication Latency

Fig. 3. Communication Performance. (a) represents communication latency in running


applications using MPICH-GU and MPICH-G2. (b) shows their ratio.

Fig. 4. Ratio of Execution Time (MPICH-GU to MPICH-G2)

Since they exchange relatively large messages, they are sucient to show how
MPICH-GU aects workload performance. Our experimental results are shown
in Figure 4. As expected, our scheme incurs very little overhead compared with
MPICH-G2 - at most 4.6%.

Conclusion

In this paper, we have presented MPICH-GU which enables a more universal


Grid environment including private clusters. Penetrating scheme which MPICHGU exploited is not only generally applicable but also simple kernel-level
strategy. Implementing kernel-level scheme, MPICH-GU can deliver nearly as

584

I. Jo et al.

identical performance as MPICH-G2 except when small messages are transmitted. Our experimental results show that it supports private clusters with very
little overhead. It is a remarkable result compared with the performance of previously implemented user-level strategies.

References
1. K. Park, S. Park, O. Kwon and H. Park. MPICH-GP: A Private-IP-enabled MPI
over Grid Environments. ISPA, 2004.
2. E. Gabriel, M. Resch, T. Beisel and R. Keller. Distributed computing in a heterogenous computing environment. EuroPVMMPI, 1998.
3. Jack Dongarra, Graham E. Fagg, Al Geist, James Arthur Kohl, Philip M. Papadopoulos, Stephen L. Scott, Vaidy S. Sunderam and M. Magliardi. HARNESS:
Heterogeneous Adaptable Recongurable NEtworked SystemS. HPDC, 1998.
4. Michael Frumkin, Haoqiang Jin and Jerry Yan. Implementation of NAS Parallel
Benchmarks in High Performance Fortran. NAS Techinical Report NAS-98-009,
1998.
5. P. Srisuresh and M. Holdrege. IP network address translator (NAT) terminology
and considerations. RFC 2663, 1999.
6. B. Ford, P. Srisuresh and D. Kegel. Peer-to-peer communication across middleboxes. Internet Draft draftford- midcom-p2p-01, Internet Engineering Task Force,
Work in progress, 2003.
7. Rosenberg, J. Traversal Using Relay NAT (TURN). draftrosenberg- midcom-turn04, 2004.
8. Bryan Ford, Pyda Srisuresh and Dan Kegel. Peer-to-Peer Communication Across
Network Address Translators. USENIX 2005, pp.179-192, 2005.
9. M. Muller, M. Hess and E. Gabriel. Grid enabled MPI solutions for Clusters.
CCGRID03, pp.18-24, 2003.
10. S. Guha and P. Francis. Characterization and Measurement of TCP Traversal
through NATs and Firewalls. IMC 2005, 2005.
11. OpenMP Architecture Review Board. OpenMP, http://www.openmp.org.
12. B. Carpenter, V. Getov, G. Judd, T. Skjellum and G. Fox. MPJ: MPI-like Message
Passing for Java. Concurrency: Practice and Experience, Volume 12, Number 11,
September 2000.

Parallelization of C# Programs Through


Annotations
Cristian Dittamo, Antonio Cisternino, and Marco Danelutto
Dept. of Computer Science, University of Pisa,
Largo B. Pontecorvo 3, 56127 Pisa, Italy
{dittamo,cisterni,marcod}@di.unipi.it

Abstract. In this paper we discuss how extensible meta-data featured


by virtual machines, such as JVM and CLR, can be used to specify the
parallelization aspect of annotated programs. Our study focuses on annotated CLR programs written using a variant of C#; we developed a
meta-program that processes these sequential programs in their binary
form and generates optimized parallel code. We illustrate the techniques
used in the implementation of our tool and provide some experimental
results that validate the approach.
Keywords: Code annotations, generative and parallel programming.

Introduction

The correct and ecient design of parallel programs require to consider several
dierent concerns, that are both dicult to separate during program development and depend on the target architecture features. Parallel aspects include
how the computation is performed using processing elements (such as processes
and threads), and how these communicate. These are non-functional aspects
of a program since they do not contribute to dene the computation but only
how it is performed. Subscribing the separation of concerns concept, typical of
Aspect Oriented Programming (AOP) [3], we recognize the importance of using
proper tools (meta-data) to program the non-functional aspects related to parallelism exploitation. Meta-data (annotation) can be used by the programmer
to suggest consciously how a parallel application can be automatically derived
from the code. We will describe a proper meta-program (instead of compiler)
that can eciently handle such meta-data. This is dierent from the standard
AOP approach where join points are dened using patterns, making the programmer unaware of program transformation details that will be applied afterwards.
Moreover, in our case the meta-program does not restrict itself to inject code at
the join points (identied by the annotations) as an aspect weaver would do, but
it also perform a local analysis of the program to make its decisions about how
to render parallel the sequential code. Since annotations are explicitly dened in
the program code, it is possible to operate on the binary form of the program,
allowing to adapt the transformation to a specic execution environment. Several approaches have been proposed in the past to relieve programmers from the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 585592, 2007.
c Springer-Verlag Berlin Heidelberg 2007


586

C. Dittamo, A. Cisternino, and M. Danelutto

burden associated with the design of parallel applications: parallel compilers [5],
parallel languages [10], parallel libraries [11] and more advanced programming
models [12,13]. All these proposals rely on explicit support from the compiler
(or tools involved in the compilation process) to generate a parallel version of
the program and they dont provide any mechanism to avoid substantial codetangling where the application core functionality cross-cut with the concern of
parallel execution. In structured parallel programming systems based on the
algorithmic skeleton concept the programmers qualitatively express parallelism
directly at the source level, by properly instantiating and composing a set of
pre-dened parallelism exploitation patterns/skeletons. The compiler and run
time system is then in charge of adapting the provided skeleton structure to
the target architecture features. Our approach is similar, though in our case the
code transformation is performed at load time on the binary form of the program, relying on the compilation infrastructure typical of the virtual machine.
Programming languages targeting runtimes, such as [1,2], provide the programmers with general mechanisms (extensible meta-data) to insert annotations into
programs that can be read at runtime through the reection APIs. Annotations
are ignored by the execution engine, but they can be accessed by meta-programs
in order to perform program analysis and manipulation at runtime. Annotation
let programmers to provide all kinds of non-functional knowledge to the runtime
system. In this respect annotations dier from directives traditionally used by
compilation systems as in OpenMP and HPF: the compiler is not responsible for
their processing. The prior knowledge about these system, however, may guide us
in dening annotations to express parallel computations, and the semantics that
a meta-program will attribute to them. In this paper we present Particular1,
a meta-program which reads annotated programs written in [a]C# [9] (an extension of C#, read annotated C#). [a]C#s main contribution is to extend the
custom annotation mechanism provided by C# by allowing to annotate code
blocks inside method bodies, other than declaration (classes, elds, and methods are examples). As shown in the following [a]C# code
class MyAnnotAttribute : ACS.CodeAttribute {}
class AnotherAnnotAttribute : ACS.CodeAttribute {}
class Example {
public static void Main(string[] args) {
[MyAnnot] { /* Code under the aegis of the MyAnnot attribute */
[AnotherAnnot] { /* Code inside a nested annotation */ }
}
[AnotherAnnot, MyAnnot] /* Single statement */
}
}
}

the programmer simply encloses the name of the annotation inside square braces,
as it is custom in C#, to annotate a block or a statement. A pre-compilation step
must be done in order to transform ACS code to standard C# code. Annotations
can be retrieved at runtime through an extension of the reection API that
returns attributes in the form of a forest (to represent annotations with nested
1

PARallelizaTIon of CUstom annotated intermediate LAnguage programs at


Runtime.

Parallelization of C# Programs Through Annotations

587

scopes), where each node allows obtaining a cursor to the intermediate languages
instructions enclosed in the annotation.
In the rest of this paper we discuss the set of annotations introduced by our
system and made available to the programmer for providing hints to our tool.
It also explains the strategies adopted for rendering the parallel version of an
annotated program. We nally provide experimental results on the performance
of the generated code.

Prototype Framework

Particular denes a set of the annotations to provide hints on how to transform a sequential C# program into a parallel one. Since we were interested in
exploring the viability of the approach, we decided to use a small and simple set
of annotations. After the results we obtained, we believe that more annotations
could be used, supporting a wider set of parallel programming paradigms, and
perhaps leading to better performance. We introduced two annotations named
Parallel and Process. The former denotes the parts of the code subject to parallel execution, and the latter denotes the specic parts that have to be included
in an independent control ow (a thread when targeting SMP architectures, or
a remote process when targeting clusters/networks of workstations or grids). In
general, it is Particular that denes the semantics of annotations by manipulating the annotated code. The Particular tool should be used before an annotated example is loaded. This way, it is possible to customize the schema used for
generating the parallel version depending on the actual architecture where the
program should run. Particular reads an annotated executable program, in its
intermediate (bytecode) form, and produces a new parallel program exploiting
these annotations. It is important to notice that the entire transformation does
not involve the original source program. Our approach is based on the manipulation of binary programs, therefore Particular can adopt strategies depending
on the target architecture used for program execution. It may decide how annotations should be better rendered in the nal program and what mechanisms
should be used (threads, processes, etc.). Since our annotations may be of some
help to the execution environment, a Just-in-time (JIT) compiler could be
used to read these annotations. This extension, however, would require a tighter
integration between annotated binaries and the execution environment. The basic parallelism exploitation pattern used to handle Parallel annotated sections
of code is the master/worker paradigm: a master process delivers tasks to
be computed to a worker, picked up from a worker pool according to a proper
scheduling strategy. The worker completely processes the task, computing a result which is eventually given back to the master process. This process is repeated until there are new tasks to be computed. In our approach each Process
annotation leads to the generation of a master process/thread and of a set of
worker processes/threads. Whether processes or threads are generated depends
on the kind of target architecture at hand, and generation in this context is a
synonym of proper IL code generation. With our framework the developer can
use any standard compiler (Microsoft, Rotor, Mono) targeting CLR. Whenever a

588

C. Dittamo, A. Cisternino, and M. Danelutto

new architecture or pattern for parallelism will be introduced just Particular


will be modied.
Design pattern to achieve parallelism. Our aim is to let the programmer focus
on the functional aspects of the application, without having to concentrate at
the same time on execution aspects. Annotations serve this purpose, and the
user may even be unaware of the implementation details of parallel execution
(though it is expected that he understand the meaning of annotations). For each
annotation found, the meta-program creates a method containing the IL code
of the annotated section (taking care of ensuring the appropriate handling of
variables access); the annotated block is replaced by an asynchronous method
call to the correspondent (new) method. Moreover Particular emits the IL
instructions needed to call the (new) method (i.e. for reading actual parameters,
to return values). Every new method code and meta-data are loaded into new
library referenced by the original one.
State Analysis. Using annotations developers dene new scopes inside a method
body. Local variables can either be inside or outside a particular scope: when
a scope is extruded from its original location, the meta-program should ensure
that the state is properly handled, and the portion of the local state of the original method body should be made accessible to all the fragments generated from
it. Unfortunately, the intermediate language form of the code does not provide
a notion of scope for local variables, which are attened into an array, thus we
have to reconstruct a plausible scope for each of them. When an annotated scope
is extruded into a new method m, a new type State is created; State contains a
eld for each variable that is not local with respect to an annotated scope. The
signature of method m is extended with a parameter of type State in order to
receive the non local state of the annotated scope. Therefore, values, not local
to m, are passed as input arguments to it, thus making them local. These arguments are exposed directly in the signature of extruded methods generated by
the tool. The only technical dierence is that these variables are accessed with
dierent IL instructions. Another critical issue in dealing with variables not local
with respect to a scope, is that they are potentially accessible by many concurrent scopes Particular has to inject IL instructions in order to synchronize the
access to the variables. It is important to note that the programmer must be
aware of this behaviour, since the order of access into these shared variables can
dier from the sequential execution of the original annotated program. After
the analysis phase, the tool should rewrite the IL instructions of the annotated
scope, so that references to arguments and local variables are consistent with the
original method denition. If the instruction refers to a variable that should be
synchronized, additional instructions are injected in order to avoid race conditions; the locking pattern used depends on the specic mechanisms employed to
render the parallel execution (threads or processes): locks in the case of threads
are achieved using monitors provided by the CLI infrastructure, in the case of
multiple processes we delegate the master to deal with the synchronization and
semaphores are used. At rst sight, one might think that the annotations we

Parallelization of C# Programs Through Annotations

589

have introduced allow to express only a xed number of processes to be spawned


by the translated code. This is not true: in our parallelization pattern, if we
annotate the body of a for loop, the main thread will spawn a worker for each
execution of the body. Again, it is the responsibility of the programmer to ensure
that the unbounded spawn is preferable, although the master can use schemas
that impose an upper bound to the number of workers.
Meta-program implementation. As discussed in the previous sections, the transformation program processes the program in its binary form. The main advantage
of this approach is that the strategy used to render the annotations is deferred at
deploy time, or even at runtime2 . Therefore, the transformation can take into account the particular geometry of a distributed/parallel system and make specic
decisions on how to perform the rendering of the annotations. Another advantage
of the approach is that the programmer can rely on the standard development
toolchain to develop and debug the program in its sequential form, focusing on
the functional aspects of the application. Code generation exploits the facilities
provided by the CLI infrastructure in the reection APIs The library is responsible for generating the bytes of an executable, allowing the programmer to emit
opcodes and arguments in a symbolic form; it is also responsible for performing
the back-patching required to resolve forward branches. The algorithm adopts a
two-step strategy:
step 1. Code Analysis, analysis of the code as discussed previously, where
the algorithm takes as input a forest of annotations retrieved through the
[a]C# runtime. It returns an array of Masters that has as many elements as
the Parallel blocks inside the annotated method;
step 2. Code Generation: Annotated methods are processed and a new
assembly is generated with the newer version of the method; the algorithm
scans the method body and generates a new method for each worker by invoking the appropriate methods of the actual Master. The algorithm creates
all the required types and their members (i.e. elds, methods, constructors,
properties, and events) using the reection APIs. The code generation strategy is based on copying instructions of the original program, without any
attempt of code optimization. This is not an issue, however, since the JIT
compiler already performs optimizations on the intermediate language.

Experimental Results

To evaluate the eectiveness of our framework, we used Particular to manipulate two computations from two dierent classes of parallel applications: task
farm and static stencil computations. We evaluated the eects of our transformation in terms of reduction in execution time for the whole application. We
used heterogeneous platforms, both in operating systems (Windows and Linux)
2

Virtual machines can dynamically load types, thus a running program can invoke
Particular and then load its output.

590

C. Dittamo, A. Cisternino, and M. Danelutto

and architectures (uniprocessor, multiprocessor and dual core). This work should
be considered preliminary, since most of the eort in developing Particular was
spent in making the meta-program infrastructure; we expect that smarter plugins for the master task would lead to better performance. The CLI execution
environment implementation used is Microsoft .NET CLR v2.0 under Microsoft Windows systems, whereas Mono v1.1.15 is used under the Linux system.
We considered two applications, representative of a large number of parallel
applications: a task farm (i.e. embarrassingly parallel) application (namely the
parallel rendering of the Mandelbrot set) and a stencil, data parallel application
(namely Jacobis method used to approximate the solution of the Laplace Finite
Dierence over a matrix of values). For the task farm, the maximum number
of workers used is not xed, and the parallel pattern MAP manages to use the
available resources according to a best eort policy. We rendered images of
dierent sizes, with a dierent color rendering precisions. The graph (a) below
shows the eciency of the multi-threaded version of the renderer, with an image
of 4000x4000 points and dierent precision values as inputs. In order to reduce
the communication overhead, each worker computes the color of the points in its
region and stores the results in a local queue. Once the computation ends, the
worker copies its elements into the global queue. Therefore, synchronization is
required only while copying results. Using a dual core processor we reached the
best eciency employing 2 workers only. As expected, increasing the number of
workers causes an increase in processors contentions and race conditions while
accessing the global circular queue. Therefore, the wait time spent on synchronization increases as well as the execution time for the whole application. Using
larger precision values does not allow to have better performance because of the
trivial type of computations made for each iteration. Similar results are obtained
with the multi-processes version of the application. In this case a larger color
precision causes the communication costs to rise, thus reducing the overlapping
of communications with calculus. The graph (b) below shows the eciency of
the multi-processes version of the renderers. For our experiments concerning
the data parallel stencil application, we considered dierent sizes of the input
array. The graph (c) below shows the eciency values obtained testing the
multi-threaded version of the application. As expected, the best performance is
obtained on a dual core processor with 2 workers. Splitting the input array into
more blocks causes a decrease in the time spent for computation and increases
the number of race conditions between workers when accessing the shared array.
Having a larger input array increases the computations on local elements but at
the same time causes an increment in the number of synchronizations required
to access the shared array; therefore, the performance obtained decreases when
larger arrays are utilized. Slightly better performance results are obtained with
the multi-processes version as shown in the graph (d) below. In this case, the
Master is responsible for updating the elements with the new values computed
by the Workers, hence the elimination of synchronization hints.
However, the blocks of rows are exchanged between Master and Worker
through the network, leading to heavy communication costs. Therefore, the best

Parallelization of C# Programs Through Annotations


Image of size 4000x4000 rendering

Image of size 4000x4000 rendering

dwell=5000
dwell=10000
dwell=20000

dwell=5000
dwell=10000
dwell=20000

1
0.8
Efficiency

0.8
Efficiency

591

0.6

0.6
0.4

0.4
0.2
0.2
0
2

4
5
6
Number of workers

a. Mandelbrots multi-threaded version

Multi-process version

matrix size=1000
matrix size=4000
matrix size=8000

matrix size=1000
matrix size=4000
matrix size=8000

1
0.8
Efficiency

0.8
Efficiency

b. Mandelbrots multi-processes version

Multi-thread version
1

4
5
6
Number of workers

0.6
0.4
0.2

0.6
0.4
0.2

0
2

4
5
6
Number of workers

c. Jacobis multi-threaded version

4
5
6
Number of workers

d. Jacobis multi-processes version

performance results with the use of 2 workers and the smallest input array size.
When increasing the latter, the processing of Laplaces Equation cannot overlap communications, forcing the Master to wait for updates without performing
any calculus.

Related Work

One of the most interesting approach to separate functional aspects from parallel
code is based on AOP. In [4] an attempt is made using AspectJ for encapsulating
parallelization in an aspect separated from the mathematical model concerns.
Code annotations have been mainly used to enhance the exibility and the
eciency of the compiling step and to support new language features, see [14].
Code generation is another of the most diuse application for code annotations.
The XDoclet2 tool [15] for Java has been successfully used for performing code
generation tasks. This approach to code annotation is based on source code manipulation. Program manipulation with bytecode transformation is a technique
that has been employed in several applications, see [8,16].

Conclusions

In this paper we have presented Particular, a meta-program that transforms


annotated programs relying on the reection capabilities of the CLI infrastructure. Our transformation schema allows deferring the decision on how to render
parallel a sequential program, considering the particular distributed/parallel architecture where it will be executed. In our approach, programmers can focus on
functional aspects of the problem, relying on the well established programming

592

C. Dittamo, A. Cisternino, and M. Danelutto

toolchain for developing and debugging. Annotations will be enough to drive our
meta-program in the parallelization of the program at a later stage. Although
this is a form of AOP, in our case join-points are explicitly provided by the programmer in the form of annotations and the weaving of the parallel aspect is
performed at deploy time on the binary program. The weaving process, moreover, does involve analysis of annotated blocks, not only injection of code at
join-points. Another important element of our system is the ability to plug in
dierent code generation strategies. So far we have developed the basic strategies
based on thread and process execution, though we expect that smarter plugins
for the master task would lead even to better performances. The results of our
tests look promising and encourage us to continue in our research.

References
1. .NET Framework Developer Center: The Common Language Runtime (CLR),
http://msdn2.microsoft.com/en-us/netframework/aa497266.aspx (2007).
2. T. Lindholm and F. Yellin, The Java(TM) Virtual Machine Specication 2nd
ed., Prentice Hall PTR, (1999).
3. T. Elrad, R. E. Filman and A Bader, Aspect-oriented programming, Communications of the ACM, Vol.44 (2001).
4. B. Harbulot and J. R. Gurd, Using AspectJ to separate concerns in parallel
scientic java code, Proceedings of the AOSD Conference, Lancaster UK (2004).
5. T. Brandes and F. Zimmermann, ADAPTOR - A Transformation Tool for HPF
Programs, Programming Environments for Massively Parallel Distr. Sys. (1994).
6. C. Dittamo, Sequential program parallelization using annotations (in Italian),
Graduation thesis, Dept. Computer Science, Univ. of Pisa, Italy, 2006.
7. J. Miller, Common Language Infrastructure Annotated Std, Addison-Wesley,
2003.
8. A. Cisternino, Multi-stage and Meta-programming support in strongly typed execution engines, PhD thesis, Dept. Computer Science, Univ. of Pisa, Italy, 2003.
9. W. Cazzola, A. Cisternino and D. Colombo, Freely Annotating C#, Journal
of Object Technology Vol. 4, No.10, ETH Zurich (2005).
10. High Performance Fortran Forum, High Performance Fortran language specication, version 1.0, Technical Report CRPCTR92225 Houston, Tex (1993).
11. OpenMP Forum, OpenMP: A Proposed Industry Standard API for Shared Memory Programming, Technical report (1997).
12. M. Aldinucci, M. Coppola, M. Danelutto, M. Vanneschi and C. Zoccolo,
ASSIST as a Research Framework for High-performance Grid Programming Environments, (2005).
13. A. Benoit, M. Cole, J. Hillston and S. Gilmore, Flexible Skeletal Programming with eSkel, Proc. 11th International Euro-Par Conference (2005).
14. R. Kirner and P. Puschner, Classication of Code Annotations and Discussion
of Compiler-Support for Worst-Case Execution Time Analysis, In Proceedings of
the 5th Euromicro International Workshop on Worst-Case Execution Time Analysis (WCET05), Palma, Spain (2005).
15. C. Walls and N. Richards, XDoclet in Action, Manning Publications (2003).
16. H. Masuhara and A. Yonezawa, Run-time Bytecode Specialization. A Portable
Approach to Generating Optimized Specialized Code, In Proceedings of Programs
as Data Objects, Second Symposium, PADO01 (2001).

Fine Grain Distributed Implementation of a


Dataflow Language with Provable Performances
Thierry Gautier, Jean-Louis Roch, and Frdric Wagner
MOAIS Project, LIG Lab., INRIA-CNRS, Universits de Grenoble, France
{thierry.gautier,jean-louis.roch,frederic.wagner}@imag.fr

Abstract. Ecient execution of multithreaded iterative numerical computations requires to carefully take into account data dependencies. This
paper presents an original way to express and schedule general dataow
multithreaded computations. We propose a distributed dataow stack
implementation which eciently supports work stealing and achieves
provable performances on heterogeneous grids. It exhibits properties such
as non-blocking local stack accesses and generation at runtime of optimized one-sided data communications.
Keywords: dataow, distributed stack, work-stealing, work depth
model.

Introduction

Multithreaded languages have been proposed as a general approach to model


dynamic, unstructured parallelism. They include data parallel ones e.g. NESL
[5] , data ow ID [7] , macro dataow Athapascan [10] , Jade [15] languages with fork-join based constructs Cilk [6] or with additional synchronization primitives Hood [2], EARTH [11] . Ecient execution of a multithreaded
computation on a parallel computer relies on the schedule of the threads among
the processors. In the work stealing scheduling [2,1], when becoming idle, a
processor steals a ready task (the oldest one) on a randomly chosen victim
processor. Usual implementations of work stealing are based on stacks to store,
locally on each processor, the tasks still to complete.
Such scheduling has been proven to be ecient for fully-strict multithreaded
computations [6,8] while requiring a bounded memory space with respect to a
depth rst sequential execution [14]. However, some numerical simulations generate non serie-parallel data dependencies between tasks; for instance, iterative nite dierences computations have a diamond dag dependency structure.
Such a structure cannot be eciently expressed in term of neither fully-strict
nor strict multithreaded computation without adding articial synchronizations
which may limit drastically the eective degree of parallelism. The Athapascan
[10] parallel language enables to describe such recursive multithreaded computations with non serie-parallel data dependencies as described in Section 2.
In this paper, we propose an original extension named DDS (Section 3) of
the stack management in order to handle programs which data dependencies
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 593600, 2007.
c Springer-Verlag Berlin Heidelberg 2007


594

T. Gautier, J.-L. Roch, and F. Wagner

do not t the class of strict multithreaded computations. The key point consists in linking one-sided write-read data dependencies in the stack to ensure
constant time non-blocking stack operations. Moreover, on distributed architectures, data links between stacks are used to implement write-read dependencies
as one-sided ecient communications. Those properties enable DDS to implement macrodataow languages such as Athapascan with provable performances
(Section 4). Section 5 reports experimental performances on classical benchmarks
on both cluster and grid architectures up to a thousand processors conrming
the theoretical performances.

Model for Recursive Dataflow Computations

This section describes the basic set of instructions (abstract machine) used to
express parallel execution as a dynamic data ow graph. It is based on Athapascan which models a parallel computation from three concepts: tasks, shared
objects and access specications [10]. Following Jade [15], Athapascan extends
Cilk [9] to take into account data dependencies; however, while Jade is restricted
to iterative computations, Athapascan includes nested recursive parallelism to
take benet from the work stealing.
The programming model. A task represents a non-blocking sequence of instructions: Like in ordinary functional programming languages, a task is the
execution of a function that is strict in all arguments (no side eect) and makes
all result values available upon termination. Tasks may declare new tasks. Synchronization between tasks is performed through the use of write-once shared
objects denoted Data. Each task has an access specication that declares how
it (and its child tasks) will read and write individual shared objects: the type
Data::Read (resp. Data::Write) species a read (resp. write) access to the eective
parameter. To create a task, a block of memory called a closure is rst allocated
using AllocateClosure (Figure 1). Then the eective parameters of the task
are pushed to the closure, either immediate values or shared objects. For each
shared parameter, the access specication is given: either read (push::Read) or
write (push::Write). An immediate value parameter is copied using push::Value.
Finally, the commit instruction completes the description of the task.
Synchronization between tasks is only related to access specication. The semantic is lexicographic: statements are lexicographically ordered by ;. In other
words, any read of a parameter with a Read access specication sees the last write
according to a lexicographic order called reference order. Figure 1 is an example of code using Athapascan for the folk recursive computation of Fibonacci
numbers: the tasks Sum reads a, b and writes r.
Spawn tree and Reference order. Recursive description of tasks is represented by a tree T , called spawn tree, whose root is the main task. A node n in
T corresponds to a task t and the successor nodes of n to the child tasks of t.
Due to the semantics of Athapascan, the non-preemptive sequential schedule of
tasks that follows the depth-rst ordering of T is a valid schedule. This ordering

Fine Grain Distributed Implementation of a Dataow Language


1.
2.
3.

void Sum (Data a, Data b, Data r) {


r.write(a.read() + b.read());
}

4. void Fibo(int n,Data r) {


5.
if (n <2) r.write( n );
6.
else {
7.
int r1, r2;
8.
Task f1 = AllocateClosure( Fibo );
9.
f1.push( ReadAccess, n-1 );
10.
f1.push( WriteAccess, r1 );
11.
f1.commit();

12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.

595

Task f2 = AllocateClosure ( Fibo );


f2.push( ReadAccess, n-2);
f2.push( WriteAccess, r2);
f2.commit();
Task sum = AllocateClosure ( Sum );
sum.push( WriteAccess, r);
sum.push( ReadAccess, r1);
sum.push( ReadAccess, r2);
sum.commit();
}
}

Fig. 1. Fibonacci program with abstract machine instructions (it corresponds to the
folk original Athapascan code for Fibonacci in [10], g. 3.)

is called reference order and denoted by R. According to R, the closures consecutively committed by a task t are executed after completion of t in the same
order, while in a depth-rst sequential schedule, a closure is executed just after
committing.
Work-stealing scheduling based on reference order. The nested structure of the spawn tree enables a depth-rst work-stealing scheduling, similar to
DFDeques scheduling proposed in [14] but here based on the reference order R
instead of the standard sequential depth rst order. All tasks in the systems are
ordered according to R in a distributed way. Locally, each processor manages
its own deque in which tasks are ordered according to R. When a closure is
allocated on a processor, it is pushed on the top of the local deque but, following
R, execution of current closure pursues. When the current closure completes, a
new one is popped from the local deque. If this deque is empty, a new closure is
stolen from the bottom of the deque of another randomly chosen processor.

Distributed Implementation: DDS

This section presents the distributed data-ow stack implementation, named


DDS, of the abstract machine model presented in section 2. DDS implements
local stacks by allocating contiguous blocks of memory that can store several
frames. A frame is related to the execution of a task; it is used to store all
closures created by the task with direct links describing Read or Write data
accesses. A new frame is pushed on the stack when a task begins its execution.
Tasks are executed according to the reference order R.
Figure 2.a, shows the state of the stack during the execution of the recursive
computation of the program of Figure 1. Starting from the base stack pointer,
the frame related to the task fibo(n,r) is rst pushed on the stack. During its execution fibo(n,r) creates three new tasks: fibo(n-1,r1), fibo(n-2,r2) and a sum
task to compute r:=r1+r2. The associated closures including their arguments are
then allocated in the frame. When fibo(n,r) completes, the task fibo(n-1,r1) is
popped from the top of the frame and a new frame is allocated for its execution.
This new frame is pushed on the stack. When all closures allocated by a task

596

T. Gautier, J.-L. Roch, and F. Wagner


Base stack

Base stack
Closure link
Frame fibo(n,r)
Frame fibo(n,r)

fibo(n1,r1)

fibo(n1,r1)

n1: int
frame link

r1 : Write Access

fibo(n1,r2)
activation

fibo(n2,r2)
sum(r,r1,r2)

n2: int
r2 : Write Access

Frame fibo(n1,r)

sum(r,r1,r2)
fibo(n2,r1)

shared links

activation

r : Write Access
r1 : Read Access

fibo(n3,r2)

r2 : Read Access
sum(r,r1,r2)

Frame fibo(n1,r)

Frame fibo(n2,r)

fibo(n2,r1)
n2: int

fibo(n3,r1)

r1 : Write Access

fibo(n4,r2)

fibo(n3,r2)
n3: int

sum(r,r1,r2)

r2 : Write Access

Top stack

sum(r,r1,r2)
r : Write Access
r1 : Read Access
r2 : Read Access

(a) Activation frames

Top stack

(b) Shared links

Fig. 2. (a) Stack structure with activation frames. (b) Data ow link.

are completed or stolen, its associated frame is popped and the execution of its
successor according to R can start. In order to manage data dependencies, Read
or Write data accesses are pushed into the closure and linked between closures
according to the reference order (Figure 2.b).
Distributed work stealing and extended stack management. A new stack
is created when a processor becomes idle and steals work from another processor.
When the current stack of a thread becomes empty or the current task is not
ready, a steal request occurs. In this case, the thief thread rst tries to steal a
ready closure in another stack: rst locally on SMP machines or, when no closures
are found, a steal request is sent to a randomly chosen distant processor.
The stolen closure is ready to execute, i.e. all its input parameters are produced.
For instance, in gure 2 a), in the top frame, the closure fibo(n-1,r1) is already
completed, the closure fibo(n-2, r2) is ready while the closure sum(r, r1, r2) is
not ready since its input parameter r2 has not been produced. Using access links,
the computation of ready closures is only performed on steal requests. Indeed,
since the reference order is a valid sequential schedule, local tasks in a stack are
executed without computing the readiness of closures. Following work first principle [9], this enables to minimize scheduling overhead by transferring the cost
overruns from local computations to steal operations. In particular, atomicity of
local accesses is ensured by non-blocking locks (compare-and-swap instruction).
Once the choice of a victim has been made, a copy of the chosen closure
is pushed in a new stack owned by the thief processor. The original closure is
marked as stolen. If the thief is a remote processor, input parameters of the task
are copied and sent with the closure. In order to manage the data-ow for output
parameters, a signalization task is pushed after the closure copy. This task plays

Fine Grain Distributed Implementation of a Dataow Language


Base stack

597

Base stack
fibo(n2,s,r2) : copy
n2: int
r2 : Write Access
Send and Signal

Frame fibo(n,r)
fibo(n1,r1)

al
Ste

n1: int

r : Read Access

r1 : Write Access
fibo(n2,r2) : stealed

Frame fibo(n2,s,r)
fibo(n3,s,r1)

n2: int

nal

r2 : Write access
sum(r,r1,r2) : non ready

Sig

r : Write Access
r1 : Read Access
r2 : Read Access

Send

Frame fibo(n1,s,r)
fibo(n2,s,r1)

n3: int
r1 : Write Access
fibo(n4,s,r2)
n4: int
r2 : Write Access
sum(r,r1,r2)
r : Write Access
r1 : Read Access
r2 : Read Access

n2: int
r1 : Write Access

Frame fibo(n3,s,r)
fibo(n4,s,r1)

fibo(n3,s,r2)
n3 : int
r2 : Write Access

n4: int
r1 : Write Access

sum(r,r1,r2)

fibo(n5,s,r2)

r : Write Access
r1 : Read Access

n5 : int
r2 : Write Access

r2 : Read Access
Top stack

(b) Stack of the thief

(a) Stack of the victim

Fig. 3. Structure of both victim (a) and thief (b) stacks. A new task (Send Signal)
is forked into the thief stack. Its role is to send back the result and signal the tasks
marked as non ready that depend on the stolen task.

the role of signaling that output accesses of the stolen task are produced in
order to compute readiness of successors and sending the produced data to the
victim processor.
Remark. Since DDS describes all tasks and their dependencies, it stores a
consistent global state; this is used in [12] to implement fault tolerant checkpoint/restart protocols.

Theoretical Analysis

This section provides a theoretical analysis of the DDS implementation, resulting


in a language-based performance model for Athapascan macrodataow parallel
programs on heterogeneous grids. To model such an architecture, we adopt the
model proposed in [3]. Given p processors, let i (t) be the instantaneous speed
of processor i at time t, measured as the number of elementary operations per
p
T
(t)
unit of time; let ave = t=1 Ti=1 i be the average speed of the grid for a
computation with duration T .
To predict the execution time T on the grid, following [4], we adopt a languagebased performance model using work and depth. The work W is the total number
of elementary (unit) operations performed; the depth D is the critical-path, i.e.
the number of (unit) operations for an execution on an unbounded number of
processors. Note that D accounts not only for data-dependencies among tasks
but also for recursive task creations, i.e. the depth of the spawn tree.
The work (and depth) of an Athapascan parallel program includes both the
sequential work (WS ) and the cost of task creations but without considering
the scheduling overhead; similarly to a sequential function call, the cost of a

 

598

T. Gautier, J.-L. Roch, and F. Wagner

task creation with n unit arguments is f ork + n.arg . If the cost of those task
creations is negligible in front of WS , then W  WS .
Theorem 1. In the DDS implementation, when no steal operation occurs, any
local access or modification in any stack is performed in a constant number of
operations. Then, f ork and arg are constants.
The proof is direct: when no steal operation occurs, each process only accesses
its own local stack. Due to the links in the stack and non-blocking locks, each
access is performed in (small) constant time.
Since DDS implements a distributed work-stealing scheduling, a steal operation only occurs when a stack becomes empty or when the current task is not
ready. In this case, the process becomes a thief and randomly scans the stack of
the other processes (from their top) to nd a ready closure; the resulting overhead is amortized by the work W when D  W . Indeed steal operations are
very rare events as stated in [2,3] on a grid with processors speeds ratios may
vary only within a bounded interval.
Theorem 2. With high probability, the number of steal
is O(p.D)
 operations

W
D
and the execution time T is bounded by T ave + O p ave .
The proof (not included) is derived from theorems 6,8 in [3]. Then, when D  W ,
the resulting time is close to the expected optimal one W
.
ave

Experiments

A portable implementation of DDS supporting Athapascan has been realized


within the Kaapi C++ library [13].
Results on a cluster. A rst set of experiments has been executed on a Linux
cluster of 100 PC (100 Pentium III, 733Mhz, 256MBytes of main memory) interconnected by fast Ethernet (100MBits/s). On this implementation, f ork =
0.23s and arg = 0.16s are observed constant in accordance to theorem 1.
In the timing results (Figure 1): T1 denotes the time, corresponding to W , to
execute the benchmark on one processor; Tp the time on p processors; TS the
time of the pure C++ sequential version of the benchmark, it corresponds to
WS . Recursive subtasks creation is stopped under a threshold th where further
recursive calls are performed with a sequential C++ recursive function call; the
timing of a leaf task with th = 15 (resp. th = 20) is 0.1 ms (resp. 1 ms). Left
and right tables report times respectively for the Fibonacci benchmark with up
to 32 processors and for the Knary benchmark up to 100 processors. Both show
scalable performances up to 100 processors, conformally to theorem 2.
Results of grid experiments. We present here experimental results computed
on the french heterogeneous GRID5000 platform during the plugtest 1 international contest held in november 2006. On the NQueens challenge (Takkaken
1

http://www.etsi.org/plugtests/Upcoming/GRID2006/GRID2006.htm

Fine Grain Distributed Implementation of a Dataow Language

599

Table 1. T1 , Tp and TS (in second) for Fibonacci (a) and KNary (b) benchmarks

p
1
4
8
16
32

fib(40) ; th = 15
fib(45) ; th = 20
Tp T1 /Tp TS /Tp Tp T1 /Tp TS /Tp
9.1
1
0.846 88.2
1
0.981
2.75 3.3
2.8
22.5 3.92
3.84
1.66 5.48
4.6
12.35 7.14
7
1.01
9
7.62
6.4 13.78 13.52
.99 9.19
7.78
3.7 23.83 23.37

Knary(35,10) ; th = 15
p
Tp
T1 /Tp TS /Tp
1
2435.28
1
0.984
8
306.17 7.95
7.83
16
153.52 15.86 15.61
32
77.68 31.35 30.86
64
40.51 60.12 59.18
100 26.60 91.55 90.13

NQueens p
T
21
1000 78s
22
1458 502.9s
23
1422 4434.9s
Fig. 4. CPU/network loads and timing reports

sequential code), our implementation in Athapascan on DDS/Kaapi showed


the best performances, honored by a special prize: On instance 23 solved in
T = 4434.9s, an idle time of 22.72s was measured on the 1422 processors; this
22.72
experimentally veries theorem 2 with a maximal relative error 4434.9
= 0.63%.
Figure 4 shows the global grid load together with CPU and network load on one
of the clusters composing the grid (cluster from the sophia site). These results
have been obtained using GRID5000 monitoring tools during the last hour of
execution. Our computations start approximately at 01:50. Dierent instances
of nqueens problems are executed sequentially. The dierent graphs show a very
good use of CPU ressources. At the end of each execution work stealing occurs,
increasing briey network load while enabling to maintain ecient CPU usage.

Conclusions

Multithreaded computations may take benet of the description of non strict


data dependencies. In this paper we present a novel approach, DDS, to implement ecient work stealing for multithreaded computations with data ow
dependencies. Local stack operations are guaranteed in small and constant time,
while most of the overhead is postponed onto unfrequent steal operations. This
important property enables us to predict accurately the time of a (ne grain)
parallel program on an heterogeneous platform where processors speeds vary

600

T. Gautier, J.-L. Roch, and F. Wagner

in a bounded ratio (theorem 2). Experiments reported on a cluster and a grid


infrastructure with 1400 processors showed scalable performances.
Besides, by providing a consistent global state, DDS implementation enables
to support fault tolerance. A perspective of this work is to use fault-tolerance to
extend theorem 2) to dynamic grid platforms where speed ratios cannot be considered bounded anymore, e.g. when a processor leaves (resp. enrolls) its speed becomes zero (resp. non zero). Under a given speed threshold, considering a processor as faulty might be a practical way to ensure the bounded ratio property.
Acknowledgments. The authors gratefully acknowledge Serge Guelton, Samir
Jafar and Rmi Revire for useful discussions and participation to implementation
and experimentations.

References
1. U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing.
Theory Comput. Syst., 35(3):321347, 2002.
2. N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors. Theory Comput. Syst., 34(2):115144, 2001.
3. M. A. Bender and M. O. Rabin. Online scheduling of parallel programs on heterogeneous systems with applications to cilk. Th. Comp. Sys., 35(3):289304, 2002.
4. G. E. Blelloch. Programming parallel algorithms. Com. ACM, 39(3):8597, 1996.
5. G.E. Blelloch. NESL: A Nested Data-Parallel Language. Technical Report
CMU-CS-93-129, April 1993.
6. R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, and
Y. Zhou. Cilk: An ecient multithreaded runtime system. Journal of Parallel and
Distributed Computing, 37(1):5569, 1996.
7. D.E Culler and Arvind. Resource requirements of dataow programs. In Proceedings of the 15th Annual International Symposium on Computer Architecture, pages
141150, Honolulu, Hawai, 1989.
8. P. Fatourou and P.G. Spirakis. Ecient scheduling of strict multithreaded computations. Theory of Computing Systems, 33(3):173232, 2000.
9. M. Frigo, C.E. Leiserson, and K.H. Randall. The implementation of the cilk-5
multithreaded language. In Sigplan98, pages 212223, 1998.
10. F. Galile, J.-L. Roch, G. Cavalheiro, and M. Doreille. Athapascan-1: On-line
building data ow graph in a parallel language. In IEEE, editor, Pact98, pages
8895, Paris, France, October 1998.
11. L. J. Hendren, G. R. Gao, X. Tang, Y Zhu, X. Xue, H. Cai, and P. Ouellet.
Compiling c for the earth multithreaded architecture. In IEEE, editor, Pact96,
pages 1223, Boston, USA, 1996.
12. S. Jafar, T. Gautier, A. W. Krings, and J.-L. Roch. A checkpoint/recovery model
for heterogeneous dataow computations using work-stealing. In LNCS SpringerVerlag, editor, EUROPAR2005, Lisboa, Portogal, August 2005.
13. MOAIS Team. KAAPI. http://gforge.inria.fr/projects/kaapi/, 2005.
14. G.J. Narlikar. Scheduling threads for low space requirement and good locality.
Number TR CMU-CS-99-121, may 1999. Extended version of Spaa99 paper.
15. M.C. Rinard and M.S. Lam. The design, implementation, and evaluation of Jade.
ACM Trans. Programming Languages and Systems, 20(3):483545, 1998.

Efficient Parallel Tree Reductions on


Distributed Memory Environments
Kazuhiko Kakehi1 , Kiminori Matsuzaki2 , and Kento Emoto3
1

Division of University Corporate Relations


Department of Mathematical Informatics
3
Department of Creative Informatics
University of Tokyo
{kaz,kmatsu,emoto}@ipl.t.u-tokyo.ac.jp
2

Abstract. A new approach for fast parallel reductions on trees over distributed
memory environments is proposed. By employing serialized trees as the data representation, our algorithm has a communication-efficient BSP implementation
regardless of the shapes of inputs. The prototype implementation supports its real
efficacy.
Keywords: Tree reduction, parentheses matching, BSP, parallel algorithm.

1 Introduction
Research and development of parallelization have been intensively done toward matrices or one dimensional arrays. Looking at recent trends in applications, another data
structure has also been calling for efficient parallel treatments: the tree structure. Emergence of XML as a universal data format, which takes the form of a tree, has magnified
the impact of parallel and distributed mechanisms toward trees in order to reduce computation time and mitigate limitation of memory. However, parallel tree computation
over distributed memory environments is not so straightforward as it looks.
Consider, as a simple and our running example, a computation maxPath to find the
maximum of the values each of which is a sum of values in the nodes from the root to
each leaf. When it is applied to the tree in Fig. 1, the result should be 12 contributed by
the path of values 3, 5, 6 and 8 from the root. Parallelization of such a simple computation under distributed memory environments requires consideration from two aspects.
The first is the underlying data representations, including its distribution among processors to guarantee performance toward trees of arbitrary shapes. The second is derivation
of the parallel algorithm. As associativity often helps parallelization, we need to exploit
the similar property under trees which suitably work for the data representations.
This paper gives a clear solution for parallel tree reductions with its start point to
use serialized forms of trees. Their Notable examples are the serialized (streamed) representations of XML or parenthesized numeric expressions which are obtained by tree
traversals. The first problem mentioned above is naturally resolved by this choice, since
distribution of serialized data among processors is much simpler than that of trees. As
for the second point, we present an efficient parallel algorithm for tree reductions satisfying extended distributivity. As instances of serialized trees, parallelization of parentheses matching problems, which figures out correspondence between brackets, have
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 601608, 2007.
c Springer-Verlag Berlin Heidelberg 2007


602

K. Kakehi, K. Matsuzaki, and K. Emoto


3
4

-5
6

-2

-6

se

en

clo

op

tree traversal /
parsing

internal node

-1

leaf
4

(0,3)

(1,/)
(1,4)

(2,6)
(3,/)
(3,/)
(4,4)
(3,/)
(2,1)
(1,/)
(1,/)
(2,-6)
(1,/)
(1,-5)
(3,-2)
(3,8)
(3,-1)
(4,/)
(2,/)
(2,/)
(1,5)
(1,2)
(2,/)
(0,/)

aligned by depth
0

1
4

-5

5
6

1
-2

-1

2
-6

3
4

Fig. 1. A rose tree (upper left), its serialized representation as a sequence of pairs (middle) and
another representation according to the depth (lower)

plenty of work ( [1, 2, 3, 4, 5] to mention a few); our algorithm, with good resemblance
to the one under BSP [6], has also a BSP implementation with three supersteps.
This paper is organized as follows. After this Introduction, Sect. 2 observes our tree
representations and tree reductions. Section 3 develops the algorithm. Our algorithm
consists of three phases, where the first two perform computation along with parentheses matching, and the last reduces a tree of size less than twice of the number of
processors. Section 4 supports our claims by some experiments. We conclude this paper in Sect. 5 and mentioning future directions. We refer the interested reader to our
technical report [7] for the omitted details due to the space limitation of this paper.

2 Preliminaries
This section observes the underlying frameworks for our parallelization: tree structures,
their serialized form, tree homomorphism, and extended distributivity.
2.1 Trees and Their Serialized Representation
We treat trees with unbound degree (trees whose nodes can have an arbitrary number
of subtrees); Fig. 1 shows an example. As was explained in Introduction, our internal
representation is to keep tree-structured data in a serialized manner. The sequence of
the middle in Fig. 1 is our internal representation of the example tree. Like XML serialization or parentheses expressions, it is a combination of a preorder (for producing the
open elements) and a postorder traversal (for producing the close elements afterwards).
We assume well-formedness, with which sequences are guaranteed to be parsed back
into trees, and without loss of information we simplify close elements to be /. For
the convenience of later discussion, we assign the information of the depth in the tree
to each node. The figure also depicts their presentation according to the depth.

Efficient Parallel Tree Reductions on Distributed Memory Environments

603

2.2 Tree Homomorphism and Extended Distributivity


We use the framework called tree homomorphism [8], which specifies recursive tree
reductions h using h , , and associative with its unit :
h( Node a [t1 , . . . , tn ] ) = a ( h(t1 ) h(tn ) ) ,
h( Leaf a )
= h (a) .
The computation maxPath mentioned in Introduction is also a tree homomorphism:
maxPath( Node a [t1 , . . . , tn ] ) = a + ( maxPath(t1 ) maxPath(tn ) ) ,
maxPath( Leaf a )
= id (a) = a .
Here, id is the identify function, and returns the bigger of two numbers whose unit is
. When it is applied to the tree in Fig. 1, the result should be 12 = 3 + (5) + 6 + 8.
We can apply parallelization of tree homomorphism over serialized trees based on
list homomorphism [3]. This naive approach, however, suffers from the factor of tree
depths. We need to review an additional property called extended distributivity [9]. This
property is explained, by introducing a new operator  defined as (a, b, c)  e = a
(b e c), as follows: for any triples (au , bu , cu ), (al , bl , cl ), and any expression e,
there exists a triple (a, b, c) which satisfies
(au , bu , cu )  ( (al , bl , cl )  e ) = (a, b, c)  e .
Efficient parallel reduction requires these computations as well as and to be done
in constant time. Our running example satisfies these properties as the following calculation shows.
(au , bu , cu )  ( (al , bl , cl )  e )
= au + (bu (al + (bl e cl )) cu )
= (au + al ) + ((al + bu bl ) e (cl al + cu ))
= ( (au + al ), (al + bu bl ), (cl al + cu ) )  e
For some other examples under these formalizations, see our previous work [9].

3 Parallel Computation over the Serialized Trees


This section develops a parallel algorithm for tree homomorphism which satisfies extended distributivity. We explain our algorithm in three phases: (1) the first phase applies
tree homomorphism toward segments (consecutive subsequences) in each processor as
much as possible; (2) after communications among processors the second phase performs further reduction using extended distributivity, producing a binary tree as a result
whose internal nodes are specified as triples, and size is less than twice of the number
of processors; finally (3) the third phase reduces the binary tree into a single value.
We use N and P to the number of tree nodes and available processors, respectively.
During the explanation we assume P = 4. The algorithm resembles the analysis of
parentheses matching problems under BSP [5].
First phase. Each processor applies tree homomorphism to its given segment of size
2N/P. The process is summarized as Routine 1. This process leaves fragments of results,

604

K. Kakehi, K. Matsuzaki, and K. Emoto

5
1
-1

5
-6

1
3

2
- -6
-

Fig. 2. An illustrating example of the first phaseapplying tree homomorphism maxPath to a


segment of Fig. 1 (lined and dashed ovals indicate values obtained by reduction and (the
unit of ), respectively)

in arrays asp , bsp , csp and an integer dp for each processor number p. The array asp
is to keep the open elements without their corresponding close element. Each element
of asp can have subtrees before the next one in asp , and their reduced values are kept
in bsp . Similar treatments are done to unmatched close elements, leaving values in csp
(we remove unmatched close elements thanks to the absence of values). The integer dp
denotes the shallowest depth in processor p. While both of elements in asp and bsp are
listed in a descending manner, those of csp are in ascending manner; the initial elements
of asp and bsp and the last one of csp are at height dp (except for cs0 and csP1 whose
last element at depth 0 is always and therefore is set to be eliminated).
Routine 1. Prepare arrays as, bs, cs (behaving as stacks growing from left to right),
and an integer variable d. A temporary t is used. Sequence (d0 , a0 ), . . . , (dn1 , an1 )
(n 1) is given.
Set d d0 . If a0 is / then cs [ , ], as [ ], bs [ ]; else cs [ ],
as [a0 ], bs [ ].
For each i in {1, . . . , n 1}
if ai is not / (namely a value), then push ai to as and to bs;
else if as is empty, then push to cs, and set d di ;
else pop a from as and b from bs.
if b = (implying a is a leaf) then t h (a ); else t a b ;
if bs is not empty, then pop b from bs and push b t to bs;
else pop c from cs and push c t to cs.
If P = 1 then remove at the bottom of cs0 and csP1 .
In Fig. 2 we show a case of the illustrating segment from the 10th element (3, 1) to
the 21st (2, 6) of the sequence in Fig. 1. We have, as depicted:
cs = [1 + id (4), id (1), id (5)]
= [3, 1, 5],

as = [2, 6],
bs = [, ],

d = 1.

Please note that we regard absence of subtrees as an empty forest to which the tree
homomorphism returns , the unit of (at depth 2 and 3 kept in bs). When we
distribute the whole sequence in Fig. 1 evenly among four processors (6 elements for
each), the results by this phase is shown in Fig. 3, left.
A linear routine produces as, bs, cs and d. The worst case in terms of the size of the
results occurs when an sequence of only open elements is given, resulting in two arrays
as and bs of the length 2N/P for each.

Efficient Parallel Tree Reductions on Distributed Memory Environments


proc #0

proc #1

proc #2

proc #3

-5
-

6
-

-2

8
-

cs0 =
as0 =
bs0 =
d0 =

1
-1

-
4

(3,4,-4)

1
5

-4
-

605

(1,-,-5)

2
3

-5

4
-2

83

[]
cs1 = [, 8] cs2 = [, , 1, ] cs3 = [, 4]
[3, 5, 6, 2]
as1 = [1]
as2 = [5]
as3 = [ ]
[4, , , ] bs1 = [4]
bs2 = []
bs2 = [ ]
0
d1 = 3
d2 = 1
d3 = 0

Fig. 3. Triples between two processors (left) and the resulting tree (right) after the second phase

Second phase. The second phase matches data fragments kept in each processor into
triples using communication between processors. Later, we reduce consecutive occurrences of triples into a value, or into one triple by extended distributivity.
When we look carefully at Fig. 3, we notice that 3 in as0 at depth 0 now has five parts
at depth 1 as its children: the value 4 in bs0 , a subtree spanning from processors 0 to 2
whose root is 5 in as0 , the value in cs2 , a subtree from processor 2 to 3 whose
root is 5 in as2 , and the value 4 in cs3 . As these subtrees need reducing separately, we
focus on the leftmost and the rightmost values in bs0 and cs3 (we leave the value
in cs2 for the time being). We notice that the group of the value 3 in as0 with these two
values in processors 0 and 3 forms a triple (3, 4, 4).
Similarly, two elements in as0 at depth 1 and 2, with two elements each in bs0 and
cs2 at depth 2 and 3, respectively, form two triples (5, , 1) and (6, , ). The
former triple indicates a tree that awaits the result of one subtree specified by the latter.
This situation is what extended distributivity takes care of: we can merge two triples
(a sequence of triples in general) into one: (5, , 1)  ( (6, , )  e ) =
( (5 + 6), (6 ), ( 6 + 1) )  e = (1, , 5)  e for any e.
In this way, the group of data fragments in two processors turn into one triple.
Such groups from two adjacent processors are reduced into a single value without any missing subtrees in between: instead of treating using extended distributivity,
the values 2 in as0 and 1 in as1 at depth 3, and 5 in as2 at depth 2 with their
corresponding values in bsi and csi+1 (i = 0, 1, 2) turn into values id (2) = 2,
1 + (4 ) = 3, id (5) = 5, respectively.
We state the following lemma to tell the number of resulting groups in total.
Lemma 1. The second phase produces groups of the number at most 2P 3.
The proof sketch is as follows. Let Rp be the the number of groups among p processors.
As Fig. 3 indicates, R2 = 1, and for p > 2 we derive Rp 1 + Rj + Rpj+1
with 1 < j < p; hence we have Rp 2p 3. This lemma guarantees that, after
transactions of data fragments among processors, one processor needs to take care of at
most two such groups. The computed value at the shallowest depth dp in each processor
are associated to their right for later computation by (8 in cs1 , in cs2 ; see Fig. 3).
pl pr
The following Routine 2 figures out groups among processors. M[d
denotes a
u ,dl ]
group between processors pl and pr whose data fragments span from the depth du until
dl ( in dl indicates everything starting from du ); M[d
is inserted as a dummy
u ,dl ]

606

K. Kakehi, K. Matsuzaki, and K. Emoto

group in case the same d appear among more than two consecutive processors, and
assumed to be reduced into  , a virtual left unit of  (namely   e = e).
Routine 2. A stack is used whose top is referred as (ps , ds ). Sequence (p, dp ) is given
in the ascending order of p.
Push the first pair (0, 0) on a stack
For each i in {1, . . . , P 1}
prepare a variable d .
ps i
while di < ds , produce M[d
, set d ds and pop from the stack;
s ,d]
ps i
if di = ds , then produce M[d
and M[d
, and pop from the stack.
i ,ds ]
s ,d]

ps i
else produce M[d
.
i ,d]
push (i, di ).

Finally eliminate the last mating pair (that is M[0,0]


).

We summarize this phase. This phase consists of two steps. In the first all processors
figure out the groups by Routine 2 and allocation of groups by any deterministic rule
through sharing their depth information di . They require O(P) communication and computation costs in total. The second step is to apply further computation toward each
group. In the worst case it is possible that a processor sends out its whole data fragments in it to all other P 1 processors, and receives two groups and evaluates each
into one triple or value. Since the size of fragments in one processor and that of each
group are at most O(N/P), the cost for data transaction and computation is bound by
O(N/P).
Third phase. This last phase compiles obtained triples or values and reduce them into
a single value. As Fig. 3 shows, the obtained triples and values in the previous phase

form a binary tree of size 2P 3 (including dummies by M[0,0]


). We collect triples or
values in one processor, and apply tree reduction in O(P) time.
In summary, we have three communication rounds (two in the second phase, one in
the third phase). We conclude this section by stating the following theorem.
Theorem 1. Tree homomorphism with extended distributivity has a BSP implementation with three supersteps of at most O(P + N/P) communication cost for each.

4 Experiments and Discussion


We performed experiments using our prototype implementation using C++ and MPI.
Specifications of our PC cluster are shown in Table 1, left. We simulated a simple query
on rose trees where local computation of and were matrix multiplication-like operations over matrices of the size 10 10. We prepared randomly generated trees of
three types, namely (RS) shallow, (RD) deep, and (M) a monadic tree (a tree-view of
a list). The tree size was constrained by the memory size a machine has. The height
of RS came from observations on XML documents [10]. We executed the program over
each type using 2i processors (i = 0, . . . , 6).
Table 1 summarizes the results. First, it is natural that no difference existed in terms
of costs of initial data distribution. As their plots in the left of Fig. 4 indicates, our
algorithm exhibited good scalability for both of (RS) and (RD). Results of (M) fell

Efficient Parallel Tree Reductions on Distributed Memory Environments

607

Table 1. Specification (left) and execution times (right) of our experiments


machine 2.8 GHz CPU, 2GB memory
network Gigabit Ethernet
software Linux 2.4.21,
gcc 2.96, mpich 1.2.7
data: randomly generated trees of size
2,000,000
(RS) with maximum height 7
(RD) with maximum height of 5,000
(M) a monadic tree

1
2
4
8
16
32
64

Deep (RD)
Shallow (RS)
Monadic (M)
dist. comp. total comp. total comp. total
23.5 23.5
22.8 22.8
N.A. N.A.
0.48 12.3 12.7
12.6 13.0
N.A. N.A.
0.56
6.13 6.70
5.81 6.36 60.1 60.7
0.62
3.18 3.80
3.79 4.41 20.9 21.5
0.70
1.81 2.51
1.96 2.65 14.8 15.5
0.77
1.11 1.88
1.04 1.81
8.36 9.14
0.83
0.68 1.52
0.57 1.40
4.47 5.30

70
Deep (RD)
Shallow ( RS )
Monadic (M)

25

Relative Speedups

Execution Time (sec)

30

20
15
10
5
0

Deep (RD)
60 Shallow ( RS )
linear
50
40
30
20
10

10

20
30
40
50
Number of Processors

60

70

10

20
30
40
50
Number of Processors

60

70

Fig. 4. Plots of Table 1 by total execution time (left) and by speedups of computation time (right)

behind the other two: it ran out of memory until 2 processors (the result by 4 processors seems also affected), and its execution was around seven times slower than other
cases. The reason is that there are no reducible subtrees in the first phase, which incurs
heavy data transactions and costly computation by extended distributivity. It should be
noted that the similar BSP algorithm for all nearest smaller values problem, generalization of parentheses matching, toward average cases is shown not to suffer from heavy
transactions [5]. While we need to develop a similar theoretical proof to our algorithm,
experiments on random trees (RS and RD) suggest that our algorithm works efficiently
toward average cases.
We make a brief comparison with existing approaches of parallel tree computations.
One common approach under distributed memory environments has their basis in list
ranking, and their parallel tree contractions require expensive costs of O(N/P log P)
[11, 12, 13]. Using a technique called m-bridge [14] for initial data distribution, our
existing library implements parallel tree contractions in O(N/P + P) over trees kept as
linked structures [15, 16, 9]. In comparison with them, the approach in this paper has
two notable advantages. First, it can be coded as simple for-loops over one-dimensional
arrays and benefits compiler optimizations well. We also observed that high memory locality by serialized forms brought considerable performance improvements, especially
when the required computation was memory-intensive where cache effects become important. Second, the data distribution process in this paper is really small compared to
the cost of m-bridge, in which traversal over linked structure is involved.

608

K. Kakehi, K. Matsuzaki, and K. Emoto

5 Concluding Remarks
In this paper we have developed a new approach to reduce a tree in parallel. The algorithm has been shown to run scalably as well as fast in theory and practice, by exploiting
the serialized trees and a property called extended distributivity.
At the moment we have only analyzed cost for the worst case. It is our interesting
task to analyze average cases, or go a step further to the theory of heterogeneous cases.
Acknowledgment. This research was partially supported by the Ministry of Education, Culture, Sports, Science and Technology, Grant-in-Aid for Young Scientists (B),
17700026, 20052007.

References
1. Berkman, O., Schieber, B., Vishkin, U.: Optimal doubly logarithmic parallel algorithms
based on finding all nearest smaller values. Journal of Algorithms 14 (1993)
2. Prasad, S., Das, S., Chen, C.: Efficient EREW PRAM algorithms for parentheses-matching.
IEEE Transactions on Parallel and Distributed Systems 5(9) (1994)
3. Cole, M.: Parallel programming with list homomorphisms. Parallel Processing Letters 5
(1995)
4. Kravets, D., Plaxton, C.: All nearest smaller values on the hypercube. IEEE Transactions on
Parallel and Distributed Systems 7(5) (1996)
5. He, X., Huang, C.: Communication efficient BSP algorithm for all nearest smaller values
problem. Journal of Parallel and Distributed Computing 16 (2001)
6. Valiant, L.: A bridging model for parallel computation. Communication of the ACM 33(8)
(1990)
7. Kakehi, K., Matsuzaki, K., Emoto, K., Hu, Z.: A practicable framework for tree reduction
under distributed memory environments. Technical Report METR 2006-64, Department of
Mathematical Informatics, University of Tokyo (2006)
8. Skillicorn, D.B.: Parallel implementation of tree skeletons. Journal of Parallel and
Distributed Computing 39(2) (1996)
9. Matsuzaki, K., Hu, Z., Kakehi, K., Takeichi, M.: Systematic derivation of tree contraction
algorithms. Parallel Processing Letters 15(3) (2005) (Original version appeared in Proc. 4th
International Workshop on Constructive Methodology of Parallel Programming, 2004.).
10. Mignet, L., Barbosa, D., Veltri, P.: The XML web: a fist study. In Proceedings of the Twelfth
International World Wide Web Conference, ACM Press (2003)
11. Mayr, E.W., Werchner, R.: Optimal routing of parentheses on the hypercube. Journal of
Parallel and Distributed Computing 26(2) (1995)
12. Mayr, E.W., Werchner, R.: Optimal tree contraction and term matching on the hypercube
and related networks. Algorithmica 18(3) (1997)
13. Dehne, F., Ferreira, A., Caceres, E., Song, S., Roncato, A.: Efficient parallel graph algorithms
for coarse-grained multicomputers and BSP. Algorithmica 33(2) (2002)
14. Reid-Miller, M., Miller, G.L., Modugno, F.: List ranking and parallel tree contraction. In
Reif, J.H., ed.: Synthesis of Parallel Algorithms. Morgan Kaufmann (1993)
15. SkeTo Project Home Page. http://www.ipl.t.u-tokyo.ac.jp/sketo/
16. Matsuzaki, K., Emoto, K., Iwasaki, H., Hu, Z.: A library of constructive skeletons for
sequential style of parallel programming (invited paper). In: Proceedings of the First
International Conference on Scalable Information Systems, IEEE Press (2006)

Efficient Implementation of Tree Accumulations


on Distributed-Memory Parallel Computers
Kiminori Matsuzaki
Graduate School of Information Science and Technology, University of Tokyo,
7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
kmatsu@ipl.i.u-tokyo.ac.jp

Abstract. In this paper, we develop an efficient implementation of two tree accumulations. In this implementation, we divide a binary tree based on the idea of
m-bridges to obtain high locality, and represent local segments as serialized arrays to obtain high sequential performance. We furthermore develop a cost model
for our implementation. The experiment results show good performance.

1 Introduction
This paper develops an efficient implementation of two tree accumulations for binary
trees, upwards accumulation and downwards accumulation, on distributed-memory parallel computers. Tree accumulations are basic manipulations of trees: the upwards accumulation aggregates the data at all the descendants for each node, and the downwards
accumulation aggregates the data at all the ancestors. These two tree accumulations
have been used for solving many tree problems, for example, computing depths and
sizes for all subtrees, and determining maximal independent sets [2].
For parallel tree manipulations tree contraction algorithms have been studied intensively [3,4,5,6,7]. Tree contraction algorithms are parallel algorithms that reduce a tree
into the root by independent removals of nodes. Several tree contraction algorithms have
been developed on many parallel computational models such as EREW PRAM [4], Hypercubes [5], and BSP/CGM [6]. Several studies have also clarified that tree accumulations can be implemented in parallel based on tree contraction algorithms [4,8].
We are developing parallel skeleton library SkeTo [9], which provides parallel manipulations for trees as well as lists and matrices. Compared with the implementations
so far, our implementation of tree accumulations has the following three features, which
are important in efficient parallel computations on distributed-memory computers.
High locality. Locality is one of the most important properties in developing efficient parallel programs especially for distributed-memory computers. We adopt
m-bridges [10] in the basic graph-theory to divide binary trees with high locality.
High sequential performance. The performance of the sequential parts is as important as that of the communication parts for efficient parallel programs. We represent
a local segment as a serialized array and implemented computations in sequential
parts with loops rather than recursive functions.


The full discussion of the paper is given in our technical report [1].

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 609616, 2007.
c Springer-Verlag Berlin Heidelberg 2007


610

K. Matsuzaki

Cost model. We also formalize a cost model of our parallel implementation. The
cost model helps us to divide binary trees with good load balance.
The organization of the paper is as follows. In the following Sect. 2, we introduce
the division and representation of binary trees in our implementation. In Sect. 3, we
develop the implementation of two tree accumulations, and show the cost model. In
Sect. 4, we discuss the optimal division of binary trees based on the cost model, and
show experiment results in Sect. 5. In Sect. 6, we make concluding remarks.

2 Internal Representation of Binary Trees


In efficient parallel programs on distributed-memory parallel computers, we need to
divide data structures into smaller parts to distribute them to the processors. Locality
and load balance are the most important two properties in efficient parallel programs.
We start the discussion by introducing some graph-theoretic results [10]. Let
size b (v) denote the number of nodes in the subtree rooted at vertex v.
Definition 1 (m-Critical Node [10]). A vertex v is m-critical node, if v is an internal
node and for each child v  of v inequality sizeb (v)/m > sizeb(v  )/m holds.
The m-critical nodes divide a tree into sets of adjacent nodes (m-bridges) as shown in
Fig. 1. Note that the global structure given by m-bridges also forms a binary tree.
Definition 2 (m-Bridge [10]). An m-bridge is a set of adjacent nodes divided by mcritical nodes, that is, a largest set of adjacent nodes in which m-critical nodes are only
on the ends.
Let N be the number of nodes and P be the number of processors. In the previous
studies [10,11], we divided a tree into m-bridges using m given by m = 2N/P . Under
this division, we obtain at most (2P 1) m-bridges and each processor deals with at
most two m-bridges. Of course this division enjoys high locality, but it has poor load
balance since the maximum number of nodes passed to a processor may be 2N/P ,
which is twice of that in the best load-balancing case. In this paper, we divide a tree
into m-bridges using smaller m and assigns more m-bridges to a processor. In Sect. 4,
we discuss how to adjust m based on the cost model of our implementation.
(a)

13

(c)
(b)

3
1

(d)

3
1

(f)

3
1

(e)

(g)

Fig. 1. An example of m-critical nodes and m-bridges. Left: There are three 4-critical nodes
denoted by the doubly-lined circles. The number in each node denotes the number of nodes in
the subtree. Right: There are seven 4-bridges, (a)(g), each of which is a set of connected nodes.

Efficient Implementation of Tree Accumulations


gt = [2N , 2N , 2L , 2L , 2L ]

1
2

3
5

4
6

611

seg = [ [ 1C ],
[ 2C ],
[ 4N , 6L , 7L ],
[ 5N , 8L , 9L ],
[ 3L ] ]

Fig. 2. Internal representation of divided binary trees. Each local segment is distributed to one
of processors and is not shared. Labels L, N and C denote a leaf, a normal internal node, and a
critical node, respectively. Each critical node is included in the parent segment.

Generally speaking, a tree structure is often implemented with pointers or references.


In this paper, we represent the tree structure with arrays as shown in Fig. 2 where the
elements are aligned in the order of preorder traversal. This representation has an advantage in terms of locality: that is, we can reduce cache misses since adjoining elements
are aligned one next to another.
In the following, gt denotes the serialized array for the global structure, and seg[i]
denotes the serialized array for the ith local segment. We use functions isLeaf, isNode
and isCrit to check whether the node is a leaf, an internal node, and a critical node.

3 Implementation and Cost Model of Tree Accumulations


In this section, we show the implementation and the cost model of the tree accumulations on distributed-memory parallel computers. We implement the local computations
in tree accumulations using loops and stacks on the serialized arrays. This is the most
significant contribution in this paper with which the parallel programs reduce the cache
misses and achieve good speedups even against efficient sequential programs.
Before showing two accumulations, we introduce several parameters for the cost
model. The computational time of function f executed with p processors is denoted
by tp (f ). Parameter N denotes the number of nodes, and P denotes the number of
processors. Parameter m is used for m-critical nodes and m-bridges, and M denotes the
number of segments after the division. For the ith segment, in addition to the parameter
of the number of nodes Li , we introduce parameter Di indicating the depth of the
critical node. Parameter c denotes the communication time for a value of type . We
develop the cost model with the overall communication cost.
The cost model for tree accumulations can be uniformly given as the sum of the
maximum cost of local computations and the cost of global computations as follows.

max
(Li tl + Di td ) + M tm
p

pr (i)=p

where pr (i)=p denotes the summation of costs for m-bridges associated to processor
p, and tl , td , and tm are given with the cost of functions and communications. The term
(Li tl ) indicates the computational time required in sequential computation and the
term (Di td ) indicates the overhead caused by parallelism. The last term (M tm )
indicates the overhead in terms of global structure.

612

K. Matsuzaki

Upwards Accumulation. Upwards accumulation applies a ternary function to each


internal node in a bottom-up manner and returns a tree of the same shape as the input.
In Haskell notation, a sequential definition of the upwards accumulation is given as
follows.
uAcc :: ( ) Tree Tree
uAcc k (Leaf a)
= Leaf a
uAcc k (Node l b r) = let l = uAcc k l; r = uAcc k r;
in Node l (k (root l) b (root r)) r
Function root returns the root node of the given tree. The upwards accumulation takes
(N t1 (k)/2) time by sequential execution.
For efficient implementation of the upwards accumulation, we require four auxiliary
functions , n , l , and r satisfying the following equations.
k l b r = n l ( b) r
n (n x l y) b r = n x (l l b r) y
n l b (n x r y) = n x (r l b r) y
Let the type of result of be . Intuitive meaning of these auxiliary functions is as
follows: The computation on an internal node is lifted up by function to some domain
and pulled down by function n from the domain, where a certain kind of associativity
holds against functions l and r on the domain.
Under this condition, we implement the upwards accumulation by five steps.
At the first step, we apply the following UACC LOCAL to each segment to compute
local upwards accumulation. This function puts the intermediate result to array seg  if
a node has no terminal node as descendants. (This result value is indeed the result of
uAcc.) This function returns the result of the local reduction and the array seg  .
UACC LOCAL(k, , l , r , seg)

stack ; d ;
for i seg.size 1 to 0
if (isLeaf (seg[i])) seg  [i] seg [i]; stack seg  [i]; d d + 1;
if (isNode(seg[i]))
lv stack ; rv stack ;
if (d == 0) stack l (lv, (seg[i]), rv); d 0;
else if (d == 1) stack r (lv, (seg[i]), rv); d 0;
else seg  [i] k(seg[i], lv, rv); stack seg  [i]; d d 1;
if (isCrit(seg[i])) stack (seg[i]); d 0;
top stack ; return(top, seg  );

In the computation of UACC LOCAL, and either of l or r are applied to each node
on the path from the critical node to the root, and k is applied to the other internal nodes.
Since the
number of internal nodes is a half of Li , we obtain the cost of UACC LOCAL
as maxp pr (i)=p ((Li /2 Di ) t1 (k) + Di (t1 () + max(t1 (l ), t1 (r )))).
At the second step, we gather the results of the local reductions to the global structure
gt on the root processor. From each leaf segment a value of type is sent, and from
each internal segment a value of type is sent. Therefore, the communication cost in
the second step is given as M/2 c + M/2 c .
At the third step, we compute the upwards accumulation for the global structure gt
on the root processor. Function UACC GLOBAL performs sequential upwards accumulation using function n .

Efficient Implementation of Tree Accumulations

613

UACC GLOBAL(n , gt )

stack ;
for i gt .size 1 to 0
if (isLeaf (gt [i])) gt  [i] gt[i];
if (isNode(seg[i])) lv stack ; rv stack ; gt  [i] n (lv, gt[i], rv);
stack gt  [i];
return(gt  );

In this function, we apply function n to each internal segment of gt , and thus the cost
of the third step is given as M/2 t1 (n ).
At the fourth step, we send the results of global upwards accumulation to processors,
where two values are sent to each internal segment and no value is sent to each leaf
segment. All the values have type after the global upwards accumulation, and thus
the communication cost in the fourth step is given as M c .
At the last step, we apply function UACC UPDATE to each internal segment. The two
values pushed to the stack are the values passed in the previous step, and correspond
to the two results of children of the critical node. Note that in the last step we only
compute the missing values left in the segment seg  .

, lc, rc)
stack ; stack rc; stack lc; d ;
for i seg.size 1 to 0
if (isLeaf (seg[i])) stack seg  [i]; d d + 1;
if (isNode(seg[i]) isCrit(seg[i]))
lv stack ; rv stack ;
if (d == 0 d == 1 isCrit(seg [i]))
seg  [i] k(seg [i], lv, rv); stack seg  [i]; d 0;
else stack seg  [i]; d d 1;
return(seg  );

UACC UPDATE(k, seg, seg

In this step, function k is applied to the nodes on the path from the critical node to the
root. Noting that
 the depth of the critical nodes in ith segment is Di , we can give the
cost as maxp pr (i)=p (Di t1 (k)).
Summarizing the discussion above we can give the cost model of the upwards accumulation as follows. Note that the coefficient of Li is the same as that of sequential
programs and is not affected by the introduction of auxiliary functions.


t1 (k)
tp (uAcc k) = max
Li
+ Di (t1 () + max(t1 (l ), t1 (r )))
p
2
pr (i)=p

+ M (3c + c + t1 (n ))/2
Downwards Accumulation. We then show the implementation of the downwards accumulation. Downwards accumulation is a top-down computation where an accumulative parameter c is updated with two functions (gl , gr ). The values of leaves are not
used in this computation.
dAcc :: ( , ) Tree Tree
dAcc (gl , gr ) c (Leaf a)
= Leaf c
dAcc (gl , gr ) c (Node l b r) = let l = dAcc (gl , gr ) (gl c b) l
r = dAcc (gl , gr ) (gr c b) r
in Node l c r

614

K. Matsuzaki

The downwards accumulation takes (N (t1 (gl ) + t1 (gr ))/2) by sequential execution.
For efficient parallel implementation, we require auxiliary functions l , r , u and
d satisfying the following equations. Let be the type of results of l and r .
gl c n = d c (l n),

gr c n = d c (r n),

d (d c n) m = d c (u n m)

The implementation of the downwards accumulation also consists of five steps. Due
to the page limit, we only show the outline of the implementation. See [1] for details.
1. For each internal segment, compute intermediate values corresponding to the path
from the root to the critical node.
 We can implement this step by a reversed loop
with a stack. The cost is maxp pr (i)=p (Di (max(t1 (l ), t1 (r )) + 2t1 (u ))).
2. Gather local results to the root processor. The communication cost is M c .
3. Compute downwards accumulation on the root processor for the global structure
by a forward loop with a stack. The cost of this step is given as M t1 (d ).
4. Distribute the result of global downwards accumulation to each segment. The communication cost in this step is M c .
5. For each segment, compute downwards accumulation starting from the result of
global downwards accumulation. We can implement this computation
using a for
ward loop with a stack. The cost of this step is given as maxp pr (i)=p (Li /2
(t1 (gl ) + t1 (gr ))).
The overall cost model for the downwards accumulation is given as follows. Here
again the coefficient of Li is the same as that of sequential computation.

 
t1 (gl ) + t1 (gr )
max
Li
+ Di (max(t1 (l ), t1 (r )) + 2t1 (u ))
p
2
pr (i)=p

+M (c + t1 (d ) + c )

4 Optimal Division of Binary Trees Based on Cost Model


Locality and load balance are two major properties in developing efficient parallel programs. In the division of trees, we enjoy good locality with smaller m while we enjoy
good load balance with larger m. Therefore we need to find an appropriate value for m.
The parameters Li , Di , and M in the cost model satisfy the following inequalities
against m:
Li m,

Di Li /2 m/2,

(N/m 1)/2 < M < 2N/m 1.

By using a greedy balancing algorithm with respect to the number of nodes and depth
of the critical node for distributing m-bridges, we get the cost of worst case as follows.

max
(Li tl + Di td ) + M tm (N/P + m) (tl + td /2) + M tm
p

pr (i)=p

The equality holds if all the m-bridges have the same number of nodes and all the
critical nodes are in depth m/2, i.e., fully ill-balanced trees. With the inequality between
M and m, the right-hand side above gets the minimum value for some value m in the
following range.



tm /(2tl + td ) N < m < 2 tm /(2tl + td ) N

Efficient Implementation of Tree Accumulations


Speedups against Seq. Program

Execution Times (s)

2
Balanced Tree
Random Tree
Ill-balanced Tree

1.5
1
0.5
0
1 4 8 12 16

24

32

40

48

56

60

Balanced Tree
Random Tree
Ill-balanced Tree
linear

50
40
30
20
10
0

1 4 8 12 16

64

615

24

32

40

48

56

64

Number of Processors

Number of Processors

Fig. 3. Execution times and speedups against sequential program where m = 2 104
0.4
0.3

Execution Times (s)

Execution Times (s)

0.14

m = 2000
m = 20000
m = 200000

0.35
0.25
0.2
0.15
0.1

P = 16
P = 32
P = 64

0.12
0.1
0.08
0.06
0.04
0.02

0.05

0
1 4 8 12 16

24

32

40

48

Number of Processors

56

64

10

20

40

80

200

500

Value of m (1000)

Fig. 4. Execution times changing parameter m for the randomly generated tree

5 Experiment Results
To verify the efficiency of the implementation of tree accumulations, we made several
experiments. We used our PC-cluster of uniform PCs with Pentium 4 2.8 GHz CPU and
2 GByte memory connected with Gigabit Ethernet. The compiler and MPI library used
are gcc 4.1.1 and MPICH 1.2.7, respectively.
We solved the party planning problem [12], which is a generalization of maximal
independent sets [2]. The input trees are (1) a balanced tree, (2) a randomly generated
tree and (3) a fully ill-balanced tree, each with 16777215 (= 224 1) nodes. The
parameters for the cost model are tl = 0.18 s, td = 0.25 s, and tm = 100 s on our
PC cluster.
Figure 3 shows the general performance of the tree accumulations. Each execution
time excludes the initial data distribution and final gathering. The speedups are plotted
against a sequential implementation of the tree accumulations. As seen in these plots,
the implementation shows not only scalability but also good sequential performance.
For the fully ill-balanced tree the implementation performs worse but this is caused by
the factor of Di td ( 0.7(Li tl ), for the program) introduced for parallelism.
To analyze more in detail, we made more experiments by changing the value of
m. The results are shown in Fig. 4. Roughly speaking, as seen from Fig. 4 (left), the
implementation of tree accumulations scales under both large and small m. Figure 4
(right) plots the execution time with respect to the parameter m. The performance gets

616

K. Matsuzaki

worse for too small m or too large m, where good performance is shown under the range
5 104 < m < 1 105 computed from the parameters of the cost model estimated
with a small test case.

6 Conclusion
We have developed an efficient implementation of the tree accumulations. Not only
our implementation shows good performance against efficient sequential programs, but
also the cost model of the implementation helps us to divide a tree into segments. The
implementation will be available as part of SkeTo library (Sketo library is available at
http://www.ipl.t.u-tokyo.ac.jp/sketo/). Our future work is to develop a profiling system
that determines accurate parameter m for dividing trees.
Acknowledgments. This work is partially supported by Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (B) 17300005, and the Ministry of
Education, Culture, Sports, Science and Technology, Grant-in-Aid for Young Scientists
(B) 18700021.

References
1. Matsuzaki, K., Hu, Z.: Efficient implementation of tree skeletons on distributed-memory
parallel computers. Technical Report METR 2006-65, Department of Mathematical Informatics, the University of Tokyo (2006).
2. He, X.: Efficient parallel algorithms for solving some tree problems. In 24th Allerton Conference on Communication, Control and Computing. (1986).
3. Miller, G.L., Reif, J.H.: Parallel tree contraction and its application. In 26th Annual Symposium on Foundations of Computer Science, 2123 October 1985, Portland, Oregon, USA,
IEEE Computer Society (1985).
4. Abrahamson, K.R., Dadoun, N., Kirkpatrick, D.G., Przytycka, T.M.: A simple parallel tree
contraction algorithm. Journal of Algorithms 10(2) (1989).
5. Mayr, E.W., Werchner, R.: Optimal tree contraction and term matching on the hypercube
and related networks. Algorithmica 18(3) (1997).
6. Dehne, F.K.H.A., Ferreira, A., Caceres, E., Song, S.W., Roncato, A.: Efficient parallel graph
algorithms for coarse-grained multicomputers and BSP. Algorithmica 33(2) (2002).
7. Vishkin, U.: A no-busy-wait balanced tree parallel algorithmic paradigm. In SPAA 2000:
Proceedings of the 12th Annual ACM Symposium on Parallel Algorithms and Architectures,
July 913, 2000, Bar Harbor, Maine, USA, ACM Press (2000).
8. Gibbons, J., Cai, W., Skillicorn, D.B.: Efficient parallel algorithms for tree accumulations.
Science of Computer Programming 23(1) (1994).
9. Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In InfoScale 06: Proceedings of the 1st international
conference on Scalable information systems. Volume 152 of ACM International Conference
Proceeding Series., ACM Press (2006).
10. Reif, J.H., ed.: Synthesis of Parallel Algorithms. Morgan Kaufmann Publishers (1993).
11. Matsuzaki, K., Hu, Z., Takeichi, M.: Implementation of parallel tree skeletons on distributed systems. In The Third Asian Workshop on Programming Languages and Systems,
APLAS02, Shanghai Jiao Tong University, Shanghai, China, November 29 December 1,
2002, Proceedings. (2002).
12. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. Second
ed. MIT Press (2001).

SymGrid-Par: Designing a Framework for


Executing Computational Algebra Systems on
Computational Grids
Abdallah Al Zain1 , Kevin Hammond2 , Phil Trinder1 , Steve Linton2 ,
Hans-Wolfgang Loidl3 , and Marco Costanti2
1

Dept. of Mathematics and Comp. Sci., Heriot-Watt University, Edinburgh, UK


{ceeatia,trinder}@macs.hw.ac.uk
2
School of Computer Science, University of St Andrews, St Andrews, UK
{kh,sal,costanti}@cs.st-and.ac.uk
3
Ludwig-Maximilians Universit
at, M
unchen, Germany
hwloidl@informatik.uni-muenchen.de

Abstract. SymGrid-Par is a new framework for executing large computer algebra problems on computational Grids. We present the design
of SymGrid-Par, which supports multiple computer algebra packages,
and hence provides the novel possibility of composing a system using
components from dierent packages. Orchestration of the components
on the Grid is provided by a Grid-enabled parallel Haskell (GpH). We
present a prototype implementation of a core component of SymGridPar, together with promising measurements of two programs on a modest Grid to demonstrate the feasibility of our approach.

Introduction

This paper considers the design of high-performance parallel computational algebra systems targeting computational Grids, undertaken as part of the European Union Framework VI grant RII3-CT-2005-026133 (SCIEnce). Parallelising computational algebra problems is challenging since they frequently possess
highly-irregular data- and computational-structures. We describe early stages
of work on the SymGrid system that will ultimately Grid-enable a range of
important computational algebra systems, including at least Maple [12], MuPad [21], Kant [16] and GAP [18]. The SymGrid system comprises two distinct parts: SymGrid-Services allows symbolic computations to access, and
to be oered as, Grid services; conversely, SymGrid-Par enables the parallel
execution of large symbolic computations on computational Grids. This paper
focuses on SymGrid-Par. While there are some parallel symbolic systems that
are suitable for either shared-memory or distributed memory parallel systems
(e.g. [13,11,15,19,5]), work on Grid-based symbolic systems is still nascent, and
our work is therefore highly novel, notably in aiming to allow the construction
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 617624, 2007.
c Springer-Verlag Berlin Heidelberg 2007


618

A. Al Zain et al.

of heterogeneous computations, combining components from dierent computational algebra systems. In this paper, we introduce the design of SymGrid-Par
(Section 2); outline a prototype implementation (Section 3) and present some
preliminary results to demonstrate the realisability of our approach (Section 4).
In particular, we demonstrate the integration of GRID-GUM with one important
computational algebra system, the GAP system for programming with groups
and permutations, and show that we can exploit parallelism within a single Gridenabled cluster. This represents the rst step towards a general heterogeneous
framework for symbolic computing on the Grid that will eventually allow the
orchestration of complete symbolic applications from mixed components written
using dierent computational algebra systems, running on a variety of computer
architectures in a geographically dispersed setting, and accessing distributed
data and other resources.

The SymGrid-Par Middleware Design

Computational algebra has played an important role in notable mathematical


developments, for example in the classication of Finite Simple Groups. It is
essential in several areas of mathematics which apply to computer science, such
as formal languages, coding theory, or cryptography. Applications are typically
characterised by complex and expensive computations that would benet from
parallel computation, but which may exhibit a high degree of irregularity in
terms of both data- and computational structures. In order to provide proper
support for high-performance symbolic computing applications, we therefore use
a multi-level approach where parallelism may be exploited within a local cluster
(or, indeed, within a single multiprocessor/multicore system), and where individual clusters may then be marshalled into a larger heterogeneous system. In
order to allow adequate exibility, we also provide support for dynamic task allocation, rebalancing and migration. Although we will not discuss it in this paper,
our design also allows us to exploit the availability of specic Grid resources,
which may not be distributed across the entire computational Grid.
The SymGrid-Par middleware is built on the GRID-GUM [8] Grid-enabled
implementation of Glasgow Parallel Haskell (GpH) [25], a well-established semiimplicitly parallel extension to the standard Glasgow Haskell Compiler. GpH
provides various high-level parallelism services including support for ultra-lightweight threads, virtual shared-memory management, scheduling support, automatic thread placement, automatic datatype-specic marshalling/unmarshalling,
implicit communication, load-based thread throttling, and thread migration. It
thus provides a exible, adaptive environment for managing parallelism at various degrees of granularity, and has therefore been ported to a variety of sharedmemory and distributed-memory systems. GRID-GUM replaces the MPI-based
low-level communication library in GpH with MPICH-G2, which integrates
with standard Globus Toolkit middleware.

SymGrid-Par: Designing a Framework for Executing CASs


GAP

Maple

619

Kant

Computational Algebra
Systems (CAS)
CAG Interface

<OMOBJ> <OMA>

Grid Middleware
(GpH/GridGUM)

<OMS name=FuncName cd=SomeCD />


<OMV name=arg 1 />

GCA Interface
<OMV name=argn N />

Computational Algebra
Systems (CAS)

</OMA>
</OMOBJ>

GAP

Maple

Kant

Fig. 1. SymGrid-Par Design Overview

Fig. 2. GCA OpenMath Service Request

Our overall design is shown in Figure 1. SymGrid-Par comprises two interfaces: CAG links computational algebra systems (CASs) to GpH; and GCA
conversely links GpH to these CASs. The purpose of the CAG/GCA interfaces
is to enable CASs to execute on computational Grids, e.g. on a loosely-coupled
collection of Grid-enabled clusters. This is achieved by calling from the CAS to
the Grid-enabled GpH middleware using CAG. This, in turn, calls CAS functions on remote processing elements using GCA.
2.1

GCA Design

The purpose of the GCA interface is to allow CAS functions to be invoked


from GpH. In this way, GpH deals with issues of process creation/coordination,
and the CAS deals with the algebraic computations. GpH and the CAS run
as separate operating system processes, communicating through shared pipes.
Figure 3 shows the design of the GCA interface. Unidirectional pipes connect
each GRID-GUM process to the companion CAS process, as shown in Figure 4.
Objects that are communicated between the two systems are encoded using the
standard OpenMath [6] format for exchanging mathematical data (see Figure 2).
2.2

CAG Design

The CAG interface will support low-eort Grid programming by providing algorithmic skeletons [14] that have been tailored to the needs of computational
Grids. Figure 5 shows the standard GAP functions that we believe can form the
basis for an initial set of skeletons (ParGAP [15] has also identied a similar
set of parallel operations). Here each argument to the pattern is separated by
an arrow (->), and may operate over lists of values ([..]), or pairs of values

620

A. Al Zain et al.
Processor A

Processor B

GpH

GpH

GridGUM

MPICHG2

GridGUM

GCA

CAS

GpH
GCA

GCA

CAS
GAP, Maple,
KANT ....

Input pipe carrying


OM Object

CAS
GAP, Maple,
KANT ....

Fig. 3. GCA Interface

OM
Code/
Decode

Output pipe carrying


OM Object

OM
Code/
Decode

Fig. 4. GCA Design

((..,..)). All of the patterns are polymorphic: a, b etc. stand for (possibly different) concrete types. The rst argument in each case is a function of either
one or two arguments that is to be applied in parallel. For example, parMap
applies its function argument to each element of its second argument (a list) in
parallel, and parReduce will reduce its third argument (a list) by applying the
function between pairs of elements, ending with the value supplied as its second
argument.

The GCA Prototype

The GCA prototype (Figure 6) interfaces GpH with GAP, connecting to a


small interpreter that allows the invocation of arbitrary GAP functions, marshalling/unmarshalling data as required. The interface consists of both C and
Haskell fragments. The C component is mainly used to invoke operating system services that are needed to initiate the GAP process, to establish the pipes,
and to send and receive commands/results from GAP process. It also provides
support for static memory that can be used to maintain state between calls.
The functions GAPEval and GAPEvalN allow GpH programs to invoke GAP
functions by simply giving the function name and a list of its parameters as

parMap :: (a>b) > [a] > [b]


parZipWith :: (a>b>c) >
> [a] > [b] > [c]

GAPEval :: String > [GAPObject] > GAPObject


parReduce :: (a>b>b) >
b > [a] > b
parMapReduce :: (a>b>b) >
(c>[(d,a)]) > c > [(d,b)]

Fig. 5. CAG Functions

GAPEvalN :: String > [GAPObject] > [GAPObject]


String2GAPExpr :: String > GAPObject
GAPExpr2String :: GAPObject > String

Fig. 6. GCA Prototype Functions

SymGrid-Par: Designing a Framework for Executing CASs

621

smallGroupSearch :: Int > Int > Int >


((Int,Int) > (Int,Int,Bool)) > [(Int,Int)]
smallGroupSearch lo hi chunkSize pred =
concat (map (ifmatch pred) [lo..hi] using
parListChunk chunkSize)
ifmatch :: ((Int,Int) > (Int,Int,Bool)) > Int > [(Int,Int)]
ifmatch predSmallGroup n = [ (i,n) | (i,n,b) <
((map predSmallGroup [(i,n) | i <
[1..nrSmallGroups n]]) using
parListBigChunk 10000), b]
predSmallGroup :: (Int,Int) > (Int,Int,Bool)
predSmallGroup (i,n) =
(i,n,(gapObject2String (gapEval "IntAvgOrder"
[int2GAPObject n, int2GAPObject i]))== "true")
nrSmallGroups :: Int > Int
nrSmallGroups n = gapObject2Int (gapEval
"NrSmallGroups" [int2GAPObject n])

Fig. 7.

pfib_mp.mpi 62 +RTS -qp4 -qPg -qP -sstderr

Average Parallelism = 4.0

GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
tasks

tasks

GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim
GrAnSim

smallGroup

Kernel

smGrpSearch_mp.mpi 1 250 35 +RTS -qp5 -qPg -qP -qe200 -sstderr

Average Parallelism = 2.9

9
8
8

7
6
7
6
6
5

4
3

3
2
2
2
2
1

0
0

50.0 k

running

100.0 k

runnable

150.0 k

fetching

200.0 k

blocked

250.0 k

migrating

300.0 k

350.0 k

400.0 k

Runtime = 419992 cycles

2.0 k

running

4.0 k

6.0 k

8.0 k

runnable

10.0 k

12.0 k

14.0 k

fetching

16.0 k 18.0 k

blocked

20.0 k

22.0 k

24.0 k

migrating

26.0 k

28.0 k

30.0 k

32.0 k 34.0 k

36.0 k

Runtime = 36433 cycles

Fig. 8. lift: Activity Prole for (parFib 62) on 4 PEs; right: Activity Prole for
Group (1-250) on 5 PEs

small-

GAPObject s. The GAPEvalN function is used to invoke GAP functions returning more than one object. Finally, String2GAPExpr and GAPExpr2String convert GAP objects to/from our internal Haskell format.

Preliminary GCA Prototype Performance Results

We have measured the performance of our GCA Prototype using two simple programs: parFib, a parallel benchmark that is capable of introducing large quantities of very ne-grained parallelism; and smallGroup, a group-algebra program
that exhibits highly-irregular data-parallelism. In both cases GpH uses GCA
to invoke the GAP engine to perform actual computations, dealing only with
the decomposition of the problem, marshalling etc. The kernel of the smallGroup
program is shown in Figure 7. Invocations of GAP Eval can be clearly seen in

622

A. Al Zain et al.

predSmallGroup and nrSmallGroups, and the associated marshalling of arguments and results is also clearly visible. Our experiments were performed on
a Beowulf-style cluster of workstations at Heriot-Watt University, where each
processing element (PE) is a 3GHz Celeron processor with 500kB cache, running
Linux Fedora Core-2. All PEs are connected through a 100Mb/s fast Ethernet
switch with a latency of 142s, as measured under PVM 3.4.2. In the following
results, each runtime is shown as the median of three execution times.
For (parFib 62), the GCA prototype implementation delivers good parallel performance on four processors, requiring 539s in the parallel case, as opposed to a
sequential time of 2, 559s (a superlinear speedup of 4.75, with average parallelism
of 4.0). Figure 3 (above) shows the corresponding GpH activity prole, where
time on the x-axis is plotted against the number of threads in various states,
e.g. running or blocked, on the y-axis. Note that the activity proles record only
GpH computations, and do not expose activity in the underlying GAP process,
which may be lower or higher. The corresponding results for smallGroup show
that on ve PEs, the computation of groups in the interval between 1 and 250
is completed in 37 seconds, compared with a sequential time of 144s, that is a
speedup of 3.9 at an average parallelism of 2.9. Figure 8 shows the corresponding
activity prole for the GpH component. In order to estimate overheads due to
parallelisation, we have also measured sequential times for this problem using
the standard GAP sytem. In this case, the problem was executed in 87s, that
is the cost of marshalling/unmarshalling data and context switching between
processes amounts to 66%. While this is a relatively high gure, we anticipate
that we should be able to reduce this cost by careful tuning of the GRID-GUM
runtime system. While these represent very early results for SymGrid-Par,
and we therefore now intend to explore performance both on larger numbers
of processors and on multiple clusters, and for larger-scale applications, it is
clear that real benet can be obtained for computational algebra problems on a
clustered system without major rewriting of the computational algebra system.

Related Work

Signicant research has been undertaken for specic parallel computational algebra algorithms, notably term re-writing and Gr
obner basis completion (e.g. [9,
10]). A number of one-o parallel programs have also been developed for specic
algebraic computations, mainly in representation theory [2]. There is, however,
little if any support for parallelism in the most widely-used CASs such as Maple,
Axiom or GAP. As research systems, Maple/Linda-Sugarbush [13] supports
sparse modular gcd and parallel bignum systems, with Maple/DSC [11] supporting sparse linear algebra, and ParGAP [15] supporting very coarse-grained computation between multiple GAP processes. There have also been a few attempts
to link parallel functional programming languages with computer algebra systems, for example, the GHC-Maple interface [5]; or the Eden-Maple system [20].
None of these systems is in widespread use at present, however, and none supports the broad range of computational algebra applications we intend to target.

SymGrid-Par: Designing a Framework for Executing CASs

623

Roch and Villard [23] provide a good general survey of work in the eld as of 1997.
Even less work has so far been carried out to interface CASs to the Grid. While
a number of projects have considered the provision of CASs as Grid services, often extending existing web services frameworks, e.g. GENSS [3], GEMLCA [17],
Netsolve/GridSolve [7], Geodise [4], MathGridLink [24] or Maple2G [22], and systems such as GridMathematica [1] allow Grid services to be called from within
CASs, there has been very little work on adapting CASs so that they can cooperate as part of a general Grid resource. Key work is in the Maple2G system that
is now being developed as part of our SCIEnce project. None of these systems
is, however, capable of linking heterogeneous CASs as in SymGrid.

Conclusions

This paper has presented the design of SymGrid-Par, a framework for symbolic computing on heterogeneous computational Grids that uniquely enables
the construction of complex systems by composing components taken from different symbolic computing systems. The core of SymGrid-Par is a pair of
standard interfaces (CAG and GCA) that interface between the symbolic computing systems and the GpH middleware. We have discussed a prototype GCA
implementation and reported promising performance measurements for two simple GAP/GpH programs. In ongoing work funded by the EU FP VI SCIence
project, we now intend to implement the more sophisticated CAG and GCA
interfaces, initially for GAP, and then for Kant, Maple and MuPad. The implementations will be validated on larger symbolic computations and robustied.
We also plan to demonstrate the inter-operation of multiple symbolic computing
systems. In the longer term we will investigate issues associated with scheduling irregular symbolic computations on computational Grids, and develop more
sophisticated parallel skeletons. Finally, we will provide wider access to high
performance symbolic computation by oering SymGrid-Par as a grid service.

References
1. GridMathematica2, http://www.wolfram.com/products/gridmathematica/.
2. High performance computations in group representation theory. Preprint, Institut
fur Experimentelle Mathematik, Univerisitat GH Essen,, 1998.
3. GENSS, http://genss.cs.bath.ac.uk/index.htm, 2006.
4. Geodise, http://www.geodise.org/, 2006.
5. The
GpH-Maple Interface,
http://www.risc.uni-linz.ac.at/software/
ghc-maple/, 2006.
6. The OpenMath Format,http://www.openmath.org/, 2006.
7. S. Agrawal, J. Dongarra, K. Seymour, and S. Vadhiyar. NetSolve: past, present,
and future; a look at a Grid enabled server. In Making the Global Infrastructure a
Reality, pages 613622. Wiley, 2003.
8. A. Al Zain, P. Trinder, H.-W. Loidl, and G. Michaelson. Managing Heterogeneity
in a Grid Parallel Haskell. J. Scalable Comp.: Practice and Experience, 6(4), 2006.

624

A. Al Zain et al.

9. B. Amrheim, O. Gloor, and W. Kuchlin. A case study of multithreaded grobner


basis completion. In In Proc. of ISSAC96, pages 95102. ACM Press, 1996.
10. R. Bundgen, M. Gobel, and W. Kuchlin. Multi-threaded ac term re-writing. In
Proc. PASCO94, volume 5, pages 8493. World Scientic, 1994.
11. K. C. Chan, A. Draz, and E. Kaltofen. A Distributed Approach to Problem Solving
in Maple. In Proc. 5th Maple Summer Workshop and Symp., pages 1321, 1994.
12. B. W. Char and et al. Maple V Language Reference Manual. Maple Publishing,
Waterloo Canada, 1991.
13. B.W. Char. A users guide to Sugarbush - Parallel Maple through Linda. Technical
report, Drexel University, Dept. of Mathematics and Comp. Sci., 1994.
14. M. Cole. Algorithmic Skeletons. In K. Hammond and G. Michaelson, editors,
Research Directions in Parallel Functional Programming, chapter 13, pages 289
304. Springer-Verlag, 1999.
15. G. Cooperman. Parallel gap: Mature interactive parallel. Groups and computation,
III (Columbus, OH,1999), 2001. de Gruyter, Berlin.
16. M. Daberkow, C. Fieker, J. Kl
uners, M. Pohst, K. Roegner, M. Sch
ornig, and
K. Wildanger. Kant v4. J. Symb. Comput., 24(3/4):267283, 1997.
17. T. Delaitre, A. Goyeneche, P. Kacsuk, T. Kiss, G.Z. Terstyanszky, and S.C. Winter.
GEMLCA: Grid Execution Management for Legacy Code Architecture Design. In
Proc. 30th EUROMICRO Conference, pages 305315, 2004.
18. The GAP Group. Gap groups, algorithms, and programming, version 4.2, 2000.
St Andrews, http://www.gap-system.org/gap.
19. W. Kuchlin. Parsac-2: A parallel sac-2 based on threads. In AAECC-8, volume
508 of Lecture Notes in Computer Science, pages 341353. Springer-Verlag, 1990.
20. R. Martnez and R. Pena. Building an Interface Between Eden and Maple. In
Proc. IFL 2003, Springer-Verlag LNCS 3145, pages 135151, 2004.
21. K. Morisse and A. Kemper. The Computer Algebra System MuPAD. Euromath
Bulletin, 1(2):95102, 1994.
22. D. Petcu, M. Paprycki, and D. Dubu. Design and Implementation of a Grid Extension of Maple, 2005.
23. L. Roch and G. Villard. Parallel computer algebra. In ISSAC97. Preprint IMAG
Grenoble France, 1997.
24. D. Tepeneu and T. Ida. MathGridLink Connecting Mathematica to the Grid. In
Proc. IMS 04, Ban, Alberta, 2004.
25. P.W. Trinder, K. Hammond, H.-W. Loidl, and S.L. Peyton Jones. Algorithm +
Strategy = Parallelism. J. Functional Programming, 8(1):2360, January 1998.

Directed Network Representation of Discrete


Dynamical Maps
Fragiskos Kyriakopoulos1,2 and Stefan Thurner1
1

Complex Systems Research Group, HNO, Medical University of Vienna, W


ahringer
G
urtel 18-20, A-1090 Vienna, Austria
thurner@univie.ac.at
Institute of Theoretical Physics; Johannes Kepler University; Altenbergerstrasse 69;
Linz; A-4040, Austria

Abstract. We suggest a network representation of dynamical maps


(such as the logistic map) on the unit interval. The unit interval is partitioned into N subintervals, which are associated with nodes of the
network. A link from node i to j is dened as a possible transition of the
dynamical map from one interval i, to another j. In this way directed
networks more generally allow phasespace representations (i.e. representations of transitions between dierent phasespace cells), of dynamical
maps dened on nite intervals. We compute the diameter of these networks as well as the average path length between any two nodes. We numerically demonstrate that these network characteristics seem to diverge
at the (rst) zeros of the Lyapunov exponent and could thus provide an
alternative measure to detect the edge of chaos in dynamical systems.
Keywords: network; dynamical maps; diameter; node.

Introduction

Complex systems are often dened as statistical systems with long-range interactions between their elements and which are characterized by a remarkable
stability and at the same time by the ability to adapt. These features have
led to view complex systems as systems which are prepared at the edge of chaos.
The edge of chaos is roughly dened as the set of zeros of the maximum Lyapunov exponent of a dynamical system. It has been realized in the past that
at exactly these points interesting dynamics occurs, see e.g. [1]. In practical circumstances one is often confronted with the situation that the exact knowledge
of a dynamical system is not available and that the only information given is
in the form of timeseries. A long-standing technical question is how to reliably
compute Lyapunov exponents from a given nite timeseries.
The recent boom in network theory has lead to a number of ways to characterize networks, such as their degree distribution, clustering coecients, neighboring connectivity, diameter, average path length, betweenness, just to name a
view [2,3].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 625632, 2007.
c Springer-Verlag Berlin Heidelberg 2007


626

F. Kyriakopoulos and S. Thurner

The aim of this contribution is to investigate the possibility of using recent


developments in network theory to successfully apply them to the realm of dynamical systems. In particular the idea is to use networks as a representation of
the phasespace of a dynamical system and to see whether in this representation
it is possible to get alternative maybe practical ways to obtain insights into
dynamical systems. Recently there have been several approaches toward this direction [4,5,6,7,8], each of them following a slightly distinct philosophy. Froyland
et al. [4] propose a method for rigorously computing a metric entropy of dynamical systems based on nite partitions of the phasespace. More recently, Shreim
et al. [5] construct directed networks from the statespace of cellular automata.
An approach very similar to the one presented here was proposed in [6] and was
worked out in some detail in [7]. There they construct an undirected network
from the trajectories of the logistic map and study some of the respective network properties. The dierence to our aim here is that their results depend on
the number of trajectories used and the number of iterations of the map (length
of each trajectory).
The paper is organized as follows: In Section 2 we describe how the directed
network can be uniquely constructed from a (one dimensional) map and how
the quantities of interest are dened and computed. In Section 3 we apply this
approach to the logistic and the tent map and discuss the results.

Method

As a starting point we consider discrete maps on the unit interval [0, 1] of the
form xt+1 = f (xt ). In a rst step we partition the unit interval into N equal
disjoint subintervals, Ii [(i 1), i], with i = 1, , N , and the interval size
given by  1/N . To arrive at a network representation of a dynamical map
we assign a node to each subinterval. Imagine that at a given time t the value
of the map falls into subinterval i, i.e. xt Ii . At the next timestep t + 1 the
value of the map changes to xt+1 , which we imagine to fall into subinterval j,
i.e. xt+1 = f (xt ) Ij . We dene a (directed) link in the network from node i
to node j whenever the map changes from a value in interval i at some time to
an interval j in the following timestep. Obviously the adjacency matrix A is of
dimensions N N ; formally, its elements are dened as aij = 1 if, xt Ii and
(!) xt+1 = f (x) Ij ; aij = 0, otherwise. More algorithmically one can think
of constructing the network in the following way: We start at the rst interval,
I1 . We ask to which intervals Ij all the elements in Ii point in the above sense.
For all intervals Ij which can be reached from Ii within one timestep, we create
an entry, a1,j = 1. We continue with the second interval, etc. In this way the
network uniquely maps the dynamics of the map, with a resolution . Note
that in this construction the number of links do not depend on the maximum
number of iterations of the map, T max(t). This is not the case in [7], where
they recorded transitions from actual realizations of maps; obviously the network
then will depend on T , and the initial condition, x0 . A further dierence to [7]
is that they symmetrize A, which we do not.

Directed Network Representation of Discrete Dynamical Maps

627

0.3
0.5
0.25

Pout(k)

Pin(k)

0.4
0.3
0.2

0.2
0.15
0.1

0.1

10

15

20

25

(b)

0.05

(a)
0

0
0

30

10

15

20

25

30

0.3

P(d)

0.25
0.2
0.15
0.1
(c)

0.05
0
0

10

12

Fig. 1. In-degree distribution (a), out-degree distribution (b) and shortest distance
distribution (c) for the logistic map with r = 4, for N = 1000 subintervals (nodes)

Given the adjacency matrix one can compute the in- and out-degrees for each
node kiin , kiout . The corresponding degree distributions are denoted by P in (k),
P out (k). We further compute the distance matrix D, where the element dij
represents the shortest path from node i to j (number of iterations in the map).
If i and j are not connected, dij = , and the element gets excluded for all
further
analysis. From D the distance distribution function is given by P (d)
1 
2
i,j (dij d). The average distance (mean of this distribution) is denoted
N
by davg dij (i,j) , its maximum the diameter of the network is dmax
max(dij ).

3
3.1

Results
Logistic Map

We apply the above procedure to the logistic map xn+1 = f (xn ) = rx(1 x),
and illustrate it with a small partition N = 4 of the interval [0, 1]. For the
particular choice r = 4 this partition produces the following adjacency and
distance matrices,

628

F. Kyriakopoulos and S. Thurner

(d)

r=4

1
0.8

f(x)

0.6
0.4
0.2
0
0

0.2

0.4

0.6

0.8

x
(e)

r = 0.2

1000
900
3.5

800
700

600
2.5

500
400

300
200

1.5

100
5

10

15

20

Fig. 2. Adjacency A (a) and distance D (b) matrices of the logistic map for N = 1000
and r = 4. (c) shows D for xed values of dij = 1, 2, 3 only. These look exactly the
same as the /2-rotated 2- and 3-fold compositions of the map, f (f (x)), f (f (f (x))) for
r = 4, shown in (d). Plots (e)-(h) show D for r = 0.2, 0.9, 1, 1.2 respectively. White
color corresponds to dij = .

1110

0 0 1 1

A=

0 0 1 1

1110

1112

2 2 1 1

D=

2 2 1 1

(1)

1112

In Fig. 1 we show P in (k) (a), P out (k) (b) and the distance distribution function P (d) (c), for r = 4, and N = 1000. It is obvious that the degree distributions are not particularly spectacular and denitely do not follow a Poissonian or
power-law distribution. The maximum degrees are k max,in = 32 and k max,out = 6
(for k max,in the probabilities for k > 7 are very small and not discriminable in
the plot from zero). The mean degrees are equal k in  k out  3.0.

Directed Network Representation of Discrete Dynamical Maps

629

The distance matrix D is computed using the Floyd algorithm [9]. In Fig. 2
(a) and (b) we show the adjacency and distance matrices A, D, respectively. For
A ones are represented by blue pixels, zeros by white. It is intuitively clear that
the structure of A has to be identical to the form of the /2- rotated original
map f (x), which is shown in Fig. 2(d). We dene the n-fold composition of the
map as f (n) f (f ( f (x))). The respective distance matrices are plotted in
Fig. 2(c) for n = 1, 2, 3 for xed values of dij = 1, 2, 3. These curves exactly
resemble the n-fold compositions shown in (d), modulo rotation.
We next check how the quantities k in and k out depend on r and N . As seen
in Fig. 3 the maximum in-degree exhibits a clear power-law behavior. We obtain
k max,in r0.49 , irrespective of N . Quite dierently the maximum out-degrees
resemble step functions. As N becomes larger there is a shift to the left. The
average degrees k in  and k out  depend on r for three network sizes N = 100,
N = 500, N = 1000 exactly in the same linear way, k in  = k out  = r2 + 1,
irrespective of network size (not shown).
3

10

N = 1000
N = 500
N = 100

N = 1000
N = 500
N = 100

5
4

kmax,in

kmax,out

10

10

3
2

(a)

(b)

10 3
10

10

10

10

10

Fig. 3. Maximum in-degree (a) and out-degree (b) of the logistic map networks as
functions of the control parameter r for three network sizes

Finally, we consider the r and N dependence of the average and maximum


distance davg and dmax . In Fig. 4 we show davg as a function of r for several N .
Their behavior of dmax is very similar (not shown). Both curves show divergences.
It is noteworthy that at least the rst divergences occur exactly at those values
where the Lyapunov exponent vanishes (see lower curve in Fig. 4).
3.2

Tent Map

We carry out the same computations for the tent map xn+1 = (1 2|xn 12 |),
with [0, 1]. In Fig. 5 we present the same quantities as for the logistic map.
Again the mean degrees grow linearly with and are independent of network
size. The maximum in-degree, Fig. 5(a), shows a combination of a power law and
step like function behavior. The intervals on which k max,in is constant grow as
approaches 1. The maximum out-degree, Fig. 5(b), is constant k max,out = 2

630

F. Kyriakopoulos and S. Thurner


35
30
25

N = 1000
N = 500
N = 100
Lyapunov exponents

15

avg

20

10
5
0
5
0

r
Fig. 4. Average network distance davg of the logistic map as function of the control
parameter r, for three network sizes. The green line are the Lyapunov exponents of the
map.

for (0, 1/2) and jumps to k max,out = 2 for (1/2, 1). The average and
maximum distances again are very similar to each other and exactly behave as
in the logistic map: They diverge where the Lyapunov exponent vanishes. Using
the standard
denition of the Lyapunov exponent for one dimensional maps,
n
= n1 i=0 ln(|f  (xi )|), we obtain = ln(|2|), which is a known result for the
standard map. For = 1/2 we have = 0. In Fig. 5 (c) and (d) we see that for
values of near 1/2 davg and dmax diverge.

Discussion

We suggested a directed network representation of one-dimensional dynamical


maps. The structure of the adjacency matrix looks like the the original map rotated by /2. The corresponding networks were analyzed with respect to several
known network measures. Maybe the most interesting nding was that both, the
network diameter, and the average path length diverge at the same parameters
where the Lyapunov exponents become zero. We have shown this explicitly for
the logistic map for the rst four vanishing Lyapunov exponents, and the single
vanishing Lyapunov exponent in the tent map. For nite network sizes it becomes dicult to check the validity for higher vanishing Lyapunov exponents.
We found, in accordance with [7] that the degree distributions of the networks
are not particularly interesting. Finally we note, that higher dimensional maps
can be dealt with in exactly the same way.
There are several questions that arise for future work, such as the inverse
problem: Given a directed network with some xed properties is it possible to
construct or estimate a (discrete) map which exhibits these properties? To what
level of uniqueness is this possible? One way to solve this problem is to construct

Directed Network Representation of Discrete Dynamical Maps

631

10

N = 1000
N = 500
N = 100

N = 1000
N = 500
N = 100

3.5
3

kmax,in

kmax,out

10

2.5
2
1.5

10

1
0.5

(b)

(a)

10 3
10

10

10

10

200

400

dmax

davg

150

500

N = 1000
N = 500
N = 100

100

50

0
0

(c)

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

N = 1000
N = 500
N = 100

300
200
100

0
0

(d)

0.2

0.4

0.6

0.8

Fig. 5. Maximum in-degrees (a) and out-degrees (b) for the tent map, for network sizes
N = 100, N = 500 and N = 1000. Average and maximum distances as functions of
are shown in (c) and (d), respectively.

maps which are piecewise linear on each subinterval and have a slope, equal to
the in- or out-degree of the directed graph. Another interesting problem is the
systematic application of the presented procedure on higher dimensional maps
and possibly continuous dynamical systems.
We acknowledge support from the Austrian Science Fund FWF under P17621
and P19132.

References
1. A. Robledo: Critical attractors and q-statistics. Europhysics News 6 (2005) 214-218
2. S. Dorogovtsev, J.F.F. Mendes: Evolution of Networks. Oxford University Press
(2003)
3. A.-L. Barab
asi: Statistical mechanics of complex networks. Rev. Mod. Phys. 74
(2002) 47
4. G. Froyland, O. Junge, G. Ochs: Rigorous computation of topological entropy with
respect to a nite partition. Physica D 154 (2001) 68-84
5. A. Shreim, P. Grassberger, W. Nadler, B. Samuelsson, J.E.S. Socolar, M.
Paczuski: Network analysis of the state space of discrete dynamical systems, condmat/0610447 (2006)

632

F. Kyriakopoulos and S. Thurner

6. S. Thurner: Nonextensive statistical mechanics and complex scale-free networks.


Europhysics News 6 (2005) 218-220
7. E.P. Borges, D.O. Cajueiro, R.F.S. Andrade: Mapping dynamical systems onto complex networks. cond-mat/0610820 (2006)
8. S.M. Park, B. J. Kim: Dynamic behaviors in directed networks. Phys. Rev. E 74
(2006) 026114
9. R.W. Floyd: Algorithm 97: Shortest Path. Communications of the ACM 5 (1962)
345

Dynamical Patterns in Scalefree Trees of


Coupled 2D Chaotic Maps
Zoran Levnajic and Bosiljka Tadic
Department for Theoretical Physics, Jozef Stefan Institute
Jamova 39, SI-1000 Ljubljana, Slovenia
zoran.levnajic@ijs.si, bosiljka.tadic@ijs.si

Abstract. New insights have been gained recently into the interplay
between complex network architectures and the collective dynamics of
simple elements coupled through them. The usual paradigm for studying
systems of this kind is a network of one-dimensional interacting logistic
maps, which provides a plausible model for a variety of complex systems.
In this work we consider a network of two-dimensional standard maps
coupled through a scalefree tree topology. The coupling is solely in the
angle coordinate and involves a time-delay: this approach is motivated
by the node-to-node information ow view of the collective dynamics.
We observe a rich variety of dynamical behavior including self-organized
patterns: nodes synchronize in clusters each having specic motion quantities whose values are belonging to a discrete set of possible values.
We conclude by studying the relationship between dynamically induced
cluster-organization of the nodes and their network-structurization.

Introduction

Coupled map systems (CMS) like chains, lattices and networks represent the
simplest and most useful paradigm of a complex system: they are conceptually
easy to understand and numerically straightforward to implement [1,2]. Following the pioneering work by Kaneko [3], the collective dynamical properties of
CMS were studied extensively [4,5,6] yielding a lot of insights into the key mechanisms behind the self-organization of complex systems. Moreover, due to the
rich variety of the global qualitative behaviors that CMS posseses, they became
one of the central tools in complex phenomena modeling. For instance, many
systems of biological interest can be plausibly modeled by CMS [7,8,9]; even the
inherent topological details of a given network can be investigated by studying
the dynamics of a CMS interacting through it [10,11].
Despite the simplicity of the local dynamics exhibited by a single map (considered to be a node of the network) their collective behavior usually has completely dierent properties: in particular, for carefully chosen parameter values,
the networks of coupled chaotic maps not only inhibits local chaos on each node,
but manage to fully synchronize in many dierent ways. Extreme robusteness
of the collective functioning of many real networks can also be seen as an intrinsic property of the network-coupled dynamics. Various recent works tried to
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 633640, 2007.
c Springer-Verlag Berlin Heidelberg 2007


634

Z. Levnajic and B. Tadic

model dierent network processes observed in nature using CMS or similarly


simple systems [8,12]: robustness on initial values, speed of self-organization or
network adjustment to a desired nal behavior. Mostly, these works consider
one-dimensional discrete maps (logistic map or binary maps) to mimic the local
dynamics on a single node and focus on statistics of the emergent behavior.
Even though this approach yielded many results, these models still remain limited to one-dimensional dynamics on each node. However, many networks of interest involve nodes whose local dynamics cannot be modeled by one-dimensional
elements: genes for instance need more then one degree of freedom to be properly modeled; see [13] for an example of a two-gene system. In general, nodes
with larger dimensionality give much richer network behavior as two-dimensional
maps possess a dynamical variety that is already vast. In this contribution we
focus on a network (namely a tree) of coupled 2D maps using the scalefree-tree
topology. We observe a dierent collective properties resulting from the particularities of both coupling and topology. Quite dierently from previous 1D studies,
our network synchronizes after a quick transient for very small values of networkcoupling, achieving a full cluster-synchronization that qualitatively persists even
at higher coupling values. After providing a denition of cluster-synchronized
states by using the average trajectory power spectrum, we study the relationship between the clustering patterns and the underlying dynamical/topological
details.

The Tree System Set-Up and the Coupled Dynamics

We grow a scalefree-tree network using the standard procedure of preferential


attachment [14] by 1 link/node to obtain a tree with N = 1000 nodes and a
power-law degree distribution with exponent 3. Every node is assumed to have
an internal dynamics given by the 2D standard map [0, 1] R [0, 1] R
x = x + y + sin(2x)
y  = y + sin(2x).

[mod 1]

(1)

The nodes are coupled through the network edges by one-step-delay dierence
in angle (x) coordinate so that the complete time-step of the node [i] is

x[i]n+1 = (1 )x [i]n + ki jKi (x[j]n x [i]n )
(2)
y[i]n+1 = (1 )y  [i]n .
Here, ( ) denotes the next iterate of the (uncoupled) standard map as in (1), n
denotes the global discrete time and [i] indexes the nodes (ki being a nodes degree); and are standard map and network-coupling parameters respectively,
and Ki stands for the network neighborhood of the node [i]. The update of each
node is therefore the sum of a contribution given by the update of the node itself
(the  part) plus a coupling contribution given by the sum of dierences between
the nodes x-value and the x-values of neighboring nodes in the previous iteration. The study is done on a xed tree and focused on observing the dynamical

Dynamical Patterns in Scalefree Trees of Coupled 2D Chaotic Maps


200

20

180

635

15

160

2
10

140

120

5
0

0
100

1
5

80

10

60

40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

15

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Fig. 1. Examples of single trajectories. Left: a plot for the uncoupled map; middle and
right: plots of two typical trajectories (colored dierently) for = 0.01, 0.02 respectively. All plots show 10,000 iterations of the randomly chosen node after an initial
transient of 105 iterations for random initial conditions from (x, y) [0, 1] [1, 1].

phenomena exhibited by this system in function of the parameter (we will


keep = 0.9 for the whole study). Of course, the coupling term is somewhat
arbitrary; we choose this particular functional form (2) for two reasons: (i) the
standard map represents a discrete version of an oscillator so the coupling in
angle is natural; (ii) the time-delay models the fact that in realistic networks information exchange needs a nite time. The standard map [15] is the best-known
example of a chaotic system that exhibits almost all known discrete-time motion
possibilities as changes ( = 0.9 implying a strong chaos). An example of a
standard map trajectory with = 0.9 is in Fig.1; its most notable property is
the chaotic diusion regardless of the initial conditions (x, y) the orbits always
diuse unboundedly along the y axis for above the critical value of
= 0.15.
The characteristic trajectories of the coupled model are also shown in Fig.1.
Observe that the diusion of the uncoupled map has been inhibited: all the
trajectories are contained in a bounded region (band) in y coordinate once the
transients are gone they are localized. Even though the motion is still chaotic
within the band, localization is clearly the rst observed collective eect. Note
that all trajectories in Fig.1 are actually trajectories after rst 105 iterations
have passed. Since transients are less relevant in this study (we are looking for
the asymptotic properties) from here on they will be disregarded.
With the further increase of we witness the localization of trajectories in
smaller and smaller bands and then nally into a discrete and nite number of
points (Fig.1 right). Motion here suddenly becomes periodic and fully regular,
characterized also by the band having a width close to 1. Interestingly, once
some nodes start achieving regularity of their motion, all other nodes do it almost simultaneously. There is only a small range in within which all the nodes
go from a full chaoticity to a periodic regularity (to be shown in detail later).
This is also visible from Fig.1 which shows qualitatively dierent trajectories

636

Z. Levnajic and B. Tadic

obtained for only a small increase in . Also, note that all the network-coupling
values are orders of magnitude smaller than in similar studies of 1D map networks (e.g.[6],[16]). Synergy between 2D maps is clearly stronger then in 1D
case.
A systematic study of the localized trajectory widths leads to the histograms
in Fig.2 left. The curves have a shape of the log-normal distribution for larger
band widths ( 100); at the same time the process of localization into periodic
orbits is increasing the number of nodes with band width 1. This regularization of network occurs quite abruptly without a smooth decrease of band
widths; for
= 0.018 all the nodes for all choices of initial conditions get localized into periodic orbits. Since this overall property of the trajectories is
robust to initial conditions, we can formalize it by establishing a denition of
synchronization using the periodicity of the nal steady states in the section to
come.

The Average Trajectory and the Fourier Transform


Denition of Synchronization

A convenient object for considering the global properties of a network dynamics


is the average trajectory (a.t.) of the network [6] dened as
(
xn , yn ) =

N
1 
(xn [i], yn [i]),
N i=1

(3)

which measures the average motion of all the N nodes. When all the nodes
exhibit a periodic motion the a.t. is also periodic, whereas if some of the nodes
are still chaotic the a.t. will be fairly localized but not periodic (Fig.2 right). This
can be quantied through the Fourier analysis. Let us consider a time-signal of
0.5

90
0.4

0.3

70

0.2

60

50

# of nodes (out of 1000)

80

0.1

40
0

30
0.1

20
0.2

10

10

20

30

40

50

band width

60

70

80

90

0.3
0.45

0.46

0.47

0.48

0.49

0.5

0.51

0.52

0.53

0.54

0.55

Fig. 2. Left: Band widths distribution for the whole network; = 0.0001 (red), 0.001
(green), 0.01 (blue). Each histogram is obtained by averaging over 50 sets of network
initial conditions. Right: an example of a chaotic (red) and a regular a.t. (green).

Dynamical Patterns in Scalefree Trees of Coupled 2D Chaotic Maps

637

0.1

0.09

0.08

0.8

supp(|Fk| )

0.06

supp(|F | )

0.07

0.05

0.04

0.6

0.4
0.03

0.02

0.2

0.01

0
0

0.5

0.008

0.01

frequencies

0.012

0.014

0.016

0.018

0.02

0.022

0.024

0.026

Fig. 3. Left: The power spectrum of the time-signal of a chaotic a.t. (red) and a regular
a.t. (green). Right: supp(|Fk |2 ) as function of for = 0.9 with s0 = 104 .

the a.t. given as fn = f (


xn , yn ) where f (x, y) is some L2 function dened over
the phase space. The Discrete Fourier Transform of the signal fn is given by
Fk =

M
1 
fn e2ink/M ,
M n=1

(4)

and its power spectrum |Fk |2 characterises the distribution of periodic components within the corresponding motion. A regular periodic orbits spectrum will
therefore be limited to only a few non-zero frequencies whereas a chaotic orbit
will have an (almost) continuous spectrum (Fig.3 left). Since this qualitative
dierence can be quantied by considering supp(|Fk |2 ) (the portion of the domain with non-zero frequencies), we can now dene a collective dynamics as
synchronized if
supp(|Fk |2 ) < s0 ,
(5)
with some very small s0 . This denition is not entirely precise but it is very
useful for the numerical study in this work as it gives a good description of the
nal steady states of our system. Also, supp(|Fk |2 ) can be used as the statistical
order parameter for this system as it sharply decreases to zero when the synchronization starts occurring (Fig.3 right) and stays  0 afterwards. From here
on we will focus on these nal steady states that fulll the condition (5).

The Cluster Synchronization

Above a particular threshold coupling value c the tree dynamics reaches a synchronized steady state in which every node has a regular periodic trajectory with
a band width close to 1. A detailed analysis leads to a surprising result that the
nodes actually synchronize in clusters, each cluster having a common value of
the band center. Moreover, the band center values seem to appear only in a discrete set of possible values, as shown in Fig.4 left where the motion of the whole
network is presented. The tree is cluster-synchronized: each cluster is dened by

638

Z. Levnajic and B. Tadic


1

0.95

0.9

0.85

band width

0.8

0.75

0.7

1
0.65

0.6

0.55

3
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.5
2

1.5

0.5

0.5

1.5

band center ylocation

Fig. 4. Left: First two iterations of all the nodes (after the transient) for = 0.02,
every node belongs to only one trajectory-group with a particular color. Right: band
width against band center values for all the nodes with the same .

its band center location which varies less then 1% for the nodes within a cluster.
Band widths (that are now simply twice the amplitude of oscillation) are also
discretized with slightly bigger variance and are close to 1, as expected. This is
shown in Fig.4 right where we see a plot of band widths against band center
locations; note the left-right symmetry of the plot due to the same symmetry
of the initial value interval. This is one of the regularities of the clusterization,
suggesting that it is not a random process. Even though the group of nodes
within a given cluster vary with initial conditions, the nal qualitative cluster
structure is extremely robust to the initial conditions.
We next show that the synchronized nodes form patterns with more dynamical regularity. In Fig.5 left we show histograms of distances between the nodes
belonging to the same cluster measured along the supporting tree. For comparison we also show lengths of the topological shortest-paths on the same tree. It
appears that the cluster-synchronization aects nodes at distances of 2, 3 and
4, which have statistical weights dierent from the topological distances on that
tree. In this way an intricate structure of the interconnected domains appears
throughout the network (Fig.5 right). This emergent property also seems to be
robust to the variations of the initial conditions.
Another invariant property, although quite common for coupled oscillators,
is that every two linked nodes strongly tend to have dierent oscillation phases
when the tree is synchronized. Regardless of the initial conditions, less then 1
% of the neighboring nodes synchronize in the same phase. Although clustering
eects had already been observed in 1D CMS [17], the clustering observed here
is qualitatively dierent as it includes 2D systems whose synchronization is more
subtle. As for the mechanism behind the clusterization note that the neighboring
nodes tend to oppose their phases due to the time-delay coupling. The dierence
between a nodes x-value and the neighboring nodes past x-values is constant
which gives the steady oscillation after transients. The clustering begins as an

Dynamical Patterns in Scalefree Trees of Coupled 2D Chaotic Maps

639

1.2
tree:all_pairs
tree: clusters_i.c.1
tree: clusters_i.c.2
tree:clusters_i.c.3
tree:clusters_i.c.4

Normalized # events

1
0.8
0.6
0.4
0.2
0

10

Distance
Fig. 5. Left: Distribution of the topological distances for a tree with 1000 nodes (bullets) and the distances inside the clusters (other symbols) in synchronized states for
= 0.017 and dierent initial conditions. Right: visualization of the tree with ve
interconnected clusters of synchronized nodes marked by dierent colors.

accumulation of nodes x-values in a way that keeps the x-dierence between


the neighbors close to 0.5, as this produces steadiness (Fig.4 left). This results
in an analogous accumulation of y-values as they are changed solely through
x coordinate, which nally gives the cluster-synchronized states that are the
invariant orbits for these dynamics.

Conclusions and Outlook

We have shown some preliminary results of the collective dynamics of 2D maps


on a scalefree tree network. Our ndings include self-organizational properties
of the global dynamics achieved after a quick transient. Nodes synchronize into
clusters with specic dynamical properties even for very small coupling values.
We further demonstrate that the synchronization with the 2D maps leads to
robust dynamical patterns, whose properties can be related to the underlying
tree topology. One direction for further study will be to investigate how these
patterned structures can be used in the network dynamics related to dierent
applications. More open theoretical questions also arise, in particular regarding
the the full extent of the time-delay coupling [18,19], that might be essential in
models of some realistic networks.
Acknowledgments. This work was supported by the Program P1-0044 of the
Ministery of Higher Education, Science and Technology of Republic of Slovenia.

640

Z. Levnajic and B. Tadic

References
1. K. Kaneko. Theory and applications of coupled map lattices. John Wiley & Sons
New York, 1993.
2. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D. U. Hwang. Complex
networks: Structure and dynamics. Physics Reports, 424:175, 2006.
3. K. Kaneko. Period-doubling of kink-antikink patterns, quasiperiodicity in antiferro-like structures and spatial intermittency in coupled logistic lattice. Progress
of Theoretical Physics, 72(3):480, 1984.
4. G. Abramson and D. H. Zanette. Globally coupled maps with asynchronous updating. Physical Review E, 58(4):4454, 1998.
5. S. Jalan and R. E. Amritkar. Self-organized and driven phase synchronization in
coupled maps. Physical Review Letters, 90(014101), 2003.
6. P. G. Lind, J. A. C. Gallas, and H. J. Herrmann. Coherence in scale-free networks
of chaotic maps. Physical Review E, 70(056207), 2004.
7. H. Nozawa. A neural network model as a globally coupled map and applications
based on chaos. Chaos, 2(3):377, 1992.
8. F. Li, T. Long, Y. Lu, Q. Ouyang, and C. Tang. The yeast cell-cycle network is
robustly designed. Proceedings of the National Academy of Sciences, 101(14):4781,
2004.
9. B. Tadic, G. J. Rodgers, and S. Thurner. Transport on complex networks:
Flow, jamming & optimization. International Journal of Bifurcation and Chaos,
17(7):n/a, 2007.
10. E. Oh, K. Rho, H. Hong, and B. Kahng. Modular synchronization in complex
networks. Physical Review E, 72(047101), 2005.
11. A. Arenas, A. Daz-Guilera, and C. J. Perez-Vicente. Synchronization reveals
topological scales in complex networks. Physical Review Letters, 96(114102), 2006.
12. N. Kashtan and U. Alon. Spontaneous evolution of modularity and network motifs.
Proceedings of the National Academy of Sciences, 102(39):13773, 2005.
13. S. Widder, J. Schicho, and P. Schuster. Dynamic patterns of gene regulation i:
Simple two gene systems. To appear.
14. S. N. Dorogovtsev and J. F. F. Mendes. Evolution of networks: From Biological
Nets to the Internet and WWW. Oxford University Press, 2003.
15. B. V. Chirikov. A universal instability of many-dimensional oscillator systems.
Physics Reports, 52(5):263, 1979.
16. F. M. Atay and J. Jost. Deleys, connection topology and synchronization of coupled
chaotic maps. Physical Review Letters, 92(14), 2004.
17. S. Jalan, R. E. Amritkar, and C. Hu. Synchronized clusters in coupled map networks. i. numerical studies. Physical Review E, 72(016211), 2005.
18. C. Masoller and A. C. Mart. Random delays and the synchronization of chaotic
maps. Physical Review Letters, 94(134102), 2005.
19. C. P. Li, W. G. Sun, and J. Kurths. Synchonization of complex dynamical networks
with time delays. Physica A, 361(1):24, 2006.

Simulation of the Electron Tunneling Paths in


Networks of Nano-particle Films

Milovan Suvakov
and Bosiljka Tadic
Department for Theoretical Physics, Jozef Stefan Institute, Box 3000, 1001 Ljubljana,
Slovenia
Milovan.Suvakov@ijs.si, Bosiljka.Tadic@ijs.si
http://www-f1.ijs.si/tadic/, http://www-f1.ijs.si/suvakov/

Abstract. Thin lms of nano-particles deposited on substrates are


important for new technological applications. Their physical properties
depend crucially on particle size and structural inhomogeneity of the
deposition. To systematically investigate these dependencies, we apply
graph theory and simulations of voltage-driven tunneling conduction.
We implement a network model of nano-particle lms, based on data of
particle positions on a two-dimensional plane. Assuming that connected
particles on the network are within the tunneling distance, we then implement a continuous-time electron tunneling on the network and show
how the currentvoltage characteristics depend on the graph structure.
We visualize the conduction paths through the network which correspond
to the measured currents both in random and spatially inhomogeneous
lms.
Keywords: nano-particles; network; electron Tunneling Paths.

Introduction

In search of new materials for advanced technology applications [1] metallic


nano-particles deposited on a two-dimensional substrate either as regular arrays [2,3,4] or as thin lms [5] are being extensively investigated. Often thin
lms are grown by spin casting or methods that involve non-linear processes in
which nano-particles move in a liquid, both leading to characteristically inhomogeneous structures, that can be varied via control parameters of the process
[6,7,8]. It has been recognized that the physical properties of these nano-particle
arrays and thin lms depend on the mutual positions of the nano-particles and
global characteristics of the structure [9]. In particular, enhanced non-linear
conduction properties have been observed in certain structures of nano-particles
[10,11,12,13,14,15].
In order to systematically investigate the topology eects on the conduction in
two-dimensional nano-particle structures, here we propose a numerical approach
based on the graph theory and simulations of the electron tunneling processes.
Our approach consists of two steps:
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 641648, 2007.
c Springer-Verlag Berlin Heidelberg 2007


642

M. Suvakov
and B. Tadic

Mapping of an arbitrary nano-particle array to a graph;


Implementation of electron-tunneling through the graph;
In the rst part, as described below, the positions of nano-particles on a twodimensional substrate are exactly observed. The mapping of the structure to the
graph enables quantitative study of the structure in many details. In the second
part, the dynamics of voltage-driven electron tunnelings are strictly linked to
the graph structure and thus their dependence on the structural elements of the
graph can be easily traced. We demonstrate by simulations and by graphical
means (visualizing the conduction paths) how the current ows through two
such graphs, which correspond to two types of lms made by random and by
correlated deposition of particles.

2
2.1

Mapping of Nano-particle Films to Networks


Particle Deposition

For implementation of our numerical methods we assume that the positions


of nano-particles on the substrate are known, e.g., from STM measurements
or by statistical estimates based on the parameters of the deposition process.
To illustrate our approach, we generate two types of lms, shown in Fig. 1,
by sequential deposition of particles of a unique size (determined by the grid
spacing). In Fig 1(a) the deposition site for a new particle is randomly selected
away from already occupied sites, resulting in a random homogeneous structure.

(a)

(b)

Fig. 1. Examples of the nano-particle lms with (a) random and (b) correlated deposition at density = 10% grid points covered by particles. Network edges are shown
corresponding to the radius r of 2 grid points, discussed in the text.

Simulation of the Electron Tunneling Paths in Networks

643

In contrast to the random deposition, in Fig. 1(b) we apply self-avoiding randomwalk dynamics to select the deposition sites. In this case, the non-linear selfavoiding process leads to an inhomogeneous deposition, with inhomogeneity at
dierent scales [16]. In both cases we put N = 1000 particles on the rectangular
plane with 100 100 grid points (see Fig. 1).
2.2

Emergent Networks

Next we make a nano-particle lm network by connecting pairs of nano-particles,


which are spaced within a small radius r from each other. For the purposes of
the tunneling processes that we would like to study on these lms, the radius
r will be selected to coincide with the tunneling distance, that is known for
each nano-particle type. In our numerical experiment, we may vary r from one
to few grid points. The emergent networks have a variable, but limited node connectivity, which reects the local inhomogeneity of the deposition. For instance,
the networks shown in Fig. 1 have the average local connectivity k = 9 but
with small dispersion k1 = 2.62 in the random deposition network, whereas
k2 = 4.05 in the case of correlated deposition network. Further topological
properties of such networks can be studied using the appropriate algorithms and
the graph theory [17,18].
In the next section we implement the conduction via microscopic tunneling of
charges through these networks. For this purpose it is important to check that
for a given average particle density (surface coverage) and xed radius r, the
system is above the geometrical percolation point.

Conduction

For the simulations of the tunneling currents driven by voltage dierences between two electrodes, the network structure (Fig. 1) is xed by its adjacency
We then place the electrodes at left-most and right-most nodes of the
matrix A.
network. By increasing the voltage dierence, the conditions for the electron tunneling into the system of nano-particles from the side of higher voltage electrode
are created. Increasing the voltage in small steps, the front of tunneling charges
moves towards the low-voltage electrode and eventually touches the electrode
when the voltage exceeds a threshold value VT , which depends on the distance
between electrodes and other details of the process, described below.
3.1

Tunneling Implementation

The energy of charged particles on a network of size N is given by the Hamiltonian [10,12,13,19]:
E=

1 1
Q M Q + Q V ext + Q = E (1) + E (2) + Q
2

(1)

with the vector of charges at nodes (nano-particles) Q {Qi }, i = 1, 2 N ,


and the potential of the electrodes ,
the matrix of capacitancies M,

644

M. Suvakov
and B. Tadic

{+, , gate}. The external potential is V ext = M 1 C , where C is the


vector of capacitance between dots and electrodes . The microscopic structure
of the underlying nano-particle array appears through the o-diagonal elements
i.e., via the adjacency matrix, i.e., Mij = Aij for i = j.
of the matrix M,
The inter-particle charge transport is a stochastic process in which tunneling
of an electron between nodes i j at time t is governed by the probability
distribution pij (t) = ij exp(ij t), with the tunneling rate [11,10]
ij =

Eij /e2 R
,
1 exp(Eij /kB T )

(2)

which is determined by the energy change Eij associated with the tunneling
process. Here e is the electron charge, R is quantum resistance and T is temperature.
Next we calculate the energy changes associated to the tunneling along the
links of the network. After a single electron tunneling process from dot a to dot
b (Qi = Qi + bi ai ) the change in the interaction energy term in Eq.(1),

E (1) (a b) = E (1) E (1) can be written as:
E (1) (a b) =

1
1
(Qi + bi ai )Mij1 (Qj + bj aj )
Qi Mij1 Qj (3)
2 ij
2 ij

1
Using Mij1 = Mji
(3) becomes:

E (1) (a b) =


i

1
1
1
1
1
1
1
Qi (Mib
Mia
) + (Maa
+ Mbb
Mab
Mba
) . (4)
2

Similarly, the change in the second energy term of Eq.(1), which is associated

with the interaction with the external potential, E (2) (a b) = E (2) E (2)
becomes


E (2) (a b) =
(Qi + bi ai )Viext
Qi Viext = Vbext Vaext .
(5)
i

In addition, the contribution from the tunneling processes between electrodes


and dots is computed. A tunneling process between an electrode and a dot a
(Qi = Qi ai ) contributes the energy changes in both terms, which are:
E (1) (a ) =

1
1
(Qi ai )Mij1 (Qj aj )
Qi Mij1 Qj
2 ij
2 ij


i

and
E (2) (a ) =

1 1
1
Qi Mia
+ Maa
,
2


i

(Qi ai )Viext


i

Qi Viext = Vaext .

(6)

(7)

Simulation of the Electron Tunneling Paths in Networks

645

We can write the expressions (4)


and (6)1can be written in a concise form using
the appropriate variables Vc i Qi Mic
. We obtain:
1
1
1
1
1
E (1) (a b) = Vb Va + (Maa
+ Mbb
Mab
Mba
),
2

(8)

1 1
E (1) (a ) = Va + Maa
.
(9)
2
In the case when C << Cq the o-diagonal elements of inverse capacitance
matrix fall o exponentially and in calculations of Vc we can use only nearst
neighbor terms, which speeds up the calculations.
In the simulations, following an increase of the voltage on one electrode, it
takes some transient period until the distribution of charges become stationary.
We then sample the charges at all nodes and currents through all links of the
network. The relevant quantities are measured in the following units: charge
Q[e], voltage V [e/Cg ], time t[RCg ] and current I[e/RCg ]. In this paper we focus
on certain properties of current in the limit C Cg and we set T = 0, where
the tunneling eects are most pronounced, and gate = = 0.
The numerical implementation in C++ is done according to the following
steps:
Input: graph, parameters
Calculate capacitance and inverse capacitance matrix
Initialize vectors Q, Vc with zeros
Initialize time t=0
While(enough sampled data)
Calculate vector Vc
Calculate energy change for all junctions
For each link t(i,j)=next random from distribution pij(t)
Process the tunneling with smallest time t(i,j)
Increment time t=t+t(i,j)
If(relaxation is done)
Sample data of interest
End If
End While
3.2

Conduction Paths and Nonlinear Currents

As mentioned above, driven by the external potential, the charges are entering
the system from the higher-voltage electrode. When the voltage is small, the
charges can screen the external potential, in which case the system may become
static and there is no current. For the voltage larger than a threshold value VT the
screening does not occur and, after some transient time, a stationary distribution
of charges over dots sets-in with a constant current through the system. In
general, the experimental measurements of the current-voltage characteristics in
nano-particle arrays revealed a non-linear dependence [13] according to the law
I (V /VT 1) ,

(10)


M. Suvakov
and B. Tadic

646

0.01

Random
Correlated
slope 2.3
slope 2.4

0.1

Distribution

I [rel. units]

0.001

1e-04

0.01

0.001
1e-05

1e-06
0.1

10

100

1e-04
0.01

Random
Correlated
0.1
I/Imax

(V-VT)/VT

(a)

(b)

Fig. 2. Conduction on the networks shown in Fig. 1: (a) Non-linear I-V curves with
solid lines representing the power-law ts, according to Eq. (10); (b) Distribution of
normalized currents over links. The solid lines represent ts with Eq. (11).

in a range of values of voltage above V > VT . In Fig. 2(a) we show the results of
the simulated current-voltage curves in our two networks. They exhibit a typical
non-linear behavior with the exponent > 2. The dierence between these two
curves appears in the saturation region, visible in the case of inhomogeneous network for voltage around 30VT . This saturation can be explained by the existence
of a large number of bottle necks in the inhomogeneous network structure.
For the xed value of voltage V = 10VT we demonstrate how the electrons
ow through the network by monitoring the currents through each link. All data
are collected after a transient (relaxation) period. The distributions of currents
through the links are shown in Fig. 2(b) for the two network topologies. In the
case of random network there is a well dened power-law tail with a large slope of
4.5 (solid line). For the correlated network we found that the ow distribution
can be well tted with a q-exponential form [20]

1/(1q)
I/Imax
P (x) = B0 1 (1 q)
,
x0

(11)

often found in non ergodic dynamical systems. Here we nd B0 = 0.22, q = 1.21


and x0 = 0.03. Applying, for a comparison, the same expression to t the ow
data collected on the random structure, (doted line Fig. 2(b)), we nd quite
dierent values: B0 = 0.1, q = 1.01 and x0 = 0.12. These values are compatible
with the even distribution of ow over links in the random deposition structure,
and the absence of links which play a special role in the conduction process.
The observed quantitative dierence in the current ow over these two networks are graphically illustrated by visualization of the conduction paths with
an appropriate java applet. The images are shown in Fig. 3. Dierent levels of
gray color are proportional to the current ow along the network links.

Simulation of the Electron Tunneling Paths in Networks

(a)

647

(b)

Fig. 3. Conduction paths on networks shown in Fig. 1: (a) random, (b) correlated
structure. Intensity in the gray scale is proportional to the ow over links.

Conclusions

We have implemented a numerical algorithm for analysis of the electron tunneling conduction through nano-particle arrays on a two-dimensional substrates.
The inputs in the algorithm are positions of the nano-particles in the x-y plane,
that can be obtained either from the appropriate measurements in the real systems or from a theoretical model.
We have demonstrated how the algorithm works in two types of networks,
corresponding to a random and a correlated deposition of nano-particles. In
particular, the ow over the links has been studied quantitatively by the statistical distributions and visualized graphically. In both networks we obtain a
non-linear currentvoltage characteristics with most available experimental data
of nano-particle arrays. Furthermore, we have demonstrated that both of these
measures appear to be dependent on the actual structure of the network. Hence,
these measures of the current ow can be used as a tool to characterize the lm
structureconduction properties.
The graphic visualization algorithm developed in this work (made available for
an interactive application on the authors Web site) can be used for visualization
of ow in a general case of planar graphs.

Acknowledgments
We thank the nancial support by the Marie Curie Research and Training Network MRTN-CT-2004-005728 project and by the programme P1-0044 of the
Ministry of high education, science and technology (Slovenia).

648

M. Suvakov
and B. Tadic

References
1. T. Bigioni, X. Lin, T. Nguyen, E. Corwin, T. Witten, and H. Jeager. Kinetically
driven self assembly of highly ordered nanoparticle monolayers. Nature Materials,
5:265, 2006.
2. C. I. Duru
oz, R. M. Clarke, C. M. Marcus, and J. S. Harris, Jr. Conduction threshold, switching, and hysteresis in quantum dot arrays. Phys. Rev. Lett., 74(16):
32373240, 1995.
3. C
. Kurdak, A. J. Rimberg, T. R. Ho, and J. Clarke. Activated transport and scaling
behavior in the current-voltage characteristics and coulomb-blockade oscillations
of two-dimensional arrays of metallic islands. Phys. Rev. B, 57(12):R6842R6845,
Mar 1998.
4. A. J. Rimberg, T. R. Ho, and J. Clarke. Scaling behavior in the current-voltage
characteristic of one- and two-dimensional arrays of small metallic islands. Phys.
Rev. Lett., 74(23):47144717, Jun 1995.
5. P. Moriarty. Nanostructured materials. Reports of Progress in Physics, 64:297381,
2001.
6. P. Moriarty, M. D. R. Taylor, and M. Brust. Nanostructured cellular networks.
Phys. Rev. Lett., 89(24):248303, 2002.
7. Z. Konstantinovic, M. G. del Muro, M. Varela, X. Batlle, and A. Labarta. Particle
growing mechanisms in ag-zro2 and au-zro2 granular lms obtained by pulsed laser
deposition. Nanotechnology, 17:4106, 2006.
8. M. Brust, D. Bethell, Kiely C. J., and D. J. Schirin. Self-assembled gold nanoparticle thin lms with nonmetallic optical and electronic properties. Langmuir,
14(19):5425 5429, 1998.
9. C. P. Martin, M. O. Blunt, and P. Moriarty. Nanoparticle networks on silicon:
Self-organized or disorganized? Nano Lett., 4(12):2389 2392, 2004.
10. B. K. Ferry and S. M. Goodnick. Transport in Nanostructures. Cambridge University Press, 1997.
11. U. Geigenm
uller and G. Sch
on. Single-electron eects in arrays of normal tunnel
junctions. Europhysics Letters, 10:765+, December 1989.
12. N. S. Bakhvalov, G. S. Kazacha, K. K. Likharev, and S. I. Serdyukova. Statics and
dynamics of single-electron solitons in two-dimensional arrays of ultrasmall tunnel
junctions. Physica B Condensed Matter, 173:319328, September 1991.
13. A. A. Middleton and N. S. Wingreen. Collective transport in arrays of small
metallic dots. Phys. Rev. Lett., 71(19):31983201, 1993.
14. M. N. Wybourne, L. Clarke, M. Yan, S. X. Cai, L. O. Brown, J. Hutchison, and
J. F. W. Keana. Coulomb-blockade dominated transport in patterned gold-cluster
structures. Jpn. J. Appl. Phys., 36:77967800, 1997.
15. R. Parthasarathy, X. Lin, K. Elteto, T. F. Rosenbaum, and H. M. Jeager. Percolating through networks of random thresholds: Finite temperature electron tunneling
in metal nanocrystal arrays. Phys. Rev. Lett., 92:076801, 2004.
16. G. F. Lawler. Intersections of Random Walks. Birkhauser-Boston (1996).
17. R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms,
and Applications. Prentice Hall, 1993.

18. M. Suvakov
and Tadic B. Topology of cell-aggregated planar graphs. Springer
(Berlin) Part III, 3993:1098, 2006.

19. M. Suvakov
and B. Tadic. Transport processes on homogeneous planar graphs with
scale-free loops. Physica A, 372:354361, 2006.
20. Tsallis C. Possible generalization of boltzmann-gibbs entropy. J. Stat. Phys, 52:479,
1988.

Classification of Networks Using Network Functions


Makoto Uchida1 and Susumu Shirayama2
1

School of Engineering, the University of Tokyo,


5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8568, Japan
uchida@race.u-tokyo.ac.jp
Research into Artifacts, Center for Engineering, the University of Tokyo,
5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8568, Japan
sirayama@race.u-tokyo.ac.jp

Abstract. We propose a new classification of complex networks in association


with a function of networks. Networks are considered to be input-output system
where the initial condition is input and the evolving dynamics is output. We study
a functional relationship between the input and the output which depend on network structures. A function of network are modeled as a spin interaction system
driven by Glauber dynamics with arbitrary initial conditions. Through numerical
studies, we show a novel classification of networks. The results are applied to
examples of real-world networks, which proved the classification to be useful for
analysis of the inherent characteristics and model assumption for the real-world
networks.

1 Introduction
Recently, a considerable number of works have been done on various networks in the
real world, such as the Internet, the world wide web, social networks and many others to
understand their properties so far [1, 2, 3] . Statistical properties of networks which are
commonly seen in many instances of networks in nature, have been revealed; scale-free
degree distribution, relatively short distance between nodes, clustered structure with
high clustering coefficient and existence of community structure are the examples. A
number of theoretical models of complex networks which realize these properties have
also been proposed [4, 5, 6, 7].
There are some classes of networks structures considered in the context of complex
network science, typically known as scale-free networks or small-world networks. Such
classes are of use for assuming a model for a real-world network. However, classification of complex networks is still not a simple problem, for mathematical representation
for real-world networks is almost impossible, due to their complicated structures. Because of this difficulty, only a little is known about classes of networks, mainly by basic
statistical properties, such as degree distributions.
Not only statistical structures of networks, there are other groups of studies from
the point of view of functions of networks. Examples include evolution of magnetized
spin glasses [8, 9], synchronization of coupled oscillators [10] and transport [11] on
complex networks. From this perspective, networks are often treated as a structure of
an interaction pattern in systems. Suppose a time-progressing dynamics on a complex
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 649656, 2007.
c Springer-Verlag Berlin Heidelberg 2007


650

M. Uchida and S. Shirayama

network to be a response of the network corresponding to an initial condition, one can


regard the network as an input-output system, where the initial condition is regarded
as input and the evolving dynamics is output. If there is a certain relationship between
the input and the output of a network, it may represent a new characteristic hidden in
the network structure, and networks may be classified in relation to the input-output
relations.
We already reported in the previous paper that characteristic behaviors occur in a
spin system on networks evolving by Glauber dynamics, corresponding to arbitrarily
determined initial conditions [12], and proposed a new method to analyze a structure
of networks. In this paper, we study a classification of networks in association with a
function of networks, with extending our previous work. Through a series of numerical
studies using complex network models, we show several classes of networks.

2 Glauber Dynamics on Networks with Arbitrary Initial


Conditions
We consider a spin system of zero-temperature Glauber dynamics on complex networks, with spin variables = 1 located at each vertex of a network. The local field
hi ( ) operating on vertex i at time step , due to the spins of the neighboring vertices
of the vertex i, is given by
hi ( ) =

N


Aij j ( )

(i = 1...N ),

(1)

where Aij is the adjacency matrix, which in this paper is considered to be symmetric. In
our previous work, we investigated a similar dynamics as a two-state diffusion process
[12]: The model is described as follows.

sgn {hi (n)} if hi (n) = 0
i (n + 1) =
(2)
(n) if hi (n) = 0.
In this model, the spins of vertices are updated synchronously at each time step n, that
is, all vertices are updated their spins simultaneously as n progresses. In this paper, a
general Monte Carlo method is used for the numerical simulations. The spins of individual vertices are updated asynchronously. At each step , a vertex is randomly chosen
and its spin is updated according to the following rule:

sgn {hi ( )} if hi ( ) = 0
i ( + 1) =
(3)
1 (probability 1/2) if hi ( ) = 0.
Initial condition at = 0 is arbitrarily determined according to centrality of vertices. Corresponding to the centralities of each vertex, the rN vertices with the largest
centrality are assigned the positive spin state at the initial state ((0) = +1), while the
remaining (1 r)N vertices are assigned (0) = 1. Here, r is the initial fraction of
positive spins, and N is the number of vertices in the network. We consider r and the

Classification of Networks Using Network Functions

651

type of centralities as an input to a network. For an output, we investigate a fraction


of positive spins at t = , which is denoted by r . We study the r-r relations as the
input-output relationships of the system.

3 Numerical Studies
In this section, we explain the numerical results, and the classes emerging out from the
input-output relations of networks. In addition to the models, we apply the method to
several real-world networks to study their classes.
3.1 Network Models and Initial Conditions
In this paper, we deal with several network models: Regular lattice, random graph (RA),
Watts-Strogatz (WS) model [4] (the mixing parameter p is set p = 0.1), Barabasi-Albert
(BA) model [5], Klemm-Egulz (KE) model [6] (the mixing parameter is set = 0
and = 0.1) and Connecting Nearest Neighbor (CNN) model [7]. Each model has
its inherent characteristics. Their classes from conventional structural classification are
briefly explained in Table. 1. See related references for further details of the models.
Table 1. Network models and their structural classes
Model

Classes

Lattice
Random graph
WS model
BA model
KE model ( = 0)
KE model ( = 0.1)
CNN model

highly-clustered
small-world
small-world, highly-clustered
scale-free, small-world
scale-free, highly-clustered
scale-free, small-world, highly-clustered
scale-free, small-world, highly-clustered

In order to determine initial input of spins, we consider four different orderings as


mentioned in the previous section: degree centrality, closeness centrality, betweenness
centrality, clustering coefficient. See Refs. [4, 13] for the detail of these centralities.
3.2 Numerical Results and Typical Input-Output Relations
In the numerical studies, the number of vertices of each network is N = 36000, and the
average degree is k = 10. They are always confirmed to be connected, that is, all of
the networks are composed of a single connected component. In addition, the numerical
results presented in this paper are averaged over tens of realizations.
The results for all networks and initial conditions are listed in Fig. 1. As for the
lattice, there is no discrimination of vertices about the centralities, so the initial spins are
randomly distributed with the fraction r. In our previous paper (the model is represented

652

M. Uchida and S. Shirayama

by Eqn. (2), a certain oscillation mode was observed at n = [12], while no oscillation
occurred in the model by Eqn. 3. However, the r-r relations of both models turned out
to be essentially the same.
From Fig. 1, we find several characteristic patterns in the r-r relations. They are
briefly reviewed in the following (Fig. 2). The patterns are classified by step-like function, sigmoid-like function, convex curve with/without critical point and linear-like
function with/without intercept:
Step function. For most cases on random graph and BA model, a r-r relation like
step function as shown in Fig. 2(a) is observed. Positive spins disappear unless the
fraction in the input does not exceed a certain fraction rc , otherwise they prevails
in the whole network.
Sigmoid function. In some cases the transition at the critical fraction rc is rather moderate. In such cases, the r-r relation forms a shape like sigmoid function as shown
in Fig. 2(b).
Convex curve with/without critical point. In other cases, the shape is fairly different.
Positive spins disappear when r is less than rc , while they survive in a higher fraction than r if r > rc , as shown in Fig. 2(c). For some cases there is no critical point,
where rc 0, as shown in Fig. 2(c).
Linear function with/without intercept. For another input, the r-r relation forms a
liner function, where the spins does hardly prevail nor shrink from the initial state,
as shown in Fig. 2(d). In some cases r |r=0 = r0 > 0, where the positive spins
survives in the limit of r 0, as shown in Fig. 2(d).
3.3 Classification of Networks
According to the patterns described above, the input-output relations of networks can
be classified as Table. 2. The classes seem not to be directly related to the conventional
structural classes, however, there are some clue to investigate them. The pattern (a)
is only observed on the random graph and the BA model. It may be related to low
clustering. The pattern (c) and (d), which are seen on highly-clustered networks, are
more complicated. The critical point rc of the pattern (c) are different for the cases. The
patten (d) are seen on the input by closeness centrality and betweenness centrality. They
may be related to inherent characteristics of these networks and centralities. Thus, the
patterns of the input-output relations of networks can be related to a new classification
of networks.
3.4 Application to Real-World Networks
To utilize the classification discussed in the previous section, we apply this analysis
and classification to two examples real-world networks. One is a network of Japanese
social networking service (SNS) mixi, where a node is a user and an edge is their
registered friendship. The other is a network of entries of weblogs, where a node is
an individual entries of weblogs and an edge is a trackback between entries. They can
be both recognized as scale-free, small-world and highly-clustered networks based on
the conventional classification. The basic statistics (number of vertices N , number of

Classification of Networks Using Network Functions

653

1
0.9

Convergent Ratio of +1 State

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Initial Ratio of +1 State

0.7

0.8

0.9

Lattice, random input


1

0.9

0.9

0.9

0.9

0.8

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

RA, degree

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.2
0.1
0
0

RA, closeness

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

RA, betweenness

0.9

0.9

0.9

0.8

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

WS, degree

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.3

0.4

0.5

0.6

0.7

0.8

0.9

WS, betweenness

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0.4

0.5

0.6

0.7

0.8

0.9

0.1

BA, degree

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0
0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0), degree

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0), closeness

0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0), betweenness

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0
0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0.1), degree

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0.1), closeness

0.3

0.4

0.5

0.6

0.7

0.8

0.9

KE ( = 0.1), betweenness

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0
0.3

0.4

0.5

0.6

0.7

0.8

CNN, degree

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

CNN, closeness

0.9

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.3
0.2
0.1

0
0

0.1

KE ( = 0.1), clustering

0.2

0
0.2

0.7

0.1

0.9

0.1

0.1

0.8

0.8

0.2

0.9

0.7

0.3

0
0

0.6

KE ( = 0), clustering

0.2

0.5

0
0.2

0.7

0.1

0.4

0.1

0.1

0.8

0.3

0.2

0.9

0.2

0.3

0
0

0.1

BA, clustering

0.8

0.1

BA, betweenness

0.9

0.9

0.1

BA, closeness

0.8

0.2

0.7

0.3

0
0.3

0.6

WS, clustering

0.2

0.5

0
0.2

0.8

0.1

0.4

0.1

0.1

0.9

0.3

0.2

WS, closeness

0.2

RA, clustering

0.9

0.1

0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

CNN, betweenness

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

CNN, clustering

Fig. 1. Numerical results of r-r  relations for each network and ordering of initial condition. The
x-axis and the y-axis denote r and r  , respectively.

edges m, average path length L and clustering coefficient C) of them are N = 360802,
m = 1904614, L = 5.52 and C = 0.33 for the SNS, and N = 39048, m = 116318,
L = 13.22 and C = 0.23 for the weblogs. The degree distributions of them are shown
in Fig. 3.

M. Uchida and S. Shirayama

rc1

(b)

(a)

rc2

(c)

654

rc
0

(d)

(d)

(c)

rc

r0

Fig. 2. Typical patterns found in the r-r  relations

Table 2. Classes of the r-r  relations of the network models for different inputs

Lattice
RA
WS
BA
KE ( = 0)
KE ( = 0.1)
CNN

Degree

Closeness

(a)
(c)
(a)
(b)
(b)
(c)

(a)
(c)
(a)
(d)
(d)
(d)

Betweenness

Clustering

(b)

(a)
(c)
(a)
(d)
(c)
(c)

(a)
(b)
(a)
(c)
(c)
(b)

0.1
0.1

0.01

p(k)

p(k)

0.01
0.001

0.001
1e-04

1e-04
1e-05

1e-06

1e-05
1

10

100
Degree

1000

10000

10

100

1000

Degree

Fig. 3. Degree distributions of the real-world networks (right: SNS, left: Weblogs)

Classification of Networks Using Network Functions

655

The r-r relations of the real-world networks by the proposed method and their
classes are shown in Fig. 4 and Table. 3. They also have unique classes. As for the
SNS network, the results are classified into the same class as the CNN model, except
the input of closeness centrality. Especially, the form of the r-r relationship for the
input of clustering coefficient has the same characteristic of discontinuity as that of the
CNN model. As for the weblogs network, the linear pattern (d) of the input of closeness
centrality, which is seen on the KE network, is observed.
These classes may represent the inherent characteristics of the networks which do not
appear in the conventional statistical characteristics. These results imply that the inherent structure of the SNS network is well modeled by the CNN model, and the inherent
characteristics about the closeness centrality of the weblogs network are modeled by
the KE model. The classes proposed in this paper can thus be useful for classification
and model assumption of real-world networks.
1

0.9

0.9

0.9

0.9

0.8

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

SNS, degree

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.2
0.1
0
0

SNS, closeness

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

SNS, betweenness

0.9

0.9

0.9

0.9

0.8

0.8

0.8

0.8

0.7

0.7

0.7

0.7

0.6

0.6

0.6

0.6

0.5

0.5

0.5

0.5

0.4

0.4

0.4

0.4

0.3

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Weblogs, degree

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Weblogs, closeness

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.9

SNS, clustering

0.2
0.1
0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Weblogs, betweenness

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Weblogs, clustering

Fig. 4. Numerical results of r-r  relations of the real-world networks

Table 3. Classes of the r-r  relations of the real-world networks

SNS
Weblogs

Degree

Closeness

Betweenness

Clustering

(c)
(c)

(c)
(d)

(c)
(c)

(b)
(b)

4 Conclusion
We proposed a new classification of networks, using the function of networks. A network is considered to be an input-output system, and the classes are related to a functional relationship of arbitrary input and response (output) to the input. As a model
of function, we considered a spin interaction system with arbitrary initial conditions
on networks. Numerical studies using complex network models revealed that networks
are classified into four fundamental classes. The classes is considered to represent an
inherent characteristic of networks.

656

M. Uchida and S. Shirayama

Applying the method to two examples of real-world network, it is shown that the
classes also appears on the real-world networks, and that the classification can be used
for model assumption for studying an unknown characteristic of a real-world network.

References
1. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of Networks: From Biological Nets to the
INternet and WWW. Oxford University Press, Oxford (2003)
2. Newman, M.E.J., Barabasi, A.L., Watts, D.J.: The Structure and Dynamics of Networks.
Princeton Univ. Press (2006)
3. Boccaletti, S., Latora, Y., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: Structure and dynamics. Phys. Rep. 424 (2006) 175 308
4. Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393
(1998) 440 442
5. Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286 (1999)
509 512
6. Klemm, K., Eguluz, V.M.: Highly clustered scale-free networks. Phys. Rev. E 65(036123)
(2002)
7. Vazquez, A.: Growing network with local rules: Preferential attachment, clustering hierarchy, and degree correlations. Phys. Rev. E 67(056104) (2003)
8. Castellano, C., Loreto, V., Barrat, A., Cecconi, F., Parisi, D.: Comparison of voter and glauber
ordering dynamics on networks. Phys. Rev. E 71(066107) (2005) 066107
9. Castellano, C., Pastor-Satorras, R.: Zero temperature glauber dynamics on complex networks. J. Stat. Mech. (2006) P05001
10. Arenas, A., Daz-Guilera, A., Perez-Vicente, C.J.: Synchronization reveals topological scales
in complex networks. Phys. Rev. Lett. 96(114102) (2006)
11. Tadic, B., Godgers, G.J., Thurner, S.: Transport on complex networks: Flow, jamming and
optimization. arXiv:physics/0606166 (2006)
12. Uchida, M., Shirayama, S.: A new analysis method for compex network based on dynamics
of spin diffusion. In: ICCS 2006, Part III, LNCS 3993. (2006) 1063 1066
13. Brandes, U.: A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25(2) (2001) 163 177

Effective Algorithm for Detecting Community


Structure in Complex Networks
Based on GA and Clustering
Xin Liu1, Deyi Li2, Shuliang Wang1, and Zhiwei Tao1
1

State Key Laboratory of Software Engineering, Wuhan University, Wuhan 430072, China
2
China Institute of Electronic System Engineering, Beijing, 100039, China
tsinllew@gmail.com

Abstract. The study of networked systems has experienced a particular surge of


interest in the last decade. One issue that has received a considerable amount of
attention is the detection and characterization of community structure in
networks, meaning the appearance of densely connected groups of vertices,
with only sparser connections between groups. In this paper, we present an
approach for the problem of community detection using genetic algorithm (GA)
in conjunction with the method of clustering. We demonstrate that our
algorithms are highly effective at discovering community structure in both
computer-generated and real-world network data, and show how they can be
used to shed light on the sometimes daunting complex real-world systems of
scale-free network structure.
Keywords: complex network, community structure, modularity, clustering.

1 Introduction
There have been numerous and various complex networks with the development of
science, technology and human society. A property that seems to be common to many
networks is community structure [3], the division of network nodes into communities
or modules within which connections are dense, but between which they are sparser.
Community detection, which can help us simplify functional analysis, is potentially
very useful. To evaluate the accuracy of the community structure Newman and
Girvan devised a quantitative measure called modularity Q. It is defined in [4] as:
Q = ( eii ai )
2

(1)

Q is the addition of the modularity q of all the communities, Q=sum (qi). For each
community i, qi is calculated as eii, the fraction of the links that fall within the
communities, minus ai2, the expected value of the same number of links distributed
randomly in spite of the community structure.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 657664, 2007.
Springer-Verlag Berlin Heidelberg 2007

658

X. Liu et al.

Adopting this successful measure as the evaluation function based on optimization,


many attempts were proposed to tackle the module identification problem. Guimer
and Amaral put forward a simulated annealing approach [1]. Duch and Arenas
suggested an algorithm based on extremal optimization [7]. Newman proposed in [5]
a fast method by means of creating the hierarchy following an agglomerative strategy
to address this issue. Pujol, Bjar and Delgado recently developed an algorithm [6]
that is more efficient while maintaining its accuracy. In this paper, we introduce
another approach based on the genetic algorithm (GA) and clustering, which is
efficient, sensitive and able to analyze very large networks.

2 The Method of Extracting Communities Based on GA


2.1 Bi-partitioning Strategy
Community detection problem is an NP hard puzzle [5]. It is in practice infeasible to
carry out an exhaustive search of all possible divisions for systems larger than 30 or
40 vertices. However, by performing GA, as we will shortly show, one can find out a
fine partition that divides the network into two parts at ease. In order to detect more
modules, we adopt a smart scheme. Our approach, with reference to [1], employs the
repeated division into two: we first divide the network into two parts, then divide
those parts, and so forth. When partition a subgraph, we first isolate it from the rest of
the network, then perform a nested GA. The result of the nested GA is a partition of
the subgraph into two new modules, which we accept or exclude according to the
global modularity. Namely if we find that the proposed split makes a negative
contribution to the total modularity, we leave the corresponding subgraph undivided.
When the entire network has been decomposed into indivisible subgraphs in this way,
the algorithm ends.
2.2 Dividing a Network into Two Using GA
Given a network (or a subgraph that is separated from the entire network), we divide
it into two parts on the criterion of modularity. So the goal is to:

Maximize
q H

Q(q)

(2)

where H is all possible ways of bi-partitions. The GA used to solve such an


optimization problem consists of several steps and the implementation of them is
explained as follows.
Problem Encoding
A divisive resolution that bi-partition a network containing k vertices is represented
by a binary string:

(num1,num2,,numk), numi = 0, 1 (i=1,...,k),


where numi = 0 denotes that vertex numi is distributed to the first part, while numi = 1
means the other assignment.

Effective Algorithm for Detecting Community Structure in Complex Networks

659

Evaluation Function
We choose modularity Q as the fitness value of the chromosome. Then, the evaluation
function is defined in (1).
Mutation Operation
In this process, each chromosome in the population except the fittest one undergoes a
random one-bit flip. If the mutation has improved the fitness value of the
chromosome, we adopt it for certain. Otherwise, we accept or reject the mutated
string according to a variable probability:

pro = 1- t / maxgen ,

(3)

where t, maxgen denotes the current generation number and the maximal generation
number respectively. This mechanism, analogous to simulated annealing, causes the
acceptance probability of the deteriorated mutation being large initially, where we can
search the space uniformly, and it being very small at later stage, where we can search
the space very locally and escape the local optima.
Local Hill-climbing Operation
We first select the fittest chromosome u in the population, then mutate on one of its
bit obtaining u. Only if the random mutation has improved the fitness value did we
replace u by u, otherwise nothing should be done. This operation is significant in that
the algorithm depends highly on it to search the global optimization when the
evolution reaches the later stage.
Termination Condition
The algorithm is terminated on condition that at least one of the following two terms
is satisfied:

1. The fittest chromosome in the population has not been improved for a desired
number of generations.
2. The number of the current generation is beyond limitation.
Pseudo-code
Finally, the pseudo-code of our genetic algorithm is described as follows:

program Bi-Partition a network (Output)


begin
Step 1current generation t:=0;
Step 2initiate pop(t) randomly;
Step 3while(termination condition is not satisfied)
Step 3.1mutation operation;
Step 3.2local hill-climbing operation;
Step 3.3eliminate a quarter of the inferior
chromosomes and replace them by the top 25% ones;
Step 3.4t := t + 1;
Step 4convert the fittest chromosome into a
bipartition solution;
end.

660

X. Liu et al.

Complexity Analysis
Some of parameters used in GA are listed in Table 1. Concerning the computational
time, the most consuming step is the fitness evaluation process. For a given
chromosome with the string length of l, the running time of fitness evaluation
operation is O(l2). However, if just flip a single bit of a string, we only need to modify
the previous fitness value to fulfill the re-evaluation, whose cost would be as low as
O(l). This is also the reason of not involving the crossover operation in our algorithm,
as re-evaluating the offspring produced by crossover entails unbearable complexity of
O(l2). Thus, the complete fitness evaluation operation of O(l2) complexity is only
performed when assessing the first randomized chromosomes. In addition, we do the
fitness re-evaluation of O(l) complexity popsize times per generation. Therefore, in
the worst case, the complexity of the GA would be:

popsize O(l2) + (maxgen popsize) O(l) .

(4)

If the network with n vertices is ultimately divided into s communities, we would


perform the GA 2s 1 times. Yet, the length of the chromosome, which is n in the
first time, decreases dramatically rather than unchangeable. Hence, a rough upper
limit of the total computation cost could be estimated as:
2 popsize O(n2) + (log2s + 2) (maxgen popsize) O(n) .

(5)

3 Community Detection in Scale-Free Networks of Large Size


When confronted with the large real-world graphs that have thousands or millions of
vertices, the GA becomes incompetent. To classify communities in such graphs, we
should exploit their scale-free nature. First, the power law distribution tells us that
most vertices are of low degree, but a few nodes have a very high degree. Second,
vertices in scale-free networks are not created equally some are more important
than others. Based on these two views, we propose a method of clustering which
could guide the search of dense modules. Our notion can be expressed vividly as
follows. For each vertex with higher degree, we take it as a center and draw a circle
around it. One can imagine intuitively that if the scope is small enough, all nodes in it
would belong to the same community. In other words, if we allot the centered node to
a certain community, the other vertices in the same circle will be assigned to the same
module too. Thus, we could omit the massive trivial nodes in the same ring, and treat
them as a unified cluster represented by the central vertex.
To realize the above idea, we first find out the most connected rn vertices (hubs),
supposing r is the ratio of the superior nodes and n denotes the dimension of the
network. Next, we affiliate each vertex with its nearest hub. Upon that, the initial
network G has been preparatively classified into rn clusters. Then our consequent
GA only needs to consider for each cluster to which community it belongs, with the
chromosome coding on the typical hub and hence the efficiency of the algorithm can
be enhanced enormously.
Generally speaking, our method is a joint strategy of GA and clustering, which
allows us to analyze very large networks in reasonable time while maintaining

Effective Algorithm for Detecting Community Structure in Complex Networks

661

accuracy. To be noted, the efficiency and accuracy of the algorithm is affected by the
proportion of hubs we mined. As a matter of fact, if the hubs are excessive, there
would be too many clusters for the algorithm to be efficient. On the contrary, if the
number of hubs is too small, although the algorithm is efficient, the partition would be
ill-constructed. Therefore, the quality-efficiency trade-off of our algorithm is really
subject to the value of r.
On a network with n vertices and m edges, the running time of clustering operation
is dependent on r. Specifically, the complexity is between O(n), where r = 1, i.e., all
nodes are hubs, and O(mn), where r is so small that only one hub exists. According to
(5), the sequential divisional algorithm has 2 popsize O((rn)2) + (log2s + 2)
(maxgen popsize) O(rn) complexity. If the values of all parameters are set as
Table 1, the total computational cost would approximately be:
O(mn) + 2 105 (log2s + 2) O(n) + 8 O(n2) .

(6)

This is the complexity of the worst case, considering every time the GA would
converge much earlier before it evolving to the max generation. Therefore, generally
speaking, our algorithm entails a sort of operation with cost O(n(m+n)) or O(n2) on a
sparse network.
Table 1. Summary of parameters used in the algorithm
Parameter
Meaning
Value
popsize
size of the population
100
maxgen
maximal generation number
10000
ungen max number of generations for unimproved fittest chromosome
100
r
fraction of mined hubs
0.2
s
number of identified communities
network dependent

4 Experiments and Results


4.1 Computer-Generated Networks

As a first example of the working of our algorithm, we employ a widely used


computer-generated random graph with known community structure [4]. It consists of
128 vertices divided into four groups of 32. Each vertex has on average kin edges
connecting itself to other nodes of the same group and kout edges to members of other
groups, satisfying k=kin+kout=16. As kout becomes larger and larger, the resulting
community structure becomes weaker and weaker, and thus the graph poses greater
and greater challenges to our algorithm. This time, we set r=1 to get the best partition
result. We compare our algorithm to the GN algorithm [3], the most frequently
referenced one, and SA algorithm [1], the most accurate one at present. As Fig.1
shows, the algorithm performs well, correctly identifying all of vertices for values of
kout/k 0.3. When kout/k approaches 0.4, more than 97.5% of vertices are still
classified accurately. Only for kout/k 0.45 does the classification begin to deteriorate
markedly.

fraction of vertices classified correctly

X. Liu et al.

fraction of vertices classified correctly

662

1
0.8
0.6
0.4

algorithm described in this paper


algorithm of Girvan and Newman

0.2
0
0

0.1
0.2
0.3
0.4
fraction of inter-community edges,kout/k

0.5

1
0.8
0.6
0.4

algorithm described in this paper


algorithm of SA

0.2
0
0

0.1
0.2
0.3
0.4
fraction of inter-community edges,kout/k

0.5

Fig. 1. The fraction of vertices correctly identified by our algorithm, GN algorithm and SA
algorithm in the computer-generated graphs described in the text. Our algorithm, which
significantly outperforms the GN algorithm and is comparable to the SA algorithm for most
cases, could be able to reliably identify modules in a network whose nodes have as many as
40% of their connections outside their own module.

Fig. 2. The division results of the karate club network by our algorithm. On the left, we apply
our algorithm directly to the network obtaining 4 communities with Q=0.418803, the biggest
modularity at present. On the right, the result of improved algorithm by a threshold of =0.02
is satisfying, only the number 10 vertex is classified wrongly according to the actual structure
of the network.

4.2 Zacharys Karate Club Network

Now we turn to some applications of real-world network. The first one is Zacharys
karate club network [4], which is taken from one of the classic studies in sociology.
We apply our algorithm to it and the original network is segmented into 4 parts with
Q=0.418803 (see Fig.2, left). In order to avoid the graph being decomposed into more
parts, we set a threshold =0.02. Only if the contribution to the total modularity of the
proposed split of the isolated subgraph is larger than , we do the division. In this
way, we carry out our experiment for the second time. The algorithm works perfectly,
finding most of the known split of the network into two groups with Q=0.371795,
only one vertex, number 10, is classified wrongly (see Fig.2, right). In fact, if we
distribute the number 10 vertex to the right group, the corresponding Q would fall to
0.371466. As can be seen form the picture, the number 10 vertex is connected to the
two parts with only one link respectively. So both assignments could be accepted
from the view of community structure, whether we distribute it to the left part or the
right one.

Effective Algorithm for Detecting Community Structure in Complex Networks

663

4.3 UML Class Diagram

This time we employ a UML class diagram with 224 vertices and 541 edges. Again
we set r to 1 in the experiment. The program ends rapidly and segments the original
graph into 10 parts with the modularity of 0.531982. Fig.3 shows the image of the
network after performing our algorithm.

Fig. 3. The identified 10 communities of UML class diagram

4.4 Other Real-World Networks

In order to further verify the significance of the algorithm proposed in this paper, we
apply it to several other networks of different sizes. In all the tests, we only work with
the biggest connected component, removing all multiple relations and self-referenced
edges. The results given in Table 2 provide a useful quantitative measure of the
success of our algorithm when applied to real-world problems.
Table 2. The result of several other real-world networks by our algorithm. The applied
networks, ranging from 1458 to 28502 vertices, are all of scale-free structure. The algorithm
successfully identifies the communities in reasonable time (The runs are executed on a desktop
computer of Pentium 4, 3.0GHz).
Networks

Num of Vertices

Num of Communities

Time(s)

Protein interaction network


Word association graph
Co-authorship graph

1458
7192
28502

12
22
68

0.738382
0.434555
0.578198

0.3
0.2
0.1

5.1
62.4
1724.0

5 Conclusion
In this paper we describe an algorithm, based on repeated subdivisions, to extract
community structure from complex networks. The running time of the algorithm is
O(n2) on a sparse network, where n is the number of nodes. Experiments show that
our method performs well in both efficiency and accuracy. The innovative aspects of

664

X. Liu et al.

this paper are as follows. First, we employ the GA and devise some specific tricks to
successfully solve the intractable problem. Second, we exploit the scale-free nature of
real-world complex networks, and propose a method, which combines the idea of
clustering, to manage the daunting real-world systems of medium and large sizes in
reasonable time.
Acknowledgments. The authors thank Zhijian Wu, Zengyang Li for their wonderful
suggestions and Baohua Cao, Bin Liu for providing data. This research is supported
by the National Grand Fundamental Research 973 Program (2006CB701305), the
State Key Laboratory of Software Engineering Fund (SKLSE05-15), and the Ministry
Key GIS Laboratory Fund (wd200603).

References
1. Roger Guimer and Lus A. Nunes Amaral.: Cartography of complex networks: modules
and universal roles. Journal of Statistical Mechanics: Theory and Experiment. 17425468/05/P02001
2. L.Danon, J.Duch, A.Diaz-Guilera, and A. Arenas.: Comparing community structu-re
identification. J. Stat. Mech. p. P09008 (2005)
3. Michelle Girvan and M. E. J. Newman.: Community structure in social and biological
networks. 2002, Proc. Natl. Acad. Sci., 99, 7821
4. M. E. J. Newman and M. Girvan.: Finding and evaluating community structure in
networks. Phys. Rev. E, 69, 026113
5. M. E. J. Newman.: Fast algorithm for detecting community structure in networks. Phys.
Rev. E 69, 066113 (2004)
6. Josep M. Pujol, Javier Bjar, and Jordi Delgado.: Clustering algorithm for determining
community structure in large networks. Phys. Rev. E 74, 016107 (2006)
7. Jordi Duch and Alex Arenas.: Community detection in complex networks using extremal
optimization, Phys. Rev. E 72, 027104 (2005)
8. M. E. J. Newman.: Finding community structure in networks using the eigenvectors of
matrices. Phys. Rev. E 74, 036104 (2006)
9. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. 3rd edn.
Springer-Verlag, Berlin Heidelberg New York (1996)
10. M. E. J. Newman.: Modularity and community structure in networks. Proceedings of the
National Academy of Sciences of the United States of America 103:8577 2006

Mixed Key Management Using Hamming


Distance for Mobile Ad-Hoc Networks
Seok-Lae Lee1 , In-Kyung Jeun1 , and Joo-Seok Song2
1

KISA, 78 Garak-Dong, Songpa-Gu, Seoul, 138-803, Korea


{sllee,ikjeun}@kisa.or.kr
Yonsei University, 134 Shinchon-Dong, Seodaemoon-Gu, Seoul, 120-749, Korea
jssong@emerald.yonsei.ac.kr

Abstract. Unlike xed networks, mobile ad-hoc networks have diverse


characteristics, such as dynamic topologies, bandwidth-constrained links,
energy-constrained operation, limited physical security, etc. Due to these
characteristics their security requirements are dierent from those of
xed networks. This paper presents a method of authenticating the nodes
in the ad-hoc network and securely transmitting information in consideration of the characteristics of the mobile ad hoc network. To this end,
this paper proposes a method of combining asymmetric and symmetric
cryptography to enhance the eciency of secret key management for authentication and secure channel establishment. In particular, this paper
proposes a method that introduces the concept of Hamming Distance
to share the secret keys between the nodes. According to this proposal,
secure communication between the nodes is possible when one node manages only log2 N secret keys.
Keywords: Hamming Distance, Certicate, Public-key, Ad-hoc Network, Secret Key.

Introduction

Mobile ad-hoc networks provide convenient infra-free communications over wireless channels. Recently, mobile ad-hoc wireless networking technology has been
applied to various applications such as military tactical networks, sensor networks, and disaster area networks [5]. But it is dicult to implement mobile
ad-hoc networks due to physical constraints such as the limited power range
and small memory of mobile device, and security issues [6] such as authentication. Recently, there are many eorts to resolve the constraints of mobile ad-hoc
networks with various security issues [7], [8], [9], [11].
In [7], [8], [9], Luo et al distributed the functions of certication authority
through a threshold secret sharing method [3], [4] and scalable multi-signature
mechanism. In [11], Capkun et al proposed a fully self-organized public-key management system that doesnt need any trusted authorities in mobile ad-hoc network. The authors presented methods by which the authentication mechanism of
the mobile ad-hoc network can act on its own without the help of the centralized
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 665672, 2007.
c Springer-Verlag Berlin Heidelberg 2007


666

S.-L. Lee, I.-K. Jeun, and J.-S. Song

TTP(Trusted Third Party), but since it acts on the basis of the public-key algorithm, it needs to be improved in terms of actual implementation or eciency
in consideration of the performance of the nodes.
Accordingly, this paper proposes a method of ensuring the security and reliability of the mobile ad-hoc network by providing authentication and secure
communication based on symmetric cryptography while acting independently
from the centralized TTP. Besides, the concept of Hamming Distance [1] is introduced to show that it is possible to securely transmit information between
the nodes, though one node only manages log2 N secret keys. In this paper, N
means the number of nodes in the network.
This paper is organized as follows: In section 2, we explain the concepts of
Hamming Distance. In section 3, we present a secure communication channel
establishing method. In section 4, we propose the key management method.
In section 5 and 6, we propose a path construction algorithm and analyze its
performance.

Concepts

First, we explain the basic characteristics of Hamming Distance between two integers and the concept of Hamming Distance for improving the number of secret
keys exchanged between one node and others as log2 N . We use the following
two denitions to simplify our explanation.
The IDs of two nodes, a and b, are dened as aID ( ZN ) and bID ( ZN ) and
Hamming Distance between these IDs are dened as HD(a, b) (or HD).
A node a stores and manages the secret keys exchanged with other nodes.
At this time, node a from the viewpoint of other nodes is dened as parent
node (P N ) and the other nodes is dened as child nodes (CN ). Besides,
one of CN is dened as CNi . Each node meets the following condition :
HD(P N, CNi ) = 1.
Theorem 1. For a ZN , dene Ra = {x ZN | HD(a, x) = 1}, then
|Ra | = log2 N where Zn = {0, 1, ..., N 1}, N = 2m , m > 0
Proof. For an integer a ZN , the number of integers which Hamming Distance
in comparison with a is 1 is the same as the number of integers which Hamming weight in comparison with 0 is 1. Generally, the number of integers
that Hamming weight in comparison with 0 is r is computed to m Cr . Here, m
means log2 N and C means combination. Thus, for a number a, the number of
integers that Hamming Distance is 1 is equivalent to m(= log2 N ) as m C1 .
In this paper, each node is assigned an element over ZN as the unique ID of
its public-key certicate. At this time, this ID is dierent concept with the
serial number of public-key certicate x.509 v3 [10] and is managed by using
the extension eld within certicate. To construct a secure communication path
and manage a secret key eciently, each node is assigned its own ID by the

Mixed Key Management Using Hamming Distance

667

R001

Fig. 1. The set of IDs with HD = 1

administrator of the ad-hoc network. This ID has particular inter-relationship


with some nodes by using HD. As for this matter, we explain in section 5 and 6
through a path construction algorithm and simulation. In addition, the number
of nodes that a node has to manage will become log2 N .
Figure 1 briey describes the characteristics explained in the above. Figure 1
represents the relationship between integers with HD=1 about the integer set of
N =8. Namely, the number of integers that Hamming Distance with 001 is 1
is 3 (=log2 8) such as {000, 011, 101}. At this time, if ID of P N is 001,
IDs of CN are {000, 011, 101}. Namely, R001 becomes {000, 011,
101}. According to Theorem 1, a P N only stores and manages the secret keys
belonging to log2 N nodes.

Secure Communication Channel

This section uses the concept of Hamming Distance to describe how to establish
a secure channel for securely transmitting information between nodes in the adhoc network. For simplicitys sake, it is assumed that N is 8, and information is
transmitted through the secure channel between two nodes(N000 and N110 ).
For starters, each node identies the nodes with HD=1. In gure 2, nodes
with HD=1 in comparison with N000 are {N001 , N010 , N100 }. To establish a
secure channel between nodes, each node exchanges secret keys with those nodes
with HD=1. To ensure security at this time, each node uses the public-key of
the other node to encrypt and transmit its secret key. In other words, nodes
exchanging secret keys with node N000 are {N001 , N010 , N100 }, and the secret keys are {K(000,001) , K(000,010) , K(000,100) }. At this time, K(000,001) and
K(001,000) is the same secret key. And node N000 may exchange a dierent secret key with each node to enhance security. In this method, all nodes exchange
the secret keys with nodes that satisfy HD=1. On the assumption that secret
keys were exchanged like this, the method of establishing a secure channel for
securely transmitting information between two nodes will be explained. That is,
a communication path between Sn and Dn in gure 2 is determined to securely

668

S.-L. Lee, I.-K. Jeun, and J.-S. Song

Fig. 2. The concept of a communication path construction

transmit information from N000 to N110 . There are six paths in gure 2 and one
of these paths is selected to transmit information.

Secure Key Management

A node that wants to participate in the ad-hoc network needs to cross certify
and exchange the encryption key for secure communication with other nodes.
This section proposes a method that combines asymmetric and symmetric cryptography as the method of managing secret keys necessary for cross certication
and encryption in this ad-hoc network. Asymmetric cryptography is used when
the public-key certicate for an ad-hoc network administrator is issued from
the TTP, and when an initial key is installed on each node. The initial key is
used when a node participating in the ad-hoc network exchanges secret keys
with other nodes. Symmetric cryptography is used for authentication and secure
communication between nodes.
In general, in symmetric cryptography, as all nodes must share secret keys with
each other, the number of secret keys managed by one node must be (N 1) like
gure 3(a). However, in our paper, we show that it is possible to get the secure
communication even if only log2 N secret keys are exchanged like gure 3(b).
This paper divides the key management method into four stages: administrators
preparation, node initialization under the administrators control, secret key
exchange, and secure communication.
4.1

Preparation

In the preparation stage the administrator of an ad-hoc network registers in the


centralized TTP to initialize nodes. In other words, the administrator generates
a public-key pair using his own PC, stores the private-key in his tamper-proong
device, sends the public-key to TTP, and gets the certicate issued from TTP.

Mixed Key Management Using Hamming Distance

669

Fig. 3. Traditional method vs. Hammming Distance method

4.2

Node Initialization

For a node to participate in an ad-hoc network, it must be identied to the


existing network participants. To this end, the node must have the information
necessary for identifying and conrming itself to the nodes already participating
in the ad-hoc network (authentication information). In the node initialization
state this authentication information is given to the nodes. Node initialization
is performed by the administrator.
Node initialization is as follows. The administrator connects to TTP. TTP
veries the administrators public-key certicate(administrator authentication).
A secure channel is established for secure communication between the TTP
and administrator (secure channel establishment). The administrator generates the public-key pair corresponding to the node, stores the private-key in the
tamper-proong area of the node and sends the public-key to the TTP (nodes
public-key pair generation). The TTP generates a certicate to the public-key
of the node, and then sends its own certicate and nodes certicate together
to the node (nodes certicate issuance). At this time, the TTPs certicate is
used to authenticate other nodes while participating in the ad-hoc network, and
the nodes certicate is used to authenticate itself for other nodes. In particular,
the nodes certicate in this stage complies with x,509 v3 certicate prole, but
includes diverse information, such as ID, IP address,device usage, and device
validity date, to restrict the use of the nodes certicate.
4.3

Secret Key Exchange

In this stage, the secret keys for secure communication between nodes are shared.
In general, for secure communication between nodes, all nodes must own secret
keys necessary for data encryption. In sharing these secret keys, it is important
to minimize the impact of the leakage of the secret keys in the network to an
intruder and provide end-to-end security. As the best method, if the network
size is N , one node needs to exchange secret keys with (N 1) other nodes, and

670

S.-L. Lee, I.-K. Jeun, and J.-S. Song

1. Initialization :
- path_length
- PN

4. Randomly select CN i and replace by CN j :

0,

- path_nodes[0]

Sn

4 - 1. j 0

Sn

4 - 2. randomly select CN j , CN j

2. Compute the Hamming Distance


between PN and Dn :
- If HD(PN, Dn) 1, then go to step 5
3. compute the Hamming distance CN i and Dn :
3 - 1. i

0, path_length path_length 1

3 - 2. Randomly select CN i , CN i

RPN :

(a) If CN i is not available, them


-i

RPN :

(a) If CN j has been already processed or


not available, then
-j j 1
- if j

(b) replace PN by CN j :
- path_nodes[path_length]
- PN

i 1

- if i log 2 N , then go to step3 - 2

log 2 N , then go to step4 - 2

- go to step 5(Fail)

PN

CN j

(c) go to step 3 - 1

- go to step 5(Fail)
(b) if HD(CN i , Dn)
(c) if i log 2 N , then i

1, then go to step 5

5. Output (path_length, path_nodes[]) and stop.

i 1 and

go to step 3 - 2

Fig. 4. The path construction algorithm

bears the burden of managing (N 1) secret keys. Accordingly, there must be


some trade-o among the network security, the burden of exchanging secret keys
between nodes, and the quantity of secret keys managed by one node.
As the concept of Hamming Distance is introduced, it seems possible to securely transmit information in the ad-hoc network only if secret keys are shared
between those nodes with HD=1. Figure 3(b) illustrates how secret keys are
shared between nodes. Since one node only needs to manage log2 N secret keys
in the above method, the burden of the node will be reduced to a considerable
extent.
4.4

Secure Communication

For secure communication between two nodes, a communication path between


the two nodes must rst be established. Section 5 describes how to establish
the path. Once the path is established, the secret keys owned by the nodes on
respective paths are used for communication between two nodes. That is, secure
communication between two nodes can be explained by means of gure 2. The
assumption in gure 2 is that the communication path from N000 to N110 is
{N000 N010 N110 }. First of all, as N000 and N010 share the secret key
K(001,010) , N000 uses the secret key K(000,010) to transmit information to N010 ,
whereas N010 uses a secret key K(010,110) to transmit information to N110 .

Path Construction Algorithm Using Hamming Distance

A path construction algorithm refers to nd one path establishing the secure


communication channel from Sn(Source node) to Dn(Destination node). In gure 2, there are six paths from Sn to Dn and only One of these paths is constructed as the secure communication path from Sn to Dn.

Mixed Key Management Using Hamming Distance

671

We present an algorithm illustrated in gure 4 to search a path from Sn to


Dn by using the characteristics of Hamming Distance. First, the terminology
that used in gure 4 will be explained briey. P ath nodes represents the path
information from Sn to Dn in the process of constructing a path. The algorithm
proposed in this paper is divided into two parts. One is computing Hamming
Distance between Dn and the CNi of P N . The other is a process to replace P N
with one node among CN of P N in case that HD = 1. If HD(P N, Dn) = 1 or
HD(CNi , Dn) = 1, the path construction algorithm is successfully terminated.
If HD = 1, P N is replaced by randomly choosing one of CN . And then by
using CN of the replaced P N , it also veries to satisfy HD= 1. As repeating
this computation, a path can be found.
In general, the path construction algorithm is similar to routing algorithm in
mobile ad-hoc network. In this paper, algorithm performance is dened in the
perspective of how to accurately nd a path. By using the algorithm performance
evaluation method proposed in [7], it is dened as the following.
Pb (THD , r, ZN )
|{(Sn, Dn) ZN ZN : Sn suc Dn}|
=
|{(Sn, Dn) ZN ZN : Sn tri Dn}|

(1)

Here, THD : A path construction algorithm using HD, r : (N Nr )/N


Sn tri Dn : Path searching trials, Sn suc Dn : Path constructing cases

Simulation Results for the Path Construction


Algorithm

The main goal of this simulation is to verify a completion level of our proposed algorithm in gure 4. Thus, this simulation intends to analyze the success
probability of path construction in case that the number of nodes on mobile
ad-hoc network is smaller than N (= 2m ) . Namely, we analyze the success possibility by evaluating the performance of the proposed algorithm according to

Fig. 5. Success probability(Pb ) vs. Node reducing factor(r)

672

S.-L. Lee, I.-K. Jeun, and J.-S. Song

the node reducing rate r. Figure 5 represents the success probability of constructing a path versus node reducing factor r(= { 0.1, ..., 0.9}) according to N
(= {128, 512, 2048, 8192}). As seen in gure 5, a path is constructed about 97.6%
success probability at condence interval [2] 99% in case of r= 0.5. If r <0.5, the
success probability will be high. As a result, the performance of our algorithm
is outstanding.

Conclusion

We used asymmetric and symmetric cryptography to propose a method of secure communication between nodes over mobile ad-hoc network. Besides, the
method based on Hamming Distance was proposed as the method of exchanging keys between nodes so that one node manages only log2 N secret keys. The
performance of our method was evaluated through simulation. According to the
result of the simulation, the probability of this algorithm successfully nding a
path was above 97.9% at 99% condence interval, in case that r is smaller than
0.5. In addition, our method can minimize the impact of the leakage of some
secret keys on the entire ad-hoc network as individual nodes own dierent secret
keys. However, Our method doesnt guarantee the end-to-end security directly
between source and destination nodes in this paper. Accordingly, additional research needs to be made into this area.

References
1. R. Hamming: Coding and Information Theory, Prentice-Hall, 1980.
2. Raj Jain: The art of computer systems performance analysis, John Wiley & Sons,
Inc., 1991.
3. A. Herzberg, S. Jarecki, H. Krawczyk, M. Yung: Proactive Secret Sharing, or: how
to cope with perpetual leakage, Advances in Cryptography - Crypto 95 Proceedings, LNCS Vol 963, 1995.
4. Y. Frankel, P. Gemmell, P.-D. MacKenzie, and M. Yung: Optimal-Resilience Proactive Public-Key Cryptosystems, IEEE Symp. on Foundations of Computer Science,
1997.
5. K. Fokine: Key Management in Ad Hoc Networks, LiTH-ISY-EX-3322-2002, 2002.
6. F. Stajano and R. Anderson: The Resurrecting Duckling : Security Issues for Adhoc Wireless Networks, Proc. seventh Intl workshop security protocols, 1999
7. H. Luo and S. Lu: Ubiquitous and Robust Authentication Services for Ad Hoc
Wireless Networks, Technical Report 200030, UCLA Computer Science Department, Oct. 2000.
8. J. Kong, P. Zerfos, H. Luo, S. Lu, and L. Zhang: Providing robust and ubiquitous
security support for MANET, Proc. ninth Intl conf. Network Protocols, Nov. 2001.
9. H.Luo, P. Zerfos, J. Kong, S. Lu, and L. Zhang: Self-securing Ad Hoc Wireless
Networks, Seventh IEEE Symp. on Computers and Communications, 2002.
10. R. Housley, W. Polk, W. Ford, and D. Solo: Internet X.509 Public Key Infrastructure Certicate and CRL Prole, IETF RFC3280, April 2002.
11. S. Capkun, L.Buttyan and J.-P. Hubaux: Self-Organized Public-Key Management
for Mobile Ad Hoc Networks, IEEE Trans. on mobile computing, vol. 2, No. 1,
Jan./Mar. 2003.

An Integrated Approach for QoS-Aware


Multicast Tree Maintenance
Wu-Hong Tsai1 and Yuan-Sun Chu2
1

Transworld Institute of Technology, Yunlin, 640 Taiwan


edward@tit.edu.tw
2
Chung Cheng University, Chiayi, 621 Taiwan
chu@ee.ccu.edu.tw

Abstract. To maintain the quality of multicast tree in supporting dynamic membership is critical important for the tree receivers. We propose
an integrated approach that combines core migration with local search to
aord the receivers a consistent quality service. The main ideas behind
the thought are always migrate the core to the tree topological center
within limited overhead and provide multiple routes via local search for
a new receiver to meet its quality requirements. Our algorithm aims to
meet a predened upper bound delay along path to individual receiver
and satisfy the constraint of inter-destination delay variation among all
receivers. Simulation results show that our proposed algorithm performs
well in terms of constrains indicated above.
Keywords: integrated approach; core migration; receiver; multicast tree.

Introduction

Multicast communication sends same messages to multiple destinations using


shared links and eciently uses resources such as bandwidth or buer space. A
multi-point connection adopted by multicast communication is a virtual topology
which is usually a tree, called multicast tree. To construct a multicast tree in effectively supporting multimedia application such as video-conference or distancelearning has been widely researched. In particular, the research question of how
to build a multicast tree to minimize the total cost of multi-type data transmitting over network is critical important. The total cost is the aggregated cost of
all edges in the multicast tree that regards to tree performance. Multicast tree
are not expected to only provide ecient usage of network resource, but also
support quality requirements, in terms of real-time delay bound and a bound
on the variation among the delays. For example, the participants in the videoconference want that all video frames are received smoothly (delay jitter) and
simultaneously (inter-destination delay variation).
The tree shape is an important factor for delivering quality service and is
aected by the category of receiver groups. Based on the nature of group membership, multicast receiver groups are classied into two categories: static group
and dynamic group. With static group, the tree can be constructed in advance
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 673680, 2007.
c Springer-Verlag Berlin Heidelberg 2007


674

W.-H. Tsai and Y.-S. Chu

to fulll quality requirements for the group. Yet, the overall information about
network topology and the group size is required prior to tree construction. For
dynamic group, the tree is dynamically constructed as members join/leave the
group. Due to members movement, the service qualities are often degraded, even
though the tree is suitable for the group at rst.
Two types of multicast trees, named source-based tree and core-based tree,
have been considered to accommodate to quality requirements. In source-based
tree, as a member requests for receiving a specic sources messages, it has to
graft a shortest path onto the multicast tree that rooted at the specic source.
And as a member joins multiple sources the multiple trees with shortest path
are grafted as well. The more the resource of nodes is consumed the more the
routing table of nodes is complicated. Instead of grafting onto multiple trees, a
core node and a single shared tree, dened to be the union of the member-tocore shortest paths, is utilized in core-based tree (CBT) [1]. In CBT, messages
destined for the group are rst delivered to the core node, from which they
are distributed along tree branches to group members. The shared tree in CBT
reduces usage of resource and complications of routing table. Unfortunately, in
CBT, the core location/selection problem becomes another issue which aects
the tree shape and service quality requirements. Our research aims to provide a
heuristic solution in resolving this important issue.
The rest of this paper is organized as follows. In Section 2, several related
researches are reviewed. In section 3, we dene our research and propose a core
selection/migration method that minimizes inter-destination delay variation of
multicast tree. Simulation results are presented in Section 4 and some concluding
remarks are given in Section 5.

Background

The main issues associated with shared tree construction are core selection, paths
setup for new members, and tree maintenance for core migration and member
dynamics. The core location of a shared tree inuences the performances of the
tree in terms of total tree cost and delays experienced by individual receivers.
The optimal core selection is an NP-complete problem and numbers of heuristics
have been proposed in previous researches. Whereas most of these proposed
heuristics require a completed topology and detailed membership exacted prior
to core selection, and without taking the quality of services (QoS) requirements
into account [2].
Ideally, while a new member requests to join a multicast group, multicast
routing protocols should have the ability to nd a feasible branch that connect
the new member to the available multicast tree if it exists. CBT [1] grafts a new
member to the multicast tree along the unicast routing path from core to the new
member. It is suitable for best-eort trac; however, in addition to consider QoS
requirements, such shortest-path tree would not be acceptable. A few QoS-aware
multicast routing protocols that use multiple branches searching method have
been proposed in Carlbergs and Crowcrofts research [3]. In their study, they
mainly propose multiple candidate paths for new member to join the existed

An Integrated Approach for QoS-Aware Multicast Tree Maintenance

675

tree; and use GREEDY [4] approach in handling the leaving member. Where
GREEDY [4] is, if the node being removed is a leaf node, then the branch of the
tree supporting only that node is pruned; and for the case of non-leaf nodes, no
action is taken.
However, in dynamic multicasting, continuous grafting and pruning degrades
the performance of multicast tree over the time [2]. And the problem of updating
a multicast tree to accommodate addition or deletion of nodes for the multicast
group can be modeled as the on-line Steiner Problem. The on-line Steiner Problem in network is rst presented by [4] and is an NP-complete problem as well.
In the extreme case, the problem can be solved as a sequence of static multicast
problem by rebuilding the tree at each stage using a static Steiner heuristic.
However, this approach is too expensive in the cost and is unsuitable for ongoing real-time multicast sessions which cannot tolerate the disturbance caused by
excessive change in the multicast tree after each addition or deletion.
Rouskas and Baldine [5] propose a source-based tree to meet the end-to-end
delay constraints and inter-destination delay variation constraints by extension
of path lengths. Wang et al. [6], based on shared tree, proposed an distributed
algorithm that core does not need global knowledge of network topology. Instead, core only maintains the information about agents (sub-trees root) and
the information of all the members is distributed among local agents.

Proposed Algorithm

The network is modeled as a simple, undirected, and connected graph G =


(V, E), where V is the set of nodes and E is the set of edges (or links). The nodes
represent the designated routers and the edges represent the network communication links connecting the routers. An edge e E connecting two adjacent
nodes u and v will be denoted by (u, v). Each edge e has two non-negative
metrics associated with it: a cost function C(e) = C(u, v) represents the utilization of the link, and a delay function D(e) = D(u, v) represents the delay that
the packet experiences through passing that link including switching, queuing,
transmission and propagation delays. A path P (v0 , vn ) = (v0 , v1 , , vn ) in the
n1
network, has two associated characteristics: C(P (v0 , vn )) =
i=0 C(vi , vi+1 )
n1
and D(P (v0 , vn )) = i=0 D(vi , vi+1 ).
We denote VT V as a set of routers to which hosts that belong to a multicast
group are attached. For simplicity, we call the set VT a multicast group with each
router v VT as a group member.1 A multicast tree T = (VT , ET ), which is a
subgraph of G that spans
all the nodes in VT and has no cycles, has an associated
cost dened as C(T ) = e ETC(e).
Given a tree T and two nodes u and v belonging to this tree, we will let
PT (u, v) denote the path between u and v in this tree. Then, the delay and cost
of this path are respectively denoted as D(PT (u, v)) and C(PT (u, v)). Let nodes
vi and vj are the farthest and the nearest node away u, respectively, where u,
1

Actually the multicast group should be the set of hosts that are directly attached to
routers in VT .

676

W.-H. Tsai and Y.-S. Chu

vi , and vj are all in VT . We denoted by (u) = D(PT (u, vi )) D(PT (u, vj )) the
inter-destination delay variation of the node u.
Let R = {r1 , r2 , . . . , rk } be a sequence of requests, where ri is either adding
or removing a destination node to or from the multicast group. Let VTi be the
set of nodes in the multicast group after request ri has been completely made.
In response to ri , we also let Ti and ui VTi be the multicast tree spanning VTi
and the core of the tree Ti , respectively.
There are two parameters are dened to characterize the quality of the tree as
perceived by application performing the multicast. These parameters relate the
end-to-end delays along individual core-destination paths to the desired level of
QoS.
Delay tolerance, : Parameter represents an upper bound on the acceptable end-to-end delay along any path from the core to a destination node.
This parameter reects the fact that the information carried by multicast
packets become stale time units after its transmission at the core.
Inter-destination delay variation tolerance, (u): This is the maximum allowed dierence in end-to-end delay between the paths from the node u
(core) to its farthest and nearest nodes which are in the set VT . In essence,
this parameter is used to insure that no one is left behind and that none is
far ahead in receiving the same data among the various receivers.
Problem 1. Given a network G = (V, E), a delay tolerance , a shared tree
T0 , rooted at u0 , which is a subgraph of G, and a request sequence R =
{r1 , r2 , . . . , rk }. With respect to request ri been made, the sequence of multicast tree {T1 , T2 , . . . , Tk } is constructed, in which Ti spans VTi . After tree Ti
constructed, nd a node ui VTi satises the following conditions:
1. Tree
Ti has minimum inter-destination

 delay variation. Formally,
ui VTi  (ui ) = minvj VTi (vj ) .
2. The delay of path PTi (ui , v) fullls the delay tolerance . That is,
D(PTi (ui , v)) , v VTi .
3. After the above two conditions have been satised, the tree Ti has minimum
cost.
This problem is known to be NP-Complete since it reduces to the standard
Steiner tree problem [5,7].
Please refer to Appendix A for detailed description of our approach and we
briey describe it in the rest of this section. We assume that a shared tree
T , rooted at node u (core), has been constructed at the beginning. The nodes
that those are already on-tree nodes maintain their own table T D.The elds
of T D are Branch indicated the branch connecting to neighboring node and
AccumulatedDelay indicated the accumulated delay of the farthest leaf node
along Branch. We also assume that each node in network knows the delays that
messages pass through adjacent branches to neighboring nodes. While GRAFT
message is sent to an on-tree node vj , it calculates the accumulated delay along
the joining path, and decides whether to send an UPDATE message to upstream
node of itself or not.

An Integrated Approach for QoS-Aware Multicast Tree Maintenance

677

The PRUNE message is only sent by leaf nodes, and is terminated at a node
which has the other branch in T D or is a member, too. When node vj receives
a PRUNE message, it determines its own table T D to verify whether maximum
delay is replaced, then an UPDATE message is consequently sent to its upstream
node. On the other hand, a PRUNE message is sent to upstream node of vj , if
there is no any record in T D.
According to the new arrival UPDATE message, a node vj may update its
corresponding recode in T D. If the maximal AccumulatedDelay is changed and
vj is not the core, then an UPDATE message is sent to upstream node, again. The
above process is continued until the core u receiving an UPDATE message. The
core calculates the dierence between rst and second maximal accumulated
delay in T D, and it decides whether migration process is proceed or not. If
migration is necessary for best quality requirements, the core u sends MIGRATE
message along the branch toward the farthest node. However, each node which
receives MIGRATE message is just a candidate of core. The candidate also
calculates the next node along maximal delay path until no more better results.
Consequently, the candidate core becomes the new core and sends a NOTIFY
message to all the source nodes of multicast tree T .

Simulation Model and Results

In generating random graphs, we adopt the method used by Waxman [4], where
vertices are placed randomly in a rectangular coordinate grid by generating uniformly distributed values for their x and y coordinates. The probability function
d(u, v)
P (u, v) = e 2n and a random number 0 r < 1 are used to decide that
whether an edge exists between node u and v or not. Where d(u, v) is denoted as
their Euclidean distance, and are tunable parameters, and n is the number
of nodes in the graph. Increasing increases the number of connections between
far o nodes and increasing increases the degree of each node.
To generate addition or deletion requests for our simulation, we have used the
probabilistic model employed in [4]. In a network of size N , let k represent the
number of nodes in the multicast tree. Then, the probability of an add-request
(N k)
is given by the expression P rob(add) = (N k)+(1)k
, where is a constant
in the range (0, 1]. The value of determines the equilibrium point at which
k
the probability of an add or delete request is equally likely. When = N
, the
above expression takes a value of 0.5. The probability that a request will be a
delete-request is given by P rob(del) = 1 P rob(add).
Our simulation studies were conducted on a set of 100 random networks. Values of = 0.2 and = 0.2 were used to generate networks with average degree
ranging between 3 and 4. Each random network received a total of 30 update
requests. These requests were generated based on the probabilistic model using a
value of = 0.3. Two performance measures, namely maximum inter-destination
variation and cost of multicast tree, are used to evaluate the performance of the
algorithms. The performance of the algorithms are also evaluated over two parameters, which are: number of network nodes and multicast group size.

W.-H. Tsai and Y.-S. Chu


1

1.1

0.95

1.05

0.9

0.85

0.95

0.8

Ratio

Ratio

678

0.75

0.9
0.85

0.7

0.8

0.65

0.75

0.6

0.7

Ratio of Total Cost of Tree


Ratio of Delay Variation

0.55

Ratio of Total Cost of Tree


Ratio of Delay Variation

0.65
20

40
60
80
100
Nodes (Number of members is 30% nodes)

(a) Dierent number of nodes


(group size 30%)

120

10

15
20
Group Size % (100 nodes)

25

30

(b) Dierent number of group size


(nodes 100)

Fig. 1. Performance comparisons for inter-destination delay variation

Fig. 1(a) shows the results for dierent number of network nodes, which ranges
from 20 to 120 in steps 10 while group size are kept constant at 30% of network
nodes. Fig. 1(b) shows the results for dierent number of group size, ranging
from 5 to 30 in steps 5 while network size are kept constant at 100. It is easy to
see that our algorithm is able to generate solutions with smaller inter-destination
delay variation than SPTs algorithm.

Conclusions

In this paper, we provide the solution to multicast trees construction with the
guarantees that the end-to-end delays from the core to the destination modes
and the variation among these delays are within a given bound constrain. And
our resolution is able to minimize total tree costs as well. The problem of constructing such trees is NP-complete. In comparing with SPTs algorithm, our
heuristic exhibits good average case behavior, eases of implement, and does not
need global network information during tree construction. Our heuristic does
not take end-to-end delay bounds into account since satisfaction of the interdestination delay variation is the most important consideration in our strategy.
However, our heuristic is exible in modication to cope with the end-to-end delay bound requirement in many ways such as the method proposed in Rouskass
and Baldines research [5]. The analysis of data losing rate during the core migration process should be explored more in the future research.

References
1. Ballardie, T., Francis, P., Crowcroft, J.: Core-based trees (CBT): An architecture
for scalable inter-domain multicast routing. ACM SIGCOMM Computer Communication Review 23 (1993) 8595
2. Donahoo, M.J., Zegura, E.W.: Core migration for dynamic multicast routing. In:
Proceedings of the Fifth International Conference on Computer Communications
and Networks (ICCCN 96). (1996) 9298

An Integrated Approach for QoS-Aware Multicast Tree Maintenance

679

3. Carlberg, K., Crowsroft, J.: Building shared trees using a one-to-many joining
mechanism. ACM SIGCOMM Computer Communication Review 27 (1997) 511
4. Waxman, B.M.: Routing of multipoint connections. IEEE Journal of Selected Areas
in Communications 6 (1988) 16171622
5. Rouskas, G.N., Baldine, I.: Multicast routing with end-to-end delay and delay
variation constraints. IEEE Journal of Selected Areas in Communications 15 (1997)
346356
6. Wang, T.Y., Wuu, L.C., Huang, S.T.: A scable core migration protocol for dynamic
multicast tree. Journal of Information Science and Engineering 19 (2003) 479501
7. Kompella, V., Pasquale, J., Polyzos, G.: Multicasting for Multimedia Applications.
In: Proc. IEEE INFOCOM 92, Florence, Italy (1992) 20782085

Appendix A
main()
waiting for a control message r
switch r.type
case GRAFT; join()
case PRUNE; leave()
case UPDATE; update()
case MIGRATE; migrate()
end r.type
end main
join()
N ewAccumulatedDelay r.AccumulatedDelay + DownstreamDelay
if N ewAccumulatedDelay > max (T D.AccumulatedDelay) then
r.type UPDATE
r.AccumulatedDelay N ewAccumulatedDelay
send(upstream(self ), r)
end if
T D.Branch T D.Branch r.Branch
T D.Branch(r).AccumulatedDelay N ewAccumulatedDelay
end join
leave()
N ewAccumulatedDelay r.AccumulatedDelay + DownstreamDelay
if N ewAccumulatedDelay == max (T D.AccumulatedDelay) then
T D.Branch T D.Branch r.Branch
if T D.Branch = then
r.type UPDATE
r.AccumulatedDelay max (T D.AccumulatedDelay)
send(upstream(self ), r)
else
r.type PRUNE
r.AccumulatedDelay N ewAccumulatedDelay
send(upstream(self ), r)
end if

680

W.-H. Tsai and Y.-S. Chu

end if
end leave
update()
N ewAccumulatedDelay r.AccumulatedDelay + DownstreamDelay
OldM axDelay max (T D.AccumulatedDelay)
if N ewAccumulatedDelay > T D.Branch(r).AccumulatedDelay then
T D.Branch(r).AccumulatedDelay N ewAccumulatedDelay
end if
if OldM axDelay < max (T D.AccumulatedDelay) then
if self = core then
r.type UPDATE
r.AccumulatedDelay max (T D.AccumulatedDelay)
send(upstream(self ), r)
else
M axDelay max (T D.AccumulatedDelay)
SecM axDelay secondly maximum in T D.AccumulatedDelay
BranchDelay the delay of branch toward farthest member
if (M axDelay SecM axDelay)/ 2 > BranchDelay then
r.type MIGRATE
r.AccumulatedDelay SecM axDelay
send(maxdownstream(self ), r)
end if
end if
end if
end update
migrate()
N ewAccumulatedDelay r.AccumulatedDelay + U pstreamDelay
T D.Branch(r).AccumulatedDelay N ewAccumulatedDelay
M axDelay max (T D.AccumulatedDelay)
SecM axDelay secondly maximum in T D.AccumulatedDelay
BranchDelay the delay of branch toward farthest member
if (M axDelay SecM axDelay)/ 2 > BranchDelay then
r.type MIGRATE
r.AccumulatedDelay SecM axDelay
send(maxdownstream(self ), r)
else
r.type NOTIFY
send(source(T ), r)
end if
end migrate

A Categorial Context with Default Reasoning Approach


to Heterogeneous Ontology Integration
Ruliang Xiao1,2 and Shengqun Tang1
1
2

State Key Lab of Software Engineering, Wuhan University, Wuhan 430072, China
Department of Information & Management, Hunan Finance & Economics College,
Changsha 410205, China
xiaoruliang@163.com

Abstract. With context unceasingly changing, available information among


ontologies of different information sources is often heterogeneous. It is crucial
to develop scalable and efficient ontological formalism. This paper presents a
categorial context-based formalism with default reasoning, in which context
information is extensively considered, and from the category theory point of
view, we syncretize default reasoning and make a categorial context extension
to description logics (DL) for heterogeneous ontology integration. The core part
of the formalism is a categorial context based on the DL, which captures and
explicitly represents the information about contexts, and constructs
nonmonotonic default reasoning. Based on the formal framework, a prototype is
developed based on JESS, RACER and Protg OWL plugin, which can
integrate different ontologies from multiple distributed sources with context
information using default reasoning.
Keywords: ontology integration, categorial context, default reasoning,
description logics.

1 Introduction
Generally speaking, ontologies are shared models of a domain that encode a view
which is common to a set of different parties, and contexts are local not shared
models that encode a partys subjective view of a domain. While the environment
keeps unceasingly changing, semantic information in the constant ontology cant
reflect the dynamic context. Context-awareness is one of the fundamental
requirements for fitting the semantics of the unceasingly changing context. With the
increased availability of large and specialized online ontologies, the issues about the
integration of independently deployed ontologies have become even more
outstanding. Hence, it is crucial to develop efficient ontological formalism for
overcoming the semantics heterogeneity of multiple information sources so as to
make the public (shared) and local harmonious. In the last few years, a lot of efforts
have been done[3,4,5,6,7]. In our work, we concentrate on the integration of
heterogeneous metadata information in the Semantic Web resources. A novel
categorial context with default reasoning approach is presented here that is capable of
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 681688, 2007.
Springer-Verlag Berlin Heidelberg 2007

682

R. Xiao and S. Tang

representing and efficiently dealing with some challenges. The organization of the
paper is as follows: in section 2, from the category theory and description logic point
of view, we define several conceptions relative context. In section 3, a categorial
context formalism CContext-SHOIQ(D+)DL is proposed, which is extension to the
description logic (DL). Section 4 constructs nonmonotonic default reasoning based on
CContext-SHOIQ(D+)DL. In Section 5, we develop a prototype system for integrating
context information from multiple distributed sources using default reasoning based
on JESS, RACER and Protg OWL plugin. At the end, we discuss the related work
and make a conclusion in Section 6.

2 Categorial Context Definition


Category Theory is a mathematics theory, which studies objects and morphisms
between them, and may summarize all knowledge and knowledge processing from the
structural angle[8]. The concept of Category embodies some abstract properties of the
composition operator for functions that reasonably must be guaranteed.
The category has the systematic characteristics, on the contrary, SHOIQ(D+) DL
that is equivalent to the OWL-DL fragment of OWL (Web Ontology Language)[1,2],
is dispersed. The context is the structure knowledge that possesses the systematic
characteristic. Just as the direction pointed in the literature [8], there exists close
corelation between category and knowledge base. Hence, we will survey the context
based on the SHOIQ(D+) DL from the point of view of category theory.
Definition 1 (Context). A context is a kind of category, which is given by a binary
form <,>, as a collection of conceptions (objects) in the SHOIQ(D+)DL, denoted
by A,B,C ; as a collection of relations (roles)(arrows) in the SHOIQ(D+) DL,
denoted by R, S, . From the ontological point of view, a categorial context is
showed itself a mini ontology.
For example, .C, means conception C depends context . Supposed that
represents Cold-War-era, C denotes super country, .C means a kind of solid
meanings USA, and USSR.
Definition 2 (Subcontext). A context U is a subcontext of a context V, if
U0 V0, where U0, V0 is the conceptions set with respect to U, V respectively.
U1V1; where U1, V1 are the relations (or roles) set with respect to U, V
respectively.
composition and identities in U coincide with those of V.
Definition 3 (Functor). Given two contexts U and V, a functor F: UV consists of
operations:
F0: U0V0 and F1: U1V1, where U0, U1 are the conceptions set and roles set of
the context U respectively, V0 , V1 are the conceptions set and roles set of the context
V respectively, such that
Provided that f: AB occurs in the context
the V.
For every object A U, F1(idA)=id F0(A).

, F (f): F (A)F (B) must be in


1

A Categorial Context with Default Reasoning Approach

683

If gf is defined in the context U, then F1(g) F1(f) is defined in the context V , and
F1(gf)=F1(g) F1(f).
Definition 4 (Context Sequence). Provided that there is a serial of contexts that
construct a concrete environment, we call this environment Context Sequence. It can
be denoted as n.2.1, where j. as (j=1,2, , n) context. This kind of context
sequence is called Context-Aware Scenes too.

3 Our Formal Framework: CContext-SHIOQ(D+)DL


Based on these definitions of categorial context (CContext), we take the notion of
context as the core mechanism for representing semantics, and combining with the
SHOIQ(D+) DL of OWL, to get an extensive CContext-SHOIQ(D+) DL. We use to
denote a context.
Definition 6. Suppose that is a set of contextual names, a context name within ,
a set of conceptions, a set of roles, . means whole conception set depending
on the context . Similarly, roles within context denotes ., holds a subset of +
with respected to transitive roles, +m, context holds a role name set {R|R}, in order to avoid operating R- -, we add a new kind of role inv(), and a new
transitivable function tran(). Tran(R)=true if and only if R+, or inv(R) +. We say
a role R, within a context , is a simple if the R is not transitive, and it is not
compositive by a transitive role R and others, otherwise R is complex.
General inclusion axioms about role are in possession of as following form:
(1) .R m.S, where R,S.;
(2) 1.R m2. S, where 1,2, 1. R, 2.S. and 1 m2;
(3) 1.R m2. R, where 1, 2, 1.R, 2. R . and 1 m2;
After added context conception into SHOIQ(D+), conception set within CContextSHOIQ(D+) DL holds property as follows: .<C,R>=<. C, . R>.
O stands for nominals (classes whose extension is a single individual) within
SHOIQ(D+)DL, and a single individual a can be denoted as a conception by an
operator{a}. Therefore, every conception within SHOIQ(D+) DL must be a
conception within CContext-SHOIQ(D+) DL:
If .C, .D., .R., S is a simple role depending on a context, and n is a
non-negative integer, then .C+.D, 1.C+2.D, .C*.D, 1.C*2.D, 1.C, .R.C, bn
R.C, r n R.C all are conceptions.
General inclusion axioms about conception are in possession of as following form:
(1) .C m.D, where is a categorial context and , C, D..
(2) 1.C m2.D, where 1, 2, 1. C, 2. D . and 1 m2;
(3) 1.C m2. , where, 1, 2, 1. C . and 1 m2.

From the above general inclusion axioms, we all know that a Tbox is still a set of
general inclusion axioms.
Suppose that I={a,b,c,}, an assertion must be in the form of : a: .C, (a,b): .R,
where a, bI, C., R. . An Abox consists of limited number of assertions.

684

R. Xiao and S. Tang

Definition 7. A CContext-SHOIQ(D+) DL interpretation is a pair I=(I,I), where I


contains a nonempty set of objects (the resources) of I , is called domain; and a
function (.)I maps every role to a subset of II such that, for conceptions C,D, roles
R,S, non-negative integer n, #M means the cardinality of M.
(1)(. R) I = (I. RI)+, for R+
(2) (. R-)I = {<I. x, I.y>|<y, x >I.R I}
I
I
I
I
I
(3) (. C *. D) = . C .D
(4) (1.C *2.D)I =1I.C I 2I.D I
I
I
I
I
I
(5) (. C +. D) = . C .D
(6) (1.C +2. D)I =1I. C I2I. D I
I
I
I I
(7) (1.C) =1 \1 .C
(8) (R.C) I={x|y.<x,y>I.RI implies yI.C I }
I
I
I
(9) (R. C) ={x|y.<x,y> . R and yI.C I }
(10) (bn R. C) I={x|#{y.<x,y>I. RI and y.CI }bn}
(11) (rn R. C) I={x|#{y.<x,y>I. R I and yI.C I }rn}
An interpretation I satisfies a terminology set Tbox T, if only if every general
inclusion axiom 1.C m2.D, 1 I. C I 2 I. D I. we denote this I as model of T, IT.
For the context , we defines it as follows that I I, I. C I=IC I , I. RI=IR I.
As far as Abox is concerned, single individual aI, an interpretation for it is aII,
and satisfies some assertions as follows that:
(1) a: .C, if only if a IIC I;
(2) (a,b): , if only if (a I, b I) I R I;
(3) a b, if only if a Ib I .
Definition 8. A global interpretation (global-I) for context sequence n. 2. 1, is
(n. 2. 1) global-I =n I 2 I1 I; On the other hand, a local interpretation (localI) for it, is (n. 2. 1) local-I =n I 2 I1 I, where j (j=1,2, , n) as context
and j.
Definition 9. A interpretation for context migration (n. 2. 1)~n+1, includes global
interpretation ((n. 2. 1)~n+1) global-I and local interpretation ((n. 2. 1)~n+1)
local-I
.
According to all these descriptions hereinbefore, there exists subsumption and
satisfiability in our CContext-SHOIQ(D+)DL as in the SHOIQ(D+)DL, and tableau
arithmetic[1] also can be our an important tool for testing conception satisfiability in
the new CContext-SHOIQ(D+)DL. When the context migration takes place, we will
focus on realizing default reasoning in the Rules Layer as following next section.

4 Syncretizing Default Reasoning into CContext-SHOIQ(D+)


Currently, the Ontology Layer has reached a certain level of maturity with W3C
recommendations such as OWL-DL Web Ontology Language, and most interest
focuses on the Rules Layer and its integration with the Ontology Layer. Reasoning
with ontology languages will be important in the Semantic Web if applications are to
exploit the heterogeneous ontology integration. We will discuss nonmonotonic default
reasoning[11] for realizing the Rules Layer to facilitate such integration under our
formalism.

A Categorial Context with Default Reasoning Approach

685

4.1 Related Default Reasoning


This section will briefly recall the default reasoning. Default logic[11] is one of the
oldest and most studied nonmonotonic. It is an extension of classical logic with
default rules. A default theory is a pair (U,V), where U is a set of first-order formulae,
V is a set of defaults in form

:1,,n
whereas , , and i i=1,2,,n are classical first-order logic formulae, is called the
prerequisite, is called the consequent of the default, i is a consistency condition or
justification. The default has the following intuitive meaningif is inferable and,
for all i i=1,2,,n is not inferable, then infer .
4.2 Default Reasoning Approach to Categorial Context Functor
The experience in building practical applications has revealed several shortcomings of
many semantic and computational problems. On the one hand, SHOIQ(D+) DL, which
acts as one of DLs, is fragment of first-order logic. Its semantics is based on the Open
World Assumption (OWA) of classical logic, while default rules are based on a
Closed World Assumption (CWA), imposed by the different semantics for logic
programming over categorial context functor. How to integrate the OWA of
SHOIQ(D+) and the CWA of rules in a proper way? i.e., how to merge monotonic
and nonmonotonic logical subsystems from a semantic viewpoint? On the other hand,
decidability and complexity of reasoning are two crucial issues in systems combining
DL knowledge bases and default rules. In fact, reasoning in DL knowledge base is
decidable and default reasoning is decidable, our combination perfectly syncretizes
default reasoning into CContext-SHOIQ(D+), preserving decidability and balancing
between OWA and CWA.
Definition 10. Categorial Context Default Reasoning is constructed into the
formalism CContext-SHOIQ(D+) as above. Provided that is a set of contextual
names, , and are categorical context names within , a set of conceptions, a
set of roles. .X, i.Yi, and i.Xi (i=1,2,,n) are conceptions or roles in , and of
CContext-SHOIQ(D+), Categorial Context Default Reasoning satisfies such rule as
follows,

.X 1.Y1,,n.Yn :1.X1n.Xn
where . Yi is called the prerequisite, .X is called the consequent of the default, i.Xi
is a consistency condition or justification.
The default has the following intuitive meaningfor all i.Xi (i=1,2,,n) is not
derivable, and if .Yi is derivable, then derive .X, i.e. If there exists an interpretation I
such that I satisfies 1.Y1(x),,n.Yn(x) ( x is a variable ) and doesnt satisfy every i.Xi
(x) (1in), then I satisfies .X (x). Otherwise, if I satisfies every i.Xi (x) (1in),
then I satisfies .X (x).
For example, provided that there are three scenes, i.e. here are three contexts U,V
and W, available information come from other context names U and V, in general, we
assign the current context as W, to state that a person can speak except if s/he is a
dummy, we can use the default rule

686

R. Xiao and S. Tang

W.CanSpeak(x) U.Person(x): V.Dummy(x).


If there is a individual named John in a domain of individuals, then the closed default
rule is W.CanSpeak(John) U.Person(John): not V.Dummy(John).
Integrating a Categorical Context-SHOIQ(D+)DL with nonmonotonic default
reasoning simply means the possibility of writing a hybrid knowledge base
containing a TBox, an ABox, and a set of Defult reasoning rules. This hybrid
syncretizing includes two main directions: loose syncretizing and strict syncretizing.
As far as loose syncretizing is concerned, because of the context always unceasingly
changing, when agents migrating between context-aware scenes, with the migrating
between different scenes, there must be context functor transition. Hence, loose
syncretizing often takes place in the local interpretation for context migration (see
section 3). As to strict syncretizing, it often takes place in the global interpretation for
context migration (see section 3).

5 A Context-Aware Prototype System


Based on our CContext-SHOIQ(D+)DL formal framework, we develop a prototype
system based on protg OWL plugin, RACER reasoner and JESS [12,13,14,15] for
integrating ontology information as well as default information from multiple
different information sources based on context sequence. Protg is an open platform
for ontology modeling and knowledge acquisition, Protg OWL plugin supports to
edit categorial context ontology and can be connected to RACER reasoner. RACER
system implements a highly optimized tableau calculus for the very expressive
SHOIQ(D+)DL, also provides facilities for satisfiablity, subsumption, and
instantiation reasoning. JESS, a Java Expert System Shell, can be used as a default
reasoning engine.
User 1
Server
Ontology
Knowledge
Base
User 2

Internet

Ontology
Knowledge
Base

Ontology
Knowledge
Base

.
.
.

User n
Ontology
Knowledge
Base

Fig. 1. An architecture of prototype with different information sources

Figure 1 shows the architecture of the case system. The whole distributed system is
constructed by multiple local information sources, each of which describes and
classifies every users ontology knowledge base. Interactive process of the distributed

A Categorial Context with Default Reasoning Approach

687

users embodies the context-aware property. This context-aware property is decided by


every ontology knowledge bases of user domain.
The overall process of software architecture in the server is showed in the
following Fig.2. As far as reasoning with ontologies and default reasoning rules is
concerned, it consists of the following steps:
(1) Default reasoning rules is edited by Protg OWL plugin.
(2) The user OWL ontology is loaded into Protg OWL plugin.
(3) Categorial Context is extracted by Categorial Context Manager assisted by
Protg OWL plugin.
(4) After the user OWL ontology is passed by consistency checking, it is loaded
into the RACER.
(5) Concept and role instances of RACER ABox are translated into JESS facts
assisted by Categorial Context Manager, and it should be processed by
consistency checker in succession.
(6) Default Reasoning Rules Manager helps the rules OWL ontology to be
converted into JESS facts.
(7) JESS facts are changed into RACER assertions.
(8) According to the application requirement, a new ontology is build.
Integrating Ontology Based on Multiple Different Ontology Sources
JESS (Java Expert System Shell )

RACER
Exceptional Handler

Default Reasoning Rules Manager

Consistency Checker

Categorial Context Manager

User 1
Ontology

User 2
Ontology

Protege OWL plugin

.......

User n
Ontology

Fig. 2. Software architecture of the different ontology integration in the server

This context-aware prototype system is combined with default reasoning, and all
our work is ongoing. We will introduce relative other work in the next paper.

6 Related Works and Conclusion


Some researchers do much work in the ontological integration field. Their works can
be classified as three classes: upper-ontology, mappings-OWL, and -connection.
Wang takes advantage of upper ontology for modeling context[9]. Bouquet in the
reference [3] proposes that a local ontology in C-OWL is considered as a context.
Mappings are made of bridge rules that express semantic relations, and it can be used
for representing modular ontologies and combining different viewpoints. In the
literature [5], Kutz presents a extension of DL. Grau expands Kutz method into the

688

R. Xiao and S. Tang

OWL-DL in the literature [4]. These approaches above are mainly at the angle of
technology, and there is not any context reasoning rule based their methods besides
tableaux algorithm. But, our work mainly focuses on the meta-level methodology.
In the paper, CContext-SHOIQ(D+) DL, a novel categorial context formalism with
default reasoning rules is proposed. The aim is to do some research for methodology
of ontology integration. According to our prototype system, the practice proves that
our research is valuable. This is only the beginning to explore the context of an
evolving ontology, and a lot of challenge need to be done by our further efforts.

References
1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.,eds.: The
Description Logic Handbook: Theory, Implementation, and Applications. Cambridge
University Press (2003).
2. Horrocks. I.: Description Logics in Ontology Applications. In Beckert, B.(ed.): Automated
Reasoning with Analytic Tableaux and Related Methods. LNAI 3702. Springer-Verlag,
Berlin(2005), 2-13
3. Bouquet. P., Giunchiglia. F., Frank van Harmelen et al. C-OWL: Contextualizing
Ontologies. In D. Fensel et al. (Eds.): ISWC 2003, LNCS 2870, Springer-Verlag,
Berlin(2003),164179
4. Grau. B.C., Parsia. B., Sirin. E.: Working with Multiple Ontologies on the Semantic Web.
In S.A. McIlraith et al. (Eds.): ISWC 2004, LNCS 3298, Springer-Verlag, Berlin (2004)
620634
5. Kutz. O, C. Lutz, F. Wolter, and M. Zakharyaschev. E-connections of abstract description
systems. Artificial Intelligence 156(1): (2004) 1-73
6. Tan. P., Madnick. S., Tan. K-L.: Context Mediation in the Semantic Web: Handling OWL
Ontology and Data Disparity Through Context Interchange. In C. Bussler et al. (Eds.):
SWDB 2004, LNCS 3372, Springer-Verlag, Berlin (2005) 140 154,
7. Schmidt. A.: Ontology-Based User Context Management: The Challenges of Imperfection
and Time-Dependence. In R. Meersman, Z. Tari et al. (Eds.): OTM 2006, LNCS 4275,
Springer-Verlag, Berlin (2006) 9951011
8. Lu. R-Q.: Towards a Mathematical Theory of Knowledge. J.Comput. Sci. & Tech. Nov.
(2005), Vol.20, No.6, 751-757
9. Wang X.H., Zhang D.Q., Gu. T.,et al. Ontology Based Context Modeling and Reasoning
using OWL. Proceedings of the Second IEEE Annual Conference on Pervasive Computing
and Communications Workshops (PERCOMW2004).
10. Horrocks .I, Peter F. Patel-Schneider. A proposal for an OWL rules language. In Proc. of
the 13th international conference on World Wide Web (WWW 2004), (2004) 723-731
11. Reiter .P.: A Logic for Default-Reasoning. Artificial Intelligence. 13 (1980) 81-132
12. Haarslev.V., Mller. R.: RACER Users Guide and Reference Manual: Version 1.7.6,
December, (2002)
13. Haarslev. V., Mller.R.: Practical Reasoning in Racer with a Concrete Domain for Linear
Inequations. In Ian Horrocks et al. (Eds.): Proceedings of the International Workshop on
Description Logics (DL-2002), April (2002)
14. Eriksson, H.: Using JessTab to Integrate Protg and Jess. IEEE Intelligent Systems. 18(2):
(2003) 43-50
15. Golbreich C., Imai A. Combing SWRL rules and OWL ontologies with Protg OWL
Plugin, Jess RACER. 7th Internation Protg Conference, Bethesda, (2004)

An Interval Lattice Model for Grid Resource Searching


Wen Zhou1, Zongtian Liu1, and Yan Zhao2
1
2

School of Computer Engineering and Science, Shanghai University,


Sydney Institute of Language and Commerce, Shanghai University,
Shanghai, P.R. China, 200072
{zhouwen, ztliu, zhaoyan87}@shu.edu.cn

Abstract. In practice the information mostly expressed in interval form. Formal


Concept Analysis can not deal with the interval information. So the research on
interval lattice is an important task. In the paper, an interval lattice model is
proposed. Then a Grid resource matching algorithm through the interval lattice
is presented. In the end of the paper, experimental results show that the
construction algorithm has reasonable performance on the complexity.
Keywords: interval concept lattice, interval FCA, Grid resource searching.

1 Introduction
Formal Concept Analysis (FCA) is a data analysis technique based on the ordered lattice
theory firstly introduced by Wille [1]. It defines formal contexts to represent relations
between objects and attributes and interprets the corresponding concept lattice.
To deal with the uncertain and vague information in practice, the interval analysis
may be a better method. There are many researchers study in interval number such as
[2-5]. Burusco et al introduce the interval valued fuzzy data from the view of FCA
who produce a model of L-fuzzy concept theory to deal with interval valued fuzzy
data from the point of view of FCA at the first time [6]. It uses some results of
multisets and expertons and requires a complete chain to conduct the interval value
within [0, 1] to form the super-sub relation between two concepts to form lattice.
Scaling [7] is used to generate concept lattice of the formal context with nominal
attributes. Inspired by which, interval scaling is proposed to process interval attribute.
Then, based on the interval lattice to manage the resource in Grid, a Grid resource
matching algorithm is introduced. The rest of the paper is organized as follows.
Section 2 introduces Interval Formal Concept Analysis Model. Section 3 presents the
scale algorithm. Section 4 discusses the resource matching algorithms, the experiment
is in Section 5 and section 6 concludes.

2 An Interval Formal Concept Analysis Model


This section produce the Interval Formal Concept Analysis (IFCA) whose core data
structure is interval lattice, which incorporates attribute decomposing based interval
attribute scaling to make FCA has the capacity of representing interval information.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 689695, 2007.
Springer-Verlag Berlin Heidelberg 2007

690

W. Zhou, Z. Liu, and Y. Zhao

An interval [a, b] is the set of all real numbers {x: axb}. A natural definition of
arithmetic for intervals, represented as pairs of real numbers[8]. The interval [a, b]
with a=b is called degenerate interval which degenerate to a real number a. interval
decomposition denoted as . There are two rules in decomposition process:

1.
2.

The degenerate intervals do not participate in the decomposition process.


The decomposition process does not generate degenerate intervals.

Firstly the two intervals decomposition process is analyzed. Both [a1,b1] and [a2,b2]
are not degenerate intervals. Without missing the generality, we assume a1a2.

[a ,b ] = I ,I ,I =[a ,a b ], [a ,(a b )b ],[(a b )b , b b ] (1)


If the interval I =[a ,a ] (i=1,2,3) with a a ( a and a are the lower bound and

[a1,b1]

1 2 3
-

the upper bound of Ii respectively), it does not satisfy rule 2, and will be deleted from
the result. There exists eight conditions of [a1, b1] and [a1, b2] as show in Table 1.
Table 1. The decomposition condition table of two intervals

Type
1

Condition
[a1, b1]
B1 - [a1,b1]

[a , b ]
2

Example

[2,5]= [2,5]
[2,8][2,12]=[2,8],[8,12]
[2,8][2,5]=[2,5],[5,8]
[2,6][3,6]=[2,3],[3,6]
[2,4][4,6]=[2,4],[4,6]
[2,3][4,6]=[2,3],[4,6]
[2,5][4,7]=[2,4],[4,5],[5,7]
[2,11][5,9]=[2,5],[5,9],[9,11]
[2,5]

A1 B2 C3 [a1,b1],[b1, b2]

3
4
5
6
7
8

B3 B1 C3
C1
A2 B2 C2
C3
B3 -

[a1, b2],[b2, b1]


[a1, a2],[a2,b2]
[a1,b1], [a2,b2]
[a1,b1],[a2,b2]
[a1, a2],[a2, b1],[b1, b2]
[a1, a2],[a2,b2],[b2, b1]

The condition A1 and A2 denote a1=a2, a1<a2 respectively, B1, B2 and B3 denote
b1=b2, b1<b2, and b1>b2, and C1, C2, C3 denote b1=a2, b1<a2 and b1>a2 respectively.
The information table (G, M, R) (the relation of R (g, m) ((g G, m M) has the
value of u (g, m)) can be represented as a cross-table shown in Table 2. The
information table has three objects representing three Grid resources, namely C1, C2
and C3. In addition, it also has three attributes, "CPU capacity" (A), "Memory Size"
(B), "Resource Price" (C) and Working Time (D) representing four properties of the
Grid resource. The relation between an object and an attribute is represented by a
membership value expressed by interval.

Table 2. Information table of interval

C1
C2
C3

A
[0-2.8]
[0-8.8]
[0-1.4]

B
[0-512]
[0-256]
[0-1024]

C
[200-10000]
[300-10000]
[500-10000]

D
[9-12]
[20-24]
[5-11]

An Interval Lattice Model for Grid Resource Searching

691

Interval attribute decomposition is for m ( m M ), the intervals u (g, m) of R (g,


m) in (G, M, R) is decomposed as Wm. The information table can be transfer to
interval context by interval attribute scaling.
Definition 1: An interval attribute scaling. After interval attribute decomposition
the information table (G, M, R) is extended to (G, M, (Wm)m M, R, I), each Wm is a set
of decomposed intervals of ( g , m) for the attribute m (m
M, g
G ),

( g , m, w) I if and only if there exists w ( g , m) ( w Wm , ( g , m) is the value of


R(g, m) of information table (G, M, R)).

Definition 2: Interval formal context. It is a tuple :=(G, M , (Wm ) mM , I ) where G is a


set of objects, M a set of attributes, and I G {(m, w) | m M , w Wm} , a relation
with ( g , m, w1 ) I , ( g , m, w2 ) I w1 = w2 . ( g , m, w) I is read object g has value
w for attribute m.
The interval formal context is as shown in Table 3 which is obtained from Table 2 by
interval attributer scaling after attribute decomposition.
Table 3. Interval formal context

M
A
W A1 a2
C1 1 1
C2 1 1
C3 1 0

a3
0
1
0

B
b1 b2 b3
1 1 0
1 0 0
1 1 1

c1
1
0
0

C
c2
1
1
0

c3
1
1
1

D
d1 d2 d3 d4
0 1 1 0
0 0 0 1
1 1 0 0

Definition 3: for A G , there are two mapping:

A := {w Wm g A, m M , ( g , m, w) I }

B := {g G w Wm , m B, ( g , m, w) I }

Definition 4: Interval formal concept. C (A, B) is called an interval formal


concept of K := ( G , M ,W , I ) if and if only A G , B ( M B , WB ) , M B M ,

Wm ( B ) = {wm wm B} , A = B and B = A , A and B are called the extent and

intent of C respectively.
Definition 5: Interval lattice. The concepts of a given context are naturally ordered
by the partial relation defined by C1 ( A1 , B1 ) C2 ( A2 , B2 ) : A1 A2 Wm( B2 ) Wm( B1 )
( Wm B2 ). The ordered set of all interval formal concepts of :=(G , M , W , I ) is
denoted by B (G , M , W , I ) , and is called the interval concept lattice of (G , M , W , I ) .
Without ambiguousness, the interval concept lattice is called concept lattice or lattice.

692

W. Zhou, Z. Liu, and Y. Zhao

3 The Interval Lattice Construction Principle


The attribute decomposition is the base of attribute scale to form the interval context
after the attribute decomposition. The attribute decomposition algorithm is extended
from the two interval decomposition according to the definition equation (1).
According to the results in Table 1, the decomposition process of two intervals is
shown in Table 4. Attribute decomposition of the context is based on the method of
two-interval-decomposition.
Table 4. The two interval decomposition following the condition in Table 1

[a1,b1]

Wm

Return
r1
r2
[a1,b1]
-

2
3

[a1,b1], [b1, b2]


[a1, b2],[ b2, b1]

Wm=Wm+[a1,b1]
Wm

[a1, a2],[a2,b2]

Type

Wm

[a1, b1][a2, b2]

Return
Flag
No.
1

true

[b1,b2]
[a1,b2] [b2,b1]

1
2

true
true

Wm=Wm+[a1,a2]

[a2,b2]

true

[a1,b1], [a2,b2]

Wm=Wm+[a1,b1]

[a2,b2]

false

[a1,b1] ,[a2,b2]

Wm=Wm+[a1,b1]

[a2,b2]

false

[a1, a2],[ a2, b1],[ b1, b2] Wm=Wm+[a1,a2]

[a2,b1] [b1,b2]

true

[a1, a2], [a2,b2],[b2, b1]

Wm= Wm+[a1,a2] [a2,b2] [b2,b1]

true

#5{G , A[0,1.4];
B[0, 256];
C[500,10000]}
# 4{C1C 3, A[0,1.4];
#2{C1C2, A[0,2.8];
B[0, 512];
B[0,256];
C[500,10000];
C[300,10000]}
D[9,11]}
# 3{C 3, A[0,1.4];
B[0,1024];
#0{C1, A[0, 2.8];
C [500,10000];
#1{C 2, A[0,8.8];
B[0,512];
D[5,11]}

C[200,10000];
D[9,12]}

# 6{ , A[0, 8.8];
B[0,1024];
C [200,10000];
D [5,12],[20, 24]}
Fig. 1. The interval lattice

B[0, 256];
C[300,10000];
D[20, 24]}

An Interval Lattice Model for Grid Resource Searching

693

By attribute decomposition, the decomposed attribute set W is obtained. According


to the attribute scale definition the interval context is generated as an example shown
in Table 1. The lattice is generated as Figure 1 from context shown as Table 3.

4 The Resource Searching Algorithm in Grid Environment


In Grid environment, one issue is how to effectively manage the Grid resource.
Another is when a resource request is submitted, how to find out the fitful Gird
resource to satisfy the request. The request is given out in the form of interval set [9].
This is our motivation to propose the interval lattice model and extend the FCA to
IFCA. Using IFCA as the method to solve the two issues, interval lattice as tools to
manage the Grid resource and based on which the task of resource searching
according to the resource request is carried out.
According to the type of the request, we design two algorithms for search fitful
Grid resources. One is bottom-up algorithm which is much quicker for the complex
requests with many attributes than simple ones as shown in Figure 2. The other is topdown algorithm which is quicker for the simple request with few attributes than
complex ones. The judgeRequest algorithm is to judge if a node in lattice match the
request. For the reason of page limit, we only introduce the first method in this paper.
Algorithm: Bottom-up resource searching algorithm:
Input: a request, interval lattice L of Grid resource
Output: matching Grid resource G of the request
Process:
1.match=null; I=null;
2.if (B doesnt satisfy the request) return match; //judge the bottom node B of L
else {B.flag=true; I.add(B);}
3. while (I.size!=0) //it means there is at least one node in I
{ II=null;
for (each lattice node i in I)
{
for (each is parent node j)
{
if (j.flag==null) //j.flag denotes if j satisfy the request.
{
Boolean flag = judgeRequest(request, j);
if (flag = true) {j.flag=true; II.add(j);}
else j.flag=false;
}
else if (j.flag==true) i.flag=false;
else continue;
for (each i in I )
{
Fig. 2. The Bottom-up resource searching algorithm

694

W. Zhou, Z. Liu, and Y. Zhao

if(i.flag=false) continue;
else
for(each is parent node j)
{
if (j.flag== true) i.flag=flase;
if (i.flag==ture) match.add(i);
}
}
if (II.size==0) break; //it means there are no nodes in II
else if (II.size==1) //there are just one node in II
if (II.get(0)==TopNode) break;
else I=II;
}
}
4. for (each node i match)
{
If(i.object G) G.add(I.object);
}
5. return G;

Fig. 2. (Continued)

5 Experiments and Discussion

the size of concept lattice

Adopting the random data as the test data, all the algorithms realized with Java
running on PC (CPU PIV 2.8G, memory 512M) within Windows XP environment. To
improve the reality of the experiments, each of experiments with different parameters
carries on five randomly generated data and take the average values of results of five
times of experiments as the final result.
First, the spatial consuming of incremental lattice construction algorithm are tested.
Figure 3 expresses the spatial complexity. Curve 1 and Curve 2 show the numbers of
lattice nodes changing according to the different number of objects and the different
numbers of attributes respectively. The horizontal ordinate indicates both the numbers
of objects and attribute are respectively {10,20,30,40,50,60}. The vertical ordinate
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
10

20

30

40

50

60

The number of objects/attributes


1.the size of concept lattice according to the number of objects
2.the size of concept lattice according to the number of attributes

Fig. 3. Relations between the size of lattice and the numbers of both objects and attributes

An Interval Lattice Model for Grid Resource Searching

695

indicates size of lattice. With the ascending of object numbers, the size of the lattice
has exponential ascending tendency. The influence of attribute number has very slight
impact on the size of lattice.
The curves confirm the results that the numbers of objects have big influence on
the lattice both in the size and the numbers of attributes havent so strong impact on
the result lattice. Fuzzy lattice spatial consuming character is different. In fuzzy FCA
[10], the numbers of objects has slight influence on lattice size, but the attribute
numbers has big impact on it.

6 Conclusion
In this paper, interval lattice (IL) model produced extend FCA to IFCA which can
deal with the real interval values in practice. It is different from the classical FCA.
The future research include appling this interval lattice as the Grid resource
management tools and using the resource searching algorithm as the resource match
tools in the real Grid enviornments to using the real data of Grid environment to
further test the model and algorithms we proposed.

Acknowledgments
The work presented in this paper is supported partially by National Science
Foundation of China (reference number: NSFC 60275022 and NSFC 60575035).

References
1. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts,
Reidel, Dordrecht (1982)
2. Young, R.C.: The algebra of many-valued quantities. Annals of Mathematics 104 (1931)
260-290
3. Markov, S., Okumura, K.: The Contribution of T. Sunaga to Interval Analysis and Reliable
Computing. Kluwer Academic Publishers (1999)
4. Moore, R.E.: Interval analysis. Prentice-Hall Englewood Cliffs, NJ (1966)
5. Fernandez, A.J., Hill, P.M.: An interval lattice-based constraint solving framework for
lattices. 4th International Symposium on Functional and Logic Vol. 1722 (1999) 194-208
6. Burusco, A., Fuentes-Gonzlez, R.: The study of the interval-valued contexts. Fuzzy Sets
and Systems 121 (2001) 439-452
7. Prediger, S., Stumme, G.: Theory-driven logical scaling. International Workshop on
Description Logics, Vol. 22 (1999)
8. Moore, R.E., Bierbaum, F.: Methods and Applications of Interval Analysis. Soc for
Industrial & Applied Math (1979)
9. Buyya, R.: Economic-based Distributed Resource Management and Scheduling for Grid
Computing. Vol. Ph.D.. Monash University Australia (2002)
10. Qiang, Y., Liu, Z.T., al., e.: Research on fuzzy concept lattice in knowledge discovery and
a construction algorithm. Acta Electronica Sinica 33 (2005) 350~353

Topic Maps Matching Computation Based on Composite


Matchers
Jungmin Kim1 and Hyunsook Chung2,
1

School of Computer Engineering, Seoul National University, Korea


jmkim@idb.snu.ac.kr
2
Department of Computer Engineering, Chosun University, Korea
hsch@chosun.ac.kr

Abstract. In this paper, we propose a multi-strategic matching approach to find


correspondences between ontologies based on the syntactic or semantic
characteristics and constraints of the Topic Maps. Our multi-strategic matching
approach consists of a linguistic module and a Topic Map constraints-based
module. A linguistic module computes similarities between concepts using
morphological analysis, and language-dependent heuristics. A Topic Map
constraints module takes advantage of several Topic Maps-dependent
techniques such as a topic property-based matching, a hierarchy-based
matching, and an association-based matching. It is not necessary to generate a
cross-pair of all topics from the ontologies because unmatched pairs of topics
can be removed by characteristics and constraints of the Topic Maps. Our
experiments show that the automatically generated matching results conform to
the outputs generated manually by domain experts, which is very promising for
further work.
Keywords: Ontology matching, Topic Maps, multi-strategic matching process.

1 Introduction
In recent years, many approaches for ontology matching have been proposed.
However, all of these earlier approaches for schema or ontology matching focused on
providing various techniques for effective matching and merging of schemas or
ontologies[1]. They were far from efficiency considerations and thus are not suitable
for practical applications based on ontologies of real world domains[7]. Also, earlier
approaches convert ontologies or schemas of relational database, object oriented
database, and XML, to a graph model with only nodes and edges for supporting
different applications and multiple schema types[2,3,11]. This conversion results in
low efficiency because the characteristics of ontologies that are useful for similarity
computation are overlooked. Another problem with the existing matching methods is
that given two ontologies O1 and O2, for each entity in ontology O1, they are
compared with all entities in ontology O2. This full scanning on ontology O1 and O2
also ends up with low efficiency.

Corresponding author.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 696703, 2007.
Springer-Verlag Berlin Heidelberg 2007

Topic Maps Matching Computation Based on Composite Matchers

697

In this paper, we present an approach that considers features of Topic Maps to


reduce the matching complexity and linguistic analysis to improve the matching
performance. Our approach does not require ontologies to be converted into a generic
graph model and the entities to be fully scanned into two ontologies. Furthermore, our
approach is a composite combination of four matching techniques: name matching,
internal structure matching, external structure matching, and association matching.
This composite matching approach combines the results of four matching techniques
that are independently processed to measure the unified similarity of each pair.
To evaluate the quality of our approach, we use the philosophy ontology[5] which
is constructed from Korean philosophy learning domain, Wikipedia philosophy
ontology which is constructed from philosophy-related contents of Wikipedia, and
German literature ontology which is constructed from contents on German literature
in the yahoo encyclopedia as experimental data.
We use three measurements such as precision, recall, and overall, which were
derived from the Information retrieval field, to evaluate the quality of our approach.
We then evaluated the approach by computing three measurements based on a set of
manually determined matches and a set of automatically generated matches by
matching operations. Based on the experimental results, we could conclude that
automatically generated matches by our matching operation can cover most of the
manually determined matches.

2 Related Work
With respect to matching and merging ontologies, there have been a few approaches,
such as PROMPT[9], Anchor-PROMPT[10], Information flow[13], FCA-Merge[2],
QOM[7], and so on.
According to Topic Maps Reference Model, two Topic Maps can be mapped and
merged only if two topics have identical subject identity regardless of their namebased similarity. But it is not always the case that all topics, which represent the
semantically same concept, have a standard subject identity. Furthermore, Topic
Maps whose topic does not have a subject identity can be built.
To overcome this weakness, SIM(Subject Identity Measure)[6] was used to
measure the similarity between topics based on their name similarity and occurrence
similarity. In the SIM, the processes were only string comparison of the name of
topics and resource data of occurrences. The hierarchical structure and association in
Topic Maps are not considered.
Table 1 represents characteristics of the methods at a glance. Abbreviated column
names mean that Language(L), Patterns(P), Experimental Data(D), Results(R), and
Complexity(C). Patterns column indicates matching approaches, which
terminological(T), internal structure(IS), external structure(ES), extensional(E), and
instance(I). Our approach, which is named TM-MAP, is similar with QOM in terms
of the use of features of a data model for an ontology to reduce the complexity of
matching operation. The difference is that our approach treats the matching problem
of distributed Topic Maps.

698

J. Kim and H. Chung


Table 1. Comparison of the matching and merging methods

Methods

PROMPT
Ctx-Match
IF-MAP
FCA-Merge
QOM
TMRM
SIM
TM-MAP

Graph
Graph
Graph
Graph
RDF
Topic Maps
Topic Maps
Topic Maps

T/ES
T/E
T/I
T/I
T/IS/ES/E
T
T/IS
T/IS/ES/E

HPKB
Toy
Toy
Toy
Real Onto.
Toy
Real Onto.

Merge
Matching
Matching
Matching
Matching
Merge
Matching
Merge

O(n )
2
O(n )
2
O(n )
2
O(n )
O(nlogn)
2
O(n )
2
O(n )
O(nlogn)

3 Problem Definition
3.1 Topic Maps Data Model
Topic Maps is a technology for encoding knowledge and connecting this encoded
knowledge to relevant information resources. It is used as a formal syntax for
representing and implementing ontologies[4,8]. Topic maps are organized around
topics, which represent subjects of discourse; associations, which represent
relationships between the subjects; and occurrences, which connect the subjects to
pertinent information resources. These entities have different meaning and usage, and
so we measure the similarity between same entity types rather than whole entitites.
Definition 1. We define a Topic Map model as following 7 tuples:
TM := (TC, TO, TA, TR, TI, RH, RA)
- TC denotes a set of topic types
- TA denotes a set of association types
- TI denotes a set of instance topics

- TO denotes a set of occurrence types


- TR denotes a set of role types
- RH denotes a set of subsumption hierarchy
relations

- RA denotes a set of associative relations


3.2 Topic Maps Matching Process
Our ontology matching process is composed of following 6 steps.
1. Initialization step takes two serialized Topic Maps documents, so-called XTM
(XML Topic Maps)[12], as input and interprets them to build Topic Maps in
memory. During interpretation, PSI and TopicWord indexes are generated for each
Topic Map.
2. Topic pairs generation step creates the reduced number of entity pairs rather than
whole entity pairs of two Topic Maps.
3. Similarity computation step apply composite combination of matching techniques
to measure similarity between topics based on the linguistic analysis. Our
composite matching approach combines the results of independently executed four

Topic Maps Matching Computation Based on Composite Matchers

699

matching algorithms: name matching operation, property matching operation,


hierarchy matching operation, and association matching operation.
4. Similarity aggregation step aggregates similarity values of four matching
operations to generate a combined similarity value for each topic pair.
5. Match candidates selection step automatically chooses match candidates for a
topic by selecting the topics of the other Topic Map with the best similarity value
exceeding a certain threshold.
6. Post-processing step manually corrects the errors of automatically generated match
results by domain experts.

4 Similarity Computation
Definition 2. A matching function map is defined as following expression:
map(A, B, D) = map(A.TC,B.TC,D)
map(A.TO,B.TO,D)

map(A.T ,B.T ,D) map(A.T ,B.T ,D)


map(A.T ,B.T ,D) map(A.T ,B.T ,D)
C

A and B are source Topic Maps and D is domain-specific term dictionary. A matching
function map(A, B, D) is processed by matching functions of different entity types. A
matching function map is composed of following matching operations.
4.1 Name Matching Operation
Name matching operation compares strings of base names and variant names of
topics. In the field terminology, a single term can refer to more than one concept and
multiple terms can be related to a single concept. Name matching operation find
multiple terms refer to a same concept by application of two main categories of
methods for comparing terms: String-based methods and linguistic knowledge-based
methods. Both x and y are tokens and c is the largest common substring of them. The
similarity value between two strings based on the token and substring-based method
is computed by following expression. In this expression xi is the i-th token of string a
and yj is the j-th token of string b. In our morphological analysis these phrases or
sentences are divided into a several stems and inflectional endings, which attached to
stems and represent various inflections or derivations in Korean. Thus, in order to
improve the quality of string matching results between words, we use word order and
ending information, which classify corresponding ending groups according to their
meaning and usage.
SIMtoken(x, y) = 2|c| / |x| + |y|
TS-SIMstring(a, b) = SIMtoken(xi,yj) / | a

b|

4.2 Internal Structure Matching Operation


If two topics have m occurrences and n occurrences each other, internal structurebased strategy computes similarity values of m by n pairs of occurrences to measure
the similarity between topics. An occurrence is defined by an occurrence type and an

700

J. Kim and H. Chung

occurrence value which is a textual description or URI address. For example, a topic,
Immanuel Kant, has a occurrence which type is figure and value is
http://www.encyphilosophy.net/kant/figure.jpg. Thus, the similarity values of
occurrence types and occurrence values need to be combined to determine the internal
structure-based similarity value of the paired topics.

4.3 External Structure Matching Operations


External structure matching measures the similarity between two class topics based on
the combined similarity between their child topics. The following expression
computes the similarity value between two topics based on the similarity of their
hierarchical structure. In this expression, t1 and t2 are topics that have m and n parent
topics and x and y child topics respectively. And t1.parenti is i-th parent topic of t1 and
t2.parentj is j-th parent topic of t2. We average SIMname and SIMocc of t1.parenti and
t2.parentj to determine a combined similarity value between parent topics of t1 and t2.
Likewise, t1.childk is k-th child topic of t1 and t2.childl is l-th child topic of t2. We
average SIMname, SIMocc, and SIMH of t1.childk and t2.childl to produce the combined
similarity value SIM between them. In the expression, w is a weight ranging from 0
to 1. We set a different value to w in order to emphasize the similarity of parent topics
or child topics.
SIMH(t1, t2) = (1-w)((SIMname+occ(t1.parenti, t2.parentj))/|m|
w((SIM(t1.childk, t2.childl))/|x| |y|)

|n|) +

4.4 Association Matching Operation


Association matching operation determines the similarity between association types.
An association type is composed of a set of members, which have their roles in the
relation. Thus, the similarity between association types is determined by similarities
between members of them. Following expression measures the similarity between
association types. Given two association types, t1 and t2, for a set of pairs of members
the similarity value between paired members is computed. M and N is the number of
members of two association types each other. mi is the i-th member of t1 and mj is the
j-th member of t2. ri is role of mi and rj is role of mj.
SIMassoc(t1,t2)=SIM(mi,mj)SIM(ri,rj)/|M|

|N|, for 1iM, 1jN

5 Experiment
We set up three kinds of data groups, which are group A, group B, and group C, for
our experiment. Oriental philosophy ontology(T1), modern western philosophy
ontology(T2), and contemporary western philosophy ontology(T3) are grouped in
group A, because these ontologies are philosophy domains ontologies and created by
the same philosophy experts. Group B includes Wikipedia philosophy ontology(T4)
which is constructed from philosophy-related contents of Wikipedia. Group C

Topic Maps Matching Computation Based on Composite Matchers

701

includes German literature ontology(T5) that was constructed from German literature
encyclopedia provided by Yahoo Korea portal. Table 2 shows the characteristics of
our experimental data.
Table 2. The statistics of experimental ontologies

Ontologies

Group A

Group B

Group C

Max level

T1
11

T2
10

T3
9

T4
9

T5
4

# of Topics

1826

983

1266

417

30

# of Topic types

1379

384

603

182

# of Occ. types

86

56

62

13

# of Ass. types
# of Role types

47
22

40
15

43
18

7
4

2
2

# of PSIs

653

328

345

In this work, we use performance measurement of information retrieval such as


precision, recall, and overall, to measure performance of our ontology matching
operations. To evaluate the quality of our matching operations, we need to know the
manually determined match set(R) and the automatically generated match set(P)
which can be obtained by matching the processes. By comparing these match results,
we get true-positive set(I) which includes correctly identified matches. We can
measure match quality of automatic matching processing by evaluating following
expression. Figure 1 shows the experimental result that represents high recall and
precision.

Fig. 1. Experiment results of pairs of Topic Maps

Pairs of ontologies in group A are matched based on the ontology schema layer
because these ontologies are constructed from the same knowledge domain and a

702

J. Kim and H. Chung

group of experts. These ontologies share a common schema, known as the philosophy
reference ontology, for standardizing and validating them. The pair (T2,T3) of group A
has maximal matches because both ontologies are components of the philosophy
ontology and have some relationships in terms of philosophers, texts, terms, doctrines,
and so on.
In (T1,T4), (T2,T4), and (T3,T4) of group A and B, most of all matched topics result
from topic name-based matching operation because paired Topic Maps have topics
describing same philosophers, i.e. Kant, Hume, and Marx, same texts of philosophy,
i.e. Philosophy of Right, Critique of Pure Reason, and Discourse on the Method, and
same terms of philosophy, i.e. reason, free will, ideology, and moral. The recall of a
pair of modern western philosophy and German literature, (T2, T6), is 1 because the
number of matches between different domains ontologies are very low and matching
operations easily find matches based on topic names, such as Nietzsche, Philosophy of
Right, and so on. This pair has poor overall, -0.38 in contrast to recall. This means
that domain experts must make more efforts to adopt automatically generated matches
than to determine matches in manual. In other words, it seems useless to match
ontologies between different knowledge domains.

6 Conclusion
In this paper, we propose a multi-strategic matching approach to determine semantic
correspondences between Topic Maps. Our multi-strategic matching approach takes
advantage of the combination of linguistic module and Topic Maps constraints
including name matching, internal structure matching, external structure matching,
and association matching. By doing this, the system achieves higher match accuracy
than the one of a single match technique.
The experiment results shows that precision of automatically generated match set
is more than 87%, but the recall of the set is more than 90%. This means that
automatically generated match sets include a large portion of all manually determined
matches.
Matched topics are merged into a new topic or connected by a semantic
relationship to enable ontology-based systems to provide knowledge-related services
on multiple Topic Maps. However, merging or alignment of Topic Maps is not easy
work although we found matches between Topic Maps. Ontology merging approaches
concerning merging issues, such as conflict resolution, ontology evolution, and
versioning will be investigated in the near future.

References
1. Erhard Rahm and Philip A. Bernstein. 2001. A survey of approaches to automatic schema
matchng, VLDB Journal, 10(4):334-350.
2. Gerd Stumme and Alexander Maedche. 2001. FCA-Merge: Bottom-up Merging of
Ontologies, In Proceedings of 17th Internaltional Joint Conference on Artificial
Intelligence(IJCAI):225-234.
3. Hong Hai Do and Erhard Rahm. 2002. COMA - a system for flexible combination of
schema matching approaches, In Proceedings of VLDB:610-621.

Topic Maps Matching Computation Based on Composite Matchers

703

4. ISO/IEC JTC1/SC34. 2003. Topic Maps - Reference Model,URL:http://


www.isotopicmaps.org/TMRM/TMRM-latest-clean.html, 2003.
5. JungMin Kim, ByoungIl Choi, and HyoungJoo Kim. 2005. Building a Philosophy
Ontology based on Contents of Philosophical Texts, Journal of Korea Information Science
Society, 11(3):275-283.
6. Lutz Maicher. 2004. Merging of Distributed Topic Maps based on the Subject Identity
Measure(SIM) Approach, In Proceedings of Berliner XML Tags:301-307.
7. Marc Ehrig and Steffen Staab. 2004. QOM: Quick ontology matching, In Proceedings of
ISWC:683-697.
8. Michel Biezunski, Michael Bryan and Steven R. Newcomb. 2002. ISO/IEC 13250
TopicMaps, URL:http://y12web2.y12.doe.gov/sgml/sc34/document/0322.htm.
9. Natalya F. Noy and Mark A. Musen. 2000. PROMPT: Algorithm and Tool for Automated
Ontology Merging and Alignment, In Proceedings of the National Conference on Artificial
Intelligence(AAAI):450-455.
10. Natalya F. Noy and Mark A. Musen. 2001. Anchor-PROMPT: Using Non-Local Context
for Semantic Matching, In Proceedings of the Workshop on Ontologies and Information
Sharing at the International Joint Conference on Artificial Intelligence(IJCAI):63-70.
11. Paolo Bouquet, Luciano Serafini, and Stefano Zanobini. 2003. Semantic coordination: A
new approach and an application, In Proceedings of ISWC:130-145.
12. Steve Pepper and Graham Moore. 2001. XML Topic Maps(XTM) 1.0, TopicMaps.Org,
URL:http://www.topicmaps.org/xtm/1.0.
13. Yannis Kalfoglou and W. Marco Schorlemmer. 2003. IF-Map: An Ontology-Mapping
Method Based on Information-Flow Theory, Journal on Data Semantics I: 98-127.

Social Mediation for Collective Intelligence


in a Large Multi-agent Communities:
A Case Study of AnnotGrid
Jason J. Jung and Geun-Sik Jo
Department of Computer and Information Engineering
Inha University, Republic of Korea
j2jung@intelligent.pe.kr, gsjo@inha.ac.kr

Abstract. Collective intelligence is a key issue for efficient collaborations on


semantic grid environment. Through interactions among agents on the semantic
grid, we want to discover social centralities underlying on agent communities. In
this paper, we propose an ontology mapping approach to build collective intelligence based on social mediation. As a case study, we demonstrate AnnotGrid
platform to support manual annotation tasks of image dataset.

1 Introduction
Collective intelligence is regarded as one of the important processes for collaborations
between people, in order to solve a very complicated problem. They can share their
own knowledge (or experiences) with each other, access the knowledge in the individual viewpoint, and integrate some relevant knowledge with theirs. Originally, this
mechanism has been studied in the communities of social science and pedagogical science. In the context of computational and information science, we want to automatically
build the collective intelligence among information systems (e.g., virtual organizations),
and then, to provide meaningful services (e.g., recommendation) to people [1]. Particularly, the individuals are represented as not human actors but software agents, which
are working on behalf of the corresponding persons [2]. The agents however have some
drawbacks to communicate with others. Such problems are i) lack of computational
power, ii) lack of information and knowledge, and even more seriously, iii) semantic
heterogeneities.
For overcoming these problems, the agents can be reinforced with semantic grid
computing [3]. The semantic grid is an extension of the current grid in which information and services are given well-defined meaning through machine-processable descriptions which maximize the potential for sharing and reuse. The existing grid environments suppose that the power of agents should be simply identical. We are motivated from the naive assumption. Depending on the ontological structure of agents, the
semantic information provided during cooperations might be different from the agents.
In this paper, as shown in Fig. 1, we have investigated three-layered architecture
for the semantic grid, which is composed of social, ontology, and resource layers. We
propose a novel approach to integrate the intelligence reasoning capabilities based on
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 704711, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Social Mediation for Collective Intelligence in a Large Multi-agent Communities

O4

O1

O2

Social Layer

O3

Ontology Layer

705

Resource Layer

Fig. 1. Three-layered semantic grid architecture

social network analysis and co-occurrence analysis (i.e., extended TF-IDF) methods of
the resources. The semantics delivered from an agent is applied to measure the strength
of social ties with neighboring agents, because we have no idea about the relationship
between two agents A and B on social layer.
The outline of this paper is as follows. In Sect. 2, we will define some notation and
explain our ontology mapping algorithm for collecting the social ties among agents.
Sect. 3 will propose community-based approach for social mediation on multi-agent
environment. In Sect. 4, we demonstrator our AnnotGrid environment to support semantic collaboration. Finally, Sect. 6 give a conclusion and future work.

2 Constructing Social Network


Now we want to define agents and their community model. Social ties between the
agents are discovered, so that they are aggregated to construct a weighted social network. The weighted link between two agents are computed from two steps; i) semantic
similarity between ontologies that they are using, and ii) co-occurrence patterns of resources that they are possessing.
Definition 1 (Agent). An agent Ai in a social grid SG is composed of three modules;
i) inference engine, ii) a set of ontologies Oi , and iii) local resources Ri where each
element r Ri is annotated with Oi .
In this paper, the inference engine is regarded as an annotation (or classification) system for local resources. Thereby, each agent can build his own local ontologies, and
also he can exploit the standard ontologies such as OpenCyc [4] and SUMO1 . For
convenience, Oi = {c , rel, c |c , c Annot(rx ), rx Ri }.
2.1 Semantic Similarity by Ontology Mapping
We can measure similarity between two agents SimO (Ai , Aj ) by similarity measurement strategy on [5,6], which defines all possible similarities between classes SimC ,
1

Suggested Upper Merged Ontology, http://ontology.teknowledge.com

706

J.J. Jung and G.-S. Jo

relationships SimR , attributes SimA , and instances SimI . For simplicity, we need to
use only SimC . Given a pair of classes from two different ontologies, the similarity
measure SimC is assigned in [0, 1]. The similarity (SimC ) between c OA and
c OB is defined as

C
SimC (c , c ) =
F
M SimY (F (c ), F (c ))
(1)
F N (C)

where N (C) is the set of all relationships in which the classes participates, and function
F can measure label similarities (it can be replaced to
algorithms, e.g.,
string matching
C
C
edit distance). The weights F
are normalized (i.e., F N (C) F
= 1).
In this paper, hence, we consider three relationships in N (C), which are the superclass (SU P ), the subclass (SU B) and the sibling class (SIB), and Eq. 1 has to be
extended to
C
SimC (c , c ) = L
simL (F (c ), F (c ))

S
+ SU
P M SimSU P (SUP(c ), SUP(c ))
C
+ SU B M SimSU B (SUB(c ), SUB(c ))
C
+ SIB
M SimSIB (SIB(c ), SIB(c )).

(2)

where the set functions M SimY is formulated for comparing the two sets of entity
collections. According to the characteristics of the sets, it is differently given by two
equations
max<c,c >P airing(S,S  ) (SimC (c, c ))
,
max (|S|, |S  |)

1 <c,c >P airing(S,S  ) (SimC (c, c ))

M SimSU P,SU B (S, S  ) =


M SimSIB (S, S  ) =

(3)

(4)
max (|S|, |S  |)
where P airing is a simple matching function for generating all possible pairs from
both sets. These equations express either positive or negative influences on semantic
relationships between classes. It means that the more matched super- and subclasses
can imply the corresponding classes are more similar, as shown in Eq. 3. In contrast,
the sibling classes (Eq. 4) reflect the negative effect, because they make the semantics
dispersed and the degree of similarity decreased.
2.2 Co-occurrence Patterns
Finally, we can construct a weighted social network as a set of agents by referring to
the resource layer. In this paper, we assume that when agents annotate a resource with
semantically similar classes, they are strongly correlated, i.e., the strength of social tie
might be high. Thus, the strength of social tie between Ai and Aj is given by


c Annoti (rx ),c Annotj (rx ) SimC (c , c )

weij =
,
(5)
Annoti (rx ) Annotj (rx )
rx Ri Rj

and the normalized strength is easily computed by weij =

we

ij

|Ri Rj | .

Social Mediation for Collective Intelligence in a Large Multi-agent Communities

707

Furthermore, there is another issue to enhance the weights between agents. This is
an inactivation process by using conflicted annotation semantics. When SimC between
annotations of a resource is very low, the strength of the social tie is supposed to be
decreased. We want to discuss this issue for future work.

3 Community-Based Social Mediation


A set of agents which is participating to a semantic grid should be organized as agent
communities.
Definition 2 (Agent community). An agent community ACs is represented as ACs =
N , E where N is a set of agents {A1 , A2 , . . . , A|N | }, and E is a set of relations
between the agents
E |N | |N |.
(6)
Each element is represented as eij , weij  E where weight weij [0, 1] indicates the
strength of a social link between two agents Ai and Aj .
Property 1 (Degree of cohesion COH). A set of agents in an agent community ACs
have to be highly cohesive with each other. It is similar to subgroup identification. Based
on [7,8], this constraint is simply formulated by

Ai ,Ai ACs weij
COH(ACs ) =
COH
(7)
|N | C2
where COH is a predefined threshold value.
Based on the social ties between two agents, we can apply a non-parametric approach,
e.g., nearest neighborhood method [9]. As extending [10], this task is to maximize semantic modularity function Q on the social network of agents. Given the number of
agent communities S from a set of agents, a social network of SG can be partitioned
into a set of agent communities (or subgroup) {AC1 , AC2 , . . . , ACS }. The agents can
be involved in more than one community. It means that an agent in ACk can also be
taken as one of members of ACk . Based on Equ. 7, the modularity function Q is
formulated by
S
COH(ACi )
Q (SG) = i=1
(8)
S
where all possible pairs of agents should be considered. Thus, the optimal formation
of SG is discovered when Q (S) is maximized. For computing this, in this paper, we
applied k-nearest neighborhood methods.
Within a certain AGi , the centrality of agents can be considered to find out which
agent is more semantically powerful, i.e., which agents is the most relevant to make two
heterogeneous agents be able to communicate and understand with each other.
Definition 3 (Centrality CT R). From a set of agents AC x , the centrality of an agent
Ai is given by

708

J.J. Jung and G.-S. Jo


Aj AC x

CT R(Ai ) =

weij

(9)

|N | 1

where Aj can be in either the same agent community or not. Because the network of the
agents are fully connected, it is based on closeness centrality [11] rather than others
(e.g., betweenness and stress centralities).
Given the agents whose centralities is highest in the corresponding agent communities,
We have to repeat the cohesion evaluation process. As repeating this centrality computing process, the SG is hierarchically built for efficient social mediation. For the social
mediation based on semantic grid environment, we have to find out whether two agents
are involved into a same agent community or not. The reason is because the centrality is available only in the same community. There are two cases, i.e., two red arrows
meaning communications i) between heterogeneous agents A1 and B1 , ii) between A1
and A3 , as shown in Fig. 2. Within an agent group, the social mediation can be done
AgentCommunity1

AgentCommunity2
AgentCommunity3

C2

A1

A3
C1

C3

B1

Fig. 2. Two kinds of social mediation based on centrality; Three agent communities are built,
and agents C1 , C2 , and C3 are assigned with the highest centrality in the agent communities,
respectively

by the agent whose centrality is highest. For example, we can figure out that the communications between A1 and B1 in AgentCommunity1 is possibly supported by C1 . On
the other hand, for communication between A1 and A3 , we need to know which collective intelligence should be integrated. In this case, agents C1 , C2 , and C3 are possibly
cooperating to integrate their semantic information to mediate between A1 and A3 .

4 Experimentation: A Case Study of AnnotGrid Environment


We have been implementing a semantic grid environment based on multi-agent systems,
called AnnotGrid. Sharing a bunch of image files is usually regarded as a serious hurdle
in distributed environment. In semantic web era, rather than content-based analysis, a
variety of studies have been focusing on image annotation. The annotation is basically
generated by human experts (or end-users). They have to put in some description about
the images (somehow, they can refer to knowledge-based systems.).

Social Mediation for Collective Intelligence in a Large Multi-agent Communities

709

The goal of this system is to support the annotation tasks to save their time and
increase the precision of their descriptions. Basic scheme of AnnotGrid is as follows
(For more details, see the project web pages2 );
1. Building social grid with initial annotations; it is an initialization step of semantic
grid.
2. Loading a new image ri
3. Retrieving a set of existing annotation from other users; the agent can get the existing annotations about ri from the other agents in both the same agent community
and different ones.
local searching by himself
global searching by social mediator
4. Translating the retrieved annotation; The corresponding social mediators provide
translation service to make the annotations understandable.
For evaluating this system, we invited 32 college students to annotate an image
dataset3 . Above all, because the class similarities built during annotating the images
should become relatively large, we recommended to use some standard ontologies
such as
Suggested Upper Merged Ontology (SUMO)
WordNet (http://wordnet.princeton.edu/), and
CIDOC CRM (CIDOC4 Conceptual Reference Mode, http://cidoc.ics.forth.gr/).
Each person had to initially annotate five images randomly selected by his own preferences, so that four communities had been built (COH = 0.35). As keeping annotating
the imaged, the students were able to obtain annotations recommended by AnnotGrid,
and feedback their opinions whether the recommendation is relevant or not.
Table 1. Experimental results on annotation recommendation by AnnotGrid
Index

User number

Community1
Community2
Community3
Community4
Average

4
12
9
7

Number of annotation
Local
Global
19
97
93
124
58
137
52
94

Accuracy of annotation (%)


Local
Global
72.5
78.6
68.4
81.2
62.5
75.0
69.1
81.6
68.1
79.1

As shown in Table 1, global searching facility has shown better performance than local one by approximately 11%. It proves our social mediations has been working properly. Especially, the users in Community1 got the largest social mediation, even though
the number of users are smallest. In case of large communities (2nd , 3rd , 4th communities), the users had acquired quite higher precision. We think that enough amount
of semantic information in these communities make ontology matching process more
precise.
2
3
4

http://intelligent.pe.kr/AnnotGrid/
http://intelligent.pe.kr/AnnotGrid/images/
http://www.willpowerinfo.myby.co.uk/cidoc/

710

J.J. Jung and G.-S. Jo

5 Related Work
There have been various annotation systems for not only images but also bookmarks and
other multimedia [12,13,14]. Especially, many collaborative tagging systems, called
Folksonomy, like del.icio.us and Bibsonomy (http://www.bibsonomy.org) have been
trying to share the semantic tags with others. But, they do not have any facilities to
analyze social features and automated recommendation process. Then, people have to
directly inspect which tags (or semantics) can be exploited. In the aspect of computing
architectures, SOA (service-oriented architecture) systems have been applied to data
annotation [15]. Particularly, semantic overlay architecture for P2P network [16] is very
similar to our scheme, because it can also hierarchical organize the linkage structure
among peers.

6 Concluding Remarks and Future Work


In conclusion, we claim that the social relationships between agents should be discovered to support efficient cooperations on semantic grid environment. The proposed
method are able to build hierarchical formation of people in a social network. For the
practical experimentation, this paper demonstrates AnnotGrid system, which is capable
of recommending semantic information for manual annotation task of photos.
As future work, we are planning to apply consensus method [17] to our system, in
order to deal with conflicts between ontology mapping.

References
1. Jung, J.J.: Ontological framework based on contextual mediation for collaborative information retrieval. Information Retrieval 10(1) (2007) 85109
2. Maes, P.: Agents that reduce work and information overload. Communication of the ACM
37(7) (1994) 3040
3. Gil, Y.: On agents and grids: Creating the fabric for a new generation of distributed intelligent
systems. Journal of Web Semantics 4(2) (2006) 116123
4. Lenat, D., Guha, R.: Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley (1989)
5. Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in OWL-Lite. In de Mantaras,
R.L., Saitta, L., eds.: Proceedings of the 16th European Conference on Artificial Intelligence
(ECAI). (2004) 333337
6. Jung, J.J., Euzenat, J.: Measuring semantic centrality based on building consensual ontology
on social networks. In: Proceedings of the Workshop on Semantic Network Analysis (colocated in ESWC 2006). (2006)
7. Wasserman, S., Faust, K.: Social Network Analysis. Cambridge University Press (1944)
8. Jung, J.J., Euzenat, J.: From personal ontologies to socialized semantic space. In: Poster of
the 3rd European Semantic Web Conference. (2006)
9. Gowda, K.C., Krishna, G.: Agglomerative clustering using the concept of mutual nearest
neighbourhood. Pattern Recognition 10(2) (1978) 105112
10. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Physical
Review E 69 (2004) 066133

Social Mediation for Collective Intelligence in a Large Multi-agent Communities

711

11. Sabidussi, G.: The centrality index of a graph. Psychometrika 31 (1966) 581603
12. Kiryakov, A., Popov, B., Ognyanoff, D., Manov, D., Kirilov, A., Goranov, M.: Semantic
annotation, indexing, and retrieval. In Fensel, D., Sycara, K.P., Mylopoulos, J., eds.: International Semantic Web Conference. Volume 2870 of Lecture Notes in Computer Science.,
Springer (2003) 484499
13. Belhajjame, K., Embury, S.M., Paton, N.W., Stevens, R., Goble, C.A.: Automatic annotation of web services based on workflow definitions. In Cruz, I.F., Decker, S., Allemang,
D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L., eds.: Proceedings of the 5th
International Semantic Web Conference (ISWC 2006). Volume 4273 of Lecture Notes in
Computer Science., Springer (2006) 116129
14. Jung, J.J.: Exploiting semantic annotation to supporting user browsing on the web.
Knowledge-Based Systems xx(x) (to appear) http://dx.doi.org/10.1016/j.knosys.2006.08.003
15. Aurnhammer, M., Hanappe, P., Steels, L.: Augmenting navigation for collaborative tagging
with emergent semantics. In Cruz, I.F., Decker, S., Allemang, D., Preist, C., Schwabe, D.,
Mika, P., Uschold, M., Aroyo, L., eds.: Proceedings of the 5th International Semantic Web
Conference (ISWC 2006). Volume 4273 of Lecture Notes in Computer Science., Springer
(2006) 5871
16. Aberer, K., Cudre-Mauroux, P.: Semantic overlay networks. In Bohm, K., Jensen, C.S., Haas,
Ooi, B.C., eds.: Proceedings of the 31st International
L.M., Kersten, M.L., Larson, P.A.,
Conference on Very Large Data Bases (VLDB), ACM (2005) 1367
17. Nguyen, N.T.: Conflicts of ontologies - classification and consensus-based methods for resolving. In Gabrys, B., Howlett, R.J., Jain, L.C., eds.: Proceedings of the 10th International
Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES
2006). Volume 4252 of Lecture Notes in Computer Science., Springer (2006) 267274

Metadata Management in S-OGSA


Oscar Corcho, Pinar Alper, Paolo Missier,
Sean Bechhofer, Carole Goble, and Wei Xing
School of Computer Science, University of Manchester
Oxford Road, Manchester M13 9PL, United Kingdom
{ocorcho,penpecip,pmissier,seanb,carole,wxing}@cs.man.ac.uk

Abstract. Metadata-intensive applications pose strong requirements for


metadata management infrastructures, which need to deal with a large
amount of distributed and dynamic metadata. Among the most relevant
requirements we can cite those related to access control and authorisation,
lifecycle management and notication, and distribution transparency. This
paper discusses such requirements and proposes a systematic approach to
deal with them in the context of S-OGSA.

Introduction

Metadata can be attached to many dierent types of objects, available in different formats and locations, and can be expressed in a wide range of languages
(e.g., natural language, lists of terms, formal languages) and using a wide range
of vocabularies (e.g., keyword sets, concept taxonomies). Depending on these aspects, many authors use the terms semantic versus non-semantic, or rich versus
non-rich metadata.
Many technologies are available to manage metadata, such as Jena, Sesame,
Boca, Oracle-RDF, Annotea, Technorati, etc. They give a good level of support
for current metadata-intensive applications. However, they fall short for some of
the metadata management requirements that new applications are posing, such
as metadata distribution, lifecycle management and authorisation.
This paper starts by providing a set of requirements for advanced metadata
management (Section 2), which characterise metadata-intensive applications in
domains like bioinformatics, social sciences, engineering, market analysis, etc. Existing approaches and technologies do not meet all of the previous requirements [1].
Hence we make a proposal for managing metadata as rst-class resources in distributed systems, so that we can deal with the previous requirements (Section 3).
Finally we provide some conclusions and future work.

Metadata Management Requirements

In this Section we describe some of the advanced requirements for metadata


management that are needed in some metadata-intensive applications.
Metadata should be stored and accessible in a distributed manner. Many systems will need to integrate distributed metadata (and ontologies),
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 712719, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Metadata Management in S-OGSA

713

which may be supplied by multiple parties and with dierent technologies. Most
of the current metadata management systems provide repositories designed for
centralized use, with metadata consumers and producers acting as local-access
clients using specialized APIs. Remote distributed access to metadata and ontologies has received little attention, causing each system to depend on one
particular technology, thus reducing interoperability.
A service-oriented approach to metadata and ontology access could improve
this situation. However, this should not be done by simply wrapping current
metadata management systems with web services that replicate their local APIs,
as described below.
Metadata should be accessible with technology-independent serviceoriented protocols. Metadata can be available in multiple forms (in RDF, as
social tags, in natural language, etc.). Hence we require systems that enable the
integration and exploitation of this heterogeneous metadata.
Even in the case of a single form of metadata (e.g. RDF(S)) each technology
uses a dierent abstraction and means to handle it. There is a clear need for a
common abstraction for metadata, which would enable the systematic development of services that combine other services and applications with varying levels
of semantic capabilities for dealing with and interpreting metadata.
Metadata should evolve together with the resources that it describes and the vocabularies that it uses. Metadata evolves due to dierent reasons. However, the dynamics of metadata is poorly supported by existing
technologies, although some work has been done with respect to the semantics
of change and its propagation to the related metadata [2,3].
Metadata should be maintained up-to-date in a cost-eective manner. This
includes maximising the automation of dierent aspects of the knowledge lifecycle, managing the evolution and change of metadata and knowledge models
in distributed contexts, and synchronising adequately the evolution of all these
related entities by means of notication mechanisms.
Metadata access should be controlled. Metadata owners may want to
dene the conditions and rules under which others can access their metadata.
However, existing technologies vary in their support and granularity for security,
which normally means access-control only. For instance, Sesame provides built-in
user/role-based access control per repository, while Jena has no built-in accesscontrol support since it assumes that this can be provided by the underlying
database technology. Moreover, all these access control mechanisms have their
own specic APIs and conventions, which creates overhead at the client layer
(need for multiple usernames and passwords, or multiple sign-ons).
Securing message exchanges, authentication and access-control are well-studied
problems in distributed environments, with standards1 and open-source reference
implementations2 for global user identication, single sign-on, communication encryption and representation and decision of resource-sharing policies.
1

E.g., XACML: www.oasis-open.org/committees/xacml/, WS-Security: www.oasisopen.org/committees/wss/


E.g., Globus GSI: http://www.globus.org/toolkit/docs/4.0/security/

714

O. Corcho et al.

Metadata Management in S-OGSA

In this Section we describe the approach for metadata management proposed


in the context of the Semantic-OGSA (S-OGSA) architecture [4]. S-OGSA is
an extension of the Open Grid Service Architecture [5] for the development of
distributed applications that need to use explicit and distributed metadata.
The S-OGSA model is driven by the principle that associations between resources and their metadata should be rst-class resources that can be distributed
and managed in a service-oriented manner. This gives support to the rst two
requirements from Section 2, and is the basis for giving support to the rest.
The term Semantic Binding (SB) [6] denotes this new type of resource, which
constitutes the core of S-OGSA. Semantic Bindings represent associations between any set of resources and any set of knowledge entities (e.g., ontologies, rule
sets, controlled lists of terms). These associations contain metadata, which can
be encoded in dierent formats: RDF, natural language, as social tags, etc. Besides, dierent Semantic Bindings can describe the same set of entities: dierent
tools or persons may create dierent annotations for the same resource.
The following subsections describe the capabilities associated to Semantic
Bindings and how they support our metadata management requirements.
3.1

Semantic Binding Capabilities

In [4] we describe the model used to represent Semantic Bindings, expressed as an


ontology that extends the Grid ontology described in [7]. The main properties of a
Semantic Binding are the set of resources to which it refers (that is, the resources
for which it contains metadata), the set of knowledge entities that the metadata
is based on, and the actual metadata that they store. Other properties (state,
creation time, last modication time, etc.) are stored and used for managing
lifetime, notication and authorisation mechanisms.
Figure 1 shows the basic operations provided by the Semantic Binding service
suite (the Semantic Binding Service, its corresponding Factory and a Metadata
Service that unies the metadata stored by several Semantic Bindings).
Create. It creates a Semantic Binding, given the described resources, the
Knowledge Entities used for the description, and the metadata to store.
Update Resource and Knowledge Entity References. They allow managing the
references to Resources and Knowledge Entities of the Semantic Binding.
Update Semantic Binding Content. It updates the metadata stored in the
Semantic Binding, due to its reannotation or curation.
Destroy. It destroys the Semantic Binding, together with its content, immediately or at a scheduled point in time.
Archive. It archives the Semantic Binding content so that it is not active but
can be retrieved in case that it is needed later (e.g., for provenance).
Query. It executes a query over the metadata stored by the Semantic Binding. Queries will be sent in a query language that the Semantic Binding
supports, and can take into account the knowledge entities to which the
Semantic Binding refers or not.

Metadata Management in S-OGSA

715

Fig. 1. Functionality of the Semantic Binding Service

This suite is supported by the S-OGSA reference implementation3 , which


complies with WS-Resource Framework [8] (WSRF). It can be deployed on the
Globus Toolkit 4 platform4 and in Apache Tomcat5 .
Next we describe how we deal with the rest of requirements, namely lifetime
management, metadata change notications and controlled access to metadata.
3.2

Semantic Binding Lifetime

In some applications, metadata is a dynamic entity subject to frequent changes:


resources and knowledge entitites that they refer to can evolve, become suddenly
unavailable, be destroyed, etc. Some of these changes may cause metadata to
become invalid. Other changes may not have an inuence on metadata validity
(e.g., the removal of a concept in an ontology for which there are no instances in
the stored metadata). Finally, metadata may become invalid after a given period
of time, when a new annotation tool has been made available and the resource
has to be reannotated, when a metadata curation process is in place, etc.
In order to deal with all these changes in a principled way, S-OGSA denes
SBs as stateful resources with a dened lifetime and identies the states and state
transitions that a SB can go through throughout its lifetime. The corresponding
state diagram is presented in the next subsection.
Figure 2 shows the state diagram associated to an SB, which includes fundamental states and state transitions, and the external events that cause the
transitions. The SB lifetime specication extends WS-ResourceLifetime, a part
of WSRF that standardizes the way that resources are destroyed, and denes resource properties for the inspection and monitoring of a resource lifetime. While
3
4
5

Available at the OntoGrid CVS.


http://www.globus.org/toolkit/
http://tomcat.apache.org/

716

O. Corcho et al.

WS-ResourceLifetime is focused exclusively on resource destruction, we extend


it to include any life-changing event that may aect the validity and updates
of an SB. Furthermore, the basic state machine presented here can be extended
with sub-states if needed in a certain application.
The explanation of the state transition diagram is as follows. When it is rst
created, a Semantic Binding SB is in the Valid state. We denote with ResSB
and KESB , respectively, the set of Resources and Knowledge entities that are
part of the association, and with contentSB the metadata payload within SB.

Fig. 2. State transition diagram for a generic Semantic Binding

State transition events are of the following types:


Changes in the described resources, denoted by ResSB ResSB .

Changes in the Knowledge entities, i.e., KESB KESB

Updates to the SB content: contentSB contentSB .
Note the Resources and Knowledge entities can also be destroyed: ResSB ,
ResSB . In addition to these external events, a content expiration date can
also be associated to an SB, so that it gets stale upon expiration.
For a Valid SB, these events cause its transition to either one of two possible
Validate states, Validate Res and Validate KE. These are interim states in which
the SB may be invalid, and is awaiting re-validation. A re-validation process,
either manual or automated, is any procedure that updates any or all of ResSB ,
KESB , or contentSB , and which results in a decision as to whether the updated
entities represent a new valid combination6 .
For a ValidateRes SB, such procedure determines whether the existing metadata can be associated to the new Resources, and provides an update to the
references in SB to ResSB . For example, following a change in a workow
6

In both cases, the SB goes back to the Valid state in case of successful validation,
and to Invalid otherwise.

Metadata Management in S-OGSA

717

that is described with a piece of metadata, the procedure determines whether


the same metadata can be associated to the new workow.
For a ValidateKE SB, the problem is to determine whether the new ontology
can still be used to interpret the old metadata.
Finally, the Archived state indicates that a SB is still available for inspection.
3.3

Notication of Semantic Binding Changes

Requirement 3 suggests that the metadata-aware services that use SBs should be
informed of state change for those SBs. For this purpose, S-OGSA denes a set
of notication mechanisms based on WS-Notication. S-OGSA proposes using
a set of pre-dened topics associated to the changes described above, which any
consumers can subscribe to.
Services that receive any of these notications will decide, as part of their
business logic, how to react to the changes that they are notied about. The
S-OGSA specication does not enforce any specic type of behaviour. However,
as part of our future work, we are designing a service, called SB housekeeping
service, which monitors SB lifetime by subscribing to all their topics. This service
will be responsible for activating application-dependent re-validation procedures
(validation of the SB content, triggering of re-annotation processed, etc.).
3.4

Security over Semantic Bindings

The last requirement suggests the need to control the access to metadata in more
ne-grained ways than what it is currently done with current technology.
S-OGSA provides the framework needed to support security in metadata management. Metadata is a rst-class resource; hence standard security mechanisms
can be applied to metadata in the same way as it is done with other resources,
including the possibility of specifying and enforcing access control policies over
each SB. Besides, since the reference implementation of S-OGSA can be deployed
on top of Globus Toolkit 4, its associated security mechanisms, such as Globus
GSI, can be also applied, ensuring both message and transport level security,
based on standard X.509 end entity and proxy certicates.
With this framework, it is possible to allow or deny access to the dierent
annotations of an object, stored by dierent SBs, based on the users or groups
that have created them and on their access control policies.

Conclusions and Future Work

In this paper we have described the requirements posed by some of the existing
and the envisaged metadata-intensive distributed applications:
Applications may require their explicit metadata to be encoded in multiple forms, supplied by multiple parties and coming from dierent contexts.
Moreover, metadata may need to be in dierent physical locations.

718

O. Corcho et al.

Applications may use heterogeneous metadata storage and query technologies, each of which has specic advantages over the others (e.g. Sesame is
good at query performance, Jena has a rich API, etc.). To ease metadata
sharing, common technology-independent means for metadata access (metamodels and protocols) become necessary.
Metadata normally evolves, either because of new metadata generation
means or because of the evolution of the resources or knowledge entities
that it refers to. Adequate means to deal with this evolution have to be
made available.
Access to metadata may need to be secured with dierent levels of granularity and dierent access control policies.
We have shown how the approach followed in S-OGSA complies with the
previous requirements without the need to create expensive and dicult-tomaintain ad-hoc solutions for each of them. Basically, S-OGSA proposes to treat
metadata as a rst-class resource (Semantic Binding) in the application, so that
standard resource management and sharing solutions used in the context of
distributed applications can be easily applied to it. This includes aspects like
service orientation, lifecycle management, security, etc.
The S-OGSA approach is being used in the development of dierent types of
applications, all of which are characterised by being metadata-intensive and by
needing support for some of the previous requirements. The S-OGSA reference
implementation is being or will be used in several system prototypes: a satellite
quality image analysis system [9], an information service for the EGEE Grid [10],
and a coral bleaching alert system (Semantic Reef) [11].
Our future work will be devoted to improve our S-OGSA reference implementation, which includes the Semantic Binding Service suite presented in Section 3.
We will consider the additional requirements that the early adopters of S-OGSA
are providing and create an implementation with industrial standards.
Among the aspects that will be improved, we can cite the following:
Security. We will provide a set of pre-dened security congurations that
cover the most common metadata management security aspects.
Naming. Our current implementation uses WS-Addressing EndPoint References (EPRs) to identify Semantic Bindings and the Resources that they
refer to, and URIs to identify Knowledge Entities. However, there are more
ways to identify entities in a distributed environment (URIs, ARK Identiers, LSIDs). We will implement a metadata identication model using
WS-Naming, which builds on WS-Addressing and extends it with URIs.
Semantic Binding Housekeeping Service. We will build a congurable SB
Housekeeping Service that gives support to the most common behaviours
for metadata evolution found in applications.
Support for more forms of metadata. We will give support to several forms of
annotation (besides RDF), such as social tags, natural language comments,
other ontology languages, etc.

Metadata Management in S-OGSA

719

Finally, as part of our future work we will also evaluate our system with respect
to other metadata management systems, in terms of memory consumption, query
execution performance, etc., in distributed settings.

Acknowledgements
This work is supported by the EU FP6 OntoGrid project (STREP 511513)
funded under the Grid-based Systems for solving complex problems, and by
the Marie Curie fellowship RSSGRID (FP6-2002-Mobility-5-006668). We also
thank the other members of the OntoGrid team at Manchester for their helpful
discussions: Ian Dunlop and Ioannis Kotsiopoulos.

References
1. O. Corcho, P. Missier, P. Alper, S. Bechhofer, and C. Goble, Principled metadata
management for next generation metadata-intensive systems, in 4th European
Semantic Web Conference (ESWC2007). Submitted, Innsbruck, Austria, 2007.
2. L. Stojanovic, Methods and tools ontology evolution, Ph.D. dissertation, Univ
Karlsruhe (TH), 2004.
3. M. Klein, Change management for distributed ontologies. Ph.D. dissertation,
Vrije Universiteit Amsterdam, 2004.
Corcho, P. Alper, I. Kotsiopoulos, P. Missier, S. Bechhofer, and C. A. Goble,
4. O.
An Overview of S-OGSA: A Reference Semantic Grid Architecture, Journal Web
Semantic, vol. 4, no. 2, pp. 102115, 2006.
5. I. Foster, H. Kishimoto, A. Savva, D. Berry, A. Grimshaw, B. Horn, F. Maciel, F. Siebenlist, R. Subramaniam, J. Treadwell, and J. V. Reich, The
Open Grid Services Architecture, Version 1.5, gfd-i.080 ed., GGF, July 2006,
http://forge.gridforum.org/projects/ogsa-wg.
6. P. Missier, P. Alper, O. Corcho, I. Kotsiopoulos, I. Dunlop, W. Xing, S. Bechhofer,
and C. Goble, Managing semantic grid metadata in S-OGSA, in Cracow Grid
Workshop 2006. Submitted, Cracow, Poland, 2006.
7. M. Parkin, S. van den Burghe, O. Corcho, D. Snelling, and J. Brooke, The Knowledge of the Grid: A Grid Ontology, in Proceedings of the 6th Cracow Grid Workshop, Cracow, Poland, October 2006.
8. K. Czajkowski, D. Ferguson, I. Foster, J. Frey, S. Graham, I. Sedukhin, D. Snelling,
S. Tuecke, and W. Vambenepe, Web Services Resource Framework (WSRF),
Globus Alliance and IBM, Technical report,, March 2005.
9. M. S
anchez-Gestido, L. Blanco-Abru
na, M. de los Santos Perez-Hern
andez,
R. Gonz
alez-Cabrero, A. G
omez-Perez, and O. Corcho, Complex data-intensive
systems and semantic grid: Applications in satellite missions, in Proceedings of the
2nd IEEE International Conference on e-Science and Grid Computing (e-Science
2006), Amsterdam, The Netherlands, December 2006.
10. W. Xing, O. Corcho, C. Goble, and M. Dikaiakos, Active ontology: An information
integration approach for highly dynamic information sources, in Submitted to
ESWC2007, May 2007.
11. T. S. Myers, The Semantic Reef: Managing complex knowledge to predict coral
bleaching on the great barrier reef. in In proceedings of AusGrid 2007, 2007,
http://eprints.jcu.edu.au/1131.

Access Control Model Based on RDB Security Policy for


OWL Ontology*
Dongwon Jeong1, Yixin Jing2, and Doo-Kwon Baik2
1

Dept. of Informatics & Statistics, Kunsan National University,


San 68, Miryoung-dong, Gunsan, Jeollabuk-do, 573-701 Korea
djeong@kunsan.ac.kr
2
Dept. of Computer Science & Engineering, Korea University,
Anam-dong, Sungbuk-gu, Seoul, 136-701 Korea
{jing,baik}@software.korea.ac.kr

Abstract. Most of information has been storing and managing in relational databases and there are many researches to store OWL ontologies. In this situation, a study on efficient access control model using relational security model is
required. The paper proposes a novel access control model for OWL ontologies
in relational database systems. The access control is realized through evaluating
queries against an OWL data view. On one hand, the OWL data view prevents
the sensitive information revealed to unauthorized users. On the other hand it
considers the inference ability of users. An empirical study verifies the effectiveness of our approach.

1 Introduction
Web Ontology Language (OWL) is a knowledge description language and is recognized as one of the most important technologies to realize Semantic Web [1]. Compared to XML, OWL supports more enhanced impression power and enables machines to infer new knowledge. There is much effort on how to build and store OWL
knowledge bases (Ontologies, Documents). However, few of research pay attention to
the OWL document security issue. This may result in knowledge leakage and also
makes high-quality services through secure knowledge access so hard.
This paper contributes a novel OWL security (Access Control) model using relational database security model. It means that we assume OWL knowledge bases are
stored in a relational database. There might be a question why we propose an access
control model for OWL knowledge bases in relational databases. The reasons are: (1)
most of information has been managing by relational database systems; (2) relational
database systems provide a secure and stable access control model.
Therefore, we propose a security model based on relational databases considering
this reality. In other words, for enforcement of access control to the OWL knowledge
bases, the persistent storage should be taken into consideration. The prevalent
*

This work was supported by the Korea Research Foundation Grant funded by the Korean
Government(MOEHRD) (KRF-2006-311-D00776).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 720727, 2007.
Springer-Verlag Berlin Heidelberg 2007

Access Control Model Based on RDB Security Policy for OWL Ontology

721

approaches can be classified into two types: (1) File-based storage model; (2) Persistent storage model (e.g., RDB-based storage). The file-based storage models, such as
Jena [2] and Protege [3], are based on the graph model (Triple structure). Jena provides a persistent storing method to manage OWL knowledge in relational databases.
However, it still suffers from the inefficiency.
To solve this problem, we first define a new database layout to efficiently store
OWL knowledge. In this paper, our proposal on OWL ontology data access control is
outlined as follows: (1) First, we design an efficient relational data model to store
OWL data persistently. Our model gains predominant over Jena2; (2) Second, we
define a new OWL security model using relational access control model; (3) Finally,
we show the evaluation results through experiments.

2 Persistent OWL Storage Model in Relational Databases


2.1 OWL-DL Ontology Model
We briefly review OWL-DL document. Without loss of generality, we represent an
OWL ontology by Definition 1.
Definition 1. OWL-DL ontology is composed of (N, P, literal),
Predicate P: NN | Nliteral
type_n: N{NamedClass, Instance, Non-NamedClass}
type_p: P{Transitive, Non-Transitive}
type_p (pc:=subClassOf)=Transitive
We consider the ontology as a graph. N and P represent nodes and edges respectively.
P is a one to one mapping from N to N, or from N to literal. In the context of OWLDL, N is a set of subclasses and instances of RDFNode, while P is a set of subclasses
and instances of RDFProperty. Function type_n and type_p are many-to-one mapping
and return the type of element of N and P respectively. As for any ni in N, ni type is
either NamedClass, Instance, or Non-NamedClass. NamedClass are those classes
which are given explicit URIs in the OWL document. Non-NamedClass includes such
as ClassExpression, Restriction and AnonymousClass defined in the OWL syntax. As
for any pi in P, the pi type is either Transitive or Non-Transitive. If a pi is transitive,
pi(n1, n2) pi(n2, n3)pi(n1, n3). Particularly if a pc is rdfs:subClassOf, its type is of
Transitive.
In addition, we define two functions to facilitate the query evaluation algorithm.
asDomain(ni) returns a set of properties which domain is ni; asRange(ni) returns a set
of properties which range is n. Take the OWL file in Fig. 1 as an example, which is
shown as a graph. The RedWine is the subclass of Wine and a restriction, which constrains that property hasColor refers to Red. type_p(subClass) is Transitive, other
properties are Non-Transitive.

2.2 Permanent Storage Model


The goal of this paper is to define a novel OWL access control model based on the
relational database security model. It means that we an efficient database layout to

722

D. Jeong, Y. Jing, and D.-K. Baik

store OWL knowledge base and to provide an efficient way to facilitate access control. First, the named Classes in an OWL document are identified. According to each
named Classs definition, its nesting Restriction is saved in different tables. The layout also saves the instances of each named Class. The Property table created connections between the Classes by specifying property domain and range. Through testing
with lots of existing OWL ontologies, the persistent relational model is proved capable of saving any general OWL data.

Fig. 1. A simple OWL ontology

Fig. 2 shows a metamodel of the relational model for storing persistently. An OWL
ontology basically consists of a triple (Subject, Predicate, Object). In this paper, both
of Subject and Object are described as Concept or Instance. In the OWL specification,
they are defined as Class and Individual.
involve

involved_by

involved_by
1

Instance_Property

involve

1
N

Instance_Definition
N

specialize

specialize_by

Concept_Definition

qualified_by

Concept_Property

N qualified_by
qualify 1

qualify 1

Instance_Axiom

qualified_by

qualify 1

specialize

specialize_by

Concept_Axiom

Concept_Prop_Axiom

Fig. 2. Metamodel for efficiently storing OWL ontology

3 OWL Security Model


3.1 Definition of Access Control Model
In this section, we define the OWL data access control model. This concept model
provides a general definition.

Access Control Model Based on RDB Security Policy for OWL Ontology

723

Definition 2. OWL Access Control Model: M = (P, V, R),


where P: OWL data in a persistent storage (RDB);
V: OWL data view set; and R: Role set.
V is used to constrain the OWL data that a role can obtain. For each role, there is only
one view connected with it. User can access to the P through one of V by playing a
role. In addition, we define the function : VR to return a data view assigned to a
role. is a many to one function. According the definition 2, a framework for the
propose security model is illustrated in Fig. 3.

OWL Data Sets


(Ontologies)

Authorization

Users

OWL Query Processor


OWL Query

Parsing and Storing (Translation)

Access Controlling
(Pruning & Rewriting)

Relational Database

Inference
Engine
(Reasoning)

SQL
OWL data sets

Metadata (including authorization info.)

Fig. 3. A framework for the proposed security model

3.2 Views for OWL Ontology


Although an OWL ontology is successfully ported into a relational database system,
the attempt to adopt the relational database security solution encounters challenges.
Creating view on the tables can only obtain an access control in a very limited level.
Only the limit to the column access is not enough. Access control on OWL data needs
to evaluate the query to individual record such as Wine or RedWine, which requires
the access control on the record level. The security view on OWL data consists of two
parts for securing N and P respectively.
Definition 3. OWL Data View: V = (VN, VP),
where VN: {ni} {false, (true, transitiveEnd (pi) = [ns, ne]})}
VP: {pi} {false, true}

VN maps each node ni in ontology graph to a security specification for a given user.
The specification denotes a ni inaccessible (false), or accessible (true) along with a
constraint to transitive end. Transitive end constrains the inference ability from the
node. From the perspective of a graph, the inference ability can be considered as
the ability of how far the current node can access tracking along with a transitive
property. The TransitiveEnd defines the end of this inference path. [ns, ne] denotes end

724

D. Jeong, Y. Jing, and D.-K. Baik

nodes where the inference path comes to an end in two directions. The end node can
be Self, another node or Full. Self means the inference stop at the node itself. Full
doesnt restrict the inference ability. VP denotes the access to a given property pi. An
example VN(RedWine) = (true, trasitiveEnd(subClassOf) = [self, Wine]) presents a
view constraining that a user can visit RedWine and can only visit its super classes till
Wine.

4 Experiment and Evaluation


A query is a request for accessing a given content in ontology. A query pattern is an
abstract template and a query (concrete query) request the desirable OWL information
by adopting this template. The query pattern and query can be defined as Definition 4
and Definition 5.
Definition 4. Query Pattern PQ = (
nj N, px, py P.

1<=i<=|N|(px(?x

,ni)|py(nj,?x))), where ni

N|literal,

Each element of the conjunction is called a constraint. ni, nj, px, and py are named
condition variables. ?x is output variable.
Definition 5. Query q = [PQ] is a value assignment to condition variables in query
pattern PQ.
For example, a query get the instances of Car which are made in USA can be expressed as q = (instanceOf(?x, Car) make(USA, ?x )), constraints are instanceOf(?x,
Car) and make(USA, ?x ). When a user issue a query, the query evaluation will first

get the OWL view according to the users role through , then consult the view. If
the role is authorized to get access to all condition variables and output variable, the
satisfying result is returned. Moreover, query evaluation is responsible to generate
SQL query to the underlying database.
This paper assumes that the translation from OWL data query in query pattern to
database compliant SQL is possible. The query evaluation algorithm Query_Eva (V,
q) is shown in Fig. 4.
As an example, consider the query q=instanceOf(RedWine_1, ?x) along with the
VN(RedWine). Since VN(RedWine) defines the super classes of RedWine is visible
until Wine, and the other content on ontology is fully accessible, the algorithm
returns a graph consists of RedWine and Wine, without PortableLiquid. If the same
query is processed with normal reasoning engine without OWL data view, the result
would include PortableLiquid. Our proposal offers a controllable reasoning approach. Another example query is instanceOf(?x, PortableLiquid). If
VN(PortableLiquid) = (true, trasitiveEnd(subClassOf)=[Wine, self] ), the returned
graph includes Wine_1, without the instances of RedWine. The evaluated result is
illustrated in Fig. 5.

Access Control Model Based on RDB Security Policy for OWL Ontology

725

Fig. 4. Query evaluation algorithm

Fig. 5. Query evaluation result

In the experiment, we put focus on the effect that the number of query constraints
and transitive property take on the efficiency. To exactly investigate the relationship

726

D. Jeong, Y. Jing, and D.-K. Baik

between the numbers of query constraints with time-consuming, we tested a query,


which constraints number is from 100 to 1000, and the constraints properties are all
non-transitive.
Fig. 6 shows the query time of the algorithm under different number of query constraints. From the figure, we can observe that the time is nearly linearly proportional
to the number of constraints. For a given query which has fixed 50 constraints, we
assigned to all the constraints with the same transitive property and the same inference ability (length of transitive path in OWL). In 10 times experiment, we changed
the path length from 1 to 10. Fig. 7 shows the querying time is linear proportional to
the length of transitive path.

Fig. 6. The query time with different constraints

Fig. 7. The query time with different inference ability

5 Related Work
Several tools for persistently storing ontologies have been proposed. Protege project
[3] is one of precursors for modeling and managing ontologies. The file-based storage
results in that whenever an ontology is visited, the ontology has to be imported into
memory. In case of the ontology size is much larger than memory, or ontology is
visited frequently, this approach introduces significant overhead. Instead of storing
ontologies in files, Jena2 provides persistent ontologies storage in RDB. In this approach, each RDF statement is stored as a single row in a three column statement
(subject-predicate-object) table [2]. A complete ontology class is defined by multiple
RDF statements. As a result, to achieve the complete definition of an ontology class
requires to combine many statements. This is also time-consuming when facing frequent query. To solve these problems, we proposed a new relational table layout for
storing ontologies persistently. A number of recent research efforts have considered
access control models for XML data. One of the representative efforts is the access
control language, such as XML Access Control Language (XACL) [4], eXtensible
Access Control Markup Language (XACML) [5] and Author-X [6] [7]. The access
control language shares the feature that the language enforces an object-subjectaction-condition oriented policy on queries. [8] introduced an approach to define
security views for XML document for different users. [9] adopts the encryption

Access Control Model Based on RDB Security Policy for OWL Ontology

727

technology in securing XML document. All above approach is not suitable for the
access control on OWL. In XML document, a concept is defined within an element.
In case of OWL, a concept can be defined anywhere in a document. So, only securing
individual elements is not enough. On the other hand, OWL document holds the inference ability. Neither of XML-targeted access control considers this issue. This
paper proposes a novel approach to overcome these problems.

6 Conclusion
This paper contributes an approach to access control on OWL ontologies. Users can
get access to an OWL document, through a specified view. Thus, the sensitive information is protected. In addition, the view specifies not only the element accessibility
but also the inference ability. We also proposed the algorithm to evaluate queries
against the view. The underlying persistent storage provides an efficient data retrieval.
The further study will be focused on improving the view definition and refining the
evaluation algorithm. The system implementation is also under consideration.

References
[1] OWL Web Ontology Language, W3C Recommendation, February 10, http:// www.w3.org/
TR/owl-features/ (2004)
[2] Wilkinson K., Sayers, C., Kuno, H., and Reynolds, D.: Efficient RDF storage and retrieval
in Jena2. In Proceedings of VLDB Workshop on Semantic Web and Databases (2003)
131-150
[3] Noy, N.F., Fergerson, R.W., and Musen, M.A.: The Knowledge Model of Protg-2000:
Combining Interoperability and Flexibility. In proceedings of 12th International Conference, EKAW 2000, Juan-les-Pins, France, October 2-6 (2000)
[4] Hada, S. and Kudo, M.: XML access control language: Provisional authorization for XML
documents, http://www.trl.ibm.com/projects/xml/xacl/xacl-spec.html.
[5] OASIS, eXtensible Access Control Markup Language (XACML), http://www.oasisopen.org/committees/xacml.
[6] Bertino, E., Castano, S., and Ferrari, E.: On Specifying Security Policies for Web Documents with an XML based Language, In Proceedings of ACM SACMAT2001 (2001)
[7] Bertino, E., Castano, S., Ferrari, E.: Securing XML documents with Author-X, IEEE Internet Computing, Vol. 5, No. 3, May/June (2001) 21-31
[8] Fan, W., Chan, C.-Y., and Garofalakis, M.: Secure XML Querying with Security Views,
SIGMOD 2004, Paris, France, June 13-18 (2004)
[9] Geuer-Pollmann, C.: XML Pool Encryption, ACM Workshop on XML Security, Fairfax,
VA, USA, November 22 (2002)

Semantic Fusion for Query Processing in Grid


Environment
Jinguang Gu
1

College of Computer Science and Engineering,


Southeast University, Nanjing 210096, China
sam@seu.edu.cn
2
College of Computer Science and Technology,
Wuhan University of Science and Technology, Wuhan 430081, China

Abstract. To enable accessing web information at semantic level, this


paper develops a semantic fusion mechanism in Mediator-Wrapper based
grid environment to support ontology-based query planning with GAV
style querying request. It employs semantic communication language to
build mediator site for dierent virtual organizations (VO), and creates
global semantic information for VOs. The procedure of ontology based
semantic fusion is discussed in detail.

Introduction

We witness a rapid increase in the number of web information sources that


are available online. The World-Wide Web(WWW), in particular , is a popular medium for interacting with such sources[1]. How to integrate and query
distributed and heterogeneity information, especially semi-structured and nonstructured information is the problem we need to solve. Data grid technology
is the standard means of realizing the needs. However, the studies in data grid
technology still have the shortcomings as follows: 1)The exibility of the grid
technology is limited. Taking OGSA-DAI[2] for example, it only supports the
limited related database and native XML database. However, most information
on Internet comes from web-based semi-structured data environment, such as
company web application and XML-based e-commerce platform; furthermore,
OGSA-DAI does not have the eective mechanism for other data sources to be
integrated into the grid environment. 2) The individual node in the grid environment may exist in varied semantic environment; dierent data resource is
constructed in accordance with dierent semantic standard. The present data
grid does not take into consideration the semantic heterogeneity among dierent
nodes.
This paper proposes a semantic data grid (SDG for short) service to solve
these two problems and discusses the semantic fusion mechanism in detail. SDG
employs a mediator-wrapper framework to support dierent information sources
and enable semantic information operation on dierent grid nodes[3], and it
uses semantic communication language to create mediator-wrapper structure
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 728735, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Semantic Fusion for Query Processing in Grid Environment

729

dynamically for dierent virtual organizations(VO for short)on the grid, creates
semantic fusion list on the mediator site to support Global-as-View(GAV) style
querying request. This grid architecture uses a semantic grid adapter service to
support semantic operation on the gird. The function of the wrapper of local
grid nodes is to describe its semantics and its mapping relationship with other
nodes, the information source of these nodes include both free and commercial
databases, at les services, web services or web based applications, HTML les
and XML les, and the semantic information of every local gird node is described
with the language based on its ontology. The mediator node constructs the global
semantics of the local nodes, the semantic communication mechanism between
the mediator and wrapper nodes is discussed in the following section. The remainder of this paper is structured as follows. Section 2 discusses the knowledge
communication mechanism to support semantic querying and knowledge fusion.
Section 3 discusses the procedure of ontology based semantic fusion. Section 4
summarizes the whole paper.

Communication Mechanism with Semantic Grid

It is very important to develop a knowledge communication and coordinating


mechanism to support the ontology fusion and semantic query on dierent data
grid nodes. This paper employs a Knowledge Communication and Manipulation
Language for Semantic Grid, or KCML for short to support this mechanism,
which is an extension of the KGOL[4] language. One function of KCML is to
coordinate with each grid node to build the mediator-wrapper architecture dynamically. The other function is to build global knowledge on the mediator and
enable semantic query. The communication language is build on SOAP, following
the expression of SOAPs class XML, supporting SOAP over HTTP, HTTPS or
other rock-bottom communication protocol. The language could describe as:
KCM L ::= V er|Operation|Sender|Receiver|Language|Content.
The eld Ver is for keeping Expanding, showing which version language was
used. The new version language has compatibility downwards, supporting the old
communication mechanism; Operation gives basic communication atom which
will be described next; Content describes what is communicated; Sender denes
senders information, including user, address (such as IP ,e-mail,URL, port); Receiver denes receivers information (usually, receiver should be Web Service or
Grid Service), ,including type (HOST, Web Service or Semantic Web Service),
address(such as IP address, e-mail, URL, port, if receiver is Web Service, also
including service address), identier; language denes which language is used
this communication, including RDF/RDFs, DAML+OIL, OWL etc. To illustrate communication atoms, we rst dene the ontology based knowledge on the
mediators and wrappers.
Denition 1. A knowledge schema is a structure KB := (CKB , RKB , I, C , R )
consisting of (1) two sets CKB and RKB , (2) a set I whose elements are called
instance identiers or instances, (3) a function C : CKB (I) called concept
instantiation, (4) a function R : RKB (I + ) called relation instantiation.

730

J. Gu

The atom includes basic communication atoms and semantic fusion atoms. Basic
communication atoms such as selection, join, union, minus and projection are
discussed in paper [5][6]. Semantic fusion atoms represent semantic matching
and fusion procedure.
The mediator node constructs the global semantics of the local nodes based
on ontology via ontology fusion mechanism[7] based on the ontology mapping
patterns in gird environment, the patterns of ontology mapping can be categorized into four expressions: direct mapping, subsumption mapping, composition
mapping and decomposition mapping[8], a mapping can be dened as:
Denition 2. A Ontology mapping is a structure M = (S, D, R, v), where
S denotes the concepts of source ontology, D denotes the concepts of target ontology, R denotes the relation of the mapping and v denotes the condence value
of the mapping, 0 v 1.
A direct mapping relates ontology concepts in distributed environment directly,
and the cardinality of direct mapping could be one-to-one. A subsumption
mapping is a 6-tuple SM = (Dm , Rm , Bm , m , Im , v), where Dm is a direct
mapping expression; Rm is the rst target concept, which is the most specialized ontology concept. The mapping between the source ontology and Rm is
denoted as Root ontology concept mapping; Bm is the last target concept,
which is the most generalized ontology concept. The mapping between the source
ontology and Bm is denoted as Bottom ontology concept mapping; m is
inclusion relation between target ontology concepts; Im is the inverse mapping.
Subsumption mapping is used to denote concept inclusion relation especially
in the multiple IS-A inclusion hierarchy. The composition mapping is a 4tuple CM = (Fm , Am , Bm , v), where Fm is a direct mapping expression; Am is
chaining of role(s) between target ontology concepts; Bm is the last target symbol, which is the node of chaining target role(s), and composition mapping is
used to map one concept to combined concepts. For example, the mapping address=contact (country, state, city, street, postcode) is a composition mapping,
in which the concept address is mapped to combined concept contact, country, state, street, and postcode of local schema elements. The decomposition
mapping is a 4-tuple CM = (Am , Bm , Lm , v), where Am is chaining of role(s)
between source ontology concepts; Bm is the last target symbol, which is the
node of chaining source role(s); Lm is a direct mapping expression. Decomposition mapping is used to map a combined concept to one local concept, and the
example for the decomposition mapping is the reverse of the composition.
The KCML language must support the mapping patterns between dierent
semantic nodes on gird, we use Match atom to support it, it can be dened as
M (c, d, r) = {(x, y)|x C (c) y C (d) (x, y) R (r)}, where c is dierent
concept from d, r is relationship of mapping.
The knowledge stored at mediator can be described as the ontology fusion
connections list, which can be described as denition 3. The corresponding fusion
connection lists of the mapping patterns can be denote as Fcd , Fcs and Fcc
respectively.

Semantic Fusion for Query Processing in Grid Environment

731

Denition 3. Fusion Connection is a structure Fc (O1 : C1 , O2 : C2 , . . . , On :


Cn , M), where C1 denotes a concept or concept set of ontology O1 , C2 denotes a
concept or concept set of Ontology O2 , M denotes the mapping patterns between
C1 , C2 , . . . and Cn .

The Procedure of Semantic Fusion

The function of Ontology fusion is to add connection tag between the concepts
that have direct mapping relationship. In direct mapping M = (S, D, R, v),
the fusion connection adds connection tag between the elements of S and D;
in subsumption mapping SM = (Dm , Rm , Bm , m , Im , v), the fusion connection
adds connection tag between the concepts which have mapping relation Dm ,
and in composition mapping CM = (Fm , Am , Bm , v) or decomposition mapping
CM = (Am , Bm , Lm , v), the fusion connection adds connection tag between the
concepts which have relations Fm or Lm respectively. We use Fcd to denote
fusion connection of direct mapping, Fcs to denote fusion connection of subsumption mapping and Fcc to denote the fusion connection of composition or
decomposition mapping. The fusion connection list is a list and its elements are
the fusion connections, denote it as F L.
1. The rst step of ontology integration is Ontology Fusion for Direct
Mapping, which creates fusion list F L from the mapping list of dierent local
ontologies, it can be described by algorithm 1.
2. The second step of ontology integration is Ontology Fusion for Complex
Semantic Mapping, which used to nd the mappings of the concept that are
not in the mapping list, the basic idea of this step is to nd the semantic similarity
of the mapping relations, and create new mappings between these relations. We
propose this step in two situations, one is to nd the mapping between two
subsumption mapping relations, and the other is to nd the mapping between
two composition mapping relations or decomposition mapping relations.
We propose subsumption mapping situation at rst, suppose M1 and M2 are
two subsumption mappings, and R1 is the mapping relation of M1 , C10 is the
concept of source ontology, C1i (1 i n) is the concept of target ontology;
R2 is the mapping relation of M2 , C20 is the concept of source ontology, C2j
(1 j m) is the concept of target ontology . If there exists JS(R1 , R2 ) 1,
it means that these two relations of the mappings can match each other, we
create a new mapping named M  and create the fusion connection list F L with
algorithm 2.
Then we propose the composition and decomposition mapping situation, these
kinds of mapping are very complex because the concatenations between two concepts of target ontology are dierent with each other, we divide the concatenation
of target concepts into two kinds. One kind of concatenation does not have meanings, just means that the concatenation between these concepts exists(maybe
most of the concatenation can be expressed as this kind). For example the mapping: address=contact (country, state, city, street, postcode), the concatenation

732

J. Gu

Algorithm 1. Direct Fusion(SM )


Input: SM is the set of mapping
Output: F L is the fusion connection list
1 F L ;
2 foreach M in SM do
3
switch M do
4
case M belongs to M
// if the mapping is a direct mapping.
5
F L F L + Fcd (O1 : C1 , O2 : C2 , M ), O1 : C1 S O2 : C2 D;
6
case M belongs to SM
// if the mapping is a subsumption mapping,
// and Dm is the direct mapping expression of SM
7
let Dm = (S, D, R, v) ;
8
F L F L + Fcs (O1 : C1 , O2 : C2 , Dm ), O1 : C1 S O2 : C2 D;
9
case M belongs to CM
// if the mapping is a composition mapping,
// and Fm is the direct mapping expression of CM
10
let Fm = (S, D, R, v) ;
11
F L F L + Fcc (O1 : C1 , O2 : C2 , Fm ), O1 : C1 S O2 : C2 D;
12
case M belongs to DM
// if the mapping is a decomposition mapping,
// and Lm is the direct mapping expression of DM
13
let Lm = (S, D, R, v) ;
14
F L F L + Fcc (O2 : C2 , O1 : C1 , Lm ), O1 : C1 S O2 : C2 D;
15
otherwise
16
Errors Handler;
17
end
18
end
19 end
20 return F L

among contact, country, state, city, street and postcode are this kinds of concatenations. We name this kind of concatenation as None-Meanings Concatenation; The other kind of concatenation does have meanings between the
concepts, we name it as Full-Meanings Concatenation. For example, the
mapping payment = total (commodity amount * (unit price + tax )) is this kind
of concatenation. For the rst kind of concatenation, we only need to identify
whether the related concepts satisfy the denition of semantic similarity or not,
if they satisfy the denition, add a new fusion connection to the fusion connection list. Before we discuss the other kind of concatenation, we introduce the
concept of Fusion Equivalence and Fusion Subsumption.
For composition (or decomposition) mappings M1 and M2 , Am1 and Am2
represents the concatenation relationship of M1 and M2 respectively, we use
|Am1 | and |Am2 | to denote the amount of the concepts in Am1 and Am2 , Am1
and Am2 are Fusion Equivalence i,

Semantic Fusion for Query Processing in Grid Environment

733

Algorithm 2. Subsumption Fusion(M1 , M2 , )


Input: M1 , M2 are two subsumption mappings,  is the threshold of semantic
similarity
Output: F L is the fusion connection list
1 i = 1, F L ;
2 foreach C2j in M2 do
3
switch the semantic similarity between C1i and C2j do
4
case JS(C1i , C2j ) 1 
// create the direct fusion connection.
5
F L F L + Fcd (C1i , C2j , M  ), C1i M1 ;
6
case M SP (C1i , C2j ) 1 
// create the subsumption fusion connection.
7
F L F L + Fcs (C1i , C2j , M  ), C1i M1 ;
8
case M GC(C1i , C2j ) 1 
// create the subsumption fusion connection.
9
F L F L + Fcs (C2j , C1i , M  ), C1i M1 ;
10
end
11
i = i + 1;
12
end
13 end
14 return F L

i). Am1 and Am2 are Full-Meaning Concatenations, and |Am1 |=|Am2 |;
ii). For i(C1i
Concatenation1,i+1 .C1,i+1 , C2i
Concatenation2,i+1 .
C2,i+1 ) exists (JS(C1i , C2i ) 1 ) (JS(C1,i+1 , C2,i+1 ) 1 )
(JS(Concatenation1,i+1 , Concatenation2,i+1 ) 1 ).
We use Am1 Am2 to denote the fusion equivalence of Am1 and Am2 , if
|Am1 | > |Am2 |, suppose |Am1 | = m and |Am2 | = n, then m > n. If Am1,n is
the rst n concepts of Am1 and Am1,n Am2 , then we say Am1 and Am2 are
Fusion Subsumption, denotes it as Am1 Am2 , or Am2 Am1 if n > m.
We use AU1 and AU2 to denote the concepts of Am1 and Am2 respectively, and
whose concatenations are None-Meanings Concatenations, use AN1 and AN2 to
denote the concepts whose concatenations are Full-Meanings concatenations. We
use AE1 and AE2 to denote the concepts of AN1 and AN2 , which keep fusion
equivalence or fusion subsumption relationship with each other. We can express
the fusion connection with algorithm 3.
3. The last step of ontology integration is Canonical Fusion, which merges
the concepts of the fusion connection into one concept if the fusion connection
type is Fcd or Fcc , and adds a real relation connection to the concepts if the fusion
connection type is Fcs . For example the fusion connection Fcd = (C1 , C2 , M ),
C1 and C2 are concepts of dierent ontologies, they will be merged to a concept
(C1 , C2 ) and all the hierarchy of the concepts will be kept. But not all the concepts with the same mapping relation can be merge into one concepts, only the
concepts which have strong mapping relation can be merged. For example, if the

734

J. Gu

Algorithm 3. Composition Fusion(M1 , M2 , )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Input: M1 , M2 are two composition(decomposition) mappings,  is the threshold of


semantic similarity
Output: F L is the fusion connection list
F L ;
// Process the Full-Meanings concatenation
foreach Ci AE1 , Cj AE2 do
if Ci Cj then
// Ci and Cj are fusion equivalence.
F L F L + Fcc (Ci , Cj , M  );
else
if |Ci | > |Cj | then
let Ci Ci and Ci Cj , F L F L + Fcc (Ci , Cj , M  );
else
let Cj Cj and Ci Cj , F L F L + Fcc (Ci , Cj , M  );
end
end
end
// Process the None-Meanings concatenation
foreach C2j AU2 do
foreach C1i AU1 do
if JS(C1i , C2j ) 1  then F L F L + Fcc (C1i , C2j , M  )
end
end
return F L

mappings M(C1 , C2 , R, v1 ),M(C2 , C3 , R, v2 ) and M(C1 , C3 , R, v3 ) satisfy the


strong mapping property, we can merge the concept C1 , C2 , C3 into one concept
(C1 , C2 , C3 ), otherwise, we have to merge them into two concepts (C1 , C2 ) and
(C2 , C3 ).

Discussion and Conclusion

Semantic data grid service mechanism we present in this paper wrapped various information source through semantic fusion mechanism, and used MediatorWrapper to support the heterogeneous data source, employed mediator structure
to realize virtual data gird service which supports semi-structured information
retrieving language.
Because XML is rapidly becoming a language of choice to express, store
and query information on the web, other kinds of web information such as
HTML-based web information can be transferred to XML based information
with annotation technologies. Users can query information with XML languages,
XPath based XML query style language to retrieve information from dierent grid nodes, languages such as XQuery, XUpdate are suitable for retrieving

Semantic Fusion for Query Processing in Grid Environment

735

information in distributed integration systems. It is suitable to extend XML


query algebra to support semantic querying on the gird, some works have focused
on this topic[9].

Acknowledgment
This work was partially supported by NSF (Natural Science Fundation) of China
under grant number 60425206, and it was partially supported by China Postdoctoral Science Foundation under grant number 20060400275 and Jiangsu Postdoctoral Science Foundation under grant number 0601009B. It was partially
supported by a grant from the NSF of Hubei Prov. of China under grant number 2005ABA235.

References
1. Levy, A.Y., Rajaraman, A., Ordille, J.J.: Query heterogeneous information sources
using source descriptions. In: Proceedings of the 22nd VLDB Conference, Mumbai,
India, Morgan Kaufmann Publishers Inc (1996) 251262
2. Antonioletti, M., Atkinson, M., Baxter, R., et al.: The design and implementation
of Grid database services in OGSA-DAI. Concurrency and Computation: Practice
and Experience 17 (2005) 357376
3. W
ohrera, A., Brezanya, P., Tjoab, A.M.: Novel mediator architectures for Grid
information systems. Future Generation Computer Systems 21 (2005) 107114
4. Zhuge, H., Liu, J.: A Knowledge Grid Operation Language. ACM SIGPLAN Notices
38 (2003) 5766
5. Gu, J., Xu, B.: XML based Semantic Query Mechanism on Grid. In: ISPA 2006
Workshops, LNCS 4331, Springer Verlag (2006) 532541
6. Sheng, Q.J., Shi, Z.Z.: A Knowledge-based Data Model and Query Algebra for the
Next-Gereration Web. In: Proceedings of APWeb 2004, LNCS 3007 (2004) 489499
7. Gu, J., Zhou, Y.: Ontology Fusion with Complex Mapping Patterns. In: Proceedings
of 10th International Conference on Knowledge-Based, Intelligent Information and
Engineering Systems, Bournemouth, United Kingdom, LNCS 4251, Springer Verlag
(2006) 738745
8. KWON, J., JEONG, D., LEE, L.S., BAIK, D.K.: Intelligent semantic concept
mapping for semantic query rewriting/optimization in ontology-based information
integration system. International Journal of Software Engineering and Knowledge
Engineering 14 (2004) 519542
9. Gu, J., Hu, B., Zhou, Y.: Semantic Query Planning Mechanism on XML based
Web Information Systems. In: WISE 2006 Workshops, Wuhan, China, LNCS 4256,
Springer Verlag (2006) 194205

SOF: A Slight Ontology Framework Based on


Meta-modeling for Change Management
Li Na Fang, Sheng Qun Tang, Ru Liang Xiao, Ling Li, You Wei Xu,
Yang Xu, Xin Guo Deng, and Wei Qing Chen
State Key Lab of Software Engineering, Wuhan University, 430072, Wuhan, China
dpetfln@sina.com

Abstract. The importance of efficient E-Government services change


management system is increasingly due to the evolution of it. But, the most
system management tasks are still performed manually. This can be easier
error-made, high time-consuming and more human-needed. So we present a
Slight-Ontology-Framework (SOF) to perform the semi-automatic change
management. The main ideas can be shown as the following aspects: One, it
uses a set of ontologies to describe E-Government services and introduces
meta-modeling theory to analyze the features of changes. Two, according to
these services characters, it reduces the description capability of OWL-S and
combined itself with Business Process Modeling theory to make it higher
pliability and easier system-implemented. Even though we use E-Government
domain as the example, the approach is a general solution in other domains.
Keywords: Meta-ontology, semantic web services, framework.

1 Introduction
The most important challenge for the E-Government services is adapted themselves to
the complex changes in their environment, besides in their internal structures and
processes. The new ways of working with citizens, enterprises, or other
administrations efficiently should be concerned by E-Government services. They
need the continuing change management to perform the dynamic modification
problem in time. Note that, such dynamical refinement changes are not to rewrite
project. It is a high level modification which must meet the syntactic and semantic
consistency of changed services. Here, we present an approach to resolve it.
This approach uses the semantic technology to enrich the current implementing
mechanisms of E-Government service process to support a more efficient
management. In the most current Web Services description and Web Services
composition languages, the semantic expressivity on the level of business processes,
is lacked [1]. Therefore, specifications of domain specific constraints need to be
presented, which must be taken into account during the construction of it, and
business process flow must be defined at the abstract task levels without consideration
of the details of specific service bindings and execution flow. To model this abstract
representation of services, a set of ontologies is used to describe them, besides the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 736743, 2007.
Springer-Verlag Berlin Heidelberg 2007

SOF: A Slight Ontology Framework Based on Meta-modeling

737

consistency of service descriptions, possible changes, as well as their resolutions.


Therefore, the evolution of distributed and independent ontologies [2] is the study
focus in our approach. Even though the approach is presented in the application
domain of E-Government services, it is a general solution in other domains that uses
semantic web services.
The paper is organized as follows: In section 2, SOF is introduced. Then, the
approach is presented in section 3. The partial implement is given in section 4. Before
the conclusion (section 6), we present an overview of related work (section 5).

2 SOF: A Slight Ontology Framework Based on Meta-modeling


In this section, a new ontology framework is given. A set of ontologies is introduced
to describe services. Meta-ontology is used to model the dependency between
business rules and service implementation in order to define the business process flow
about static services and the propagation of changes.
2.1 Ontologies Used to Model Services
SOF uses the evolution of meta-ontology to manage change. So, a set of ontologies,
which are used to model services, is presented. Moreover, we classify them as
follows: Meta-ontology which contains the entities used to describe service, Domain
ontology which contains the special domain knowledge, Service ontology to describe
the concrete services, Organization ontology which contains structure concepts of
government organization and Law ontology to model the structure of the law
document. The relationships between them are shown in Fig. 1.

Fig. 1. The top level ontologies used to describe services

The aim of this management system is to assist public administrators to get rid of a
great amount of management tasks to improve maintenance efficiency and reduce
cost. Here, we do not focus on the dynamic service composition, but the static web
services and the precise definition of business rules (i.e. law) in addition to the
dynamic bindings of services during the execution of static services. So in the
following part of this section, we introduce the two major parts of meta-ontology:
Profile part which is used for service discovery and the process part which is used for
describing the business process flow.

738

L.N. Fang et al.

2.2 The Profile Part


Although the profile of SOF is similar to the profile of OWL-S [3], it is extended in
several ways. For example, the property hasReferencedBusinessRule is used to
establishing the reference between the service descriptions and business knowledge
represented by ontology. Business rules ontology depends on the application domain
knowledge. In this paper, business rules ontology will be law ontology [4] if the
application domain is E-Government.
Secondly, from the point of view of business process modeling, some entities are
added to facilitate the business personnel to establish the business rules ontology (i.e.
the service ontologies). For example, the property requires, concept human and
concept equipment, which refer to the human resource, software or hardware
involved in the implementation of services, is introduced.
Finally the hidden rules in the knowledge can be translated to a set of constraint
rules in the corresponding ontologies to perform the referencing tasks. The approach
is implemented by the following steps: firstly, the standard metadata of domain
knowledge can be transferred to the properties of the meta-ontology. Then, according
to the structure of domain knowledge, some concepts are introduced for service
discovery.
2.3 The Process Part
Similar to OWL-S process ontology, we consider process part from the business rule
modeling point of view, which distinguish between the services and the control
Constructs. For every service, we define a standard set of properties that concludes
the name, description etc.

Fig. 2. The top level ontologies used to describe services.

To satisfy specific requirements such as security, cost and trackability, the specific
properties is introduced. It also has the input and output, similar to OWL-S. The
difference is that they will be defined in the domain ontology. The concept

SOF: A Slight Ontology Framework Based on Meta-modeling

739

Reference is added, as well as properties related with it. The property hasFirst
and ConsistOf are presented to describe the composite service flow. The control
constructs is defined as follows: sequence, split, join and if-then according to the
specific of E-Government domain. Moreover, some properties between services and
control constructs are given. The details can be shown in Fig. 2.

3 Approach
The change management is argued in this section as follows: Firstly, changes and
consistency are defined. Then the propagation of changes is introduced from business
rules to services and within services.
3.1 Changes and Consistency
To manage change efficiently, the meta-modeling is used to analyze the features of
ontologies. Changes are classified into two kinds: the basic changes (Add and Sub)
and the complex changes(i.e. composed change) such as modify, which can be
achieved by the composition of basic changes. For example, a ModifyConcept
change can be achieved by the sequence composition of previous SubConcept
change and next AddConcept change.
But we need to extend the granularity level of changes to make a better
management of changes in a service description. Changes should be defined about
concepts of service and relations among them, besides the concept of input, output,
preconditions etc. of a service. A full set of basic change can be defined in the
following table 1.
Table 1. The set of basic change of ontologies used by SOF

Service
ServiceSpecialization
Input
Output
Precondition
Result
Business Rule
Human
Equipment
Software
PreviousConnection
NextConnection

Additive changes
AddService
AddServiceSpecialization
AddInput
AddOutput
AddPrecondition
AddResult
AddBusinessRule
AddHuman
AddEquipment
AddSoftware
AddPreviousConnection
AddNextConnection

Subtractive changes
SubService
SubServiceSpecialization
SubInput
SubOutput
SubPrecondition
SubResult
SubBusinessRule
SubHuman
SubEquipment
SubSoftware
SubPreviousConnection
SubNextConnection

To define the consistency of ontologies in SOF, the consistency of ontology should


be defined. The reason is that ontologies are a kind of Ontology used to model
services. So, Ontology Consistency [5] is defined as the following way: An Ontology
is consistent with the respect to its model if and only if it preserves the constraints
defined for the underlying ontology model.

740

L.N. Fang et al.

Moreover, ontologies may include other ontologies. The dependent ontology


consistency is defined as following [6]: A dependent ontology is consistent if the
ontology itself and all its included ontologies, observed alone and independently of
the ontologies in which they are reused, are ontology consistent.
Finally, we give the consistency of semantic web services as the following way: A
semantic web service is a consistent service only if its description is dependent
ontology consistent and the additional constraints (C1-C11) are satisfied. The
constraints are defined as follows:
C1. Each service has to have a reference to at least one business rule.
C2. Each service has to have at least one human or equipment that controls its
execution.
C3. Each service has to have at least one software component attached to it.
C4. Each service has to have at least one input and one output.
C5. Each service input has to be either output of some other service or is specified
by the end-user.
C6. If the input of a service is the output of another service, then it has to be
subsumed by this output.
C7. If the input of a service subsumes the input of the next service, then its
preconditions have to subsume the preconditions of the next one.
C8. If two services are subsumed by the same service, then their preconditions
have to be disjoint.
C9. If a service specialize another service, one of its parameters (i.e. inputs,
outputs, preconditions or results) has to be different. The difference can be
achieved either through the subsumption relation with the corresponding
counterpart or by introducing a new one.
C10. Inputs, outputs, preconditions and postconditions have to be from the domain
ontology.
C11. Any specialization of the service S1 must always be a predecessor of any
specialization of the service S2, where S1 and S2 are two services defined in the
Meta Ontology and their order is given in advance (i.e. S1 precedes S2).
Semantic web services must obey the constraints defined in this set of consistency
constraints. Note that, all constraints should be defined formally to validate the
description of services automatically. Services, which are defined in the set of
consistency constraints of services, must obey the constraints.
3.2 Changes Management Process
The aim of change management process is to find the weak places of descriptions of
services, by considering changes in the business rules and the impact on consistency,
and discover inconsistencies in service descriptions, which can be repaired to
improve the agreement of ontology with the business rules. So, a propagation
procedure is defined to make managers know changed business rules which may

SOF: A Slight Ontology Framework Based on Meta-modeling

741

spring some inconsistencies. To implement this propagation, changes should be


formalized and automatically performed.
The capability, which the system must have, is that it should automatically identify
problems in services or their descriptions, and rank them in sequence according to the
importance. Once a problem arise, the system should assist them to identify resources
of it, analyze and give solution and, if possible, to help them decide which solution
should be chosen to resolve it.
A change management process is defined, which is shown in Fig. 3, as follows:
Firstly, it represents a request for a change formally and explicitly. Secondly, to
prevent inconsistencies raised by additional changes, changes are preserved to
guarantee the consistency of services. Thirdly, the requested changes and additional
changes are performed during the change implementation step. Finally, all dependent
knowledge items are found and updated in the change propagation phase.

Fig. 3. Four Steps of the change management process

4 The Partial Implementation of SOF


SOF is a slight ontology framework to manage changes. The system based on SOF
has been developed which is much more than a standard framework for creating,
modifying, querying and storing ontology-based description of semantic web services.
It supports the change evaluation management which includes modeling services,
discovering services, compositing services as well as reconfiguring services. The SOF
platform is roughly presented in Fig. 4. It can be divided into three lays:
1.

2.

3.

Services Application Lay: It realizes UI applications and provides interfaces to


Service Editor used to edit the semantic description of E-Government services,
Visual Ontology Editor used to edit the domain ontology, organization ontology
and law ontology, and Service Registry.
Middleware API Lay: It is the core part of the change evaluation management
architecture. In Fig. 4, it shows the main including modules: (i) service
management module; (ii) consistency check module; (iii) change manage
module, which is the core part of middleware API lay, including change
implementation submodule, change maintenance submodule, change
propagation submodule and evaluation submodule; (iv) service implement
module.
Data Storage Lay: It provides data storage facilities which include storages of
service data, ontology data, reason data change data and log data.

742

L.N. Fang et al.

Fig. 4. The architecture of change management system

5 Related Works
In this paper, a slight ontology framework SOF is presented to perform the semiautomatic change management to improve current management. Our approach
introduces into Semantic Technology and the experiences of Business Rules
Modeling to reconstruct SOF. The aim of it is to provide a continuing change
management, which is lack in the Web Services.
In the research of Semantic Web Services and Workflow, there exist some
proposals to support the dynamic service composition [7] [8] and the extensibility and
flexibility are concerned [9] [10] [11], which present some overlaps with our ideas.
Especially in [9], it presents a technique to generate composite services from highlevel declarative descriptions and defines formal safeguards for meaningful
composition through the use of rules. In [11], it takes into consideration of the
underlying business process next to the data to present a three-level schema
architecture for the conceptual design of dynamic web-based information systems to
support workflow tasks and e-commerce transactions besides querying.
There are many graphical tools (ARIS, Adonis) which lay out a process model and
draw connections among steps. These tools lack formal methods for verifying
properties of processes. In contrast, our approach allows users to formally specify
consistency constraints. Ontologies and rules are used to represent this kind of
background knowledge or users needs. Moreover, our system can check the service
profile and propose suggestions for resolving the problems.

6 Conclusions
E-Government systems are subject to a continual change. The importance of better
change management is nowadays more important due to the evolution of EGovernment services. It is clear that the management of changes in E-Government

SOF: A Slight Ontology Framework Based on Meta-modeling

743

will be a work which must be treated in a more systematic way in order to avoid
drawbacks in the long run.
In this paper we present an approach for ontology-based change management. Our
approach goes beyond a standard change management process. It is a continual
improvement process. The novelty of the approach lies in the using of formal methods
for achieving consistency when a problem is discovered and the formal verification of
the service description.
In the future we will continue to improve this system in the following parts: Firstly,
in order to improve the efficiency of this system, the system implementation and
experimental evaluation will be performed. Secondly, we will extend our approach by
adding function of change suggestion which can improve services by monitoring
the execution of them and taking into consideration of advices from managers or
end-users.

References
1. BPEL4WS. http://www-128.ibm.com/developerworks/cn/webservices/ws-theme/ws-bpel/
2. Maedche, A.: Managing multiple and distributed ontologies on the Semantic Web. The
VLDB Journal, Vol. 12. (2003) 286302.
3. OWL-S (2005). http://www.w3.org/Submission/OWL-S/
4. Gangemi, A.: Some Ontological Tools to Support Legal Regulatory Compliance - with a
Case Study. Workshop on Regulatory Ontologies, OTM'03. (2003) 607620.
5. Stojanovic, L., Abecker A.: The role of semantics in e-government service model
verification and evolution. Semantic Web meets eGovernment, AAAI Spring Symposia,
AAAI. (2006).
6. Maedche, A.: Managing multiple and distributed ontologies on the Semantic Web. The
VLDB Journal- Special Issue on Semantic Web, Vol 12 (2003) 286302.
7. Arpinar, I.B.: Ontology-driven Web services composition Platform. Information Systems
and E-Business Management, Vol, 3. Springer Berlin (2005) 175199.
8. Medjahed, B.: Composing Web services on the Semantic Web. The VLDB Journal, Vol.
12. Springer Berlin (2003) 333351.
9. Medjahed, B.: Composing Web services on the Semantic Web. The VLDB Journal, Vol.
12. Springer Berlin (2003) 333351.
10. Duke, A.: Enabling a callable service oriented architecture with semantic Web Services.
BT Technology Journal, Vol. 23. Springer Netherlands (2005) 191201.
11. Preuner, G.: A three-level schema architecture for the conceptual design of web-based
information systems: from web-data management to integrated web-data and web-process
management. World Wide Web, Vol. 3. Springer Netherlands (2000) 125138.

Data Forest: A Collaborative Version


Ronan Jamieson, Adrian Haegee, Priscilla Ramsamy, and Vassil Alexandrov
Advanced Computing and Emerging Technologies Centre,
The School of Systems Engineering, University of Reading,
Reading, RG6 6AY, United Kingdom
r.jamieson@reading.ac.uk

Abstract. As we increase our ability to produce and store ever larger


amounts of data, it is becoming increasingly dicult to understand what
the data is trying to tell us. Not all the data we are currently producing
can easily t into traditional visualization methods. This paper presents
a new and novel visualization technique based on the concept of a Data
Forest. Our Data Forest has been developed to be utilised by virtual reality (VR) systems. VR is a natural information medium. This approach
can easily be adapted to be used in collaborative environments. A test
application has been developed to demonstrate the concepts involved
and a collaborative version tested.
Keywords: Data Forest, collaborative virtual environments, data visualization.

Introduction

The search to eciently extract information and meaning from data has preoccupied mankind. It has lead humans to try many dierent methods of presenting
the data from hieroglyphics to printed tables, but they mainly rely on our primary sense which is sight. This is due to the fact that humans have a highly
developed visual system, that has evolved over time from the need to survive
(i.e. tracking food sources) to more complex requirements like entertainment
(e.g. pattern matching on Rubiks cubes). Humans excel at pattern recognition
and are capable of gaining a better insight and understanding from this. Based
on this ability, we nd it possible to make intuitive decisions even if there is
incomplete or missing data. Any method that uses a 3D object approach will be
more eective, due to the fact that it is more natural and intuitive considering
this is how humans view near objects.
Current methods used to extract knowledge from variable data sets that do
not t into traditional structures do not necessarily scale well to large multidimensional levels. Therefore a new approach is required that overcomes this
problem and utilises the recent advances in computer and graphics technology.
A VR Data Forest fulls this requirement. VR is a discipline that enables users to
view large complex data sets in an intuitive and natural manner. For example a
3D object can be a graphical representation of a particular data value. Immersive
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 744751, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Data Forest: A Collaborative Version

745

Projection Technology (IPT) systems like CAVEs[3] or powerwalls, are becoming


increasingly popular VR systems for high-end visualization and analysis of large
multi-parameter data sets. The feeling of immersion, coupled with natural forms
of interaction are particularly useful when using data sets.
VR is currently being utilised by a wide range of industries to gain an insight
into their data sets, these industries range from gas and oil (e.g. seismic data)
to the medical profession (e.g. pre-surgery planning). Using VR it is possible to
network Virtual Environments (VE) together and create a Collaborative Virtual
Environment (CVE). This has be been dened as a software system in which
multiple users interact with each other in real time, even though those users may
be located around the world[11].
This ability to collaboratively work with others has expanded the skill base
that can become involved in a project, by allowing the location barrier to be removed. Now experts from dierent locations can eciently and eectively work
on collaborative projects. Therefore it is becoming a more common project requirement. The next section describes the related work in collaborative environments and data visualization, Section 3 gives a detailed explanation of the
concepts involved. Section 4 examines some of the potential application area.
Then Section 5 outlines the conclusions and discusses any future work.

Related Work

The most common approach to developing CVE that allow users to interact
with other users and data, has been to build applications on top of dierent networking platforms and protocols. In VR this has been with either COVEN[8],
DIVE[1] or CAVERN[6]. COVEN has it roots in DIVE, as it took lessons learned
from that eort and other collaborative environments. DIVE uses a peer to peer
approach to networking the environments, this platform allows for the creation of
interactive environments, but uses other software to enable its rendering. CAVERN has an ecient database/network library, but needs additional applications
to create a CVE (e.g. CoAnim[2]) which then limits its functionality.
The scientic community have been developing dierent approaches to analysing and exploring large multi-dimensional data sets for a number of years. These
have resulted in highly advanced tool sets for their needs, for example application like VTK[9], AVS[12] and Open DX[7] are available and widely used. Other
communities have relied on traditional methods, namely data mining techniques.
These rely on statistical techniques, genetic algorithms and neural networks to
uncover the relevant data. Then the data is usually visualized using 2D graphs,
spread sheets, or quasi-3D structures but using a 2D medium (e.g. desktop monitor or projector). A further disadvantage to data mining techniques is the lack
of interaction and rapid feedback to the user. Examples of data mining applications that take advantage of visualization techniques are MineSet[10] and Iris
Explorer[4],but they use 2D & quasi-3D structures to represent the data. Furthermore these methods do not scale well to the large multi-dimensional data
set. New methods are required that utilise the advantages of new technology
(e.g. VR).

746

R. Jamieson et al.

A Data Forest

A Data Forest is a graphical representation of complex data using the concept of


data trees (see Figure 1 (a) & (b)) to represent the data. Each data tree consists
of a trunk and crown; these represent dierent parameters of the data set that
has been chosen. Dependant on which type of VR system the user has access to,
the user can then walks and/or navigate through the forest. This will allow that
user to eectively spot trends, discover relationships or uncover unusual patterns
in the trees. This can then lead to further examination of the underlying data. To
increase the exibility of this approach the forest can either be static or dynamic,
this will be depend on the nature of the data to be visualized. A dynamic forest
will allow for the use of time variant information. A data tree has been chosen
due to the fact that it is a simple object that people are familiar with. People can
relate to making decision on the size and shape of a tree relative to another tree.

(a)

(b)

Fig. 1. Data Forest - (a) Flying over (b) Inside Forest

Up to eight dierent parameters can be visualized in any one data tree. This
number can be achieved by matching the physical properties of the tree (i.e.
trunk and crown) to dierent parameters within the data set,

position of the tree


height of trunk
width of trunk
colour of trunk
transparency of the trunk
radius of crown,
colour of crown
transparency of crown

It is possible to expand the number of parameters even further by using all


axis of the tree position and inverting the tree (i.e. crown below the trunk).

Data Forest: A Collaborative Version

747

Therefore we do not have to have a at forest; the trees could be positioned at


various heights and depths.
Graphically a tree is a simple object, as it just consists of two primitive objects
a sphere and cylinder. Simple objects are desirable due to the fact that they
retain their shape if a low graphical computation is required (i.e. a tree with
a low numbers of polygons will still resemble a tree object). Therefore when
required to render large numbers of them, it is still possible to retain a fast and
smooth environment. This is an important feature that allows our approach to
be used by a wider range of hardware platforms. By using VR we are able to
develop an interactive VE, this gives us the ability to interact with the data in a
more natural manor, especially if we are using an Immersive VR system. From
this interaction it is possible to represent dierent level/layers of abstraction
of the data, by creating dierent Data Forests. Drilling down is a common
term in mentioned in data mining techniques, but we have taken the approach
of exposing the roots of the data tree. Users can choose a particular tree
that they are interested by using their input device, then using a 3D menu (see
Figure 3(a)) choose to further examine the related data. This gives the users an
opportunity to examine in a more detailed fashion the data that is relevant to
the chosen tree (see Figure 3(b)). It is possible to return to the original Data
Forest or descend to a lower layer (i.e. assuming there are further layers below
the current layer) by using the 3D menu.
Figure 2 (see below) outlines a simple overview of the test application that
has been developed. The networking of the Data Forest VEs was a logical step,
which benets the users. This has been achieved using the structure proposed in
Collaborative Virtual Visualization (CVV)[5], this also allows us to be hardware
and platform independent. By using the CVV approach we will represent our
remote users as avatars, communicate with them via audio and video links.
The avatars will be mapped to the remote users position and orientation, so we
will be capable of determining gestures (e.g. pointing to a particular tree) and
location within the Data Forest. User decisions in each environment (i.e. remote
and local) will be reected in the connected environment by the use of event
messages.
A data lter can be applied to the input data to check the validity of the
data; this lter is in the form of a reference table. Once the data has been
veried a data structure for each tree is created containing all the data that is
relevant to that tree (e.g. trunk size, tree name, etc). These data structure are
used in the rendering of the CVE, event messages that allow the environments
to be updated to reect the users activities will be passed over the network, e.g.
when a user chooses a tree to examine further. An event message will be passed
to the other remote users application with the all the relevant information to
inform the application that a change has occurred. The application will update
the remote users VE to reect this decision. A design aim has been to create a
modular application that will easily allow specic functionality can be added,
depending on the data type to be analysed. An example of this is discussed in
the application section concerning Stock Market Data.

748

R. Jamieson et al.

Fig. 2. Overview of Data Forest Structure

Application Areas

There are numerous areas that would benet from using a Data Forest approach.
The following areas were chosen due to their diversity of data, and in which
further work is taking place.
4.1

Stock Market Data

The daily movement of the stock market, with its wide range of dierent parameters produces a vast amount of time dependant data. The ability to analyse this
movement correctly and predict future changes has the potential to generate a
considerable amount of money. Therefore this data is subject to many dierent
traditional visualization techniques (e.g. 2D graphs, spreadsheets). Our approach
is more suited to what would be considered o-line (i.e. non real-time) studies,
but with some of the improvements mentioned in the future work it could then
have the potential to move to real time use.
The test application was developed to visualize this type of data. This example
was chosen to generate the forest, due to the dierent number of parameters that
would be found in the data set. An example of some of the potential parameters
that could be in the data set are share price, volume of trades, whether the share
provides a dividend and how much, the percentage yield, the capital appreciation
of the company, total number of shares in company, etc. Some of these parameters
are highly dynamic (e.g. share price, volume of trades), but others are of a static
nature for a long period of time (e.g. total number of share, amount of dividend).
Therefore it is possible to match the movement of share prices with the changing
shape/growth of the forest.
By utilising the structure proposed in CVV a collaborative version has been
developed, remote users are represented by avatars and changes in one environment are then reected in the other users environment by using event messages.
A text grid is laid over the forest so that it is possible to associate each tree

Data Forest: A Collaborative Version

749

with a particular company (i.e. a tree with a NGC text label above it is referring
to the National Grid Company). The forest is laid out alphabetically grid to
ensure easier navigation if a user is interested in a particular company. The main
methods of interaction with the Data Forest environment are through the use of
3D menus and an input device (e.g. a wand in an immersive environment). The
use of 3D menus allows the user to easily interact with the data and environment.
Currently the menus contain the following functionality:Stop/Start Simulation. Changing the forest from static mode to a dynamic
or simulation mode (e.g. steps through the time dependant data). In simulation
mode the forest is changed every set time period to reect the new time variant
data. It is possible to pause/stop the simulation at any time; also there is a
step forward/back one time period function. This can be useful when the user
discovers an interesting feature in the forest.
Increase/Decrease Size. Scaling the size of the forest to allow for dierent
methods of analysis, scaling down gives the users a better overview of the complete forest while scaling up allows for more natural interaction and exploration
of the data (i.e. the trees could be made life size).
Examine Tree. If the user decides to examine a particular data tree in more
detail (i.e. exposing the roots of the tree), all that is required in that the user
touches the desired tree with their input device (see Figure 3(a)) and then selects
the examine function on the 3D menu. A new forest is rendered based on any
related data to that tree. Once the user is nished with this requirement (see
Figure 3(b)), they can return to the previous forest.
Predict Tree. While in the examine mode it is possible to carry out further
analysis, with modules such as the predict tree function. This calculates an
average crown and trunk value for this particular forest and then creates a tree
based on these values. This is then added to the forest. Currently this is a very
simple mathematical approach, but this could easily become a more complex
algorithm.
4.2

Network Data

Computer networks are constantly being monitored and analysed to ensure the
ecient transfer of data over the network. The data produced requires a method
of visualization that will allow for time dependant analysis. By changing the axis
of the Data Forest from vertical to horizontal we can use the forest to represent
a computer network. The dierent parameters that could create a data tree are
dierent network bandwidths, number of servers on the system, number of users
connected to a server, load on server, amount of memory available and consumed
at any one time by a server. This could be visualized in real time as discussed
in the previous section or used for historical analysis. Another area of use can
be as a forensic tool to simulate network conditions at the time of failure, so
the user could navigate around the forest and examine all the factors that could

750

R. Jamieson et al.

(a)

(b)

Fig. 3. (a) User selecting tree 3D menu (b) Underlying Data

have contributed to the problem. Companies are always interested in maximising


their network usage; therefore they must visualize their current structure and
ows. Similar functionality used in the stock market application could be reused
here via the 3D menus.
4.3

Marketing Data

Consumers are continually producing data of interest through their shopping


habits and preferences, to manufacturing companies who collect and store the
data. Advances in methods of gathering this data, like the use of radio frequency
ID (RFID) tags on products has increased the availability and complexity of
this data. Therefore the data is extremely large and multi-variable. Analysing
this is the focus of marketing departments within these companies. The data is
extremely important to them, but it also has a very short life span. This is due
to the short time to market of products and a fast changing customer demand.
Therefore any insights into the shopping behaviour of customers is vital. An
example of this could be point of sale data, it would produce parameters that
would consist of frequency of purchase, cost of purchase, number of associated
purchases, prot margin on purchase, purchased from and many more. This data
is not as time dependant as the other examples but analysing the purchasing
decision of customers using traditional methods is complex and not intuitive.
Therefore this is another area that could use a Data Forest, with similar features
developed in the test application.

Conclusion and Future Work

This paper has successfully outlined the concept of a Data Forest, which is a
new and novel approach that uses VR to visualising data that does not easily
t traditional methods. A test application has been developed utilising stock
market data and a collaborative version has been successfully incorporated into
the collaborative structure proposed by CVV. Our modular approach is exible

Data Forest: A Collaborative Version

751

and allows for the reuse of dierent function within dierent contexts, by just
changing modules, dierent types of data sets could be loaded into our environment.
Future work will include incorporating Really Simple Syndication (RSS) feeds
directly in the collaborative environment in the form of a scrolling text bar. The
option to extract this data from the RSS feed and utilise it in updating the
relevant data trees accordingly will be investigated. This will allow for real-time
information to be incorporated into our Data Forest. The inclusion of virtual
maps of the forest for large environments so the users can see the position of the
remote user (i.e. out of visual range of the user), therefore allowing a larger user
search area will be researched. We aim to investigate allowing the user to specify
how they would like the forest to be arranged to enable greater ease in searching
the forest for particular areas of interest (e.g. in the stock market application
the user could group the trees according to industries).

References
1. C.Carlsson et al, DIVE: a platform for multi-user virtual environment, Computer
& Graphics, 1993, p663-669, Vol .17, No.6.
2. Coanim, Electronic Visualisation Laboratory (EVL), University of Illinois at
Chicago, 2005, http://www.evl.uic.edu/cavern/agave/coanim.html.
3. C.Cruz-Neria et al, The CAVE: audio visual experience automatic environment,
Communication of the ACM, 1992, p64-72, Vol.35, No.6.
4. D. Foulser, Iris Explorer - a framework for investigation, Computer Graphics,
1995, 29(2):13-16
5. R. Jamieson et al, Collaborative Virtual Visulaization, First Austian Grid Symposium, Austira, 2005
6. J.Leigh et al, CAVERN: A Distributed Architecture for Supporting Scalable Persistence and Interoperability in Collaborative Virtual Environments, Journal of
Virtual Reality Research, Development and Applications, 1997, p261-296, Vol.2.2.
7. B. Lucas et al, An architecture for a scientic visualization system, Proceddings
of Visualization 92, 1992, p107-114
8. V.Normand et al, Collaborative Virtual Environments: the COVEN Project.,
Proc.of the Framework for Immersive Virtual Environments Conf. (FIVE), 1996.
9. W.J. Schroeder et al. The design and implementation of an object-oriented toolkit
for 3d graphics and visualization. Proc. of Visualization,ACM, 1996, p93-100.
10. SGI Inc, MineSet: a system for high-end data mining and visualization, Int.Conf.
On Very Large Databases (VLDB96), Bombay, India, 1996, p595
11. S.Singhal et al, Networked Virtual Environments: Design and Implementation,
ACM Press, 1999.
12. C. Upson et al. The Application Visualization System: a computational environment for scientic visualization, C omputer Graphics and Applications, 1989,
9(4):30-42

NetODrom
An Example for the Development of Networked
Immersive VR Applications
Christoph Anthes1 , Alexander Wilhelm2 , Roland Landertshamer1 ,
Helmut Bressler1 , and Jens Volkert1
1

GUP, Institute of Graphics and Parallel Processing


Johannes Kepler University, Altenbergerstrae 69, A-4040 Linz, Austria
canthes@gup.uni-linz.ac.at
2
The Visioneers,
Fabrikstrae 2, A-4020 Linz, Austria
http://www.thevisioneers.com

Abstract. Developing Networked Virtual Environments (NVEs) is a


challenging task. Designers and programmers of such an application can
encounter many pitfalls. For a public event, a night of science, a networked Virtual Reality (VR) application had to be designed and implemented. This paper will describe the technical and conceptual design of
this application - The NetODrom game. Possible pitfalls during the design and implementation of NVEs will be pointed out and suggestions on
how to overcome these issues will be given. A closer look at the problem
areas of physics, network communication and graphical eects is taken.

Introduction

Networked graphics applications are becoming more and more common nowadays. If we take a closer look in the area of VR and NVEs, medical visualisations,
safety training and architectural design are just a few application areas, where
the users benet from interaction in a networked environment. Not only the
area of VR but also the area of multi-player games opens a vast market for
networked graphics applications. The algorithms and distribution mechanisms
between these two areas are very similar, but VR provides additional challenges
like multi-wall displays, dierent interaction techniques and it has tighter constraints considering responsiveness.
This paper gives an overview on the design of a networked graphics application
and takes a racing game as an example. The example of such a game seams to
be ideal, since it consists of the common components usually needed in NVEs.
The description of the components and the overall design of the NetODrom
game should give an idea on how to approach the development of an NVE. It
should not be considred as a generic approach, but rather be an inspiring source
for design ideas. Common issues which arise during the development of such a
VE will be pointed out and suggestions on how to overcome these obstacles will
be given.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 752759, 2007.
c Springer-Verlag Berlin Heidelberg 2007


NetODrom An Example for the Development

753

The following Section examines various design approaches in the eld of VEs
and NVEs. Afterwards the conceptual design from an artistic point of view
will be given. Section 4 will describe the application design of the NetODrom
game. The next Chapters will focus on the developed physics engine, the network
component and the eects used to enhance immersion and the gaming experience. The nal Sections conclude the paper and provide an outlook into future
developments.

Related Work

In general VR applications can be developed in many ways. Authoring systems


allow experts of a given application domain, which have not much programming
experience to design VEs by using graphical editors. Other systems like DIVE[1],
ALICE[2] or AVANGO1 [3] rely on the use of scripting languages like TCL and
Python. Through the use of graphical interfaces or scripting languages the design
of a VE is more intuitive, but obvious drawbacks are limited exibility and a
higher need of computational power. Many domain specic problems have to
be solved by using workarounds. In most cases the applications are tailored
specically to fulll the needs of the given problem domain. To avoid complexity
and to allow for more intuitive structuring of a VE scene graphs like Performer
[4] are used.
To allow multiple participants inside a VE, network protocols have to be dened and connection topologies between the dierent participants have to be
chosen. The used communication protocols are closely associated to the underlying topologies. Singhal and Zyda give a detailed introduction into NVE topologies and protocols [5]. A good overview which demonstrates the close connection
between multi-user games and NVEs is given in [6].

Conceptual Design

In the NetODrom game two players compete in a virtual race. They drive their
vehicles on a race-track and try to reach the goal before the other player.
3.1

The Principle of Mutual Dependent Interaction

It was expected that users with dierent experience in the area of games and
input devices had to interact with each other. Due to that reason an unfair situation during the game would arise which could decrease the gaming experience. In
a racing game the more experienced player is able to advance much faster than
the inexperienced player, which reduces the aspect of a challenge enormously.
To overcome this problem a game mechanism had to be found which keeps the
players vehicles in spatial proximity.
1

Formerly known as AVOCADO.

754

C. Anthes et al.

Our approach guarantees this mutual interaction by coupling the players vehicles with a rubber band mechanism, which keeps the vehicles below a maximal
distance. This mechanism generates interesting tactical game situations, whereas
the faster player is pulling the slower player along. Both players can apply their
acceleration and braking force on the other players vehicles by pulling the rubber band together. The gaming idea is easy to understand and increases the
competition between the players. Even close to the end of the race it is still
possible to gain the lead.
3.2

Technical Implementation of 3D Models

Due to performance reasons the 3D models had to be designed using a low


amount of polygons. Materials for the objects can be developed with the use of
photographic references. A distinction between movable, modular objects and
objects of the environment are made. Movable and modular objects (avatars,
vehicles and race track tiles) are textured with a diuse general lighting situation
which is dened through the use of skylights and ambient occlusion shadows.
This type of lighting generates a strong plasticity and emphasises details. It
implements the common light distribution based on the human perception and
is perceived as a balanced light by the observers. For the static objects of the
VE dierent daylight situations were dened for the dierent levels of the game.
The objects, background images and architectures are illuminated similar to the
movable objects to achieve a level of consistency.
To combine the lighting situation with the lighting-neutral materials of the
geometries in common textures a procedure called texture baking is needed.
This process guarantees the quality of pre-rendered image quality in a real time
application.
The rendering process does not take the camera perspective as a basis for
texture calculation, but the UVW-coordinates of the according 3D object. This
produces a polygon-based, object projection of light and material attributes on
a 3D object per polygon. The result is mapped as a self-illuminated texture with
the same UVW coordinates on the given objects. With that mechanism it is
possible to achieve an image quality, which is normally known from rendered
animations and visualisations.

Game Architecture

The architecture of the NetODrom NVE consists of a network module, a database module, a physics engine, independent libraries for eects and a highly specic physics engine. A core module combines the independent parts and manages
the application. It acts as glue between the individual components and contains
the management of events. Graphical output is handled via the scene graph
OpenSG, and audio output is generated via the OpenAL audio library. Figure 1
shows the dierent components and their communication routes. Three dierent
classes of components can be identied which are shown in dierent shades of

NetODrom An Example for the Development

755

Fig. 1. Overview of the architecture

grey. On the bottom and the left part of the Figure the external libraries are
shown. The central part is formed of the actual application core and the modules.
In between the application core and the scene graph additional modules which
are used to extend the underlying scene graph functionality are shown. These
additional modules are implemented as scene graph nodes since they represent
graphical eects or non standard geometries which will be explained in detail in
Section 4.4. Multi-threading is highly desirable for various components. The input device component and the network component should be independent from
the rest of the application and should be polled at certain intervals, to avoid
unnecessary waiting times for the application. If physics is used in a VR application it should be performed on a separate processor if available, since physics
calculation is normally a computationally intense task.
4.1

Structuring the VE

When designing a VE several options arise for the organisation of the scene
graph. Ideally nodes should be easy to cull and the structure should be organized
and stored in a manner, that the application designer is able to easily change the
scene layout without changing the hierarchy of the graph or altering the source
code. In our example application the NetODrom racing game proprietary text
les dene which objects are available in the particular scene and how they
are arranged. A VE as structured in the NetODrom application consists of
tiles, entities and environments. Additionally each player controls a vehicle to
which an avatar and particle systems representing the exhausts of the vehicle
are attached.
One environment represents a region of the race track. The tiles and entities
represent the visible objects in the world. Tiles are used to structure the race
track and can only be arranged on a grid. This approach has been chosen to allow
for an intuitive denition inside the conguration les, easy frustum culling and
ecient physics calculation. The entities do not have any direct technical purpose
and are used to create the atmosphere of the VE. They can be arbitrarily placed

756

C. Anthes et al.

Fig. 2. The scene graph of the NetODrom application

inside an environment. An overview of the scene graph of the example application


is given in Figure 2. Organisation of a scene graph should be done in most cases
in a way that the dierent branches of the graph describe certain regions. Ideally
the upper nodes in the graph describe a spatially larger region where their child
nodes are located inside this region. A common mistake is grouping the dierent
nodes based on their logic, e.g. grouping all tiles or all entities in the scene
under one node. Since most scene graph implementations provide optimizations
like frustum culling based on the node hierarchy it is recommended to organize
the scene based on the spatial position of nodes in the environment.
4.2

Physics

A physics engine in the context of computer games and VEs simulates the behaviour of rigid bodies, which comprises the following tasks: rigid body motion,
collision detection and response and constrained interaction of rigid bodies. Typically these calculations are performed in 3D space on simplied proxy geometries to reduce the needs of computational power. In our sample application we
decided to tailor a highly ecient, problem specic physics engine. The basic
idea is to perform as much work as possible in 2D. Therefore a heightmap is
calculated based on the surface. In order to perform collision detection in 2D,
obstacles and moving objects on the surface are approximated by 2D primitives.
Heightmap. The heightmap is generated out of a set of geometries which is
considered as surface of the area reachable by the vehicle. The height values are
obtained by performing intersection tests between the surface geometry and rays
going into the direction of gravity. Also the surface normals are kept which are
necessary to determine the orientation of objects on the heightmap in 3D. This
procedure is time consuming but it needs to be performed only when the surface
geometry changes. Queries to the heightmap of the height value or normal for

NetODrom An Example for the Development

757

a given (x, y) position on the surface are performed by simply interpolating the
values of its neighbours.
A limitation to this approach is that each point on the surface can have exactly
one height value.
Collision Detection. Collision detection aims to determine the intersection of
two objects. Our implementation provides lines and circles as collision primitives.
In the simulation the vehicles are represented by circles and the obstacles on
the surface are approximated by line sets which describe the 2D shapes of the
obstacles. Every possible collision point on the obstacles mesh is determined
by simulating the vehicle scratching along the obstacle. The result is a line
set which describes the collision shape of the obstacle in 3D. These lines are
then projected into 2D space for use in the collision detection algorithm. Since
the vehicle is mobile and moves over a non-planar surface its 2D shape can
change over time. To avoid a collision test with changing 2D shapes our collision
detection algorithm considers the orientation of the vehicles. A more detailed
description on this approach can be found in [7].
To allow for ecient physics calculation it is always important to know the
given application domain. Using a standard approach and working with all of
the triangles of the scene will decrease the overall performance enormously. The
representation of the scene should be pre-processed and simplied in advance as
much as possible to allow for simplied calculations in the physics component.
4.3

Network

The network module makes a clear distinction between two types of transmission
data, continuous movement data and discrete events. The communication topology is a peer-to-peer (p2p) architecture designed for two users. One client starts
the application as a server and the second client has to connect to this server.
Once this connection is established, both clients work on their fully replicated
copy of data and synchronise the application via messages. For most NVEs we
can identify three dierent types of messages which are used for position update,
scene management and management of the network topology.
The rst type of messages describes the transformation of the vehicle in the
VE and other physical properties of the vehicles like speed and orientation. To
avoid the additional latencies other than the network delay the updates have to
be transmitted as fast as possible. Therefore the continuous ow of movement
data is distributed via UDP. If the latencies introduced by the network are
already to high it is possible to extrapolate the position of the remote users
representation, in our case the vehicle with the avatar, by using predictive dead
reckoning algorithms.
A second type of messages consists of discrete events which have to be handled
in a reliable manner. Events can be used for many purposes, e.g. to trigger
animations, to trigger sound eects, or to provide reliable information for the
physics engine. These reliable events guarantee a consistent state of an NVE.

758

C. Anthes et al.

Another message type is used to manage the network topology and is mostly
independent from the application. Typical messages which are transmitted could
be joining or leaving of a participant of the VE.
An important issue designing network protocols for NVEs lies in the byte
order, which diers between standard Intel and AMD CPUs and most of the
CPUs used in supercomputers.
4.4

Graphical Eects

Graphical eects are useful to increase the degree of immersion and to attract
users. Particle systems [8] are often used in safety applications to demonstrate the
propagation of smoke or to animate res and explosions. Flocking [9] algorithms
can be used to create a more dynamic and life-like VE.
In general these eects are not necessary for the simulation of the VE, they
have to be separated from the actual application layer and should be updated
once per frame. Besides their usefulness for enhancing immersion and their ability to create life-like environments these eects have more in common. They can
cause signicant performance issues when they are used in combination with immersive multi-screen displays. Particle systems emit particles from their core and
update each frame the attributes of the set of emitted particles. These updates
have to be synchronized between the dierent displays. Similar problems arise
with ocking algorithms, where the boids of the ock have to be synchronised
in their position and animation phase. To overcome these probles a collection of
boids or a collection of particles has to be implemented as a single node.
Most of these problems can be avoided by using shared memory architectures
and synchronizing the memory access where the changes of the nodes come into
eect. If the multi-wall display is driven by a cluster system, the distribution of
random seeds and the synchronization of these seeds can be used to synchronize
particle systems and ocking algorithms.

Conclusions and Future Work

The actual NetODrom application was demonstrated to the public with great
success on an SGI Onyx with an active stereo Curved Screen was used as output
display. This system was interconnected to a portable Linux driven PC with a
passive stereo powerwall display.
This paper has identied the necessary core parts for a networked VR application. Suggestions on how to design these components were given with the
example of NetODrom, a networked racing game. Not only technical aspects
like physics or network communication were presented, but also other aspects
like graphical design or eects which are used to increase immersion were introduced. Many pitfalls that can be made during the development of NVEs have
been shown, and suggestions how to maneuver around these obstacles were given.
Core parts of the NetODrom application are being used in a framework for
the design of more generic NVEs - the inVRs framework. An brief overview of

NetODrom An Example for the Development

759

the inVRs framework is given in [10]. Furthermore this application will be used
as a demonstration application to show the capabilities of the edutain@grid
middleware.

Acknowledgments
We would like to thank all the people from the Johannes Kepler University and
the University of Reading, which helped to implement this application most
notably: Helmut Garstenauer, Martin Garstenauer, Adrian Haegee, Marlene
Hochrieser, Franz Keferb
ock, Stephan Reiter, Christian Wressnegger, and Johannes Zarl. The modelling and design of the VE was done by students of the
Kunstuniversit
at Linz: Wolfgang Hauer, Clemens Mock, Paul Pammesberger,
Ivan Petrov, Georg-Friedrich Sochurek, Silke Wiesinger and Wolfgang W
ogerer.
For additional support for the organization of the public demonstration event
we would like to thank Gerhard Funk, Christa Sommerer, and Friedrich Valach.
The work described in this paper is supported in part by the European Union
through the IST-034601 project edutain@grid.

References
1. Carlsson, C., Hagsand, O.: Dive - a platform for multiuser virtual environments.
Computers and Graphics 17 (1993) 663669
2. Conway, M.J., Pausch, R., Gossweiler, R., Burnette, T.: Alice: A rapid prototyping
system for building virtual environments. In: ACM Conference on Human Factors
in Computing Systems (CHI 94). Volume 2., Boston, MA, USA, ACM Press (1994)
295296
3. Tramberend, H.: Avocado: A distributed virtual reality framework. In: IEEE
Virtual Reality (VR 99), Houston, TX, USA, IEEE Computer Society (1999)
1421
4. Rohlf, J., Helman, J.: Iris performer: A high performance multiprocessing toolkit
for real-time3d graphics. In: SIGGRAPH, ACM Press (1994) 381394
5. Singhal, S.K., Zyda, M.J.: Networked Virtual Environments - Design and Implementation. Addison-Wesley Professional (1999)
6. Smed, J., Kaukoranta, T., Hakonen, H.: A review on networking and multiplayer
computer games. Tucs technical report no 454, Turku Centre for Computer Science
(2002)
7. Bressler, H., Landertshamer, R., Anthes, C., Volkert, J.: An ecient physics engine
for virtual worlds. In: Mediaterra 06. (2006)
8. Reeves, W.T.: Particle systems - a technique for modeling a class of fuzzy objects.
ACM Transactions on Graphics 2 (1983) 93108
9. Reynolds, C.W.: Flocks, herds, and schools: A distributed behavioral model. Computer Graphics 21 (1987) 2534
10. Anthes, C., Volkert, J.: invrs - a framework for building interactive networked virtual reality systems. In: International Conference on High Performance Computing
and Communications (HPCC 06), Munich, Germany, Springer (2006) 894904

Intelligent Assembly/Disassembly System with a Haptic


Device for Aircraft Parts Maintenance
Christiand and Jungwon Yoon
Mechanical and Aerospace Engineering and ReCAPT,
GyeongSang National University, Jinju, Gyeongnam 660-701, Korea
tianize@yahoo.com, jwyoon@gnu.ac.kr

Abstract. This paper presents the development of an intelligent Assembly/Disassembly (A/D) system with the utilization of a haptic device. The development was aimed to assist the A/D process of aircraft parts maintenance
with an intelligent algorithm. A comprehensive methodology is presented by an
intelligent algorithm generating the best sequence for the assembled/disassembled parts. For this purpose, a genetic algorithm was chosen because of its effectiveness and compatibility dealing with the A/D sequencing
problem. Furthermore, A/D process of the parts will also be calculated in the
genetic algorithm process, which will show the effectiveness of the algorithm.
Moreover, the concept of both natural and artificial constraints are applied to
generate the path planning of the assembled/disassembled parts. It is believed
that the combination of haptic, optimized sequence algorithm and intelligent
path planning will increase efficiency for the overall aircraft parts maintenance
process.
Keywords: Haptic, Genetic Algorithm, Natural and Artificial Constraint,
Aircraft Part Maintenance.

1 Introduction
The implementation of Virtual Reality (VR) technologies such as the medical and
A/D fields, is increasing. The improvement of VR technologies promises a bright
future for the efficiency of the A/D process, including the maintenance of aircraft
parts. Since the physical objects are represented in digital form with VR technologies,
users do not need to have a real object for the simulation of the process. From that
point of view, the efficiency can be increased because of the elimination of real object. Furthermore, the development of VR technologies is also supported by the development of haptic technologies. Nowadays, some haptic devices are available and
widely used such as Phantom families [1] from SenseAbleTM Corporation and Omega
xTM from the Force Dimension Corporation. Haptic devices give touch sensing by a
force-feedback mechanism. Through this mechanism, an operator will feel the result
of collision force.

Corresponding authors.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 760767, 2007.
Springer-Verlag Berlin Heidelberg 2007

Intelligent Assembly/Disassembly System with a Haptic Device

761

The maintenance process itself benefits from advancements in the VR interface,


since the maintenance process involves many A/D task and depending on the amount
of parts involved. A maintenance simulation could be held to observe the scheme of
maintenance task by simulating the process in a virtual environment (VE). Borro et al.
have developed the system for the maintainability simulation in aeronautics. They
combined a haptic device to the system and keep the users movements same as those
that occur when testing a physical mock-up [2]. Similar work has also been suggested
by Saval et al. [3]. Another research focused on the development a language, such as
Virtual Fixture Assembly Language (VFAL) [4]. This library could be used to construct various virtual fixture series with graphic and force guidance rules, making the
low-level haptic and graphic rendering details transparent to the developers. Gallantuci et. al have researched the A/D planning by using fuzzy and genetic algorithms
[5]. They utilized a genetic algorithm (GAs) to generate the optimal sequence for
A/D task.
Our research focuses on the development of A/D system for the aircraft part maintenance process. We propose a combination of the utilization of a haptic device and
an intelligent algorithm for part sequence generation. We utilized a genetic algorithm
as an optimization method to optimize part sequence generation. Even though some
researches already have been suggested this for the parts sequence generation, they
didnt consider the utilization of a haptic device. Natural and artificial constraint concepts are applied in this paper. Natural and artificial constraint concepts have been
widely applied in the robotics field. Even though there has been extensive research
[6-7] on the constraints in the haptics field, there has not been a detailed study on its
applicability to aircraft part maintenance. This paper describes the development of the
intelligent system for the aircraft parts maintenance with haptic information.

2 Description of Aircraft Part Maintenance System


The schematic of A/D system is shown in Figure 1. The system consists of a haptic
device (Phantom omni), a Virtual Environment (VE), and a trained operator. The
haptic device can be considered as an apparatus where an operator inputs the action
for the manipulated object. This device has only three degree of freedom (3-DOF), so
it can not generate torque. The communication occurred bi-directionally between the
haptic device and the VE through a fire-wire cable. The manipulated objects are aircraft parts with a scale model, which was made in CAD software and saved as the
VRML (Virtual Reality Machine Language) file format. After the object was transferred to VE, the operator could then manipulate the object during the scheme of the
maintenance task.
In the future trials according to the design, the movement made by the operator
will be constrained along to the suggested path. As the operator looks at VE, he/she
will feel the guidance force from artificial and natural constraints. The artificial and
natural constraints will guide the operator to put the part in the right position, using
the suggested path. The virtual environment was built based on the OpenInventorTM
graphic library, a collision detection library, a path planning algorithm, and an optimal sequence generation algorithm. Those components work together to make the
aircraft part maintenance system efficient.

762

Christiand and J. Yoon

(a) Schematic of A/D System

(b) An operator

Fig. 1. An intelligent A/D system for aircraft parts maintenance

3 Aircraft Maintenance System Algorithm


The overall system algorithm is shown in the flowchart in Fig. 2. First of all, the system should check the haptic device through the device initialization, verify the initial
position of a haptic stylus, and then confirm whether collision detection is being performed or not. This step was aimed to decide which part will be the first position in
the sequence. This procedure gives freedom to the operator to set the first suitable part
as he/she wish. Since the collision detection library only retrieves the coordinates of
the colliding polygons and points, the contact configuration needs to be reconstructed
by considering the possible combination of surface features between two bodies [6].
For collision detection, the functions from graphic API (Application Programming
Interface) Open Inventor are utilized [8].
After the first part of the algorithm is recognized, we can perform the optimal sequence generation. The genetic algorithm (GAs) is applied in this step. The GAs algorithm begins by defining the optimization variables, the cost function, and the cost
[9]. The cost function is a function that shows the relation between the optimization
variable and the parameter for evaluation of the problem solution. The cost is a difference between the actual solution and the desired solution. Optimization variables are
the variables that should be optimized to reach optimal solution for the case model.
The optimization variables are arranged in chromosome form. A cluster of chromosome is called as a population. The optimization variables are sequence order (S),
direction of each part (O), and used tool for each part (T).

chromosome = [S | O | T ]
where :

(1)

Intelligent Assembly/Disassembly System with a Haptic Device

763

S = ( s1 ss " sn ) = part sequence


O = (o1 os " on ) = part orientation
T = (t1 ts " tn ) = used tool for each part
n = number of part
The evaluation of disassembly sequence was realized by using cost function [5]:
Cost function = f (chromosome) = ( w1 l ) + w2 ( N 1 o) + w3 ( N g 1)

(2)

where the parameters of the cost function were set to l, o, and g that variables are
the maximum length of feasible sequence, the orientation change number, the
gripper change number, respectively. Then, every parameter was associated with the
weight w.

Fig. 2. System algorithm of the intelligent A/D system

The use of these cost function parameters are quite representative for the case such
as the maintenance task. In the maintenance task, the numbers of orientation changes
and tool changes will affect the efficiency of overall maintenance process. An analogy
can be made to describe the effect of the cost function parameters for the overall efficiency. A gripper is a tool used to do one task of the A/D tasks, whereas the orientation changes represent how many step we needed for doing one task in A/D task list.
It needs to be restated in this case, that the best sequence is a list of parts in the assembled order which have the highest cost function value.
The ranking for chromosomes will be generated in every generation (iteration).
Ranking is made by sorting the cost value for each chromosome. Some portion of
population will be destroyed to give space for new offspring. This process is called

764

Christiand and J. Yoon

natural selection. Selection rate (Xrate) determines the amount of chromosomes that
still exists after the natural selection process. The selection rate is set to 50%. The
relation between the existing chromosomes (Pkeep) and the number of population (P) is
shown in (3).
Pkeep = X rate P
(3)
Then, the numbers of the empty space or the disappearing chromosomes are shown
in (4).
Pdisappear = P Pkeep
(4)
Half of the population will disappear and will be replaced by new offspring. The disappeared chromosomes (Pdisappear) are chromosomes that have low cost values. In each
generation, GAs operator: crossover and mutation are applied to produce new offspring. By applying the GAs operator, the fitness of the chromosomes can be increased. Crossover is a process matching of some nodes in one chromosome to
another chromosome. The results of the crossover process are two new offspring. In
this paper, the crossover process is based on a partially matched crossover (PMX) that
was introduced by Lazzerini et.al with more advancement called MPMX [10]. Parents
are taken from the existing chromosome (Pkeep). The pair of parents are randomly
selected from the existing chromosomes (Pkeep). Then, three numbers are randomly
selected. Let a, b, and l represent the starting nodes for the first and the second
chromosomes, and the length of the matching portion, respectively. For example: the
chromosomes are SA and SB, l = 4, a = 3, b = 7, n = 10.
a

SA

[ s4 s5 s8 s1 s9 s2 s6 s10 s7 s3 | o4 o5 o8 o1 o9 o2 o6 o10 o7 o3 | t 4 t5 t8 t1 t9 t 2 t 6 t10 t 7 t3 ]

SB

[ s7 s3 s 2 s8 s5 s6 s9 s1 s 4 s10 | o7 o3 o2 o8 o5 o6 o9 o1 o4 o10 | t 7 t3 t 2 t8 t5 t 6 t9 t1 t 4 t10 ]

After the crossover


b

CA

[ s2 s5 s9 s1 s4 s10 s6 s8 s7 s3 | o2 o5 o9 o1 o4 o10 o6 o8 o7 o3 | t 2 t5 t9 t1 t 4 t10 t 6 t8 t 7 t3 ]

CB

[ s7 s3 s4 s10 s5 s6 s8 s1 s9 s 2 | o7 o3 o4 o10 o5 o6 o8 o1 o9 o2 | t 7 t3 t 4 t10 t5 t6 t8 t1 t 9 t 2 ]

Then, the offspring (CA and CB) will fill the empty space in the population, replacing
the disappearing chromosome (Pdisappear). A mutation operator is necessary to increase
possibilities for getting better offspring. Mutation is held by modifying the chromosomes individually. The nodes are selected randomly. If generation has reached the
cost function threshold, the generation should be stopped. The most optimal sequence
for A/D task will be suggested to the operator. It is worth remarking that the optimal
sequence is conducted based on the first part recognition algorithm chosen by the
operator.
When provided with sufficient information, the system will generate the guidance
path for the operator, which will guide the operator for the best part sequence of the
maintenance task. In order to sense the guidance force, artificial and natural constraints are applied to the operator through the path. The fundamental idea of using
the force field is that each object in virtual environment is surrounded by repulsive
force field [7]. Since we already know the target position of our part position, the

Intelligent Assembly/Disassembly System with a Haptic Device

765

force field of whole environment can be calculated to assist the operator to reach the
target position. We can prescribe the target position for aircraft parts. During the assembly task, target position will be the best position for each part in Cartesian coordinate related to the geometry constraint of the assembled parts. In disassembly task,
target position is safe place (position) in Cartesian coordinate where we will put the
part for the final position. The force applied for the guidance is a summation between
the artificial and natural constraint. The following relation summarizes the force implementation [6] :

Constraint total (x,y,z) = Constraint natural (y) + Constraint artificial (x,z)

(5)

The constraint along the plane x and z is used to guide the object to reach the prescribed target position, whereas the natural constraint is the constraint that naturally
comes from the object. After one part has reached the target position, another guidance path for another part will be suggested to the operator. This process repeated
until all part reaches their target positions. The sequence of task based on the results
of the optimal sequence generation.

4 Experiment and Result


To implement optimal sequence generation, some experiments were conducted to test
the algorithm. As shown in Fig. 3, An operator attempted to disassemble the assembled part. To accomplish the disassembly task, parameters such as gripper and direction of part were involved in cost value calculation. While in [5] the number of feasible sequence was involved, this trial did not involve that parameter in this experiment.
Assume the possibility of escape part direction only limited to the +z and z direction.
The gripper was available in five types.
Table 1. Available Gripper
Part
1
2
3
4
5
6
7
8
9
10

Available Gripper
G1,G2,G3
G2,G3,G4
G1,G3,G4
G3,G4,G5
G3,G4
G5
G1,G2,G3
G2,G3,G4,G5
G1,G2,G3,G4
G5

Initial population consisted of 100 individuals. The number of generation was limited to 50 generation. In every generation, maximum and average cost value was
calculated. Below are the maximal cost value after 100 trial attempting. In every generation, maximum and average cost value was increased (Fig. 3) This indicate that

766

Christiand and J. Yoon

solution is getting better from generation to generation. The final solution suggested
for disassembly sequence are 10-8-6-2-5-9-1-4-7-3. Because the cost function did not
involve a number of feasible sub-sequence and feasible sub-sequence modules, so
there is a non-feasible sub-sequence in suggested solution.

(a) Maximum cost function

(c) Average cost function

(b) Assembled part for experiment

(d) Assembled part drawing for experiment

Fig. 3. Experiment result and picture of assembled part

5 Conclusion
In this paper, we have suggested the framework of the intelligent assembly/disassembly system for aircraft parts maintenance task. The algorithm and components of the system which deal with the complexity of the sequence selection by an
optimalized method have been described in this paper. Some results have been
showed refer to the selection of optimal sequence. In future works, sub-sequence
module will be involved in the calculation of cost value to avoid non feasible subsequence appear in sequence solution. Furthermore, a novel algorithm and strategy in
the haptic utilization has been developed.. In other words, this system could be a prototype of an efficient system for the purpose of aircraft part maintenance tasks. Since
this is an ongoing research, further developments and inputs are desired to increase
the effectiveness of the system.

Intelligent Assembly/Disassembly System with a Haptic Device

767

Acknowledgement
This work was supported by the Korea Research Foundation Grant funded
by the Korean Government (MOEHRD) (KRF-2005-005-J09902) and was supported
by NURI Project.

References
[1] http://www.sensable.com/. 4:23 pm. 26/12/2006
[2] Diego Borro, Joan Savall, Aiert Amundarain, Jorge Juan Gil, Alejandro Garcia-Alonso,
Luis Matey, A Large Haptic Device for Aircraft Engine Maintainability, IEEE Computer Graphics and Applications, vol. 24, no. 6, pp. 70-74, Nov/Dec 2004.
[3] Joan Savall, Diego Borro, Jorge Juan Gil, Luis Matey, Description of a Haptic System
for Virtual Maintainability in Aeronautics, IEEE International Conference on Intelligent
Robots and Systems, Laussane, Switzerland, October 2002.
[4] Alex B. Kuang, Shahram Payandeh, Bin Zheng, Frank Henigman, Christine L.
MacKenzie, "Assembling Virtual Fixtures for Guidance in Training Environments," haptics, pp. 367-374, 12th International Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems (HAPTICS'04), 2004.
[5] L. M. Galantucci; G. Percoco & R. Spina, Assembly and Disassembly Planning by using Fuzzy Logic & Genetic Algorithms, International Journal of Advanced Robotic Systems, volume 1 Number , 2004
[6] Daniel Galeano and Shahram Payandeh, Artificial and Natural Force Constraints in
Haptic-aided Path Planning, IEEE International Workshop on Haptic Audio Visual Environments and their Applications, Ottawa, Canada, 1-2 October 2005
[7] Dongbo Xiao and Roger Hubbold. Navigation Guided by Artificial Force Fields, Proceedings of ACM CHI 98 Conference on Human Factors in Computing Systems, vol. 1,
pp 179-186. ACM SIGCHI, Addison Wesley, April 1998
[8] Open Inventor 6, Mercury Computer System, 2006
[9] Randy L. Haupt and Sue Ellen Haupt, Practical Genetic Algorithms 2nd Edition, John
Willey and Sons, USA, 2004
[10] Lazzerini B., Marcelloni F., A Genetic Algorithm for generating Optimal Assembly
Plans, Artificial Intelligence in Engineering, Vol. 14, pp. 319-329, 2000

Generic Control Interface for Networked Haptic


Virtual Environments
Priscilla Ramsamy, Adrian Haegee, and Vassil Alexandrov
Advanced Computing and Emerging Technologies Centre,
The School of Systems Engineering, University of Reading,
Reading, RG6 6AY, United Kingdom
p.ramsamy@reading.ac.uk

Abstract. As Virtual Reality pushes the boundaries of the human computer interface new ways of interaction are emerging. One such technology is the integration of haptic interfaces (force-feedback devices) into
virtual environments. This modality oers an improved sense of immersion to that achieved when relying only on audio and visual modalities.
The paper introduces some of the technical obstacles such as latency and
network trac that need to be overcome for maintaining a high degree
of immersion during haptic tasks. The paper describes the advantages
of integrating haptic feedback into systems, and presents some of the
technical issues inherent in a networked haptic virtual environment. A
generic control interface has been developed to seamlessly mesh with existing networked VR development libraries.
Keywords: Virtual Reality, Force Feedback.

Introduction

The real power of VR technology is that it allows people to expand their perception of the real-world in ways that were previously impossible[1]. Changes
to virtual objects and their interrelationships can be eected with ease, which is
often not feasible with real objects. Through VR market researchers are now able
to generate precisely the same stimulus conditions for all participants and are
able to modify certain environment variables in real-time thus allowing them
to have more control over the experimental settings. Our study is centred on
exploring shoppers habits for market research with regards to how they move
about in their search for desired items within the shopping centre. Based on
these results market researches can provide the relevent foundation for developing eective marketing strategy and advertising campaigns to boost sales. A
random sample of people categorised by age was selected and were interviewed
about their experience regarding the eciency, ease of use and degree of realism
to the physical world that the application had. Analysis of the collected data
showed that respondents not only experienced diculty to navigate in the 3D
environment using the wand but also in its use to actually get hold of certain
objects by pressing buttons. It was then decided to develop a more natural and
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 768775, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Generic Control Interface for Networked Haptic Virtual Environments

769

intuitive mode of interaction and navigation to better suit the users in such an
environment.
Input devices such as computer keyboards, mice and joysticks can convey the
users commands to the computer but are unable to provide a natural sense of
touch and feel to the user. It is crucial to understand the nature of touch interaction, since we react to our surroundings based on what and how we perceive
things [2]. The users sensorial experience in a virtual environment is a fusion of
information received through his perception channels. The results obtained from
previous studies have made it possible to quantify the eectiveness of using haptics to provide a greater sense of presence and immersion. Based on these ndings
[3][4][5][6] there is great potential to improve VR applications. The additional information from haptic feedback makes certain tasks such as manipulation much
easier compared to that of the traditional computer interface [7]. In an eort
to bring in more realistic and intuitive experiences in the virtual environment
created, a haptic trolley and an instrumented glove were incorporated with the
VE. The study consists of augmenting an immersive graphic display (CAVE like
system[8]) with a haptic interface.
The integrated system is multi-frequency since the haptic loop and the display/rendering loop require dierent update rates. The haptic loop should be
maintained with an update rate of 1KHz to avoid force artifacts and because
humans can detect discrete events at less than this rate. The display loop should
be updated at 30Hz(minimum) to produce stable images and since the human
visual system has a icker frequency around 30-60Hz. An eective integrated
system of haptics and graphics should ensure the eectiveness of each modality. The users experience and sense of immersion would be enhanced and would
complement the visual display only if both the haptic and the visual feedback
are consistent with each other. Evidence show that latency is a limiting factor
in shared haptic environments [9]. Latency creates a lag from when collision is
detected and when a force is applied to the display device. The latency experienced for local systems is very small(instantaneous) however, for systems that
are distributed across a network the lag experienced could generate instability
during haptic rendering. A generic control interface has been developed at Reading University which forwards the relevant device information over the network
to both the haptic and rendering loop. The latency felt is minimal as we are
integrating the haptic display on a local system.
In section 2 of this paper, we give an overview of the software architecture
implemented as well as the networking protocol used and how its implementation
has been utilised in real-time. Section 3, presents the current feedback device
operations developed. Finally the future work is outlined in section 4.

Software Architecture

Haptic simulations involve a human operator, a force reecting device and a virtual environment. Haptic is a term derived from the Greek haptesthai which
means to come in contact with by providing the sense of touch with both tactile

770

P. Ramsamy, A. Haegee, and V. Alexandrov

(cutaneos) and kinaesthetic (proprioception) feedback [10]. Tactile, or touch


feedback describes the sensations felt by the skin. Tactile feedback permits users
to feel things such as the texture of surfaces, temperature and vibration. kinaesthetic force feedback on the other hand enables us to recognise a force that
is applied to our body by the help of sensoric cells located at the end of the
tendons or between the muscle strands. The human operator kinaesthetically
explores the virtual environment by interacting and grasping an active mechanical device. The device provides a new sensory display modality and presents
the necessary information by exerting controlled forces on the human operator.
Figure 3. shows the mobile platform developed. The device is equipped with an
actuator that produces the relevant forces and a sensor from which the velocity
can be calculated. The motion and position of the device is coupled to a virtual
object in the virtual environment. The virtual object should experience the same
forces and movement as experienced by the physical device.

Fig. 1. Software architecture for multi-processing

Several issues can contribute to limiting haptic device capability to render


and provide the relevant forces. To provide the desire results the system should
update in real-time however to have higher accuracy would require that we
sacrice the update speed. Therefore due consideration should be given between
the trade-os of being highly ecient or highly accurate.
The generic interface in Figure 1. currently forwards the relevant information
that is velocity, force and position over the network. These values are then used
by both the visual-rendering and haptic-rendering loop to produce the necessary
visual and haptic feedback. In our implementation the haptic and the graphic
loops have their own databases, which are not shared. This architecture enables
us to run the haptic process and visual process on either dierent computer
systems or on the same system. For this multi-processing structure the critical
task would be to update both the graphic and the haptic database. The visual

Generic Control Interface for Networked Haptic Virtual Environments

771

rendering subsystem displays the necessary scenes onto the display system and
this is usually run at rates of up to 80Hz. For this work the VieGen framework
as described in [11] was used for control and rendering of the scene.
2.1

Transport Protocol Used

Since the transmission speed would have a direct bearing on the update rate of
the simulation engine the solution chosen was a trade-o to provide maximum
exibility and high throughput at the expense of the implementation being unreliable. Currently the User Datagram protocol (UDP) transport protocol is used
to route data packets over the network. However based on the fact that the
network currently being used is a close coupled controlled network, reliability
due to lost packets is not be an issue. Error recovery for lost packets has been
inbuilt in the system. The interface is written in C programming language and
uses the stream socket or connected datagram socket for sending and receiving
data across the network.
2.2

The Client/Server Architecture

The proposed design is a client/server architecture. The real-time implementation is done over a LAN and consists of 3 components namely the client, the
server and the generic control interface. The server component : is responsible
for receiving packets from the force reecting/haptic device through the xPC
TargetBox. The server is then responsible for forwarding the relevant information over the network. A generic control interface deals with this information
by writing and storing it into shared memory on each system. The following
owchart Figure 2. depicts the current workings of the server application. This
application unpacks the packets received, which consists of data for both the
instrumented glove and the mobile platform. Once the data is unpacked the information is stored in the relevant structure in shared memory. Calls are made
to the underlying methods in the generic control interface to write to shared
memory. Error checks have been implemented to verify whether memory has
been initialised. If not, a memory segment is initialised and allocated to accommodate the information for both the instrumented glove and the mobile platform
data structures. Since the packets sent over the network are updated at a high
frequency writing to shared memory should be synchronised to be able to update the information at the same rate. The client component : is responsible for
retrieving the relevant information from memory and using it to render scenes
in the Virtual environment. The update/refresh rate can be set according to the
application requirement. The server component receives data from the haptic
interface via the network whilst the client interface sends the force values back
to the haptic interface. Both Client/Server components run on the same host.
As mentioned previously the motion and position of the haptic trolley is coupled to a virtual trolley object in the virtual environment. The virtual object
should experience the same forces and movement as experienced by the physical
device. Hence to provide a realistic experience to end-users the scene rendered

772

P. Ramsamy, A. Haegee, and V. Alexandrov

Fig. 2. Server and Client Application Flowchart

should portray the changes in real-time and this is achieved by enabling a high
refresh/update rate. The interface was tested and produced successful results as
to providing a realistic enhanced experience to the end-user. Since the current
network implementation was based on the User Datagram protocol less overhead
and network lag was experienced. We are currently experiencing a round trip
time of 1ms for every packet being sent whilst the grounded trolley is being used
as a navigational tool. The owchart in Figure 2. depicts the current workings
of the client application.
2.3

Application Development

The shopping application was developed to allow users in the ReaCTor to shop
in the virtual environment by interacting with 3D models of dierent products
and brands. A more intuitive mode of navigation was provided by integrating the
trolley during user trials. The application implemented made use of the VieGen
framework as described in [10]. VieGen provides several libraries of tools and
utilities that are useful in the development of Virtual Interactive Environments.
The elements making up the framework consists of the interface to the display
hardware, a networking subsystem, scene management, environment simulation
and accompanying utilities, all of which can be used or left out as depending on

Generic Control Interface for Networked Haptic Virtual Environments

773

the requirements. The CAVE Scene Manager component (CSM) oers a generic
abstraction layer to the underlying VR hardware and this allows the same application to be compiled for a variety dierent systems. At run-time a conguration
le is then used to stipulate the required system set-up and mode of operation
and by using this approach, it was possible to use the VieGen framework to run
on dierent display systems. Ranging from mono desktop systems through Powerwalls and through to immersive CAVE like systems. With such an application,
it not only implies application interoperability throughout the entire range of
hardware systems, but it also lends itself to application development and testing
without the excessive tying up of the essential immersive system resources.
The networking subsystem consists of two layers. While the low level layer provides a consistent, low latency infrastructure for buering messages to and from
multiple remote locations, the higher level layer [9] builds on top of this to provide a hierarchical topology for Distributed Virtual Environments (DVEs). This
topology assumes that the environment is split into domains and controls the
individual users in these domains. One of the users is dynamically and transparently allocated (designated) as the Domain Server, and is consequently used by
the topology to handle connections between the remaining clients. Later these
connections are used to relay information such as position, action events or
user-dened messages. The modules for simulation are run concurrently with
the networking subsystem. This operation forms the basis for establishing an
environment as a domain, and allows its population by a number of computer
generated users. It is then possible for these pseudo-users to perform actions
within the environment to collect statistical data, perform tests and statistical
analysis.

Device Operations

In our implementation we have a instrumented tactile glove and a grounded


haptic mobile platform both are currently being developed at Reading University. The tactile glove can replace or be used in conjunction with the traditional
wand to provide a more realistic and intuitive way of manipulating virtual objects. The generic control interface developed permits function calls to the device
state whereby the relevant information can be retrieved. The existing functions
can retrieve the hand orientation and nger angles. The appropriate feedback
can be rendered based on these values retrieved and the chosen collision detection algorithm. Once collision is detected with a manipulatable object the
person would be able to interact with and manipulate the object. The virtual
object will be attached to the glove only while the grasp aperture (the distance
between thumb and ngers) is less than a critical distance. The critical distance
can be xed or vary depending on the size of the object being manipulated.
The following gure 3 shows the glove in question being used in the immersive
shopping environment to provide a more intuitive way to manipulate objects in
the environment. The haptic mobile platform is currently being used to replace

774

P. Ramsamy, A. Haegee, and V. Alexandrov

Fig. 3. Glove and trolley in use

the traditional joystick and permits the user to navigate within the virtual environment. It could mimic several devices ranging from a bicycle to a trolley or
hang glider. Information on the device sate, the force applied, the displacement
forward or backwards and the angle of rotation can be retrieved. Figure 3 shows
the trolley being used as a navigational tool in the shopping environment.

Conclusion and Future Work

Our generic control interface has been successfully developed and permits both
the tactile and force feedback devices to integrate with our immersive system,
and is being used in ongoing application development. The current work in
progress involves making the generic interface a platform independent portable
interface. The generic control interface provides a exible method of developing
libraries that acts as an interface between dierent input devices and existing
CVEs.
Based on the requirements for stability and delity a very high update rate
has been used in our current implementation. However to minimise transmission
requirements updates need only be made when there is a change in state. Our
future work will concentrate in implementing such an architecture to enable an
improved local interaction and to nd the eects on the degree of immersion
and haptic feedback of the system. The magnitude of inconsistencies between
states of the local copies of the VE will be evaluated to verify the network delay
and the amount of packet loss. Renement of current haptic device operations
and incorporating an increased number of vibrotactile stimulators onto each
nger of the glove would also be considered. This would enhance and provide
more functionality to the instrumented glove would enable a realistic mode of
interaction in VEs.
The current implementation considers the integration of haptic information to
augment visual displays on a local system. The latency experienced is minimal
in such a situation. However the work will be extended to provide a haptically

Generic Control Interface for Networked Haptic Virtual Environments

775

mediated collaborative virtual environment. This would be challenging considering that the users would be graphically distributed and various problems such
as network delay and latency would be introduced.

References
1. J.L. Mohler, Desktop Virtual Reality for the Enhancement of Visualization Skills.
Journal of Educational Multimedia and Hypermedia,(2000) 151-165.
2. M.A. Srinivasan and C. Basdogan, Haptics in Virtual Environments: Taxonomy,
Research Status, and Challenges, Computers and Graphics, Special Issue on Haptic Displays in Virtual Environments, Vol. 21, No. 4, (1997)
3. Burdea, G.C. Haptics issues in virtual environments. In Computer Graphics International,Proceedings. (2000)
4. F. P. Brooks, M. Ouh-Young, J. J. Batter, and P. J. Kilpatrick. Project GROPE
Haptic Displays for Scientic Visualization. In Proc. ACM SIGGRAPH, Dallas,
TX, Aug (1990) 177185
5. Michael Dinsmore, Noshir Langrana, Grigore Burdea, Jumoke Ladeji. Virtual
Reality Training Simulation for Palpation of Subsurface Tumors, vrais, Virtual
Reality Annual International Symposium, (1997) 54
6. Burdea, G., G. Patounakis, V. Popescu, & R. E. Weiss. Virtual Reality Training
for the Diagnosis of Prostate Cancer. In IEEE International Symposium on Virtual
Reality and Applications, Atlanta, Georgia, March (1998) 190-197
7. Andrew G. Fischer, Judy M. Vance, Implementing Haptic Feedback in a Projection Screen Virtual Environment, Seventh PHANToM Users Group Workshop,
October (2002) 26-29
8. Cruz-Neira, C.,Sandin, S.J.,DeFanti, T.A., Kenyon, R.V., and Hart, J.C. The
CAVE: Audio visual experience automatic virtual environment. Communications
of the ACM 35 (1992) 64-72
9. Ottensmeyer MP, Hu J, Thompson JM, Ren J, and Sheridan TB Investigations into
performance of minimally invasive telesurgery with feedback time delays. Presence
9 (4), (Aug 2000) 369-382.
10. M. Mokhtari, F. Bernier, F. Lemieux, H. Martel, J.M. Schwartz, D. Laurendeau
and A. Branzan-Albu. Virtual Environment and Sensori-Motor Activities: Haptic,
Audition and Olfaction. WSCG POSTERS proceedings Plzen, Czech Republic.,
Vol.12, No.1-3, February, (2004) 2-6
11. Haegee, A., Jamieson, R., Anthes, C., Alexandrov, V. Tools for Collaborative
VR Application Development. In: International Conference on Computational
Science, Springer Verlag (2005) 350358

Physically-Based Interaction
for
Networked Virtual Environments
Christoph Anthes, Roland Landertshamer, and Jens Volkert
GUP, Institute of Graphics and Parallel Processing
Johannes Kepler University, Altenbergerstrasse 69, A-4040 Linz, Austria
canthes@gup.uni-linz.ac.at

Abstract. Physics engines are gaining importance in the area of computer games to create better visual experiences. In the context of networked Virtual Reality (VR) applications they can be useful to provide
new interaction possibilities. This paper will provide an introduction into
the area of physics simulation and the use of it to create more realistic
and interactive multi-user virtual worlds. Examples for the usefulness of
such physics engines can be found in a variety of training applications.

Introduction

In the past years Virtual Reality (VR) has become a useful tool for a variety
of application areas. Networked Virtual Environments (NVEs) allow dislocated
users from all over the globe to communicate and interact in a shared virtual
space.
If we take a look at safety training applications like the SAVE project [1,2]
where the user has to react on hazardous incidents in a petroleum renery or the
VETT project, which provides shipboard reghting training [3] we discover that
most of them are single user applications or work with two users a trainer and a
single trainee, where only the trainee enters the VE and gets instructions from
the trainer. In the real world safety training is typically performed in teams,
as hazardous incidents are fought by teams. Due to that reason it is obvious
that safety training VR applications would prot from multi-user interaction
and collaboration.
Another issue which arises is the delity of interaction. Users are typically
interacting with a Virtual Hand interaction technique. If the representation of
the users cursor collides with an entity in the VE interaction can be performed.
The level of interaction is normally reduced to clicking on the entity which can
for example result in the triggering of an event. This is sucient to train basic
procedures during incidents. To maximise the training in such environments
physics engines should be incorporated to allow for a more realistic behaviour of
the VE, where the user has to perform the movement of the object manipulation
like he would in the real world. This could be throwing a lever, or pushing down
a virtual door handle in order to pull the door open afterwards.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 776783, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Physically-Based Interaction for NVEs

777

The inVRs framework1 is designed to ease the creation of NVEs. Its physics
module is used to implement realistic interaction possibilities with the environment. Joints for example can interconnect nodes of the scene graph and allow
for a simple scene description. Gravity and collision detection become important
during manipulation and placement tasks. This makes inVRs an ideal candidate
to create NVEs for safety training.
This Section has given a brief introduction on the use of physics in training
environments. The next Section will provide an overview on the related work. In
the following the use of physics will be described in the context of the framework.
Network communication for the physically based interaction will be introduced
and some example setups will be given. The nal Section concludes the paper
and gives an outlook of future developments.

Related Work

Problem domain and related work of this paper are distributed over three different areas, physics, network communication and interaction.
A detailed overview on the design of a 3D rigid body physics engine for games
is given by David H. Eberly [4]. He describes in detail the mathematical and
physical background while Bara focuses in his thesis [5] on how physical constraints can be transformed in to mathematical problems and how they can be
solved.
The commonly available NVEs like DIVE [6], AVOCADO [7] or MASSIVE
[8], do not support the use of physics engines due to several reasons. Simulating
physics in none client-server topologies is very cumbersome. In a peer-to-peer
(p2p) topology one client would have to be selected to perform the role of a
server, to guarantee a consistent state of the VE. Most of the existing VR systems incorporate p2p topologies or hybrid approaches, to achieve a high responsiveness and scalability. Another issue which constraints the use of physics is
the computational intensity of such simulations. A good overview on NVEs in
general is given by Singhal and Zyda in [9].
A variety of interaction techniques for VEs has been described by Mine. [10].
The psychological aspects of collaborative object manipulation have been researched on many levels by Roberts et al. [11].On the technical level Broll gives
ideas on how to solve concurrent object manipulation [12].
Jorisson and Lamotte combine the use of physics engines and the VR to
create a platform for collaborative VEs where objects contain the interaction
information and behaviour [13]. Their networking topology uses a classic client
server approach, where one server is responsible for the physics update of the
whole VE.
Early approaches in scene graphs like VRML and Inventor use sensors to
constrain object movement. The use of joints provides similar functionality but
provides additional features like the swinging of a door.
1

Pronounced In"v3:s.

778

C. Anthes, R. Landertshamer, and J. Volkert

Framework Architecture

The inVRs framework consists of modules for navigation, interaction and network communication. These modules are interconnected via a system core, which
provides databases to store information about the VE and the users inside the
VE as well managers for communication handling between the modules and the
databases. A brief overview on the architecture is given by Anthes and Volkert
[14]. inVRs provides the possibility to concurrently manipulate the same geometry as the same time by two dislocated users. It supports this collaboration in
many ways including the use of an additional physics module.
One of the important aspects of the framework is the distinction between discrete events which are distributed via the event manager and continuous streams
of transformation data which is handled by the transformation manager. A detailed description of the transformation and event handling can be found in [15].
The physics module makes use of these two managers, to generate for example
collision events or change the transformations of the objects in the NVE.

Physics Module

To increase the immersion of the users in a VE it is important that the virtual


objects show a realistic behaviour. Therefore a physics engine can be used. In
the context of VEs it is responsible for the simulation of the behaviour of rigid
bodies, which comprises the following tasks:
rigid body motion
collision detection and response
constrained interaction of rigid bodies
Each object in the VE that should be simulated by the physics engine has to
be represented by a physics object. For rigid body motion this physics object
has to contain the mass and the inertia tensor of the virtual object. With this
information the physics simulation can calculate the linear and angular velocities
of the object and determine its position and orientation. The resulting velocities of the objects are a result of forces and torques which act on the physics
object. To avoid that two physics objects interpenetrate the physics engine has
to check for collisions. Therefore each physics object needs a description of its
shape. In general the triangle mesh of the virtual object could be used for the
collision detection but in real time simulations the shape of the objects is often
approximated by 3D primitives like spheres or boxes to reduce the amount of
computation time. When the physics engine encounters a collision then the collision response has to be applied. In this step the physics engine calculates an
impulse as a result of the collision. This impulse is then applied to both objects to
avoid an interpenetration. For constrained interaction of rigid bodies the physics
engines provide constraints or joints. A joint describes a restriction of motion for
a physically simulated object. Joints can be used to connect two objects together
and/or allow only special relative movement between two objects. A joint can

Physically-Based Interaction for NVEs

779

also be used to restrict the motion of a single object. The possibilities of motion
restriction cover the free rotation around a point, the rotation around an axis,
the movement along an axis or any combination of this constraints. Each joint
can again be restricted in its movement range. This allows to dene a rotation
to a maximum angle or the movement to a maximum distance.
The physics module of the inVRs framework uses a physics engine to allow
constrained interaction with virtual objects. It therefore allows to connect virtual
objects via joints. XML is used to dene the properties of the joints and to
interconnect nodes of the scene graph. Since each entity in the VE has a unique
id, which is stored in a database of the framework it is possible to interconnect
entities based on their ids. The physics module allows to dene thresholds for
each joint to execute events when these thresholds are exceeded. The possible
types of thresholds depend on the types of the joint. Examples are a rotation
angle around the joint axis or a distance from the untransformed position. The
events that should be executed can be dened in the cong le. For each joint
the user can set activation conditions which describe when the joint should be
active and when not. An inactive joint is seen as a xed connection between
two objects. The physics module of the inVRs framework allows to activate
or deactivate a joint when a user interacts with an entity, when other joints are
active or not, when a joint exceeds a dened angle or a joint exceeds a predened
distance. An example for this conditions can be found in the Section 6.

Network Communication

To synchronise VEs typically locking mechanisms are used. The philosophy of


the inVRs framework is to avoid locking and keeping a loose consistency. The
presented approach transmits a continuous stream of transformation data packets via UPD. The transformation data is mostly generated by the tracking systems of the interconnected VR installations. Transformations from objects in
the scene can be transmitted as well if necessary via this distribution channel.
The network topology of the framework is a peer-to-peer topology, which works
with fully replicated databases of the NVE to guarantee a high responsiveness.
Since tracking information can be considered ubiquitous if two or more users
are interacting in an NVE very little additional messages are needed to implement physically based interaction or concurrent object manipulation. If a local
user picks a part of a physically simulated entity, which could be for example a
door handle or a vent in a VE, a pick event is sent to the remote users. This
message is transfered in a reliable way and consists of a unique user and entity id.
Additionally it contains information about the time, when the entity was picked
and an oset from the origin of the entity to the picking point, which is determined based on the interaction technique of the user (e.g. ray-casting or virtual
hand). The remote systems are now notied that the user is connected with the
entity. With the information about the oset to the users picking point and the
additional tracking data they are able to perform a local physics simulation.

780

C. Anthes, R. Landertshamer, and J. Volkert

Once the object is released a reliable message is sent which stores data about
the exact position where the object is dropped with an additional timestamp.
The remote clients interpolate the position and orientation of the object between
current transformation at the time they received the message to the transformation in the time in the past which is stated in the timestamp of the release
message. The duration of this interpolation can be set in system setting of the
framework.
Using this type of synchronisation, object manipulation can be performed
locally without the need of additional transformations for each simulated object.
By using reliable release messages and transforming the object to a nal position
the NVE is brought back into a consistent state.

Example Setups

A variety of application areas of physics in a VE arise. An example for the use of


joints would be a door. The same mechanisms could be applied on any entities
in the VE. They are ideal to implement vents or handles in a safety training
application. Furthermore object manipulation with a virtual hand technique or
concurrent object manipulation can be handled by the use of physics.
6.1

A Door in the VE

In this example the scene graph for the door consists of three interconnected
nodes. The frame, the actual door and a door handle. The the door is attached
to the frame with a hinge joint on the z-axis. The rotation around the frame is
restricted to +/- 90 degrees. Additionally a threshold area of an angle of 1 degree
is dened in which the door is considered closed. If no user is interaction with the
door and it is in that threshold area it will be rotated to its idle state and will
be deactivated. The handle is as well attached to the door with a hinge joint. It
rotational freedom is limited 45 degrees around the y-axis. Thresholds measure
when it is rotated more than 25 degrees. In this case the joint between the frame
and the door is activated and the door can be rotated. Figure 1 illustrates the
setup of such a door while Listing 1.1 shows the XML denition of the example
joint setup.
In the initial state frame, door and handle are in passive mode. None of these
joints are simulated by the physics module at this stage. The user can grab
the handle, if this action has taken place, the joint between handle and door is
activated and simulated by the physics module. Once the user has rotated the
handle over the threshold, the joint between the door and the frame become
active. The whole door is now simulated in the physics module. since the axes
of the two joints are independent no problems can arise during the calculation
of the position of the door and its parts, it is always unique.
Using these joints for interaction does not only provide easy mechanisms to
create functional elements in a VE. Furthermore it give the VE a very realistic

Physically-Based Interaction for NVEs

781

feeling, when the user has the possibility to swing doors. It still has to be proven,
that by actually pulling down a door handle and pulling the door open the user
is trained in a better way than he is by simply clicking on the object.

Fig. 1. A door in the VE

<! - - Door Example -->


<! - - door knob -->
< joint type = " hinge " id = " 1 " >
< entities environmentId = " 0 " entityId1 = " 3 " entityId2 = " 2 "
anchorEntity = " 2 " / >
< anchor xPos = " -2 " yPos = " 5 " zPos = " 0 " / >
< axis xDir = " 0 " yDir = " 0 " zDir = " 1 " / >
< angles min = " -45 " max = " 0 " / >
< activeIF entity = " 1 " isGrabbed = " 1 " / >
</ joint >
<! - - door body -->
< joint type = " hinge " id = " 2 " active = " 0 " >
< entities environmentId = " 0 " entityId1 = " 2 " entityId2 = " 1 "
anchorEntity = " 2 " / >
< anchor xPos = " 4 " yPos = " 0 " zPos = " 0 " / >
< axis xDir = " 0 " yDir = " 1 " zDir = " 0 " / >
< angles min = " -90 " max = " 90 " / >
< deactivateIF jointId = " 2 " angle1GT = " -1 " / >
< activateIF jointId = " 1 " angle1LT = " -25 " / >
</ joint >
Listing 1.1. Joint denition of a door

782

6.2

C. Anthes, R. Landertshamer, and J. Volkert

Natural Interaction

To realise natural interaction tracking systems are incorporated. The movement


of the users input device is directly mapped on the cursor position and orientation
in the VE. The physics module can be used to simulate object properties like
gravity.
If gravity is used in the physics simulation of the environment one client has to
act as a master for the simulation. When an object is dropped gravity has to be
calculated and the proper rebound has to be calculated. This type of simulation
has to be calculated locally by the master client, who is in controll of the object.
The transformations of the matrices have to be transferred over the network.
6.3

Concurrent Object Manipulation

Concurrent object manipulation as described by Broll [12] allows two users to


manipulate the same attribute of the same virtual object in real-time. This type
of interaction can be extremely useful in construction scenarios or safety applications. Obstacles could be carried away or building material could be arranged.
The VR systems mentioned in Section 2 do not support cooperative manipulation. They lock the access to an object to a exclusively to a single user. Broll
suggests to combine interaction requests and calculate the resulting object position on one participants site, to keep the system consistent. An alternative
approach was developed by Froehlich et al. [16] who incorporate physics to cooperatively manipulate objects during assembly tasks. The approach developed
by Froehlich attaches a physically simulated spring between the cursor of the
user and the point, where the user grabbed the object.
In our case concurrent object manipulation is detected if two transformation
on the same object are detected by the transformation manager. In that case
a special merging is introduced which can be implemented using Froehlichs
physics approach. The resulting transformation is applied on the object. Since
the immediate input transformations from the local user and slightly delayed
transformations from the remote user which still can be extrapolated are available it is possible to provide a relatively correct and highly responsive representation of the cooperatively manipulated object.

Conclusions and Future Work

This paper has given an introduction on the use of physics simulation for interaction in NVEs. A physics module for the the inVRs framework allows to
dene joints for interconnection of scene graph nodes. These nodes can be used
for highly interactive NVEs. Three types of interaction have demonstrated the
use of physics simulation for VEs, especially training scenarios.
Advanced methods for synchronsiation of physics have to be found. A similar
approach than synchronising particle systems on multiple displays might be used.
The distribution of random seeds for some aspects of the physics calculation
might help to simulate parts of the VE locally.

Physically-Based Interaction for NVEs

783

References
1. Haller, M., Holm, R., Volkert, J., Wagner, R.: A vr based safety training in a petroleum renery. In: Annual Conference of the European Association for Computer
Graphics (EUROGRAPHICS 99). (1999)
2. Haller, M., Kurka, G., Volkert, J., Wagner, R.: omvr - a safety training system for
a virtual renery. In: ISMCR 99, Tokyo, Japan (1999)
3. Tate, D.L., Sibert, L., King, T.: Virtual environments for shipboard reghting
training. In: IEEE Virtual Reality Annual International Symposium (VRAIS 97),
Albuquerque, NM, USA, IEEE Computer Society (1997) 6168
4. Eberly, D.H.: Game Physics. The Morgan Kaufmann Series in Interactive 3D
Technology. Morgan Kaufmann (2004)
5. Bara, D.: Dynamic Simulation of Non-Penetrating Rigid Bodies. PhD thesis,
Department of Computer Science, Cornell University, Ithaca, NY 14853-7501, USA
(1992)
6. Carlsson, C., Hagsand, O.: Dive - a multi-user virtual reality system. In: IEEE
Virtual Reality Annual International Symposium (VRAIS 93), Seattle, WA, USA,
IEEE Computer Society (1993) 394400
7. Tramberend, H.: Avocado: A Distributed Virtual Environment Framework. PhD
thesis, Technische Fakult
at, Universit
at Bielefeld (2003)
8. Greenhalgh, C., Benford, S.: Massive: A distributed virtual reality system incorporating spatial trading. In: IEEE International Conference on Distributed Computing Systems (DCS 95), Vancouver, Canada, IEEE Computer Society (1995)
2734
9. Singhal, S.K., Zyda, M.J.: Networked Virtual Environments - Design and Implementation. Addison-Wesley Professional (1999)
10. Mine, M.R.: Virtual environment interaction techniques. Tr95-018, University of
North Carolina, Chapel Hill, NC 27599-3175 (1995)
11. Roberts, D.J., Wol, R., Otto, O., Steed, A.: Constructing a gazebo: Supporting teamwork in a tightly coupled, distributedtask in virtual reality. Presence:
Teleoperators and Virtual Environments 12 (2003) 644657
12. Broll, W.: Interacting in distributed collaborative virtual environments. In: IEEE
Virtual Reality Annual International Symposium (VRAIS 95), Los Alamitos, CA,
USA, IEEE Computer Society (1995) 148155
13. Jorissen, P., Lamotte, W.: Dynamic physical interaction platform for collaborative
virtual environments. In: CollabTech 2005, Tokyo, Japan (2005)
14. Anthes, C., Volkert, J.: invrs - a framework for building interactive networked virtual reality systems. In: International Conference on High Performance Computing
and Communications (HPCC 06), Munich, Germany, Springer (2006) 894904
15. Anthes, C., Landertshamer, R., Bressler, H., Volkert, J.: Managing transformations and events in networked virtual environments. In: International MultiMedia
Modeling Conference (MMM 07), Singapore, Springer (2007)
16. Fr
ohlich, B., Tramberend, H., Beers, A., Agrawala, M., Bara, D.: Physically-based
manipulation on the responsive workbench. In: IEEE Virtual Reality (VR 00),
New Brunswick, NJ, USA, IEEE Computer Society (2000) 512

Middleware in Modern High Performance


Computing System Architectures
Christian Engelmann, Hong Ong, and Stephen L. Scott
Computer Science and Mathematics Division,
Oak Ridge National Laboratory, Oak Ridge, TN 37831-6164, USA
{engelmannc,hongong,scottsl}@ornl.gov
http://www.fastos.org/molar

Abstract. A recent trend in modern high performance computing


(HPC) system architectures employs lean compute nodes running a
lightweight operating system (OS). Certain parts of the OS as well as
other system software services are moved to service nodes in order to
increase performance and scalability. This paper examines the impact of
this HPC system architecture trend on HPC middleware software solutions, which traditionally equip HPC systems with advanced features,
such as parallel and distributed programming models, appropriate system resource management mechanisms, remote application steering and
user interaction techniques. Since the approach of keeping the compute
node software stack small and simple is orthogonal to the middleware
concept of adding missing OS features between OS and application, the
role and architecture of middleware in modern HPC systems needs to
be revisited. The result is a paradigm shift in HPC middleware design,
where single middleware services are moved to service nodes, while runtime environments (RTEs) continue to reside on compute nodes.
Keywords: High Performance Computing, Middleware, Lean Compute
Node, Lightweight Operating System.

Introduction

The notion of middleware in networked computing systems stems from certain


deciencies of traditional networked operating systems (OSs), such as Unix and
its derivatives, e.g., Linux, to seamlessly collaborate and cooperate. The concept
of concurrent networked computing and its two variants, parallel and distributed
computing, is based on the idea of using multiple networked computing systems
collectively to achieve a common goal. While traditional OSs contain networking
features, they lack in parallel and distributed programming models, appropriate system resource management mechanisms, remote application steering and


This research is sponsored by the Mathematical, Information, and Computational


Sciences Division; Oce of Advanced Scientic Computing Research; U.S. Department of Energy. The work was performed at the Oak Ridge National Laboratory,
which is managed by UT-Battelle, LLC under Contract No. De-AC05-00OR22725.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 784791, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Middleware in Modern High Performance Computing System Architectures

785

user interaction techniques, since traditional OSs were not originally designed
as parallel or distributed OSs. Similarly, traditional OSs also do not dierentiate between various architectural traits, such as heterogeneous distributed or
massively parallel.
Since the emergence of concurrent networked computing, there have been two
dierent approaches to deal with these deciencies. While one approach adds
missing features to an existing networked OS using middleware that sits inbetween the OS and applications, the other approach focuses on adding missing
features to the OS by either modifying an existing networked OS or by developing
a new OS specically designed to provide needed features. Both approaches
have their advantages and disadvantages. For example, middleware is faster to
prototype due to the reliance on existing OS services, while OS development is
a complex task which needs to deal with issues that have been already solved in
existing OSs, such as hardware drivers.
Software development for high performance computing (HPC) systems is always at the forefront with regards to both approaches. The need for ecient,
scalable distributed and parallel computing environments drives the middleware
approach as well as the development of modied or new OSs. Well known HPC
middleware examples are the Parallel Virtual Machine (PVM) [1], the Message
Passing Interface (MPI) [2], the Common Component architecture (CCA) [3],
and the Grid concept [4]. Examples for modications of existing OSs for HPC
include the Beowulf Distributed Process Space (BProc) [5], cluster computing
toolkits, like OSCAR [6] and Rocks [7], as well as a number of Single System
Image (SSI) solutions, like Scyld [8] and Kerrighed [9]. Recent successes in OSs
for HPC systems are Catamount on the Cray XT3/4 [10] and the Compute Node
Kernel (CNK) on the IBM Blue Gene/L system [11].
A runtime environment (RTE) is a special middleware component that resides
within the process space of an application and enhances the core features of the
OS by providing additional abstraction (virtual machine) models and respective
programming interfaces. Examples are message passing systems, like PVM and
implementations of MPI, but also component frameworks, such as CCA, dynamic
instrumentation solutions, like Dyninst [12], as well as visualization and steering
mechanisms, such as CUMULVS [13].
This paper examines a recent trend in HPC system architectures toward
lean compute node solutions and its impact on the middleware approach.
It describes this trend in more detail with regards to changes in HPC hardware
and software architectures and discusses the resulting paradigm shift in software
architectures for middleware in modern HPC systems.

Modern HPC System Architectures

The emergence of cluster computing in the late 90s made scientic computing
not only aordable to everyone using commercial o-the-shelf (COTS) hardware, it also introduced the Beowulf cluster system architecture [14,15] (Fig. 1)
with its single head node controlling a set of dedicated compute nodes. In this

786

C. Engelmann, H. Ong, and S.L. Scott


Compute Node Interconnect
Head
Node

Compute
Node

Compute
Node

Compute
Node

Compute
Node

...

Users,
I/O &
Storage

Fig. 1. Traditional Beowulf Cluster System Architecture

Compute Node Interconnect


Head
Node

Compute
Node

Compute
Node

Compute
Node

...

Compute
Node

Users
I/O
Node

...

I/O
Node

Service
Node

...

Service
Node

I/O & Storage

Fig. 2. Generic Modern HPC System Architecture

architecture, head node, compute nodes, and interconnects can be customized to


their specic purpose in order to improve eciency, scalability, and reliability.
Due to its simplicity and exibility, many supercomputing vendors adopted the
Beowulf architecture either completely in the form of HPC Beowulf clusters or
in part by developing hybrid HPC solutions.
Most architectures of todays HPC systems have been inuenced by the Beowulf cluster system architecture. While they are designed based on fundamentally dierent system architectures, such as vector, massively parallel processing
(MPP), single system image (SSI), the Beowulf cluster computing trend has led
to a generalized architecture for HPC systems. In this generalized HPC system
architecture (Fig. 2), a number of compute nodes perform the actual parallel
computation, while a head node controls the system and acts as a gateway to
users and external resources. Optional service nodes may ooad specic head
node responsibilities in order to improve performance and scalability. For further
improvement, the set of compute nodes may be partitioned (Fig. 3), tying individual service nodes to specic compute node partitions. However, a systems
architectural footprint is still dened by its compute node hardware and software
conguration as well as the compute node interconnect.
System software, such as OS and middleware, has been inuenced by this trend
as well, but also by the need for customization and performance improvement.
Similar to the Beowulf cluster system architecture, system-wide management and
gateway services are provided by head and service nodes. However, in contrast
to the original Beowulf cluster system architecture with its fat compute nodes
running a full OS and a number of middleware services, todays HPC systems
typically employ lean compute nodes (Fig. 4) with a basic OS and only a small

Middleware in Modern High Performance Computing System Architectures

787

Compute Node Interconnect

Partition Compute Node Interconnect


Head
Node

Compute
Node

Compute
Node

Compute
Node

...

Compute
Node

Users
I/O
Node

...

I/O
Node

Service
Node

...

Service
Node

I/O & Storage

Fig. 3. Generic Modern HPC System Architecture with Compute Node Partitions

(a) Fat

(b) Lean

Fig. 4. Traditional Fat vs. Modern Lean Compute Node Software Architecture

amount of middleware services, if any middleware at all. Certain OS parts and


middleware services are provided by service nodes instead.
The following overview of the Cray XT4 [16] system architecture illustrates
this recent trend in HPC system architectures.
The XT4 is the current agship MPP system of Cray. Its design builds upon
a single processor node, or processing element (PE). Each PE is comprised of
one AMD microprocessor (single, dual, or quad core) coupled with its own memory (1-8 GB) and dedicated communication resource. The system incorporates
two types of processing elements: compute PEs and service PEs. Compute PEs
run a lightweight OS kernel, Catamount, that is optimized for application performance. Service PEs run standard SUSE Linux [17] and can be congured for
I/O, login, network, or system functions. The I/O system uses the highly scalable
LustreTM [18,19] parallel le system. Each compute blade includes four compute
PEs for high scalability in a small footprint. Service blades include two service

788

C. Engelmann, H. Ong, and S.L. Scott

PEs and provide direct I/O connectivity. Each processor is directly connected to
the interconnect via its Cray SeaStar2TM routing and communications chip over
a 6.4 GB/s HyperTransportTM path. The router in the Cray SeaStar2TM chip provides six high bandwidth, low latency network links to connect to six neighbors
in the 3D torus topology. The Cray XT4 hardware and software architecture is
designed to scale steadily from 200 to 120,000 processor cores.
The Cray XT4 system architecture with its lean compute nodes is not an
isolated case. For example, the IBM Blue Gene/L solution also uses a lightweight
compute node OS in conjunction with service nodes. In fact, the CNK on the
IBM Blue Gene/L forwards most supported POSIX system calls to the service
node for execution using a lightweight remote procedure call (RPC).
System software solutions for modern HPC architectures, as exemplied by
the Cray XT4, need to deal with certain architectural limitations. For example,
the compute node OS of the Cray XT4, Catamount, is a non-POSIX lightweight
OS, i.e., it does not provide multiprocessing, sockets, and other POSIX features.
Furthermore, compute nodes do not have direct attached storage (DAS), instead
they access networked le system solutions via I/O service nodes.
The role and architecture of middleware services and runtime environments
in modern HPC systems needs to be revisited as compute nodes provide less
capabilities and scale up in numbers.

Modern HPC Middleware

Traditionally, middleware solutions in HPC systems provide certain basic services, such as a message passing layer, fault tolerance support, runtime reconguration, and advanced services, like application steering mechanisms, user
interaction techniques, and scientic data management. Each middleware layer
is typically an individual piece of software that consumes system resources, such
as memory and processor time, and provides its own core mechanisms, such
as network communication protocols and plug-in management. The myriad of
developed middleware solutions has led to the yet another library and yet
another daemon phenomenons, where applications need to link many interdependent libraries and run concurrent to service daemons.
As a direct result, modern HPC system architectures employ lean compute
nodes using lightweight OSs in order to increase performance and scalability
by reducing compute node OS and middleware to the absolute necessary. Basic
and advanced middleware components are placed on compute nodes only if their
function requires it, otherwise they are moved to service nodes. In fact, middleware becomes an external application support, which compute nodes access
via the network. Furthermore, single middleware services on service nodes provide support for multiple compute nodes via the network. They still perform the
same role, but in a dierent architectural conguration. While middleware services, such as daemons, run on service nodes, RTEs continue to run on compute
nodes either partially by interacting with middleware services on service nodes or

Middleware in Modern High Performance Computing System Architectures

789

completely as standalone solutions. In both cases, RTEs have to deal with existing limitations on compute nodes, such as missing dynamic library support.
While each existing HPC middleware solution needs to be evaluated regarding its original primary purpose and software architecture before porting it to
modern HPC system architectures, new middleware research and development
eorts need to take into account the described modern HPC system architecture
features and resulting HPC middleware design requirements.

Discussion

The described recent trend in HPC system architectures toward lean compute
node solutions signicantly impacts HPC middleware solutions. The deployment
of lightweight OSs on compute nodes leads to a paradigm shift in HPC middleware design, where individual middleware software components are moved from
compute nodes to service nodes depending on their runtime impact and requirements. The traditional interaction between middleware components on compute
nodes is replaced by interaction between lightweight middleware components on
compute nodes with middleware services on service nodes.
Functionality. Due to this paradigm shift, the software architecture of modern
HPC middleware needs to be adapted to a service node model, where middleware
services running on a service node provide essential functionality to middleware
clients on compute nodes. In partitioned systems, middleware services running
on a partition service node provide essential functionality to middleware clients
on compute nodes belonging to their partition only. Use case scenarios that
require middleware clients on compute nodes to collaborate across partitions are
delegated to their respective partition service nodes.
Performance and Scalability. The service node model for middleware has
several performance, scalability, and reliability implications. Due to the need of
middleware clients on compute nodes to communicate with middleware services
on service nodes, many middleware use case scenarios incur a certain latency and
bandwidth penalty. Furthermore, central middleware services on service nodes
represent a bottleneck as well as a single point of failure and control.
Reliability. In fact, the service node model for middleware is similar to the Beowulf cluster architecture, where a single head node controls a set of dedicated
compute nodes. Similarly, middleware service ooad, load balancing, and replication techniques may be used to alleviate performance and scalability issues
and to eliminate single points of failure and control.
Slimming Down. The most intriguing aspect of modern HPC architectures is
the deployment of lightweight OSs on compute nodes and resulting limitations
for middleware solutions. While the native communication system of the compute node OS can be used to perform RPC calls to service nodes in order to
interact with middleware services, certain missing features, such as the absence
of dynamic linking, are rather hard to replace.

790

C. Engelmann, H. Ong, and S.L. Scott

Service-Oriented Middleware Architecture. However, the shift toward the


service node model for middleware has also certain architectural advantages.
Middleware services may be placed on I/O nodes in order to facilitate advanced
I/O-based online and/or realtime services, such application steering and visualization. These services require I/O pipes directly to and from compute nodes.
Data stream processing may be performed on compute nodes, service nodes,
and/or on external resources. System partitioning using multiple I/O nodes may
even allow for parallel I/O data streams.

Conclusion

This paper describes a recent trend in modern HPC system architectures toward
lean compute node solutions, which aim at improving overall system performance
and scalability by keeping the compute node software stack small and simple. We
examined the impact of this trend on HPC middleware solutions and discussed
the resulting paradigm shift in software architectures for middleware in modern
HPC systems. We described the service node model for modern HPC middleware
and discussed its software architecture, use cases, performance impact, scalability
implications, and reliability issues.
With this paper, we also try to engage the broader middleware research and
development community beyond those who are already involved in porting and
developing middleware solutions for modern HPC architectures. Based on many
conversations with researchers, professors, and students, we realize that not many
people in the parallel and distributed system research community are aware of
this trend in modern HPC system architectures.
It is our hope that this paper provides a starting point for a wider discussion
on the role and architecture of middleware services and runtime environments
in modern HPC systems.

References
1. Geist, G.A., Beguelin, A., Dongarra, J.J., Jiang, W., Manchek, R., Sunderam,
V.S.: PVM: Parallel Virtual Machine: A Users Guide and Tutorial for Networked
Parallel Computing. MIT Press, Cambridge, MA, USA (1994)
2. Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI: The Complete Reference. MIT Press, Cambridge, MA, USA (1996)
3. SciDAC Center for Component Technology for Terascale Simulation Software
(CCTTSS):
High-Performance Scientic Component Research: Accomplishments and Future Directions. Available at http://www.cca-forum.org/db/news/
documentation/whitepaper05.pdf (2005)
4. Kesselman, C., Foster, I.: The Grid: Blueprint for a New Computing Infrastructure.
Morgan Kaufmann Publishers, San Francisco, CA, USA (1998)
5. Hendriks, E.: BProc: The Beowulf distributed process space. In: Proceedings of
16th ACM International Conference on Supercomputing (ICS) 2002, New York,
NY, USA (2002) 129136

Middleware in Modern High Performance Computing System Architectures

791

6. Hsieh, J., Leng, T., Fang, Y.C.: OSCAR: A turnkey solution for cluster computing.
Dell Power Solutions (2001) 138140
7. Papadopoulos, P.M., Katz, M.J., Bruno, G.: NPACI Rocks: Tools and techniques
for easily deploying manageable Linux clusters. In: Proceedings of IEEE International Conference on Cluster Computing (Cluster) 2001, Newport Beach, CA,
USA (2001)
8. Becker, D., Monkman, B.: Scyld ClusterWare: An innovative architecture for
maximizing return on investment in Linux clustering. Available at http://www.
penguincomputing.com/hpcwhtppr (2006)
9. Morin, C., Lottiaux, R., Valle, G., Gallard, P., Utard, G., Badrinath, R., Rilling,
L.: Kerrighed: A single system image cluster operating system for high performance
computing. In: Lecture Notes in Computer Science: Proceedings of European Conference on Parallel Processing (Euro-Par) 2003. Volume 2790., Klagenfurt, Austria
(2003) 12911294
10. Brightwell, R., Kelly, S.M., VanDyke, J.P.: Catamount software architecture with
dual core extensions. In: Proceedings of 48th Cray User Group (CUG) Conference
2006, Lugano, Ticino, Switzerland (2006)
11. Moreira, J., Brutman, M., Castanos, J., Gooding, T., Inglett, T., Lieber, D., McCarthy, P., Mundy, M., Parker, J., Wallenfelt, B., Giampapa, M., Engelsiepen, T.,
Haskin, R.: Designing a highly-scalable operating system: The Blue Gene/L story.
In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis (SC) 2006, Tampa, FL, USA (2006)
12. Buck, B.R., Hollingsworth, J.K.: An API for runtime code patching. Journal of
High Performance Computing Applications (2000)
13. Kohl, J.A., Papadopoulos, P.M.: Ecient and exible fault tolerance and migration
of scientic simulations using CUMULVS. In: Proceedings of 2nd SIGMETRICS
Symposium on Parallel and Distributed Tools (SPDT) 1998, Welches, OR, USA
(1998)
14. Sterling, T.: Beowulf cluster computing with Linux. MIT Press, Cambridge, MA,
USA (2002)
15. Sterling, T., Salmon, J., Becker, D.J., Savarese, D.F.: How to Build a Beowulf:
A Guide to the Implementation and Application of PC Clusters. MIT Press,
Cambridge, MA, USA (1999)
16. Cray Inc., Seattle, WA, USA: Cray XT4 Computing Platform Documentation.
Available at http://www.cray.com/products/xt4 (2006)
17. Novell Inc.: SUSE Linux Enterprise Distribution. Available at http://www.novell.
com/linux (2006)
18. Cluster File Systems, Inc., Boulder, CO, USA: Lustre Cluster File System.
Available at http://www.lustre.org (2006)
19. Cluster File Systems, Inc., Boulder, CO, USA: Lustre Cluster File System Architecture Whitepaper. Available at http://www.lustre.org/docs/whitepaper.pdf
(2006)

Usability Evaluation in Task Orientated Collaborative


Environments
Florian Urmetzer and Vassil Alexandrov
ACET Centre, The Universtity of Reading, Reading, RG6 6AY
{f.urmetzer, v.n.alexandrov}@reading.ac.uk

Abstract. An evaluation of the usability is often neglected in the software


development cycle, even that it was shown in the past that a careful look at the
usability of a software product has impact on its adoption. With the recent
arising of software supporting collaboration between multiple users, it is
obvious that usability testing will get more complex because of the
multiplication of user interfaces and their physical distribution. The need for
new usability testing methodologies and tools for their support under these
circumstances is one consequence. This paper widens the methodologies of
usability evaluation to computing systems supporting the solving of a
collaborative work task. Additionally details of a distributed screen recording
tool are described that can be used to support usability evaluation in a
collaborative context.
Keywords: collaboration, groupware, evaluation, usability, testing, HCI.

1 Introduction
Collaborations are well known to better the outcome of projects in industry as in
academia. The theory is based on the availability of specialist knowledge and
workforce through different collaborators [1]. Software tools to support any form of
collaboration are widely used in scientific as well as in business applications. These
include tools to enhance communications between collaborators, like for example
Access Grid [2] or portals enabling text based exchange of information [3].
Additionally there are tools to support virtual organizations in their work tasks. These
tools have the aim to enable two or more individuals to work in distributed locations
towards one work task using one computing infrastructure. An example is the
Collaborative P-Grade portal [4] or distributed group drawing tools as described in
[5]. The P-Grade portal enables distributed users to make contributions to one
workflow at the same moment in time. Therefore users can actively change, build and
enhance the same workflow in an informed fashion using advantages of collaborative
working models. Similarly the distributed group drawing tool enables multiple users
to draw and share the drawing via networks.
In this paper, first the state of the art in usability evaluation methods will be looked
at, detailing the current methods for usability testing. Then an example of a
collaborative system will be given to provide an understanding of the terminology
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 792798, 2007.
Springer-Verlag Berlin Heidelberg 2007

Usability Evaluation in Task Orientated Collaborative Environments

793

collaborative software. The authors will argue that current methods have to be revised
under the new collaborative software paradigm. Finally the authors will introduce a
multiple screen recorder, which is currently under development. This recorder may
help to enable usability evaluation in collaborative environments and therefore enable
a revision of the current methodologies, namely observational studies, under the new
collaborative software paradigm.

2 Methods for Usability Evaluation and Testing


Usability testing has been defined as the systematic way to observe real users
handling a product and gathering data about the interaction with the product. The tests
can determine if the product is easy or difficult to use for the participants [6].
Therefore the aim of enquiries into usability is to uncover problems concerning the
usage of existing interfaces by users in realistic settings. The outcomes of these tests
are suggestions to better the usability and therefore recommendations to change the
interface design [7].
Historically usability tests have been at the end of software projects to enable the
measurement of usability of the outcome of the project and to better the interface. One
example is automatic user interface testing, where software evaluates with the help of
an algorithm a programmed specification on a user interface. These automatic
methods have however been described to have scaling problems when evaluating
complex and highly interactive interfaces [7].
In recent years usability testing has changed to the user centred paradigm and is
therefore implemented more and more into the software design cycle. The testing
involving target users however is still an important part of the enquiry [8], [9], [10].
Two examples are the use of design mock-ups and the use of walkthroughs.
For example Holzinger [11] used drawn paper mock-ups of interfaces to elicit
information from potential users of a virtual campus interface. Paper mock-ups were
described to be used by the researchers to allow the user to interact in single and
group interviews before the interface was programmed. The researchers described the
process and methods applied in the project as useful for identifying the exact look and
feel of an interface and the requirements of the users usability needs.
A second example for a user centric design approach is the usability walkthrough.
Using focus groups, the walkthrough ensures a quality interface and the efficiency of
the software production. This process is indicated to include users as well as product
developers and usability experts in a group setting during the design of the software.
This means that all attendees discuss the functionality and the design of the software
before the programming has started [12], [13].
The two example methods in use to define the design of interfaces are however
stressed to be additions to tests with users using the interfaces after the programming
was done [7]. In usability evaluations, after the software is programmed, most
commonly observational methods are chosen to determine the use of interfaces. For
example the researcher would observe chosen participants performing tasks while
using the tested interface. The tasks should be related to the everyday jobs that need
to be performed by the users when using the software, for example the question, can

794

F. Urmetzer and V. Alexandrov

a user send e-mails using newly developed e-mail software? However as Dumas [6]
points out the tasks to be performed should be a set of problem tasks, instead of tasks
that are seen to be rather easy. This is seen as a way of making sure that the interface
is not verified, but critically analysed.
While the user is performing the number of tasks provided, the tester would
typically ask the participant to speak aloud what he is thinking during the execution of
the task. This gives the tester an inside to the participants thoughts and problems, for
example a participant would say how do I send this e-mail now indicating that he is
looking for the send button. This method is named think aloud [14], [16]. An
extension of the think aloud is to make two people work on one interface and make
them communicate about their actions. This is seen as more natural then just to talk to
oneself [6].
A technology supported extension of usability testing is remote usability testing.
This is described to enable testing participants in their natural surroundings and with
the opportunity not to have to travel to the participants. The researchers in [16] have
described their study to involve a video conferencing system, shared whiteboard and
shared applications software on the participants side. This video conferencing
software will then enable the observer on his side to see the image of the participant
through the video conferencing system and to see the computer screen using the
shared applications software. The shared whiteboard would be used to give
instructions and tasks to the test participants.
In a more recent publication the researchers have used remote usability testing
using Microsoft NetMeeting to share the test participants screen and a software called
snag-it to capture the shared content from the observers computer screen. In this
particular case speaker phones on the participants side and a tape recorder on the
observers side was used to capture think aloud comments from the participant. The
tasks to be performed during the test were given to the user in written form via postal
mail. The major outcome of the study was that they found that remote usability testing
is as accurate as face to face testing [17].

Fig. 1. A remote usability test, using observational methods, the researcher is observing the
participant using the software via the internet and a phone network. The phone is used for think
aloud protocols and the participants screen is shared using screen sharing software.

Usability Evaluation in Task Orientated Collaborative Environments

795

2.1 The Collaborative Challenge


With the CoWebs (collaborative web spaces, e.g. wiki) it is possible to move more
away from a text only media and to integrate collaborative multimedia allowing the
group editing of, for example, formulas [18]. Supporting the CoWeb theory is the
advent of virtual organisations in the Grid paradigm. Individuals can form virtual
collaborations and therefore share information and resources in electronic means
across organisations [19].
An example for a tool produced under these paradigms is the Collaborative PGrade Portal [4]. This portal enables multiple users to make contributions to one
workflow at the same moment in time. This is possible through a central server,
where every participant in the collaboration logs in and receives an editable
representation of the workflow. The workflow jobs are lockable by only one user in
the collaboration. Through the locking, the job becomes editable for the user and is
shown as locked to the other users (see fig. 3). This mechanism allows users to
actively change, build and enhance the same workflow while being in different
geographic locations.

Fig. 2. Example of the P-Grade Portals central server sharing a workflow. The locks can be
seen in the interfaces indicating that the nodes are not editable.

This shift from single user applications to collaborative applications needs a


reflection in the methodology of usability testing. At present it is not possible to run
any form of usability tests in these collaborative settings. However it would be highly
interesting to report such findings and therefore to gather more detailed information
on the use and of the requirements of such systems.

796

F. Urmetzer and V. Alexandrov

3 Enabling Usability Evaluation in Collaborative Environments


The discussion above has shown that there are developed methods to enable usability
evaluation when single interfaces are used. However there is need for tools to
investigate multiple interfaces. This will have an important impact on the ability to
conduct user interface evaluation and therefore to uncover problems concerning the
usage of existing interfaces in realistic settings when using collaborative software.
Therefore the researchers of this paper have proposed a software allowing distributed
usability evaluation using observational methods.
The software allows the recording of chosen screen content and cameras, including
the sound, of clients involved in a collaborative task. Therefore the clients screen
content is captured and transferred over the network to the recording server where the
video is stored to disc.

Fig. 3. Test computers are capturing screens and cameras and transfer the images and audio to a
recording server. From there the usability tester can playback the recorded material.

The capturing of screens has been realized using the Java media framework. The
area of the screen to be captured can be defined using x and y locations of the screen.
Therefore not all the screen has to be captured, but only the part of the screen showing
the application looked at by the researchers is captured.
Before the video is transferred, a timestamp is added, so that it is possible to
synchronize the replay of the different clients at a later stage.
The transport is organized using RTP (Real Time Protocol) from the client to the
server. Finally the video is written to disc on the server.
A researcher interested in analyzing the material gathered can then play the
material back from the server to a single client. This will then transfer the video
synchronized to one client.

4 Future Directions and Conclusion


The future tasks in the project will be grouped into three areas.
First more functionality will be added to the user interfaces of the clients as well as
of the researchers interface. On the client side a choice of whether a camera and/or

Usability Evaluation in Task Orientated Collaborative Environments

797

audio is shared has to be added in the graphical user interface. These selections then
have to be fed back to the user in a smaller window on the screen to enable the user to
gain more control over the shared material. Additionally an option has to be created,
where the user can choose the size of the transferred video. This will prevent the
overload of networks and of the personal computer of the participant. The researcher
has to have the option to choose the video feeds during playback and to see the video
feeds during recording. This should enable the researcher to interfere in the test when
needed.
Second it should be possible to add a tag to the recorded material. Tagging is the
adding of information to the recorded material, which can be in text form as well as in
graphical form. This should enable scientists to describe moments in time and code
the material for further analysis. This tool is seen as important by the researchers,
given that during the analysis it will be hard to keep track of where usability problems
occur, because the attention of the researcher has to be multiplied by the number of
participants. Similar, tagging mechanisms have been used in meeting replays [20].
Finally a messenger structure to allow textual communication between the tester
and the participant has to be added. This may only be text based, however could be
voice based or even video based. However intensive testing of network overload and
other issues has to be done before.
In conclusion the researchers have shown in a prove of concept [21] that it is
technically possible to enable the recording of multiple screens and cameras on one
server. This prove of concept is now programmed in one server and multiple clients
application. This application will enable researchers to conduct observational studies
in distributed collaborative computing environments and therefore to evaluate user
interfaces.
The current methods used for such interface evaluations have been described in
this paper in great detail and the new development is described from the classical
observational study of one user to the observation of groups of users and their screens.
At this point it is taken as proved that the observation of the users via networked
screen and camera recording is possible, however tests are being conducted at present
to see how effective the new methodologies are from a human factor point of view.

References
1. Dyer, J.H., Collaborative advantage: winning through extended enterprise supplier
networks. 2000, Oxford: Oxford University Press. xii, 209.
2. AccessGrid. Access Grid: a virtual community. [Internet] 2003 [cited 2006 13 March
2003]; http://www-fp.mcs.anl.gov/fl/accessgrid/].
3. Klobas, J.E. and A. Beesley, Wikis: tools for information work and collaboration. Chandos
information professional series. 2006, Oxford: Chandos. xxi, 229 p.
4. Lewis, G.J., et al., The Collaborative P-GRADE Grid Portal. 2005.
5. Saul, G., et al., Human and technical factors of distributed group drawing tools. Interact.
Comput., 1992. 4(3): p. 364-392.
6. Dumas, J.S. and J. Redish, A practical guide to usability testing. Rev. ed. 1999, Exeter:
Intellect Books. xxii, 404.
7. Mack, R.L. and J. Nielsen, Usability inspection methods. 1994, New York; Chichester:
Wiley. xxiv, 413.

798

F. Urmetzer and V. Alexandrov

8. Carroll, C., et al., Involving users in the design and usability evaluation of a clinical
decision support system. Computer Methods and Programs in Biomedicine, 2002. 69(2):
p. 123.
9. Nielsen, J., Usability engineering. 1993, Boston; London: Academic Press. xiv, 358.
10. Faulkner, X., Usability engineering. 2000, Basingstoke: Macmillan. xii, 244.
11. Holzinger, A., Rapid prototyping for a virtual medical campus interface. Software, IEEE,
2004. 21(1): p. 92.
12. Bias, R., The Pluralistic Usability Wlakthrough: Coordinated Emphaties, in Usability
Inspection Methods, J. Nielsen and R.L. Mack, Editors. 1994, Wiley & Sons, Inc: New
York.
13. Urmetzer, F., M. Baker, and V. Alexandrov. Research Methods for Eliciting e-Research
User Requirements. in Proceedings of the UK e-Science All Hands Meeting 2006. 2006.
Nottingham UK: National e-Science Centre.
14. Thompson, K.E., E.P. Rozanski, and A.R. Haake. Here, there, anywhere: remote usability
testing that works. in Proceedings of the 5th conference on Information technology
education table of contents. 2004. Salt Lake City, UT, USA: ACM Press, New York, NY,
USA.
15. Monty, H., W. Paul, and N. Nandini, Remote usability testing. interactions, 1994. 1(3): p.
21-25.
16. Thompson, K.E., E.P. Rozanski, and A.R. Haake, Here, there, anywhere: remote usability
testing that works, in Proceedings of the 5th conference on Information technology
education. 2004, ACM Press: Salt Lake City, UT, USA.
17. Dieberger, A. and M. Guzdial, CoWeb - Experiences with Collaborative Web Spaces, in
From Usenet to CoWebs: interacting with social information spaces, C. Lueg and D.
Fisher, Editors. 2003, Springer: London. p. x, 262 p.
18. Foster, I., C. Kesselman, and S. Tuecke, The Anatomy of the Grid: Enabling Scalable
Virtual Organizations. International Journal on Supercomputing Applications, 2001. 15(3).
19. Buckingham Shum, S., et al. Memetic: From Meeting Memory to Virtual Ethnography &
Distributed Video Analysis. in Proc. 2 nd International Conference on e-Social Science.
2006. Manchester, UK: www.memetic-vre.net/publications/ICeSS2006_Memetic.pdf.
20. Urmetzer, F., et al. Testing Grid Software: The development of a distributed screen
recorder to enable front end and usability testing. in CHEP 06. 2006. India, Mumbai.

Developing Motivating Collaborative Learning Through


Participatory Simulations
Gustavo Zurita1, Nelson Baloian2, Felipe Baytelman1, and Antonio Farias1
1

Management Control and Information Systems Department - Business School,


Universidad de Chile
gnzurita@fen.uchile.cl, fbaytelmanp@fen.uchile.cl,
anfari@fen.uchile.cl
2
Computer Science Department Engineering School. Universidad de Chile
nbaloian@dcc.uchile.cl

Abstract. Participatory simulations are collaborative group learning activities


whose goals are to improve teaching and learning, increasing motivation inside
the classroom by engaging the learner in games simulating a certain system
they have to learn about. It has been already applied to students of primary and
secondary educational levels, however there are still not experiences reported
with higher level students, although there are many learning subjects to which
this technique can be applied. This paper presents the implementation of a
framework-like tool for supporting learning activities in a business school with
undergraduate students using mobile devices over an ad-hoc network.
Keywords: Handhelds. Collaborative Learning. Participatory Simulation.
Gestures. Sketches. Freehand-input based. Learning and Motivation.

1 Introduction
Any experienced teacher knows that without the proper motivation for students to engage in
a learning experience, the otherwise best designed experiences will be unsuccessful. Dick
and Carey [8] state: Many instructors consider the motivation level of learners the most
important factor in successful instruction. Motivation is not only important because it is a
necessary causal factor of learning, but because it mediates learning and is a consequence of
learning as well [20]. In other words, students who are motivated to learn will have greater
success than those who are not. Participatory Simulation aims for students having rich
conceptual resources for reasoning abut and thoughtfully acting in playful and motivational
spaces, and thus can more easily become highly engaged in the subject matter [11]. It uses
the availability of mobile computing devices to give each student the capability of simple
data exchanges among neighboring devices [19], [4]. They enable students to act as agents in
simulations in which overall patterns emerge from local decisions and information
exchanges. Such simulations enable students to model and learn about several types of
phenomena, [4] including those related to economics [4], [9].
Some research groups have implemented collaborative learning participatory simulations
with handhelds and infrared beaming [16], and it has been found that this kind of activities
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 799807, 2007.
Springer-Verlag Berlin Heidelberg 2007

800

G. Zurita et al.

provide various advantages for teaching and learning: (a) they introduce an effective
instructional tool and have the potential to impact student learning positively across
curricular topics and instructional activities [18], (b) they increase the motivation [12], [4],
and (c) they generate positive effects in engagement, self-directed learning and problemsolving [12].
Although handhelds most natural data-entry mode is the stylus, most currently available
handheld applications adopt the PC application approach that uses widgets instead of
freehand-input-based paradigms (via touch screens) and/or sketching, [6].
This paper presents a tool for implementing collaborative learning participatory
simulations, having two general research goals: (a) propose a conceptual framework for
specifying and developing participative simulation applications, and (b) to determine the
feasibility of using this in undergraduate curricular contexts of the simulation activities both
in terms of intended and actualized learning outcomes; particularly in the management area.
An instance of the framework is described. Its implementation is simple, lightweight and
fully based on wirelessly interconnected handhelds with an ad-hoc network.

2 Related Work
A learning participatory simulation is a role-playing activity that helps to explain the
coherence of complex and dynamic systems. The system maps a problem of the real
world to a model with a fixed number of roles and rules. Global knowledge and
patterns emerge in Participatory Simulations from local interactions among users and
making decisions to understand the impact by an analysis and observation while
doing and/or at the end of the activity.
Researchers are highly interested in collaborative learning participatory
simulations due to these simulations appear to make very difficult ideas around
distributed systems and emergent behavior more accessible to students [19]
motivating its learning process in a playful social space [4].
Various systems using different hardware devices have been already implemented:
A futures trading simulation described on [2] enhances the learning process of concepts
such as price discovery, the open outcry trading method, trading strategies of locals and
brokers, and the impact of interest rates on the treasury bond futures contract.
Thinking Tags [1] uses small nametag sized computers that communicate with each
other. It was used to tech high-school students in a simulation of virus propagation and
asked them to determine the rules of the propagation [5].
NetLogo [17] is an environment for developing of learning participatory simulations for
PCs. Simulations can be re-played, analyzed and compared with previous ones. An
extension called HubNet [19] supports PCs and mobile devices for input and output.
Klopfer et al. [12] showed that the newer and more easily distributable version of
Participatory Simulations on handhelds was equally as capable as the original Tag-based
simulations in engaging students collaboratively in a complex problem-solving task.
They feel that handhelds technology holds great promise for promoting collaborative
learning as teachers struggle to find authentic ways to integrate technology into the
classroom in addition to engaging and motivating students to learn science.

Developing Motivating Collaborative Learning Through Participatory Simulations

801

A collaborative learning participatory simulation in form of a stock exchange was


designed for masters students in financial theory, using architectures based on a server
and clients running on desktop PCs or laptops as well as on PDAs, [13].
The SimCaf experiments belong to the sociological approach, aiming at validating and
consolidating models [9], [4]. In this approach, participants are stakeholders and the
witnesses of the emergence are domain experts, usually social scientists.
Based on the literature above mentioned, we have identified that no system has yet
been proposed or implemented for handhelds in a wireless ad-hoc network using a
pen-based interface as main metaphor for user interaction.

3 Developing a Framework
In accordance to [7] and [20] some factors, based on empirical evidence, to enhance
motivation are:

Involve the learner. Learners must be involved in the ownership of the goals,
strategies and assessment of that with which they are to be engaged. The students
must to fell that they are in control of their own learning.
Respond positively to questions posed by students can enhance intrinsic
motivation. Furthermore, consideration should be given to what the learner brings
to the classroom: experiences, interests, and other resources.
Options and choices about the learning environment and the various
curriculum components (persons, time, content, methods, etc.) must be available.
Simulating the reality. Whatever the expected learning outcomes, there must be
a direct connection with the real world outside the classroom.
The shifting of responsibility for learning from the teacher to the student is
fundamental to both content fulfillment and learner motivation.
Feedback and reinforcement are essential to good teaching and effective
learning. When learners are given positive reinforcement, a source of motivation
is tapped. Evaluation should be based on the task, rather than comparison with
performance of other students.
Collaboration among learners is a very potent way in which an individual
learner forms an interpretation of the environment and develops understanding in
motivational spaces of social interactions.

Collaborative learning applications based on Participative Simulations are able to


meet the requirements listed above. In order to generate, design and implement them
the Teacher must define learning goals, artifacts to be exchanged, behavior variables
and parameters, and rules and roles for playing the simulation. In order to configure
the system for a collaborative learning participatory simulation, the Teacher may
setup transferable objects, their behavior parameters, rules and participant roles. Then,
the teacher explains the goal of the activity to the students, also describing objects,
rules and roles, and how these concepts are represented in their handhelds. Rules,
roles and goals should be designed to achieve a high social interaction between
students, negotiation instances, and competition to encourage an active and motivated
stance [13]. A startup process must ensure students will play an active and dynamic

802

G. Zurita et al.

role. This should be based on defining trading activities between students including
Negotiation and Exchange of Objects which is supported by handhelds. These
conditions depend on each particular application and may involve the following
aspects: (a) type of exchange objects, (b) exchange amounts, (c) trade conditions, (d)
parameters before and after the exchange, and (e) exchange objects. If students
require assistance, our framework allows the teacher to wirelessly give them
feedback and assessment. The teacher can (a) observe the simulation state of each
participant device and (b) modify such state in order to solve the student inquiry.

Fig. 1. Conceptual framework

Once the simulation is done, the teacher must guide students understanding about
the activity. In this way, the students will construct the learning objective together,
analyzing different stages of the activity.

4 Applications Implemented Using the Framework


We have implemented a lightweight platform for supporting the implementation of
participatory simulation applications based on the framework proposed in section 3. Using
this platform we have successfully implemented two applications for the scenarios proposed
on the previous section. The platform is a collection of java-classes which can be extended to
implement the desired scenario in a very fast and easy way. They allow the definition of new
roles, new products and the rules which will govern the simulation. It also offers
implementation of interfaces for assigning roles and exchange goods, which are should be
extended to implement the details of the desired application. The platform also implements
all the necessary actions to discover other applications of the same class running on
handhelds within the ad-hoc network and opens the necessary communication channels to
exchange data between them.
4.1 Trust Building Simulation
This application is aimed to support the learning of concepts like reputation and trust
by undergraduate students of business schools. In the simulated situation the roles of

Developing Motivating Collaborative Learning Through Participatory Simulations

803

vendors and customers are implemented. Customers are required to maintain a certain
basket with products they can acquire goods from vendors. Products have a certain
lifespan which is known at the purchase moment. Customers have to replace them
when they expire. The simulation assigns a random lifetime to the products around
the expected one. If the product fails before the lifetime offered by the vendor
customers may claim money refund or product replacement. Vendors can advertise
their products freely offering a longer lifetime to attract customers or a shorter to gain
customers trust. They may refuse to refund the money or replace the product in order
to make better profit. In the real world, the customers trust to the companies is built
by a repetitive interaction. A positive evaluation usually is generated when products
quality is satisfactory or, even, when the company reacts appropriately after a clients
complain about bad products (or services). When the simulation finishes, students
must analyze these results and conclude about how clients trust impacts the
companies profit.

Fig. 2. a) Teacher drags a student icon into the vendor area to assign him a new role. b) Teacher can
create new goods using free-hand drawings.

Setup phase: To assign roles to the students (customer or vendor) the teacher uses the
"activity administration" mode. Students without roles are displayed in the middle of the
screen over a white area. The right area of handhelds (figure 2.a) has vendors and left
belongs to consumers. The teacher assigns roles by drag-and-dropping the icon of a
student into the desired area. Since in this application goods may be anything, they are
created by the teacher by drawing a sketch and surround it by a rectangle. This will produce
a "good icon", displaying an awareness of successful creation and displaying a reduced icon
of the original scratch in the bottom bound of the screen. Then, additional "goods icons" may
be created, as seen in Figure 2.b.
Double-clicking on a "goods icon" will open a window for defining default
variables for that type of good. In this application, instance variables are "original
price", "production time" and "product expected life". Once goods have been created
their icons will show up in "activity administration mode". The teacher assigns goods
to patricians by dragging the goods icons over the vendor icons to allow them to
produce this item, or over consumer icons to ask them to acquire this item.

804

G. Zurita et al.

Simulation phase: The simulation starts by vendors offering their products verbally and
customers looking for the most convenient offer in terms of price and lifetime. Once a
customer faces a vendor, they face their handhelds in order to activate the IrDA
communication. This will enable the customer to receive information about the vendors
reputation and allow customer and vendor make the transaction. Figure 3 shows the
three steps required to complete the transaction. When facing both handhelds the top
region of the screen is converted in the negotiation area. The vendor drags the product
to this area, which will appear on the buyers negotiation area, who accepts it by
dragging it to the area of products owned. The clients keep information of the
reputation of each vendor with a number ranking the vendors. At the beginning it has
no value. The client can set and alter this number after according to the interaction
they had with vendor and also by asking other customers about the opinion they have
from a vendor.

Fig. 3. Three steps in the trade process. The vendor offers a product by dragging the object, a
customer accepts it, the vendors stock and customer requirements/acquired lists get updated.

4.2 Stock Market Simulation


This application is about learning how offer and demand are impacted by expectations
and speculations. This is learnt by simulating a stock market situation. The only role
present in this simulation is the one of the investor who has to buy and sell shares
trying to make profit. The teacher takes part in the simulation introducing changes in
the scenario by varying the prices of the overall company. She can also participate by
offering and buying shares in order to create unexpected situations simulating
speculations. Students and teacher can after the simulation analyze the reactions of the
simulated marked.

Developing Motivating Collaborative Learning Through Participatory Simulations

805

Setup phase: In this scenario there is no role assignment action since all have the
same one. The goods are now the shares of different companies the investors can buy
and sell. The teacher creates the different shares in the same way like the previous
application. Every investor receives an initial amount of shares and money.
Simulation phase: The simulation starts by letting the investors offer and buy their
shares. Figure 4 a) shows the interface of the application. Students can see the amount
of shares and their value, value and amount, and a small graph with the history of the
values with a pull-down menu. Shares are traded using IrDA detection. Figure 4 b)
shows the three steps necessary to transfer shares among students when they agree on
the price and amount of the shares to be traded. When facing both handhelds, from the
buyer and seller, the region at the top of the screen is converted in the trade area. The
seller drags the object representing the share to this area, triggering at the buyers
handheld a dialog in order to enter the amount of shares and the money. Then the data
of both is updates accordingly.

Fig. 4. a) The students interface. b) The selling buying sequence.

5 Discussion and Future Work


First results of this ongoing work have shown us that mobile technology is a right approach
for implementing participatory simulations. In fact, one of the most motivating factors of this
kind of learning activities is the face-to-face interaction students can have among each other.
Technology plays a very subtle yet important role, letting the social interaction to be at the
center of the experience. On the other hand, we could experience that the platform is really a
helpful tool for supporting the development of applications implementing participatory
simulations and other games that are based on the exchange of artifact between the
participants. The development time required for the subsequent applications can be reduced
to less than 1/3 of the time. We believe that the most significant contribution of the work
reported here is to provide a conceptual framework for applications of collaborative learning

806

G. Zurita et al.

participatory simulations, which is easy to adapt to many subject-matter content knowledge


and undergraduate curricular integration and encouraging the adoption of learner-centered
strategies. The teachers, who pre-evaluate the application, suggest that the same technologies
and ideas could be used across many subject matter areas. The design of effective learning
environments of our conceptual framework have included (a) learner centered environment
(learners construct their own meanings), (b) knowledge-centered environment (learners
connect information into coherent wholes and embedding information in a context), (c)
assessment-centered environment (learner use formative and summative assessment
strategies and feedback), and (d) community-centered environments (learner work in
collaborative learning norms). The next phase of our investigations will develop and explore
more subject-specific applications and learning and motivational measures at the student
level. We are also working on developing an application which can let the teacher define a
participatory simulation application without having to program a single line, only defining
the roles, products and rules for exchanging products. In the current platform a language for
defining these rules which could be used to generate the application is missing.
Acknowledgments. This paper was funded by Fondecyt 1050601.

References
1. Andrews, G., MacKinnon, K.A. Yoon, S.: Using Thinking Tags with Kindergarten Children: A
Dental Health Simulation, Journal of Computer Assisted Learning, 19 (2), (2003), 209219.
2. Alonzi, P., Lange, D., Betty, S.: An Innovative Approach in Teaching Futures: A Participatory
Futures Trading Simulation, Financial Practice and Education, 10(1), (2000), 228-238.
3. Castro, B., Weingarten, K.: Towards experimental economics, J. of Political Economy, 78, (1970),
598607.
4. Colella, V.: Participatory simulations: Building collaborative understanding through immersive
dynamic modeling. The Journal of the Learning Sciences 9, 2000, pp. 471500.
5. Colella, V., Borovoy, R., Resnick, M.: Participatory simulations: Using computational objects to
learn about dynamic Systems, Conf. on Human Factors in Computing Systems, (1998), 9 10.
6. Dai, G., Wang, H.: Physical Object Icons Buttons Gesture (PIBG): A new Interaction Paradigm
with Pen, Proceedings of CSCWD 2004, LNCS 3168, (2005), 11-20.
7. Dev, P. (1997). Intrinsic motivation and academic achievement. Remedial & Especial Education.
18(1)
8. Dick, W., & Carey, L. (1996). The systematic design of instruction (4th ed.). New York:
Longman.
9. Guyot, P., Drogoul, A.: Designing multi-agent based participatory simulations, Proccedings of 5th
Workshop on Aget Based Simulations, (2004), 32-37.
10. Hinckley, K., Baudisch, P., Ramos, G., Guimbretiere, F.: Design and Analysis of Delimiters for
Selection-Action Pen Gesture Phrases in Scriboli, Proceeding of CHI 2005, ACM, (2005),
451-460.
11. Klopfer, E., Yoon, S., Perry, J: Using Palm Technology in Participatory Simulations of Complex
Systems: A New Take on Ubiquitous and Accessible Mobile Computing, Journal of Science
Education and Technology, 14(3), (2005), 285-297.
12. Klopfer, E., Yoon, S., Rivas, L.: Comparative analysis of Palm and wearable computers for
Participatory Simulations, Journal of Computer Assisted Learning, 20, (2004), 347359.

Developing Motivating Collaborative Learning Through Participatory Simulations

807

13. Kopf, S., Scheele, N. Winschel, L., Effelsberg, W.: Improving Activity and Motivation of Students
with Innovative Teaching and Learning Technologies, International Conference on Methods and
Technologies for Learning (ICMTL), WIT press, (2005), 551 556.
14. Landay, J., Myers, B.: Sketching interfaces: Toward more human interface design, IEEE
Computer, 34(3), (2001), 56-64
15. Long, A., Landay, J., Rowe, L.: PDA and gesture Use in Practice: Insights for Designers of Penbased User Interfaces, Retrieved on 2006, December, from http:// bmrc.berkeley.edu/
research/publications/1997/142/clong.html
16. Soloway, E., Norris, C., Blumenfeld, P., Fishman, R., Marx, R:: Devices are Ready-at-Hand,
Communications of the ACM, 44(6), (2001), 1520
17. Tisue, S., Wilensky, U.: NetLogo: A simple environment for modeling complexity, International
Conference on Complex Systems, (2004).
18. Vahey, P., Crawford, V.: Palm Education Pioneers Program: Final Evaluation Report, SRI
International, Menlo Park, CA, (2002).
19. Wilensky, U., Stroup, W.: Learning through participatory simulations: Network-based design for
systems learning in Classrooms, Proceedings of CSCL99, Mahwah, NJ, (1999), 667-676.
20. Wlodkowski, R. J. (1985). Enhancing adult motivation to learn. San Francisco: Jossey-Bass.

A Novel Secure Interoperation System


Li Jin and Zhengding Lu
Department of Computer Science & Technology,
Huazhong University of Science & Technology, Wuhan 430074, China
jessiewelcome@126.com

Abstract. Secure interoperation for a distributed system, such as a multidomain system, is a complex and challenging task. The reason for this is that
more and more abnormal requests and hidden intrusions make the static access
control policies to be fragile. The access control model presented in this paper
approaches this problem by proposing a concept of trust-level, which
involves a statistical learning algorithm, an adaptive calculating algorithm and a
self-protecting mechanism to evaluate a dynamic trust degree and realize secure
interoperations.
Keywords: Secure Interoperation; Trust-level; Adaptive.

1 Introduction
Traditional access control systems are defined as granting or denying requests to
restrict access to sensitive resources, the main purpose of which is preventing the
system from ordinary mistakes or known intrusions. The development of network and
distributed technology has made a large revolution in information security area.
Sharing information without sacrificing the privacy and security has become an
urgent need.
The emergence of Trust Management has promised a novel way to solve these
problems. Many researchers have introduced the idea of trust to improve a systems
security degree. It always involves a systems risk-evaluation or a users historical
reputation to decide whether they are trust or not. It is insufficient to deal with such
unexpected situations as dishonest network activities, identity spoofing or
authentication risk. There are many researchers who have amply discussed the
importance of trust in a dynamic access control system and reached many
achievements. However, how to change an abstract concept of trust into a numeric
value was insufficiently discussed. The motivation of this paper is drawn by this idea.
To solve these problems, we introduced a new quantitative concept trust-level to
access control policies and developed a novel Adaptive Secure Interoperation System
using Trust-Level (ASITL). In Section 2, we discuss some related achievements of
secure interoperation and trust management in recent years. We describe the whole
architecture and working flow of the ASITL in Section 3. The trust evaluation module
is discussed in Section 4. We also present an interesting example in Section 5.
Concluding remarks is added in Section 6.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 808814, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Novel Secure Interoperation System

809

2 Related Works
Several research efforts have been devoted to the topic of trust strategies in secure
interoperations and trust management.
Ninghui Li and John C. Mitchell proposed RT [1], which combines the strengths of
Role-based access control and Trust-management systems. RT has been developed
into a systematic theory. Some trust services, such as trust establishment, negotiation,
agreement and fulfillment were reported in [2], [3]. Although they concluded many
security factors that might influence the trust degree, they did not propose a
formalized metric to quantize it. Furthermore, Elisa B. et al. discussed secure
knowledge management, focusing on confidentiality, trust, and privacy [4] and Kilho
S. et al. presented a concrete protocol for anonymous access control that supported
compliance to the distributed trust management model [5], both of which represented
novel achievements in this area. However, the trust evaluation methods have not been
essentially developed.
Comparing with the above research efforts, our framework leads to a number of
advantages in a secure interoperation environment.
A dishonest user can not deceive the authentication server to reach his true
intention. Once a dishonest network activity is detected, the users trust-level
will be decreased and he will not be granted any further privileges. Therefore,
many potential risks can be efficiently avoided.
To gain a higher trust-level, a user seeking to any advanced service has to
submit correct certificate and obey rules all the time.
With a statistical learning algorithm, event a new intrusion or an unknown event
can also be learned and added into the abnormal events DB.

3 Framework for ASITL


ASITL is a self-adaptive secure interoperation system. Different from traditional
authentication systems, it involves dynamic trust evaluation mechanism. It consists of
three main parts: certificate authentication mechanism, authentication server and trust
evaluation module. Each part has its own duty as follows:
Certificate authentication mechanism. It verifies the validity of certificates with
the certificates DB.
Authentication server. With the certificate authentication mechanism, access
control policies, and if necessary, trust-level evaluation module, it decides whether or
not to grant a request of the current user.
Trust evaluation module. This module includes two parts: abnormal judging
mechanism and trust-level calculating algorithm. Abnormal judging mechanism
involves a self-adaptive statistical learning algorithm which uses a probability method
to define a class for an abnormal event. And trust-level calculating mechanism defines
a mathematic model to calculate a trust-level value with an abnormal events kind and
its occurrence number.

810

L. Jin and Z. Lu

With above three main parts, we can describe a typical secure session in ASITL:
Firstly, a user sends a request to the authentication server. And according to users
request and access control policies, the authentication server asks for some necessary
certificates.
Secondly, the user submits the needed certificates. If the certificates can satisfy the
policies, the users request will be transmitted to the corresponding application
servers. A secure interoperation is finished. Otherwise, the authentication server sends
further authentication requirements to the user and trust evaluation module starts to
work at once.
Thirdly, the user has to send other certificates once more to further proof his
identity. And the authentication server continues to authenticate the current user, at
meanwhile, update the trust-level for the user constantly.
Finally, when the users trust-level is beyond the systems threshold value, the
current session will be canceled and the user will be banned in the system for a
predefined time-out period. Otherwise, the user has to continue to submit some more
certificates to verify his identity until his certificates satisfy the request and access
control policies.

4 Trust Evaluation
To maintain consistency and simplicity, authentication server generates a user ID and
maintains a history record for each user. Generally, a trust observation period is a
fixed interval of time, i.e., a threshold defined by system, or a period between two
audit-related events.
4.1 Basic Definitions
Each network event has it own features. In network security area, researchers often
abstractly divide a network event into many significant features. Similarly, common
kinds of features can be concluded into a class, which might be related to a certain
network event. Before describing the trust evaluation process in detail, we give the
basic definitions and theorems.
Definition 1: Every network event contains a set of intrinsic features. When we
analysis a network event, some of these features are essential, some of them are
irrelevant. We call those essential features as key feature, named feature.
Definition 2: A feature can be divided into a mixture one or more topic kinds, named
classes, which are associated with different kinds of network events.
Theorem 1: Supposing that
1. An event E in the abnormal events DB can be described with a feature set F ;
2. All features f F are mutually exclusive and are associated with one or more
of a set of classes C k ;
3. A suspicious event Ei is observed by a feature set FJ = { f 1 , f 2 ,..., f j ,... f J } ;

Then the index I of the most probable event Ei is given by

A Novel Secure Interoperation System

I = arg max log p(C f ( j ) | Ei ) log p (C f ( j ) )


i

811

(1)

p( X ) denotes the probability of event X and C f ( j ) is the class that feature f j is


assigned.

4.2 Working Flow of Abnormal Judging

With the definitions and theorems in above, we realize the self-adaptability of


abnormal judging mechanism as follows:
Step 1: Initialize the events training set by extracting general features from large
amount of abnormal events and learning to deduce some basic rules from current
abnormal feature set.
Step 2: Receive an abnormal event which is needed to be classified.
Step 3: Extract the features and send them to the Event Induction module.
If it can be divided into a known class, its abnormal kind will be transferred to
the trust-level calculating module.
Otherwise, the unknown features are sent back to the training set and update the
current feature rules for next judging process.

4.3 Trust-Level Calculating

With determinations made by the abnormal judging module, trust-level calculating


algorithm updates the current users trust-level and feeds back a quantitative trust
value to the authentication server.
Definition 3: A users trust-level is defined as follows:

Tu = 1 / m Sum( klk )

(1 k m)

(2)

Tu denotes the trust-level of user u, and m is the amount of abnormal event kinds. k
is the influence rate of each kind of event, which is a real number between 0 and 1. lk
is the occurrence number of event k. Consequently, Tu is in the range of [0, 1]. lk
starts as 0 to reflect that there is no prior interaction between user and the
authentication server (that is, unknown users).
Supposing there are 3 kinds of abnormal event E1 , E2 , E3 and their trust rates
are 1 =0.9, 2 =0.7, 3 =0.5. The following tables separately shows Tu s going trend
as all the kind of events are detected. Assuming that l1 , l2 , and l3 all follow the
same increasing rate, we can find an interesting result: with the increasing of event
kinds, the larger lk , the faster the Tu decreases.

812

L. Jin and Z. Lu
Table 1. Tu on

l1
0
3
5
7
9

E1

increasing

l2
0
0
0
0
0

l3
0
0
0
0
0

Table 2. Tu on

l1
0
3
5
7
9

and

E2 increasing

l2
0
3
5
7
9
Table 3. Tu on

l1
0
3
5
7
9

E1

Tu
1.0000
0.9097
0.8635
0.8261
0.7958

l3
0
0
0
0
0

E1 , E2

l2
0
3
5
7
9

and

Tu
1.0000
0.6907
0.5862
0.5202
0.4759

E3 increasing
l3
0
3
5
7
9

Tu
1.0000
0.3990
0.2633
0.1895
0.1432

5 An Example
Assuming there is a file access control system. With the sensitivity S of files, all files
can be divided into three classes A, B, and C ( S A > S B > S C ). To maintain secure
levels of this system, we defines three different certificates C1 , C 2 and C 3
( C 3 > C 2 > C1 ). Different combinations of certificates grant different privileges.
Table 4. File access control policies
File

Trust-level

0.7 TL < 1.0

0.4 TL < 0.7

0.1 TL < 0.4

Certificates

History Records

C1 , C 2 , C 3

0.7 AVGTL

C1 , C 2

0.4 AVGTL

C1

0.1 AVGTL

A Novel Secure Interoperation System

813

Access control policies are defined in Table 4.There are three kinds of abnormal
events E1 , E2 , E3 and access control policy defines the lowest threshold of Tu ,
named Tu T , is 0.1000. Furthermore, E1 , E2 , E3 and their trust rate 1 , 2 , 3 are
described as follows:
Certificate_Error_Event: a user presents a needless certificate. Although it is valid,
it is not the right one that authentication server needed. This event may indicate a
certificate mistake of the user. The trust influence rate of this event is 1 =0.9.
Certificate_Invalidation_Event: a user presents an expired, damaged or revoked
certificate. This event may indicate an attempt to a network fraud. The trust influence
rate of it is 2 =0.7.
Request_Overflow_Event: a user sends abnormally large amounts of requests. This
event may indicate an attempt to a Dos attack or a virus intrusion. The trust influence
rate of this event is 3 =0.5.

Jimmy wants to access some files and sends a request with his identity certificate
to the authentication server. To demonstrate the secure mechanism of ASITL, we
assume three different possible results:
Jimmy is a malicious intruder: He does not have a valid certificate at all. From
the beginning, he sends expired or damaged certificates to the authentication server
continually. Certificate_Invalidation_Event is detected constantly and the occurrence
number of it increases fast. When the occurrence number reaches a threshold amount,
Request_Overflow_Event may be detected. Once Jimmys TL is below 0.1, he will be
forbidden by the system. And the final TL with his ID will be recorded in the history
record. If this result continually takes place more than five times, the user ID will be
recorded in the Black List.
Jimmy is a potentially dangerous user: He only has a valid certificate C1 , so his
privilege only can access the files of Class C. But his true intention is the more
sensitive files of Class A or Class B. In order to accumulate a good reputation, he
maintains a high TL ( 0.4 TL < 1.0 ) and AVGTL ( 0.4 AVGTL ) by validly accessing

Class C files with certificate C1 . However, once he presents an invalid certificate


C 2 or C3 , Certificate_Invalidation_Event is trigged and his TL decreases fast.
Although Jimmy has owned a high TL and a good history record by dealing with less
sensitive files C, his potential intention of more sensitive files A or B can never be
reached.
Jimmy is a normal user: He sends file request and corresponding valid certificate
to the authentication server. If his certificate is suited to the privilege of the request,
his TL and history records can satisfy the access control policies, he will pass the
authentication and his request will be responded by the application server.

6 Conclusions and Future Work


The ASITL, which can supply secure interoperations for multi security domains, is
guided by a set of desiderata for achieving a fine-grained access control system. In

814

L. Jin and Z. Lu

this paper, we introduce a variable value trust-level to reflect a users trust degree.
Based on this value, ASITL dynamically evaluates the users trust degree and
responds to the requestors through the judgment of a new suspicious event.
Furthermore, ASITL can be sure that all secure measures have been completed before
sensitive information is exchanged.
In future work, we would like to extend our work to some new areas. We need find
more efficient learning algorithms to shorten the responding period. Neural network
algorithms or similar methods might be involved. Moreover, we can further optimize
the cooperating abilities among modules in the system to enhance the performance of
the system. Finally, trust evaluating for the authentication server and users privacy
issues also need to be investigated.

References
1. Li, N., Mitchell J., Winsborough W.. RT: A role-based trust-management framework. In:
Proceedings of The 3th DARPA Information Survivability Conference and Exposition
(DISCEX III), Washington (2003) 201-212
2. Li Xiong, Ling Liu. PeerTrust: Supporting Reputation-Based Trust for Peer-to-Peer
Electronic Communities. IEEE Transactions on Knowledge and Data Engineering, Vol.16,
No.7 (2004) 843-857
3. Bhavani Thuraisingham,. Trust Management in a Distributed Environment. In: Proceedings
of the 29th Annual International Computer Software and Application Conference, vol.2
(2005) 561-572
4. Elisa Bertino, Latifur R. Khan, Ravi Sandhu. Secure Knowledge Management:
Confidentiality, Trust, and Privacy. IEEE Transactions on Systems, man, and Cybernetics.
Vol. 36, No.3 (2006) 429-438
5. Kilho Shin, Hiroshi Yasuda. Provably Secure Anonymous Access Control for
Heterogeneous Trusts. In: Proceedings of the First International Conference on Availability,
Reliability and Security (2006) 24-33

Scalability Analysis of the SPEC OpenMP


Benchmarks on Large-Scale Shared Memory
Multiprocessors
Karl F
urlinger1,2 , Michael Gerndt1 , and Jack Dongarra2
1

Lehrstuhl f
ur Rechnertechnik und Rechnerorganisation,
Institut f
ur Informatik,
Technische Universit
at M
unchen
{fuerling, gerndt}@in.tum.de
2
Innovative Computing Laboratory,
Department of Computer Science,
University of Tennessee
{karl, dongarra}@cs.utk.edu

Abstract. We present a detailed investigation of the scalability characteristics of the SPEC OpenMP benchmarks on large-scale shared memory
multiprocessor machines. Our study is based on a tool that quanties four
well-dened overhead classes that can limit scalability for each parallel
region separately and for the application as a whole.
Keywords: SPEC, Shared Memory Multiprocessors.

Introduction

OpenMP has emerged as the predominant programming paradigm for scientic


applications on shared memory multiprocessor machines. The OpenMP SPEC
benchmarks were published in 2001 to allow for a representative way to compare
the performance of various platforms. Since OpenMP is based on compiler directives, the compiler and the accompanying OpenMP runtime system can have
a signicant inuence on the achieved performance.
In this paper we present a detailed investigation of the scalability characteristics of the SPEC benchmarks on large-scale shared memory multiprocessor
machines. Instead of just measuring each applications runtime for increasing
processor counts, our study is more detailed by measuring four well-dened
sources of overhead that can limit the scalability and by performing the analysis not only for the overall program but also for each individual parallel region
separately.
The rest of this paper is organized as follows. In Sect. 2 we provide a brief
overview of the SPEC OpenMP benchmarks and their main characteristics. In
Sect. 3 we describe the methodology by which we performed the scalability
analysis and the tool which we used for it. Sect. 4 presents the results of our
study, while we discuss related work in Sect. 5 and conclude in Sect. 6.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 815822, 2007.
c Springer-Verlag Berlin Heidelberg 2007


816

K. F
urlinger, M. Gerndt, and J. Dongarra

The SPEC OpenMP Benchmarks

The SPEC OpenMP benchmarks come in two variants. The medium variant
(SPEC-OMPM) is designed for up to 32 processors and the 11 applications
contained in this suite were created by parallelizing the corresponding SPEC
CPU applications. The large variant (SPEC-OMPL) is based on the medium
variant (with code-modications to increase scalability) but two applications
(galgel and ammp) have been omitted and a larger data set is used.
Due to space limitations we omit a textual description of the background,
purpose, and implementation of each application, please refer to [7] for such
a description. Instead, Table 1 lists the main characteristics of each application
with respect to the OpenMP constructs used for parallelization (sux m denotes
the medium variant, while sux l denotes the large variant of each application).

2
2

2
12
17
18
7
2
4
23
18
6
1
8
2
2

400000
1
1

2
12
9
10
2
2
4
23
18
6
1
4
1
5

PARALLEL SECTIONS

PARALLEL LOOP

PARALLEL

11

LOCK

CRITICAL

wupwise m/wupwise l
swim m
swim l
mgrid m/mgrid l
applu m
applu l
galgel m
equake m
equake l
apsi m
apsi l
gafort m/gafort l
fma3d m
fma3d l
art m/art l
ammp m

LOOP

BARRIER

Table 1. The OpenMP constructs used in each of the applications of the SPECOpenMP benchmark suite

3
8
10
13
12
26
9
8
1
10
1
29
47
3
5

Scalability Analysis Methodology

We performed the scalability study with our own OpenMP proling tool,
ompP [4,5]. ompP delivers a text-based proling report at program termination
that is meant to be easily comprehensible by the user. As opposed to standard
subroutine-based proling tools like gprof [6], ompP is able to report timing data
and execution counts directly for various OpenMP constructs.

Scalability Analysis of the SPEC OpenMP Benchmarks

817

In addition to giving at region proles (number of invocations, total execution time), ompP performs overhead analysis, where four well-dened overhead
classes (synchronization, load imbalance, thread management, and limited parallelism) are quantitatively evaluated. The overhead analysis is based on the
categorization of the execution times reported by ompP into one of the four overhead classes. For example, time in an explicit (user-added) OpenMP barrier is
considered to be synchronization overhead.
Table 2. The timing categories reported by ompP for the dierent OpenMP constructs
and their categorization as overheads by ompPs overhead analysis. (S) corresponds to
synchronization overhead, (I) represents overhead due to imbalance, (L) denotes limited
parallelism overhead, and (M) signals thread management overhead.

MASTER
ATOMIC
BARRIER
USER REGION
LOOP
CRITICAL
LOCK
SECTIONS
SINGLE
PARALLEL
PARALLEL LOOP
PARALLEL SECTIONS

seqT

execT
(S)
(S)

bodyT

exitBarT

(I)
(I/L)
(L)
(I)
(I)
(I/L)

enterT

exitT

(S)
(S)

(M)
(M)

(M)
(M)
(M)

(M)
(M)
(M)

Table 2 shows the details of the overhead classication performed by ompP.


This table lists the timing categories reported by ompP (execT, enterT, etc.) for
various OpenMP constructs (BARRIER, LOOP, etc.) A timing category is reported
by ompP if a is present and S, I, L, and M indicate to which overhead class a
time is attributed. A detailed description of the motivation for this classication
can be found in [5].
A single proling run with a certain thread count gives the overheads according to the presented model for each parallel region separately and for the
program as a whole. By performing the overhead analysis for increasing thread
numbers, scalability graphs as shown in Fig. 1 are generated by a set of perl
scripts that come with ompP. These graphs show the accumulated runtimes over
all threads, the Work category is computed by subtracting all overheads form
the total accumulated execution time. Note that a perfectly scaling code would
give a constant total accumulated execution time (i.e., a horizontal line) in this
kind of graph if a xed dataset is used (as is the case for our analysis of the
SPEC OpenMP benchmarks.

818

K. F
urlinger, M. Gerndt, and J. Dongarra

Results

We have analyzed the scalability of the SPEC benchmarks on two cc-NUMA


machines. We ran the medium size benchmarks from 2 to 32 processors on a
32 processor SGI Alitx 3700 Bx2 machine (1.6 GHz, 6 MByte L3-Cache) while
the tests with SPEC-OMPL (from 32 to 128 processors, with increments of
16) have been performed on a node of a larger Altix 4700 machine with the
same type of processor. The main dierences to the older Altix 3700 Bx2 are
an upgraded interconnect network (NumaLink4) and a faster connection to the
memory subsystem.
Every eort has been made to ensure that the applications we have analyzed
are optimized like production code. To this end, we used the same compiler
ags and runtime environment settings that have been used by SGI in the SPEC
submission runs (this information is listed in the SPEC submission reports)
and we were able to achieve performance numbers that were within the range
of variations to be expected from the slightly dierent hardware and software
environment.
The following text discusses the scalability properties we were able to identify
in our study. Due to space limitations we can not present a scalability graph
for each application or even for each parallel region of each application. Fig. 1
shows the most interesting scalability graphs of the SPEC OpenMP benchmarks
we have discovered. We also have to limit the discussion to the most interesting
phenomena visible and can not discuss each application.
Results for the medium variant (SPEC-OMPM):
wupwise m: This application scales well from 2 to 32 threads, the most signicant overhead visible is load imbalance increasing almost linearly with
the number of threads used (it is less than 1% for 2 threads and rises to
almost 12% of aggregated execution time for 32 threads). Most of this overhead is incurred in two time-consuming parallel loops (muldoe.f 63-145
and muldeo.f 63-145).
swim m: This code scales very well from 2 to 32 threads. The only discernible
overhead is a slight load imbalance in two parallel loops (swim.f 284-294
and swim.f 340-352), each contributing about 1.2% overhead with respect
to the aggregated execution time for 32 threads.
mgrid m: This code scales relatively poorly (cf. Fig. 1a). Almost all of the
applications 12 parallel loops contribute to the bad scaling behavior with
increasingly severe load imbalance. As shown in Fig. 1a, there appears to
be markedly reduced load imbalance for 32 and 16 threads. Investigating
this issue further we discovered that this behavior is only present in three of
the applications parallel loops (mgrid.f 265-301, mgrid.f 317-344, and
mgrid.f 360-384). A source-code analysis of these loops reveals that in all
three instances, the loops are always executed with an iteration count that
is a power of two (which ranges from 2 to 256 for the ref dataset). Hence,
thread counts that are not powers of two generally exhibit more imbalance
than powers of two.

Scalability Analysis of the SPEC OpenMP Benchmarks

314.ovhds.dat
5000

3000

Mgmt
Imbal

Limpar
Imbal

2000

Sync
3000

316.ovhds.dat
Mgmt

2500

Limpar
4000

Work

Sync
Work

1500

2000

1000

1000

500

0
2

12

16

20

24

28

32

(a) mgrid m.

3500
3000
2500
2000

819

12

16

20

24

28

32

24

28

32

(b) applu m.

318.ovhds.dat

1600

Mgmt

1400

Limpar

318.R00034.ovhds.dat
Mgmt
Limpar

Imbal

1200

Imbal

Sync

1000

Sync

Work

Work
800

1500
600
1000

400

500

200

0
2

12

16

20

24

28

32

(c) galgel m.

1200

320.R00009.ovhds.dat

12000

Limpar

6000

400

4000

200

2000

313.ovhds.dat
Limpar
Sync

12

16

20

24

28

32

32

48

(e) equake m (quake.c 1310-1319).


315.ovhds.dat

64

80

96

112

128

96

112

128

(f) swim l.

50000

Mgmt

317.ovhds.dat
Mgmt

Limpar

40000

Imbal
40000

20

Work

600

50000

16

Imbal
8000

Sync
Work

60000

12

Mgmt
10000

Imbal
800

(d) galgel m (lapack.f90 5081-5092).

Mgmt
1000

Limpar
Imbal

Sync

30000

Work

Sync
Work

30000
20000
20000
10000

10000
0

0
32

48

64

80

(g) mgrid l.

96

112

128

32

48

64

80

(h) applu l.

Fig. 1. Scalability graphs for some of the applications of the SPEC OpenMP benchmark suite. Sux m refers to the medium size benchmark, while l refers to the large
scale benchmark. The x-axis denotes processor (thread) count and the y-axis is the
accumulated time (over all threads) in seconds.

820

K. F
urlinger, M. Gerndt, and J. Dongarra

applu m: The interesting scalability graph of this application (Fig. 1b) shows
super-linear speedup. This behavior can be attributed exclusively to one
parallel region (ssor.f 138-209) in which most of the execution time is
spent (this region contributes more than 80% of total execution time), the
other parallel regions do not show a super-linear speedup. To investigate the
reason for the super-linear speedup we used ompPs ability to measure hardware performance counters. By common wisdom, the most likely cause of
super-linear speedup is the increase in overall cache size that allows the applications working set to t into the cache for a certain number of processors.
To test this hypothesis we measured the number of L3 cache misses incurred
in the ssor.f 138-209 region and the results indicate that, in fact, this is
the case. The total number of L3 cache misses (summed over all threads) is
at 15 billion for 2 threads, and at 14.8 billion at 4 threads. At 8 threads the
cache misses reduce to 3.7 billion, at 12 threads they are at 2.0 billion from
where on the number stays approximately constant up to 32 threads.
galgel m: This application scales very poorly (cf. Fig. 1c). The most signicant
sources of overhead that are accounted for by ompP are load imbalance and
thread management overhead. There is also, however, a large fraction of
overhead that is not accounted for by ompP. A more detailed analysis of
the contributing factors reveals that in particular one small parallel loop
contributes to the bad scaling behavior: lapack.f90 5081-5092. The scaling
graph of this region is shown in Fig. 1d. The accumulated runtime for 2 to
32 threads increases from 107.9 to 1349.1 seconds (i.e., the 32 thread version
is only about 13% faster (wall-clock time) than the 2 processor execution).
equake m: Scales relatively poorly. A major contributor to the bad scalability
is the small parallel loop at quake.c 1310-1319. The contribution to the
wall-clock runtime of this region increases from 10.4% (2 threads) to 23.2%
(32 threads). Its bad scaling behavior (Fig. 1e) is a major limiting factor for
the applications overall scaling ability.
apsi m: This code scales poorly from 2 to 4 processors but from there on the
scaling is good. The largest identiable overheads are imbalances in the
applications parallel loops.
Results for the large variant (SPEC-OMPL):
wupwise l: This application continues to scale well up to 128 processors. However, the imbalance overhead already visible in the medium variant increases
in severity.
swim l: The dominating source of ineciency in this application is thread management overhead that dramatically increases in severity from 32 to 128
threads (cf. 1f). The main source is the reduction of three scalar variables in
the small parallel loop swim.f 116-126. At 128 threads more than 6 percent
of total accumulated runtime are spent in this reduction operation. The time
for the reduction is actually larger than the time spent in the body of the
parallel loop.
mgrid l: This application (cf. 1g) shows a similar behavior as the medium variant. Again lower numbers are encountered for thread counts that are powers

Scalability Analysis of the SPEC OpenMP Benchmarks

821

of two. The overheads (mostly imbalance and thread management) however,


dramatically increase in severity at 128 threads.
applu l: Synchronization overhead is the most severe overhead of this application (cf. 1h). Two explicit barriers cause most of this overhead with severities
of more than 10% of total accumulated runtime each.
equake l: This code shows improved scaling behavior in comparison to the
medium variant which results from code changes that have been performed.

Related Work

Saito et al. [7] analyze the published results of the SPEC-OMPM suite on large
machines (32 processors and above) and describe planned changes for the then
upcoming large variant of the benchmark suite.
A paper of Sueyasu [8] analyzes the scalability of selected components of
SPEC-OMPL in comparison with the medium variant. The experiments were
performed on a Fujitsu Primepower HPC2500 system with 128 processors. A
classication of the applications into good, poor, and super-linear is given and
is more ore less in line with our results. No analysis on the level of individual
parallel regions is performed and no attempt for a overhead classication is made
in this publication.
The work of Aslot et al. [1] describes static and dynamic characteristics of
the SPEC-OMPM benchmark suite on a relatively small (4-way) UltraSPARC
II system. Similar to our study, timing details are gathered on the basis of individual regions and a overhead analysis is performed that tries to account for the
dierence in observed and theoretical (Amdahl) speedup. While the authors of
this study had to instrument their code and analyze the resulting data manually,
our ompP tool performs this task automatically.
Fredrickson et al. [3] have evaluated, among other benchmark codes, the performance characteristics of seven applications from the OpenMP benchmarks on
a 72 processor Sun Fire 15K. In their ndings, all applications scale well with
the exception of swim and apsi (which is not in line with our results, as well as,
e.g. [7]). This study also evaluates OpenMP overhead by counting the number
of parallel regions and multiplying this number with an empirically determined
overhead for creating a parallel region derived from an execution of the EPCC
micro-benchmarks [2]. Compared to our approach, this methodology of estimating the OpenMP overhead is less exible and accurate, as for example it does
not account for load-imbalance situations and requires an empirical study to
determine the cost of a parallel region. Note that in our study all OpenMPrelated overheads are accounted for, i.e., the work category does not contain any
OpenMP related overhead.

Conclusion and Future Work

We have presented a scalability analysis of the medium and large variants of


the SPEC OpenMP benchmarks. The applications show a widely dierent scaling behavior and we have demonstrated that our tool ompP can give interesting,

822

K. F
urlinger, M. Gerndt, and J. Dongarra

detailed insight into this behavior and can provide valuable hints towards an
explanation for the underlying reason. Notably, our scalability methodology encompasses four well-dened overhead categories and oers insights into how the
overheads change with increasing numbers of threads. Also, the analysis can
be performed for individual parallel regions and as shown by the examples, the
scaling behavior can be widely dierent. One badly scaling parallel region can
have increasingly detrimental inuence on an applications overall scalability
characteristics.
Future work is planned along two directions. Firstly, we plan to exploit ompPs
ability to measure hardware performance counters to perform a more detailed
analysis of memory access overheads. All modern processors allow the measurement of cache-related events (misses, references) that can be used for this
purpose. Secondly, we plan to exploit the knowledge gathered in the analysis
of the SPEC benchmarks for an optimization case study. Possible optimizations
suggested by our study include the privatization of array variables, changes to
the scheduling policy of loops and avoiding the usage of poorly implemented
reduction operations.

References
1. Vishal Aslot and Rudolf Eigenmann. Performance characteristics of the SPEC
OMP2001 benchmarks. SIGARCH Comput. Archit. News, 29(5):3140, 2001.
2. J. Mark Bull and Darragh ONeill. A microbenchmark suite for OpenMP 2.0. In
Proceedings of the Third Workshop on OpenMP (EWOMP01), Barcelona, Spain,
September 2001.
3. Nathan R. Fredrickson, Ahmad Afsahi, and Ying Qian. Performance characteristics of OpenMP constructs, and application benchmarks on a large symmetric
multiprocessor. In Proceedings of the 17th ACM International Conference on Supercomputing (ICS 2003), pages 140149, San Francisco, CA, USA, 2003. ACM
Press.
4. Karl F
urlinger and Michael Gerndt. ompP: A proling tool for OpenMP. In Proceedings of the First International Workshop on OpenMP (IWOMP 2005), Eugene,
Oregon, USA, May 2005. Accepted for publication.
5. Karl F
urlinger and Michael Gerndt. Analyzing overheads and scalability characteristics of OpenMP applications. In Proceedings of the Seventh International Meeting
on High Performance Computing for Computational Science (VECPAR06), Rio de
Janeiro, Brasil, 2006. To appear.
6. Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. gprof: A call graph
execution proler. SIGPLAN Not., 17(6):120126, 1982.
7. Hideki Saito, Greg Gaertner, Wesley B. Jones, Rudolf Eigenmann, Hidetoshi
Iwashita, Ron Lieberman, G. Matthijs van Waveren, and Brian Whitney. Large
system performance of SPEC OMP2001 benchmarks. In Proceedings of the 2002
International Symposium on High Performance Computing (ISHPC 2002), pages
370379, London, UK, 2002. Springer-Verlag.
8. Naoki Sueyasu, Hidetoshi Iwashita, Kohichiro Hotta, Matthijs van Waveren, and
Kenichi Miura. Scalability of SPEC OMP on Fujitsu PRIMEPOWER. In Proceedings of the Fourth Workshop on OpenMP (EWOMP02), 2002.

Analysis of Linux Scheduling with VAMPIR


Michael Kluge and Wolfgang E. Nagel
Technische Universit
at Dresden, Dresden, Germany
{Michael.Kluge,Wolfgang.Nagel}@tu-dresden.de

Abstract. Analyzing the scheduling behavior of an operating system


becomes more and more interesting because multichip mainboards and
Multi-Core CPUs are available for a wide variety of computer systems.
Those system can range from a few CPU cores to thousands of cores. Up
to now there is no tool available to visualize the scheduling behavior of a
system running Linux. The Linux Kernel has an unique implementation
of threads, each thread is treated as a process. In order to be able to
analyze scheduling events within the kernel we have developed a method
to dump all information needed to analyze process switches between
CPUs into les. These data will then be analyzed using the VAMPIR
tool. Traditional VAMPIR displays will be reused to visualize scheduling
events. This approach allows to follow processes as they switch between
CPUs as well as gathering statistical data, for example the the number
of process switches.

Introduction

The VAMPIR [7] tool is widely used to analyze the behavior of parallel (MPI,
OpenMP and pthreads) as well as sequential programs. This paper will demonstrate how the capabilities of VAMPIR can be used to analyze scheduling events
within the Linux kernel. These events are gathered by a Linux kernel module
that has been developed by the authors. This development has been motivated
by an scheduling problem of an OpenMP program that will be used within this
paper to demonstrate the application of the software.
Linux itself is an operating system with growing market share in the HPC
environment. Linux has its own way of implementing threads. A thread is not
more than a process that shares some data with other processes. Within the
Linux kernel there is no distinction between a thread and a process. Each thread
also has its own process descriptor. So within this paper the terms thread and
process do not dier much. Although we will talk about OpenMP threads,
those threads are also handled by the Linux Kernel as normal processes when
we are talking about scheduling.
The rst section gives an short overview about the state of the art in monitoring the Linux kernel. The next section is dedicated to our Linux kernel module
and the output to OTF [1]. Within the third section will show how various VAMPIR displays that have been designed to analyze time lines of parallel programs
or messages in MPI programs can be reused for an visual analysis of Linux
scheduling events. This paper is closed by a short summary and an outlook.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 823830, 2007.
c Springer-Verlag Berlin Heidelberg 2007


824

M. Kluge and W.E. Nagel

Analyzing Scheduling Events in the Linux Kernel

Analyzing scheduling events is an interesting piece within the whole eld of performance analysis due to eects that can be traced back to a specic process
placement or cache thrashing. Within this paper we are referring to a multiprogramming environment. This means that multiple programs do run in parallel
on a given set of CPUs. The processes associated to these programs are not
pinned to a specic CPU. Therefore the scheduler is free to place the processes
as needed onto available CPUs.
We have identied two main approaches to analyze the scheduling behavior
of an specic system. The rst idea is to instrument the kernel scheduler itself
to monitor its actions. This would have the advantage of having insight into
scheduler decisions. An other idea is an indirect approach. If the CPU number
that process is running on over time is traced as well as information about the
process state (running or suspended), the priority, the nice value, interactivity
etc. one can show strength and weaknesses within the scheduler also. All the
information needed for the second approach are available within the process
descriptor in the Linux kernel. The information needed for the rst approach is
only locally available only within the scheduler implementation and not globally
in the kernel. Opposite to that, the list of current tasks and their properties are
available everywhere within the kernel.
There is no exiting tool we have found that is able to gather those information described above. The Linux Trace Toolkit [5] collects information about
processes but does not have any information about the CPU number a process
is running on. There are tools that are able to instrument a kernel (like KernInst
[11] or KTau [8]) that require a unique program to be written and put into the
kernel. Monitoring the /proc le system [9] would be an solution but cannot
provide the ne granularity needed. For AIX the AIX kernel trace facilities can
be used to gather data about various events in the kernel [2].
For really ne grained monitoring of the process-to-CPU mapping and the
process state we decided to try a dierent way.

Tracing Scheduling Events

Our approach utilizes the second idea from the section above. Because the information about all tasks on the system are available at each point in the Linux
kernel, the main idea is to write a kernel module that dumps the information
needed at speciable time intervals. Some kind of GUI or automatic analysis
tool could later be used to analyze this data. The advantage of a kernel module
is the ability to load and unload the module as needed as well as the short time
for recompilation after a change in the source code because the kernel itself is
not being touched [4].
The design of the kernel module is as follows. A kernel thread is created and
inspects all given threads at an adjustable time interval. The minimum time
between two inspections is the so called kernel frequency which can be chosen

Analysis of Linux Scheduling with VAMPIR

825

at the kernel setup with 100, 250 or 1000 ticks per second. The kernel module is
given a particular process id (PID) to watch. It will inspect this PID and all its
children and will dump all needed information (actually CPU and process state)
to the relayfs [12] interface. This way the user is able to select all processes
or any part of the current process tree. On problem here are processes that
get reparented. If a process nishes that still has child processes, those child
processes will get the init process (1) as parent. If not all processes but a specic
subset is traced, this child processes will vanish from the trace at this point in
time.
The kernel module itself can be started, congured and stopped via an interface that has been made available through the sysfs le system. So the kernel
module can stay within the kernel without generating overhead when nothing
needs to be measured.
To dump the data gathered to the disk, a relatively new part of the Linux
kernel, relayfs, is used. relayfs is an virtual le systems that has been designed
for an ecient transfer of large amounts of data from the kernel to user space.
It uses one thread per CPU to collect data through a kernel wide interface and
to transfer the data. The data is collected inside relayfs within sub buers. Only
full sub buers are transfered to the user space. On the user side, one thread
per CPU is running to collect the full sub buers and to write the data to a le
(one le per CPU). This thread is sleeping until it gets a signal from the kernel
that a full sub buers is available. This approach is scalable and disturbs the
measured tasks as less as possible.
In summary the kernel module currently supports the following features:
enable/disable tracing from the user space on demand
tracing of user selectable processes or tracing the whole system
changing parameter settings from the user space (via sysfs)

Using VAMPIR to Analyze Scheduling Events

Now we have a collection of events that describes which process has been on
which CPU in which state in the system at dierent timestamps. The amount
of data can become very large and needs a tool to be analyzed. Some kind of
visual and/or automatic analysis is needed here. There are basically two dierent
things that we want to analyze from those trace les:
1. number of active processes on each CPU
2. following the dierent processes (and their current states) on the CPUs over
the time
As threads and processes are basically treated the same way by the Linux
kernel, the hierarchical structure between all processes/threads is also known at
this point. For the rst application it is possible to count the tasks in the state
runnable on each CPU. To actually be able to view this data the following
approach have been identied:

826

M. Kluge and W.E. Nagel

each CPU is mapped to what VAMPIR recognizes as a process


task switches can be show as an one byte message between the associated
CPUs (processes)
forks and joins can also be shown as messages
the number of forks, joins and task switches per kernel tick are put into
counters
By using dierent message tags for the forks, joins and task switch dierent
colors can be used within VAMPIR to make the display even more clear. The
lter facilities of VAMPIR can be used to analyze CPU switches, forks or joins
independently. Due to the zooming feature of VAMPIR (which updates each
open display to the actual portion of the time line that is selected) it is possible
to analyze the scheduling behavior over time.
On the beginning all CPUs (processes for VAMPIR) do enter a function called
0. When the rst process is scheduled onto a CPU is will leave this function and
enter a function called 1. By following this idea we can have a very informative display about the number of runnable processes on the dierent CPUs. By
looking at VAMPIRs time line and the counter time line in parallel we already
get a good feeling on what was happening on the system.
For following the processes over dierent CPUs this scheme needs to be extended not only to have one VAMPIR process line for a CPU but to have multiple
process lines per CPU where the real processes will be placed on. Those lines
will be called a stream on that CPU from now. In this scenario, processes that
enter a specic CPU will be placed on a free stream. So for each process one
or two virtual function were dened for VAMPIR. One is always needed and
denotes that on one stream a specic process ID is present. This can further
be extended to have distinct virtual VAMPIR functions for the two states of a
process (running/not running). In the second case we can generate a leave event
for one virtual function and an enter event to the other virtual function on the
same stream when a process switches its state.
The idea of modeling task switches as messages allows to use VAMPIRs
Message Statistics window to analyze how many processes switched from one
CPU to another and how often this took place for each CPU (from-to) pair.

OTF Converter

To be able to analyze the collected data with VAMPIR a tool is needed to convert
the data dumped by the kernel module to a trace le. We have chosen to utilize
the OTF library due to its easy handling. Within relayfs the data obtained by
the kernel thread are dumped to the le that is associated with the CPU where
the kernel thread is running on at this point in time. The converter has been
written to serialize all events within these les after the program run and to
follow the tasks when they jump between the CPUs. It generates OTF output
with all necessary information like process names and state, CPU utilization
together with various counters.

Analysis of Linux Scheduling with VAMPIR

827

The example we will look at within the next section creates about 1GB of
trace data together from all CPUs. This example runs for about 6 minutes on
8 CPUs. The conversation to the OTF le format takes about one minute and
results in OTF les between 100 and 120 MB.

Example

Our example is derived from a problem observed on our Intel Montecito test
system. It has 4 Dual Core Itanium 2 CPUs running at 1.5 GHz (MT disabled).
The multiprogramming capabilities of a similar system (SGI Altix 3700) have
been investigated with the PARbench Tool [6], [3], [10]. One result here has been
that an OpenMP parallelized program that is doing independent computation in
all threads all the time (without accessing the memory) is behaving unexpected
hin an overload situation. We put eight sequential tasks and eight parallel tasks
(which open eight OpenMP threads each) on eight CPUs. So we have 72 active
threads that all need CPU time and do hardly any memory access. The algorithm
used on each tasks is the repeated (100000 times) calculation of 10000 Fibonacci
numbers. The sequential version takes about 2 seconds to run. The OpenMP
parallel program exists in two avors. The rst avor has one big parallel section,
100000 10000 numbers are calculated in one block. The second implementation
opens and closes the OpenMP parallel section 100000 times to calculate 10000
Fibonacci numbers. One parallel task with 8 parallel threads also needs 2 seconds
for both avors. In the overload situation, all 72 threads did run concurrently
on the system. If we used the rst OpenMP implementation all 72 tasks/threads
Table 1. Wall time in seconds of the sequential and parallel program version in dierent
execution environments
program

big OpenMP block small OpenMP block


busy waiting yield CPU
sequential
19 21
23
8 16
parallel
19 21
45 50
21 23

exited after about 20 seconds (+/- 1 second). If we use the second avor, the
eight sequential tasks exit after 2 to 3 seconds and the parallel tasks exit after
45 to 50 seconds.
The explanation for the dierent behavior we found after the investigation
with our tool. It is the fact, that for the rst avor the tasks do not synchronize.
On an overloaded system the tasks get out of sync easily. The default behavior of
the OpenMP implementation for a synchronization point is a busy wait for 200ms
and a call to sleep() afterwards. That way the OpenMP threads for the rst avor do synchronize just once and they use their full timeslice to do calculation. In
the second avor the parallel task spend part of their time slice with busy waiting. By putting the busy waiting time to 0 by using export KMP_BLOCKTIME=0

828

M. Kluge and W.E. Nagel

Fig. 1. Screenshot of the full time line, note the three dierent phases

this can be improved. The sequential tasks exit after 8 to 16 seconds and the
parallel tasks need between 21 and 23 seconds. The numbers are compiled in
table 1.
The VAMPIR screenshot for the scheduling time line for all three runs is given
in Figure 1. All switches for a task from one CPU to another is marked by a
(blue) line. From the beginning of the time line to about 1:50 min the run for the
one big OpenMP block has taken place. Afterwards the OpenMP busy waiting
example is executed. As the last example from about 5:30 minutes to the end
of the time line the run with disabled busy waiting is shown. Figure 2 shows all
switches from/to all CPUs. By zooming in and looking into the dierent parts of
the time line, the following facts could be collected for the three dierent runs:
1. After spawning all the processes the system is balanced after a relatively
short period of time. The load on the individual CPUs is well balanced.
Almost no rescheduling occurs during this period of time.
2. For the second run the balancing of the system takes much longer. During
the whole second run every few seconds there are some scheduling events
where tasks switch between CPUs. The reason for this is that some tasks
get suspended (after the busy wait time has elapsed) and the system needs
to be re-balanced afterwards.
3. The third case again is very dierent. Tasks get suspended very often and
awakened thus the CPU utilization jitters a lot (due to the short OpenMP
regions and no busy waiting). For that reason the system never gets well
balanced but due to the fact that there are no CPU cycles spent busy waiting
this scenario has a shorter wall time than the second one.

Analysis of Linux Scheduling with VAMPIR

829

Fig. 2. Screenshot of all process switches

Conclusion

The work presented has two main results. First of all we designed a convenient
measurement environment to collect scheduling events from the Linux kernel (a
kernel module + relayfs). And we reused VAMPIRs capabilities for a dierent
purpose. Traditional displays from VAMPIR have been reinterpreted for our purposes and do provide very useful information to analyze the scheduling behavior
of a Linux system. A test case has been investigated and the underlying problem
has been identied.
For the future there are various opportunities to follow. One very interesting
idea is to correlate this information with a traditional program trace to be able
to follow eect like cache thrashing or other things that are only analyzable by
looking at the whole system and not only looking onto a single program trace
obtained in user space.
This work has also shown that short OpenMP sections in an overload situation
on Linux is counterproductive. With busy waiting disabled this can be improved.
This way the OpenMP threads do sleep while waiting on a barrier. For this there
is a possibility that the Linux kernel classies this threads as interactive and
starts to shorten their timeslice.
The authors wants to thank her colleges Andreas Kn
upfer, Holger Brunst,
Guido Juckeland and Matthias Jurenz for useful discussions and a lot of ideas.

830

M. Kluge and W.E. Nagel

References
1. Andreas Kn
upfer, Ronny Brendel, Holger Brunst, Hartmut Mix, and Wolfgang
E. Nagel. Introducing the Open Trace Format (OTF). In Vassil N. Alexandrov,
Geert Dick van Albada, Peter M.A. Sloot, Jack Dongarra, Eds., Computational
Science ICCS 2006: 6th International Conference, Reading, UK, May 28-31,
2006. Proceedings, volume II of Lecture Notes in Computer Science. Springer Berlin
/ Heidelberg.
2. IBM.
http://publib16.boulder.ibm.com/doc link/en US/a doc lib/aixprggd
/genprogc/trace facility.htm.
3. M.A. Linn. Eine Programmierumgebung zur Messung der wechselseitigen Einusse von Hintergrundlast und parallelem Programm. Technical Report J
ul-2416,
Forschungszentrum J
ulich, 1990.
4. Robert Love. Linux Kernel Development (german translation). Number ISBN
3-8273-2247-2. ADDISON-WESLEY, 1 edition, 2005.
5. Mathieu Desnoyers and Michel R. Dagenais. Low Disturbancea Embedded System Tracing with Linux Trace Toolkit Next Generation. http://ltt.polymtl.ca,
November 2006.
6. W.E. Nagel. Performance evaluation of multitasking in a multiprogramming environment. Technical Report KF-ZAM-IB-9004, Forschungszentrum J
ulich, 1990.
7. Wolfgang E. Nagel, Alfred Arnold, Michael Weber, Hans-Christian Hoppe, and
Karl Solchenbach. VAMPIR: Visualization and Analysis of MPI Resources. In
Supercomputer 63, Volume XII, Number 1, pages 6980, 1996.
8. A. Nataraj, A. Malony, A. Morris, and S. Shende. Early Experiences with KTAU
on the IBM BG/L. In Proceedings og EUROPAR 2006 Conference, LNCS 4128,
pages 99110. Springer, 2006.
9. redhat Documentation. http://www.redhat.com/docs/manuals/linux/RHL-7.3Manual/ref-guide/ch-proc.html, November 2006.
10. Rick Janda. SGI Altix: Auswertung des Laufzeitverhaltens mit neuen PARBenchKomponenten. Diplomarbeit, Technische Universit
at Dresden, June 2006.
11. Ariel Tamches and Barton P. Miller. Using dynamic kernel instrumentation for
kernel and application tuning. In International Journal of High-Performance and
Applications 13, 3, 1999.
12. Tom Zanussi et.al. relayfs home page. http://relayfs.sourceforge.net, November
2006.

An Interactive Graphical Environment for Code


Optimization
Jie Tao1 , Thomas Dressler2 , and Wolfgang Karl2
1

Institut f
ur Wissenschaftliches Rechnen
Forschungszentrum Karlsruhe
76021 Karlsruhe, Germany
jie.tao@iwr.fzk.de
2
Institut f
ur Technische Informatik
Institut f
ur Technische InformatiK
76128 Karlsruhe, Germany
karl@ira.uka.de

Abstract. Applications usually do not show a satised initial performance and require optimization. This kind of optimization often covers a complete process, starting with gathering performance data, followed by performance visualization and analysis, up to bottleneck nding and code modication. In this paper we introduce DECO (Development Environment for Code Optimization), an interactive graphical interface that enables the user to conduct this whole process within a single
environment.
Keywords: Performance tools, visualization, cache optimization.

Introduction

General-purpose architectures are not tailored to applications. As a consequence,


most applications usually do not show high performance as initially running on
certain machines. Therefore, applications often have to be optimized for achieving expected performance metrics.
This kind of optimization, however, is a quite tedious task for users and as a
consequence various tools have been developed for providing support. First, users
need a performance analyzer [1] or a visualization tool [7] capable of presenting
the execution behavior and performance bottlenecks. These tools have to rely on
prolers [4], counter interfaces [3], or simulators [6] to collect performance data.
Hence, tools for data acquisition are required. In addition, users need platforms
to perform transformations on the code. These platforms can be integrated in
the analysis tool but usually exist as an individual toolkit.
In summary, users need the support of a set of dierent tools. Actually, most
tool vendors provide a complete toolset to help the user step-by-step conduct
the optimization process, from understanding the runtime behavior to analyzing
performance hotspots and detecting optimization objects. Intel, for example, has
developed VTune Performance Analyzer for displaying critical code regions and
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 831838, 2007.
c Springer-Verlag Berlin Heidelberg 2007


832

J. Tao, T. Dressler, and W. Karl

Thread Proler for presenting thread interaction and contention [5]. The Center
for Information Services and High Performance Computing at the University of
Drednen developed Vampir [2] for performance visualization and Goo [8] for
supporting loop transformation in Fortran programs.
Similarly, we implemented a series of tools for cache locality optimization. This
includes a cache visualizer [9] which demonstrates cache problems and the reason
for them, a data proler which collects information about global cache events, a
cache simulator [10] for delivering runtime cache activities, and a pattern analysis
tool [11] for discovering the anity among memory operations.
By applying these tools to optimize applications, we found that it is inconvenient to use them separately. In this case, we developed the program development
environment DECO in order to give the user a single interface for conducting the
whole process of code tuning. A more challenging motivation for us to develop
DECO is the user requirement of a platform to work with their programs. This
platform must provide all necessary functionality for building and running an
application. The third reason for a program development environment is that
users need to compare the execution time or other performance metrics of dierent runs with both unoptimized and optimized code versions. It is more exible
if users could acquire this information within a single view than having to switch
across several windows.
These features are general and can be applied directly or with slight extension
by other tool developers to build a development environment for their own need.
For our purpose with cache locality optimization DECO has an additional property: it visualizes the output of the pattern analysis tool and maps the access
pattern to the source code.
The rest of the paper is organized as follows. We rst describe the design and
implementation of DECO in Section 2. In Section 3 we present its specic feature
for cache locality optimization, together with some initial optimization results
with small codes. In Section 4 the paper is concluded with a short summary and
some future directions.

DECO Design and Implementation

The goal of DECO is to provide program developers an easy-to-use environment


for step-by-step running the program, analyzing the performance, conducting
optimization, executing the program again, and then studying the impact of
optimizations. This is actually a feedback loop for continuous optimization. Users
repeat this loop till a satised performance is achieved. For giving the user a
simple view DECO provides all needed functionality for the whole loop in a
single window.
As depicted in Figure 1, this window consists of a menu bar and three visualization areas. The menu bar is located on the top of the window, with items
for le operation, view conguration, project creation, executable building, and
performance analyzing. The left eld under the menu bar is a code editor, where
the program source is displayed in dierent color for a better overview and an

An Interactive Graphical Environment for Code Optimization

833

Fig. 1. Main window of the program development environment

easy logic analysis. Next to the editor is a subwindow for demonstrating runtime
results, such as execution time, cache miss statistics, communication overhead
(e.g. for MPI programs), and analysis of specic constructs (e.g. parallel and
critical regions in OpenMP programs). In addition, the visualization of access
pattern, for our special need of cache locality optimization, is combined with
this subwindow. The output and error reports delivered during the run of applications or tools are displayed at the bottom.
Building Projects. DECO targets on realistic applications which are usually
comprised of several les including head and Makele. Hence, applications are
regarded as project within DECO. Projects must be created for the rst time
and then can be loaded to the DECO environment using corresponding options
in the menu bar.
Figure 2 shows two sample subwindows for creating a project. First users have
to give the project a name and the path where the associated les are stored.
Then users can add the les that need analysis and potentially optimization
into the project by simply select the object from the les at the given path, as
illustrated on the top picture of gure 2. It is allowed to concurrently open a set
of les, but only one le is active for editing.
After modication of the source program users can generate an executable of
the application using Menu Compile that contains several options like make and
make clean, etc. Commands for running the application are also included.
Tool Integration. DECO allows the user to add supporting tools into the
environment. To insert a tool, information about the name, the path, and conguration parameters has to be provided. For this, DECO lists the potential parameters that could be specied as command line options for a specic toolkit. In
case of the cache simulator, for example, it allows the user to specify parameters
for the frontend, the simulator, and the application.
Using this information, DECO builds an execution command for each individual tool and combines this command with the corresponding item in the
menu Analysis of the main window. This means, tools can be started with menu

834

J. Tao, T. Dressler, and W. Karl

Fig. 2. Creating a new project

choice. During one tool is running, users can congure another tool or edit the
applications.
Execution Results. To enable a comparative study of dierent program versions, DECO visualizes the output of applications. Depending on the application,
this output can be execution time, overhead for parallel programs, statistics on
memory accesses, or time for synchronization of shared memory applications.
The concrete example in Figure 1 visualizes the statistics on cache hits and
misses.

Visualizing the Access Pattern

As mentioned, a specic feature of DECO for our cache locality optimization is


the ability of depicting access patterns in the source code. This leads the user
directly to the optimization object and more importantly shows the user how
to optimize. In the following, we rst give a brief introduction of the pattern
analysis tool, and then describe the visualization in detail.
Pattern Analyzer. The base of the analyzer [11] is a memory reference trace
which records all runtime memory accesses. By applying algorithms, often used
in Bioinformatics for pattern recognition, it nds the anity and regularity between the references, like access chain and access stride. The former is a group
of accesses that repeatedly occur together but target on dierent memory locations. This information can be used to perform data regrouping, a strategy for
cache optimization, which packs successively requested data into the same cache
block by continually dening the corresponding variables in the source code.
The latter is the stride between accesses to neighboring elements of an array.
This information can be used to guide prefetching, another cache optimization
strategy, because it tells which data is next needed.
Additionally, the pattern analyzer delivers the information whether an access
is a cache hit or a cache miss. For a cache miss, it further calculates the push
back distance which shows the number of steps a miss access must be put back

An Interactive Graphical Environment for Code Optimization

835

in order to achieve a cache hit. It is clear that users can use this information to
change the order of memory accesses for a better cache hit ratio. However, it is
dicult to apply this information in its original form based on virtual addresses
and numbers. Therefore, DECO provides this specic property of mapping the
pattern to the program.
Visualization. Using DECO, patterns are displayed within the source code but
also in a separate window for a deeper insight. Figure 3 is an example of the
access chain. The window on the right side lists all detected chains with detailed
description, including the ID, number of elements, frequency of occurrence, and
the initial address. It is also possible to observe more detailed information of a
single chain, for example, all accesses contained and the miss/hit feature of each
access. This detailed information helps the user decide whether the grouping
optimization is necessary.
On the left side access chains are demonstrated in the source code. By clicking
an individual chain in the right window, the corresponding variables and the
location of references are immediately marked in the program. Users can utilize

Fig. 3. Presentation of access chains in two windows

Fig. 4. Visualization of access stride

836

J. Tao, T. Dressler, and W. Karl

this information to move the declarations of the associated variables into a single
code line.
Similarly, the push back distance is also combined with the program. For a
selected variable DECO marks the rst access to it in the code. Then the rest
accesses are one-by-one marked with dierent colors presenting their hit/miss
feature. In case of a cache miss, the position, where the reference has to be issued
for avoiding the miss, is also marked with color. Based on this information users
can decide if it is possible to shift an access. This optimization improves the
temporal locality of single variables and is especially eective for variables in
loops.
The access stride for arrays can be observed in three dierent forms. First,
an initial overview lists all detected strides with the information about start
address, stride length, and the number of occurrence. A further view displays
individual strides in detail, with descriptions of all references holding this stride.
The third form is a diagram. The lower side of Figure 4 is an example. Within
this diagram, all array elements holding the stride are depicted with small blocks.
Each small block represents a byte; hence, an element usually consists of several
blocks, depending on the type of the array. For example, the concrete diagram
in Figure 4 actually demonstrates an array of 4-bytes elements and a stride of
length one. The rst block of each element is colored in order to exhibit the
hit/miss feature. This helps the user reorganize the array structure or change
the sequence of accesses to it, if large number of misses had been introduced
by this array. The size of blocks can be adjusted using the spin-box at the right
corner above the diagram. This allows the user to either observe the global access
characteristics of the complete array or focus on a specic region of elements.
Sample Optimization. We show two examples to demonstrate how to use this
special feature of DECO for cache optimization. The rst code implements a
simple scheduler for user-level threads. With this scheduler the threads release
the CPU resources by themselves, rather than being evicted after the timeslice.
For saving registers by thread swap, a stack is managed for each thread. The
current position at the stack has to be recorded in order to restore the registers
when the thread acquires the timeslice again. We use a struct to store this stack
pointer together with its ID and the stack. By thread switching the stack pointer
has to be updated, where the pointer of the old thread has to be written back
and the pointer of the new thread is fetched. The declaration of struct and the
code for stack pointer update is depicted in Figure 5.
We perform 1000 switchings between four threads with only the operation of
processor release. The trace analysis detects ve access chains that repeat more
than 1000 times. Observing the chains in more detail, we nd that among all
accesses to the associated addresses in four of the ve only three hits exist with
each. Switching to the marked code lines in the source, we further detect that
the corresponding accesses are performed for stack pointer update. This leads
us to move the denition of stackPointer out of struct and dene all the four
pointers together. In this case, the number of L1 misses reduces from 4011 to 9.
This signicant improvement is the result of grouping.

An Interactive Graphical Environment for Code Optimization

837

Fig. 5. Source code of the thread scheduler

The second example performs matrix addition. The working set contains three
matrices which is declared one after another. Initial run of this code reported
12288 L1 miss without any hits. The trace analysis detected three strides of
length one, each corresponding to a single matrix. In principle, for such a stride
many accesses shall be cache hit because with a size of 64 bytes one cache line
can hold several elements. The only explanation lies in that there exists mapping
conict between all three matrices and this conict results in the eviction of a
matrix block out of cache before the other elements can be used.
To eliminate this conict, an ecient way is to insert a butter between two
matrices so that the mapping behavior of the second matrix is changed. As the
applied L1 is a 2-way cache, meaning that a cache line can hold two data blocks,
we need only to alter the mapping behavior of one matrix. Here, we put a buer
of one cache line between the second and the third matrix. In this case, only 769
misses has been observed.

Conclusions

This paper introduces a program development environment that helps the user
perform code optimization. This environment is implemented because we note
that code optimization requires support of several tools and it is inconvenient
to use them separately. Hence, our environment integrates dierent tools into a
single realm. In addition, it provides a exible interface for users to work with
their programs and to study the impact of optimizations. A more specic feature
of this environment is that it directly show access patterns in the source code.
This allows the user to locate the optimization object and the strategy.

References
1. D. I. Brown, S. T. Hackstadt, A. D. Malony, and B. Mohr. Program Analysis
Environments for Parallel Language Systems: The TAU Environment. In Proc. of
the Workshop on Environments and Tools For Parallel Scientic Computing, pages
162171, May 1994.

838

J. Tao, T. Dressler, and W. Karl

2. H. Brunst, H.-Ch. Hoppe, W. E. Nagel, and M. Winkler. Performance Optimization


for Large Scale Computing: The Scalable VAMPIR Approach. In Computational
Science - ICCS 2001, International Conference, volume 2074 of LNCS, pages 751
760, 2001.
3. J. Dongarra, K. London, S. Moore, P. Mucci, and D. Terpstra. Using PAPI For
Hardware Performance Monitoring On Linux Systems. In Linux Clusters: The
HPC Revolution, June 2001.
4. J. Fenlason and R. Stallman. GNU gprof: The GNU Proler.
available
at http://www.gnu.org/software/binutils/manual/gprof-2.9.1/html mono/gprof.
html.
5. Intel Corporation.
Intel Software Development Products.
available at
http://www.intel. com/cd/software/products/asmo-na/eng/index.htm.
6. P. S. Magnusson and B. Werner. Ecient Memory Simulation in SimICS. In
Proceedings of the 8th Annual Simulation Symposium, Phoenix, Arizona, USA,
April 1995.
7. B. P. Miller, M. D. Callaghan, J. M. Cargille, J. K. Hollingsworth, R. B. Irvin,
K. L. Karavanic, K. Kunchithapadam, and T. Newhall. The Paradyn parallel
performance measurement tool. IEEE Computer, 28(11):3746, November 1995.
8. R. Mueller-Pfeerkorn, W. E. Nagel, and B. Trenkler. Optimizing Cache Access:
A Tool for Source-To-Source Transformations and Real-Life Compiler Tests. In
Euro-Par 2004, Parallel Processing, volume 3149 of LNCS, pages 7281, 2004.
9. B. Quaing, J. Tao, and W. Karl. YACO: A User Conducted Visualization Tool
for Supporting Cache Optimization. In High Performance Computing and Communcations: First International Conference, HPCC 2005. Proceedings, volume 3726
of Lecture Notes in Computer Science, pages 694703, Sorrento, Italy, September
2005.
10. J. Tao and W. Karl. CacheIn: A Toolset for Comprehensive Cache Inspection.
In Proceedings of ICCS 2005, volume 3515 of Lecture Notes in Computer Science,
pages 182190, May 2005.
11. J. Tao, S. Schloissnig, and W. Karl. Analysis of the Spatial and Temporal Locality
in Data Accesses. In Proceedings of ICCS 2006, number 3992 in Lecture Notes in
Computer Science, pages 502509, May 2006.

Memory Allocation Tracing with VampirTrace


Matthias Jurenz, Ronny Brendel, Andreas Knupfer,
Matthias Muller, and Wolfgang E. Nagel
ZIH, TU Dresden, Germany

Abstract. The paper presents methods for instrumentation and measurement


of applications memory allocation behavior over time. It provides some background about possible performance problems related to memory allocation as
well as to memory allocator libraries. Then, different methods for data acquisition
and representation are discussed. Finally, memory allocation tracing integrated in
VampirTrace is demonstrated with a real-world HPC example application from
aerodynamical simulation and optimization.
Keywords: Tracing, Performance Analysis, Memory Allocation.

1 Introduction
High Performance Computing (HPC) aims to achieve optimum performance on highend platforms. Always, the achievable performance is limited by one or more resources,
like available processors, floating point throughput or communication bandwidth. Memory is another important resource, in particular for data intensive applications [8].
This paper presents methods for instrumentation and measurement of programs
memory allocation behavior. The integration into VampirTrace [9] provides additional
information for trace-based performance analysis and visualization.
The rest of the first Section discusses the influence of memory allocation to performance, introduces memory allocators and references some related work. The following
Sections 2 and 3 show various instrumentation approaches and ways of representing
the result data. In Section 4 the approach is demonstrated with a real-world application
example before the final Section 5 gives conclusions and an outlook on future work.
1.1 Impact of Memory Allocation on Application Performance
Memory consumption may cause notable performance effects on HPC applications, in
particular for data intensive applications with very large memory requirements. There
are three general categories of memory allocation related performance issues:
memory requirements as such
memory management overhead
memory accesses
Firstly, excessive memory requirements by applications exceeding the available resources may lead to severe performance penalties. An oversized memory allocation
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 839846, 2007.
c Springer-Verlag Berlin Heidelberg 2007


840

M. Jurenz et al.

request might make an application fail or cause memory paging to secondary storage.
The latter might happen unnoticed but brings severe performance penalties.
Secondly, memory management may cause unnecessary overhead - see also
Section 1.2. This might be either due to frequent allocation/deallocation or due to memory placement which can cause so called memory fragmentation.
Thirdly, accesses to allocated memory may cause performance problems. In general,
tracing is unsuitable to track single memory accesses, because of the disproportional
overhead. This is covered by many tools already by means of hardware counters [4].
1.2 Memory Allocators
The memory allocator implements the software interface for applications to request
memory. For example in C it contains the malloc, realloc and free functions
and few more. Only the memory allocator is communicating with the operating system
kernel to actually request additional memory pages in the virtual address space.
Usually, memory allocators handle small requests differently than medium or large
requests. This is done for two reasons. Firstly, because the operating system can partition memory only in multiples of the virtual memory page size. And secondly, to reduce
run-time overhead for small requests.
Reserving a whole page for very small requests would cause a lot of overhead. Thus,
multiple small requests can be placed in the same page. This page can only be released
again, if all memory requests in it have been freed (memory fragmentation). However,
the memory allocator can re-use freed parts for following requests with matching sizes.
All allocators add a small amount of management data to the memory blocks delivered. Usually, this is one integer (or few) of machines address size, see Section
2.3 below. Furthermore, memory allocators will pre-allocate a number of pages from
the operating system for small requests. By this means, it avoids issuing an expensive
system call for every single small request. This provides a notable performance optimization esp. in multi-threaded situations [1].
Memory allocators come as part of a platforms system library (libc) or as external
libraries that can be linked to an application explicitly. Well known examples are the
Lea allocator and the Hoard allocator library [1, 2].
1.3 Related Work
There is a large number of memory debugging tools available that check memory allocation/deallocation or even memory accesses. While such tools serve a different purpose
than the presented work, they use similar techniques to intercept the allocation calls.
One typical and small example is the ccmalloc library for debugging memory management and accesses. It uses dynamic wrapper functions to intercept memory management function calls [3]. The MemProf tool allows memory usage profiling [14]. It provides summary information about memory allocation per function. Valgrind is a very
powerful memory debugger and profiler [15]. It offers more sophisticated debugging
techniques to detect bugs in memory allocation/deallocation as well as invalid accesses.
Unlike the previous examples, Valgrind depends on simulation of target applications.

Memory Allocation Tracing with VampirTrace

841

Even though, this is highly optimized, it causes notable slowdown of target applications
which is well acceptable for debugging but not for parallel performance analysis.
Besides debuggin tools, there are performance analysis tools focusing on memory
allocation. The TAU tool set offers special operation modes for profiling and tracing
of memory allocation [13]. It employs preprocessor instrumentation for memory management calls and takes advantage of platform specific interfaces (Cray XT3). Also, it
introduces the memory headroom metric for remaining heap memory, see Section 2.3.
Other existing performance measurement and analysis tools focus on memory access
behavior. This is done either with summary information provided by PAPI counters [4]
or with even more detailed information which are achievable by simulation only [6].

2 Instrumentation of Allocation/De-Allocation Operations


Instrumentation is the process of inserting measurement code into a target application.
There are various general instrumentation techniques. They can be applied at source
code level, at compilation or linking time or at binary level. The different approaches
vary in terms of platform dependence, in terms of measurement overhead and even in
terms of expressiveness. Below, some approaches are discussed with special focus on
analysis of memory allocation.
2.1 The proc File System
Firstly, there are interfaces to query general process information. One of the most commonly known is the proc file system. It provides system wide information as well as
process specific data. In /proc/meminfo the total available memory of the system
is given among other details. This might not always be the amount of memory actually
available to a process because of user limits, machine partitioning or other means.
In /proc/<PID>/statm there is information about the memory usage of the
process with the given PID. It provides current memory size of heap and stack in multiples of the page size (usually 4 KB). This interface is widely supported but is rather
slow in terms of query speed. For our experiments we use it optionally to determine
total memory consumption.
2.2 The mallinfo Interface
The mallinfo interface, which provides detailed allocation statistics, is provided by
many allocators. Although it returns all desired information, is platform independent
and has fast query speed and yet it is unusable for our tool. Unfortunately, it uses 32 bit
integers by design, thus being incapable of reporting about memory intensive applications in 64 bit address space.
2.3 Autonomous Recording
An autonomous approach can be implemented by intercepting all memory allocation
operations via wrapper functions1. Wrappers evaluate the functions arguments, e.g.
1

Wrapper functions replace calls to a target but call it again itself. Besides the original purpose
of the function, additional tasks like logging, measurement, checking, etc. can be issued.

842

M. Jurenz et al.

memory size in malloc(), which are essential for recording memory consumption.
However, for some calls the corresponding memory sizes are unknown, e.g. for free().
It is necessary to infer the size of deallocated memory area at this point in order to
correctly update the record. Explicitly storing this information would create memory
overhead. Fortunately, memory managers do provide it. Usually, it is stored as an integer at the position right in front of the very memory area. It can be accessed either by
the malloc usable size function or directly. Note that this deals with the memory allocated, which is greater than or equal to the amount requested. For consistence
out approach always reports the amount of memory actually used, including memory
management overhead. This is greater or equal to user application requests.
Internally, the amount of memory currently allocated is stored in thread-private variables within the measurement infrastructure. Multiple threads in the same process will
record their memory consumption separately. This avoids unnecessary thread synchronization. Below, there is a short discussion of different ways to intercept all memory
management calls like malloc, realloc, free, etc.
Pre-Processor Instrumentation. Compiler pre-processors can be used to replace all
function calls in the source code with given names by alternative (wrapper) functions.
Those need to be provided by the the trace library.
With this approach it is possible to miss certain allocation operations hidden in library calls with inaccessible sources. This can cause falsified results. For example,
memory allocated within third party libraries might be freed by the user application
directly. Therefore, this approach is not advisable without special precautions.
Malloc Hooks. The GNU glibc implementation provides a special hook mechanism
that allows intercepting all calls to allocation and free functions. This is independent
from compilation or source code access but relies on the underlying system library.
Unlike the previous method, this is suitable for Fortran as well. It is very useful to
guaranty a balanced recording, i.e. all allocation, re-allocation and free operations are
captured. Similar mechanisms are provided by other libc implementations, e.g. on SGI
IRIX.
This approach requires to change internal function pointers in a non-thread-safe way!
It requires explicit locking if used in a multi-threaded environment. Nevertheless, this
is the default way for MPI-only programs in our implementation.
Library Pre-Load Instrumentation. An alternative technique to intercept all allocation functions uses pre-loading of shared libraries. Within the instrumentation library
there are alternative symbols for malloc, realloc, free, etc. which are preferred
over all later symbols with the same name. The original memory management functions
can be accessed explicitly by means of the dynamic loader library. This approach requires support for dynamic linking, which is very common but not available on all platforms (e.g. IBMs BlueGene/L). Unlike the previous approach, it can be implemented
in a thread-safe manner without locking.
Exhausting Allocation. There is another remarkable strategy to detect the available
amount of heap memory, the so called memory headroom. It counts the free heap memory and infers about the consumed memory (implying the total amount of memory

Memory Allocation Tracing with VampirTrace

843

is constant) [13]. Free heap memory is estimated by exhaustingly allocating as much


memory as possible (with a O(log n) strategy). Presumably, this approach has a big
impact on run-time overhead as well as the memory managers behavior. Furthermore,
it will fail with optimistic memory managers like the default Linux one.
2.4 Measurement Overhead
Overhead from instrumentation can occur in terms of additional memory consumption
and in terms of run-time. There is almost no memory overhead from allocation tracing
with the exception of one counter variable (64 bit integer) per process.
The run-time overhead of malloc hooks is not detectable within the given accuracy
of measurement of 5 %. It was tested on single-processor AMD Athlon64 machine
with Linux OS and with a pathological worst case test. It issues 20, 000, 000 allocation
requests of differing sizes without accessing the memory or any real computation. See
Figure 1 for run-time results including wall clock time, system time and user time.

250

time [s]

200
150
100
50
0

no hooks
with hooks
average

Fig. 1. Run time overhead for instrumentation. The figure compares time without and with instrumentation for ten test runs with a pathological program. It shows total wall clock time (top),
system time (middle) and user time (bottom) as well as the respective average values.

The test covered instrumentation only, no trace records were generated neither in
memory buffers nor in actual trace files. The total tracing overhead may slightly increase
due to the additional records which makes the records buffer flush slightly more often.

3 Representation of Result Data


There is a common conception how various trace file formats store trace information.
All employ so called trace records as atomic pieces of information. Representation
(syntax) and expressiveness (semantics) vary between the existing formats, though [11].
There are two obvious options, how to represent memory allocation information. Firstly,
to extend (a) trace format(s) by a novel record type with specified semantics. Secondly,
to re-use an existing record type which is suitable to transport the information.

844

M. Jurenz et al.

3.1 Novel Record Types


One or more new record types could be introduced specifically tailored towards memory allocation information. It might be special flavors of enter and leave record types
enhanced by allocation information or special record types carrying only allocation information, that are to be used in addition to existing enter and leave records. There may
be several sub-types for allocation, re-allocation and de-allocation or a common one.
Both ways combine the same major advantage and disadvantage. On one hand, they
allow to express more semantics, i.e. the presence of memory allocation information
can reliably be determined by a specific record type. On the other hand, they would
require any trace analysis tools to adapt explicitly to the new record types. We regard
this as a major obstacle for general acceptance by performance analysis tool developers.
3.2 Generic Performance Counters
The memory allocation over time can be mapped to ordinary performance counter
records [4, 9]. Performance counter records are designed to represent scalar attributes
over discrete time, which is very suitable for memory allocation information. Usually,
for every enter and leave event there needs to be a samples with matching time stamp
for every active counter. This results in two samples per function call per counter. Only
so, the counter values can be associated to function calls.
This is unfavorable for memory allocation information because it changes rather
infrequently. Instead, the memory allocation counter is only updated on actual changes.
Every update of this counter specifies a constant value for memory consumption from
the current time until the next sample2 . Thus, the memory consumption is represented
as a step function (piecewise constant) with very low overhead in terms of extra records.

4 Application Example
The example that memory allocation tracing was tested with originates from an aerodynamics application. The sequential application consists of two components for simulation and for optimization of the simulation results. The first part is the flow solver TAUij
from the German Aerospace Center (DLR), which is a quasi 2D version of TAUijk [2].
It solves the quasi 2D Euler equations around airfoils and computes the aerodynamic
coefficients for drag, lift and pitching moment.
The second part uses the ADOL-C software package for Automatic Differentiation
(AD). ADOL-C attaches to TAUij by overloading variable types and operators and provides mechanisms to evaluate various kinds of derivations of the original programs
computation. Here, it is used to compute the gradient of the drag coefficient with respect to the input variables controlling the wing geometry. The gradient is then used for
optimizing the wing shape, which involves iterated simulation and gradient evaluation.
See [5, 7, 12] for more details and background information.
In the example the so called reverse mode of AD is applied to compute the gradient
with respect to a large number of input variables. Basically, this requires to store all intermediate results of the computation in a so called memory tape. This is to be traversed
2

This behavior can be announced by setting a corresponding flag in the counter definition [10].

Memory Allocation Tracing with VampirTrace

845

Fig. 2. Vampir process timeline with counters MEM TOT ALLOC and MEM TOT PAGES

in reverse order in the stage of gradient computation. There are some optimizations for
fixpoint iteration algorithms, but still it causes excessive memory requirements [7].
This is shown in the Vampir screenshot in Figure 2. The test-run consumes 15 GB
of memory (MEM TOT ALLOC). The exact amount is neither constant nor easily predictable. The analysis revealed that the actual usage is only 5.4 GB (MEM TOT PAGES).
The experiments involved additional PAPI counters that showed that the memory
intensive sections of the application suffer from high miss rates of level three cache due
to the linear traversal of the memory tape.

5 Conclusion and Outlook


The paper discussed various methods for memory allocation tracing for HPC applications. They have been implemented in the VampirTrace measurement system [9] and
used with a real-world example application. It was able to provide some insight in the
programs run-time behavior, that was not obviously visible without this approach.
Future work will focus on alternative ways to query kernel statistics about memory
pages assigned to processes. We would like to decrease the run-time overhead. Special
kernel modules might be a solution. This is currently attempted for I/O tracing.
We will apply memory allocation tracing to more HPC programs in the course of
performance analysis to learn more about typical memory management behavior. The
convenient integration into VampirTrace allows this by adding a mere run-time option.

Acknowledgments
Wed like to thank Carsten Moldenhauer and Andrea Walther from IWR, TU Dresden
as well as Nicolas R. Gauger and Ralf Heinrich from the Institute of Aerodynamics and
Flow Technology at DLR Braunschweig for their support with the example application.

846

M. Jurenz et al.

References
[1] E.D. Berger, K.S. McKinley, R.D. Blumofe, and P.R. Wilson. Hoard: A Scalable Memory
Allocator for Multithreaded Applications. In Proc. of ASPLOS-IX, Cambridge, MA, 2000.
[2] E.D. Berger, B.G. Zorn, and K.S. McKinley. Reconsidering custom memory allocation. In
Proc. of OOPSLA02, New York, NY, USA, 2002. ACM Press.
[3] Armin Biere.
ccmalloc.
ETH Zurich, 2003. http://www.inf.ethz.ch/
personal/projects/ccmalloc/.
[4] S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A Portable Programming Interface
for Performance Evaluation on Modern Processors. The International Journal of High
Performance Computing Applications, 14(3):189204, 2000.
[5] N. Gauger, A. Walther, C. Moldenhauer, and M. Widhalm. Automatic differentiation
of an entire design chain with applications. In Jahresbericht der Arbeitsgemeinschaft
Stromungen mit Ablosung STAB. 2006.
[6] M. Gerndt and T. Li. Automated Analysis of Memory Access Behavior. In Proceedings of
HIPS-HPGC 2005 and IPDPS 2005, Denver, Colorado, USA, Apr 2005.
[7] A. Griewank, D. Juedes, and J. Utke. ADOL-C: A package for the automatic differentiation
of algorithms written in C/C++. ACM Trans. Math. Softw., 22:131167, 1996.
[8] G. Juckeland, M. S. Muller, W. E. Nagel, and St. Pfluger. Accessing Data on SGI Altix: An
Experience with Reality. In In Proc. of WMPI-2006, Austin, TX, USA, Feb 2006.
[9] Matthias Jurenz. VampirTrace Software and Documentation. ZIH, TU Dresden, Nov 2006.
http://www.tu-dresden.de/zih/vampirtrace/.
[10] Andreas Knupfer, Ronny Brendel, Holger Brunst, Hartmut Mix, and Wolfgang E. Nagel.
Introducing the Open Trace Format (OTF). In Proc. of ICCS 2006: 6th Intl. Conference on
Computational Science, Springer LNCS 3992, pages 526 533, Reading, UK, May 2006.
[11] Bernd Mohr. Standardization of event traces considered harmful: or is an implementation of object-independent event trace monitoring and analysis systems possible? Journal:
Environments and tools for parallel scientific computing, pages 103124, 1993.
[12] S. Schlenkrich, A. Walther, N.R. Gauger, and R. Heinrich. Differentiating Fixed Point
Iterations with ADOL-C: Gradient Calculation for Fluid Dynamics. In Proc. of HPSC
2006.
[13] S. Shende, A. D. Malony, A. Morris, and P. Beckman. Performance and memory evaluation
using tau. In Proc. for Cray Users Group Conference (CUG 2006), 2006.
[14] Owen Taylor. MemProf. http://www.gnome.org/projects/memprof/.
[15] Valgrind.org. Valgrind, 2006. http://valgrind.org/info/about.html.

Automatic Memory Access Analysis with


Periscope
Michael Gerndt and Edmond Kereku
Technische Universit
at M
unchen
Fakult
at f
ur Informatik I10, Boltzmannstr. 3, 85748 Garching
gerndt@in.tum.de

Abstract. Periscope is a distributed automatic online performance


analysis system for large scale parallel systems. It consists of a set of
analysis agents distributed on the parallel machine. This article presents
the support in Periscope for analyzing ineciencies in the memory access
behavior of the applications. It applies data structure specic analysis
and is able to identify performance bottlenecks due to remote memory
access on the Altix 4700 ccNUMA supercomputer.
Keywords: Performance analysis, supercomputers, program tuning, memory accesses analysis.

Introduction

Performance analysis tools help users in writing ecient codes for current high
performance machines. Since the architectures of todays supercomputers with
thousands of processors expose multiple hierarchical levels to the programmer,
program optimization cannot be performed without experimentation.
To tune applications, the user has to carefully balance the number of MPI
processes vs the number of threads in a hybrid programming style, he has to
distribute the data appropriately among the memories of the processors, has to
optimize remote data accesses via message aggregation, prefetching, and asynchronous communication, and, nally, has to tune the performance of a single
processor.
Performance analysis tools can provide the user with measurements of the
the programs performance and thus can help him in nding the right transformations for performance improvement. Since measuring performance data and
storing those data for further analysis in most tools is not a very scalable approach, most tools are limited to experiments on a small number of processors.
To investigate the performance of large experiments, performance analysis has
to be done online in a distributed fashion, eliminating the need to transport
huge amounts of performance data through the parallel machines network and
to store those data in les for further analysis.
Periscope [5] is such a distributed online performance analysis tool. It consists of a set of autonomous agents that search for performance bottlenecks in a
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 847854, 2007.
c Springer-Verlag Berlin Heidelberg 2007


848

M. Gerndt and E. Kereku

subset of the applications processes and threads. The agents request measurements of the monitoring system, retrieve the data, and use the data to identify
performance bottlenecks. The types of bottlenecks searched are formally dened
in the APART Specication Language (ASL) [1,2].
The focus of this paper is on Periscopes support for analyzing the memory
access behavior of programs. Periscope searches not only for bottlenecks related
to MPI and OpenMP, but also for ineciencies in the memory accesses. Novel
features are the support for identifying data structure-related bottlenecks and
remote memory accesses bottlenecks in ccNUMA systems.
The next section presents work related to the automatic performance analysis
approach in Periscope. Section 3 presents Periscopes architecture and its special
features for memory access analysis. Section 5 presents several case studies.
Section 6 gives a short summary.

Related Work

Several projects in the performance tools community are concerned with the automation of the performance analysis process. Paradyns [9] Performance Consultant automatically searches for performance bottlenecks in a running application
by using a dynamic instrumentation approach. Based on hypotheses about potential performance problems, measurement probes are inserted into the running
program. Recently MRNet [10] has been developed for the ecient collection of
distributed performance data.
The Expert [12] tool developed at Forschungszentrum J
ulich performs an
automated post-mortem search for patterns of inecient program execution in
event traces. Potential problems with this approach are large data sets and long
analysis times for long-running applications that hinder the application of this
approach on larger parallel machines.
Aksum [3], developed at the University of Vienna, is based on a source code
instrumentation to capture prole-based performance data which is stored in a
relational database. The data is then analyzed by a tool implemented in Java
that performs an automatic search for performance problems based on JavaPSL,
a Java version of ASL.

Architecture

Periscope consists of a graphical user interface based on Eclipse, a hierarchy of


analysis agents and two separate monitoring systems (Figure 1).
The graphical user interface allows the user to start up the analysis process
and to inspect the results. The agent hierarchy performs the actual analysis. The
node agents autonomously search for performance problems which have been
specied with ASL. Typically, a node agent is started on each SMP node of the
target machine. This node agent is responsible for the processes and threads on
that node. Detected performance problems are reported to the master agent that
communicates with the performance cockpit.

Automatic Memory Access Analysis with Periscope

849

Fig. 1. Periscope currently consists of a GUI based on Eclipse, a hierarchy of analysis


agents, and two separate monitoring systems

The node agents access a performance monitoring system for obtaining the
performance data required for the analysis. Periscope currently supports two
dierent monitors, the Peridot monitor [4] developed in the Peridot project focusing on OpenMP and MPI performance data, and the EP-Cache monitor [8]
developed in the EP-Cache project focusing on memory hierarchy information.
The node agents perform a sequence of experiments. Each experiment lasts for
a program phase, which is dened by the programmer, or for a predened amount
of execution time. Before a new experiment starts, an agent determines a new set
of hypothetical performance problems based on the predened ASL properties
and the already found problems. It then requests the necessary performance data
for proving the hypotheses and starts the experiment. After the experiment,
the hypotheses are evaluated based on the performance data obtained from the
monitor.

Monitoring Memory Accesses

The analysis of the applications memory access behavior is based on the EPCache monitor. It allows to access the hardware counters of the machine for
evaluating properties such as the example for L1-cache misses in Figure 2.
The example demonstrates the specication of performance properties with
ASL. The performance property shown here identies a region with high L1cache miss rate. The data model, specied in ASL too, contains a class SeqPerf
which contains a reference to a program region, a reference to a process, and a
number of cache-related and other metrics. The instance of SeqPerf available in
the property is called the propertys context. It denes the program region, e.g.,
a function, and the process for which the property is tested.
A novel feature of the EP-Cache monitor is its data structure support. The
node agent can request measurements for specic data structures via the

850

M. Gerndt and E. Kereku


PROPERTY LC1ReadMissesOverMemRef(SeqPerf s){
CONDITION:
s.lc1_data_read_miss/s.read_access > 0.01;
CONFIDENCE:
1;
SEVERITY:
s.lc1_data_read_miss/s.read_access; }
Fig. 2. This performance property identies signicant L1-cache miss rate

Measurement Request Interface (MRI) [6]. The node agent species a request
via the MRI Request function. Its parameters determine, for example, that L1cache misses are to be measured for array ACOORD in a specic parallel loop
in subroutine FOO. The request would be generated for a new experiment after
the property LC1MissRate was proven in the previous experiment.
To enable the node agent to specify such detailed requests, it requires static
information about the program, e.g., it has to know that array ACOORD is
accessed in that loop. This program information is generated by the source-tosource instrumenter developed for Periscope. It generates the information in an
XML-based format called the Standard Intermediate Program Representation
(SIR) [11]. This information is read by the agents when the analysis is started.
At runtime the MRI request for measuring cache misses for a data structure is
translated based on the debug information in the executable into the associated
range of virtual addresses. The measurement is certainly only possible, if the
hardware counters of the target processor do support such restricted counting.
Our main target architecture is the Altix 4700, which was installed at the Leibniz
Computing Centre in Munich. It consists of 4096 Itanium 2 processors which have
a very extensive set of hardware counters. On Itanium 2, such measurements
can be restricted to specic address ranges and thus can be executed without
program intrusion.
Another feature of the Itanium 2s hardware counters exploited by Periscope is
its support for counting memory accesses that last more than a given number of
clock cycles. One of these events is DATA EAR CACHE LAT8 for example. This
event returns the number of memory operations with a memory access latency
of more than eight cycles. Similar events return the number of operations with a
memory access latency greater than LAT4, LAT16, ... up to LAT4096 increasing
by powers of 2.
The counters can be used to identify those memory references that go to local
memory on the ALTIX 4700 processor or to remote memory. Since the ALTIX is
a ccNUMA system, all the processors can access all the memory, but, for ecieny
reasons, most of the accesses should be to the processors local memory. Thus it is
very important to identify non-local access behavior. Edmond Kereku specied
appropriate ASL performance properties for such situation in his dissertation [7].
Based on elementary properties such as the LC1Miss-Property and similar properties for individual data structures and for remote memory accesses,

Automatic Memory Access Analysis with Periscope

851

higher-level properties can be deduced. For example, the Property UnbalancedDMissRateInThreads shown in Figure 3 gives information on the behavior across
multiple threads.
PROPERTY TEMPLATE UnbalancedDMissRateInThreads
<float MissRateParFunc(MRIParPerf, int)>
(MRIParPef mpp){
LET
const float threshold;
float mean = mean_func_t(MissRateParFunc, mpp);
float max = max_func_t(MissRateParFunc, mpp);
float min = min_func_t(MissRateParFunc, mpp);
float dev_to_max = max - mean;
float dev_to_min = mean - min;
float max_exec_time=MAX(mpp.parT[tid] WHERE tid IN mpp.nThreads);
IN
CONDITION : MAX(dev_to_max, dev_to_min) / mean > threshold;
CONFIDENCE : 1;
SEVERITY
: MAX(dev_to_max, dev_to_min) / mean * max_exec_time;
}
PROPERTY UnbalancedDMissRateInThreads<LC1DReadMissRateParFunc>
UnbalancedLC1DReadMissRatePar;
PROPERTY UnbalancedDMissRateInThreads<LC2DReadMissRateParFunc>
UnbalancedLC2DReadMissRatePar;
PROPERTY UnbalancedDMissRateInThreads<LC3DReadMissRateParFunc>
UnbalancedLC3DReadMissRatePar;
Fig. 3. Performance properties with a similar specication can be expressed via property templates. This template specied an unbalanced miss rate across threads in a
parallel OpenMP program.

The template UnbalancedDMissRateInThreads is used to create multiple


properties for the dierent cache levels. The property template is parameterized
by a function which is replaced in the specications of the individual properties
with a function returning the appropriate miss rate, i.e., for the L1 cache, L2,
and L3 respectively.

Application Experiments

We used Periscope to analyze applications on the Itanium SMP nodes of the


Inniband Cluster at LRR/TUM. Table 1 shows the list of properties which we
searched for during our experiments, the threshold set in the condition of the
properties, and the number of regions on which the property holds for each of the

852

M. Gerndt and E. Kereku

Table 1. The list of properties searched on the test applications running on the Inniband Cluster
Property
Threshold LU FT SWIM PGAUSS
LC2DMissRate
5% 0 14
0
5
LC2DReadMissRate
5% 0 3
0
5
LC3DMissRate
2% 13 3
7
0
LC3DReadMissRate
2% 1 3
2
0

test applications. We did not count the data structures, only the code regions.
The properties hold for a region if the cache miss rate is higher than 5% for LC2
and higher than 2% for LC3. Looking at the table, we can conclude that FT
and PGAUSS have more L2 cache problems, while LU and SWIM have more L3
cache problems.
We present here the results of Periscopes automatic search for the LU decomposition application in the form of search paths. The paths start from the
region where the search began and go down to the deepest subregion or data
structure for which the property holds. We omit the severities associated with
each evaluated property. Instead, we provide the measured miss rate for each
of the search paths regions and data structures. The search path which generated the property with the highest severity is marked italic. Please note that
in some of the tests, the region with the highest miss rate does not necessarily have the property with the highest severity. The severity is also dependant
on the regions execution time, which for presentation reasons is not shown
here.
The results of the automatic search for LU
Region
Application Phase( USER REGION, ssor.f, 109 )
( PARALLEL REGION, ssor.f, 120 )
( WORKSHARE DO, ssor.f, 126 )
rsd( DATA STRUCTURE, ssor.f, 4 )
Application Phase( USER REGION, ssor.f, 109 )
( PARALLEL REGION, ssor.f, 120 )
( LOOP REGION, ssor.f, 149 )
jacld( CALL REGION, ssor.f, 156 )
jacld( SUB REGION, jacld.f, 5 )
( WORKSHARE DO, jacld.f, 39 )
u( DATA STRUCTURE, jacld.f, 5)
Application Phase( USER REGION, ssor.f, 109 )
( PARALLEL REGION, ssor.f, 120 )
( LOOP REGION, ssor.f, 149 )
blts( CALL REGION, ssor.f, 165 )
blts( SUB REGION, blts.f, 4 )

LC3 miss
rate
0.038
0.029
0.029

0.038
0.037
0.030
0.030
0.040

0.038
0.037
0.052

LC3 read
miss rate

0.029

Automatic Memory Access Analysis with Periscope


( WORKSHARE DO, blts.f, 75 )
v( DATA STRUCTURE, blts.f, 4 )

Application Phase( USER REGION, ssor.f, 109 )


( PARALLEL REGION, ssor.f, 120 )
( LOOP REGION, ssor.f, 184 )
buts( CALL REGION, ssor.f, 200 )
buts( SUB REGION, buts.f, 4 )
( WORKSHARE DO, buts.f, 75 )
v( DATA STRUCTURE, buts.f, 4 )
Application Phase( USER REGION, ssor.f, 109 )
( PARALLEL REGION, ssor.f, 120 )
( WORKSHARE DO, ssor.f, 221 )
u( DATA STRUCTURE, ssor.f, 4 )
rsd( DATA STRUCTURE, ssor.f, 4 )

853

0.055
0.025

0.038
0.038
0.051
0.054
0.025
0.038
0.039
0.031
0.062

0.031
0.062

The search for L2 cache problems did not detect any problem, but as the
results show, LU has L3 cache problems. In addition to L3DMissRate, our
search rened to the property L3DReadMissRate too. As the search paths show,
LC3DReadMissRate does not hold on the majority of the regions where
LC3MissRate was proven. This means that most of the L3 cache problems in
LU are write-related problems.
The search path that discovered the most severe problem rened from the
application phase to subroutine buts. The data structure v is the source of the
problem. The most important data structures of LU are u and rsd. In fact, the
variable v is the local name for rsd which is passed to subroutine buts as a
parameter.

Summary

This paper presented the automatic memory access analysis support in Periscope.
Periscopes analysis is automatically running several experiments during a single program execution to incrementally search for performance bottlenecks. If no
repetitive program phases are marked by the user, the application can even be
automatically restarted to perform additional experiments.
Periscope uses this approach to search for data structure-related memory access bottlenecks as well as for remote memory access bottlenecks in ccNUMA
architectures. Due to the limitation in the number of performance counters of
current processors, multiple experiments are required to evaluate all the performance hypotheses for critical program regions.
The overall overhead of the analysis depends on the frequency measurements
are taken. In our tests, the regions to be analyzed where outer loops or parallel regions which have signicant runtime so that accesses to the performance
counters did not have signicant overhead.

854

M. Gerndt and E. Kereku

The strategies applied in rening for more detailed performance bottlenecks


can be found in [7] and will be published elsewhere.
Acknowledgments. This work is funded by the Deutsche Forschungsgemeinschaft under Contract No. GE 1635/1-1.

References
1. T. Fahringer, M. Gerndt, G. Riley, and J. Tr
a. Knowledge specication for automatic performance analysis. APART Technical Report, www.fz-juelich.de/apart,
2001.
2. T. Fahringer, M. Gerndt, G. Riley, and J.L. Tr
a. Specication of performance problems in MPI-programs with ASL. International Conference on Parallel
Processing (ICPP00), pp. 51 - 58, 2000.
3. T. Fahringer and C. Seragiotto. Aksum: A performance analysis tool for parallel
and distributed applications. Performance Analysis and Grid Computing, Eds. V.
Getov, M. Gerndt, A. Hoisie, A. Malony, B. Miller, Kluwer Academic Publisher,
ISBN 1-4020-7693-2, pp. 189-210, 2003.
4. K. F
urlinger and M. Gerndt. Peridot: Towards automated runtime detection of performance bottlenecks. High Performance Computing in Science and Engineering,
Garching 2004, pp. 193-202, Springer, 2005.
5. M. Gerndt, K. F
urlinger, and E. Kereku. Advanced techniques for performance
analysis. Parallel Computing: Current&Future Issues of High-End Computing (Proceedings of the International Conference ParCo 2005), Eds: G.R. Joubert, W.E.
Nagel, F.J. Peters, O. Plata, P. Tirado, E. Zapata, NIC Series Volume 33 ISBN
3-00-017352-8, pp. 15-26a, 2006.
6. M. Gerndt and E. Kereku. Monitoring Request Interface version 1.0. TUM Technical Report, 2003.
7. E. Kereku. Automatic Performance Analysis for Memory Hierarchies and Threaded
Applications on SMP Systems. PhD thesis, Technische Universit
at M
unchen, 2006.
8. E. Kereku and M. Gerndt. The EP-Cache automatic monitoring system. International Conference on Parallel and Distributed Systems (PDCS 2005), 2005.
9. B.P. Miller, M.D. Callaghan, J.M. Cargille, J.K. Hollingsworth, R.B. Irvin, K.L.
Karavanic, K. Kunchithapadam, and T. Newhall. The Paradyn parallel performance measurement tool. IEEE Computer, Vol. 28, No. 11, pp. 37-46, 1995.
10. Ph. C. Roth, D. C. Arnold, and B. P. Miller. MRNet: A software-based multicast/reduction network for scalable tools. SC2003, Phoenix, November, 2003.
11. C. Seragiotto, H. Truong, T. Fahringer, B. Mohr, M. Gerndt, and T. Li. Standardized Intermediate Representation for Fortran, Java, C and C++ programs.
APART Working Group Technical Report, Institute for Software Science, University of Vienna, Octorber, 2004.
12. F. Wolf and B. Mohr. Automatic performance analysis of hybrid MPI/OpenMP
applications. 11th Euromicro Conference on Parallel, Distributed and NetworkBased Processing, pp. 13 - 22, 2003.

A Regressive Problem Solver That Uses Knowledgelet


Kuodi Jian
Department of Information and Computer Science
Metropolitan State University
Saint Paul, Minnesota 55106-5000
Kuodi.jian@metrostate.edu

Abstract. This paper presents a new idea of how to reduce search space by a
general problem solver. The general problem solver, Regressive Total Order
Planner with Knowledgelet (RTOPKLT), can be used in intelligent systems or
in software agent architectures. Problem solving involves finding a sequence of
available actions (operators) that can transfer an initial state to a goal state.
Here, a problem is defined in terms of a set of logical assertions that define an
initial state and a goal state. With too little information, reasoning and learning
systems cannot perform in most cases. On the other hand, too much information
can cause the performance of a problem solver to degrade, in both accuracy and
efficiency. Therefore, it is important to determine what information is relevant
and to organize this information in an easy to retrieve manner.
Keywords: Software Agent, Artificial Intelligence, Planning.

1 Introduction
In recent years, as the computing power increases steadily, computers are used in a
wide variety of areas that need some intelligence. All of these intelligent systems need
a problem solver. The problems that require a problem solver include: (1) generating
a plan of action for a robot, (2) interpreting an utterance by reasoning about the goals
a speaker is trying to achieve, (3) allocating the use of resources, and (4)
automatically writing a program to solve a problem. The heart of a problem solver is a
planner. There are many kinds of planners [1]. In this paper, we present a regressive
planner that uses knowledgelet. A planner is complete if it is able to find every
solution given that solutions exist and that the planner runs to its completion. There is
a trade-offs between the response time and the completeness. We can improve the
response time of a planner by using some criteria to prune the search space. But that
will cause the loss of solutions to the problem. On the other hand, if a planner
searches every possible space, then, the search space grows exponentially with the
number of operators; the response time will deteriorate to the point of intolerable for
many problems of decent size. This paper offers a solution that gives us both the
quick response time and the completeness of solutions. The planner presented in this
paper has the following characteristics:
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 855863, 2007.
Springer-Verlag Berlin Heidelberg 2007

856

K. Jian

It has fast response time since the planner searches only relevant
operators,
It can use existing relational database to store pieces of
knowledge, called knowledgelet,
It is built on proven regressive idea and is conceptually simple
and complete,
It is general, which means that it can be used in different areas
with little or no changes.

The rest of the paper is organized as the following: section 2 gives the description
of knowledgelet; section 3 presents the regressive general problem solver,
RTOPKLT, and its application to a planning problem; and section 4 is the summary.

2 Knowledgelet
A knowledgelet is an object that describes the relevant knowledge of solving a
particular problem. By using knowledgelets, the search time by a problem solver can
be greatly reduced. A knowledgelet consists of slots of data structure and can be
stored in most existing databases (object databases will be the best fit).
2.1 Structure of Knowledgelet
Knowledgelet constrains the world states by giving only subsets of operators (actions)
that are relevant to the problem at hand. The search space can be an open world space
or a Local Closed World (LCW) space [2]. In a LCW space, we assume that we know

Slot1: Name of the knowledgelet


Slot2: Domain name : context description
Slot3:
Goal1: statement
Initial condition1: partial solution.
Initial condition2: partial solution.
.
Goal2: statement
Initial condition1: partial solution.
Initial condition2: partial solution.
.
.

Slot4:
Language building blocks : LCW or open
world
Operator set -

Fig. 1. The structure of a knowledgelet

A Regressive Problem Solver That Uses Knowledgelet

857

everything. If something is not positively stated, we conclude it is false. The benefit


of using LCW is that we neednt to record all the world states. Figure 1 is the diagram
of a knowledgelet.
From the diagram, we see that a knowledgelet has a name. This name can be used
by a database as a key to store it. A knowledgelet is like a blob of knowledge with a
domain name (also can be used as a key) and a context that describes the domain
name. We all know the fact that the same words carry different meanings in different
contexts. The domain name and its context will solve this ambiguity.
In slot3, there are goals that can be searched. Under each goal, there is a set of
partial solutions to the goal. Each partial solution is associated with an initial state.
This means that the same goal can be achieved from different starting points. For
example, in a block world, the goal of block C on top of block B that is in turn on top
of block A can be achieved either from the initial state of (on A B) (on C Table) or
from the initial sate of (on A Table) (on B Table) (on C Table). A partial solution
is a partial plan (contains an ordered sequence of operators) that will change the world
from an initial state to the goal state. To the regressive planner, this partial plan can be
viewed as one big operator [3].
In slot4, there is a field that contains the language type and a field that contains
available operator set. There are two language types: open world assumption and the
Local Closed World (LCW) assumption. The LCW language building block is a LCW
lingo in that particular domain; the open world language building block is an open
world lingo in that particular domain. The language building blocks include
sentences, axioms, rules, and constraints. One example of the constraints is the
domain rules. Domain rules are the guiding principles that operate in a particular
domain. They may specify a particular sequence of operators (for example, one
operator must proceed another, etc.). If there is no existing partial plan in slot3 to
match a goal, the planner will construct a plan from the available operators in slot4
with the help of the language lingo in the language type field.
The search starts from slot1, slot2, slot3, and slot4. By searching the partial
solution first, we improve the response time since we do not have to construct
everything from scratch.

3 Regressive General Total Order Planner, RTOP


In artificial intelligence, problem solving involves finding a sequence of actions
(operators) that result in a goal state. A lot of real world problems can be cast into
planning problems in terms of an initial state, a set of goal state conditions, and a set
of relevant legal operators; each legal operator is defined in terms of preconditions
and effects. Preconditions of an operator must be satisfied before the operator can be
applied. All the reachable states comprise world space. A world state in a world space
can be represented by state variables and can only be changed by legal operators.
There are basically two ways to search through a world space. One is called
progression and the other is called regression. A progressive planner searches

858

K. Jian

forward from the initial world state until it finds the goal state. A regressive planner
searches backward from the goal state until it finds the initial state. As the search
progresses, both the forward search algorithm and the backward search algorithm
need to choose non-deterministically the next operator to consider. The planner
discussed in this paper belongs to the regressive type. Since the answer returned by
our planner is a total-ordered (represented by <<) sequence of operators, our
planner is called Regressive Total Order Planner with Knowledgelet (RTOPKLT).
In this planner, a returned plan is represented by two-tuple: <A, TOLO>. TOLO is
the total order of the operators that is selected by the planner. Internally, a plan is
represented by a three-tuple: <A, O, L>. A is a set of operators drawn from available
operator pool , O is a set of partial-ordering (represented by <) constraints over A,
and L is a set of causal links. For example, if A = {A1, A2, A3} then O might be the
set {A1 < A3, A2 < A3}. A good way of ensuring that the different operators
introduced for different goals wont interfere with each other is to record the
dependencies among operators explicitly within a partial plan. To record these
dependencies, we use a data structure called causal link [4]. A causal link is a data
structure with three fields: the first two fields contain pointers to plan operators (the
links producer, Ap, and its consumer, Ac); the third field is a proposition or conjunct
of propositions, Q, which is both an effect of Ap and a precondition of Ac. We can
write such a causal link as Ap (Q) Ac and store a plans links in the set L. Figure 2
is the algorithm for the regressive planner.
==========================================================
Algorithm: RTOPKLT
<A, TOLO>RTOPKLT_Algorithm (InitialState, Goal, KnowledgeletName, DomainName)
(a) Use either KnowledgeletName or DomainName to search existing knowledgelets.
If there is a applicable knowledgelet in the system and the partial plan exist
for the given InitialState and Goal
Return the plan directly
Else if there is a applicable knowledgelet in the system and the partial plan does
not exist for the given IntialState and Goal
1. Construct A0, A, initial A, O, L, agenda, and get - from
knowledgelet. All the added operator Ai satisfies the
following ordering:
A0 < Ai < A
2. Goto step (b), pass it a null plan, initial agenda, and the operator pool -.
Else if there is no applicable knowledgelet in the system
Return no answer.
(b) Call subroutine RTOP(<A, O, L>, agenda, -).
(c) Total order: TOLO = total ordering O.
(d) Termination: return <A, TOLO>.
RTOP(<A, O, L>, agenda, -)

A Regressive Problem Solver That Uses Knowledgelet

859

(1) Termination: if agenda is empty, return <A, TOLO>.


(2) Goal selection: let <Q, A need> be a pair on the agenda (by definition A need A and Q is a
precondition of A need).
(3) Operator selection: Let A add be the action of choosing an operator that has Q as its one
postcondition (an operator can be chosen from -, or an operator already in A that can be consistently
ordered prior to A need). The choice of an operator can be implemented in such as way that the chosen
operator will achieve maximum number of goals in agenda.
Let L = L {Aadd o(Q) Aneed}, and Let O = O {Aadd < Aneed}.
If Aadd is newly instantiated, then
A = A {Aadd}
(otherwise let A = A).
If no such operator exists, then return failure. For each condition, keep all other choices as backtrack
points.
(4) Update goal set: Let agenda = agenda {<Q, A need>}. If A add is newly instantiated, then for each
conjunct, Qi, of its precondition add <Qi, A add > to agenda.
(5) Causal link protection: For every operator A add that might threaten a causal link Ap o(S) Ac L
choose a consistent ordering constraint, either
Demotion: add Aadd < Ap to O, or
x
x
Promotion: add Ac < Aadd to O.
If neither constraint is consistent, then return failure.
(6) Recursive invocation: RTOP(<A, O, L>, agenda, -).

==========================================================
Fig. 2. Regressive generic total order planner, RTOPKLT

In the above algorithm, there are four parameters. The InitialState is the starting
place where the planner begins to work. The Goal is the final state. The parameters
KnowledgeletName and DomainName are used by the planner to search the existing
knowledgelets. RTOPKLT starts with a search of knowledgelet based on the
knowledgelet name or the domain name. Based on the search result, it either returns a
plan immediately or invokes the subroutine RTOP. The return type is a two-tuple data
structure. A is a set of operators included in the executable plan and TOLO is the
total order constraint on A. The total ordering constraints, TOLO, is guaranteed to
be an action sequence that solves the planning problem [5].
The first parameter to subroutine RTOP is a three-tuple data structure: A is the
chosen operator for the partial plan, O is the set of ordering constraints, and L is the
set of causal link constraints. The second parameter is an agenda of goals that need to
be supported by links. Each item on the agenda is represented as a pair <Q, Ai> where
Q is a precondition or a conjunct of the precondition of Ai. The third parameter is ,
which is the pool of operators selected by a knowledgelet. To achieve simplicity, we
introduce two dummy operators: A0 and A. A0, the start operator, has no
precondition and has the initial state as its effect; A, the end operator, has no effect
and has the goal as its precondition. Subroutine RTOP always starts with a null plan:
A = {A0, A}, ordering constraint O = {A0 < A}, and causal link L = {}. Then,

860

K. Jian

RTOP makes non-deterministic choices of operators until all the conjuncts of every
operators precondition have been supported by causal links and all the threatened
links have been protected from possible interferences. In the rest of the paper, we use
~ to mean a condition is not true.
3.1 Example of Applying the RTOPKLT
Lets use an example to illustrate how the RTOPKLT planner works. We use the
following simple but illustrative planning problem. The following is the problem
statement:
Example 1 (block world)
This problem involves a robot operating in a closed room. There are three blocks of
equal size labeled as A, B, and C. There is a table that supports blocks. The robot may
be commanded to move blocks one at a time, placing them on the table or on top of
other blocks. Blocks may only be stacked one atop the other and the robot can easily
reach all of the blocks. Suppose the exact position of the blocks on the table are not
important and that there is enough room for all the blocks to be resting on the table at
the same time. The initial state is A and B are on the table; C is on top of A. The goal
state is A on top of B; B is on top of C; and C is on table.

Start
(on c a)
(clear b)
(clear c)
(on a table)
(on b table)
3.a. the null plan

(on a b)
(on b c)
(clear a)
(on c table)
End

A
C
A

B
B

3.b. the initial state and the goal state

Fig. 3. The null plan, initial state, and goal state for block world example

Figure 3.b is the visual representation of the initial state and the goal state. The
RTOPKLT starts by constructing a null plan as shown in figure 3.a.
3.2 Solution to the Problem When There Is a Partial Plan
If there is a solution existing in the applicable knowledgelet, then the RTOPKLT
will return a solution immediately. The line (a) will return a plan of two-tuple <
A, TOLO>. One possible answer is: the operator set A contains {move-C-from-A-toTable, move-B-from-Table-to-C, move-A-from-Table-to-B}, and TOLO contains
{move-C-from-A-to-Table << move-B-from-Table-to-C << move-A-from-Tableto-B}.

A Regressive Problem Solver That Uses Knowledgelet

861

3.3 Solution to the Problem When There Is No Partial Plan


If there is no solution existing in the applicable knowledgelet, then the RTOPKLT
will build a plan from scratch by calling the subroutine RTOP. When making the
initial call, RTOPKLT passes in a null plan (shown in the iteration 1 of Table 1) as
parameters. The parameters are: <A, O, L>, agenda, . The values of each variable in
a null plan will be the following: A = {A0, A}, O = {A0 < A}, L = {}, agenda =
{<(on A B), A>, <(on B C), A>, <(on C Table), A>}, and the operator pool that
contains the relevant set shown in the first part of Table 1. Rest of Table 1 shows the
computation results for each iteration of applying RTOPKLT to the example 1.
Table 1. The values of variables for each iteration
Operator
pool

<pre: none>A0<post: (on C A), (clear B), (clear C), (on A Table), (on B Table)>,
<pre: none>A0<post: (on C A), (clear B), (clear C), (on A Table), (on B Table)>,
<pre: (clear A), (on A B), (on B C), (on C Table)>A<post: none>,
A1 <pre: (clear C), (on C A)>move-C-from-A-to-Table<post: ~(on C A), (clear A), (on C
Table)>,
A2 <pre: (clear B), (clear C), (on B Table)>move-B-from-Table-to-C<post: (on B C),
~(clear C), ~(on B Table)>,
A3 <pre: (clear A), (clear B), (on A Table)>move-A-from-Table-to-B<post: (on A B),
~(clear B), ~(on A Table)>,
A4 <pre: (clear A), (on A C)>move-A-from-C-to-Table<post: (on A Table), (clear C), ~(on
A C)>,
A5 <pre: (clear C), (clear B), (on C Table)>move-C-from-Table-to-B<post: (on C B),
~(clear B), ~(on C Table)>,
A6 <pre: (clear B), (clear A), (on B Table)>move-B-from-Table-to-A<post: (on B A),
~(clear A), ~(on B Table)>.
Note: for every operator Ai, we have A0 < Ai < A.
A
O
L
Agenda
A0, A
A0 < A
none
<(clear A), A>, <(on A
B), A>, <(on B C), A>,
<(on C Table), A>
A0, A,
A0 < A
A2 (on B C) A
<(clear A), A>, <(on A
A2
A0 < A2
A0 (clear B) A2
B), A>, <(on C Table),
A2 < A
A0 (clear C) A2
A>
A0 (on B Table) A2
A0, A,
A0 < A
A2 (on B C) A
<(clear A), A>, <(on C
A2, A3
A0 < A2
A0 (clear B) A2
Table), A>,
A2 < A
A0 (clear C) A2
<(clear A), A3>, <(clear
A0 < A3
A0 (on B Table) A2 B), A3>, <(on A Table),
A3 < A, A2 < A3 A3 (on A B) A
A3>
A0, A,
A0 < A
A2 (on B C) A
<(clear A), A>, <(on C
A2, A3
A0 < A2
A0 (clear B) A2
Table), A>,
A2 < A
A0 (clear C) A2
<(clear A), A3>
A0 < A3
A0 (on B Table) A2
A3 < A
A3 (on A B) A
A2 < A3
A0 (clear B) A3
A0 (on A Table)A3

Iteration
1

862

K. Jian
Table 1. (Continued)

A0, A,
A2, A3,
A1

A0 < A
A0 < A2
A2 < A
A0 < A3
A3 < A
A2 < A3
A0 < A1
A1 < A
A1 < A2

A0, A, A2, A3,


A1

A0 << A1 << A2
<< A3 << A

A2 (on B C) A
A0 (clear B) A2
A0 (clear C) A2
A0 (on B Table) A2
A3 (on A B) A
A0 (clear B) A3
A0 (on A Table) A3
A0 (clear C) A1
A0 (on C A) A1
A1 (clear A) A
A1 (clear A) A3
A1 (on C Table) A

Initially, the planner contains only null plan as shown in iteration 1 of the above
table. At this point, the agenda is: <(clear A), A>, <(on A B), A>, <(on B C), A>,
<(on C Table), A>. There are four choices for the immediate goal. As far as
completeness is concerned, it doesnt matter which one is chosen first. If the first
conditions take from agenda is (on B C) and the operator to support this condition,
Aadd, is move-B-from-Table-to-C, then all the values of parameters will be updated to
the values as shown in iteration 2. Aadd has three preconditions, they can either be
supported by conditions in the initial state or be put into agenda. In this instance, all
the preconditions are supported by the initial condition. We record this fact by adding
three causal links to L.
For iteration 3, suppose the planner selects the goal (on A B) from the agenda
and decides to instantiate a new operator 3, move-A-from-Table-to-B, as Aadd. If
operator 3 is put before operator 2, we will get a threat situation. We resolve threat by
promotion to order the threat after links consumer. In iteration 6, the planner returns
a totally ordered plan with two-tuples: <A={A0, A1, A2, A3, A}, TOLO=
{A0<<A1<<A2<<A3<< A}>. The total order uses notation << to indicate the
absolute order among chosen operators. In this iteration, the agenda is empty and the
algorithm terminates.

References
1. McDermott, D.: The 1998 AI planning systems competition, AI Magazine 21 (2) (2000)
35-55.
2. Bertino, E., Provetti, A., and Salvetti, F.: "Local Closed-World Assumptions for Reasoning
about Semantic Web Data", In Proceedings of the APPIA-GULP-PRODE Conference on
Declarative Programming (AGP 03), pages 314-323, 2003.

A Regressive Problem Solver That Uses Knowledgelet

863

3. Gordon, A.: "The Representation of Planning Strategies", Artificial Intelligence 153 (1-2)
(2004) 287-305.
4. Akman, V., Erdogan, S., Lee, J., Lifschitz, V., Turner, H.: "Representing the Zoo World and
the Traffic World in the Language of the Causal Calculator", Artificial Intelligence 153
(1-2) (2004) 105-140.
5. Wilkins, D.: Practical Planning Extending the Classical AI Planning Paradigm, Morgan
Kaufmann, San Mateo, California, 1988.
6. Gray, R.: Agent Tcl: A flexible and secure mobile-agent system, In Mark Diekhans and
Mark Roseman, editors, Proceedings of the Fourth Annual Tcl/Tk Workshop (TCL 96),
Monterey, California, July 1996.

Resource Management in a Multi-agent System


by Means of Reinforcement Learning
and Supervised Rule Learning
zy
Bartlomiej Snie
nski
AGH University of Science and Technology, Institute of Computer Science
Krak
ow, Poland
Bartlomiej.Sniezynski@agh.edu.pl

Abstract. In this paper two learning methods are presented: reinforcement learning and supervised rule learning. The former is a classical
approach to a learning problem in multi-agent systems. The latter is
a novel approach, according to the authors knowledge, which has several advantages. Both methods are used for resource management in a
multi-agent system. The environment is a Fish Bank game, where agents
manage shing companies. Both learning methods are applied to generate ship allocation strategy. In this article the system architecture and
learning processes are described and experimental results comparing the
performance of implemented types of agents are presented.
Keywords: Machine learning, multi-agent systems, resource management.

Introduction

Resource allocation is a vital problem in multi-agent systems. Agents try to realize theirs goals in a way that is as good as possible, whereas a cooperation may
be necessary to avoid an exhaustion of resources and appearing a crisis situation.
One of the main problems in the development of such systems is designing an
appropriate strategy for resource allocation.
Applying learning algorithms allows to overcome such problems. One can
implement an agent that is not perfect, but improves its performance. This is
why machine learning term appears in a context of agent systems for several
years. There are many learning methods that can be used to generate knowledge
or strategy. Choosing an appropriate one, which ts a given problem, sometimes
is a dicult task.
In multi-agent systems the most common technique is reinforcement learning [1]. It allows to generate a strategy for an agent in a situation, when the
environment provides some feedback after the agent has acted. Feedback takes
the form of a real number representing reward, which depends on the quality of
the action executed by the agent in a given situation. The goal of the learning
is to maximize estimated reward.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 864871, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Resource Management in a Multi-agent System

865

Symbolic, supervised learning is not so widely used in multi-agent systems.


There are many methods belonging to this class that generate knowledge from
data. Here a rule induction algorithm is used. It generates a rule-based classier,
which assigns a class to a given example. As an input it needs examples, where
the class is assigned by some teacher. In the paper it is shown that it is possible
to generate such examples in the case of learning resource allocation. Using this
method instead of reinforcement learning has several advantages.
As an environment the Fish Banks game is used [2]. It is a simulation where
agents run shing companies that must decide how much, and where to sh.
In the following sections learning in the multi-agent systems is described,
developed system and both learning methods are presented, and experimental
results are analyzed.

Learning in Multi-agent Systems

The problem of learning in multi-agent systems may be considered as a union of


research on multi-agent systems and on machine learning. Machine learning focuses mostly on research on an isolated process performed by one intelligent module. The multi-agent approach concerns the systems composed of autonomous
elements, called agents, whose actions lead to the realization of given goals. In
this context, learning is based on the observation of the inuences of activities,
performed to achieve the goal by an agent itself or by other agents. Learning
may proceed in a traditional centralized (one learning agent) or decentralized manner. In the second case more than one agent is engaged in the learning
process [1].
So far agent systems with learning capabilities were applied in many domains:
to train agents playing in RoboCup Challenge [3], adapt user interfaces [4], take
part in agent-based computational economics simulations [5], analyze distributed
data [6].

System Description

The Fish Banks game is originally designed for teaching people eective cooperation in using natural resources [7]. However; it suits using in multi-agent
systems very well [2,8]. The game is a dynamic environment providing all necessary resources, action execution procedures, and time ow represented by game
rounds. Each round consists of the following steps: ships and money update,
ship auctions, trading session, ship orders, ship allocation, shing, sh number
update.
Agents represent players that manage shing companies. Each company aims
at collecting maximum assets expressed by the amount of money deposited at
a bank account and the number of ships. The company earns money by shing
at sh banks. The environment provides two shing areas: coastal and deep-sea.
Agents can also keep their ships at the port. The cost of deep-sea shing is the
highest. The cost of staying at the port is the lowest but such ship does not

866

zy
B. Snie
nski

catch sh. Initially, it is assumed that the number of sh in both banks is close
to the banks maximal capacity. Therefore, at the beginning of game deep-sea
shing is more protable.
Usually exploration overcomes birth and after several rounds the number can
decrease to zero. It is a standard case of the tragedy of commons [9]. It is more
reasonable to keep ships at the harbor then, therefore companies should change
theirs strategies.
3.1

Agents

Four types of agents are implemented: reinforcement learning agent, rule learning
agent, predicting agent, and random agent. The rst one uses learned strategy to
allocate ships, the second one uses rules induced from the experience to classify
actions and chose the best one, third agent type uses previous shing results
to estimate values of dierent allocation actions, the last one allocates ships
randomly.
All types of agents may observe the following aspects of the environment:
arriving of new ships bought from a shipyard, money earned in the last round,
ships allocations of all agents, and shing results for deep sea and inshore area.
All types of agents can execute the following two types of actions: order ships,
allocate ships.
Order ships action is currently very simple. It is implemented in all types of
agents in the same way. At the beginning of the game every agent has 10 ships.
Every round, if it has less then 15 ships, there is 50% chance that it orders two
new ships.
Ships allocation is based on the method used in [2]. The allocation action is
represented by a triple (h, d, c), where h is the number of ships left in a harbor,
d and c are numbers of ships sent to a deep sea, and a coastal area respectively.
Agents generate a list of allocation strategies for h = 0%, 25%, 50%, 75%, and
100% of ships that belong to the agent. The rest of ships (r) is partitioned; for
every h the following candidates are generated:
1. All: (h, 0, r), (h, r, 0) send all remaining ships to a deep sea or coastal area,
2. Check: (h, 1, r 1), (h, r 1, 1) send one ship to a deep sea or coastal area
and the rest to the other,
3. Three random actions: (h, x, r x), where 1 x < r is a random number
allocate remaining ships in a random way,
4. Equal: (h, r/2, r/2) send equal number of ships to both areas.
The random agent allocates ships using one of the candidates chosen by random. Predicting agent uses the following formula to estimate the value of the
action:
v(a) = income(a) + ecology(a),
(1)
where income(a) represents the prediction of the income under the assumption
that in this round shing results will be the same as in the previous round,
ecology(a) represents ecological eects of the action a (the value is low if shing

Resource Management in a Multi-agent System

867

begin
if it is the rst game then Q := 0;
a1 := random action; execute a1 ;
foreach round r > 1 do
xr := current state;
if random() > the probability of exploration then a := random action;
else ar := arg maxa Q(a, xr ); qr := Q(ar , xr );
execute ar ; xr+1 := state after action execution;
i:= income (money earned by shing - costs);
Qmax
r+1 := maxa Q(a, xr+1 );
:= Qmax
r+1 + I Q;
Q(ar , xr ) := Q(ar , xr ) + ;
end
end
Fig. 1. Algorithm used by the reinforcement learning agent

is performed in the area with low sh population), and represents importance


of the ecology factor.
3.2

Learning Agents

Reinforcement learning agent chooses action by random in the rst round. In


the following rounds, action with the highest predicted value (Q) is chosen. Q is
a function that estimates value of the action in a given state:
Q : A X ,

(2)

where A = {(h, d, c) : h, d, c {0%, 25%, 50%, 75%100%}, d + c = 1} is a set of


all possible ship allocation actions, and X = {(dc, cc) : dc {1, 2, . . . 25}, cc
{1, 2, . . . , 15}} is a set of states, which represent catch in both areas in the
previous round. As we can see, the set of ship allocation actions is dierent
than one of the other agents. Q-learning algorithm [10] is used to update the Q
function after each round (beginning from the second one to have a catch data).
The algorithm of the agent is presented in Fig. 1.
At the beginning Q is initialized as a constant function 0. To provide sucient
exploration, in a game number g a random action is chosen with probability 1/g
(all actions have the same probability then). Therefore random or the best action
(according to Q function) is chosen and executed. Income, which represents
feedback from the environment, is calculated, and Q is updated taking into
account the reward. The [0, 1] parameter represents the importance of the
future rewards, (0, 1) is used to control the speed of change of Q.
The rule learning agent also randomizes actions in the rst game, but in the
following games it chooses action with the highest rating. The action rating
is generated using rules, which are stored in the KB. Rules allow to classify
the allocation as good or bad taking into account allocation parameters and
environment parameters (sh catch at the deep sea and at the coastal area in
the previous round). The algorithm of the agent is presented in Fig. 2.

868

zy
B. Snie
nski

begin
if it is the rst game then KB := ; T := ;
foreach round r do
if it is the rst game or round then a:= random action;
else a:= action with the highest rating using KB;
execute a; observe actions of other agents and results;
T := T {(best-action-data, good), (worst-action-data, bad)};
end
KB := generate knowledge from training examples T ;
end
Fig. 2. Algorithm used by the rule learning agent

Every action a gets a value v according to the formula:


v(a) = good(a) bad(a),

(3)

where good(a) and bad(a) are numbers of rules, which match the action and
current environment parameters, with consequence good and bad, respectively,
and is a weight representing the importance of rules with consequence good.
If there is more then one action with the same value, one occurring earlier in
the list is chosen. As a consequence, actions with smaller h are preferred.
Training examples are generated from agents observations. Every round the
learning agent stores ship allocations of all agents, and the sh catch in the
previous round. The action of an agent with the highest income is classied as
good, and the action of an agent with the lowest income is classied as bad. If
in some round all agents get the same income, none action is classied, and as a
consequence, none of them is used in learning.
At the end of each game the learning agent uses training examples T , which
were generated during all games played so far, to learn a new classier (KB),
which is used in the next game.
To support rule induction the AQ21 program is used [11]. It is a machine
learning software that allows to generate attributional calculus rules for given
examples. In standard mode it produces a complete and consistent description of
the data, but it can also provide rules that are not complete and/or consistent.
The main advantage of this program is that the generated knowledge is easy to
interpret for human what makes the experimental results easier to check and can
be useful in Fish Bank application to teach people. Of course, other methods of
rule (or even classier) learning can be used.
3.3

Implementation

The software used in experiments is written in Prolog, using Prologix compiler [12]. Every agent is a separate process. It can be executed on a separate
machine. Agents communicate with the environment using Linda blackboard.

Resource Management in a Multi-agent System

869

Prologix is an extension of BinProlog that has many powerful knowledge-based


extensions (e.g. agent language LOT, Conceptual Graphs and KIF support).

Experimental Results

Three series of experiments were performed to test how learning inuences the
agents performance. Four agents took part in every series. Each series consisted
of the sequence of 15 games and it was repeated ten times.
In the rst series there were three random agents and one reinforcement learning agent (with = 1 and = 0.1). The performance of agents measured as a
balance at the end of every game is presented in Fig. 3-a.
In the second series there were three random agents and one rule learning
agent (with weight = 1). The performance of these agents is presented in
Fig. 3-b.
In the third series one learning ( = 1), one predicting and two random agents
were used. The performance of agents is presented in Fig. 3-c.
In all experiments average balance of both types of learning agents increases
with the agents experience, while the performance of the predicting and random
agents decreases slightly (because of the learning agents competition). Reinforcement learning agent was a little bit worse then a rule learning agent, but tuning
of its parameters (, ) and taking into account actions of other agents during
learning should increase its performance.
Examples of rules learned are presented in Fig. 4. Capital letters represent
variables that can be unied with any value. Predicate member checks if its rst
argument belongs to the list that is a second argument. It is used to represent
an internal disjunction (expression of the form x = v1 v2 . . . vn ). Remaining predicates represent the catch in the previous round (prevCatchDeep,
prevCatchCoastal) and action (harbor, deep_coastal_ratio).
These rules (in the form of clauses) can be interpreted in the following way.
Clause (a): it is a bad decision to keep at a harbor 25, 50, or 75 percent of
ships if the previous catch at deep-sea is greater or equal to 16, and the previous
catch at coastal area is 10.
Clause (b): it is a good decision to send 100% ships to a deep sea or 75% to
a deep sea and 25% to a coastal area if previous catch at deep-sea is greater or
equal to 18, and smaller or equal to 21, and previous catch at coastal area is
smaller or equal to 10.
Experimental results show that the rule learning agent performance increases
rapidly at the beginning of the learning process, when generated rules are used
instead of a random choice. Next it increases slowly, because new examples do
not contain any signicant knowledge. The performance stabilizes at the end of
the process.
As we can see in Fig. 3-c, the predicting agent performs better then the
learning agent. It suggests, that there is a space for improvement of the learning
method. Further research is necessary to check if it is possible to learn such a
good strategy.

zy
B. Snie
nski

870

Balance at the end of game [K$]

(a)
35
30
25
RLA
RA1
RA2
RA3

20
15
10
5
0
1

9 10 11 12 13 14 15

Game nr

(b)

(c)
40

35
30
25

LA
RA1
RA2
RA3

20
15
10
5
0
1

10 11 12 13 14 15

Balance at the end of game [K$]

Balance at the end of game [K$]

40

35
30
25
LA
PA
RA1
RA2

20
15
10
5
0
-5

Game nr

10 11 12 13 14 15

Game nr

Fig. 3. Comparison of performance of reinforcement learning agent (RLA), rule learning (LA) and other agents using random strategy of ship allocation (RA1, RA2, RA3)
or prediction (PA); values for learning and predicting agents are presented with the
standard deviation
(a)
rate(bad) :harbor(B),
member(B,[25,50,75]),
prevCatchDeep(C),
C >= 16,
prevCatchCoastal(10).

(b)
rate(good) :deep_coastal_ratio(B),
member(B,[100%-0%,75%-25%]),
prevCatchDeep(C),
C >= 18,
C =< 21,
prevCatchCoastal(D),
D =< 10.

Fig. 4. Examples of rules (in the form of Prolog clauses) learned by the agent

Conclusion and Further Research

As we can see, both learning algorithms can be applied for learning resource allocation in a multi-agent system. Their performance is much better then a random
strategy, but there is still a space for improvement. Both of the techniques have
some advantages and disadvantages.
These two methods use dierent knowledge representation. Reinforcement
learning uses the action value function, which is dicult to analyze especially
in a case of a large domain. Rules are usually much easier to interpret (unless

Resource Management in a Multi-agent System

871

there are too many of them). Therefore, if learned knowledge is analyzed by a


human, rule induction seems to be a better choice.
A disadvantage of reinforcement learning is necessity of tuning its parameters
(, , and exploration method). The choice has a high impact on the results.
What is more, due to necessary exploration, the algorithms performance is less
stable (there is a high variance).
On the other hand, reinforcement learning works well even if the reward is
delayed. Additionally, it does not need information about other agents actions.
Hence it is more universal. Rule learning can be modied not to use this data,
but it will probably result in slower learning. It will be tested in the future.
Other future works will concern applying other learning algorithms, and cooperation learning. Also applying both methods at the same time for dierent
aspects seems to be an interesting issue.

References
1. Sen, S., Weiss, G.: Learning in multiagent systems. In Weiss, G., ed.: A Modern
Approach to Distributed Articial Intelligence. The MIT Press (1999)
2. Kozlak, J., Demazeau, Y., Bousquet, F.: Multi-agent system to model the shbanks
game process. In: The First International Workshop of Central and Eastern Europe
on Multi-agent Systems (CEEMAS99), St. Petersburg (1999)
3. H.Kitano, M.Tambe, P.Stone, M.Veloso, S.Coradeschi, Osawa, E., Matsubara, H.,
Noda, I., Asada, M.: The RoboCup synthetic agent challenge 97. In: International
Joint Conference on Articial Intelligence (IJCAI97), Nagoya, Japan (1997) 2429
4. Lashkari, Y., Metral, M., Maes, P.: Collaborative interface agents. In: AAAI.
(1994) 444449
5. Tesfatsion, L.: Agent-based computational economics: Growing economies from
the bottom up. Articial Life 8 (1) (2001) 5582
6. Stolfo, S.J., Prodromidis, A.L., Tselepis, S., Lee, W., Fan, D.W., Chan, P.K.: Jam:
Java agents for meta-learning over distributed databases. In: KDD. (1997) 7481
7. Meadows, D., Iddman, T., Shannon, D.: Fish Banks, LTD: Game Administrators Manual. Laboratory of Interactive Learning, University of New Hampshire,
Durham, USA (1993)
8. Sniezynski, B., Kozlak, J.: Learning in a multi-agent approach to a sh bank
game. In: Multi-Agent Systems and Applications IV: Proc. of CEEMAS 2005.
Volume 3690 of Lecture Notes in Computer Science. (2005) 568571
9. Hardin, G.: The tragedy of commons. Science 162 (1968) 12431248
10. Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, Kings College,
Cambridge (1989)
11. Wojtusiak, J.: AQ21 Users Guide. Reports of the Machine Learning and Inference
Laboratory, MLI 04-3. George Mason University, Fairfax, VA (2004)
12. A. Majumdar, P. Tarau, J.S.: Prologix: Users guide. Technical report, VivoMind
LLC (2004)

Learning in Cooperating Agents Environment


as a Method of Solving Transport Problems
and Limiting the Eects of Crisis Situations
Jaroslaw Kolak
Department of Computer Science, AGH-UST, Al. Mickiewicza 30, 30-059 Krakw,
Poland
kozlak@agh.edu.pl

Abstract. The realising of transport requests plays an important role


for companies regarding cost. For this reason the construction of an optimal and eective transport planning and scheduling, oering the best use
of transport means is hugely important. One of the researched transport
problems at present is PDPTW. In this work, the PDPTW problem will
be extended by adding changeable and uncertain travel times between
given locations, which will then be examined and learnt by the dispatching company system. The changeable travel times are the results of trac
jams forming and propagating. The applied multi-agent approach allows
us to consider the problem on the level of a whole company and on the
local level analysed by a particular vehicle. The other aspects which are
taken into consideration are trac patterns, whose goal it is to facilitate
the optimal routes for the given trac patterns. These patterns contain
the information about the dependencies between the trac state on the
particular routes.
Keywords: Multi-agent systems, transport problems, learning, trac
jams.

Introduction

Vast amounts of money from the budgets of many companies are spent on costs
related to transport. In the case of some goods or services, these costs constitute
the most dominant element. A lot of companies oer shipping services or services
on demand, where it is necessary that the vehicle arrives at the destination
in a given period of time. Additionally, and especially in the case of motor
transport, it is dicult to predict the exact times of travel between locations.
This could be a consequence of changing trac volume, trac jams or some other
current unpredicted events. Therefore, the development of software tools which
make route planning and predicting travel times possible, takes on a very high
signicance. This will also allow the limiting of the use of means of transport and
total traversed distances, and will assure that the time and capacity constraints
can be achieved.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 872879, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Learning in Cooperating Agents Environment

873

A lot of dierent transport problems like TSP, VRPTW or PDPTW in static and dynamic versions have been widely researched. The basis of the current
work constitutes a dynamic version of the PDPTW (Pickup and Delivery Problem with Time Windows). The PDPTW consists of the realisation of a set of
transport requests by a eet of vehicles. In PDPTW, it is assumed that by using
a eet of vehicles it is necessary to perform a set of transport requests described
by periods of time when operations of a pick up and a delivery may take place,
locations where the goods are picked up and where they must be delivered as
well as the quantity of commodities. The dynamic version of the problem concerns the requests arriving during the running of the system, a problem solving
algorithm that must be able to be started repeatedly, the variable constraints
related to time ow and performed operations of pickups and deliveries have to
be taken into consideration as well as making it necessary for the movement of
vehicles to be modeled.
In the version of the problem being analysed in this work, several extensions
to dynamic PDPTW were introduced. They are based on assessing uncertain
and changeable travel times between locations, which are additionally a result of
the formation of trac jams and them propagating onto neighbouring arcs. The
problem, extended in such a way, better suits the situations we may encounter
when planning the realisation of transport requests in practice. To solve the
problem, the multi-agent approach has been chosen.
The layout of this paper is as follows. Section 2 contains an overview of the
work being carried out within domains of transport planning and multi-agent
systems. Section 3 concerns the presentation of a model of a multi-agent system
for planning transport requests, while taking into consideration communication
between agents, modelling and the spread of trac jams and trac patterns.
Section 4 contains an overview of experiments carried out and the advantages
oered by communication between agents as well as trac patterns. Section 5
contains conclusions and plans for future research.

State of the Art

To solve transport planning and scheduling problems dierent approaches are


applied. One can distinguish two major kinds of problems: static and dynamic.
To solve the static problems nowadays, dierent heuristic approaches are usually used. They are based on algorithms such as evolutionary algorithms, simulated annealing, tabu search [5], squeaky wheel [6] or ant colony approach. To
solve the dynamic problem, metaheuristics based on tabu search [4] or multiagent approaches are used. The multi-agent approach exploits methods such us
Contract-Net, simulated trading [1,3,2] or DCSP [8].
The multi-agent approach facilitates an analysis of problems which consider
additional elements that take part in the planning of transport requests in practice, but which are not present in the classical PDPTW. These aspects are for
example modelling trac jams [3], taking into consideration dierent kinds of

874

J. Kolak

freight and construction of the transport unit (composed of trucks and trailers
adjusted to the given needs [2]) as well as a problem decentralisation [9].
The next aspect being analysed in this paper is a problem of uncertainty related to travel times through the arcs. This problem is examined in particular by
the systems designed for city trac modelling. A very popular approach which is
the basis for dierent multi-agent realisations, is the Nagle-Schreckenberg model
[7]. Taking advantage of these systems and possessing information about city
trac, it is possible to estimate travel times in any given part of the route.

Model

The model presented in this work, besides solution optimisation methods focus
on the learning of the travel times and recognition of trac-jam patterns in the
road network. The model of the system consists of a set of agents (a dispatcher
and vehicles) and of the environment, where agents function (g. 1).

Fig. 1. System layers model

3.1

Environment

The environment is represented by the transport network, which is a graph


T N =(V , E), where V is a set of nodes (Vi ) and E is a set of edges (Ei ). Each
node is described by its coordinates (xi ,yi ) and the probability pgi of a request
generation taking this given node as a pickup or a delivery point. Each edge Ei is
described by nodes Vj , Vj  being the starting and ending nodes of the edge, the
distance djj between these nodes and a function of probability distribution Ti
which represents the travel times through the edge. The probability distribution
function may vary depending on trac volume on this edge tvi . The trac
volume changes depending on the time of day or as a result of random events
which cause the build-up of trac (accidents, breakdowns, closing of roads).

Learning in Cooperating Agents Environment

3.2

875

Agent-Dispatcher

The goal of the agent-dispatcher is to receive requests from customers and to


assign these requests to the vehicles, according to the goal function. Additionally,
agent-dispatcher manages a common knowledge of enterprise concerning any
current appearances of trac jams and patterns of trac jam distribution.
The agent-dispatcher AC is represented by the following parameters (gAC , R,
Rf , Ra , Rn , Rr , KAV , Ar , T J, T P ) where: gAC goal function of agent, R
set of arrived requests, Rf , Ra , Rn ,Rr , sets of requests realised, assigned to
agent-vehicles to be realised, which have not yet been assigned for realisation to
agent-vehicles and a set of rejected requests, KAV set of information concerning
the vehicles belonging to the company, Ar set of oers of request realisations,
T J, T P knowledge about current trac-jams and trac patterns.
The knowledge bases T J and T P are made available as an external blackboard, where access is allowed for all agents the agent-dispatcher as well as
agents-vehicles. However, operations on them are performed by the dispatcher
on the basis of data acquired from agents-vehicles.
The dispatcher may take info consideration dierent criteria of choice of vehicle, taking into consideration the preferences represented by goal function such
as: as small as possible estimated costs related with the request realisation by
the vehicle, or even distribution of requests among vehicles. The agent-dispatcher
performs the following actions:
actions related to an allocation of requests: receipt of request, bringing forward the requests to agents-vehicles, receipt of an oer from agent-vehicles,
analysis of oers (the choice of the best oer of action realisation, clearing the set of oers of this request realisation and sending an answer of
acceptance or rejection to agents which sent the propositions of request realisations), acceptance of a request to realisation or request rejection, receipt
of information that a request has already been completed;
actions related to management of knowledge base about the current tracjams (receipt of information and table update);
actions related to management of knowledge base about trac patterns (receipt of information and table update).
3.3

Agent-Vehicles

The goal of agent-vehicle is to negotiate transport requests for realisation, determination using insertion algorithm to nd the best positions in the travel route
for adding new request points, request realisation and exploration of the road
networks to improve the quality of information about travel times. Agent-vehicle
AV k is represented by the following n-tuple (V k , Rk , Rpk , Rdk , P k , Prk , k , pk ,
g k , ckmax , ck , sk , AT , AT k ,rk ), where:
V k V actual location (the node where the agent currently is at present
or where it is currently aiming);

876

J. Kolak

Rk , Rpk , Rdk set of requests assigned for realisation, picked up and have
been realised (after delivery);
P k - planned route, represented as a series <pkj > where pkj =Vj ;
Prk - planned travel route which consists only of pickup or delivery points;
k - function which assigns the request points to be performed at given
locations to the points of the travel routes;
pk a special point in the travel route, describing the current position of the
vehicle;
g k goal function describing the functioning of an agent;
ckmax maximum capacity of vehicle;
ck current load, ck should be lower or equal to ckmax ;
sk strategy which is used by the agent to estimate travel times between the
nodes and to take into consideration the uncertainty of travel times (average
value of estimations, given probability of successful plan realisation);
AT knowledge concerning the estimated travel times through the arcs;
AT k knowledge concerning travel times through the given arcs adjusted by
the travel times coming from agent experiences (its times of travel through
the arcs during its movements) and information obtained from other agents;
rk parameter describing an inclination to take a risk, for a small value of
this parameter, the agent chooses the fastest known routes, for larger values,
the agent tries to explore the road networks and to supplement the data
about less explored arcs.
The quality of solution g k for the given vehicle AV k is described by eq. 1.
n
n
g k = i=1
pi c i=1
(fip + fid )

(1)

where: n number of requests to be realised by a vehicle, pi gain after realisation of request number, c costs caused by total distance travelled, fip , fid
penalties caused by lateness at the points of pickup and delivery of request i, ,
, weighting coecients;
Agent-vehicle performs the following actions:
actions related to request allocation: construction of route for a given set
of requests taking into consideration information about travel times and a
parameter describing the inclination to take a risk, route estimation, estimation of submitted proposition (it is described as a dierence between goal
function of agent after request acceptance and before request acceptance, the
agent may also inform that it is not able to realise the request), submitting
an oer of request realisation to agent-company or acceptance or rejection
of request;
actions related to vehicle movement: movement of vehicle, pickup realisation,
delivery realisation;
actions related to an updating of its own knowledge bases and those of
other agents: modication of route because of trac jam, modication of
knowledge base about travel times through the arcs AT k , sending of message
to agent-vehicles giving information on current travel times through a given
arc, observation of existing trac jams, acquiring information about trac
patterns.

Learning in Cooperating Agents Environment

3.4

877

Trac Jams

It is assumed that each arc has a maximum allowed capacity cmax


and actual
ij
trac cij . Additionally, arcs, coming out from a given node, are given weightings reecting how the trac coming into this node is divided among the arcs
coming out from it. If the actual trac reaches the values equal or greater than
a maximum allowed capacity, a trac jam will appear on these arcs, which will
also slow down the movement of vehicles modelled on these arcs. A trac jam
may also be a result of a randomly generated crisis situation which increases
the actual trac volume on the arc. According to the rules of ow of trac to
neighbouring arcs, trac jams may propagate.
If a vehicle reaches the start of an arc on which a trac jam has appeared
(it may be recognized by observing the relationship between the current trac
parameter and maximum allowed capacity on the arc), then the vehicle attempts
to construct a diversion route. A route construction algorithm is initiated once
more but in this case omitting the arc where the trac jam has occurred. If this
construction of a diversion is successful, the vehicle will then continue along this
new modied route.
The distribution of travel times through the arcs are not known to the agents
carrying out planning. The vehicles and dispatcher do not know the distribution
on the given roads, they may only know some estimations and gain additional
knowledge while obtaining their own experience or experience acquired by other
vehicles. The agent may acquire information concerning travel times between
the nodes using the following methods:
as a result of traversing a given arc and making a measurement of the time
required. In this case an estimated value of travel times between the nodes
(n+1)
(n)
i and j is changed and takes a value equal to tij
= tij + tij , where
tij is the current travel time from i to j, and and are the weighting
parameters, equal in its simple case to n/(n + 1) and 1/(n + 1), where n+1
is a number of measurements taking the last one into consideration.
as a result of obtaining information about travel times through the arc from
the other cooperating agent. In this case, a new travel time estimation will
be calculated in the similar manner as previously described and taken into
consideration after during the planning process.
3.5

Trac Patterns

The table concerning the trac pattern is managed by an agent-dispatcher. Rows


and columns are identied by the number of arcs ej and ei . For each cell of this
table, lists are stocked containing the gathered parameters describing results of
observation of trac states on these arcs measured within the tolerated time
period: was a trac jam observed on the arc ei (yes or no), was a trac jam
observed on the arc ej (yes or no), time of observation. This information is
delivered to the agent-dispatcher which makes it accessible to particular agentvehicles afterwards. Agent-vehicle observing or obtaining information about a

878

J. Kolak

trac-jam appearing on any given arc may verify whether a trac-jam appearing
on this arc was often associated with the appearances of trac jams on other arcs
contained in its planned route. If this is the case, the agent makes an attempt to
construct a diversion route also omitting the arcs threatened by the appearance
of trac-jams.

Experiments

The experiments were conducted with the goal of estimating the quality of the
system functionality depending on the selected behaviours of agents. In particular, the following aspects were considered:
dierent frequencies of messages exchanged between agent-vehicles, these
messages contained information about travel times through the arcs, the
other option taken into consideration was the lack of exchange of this kind
of information.
application of trac patterns which describe the dependencies between the
state of trac on the dierent pairs of arcs, the other option was to not take
this element into consideration

Fig. 2. Vehicle groups income a) dierent messages exchange frequencies Group1 and
no message exchange Group2 b) with and without patterns

Vehicles in the system were divided into two groups. The vehicles in the rst
group exchange information about the travel times through the given arcs. The
vehicles in the second group do not use this mechanism. As a method of estimating the quality of activities, the total income of vehicles in each group is
considered. Experiments proved (g. 4) that for almost each frequency of messages exchanged, the group which used this mechanism obtained better results.
The only exception was a case when the communication had the lowest frequency.
In this case the exchanged information might not be current and so has a limited
inuence on the improvement of the accuracy of trac description on this arc.
The results obtained for those vehicles using the patterns proved to give highest nancial income than those that did not. Because the quantities of performed

Learning in Cooperating Agents Environment

879

requests were similar in each case, the increase of income in the cases with
communication and with the patterns is caused by a more accurate estimation
of travel times which limits lateness and the consequent nancial penalties.

Conclusions

The system for solving the transport problem as an extension to the dynamic
PDPTW by uncertain travel times was realised. The experiments showed advantages resulting from the additional exchange of information among agents
concerning travel times and the application of trac patterns. Future work on
the system will have as its goal the improvement of the quality of obtained solutions and to make verication better. In particular, an elaborate set of tests
based on the benchmarks for static PDPTW is planned. The use of larger graphs
representing the road networks, which is a real city (roads e.g. in Krakw.) is
also planned.
Kind thanks to the Computer Science students at AGH-UST and the trainees
who contributed to the work on the pilot version of the system and especially to
M. Kwiecie, J. Rachwalik and Ch. Elmer.

References
1. Bachem, A., Hochstattler, W., Malich, M.: Simulated Trading A New Parallel Approach for Solving Vehicle Routing Problems. Proc. of Int. Conf. Parallel Computing: Trends and Applications, 1994.
2. Burckert, H.-J., Fischer, K., Vierke., G. : Transportation scheduling with holonic
MAS - the TELETRUCK approach. Third International Conference on Practical
Applications of Intelligent Agents and Multiagents (PAAM 98), 1998.
3. Fischer, K., Mller, J.P., Pischel, M.: Cooperative Transportation Scheduling: an
Application Domain for DAI. Applied Articial Intelligence, vol.10, 1996.
4. Gendreau, A., Guertin, F. , Potvin, J.Y., Sguin, R. :Neighborhood search heuristics
for a dynamic vehicle dispatching problem with pick-ups and deliveries. Rapport
technique CRT-98-10, Universit de Montral, 1998.
5. Li, H., Lim A.: A Metaheuristic for the Pickup and Delivery Problem with Time
Windows. Proceedings of 13th IEEE International Conference on Tools with Articial Intelligence (ICTAI01), USA, 2001.
6. Lim, H. , Lim, A., Rodrigues, B.: Solving the Pick up and Delivery Problem using
Squeaky Wheel Optimization with Local Search. Proceedings of American Conference on Information Systems, AMCIS 2002, USA.
7. Nagel, K., Schreckenberg, M.: A cellular automaton model for freeway trac, J.
Phisique I, 2(12), 1992.
8. Neagu, N., Dorer, K., Calisti, M.: Solving Distributed Delivery Problems with AgentBased Technologies and Constraint Satisfaction Techniques. Dist. Plan and Schedule
Management, 2006 AAAI Spring Symp., The AAAI Press, USA.
9. Dorer, K., Calisti, C.: An Adaptive Solution to Dynamic Transport Optimization,
Proceedings of the AAMAS05 industry track, Utrecht, The Netherlands, 2005.

Distributed Adaptive Design with Hierarchical


Autonomous Graph Transformation Systems
Leszek Kotulski1 and Barbara Strug2
1

Department of Automatics, AGH University of Science and Technology


Al. Mickiewicza 30, 30 059 Krakow, Poland
2
Department of Physics, Astronomy and Applied Computer Science,
Jagiellonian University, Reymonta 4, Krakow, Poland
kotulski@agh.edu.pl, uistrug@if.uj.edu.pl

Abstract. In this paper a graph transformation system with the parallel derivation is used to model the process of distribution and adaptation
for computer aided design. It is based on earlier research in formal language theory, especially graph grammars, and distributed models. The
motivation for the ideas presented here is given and possible ways of
application are described. A connection to the multi-agent model is also
presented.
Keywords: Graph transformations, grammar systems, design.

Introduction

Distributed model of computing is becoming more and more popular, especially


with a rapid development of the Internet and availability of distributed development platforms. Such a model seems to be very useful in the domain of computeraided design. Many design problems can be divided into a number of tasks, each
of them carried out by an autonomous agent either only occasionally exchanging
pieces of information or contributing its capabilities to a common object being
designed. Yet no formal model of such a cooperative distributed design has been
proposed so far.
Real world objects being designed are usually complex and contain a number of dierent sub-elements inter-related in dierent ways. To represent such
complex objects in dierent domains of computer science graphs are very often
used [15]. Their ability to represent the structure of an object as well as the
dierent types of relations between its parts makes them particularly useful in
computer aided design. The process of designing objects requires thus a method
of generating graphs .
A new approach to graph generation in computer aided design domain, proposed in this paper, uses cooperation and distribution as a basis for generating
designs with the application of graph grammars. It is based on earlier research
in the domain of application of the theory of formal languages to the computer
aided design [6,7] and distributed model of computation [5,15,11]. In particular graph grammars [1,6,8,13] and grammar systems [2,3,4] were used as the
inspiration for this research.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 880887, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Distributed Adaptive Design

881

Graph Structures in the Design Process

One of the most useful representations in computer aided design systems is based
on graphs. The process of designing can then be seen as generating graphs representing some objects. In such situation introducing a method for the generation
of these graphs is needed. Moreover such a method must guarantee that graphs
it produces are valid in respect to a design problem being solved. In other words
each graph generated by a chosen method must be interpretable as a design.
Usually it must also satisfy some design criteria or constraints. Graph transformations are a method fullling all above requirements.
In real world problems the object being designed are complex and consist of
many parts which in turn can contain subparts. Thus the graphs representing
such an object can become large structures dicult to maintain, update and analyze. Yet in many situations design decisions dealing with one part of the design
are independent from the rest of it. Thus it seems reasonable to distribute the
graph representing such a design into a number of local graphs, each representing a part of design, and a master graph containing the overall structure of the
design. Each local graph can be generated by its own graph transformation system Graph transformation system can thus be considered a set of autonomous
systems paralelly cooperating with the user (designer) thus behaving itself like
an agent.
Using such an approach, two major problems inherent to graph-based representations in design problems, namely the size of graphs and number of transformation rules, can be reduced.
Moreover there is nothing to prevent local graph representing a part of a
design to be a master one either for other graphs representing subparts of this
part (what introduce hierarchy inside graph) or the common (replicated) part
of the graph structure. The second one will be discussed in the paper in a more
detailed way.
Distribution of the representation solves mentioned problems but also generates new ones. As each graph represents a part of a design then there can
exist elements that can be represented in dierent subgraphs. Thus a need for
synchronization any modication of their environment arises. For example in
a house design a system expected to design a oor layout of a house with the
whole infrastructure and furnishing each room can be considered a valid part of
a design and the master graph would contain the structure of the house layout
and common elements. The way a furniture is placed in one room is independent
from whatever happens to be placed in other rooms, thus each one can be designed by an autonomous agent. Yet some common elements exist. Doors can be
one of such elements as they are shared by two rooms and thus changing their
position must be synchronized inside both graphs representing these rooms, thus
a need for communication between agents arises.
Moreover there are some limited resources to be placed in the house. For example it may be specied that two TV sets must be placed in the house with

882

L. Kotulski and B. Strug

some additional constraints like: the distance from TV to a seat is at least 2


meters but no more then ve. Thus the master unit broadcasts the resource and
waits for responses from all rooms. Each local agent responds by specifying a
measure of satisfying the constraints and the master unit decides in which room
the TV would be placed, but the exact position is decided by the local system
(agent). Let us, however, note that the centralization of some data and decisions
inside master unit can create bottlenecks eect and decrease the eectiveness
of the system, so we will sketch the possibility of the direct cooperation of the
distributed components. The decision on both the distribution of the design data
structures and the decomposition of the design process generates a need to nd
a method of synchronizing and negotiating the work on replicated description of
elements and placement of limited resources.
In the paper we use for this purpose the concept of the Derivation Control Environment, presented in the next chapter, that controls the graph transformation
with the help of productions of the grammar associated with it.

Derivation Control Environment

The main objectives of introducing the Derivation Control Environment (DCE)


are:
the support for local derivation of the graph,
the representation of users reactions,
the synchronization of both users reaction with the proper action on the
local graph and the derivation of the local cooperation.
Thus DCE consist of two types of units: - the local Derivation Control Diagram
(DCD), - local Users Environments (UE). The denition of the communication
mechanism is one of the most important properties of the distributed environment. We assume that both types of units are able to serve messages in the
form of requests that have a unique name, and carry the values of attributes
associated with this type of request (see [11] for formal denition).
We demand from the communication environment that the sent request be
delivered to the destination unit in a nite time. We assume that each request
has at least two attributes dened: sender and time stamp dening appropriately the unit and the time of the generated request. Let RS denote the set
of all possible requests, so any nite subset of RS will be called a requests
set.
We assume that UE is characterized only by the ability of generating (internal) requests of services to DCD and servicing (external) requests required by
DCDs. DCD unit is responsible both for modifying the maintained local graph
as reaction to the appearance of some requests and generating sequence of requests to other units in order to synchronize its action with other units. So, it
should introduce:

Distributed Adaptive Design

883

1. a notion for describing some constrains in applying the grammar production in a given context (as analogy to the OCL correlated with algebraic
transformation [5]);
2. the ability to synchronize, at the level of the single production, in order to
wait until some specied sequence of events appears;
3. the ability to designate some sequence of productions associated with the
given event.
Let us start to formalize these demands.
Denition 1. A derivation control diagram is a sixtuple S = (N, I, F, T, ,
W ait), where:
N - is the set of control points,
I N and F N are accordingly, the set of starting control points and the
set of nal control points,
T is a set of transitions of the form (k, q, P, SF ), where:
k,q are control points,
P is either a production or symbol when no production is associated
with this transition,
SF is a semantic action described,
= {k }, it the set of selectors, ,
W ait = {W aitk }, is the set of synchronizing functions.
The synchronizing function W aitk awaits evaluation of the selector k until
the condition it specied is fullled. Selector k points: a production, a semantics
action and a destination control point associated with the chosen transition.
Initially activity is associated with each starting point; when the transition is
red the activity is moved to the destination one. If the source control point
is a starting one then the activity is also replicated to it; if the destination
node is the nal one the activity is cancelled. We assume that for any k the
successful evaluation of the synchronizing function W aitk and of the selector
k is made in the critical section over the graph G and the set of request .
These assumptions imply not only the correctness of the sequential evaluation
of the conditions dened inside control point, but also exclude busy form of
waiting during evaluation of the awaiting synchronizing conditions. Like in Ada
language, in the worst case all the synchronizing conditions are evaluated after
any modication G or . If no transition are red to modify G or the evaluation
is suspended until some new request r appears (  = {r} )). The semantic
function SF (associated with the transition)
enriches external request set (requesting some actions either of the designing
system or the user),
removes the request, that is serviced, from ,
evaluate parameters of the right-hand graph of the production P. Production
P is applied to current graph G and new graph H is created

884

L. Kotulski and B. Strug

More intuitively, DCD can be interpreted as a graph connecting control points


(see g. 1) inside of which both the synchronizing function and selector, choosing
one of the transitions from one control point to another one (drawn as an edge),
are sequentially evaluated. During such a transition both production Pi is applied
and semantic action SFi is executed.

Fig. 1. Illustration of DCD

The Derivation Control Environment (DCE) is responsible for controlling both


the local graphs modication and the whole distributed system cooperation.
Let us note that a local graph will be modied when a new request appears initially from some UE (representing user decision). Next requests can be generated as an eect of semantic actions executed by DCD during the service of these
actions in order to make cooperation between DCDs . There are two particular
reasons for the coordination of local activities:
rstly some graph nodes can be replicated in several local graphs (describing
the same element), so any action modifying its state (e.g. adding or removing
any edge) needs cooperation of all these graph environments;
secondly one can need some information about a global state of the system,
so it has to ask the other DCDs to help in nding the global answer.
In the next section, these problems will be discussed in the context of adaptation in design.

Synchronization and Adaptation in Design

In the approach presented in this paper, we use labelled and attributed graphs to
represent structures of design objects as they allow us to express multi-argument
relations between objects parts. Attributing is used to represent features of

Distributed Adaptive Design

885

S
=

S
S

P
D
P
S

S
P
D

D
P

3
S

Fig. 2. Example representing a house layout

objects and relations between them. Attributes represent properties (for example
size, position, colour or material) of elements corresponding to a given node or edge.
The graphs representing designs may be dynamically generated by means
of so called graph grammars [15]. Intuitively, such a grammar is a system of
graph transformations, which consist of formal rules called productions. Each
production is composed of a left-hand side. The left-hand side is replaced by the
right-hand one only if it appears in the graph to be transformed. In each step
a production is applied to a current graph. The sequence of such replacements,
which starts from the initial graph, is called a derivation process.
Yet in real world design problems it is dicult, if not impossible, to dene
such an order of application a priori.
Designing is an incremental process of applying changes, evaluating and correcting designs. Moreover it must adapt to changes in design requirements that
are not always known before the process starts. Note that it is very important
that DCD is able to designate the derivation process by responding to events
rather then following a predened route. Each event generates one or more requests sent to other DCDs. For example adding a replace to a single room may
require moving doors, which belong to two dierent rooms and thus appropriate
requests must be generated; both DCDs (controlling graph representing these
rooms) must react to them. Let consider this example in detail. Figure 2 depicts
a graph representing a part of a house layout during a design process. It consist
of three local graphs, representing three rooms. The replicated (shared) elements
are placed in gray part of each graph and connected by dashed lines (it means
that a connection is established by appropriate attribute(s) used in the request)
Now a designer wants to add a replace to a room 2. Such an operation is local
to the room 2 and is done by applying an appropriate production locally. Local
application of productions has been widely and thoroughly presented in other
papers [6] so it will be skipped here. While doing it a designer nds that the
best location for the replace collides with the doors. So a decision is taken to
remove the doors. A production Pr is applied that performs this action (shown

886

L. Kotulski and B. Strug

Padd

Premove

:=

:=

S
D

Fig. 3. Productions for adding and removing doors

in g. 3a). But the doors are shared by the room 2 with the room 3. So the
request remove door(...) must be sent to DCD controlling the graph describing
the room 3. This DCD should react to the appearance of the remove door(...)
request by applying the production Pr to remove the door from the room 3.
In both cases the removed doors can be the only door in the room. Checking the
existence of the correct communication from the room can be made in our model
with help of semantic action. If the communication stays correct than current
thread of control can be moved to the starting point, otherwise it is moved to
the part of DCDs responsible for negotiation adding dierent doors in other
walls. Such doors are added by applying production shown in g. 3b.

Conclusions

In this paper we have shown that it is possible to decompose a complex design


problem into a number of tasks. These tasks can be performed independently
by agents working according to local grammars and cooperating when a need
for updating shared resources arises. In the presented simple example all three
graphs are derived with the use of the same grammar (thus all agents following
the same rules) but there is no such requirement in the theoretical model we
propose. Thus each graph may be derived by a dierent grammar. Moreover
the grammars may belong to dierent class. The only requirement is that each
grammar contains productions for updating shared elements.
In papers [11,12] it was shown that such an approach can be realized as a
multi-agent system. Each graph transformation system can be considered to be
an agent whose set of possible actions depends on a set of productions of the
given GTS.

References
1. Borkowski A., Grabska E., Nikodem P, and Strug B. Searching for Innovative
Structural Layouts by Means of Graph Grammars and Esvolutionary Optimization,
Proc. 2nd Int. Structural Eng. And Constr. Conf, Rome (2003).
2. E. Csuhaj-Varj
u, J. Dassow, J. Kelemen and Gh. Paun. Grammar systems. A
grammatical approach to distribution and cooperation. Topics in Computer Mathematics 8. Gordon and Breach Science Publishers, Yverdon, 1994.
3. E. Csuhaj-Varj
u: Grammar systems: A short survey, Proceedings of Grammar Systems Week 2004, 141-157, Budapest, Hungary, July 5-9, 2004.

Distributed Adaptive Design

887

4. J. Dassow, Gh. Paun, and G. Rozenberg. Grammar systems. In A. Salomaa and


G. Rozenberg, editors, Handbook of Formal Languages, volume 2, chapter 4, pp.
155-213, Springer-Verlag, Berlin-Heidelberg, 1997
5. Ehrig, H and Taentzer. G , Graphical represenation and graph transformations.
ACM Comput. Surv., 31(3):9, 1999.
6. E.Grabska. Theoretical Concepts of Graphical Modelling. Part one: Realization of
CP-graphs. Machine GRAPHICS and VISION, 2(1), pp. 3-38, 1993.
7. E. Grabska. Graphs and designing. Lecture Notes in Computer Science, 776 (1994).
8. E. Grabska, P. Nikodem, B. Strug. Evolutionary Methods and Graph Grammars in
Design and Optimization of Skeletal Structures Weimar, 11th International Workshop on Intelligent Computing in Engineering, Weimar, 2004.
9. E. Grabska, B. Strug, Applying Cooperating Distributed Graph Grammars in Computer Aided Design, Parallel Processing and Applied Mathematics PPAM 2005,
Lecture Notes in Computer Science , Springer 2006

10. E. Grabska, K. Grzesiak-Kopec, J. Lembas, A. L


 achwa and G. Slusarczyk,
Hypergraphs in design, Computational Imaging and Vision, Kluwer (to appear)
11. Kotulski, L. A Model of Software Genaration in a Distributted Environment by
Graph Grammars, (in Polish) Habilitation Thesis,. Jagiellonian University Publishing, ISBN 83-233-1391-1, Krakow, 2000
12. Kotulski, L,Supporting Software Agents by the Graph Transformation Systems.
Proc. of International Conference on Computational Science , LNCS 3993, pp887890, Springer 2006
13. P. Nikodem and B. Strug. Graph Transformations in Evolutionary Design, ICAISC
2004, Lecture Notes in Computer Science,vol 3070, pp. 456-461, Springer, 2004.
14. Gh. Paun and A. Salomaa editors, Grammatical models of multi-agent systems.
Gordon and Breach, Amsterdam, 1999.
15. Rozenberg, G. Handbook of Graph Grammars and Computing by Graph. Transformations, vol.1 Fundations, World Scientic London (1997).

Integration of Biological, Psychological, and Social Aspects


in Agent-Based Simulation of a Violent Psychopath
Tibor Bosse, Charlotte Gerritsen, and Jan Treur
Vrije Universiteit Amsterdam, Department of Artificial Intelligence
De Boelelaan 1081a, NL-1081 HV, Amsterdam, The Netherlands
{tbosse,cg,treur}@few.vu.nl
http://www.few.vu.nl/~{tbosse, cg, treur}

Abstract. In the analysis of criminal behaviour, a combination of biological,


psychological and social aspects may be taken into account. Dynamical
modelling methods developed in recent years often address these aspects
separately. This paper contributes an agent-based modelling approach for
behaviour of a certain criminal type, the violent psychopath, in which these
aspects are integrated in one dynamical system. It is shown how within a certain
social context, an interaction between biological factors and cognitive and
emotional factors can lead to a crime committed when an opportunity is
perceived.
Keywords: simulation, violent behaviour, BDI-models, integration of
biological, psychological and social aspects.

1 Introduction
Analysis of criminal behaviour is a central issue in Criminology. Such an analysis
involves different types of aspects, such as biological, psychological, and social
aspects, and their mutual interactions; e.g., [2, 5, 8, 10, 12]. Usually such analyses are
made by criminologist researchers or practitioners in a nonexact manner, without
using any formalisation or computer support.
In recent years, much progress has been made in biological, cognitive, and social
complex dynamical systems modelling within areas such as Artificial Intelligence,
Computational Biology and Computational Cognitive Science. The methods
developed in these areas usually address one of the disciplines separately. However,
when an integrated modelling approach is applied, this opens the perspective to
address the analysis of criminal behaviour in more exact, formalised and computer
supported manners. Thus, the way is paved to a more solid basis and computer
support for simulation and analysis in the area of Criminology. The research
discussed here explores this potential. It identifies on the one hand useful knowledge
from the literature in Criminology and the different disciplines underlying it e.g.,
[6, 8, 10], and on the other hand it exploits dedicated agent-based and dynamical
systems modelling techniques [3]. The aim is, by combining these, to develop an
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 888895, 2007.
Springer-Verlag Berlin Heidelberg 2007

Integration of Biological, Psychological, and Social Aspects

889

integrated computer-supported method for the analysis of certain criminal behaviour


types, in this case the violent psychopath, as described in the literature.
To model all of the aspects mentioned above in an integrated manner, both
qualitative aspects (e.g., beliefs, desires, intentions, certain brain deviations, and the
presence of certain agents), and quantitative aspects (e.g., hormone levels, distances
and time durations) have to be addressed. The modelling language LEADSTO [3]
fulfils these desiderata. In LEADSTO, basic (atomic) state properties can have a
qualitative, logical format, such as an expression desire(d1), expressing that desire d1
occurs, or a quantitative, numerical format such as an expression has_value(x, v) which
expresses that variable x has value v. Such atomic state properties can be combined to
more complex state properties by taking conjunctions by means of the logical
operator and. Based on these state properties, dynamic properties can be expressed
as follows. Let and be state properties of the form conjunction of ground atoms
or negations of ground atoms. In the LEADSTO language the notation
e, f, g, h ,
means:
If
then

state property holds for a certain time interval with duration g,


after some delay (between e and f) state property will hold
for a certain time interval of length h.

In this paper, Section 2 discusses a specific type of criminal used as a case study:
the violent psychopath. In Section 3 the simulation model based on LEADSTO is
presented. Section 4 discusses some of the simulation results for the case study.
Finally, Section 5 is a discussion about the approach and its possible applications.

2 Case Study: Violent Psychopath


The case study made in this paper focuses on a specific kind of violent offender: the
violent psychopath, e.g., [2, pp. 87-120; 8, pp. 123-183; 9; 10; 12, pp. 193-207]. In
this section, this type of criminal is briefly introduced, by subsequently addressing
psychological, social and biological aspects.
Psychological Aspects. Psychopaths do not show feelings like the rest of us. They
lack the normal mechanisms of anxiety arousal, which ring alarm bells of fear in most
people. Confronted with trial and danger, even their skin does not sweat and become
clammy like the skin of normal people [8, p.157]. Violent psychopaths, who are
almost always males, can be described as predators and are usually proud of it. Their
kind of violence is similar to predatory aggression, that is accompanied by minimal or
no sympathetic arousal and is planned, purposeful, and without emotion. This is
correlated with a sense of superiority; they like to exert power and have unrestricted
dominance over others, ignoring their needs and justifying the use of whatever they
feel compelling to achieve their goals and avoid adverse consequences for their acts.
An important trigger for psychopathic violent behaviour is the use of drugs and/or
alcoholism. They are more likely to turn to drink and drugs and their brain reacts in a
different way to the effects of drugs and alcohol. For a psychopath, using drugs or
alcohol can become a compulsion and, through a genetic and neurological
mechanism, result in violent behaviour [8, p.201].
Social Aspects. Psychopaths are characterised by a disregard for social obligation and
a lack of concern for the feelings of others. They display pathological egocentricity,

890

T. Bosse, C. Gerritsen, and J. Treur

shallow emotions, lack of insight and remorse, anxiety or guilt in relation to their
antisocial behaviour. They are usually callous, manipulative individuals, incapable of
lasting friendship and of love. They use charm, manipulation, intimidation and
violence to control others and to satisfy their own selfish needs. Lacking in
conscience and in feelings for others, they violate social norms and expectations
without the slightest sense of guilt or regret.
Biological Aspects. Psychopaths have a specific deviation in the brain: the frontal
lobes are disconnected from the limbic area. The frontal lobes are the area of the brain
that is concerned with conscience, guilt and remorse and is the residence of our
morality. The limbic area generates feelings [8, p.157]. Because of the disconnection,
psychopaths cannot express their emotions in terms of feeling. They know the
difference between right and wrong, but the difference does not matter for them. It is
hard for a psychopath to understand or imagine the pain of other people [8, p.158].
Furthermore, violent psychopaths have a high level of testosterone, which makes
them more aggressive in their behaviour, and low levels of serotonin, which makes
them easily bored and impulsive, and stimulates them to seek sensation. Once they
reach adulthood, their condition is incurable. However, only a fraction of psychopaths
develops into violent criminals [8, p. 267].

3 The Integrated Simulation Model


In this section, the integrated simulation model that has been developed is described
in more detail. The agent-based model has been built by composing four submodels
for different aspects. The first submodel is a model that bases the preparation and
performing of actions on beliefs, desires and intentions (BDI); e.g., [11]. Some of
the relations within this BDI-model (as depicted by arrows in graphical format in
Figure 1) can be specified in formal LEADSTO format as follows:
desire(d1) belief(satisfies(a1, d1))
0.2, 0.2, 1, 1 intention(a1)
intention(a1) belief(opportunity_exists_for(a1))
0.2, 0.2, 1, 1 to_be_performed(a1)

desire

?
observed
stimuli

intention

action
belief in reason
belief in
opportunity

Fig. 1. Structure of the Generic BDI-model

Integration of Biological, Psychological, and Social Aspects

891

Note that the beliefs used here both depend on observed stimuli, as shown in
Figure 1. Furthermore, stands for the conjunction operator (and) between the atomic
state properties (depicted below by an arc connecting two (or more) arrows).
A second submodel is used to determine desires, needed as input for the BDImodel. This submodel incorporates various biological and psychological aspects and
their interactions. The biological and psychological aspects involved are of different
types. On the one hand there are qualitative aspects, such as anatomical aspects
concerning brain deviations (e.g., the absence of certain connections). On the other
hand there are quantitative aspects, such as biochemical aspects concerning
testosterone levels. To model these, both logical relations (as in qualitative modelling)
and numerical relations (as in differential equations) have been integrated in one
modelling framework.
A third submodel determines how observations lead to beliefs in an opportunity as
needed as input for the BDI-model. The notion of opportunity is based on criteria as
indicated in the Routine Activity Theory by [5]: a suitable target and the absence of a
suitable guardian. This was specified by the following dynamic property in
LEADSTO format:
a1,a2:AGENT l:LOCATION
observes(a1,agent_of_type_at_location(a1,criminal,l))
observes(a1,agent_of_type_at_location(a2,passer_by,l))
[ a3:AGENT not observes(a1,agent_of_type_at_location(a3,guardian,l)) ]

belief(opportunity(assault))

A fourth submodel is a physical and social environment model of the world


involving an environment (modelled here as a graph with 8 nodes, called location A
though H) in which a number of agents move around and sometimes meet at the same
location. One of the agents represents the criminal type that is considered (violent
psychopath), the other agents represent potential victims (passers-by) and guardians.
At some locations a suitable target can be found, for example an agent that looks rich
and/or weak. However, as also the guardians are moving around, such targets may be
protected, whenever at the same location a guardian is observed by the criminal. This
models the aspect of social control. To model the dynamics of the agents moving
around in the environment, a number of dynamic properties are used that relate
successive states to each other.
Note that the first three submodels describe the internal processes of the agent
representing the criminal type considered: the violent psychopath. The other
submodel describes the physical and social environment.

4 An Example Simulation Trace


A number of simulation traces have been generated for the behaviour of the agent
representing the considered violent psychopath criminal type under different
circumstances. In this section, one specific simulation trace is described in detail. For
this trace, the following initial state properties have been chosen for the violent
psychopath agent: the testosterone level during pregnancy is high, the basic adrenalin
level is medium (value 5), the basic level of serotonin is low (value 3), the basic level
of oxytocine is low, and the persons thymus gland is not developed properly.

892

T. Bosse, C. Gerritsen, and J. Treur

Moreover, the brain is configured for the following characteristics: a sensitivity for
alcohol, a high anxiety threshold (value 8), a high excitement threshold (value 8), a
low positive emotional attitude towards others and a low negative emotional attitude
towards others. In addition, some inputs for the model over time are provided:
initially there is a rather neutral stimulus present, which is not very dangerous nor
exciting (both aspects have value 2), alcohol is used during the whole scenario, no
Prozac is taken and no Ritalin is taken. Finally, the initial characteristics of the
environment are: the violent psychopath agent is at location A, there is a guardian at
location C, and one at location E and there is a passer-by at location F, and one at
location G.
In Figure 2, the results of the chosen example simulation trace (run) for the
behaviour of the violent psychopath agent are shown. In these pictures, time is on the
horizontal axis; the state properties are on the vertical axis. A dark box on top of
the line indicates that the property is true during that time period, and a lighter box
below the line indicates that the property is false. Figure 2 depicts the
biological/psychological aspects of the behaviour of the violent psychopath agent
within the example simulation trace, such as change of serotonin levels, and the
generation of beliefs, desires and intentions. Due to space limitations the
environmental aspects are not shown in Figure 2.
As can be seen in Figure 2, the initial settings mentioned above lead to the
following characteristics in the psychopath criminal type agent: the insulin level is
high, the anxiety threshold is high (value 8), the excitement threshold is high (value
10), the person is sensitive for alcohol, and his emotional attitude towards others, both
positive and negative, is low. In addition, he has a low neural self. This leads to a low
me-other differentiation, a low empathy, and eventually, a low theory of mind. His
high initial level of testosterone leads to a high current level of testosterone, which in
turn leads to high aggressiveness. Moreover, the medium level of initial adrenalin
leads to a medium current level of adrenalin (value 5). This current level of adrenalin,
combined with a low level of oxytocine, leads to a low desire to cope anxiety and a
low desire to ignore anxiety. Furthermore, the low initial serotonin level (value 3)
leads to a low current level of serotonin (also value 3). This current level of serotonin
combined with sensitivity for alcohol and taking alcohol leads (at time point 2) to a
decreased level of serotonin (value 0). The current serotonin level also leads to an
increased excitement threshold (from value 10 to value 13). The high insulin level
leads to a low blood sugar level and a high impulsiveness.
When the excitement threshold is higher than the strength of the observed stimuli,
then the violent psychopath agent will become bored. This leads to the desire for
actions with strong stimuli at time point 4. As a result of this desire and several other
characteristics mentioned above, the agent eventually (time point 5) develops a desire
for an action that is characterised by the following aspects: a low theory of mind, high
aggressiveness, a low desire to cope with anxiety, a low desire to ignore anxiety, a
high desire for actions with strong stimuli, high impulsiveness, a low positive attitude
towards others and a low negative attitude towards others. This desire, combined with
the belief that assaulting someone will lead to the satisfaction of such a desire, leads
to the intention to assault someone at time point 6.

Integration of Biological, Psychological, and Social Aspects

893

Fig. 2. The example simulation trace: biological and psychological aspects

In the meantime, the criminal agent has started to move around in the world. In
total, in this example trace, which was kept simple for reasons of presentation, there
are 5 agents in the world: agent 1 representing the considered criminal type (i.e., the
violent psychopath agent described in Figure 2, agents 2 and 3 are guardians, and
agents 4 and 5 are passers-by (i.e., potential victims). These agents are moving
through the world. For example, agent 1 starts at location A (time point 0), then
moves to location B (time point 4), and so on. When the criminal agent meets a
passer-by without a guardian present then the criminal agent will believe that there is
an opportunity to assault the passer-by. There is an opportunity to assault a passer-by
at time point 15. This opportunity has arisen because agent 1 is at location G and
agent 5 is also at this location, and there are no guardians present (agents 2 and 3 are
respectively at location E and H). At time point 43 there is another opportunity for

894

T. Bosse, C. Gerritsen, and J. Treur

agent 1 to assault someone. This is because agent 1 is at location H together with


agent 5, and agents 2 and 3 are not present.
In Figure 2, the psychopath agents beliefs about opportunities are also depicted.
When such a belief is present, together with the intention to assault someone, the
actual action to assault the passer-by is performed. This happens twice in the trace: at
time point 16 and 44. Finally, note that, when a violent psychopath agent assaults
someone, this significantly raises the level of stimuli he experiences. The values of
these stimuli are shown in the bottom part of Figure 2. When this value passes his
excitement threshold, he will stop being bored. As a consequence, also his desire for
actions with strong stimuli will be fulfilled, and his desire and intention for an action
that is characterised by this desire (among others) will disappear. However, after a
while, the increased value of the experienced stimuli will gradually decrease and the
psychopath agent will be bored again. This will lead to new desires, new intentions,
and eventually (at time point 44), to a new assault.

5 Discussion
This paper proposes a method to analyse the behaviour of certain criminal types as
described in the literature, based on integrated dynamic modelling. As a case study,
this method has been applied to analyse the behaviour of the violent psychopath
criminal type. It has been found that the model indeed shows the behaviour as
described for this criminal type. The model takes into account a cognitive modelling
approach to the preparation of actions based on beliefs, desires and intentions (BDI)
in a more or less standard manner e.g., [11]. However, for this standard BDI-model,
desires and beliefs about opportunities are required as input. Concerning the former,
additional biological, cognitive, and emotional aspects have been used as a basis to
generate desires. For the latter, additional social aspects have been used to generate
beliefs on opportunities based on two specific criteria (suitable target, presence of
guardian) as indicated by the Routine Activity Theory in [5]. For the generation of
desires various other aspects as described in the literature are taken into account,
varying from specific types of brain deviations, and serotonin and testosterone levels,
to the extent to which me-other differentiation and a theory of mind were developed.
Thus the model integrates biological, cognitive and socially related aspects in the
process of desire generation, as extracted from literature, in particular [6, 8, 10].
These involve both qualitative aspects (such as the anatomy of brain deviations, and
presence or absence of agents at a specific location in the world), and quantitative
aspects (such as distances and time durations in the world and hormone ands
neurotransmitter levels).
To achieve the integration of different aspects, the proposed modelling approach
(based on the LEADSTO language) integrates qualitative, logical aspects and
quantitative, numerical aspects. This integration allows one to exploit techniques from
both areas. As the latter type of aspects are fully integrated in the former, this results
in a declarative specification for which automated methods for logical analysis can be
exploited. Conversely, as the former types of aspects are fully integrated in the latter,
a simulation environment is offered that extends the usual possibilities to simulate
dynamical systems using numerical methods, by incorporating qualitative elements.

Integration of Biological, Psychological, and Social Aspects

895

Only few papers on simulation of criminal behaviour can be found in the literature,
and they usually address a more limited number of aspects than the modelling
approach presented in this paper. For example, Brantingham and Brantingham, [4]
discuss the possible use of agent modelling approaches to criminal behaviour in
general, but do not report a specific model or case study. Moreover, in [1] a model is
presented with emphasis on the social network and the perceived sanctions. However,
this model leaves the psychological and biological aspects largely unaddressed. The
same applies to the work reported in [7], where the emphasis is on the environment,
and police organisation.

References
1. Baal, P.H.M. van (2004). Computer Simulations of Criminal Deterence: from Public
Policy to Local Interaction to Individual Behaviour. Ph.D. Thesis, Erasmus University
Rotterdam. Boom Juridische Uitgevers.
2. Bartol, C.R. (2002). Criminal Behavior: a Psychosocial Approach. Sixth edition. Prentice
Hall, New Jersey.
3. Bosse, T., Jonker, C.M., Meij, L. van der, and Treur, J. (2005). LEADSTO: a Language
and Environment for Analysis of Dynamics by SimulaTiOn. In: Eymann, T. et al. (eds.),
Proc. of the 3rd German Conference on Multi-Agent System Technologies, MATES'05.
LNAI 3550. Springer Verlag, 2005, pp. 165-178. Extended version to appear in
International Journal of Artificial Intelligence Tools, 2007.
4. Brantingham, P. L., & Brantingham, P. J. (2004). Computer Simulation as a Tool for
Environmental Criminologists. Security Journal, 17(1), 21-30.
5. Cohen, L.E. and Felson, M. (1979). Social change and crime rate trends: a routine activity
approach. American Sociological Review, vol. 44, pp. 588-608.
6. Delfos, M.F. (2004). Children and Behavioural Problems: Anxiety, Aggression,
Depression and ADHD; A Biopsychological Model with Guidelines for Diagnostics and
Treatment. Harcourt book publishers, Amsterdam.
7. Melo, A., Belchior, M., and Furtado, V. (2005). Analyzing Police Patrol Routes by
Simulating the Physical Reorganisation of Agents. In: Sichman, J.S., and Antunes, L.
(eds.), Multi-Agent-Based Simulation VI, Proceedings of the Sixth International Workshop
on Multi-Agent-Based Simulation, MABS'05. Lecture Notes in Artificial Intelligence, vol.
3891, Springer Verlag, 2006, pp 99-114.
8. Moir, A., and Jessel, D. (1995). A Mind to Crime: the controversial link between the mind
and criminal behaviour. London: Michael Joseph Ltd; Penguin.
9. Quay, H.C. (1965). Psychopathic Personality: Pathological Stimulation-Seeking. American
Journal of Psychiatry, vol. 122, pp. 180-183.
10. Raine, A. (1993). The Psychopathology of Crime: Criminal Behaviors as a Clinical
Disorder. New York, NY: Guilford Publications.
11. Rao, A.S. & Georgeff, M.P. (1991). Modelling Rational Agents within a BDI-architecture.
In: Allen, J., Fikes, R. and Sandewall, E. (eds.), Proceedings of the Second International
Conference on Principles of Knowledge Representation and Reasoning, (KR91). Morgan
Kaufmann, pp. 473-484.
12. Turvey, B. (1999). Criminal Profiling: an Introduction to Behavioral Evidence Analysis.
Academic Press.

A Rich Servants Service Model for Pervasive


Computing
Huai-dong Shi, Ming Cai, Jin-xiang Dong, and Peng Liu
College of Computer Science and Technology, ZheJiang University,
Hangzhou 310027, China
{shd,perryliu}@zju.edu.cn {cm,djx}@cs.zju.edu.cn

Abstract. Pervasive computing has received an intensive interest in the past


years and will be a fertile source of challenging research problems in computer
systems for many years to come, but there is still no good model to distinguish
it from other technologies, such as distributed system, mobile computing, etc.
This paper presents a Rich Servants Service (RSS) Model to help researchers
understand pervasive computing, demonstrate the features and evaluate the implementations of pervasive computing systems. The model also exposes several
research problems of pervasive computing for further research. A prototype
based on the RSS Model is designed and demonstrated to prove the interpretation and evaluation capability of the model.
Keywords: pervasive computing, RSS Model, distributed system, mobile computing, operating system, context aware.

1 Introduction
The most profound technologies are those that disappear. They weave themselves
into the fabric of everyday life until they are indistinguishable from it. Mark Weiser
described his vision of ubiquitous computing in his seminal 1991 [1].
Pervasive computing involves many related technologies which include mobile
computing, embedded computing and distributed system etc. Promiscuously each of
these technologies has some features which are also the features of pervasive computing. Many research papers depict pervasive computing as calm computing technology
[2], user distraction free computing [3] etc. This exposes the problem that the essence
of pervasive computing is ambiguous, which is an obstacle for further research work.
This problem has brought the attention of some foreseeable researchers and in paper [4] a conceptual model for pervasive computing is presented to facilitate the discussion and analysis. However, it doesnt show the essence of pervasive computing
and it is more of a logical layered architecture of pervasive computing systems.
In this paper, we present a Rich Servants Service (RSS) Model for pervasive computing to reveal the essence of pervasive computing, distinguish it from analogical
research fields and evaluate the implementations of pervasive computing systems.
The rest of this paper is organized as follows. In section 2, current researches and
the key characteristics of pervasive computing are studied. In section 3 the RSS
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 896903, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Rich Servants Service Model for Pervasive Computing

897

Model for pervasive computing is demonstrated and the related technologies are integrated into it. The interpretation and evaluation capability of the model and the problems exposed by it are described. In section 4 a prototype system spawned from the
model and a demo system are presented. The paper is concluded in section 5 with the
key contributions of this model emphasized.

2 Current Researches of Pervasive Computing


The key characteristics of pervasive computing include Scalability, Heterogeneity,
Integration, Invisibility and Perception, Context awareness Smartness and Context
management [5]. And it also has the advantages of Effective Use of Smart Spaces,
Invisibility, Localized Scalability, Masking Uneven Conditioning [6]. The following is
the summarization of key characteristics:
Pervasive. User has access to the same information anytime and anywhere by various methods.
Nomadic. User and computing can move everywhere when required.
Invisibility [5]. Users attention will not be distracted when he is working in the
changing environment. The system deals with the changes silently and smartly.
Perception and smartness. This feature is mainly related to context-awareness
computing. The pervasive computing system (including mobile devices) perceives
context, makes timely and context-sensitive decisions and takes proactive actions to
prepare for the users future requirement which hides the computer system from user.
These are the typical features of pervasive computing. Other characteristics, such
as Heterogeneity, Scalability etc are not the features that distinguish pervasive computing from other related computing technologies, but problems to be solved.
A number of projects have been launched for pervasive computing research. MITs
Oxygen [7], CMUs Aura [3] and Don Normans Invisible Computer [8] express the
idea of hiding computing around the user. Other projects focus on different characters
of pervasive computing, such as Endeavour [9] on fluid software; Portolano [10] on
smart data; Sentient Computing [11] presents some context-aware applications and
smart posters; Cooltown [12] describes virtual bridge between mobile users and
physical entities, electronic services.
In pervasive computing area, the whole world is a pervasive computing society integrated through Internet. People live in and move across smart spaces. Satyanarayanan
[6] discussed some typical problems of pervasive computing that can be shown in Fig.1.
What policy is needed to inform the user and provide data in advance, how data and
process moving across the infrastructure and mobile device. At door (5, 7), privacy
policy and authentication are needed. The required data that is left behind at leaving
time can be sent by Internet at 15 which leads to the research of the management of
users private data in Internet. Is it feasible to cache data in some public sites? How do
these sites expose their services and what is the caching policy? How do mobile devices
express their data request to the infrastructure? When the user travels across smart
spaces, his smart phone self-tunes to adapt to the local environment and finds the potential local services for the user (7). It leads to adaptation strategy research. And the smart
phone must have Energy Management policy optimized (6).The research problems
found in Fig.1 are italicized and still many others implied.

898

H.-d. Shi et al.

Fig. 1. Physical architecture of pervasive computing

3 RSS Model
The concept of pervasive computing sounds graceful and the features such as invisibility and pervasive have been explicitly stated. But many features can also be found
in other technologies such as mobile computing and distributed system etc. These
technologies are good ways to solve the problems rather than the leading actor after
the following discussion. Pervasive computing includes these fields but it is not the
simple addition of them. As Artificial Intelligent classifies itself in one sentence: To
build a system that has Intelligence like human, the perfect model for AI is humans
intelligence. Theres an absence of the counterpoint for pervasive computing which
can help us comprehend the essence of pervasive computing. Here we present such an
ideal model Rich Servants Service Model for pervasive computing.
The model presents an ideal pervasive computing world described by Mark Weiser
[1] at conceptual level. Correspondingly, Fig.1 presents the physical model of pervasive computing. People live in such environment are like ancient rich men. A lot of
digital servants (hollow man-flag) reside everywhere around the user (black-solid
man-flag), listen, look, learn and analyze the masters context and take action
proactively.
We define Rich Servants Service Model as: The user in Rich Servants Service
Model is called master. Digital entity that provides services for a master is called
servant. Each master is surrounded by a dynamic set of servants. Servants must be
smart, careful, proactive, inexhaustible, cooperative, faithful, invisible at most time
and deployed all around the environment. The services cover from masters daily life

A Rich Servants Service Model for Pervasive Computing

899

Fig. 2. Rich Servants Service Model

to working. In one word, the model is a satisfied rich man serviced by a world of
smart digital servants. This is what pervasive computing is more than mobile computing or distributed system. The essence of pervasive computing can be presented
clearly by this model. It has the following features:

The model expresses all the features under consideration.


The result model is able to accommodate the other necessary technologies.
One can distinguish pervasive computing from analogical fields by the model.
One can use this model to explore new research fields, correct the ongoing
research and evaluate an implementation.
The model is addressed at conceptual level and free from any concrete designs
or implementations and can be taken as a solution.
In Fig 2 there are eight kinds of elements. User stands for the person who lives in
the pervasive computing system. Public space /private space represent the smart
spaces shared by a dynamic /static set of people respectively. Servant is the digital
element which executes the housekeeping stuff such as tracking the users location,
tracing the room temperature and controlling the poster. They compose the user interfaces and stays all around smart spaces and users mobile devices. Manager is part of
the infrastructure and it maintains the servants set and life cycle, does most high level
analysis, makes decisions, supervises the whole smart space data, organizes and delivers the context data. It also maintains the set of users and makes decisions on
whether to receive or refuse a foreign user. Process helps the manager to complete its
task and sometimes act as manager when it communicates with the lowest servant.
Waiter is a special kind of servant who welcomes the coming foreign user and serves
the leaving user. Router carries the data throughout the spaces when user travels
across smart spaces. The manager traveling with the user may call the router to fetch
or carry data to the users private space. Router may be issued by the mobile device

900

H.-d. Shi et al.

manager or some smart space manager with demands by the user. All these elements
communicate with each other through some kind of servant-language. We have
designed a context exchange language, Context-C for it which will be demonstrated in
another paper.
RSS Model has strong interpretation power to show the essence of pervasive computing clearly and thoroughly. RSS Model for pervasive computing is like human
intelligence for AI research. The emphasis in the model is servant that is something
human-being. RSS Model uses the real smart human servant system to analogize
pervasive computing which clearly illustrates the characteristics of the pervasive
computing and distinguishes it from distributed system, mobile computing etc. and
also clarifies their relationships.
Pervasive. As mentioned before, users can get into the information world anywhere and anytime by various methods. RSS Model contains many smart spaces and
servants deployed everywhere the user might visit. The valets (servants with the master-body) in the mobile device carrying with the master are aware of the local environment (communicate with waiter) and provide the most proper services. To avoid
getting into the design level of RSS Model here we will not discuss how valets accomplish this task in detail. But a lot of communication protocols have already been
invented to do things like this. For example [13] has discussed how foreign devices
can express the data requirement to a local environment through a profile. In a word,
computing is pervasive in RSS Model.
Nomadic. User and computing can move anywhere when required. The user has a
set of valets moving with the master. They cooperate to log out from the old environment, enter a new one and provide continuous services. Waiters at the door help
users movement. For example, environment and mobile device communicate by
service discovery mechanism such as JINI or UPNP at implementation level.
Invisibility. User has no sense of the servants system around him because servants
are hidden in physical spaces such as coffee cup, TV set, camera and even the wall.
This is one aspect of invisibility. Whats more, all elements are intelligent. Servants
are able to perceive the masters emotion status and the intention expressed by his
action. All the smart targets are reached by the human servants.
Perception and smartness. RSS Model is inherently smart and sensitive. The digital servants are designed to have some intelligence of human.
When using this model to evaluate the implementations of pervasive computing
systems, the more similar it is to RSS Model, the better it is a pervasive computing
system. A good servant must perceive masters status and try to provide services
proactively [3].
Due to the strong interpretation power of the model, it distinguishes pervasive
computing from other research fields. Paper [6] discusses the relationship of pervasive computing and distributed system, mobile computing. It compares the distributed
system with pervasive computing at remote communication, fault tolerance, high
availability, remote information access, and security which are indirect. However,
when investigating these problems through RSS Model, the difference can be told
more easily. Pervasive computing has particular consideration at these problems
themselves. For example, remote communication in distributed system is to aid two
computer systems working together at the system level. In pervasive computing the
communication takes place between the servants to serve their master better. The

A Rich Servants Service Model for Pervasive Computing

901

former focused on computer systems interaction and the latter must take place at a
higher level. It focuses on the master The computing is around the user . The
latter may be implemented based on the former technology but it is not pervasive
computing itself.
Likewise we could use RSS Model to distinguish pervasive computing from mobile computing. Paper [6] also compares the two research fields. Mobile computing
focused on mobile networking, mobile information access, support for adaptive
applications, system-level energy saving techniques, location sensitivity. In pervasive computing we must have mobile computing support for the user wandering
around. Furthermore for mobile information access, RSS Model can exploit new
problems that need to be solved rather than mobile computing. RSS Model shows
that the digital servant must utilize its intelligence to automatically accomplish the
masters data requirement by discovering and analyzing to satisfy the master. There
must be research fields on how to exchange mobile information based on the users
context.
In a word, RSS Model takes great advantage over other models to grasp the essence of pervasive computing. It can tell the difference between pervasive computing
and other research fields.
There are several research problems revealed by RSS Model.
Pervasive Computing involves all aspects of computer systems from operating systems to applications as many research papers have described. The tri-layer model of
Mhlhauser [14] covers from Gadgets, integration to UC world. RSS Model implies
these layers. Thinking deeply into RSS Model, we can find many research problems
across these layers spontaneously. Below well discuss some of those problems.
Impact on Operating System. Servants in RSS Model are smart for we define it to
be context-aware, self-tuning, etc. which impacts underlying operating system deeply.
Traditional operating system theory has many conclusions that are inferred from the
assumption application process is not able to predict its resource requirement and
action which was true in old days. For example, Banker-algorithm for deadlockavoid needs the resource requirement in advance to calculate the deadlock risk.
Virtual memory maintenance predicts the coming page-using-sequence with LRU
algorithm etc. While for servant process in RSS Model, it takes action according to
the context-based prediction of user and does act in a predictable mannerthey have
changed their behavior. There must be changes for operating system accordingly.
The integration of AI and Pervasive Computing. Digital servants are human being. The AI technology will play an important role in pervasive computing.
Intention-oriented Schedule. When master moves his focus, smart servants will
evaluate the masters intention and change state to meet the masters demands. It
schedules the action according to the masters intention even though the whole smart
space behaves in the same way. The problems on how to quantify users intention and
what schedule policy is proper remain to be studied.
Therere a lot of other problems that can be found by RSS Model. For example, the
software manufacturer is most likely to provide servant that is trained to a particular
customer, the language used by different servant country is need to be defined.

902

H.-d. Shi et al.

4 Prototype Based on RSS Model


Although RSS Model is a target model to clarify the essence of pervasive computing,
it is useful in implementation. In this prototype the primary element is servant (which
belongs to the infrastructure) and valet (belongs to the mobile device). To manage the
servants we place Servant/Valet manger as well as user manager to organize users in a
smart space.

Fig. 3. A pervasive computing prototype designed based on RSS Model with the concepts of
RSS Model mapped to a design model directly and perfectly

We choose mobile agents system to allow the valet to move into infrastructure and
vice versa. To demonstrate the power of the model, a demo system with simulated
environment is developed for the life and work in the university.
The scenario is that Prof. Li will give a C language class to the students as scheduled from 2:00pm to 4:00pm, Dec 20th, 2006 in room #306 of the 10th teaching
building.
At 1:45pm, Prof. Li finishes editing the presentation for the class and leaves his office for the 10th building. The office is a private space for Prof. Li. The edited presentation will be routed to the target space automatically by his valet. Probably Prof. Li
will cross several public/private spaces before he reaches the building.
When he enters the buildings public space, the servant which detects Prof. Lis location will send the information to the manager, and his valet moves to the buildings
infrastructure after proper conversation. The valet works with the local manager to
find the scheduled class information for Prof. Li. Accordingly it asks another type of
valet to move into the infrastructure of room #306, download the presentation and
warm up the projector. These valets are trained well and perform the work proactively. Private information of Prof. Li is concealed against other software entities.
When Prof. Li enters the room, everything is ready for him.
If Prof. Li feels sick after entering the building, he goes to the hospital directly.
The valet that resides in the room will get this information from Prof. Lis manager. It
will print a line in the projector so that all the students know it. The system handles all
the situations smartly and silently.

A Rich Servants Service Model for Pervasive Computing

903

5 Conclusions
In this paper we present a target model for a better understanding in pervasive computing. The key proposition is that pervasive computing essentially aims at building a
virtual servants service system for users. The widely deployed digital smart servants
wait in coffee-cup, TV set and even the wall, carefully perceive the masters intention
and emotion to provide the services in advance based on the information they get or
infer. The master usually is unaware of the existence of digital servants.
We can use RSS Model to distinguish pervasive computing from other relevant research fields such as distributed system and mobile computing etc. It accommodates
these fields into pervasive computing. RSS Model can evaluate an implementation of
pervasive computing to see how smart it achieves. It can correct the current researchs
direction, find new research problems and provide a primary solution.
The future work includes finding more problems about pervasive computing with
this model and bringing more intelligence into the prototype.

References
1. Weiser, M.: The Computer for the 21st Century. Scientific American, September, 1991.
2. Rekimoto J et a1 Adding another communication channel to reality Experience with a
chat augmented conference. In Proceedings of ACM SIGCHI Conference on Human
Factors in Computing Systems Los Angeles U SA 1998 384391
3. Garlan D Siewiorek D P Smailagic A, Steenkiste P Project aura: Toward distractionfree pervasive computing IEEE Pervasive Computing 2002 1(46) 22 31
4. Ciarletta L t Dima A: A Conceptual Model for Pervasive Computing IEEE 2000
5. Debashis Saha ,Amitava Mukherjee: Pervasive Computing: A Paradigm for the 21st Century. IEEE 2003
6. Satyanarayanan M. Pervasive Computing: Vision and Challenges. IEEE Personal Communications, 2001, 8(4): 1017
7. Dertouzos M . The future of computing. Scientific American 1999 282(3) 5263
8. Norman D.The Invisible Computer Cambridge Mass MIT Press 1999
9. Brumitt B, Meyers B Krumm J Kern A Shafer s EasyLiving: Technologies for intelligent environments Handheld and Ubiquitous Computing Bristol UK 2000
30 36
10. Esler, M., Hightower, J., Anderson, T., and Borriello, G. Next Century Challenges: DataCentric Networking for Invisible Computing: The Portolano Project at the University of
Washington Mobicom 99.
11. Hopper. A. Sentient Computing, The Royal Society Clifford Paterson Lecture, AT&T
Laboratories Cambridge, Technical Report 1999.12, 1999
12. Kindberg, T., Barton, J.: A Web-based nomadic computing system,Computer Networks
35(4), pp 443456, March 2001.
13. Cherniack M Franklin M J Zdonik S Expressing user profiles for data recharging
IEEE Personal Communications Special Issue on Pervasive Computing 2001 8(4)
3238
14. Mhlhauser M.:Ubiquitous Computing and Its Influence on MSE. IEEE, 2000

Techniques for Maintaining Population Diversity in


Classical and Agent-Based Multi-objective
Evolutionary Algorithms
Rafa Drezewski and Leszek Siwik
Department of Computer Science
AGH University of Science and Technology, Krakow, Poland
{drezew,siwik}@agh.edu.pl

Abstract. The loss of population diversity is one of the main problems in some
applications of evolutionary algorithms. In order to maintain useful population
diversity some special techniques must be used, like niching or co-evolutionary
mechanisms. In this paper the mechanisms for maintaining population diversity
in agent-based multi-objective (co-)evolutionary algorithms are proposed. The
presentation of techniques is accompanied by the results of experiments and comparisons to classical evolutionary multi-objective algorithms.
Keywords: Agent-based evolutionary computation, maintaining population diversity, sexual selection, flock-based operators.

1 Introduction
Evolutionary Algorithms (EAs) are techniques inspired by Darwinian model of evolutionary processes observed in nature. They have demonstrated in practice eciency
and robustness as global optimization techniques. However, sometimes the loss of useful population diversity limits the possibilities of their application in some areas (like,
for example, multi-modal optimization, multi-criteria optimization, dynamic problems,
etc.)
In the case of multi-objective optimization problems loss of population diversity may
cause that the population locates in the areas far from Pareto frontier or that individuals are located only in selected areas of Pareto frontier. In the case of multi-objective
problems with many local Pareto frontiers (defined by Deb in [5]) the loss of population diversity may result in locating only local Pareto frontier instead of a global one.
In order to avoid such negative tendencies special mechanisms are used, like niching
techniques, co-evolutionary algorithms and sexual selection.
Evolutionary multi-agent systems (EMAS) are multi-agent systems, in which there
are three basic mechanisms needed in order to start and maintain evolutionary processes:
limited resources that agents need for all activities and for which they compete, and
agents abilities to reproduce and die. Basic EMAS model also suers from the negative
tendency to loss the population diversity, however, as we will show in the following
sections, it can be equipped with additional mechanisms and operators which improve
the quality of obtained results. The general model of co-evolutionary multi-agent system
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 904911, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Techniques for Maintaining Population Diversity

905

(CoEMAS) [6] introduces additionally the notions of species, sexes, and interactions
between them. CoEMAS allows modeling and simulation of dierent co-evolutionary
interactions, which can serve as the basis for constructing the techniques of maintaining
population diversity and improving adaptive capabilities of such systems (for example
see [7]).
In the following sections mechanisms for maintaining useful population diversity
in classical evolutionary multi-objective algorithms are presented. Next, new techniques for (co-)evolutionary multi agent systems for multi-objective optimization are
proposed. The presentation of proposed techniques is accompanied with the examples
of selected experimental results and comparisons to classical multi-objective evolutionary algorithms.

2 Previous Research on Maintaining Population Diversity in


Evolutionary Multi-objective Algorithms
In order to maintain useful population diversity and introduce speciation (process of
forming speciessubpopulationslocated in dierent areas of solutions space) special techniques are used like niching mechanisms and co-evolutionary models. Niching techniques are primarily applied in multi-modal optimization problems, but they
are also used in evolutionary multi-objective algorithms. During the years of research
various niching techniques have been proposed [16], which allow niche formation via
the modification of mechanism of selecting individuals for new generation (crowding
model), the modification of the parent selection mechanism (fitness sharing technique
or sexual selection mechanism), or restricted application of selection and/or recombination mechanisms (by grouping individuals into subpopulations or by introducing the
environment with some topography in which the individuals are located).
Fitness sharing technique in objective space was used in Hajela and Lin genetic
algorithm for multi-objective optimization based on weighting method [10], by Fonseca and Fleming in their multi-objective genetic algorithm using Pareto-based ranking
procedure [8], and in the niched Pareto genetic algorithm (NPGA) (during the tournament selection in order to decide which individual wins the tournament) [11]. In nondominated sorting genetic algorithm (NSGA) the fitness sharing is performed in decision space, within each set of non-dominated individuals separately, in order to maintain
high population diversity [17]. In strength Pareto evolutionary algorithm (SPEA) [19]
special type of fitness sharing (based on Pareto domination relation) is used in order to
maintain diversity.
In co-evolutionary algorithms the fitness of each individual depends not only on the
quality of solution to the given problem but also (or solely) on other individuals fitness.
This makes such techniques applicable in the cases where the fitness function formulation is dicult (or even impossible). Co-evolutionary algorithms are also applicable
in the cases when we want to maintain population diversity. Generally, each of the coevolutionary technique belongs to one of two classes: competitive or co-operative.
Laumanns, Rudolph and Schwefel proposed co-evolutionary algorithm with spatial
graph-like structure and predator-prey model for multi-objective optimization [13]. Deb
introduced modified algorithm in which predators eliminated preys not only on the basis

906

R. Drezewski and L. Siwik

of one criteria but on the basis of the weighted sum of all criteria [5]. Li proposed other
modifications to this algorithm [14]. The main dierence was that not only predators
were allowed to migrate within the graph but also preys could do it. The model of
co-operative co-evolution was also applied to multi-objective optimization ([12]).
Sexual selection results from co-evolution of female mate choice and male displayed
trait where females evolve to reduce direct costs associated with mating and keep them
on optimal level and males evolve to attract females to mating (sexual conflict) [9].
Sexual selection is considered to be one of the ecological mechanisms responsible for
biodiversity and sympatric speciation [9].
All the works on sexual selection mechanism for multi-objective evolutionary algorithms were focused on using this mechanism for maintaining population diversity.
Allenson proposed genetic algorithm with sexual selection for multi-objective optimization in which the number of sexes was the same as the number of criteria of the given
problem [1]. Lis and Eiben proposed multi-sexual genetic algorithm (MSGA) for multiobjective optimization [15] in which also one sex for each criterion was used. They
used special multi-parent crossover operator and child had the same sex as the parent
that provided most of genes. Bonissone and Subbu continued work on Lis and Eibens
algorithm. They proposed additional mechanisms for determining the sex of ospring
(random and phenotype-based) [3].
Co-evolution of species and sexes is the biological mechanism responsible for biodiversity and sympatric speciation. However the application of such mechanisms in
evolutionary multi-objective algorithms is still the subject of ongoing research and an
open issue.

3 Introducing Flock-Based Operators into Evolutionary


Multi-agent System
Assuming the classical structure of evolutionary multi-agent system, one of the way for
maintaining population diversity and, in the context of multi-objective optimization, for
improving the quality of the Pareto frontier approximation consists in introducing to the
system so-called flock operators i.e. (in the simplest case) creating new flock/dividing
flock into two (n) flocks and merging two (n) flocks into one flock. Taking into account multi-objective optimization goalssuch operators can be realized as follows.
During meetings with agents located in the same flock agent gathers (partial) knowledge about its distance (in the decision variable space or in the objective space) from
another agents. Then, if such (partial) average distance to another agents is greater than
configured parameter(s) (to be precise, in realized system this parameter is changing
adaptively)agent can make a decision about creating new flock (i.e. dividing particular flock into two flocks). After making such a decision agent creates a new flock,
migrates to this new flock from the old one and then initializes its new flock. Initialization process consists in cloning itself and mutating (with small range e.g. by
mutating the least significant genes) cloned descendants. The small range of mutation
ensuresor increases the probability ofsampling the agents neighborhoodwhat

Techniques for Maintaining Population Diversity

907

is very desirable since agent creating new flock stands out in its original flock as
strange agenti.e. agent representing poorly sampled (at least by this very flock)
area of search space. Decision about the number of new cloned descendants created by
agent is an autonomous agents decision of course but it should ensure that flock does
not become extinct too early and on the other hand that there do not exist in the system
too many similar agents1 . In the simplest case eliminating/merging flocks operator can
be realized as follows: two flocks are being merged if their centers of gravity are located
closer than configured value and the dierence between their radiuses is smaller than
given parameter2 (both of these parameters can change adaptively).
1.4

2.35

Flock based approach


EMAS based approach

35

Flock based approach


EMAS based approach

Flock based approach


EMAS based approach

2.3
1.3

30

1.1

Non-dominated count

1.2

Decision Dispersion

Objective dispersion

2.25
2.2
2.15
2.1

25

20

15

2.05
0.9

10
2

0.8

1.95

a)

50

100

150

200

250

300

350

400

450

500

550

Step number

b)

100

150

200

250

2.3

Flock based approach


EMAS based approach

0.36

2.25

0.35

2.2

0.34
0.33
0.32

300

350

400

450

500

550

1.95

50

100

150

200

250

300

Step number

350

400

450

500

550

e)

150

200

250

300

350

400

450

500

550

500

550

Flock based approach


EMAS based approach

90
80
70
60
50
40

1.9
0

100

Step number

Flock based approach


EMAS based approach

2.1

0.3

50

100

2.05
2

c)

2.15

0.31

0.29

d)

50

Step number

Decision dispersion

Objective dispersion

0.37

5
0

Non dominated count

30
0

50

100

150

200

250

300

Step number

350

400

450

500

550

f)

50

100

150

200

250

300

350

400

450

Step number

Fig. 1. Selected characteristics: objective dispersion (a,d), decision dispersion (b,e), number
of non-dominated individuals (c,f) obtained during solving ZDT-1 (a,b,c) and ZDT-2 (d,e,f)
problems

To present the influence of proposed mechanism(s)during experiments presented


in fig. 1 and in fig. 2 operator of creating new flocks was performed in 100th and 400th
step. Presented in mentioned figures measures should be interpreted as follows: objective dispersion represents the averagemeasured in the objective spacedistance
among individuals, decision dispersion represents the averagemeasured in the space
of decision variabledistance among individuals. As one may notice, introducing flock
operators influences very positively on maintaining population diversity (see fig. 1a, b,
d, e and fig. 2a, b, d, e) and in the consequence on the quality of obtained Pareto frontier
approximation (see fig. 1c, f and fig. 2c, f). Because of the space limitation there are presented in fig. 1 and in fig. 2 only characteristics related to the number of non-dominated
solutions found by flock-based and classical EMAS for confirming (to some extent)
thatat least during solving Zitzler problems ([19])flock based approach allows for
obtaining much more numerous Pareto set in the comparison to the classical EMAS
approach.
1
2

In tests presented below this value varies from twelve to eighteen.


In presented results below the center of gravity is measured as the arithmetic mean of objective
function values of all flock members.

908

R. Drezewski and L. Siwik


0.44

2.4

Flock based approach


EMAS based approach

140

Flock based approach


EMAS based approach

0.42

Flock based approach


EMAS based approach

120

2.35

0.36

Non dominated count

Decision dispersion

Objective dispersion

0.4
0.38

2.3

2.25

100

80

60

0.34
2.2

40

0.32
0.3

2.15
0

a)

50

100

150

200

250

300

350

400

450

500

550

Step number

0.38

50

100

150

200

250

300

350

400

450

500

550

12.5

c)

Step number

Flock based approach


EMAS based approach

0.36

20
0

b)
12

100

150

200

250

300

350

400

450

500

550

500

550

Step number

Flock based approach


EMAS based approach

70

11.5

0.34

50

80

Flock based approach


EMAS based approach

0.3
0.28
0.26

Non dominated count

Decision dispersion

Objective dispersion

11
0.32

10.5
10
9.5
9

60

50

40

8.5
0.24

0.22
0.2

d)

30

7.5
7
0

50

100

150

200

250

300

Step number

350

400

450

500

550

e)

20
0

50

100

150

200

250

300

350

400

Step number

450

500

550

f)

50

100

150

200

250

300

350

400

450

Step number

Fig. 2. Selected characteristics: objective dispersion (a, d), decision dispersion (b, e), number of
non-dominated individuals (c, f) obtained during solving ZDT-3 (a, b, c) and ZDT-4 (d, e, f)
problems
Reproduction

The set of nondominated


solutions

Agent of sex B

Copy
of child

Environment
Death
Selection
of partner for
reproduction

Agent of sex A
Migration

Fig. 3. CoEMAS with co-evolving sexes

4 Sexual Selection as a Technique for Maintaining Population


Diversity in CoEMAS for Multi-objective Optimization
In order to maintain population diversity in systems based on model of co-evolution in
multi-agent system (CoEMAS), mechanisms based on co-evolutionary interactions of
species and sexes may be used. Such mechanisms include, for example, host-parasite,
predator-prey, or co-operative co-evolution of species. Another way to maintain useful
diversity is to apply sexual selection mechanismresulting system is the CoEMAS
with sexual selection (SCoEMAS, see fig. 3).
The mechanisms used in this system include: co-evolution of sexes, and sexual selection based on Pareto domination. All agents live within the environment, which has the
graph-like structure. The number of sexes corresponds with the number of criteria (each
sex has the criteria assigned to it and agents that belong to that sex are evaluated with
the assigned criteria). There is one resource defined in the system. The resource can
be possessed by the agents and the environment (there is closed circulation of resource

Techniques for Maintaining Population Diversity

909

in the system). This resource is distributed (proportionally to the fitness values of the
agents) by each node of the graph among the agents that are located in that node.

a)

Population size

500

Population size

Chromosome length

10

Chromosome length

External set size

256

External set size

64

Crossover probability

0,3

Crossover probability

0,1

Mutation probability

0,2

Mutation probability

Metrics

M1
M3

c)

0,05
0,2
0,6

M2
M2
M2

SPEA
Obayashi Tamaki
problem problem
0.08
0.001
1.69
1.75
3.47
2.38

1.50
1.49
5.80
18.67

NSGA
Obayashi Tamaki
problem problem
0.003
0.10
1.81
1.84
1.58
4.38

0.64
0.41
5.20
1.09

b)

CoEMAS
Obayashi Tamaki
problem problem
0.011
0.15
2.01
1.37
1.64
4.03

0.83
0.39
5.98
2.24

Metrics

M1
M3

d)

0,05
0,2
0,6

M2
M2
M2

SPEA
Obayashi Tamaki
problem problem
0.13
0.32
0.74
0.55
4.38
0.54

1.13
1.28
7.17
14.29

100

0,1

NSGA
Obayashi Tamaki
problem problem
0.44
0.0
1.96
2.55
3.95
7.63

0.39
0.15
2.98
2.11

CoEMAS
Obayashi Tamaki
problem problem
0.56
0.32
2.00
2.79
3.88
8.21

0.62
0.29
3.14
2.87

Fig. 4. Comparison of the proposed CoEMAS with sexual selection, SPEA and NSGA algorithms
according to the M1 , M2 and M3 metrics (table a includes selected configuration parameters for
results presented in table c, and table b includes parameters for results presented in table d)

Each time step, the agents can migrate within the environment (they lose some resource during the migration). The agent can migrate only to the node connected with
the one within which it is located. The agent chooses the node to which it will migrate on the basis of the amount of resource of that node. When the agent is ready for
reproduction (i.e. the amount of its resource is above the given level) it sends the information to the agents of other sexes located within the same node. The other agents
can response to this information when they are also ready for reproduction. Next, the
agent which initiated the reproduction process chooses one (or moreit depends on
the number of sexes in the system) of the agents of opposite sex on the basis of the
amounts of their resources (the probability of choosing the agent is proportional to the
amount of its resource). The ospring is created with the use of intermediate recombination and Gaussian mutation [2]. Next, the child is compared to the individuals from
the non-dominated individuals set of the node in which parents and child are located. If
none of the individuals from this set is dominating the child then the child is copied to
the set (all individuals dominated by the child are removed from the set).
First experiments, which results are presented in this section, were aimed at investigating if SCoEMAS can be applied to multi-objective optimization problems and
whether it works properly (agents do not die o). Proposed co-evolutionary multi-agent
system with sexual selection mechanism for multi-objective optimization has been tested
using, inter alia, Tamaki and Obayashi test functions [18]. Additionally, results obtained
with the use of SCoEMAS was compared to those obtained by classical evolutionary
algorithms for multi-objective optimization: niched-pareto genetic algorithm (NPGA)
[5] and strength pareto evolutionary algorithm (SPEA) [19].
To compare proposed approach with implemented classical algorithms three metrics M1 , M2 , and M3 ([19]) were used. These metrics are defined as follows. If A X
denotes a non-dominated set, 0 denotes appropriately chosen neighborhood parameter and   denotes the given distance metricthen three functions M1 (A), M2 (A)
and M3 (A) can be introduced to asses the quality of A regarding the decision space:

910

R. Drezewski and L. Siwik

1 
M1 (A) = |A|
aA min{a x | x X p } (the average distance to the Pareto optimal set
1 
X p ), M2 (A) = |A1|
aA |{b A |  a b  > }| (the distribution in combination with the

N
number of non-dominated solutions found), and M3 (A)=
i=1 max{ai bi  | a, b A}
(the spread of non-dominated solutions over the set A, N is the number of objectives).
Presented results (fig. 4) show that SPEA is the best of all compared algorithms. It
turned out that proposed SCoEMAS with sexual selection mechanism can be used for
multi-objective problems however more research is needed to obtain better results. The
fact that results were worse than in the case of classical evolutionary multi-objective
algorithms results from the tendency to maintain high population diversity what could
be very useful in the case of hard dynamic and multi-modal multi-objective problems
(as defined by Deb [4]).

5 Conclusions
Maintaining population diversity is one of the main problems in some applications of
EAsespecially in multi-modal optimization, multi-objective optimization and adaptation in dynamic environments. In the case of multi-objective optimization problems
the loss of population diversity may result in locating only some parts of Pareto frontier
or locating a local Pareto frontier instead of the global one in the case of multi-modal
multi-objective problems.
In this paper overview of selected techniques and algorithms for maintaining
population diversity in (co-)evolutionary multi-agent systems for multi-objective optimization were presented. Proposed mechanisms worked very well from maintaining
population diversity (and in the consequence improving the quality of the Pareto frontier approximation) point of view. It is worth to mention in this place that presented
flock-based operators as well as co-evolutionary approach with sexual selection are
only selected examples of the whole range of mechanisms that can be easily introduced
into (co-)evolutionary multi-agent system and that can significantly improve the quality
of obtained solutions. Other mechanisms and models such as: semi-elitist evolutionary
multi-agent system, distributed frontier crowding, co-evolutionary multi-agent system
with host-parasite model, co-evolutionary multi-agent system with predator-prey model
should be mentioned, but because of the space limitation they are omitted in this paper. Of course, further research is needed in order to improve proposed mechanisms. It
seems that full potential abilities of these systems could be fully observed in the case of
hard multi-modal multi-objective problems in which many local Pareto frontiers exist.
The future research will also include the application of other co-evolutionary mechanisms like, for example, co-operative co-evolution.

References
1. R. Allenson. Genetic algorithms with gender for multi-function optimisation. Technical
Report EPCC-SS92-01, Edinburgh Parallel Computing Centre, Edinburgh, Scotland, 1992.
2. T. Back, D. Fogel, and Z. Michalewicz, editors. Handbook of Evolutionary Computation.
IOP Publishing and Oxford University Press, 1997.

Techniques for Maintaining Population Diversity

911

3. S. Bonissone and R. Subbu. Exploring the pareto frontier using multi-sexual evolutionary algorithms: An application to a flexible manufacturing problem. Technical Report
2003GRC083, GE Global Research, 2003.
4. K. Deb. Multi-objective genetic algorithms: Problem diculties and construction of test
problems. Evolutionary Computation, 7(3):205230, 1999.
5. K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons,
2001.
6. R. Drezewski. A model of co-evolution in multi-agent system. In V. Mark, et al., editor,
Multi-Agent Systems and Applications III, volume 2691 of LNCS, pages 314323, Berlin,
Heidelberg, 2003. Springer-Verlag.
7. R. Drezewski and L. Siwik. Multi-objective optimization using co-evolutionary multi-agent
system with host-parasite mechanism. In V. N. Alexandrov, et al., editor, Computational
Science ICCS 2006, volume 3993 of Lecture Notes in Computer Science, pages 871878,
Berlin, Heidelberg, 2006. Springer-Verlag.
8. C. Fonseca and P. Fleming. Genetic algorithms for multiobjective optimization: Formulation,
discussion and generalization. In Genetic Algorithms: Proceedings of the Fifth International
Conference, pages 416423. Morgan Kaufmann, 1993.
9. S. Gavrilets. Models of speciation: what have we learned in 40 years? Evolution,
57(10):21972215, 2003.
10. P. Hajela and C. Lin. Genetic search strategies in multicriterion optimal design. In Structural
optimization 4, pages 99107, 1992.
11. J. Horn, N. Nafpliotis, and D. E. Goldberg. A niched pareto genetic algorithm for multiobjective optimization. In Proceedings of the First IEEE Conference on Evolutionary Computation, pages 8287, Piscataway, New Jersey, 1994. IEEE Service Center.
12. A. Iorio and X. Li. A cooperative coevolutionary multiobjective algorithm using nondominated sorting. In K. Deb, et al., editor, Genetic and Evolutionary Computation - GECCO
2004, volume 3102-3103 of LNCS, pages 537548. Springer-Verlag, 2004.
13. M. Laumanns, G. Rudolph, and H.-P. Schwefel. A spatial predator-prey approach to multiobjective optimization: A preliminary study. In A. Eiben, et al., editor, Parallel Problem
Solving from Nature PPSN V, volume 1498 of LNCS. Springer-Verlag, 1998.
14. X. Li. A real-coded predator-prey genetic algorithm for multiobjective optimization. In
C. M. Fonseca, et al., editor, Evolutionary Multi-Criterion Optimization, Second International Conference (EMO 2003), Proceedings, volume 2632 of LNCS. Springer-Verlag, 2003.
15. J. Lis and A. E. Eiben. A multi-sexual genetic algorithm for multiobjective optimization. In
T. Fukuda and T. Furuhashi, editors, Proceedings of the Third IEEE Conference on Evolutionary Computation, pages 5964, Piscataway NJ, 1996. IEEE Press.
16. S. W. Mahfoud. Niching methods for genetic algorithms. PhD thesis, University of Illinois
at Urbana-Champaign, Urbana, IL, USA, 1995.
17. N. Srinivas and K. Deb. Multiobjective optimization using nondominated sorting in genetic
algorithms. Evolutionary Computation, 2(3):221248, 1994.
18. D. A. Van Veldhuizen. Multiobjective Evolutionary Algorithms: Classifications, Analyses,
and New Innovations. PhD thesis, Graduate School of Engineering of the Air Force Institute
of Technology Air University, 1999.
19. E. Zitzler. Evolutionary algorithms for multiobjective optimization: methods and applications. PhD thesis, Swiss Federal Institute of Technology, Zurich, 1999.

Agents Based Hierarchical Parallelization of


Complex Algorithms on the Example of hp
Finite Element Method
M. Paszy
nski
Department of Computer Science
AGH University of Science and Technology,
Al. Mickiewicza 30, 30-059 Cracow, Poland
paszynsk@agh.edu.pl
http://home.agh.edu.pl/~ paszynsk

Abstract. The paper presents how application of agents can improve


scalability of domain decomposition (DD) based parallel codes, where
the optimal load balance for some components of the code cannot be
achieved only by partitioning computational domain. The limitation of
the DD paradigm, where some highly overloaded pieces of domain cannot
be partitioned into smaller sub-domains can be eectively overcome by
parallelization of computational algorithm over these pieces. The agents
are used to localize such highly loaded unbreakable parts of domain.
Multiple agents are then assign to each highly loaded part to execute
computational algorithm in parallel. The resulting hierarchical parallelization scheme results in the signicant improvement of the scalability.
The proposed agent based hierarchical parallelization scheme has been
successfully tested on a very complex hp Finite Element Method (FEM)
parallel code, applied for simulating Step-and-Flash-Imprint Lithography (SFIL), resistance heating of Al-Si billet in steel die for tixoforming
process as well as for the Fichera model problem.
Keywords: Hierarchical parallelization, Computational agents, Finite
Element Method, hp adaptivity.

Introduction

In general, there are two methodologies for parallelization of computational codes


[4]. The rst one is based on the domain decomposition paradigm (DD), where
computational domain is partitioned into sub-domains and the same computational algorithm is executed over each sub-domain. The second one based on
functional decomposition (FD), rst decomposing the computation to be performed and then dealing with data.
For some classes of problems the rst one is suitable, whilst for some other
classes the second one is better. However, for some complex problems mixed
approach is necessary. For example let us consider the system which consists of
many components working on the same data structure. Most of these components
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 912919, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Agents Based Hierarchical Parallelization of Complex Algorithms

913

can be eciently parallelized by utilizing the domain decomposition paradigm.


Unfortunately, some other components interpret these data in a dierent manner,
and there are some unbreakable pieces of data which require very expensive
computations.
The paper presents agent based hierarchical parallelization of such complex
algorithms, where DD paradigm is utilized on the base level, and the functional decomposition is applied to resolve highly overloaded unbreakable pieces
of domain. Many agents are applied to localize such highly overloaded pieces
of domain, which cannot be broken into smaller subdomains. The agents utilize
functional decomposition to reduce computational load on these nodes.
The Finite Element Method (FEM) [1] is the most popular and exible way
of solving engineering problems. The FEM applications are usually parallelized
by utilizing the DD paradigm, since its computational mesh can be partitioned
into uniformly loaded subdomains. The hp Finite Element Method [2] is the
most complex version of FEM. The computational meshes for hp FEM consist
of nite elements with various sizes (thus h stands for nite element diameter)
and with polynomial orders of approximation varying locally on nite elements
edges, faces and interiors (thus p stands for order of approximation). The parallel
version of the hp FEM has been developed [5], utilizing the domain decomposition paradigm. Most of the components of the system, including decisions about
optimal mesh renements, parallel mesh renements and mesh reconciliation,
scale very well under domain decomposition paradigm. However, for some other
components, like integration and elimination components of the solver, the computational meshes with high polynomial orders of approximation suer from
presence of single nodes, where computational cost can be higher than sum of
costs for all other nodes in the mesh [5]. There is no way of partitioning such
single nodes, and the domain decomposition paradigm is not sucient to provide well scalability for these components. The proposed agent based hierarchical
parallelization can eectively improve scalability of these components.
1.1

Hexahedral 3D hp Finite Element

The reference 3D hexahedral hp nite element presented in Figure


8
 1 consists of 

vertices, 12 edges, 6 faces and the interior. It is dened as a triple K, X(K), K


is a [0, 1]3 cube shape geometry of the reference element, X(K) is a
where K
space of shape functions, dened as a subspace of Q(px ,py ,pz ) polynomials of
order px , py and pz in 1 , 2 , 3 spatial variables. The polynomial order of
approximation at each vertex is set to 1. In order to be able to glue together
on single computational mesh nite elements with dierent orders, we associate
a possibly dierent polynomial order of approximation pi = (pih , piv ) with each
nite element face, where h and v stand for horizontal and vertical orders of
approximation for each of six faces i = 1, ..., 6. Similarly, we associate with each
nite element edges a possibly dierent polynomial order of approximation pj
for each of twelve edges j = 1, ..., 12. Finally, the interior of an element has three
polynomial orders of approximation (px , py , pz ) in every spatial direction. The

914

M. Paszy
nski

Fig. 1. Left picture: 3D hexahedral hp nite element. Right picture: Graphical


notation for various polynomial orders of approximation on element edges and faces.

graphical notation for denoting dierent polynomial orders of approximation by


dierent colors is introduced in Fig. 1. K is the projection operator K : X
see [2]. The integration performed by FEM over any arbitrary geometry
X(K),
nite element K called physical element are transferred into the reference element
by performing change of variables.
K

Computational Problems

The hierarchical parallelization will be discussed on three engineering problems


The 3D model Fichera problem, described in details [5].
The Step-and-Flash Imprint Lithography (SFIL), the patterning process utilizing photopolimerization to replicate microchip pattern from the template
into the substrate [6]. The goal of hp FEM simulation is to compute volumetric shrinkage of the feature modeled by linear elasticity with thermal
expansion coecient.
The resistance heating of the Al-Si billet in steel die for tixoforming process
[7]. The goal of hp FEM simulation is to compute heat distribution during
the resistance heating process, modeled by the Poisson equation with Fourier
boundary condition of the third type.
To solve each of these problems in an accurate way, a sequence of hp FE
meshes are generated by the fully automatic parallel hp-adaptive FEM code
[5]. These sequences of meshes generated in automatic mode (without any user
interaction) deliver exponential convergence of the numerical error with respect
to the mesh size (number of degrees of freedom).
When the computational problem contains singularities related with either
jumps in material data, jumps of prescribed boundary conditions, or complicated
geometry, the generated hp meshes are irregular, and may contain very small
nite elements with high polynomial orders of approximation, especially in areas
close to these singularities.
In the rst problem, there is one singularity at the center of the domain [5].
The generated hp mesh contains a single nite element with interior node with
polynomial orders of approximation set to 7 in all three directions, as well as

Agents Based Hierarchical Parallelization of Complex Algorithms

915

three nite elements with interior nodes with polynomial orders of approximation set to 6 in two directions and 7 in the third direction. The load representing
computational cost over these elements for integration and elimination components is much higher then load over all other elements, see Fig. 2. In the second

Fig. 2. The load inbalance over the optimal hp mesh for the Fichera model problem

Fig. 3. The load inbalance over the optimal hp mesh for the SFIL problem

problem, there is one central nite element with polynomial order of approximation higher then orders in all other nite elements. The load over this element
is higher then load over all other nite elements, and it is equal to 5 in two
directions and 6 in the third direction, see Figure 3. In the third problem, there
are many singularities related to jumps in material data. There are many nite
elements with high polynomial orders of approximation and the load distribution
is quite uniform, see Figure 4.

Load Balancing Problem

This section presents an overview of FEM computations performed by integration and direct solve components over hp nite element meshes. The goal of this

916

M. Paszy
nski

presentation is to derive estimation of the computational cost over a single hp


nite element. All problems presented in the previous section t into abstract
variational formulation

u u0 + V
.
(1)
b (u, v) = l (v) v V
We seek a solution u from a functional space u0 + V where u0 is the lift of the
Dirichlet boundary conditions [3]. Here b and l are problem dependent functionals dened as integrals over the entire domain. The variational formulations
are derived from partial dierential equations describing the considered physical
phenomena, and is satised for all test functions v V . The problem (1) is

Fig. 4. The load balance over the optimal hp mesh for the resistance heating problem

solved over a nite dimensional subspace Vh,p V [3]



uh,p u0 + Vh,p
.
b (uh,p , vh,p ) = l (vh,p ) vh,p Vh,p

(2)

The nite dimensional basis of Vhp is constructed from polynomials of the rst
order in all directions at all nite elements vertices, from polynomials of order
{pj,K }j=1,..,12,K over all 12 edges of all nite elements K, from polynomials of
order {(pih,K , piv,K }i=1,...,6,K over all 6 faces j = 1, ..., 6 of all nite elements K
and from polynomials of order {(px,K , py,K , pz,K )}K over interiors of all nite
elements K. These polynomials are called nite element shape-functions and
their support is spread over adjacent elements only, see [3] for more details. The
global shape functions {eih,p }i=1,...,N are obtain by collecting all interior nite
elements shape functions, and by glueing together edge and face nite elements
shape functions, respectively. The total number of global shape functions is denoted by N . Thus, the approximated solution uh,p is represented as a linear
combination of global shape functions

uh,p = u0 +
uih,p eih,p .
(3)
i,...,N

Agents Based Hierarchical Parallelization of Complex Algorithms

917

The coecients {uih,p }i=1,...,N are called global degrees of freedom. and the discretized problem (2) can be rewritten as

uh,p u0 + Vh,p




.
(4)
j
j
i
i
i=1,...,N uh,p b eh,p , eh,p = l eh,p j = 1, ..., N
Tosolve problem
(4) it is necessary to build and solve the globalstiness

 matrix
j
j
i
{b eh,p , eh,p }i,j=1,...,N , and the right-hand-side (rhs) vector {l eh,p }j=1,...,N .
The global stiness matrix and the rhs vector are aggregated
from local
vectors


j
i
and matrices, related to restrictions of integrals {b eh,p |K , eh,p |K }i,j=1,...,N ,


and {l ejh,p |K }j=1,...,N over particular nite elements K.
For hp meshes with high polynomial orders of approximation the construction
of these local contributions, performed by the integration component, is very
expensive.
Let us consider a single nite element with polynomial orders of approximation
in its interior equal to (p1 , p2 , p3 ). The size of the local matrix associated with
such a nite element is equal to the number of degrees of freedom nrdof of the
shape function of order (p1 , p2 , p3 ).
nrdof = (p1 + 1)(p2 + 1)(p3 + 1).

(5)

Here is the algorithm building the local matrix over such a nite element:
1 for i=1,nint1
2
for j=1,nint2
3
for k=1,nint3
4
for m=1,nrdof
5
for n=1,nrdof
6
aggregate local contribution to the stiffness matrix,
7
associated with m-th and n-th degrees of freedom
8
aggregate local contribution to the rhs vector
9
associated with m-th degrees of freedom
where nint1 = p1 + 1, nint2 = p2 + 1 and nint3 = p3 + 1 stand for number of
Gaussian quadrature interpolation points necessary to compute integral of polynomial of orders (p1 , p2 , p3 ). The computational complexity of the integration
algorithm is then nint1 nint2 nint3 nrdof nrdof = (p1 + 1) (p2 + 1)
(p3 +1)(p1 +1)(p2 +1)(p3 +1)(p1 +1)(p2 +1)(p3 +1) = (p1 +1)3 (p2 +1)3 (p3 +1)3 .
The load balancing is performed based on the computational cost estimation
load = (p1 + 1)3 (p2 + 1)3 (p3 + 1)3 .

(6)

When the hp mesh contains nite elements with load much higher than sum of
loads of all other elements, the optimal load balance is such that each expensive
nite element is assigned to a separate processor, and some number of other
processors is responsible for all other nite elements. For example in Figures 2
and 3 there are hp meshes balanced in optimal way into 6 or 4 processors, whilst
total number of available processors is 8 (2 or 4 processors are idle).

918

M. Paszy
nski

Agents Based Hierarchical Parallelization

To overcome the problem with expensive integration over nite elements with
very high p, the following agent based strategy is proposed.
The multiple agents are executed to localize nite elements with highest polynomial orders of approximation. Multiple agents are then assigned to each high
order nite element. All agents assigned to a single high order element execute
parallel version of the integration algorithm. The following line is added in the
integration algorithm after i, j, k loops:
4 if((k+(j-1)*nint3+(i-1)*nint2*nint3) modulo NRA != RANK)continue
where N RA is the total number of agents assigned to high order node, and
RAN K is the identicator of current agent. In other words, the Gaussian quadrature integration loops are cut into parts, and each agent executes some portion
of loops. There are three phases of communication in this algorithm
To localize nite elements with highest order agents must exchange orders
of approximations of interiors of hp nite elements.
Agents must exchange data necessary to perform integration over Gaussian
quadrature points assigned to each agent. This involves only material data
for a nite element and geometry of an element represented as single double
precision Jacobian from change of variables from a physical nite element K
= [0, 1]3 element.
into the reference K
At the end of loops, agents must sum up resulting local matrices of size
nrdof 2 and vectors of size nrdof .

Fig. 5. Execution time of the integration algorithm with and without agents for 8 and
16 processors, for the Fichera model problem

The agent based hierarchical parallelization scheme was utilized to improve efciency of the integration component for the presented problems. Exemplary
results for the Fichera model problem are presented in Figure 5. There is 1 node
with (px , py , pz ) = (7, 7, 7) and 3 nodes with (px , py , pz ) = (6, 6, 7), compare
Figure 2. The domain decomposition, optimal with respect to load dened by
(6), is put each of these nodes into separate sub-domain, and all other nodes
into two other sub-domains. For 8 processors execution there are 2 processors

Agents Based Hierarchical Parallelization of Complex Algorithms

919

idle, whilst for 16 processors execution there are 10 processors idle. The application of agents to localize highly loaded nodes and eliminate them by performing
parallelization of loops gives ideal load balance, see Figure 5. The integration is
followed by forward elimination phase of the parallel multi-frontal direct solver
[5]. The cost of elimination over highly loaded element is nrdof 3 and the same
load inbalances as for the integration component occurs. It means that we were
able to improve eciency of the integration and elimination components by 50%,
since the elimination is still not well balanced. The further improvement of the
eciency can be achieved by switching to iterative solvers.

Conclusions and Future Work

The agent based hierarchical parallelization with DD paradigm on the base level
and functional decomposition on highly loaded unbreakable pieces of data was
proposed. The computational agents were utilized to localize such highly loaded
unbreakable parts of domain, and to utilize functional decomposition by executing computational algorithm with parallelization of loops. The methodology has
been successfully applied to improve eciency of parallel hp FEM computations.
Acknowledgments. The work reported in this paper was supported by Polish
MNiSW grant no. 3 TO 8B 055 29.

References
1. Ciarlet P., The Finite Element Method for Elliptic Problems. North Holland, New
York (1994)
2. Rachowicz, W., Pardo D., Demkowicz, L., Fully Automatic hp-Adaptivity in Three
Dimensions. ICES Report 04-22 (2004) 1-52
3. Demkowicz L., Computing with hp-Adaptive Finite Elements, Vol. I. Chapman &
Hall/Crc Applied Mathematics & Nonlinear Science, Taylor & Francis Group, Boca
Raton London New York (2006)
4. Foster I., Desiging and Building Parallel Programs. www-unix.mcs.aml.gov/dbpp
5. Paszy
nski, M., Demkowicz, L., Parallel Fully Automatic hp-Adaptive 3D Finite
Element Package. Engineering with Computers (2006) in press.
6. Paszy
nski, M., Romkes, A., Collister, E., Meiring, J., Demkowicz, L., Willson, C.
G., On the Modeling of Step-and-Flash Imprint Lithography using Molecular Statics
Models. ICES Report 05-38 (2005) 1-26
7. Paszy
nski, M., Maciol, P., Application of Fully Automatic 3D hp Adaptive Code to
Orthotropic Heat Transfer in Structurally Graded Materials. Journal of Materials
Processing Technology 177 1-3 (2006) 68-71

Sexual Selection Mechanism for Agent-Based


Evolutionary Computation
Rafa Drezewski and Krzysztof Cetnarowicz
Department of Computer Science
AGH University of Science and Technology, Krakow, Poland
{drezew,cetnar}@agh.edu.pl

Abstract. Sexual selection mechanism can be used in evolutionary algorithms


in order to introduce and maintain useful population diversity. In this paper the
sexual selection mechanism for agent-based evolutionary algorithms is presented.
Proposed co-evolutionary multi-agent system with sexual selection is applied to
multi-modal optimization problems and compared to classical evolutionary algorithms.
Keywords: agent-based evolutionary computation, sexual selection mechanism.

1 Introduction
Evolutionary algorithms (EAs) are global optimization techniques based on principles
of Darwinian model of evolutionary processes. Although they have been widely, and
with great successes, applied to a wide variety of problems, EAs often suffer from loss
of population diversity. This limits the adaptive capabilities of EAs, may lead to locating
local optima instead of a global one, and limits the possibilities of application of EAs
in some areas (multi-modal optimization and multi-objective optimization are only two
examples). In the case of multi-modal optimization problems an EA without any special
mechanisms will inevitably locate a single solution [7]. Multiple solutions can be found
only after using some special multi-modal optimization techniques (so called niching
techniques [7]). Niching techniques are aimed at forming and stably maintaining subpopulations (species) that are located within the basins of attraction of local optima of
multi-modal problems.
The understanding of species formation processes (speciation) still remains the
greatest challenge for evolutionary biology. The biological models of speciation include
allopatric models (which require geographical separation of sub-populations) and sympatric models (where speciation takes place within one population without physical
barriers) [5]. Sympatric speciation may be caused by different kinds of co-evolutionary
interactions including sexual selection.
Sexual selection mechanism is the result of co-evolution of interacting sexes. Usually
one of the sexes evolve to attract the second one to mating and the second one tries to
keep the rate of reproductionand costs associated with iton optimal level (sexual
conflict) [5]. The proportion of two sexes (females and males) in population is almost
always 1 : 1. This fact combined with higher females reproduction costs causes, that
in the majority of cases, females choose males in the reproduction process according
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 920927, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Sexual Selection Mechanism for Agent-Based Evolutionary Computation

921

to some males features. In fact, different variants of sexual conflict are possible. For
example there can be higher females reproduction costs, equal reproduction costs (no
sexual conflict), equal number of females and males in population, higher number of
males in population (when the costs of producing a female are higher than producing a
male), higher number of females in population (when the costs of producing a male are
higher than producing a female) [6].
Evolutionary multi-agent system (EMAS) is the agent-based realization of evolutionary computation. In such system three basic mechanism, which are responsible for initiating and maintaining evolutionary processes, exist: agents are able to reproduce, die,
and there exist resources in the environment for which agents compete and which are
needed for all their activities. The general model of co-evolution in multi-agent system
(CoEMAS) [2] includes additionally the notions of species, sex and relations between
species and sexes in evolutionary multi-agent system. These additional mechanisms
can serve as a basis for creating techniques of maintaining useful population diversity
and speciation in systems based on CoEMAS model. Computational systems based on
CoEMAS model has already been applied with promising results to multi-modal optimization [3], and multi-objective optimization [4].
In the following sections the previous work on sexual selection as a population diversity maintaining and speciation mechanism for evolutionary algorithms is presented.
Next, the co-evolutionary multi-agent system with sexual selection mechanism is presented. In such a system two sexes co-evolve: females and males. Female mate choice
is based on values of some important features of selected individuals. Such system
is applied to multi-modal function optimization and compared to classical niching
techniques.

2 Previous Research on Sexual Selection as a Speciation


Mechanism
Sexual selection is considered to be one of the ecological mechanisms responsible for
sympatric speciation [5]. Gavrilets [5] presented a model, which exhibits three general
dynamic regimes. In the first one there is endless co-evolutionary chase between the
sexes where females evolve to decrease the mating rate and males evolve to increase
it. In the second regime females alleles split into two clusters both at the optimum
distance from the males alleles and males get trapped between the two female clusters
with relatively low mating success. In the third regime males answer the diversification
of females by splitting into two clusters that evolve toward the corresponding female
clusters. As a result, the initial population splits into two species that are reproductively
isolated.
Todd and Miller [10] showed that natural selection and sexual selection play complementary roles and both processes together are capable of generating evolutionary
innovations and bio-diversity much more efficiently. Sexual selection allows species to
create its own optima in fitness landscapes. This aspect of sexual selection can result in
rapidly shifting adaptive niches what allows the population to explore different regions
of phenotype space and to escape from local optima. The authors also presented the
model of sympatric speciation via sexual selection.

922

R. Drezewski and K. Cetnarowicz

Sanchez-Velazco and Bullinaria [9] proposed gendered selection strategies for


genetic algorithms. They introduced sexual selection mechanism, where males are selected on the basis of their fitness value and females on the basis of the so called indirect
fitness. Females indirect fitness is the weighted average of her fitness value, age, and
the potential to produce fit offspring (when compared to her partner). For each gender
different mutation rates were used. The authors applied their algorithm to Traveling
Salesman Problem and function optimization.
Sexual selection as a mechanism for multi-modal function optimization was studied
by Ratford, Tuson and Thompson [8]. In their technique sexual selection is based on
the so called seduction function. This function gives a low measure when two individuals are very similar or dissimilar and high measure for individuals fairly similar. The
Hamming distance in genotype space was used as a distance metric for two individuals. The authors applied their mechanism alone and in combination with crowding and
spatial population model. Although in most cases their technique was successful in locating basins of attraction of multiple local optima in multi-modal domain, the strong
tendency to lose all of them except one after several hundreds simulation steps was
observed.
As it was presented here, sexual selection is the biological mechanism responsible for
bio-diversity and sympatric speciation. However it was not widely used as maintaining
population diversity, speciation and multi-modal function optimization mechanism for
evolutionary algorithms. It seems that sexual selection should introduce open-ended
evolution, improve adaptive capabilities of EA (especially in dynamic environments)
and allow speciation (the formation of species located within the basins of attraction of
different local optima of multi-modal fitness landscape) but this is still an open issue
and the subject of ongoing research.

3 Sexual Selection Mechanism for Co-evolutionary Multi-agent


System
The system based on CoEMAS model with sexual selection mechanism (SCoEMAS)
can be seen in figure 1. The topography of the environment, in which individuals live,
is graph with every node (place) connected with its four neighbors. There exist resource
in the environment which is given to the individuals proportionally to their fitness function value. Every action (such as migration or reproduction) of individual costs some
resource.
There are two sexes within the species living in the system: females and males. Reproduction takes place only when individuals have enough amount of resource. The
genotypes of all individuals are real-valued vectors. Intermediate recombination and
mutation with self-adaptation [1] are used for females and males.
The females cost of reproduction is higher than male so their mating rate is lower.
Each time step males search for the reproduction partners (females) in their neighborhood. Female chooses reproduction partner only if they are both located within the
basin of attraction of the same local minima of multi-modal fitness landscape. Modified version of hill-valley function [11] is used in order to check if two individuals
are located within the basin of attraction of the same local minima. Instead of three

Sexual Selection Mechanism for Agent-Based Evolutionary Computation

Reproduction

923

Agent of sex B

Environment

Death
Selection
of partner for
reproduction

Agent of sex A
Migration

Fig. 1. Co-evolutionary multi-agent system with sexual selection mechanism

deterministically selected points, ten randomly generated points are used in order to
evaluate hill-valley function value. The decision of acceptance is made on the basis of
distance between female and male in phenotype space (Euclidean metric is used). The
probability of acceptance is greater for more similar individuals. Also, the operator of
grouping individuals into reproducing pairs is introduced. If female chooses male for
reproduction they move together within the environment and reproduce during some
simulation steps.
The system was applied to multi-modal function optimization and run against four
commonly used test functions.
3.1 Experimental Results
The presented system with sexual selection mechanism was, among others, tested with
the use of standard Rastrigin and Schwefel multi-modal problems (see fig. 2)1 . In order
to give a kind of reference point two other algorithmsstandard EMAS and deterministic crowding (DC) [7]was run against the same set of test functions.
Rastrigin function used in experiments is given by:
f1 (x) = 10 n +

n


(x2i 10 cos(2 xi))

xi [2.5; 2.5] for i = 1, . . ., n

(1)

i=1

where n is the number of dimensions (n = 2 in all experiments). The function has 25


local minima for x1 , x2 [2.5; 2.5].
Schwefel function is given by:
f2 (x) =

n 

  
xi sin |xi |

xi [500.0; 500.0] for i = 1, . . ., n

(2)

i=1

The presented CoEMAS was also tested with the use of other multi-modal problems, but because of space limitations it is out of scope of this paper.

924

R. Drezewski and K. Cetnarowicz

60
50
40
30
20
10
0

900
600
300
0
-300
-600
-900

400

200

-2

-400

-1

-200
-1

0
1

-200

0
200

-2
2

-400
400

a)

b)

40

Fig. 2. Rastrigin (a) and Schwefel (b) test functions

DC
EMAS
SCoEMAS

20

nnm(t)

10

nnm(t)

10

30

12

14

DC
EMAS
SCoEMAS

a)

1000

2000

3000
t

4000

5000

b)

1000

2000

3000

4000

5000

Fig. 3. The number of located local minima neighborhoods of Rastrigin (a) and Schwefel (b)
functions by CoEMAS with sexual selection, EMAS, and deterministic crowding technique (average values from 20 experiments)

This is deceptive function with unevenly distributed 62 local minima for n = 2.


The figure 3 shows the average number of local minima neighborhoods located from
20 experiments. The local minima neighborhood was classified as located when there
were at least three individuals closer than distmax = 0.05 from local minima for Rastrigin
function and distmax = 10.0 for Schwefel function. All the experiments were carried out
for three techniques: SCoEMAS, EMAS, and DC.
The SCoEMAS stood relatively well when compared to other techniques. In all cases
it formed and stably maintained species during the whole experiment. Although DC
quickly located even greater number of local minima neighborhoods than other techniques, there was quite strong tendency to lose almost all of them during the rest part
of experiment. Simple EMAS, without any niching mechanisms was not able to stably
populate more than one local minima neighborhood. It turned out that in the case of
multi-modal landscape it works just like simple EA.

925

Sexual Selection Mechanism for Agent-Based Evolutionary Computation

14

DC
EMAS
SCoEMAS

npd(t)

npd(t)

10

12

DC
EMAS
SCoEMAS

1000

2000

a)

3000

4000

5000

1000

2000

b)

3000

4000

5000

Fig. 4. The value of proportional species sizes indicator in experiments with Rastrigin (a) and
Schwefel (b) functions (average values from 20 experiments)

Figure 4 shows the average values of proportional species sizes indicator npd(t).
The npd(t) indicator is defined as follows:
min
|D
| 

npd(t) =
g Ai (t)

(3a)

i=1



| A j (t)|ni j 

opt




j
1
niopt


g(A j (t)) =

j 

 j

1 |A (t)|niopt 
|A(t)|
j
niopt



j
if A j (t) niopt


j
if A j (t) > niopt

f  (x+j )

= min
|D |
k=1

f  (x+k )

|A(t)|

(3b)

(3c)

where: Dmin D is the set of local minima of the goal function f , A(t) is the set of agents
that exist in the system in time t, x+j is j-th local minima, A j (t) is the set of agents, that
are closer than distmax to j-th local minima in the time t, f  = f is the modified goal
function, : is scaling function which assures that the values of f  function are
greater than 0 and that local maxima of this function are located in the same places as
local minima of function f .
In the case when all sub-populations (species) located within the neighborhoods of
local minima are of optimal sizes then npd(t) indicator has the maximal value (equal
to the number of local minima). In the case when some subpopulations sizes are not
optimal then the value of this indicator falls down. The results presented in fig. 4 confirm that SCoEMAS stably maintains useful population diversity and that DC technique
initially properly distributes individuals over the local minima basins of attraction, but
then, as the time goes on, it loses almost all basins of attraction (species located within

2500

R. Drezewski and K. Cetnarowicz


2000

926

1500
0

500

1000

The number of agents

1500
1000

The number of agents

500
0
0

a)

DC
EMAS
SCoEMAS

2000

DC
EMAS
SCoEMAS

1000

2000

3000

4000

5000

b)

1000

2000

3000

4000

5000

Fig. 5. The number of individuals in population in experiments with Rastrigin (a) and Schwefel
(b) functions (average values from 20 experiments)

them disappear). Also, earlier observations that EMAS is not able to maintain useful
population diversity are fully acknowledged by the results presented in fig. 4.
The sizes of population in EMAS, DC and SCoEMAS during experiments with Rastrigin and Schwefel functions are presented in fig. 5. EMAS and SCoEMAS used variable size populations, while DC population size was fixed (this results from the DC
algorithms assumptions [7]). In the case of EMAS and SCoEMAS systems initial population sizes are small and quickly adapts to the difficulty of the problem. It is worth
noting that SCoEMAS uses the smallest population in experiments with both functions,
what is the big advantage.
Presented results indicate that simple EMAS can not be applied to multi-modal function optimization without introducing special mechanisms such as co-evolution. DC
technique has some limitationsit has the strong tendency to lose basins of attraction
of worse local minima during the experiments (this fact was also previously observed
in [12]). CoEMAS with sexual selection is able to form and stably maintain species but
still more research is needed.

4 Summary and Conclusions


The general model of co-evolution in multi-agent system (CoEMAS) extends the basic
EMAS model from single species and sex to multiple interacting species and sexes.
On the basis of CoEMAS model computational and simulation systems may be developed. In this paper sample computational CoEMAS with sexual selection and resulting co-evolution of two sexes was presented. This system was applied to multi-modal
function optimization. As presented results clearly show it properly formed and stably maintained species of agents located within the basins of attraction of local minima of multi-modal problems. SCoEMAS was able to detect and stably maintain more

Sexual Selection Mechanism for Agent-Based Evolutionary Computation

927

neighborhoods of local minima than EMAS without niching mechanism and deterministic crowding niching technique.
Future research will include the comparison of other variants of sexual conflict (different costs of reproduction for each sex, different costs of producing female and male
individual, resulting in different proportions of individuals of each sex in population).
Also, more detailed comparison to other classical niching and co-evolutionary techniques and the parallel implementation of systems based on CoEMAS model with the
use of MPI are included in future research plans.

References
1. T. Back, D. Fogel, and Z. Michalewicz, editors. Handbook of Evolutionary Computation.
IOP Publishing and Oxford University Press, 1997.
2. R. Drezewski. A model of co-evolution in multi-agent system. In V. Mark, et al., editor,
Multi-Agent Systems and Applications III, volume 2691 of LNCS, pages 314323, Berlin,
Heidelberg, 2003. Springer-Verlag.
3. R. Drezewski. Co-evolutionary multi-agent system with speciation and resource sharing
mechanisms. Computing and Informatics, 25(4):305331, 2006.
4. R. Drezewski and L. Siwik. Multi-objective optimization using co-evolutionary multi-agent
system with host-parasite mechanism. In V. N. Alexandrov, et al., editor, Computational
Science ICCS 2006, volume 3993 of Lecture Notes in Computer Science, pages 871878,
Berlin, Heidelberg, 2006. Springer-Verlag.
5. S. Gavrilets. Models of speciation: what have we learned in 40 years? Evolution,
57(10):21972215, 2003.
6. J. Krebs and N. Davies. An Introduction to Behavioural Ecology. Blackwell Science Ltd,
1993.
7. S. W. Mahfoud. Niching methods for genetic algorithms. PhD thesis, University of Illinois
at Urbana-Champaign, Urbana, IL, USA, 1995.
8. M. Ratford, A. L. Tuson, and H. Thompson. An investigation of sexual selection as a mechanism for obtaining multiple distinct solutions. Technical Report 879, Department of Artificial
Intelligence, University of Edinburgh, 1997.
9. J. Sanchez-Velazco and J. A. Bullinaria. Gendered selection strategies in genetic algorithms
for optimization. In J. M. Rossiter and T. P. Martin, editors, Proceedings of the UK Workshop
on Computational Intelligence (UKCI 2003), pages 217223, Bristol, UK, 2003. University
of Bristol.
10. P. M. Todd and G. F. Miller. Biodiversity through sexual selection. In C. G. Langton and
T. Shimohara, editors, Artificial Life V: Proceedings of the Fifth International Workshop on
the Synthesis and Simulation of Living Systems (Complex Adaptive Systems), pages 289299.
Bradford Books, 1997.
11. R. K. Ursem. Multinational evolutionary algorithms. In P. J. Angeline, Z. Michalewicz,
M. Schoenauer, X. Yao, and A. Zalzala, editors, Proceedings of the 1999 Congress on Evolutionary Computation (CEC-1999), pages 16331640, Piscataway, NJ, USA, 1999. IEEE
Press.
12. J.-P. Watson. A performance assessment of modern niching methods for parameter optimization problems. In W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela,
and R. E. Smith, editors, GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, volume 1, pages 702709, San Francisco, CA, 1999. Morgan Kaufmann.

Agent-Based Evolutionary and Immunological


Optimization
Aleksander Byrski and Marek Kisiel-Dorohinicki
Department of Computer Science, AGH University of Science and Technology,
Mickiewicz Avn. 30, 30-059 Cracow, Poland
{olekb,doroh}@agh.edu.pl

Abstract. An immunological selection mechanism for evolutionary multi-agent


systems is discussed in the paper. It allows for reducing the number of fitness
assignments required to get the solution of comparable quality as the classical
resource-based selection in EMAS. Experimental studies aim at comparing the
performance of immune-inspired selection, with resource-based one, and also
with classical parallel evolutionary algorithms, based on typical multi-modal optimization benchmarks.
Keywords: multi-agent systems, evolutionary computation, artificial immune
systems.

1 Introduction
The term evolutionary multi-agent systems (EMAS) covers a range of optimization
techniques, which consist in the incorporation of evolutionary processes into a multiagent system at a population level. The most distinctive for these techniques are selection mechanisms, based on the existence of non-renewable resources, which is gained
and lost when agents perform actions [6].
Even though the idea of EMAS proved working in a number of tests, it still reveals
new features, especially when supported by specific mechanisms borrowed from other
optimisation methods [7]. In particular, immunological approach was proposed as a
more effective alternative to the classical resource-based selection used in EMAS [2].
The introduction of immune-based mechanisms may affect various aspects of the system behaviour, such as the diversity of the population and the dynamics of the whole
process. From the practical point of view the most interesting effect is the substantial reduction of the number of fitness assignments required to get the solution of comparable
quality. This is of vast importance for problems with high computational cost of fitness
evaluation, like reverse problems involving simulation experiments to assess the solution, or hybrid soft computing systems with learning procedure associated with each
assessment (e.g. evolution of neural network architecture for a particular problem) [1].
This paper focuses on the impact of the immune-based approach on the performance of EMAS applied to function optimization in comparison to the resource-based
selection used alone. The results obtained for both agent-based approaches are also
compared to classical parallel evolutionary algorithm (evolutionary algorithm with
allopatric speciation).
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 928935, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Agent-Based Evolutionary and Immunological Optimization

929

evaluation
localization

agent

reproduction

MAS

death

localization

Fig. 1. Structure of the population and energetic selection principle in EMAS

Below, after a short presentation of the basics of evolutionary multi-agent systems


and artificial immune systems, the details of the proposed approach are given. Then the
selected results obtained for the optimization of typical multi-modal, multi-dimensional
benchmark functions are presented.

2 Evolutionary Multi-agent Systems


In evolutionary multi-agent systems besides interaction mechanisms typical for agentbased systems (such as communication) agents are able to reproduce (generate new
agents) and may die (be eliminated from the system). So inheritance may be accomplished by an appropriate definition of reproduction (with mutation and recombination),
which is similar to classical evolutionary algorithms.
Unfortunately selection mechanisms known from classical evolutionary computation cannot be used in EMAS because of the assumed lack of global knowledge (which
makes it impossible to evaluate all individuals at the same time), and the autonomy
of agents (which causes that reproduction is achieved asynchronously). The resourcebased (energetic) selection scheme assumes that agents are rewarded for good behaviour, and penalized for bad behaviour (which behaviour is considered good or
bad depends on the particular problem to be solved) [6].
In the simplest case the evaluation of an agent (its phenotype) is based on the idea of
agent rendezvous. Assuming some neighbourhood structure (the simplest case would be
population decomposition along with allopatric speciation model, see fig. 1) in the environment, agents evaluate their neighbours, and exchange energy. Worse agents (considering their fitness) are forced to give a fixed amount of their energy to their better
neighbours. This flow of energy causes that in successive generations, survived agents
should represent better approximations of the solution [4].

3 Artificial Immune Systems


Artificial immune systems, inspired by the human immunity, recently began to be the
subject of increased researchers interest. Different immune-inspired approaches were
applied to many problems, such as classification or optimization [9].

930

A. Byrski and M. Kisiel-Dorohinicki

The most often used algorithms, which are inspired by the immune system of vertebrates, are based on clonal and negative selection processes [5].
The algorithm of negative selection corresponds to its biological origin and consists
of the following steps:
1. Lymphocytes are created, as yet they are considered immature.
2. The binding of these cells (affinity) to present self-cells (eg. good solutions of some
problem) is evaluated.
3. Lymphocytes that bind themselves to good cells are eliminated.
4. Lymphocytes that survive are considered mature.
Mature lymphocytes are presented with the cells that have unknown origin (they may be
self, or non-self cells), and they are believed to have possibility of classifying them [8].
The algorithm is usually used for classification problems.
The algorithm of clonal selection is based on biological process. Original approach
used to classification problems was modified by De Castro and von Zuben [3] in order
to adapt to solving optimization problems. In this approach the lymphocyteantigen
binding is represented by the value of fitness function (in fact, antigens may be identified
with the function optima):
1. The population of lymphocytes is randomly created.
2. Every lymphocyte produces certain number of mutated clones.
3. Every lymphocyte in the population is replaced by the best of its clones (in the
means of fitness function value) or retained.
4. The pairs of similar lymphocytes are selected (euclidean distance may be used for
similarity measure) and from every pair better lymphocyte (in the means of fitness function value) is retained. Worse one is replaced by randomly generated new
lymphocyte.
5. Steps 2-4 are repeated, until the stop condition is reached.

4 Immunological Selection in EMAS


In order to speed up the process of selection in EMAS, based on the assumption that
bad phenotypes come from the bad genotypes, a new group of agents (acting as
lymphocyte T-cells) may be introduced [2,1]. They are responsible for recognizing and
removing agents with genotypes similar to the genotype pattern posessed by these lymphocytes. Other approach may introduce specific penalty applied by T-cells for recognized agents (certain amount of agents energy is removed) instead of removing them
from the system.
Of course there must exist some predefined affinity (lymphocyte-agent matching)
function, which may be based on the percentage difference between corresponding
genes. The agents-lymphocytes are created in the system after the action of death. The
late agent genotype is transformed into lymphocyte patterns by means of mutation operator, and the newly created lymphocyte (or group of lymphocytes) is introduced into
the system.
In both cases, new lymphocytes must undergo the process of negative selection. In a
specific period of time, the affinity of immature lymphocytes patterns to good agents

Agent-Based Evolutionary and Immunological Optimization

931

death of
an agent and
creation of
a lymphocyte

localization

lymphocyte
agent

MAS

recognising
and removing

apoptosis

localization

Fig. 2. Structure of the population and immunological selection principle in iEMAS

(posessing relative high amount of energy) is tested. If it is high (lymphocytes recognize


good agents as non-self) they are removed from the system. If the affinity is low,
it is assumed that they will be able to recognize non-self individuals (bad agents)
leaving agents with high energy intact.
The system working according to the above described principles will be called an
immunological evolutionary multi-agent system (iEMAS) (see fig. 2).

5 Configuration of the Examined Systems


The systems were designed and implemented using distributed computational agent
platform AgE developed at AGH University of Science and Technology1.
Much effort was put into configuring the examined systems in such way, that the
obtained results may be comparable. It was of course a difficult task, because the agentbased and classical approaches were of different kind. For example, the mechanisms
of selection are completely different, because there is no global knowledge in agent
systems (therefore methods of selection are different).
Every algorithm was applied to optimization of 10-dimensional, well known benchmark functions, i.e. Rastrigin, Ackley, Griewank and De Jong. Real-value encoding
was used. Every approach employed allopatric speciation, the populations were decomposed into 3 islands with initial population of 20 individuals. Discrete crossover
operator along with uniform mutation with small macromutation probability (0.1) were
used. Every computation was performed during 10000 iterations.
Specific parameters for every examined algorithm will follow.
Evolutionary algorithm
Tournament selection was used, because of its similarity to the rendezvous evaluation mechanism used in EMAS and iEMAS (the number of competing individuals
was 7).
1

http://age.iisg.agh.edu.pl

932

A. Byrski and M. Kisiel-Dorohinicki

Crossover probability was 0.7, mutation probability was 0.001 with 0.1 probability
of macromutation.
Every 20 iterations 5 the most different individuals were exchanged among the
island in the process of migration.
EMAS
Initial amount of agents energy is 30. In every evaluation, agents exchanged 1 unit
of energy.
Agent was removed from the system, when its energy reached 0.
Agent was able to reproduce, when its energy exceeded average energy computed
for the evolutionary island.
Agent was able to migrate when its energy exceeded about 110% of average energy
computed for the evolutionary island.
iEMAS
The pattern contained in T-cell has the same structure as the agents genotype. The
affinity function evaluates the difference between every element of genotype and
pattern.
The energy of T-cell was 150. The T-cell was removed from the system after its
energy reached 0.
The length of negative selection process was 30 iterations. The immature T-cell
was removed from the system, after recognizing agent which energy exceeded the
minimal energy needed for reproduction.

6 Experimental Results
The plots discussed below show averaged values (with the estimates of standard deviation) obtained for 20 runs of the systems with the same parameters.
In the figures 3(a),3(b), 3(c) and 3(d), averaged best fitness is presented in consecutive steps of the system activity for EMAS, iEMAS and PEA for all optimization
problems. These graphs show, that EMAS and iEMAS reached better or comparable
results to PEA for three problems (Ackley, De Jong and Rastrigin). PEA turned out to
be better only for Griewank problem. The final results of the optimization may be seen
in the figure 4(a) and in the table 1.
It seems that introduction of the immunological selection mechanism does not affect
greatly the quality of obtained results. Both systems (EMAS and iEMAS) reached similar sub-optimal values in the observed period of time (continuation of the search would
yield better results, however it was not the primary task of this research), although the
results obtained for iEMAS were a little worse than its predecessor.
In the figure 4(b), the aggregated count of fitness function calls for Ackley problem
is presented (other problems yielded very similar results, see table 2). It may be clearly
seen, that the agent-based approaches require far less computations of fitness functions
than PEA, what makes them good for solving problems where time of fitness computation is significant (e.g. reverse problems). Fitness count result for iEMAS was about
50% better than EMAS (see table 2).

Agent-Based Evolutionary and Immunological Optimization

1000

1000
PEA
EMAS
iEMAS

PEA
EMAS
iEMAS

100
10

F (X)

100

F (X)

933

10
1
0.1

1
0.1
0.01
0.001
0.0001

0.01
1e005
0.001

1e006
0

2000

4000

6000

8000

10000

n
(a) Ackley problem optimization

4000

6000

8000

10000

n
(b) De Jong problem optimization

1000

1000
PEA
EMAS
iEMAS

100

PEA
EMAS
iEMAS

100

10

F (X)

F (X)

2000

1
0.1

10
1
0.1

0.01

0.01

0.001
0.0001

0.001
0

2000

4000

6000

8000

10000

n
(c) Rastrigin problem optimization

2000

4000

6000

8000

10000

n
(d) Griewank problem optimization

Fig. 3. The best fitness in consecutive iterations

Table 1. The best fitness for EMAS, iEMAS and PEA in 10000 iteration
Problem
Ackley
De Jong
Griewank
Rastrigin

PEA
1.97
6.49 106
7.58 103
7.81

0.07
0.96 106
2.33 103
0.81

EMAS
1.38 103
1.31 106
0.02
0.26 103

37 106
68.6 109
5.05 103
16.8 106

iEMAS
4.17 103
7.9 106
0.11
4.58 103

0.14 103
0.59 106
0.03
1.83 103

Table 2. The aggregated number of fitness function calls for EMAS, iEMAS and PEA in 10000
iteration
Problem
Ackley
De Jong
Griewank
Rastrigin

PEA
0.6 106
0.6 106
0.6 106
0.6 106

EMAS
57 840
57 769
58 484
57 787

43
43
62
29

iEMAS
25 596
26 172
30 829
27 727

149
158
379
220

934

A. Byrski and M. Kisiel-Dorohinicki

1000

400000
PEA
EMAS
IEMAS

100

#F (X)

F (X)

10
1
0.1
0.01

PEA
EMAS
iEMAS

350000
300000
250000
200000
150000

0.001
0.0001

100000

1e005

50000

1e006

0
Ackley

De Jong

Griewank

Rastrigin

n = 10000
(a) The best fitness in 10000 iteration

2000

4000

6000

8000

10000

n
(b) Ackley problem fitness count

Fig. 4. The best fitness in 10000 iteration and aggregated number of fitness function calls

Relatively small value of standard deviation displayed in the presented graphs shows
that the experimental results are trustworthy for examined problems.

7 Conclusion
In the paper immune-based selection mechanisms for evolutionary multi-agent systems
were evaluated to show their performance in comparison to energetic ones and classical
parallel evolutionary algorithms. As the experimental results show, it lowers the cost
of the computation by decreasing the number of fitness assignments, though the results
are comparable to these obtained for the system without immunological selection. Additionally it may be noticed that both agent-based approaches give better solutions than
PEA in most considered problems.

References
1. A. Byrski and M. Kisiel-Dorohinicki. Immune-based optimization of predicting neural networks. In Vaidy S. Sunderam et. al., editor, Proc. of the Computational Science ICCS 2005 :
5th International Conference, Atlanta, GA, USA, Lecture Notes in Computer Science, LNCS
3516, pages 703710. Springer Verlag, 2005.
2. A. Byrski and M. Kisiel-Dorohinicki. Immunological selection mechanism in agent-based
evolutionary computation. In M. Kopotek and et al., editors, Intelligent Information Processing and Web Mining : proceedings of the international IIS: IIPWM 05 conference : Gdansk,
Poland, June 1316, 2005, pages 411415. Springer Verlag, 2005.
3. Leandro Nunes de Castro and Fernando J. Von Zuben. The clonal selection algorithm with
engineering applications. In Artificial Immune Systems, pages 3639, Las Vegas, Nevada,
USA, 8 2000.
4. G. Dobrowolski and M. Kisiel-Dorohinicki. Management of evolutionary MAS for multiobjective optimization. Evolutionary Methods in Mechanics, pages 8190, 2004.
5. W.H. Johnson, L.E. DeLanney, and T.A. Cole. Essentials of Biology. New York, Holt, Rinehart
and Winston, 1969.

Agent-Based Evolutionary and Immunological Optimization

935

6. M. Kisiel-Dorohinicki. Agent-oriented model of simulated evolution. In William I. Grosky


and Frantisek Plasil, editors, SofSem 2002: Theory and Practice of Informatics, volume 2540
of LNCS. Springer-Verlag, 2002.
7. M. Kisiel-Dorohinicki, G. Dobrowolski, and E. Nawarecki. Agent populations as computational intelligence. In Leszek Rutkowski and Janusz Kacprzyk, editors, Neural Networks and
Soft Computing, Advances in Soft Computing, pages 608613. Physica-Verlag, 2003.
8. S. Wierzchon. Artificial Immune Systems [in polish]. Akademicka oficyna wydawnicza EXIT,
2001.
9. S. Wierzchon. Function optimization by the immune metaphor. Task Quaterly, 6(3):116,
2002.

Strategy Description for Mobile Embedded Control


Systems Exploiting the Multi-agent Technology
Vilm Srovnal1, Bohumil Hork1, Vclav Snel2, Jan Martinovi2,
Pavel Krmer2, and Jan Plato2
1

Department of measurement and control, FEECS, VB-Technical University of Ostrava,


17.listopadu 15,CZ-708 33 Ostrava-Poruba, Czech Republic

{vilem.srovnal, bohumil.horak}@vsb.cz
2

Department of computer science, FEECS, VB-Technical University of Ostrava, 17.listopadu


15,CZ-708 33 Ostrava-Poruba, Czech Republic
{vaclav.snasel, jan.martinovic, pavel.kromer.fei,
jan.platos.fei}@vsb.cz

Abstract. Mobile embedded systems are a part of standard applications of distributed system control in real time. An example of a mobile control system is
the robotic system. The software part of a distributed control system is realized
by decision making and executive agents. The algorithm of agents cooperation
was proposed with the control agent on a higher level. The algorithms for agents
realized in robots are the same. Realtime dynamic simple strategy description
and strategy learning possibility based on game observation is important for discovering opponents strategies and searching for tactical group movements,
simulation and synthesis of suitable counter-strategies. For the improvement of
game strategy, we are developing an abstract description of the game and propose ways to use this description (e.g. for learning rules and adapting team
strategies to every single opponent).

1 Introduction
A typical example of a distributed control system with embedded subsystems is the
controlling of physical robots playing soccer. The selection of this game as a laboratory task was motivated by the fact that the realization of this complicated multidisciplinary task is very difficult. The entire game can be divided into a number of partial
tasks (evaluation of visual information, image processing, hardware and software
implementation of distributed control system, hard-wired or wireless data transmission, information processing, strategy planning and controlling of robots). The task is
a matter of interest for both students and teachers, and allows direct evaluation and
comparison of various approaches. For the improvement of game strategy, we are
developing an abstract description of the game and propose ways to use this description (e.g. for learning rules and adapting team strategies to every single opponent).
We are building upon our previous work - the hardware implementation and basic
control of robots - and we would like to achieve a higher level control of complex
game strategies.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 936943, 2007.
Springer-Verlag Berlin Heidelberg 2007

Strategy Description for Mobile Embedded Control Systems

937

The rest of the paper is organized as follows: First, we briefly describe the base
hard-ware and software implementation. Then, we describe the representation of the
game field using abstract grids. After that, we describe possible game strategies. Using the abstract grids and game strategies, we explain how to learn rules that describe
specific game strategies. Particular attention is paid to the learning how to use latent
semantic analysis. We conclude with the discussion of the presented approach.

2 Technical Implementation
Embedded systems are represented by two teams (own and opponent) made up of up
to 11 autonomous mobile robots. The core of an embedded control system is a digital
signal processor Motorola - DSP56F805. PWM output of the signal processor is connected to a pair of power H-bridge circuits, which supply a pair of DC drives with
integrated pulse encoders. For communication, the communication module is used
with the control IC Nordic nRF2401, which ensures communication with a higher
level of the control system.
The higher level control system is represented by a personal computer. At the PC
input, a signal that represents a picture of a scene with robots scanned with CCD
aerial camera is entered. At the output, a radio line that transmits commands for all
own mobile robots is connected.
The software part of a distributed control system is realized by decision making
and executive agents. The algorithm of agents cooperation was proposed with the
control agent on a higher level. The algorithms for agents realized in robots are the
same. The control agent determines the required behavior of the whole control system
as the response to the dynamic behavior of robots. Ones own global strategy in the
task and knowledge of prior situations saved in the database of the scene can also be
determined. The agent on a higher level controls the other agents [1].
Another task is the transformation that converts the digital picture into the object
coordinates (robots and ball in the task of robot soccer) saved in the database of the
scene [2]. This database is common for all agents in the control system. Used agents
structure was described in [3]. Each agent sees the entire scene and is capable of
controlling its own behavior. The basic characteristic of a control algorithm of a
subordinate agent is the independence on the number of decision making agents for
robots on the playground.
Both teams (ones own and the opponents) have a common goal, to score a goal
and not to allow any goals from opponent. For successful assertion of ones own
game strategy the extraction and knowledge of an opponents game strategy is very
important. Strategy extraction algorithms are created from the opponents game
strategy database and from object coordinates of the picture scene.

3 Game Field Description


The game system can be described as up to twice eleven autonomous mobile robots
(own and opponents robots), which are situated an a field measuring 280x220cm. In

938

V. Srovnal et al.

our approach we are using a software simulator for this robot soccer game [4]. Robots
create a very dynamic environment. This environment is scanned by a CCD aerial
camera with a sample frequency (in the present time) up to 50 fps.
The neuronal net of a control agent in a sensation module process the picture signal
and encoded information (position, orientation) is saved in one of output vectors of
scene database [5]. Scene database is common for all agents. Both agent teams have a
common goal to score the goal and not to get any goal. For a success, it is also advisable to extract the strategy of the opponent team. The extraction and knowledge of
opponent game strategy is an approach that is known to be successful in other situations as well [6].
Our approach of game representation is based on separation of game into logical
and physical part. Logical part includes selection of strategy, calculation of robots
movements, and adaptation of rules to opponents strategy. Physical part contains real
movement of robots on game field and recognition of opponents turn. Logical part is
independent on physical part, because we can calculate moves of opponents robots as
well as moves of our robots. Next advantage of separation is that logical part is independent on size of game field and resolution of used camera. In logical part, the game
is represented as abstract grid with very hight resolution, which guaranteed very
precise position specification of robots and ball. But this very detailed representation
of game field in not suitable for strategy description, because that brings requirements
of many rules for description of behavior of robots. Therefore, for strategy description
is used so-called strategy grid with much less resolution than abstract grid. This simplification of reality is sufficient, because it is not necessary to know exact position,
but only approximate position, of robots for strategy realization (Figure 1). When
physical part is used, then we must only transform coordinates from abstract grid into
coordinates based on game field size and camera resolution.

Fig. 1. Inner game representation

Strategy Description for Mobile Embedded Control Systems

939

4 Game Strategy
The game strategy can be dynamically changed based on the game progress (i.e. the
history and the current position of the players and the ball [7]). The game progress
can be divided in time into the following three ground playing classes (GPC). GPC of
game opening (GPCO), GPC of movements in game site (GPCS), GPC of game end
(GPCE). The game progress, especially in the GPCS class, can be also divided into
the following two game playing situations (GPS):

GPS of attack (GPSA). The interactions of simple behaviours cause the robots to fall into a V-formation where the ball is in motion roughly towards
the opponents goal.
GPS of defence (GPSD). When the ball is not moving roughly towards the
opponents goal, the robots move around it to form an effective barrier and to
be in a good position for recovery.

Each GPC has its own movement rules. The classes GPCO and GPCE consist of
finite number of possible movements that are determined by initial positions of players and the ball. The class GPCS has virtually unlimited number of possible movements. The movements are determined by the current game situation (GPS) and by
the appropriate global game strategy (in next GGS). The movement of the particular
robot is determined by the current game class and situation, and also by the robot role.
For example, the goalkeepers task is to prevent the opponent to score a goal. His
movements are in most cases limited along the goalmouth near of goal line. The preferred movements are in goal line direction. The preference of these movements
comes from the particular GGS, where the goalkeeper prevents to score a goal in the
way of moving in the position between the central goal point and the ball (or the expected ball position). The preference of other movement directions is created using
GPSA, where movements of goalkeeper secure kicking the ball from the defense
zone.
It should be noted that above presented categorization of game play progress is rather a tool for analysis and training and description of initial team habits than strict
differentiation of game situations. During the dynamic strategy adaption, the rules
change and should evolve toward more suitable opponent anti-strategy. So the content
of each GPC is changing in time and the particular strategies influence each other and
also the strategies belonging to other classes.

5 Basic Desciption of Strategy Selection Process


In this section we describe our approach for learning game strategy from observation.
Our goal is to learn an abstract strategy. The main steps of the learning process are:

Transformation of observations into abstract grids,


Transformation of observations into strategy grids,
Learning a strategy based on the observed transitions in the strategy grid.

We adopt definition of strategy [8]: Strategy is the direction and scope of an organization over the long-term: which achieves advantage for the organization through

940

V. Srovnal et al.

its configuration of resources within a challenging environment... Strategy application for one movement of players is computed in following steps:
1.
2.
3.
4.
5.
6.
7.

Get coordinates of players and ball from camera.


Convert coordinates of players into strategic grid.
Convert ball and opponents positions into abstract and strategic grids.
Choose goalkeeper and attacker, exclude them from strategy and calculate
their exact positions.
Detect strategic rule from opponents and ball positions.
Convert movement from strategic grid to physical coordinates.
Send movement coordinates to robots.

Each strategy is stored in one file and currently consists of about 15 basic rules.
Furthermore the file contains metadata Information about the name of strategy, the
algorithm to strategy choosing, the author responsible for current strategy, the date of
last modification, the size of strategic grid, strategic rules.
Each strategic rule consists of five records. The rule ID and description (e.g. Rule 1
Attack1), the coordinates of our players in strategic grid (e.g. .Mine a6 c7 d6 e3 f9),
the coordinates of opponents players in strategic or abstract grid (e.g. .Opponent d3
e7 e8 g2 k6), the ball coordinates in abstract or strategic grid (e.g. .Ball i6), strategic
or abstract grid positions of the move (e.g. .Move a6 g7 f5 j3 i8). From observation of
opponents strategy a new set of rules can be written, without necessity of program
code modification. Furthermore, there is a possibility of automatic strategy (movement) extraction from running game.
There exist two main criteria in the Rule selection process. The selection depends
on opponents coordinates, mines coordinates and ball position. The strategy file
contains rules, describing three possible formations suggesting danger of current
game situation. The opponents team could be in offensive, neutral or defensive formations. Furthermore, we need to weigh up the ball position risk. Generally, opponent
is not dangerous if the ball is near his goal. The chosen rule has minimal strategic grid
distance from current.
Optimal movements of our robots are calculated by applying minimal distance
from strategic grid position. The goalkeeper and attacking player, whose distance is
closest to the ball are excluded from strategic movement and their new position is
calculated in exact coordinates. To summarize, the strategy management can be described in the following way:

Based on incoming data from the vision system, calculate abstract and
strategy grid coordinates of the players and the ball,
The abstract grid is then used to decide which player has under the ball
control,
This player is issued a kick to command that means that it has to try to kick
the ball to a given strategy grid coordinates,
All other players are given (imprecise) go to coordinates. These coordinates
are determined by the current game strategy and are determined for each
robot individually,
The goalkeeper is excluded from this process since its job is specialized, and
does not directly depend on the current game strategy.

Strategy Description for Mobile Embedded Control Systems

941

6 Future Research
The need to learn opponents strategy from the game and decide an appropriate counter-strategy in response was identified in the previous section. Also, verification of the
created strategy is of notable importance. An off-line (out of the gameplay) verification process validates the strategy and ensures that there are no:

Contradictory rules leading to contradictory game situations


Extra rules made immediately or in more steps the same game situations

Such a verified game strategy can improve complex goal-targeted robot behavior in
practice. We aim to extend our framework with strategy validating and optimizing
componenst based on genetic algorithms.
Genetic Algorithms are powerful and popular optimization and search algorithms
inspired by natural evolution introduced by John Holland and extended by David
Goldberg. GA is wide applied and highly successful variant of evolutionary computation [9]. GA operates over a population of potential solutions encoded into chromosomes. Each chromosome is rewarded with a fitness value expressing its suitability as
a solution of given problem. The workflow of GA consists of iterative application of
genetic operators on population of chromosomes (Figure 2). Genetic operators are:

Selection operator: to select the fittest chromosomes from the population


to be parents. Through this operator, selection pressure is applied in the
population.
Crossover operator: for varying chromosomes from one population to the
next by exchanging one or more of their subparts.
Mutation operator: random perturbation in chromosome structure; used for
changing chromosomes randomly and introducing new genetic material into
the population.

Fig. 2. Evolutionary algorithm

A population of chromosomes, in particular iteration of evolutionary computation,


is called a generation. When applying evolutionary optimization to a robot soccer
game, a strategy can be seen as a generation of rules encoded as binary chromosomes.
The evolution over strategy rules should lead to a new improved set of rules that will

942

V. Srovnal et al.

form a better counter-strategy against a particular opponent. The challenge of this


approach is to find suitable rule encoding (that will allow easy application of genetic
operators and respect the nature of the investigated problem) and discover a useful
fitness function to compare rules.
The introduced approach to robot soccer strategy is based on the overall knowledge
of global game situations. Next, robot movements are computed after an analysis of
overall positions of all players and ball. In fact, robots are being moved by an omniscient operator and not moving independently. More advanced multi-agent approaches
also incorporate agent behaviors, such as an ant algorithm, a stigmery algorithm, a
multi-agent coordination and control using techniques inspired by the behavior of
social insects. In such a game, the robot players will resolve simple or trivial gameplay situations according to their own decisions. This should mirror the instinctive
behavior of robotic agents and local knowledge following their individual, simple
behavior patterns. A pattern example can be as follows:

The attacker moves towards the ball (when the ball is moving to opponents
goal)
The attacker possessing the ball moves towards the competitors goal
The defender moves between the ball and own goal area (when the ball is
moving towards own goal).
The defender moves quickly between the ball and its own goal, when the opponent attacks and there are no defenders in position.
The goalie moves towards the ball in an effort to perform a kick-off when the
ball is too close to the goal area. Otherwise, he guards the goal.

The strategy concept presented in previous sections should then be used to resolve
complex non-trivial game situations (like standard situations in a real soccer game) or
to incorporate surprising, innovative moves to the game. If robot activity is partly or
mostly independent, there will be no need to evaluate the global game situation and
search for appropriate moves in every GPS. The gained processor time can be used
for improved strategy learning, game strategy optimization, and optimized game strategy applied in certain gameplay situations (the opponents team loses the ball when
attacked by more players; own team gains advantage by attacking with more players,
posing a greater threat to opponents goal). This can lead to notably superior results.
Supplementary out of the play analyses of game history, recorded by the means
of abstract grid and strategy rules, are used for fine tuning robot behavior as descrybed above and for development of strategy rules fitting accurately to a particular
opponent.

7 Conclusion
The main goal of the control system is to enable an immediate response in real time.
The system response should be sooner than time between two frames from the camera. When the time response of the algorithm exceeds this difference the control quality deteriorates. The method we described provides fast control. This is achieved by
using rules that are fast to process. We have described a method of game representation and a method of learning game strategies from observed movements of players.

Strategy Description for Mobile Embedded Control Systems

943

The movements can be observed from the opponents behaviour or from the human
players behaviour. We believe that the possibility of learning the game strategy that
leads to a fast control is critical for success of robotic soccer players. Like in chess
playing programs, the database of game strategies along with the indication of their
success can be stored in the database and can be used for subsequent matches.
In the future, we want to use modular Q-learning architecture [10]. This architecture was used to solve the action selection problem which specifically selects the
robot that needs the least time to kick the ball and assign this task to it. The concept of
the coupled agent was used to resolve a conflict in action selection among robots.

Acknowledgement
The Grant Agency of Czech Academy of Science supplied the results of the project
No. p. 1ET101940418 with subvention.

References
1. Hork, B., Obitko, M., Smid, J., Snel, V., Communication in Robotic Soccer Game.
Communications in Computing (2004) 295-301.
2. Holland, O., Melhuish, C., Stigmergy, self-organisation, and sorting in collective robotics.
Artiffcial Life, (2000) 173-202.
3. Srovnal, V., Hork, B., Bernatik, R., Strategy extraction for mobile embedded control systems apply the multi-agent technology. Lecture Notes in Computer Science, Vol. 3038.
Springer-Verlag, Berlin Heidelberg New York (2004) 631-637.
4. FIRA robot soccer, http://www.fira.net/
5. Berry, M. W., Browne, M., Understanding Search Engines: Mathematical Modeling and
Text Retrieval. SIAM Book Series: Software, Environments, and Tools (1999)
6. Slywotzky, A.J., Morrison, D., Moser, T., Mundt, K., Quella, J., Profit Patterns: 30 ways
to anticipate and profit from strategic forces reshaping your business (1999)
7. Veloso, M. and Stone, P., Individual and collaborative Behaviours in a Team of Homogeneous Robotic Soccer Agents, Proceedings of International Conference on Multi-Agent
Systems (1998) 309-316.
8. Johnson, G., Scholes, K., Exploring Corporate Strategy: Text and Cases. FT Prentice Hall,
(2001)
9. Mitchell, M., An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA (1996)
10. Park, K.H., Kim, Y.J., Kim, J.H., Modular Q-learning based multi-agent cooperation for
robot soccer, Robotics and Autonomous Systems, Elsevier, 35 (2001) 109-122.

Agent-Based Modeling of Supply Chains in Critical


Situations
Jarosaw Kozlak, Grzegorz Dobrowolski, and Edward Nawarecki
Institute of Computer Science
AGH University of Science and Technology, Krakow, Poland
{kozlak,grzela,nawar}@agh.edu.pl

Abstract. Supply chains are discussed in this article. They are viewed as intelligent decentralized systems that meet the agent paradigm. Thus possibility of
arising critical situations in the chains can be analyzed in the agent-based manner. The work is focused on applying an overall methodology dedicated to the
discovery of crises and support of anti-crisis activities. The agent-based model is
proposed that incorporates majority of features of supply chains so that the modeling of crises can be wide-ranging and easy. As an illustration some simulation
results are presented.

1 Introduction
Intelligent decentralized systems that meet the agent paradigm [2] can be of two types:
designed from scratch as multi-agent ones (operating in the virtual world, e.g. network
information services, virtual enterprises) and existing in the reality as a set of cooperating autonomous subsystems of whatever origin (e.g. transportation systems, industrial
complexes) that can be analyzed in agent-based manner.
Such systems are marked by possibility of arising critical situations that can be
caused by both outer (e.g. undesirable interference or the forces of nature) and inner
(e.g. resource deficit, local damages) factors. Such is the source of an idea to embed
considerations about crises in the field of multi-agent systems MAS [3,6].
As a consequence, in the real case an assumption must be taken that the system
under consideration has agent-based conceptualization that can serve as a means for
analysis of its operation, especially in critical situations. If the description occurs to be
enough precise, a simulation (computerized) model of its behavior can be built. Results
of the simulation studies would be the scenarios of crises progress. Investigation of the
scenarios would lead to finding a strategy of avoiding the particular crisis or, at least,
reducing its effects [9,6].
Some difficulties of crisis identification, evaluation of possible effects and prevention (anti-crisis) actions come from general features of multi-agent systems (autonomy
of the agents decisions, lack of global information) as well as their dynamics (consequences appear after an operation in unpredictable manner).
A set of enterprises which activities can be characterized as production or providing
services, distribution or trading of some raw materials, by-products or market goods,
called a supply chain, constitute a system that can be numbered among the discussed
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 944951, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Agent-Based Modeling of Supply Chains in Critical Situations

945

above and the ideas of agent and multi-agent system can be useful to solve managerial
problems dealing with it.
The same approach is present in research reported by some number of publications,
among which it is worth to mention here at least [7,4,5,8]. The first one contains a
comparative study of the agent-based approach versus some others proposed earlier.
Searching for the answer to question whether agents that construct supply chains can
do it better than human decision makers, especially with respect to changing market
conditions is a subject of [4]. In [5] a reader can find a limited overview of the discussed field oriented to comparison of different applications of multi-agent systems
to modeling and managing the chains. As bases of reported algorithms the following
are mentioned: evolutionary algorithms, special languages (grammars), various auction
(Contract Net) and negotiation schemas.
The paper is organized as follows. The initial part describes the proposed model of a
critical situation with discussion of its most important aspects. Some assumptions and
architecture solutions of how to manage with the defined critical situations is proposed
also. The next section 3 introduces an area of application which is management of
supply chains. Section 4 gives description of an agent-based model that, following the
main methodology assumption, is prepared to carry out simulation studies of crises
in the chains. The chosen results of simulation studies illustrating the approach are
presented at the end.

2 Critical Situations in MAS


A critical situation is recognized as a particular state or sequence of states that violate or
lead to violation of the global as well as local (the agents) goals of a system. Thus critical situations can be local (concerning a single agent) and global (involving not only all
but also a group of agents). Arising of a local crisis may entail a global one in the future,
but functional abilities of a system very often allow avoiding the consequences at the
global level. Such phenomenon straight results from the basic features of multi-agent
systems. One may say that some anti-crisis mechanisms (in the above sense) are already incorporated. On the contrary, threat of a global crisis usually requires especially
invented mechanisms.
A crisis among a group of agents is treated here as a global one because of similar
way of the state description and the obvious fact that such crisis must emerge with
respect to a partial or side- goal of a system.
The above characteristics allow to define general conditions of management of critical situations:
possibility of observation (monitoring) of the system state based on observation of
the agents states individually,
adoption of the adequate ways of evaluation of a state in order to achieve operational criteria of critical situations recognition,
availability of appropriate anti-crisis mechanisms.
Degree of realization of the above postulates can be regarded as a determinant of
the system immunity against a crisis. As it has been signalled, a flexible by nature

946

J. Kozlak, G. Dobrowolski, and E. Nawarecki

multi-agent system has some elements of the mechanisms implemented either as parts
of the agents algorithms or in a way of communication or organization of a system (or
a sub-system).
Let us discuss the conditions firstly for the case of local critical situations. In the
obvious way an agent monitors his state as well as evaluates it on his own. In each
state he determines a set of admissible actions. Significant reduction of the set can be
an indication of the crisis. If the agent must consider actions like do nothing or selfdestruction, it is not only the indication but a kind of remedy also.
Although application of the agents strategy is oriented towards a decision, it is also
the evaluation of the state. If some ranking of the actions is prepared according to the
utility coefficients, its values can be used for formulation of a crisis criterion. Then
decline in utility can be regarded as a sign of a crisis. Finally, if the both mechanisms
occur to be insufficient, the applied choice function can be augmented with an element
intended for monitoring of crises and causing activation of built-in especially anti-crisis
actions.
The similar analysis with respect to global critical situations is a bit harder. This
is because of a problem with determining the state of a multi-agent system. The state
can be easily defined as composition of the agents states but its calculation is usually
operationally impossible. Such situation comes from the following features of a system.
There are no strong enough synchronization mechanism to determine simultaneity
of the agents states.
The system state is highly multi-dimensional so that the high cost of information
acquisition should be taken into account.
The agents are autonomous. They usually intend to disclose only as much information as it is necessary for the system operation.
In the general case it is assumed that agent reveals just a sub-space of his state or
some evaluation of its state. Restriction of the state is accepted as a report while the
evaluation is regarded as subjective. Of course, interpretation of the above information
is known around the system. It is worthwhile to mention here that a state comprises also
information about history so that the evaluation can have dynamic character.
Putting all descriptions of the agents state together possibly in a single place and
regarding them as simultaneous is the only way to construct a description of the state
of the whole system. The evaluation similarly as in the case of a single agent
can be oriented towards critical situation. Adoption of a special shape of the evaluation functions and the appropriate definition of subsets of their values opens possibility
of specialized tracking of the system states. For example, the values can be done by
linguistic values: normal, preferred, danger, crisis. In its simplest form tracking can
be just memorization of monitoring data and introduction of them into an evaluation
procedure.
Following the pointed earlier ways of definition of the global goal, two kinds of critical situations can be introduced: direct and indirect. The direct means the threat of
loosing operability of the system in consequence of unavailability of the some agents
actions. The primary cause of an indirect critical situation is a lack of resources (violation of the appropriate balance) that, in turn, gives deficit of functionality.

Agent-Based Modeling of Supply Chains in Critical Situations

947

3 Structure of Agent-Based Management of Crises


The previous section is the point of departure towards the idea of the structure and
functionality of a system devoted to management of critical situations. The proposed
solution is aimed to be universal. It is also lack of details that strongly depend on a particular considered system exposed to crises. Before such details for the case of supply

Fig. 1. General Structure of Management of Critical Situations

chains will be presented, let us analyzed generals on figure 1. The main elements of the
structure are:
rMAS an agent-based conceptualization of real systems of some kind; there can be
pointed out also the concrete real system accords with this conceptualization;
vMAS an agent-based computerized model of the conceptualization thus systems of
that kind and finally the concrete one.
The presented structure is a direct consequence of a methodological rule, that creates a basis of majority of research using modeling and simulation. The rule can be
formulated here as follows. Pairs of mechanisms respectively of rMAS and vMAS are
considered. For each pair two statements are true: a real mechanism is adequately modeled or the virtual one is adequately and effectively implemented in the reality. It must
be pointed out that the formulation not only comprises requirements of adequate modeling but indicates possibility of searching for such mechanisms basing on the model
(vMAS) and applying them in the reality (rMAS) as well.
Appropriate extraction and flows of data in the described structure are done by
two monitoring modules, respectively: rMonitor and vMonitor. The former serves as
a provider of available information about the real system (rMAS), the latter is design
mainly to aggregate generated output data of the model (vMAS) which can be regarded
as the standard computer science problem. Of course, the amount of the data dictates
application of proper techniques.
As it is assumed, the invented mechanisms are to be implemented. It is represented
by block Management in the general structure that plays, in fact, a three-fold role.

948

J. Kozlak, G. Dobrowolski, and E. Nawarecki

1. Working out mechanisms of crisis recognition (patterns) or anti-crisis policy which


are to be implemented in rMAS at the local level of a single agent.
2. Working out global mechanisms of crisis recognition or anti-crisis policy.
3. Implementation of the global mechanisms or on-line management using previously
obtained patterns.
Element Patterns represents all effects of Management functionality, including patterns
applied in the on-line management.

4 Agent-Based Management of Supply Chains


A set of enterprises whose activities can be characterized as production or providing
services, distribution or trading of some raw materials, by-products or market goods
is regarded here as a supply chain. Moreover, it is assumed that the enterprises can
cooperate with each other and are (or can) be managed as a whole but only in the sense
of some decentralized decision mechanisms.
Such description indicates a fact that the supply chain constitutes a complex system
and in that case on one hand needs very special methods of management, on the other
hand is fragile in the perspective of various types of crises. Management of the supply chain must take into account first and foremost that the system elements are highly
autonomous and keep information about themselves strictly confidential. Thus managerial methods applied must be of the decentralized kind as well as based on market
mechanisms and legal agreements.
The model (vMAS) consists of two types of agents enterprise agents and client
agents. As additional elements commodities which are produced or processed by the
enterprises are modeled also. Enterprise agents have production lines at their disposal
that realize transformation of commodities. Agents of both types are described by three
groups of parameters: configuration parameters of the lines (e.g. capacity, production
cost, maintenance cost, quality of production), decision parameters that determine behavior of the agent together with other parameters describing the agent that can be
modified by him (e.g. margin value, several weight coefficients1 ), and finally, parameters that describe the agent also but change as an indirect consequence of activities of
other agents (e.g. financing, prestige on a market).
The general rule of the model is as follows. In an assumed time regime clientagents place orders for chosen final products. Enterprise-agents that are able to fulfill
some of those orders try to construct adequate supply chains. Usually there are several
enterprise-agents engaged in a single chain. Finally as a result of several auctions the
whole system can be organized and the orders done.
The model assumes a set of parameters that describe a production line. The parameters depend on each other according to the following equation.
aq
k =
dw
(1)
bc
where: k maintenance cost of a line, q quality of production, c production cost, w
capacity, a, b, c weight coefficients of parameters q, c, w respectively.
1

The coefficients are presented in the course of the article.

Agent-Based Modeling of Supply Chains in Critical Situations

949

Interchange of goods is modeled with a kind of markets. Entering the markets, suppliers and customers evaluate interesting sale and purchase offers. Formulas which are
used in the model can be sketched out as in eq. (2) evaluation of a purchase offer and
eq. (3) evaluation of a sale one.
ob = ob (p, pw, n, nw, t, tw)

(2)

where: p representation of the customer prestige, t delivery time needed, n quantity, pw, tw, nw weight coefficients of parameters p, t, n respectively.
os = os (p, n, q, t, c, pw, qw, tw, cw)

(3)

where: p supplier prestige, t offered delivery time, n quantity, q offered quality,


c price, pw, tw, qw, cw weight coefficients of parameters p, t, q, c.
Next the best offers are chosen and a kind of agreement set. It is assumed that all
the agreements are realized in turn. The decision (auction) algorithm is the same for
the whole model but differs in parameters of each enterprise-agent. Moreover, there is a
mechanism which realize optimal tuning of the parameters in the course of simulation.
The mechanism models possibility of forecasting studies by a real enterprise. It is done
based on the idea of evolutionary computation.

Fig. 2. Simulation model

Whenever an enterprise-agent faces a decision, the main stream of simulation is suspended and the predicting mechanism started (see fig. 2). At the beginning a population of enterprise-agents is generated as pseudo-random variation (mutation) of the
agents parameters which represents the enterprise under consideration. A number of

950

J. Kozlak, G. Dobrowolski, and E. Nawarecki

sub-simulations with a given (predicting) horizon are carried out for each agent of the
population in turn. Next each such a variant is evaluated on the basis of the value of
enterprise capital gained and the best chosen. Then the main stream is resumed and the
best agent can work. Assumed in the model, an expression that represents the capital of
an enterprise is as follows.
u = f (l, co , gb ) + g(e)
(4)
where: f function that describes changes of the capital, l maintenance cost that
depends on the parameters of a line, co total cost of raw materials, gb profit gained
as a result of the winning auction, g(e) profit that is gained.
The implementation of the model [1] is an application of medium degree of complication. It has layered software architecture. The following layers can be shown to
give the general overwiev of it: Java Virtual Machine, JADE Java Agent DEvelopment Platform that conforms to FIPA Foundation for Intelligent Physical Agents,
JFreeChart a library for chart programming, VMFr a framework dedicated to MAS
simulation, Virtual Market the layer of supply chains modeling in the presence of
critical situation. Two upper layers create the original part of the software.
The presented model is extensively used as a simulation tool for studies of supply
chains. Analyzed problems are: searching for the best configuration of chains, values of
the agents decision parameters, applied auction protocols, granularity of representation
of the chains, modeling environment and its influence on the chains and, of course,
forecasting and discovery of critical situations and simulation-based formation of anticrisis policies and others.
As an illustration, the authors decide to show here results dealing with one of possible
critical situations which is a drop in quality of products. The drop can entail repair of
the line. Alteration cost and parameters of the line are taken into account.

Fig. 3. a) Drop and increase of the quality of production together with changes of the capital in
time b) Quality of production in time fot two strtegies: with or without prediction

The charts that group some chosen experiment results are reproduced in figure 3.
Chart a shows changes in time of the quality of some product under analysis. As a consequence of a line deterioration, the quality falls down successively. At some moment
a repair starts and increase of the quality arises. The capital of the enterprise shows the
same tendency as the quality. Initially it decreases it is hard to sell a product of the
poor quality, next things go better.

Agent-Based Modeling of Supply Chains in Critical Situations

951

Chart b is based on the same scenario (crisis) but influence of the alteration cost is
presented. Consideration of the capital allows to deepen the problem and search for an
anti-crisis policy. As the alteration cost is relative high, the repair is done ,,incrementally minimum costs just to restore the quality. Then the two strtegies are analyzed:
conservative the repair is initiated when the quality decreases beneath some assumed
level, and active with prediction of the parameters of the line.

5 Summary
Consideration of the article concerns with the application of the agent-based approach
to the problem of management of critical situations. Design assumptions and proposal
of the overall architecture of the (sub-)system dedicated to the discovery of crises and
support of anti-crisis activities are described.
One of the possible applications of the presented approach can be found in a field
supply chains that operate in highly dynamic and uncertain environment of economic,
social, and political interdependencies that are a kind of generator of critical situations.
Considerations are carried out on the basis of the originally invented model which incorporates the majority of features of supply chains.
Simulation experiments shortly presented in the article confirm the main ideas of the
approach at general as well as specific levels. Future work will concentrate on further
studies of the model.

References
1. Bogucki P.: A Platform for Production Planning Problems Solving Using Agent-based Approach (in polish), M.Sc., Dept. of Comp. Sci., Univ. of Science and Technology, Krakow,
Poland, 2005.
2. Bussman S., Jennings R.N., Wooldridge M.: Multiagent Systems for Manufacturing Control.
A Design Methodology, Springer-Verlag, 2004.
3. Dobrowolski, G., Nawarecki, E.: Crisis management via agent-based simulation. In B. DuninKeplicz, et al., editors, Monitoring, security, and rescue techniques in multiagent systems,
Advances in Soft Computing, pages 551562. Springer, 2005.
4. Kimbrough, S.O, Wu D.J., Zhong F.: Computers Play the Beer Game: Can Artificial Agents
Manage Supply Chains?: Decision Support Systems 33(3): 323-333, 2002.
5. Moyaux, T., Design, Simulation And Analysis Of Collaborative Strategies In Multi-Agent
Systems: The Case Of Supply Chain Management, Ph.D., Luniversite Laval, 2004.
6. Nawarecki, E, Kozlak, J., Dobrowolski, G, Kisiel-Dorohinicki, M.: Discovery of crises via
agent-based simulation of a transportation system. In M. Pechoucek, et al., editors, MultiAgent Systems and Applications IV, volume 3690 of LNCS, pages 132141. Springer, 2005.
7. Parunak V., Applications of Distributed Artificial Intelligence in Industry, OHare and Jennings, eds. in Foundations of Distributed Artificial Intelligence. Wiley 1994.
8. Swaminathan J.M., Smith S. F., Sadeh N.M., Modeling Supply Chain Dynamics: A Multiagent
Approach, Decision Sciences, Volume 29, Number 3, 1998.
9. Zhao, Z., et al.: Scenario switches and state updates in an agent-based solution to constructing
interactive simulation systems. In: Proc. of CNDS 2002, pages 310, 2002.

Web-Based Integrated Service Discovery Using Agent


Platform for Pervasive Computing Environments
Kyu Min Lee, Dong-Uk Kim, Kee-Hyun Choi, and Dong-Ryeol Shin
School of Information and Communication Engineering,
Sungkyunkwan University,
300 Cheoncheon-dong, Jangan-gu, Suwon, Gyeonggi-do 440-746, Korea
{kmlee, tonykim, gyunee, drshin}@ece.skku.ac.kr

Abstract. Although current service discovery protocols provide the same basic
function of service discovery, they differ significantly in architecture, message
exchange pattern, expected operating environment and service representation/description. These differences prevent service interoperability among
protocols. To solve this problem, we propose a web-based integrated service
discovery mechanism, using an agent platform to guarantee scalability of scope
of available services without modifying existing service discovery protocols.
The proposed web-based integrated service discovery mechanism provides users with a wider selection of services, and convenient search methods.

1 Introduction
The increase in the number of different services in pervasive computing environments
makes people need service discovery protocols, which help them conveniently find
the services they want. Representative service discovery protocols have been classified by academia (INS, INS/Twine, SSDS, Splendor) and industrial standardization
bodies (Bluetooth SDP, SLP, UPnP, Jini, Salutation ).
Although they provide the same basic function of service discovery, they differ
significantly in architecture, message exchange pattern, expected operating environment and service representation/description. These differences prevent service interoperability among protocols.
To solve this problem, several approaches [1] [2] [3] that support interoperability
among service discovery protocols, have been proposed. These approaches, however,
have limitations such as translation loss, high maintenance cost, high management
cost of service information, the complexity of development, and the modification of
existing service.
In this paper, we propose an integrated service discovery mechanism based on an
agent platform and the Web, to guarantee scalability of scope of available services
without modifying existing service discovery protocols.
The remainder of the paper is organized as follows. Section 2 describes our proposed mechanism and system architecture in detail. Section 3 presents the conclusion.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 952955, 2007.
Springer-Verlag Berlin Heidelberg 2007

Web-Based Integrated Service Discovery Using Agent Platform

953

2 Proposed Mechanism and System Architecture


In this section, we explain integrated service discovery mechanism and system architecture in detail.
2.1 Integrated Service Discovery Mechanism
Integrated service discovery mechanism provides users with scalability of services,
without any modification of existing service discovery protocols. The key design of
the mechanism is that service discovery protocols multicast a state of services, such as
service registration and deregistration. Each protocol has a well-known IP and port, so
an exterior user or agent can detect the state of services if he joins in the IP and port,
i.e. 239.255.255.250:1900 on UPnP and 239.255.255.253:2427 on mSLP. Figure 1
shows the integrated service discovery mechanism.

Fig. 1. Integrated Service Discovery

The Message Monitoring module listens to the messages about services from each
service discovery protocol. It sends them to the Message Parser. The Message Parser
parses them and registers the parsed data to the Directory Facilitator (DF), which is
the agent that provides a yellow page service to the agent platform. Finally, agents
can freely use the services registered in the DF. In terms of agents, the scope of available services is extended. The following section provides a more detailed explanation
of this mechanism.

954

K.M. Lee et al.

2.2 System Architecture


Figure 2 shows the proposed system architecture for web-based integrated service
discovery using the Java Agent DEvelopment Framework (JADE), which is one of
the most popular FIPA-compliant agent platforms.

Fig. 2. System Architecture for Web-based Integrated Service discovery

The system consists of Agent Management System (AMS) and DF, which are essential agents of the FIPA-compliant agent platform; RMA, which monitors state of
agents; Discovery Agent, and Service Proxy Agent. The Discovery Agent is needed for
heterogeneous service discovery. The function of the Service Proxy Agent is to guarantee collaboration with a web server. System processing flow starts in the AP-to-Web
Communication module. It sends its hosts IP and port number to the web server.
When service search from the web server is requested, the Authentication module
judges whether a client using the web server has either view authority or invocation
authority. The Service Search module then searches available services, and transmits
the results to the web server. The Invocation Transmission module sends an invocation message about the service, from the web server requested by the client, to the
Remote Invocator module.
Figure 3 depicts the conceptual model of the web-based integrated service discovery system. The architecture is composed of three environments. First, there are domains connected with service discovery protocols such as UPnP, SLP and Jini. Second, there is a JADE environment, as shown in Figure 2, which detects messages of
services existing within each service discovery protocol. Finally, there is a web server
that supports the viewing of available services and invoking these services. The web

Web-Based Integrated Service Discovery Using Agent Platform

955

Fig. 3. The Conceptual Model

server requests an available service list to the pre-registered agent platform, according
to the clients request, and then presents the result as a web page.

3 Conclusion
This paper mainly contributes to present an integrated service discovery mechanism
for heterogeneous services. It is based on an agent platform and the Web, to guarantee
scalability of scope of available services, without modifying existing service discovery protocols. The proposed integrated service discovery mechanism provides users
with a greater variety of services and convenient search methods.

Acknowledgements
This research is supported by the ubiquitous Computing and Network (UCN) Project,
the Ministry of Information and Communication (MIC) 21st Century Frontier R&D
Program in Korea.

References
1. Erik Guttman and James Kempf. Automatic Discovery of Thin Servers: SLP, Jini and the
SLP-Jini Bridge. In Proceedings of the 25th Annual Conference of the IEEE Industrial Electronics Society, pages 722-727, 1999.
2. Hiroo Ishikawa et al. A Framework for Connecting Home Computing Middleware. In Proceedings of the IWSAWC2002, 2002.
3. J. Allard, V. Chinta, S. Gundala and G.G Richard. Jini Meets UPnP: An Architecture for
Jini/UPnP Interoperability. In Proceedings of the 2003 Symposium on Applications and the
Internet, Orlando, 2003.

A Novel Modeling Method for Cooperative Multi-robot


Systems Using Fuzzy Timed Agent Based Petri Nets*
Hua Xu1 and Peifa Jia1,2
1

State Key Lab of Intelligent Technology and Systems,


Tsinghua University, Beijing, 100084, P.R. China
2
Department of Computer Science and Technology,
Tsinghua University, Beijing, 100084, P.R. China
{xuhua, dcsjpf}@mail.tsinghua.edu.cn

Abstract. This paper proposes a cooperative multi-robot system (CMRS)


modeling method called fuzzy timed agent based Petri nets (FTAPN), which
has been extended from fuzzy timed object-oriented Petri net (FTOPN). The
proposed FTAPN can be used to model and illustrate both the structural and
dynamic aspects of CMRS. Supervised learning is supported in FTAPN. As a
special type of high-level object, agent is introduced, which is used as a
common modeling object in FTAPN models. The proposed FTAPN can not
only be used to model CMRS and represent system aging effect, but also be
refined into the object-oriented implementation easily. At the same time, it can
also be regarded as a conceptual and practical artificial intelligence (AI) tool
for multi-agent system (MAS) into the mainstream practice of software
development.
Keywords: Fuzzy, Agent, Petri nets, Object-oriented, Multi-robot system.

1 Introduction
Characterized as cooperation and high efficiency, cooperative multi-robot systems
(CMRS) have emerged as usual manufacturing equipments in current industries [1].
Differing from generic control systems, the cooperation needs to be considered in the
realization of CMRS [1]. So the system modeling, analysis and refinement always meet
with difficulties. CMRS can be regarded as a typical multi-agent system (MAS) in
distributed artificial intelligence [2]. For modeling MAS, object-oriented methodology
has been tried and some typical agent objects have been proposed, such as active object,
etc [3]. However, agent based object models still can not depict its structure and
dynamic aspects, such as cooperation, learning, temporal constraints, etc [2].
This paper proposes a high level PN called fuzzy timed agent based Petri net
(FTAPN) on the base of FTOPN [4] and it is organized as the following. Section 2
reviews the concept of FTOPN and extends FTOPN to FTAPN on the base of
*

This work is jointly supported by the National Nature Science Foundation (Grant No:
60405011, 60575057) and the China Postdoctoral Foundation for China Postdoctoral Science
Fund (Grant No: 20040350078).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 956 959, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Novel Modeling Method for Cooperative Multi-robot Systems

957

ACTALK model. Section 4 uses FTAPN to model a typical CMRS in the wafer
etching procedure of circuit industry and makes some modeling analysis to
demonstrate its benefits in modeling MAS. Finally, the conclusion and future work
can be found in section 5.

2 Basic Concepts
2.1 FTOPN
Definition 1. FTOPN is a six-tuple, FTOPN= (OIP, ION, DD, SI, R, I) where OIP,
ION, DD and SI are the same as those in HOONet [5] and TOPN [6] and
1)
2)

R: {OIP} r, where r is a specific threshold.


I is a function of the time v. It evaluates the resulting degree of the abstract
object firing.

Fig. 1. The General Structure of FTOPN

2.2 Agent Object and FTAPN


The active object concept [3] has been proposed to describe a set of entities that
cooperate and communicate through message passing. ACTALK is a framework for
implementing and computing various active object models into one object-oriented
language realization. In ACTALK, an active object is composed of three component
classes: address, activity and activeObject [3].

Fig. 2. The FTOPN Model of ACTALK

958

H. Xu and P. Jia

The parameters of FTAPN are always given beforehand. In general, however, these
parameters may not be available and need to be estimated just like those in FTPN [7].
The estimation is conducted on the base of some experimental data concerning
marking of input and output places. The marking of the places is provided as a
discrete time series. More specifically we consider that the marking of the output
place(s) is treated as a collection of target values to be followed during the training
process. As a matter of fact, the learning is carried out in a supervised mode returning
to these target data. The learning method is just like those in FTOPN [4].

3 A Modeling Example
3.1 A CMRS Model
In the etching tools, usually there is a CMRS platform made up of two transferring
robots. These two cooperative robots are up to complete transferring one unprocessed
wafer from the input lock to the chamber and fetch the processed wafer to the output
lock. Any robot can be used to complete the transferring task at any time. If one robot
is up to transfer one new wafer, the other will conduct the other fetching task. They
will not conflict with each other. Fig. 3 depicts this CMRS FTAPN model, where two
agent objects (ACTALK) is used to represent these two cooperative robots. Fig.4
depicts the time relevance rules.
[a,b]

CMRS
1'C

[a1,b1]

[a2,b2]
r1

1'C
Actalk
Agent1
1'C
1'C

[a4,b4]
1'C
[a6,b6]

1-r2
Transfer

InputLock
1'C

t1

Behavior Model

1'C
Actalk
Agent2

[a3,b3]
1'C
1'C
r2
1-r1
[a5,b5]
Fetch
1'C
[a7,b7]

1'C
tA1
1'C

1'C
rA1

OutputLock

1'C

[a,b]
1'C

1'C
1'C

1'C

tA3

1'C
rA3

rA2 1'C
1'C

(a) The Agent Based FTAPN Model

tA2

tA4

(b) The Behavior Model in Every Agent

Fig. 3. The FTAPN Model

Fig. 4. The Relevance

A Novel Modeling Method for Cooperative Multi-robot Systems

959

4 Conclusions
Cooperative multi robot system is a kind of CMRS in manufacturing industries. In
order to model, analyze and simulate this kind of CMRS, this paper proposes fuzzy
timed agent based Petri net (FTAPN) on the base of FTOPN [4] and FTPN [7]. In
FTAPN, one of the active objects ACTALK is introduced and used as the basic
agent object to model CMRS. Every abstract object in FTOPN can be trained and
reduced independently according to the modeling and analysis requirements for OO
concepts supported in FTOPN. The validity of this modeling method has been used to
model the CMRS platform in etching tools. The FTAPN can not only model complex
MAS, but also be refined into the object-oriented implementation easily. It has
provided a methodology to overcome the development problems in agent-oriented
software engineering. At the same time, it can also be regarded as a conceptual and
practical artificial intelligence (AI) tool for integrating MAS into the mainstream
practice of software development.

References
[1] Cao, Y.U., Fukunaga, A.S., Kahng, A.B., Meng, F.: Cooperative Mobile Robotics:
Antecedents and Directions, Autonomous Robots, 4(1997)727
[2] Jennings, N.R., Sycara, K., Wooldridge, M.: A Roadmap of Agent Research and
Development. Autonomous Agents and Multi-Agent Systems, 1 (1998)738
[3] Guessoum, Z., Briot, J.-P., From active objects to autonomous agents, IEEE Concurrency,
7-3(1999) 68 76
[4] Hua, X., Peifa, J., Fuzzy Timed Object-Oriented Petri Net; Artificial Intelligence
Applications and Innovations II-Proceedings of AIAI2005, Berlin Heidelberg New York
(2005) 155-166
[5] Hong, J.E., Bae, D.H.: Software Modeling and Analysis Using a Hierarchical Objectoriented Petri net. Information Sciences, 130(2000)133-164
[6] Xu, H.; Jia, P.F.: Timed Hierarchical Object-Oriented Petri Net-Part I: Basic Concepts and
Reachability Analysis, Proceedings of RSKT2006, Vol.4062, Lecture Notes In Artificial
Intelligence , Berlin Heidelberg New York (2006)727-734
[7] Pedrycz, W.: Camargo, H.: Fuzzy timed Petri nets, Fuzzy Sets and Systems,140(2003):
301-330

Performance Evaluation of Fuzzy Ant Based Routing


Method for Connectionless Networks
Seyed Javad Mirabedini1 and Mohammad Teshnehlab2
1

Ph.D. Student of Computer Software, Tehran Markaz Branch of


Islamic Azad University, Tehran, Iran
jvd2205@yahoo.com
2
Electrical Eng.K.N.Tossi University, Tehran, Iran
teshnehlab@eetd.kntu.ac.ir

Abstract. This paper introduces a novel algorithm, called FuzzyAntNet


inspired by swarm intelligence and optimized by fuzzy systems. FuzzyAntNet
is a new routing algorithm which is constructed by the communication model
observed in ant colonies, and enhanced with fuzzy systems. Two special
characteristics of this method are scalability to network changes and capability
to recognize the best route from source to destination with low delay, traffic and
high bandwidth. Using Ants (or agents) in this method cause to avoid
congestion in data packet transmission because Ants walk through paths and
intermediate routers and gather information about their delay, congestion and in
return, they update the current delay of each visited link and consequently they
bring up to date the routing probabilities table for every traversed router. We
compare FuzzyAntNet with other routing algorithms such as AntNet and
Destination Sequenced Distance Vector (DSDV).
Keywords: Connectionless Networks, Routing Algorithms, Ant Colony
System, Fuzzy Systems, Fuzzy Ant Based Routing, Probabilities Table, Fuzzydelay.

1 Introduction
Routing algorithms in modern networks must address numerous problems. Current
routing algorithms are not adequate to tackle the increasing complexity of such
networks. Centralized algorithms have scalability problems; static algorithms have
trouble keeping up-to-date with network changes; and other distributed and dynamic
algorithms have oscillations and stability problems. One of the applications of Ant
Colony Optimization in network routing is swarm intelligence routing which provides
a promising alternative to these approaches. Swarm intelligence utilizes mobile
software agents for network management. These agents are autonomous entities, both
proactive and reactive, and have the capability to adapt, cooperate and move
intelligently from one location to the other in the communication network. In this
paper we discuss routing algorithms in section 2, present FuzzyAntNet in section 4,
and finally the simulation and results are given in section 5.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 960965, 2007.
Springer-Verlag Berlin Heidelberg 2007

Performance Evaluation of Fuzzy Ant Based Routing Method

961

2 Routing Algorithms
Routing decisions can only be made on the basis of local and approximate
information about the current and the future network states, with additional
constraints posed by the network switching and transmission technology [1]. In this
article we focus on wide area networks, that is, irregular topology datagram networks
with an IP-like (Internet Protocol) network layer and a very simple transport layer.
The instance of the communication network is mapped on a directed weighted graph
with N nodes. All the links are viewed as having two parameters characterized by a
bandwidth and a transmission delay, and are accessed following a statistical
multiplexing scheme.

3 Ant Based Routing


In the Ant-based routing algorithm, routing is determined by means of very complex
interactions of forward and backward network exploration ants. The idea behind this
subdivision of agents is to allow the backward ants to utilize the useful information
gathered by the forward ants on their trip from source to destination. Based on this
principle, no node routing updates are performed by the forward ants. The backward
ants inherit this raw data and use it to update the routing table of the nodes [2], [3].
Table 1. Ant - Based Routing Table for Node A
Destination
Node A

In table I, a probability value

E
F

Neighbor Node
B
C
0.35
0.65
040
0.60

Pdn which expresses the probability of choosing n as

neighbor node when the destination node is d, with the constraint defined in (1):

n N k

dn

= 1 ,d [1, N ] , N k = {neighbor ' s ( k )} .

(1)

4 Fuzzy Ant Based Routing Method


In this section we describe our novel algorithm which we called it FuzzyAntNet.
FuzzyAntNet is constructed by the stigmergy communication model observed in ant colonies
and applied with fuzzy systems [4], [5], [6]. In this algorithm every link between two

nodes i and j is addressed as linkij and there are two parameters for each linkij : Delayij
and Bandwidthij. There are five membership functions for first input variable
(Delayij), five membership functions for second input variable (Bandwidthij), and nine
membership functions for output variable (Fuzzy-Delayij). All of the membership
functions are Triangular because it can eliminate noises and in comparison with
Gaussian membership function, the precision of Triangular is as good as Guassian

962

S.J. Mirabedini and M. Teshnehlab

but it's computation is so easier and simpler. Before applying their values, Delayij is
normalized between (0, 1) and Bandwidthij is normalized between (0, 0.5). The engine
used, is multiplication engine. Table 2 shows the rulebase for the fuzzy system.
In this table the Values for the amount of goodness from lowest to highest are
LL(Very Low), LM, LH, ML, MM, MH, HL, HM, and HH(Very High).
Table 2. RuleBase for FuzzyAntNet Method
X1
(Delay)
X2
(Bandwidth)
VL (Very Low)
L (Low)

VL
(Very Low)

L
(Low)

M
(Medium)

H
(High)

VH
(Very High)

MM
MH

ML
MM

LH
ML

LM
LH

LL
LM

MH

MM

ML

LH

(Medium)

HL

(High)

HM

HL

MH

MM

ML

VH (Very High)

HH

HM

HL

MH

MM

In table 2 as shown, there are 25 rules for this fuzzy system. We mentioned some
of them as following:
R1: If x1 (Delay) is VL and x2(bandwidth) is VL

then Y(goodness )

is MM

R25 If x1(Delay) is VH and x2(bandwidth) is VH then y (goodness ) is MM


The output of fuzzy system is named Fuzzy_Delaykij(t). In simulation it is replaced
with delayij and computed by (2):
M

Fuzzy

_ Delay

ij

(t ) =

i=1

l =1
M

l =1

y
n

l
i

( X

(2)


i=1

A il ( X

where the parameters are:


i : the node where an ant is going from .
j: the node where an ant wants to move.
M : the number of fuzzy rule bases used (M=25).
nf : the number of membership functions for input variables (nf =2).
Ail(xi) : The Fuzzy value of membership functions.
In FuzzyAntNet nodes launch forward ants in regular intervals. The forward ant
keeps track of the visited notes in a stack Jk and of their associated cost
Fuzzy _ Delay nj , d . This cost can be the wait time in queue and the transmission delay
for each visited node n. The cost Sum _ Fuzzy _ Delaynj , d is defined as the sum of all
the delay costs from node n to destination node d. Once the destination d is reached,

Performance Evaluation of Fuzzy Ant Based Routing Method

963

then a backward ant is launched, which updates the distance estimation


Sum _ Fuzzy _ Delay nj , d for node n to d via j as shown in (3):
Sum_ Fuzzy_ Delay nj , d (t ) = (1 )Sum_ Fuzzy_ Delay nj , d (t 1) + Sum_ Fuzzy_ delay kn, d .

(3)

Where is the learning rate and set to 0.7 . The routing table probabilities, are
updated by (4):

n
j ,d

Sum

(t ) =

Where

l Neighbor (n )

1
_ Delay

_ Fuzzy

Sum

1
_ Delay

_ Fuzzy

n
j ,d

( t )

n
l ,d

( t )

(4)

and is a non-linearity factor .In this experiment is set to 1.

5 Simulation and Results


In our experiments, we compared FuzzyAntNet to a set of state-of-the-art algorithms
in a realistic network simulator which constructed by object oriented programming in
C++ and used network simulator 2 (ns2) as Constant Bit Rate (CBR) traffic generator.
Both data packets and ants packet have 512 bytes length. The rate of ant packet
(overhead) for both AntNet and FuzzyAntNet is only one percent. We used a typical
network topology called TypicalNet as shown in Fig. 1.
55,2

2
70,9
60,8

95,2
45,2
50,8

95,2

90,2

95,2

1
40,2

60,8
95,2

55,8

Destination

65,2

70,2

Source

85,2

80,1
5

65,9

Fig. 1. TypicalNet. On every link, its delay and bandwidth are shown respectively.

In TypicalNet, the the traffic is generated by source node 1 and the destination is
node 9. Standard performance metrics are: Throughput, End-2-End Delay, Packet
Delivery, and Packet Drop Ratio. Comparisons of End-2-End Delay are performed for
DSDV, AntNet, and FuzzyAntNet and the results are shown in Fig. 2.
We also summarized the experimental results in Table 3 which shows that
FuzzyAntNet outperforms other routing methods in all evaluation metrics.

964

S.J. Mirabedini and M. Teshnehlab

Fig. 2. Simulation results for End-2-End delay. In this software-simulated network during 20
seconds, the routing algorithms have been executed in the same traffic state.
Table 3. Results obtained by three algorithms DSDV, AntNet, and FuzzyAntNet
Routing
Algorithms

DSDV

AntNet

FuzzyAntNet

Overhead(%0)

Overhead(%1)

Overhead(%1)

Avg. End-to-End Delay (s)

4.43

4.36

2. 99

Avg.Throughput (kbps)

280.8

189.2

375

Packet Delivery Ratio (%)

70

47

93

Packet Drop Ratio (%)

30

53

Standard
Criteria

Conclusions

In this paper we introduced a novel method called FuzzyAntNet which showed a


scalable and robust mechanism with the ability to reach a stable behavior even in
changing network envornment. FuzzyAntNet is outperformed in all metrics in this
simulation. It also expresses good utilization of network, balancing the data packets
through networks which reduces congestion and avoids packet drops.

References
1. Bertsekas D.&Gallager R.: Data Networks. Englewood Cliffs, Prentice-Hall (1992)
2. Di Caro G.& M.Dorigo: Ant Colonies for Adaptive Routing in Packet Switched Comunications Networks. Proc. PPSN Fifth International Conference on Parallel Problem
Solving From Nature (1998)
3. G. Di Caro and M. Dorigo: "AntNet: A Mobile Agents Approach to Adaptive Routing",
Tech. Rep. IRIDIA/97, Universit Libre de Bruxelles, Belgium (1997)
4. Dubois, D. and H. Prade: Fuzzy Sets and Systems: Theory and Applications, Vol.18 Filev,
D.P. (1996)

Performance Evaluation of Fuzzy Ant Based Routing Method

965

5. Seyed Javad Mirabedini, Mohammad Teshnehlab: "Adaptive Neuro Fuzzy for Optimization of Ant Colony System". Workshop Proc. EurAsia-ICT Advances in Infor-mation
and Communication Technology, Shiraz Iran (2002) 325-329
6. Seyed Javad Mirabedini , Mohammad Teshnehlab: "AntNeuroFuzzy: Optimal Solu-tion for
Traveling Salesman Problem using Ant Colony and Neuro-Fuzzy Systems".Proc. ICTIT
International Conference Supported by IEEE Jordan (2004) 305-312

Service Agent-Based Resource Management Using


Virtualization for Computational Grid
Sung Ho Jang and Jong Sik Lee
School of Computer Science and Engineering
Inha University
Incheon 402-751, South Korea
ho7809@hanmail.net, jslee@inha.ac.kr

Abstract. The key purpose of computational grid is how to allocate grid


resources to grid applications effectively. Existing resource management
models with centralized architecture like GRAM and nimrod-G broker are
liable to generate communication overheads and bottleneck. Therefore, we
propose the service agent-based resource management model using virtualization for computational grid. Virtualization and agent-based system applied to
our model is able to improve load imbalance of grid broker and reduce
communication overheads. Experiment results demonstrate that the service
agent-based resource management model decreases 25.8% more job latencies
and 151.6% more communication messages than the centralized resource
management model.
Keywords: Grid Computing, Agent-based System, Resource Management.

1 Introduction
Computational grid [1], the core field of grid computing, enables us to get the ability
to perform high performance computing and large-scale simulation. The key purpose
of computational grid is how to allocate grid resources to grid applications with high
throughput and low latency effectively. But, the effective allocation of grid resources
is not an easy task because distributed grid resources have heterogeneous operating
systems and different system performances. Grid resources also have commercial
characteristics [2] that grid users pay for grid resource utilization and grid resource
providers make profits. And, we need to consider the QoS of the whole grid.
Diverse models like GRAM [3] and nimrod-G resource broker [4] have been
developed for the resource management of computational grid. GRAM (Globus
Resource Allocation Manager), the core component of globus toolkit, is developed to
facilitate processing requests for executing remote application and managing active
jobs. Nimrod-G resource broker, a grid application scheduler, is responsible for
resource discovery, selection, scheduling, and deployment of computations. But,
existing models are based on centralized architecture that deals with communications
between grid resources and grid applications. And, these models are unable to satisfy
commercial characters and meet the dynamic demand for grid resources due to the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 966969, 2007.
Springer-Verlag Berlin Heidelberg 2007

Service Agent-Based Resource Management Using Virtualization

967

passive resource transaction mechanism. Also, these models are liable to generate
communication overheads and bottleneck by communication messages concentrated
to grid broker and too many given tasks of grid broker.
Therefore, this paper proposes the service agent-based resource management
model using virtualization in order to solve problems of the centralized resource
management model. Agent-based system [5] provides the ability to unify and allocate
networked resources in the grid environment and is useful to solve technical problems
of distributed resource management. As applying agent-based system and
virtualization to our model, we can integrate grid resources of local networks and
construct a virtual grid computing system.
This paper is organized as follows. Section 2 proposes the service agent-based
resource management model. Section 3 demonstrates the efficiency of our model with
experiment results. The conclusion of this paper is in Section 4.

2 Service Agent-Based Resource Management Model


This paper proposes the service agent-based resource management model for the
effective resource management of computational grid. As shown in fig. 1, the service
agent-based resource management model consists of five types of components which
are grid user, grid resource provider, grid service agent, central grid broker, and local
grid broker.

Fig. 1. Service Agent-based Resource Management Model

Roles of each component are as follows. Grid user is in grid application level and
uses grid resources to solve its own computing problem as paying for resources. Grid
resource provider (GRP) is in grid fabric level and provides commercial grid
resources to grid users and makes profits. Central grid broker takes charge of grid
resource registry which services the data management of local grid resource group
and discovers grid resources suitable for requirements of grid users by data query.
Central grid broker also binds grid users with local grid broker of local network of
which grid resources satisfy the requirements.

968

S.H. Jang and J.S. Lee

Local grid broker is composed of local resource pool, resource cluster manager and
transaction coordinator. All grid resources connected with local network are stored to
local resource pool. Resource cluster manager divides grid resources of this pool into
classes and constructs several clusters. Transaction coordinator provides a transaction
mechanism by the negotiation of grid user and grid service agent.
Grid service agent, the key component of our model, consists of resource
allocation, resource deal module, synchronization module, resource monitor module,
and communication agent. Firstly, grid service agent constructs virtual computing
system as integrating networked grid resources of a cluster by virtualization
techniques [6] such as LUN (Logical Unit Number) mapping and masking. Virtual
computing system collects the local network attached computing resources into a
large computing system. This logical computing system can be allocated to any grid
user with adequate computing performance. It also can solve the physical complexity
of grid resources and provide the ability to monitor and control distributed grid
resources. Secondly, grid service agent manages metadata about location, owner, and
condition of grid resources and improves the monolithic update cycle of metadata as
updating and publishing metadata dynamically when the condition of grid resources is
changed. Thirdly, grid service agent takes charge of bidding for resource transaction
instead of grid resource providers because it costs a great deal for marketing,
management and communication if grid resource providers participate in resource
transaction in several. Finally, grid service agent prioritizes grid resources of virtual
computing system by performance like the processing time and the number of
processors.

3 Experiments and Performance Evaluation


For the performance evaluation of our model, we implemented it on the DEVS
modeling and simulation environment [7] and conducted experiments in comparison
with the centralized resource management model of which grid resource providers
participate in resource transaction in person.

Fig. 2. (a) Comparison of Job Latency (b) Comparison of Communication Messages (Service
Agent-based Resource Management Model (SARMM) vs. Centralized Resource Management
Model (CRMM))

Service Agent-Based Resource Management Using Virtualization

969

In our experiments, we assumed that grid resources of each model are processors
with multi queue and measured the job latency and communication message of each
model. Fig. 2 (a) and (b) illustrate variations of job latency and communication
messages by the number of transactions. As a result, the decreasing rates of job
latency and communication messages for the service agent-based resource
management model over the centralized resource management model are 25.8% and
151.6%. This result demonstrates that our model provides improved QoS with high
throughput and reduces communication overheads.

4 Conclusion
This paper proposed the service agent-based resource management model to solve
problems of existing centralized resource management models in the computational
grid environment. We applied virtualization and agent-based system to the service
agent-based resource management model in order to improve load imbalance of grid
broker and reduce communication overheads. Contrary to the centralized resource
management model, our model of which grid service agent takes charge of dealing
and monitoring grid resources instead of grid resource providers can generate more
resource transactions and profits than existing models. Experiment results
demonstrate that the service agent-based resource management model decreases
25.8% more job latencies and 151.6% more communication messages than the
centralized resource management model.
Acknowledgments. This work is supported by INHA UNIVERSITY Research Grant.

References
1. Berman, F., Fox, G., Hey, T.: Grid computing: making the global infrastructure a reality. J.
Wiley. New York (2003) p
2. Subramoniam, K., Maheswaran, M., Toulouse, M.: Towards a Micro-Economic Model for
Resource Allocation in Grid Computing System. In Proceedings of the 2002 IEEE Canadian
Conference on Electrical & Computer Engineering (2002) 782-785
3. Foster, I., Kesselman, C.: The Globus project: a status report. Heterogeneous Computing
Workshop, 1998. (HCW 98) Proceedings. (1998) 4 - 18
4. Buyya, R., Abramson, D., Giddy, J.: Nimrod-G: An Architecture for a Resource
Management and Scheduling System in a Global Computational Grid. The 4th International
Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2000),
IEEE Computer Society Press, USA (2000)
5. Mandutianu, S.: Modeling Agent-Based Systems. Formal Approaches to Agent-Based
Systems: In Proceedings of First International Workshop. FAABS 2000, Springer Verlag,
LNCS Vol. 1871 (2001)
6. Nanda, S., Chiueh, T.: A Survey on Virtualization Technologies. RPE Report, (2005)
7. Zeigler, B.P. (ed.): The DEVS Environment for High-Performance Modeling and
Simulation. IEEE CS & E, Vol. 4, No3 (1997) 61-71

Fuzzy-Aided Syntactic Scene Analysis


Marzena Bielecka1 and Marek Skomorowski2
1

Department of Goeinformatics and Applied Computer Science,


AGH University of Science and Technology,
Al. Mickiewicza 30, 30-059 Krak
ow, Poland
bielecka@agh.edu.pl
2
Institute of Computer Science, Jagiellonian University,
Nawojki 11, 30-072 Krak
ow, Poland
skomorowski@ii.uj.edu.pl

Abstract. In syntactic pattern recognition a pattern can be described


by a graph. The problem of recognition is to determine if a pattern,
represented by a describing graph, belongs to a language L(G), generated
by a graph grammar G. The so-called IE graphs are used for pattern
description. They are generated by so-called ETPL(k) graph grammars.
The purpose of this paper is to present an idea of a new approach to
syntactic recognition of fuzzy patterns represented by fuzzy IE graphs,
followed the example of random IE graphs. This methodology can be
used in embodied multi-agent systems for a scene analysis.
Keywords: Syntactic pattern recognition, distorted patterns, graph
grammars.

Introduction

Agents are entities capable of taking account of what surrounds them. In referring to an embodied cognitive multi-agent system this means, among others,
that agents are managed to analyze the scene they act on. In particular, an
agent has got a symbolic and explicit representation of the surrounding world
([2], page 17). Syntactic pattern recognition based on graphs is one of the classical approach to this task.
In a node replacement graph grammar, a node of a derived graph is replaced
by a new subgraph, which is connected to the remainder of the graph. A node replacement is controlled by a production of a given graph grammar. An example of
a node replacement graph grammar is an ETPL(k) (embedding transformationpreserving production-ordered k-left nodes unambiguous) grammars introduced
in [3]. The so-called IE (indexed edge-unambiguous) graphs have been dened in
[3] for a description of patterns (scenes) in syntactic pattern recognition. Nodes
in an IE graph denote pattern primitives. Edges between two nodes in an IE
graph represent spatial relations between pattern primitives. An idea of a probabilistic improvement of syntactic recognition of distorted patterns represented


This work was partially supported by the AGH grant number 1010140461.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 970973, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Fuzzy-Aided Syntactic Scene Analysis

971

by graphs is described in [4] and [6]. A random IE graph approach ([4,6]) is


proposed for such a description and an ecient parsing algorithm for IE graphs
(the computational complexity is O(n2 )) is presented in [6].
In this paper we present an idea of approach to syntactic recognition by
fuzzy IE graphs, followed the example of random IE graphs. Fuzziness allows us
describe in proper way patterns that can not be presented univocally.

Fuzzy IE Graphs for Fuzzy Patterns Representation

Let us remind denition of an IE graph ([3]). An indexed edge-unambiguous


graph, an IE graph over and is a quintuple
g = (V, E, , , ),
where
V - is a nite, nonempty set of nodes to which indices have been ascribed in
an unambiguous way,
- is a nite, nonempty set of node labels,
- is a nite, nonempty set of edge labels,
E - is a set of edges of the form (v, , w), v, w V, , such that index of
v is less than index of w,
: V - is a node labeling function.
Assume that both labeled objects in nodes of a graph and spatial relations are
represented by fuzzy sets of a rst order with membership functions i and i
respectively. Let, furthermore, the set of all objects be n-elemental and the set
of all spatial relations be k-elemental. Let us dene, informally, a fuzzy IE graph
as an IE graph in which nodes labels are replaced by a vector = [1 , . . . , n ] of
values of membership functions i , i {1, . . . , n} and edges labels are replaced
by vector = [1 , . . . , k ] of values of membership functions j , j {1, . . . , k} see Fig.1.
The fuzzy measure of an outcome IE graph, obtained form a given fuzzy IE
graph, is equal to the value of a T -norm T of the values components of the node
and edge vectors. An axiomatic denition of T -norms is given in [5], denition
4.22, page 80. Having a fuzzy IE graph R the fuzzy measure r of an outcome
graph r is calculated as


 S 

P
p
s
r = T p=1
T fr (p) , s=1
T gr (s)
where
p - is a number of a regarded node,
s - is a number of an edge,
fr (p) - is a chosen component number of a vector p whereas
gr (s) - is a chosen number of component of a vector s .

972

M. Bielecka and M. Skomorowski

If the arithmetic product is used as a T -norm then the presented parsing algorithm (see Section 3) is identical to the random parsing algorithm described
in [6].

2
b(0.8)
d(0.3)
{p(0.2), r(0.9)}

{t(1.0)}
{s(0.8), t(0.3)}

a(1.0)

{r(0.2), s(0.7)}

c(0.9)
a(0.2)
{r(1.0)}

3
d(0.7)
b(0.4)

{s(0.7), t(0.2)}

6
e(0.8)
g(0.3)

{t(0.9), u(0.2)}
{t(1.0)}

g(0.9)
a(0.2)

f(1.0)

{s(0.3), t(0.8)}

{v(1.0)}

{t(0.2), u(0.9)}

8
h(0.8)
b(0.3)
Fig.1. An example of fuzzy IE graph representing an unknown pattern

Parallel Parsing of Fuzzy IE Graphs

Given an unknown pattern represented by a fuzzy IE graph R, the problem of


recognition of a pattern under study is to determine if an outcome IE graph r,
obtained from the fuzzy IE graph R, belongs to a graph language L(G) generated
by an ETPL(k) graph grammar G. In the proposed parallel and cut-o strategy
of fuzzy IE graph parsing a number of simultaneously derived graphs is equal to
a certain number limit. In this case, derived graphs spread through the search
tree, but only the best, that is with maximum measure value, limit graphs are
expanded.
Let us introduce the following notations: Z - a starting graph of an ETPL(k)
graph grammar G, P - is a set of productions of the grammar G, R - an analyzed
fuzzy graph. The idea of the proposed parsing algorithm is the following:

Fuzzy-Aided Syntactic Scene Analysis

973

1. Apply to the starting graph Z productions belonging to the set P , which


are admissible for further derivations. If such productions do not exist then
stop the parsing algorithm.
2. For subgraphs obtained in the point 1, compute values of their membership
functions using a given T -norm.
3. For further derivation choose a number (limit) of derived subgraphs with the
biggest values of membership function.
4. To the chosen subgraphs apply these productions from P which are admissible for further derivations. If such productions do not exist then stop the
parsing algorithm.
5. Repeat the points 3 and 4 until fuzzy-outcome graph R is obtained.
6. Stop the parsing algorithm.
An example of the introduced algorithm application can be found in [1].

Concluding Remarks

In this paper we have proposed an idea of a new approach to recognition of


fuzzy patterns represented by graphs. To take into account variations of a fuzzy
pattern under study, a description of the analyzed pattern based on fuzzy sets
of the rst order was introduced. The fuzzy IE graph has been proposed here for
such a description. It should be stressed that informal introduction of this class
of graphs has been forced by the volume limitation of the paper. The parsing
algorithm, having the computational complexity O(n2 ), presented in [3,6], is
extended in such a way that fuzzy patterns, represented by fuzzy IE graphs,
can be recognized. In the algorithm a T -norm is used for calculation of value of
membership measure of output graphs. Such solution makes that the algorithm
is very exible. In particular, if arithmetic product is used as a T -norm then the
algorithm is the same as the random one described in [6].

References
1. Bielecka M., Skomorowski M., Bielecki A.: Fuzzy-syntactic approach to pattern
recognition and scene analysis, under revision.
2. Ferber J.: Multi-Agent Systems. An Introducing to Distributed Articial Intelligence. Addison-Wesley, Harlow (1999)
3. Flasi
nski M.: On the parsing of deterministic graph languages for syntactic pattern
recognition. Pattern Recognition, Vol. 26 (1993) 1-16
4. Flasi
nski M., Skomorowski, M.: Parsing of random graph languages for automated
inspection in statistical-based quality assurance systems. Machine Graphics and
Vision, Vol. 7 (1998) 565 - 623
5. Rutkowski L.: Articial Intelligence Techniques and Methods. PWN, Warszawa
(2005) (in Polish)
6. Skomorowski, M.: Use of random graph parsing for scene labeling by probabilistic
relaxation. Pattern Recognition Letters, Vol. 20 (1999) 949-956

Agent Based Load Balancing Middleware for


Service-Oriented Applications
Jun Wang, Yi Ren, Di Zheng, and Quan-Yuan Wu
School of Computer Science,
National University of Defence Technology,
Changsha, Hunan, China 410073
junwang@nudt.edu.cn

Abstract. Kinds of load balancing middleware have already been applied


successfully in distributed computing. However, they dont take the services
types into consideration and for different services requested by clients the
workload would be different out of sight. Furthermore, traditional load
balancing middleware uses the fixed and static replica management and uses the
load migration to relieve overload. However, to the complex service-oriented
applications, the hosts may be heterogeneous and decentralized at all and load
migration is not efficient for the existence of the delay. Therefore, we put
forward an Agent based autonomic load balancing middleware to support fast
response, hot-spot control and balanced resource allocation among different
services. Corresponding simulation tests are implemented and their result s
indicated that this model and its supplementary mechanisms are suitable to
complex service-oriented applications.
Keywords: Web Service, Service-Oriented Applications, Load Balancing,
Adaptive Resource Allocation, Middleware.

1 Introduction
In recent years, with the rapid development of e-business, web based applications are
developed from localization to globalization, from B2C to B2B, from centralized
fashion to decentralized fashion and many applications are constructed based on
services. The services are executed by using the heterogeneous back-end resources
such as high performance systems, mass storage systems, database system etc.
However, the applications may be integrated across the Internet by using the services
and the distributed services and resources must be scheduled automatically,
transparently and efficiently. To service the increasing online clients those transmit a
large, often busty, number of requests and provide dependable services with high
quality constantly, we must make the distributed computing systems more scalable
and dependable. And even under high load, the systems must still support the services
as usual. Therefore, we must balance the load of the diverse resources to improve the
utilization of the resources and the throughput of the systems. Currently, load
balancing mechanisms can be provided in any or all of the following layers in a
distributed system:
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 974977, 2007.
Springer-Verlag Berlin Heidelberg 2007

Agent Based Load Balancing Middleware for Service-Oriented Applications

975

Network-based load balancing: This type of load balancing is provided by IP


routers and domain name servers (DNS). However, load balancing at these
layers is somewhat limited by the fact that they do not take into account the
content of the client requests.
OS-based load balancing: At the lowest level for the hierarchy, OS-based load
balancing is done by distributed operating system in the form of lowest system
level scheduling among processors [3, 4].
Middleware-based load balancing: This type of load balancing is performed
in middleware, often on a per-session or a per-request basis. The key enterprise
applications of the moment such as astronavigation, telecommunication, and
finance all make use of the middleware based distributed software systems to
handle complex distributed applications.

There are different realizations of load balancing middleware. For example,


stateless distributed applications usually balance the workload with the help of
naming service [5]. But this scheme of load balancing just support static non-adaptive
load balancing and cant meet the need of complex distributed applications. For more
complex applications, the adaptive load balancing schema [6, 7, 8] is needed to take
into account the load condition dynamically and avoid override in some node.
However, traditional load balancing middleware uses the fixed and static replica
management and load monitoring schemes to relieve overload. But to the complex
service-oriented applications, the hosts may be heterogeneous and decentralized at all
and load migration is not efficient for the existence of the delay. Therefore, we put
forward an Agent based autonomic load balancing middleware to support fast
response, hot-spot control and balanced resource allocation among different services.

2 Architecture of the Load Balancing Middleware


Our middleware will directly address to the problems by providing load balancing for
the service-oriented applications, preventing bottlenecks at the application tier,
balancing the workload among the different services and enabling replication of
service components in a scalable way to provide more access to the high performance
back end resources. The service components are object-based components and they
can be distributed or remotely located in different resources. Our load balancing
service is a system-level service and it is introduced to the application tier by using
IDL [1, 2] interfaces. Figure 1 features the core components in our load balancing
service as follows:
Service Replica Repository: Instances of services need to register with the Service
Group. All the references of the groups are stored in the Service Replica Repository.
A service group may include several replicas and we can add or remove replicas to
the groups. The main purpose of the service group is to provide a view containing
simple information about the references to the locations of all replicas registered with
group. The uses need not to know where the replica is located.
Decision Agent Group: The service decision agent acts as a proxy between the client
and the dynamic service. It enables transparency between them without letting the
client knowing about the multiple distributed service replicas. The agent is in charge

976

J. Wang et al.

of getting the best replica for the service and the client requests proceed with their
normal procedure such as calling methods from the service replica. The decision
agent makes decisions based on the algorithms configured in our load balancing
policy according to the Component Configurator design pattern.
Load Monitor: Load monitor collects load information from every load agent within
certain time interval. The load information should be refreshed at a suitable interval
so that the information provided is not expired. The hosts may have different
workload when processing some certain client requests and the workload of the
different hosts may fluctuate in different ways. Therefore we suggest an approach that
the refresh rate can be adjusted according to the workload of the hosts.

Fig. 1. Components of the Load Balancing Middleware

Load Agent: The purpose of load agent is to provide load information of the hosts it
resides when requested by load monitor. As different services might have replicas in
the same host, it is hard to presume the percentage of resource is being used by which
service at particular moment. Therefore, a general metric is needed to indicate the
level of available resources at the machine during particular moment.
Load Prediction: This module use the machine-learning based load prediction
method where the system minimizes unpredictable behavior by reacting slowly to
changes and waiting for definite trends to minimize over-control decisions.
Replica Management Agent Group: The purpose of these agents is to dynamically
manage the replicas to achieve a balance load distribution among different services. In
order to carry out service replication, we determine parameters such as initial number
of replica, maximum number of replica, when to replicate, where to replicate etc. As
different service contains different characteristics such as process time, priority and
popularity, we coordinate services in order to avoid monopolizing of resource by
popular service and unfair treatment for high priority service.
Several complex tests have been completed, and we can see that by using the Load
Agent the overhead of monitoring will be decreased effectively. Furthermore, with the
addition of the number of the hosts and the services, the resources can still be

Agent Based Load Balancing Middleware for Service-Oriented Applications

977

allocated efficiently and the workload of the resources can be balanced with the help
of the Replica Management Agent Group. Especially to the hot-spot services sharing
the workloads with the extra replicas is much better than load migration among the
overloaded servers.

3 Conclusions
Kinds of load balancing middleware have already been applied successfully in
distributed computing. However, they dont take the services types into consideration
and for different services requested by clients the workload would be different out of
sight. Furthermore, traditional load balancing middleware uses the fixed and static
replica management and uses the load migration to relieve overload. However, to the
complex service-oriented applications, the hosts may be heterogeneous and
decentralized at all and load migration is not efficient for the existence of the delay.
Therefore, we put forward an Agent based autonomic load balancing middleware to
support fast response, hot-spot control and balanced resource allocation among
different services. Corresponding simulation tests are implemented and their result s
indicated that this model and its supplementary mechanisms are suitable to complex
service-oriented applications.

Acknowledgements
This work was funded by the National Grand Fundamental Research 973 Program of
China under Grant No.2005cb321804, the National High-Tech Research and
Development Plan of China under Grant No.2004AA112020 and the National Natural
Science Foundation of China under Grant No.60603063.

References
1. Object Management Group, The Common Object Request Broker: Architecture and
Specification, 3.0 ed., June 2002.
2. Henning, M., Vinoski, S.: Advanced CORBA Programming With C++. Addison-Wesley
Longman, Massachusetts (1999)
3. Chow, R., Johnson, T.: Distributed Operating Systems and Algorithms, Addison-Wesley
Publishing Company (1997)
4. Rajkumar, B.: High Performance Cluster Computing Architecture and Systems,
ISBN7.5053-6770-6.2001
5. IONA Technologies, Orbix 2000. www.iona-iportal.com/suite/orbix2000.htm.
6. Othman, O'Ryan, C., Schmidt, D. C.: The Design of an Adaptive CORBA Load Balancing
Service. IEEE Distributed Systems Online(2001)
7. Othman, O., Schmidt, D. C.: Issues in the design of adaptive middleware load balancing. In:
ACM SIGPLAN, ed. Proceedings of the ACM SIGPLAN workshop on Languages,
Compilers and Tools for Embedded Systems. New York: ACM Press(2001)205-213
8. Othman, O., ORyan, C., Schmidt, D.C.: Strategies for CORBA middleware-based load
balancing. IEEE Distributed Systems Online(2001) http://www.computer.org/dsonline

A Transformer Condition Assessment System Based on


Data Warehouse and Data Mining
1

Xueyu Li , Lizeng Wu2, Jinsha Yuan1, and Yinghui Kong


1

Department of Electronics and Communication, North China Electric Power University


No.204, Qingnian Road, Baoding, 071003, China
2
Beijing No.2 Co-generation Plant No. 52, Lian Hua Chi Dong Lu, Beijing, China
lxueyu@gmail.com
lizengwu@yahoo.com.cn
yuanjinsha@sohu.com
kongyh@sina.com

Abstract. A framework of transformer condition assessment system is


proposed in this paper. In this system, we use a data warehouse, a multi-agent
system and data mining techniques respectively to collect transformers' testing
data, design the framework of the software, and evaluate transformers'
conditions. The proposed system prototype had been tested with realistic
transformers with reliable performance. The present framework is open and
flexible, therefore the objective system is easy to be maintained and developed
further.
Keywords: Transformer condition assessment, data mining, multi-agent
system.

1 Introduction
Power utilities are under continuous pressure to reduce maintenance expenditures
while maintaining a high level of component reliability. As a result, condition based
maintenance (CBM) has been developed to cut down the maintenance cost and
increase the level of system and component reliability. An open substation main
equipment state monitoring system framework, which used data warehouse
technologies to collect all kinds of data and used data mining and Open Architecture
Agent technologies to set up an open architecture, is proposed in this paper.

2 A New Transformer Condition Assessment System


The new overall transformer condition assessment system (TCAS) is a hybrid system
that is composed of a data collection subsystem and a condition analysis subsystem.
The former collects transformers on-site monitoring data, off-line present and
historical testing data, nameplate parameters, and historical operating records into a
data warehouse. The later uses the data in the data warehouse to evaluate the
conditions of all concerned transformers in an electric utility.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 978981, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Transformer Condition Assessment System Based on Data Warehouse

979

2.1 Data Collection Subsystem


Data warehouse is used in the data collection subsystem for data collection. In
addition, an OLAP tool is included in the subsystem to enable easy and efficient data
analysis tasks by the users.
2.2 Condition Analysis Subsystem
2.2.1 Architecture of the Transformer Condition Analysis Subsystem
The transformer condition analysis subsystem includes the following seven
application agents:
AA={AAThreshold_Alarm,
AATrend_Analysis,
AAEventTree_Analysis,
AACluster_Analysis,
AAClassification_Analysis, AAForecasting, AACondition_Assessment}
The above application agents work together to evaluate transformers conditions.
The knowledge of these agents can be described as follows:
KAAThreshold_Alarm = {dissolved gases concentrations, gases production speeds,
electric testing data, ratio limit, speed limit, electric data limits}
KAATrend_Analysis = {dissolved gases concentrations, gases production speeds,
electric testing data }
KAAEvent Tree_Analysis = {entrance circuit short, continuous high temperature,
sustained overload, environmental abnormity, fault analysis}
KAACluster_Analysis = {dissolved gases concentrations, dissolved gases relative ratios,
electric testing data, grey relation clustering algorithm }
KAAClassification_Analysis = {dissolved gases relative ratios, electric testing data,
Bayesian network classifier }
KAAForecasting = {dissolved gases concentrations, dissolved gases relative ratios,
electric testing data, grey prediction algorithm}
KAACondition_Assessment = {threshold analysis results, trend analysis results, event tree
analysis results, cluster analysis results, classification results, forecasting results,
Bayesian network model}
2.2.2 Application Agents
2.2.2.1 Threshold Alarm Agent. Standard value analysis agent is used to compare the
actual tested value derived from testing data, on-line monitoring data and other data
with standard values of transformer condition.
2.2.2.2 Trend Analysis Agent. The quality indices of a transformer usually change
with time extension. If they change slowly and they are within the ranges of their
standard values, the transformer is in normal condition. If they change sharply,
usually a hidden failure or defect occurs in the transformer even though they are
within the ranges of their standard values.
2.2.2.3 Event Tree Analysis Agent. Event tree analysis and fault tree analysis are two
main methods to do the transformer fault analysis. When an event, such as external
short circuit, continuous high environmental temperature, sustained overload, happen,
the event tree analysis agent can find it from recorded data and then do condition
assessment.

980

X. Li et al.

2.2.2.4 Cluster Analysis Agent. Cluster analysis agent is responsible for finding the
testing data's distribution patterns and relations of attributes. Cluster analysis agent
responds to the request of the facilitator agent to make the cluster analysis. The grey
correlation method is used to extract the failure types.
2.2.2.5 Classification Analysis Agent. The nave Bayesian classifier learns from
training data the conditional probability of each attribute Ai given the class label C.
Classification is then done by applying Bayes rule to compute the probability of C
given the particular instance of A1, ... , An, and then predicting the class with the
highest posterior probability.
2.2.2.6
Forecasting Agent. Forecasting agent responds to the request of the
facilitator agent, which provides the agent community with a number of services for
routing and delegating tasks and information among agents, to make the transformer
failure forecast. In grey model GM(1,1), stochastic variables are seen as Grey
Variables, irregular original data are cumulated to be regular series, and then, a
differential equation model is formed and settled. When the grey theory is used to
predict transformers dissolved gases values, the posteriori error of each gas can be
about 5% or less than 5% [11].
2.2.2.7 Transformer Condition Assessment Agent. The threshold analysis, trend
analysis, event tree analysis, cluster analysis, classification analysis, and parameter
forecasting are comprehensively analyzed through using the Bayesian network. The
transformer condition is divided into five states, excellent, better, normal, worse and
fault. Transformer condition node has seven parent nodes, which are all middle nodes
just like threshold analysis result node.

3 Applications
We are developing a transformer condition assessment system for an electric utility.
The initial results expressed in table 1 are suitable to actual transformers conditions.
Table 1. Initial results

Voltage grade
(kV)

Capacity
(kVA)

Count of Tr.

110
110
110
110
220
220

31500
20000
40000
50000
120000
180000

28
7
12
4
19
4

(E: Excellent, B: Better, N: Normal, W: Worse, F: Fault.)

E
11
6
7
3
17
4

Transformers
Conditions
B N W
14 1
1
1 0
0
4 1
0
1 0
0
1 0
0
0 0
0

F
1
0
0
0
1
0

A Transformer Condition Assessment System Based on Data Warehouse

981

In order to test the diagnosis accuracy of the system, some examples in [2] are
tested again, and the results are excellent agreement with transformers actual faults.

4 Conclusions
In this paper, we presented some general guidelines of developing an intelligent
transformer condition assessment system to help electric utilities optimize the
maintenance activities. This proposed framework is open and flexible, so the
objective system is easy to be maintained and further developed. A data warehouse
has been used to integrate all kind of transformer condition parameters. OAA is
employed to compose the multi-agent system that is the main part of the proposed
system. Seven application agents are designed to evaluate transformers conditions
synthetically. The initial filed test results got from tests of some transformers based
on a prototype system developed by the authors have proven that the framework
system is able to produce accurate condition assessment results and is promising for
further implementation. Moreover, the maintenance and further development of the
objective system is feasible since the present framework is open and flexible.

References
1. Jie Cheng and Russell Greiner: Comparing bayesian network classifiers. In Proceedings of
the 15th Conference on Uncertainty in Artificial Intelligence (UAI'99), Morgan Kaufmann
Publishers (1999) 101--107.
2. Mang-Hui Wang: A Novel Extension Method for Transformer Fault Diagnosis, IEEE
Transactions on Power Delivery, 18(1) (2003) 164-169

Shannon Wavelet Analysis


Carlo Cattani
DiFarma, Universit`
a di Salerno
Via Ponte Don Melillo
84084 Fisciano (SA)-Italy
ccattani@unisa.it

Abstract. In this paper the dierentiable structure of the Shannon


wavelets is dened and the projection of a linear dierential operators is
given for any order. As application, the wavelet solution of a heat propagation problem is computed and the contribution of the dierent scale
components is explicitly shown.
Keywords: Shannon Wavelet, Connection coecients, Heat equation,
Numerical approximation.

AMS-Classication 42C40, 42C15, 65T60, 35K05.

Introduction

Shannon wavelets are the real part of the so-called harmonic wavelets [2,4,5].
They have a slow decay in the variable space but a very sharp compact support in
the frequency (Fourier) domain, being represented therein by box functions. This
fact, together with the Parseval equality have been used to easily compute the
inner product and the connection coecients of the Shannon wavelets (see [2,3])
which are the inner product of the Shannon basis with their -order derivatives.
These coecients, also called renable integrals, are a very useful tool for the
analysis of discrete samples [7,8] and for the derivation of the wavelet solution
of partial dierential equation (in the Petrov-Galerkin approach) [1,6] In the
following the representation of linear operators in Shannon wavelet bases will
be given. As an example the solution of the heat equation for a localized initial
prole [6] will be obtain as a Shannon wavelet series. The main advantages of this
approach are that: 1) only by using localized functions, like Shannon wavelets,
the initial prole (and then its evolution in time), can be performed by the
lowest number of series coecients; 2) only with this approach the evolution can
be split into many scales: some for the low frequency evolution and some for the
high frequency evolution 3) the numerical approach is the simplest one.


Work partially supported by Regione Campania under contract Modelli nonlineari


di materiali compositi per applicazioni di nanotecnologia chimica-biomedica, LR
28/5/02 n. 5, Finanziamenti 2003 and by Miur under contract Modelli non Lineari per Applicazioni Tecnologiche e Biomediche di Materiali non Convenzionali,
Univ. di Salerno, Processo di internazionalizzazione del sistema universitario, D.M.
5 agosto 2004 n. 262 - ART. 23.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 982989, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Shannon Wavelet Analysis

983

Shannon Wavelets

The dilated and translated instances of the Shannon scaling functions nk (x)
2n/2 (2n x k) and Shannon wavelet functions kn (x) 2n/2 (2n x k) are [3]:

nk (x) = 2n/2 sin (2 x k)

(2n x k)
(1)
1
1
n
n

n/2 sin (2 x k 2 ) sin 2(2 x k 2 )


n

(x)
=
2
.
k
(2n x k 12 )
Their Fourier transform are

2n/2 ik/2n

nk () =
e
(/2n + 3)

2
n/2


n 

n () = 2
ei(k+1/2)/2 (/2n1 ) + (/2n1 )
k
2

(2)

where the characteristic function (), is dened as



1 , 2 < 4
()
0 , elsewhere .
The family of functions {kn (x)} is an orthonormal wavelet basis with respect to
the inner product, dened as


f, g
f (x)g(x)d x ,

which, according to the Parseval equality, can be expressed as

f, g

f (x) g (x)d x = 2



f() g ()d = 2 f, g ,

(3)

where the bar stands for the complex conjugate.


The derivatives of the basis are fundamental tools for the computation of the
n-order moments of the Shannon wavelets and connection coecients:





d
d
()nm
n
m
()nm
n
m

kh
(x) , h (x) , kh = 2
(x), h (x)
(4)
d x k
d x k
where


d
n (x) = (i) kn () ,
d x k

(5)

i.e., according to (2),




d
2n/2 i(k+1/2)/2n 
nk (x) = (i)
e
(/2n1 ) + (/2n1) .

dx
2

(6)

By a direct computation it can be shown (for a sketch of the proof see [3])

984

C. Cattani

Theorem 1. For a given m Z {0},  N, it is

(i) eim d = i (1 |(m)|)


+(m) e

im

+1


+1
+
+1

[1+(m)](2s+1)/2

(1)

s=1

where

!is s+1
+ Cnst.
( s + 1)!|m|s

1 ,m>0

(m) = sign(m) = 1 , m < 0

0 ,m=0 .

(7)

(8)

In particular, taking into account that



e

ik

= (1) =

, k = 2s

1 , k = (2s + 1) , s N

it is
Corollary 1. For a given m Z {0},  N and a, b Z, (a < b) it is

(i) eim d = i (1 |(m)|)


a

+(m)

+1 (b+1 a+1 )


+
+1

+1


!is s+1
(1)[1+(m)](2s+1)/2
[(1)m b bs+1 (1)m a as+1 ]
s
(

s
+
1)!|m|
s=1
(9)

The last formula enables us to compute the connection coecients


Theorem 2. The any order connection coecients (4)2 of the Shannon wavelets
(1)2 are

 2n1 +1
nm
()nm
kh =
i (1 |(h k)|)
(2
1)(1 + (1) )+
+1
+(hk)

+1

s=1

(1)[1+(hk)](2s+1)/2

!is s
(1)s2(h+k) 2ns1
( s + 1)! |h k|s








2+1 (1)4h+s + (1)4k+ 2s (1)3k+h+ + (1)3h+k+s
(10)
nm
respectively, for  1, and (0)nm
.
kh = kh

Shannon Wavelet Analysis

985

Proof: It can be easily shown that (see also [3] theorem 6)


()nm
kh = 0 ,

(n = m)

When n = m, from equation (4) it is [3],


()nn
kh

2(n1)
=
4


4

2

 i(kh)/2
 i(kh)/2
(i ) e
d +
(i ) e
d ,

(11)

that is, taking into account (9), equation (10) follows.


The connection coecients fulll some simmetries, which enable us to restrict
their computation at the lowest scale, according to
Theorem 3. The connection coecients are recorsively given by the matrix at
the lowest scale level:
(n1) ()11
()nn
kh .
kh = 2
Moreover it is
(2+1)nn
(2+1)nn
kh =
hk

(2)nn
(2)nn
kh =
hk .

For the coecients of the scaling functions





d
()
0
0

kh = 2
(x), h (x)
d x k

(12)

it can be also shown that


Theorem 4. The any order connection coecients (4)1 of the scaling functions
0k (x) are
()

kh = i (1 |(h k)|)
+(hk)(1)hk

+1


 [1 + (1) ]
+
2( + 1)

(1)[1+(hk)](2s+1)/2

s=1

!is s
[1+(1)s]
2( s + 1)!|h k|s
(13)

Reconstruction of a Function by Shannon Wavelets

Let f (x) be a given function such that the r.h.s. series expansion
f (x) =


h=

h (x h) +




kn kn (x) ,

(14)

n=0 k=

converges to f (x). If we limit the dilation factor n N < , for a truncated


series, we have the approximation of f (x), given by:

986

C. Cattani

f (x)
=

S


h (x h) +

N
M



kn kn (x)

(15)

n=0 k=M

h=S

with

f (x)(x h)d x ,

kn

f (x)kn (x)d x .

(16)

By re-arranging the many terms of the series (15) with respect to the dierent
scales, for a xed N we have
f (x)
=

S


h (x h) +

N


fn (x)

n=0

h=S

M


fn (x) =

kn kn (x)

(17)

k=M

where fn (x) represent the components of the function f (x) at the scale 0 n
N , and f (x) reults from a multiscale approximation.
Let us compute the approximate representation of the even function
f (x) = e(16x)

/2

(18)

The bottom lenght (i.e. the main part) of the function f (x) is concentrated in
the interval [0.2, 0.2]. With a low scale n = 3 we can have a good approximation
of the function, even with a small number k of translation. In fact, with |k| 5
the absolute value of the approximation error is less than 5%. Thus we can
assume
3
5


(16x)2 /2
e
kn kn (x) ,
(19)
= 0 (x) +
n=0 k=5

with 0 = 0.155663 and

kn

given by (16).

Wavelet Representation of Operators

Let f (x) a function represented by (14). Any linear dierential operator L acting
on f is
S
N
M



Lf (x) =
h L 0h (x) +
kn Lkn (x) ,
n=0 k=M

h=S

and, since
L 0h (x) =

L0k , 0h 0h (x)

Lkn (x) =

i.e.
L 0h (x) =

Lkn , hm hm (x)

m,h


h

() kh , 0h (x)

Lkn (x) =


m,h

we have, as a projection of the operator at the scale N :

m
()nm
kh h (x)

(20)

Shannon Wavelet Analysis

a

987

b
1
t0
t0.005
0.02
 0.2

 0.2
0
0.2
x

0.2

Fig. 1. Surface u(x, t) a) and b) time evolution of the Fourier integral solution

Lf (x) =

S


h=S

h () kh , 0k (x) +

N
M




m
kn ()nm
kh h (x) .

n=0 k=M m,h

As an example, let us consider the one-dimensional heat equation for an innite bar, with normalized physical constants
u
= Lu ,
t

2
,
x2

(21)

and initial condition (f (x) given by (18))


u(x, 0) = f (x) , x , t = 0 .

(22)

The solution of the problem (21)-(22), in terms of Fourier integrals, is:


1
u(x, t) =
2 t



(x )2
f () exp
d ,
4t

t = 0 ,

(23)

which, assuming (18) as initial function, can be easily computed (see Fig. 1)
2
9
1
u(x, t) =
e(16x) /(1+2 t) .
29 t + 1

(24)

Thus in a short time the high peak (of the initial prole) reduces to a smooth
prole. In the wavelet approach, according to the best scale-dilation approximation of the initial function (18), restricted to the interval [1, 1], we can assume
that at the resolution n 3 , |k| 5, the solution of (21) can be expressed as
u(x, t) = 0 (t)(x) +

3 
5

n=0 k=5

kn (t)kn (x)

988

C. Cattani

a

b

t0
t0.005
0.02
0.2

0.2
0
0.2
x

0.2

Fig. 2. Surface of the approximate wavelet solution for u(x, t) a) and b) its time
evolution

so that
0 (t)(x) +

3 
5


kn (t)kn (x) = 0 (t)L(x) +

n=0 k=5

3 
5


kn (t)Lkn (x)

n=0 k=5

By a scalar product with scaling and wavelet basis we get (n = 0, . . . , 3;


k = 5, . . . , 5)

d 0 (t)

dt

= 0 (t)

d2
(x), (x)
d x2

3
3



d kn (t) n
d2

kn (t) 2 kn (x), hm (x) ,


d t k (x), h (x) =
dx
m=0 h=3

i.e., taking into account (12) and (10) at the second order ( = 2), it is

d 0 (t)
2

0 (t)
dt

3
5



d kn (t)

=
(2)nm
kh h (t) ,
dt

(25)
(n = 0, . . . , 3; k = 5, . . . , 5) ,

m=0 h=5

The initial conditions, coincide with the values of the wavelet coecients of
the initial prole, i.e.
0 (0) = f (x), nk (x) = 0.155663 , kn (0) = f (x), kn (x)
are given by integrals (16). Since (25) is a linear system we can easily get the
solution for :
2
(t) = 0.155663e t/3

Shannon Wavelet Analysis

989

as well as for the detail coecients , so that the solution can be (numerically)
computed (see Fig. 2). Comparing the approximate wavelet solution (Fig. 2) with
the exact solution (24) (Fig. 1) it substantially coincides with the Fourier solution
(24) even with a low number of scales. However, it should be noticed that (only)
with the rst approch it is possible to decompose the solution at the dierent
scales (anologously what is done in Fourier series with mode decomposition)
which is impossible to do with the Fourier integral solution (23), (24).
If we limit to the interval [1, 1] we can see that the evolution of the initial
prole can be split into four dierent waves: f (x)
= f0 (x) , f (x)
= f1 (x) , f (x)
=
f2 (x) , f (x)
= f3 (x) , (corresponding to the four scales n = 0, . . . , 3). The
low frequency wave has a low amplitude but is more steady, while the higher
frequency wave (n = 3) has higher amplitude but a quite fast decay.

References
1. C. Cattani. Harmonic Wavelet Solutions of the Schr
odinger Equation. International
Journal of Fluid Mechanics Research, 5, 2003, 110.
2. C. Cattani. Harmonic Wavelets towards Solution of Nonlinear PDE. Computers and
Mathematics with Applications, 50, 8-9, 2005, 11911210.
3. C.Cattani, Connection Coecients of Shannon Wavelets, Mathematical Modelling
and Analysis, vol. 11 (2),(2006), 116.
4. S.V. Muniandy and I.M. Moroz. Galerkin modelling of the Burgers equation using
harmonic wavelets. Phys.Lett. A, 235, 1997, 352356.
5. D.E. Newland. Harmonic wavelet analysis. Proc.R.Soc.Lond. A, 443, 1993,
203222.
6. J.J. Rushchitsky , C. Cattani and Terletska E.V. Wavelet Analysis of a Single Pulse
in a linearly Elastic Composite // International Applied Mechanics. 2005. Volume
41 (4). P. 374-380.
7. C. Toma, An Extension of the Notion of Observability at Filtering and Sampling Devices, Proceedings of the International Symposium on Signals, Circuits and Systems
Iasi SCS 2001, Romania, 233236.
8. G. Toma, Practical Test Functions Generated by Computer Algorithms, Lecture
Notes Computer Science 3482 (2005), 576585.

Wavelet Analysis of Bifurcation in a


Competition Model
Carlo Cattani and Ivana Bochicchio
University of Salerno,
Via Ponte Don Melillo, 84084 Fisciano (SA), Italy
ccattani@unisa.it, ibochicchio@unisa.it

Abstract. A nonlinear dynamical system which describes two interacting and competing populations (tumor and immune cells) is studied
through the analysis of the wavelet coecients. The wavelet coecients
(also called detail coecients) are able to reproduce the behaviour of the
function, and, being sensible to local changes, are strictly related to the
dierentiable properties of the function, which cannot be easily derived
from the numerical interpolation. So the main features of the dynamical
system will be given in terms of detail coecients that are more adapted
to the description of a nonlinear problem.

Introduction

In this paper we consider the nonlinear dynamical system



dx

2(1+)4

xy ,

d = 1 x 1 0 e

dy = 1 xy 2 y + 3 x + 4 5 yx2 ,
d

(1)

which represents the competition between two cell populations [4,6].


The parameters are such that

0 < 1 ,

|1 | 1 , 0 < 2 , |3 | 1 , 0 4 1 ,

0 0 1 , || < 1 .
In this system, which is a generalization of the Lotka-Volterra model, the unknown quantity x(t) represents the numerical density of tumor cells, while y(t)
is the numerical density of lymphocyte population, under conditions x(t) > 0
and y(t) > 0 [6]. Moreover
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 990996, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Wavelet Analysis of Bifurcation in a Competition Model

991

1. 1 is the rate of growth of the tumor population


2. 1 is the aggressive rate of tumor cells
3. 2 is the stimulatory eect of the tumor cells on immune cells
4. 3 , 4 are, in the average, the immune system response
5. 5 the tumor malignancy
6. represents the relative velocity of encounter rates of interacting populations.
Finally 0 is a parameter related to the ability of recognition of the competing population by the immune system. In particular and 0 are coupling
the macroscopic with microscopic system [4]. Small values of 0 , according to
[4], correspond to the maximum learning, i.e. full recognition of the competing
population, whereas 0 = 1 correspond to the minimum learning; the competing
population is not recognized by the immune system.
We investigate the dynamics of this system through the analysis of the wavelet
coecients which give the possibility to focus on singularities, local high frequencies variation, irregular structure and transient phenomena (see also [2]).
A similar analysis was done in some previous papers [2,3] where we studied the
Van Der Pol equation with and without damping (pointing out stable and stable
solutions), where we observed that if the dynamical system is strongly nonlinear,
the detail coecients show signicant jumps. Wavelet coecients strongly depend on local changes, so that when the dynamical system becomes unstable (or
chaotic) many eects appear [8,9]: the amplitude of the detail coecients grows,
the detail coecients gather around some peaks, showing also some randomness
distribution.
The aim of this paper is to investigate these featuring properties of wavelet
coecients for the above system of equations and to focus on what kind of
wavelet coecients give more precise information about the behavior of the
studied dynamical system.
The paper is organized as follows: in Sect. 2 some preliminary denitions about
Haar wavelets and short Haar wavelet transform [1] are given. The Lotka-Volterra
model is introduced in Sect. 3, where we discuss the solutions of the non linear
system through a wavelet analysis.

Short Haar Wavelet Transform

The Haar scaling function (t) is the characteristic function on [0, 1]. By translation and dilation we get the family of functions dened in [0, 1]
n
k (t) 2n/2 (2n t k) ,
(0 n , 0 k 2n 1) ,



k k+1

n
n
1 , t k , k n , n
,

2
2
(2 t k) =

n
0 , t  k .

(2)

992

C. Cattani and I. Bochicchio

The Haar wavelet family {kn (t)} is the orthonormal basis for the L2 ([0, 1])
functions [5]:

kn (t) 2n/2 (2n t k) ,


||kn (t)||L2 = 1 ,

k k + 1/2

1 , t n ,
,

2
2n



(2n t k)

k + 1/2 k + 1

1
,
t

,
,
(0 n , 0 k 2n 1) ,

n
n

2
2

0,
elsewhere .
(3)
Without loss of generality, we restrict ourselves to 0 n , 0 k 2n 1 =
kn [0, 1]. Let Y {Yi }, (i = 0, . . . , 2M 1, 2M = N < , M N), be a
real and square summable time-series Y KN
2 (where K is a real eld);
ti = i/(2M 1), is the regular equispaced grid of dyadic points on the interval
restricted, for convenience and without restriction, to = [0, 1].
Let the set Y = {Yi } of N data be segmented into segments (in general)
of
dierent length. Each segment Y s , s = 0, . . . , 1 is made of ps = 2ms ,
( s ps = N ), data:
Y = {Yi }i=0,...,N 1 =

Y s} ,
{Y

Y s {Ysps , Ysps +1 , . . . , Ysps +ps 1 } ,

s=0

being, in general, ps = pr . The short discrete Haar wavelet transform of Y is


1

(see [1]) W
ps , Y , being explicitly (2ms = ps ,
ps = N )
s=0

1
1

s

ps ,
p

,
Y
=
Y ,

s=0
s=0

1

1



ps ,
ps
ps s

Y
=
W

Y =
W
Y
,
W

s=0
s=0




ms

0(s)
0(s)
1(s)
1(s)
m 1(s)
W
2 Ys
= 0 , 0 , 0 , 1 , . . . , 2mss 1 1 .
Where the discrete Haar wavelet transform is the operator W
N : KN
2
N
2
K
which maps the vector Y into the vector of the wavelet coecients
{ , kn }:
W
N Y = {, 00 , . . . , 2M1
M 1 1 } ,

Y = {Y0 , Y1 , . . . , YN 1 } .

(4)

There follows that, the matrix of the wavelet transform is expressed as a direct
sum of lower order matrices so that the short transform is a sparse matrix [1]. We
want to emphasize that when the short wavelet transform maps short interval
values into a few set of wavelet coecients, it can be considered as a rst order
approximation. Thus giving information about the linear behavior. However,

Wavelet Analysis of Bifurcation in a Competition Model

993

since the wavelet transform maps the original signal into uncorrelated sequences
[7], the short wavelet transform describes for each sequence of detail coecients
its local behavior. When ps = p = N, = 1, the above coincides with the
ordinary wavelet transform. We assume, in the following, ps = p = N/, s =
0, . . . , 1, ( > 1).

System of Competition

Let us consider the competition model (1); by xing some parameters



dx

2(1+0.1)0.2

xy ,

d = x 1 e
(5)

dy = 0.1y + x + 0.2 5 yx2 ,


d

we obtain the numerical solution as in Fig. 1, where we take, as initial conditions,


an initial high numerical density of tumor cells x(0) = 5 and we neglect the
initial number of lymphocytes y(0) = 0. The other parameters are taken as
0 = 1, = 0.1 and, according to (5), we simulate a competition where the
rate of growth of the tumor population is little (1 = 1), the aggressive rate of
tumor cells is neglectable (1 = 0), the stimulatory eect of the tumor cells on
immune cells is weak (2 = 0.1), the immune system response is in the average
3 = 1, 4 = 0.2. Of course with higher values of 4 we will have a stronger
immune system (weaker for smaller values of 4 ).
5 0.01`

5 0.04`

x
0

x
0

5 0.05`

8
5 0.06`

x
0

x
0

Fig. 1. Numerical solution of system (1) with parameters 1 = 1, 1 = 0, 2 =


0.1, 3 = 1, 4 = 0.2, 0 = 1, = 0.1, and initial conditions x(0) = 5, y(0) = 0, in
correspondence of dierent values of 5

994

C. Cattani and I. Bochicchio


5  0.01`

5  0.04`

5
5

t
0

5  0.05`

5  0.06`

t
0

t
0

Fig. 2. Numerical solution (plain x(t), dashed y(t)) of system (1) with parameters
1 = 1 , 1 = 0 , 2 = 0.1 , 3 = 1 , 4 = 0.2, 0 = 1 , = 0.1, and initial conditions
x(0) = 5, y(0) = 0, in correspondence of dierent values of 5

It can be observed (see Fig. 1 and Fig. 2) that with small values of the tumor malignancy 5 , the lymphocyte population y(t) grows, whereas tumor cells
x(t) decrease. When 5 > 0.05 the number of lymphocytes reach a maximum
value and then goes to zero while the tumor cell population grows. It should
be noticed that 5 = 0.05 represents a bifurcation point for the model with
loss of uniqueness of the dierential system, because the curve is not simply
connected (for the presence of a knot). When the number of tumor cells tends
to zero the number of lymphocytes decreases to a particular value, called the
sentinel value. The dynamics of equation (5) has been simulated by using the
Runge Kutta 4-th order method, with the accuracy 106 . We obtain as a numerical solution (0 < t 6), in correspondence of the values of the parameter
1 = 1 , 1 = 0 , 2 = 0.1 , 3 = 1 , 4 = 0.2, 0 = 1 , = 0.1, four sequences
(in correspondence of 5 = 0.01, 5 = 0.04, 5 = 0.05, 5 = 0.06) of 29 = 512
values Y = {Y0 , Y1 , . . . , YN 1 }, with N = 512 and M = 9. Moreover, using
the short Haar wavelet transform, with ps = p = 4, we compare the wavelet
coecients of the two time-series, near the bifurcation value of 5 , i.e. 5 = 0.05
and 5 = 0.06 computed numerically (Fig. 3, 4).

Critical Analysis

The importance of the wavelet analysis is mainly based on the information content of the detail coecients. It can be seen (from Fig. 3 and 4) that only a small
set of detail coecients, namely 00 , 01 , 11 , is already enough to have a good
information about the dynamical system, in any case better than the numerical evaluation (Fig. 3,4 on top). In fact, the detail coecients show some local

Wavelet Analysis of Bifurcation in a Competition Model

995

yt
xt

11

11

01

01

00
00

t
0 12
  
25

3

2

9
  
10

9

5

9

2

Fig. 3. Numerical solution and wavelet coecient of 4-parameters of short Haar transform of the numerical solution x(t) (left) and y(t) (right) of system (1) with parameters
1 = 1 , 1 = 0 , 2 = 0.1 , 3 = 1 , 4 = 0.2, 0 = 1 , = 0.1, and initial conditions
x(0) = 5, y(0) = 0, in correspondence of 5 = 0.05

yt
xt

11

11

01

01

00
00
0 12
  
25

9
  
10

Fig. 4. Numerical solution and wavelet coecient of 4-parameters of short Haar transform of the numerical solution x(t) (left) and y(t) (right) of system (1) with parameters
1 = 1 , 1 = 0 , 2 = 0.1 , 3 = 1 , 4 = 0.2, 0 = 1 , = 0.1, and initial conditions
x(0) = 5, y(0) = 0, in correspondence of 5 = 0.06

maxima and changes which are hidden in the continuous interpolation of the numerical integration. Each detail coecient is able, at each scale but mostly at the
lower scale n = 0, 00 , to reproduce the behaviour of the function, but they are
very sensible to local changes and therefore they can easily describe the intervals
where the function is monotonic. Moreover, being sensible to local changes they

996

C. Cattani and I. Bochicchio

are strictly related to the dierentiable properties of the function, which cannot
be easily derived from the numerical interpolation of the function. In particular,
we can see that they can focus exactly on local maxima of the function, and
on a number of changes higher than the numerical functions x(t) e y(t). Therefore the detail coecients are more adapted to the description of a nonlinear
problem but also show a inexion in the initial growth which is invisible in the
numerical solution. These time spots, where the the detail coecients are zero,
or where the detail coecients have some local maxima (minima) are reported
in Fig. 3,4, and tell us the inversion (inexion) in the population growth. The
positive values of the detail coecients describe the local growth, the negative
values the decreasing of the function. Local maxima (minima) of the detail coefcients dene some inexion which enable us to predict if the phenomenon will
increase in time or decrease.

References
1. Cattani, C.: Haar Wavelet based Technique for Sharp Jumps Classication. Mathematical Computer Modelling 39 (2004) 255279.
2. Cattani, C., Bochicchio, I.: Wavelet Analysis of Chaotic Systems. Journal of Interdisciplinary Mathematics Vol. 9 No. 3 (2006) 445458.
3. Cattani, C., Bochicchio, I.: Clustering Dynamical System by Wavelet. Proceeding
of the International Conference Inverse Problem and Applications Birsk, Russia
(2006) 149159
4. Cattani, C., Ciancio, A.: Hybrid Two Scales Mathematical Tools for Active Particles
Modelling Complex System with Learning Hiding Dynamics. Math. Mod. Meth.
Appl. Sci. Vol. 2 No 17 (2007).
5. Daubechies, I.: Ten lectures on wavelets. CBMS-NSF Regional Conference Series
in Applied Mathematics, SIAM, Philadelphia (1992).
6. DOnofrio, A.: A general framework for modeling tumorimmune system competition and immunotherapy: Mathematical analysis and biomedical inference. Physica
D 208 (2005) 220235.
7. Percival, D. B., Walden, A. T.: Wavelet Methods for Time Series Analysis. Cambridge University Press, Cambridge (2000).
8. Toma, C.: An Extension of the Notion of Observability at Filtering and Sampling
Devices. Proceedings of the International Symposium on Signals, Circuits and Systems Iasi SCS, Romania (2001) 233236.
9. Toma, G.: Practical Test Functions Generated by Computer Algorithms. Lecture
Notes Computer Science 3482 (2005) 576585.

Evolution of a Spherical Universe


in a Short Range Collapse/Generation Interval
Ivana Bochicchio and Ettore Laserra
DMI - Universit`
a di Salerno,
Via Ponte Don Melillo, 84084 Fisciano (SA), Italy
ibochicchio@unisa.it, elaserra@unisa.it

Abstract. We study the nal/initial behavior of a dust Universe with


spatial spherical symmetry. This study is done in proximity of the collapse/generation times by an expansion in fractional Puiseux series. Even
if the evolution of the universe has dierent behaviours depending on the
initial data (in particular on the initial spatial curvature), we show that,
in proximity of generation or collapse time, the Universe expands or collapses with the same behavior.
Keywords: Spherical Universe, Fractional Puiseux series, generation/
collapse time.

Introduction

In this paper we consider an Universe with spatial spherical symmetry


around a physical point O and we analyze its behavior in proximity of the
collapse/generation times. In this analysis we use an expansion of the exact
solution of evolution equations in fractional power series (Puiseaux series).1
In particular we introduce the rst principal curvature 1 of the initial spatial
manifold V3 into the evolution equations and we consider these equations in the
three dierent cases of null, positive and negative principal curvature.
In other words, in a short range of times, it is impossible to distinguish the
evolution of the Universe from the Euclidean case (where 1 = 0).
Moreover this result allows the generalization of some of the results found
in the previous papers [4], [8] in the spatially euclidean case (at least in an a
suitable interval of time), also to the not euclidean case.

Evolution Equations

Since in the following we are going to consider dust universes with spatial spherical symmetry, we want to briey summarize the previous main results in a form
inspired by [3,4].
1

A formal series of the form n=m


an z n/k where m and k are integers such that k 1
is called a Puiseux series or a fractional power series (see e. g. [5,6,7]).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 9971003, 2007.
c Springer-Verlag Berlin Heidelberg 2007


998

I. Bochicchio and E. Laserra

We will consider a dust system C which generates, during its evolution, a


riemannian manifold, which has locally spatial spherical symmetry around a
physical point O;2 the metric can then be given the form [1,2]:3
ds2 = g dx dx = A2 (t, r)dr2 + B 2 (t, r)(d2 + sin2 d2 ) c2 dt2 ,

(1)

where t is the proper time of each particle, r, , are comoving spherical


coordinates and we can interpret B(t, r) as the intrinsic radius of the Osphere
S(r) at time t [1, Chap. XII, 11 p.411].4
We consider now the initial space-like hypersurface V3 (with equation t = 0)
and call rshells the set of particles with comoving radius r (i.e. the dust initially
distributed on the surface of the geodesic sphere with center at O and radius r
(Osphere) S(r)); in accordance with [3,4] we assign each particle of an rshell
the initial intrinsic radius B(0, r) as radial comoving coordinate r
B(0, r) = r .

(2)

If we put a(r) = A(0, r), the metric of the initial Osphere V3 takes the form:
d 2 = ij dxi dxj = a2 (r)dr2 + r2 (d2 + sin2 d2 ) ,

(3)

where ij gij is the metric


of V3 . If we introduce the rst principal
 tensor 
1
1
curvature of V3 , 1 (r) = r2 1 a(r)2 (see [1, Chap.VII 12 (43) p.205]), into
the TolmannBondi evolution equations [2,4], they become (see [9,10]):5

B  (t,r)

A(t, r) = 1r2 1 (r)

r)2 = 1 (r) r2 c2 + 2 GN m(r)


B(t,
B(t,r)

0 (r)r 2
(t, r) = B  (t,r)B 2 (t,r)

(4)

where (t, r) is the mass density, 0 = (0, r) is the initial mass density, and
m(r) is the socalled Euclidean mass [3,4]
 r
m(r) = 4
0 (s) s2 ds .
(5)
0

The rst principal curvature is very important for studying the geometrical
property of V3 in fact, as underlined in the paper [9,10], it completely determines
its curvature properties.
2

See [1, Chap. XII, 11 p.408] for a precise denition of spherical symmetry around
a point O.
In accordance with [1] (but dierently from [3,4]) the latin indices will vary from 1
to 3, whereas the greek indices will vary from 1 to 4.
At any point B12 represents the gaussian curvature of the geodesic sphere with its
centre at the centre of symmetry O and passing through the point [1, Chap. XII,
11 p.410].
Hereafter a dot will denote dierentiation with respect to t and a prime dierentiation with respect to r.

Evolution of a Spherical Universe

999

Remark 1. In [4] it was demonstrated that given a spherical dust universe, for
each material rshell there exists a corresponding time T (r) at which the dust
distributed on the rshell is collapsed into the symmetry center, so that we have
a function t = T (r) which satises B(T (r), r) = 0.6

Exact Solutions of Evolution Equations in Three


Dierent Cases

Now we will focus our attention on a given single r-shell (that is we will consider
r as a given xed parameter), so we can regard the intrinsic radius B = B(t; r) as
a function of time only, and r , 1 (r) , m(r) as constants. By introducing the new
adimensional function Y (t) = B(t;r)
and the function k(r) = GNrm(r)
(which we
3
r
will consider constant being r is a given xed parameter), equation (4)2 becomes
2k
Y 2 (t) = 1 c2 +
.
Y (t)

(6)

In the following we will put  = |1 | and will consider separately the three
cases 1 = 0, 1 =  > 0 and 1 =  < 0, to get the corresponding exact
solutions. The case 1 = 0 corresponds to the Euclidean case a2 (r) = 1, already
studied in [4]; it is the only case where it is possible to solve (4)2 explicitly for
B. Since 1 (r) = 0, equation (4)2 becomes
2 GN m(r)
B 2 =
B(t, r)

2k
Y 2 (t) =
.
Y (t)

We can solve the previous equation by separating the variables:


1
2 3
2 k dt = Y dY t (r) =
Y2
3
k

(7)

(8)

where we have to choose the plus sign if Universe is initially expanding (B(0;
r) >

0), the minus sign if Universe is initially contracting (B(0;


r) < 0) and (r) is
an arbitrary function of the parameter r.
Remark 2. For t = (r) Y = 0 B = 0 and we know, from remark 2,
that for each r exists a unique instant T (r) which satises B(r, T (r)) = 0, where
T (r) is the time at which the dust distributed on the r-shell is collapsed into the
symmetry center, so we have t = T (r) and consequently (r) T (r).
We can calculate the function (r) T (r) through the initial values B(0; r) =
r Y (0) = 1

1
2
1
2 r3
T (r) =
=
.
(9)
3 k(r)
3 GN m(r)
6

If the initial mass density is constant, T (r) is also constant.

1000

I. Bochicchio and E. Laserra

It is possible to solve equation (8)2 with respect to Y and we can write the
solutions of equations (7) in the form [4]:

23

23
t
t
Y (t, r) = 1
B(t, r) = r 1
.
(10)
T (r)
T (r)
If 1 (r) = (r) > 0, equation (4) becomes
2k
Y 2 =  c2 +
.
Y
Now we can write
Y 2 = 2 k
where h(r) =

1
1

Y
h

dY
=
dt

(11)

2k
h

hY
,
Y

(12)

2 k(r)
(r)c2

> 0. We can solve (12) by separating the variables:




2k
Y
dt =
dY

h
h Y





h
Y
t (r) =
(h Y )Y + h arctan
2k
hY

(13)

(14)

where we have to choose the plus sign if Universe is initially expanding (B(0;
r) >

0), the minus sign if Universe is initially contracting (B(0; r) < 0) and (r) is
an arbitrary function of r.
Remark 3. Also in this case for t = (r) Y = 0 B = 0 then, from Remark
2, t = T (r) and consequently (r) T (r).

h
By substituting Bmax = h(r) r and 2k
= c12  we nd




1
B
t T (r) =
(Bmax B)B + Bmax arctan
.
Bmax B
 r 2 c2
(15)
We can calculate the function (r) T (r) from the initial values B(0; r) = r




1
r
T (r) =
(Bmax r)r + Bmax arctan
. (16)
Bmax r
 r 2 c2
Finally, when 1 (r) = (r) < 0, equation (4) becomes
2k
 c2 Y + 2k
Y 2 =  c2 +
Y 2 =
Y
Y

1 + c2 k Y
dY

= 2k
.
dt
Y
We can solve equation (18) separating the variables:

(17)

(18)

Evolution of a Spherical Universe

t (r) =

1
Y
dt = 

2k 1 + c2  Y
2k

c k Y  2 k + c2 Y  2 k arcsinh( c 2Yk )
3

c3  2

1001

(19)

(20)

where we have to choose the plus sign if Universe is initially expanding (B(0;
r) >

0), the minus sign if Universe is initially contracting (B(0;


r) < 0) and (r) is
an arbitrary function of r.
Remark 4. Also in this case for t = (r) Y = 0 B = 0 then, from
Remark 2, t = T (r) and consequently (r) T (r).
So we can calculate the function (r) from the initial value B(0; r) = r

c k  2 k + c2  2 k arcsinh( c2 
)
k
T (r) =
3
3
c 2

(21)

Study of the Behaviour of the Universe in Proximity


of the Collapse/Generation Times by an Expansion in
Fractional Power Series

Now we want to study the behaviour of the universe in proximity of the collapse/expansion times by an expansion in fractional (Puiseux) series.7
Remark 5. In proximity of the times of generation or collapse the evolution has
the same behaviour apart from its initial geometry. In addition the function
T (r) has approximately the same form in all of the three dierent cases 1 = 0,
1 > 0 and 1 < 0.
4.1

Initial Principal Curvature 1 Positive

We already remarked that it is not possible to solve (14) explicitly with respect
to B, but we can approximate the exact solution by an opportune fractional
power series (or Puiseux series):8





h
Y
(h Y )Y + h arctan
=
(22)
2k
hY
7

In [11] the approximate explicit solution was obtained through an expansion in power
series of the parametric equations, therefore by a double expansion in power series.
It is not possible to expand the second member of (14) in a simple power series
with
respect to Y , but we can develop it in Mac Lauren series with respect to Y thus
obtaining a fractional power series. As it is known the fractional power series are
particular cases of Puiseux series (see e.g. [5]).

1002

I. Bochicchio and E. Laserra

2
= Y
3 k

3
2

Y
5h 2k

5
2

28

h2

2k

7
2

(23)

By truncating the fractional series to the rst term (with precision 3/2),
we nd

1 2
3
t (r) =
Y2 .
(24)
3 k
So in our approximation we found the same expression (8) that characterizes
the case 1 = 0: in proximity of the generation or collapse times, the r-shells
expand or collapse with the same behaviour as in the case 1 = 0 and the
function T (r) has, approximately, the form (9), in agreement with [11].
4.2

Initial Principal Curvature 1 Negative

Also in this case, being not possible to solve (20) explicitly with respect to B,
we can approximate the exact solution by a Puiseux series:

c Y  2 k + c2 Y  k 2 k arcsinh( c 2Yk )
=
(25)
3
c3  2

3
5
7
2Y 2
c2 Y 2
3 c4 2 Y 2

=
3 +
5 +
3 k
10 2 k 2
112 2 k 2

(26)

By truncating the fractional series to the rst term (with precision 32 ), we nd



1 2
3
t (r) =
Y2
(27)
3 k
So in our approximation we found again the same equation that characterizes
the case 1 = 0: in proximity of the generation or collapse times, the r-shells
expand or collapse with the same behavior that in the case 1 = 0. Moreover, also in this case, the function T (r) has, approximately, the form (9) (see
also [11]).

References
1. Levi-Civita, T.: The Absolute Dierential Calculus. Dover Publications Inc. (1926)
2. Bondi, H.: Spherically symmetrical models in general relativity. Monthly Notices
107 (1947) p. 410
3. Laserra, E.: Sul problema di Cauchy relativistico in un universo a simmetria
spaziale sferica nello schema materia disgregata. Rendiconti di Matematica 2 (1982)
p. 283
4. Laserra, E.: Dust universes with spatial spherical symmetry and euclidean initial
hypersurfaces. Meccanica 20 (1985) 267271
5. Amerio, L.: Analisi Matematica con elementi di Analisi Funzionale. Vol. 3 Parte I
Utet

Evolution of a Spherical Universe

1003

6. Siegel, C. L.: Topics in Complex Function Theory. Elliptic Functions and Uniformization Theory. New York: Wiley 1 (1988) p. 98
7. Davenport, J. H., Siret Y., Tournier. E.: Computer Algebra: Systems and Algorithms for Algebraic Computation. 2nd ed. San Diego: Academic Press (1993)
9092
8. Iovane, G., Laserra, E., Giordano, P.: Fractal Cantorian structures with spatial
pseudo-spherical symmetry for a possible description of the actual segregated universe as a consequence of its primordial uctuations. Chaos, Solitons & Fractals
22 (2004) 521528
9. Bochicchio, I., Laserra, E.: Spherical Dust Model in General Relativity. Proceeding
of the International Conference Inverse Problem and Applications , May 22-23
2006, Birsk, Russia, ISBN 5-86607-266-1 (2006) 144148
10. Bochicchio, I., Laserra, E.: Inuence of the Initial Spatial Curvature on the Evolution of a Spatially Spherical Universe (to appear in Mathematical Methods, Physical Models and Simulation)
11. Giordano, P., Iovane, G., Laserra, E.: El Naschie () Cantorian Structures with
spatial pseudo-spherical symmetry: a possibile description of the actual segregated
Universe. Chaos, Solitons & Fractals 31 (2007) 11081117

On the Dierentiable Structure of Meyer


Wavelets
Carlo Cattani1 and Luis M. S
anchez Ruiz2
1

DiFarma, Universit`
a di Salerno, Via Ponte Don Melillo 84084 Fisciano (SA)- Italy
ccattani@unisa.it
2
ETSID-Departamento de Matem
atica Aplicada, Universidad Politecnica de
Valencia, 46022 Valencia, Spain
lmsr@mat.upv.es

Abstract. In this paper the dierential (rst order) properties of Meyer


wavelets are investigated.
Keywords: Meyer Wavelet, Connection coecients, Renable integrals.
AMS-Classication 35A35.

Introduction

Wavelets have widely been studied from a theoretical point of view for its many
interesting properties, mainly related with multiresolution analysis such as generating orthonormal basis in L2 (R) as well as for the fact that they have proven
to be extremely useful in many applications such as image processing, signal
detection, geophysics, medicine or turbulent ows. More mathematically focussed dierential equations and even non linear problems have also been studied
with wavelets. Very often wavelets are compared with the Fourier basis (harmonic functions), however the basic dierence is that the harmonic functions
are bounded in the frequency domain (localized in frequency) while wavelets
are bounded both in space and in frequency. Nevertheless a major drawback
for wavelet theory is the existence of many dierent families of wavelets, giving
some arbitrariness to the whole theory. Among the many families of wavelets
the simplest choice is the one based on Haar functions. Despite their simplicity
Haar wavelets have proven their suitability for dealing with problems in which
piecewise constant functions or functions with sharp discontinuities appear. The
scaling function is the box function dened by the characteristic function [0,1]
and its Fourier transform, up to constants or phase factor, is a function of the


Work partially supported by Regione Campania under contract Modelli nonlineari di materiali compositi per applicazioni di nanotecnologia chimica-biomedica,
LR 28/5/02 n. 5, Finanziamenti 2003, by Miur under contract Modelli non Lineari per Applicazioni Tecnologiche e Biomediche di Materiali non Convenzionali,
Univ. di Salerno, Processo di internazionalizzazione del sistema universitario, D.M.
5 agosto 2004 n. 262 - ART. 23 and by Applications from Analysis and Topology
- APLANTOP, Generalitat Valenciana 2005.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10041011, 2007.
c Springer-Verlag Berlin Heidelberg 2007


On the Dierentiable Structure of Meyer Wavelets

1005

sin(t)
type
, also called sinc-function. By exchanging the role of the variable
t
and frequency space, i.e. assuming as Fourier transform the box function and the
sinc-function in the space of variable, we can construct the so-called Shannon
wavelets [2]. In a more general approach they can be derived from the real part
of harmonic (complex) wavelets [2] and its scaling function may be obtained by
choosing as such a function in the Fourier space such that it satises various
conditions for a scaling function and then nd the wavelet basis. In fact the
Haar and Shannon systems reverse the roles of the kind of scaling and wavelet
functions. Very recently the connection coecients both of Shannon wavelets
and harmonic wavelets [2] have been explicitly computed at any ordered.
Let us recall that a drawback of the Shannon wavelet in the time domain is
its slow decay which has been improved by smoothing the scaling function in
the frequency space, e.g. by means of the Meyer scaling function that instead
of enjoying a sharp discontinuity, uses a smooth function in order to interpolate
between its 1 and 0 values [4]. In this paper we study the dierentiable structure
of Meyer wavelets and, in particular, its connection coecients in the line of the
aformentioned results obtained for the Harmonic [2] and Shannon wavelet.

Meyers Wavelets

Meyers wavelets are dened in a such a way to avoid the slow decay (of compact
support frequency domain wavelets) in the space domain. In order to do this one
needs a continuous interpolating even function () dened on R, [0, 1] valued
which is proportional to n+1 (see the interpolating polynomials in table 1).
There follows that the Meyer scaling function is given by [4]

, || < 2

3
1 
3
 2
4

()
= cos 2 2
|| 1 ,
||

3
3

0
, || > 4
3

(1)

where (x) (see Table 1) is an interpolating polynomial (see [4]).

Table 1. Interpolating Polynomials

n
0
1
2
3
4
5
6

()

2 (3
 2 )

3 10 15 + 6 2

4 35 84 + 70 2 20 3

5 126 420 + 540 2 315 3 + 70 4

6 462 1980 + 3465 2 3080 3 + 1386 4 252 5

7 1716 9009 + 20020 2 24024 3 + 16380 4 6006 5 + 924 6

1006

C. Cattani and L.M. S


anchez Ruiz

If we dene as a characteristic function



2
4
1 , < || <
() =
3
3
0 , elsewhere
the scaling function can be written as


3
3

()
= ( + 23 ) + 12 ei ( 2 ||1)/2 + ei ( 2 ||1)/2 ()

(2)

(3)

By taking into account that, for the properties of the Fourier transform, it is
1
f(a b) = ei b /a f(/a) ,
a

(4)

there follows that the dilated and translated instances of the scaling function are
def
n t k) = 2n/2 (2
 n k) = 2n/2 2n ei2n k (2
 n )
nk () = 2n/2 (2

that is

n
 n )
nk () = 2n/2 ei2 k (2

(5)

so that the Meyer scaling function at any scale and frequency localization is [4]

2n+1

1
,
||
<

  3 n
 2n+1
2n+2
n
n/2 i2n k

k () = 2
e
cos

|2
|

1
,

||

2
2

3 n+2
3

0
, || >
3
(6)
where (x) is the interpolating polynomial (Table 1). According to (5)-(3) it is
n
nk () = 2n/2 ei2 k (2n + 23 ) +



3
3
1
i ( 2
|2n |1)/2 + ei ( 2
|2n |1)/2 (2n )
e
2

(7)

The scaling function in the time domain is obtained by nding its inverse
Fourier transform


1 4/3 
def 1
it

(t) =
()e d =
() cos (t)d
(8)
2
0

Some Properties of the Characteristic Function

In general for the characteristic function (2) we can set



1 , a < || b
(a,b] () =
0 , elsewhere

(9)

On the Dierentiable Structure of Meyer Wavelets

for which, assuming a b , c d , h > 0 , k > 0, the following properties

(a,b] (h k)
= ( a k, b k] ()

h
h

(a,b] (h)
= (a,b] (h)

(a,b] () + (c,d]()
= (a,b](c,d]() + (a,b](c,d] ()

(a,b] ()(c,d] ()
= (a,b](c,d]()

(a+s,b+s] ()(c+s,d+s] () = (a,b] ()(c,d] () = (a,b](c,d]()

1007

(10)

hold. According to the previous equations, the characteristic function (9) on any
interval (a, b] can be reduced to the basic function (2):


2
(a,b] () =
(b 2a) ,
b>a.
3(b a)
Analogously, according to (10)1 and (2), we can generalize the basic function (2)
to any interval being
4
(h k) = ( 2
() .
3h k, 3h k]

(11)

The product of the characteristic functions, enjoys some nice properties:


Lemma 1. For the product of the characteristic functions we have:
2
2
4 m
2
2 n 4 n ()
(2m + ) (2n ) = ( 2
m
3 2 3 , 3 2 3 ]( 3 2 , 3 2 ]
3
with



2 m 2 4 m 2
2 n 4 n
2 ,
2
2 ,
2 =
3
3
3
3
3
3

m
n1

2m > 1 + 2n+1
2 < + 2

4 m 2 2 n
1
1
n1

+ < 2m < 2n +
3 2 3 , 3 2 2
2
2

= 
2
4
1

n
n
n
m
n

2 ,
2 2 + < 2 < 2 + 1

3
3
2

2
2
4

2m ,
2n 2n + 1 < 2m < 2n+1 + 1

3
3
3

(12)

(13)

being () = 0
Analogously it can be easily shown that
Lemma 2. For the product of the characteristic functions we have:
2
(2m + ) (2n ) = ( 2
m 2 , 4 2m 2 ]( 2 2n , 4 2n ] ()
3 2
3
3
3
3
3
3

(14)

1008

C. Cattani and L.M. S


anchez Ruiz

From the above there follows, in particular

n
n1

) = 0
(2 + ) (2
3

(2n1 + ) (2n ) = 0
3

(15)

Another group of values of characteristic functions useful for the following


computations are given by the following lemmas
Lemma 3. For the following products of the characteristic functions we have:
2 n 4 n ()
(2m ) (2n ) = ( 2
m 4 m
3 2 , 3 2 ]( 3 2 , 3 2 ]

(2m ) (2n ) = ( 2
m , 4 2m ]( 2 2n , 4 2n ] ()
3 2
3
3
3

(16)

Taking into account the previous lemmas and (10)5 we have also
Corollary 1. According to lemma 3 it is
2
2
(2m + )(2n + ) = (2m )(2n ) ,
3
3
and, in particular,

(2n + 2 ) (2n1 + 2 ) = 0
3
3

n
n1
(2 ) (2
)
=0

(17)

First Order Meyer Wavelet

If we take as interpolating function (Table 1) the linear one () = , we get


from (6),

2n+1

1
,
||
<

 3 n  2n+1
2n+2
n
n/2 i2n k

k () = 2
e
sin
|2
|
,

||

3 n+2
3

0
, || >
3

(18)

that is


 n2

2
n
n/2 i2n k
n
n

k () = 2
e
(2 + ) + sin 2
3 || (2 )
3

(19)

From equation (19) it immediately follows that for k = 0 the scaling functions
n0 () are real functions while for k = 0 the functions nk () have also a nontrivial complex part (see Fig. 1). Moreover, the functions n0 () are orthonormal
functions with respect to the inner product on L2 (R):

On the Dierentiable Structure of Meyer Wavelets


00 
1

01 
1

 
2

 
2

1009


2


2

1
1
1 

02 
1

 
2


2

 
2


2

1
1
11 

 
2

12 


2

 
2


2

Fig. 1. The Meyer scaling function in frequency domain (plain the real part)


f, g
=

def

f (t) g (t)dt

(20)

with f (t) , g (t), in L2 (R) and the bar stands for the complex conjugate. By
taking into account the Parseval equality it is
1
f, g
=
2

1  
g
f() g ()d =
f, 
2

In the Fourier domain, it is possible to show that

(21)

1010

C. Cattani and L.M. S


anchez Ruiz

Theorem 1. The scaling functions {0k (t)} are orthonormal (see e.g. [4])
0k (t) , 0h (t)
= kh .
Proof: It is enought to show that, due to (21),
1 0
() , 0h ()
= kh
2 k
From equation (19) and taking into account the denition (20) of the inner
product it is





2
3
ik
0
0


k () , h ()
=
e
( + ) + sin
|| ()
3
4





2
3
eih ( + ) + sin
|| () d
3
4
Since the compact support of the characteristic functions are disjoint: ( +
2
3 ) () = 0, we get





2
3
2
i(kh)
0
0


k () , h ()
=
e
( + ) + sin
|| () d
3
4

i.e. taking into account the denition of the characteristic functions


 2/3
0k () , 0h ()
=
ei(kh) d +
2/3




 2/3
 4/3
3
3
i(kh)
+
e
sin2
|| d +
ei(kh) sin2
|| d
4
4
4/3
2/3
Thus when k = h,
0k () , 0k ()
=
=

2/3

d +
2/3

2/3

sin
4/3




 4/3
3
3
2
d +
sin
d
4
4
2/3

+ + = 2
3
3
3

when k = h, let say k = h + n, it is



1
0k () , 0h ()
= sin
n

2
n
3

 9 + 18 cos

There follows that


0k () , 0h ()
=

2
n
3

9 4n2


2 , k = h
0 , k = h


=0 .

On the Dierentiable Structure of Meyer Wavelets

1011

The Meyer wavelet in the Fourier domain is [4]





 2) + (
 + 2) (/2)

()
= ei/2 (
and, according to (4),



 (/2)

()
= ei/2 e2i + e2i ()
i.e.


 (/2)

()
= 2ei/2 cos(2)()

(22)

From (22) we can easily derive the dilated and translated instances
n
(n1)/2 n+1
kn () = 2n/2+1 ei2 (k+1/2) cos(2n+1 )2n/2 n
0
()
0 ()2

i.e.
n
n+1 () .
kn () = 2(n+1)/2 ei2 (k+1/2) cos(2n+1 )n
0 ()0

From this denition we can easily prove that Meyer wavelets are orthonormal
functions.

References
1. D.E. Newland, Harmonic wavelet analysis, Proc.R.Soc.Lond. A, 443, (1993)
203222.
2. C.Cattani, Harmonic Wavelets towards Solution of Nonlinear PDE, Computers
and Mathematics with Applications, 50 (2005), 1191-1210.
3. H. Mouri and H.Kubotani, Real-valued harmonic wavelets, Phys.Lett. A, 201,
(1995) 5360.
4. I. Daubechies, Ten Lectures on wavelets. SIAM, Philadelphia, PA, (1992).

Towards Describing Multi-fractality of Traffic Using


Local Hurst Function
Ming Li1, S.C. Lim2, Bai-Jiong Hu1, and Huamin Feng3
1

School of Information Science & Technology, East China Normal University,


Shanghai 200062, P.R. China
ming_lihk@yahoo.com, hubjune@gmail.com
2
Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Selanger, Malaysia
sclim@mmu.edu.my
3
Key Laboratory of Security and Secrecy of Information, Beijing Electronic Science and
Technology Institute, Beijing 100070, P.R. China
fenghm@besti.edu.cn

Abstract. Long-range dependence and self-similarity are two basic properties of


network traffic time series. Fractional Brownian motion (fBm) and its increment
process fractional Gaussian noise (fGn) are commonly used to model network
traffic with the Hurst index H that determines both the regularity of the sample
paths and the long memory property of traffic. However, it appears too restrictive
for traffic modeling since it can only model sample paths with the same
smoothness for all time parameterized by a constant H. A natural extension of
fBm is multifractional Brownian motion (mBm), which is indexed by a
time-dependent Hurst index H(t). The main objective of this paper is to model
multi-fractality of traffic using H(t), i.e., mBm, on a point-by-point basis instead
of an interval-by-interval basis as traditionally done in computer networks. The
numerical results for H(t) of real traffic, which are demonstrated in this paper,
show that H(t) of traffic is time-dependent, which not only provide an alternative
evidence of the multifractal phenomena of traffic but also reveal an challenging
issue in traffic modeling: multi-fractality modeling of traffic.
Keywords: Network traffic modeling, fractals, multi-fractals, multifractional
Brownian motion, local Hurst function.

1 Introduction
Experimental observations of long-range dependence (LRD) and self-similarity (SS) of
traffic time series in computer communication networks (traffic for short) were actually
noted before the eighties of last century [1]. The description of traffic in [1] was further
studied and called packet trains during the eighties of last century [2]. However, the
properties of LRD and SS of traffic were not investigated from a view of self-affine
random functions, such as fractional Brownian motion (fBm) or fractional Gaussian
noise (fGn), until the last decade, see e.g. [3], [4], [5], [6], [7], and our previous work
[8], [9], [10], [11], [12].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10121020, 2007.
Springer-Verlag Berlin Heidelberg 2007

Towards Describing Multi-fractality of Traffic Using Local Hurst Function

1013

Further research of traffic exhibits that traffic has multifractal behavior at small
time-scales. However, the multifractal behavior of traffic is conventionally described
on an interval-by-interval basis by using H(n), where H(n) is the Hurst parameter in the
nth interval for n = 1, 2, , see e.g. [13], [14], [15], [16], [17], [18], and our recent work
[19]. Note that H plays a role in computer networks, see e.g. [20], [21], and our recent
papers [19], [22], [23]. Hence, modeling multi-fractality of traffic becomes a
contemporary topic in traffic modeling. From a practice view, any models of
multi-fractal phenomena of traffic are desired as they may be promising to understand
and or find solutions to some difficult issues, such as simulation of the Internet,
performance evaluation, network security, and so forth, in networking as can be seen
from [15], [19], [34]. [35].
Owing to the fact that a monofractal model utilizes fractional Brownian motion
(fBm) with the constant Hurst index H that characterizes the global self-similarity,
see e.g. [24], we need studying the possible variation of scaling behavior locally. To do
so, fBm can be generalized to multifractional Brownian motion (mBm) by replacing the
constant H with a time-dependent Hurst function H (t ), which is also called the local
Holder exponent see e.g. [26], [27], and our work [28], [29], [30]. In this paper, we
discuss and describe the multi-scale and multi-fractal properties of real traffic based on
H(t). We shall exhibit that H (t ) of traffic change erratically with location t. It is noted

that if H (t ) is allowed to be a random function or a random process, then the mBm is a


multifractal process. We note that H (t ) differs substantially from H(n) on an
interval-by-interval since it can reflect the multifractal behaviors on a point-by-point
basis. To the best of our knowledge, modeling multi-scaled and multi-fractal
phenomena of real traffic using H (t ) is rarely seen.
The rest of paper is organized as follows. We address modelling the multi-fractality
of traffic based on the local Hurst function in Section 2. Discussions are given in
Section 3, which is followed by conclusions.

2 Multi-fractality of Real Traffic


A direct generalization of fBm to multifractional Brownian motion (mBm) can be
carried out by replacing the Hurst index with a function H (t ), satisfying
H :[0, ] (0, 1). This was first carried out independently by Peltier and
Levy-Vehel [27] and Benassi, Jaffard and Roux [31] based on the moving average and
harmonizable definitions respectively. Following [24] and [26], we define mBm X (t )
by Eq. (1), where t > 0 and H :[0, ] (a, b) (0, 1) is a Holder function of
exponent > 0, and B(t ) is the standard Brownian motion. The variance of BH (t ) is
given by Eq. (2), where H2 (t ) =

(2 H (t )) cos( H (t ))
. Since is time-dependent,
H (t )(2 H (t ) 1)

1014

M. Li et al.

2
2 H (t )
it will be desirable to normalize BH (t ) such that E ( X (t ) ) = t
by replacing

X (t ) with X (t )
.

H (t )

X (t ) =

1
( H (t ) + 1/ 2)

(t s)

H (t ) 1/ 2

( s ) H (t ) 1/ 2 dB ( s )

(1)

+ (t s )

H (t ) 1/ 2

dB( s ).

2
2 H (t )
E ( X (t ) ) = H2 ( t ) t
.

(2)

For the subsequent discussion, X (t ) will be used to denote the normalized process.
The explicit expression of the covariance of X (t ) can be calculated by
E [ X (t1 ) X (t2 ) ] =
H ( t ) + H ( t2 )
N ( H (t1 ), H (t2 ) ) t1 1
+ t2

H ( t1 ) + H ( t2 )

t1 t2

H ( t1 ) + H ( t2 )

(3)

where
H (t1 ) + H (t2 )
(2 H (t1 ) H (t2 )) cos

.
N ( H (t1 ), H (t2 ) ) =
H (t1 ) + H (t2 )

( H (t1 ) + H (t2 ) 1)
2

With the assumption that H (t ) is -Holder function such that 0 < inf( H (t ))
sup( H (t )) < (1, ), one may approximate H (t + u ) H (t ) as 0. Therefore, the
local covariance function of the normalized mBm has the following limiting form

E [ X (t + ) X (t )] ~

1
t +
2

2 H (t )

+t

2 H (t )

2 H (t )

) , 0.

(4)

The variance of the increment process becomes

E [ X (t + ) X (t ) ]

}~

2 H (t )

, 0

(5)

implying that the increment processes of mBm is locally stationary. It follows that the
local Hausdorff dimension of the graphs of mBm is given by
dim{ X (t ), t [a, b]} = 2 min{H (t ), t [a, b]}

(6)

for each interval [a, b] R + .


Due to the fact that the Hurst index H is time-dependent, mBm fails to satisfy the
global self-similarity property and the increment process of mBm does not satisfy the

Towards Describing Multi-fractality of Traffic Using Local Hurst Function

1015

stationary property. Instead, standard mBm now satisfies the local self-similarity.
Recall that fBm BH (t ) is a self-similar Gaussian process with BH (at ) and a H BH (t )
having identical finite-dimensional distributions for all a > 0. For a locally self-similar
process, therefore, one may hope that the following expression can provide a
description for the local self-similarity of X (t ) :

X (at ) a H ( t ) X (t ), a > 0,

(7)

where stands for equality in distribution. However, this definition of locally


self-similar property would lead to a situation where the law of X ( s ) depends on

H (t ) when s is far away from t : X ( s ) ( s / t ) H ( t ) X (t ). A more satisfactory way of


characterizing this property is the locally asymptotical self-similarity introduced by
Benassi, Jaffard and Roux [31]. A process X (t ) indexed by the Holder exponent
H (t ) C such that H (t ) :[0, ] (0, 1) for t R and > sup( H (t )) is said to be
locally asymptotically self-similar (lass) at point t0 if
X (t 0 + u ) X (t 0 )
lim
BH ( t0 ) (u )
H ( t0 )

uR

0+

uR

(8)

where the equality in law is up to a multiplicative deterministic function of time and


BH (t0 ) is the fBm indexed by H (t0 ). It can be shown that mBm satisfies such a locally
self-similar property. In passing, the property described by (8) is also analyzed in our
recent work [32] from a view of the Cauchy class.
Based on the local growth of the increment process, one may write a sequence

Sk ( j ) =

m j+k
X (i + 1) X (i) , 1 < k < N ,
N 1 j =0

(9)

where m is the largest integer not exceeding N / k . The local Hurst function H (t ) at
point t = j /( N 1) is then given by
H (t ) =

log( / 2 S k ( j ))
.
log( N 1)

(10)

The function H (t ) in (10) can serve as a numerical model of multi-fractality of


traffic. Now, we select 4 widely-used real-traffic traces in computer networks. They are
DEC-PKT-n.tcp (n = 1, 2, 3, 4) [33]. Fig. n (a) shows their time series, where X (i )
implies the number of bytes in the ith packet (i = 0, 1, 2, ). Fig. n (b) illustrates the
corresponding local Hurst function.
Recall that the mBm is a locally self-similar process. For H (t ) which is a
continuous deterministic function, the resulting mBm is a multi-scale process. On the
other hand, if H (t ) is a random function or a random process, then the mBm is a

1016

M. Li et al.

0.8
H(t)

x(i), Bytes

1000

500

0.75

0.7
0

256

512

768

1024

2048

4096

6144

8192

(a)

(b)

Fig. 1. (a) Traffic time serie s X(i) of DEC-PKT-1.tcp. (b) Local Hurst function of X(i)

0.8
H(t)

x(i), Bytes

2000

1000

0.75

0.7
0

256

512

768

1024

2048

4096

6144

8192

(a)

(b)

Fig. 2. (a) Traffic time series X(i) of DEC-PKT-2.tcp. (b) Local Hurst function of X(i)

0.8
H(t)

x(i), Bytes

1000

500

0.78

0.76
0

256

512

768

1024

2048

4096

6144

8192

(a)

(b)

Fig. 3. (a) Traffic time series X(i) of DEC-PKT-3.tcp. (b) Local Hurst function of X(i)

0.85
H(t)

x(i), Bytes

1000

500

0.8

0.75
0

256

512

768

1024

2048

4096

(a)

6144

8192

(b)

Fig. 4. (a) Traffic time series X(i) of DEC-PKT-4.tcp. (b) Local Hurst function of X(i)

Towards Describing Multi-fractality of Traffic Using Local Hurst Function

1017

Multifractal process. From the above figures, it is obviously seen that H (t ) appears
random. Thus, H (t )s illustrated in Fig. 1 (b) ~ Fig. 4 (b) are numerical models of
multi-fractality of real traffic DEC-PKT-n.tcp (n = 1, 2, 3, 4).

3 Discussions
The previous figures provide verifications that traffic has the nature of multi-fractal.
Recall that the Hurst parameter characterizes the burstness of process from a view of
networking. From the above figures, we see that the local Hurst function of a
real-traffic series is time varying. Thus, a traffic series may have different burstness at
different points of time. As known, if the space of a buffer is occupied, some packets
will be dropped and they will be retransmitted late when traffic load becomes light but
these packets suffer from delay. The above H (t )s provide evidences that, during a
communication process, why packets are delayed due to traffic congestion and why
delay is usually a random variable.
Further, the local Hurst functions in Figs. 1 (b) ~ 4 (b) verify that traffic has strictly
alternating ON/OFF-period behavior, where the term strictly alternating ON- and
OFF- periods implies 1) the length of ON-periods are independent and identically
distributed (i.i.d.) and so are the length of OFF-periods, and 2) an OFF-period always
follows an ON-period [16].
The assumption that the Hurst function is continuous implies H (t + ) H (t ) for
0. Therefore, the normalized mBm has the covariance in the following limiting
form:

E[ X (t + ) X (t )] = 0.5 t

2 H (t )

+ t +

2 H (t )

2 H (t )

) , 0.

(11)

The variance of the increment process is given by


2
E X (t + ) X (t ) =

2 H (t )

, 0.

(12)

Therefore, one sees that the increment process of mBm is locally stationary.
In practice, a process is causal. Hence, we consider X + (t ) which stands for a causal
process with the starting point t = 0. In this case, one has the one-sided mBm based on
fractional integral of Riemann-Liouville type as follows [30]:
t

X + (t ) =

1
(t u ) H ( t ) 0.5 dB (u ).
( H (t ) + 1/ 2) 0

(13)

For t1 < t2 , we have the covariance


E [ X + (t1 ) X + (t2 ) ] =
H ( t1 ) + 0.5 H ( t2 ) 0.5
1
2

2t
t
2 F1 (0.5 H (t2 ), 1, H (t1 ) + 1.5, t1 / t2 ).
(2 H (t1 ) + 1)( H (t1 ) + 0.5)( H (t2 ) + 0.5)

(14)

1018

M. Li et al.

The variance of X + (t ) has the similar form as that of X (t ), i.e., t

2 H (t )

up to a

deterministic (or random) function of H (t ).


Though the previous discussed H (t ) appears time dependent as can be seen from
Figs. 1-4, its analytic model remains unknown. Clearly, the multi-fractality of traffic
may be quantitatively modelled if analytic models, either deterministic or statistic, of
H (t ) achieve. Either is greatly desired in practical applications such as pattern
recognition of traffic as can be seen from [12], [19]. Finding analytic models of H (t )
is our further aim that is certainly attractive.

4 Conclusions
We have given and demonstrated the multi-fractality of real traffic based on the local
Hurst functions. The present results exhibit that the local Hurst functions of
investigated real traffic show that H(t) of traffic is time-varying. The significance of the
present results is not only to show multifractal phenomena of traffic on a point-by-point
basis instead of an interval-by-interval basis as conventionally done in computer
networks but, more importantly, to make the research of the multi-fractality of traffic a
step further towards modeling multi-fractality of traffic.

Acknowledgements
This work was supported in part by the National Natural Science Foundation of China
under the project grant numbers 60573125 and 60672114, by the Key Laboratory of
Security and Secrecy of Information, Beijing Electronic Science and Technology
Institute under the project number KYKF 200606 of the open founding. SC Lim would
like to thank the Malaysia Ministry of Science, Technology and Innovation for the
IRPA Grant 09-99-01-0095 EA093, and Academy of Sciences of Malaysia for the
Scientific Advancement Fund Allocation (SAGA) P 96c.

References
1. Tobagi, F.A.; Gerla, M., Peebles, R.W., Manning, E.G.: Modeling and Measurement
Techniques in Packet Communication Networks. Proc. the IEEE 66 (1978) 1423-1447
2. Jain, R., Routhier, S.: Packet Trains-Measurements and a New Model for Computer
Network Traffic. IEEE Journal on Selected Areas in Communications 4 (1986) 986-995
3. Csabai, I.: 1/f Noise in Computer Network Traffic. J. Phys. A: Math. Gen. 27 (1994)
L417-L421
4. Paxson V., Floyd, S.: Wide Area Traffic: The Failure of Poison Modeling. IEEE/ACM T.
Networking 3 (1995) 226-244
5. Beran, J., Shernan, R., Taqqu, M. S., Willinger, W.: Long-Range Dependence in Variable
Bit-Rate Video Traffic. IEEE T. Communications 43 (1995) 1566-1579
6. Crovella, E., Bestavros, A.: Self-Similarity in World Wide Web Traffic: Evidence and
Possible Causes. IEEE/ACM T. Networking 5 (1997) 835-846

Towards Describing Multi-fractality of Traffic Using Local Hurst Function

1019

7. Tsybakov, B., Georganas, N. D.: Self-Similar Processes in Communications Networks.


IEEE T. Information Theory 44 (1998) 1713-1725
8. Li, M., Jia, W., Zhao, W.: Correlation Form of Timestamp Increment Sequences of
Self-Similar Traffic on Ethernet. Electronics Letters 36 (2000) 1168-1169
9. Li, M., Jia, W., Zhao, W.: Simulation of Long-Range Dependent Traffic and a TCP Traffic
Simulator. Journal of Interconnection Networks 2 (2001) 305-315
10. Li, M., Chi, C.-H.: A Correlation-Based Computational Method for Simulating Long-Range
Dependent Data. J. Franklin Institute 340 (2003) 503-514
11. Li, M., Zhao, W., Jia, W., Long, D.-Y., Chi, C.-H.: Modeling Autocorrelation Functions of
Self-Similar Teletraffic in Communication Networks Based on Optimal Approximation in
Hilbert Space. Applied Mathematical Modelling 27 (2003) 155-168
12. Li, M.: An Approach to Reliably Identifying Signs of DDOS Flood Attacks Based on LRD
Traffic Pattern Recognition. Computer & Security 23 (2004) 549-558
13. Cappe, O., Moulines, E., Pesquet, J.-C., Petropulu, A., Yang X.: Long-Range Dependence
and Heavy Tail Modeling for Teletraffic Data. IEEE Sig. Proc. Magazine 19 (2002) 14-27
14. Feldmann, A., Gilbert, A. C., Willinger, W., Kurtz, T. G.: The Changing Nature of Network
Traffic: Scaling Phenomena. Computer Communications Review 28 (1998) 5-29
15. Willinger, W., Paxson, V.: Where Mathematics Meets the Internet. Notices of the American
Mathematical Society 45 (1998) 961-970
16. Willinger, W., Paxson, V., Riedi, R. H., Taqqu, M. S.: Long-Range Dependence and Data
Network Traffic, Long-Range Dependence: Theory and Applications. P. Doukhan, G.
Oppenheim, and M. S. Taqqu, eds., Birkhauser (2002)
17. Abry, P., Baraniuk, R., Flandrin, P., Riedi, R., Veitch, D.:, Multiscale Nature of Network
Traffic. IEEE Signal Processing Magazine 19 (2002) 28-46
18. Nogueira, A., Salvador, P., Valadas, R.: Telecommunication Systems 24 (2003) 339362
19. Li, M.: Change Trend of Averaged Hurst Parameter of Traffic under DDOS Flood Attacks.
Computers & Security 25 (2006) 213-220
20. Tsybakov, B., Georganas, N. D.: On Self-Similar Traffic in ATM Queues: Definitions,
Overflow Probability Bound, and Cell Delay Distribution. IEEE/ACM T. Networking 5
(1997) 397-409
21. Kim, S., Nam, S. Y., Sung, D. K.: Effective Bandwidth for a Single Server Queueing
System with Fractional Brownian Input. Performance Evaluation 61 (2005) 203-223
22. Li, M., Lim, S. C.: Modeling Network Traffic Using Cauchy Correlation Model with
Long-Range Dependence. Modern Physics Letters B 19 (2005) 829-840
23. Li, M.: Modeling Autocorrelation Functions of Long-Range Dependent Teletraffic Series
Based on Optimal Approximation in Hilbert Space-a Further Study. Applied Mathematical
Modelling 31 (2007) 625-631
24. Mandelbrot, B. B.: Gaussian Self-Affinity and Fractals. Springer (2001)
25. Levy-Vehel, J., Lutton, E., Tricot C. (Eds).: Fractals in Engineering. Springer (1997)
26. Peltier, R. F., Levy-Vehel, J.: A New Method for Estimating the Parameter of Fractional
Brownian Motion. INRIA TR 2696 (1994)
27. Peltier, R. F., Levy-Vehel, J.: Multifractional Brownian Motion: Definition and
Preliminaries Results. INRIA TR 2645 (1995)
28. Muniandy, S. V., Lim, S. C.: On Some Possible Generalizations of Fractional Brownian
Motion. Physics Letters A226 (2000) 140-145
29. Muniandy, S. V., Lim, S. C., Murugan, R.: Inhomogeneous Scaling Behaviors in Malaysia
Foreign Currency Exchange Rates. Physica A301 (2001) 407-428

1020

M. Li et al.

30. Muniandy, S. V., Lim, S. C.: Modelling of Locally Self-Similar Processes Using
Multifractional Brownian Motion of Riemann-Liouville Type. Phys. Rev. E 63 (2001)
046104
31. Benassi, A., Jaffard, S., Roux, D.: Elliptic Gaussian Random Processes. Revista
Mathematica Iberoamericana 13 (1997) 19-90
32. Lim S. C., Li, M.: Generalized Cauchy Process and Its Application to Relaxation
Phenomena. Journal of Physics A: Mathematical and General 39 (2006) 2935-2951
33. http://www.acm.org/sigcomm/ITA/
34. Floyd, S., Paxson, V.: Difficulties in Simulating the Internet. IEEE/ACM T. Networking 9
(2001) 392-403
35. Willinger, W., Govindan, R., Jamin, S., Paxson V., Shenker, S.: Scaling Phenomena in the
Internet: Critically Examining Criticality. Proc. Natl. Acad. Sci. USA 99 (Suppl. 1) (2002)
2573-2580

A Further Characterization on the Sampling


Theorem for Wavelet Subspaces
Xiuzhen Li1 and Deyun Yang2
1

Department of Radiology, Taishan Medical University, Taian 271000, China


zhenlixiu@163.com
Department of Information Science and Technology, Taishan University, Taian
271000, China
nkuydy@163.com

Abstract. Sampling theory is one of the most powerful results in signal analysis. The objective of sampling is to reconstruct a signal from
its samples. Walter extended the Shannon sampling theorem to wavelet
subspaces. In this paper we give a further characterization on some shiftinvariant subspaces, especially the closed subspaces on which the sampling theorem holds. For some shift-invariant subspaces with sampling
property, the sampling functions are explicitly given.
Keywords: Sampling theorem, Wavelet subspace, Wavelet frame,
Shift-invariant subspace.

Introduction and Main Results

Sampling theory is one of the most powerful results in signal analysis. The objective of sampling is to reconstruct a signal from its samples. For example, the
classical Shannon theorem says that for each
1 1
f B 12 = {f L2 (R) : suppf [ , ]},
2 2


(xk)
f (k) sin(xk)
, where the convergence is both in L2 (R) and uni
form on R, and the Fourier transform is dened by f() = f (x)e2ix dx.
If (x) = sinxx , then {( k)}kZ is an orthonormal basis for B 12 and B 12 is a
shift-invariant subspace in L2 (R).
The structure of nitely generated shift-invariant subspaces in L2 (R) is studied, e.g., in [1]-[6]. There are fruitful results in wavelet theory in the past 20 years
(see [7]-[12]). In [2], Janssen considered the shifted sampling and corresponding
aliasing error by means of Zak transform. Walter [1] extended the Shannon sampling theorem to wavelet subspaces. Zhou and Sun [13] characterized the general
shifted wavelet subspaces on which the sampling theorem holds:

f (x) =

k=

This work was supported by the National Natural Science Foundation of China
(Grant No. 60572113).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10211028, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1022

X. Li and D. Yang

Proposition 1 ([13]). Let V0 be a closed subspace in L2 (R) and {( n)}nZ


is a frame for V0 . Then the following two assertions are equivalent:

(i) k ck (x k) converges pointwise to a continuous function for any {ck }
l2 and there exists a frame {S( n)}nZ for V0 such that

f (x) =
f (k)S(x k) for any f V0 ,
k

where the convergence is both in L2 (R) and uniform on R.



(ii) C(R), supxR k |(x k)|2 < and there exist two positive constants A, B such that





ik 
AE () 
(k)e
 BE (), a.e.,


k

2
  

where is the characteristic function, E = { R : k (
+ 2k) > 0}.


 n) for any f V0 ,
Moreover, it implies that for S in (i), f (k) = f, S(
where S is dened in the following Proposition 3.
In this paper we characterize some shift-invariant subspaces, especially the closed
subspaces in L2 (R) such that the sampling theorem holds.
Notations. Firstly, we discuss functions in L2 (R). Therefore f = g means that
f () = g() for almost everywhere R. C(R) is the space of continuous
2
  

function. Gf () =
f
(
+
k)

 . It is easy to see that Gf is dened only
k
a.e.. Ef = { R : Gf () > 0} for any f L2 (R). E is the characteris
tic function of the set E. f
() =
f (n)ei2n for any f L2 (R) with

n
2
0
n |f (n)| < . Vf = {g : g() =
n cn f ( n), where the convergence is
in L2 (R) and {cn }nZ l2 }. Vf = {f ( n)}n which means that any g Vf
can be approximated arbitrarily well in norm by a nite linear combinations of
vectors f ( n). Moreover, Vf is called a shift-invariant subspace generated by
a.e
f . Let () be the Lebesgue measure on R. For E, F R, E = F means that
(E \ F ) = (F \ E) = 0.
For L2 (R), if {( n)}n is a frame (Riesz basis) for V then is called
a frame (Riesz) function.
Denition 1. A closed subspace V in L2 (R)
is called a sampling space, if there
exists a frame {(k)}kZ for V such that k ck (xk) converges pointwisely
to a continuous function for any {ck } l2 and

f (x) =
f (k)(x k) for any f V,
k

where the convergence is both in L2 (R) and uniform on R.


In this case, is called a sampling function on V .

A Further Characterization on the Sampling Theorem for Wavelet Subspaces

1023

From the denition, we know that if V is a sampling space then for any f V
there exists a function g C(R) such that f (x) = g(x), a.e. x R. Therefore, in
what follows we assume that all the functions in a sampling space are continuous.
Next we denote the following three sets to characterize the functions with
sampling property.
2
 P1 is the subset in L (R) C(R) in which each function satises that
2
k ck (x k) converges pointwisely to a continuous function for any {ck } l .
P2 is the set of functions in which each function f satises the following three
conditions:

(i) f = 0, f L2 (R) C(R), k |f (k)|2 < and f is a bounded function;
(ii) ({ : f () = 0, Gf () = 0}) = 0;
(iii) There exist two positive constants A, B such that
Gf ()
A 
 B, a.e. Ef .
  2
f ()
P3 = { P2 : supxR


n

|(x n)| < }.

For any f P2 , it is easy to see that (Ef ) > 0 and f (), Gf () are well
dened almost everywhere.
Next we list the following propositions to be used in the proof of our results.
Proposition 2 ([3],[5]). Let L2 (R). is a frame function if and only if
there are constants A, B > 0 such that AE G BE , a.e.
Especially, if is a Riesz function, then E = R.
Proposition 3 ([3],[13]). Assume that f L2 (R) and f is a frame function.
Let

f ()

, if Ef
f () = Gf ()
0,
if
/ Ef
then {f( n)}n is a dual frame of {f ( n)}n in Vf .
Proposition 4 ([13]). Let L2 (R). Then P if and only if the following
holds
(i) C(R),

2
(ii) kZ |(x k)| M for some constant M > 0.
Proposition 5 ([5]). Vf = {g L2 (R) : 
g = f, where is a function with
period 1 and f L2 (R)}.
Proposition 6 ([5]). Let g Vf . Then Vg = Vf if and only if suppf = supp
g.
a.e

Now we give the following theorem to check whether Vf is the maximum shiftinvariant subspace in L2 (R).

1024

X. Li and D. Yang

Theorem 7. Let f L2 (R). There exists g L2 (R) such that f Vg and



g
/ Vf if and
only if there exists E [0, 1] with (E) > 0 such that f () = 0
for any kZ (k + E).
For a shift-invariant subspace Vf with sampling property, how can we nd its
sampling functions? For this, we give the following two theorems. Firstly, for any
f L2 (R) if f
is well dened, then we dene

f ()
, if f () = 0,

fp () = f ()
(1)
0, if f () = 0.
Theorem 8. If f P2 , then {fp ( n)}n is a frame for Vf .
Theorem 9. (i) Let f () = 0, a.e. R. If S L2 (R) such that

f (x) =
f (k)S(x k),
k

where the convergence is in L2 (R), then S = fp .


(ii) For any f P3 , let

f ()
, if f () = 0,

S()
= f ()
R(), if f () = 0,

where R L2 (R). Then for any g Vf0 , g(x) = n g(n)S(x n).
Finally, we give the following equivalent characterization on shift-invariant subspaces with sampling property.
Theorem 10. Assume that V is a shift-invariant subspace. Then the following
assertions are equivalent:
(i) V is a sampling space.
(ii) For any C(R) such that {( n)}n is a frame for V , we have

2
sup
|(x n)| < ,
x

(2)

and there exist positive constants A, B such that






AE ()  () BE (), a.e.

(3)

(iii) There exists C(R) such that {( n)}n is a frame for V and (2),
(3) hold.
(iv) There exists C(R) such that {( n)}n is a frame for V , (2)
holds and

A g 2
|g(n)|2 B g 2 , g V.
(4)
n

for some positive constants A, B.


(v) There exists C(R) such that {( n)}n is a frame for V , (2) holds

 )}k is a frame for V .
and { l (k l)(k

A Further Characterization on the Sampling Theorem for Wavelet Subspaces

1025

Proof of Main Results

Proof of Theorem 7. [necessary] Assume that g L2 (R), f Vg and g


/ Vf .
By Proposition 5 and Proposition 6, f = g where is a a function with period 1
 > 0. Then there exists E R with (E ) > 0 such that
and (supp
g \ suppf)
g() = 0, f() = 0 for any E . Thus () = 0 for any E . Let
T (E ) = { : [0, 1], there exists k Z such that + k E }.

Then () = 0 for any T (E ). Since E kZ (k + T (E )) and (E ) > 0,
we have (T (E )) > 0. Therefore, if we take E = T (E ), then E satises the
conditions in the theorem.
[suf f iciency] Assume that f L2 (R) and there exists E [0, 1] with

(E) > 0 such that f() = 0 for any kZ k + E. Let

g() =

f(), if
/ E,
1, if E.

Then f() = ()


g () for any R, where


1, if
/ kZ (k + E),
() =
0, if kZ (k + E)
 > 0. Thus we have f Vg
is a function with period 1 and (supp
g \ suppf)
and g
/ Vf .
This completes the proof of Theorem 7.
Proof of Theorem 8. By (1) and f P2 , we have fp () = ()f(), where

1
, if Ef
() = f ()
0, if
/ Ef .

Then
Gfp () =

Gf ()

|f ()|
0,

, if Ef
if
/ Ef .

Since f P2 , there exist positive constants A, B such that AEf Gfp BEf .
However, is a function with period 1. By Proposition 5, fp Vf . It is from
suppf = suppfp and Proposition 6 that Vf = Vfp . Finally, by Proposition 2, fp
is a frame for Vf .
This completes the proof of Theorem 8.

Proof of Theorem 9. [(i)] Since f (x) = k f (k)S(x k) and f () = 0, we


have f() = f ()S(),
S()
= ff()
. Thus S = fp .
()

0
[(ii)] For any g Vf , there exist {ck } l2 such that g() = k ck f ( k),

where the convergence is pointwise and g C(R). Let C() = k ck ei2k ,

1026

X. Li and D. Yang

C () = C()Ef (). Then there exists {ck } l2 such that C () =


 i2k
. It follows from
k ck e
g() = C()f() = C ()f()


and Proposition 4 that g(x) = k ck f (x k) converges pointwisely in R. By
Theorem 8, fp is a frame for Vf . Then using Proposition 3, fp is a dual frame
of {f ( n)}nZ in Vf . Thus
(g, fp ( n)) =

C ()

Ef

 1/2
f ()   2 i2n
f
()
e
d
=
C ()f ()ei2n d


Gf ()
1/2

ck f (n k) = g(n).

Thus we get g(x) =

g(n)fp (x n). Let 


h() = R(){:f ()=0} . Then


S()
= fp () + 
h(), S = fp + h.
Since g(n) =
Then

k) for n Z, we have g () =

g ()
h() = 0,
g(n)h(x n) = 0.

k ck f (n

i2k

f ().
k ck e

Therefore
g(x) =

g(n)(fp (x n) + h(x n)) =

g(n)S(x n).

This completes the proof of Theorem 9.


For the proof of Theorem 10, we give the following Lemma.
Lemma 1. Assume that P1 is a frame function. If f V saties f() =

b()(),
where b() is a function with period 1 and bounded on E , then f P1 .
Specially, for any frame function V , we have P1 .

Proof. Assume that f V satises f() = b()(),
where b() is a function
with period 1 and bounded on E . Let B() = b()E ().
Since B() is
bounded on [ 21 , 12 ], there exists {Bn } l2 such that B() = n Bn ei2n .
Since P1 and


f() = b()()
= B()(),

we have f C(R) and f (x) = n Bn (x n), where the convergence is both
in L2 (R) and pointwisely.

2
Now using P1 and Proposition 4, we have supxR n |(x n)| < .
Hence

2


 

2
sup
|f (x k)| = sup
Bn (x k n)



x
x
n
k

A Further Characterization on the Sampling Theorem for Wavelet Subspaces

1027


2




2
= sup
|B()| 
(x k)ei2k  d


x
1/2
k

2
2
sup B()
|(x k)| < .


1/2

(5)

Then by f C(R), (5) and Proposition 4, we get f P1 .


For any frame function V , there exists a function with period 1 such


that ()
= ()(),
then G = | ()|2 G (). By Proposition 2, is bounded
on E . Thus P1 .
This completes the proof of Lemma 1.
Proof of Theorem of 10. (i) (ii) Given any continuous function such
that {( n)}n is a frame for sampling space V . Let be a sampling function

for V . By Proposition 4, supxR n |(x n)|2 < .



Let ()
= b()(),
where b() = k bk ei2k for some {ck } l2 . Then
2
G () = |b()| G (). By Proposition 2, b() is bounded on E . By Lemma 1,
P1 . Then by Proposition 1, we get (ii).
(ii) (iii) It is trivial.
(iii)
 (i) By Proposition 1, there exists a frame {( n)}n for V such that
f (x) = n f (n)(x n) for any f V . By Lemma 1 and Proposition 4, P1 .
Thus we get (i).
(iv) (i) For any n Z, T g = g(n) is a bounded linear functional on V .
Then there exists Sn V such that
g(n) = g, Sn for any n Z, g V.
Let S := S0 and g1 (x) = g(x + 1), for any g V, x R. Then
g1 , S = g1 (0) = g(1) = g, S1 .
Thus for any g V , we have



g(x)S(x 1)dx = g(x + 1)S(x)dx = g1 (x)S(x)dx = g1 (0) = g(1)

= g, S1 = g(x)S1 (x)dx.
Thus S1 (x) = S(x1). Similarly, we have Sn (x) = S(xn) for any n Z, x R.
Therefore
g(n) = g, S( n) for any g V.
Now by (4), S is a frame for V . Let S be dened as Proposition 3. Then
by Lemma 1 and Proposition 4, S is the sampling function for V . Therefore
we get (i).


 n) for any f V0 ,
(i) (iv) Note that in Proposition 1, f (n) = f, S(
 n)}n is a frame for V0 . Thus it is just from Proposition 1.
where {S(

1028

X. Li and D. Yang

(iv) (v) Assume that there exists a continuous function which is a



2
frame function for V such that supx n |(x n)| < . Then by Lemma 1, we

 l) convergences both in L2 (R)
have  P1 . Therefore qx () := l (x l)(
and uniformly on R.
Now for any g V ,


 k) (x k) for any x R.
g, qx =
g, (
(6)
k

The last series converges both in L2 (R) to g and uniformly to g, qx . Thus


g(x) = g, qx for any x R.

(7)

If (iv) holds, then there exist two positive constants A, B such that

A g 2
| g, qk |2 B g 2 for any g V.
k

 )}k is a frame for V . We get (v).


Thus {qk ()}k = { l (k l)(k
If (v) holds, then {qk ()}k is a frame for V . By (6) and (7), we get (iv).
This completes the proof of Theorem 10.

References
1. Walter,G.: A sampling theorem for wavelet subspaces. IEEE Trans. Inform. Theory
38 (1992) 881884
2. Janssen, A.J.E.M.: The Zak transform and sampling theorems for wavelet subspaces. J. Fourier Anal. Appl. 2 (1993) 315327
3. Boor, C., Devore, R., Ron, A.: The structure of nitely generated shift-invariant
subspaces in L2 (R). J. Func. Anal. 119 (1994) 3778
4. Boor, C., Devore, R., Ron, A.: Approximation from shift-invariant subspaces of
L2 (Rd ). Trans. Amer. Math, Soc. 341 (1994) 787806
5. Benedetto, J.J., Walnut, D.F.: Gabor frames for L2 and related spaces. In:
Benedetto, J.J., Frazier, M. W. (eds.): Wavelets: Mathematics and Applications.
CRC Press, Boca Raton (1993) 136
6. Ron, A., Shen, Z.W.: Frames and stable bases for shift-invariant subspaces of
L2 (Rd ). Can. J. Math. Vol. 5 (1995) 1051-1094
7. Daubechies, I.: Ten lectures on wavelets, Philadalphia, SIAM (1992)
8. Christensen, O: An Introduction to Frames and Riesz Bases. Birkh
auser, Boston
(2003)
9. Chui, C., Shi, X.: Orthonormal wavelets and tight frames with arbitrary real dilations, Appl. Comp. Harmonic Anal. 9 (2000) 243264
10. Yang, D., Zhou, X.: Irregular wavelet frames on L2 (Rn ), Science in China Ser. A.
Math. 48 (2005) 277287
11. Yang, D., Zhou, X.: Wavelet frames with irregular matrix dilations and their stability, J. Math. Anal. Appl. 295 (2004) 97106
12. Yang, D., Zhou, X.: Frame wavelets with matrix dilations in L2 (Rn ), Appl. Math.
Letters 17 (2004) 631639
13. Zhou, X.W., Sun, W.C: On the sampling theorem for wavelet subspace. J. Fourier
Anal. Appl. 1 (1999) 347354

Characterization on Irregular Tight Wavelet


Frames with Matrix Dilations
Deyun Yang1,2 , Zhengliang Huan1 , Zhanjie Song3 , and Hongxiang Yang1
1

Department of Information Science and Technology, Taishan University, Taian


271000, China
nkuydy@163.com
School of Control Science and Engineering, Shandong University, Jinan 250061,
China
3
School of Science, Tianjin University, Tianjin 300072, China

Abstract. There are many results in one dimensional wavelet frame theory in recent years. However, since there are some essential dierences in
high dimensional cases, the classical methods for one dimensional regular
wavelet frames are unsuitable for the cases. In this paper, under some
conditions on matrix-dilated sequences, a characterization formula for
irregular tight frames of matrix-dilated wavelets is developed. It is based
on the regular formulation by Chui, Czaja, Maggioni, Weiss, and on the
recent multivariate results by Yang and Zhou.
Keywords: Irregular frame, Tight wavelet frame, Matrix dilations,
Bessel sequences.

Characterization on Irregular Tight Wavelet Frames


with Matrix-Dilations

Assume , L2 (Rn ), {Aj }jZ is a real n n matrix sequence, B is a real


n n nonsingular matrix. A is the transpose of A1 . Let
= {j,k : j Z, k Zn }, = {j,k : j Z, k Zn },

(1)

n
where j,k (x) = |det Aj | 2 (A
j x Bk). If {j,k : j Z, k Z } is a tight
frame for L2 (Rn ), then is called a tight frame function with respect to dilation
sequence {Aj }jZ . There are some results in high dimensional wavelet
 frame



theory in recent years (see [1] -[7] ). Let P (f, g) =


f,

,
g
,
n
j,k
j,k
jZ,kZ

f, g L2 (Rn ). Now we consider the following subset of L2 (Rn ):


B = {f L2 (Rn ) : f L (Rn ), f has compact support.}
The following lemma comes from [2].


This work was supported by the National Natural Science Foundation of China
(Grant No. 60572113).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10291036, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1030

D. Yang et al.

Lemma 1. Two families {e : A} and {


e : A} constitute a dual pair
if and only if they are Bessel sequences and satisfy

P (f, g) :=
f, e  
e , g = f, g ,
A

for all f ,g in a dense subset of H.


Given {Aj }jZ and B, we need the following notations:

= { Rn : = A
(Zn )},
j m, (j, m) Z B

I() = {(j, m) Z B (Zn ) : = A


j m}.
Assume that {Aj }jZ satisfy the following conditions: there exist (0, 1),
q Z+ such that




Aj A1  1, Aj A1  for any j Z,
(2)
j+1
j+q
 1



A Aj  1, A1 Aj  for any j Z.
j+1
j+q

(3)

Then we have
Theorem 2. Suppose that , L2 (Rn ) have the property that the following
two functions are locally integrable:
2
2  
 





 .
(A
)
,
(A
)


j
j


jZ

jZ

Then for f, g B, P (f, g) converges absolutely. Moreover,


1
|det B|


1

(A
j )(Aj ( + Aj m)) = ,0 ,

(j,m)I()

for a.e. Rn and for all , if and only if P (f, g) = f, g for all f, g B.
By Theorem 2 and Lemma 1, the following two theorems are the immediate
consequences.
Theorem 3. Assume that , L2 (Rn ) are two Bessel sequences,
2
2  
 






(Aj ) ,
(Aj )
jZ

jZ

are locally integrable. Then and constitute a pair of dual frames for L2 (Rn )
if and only if

1

1

(A
j )(Aj ( + Aj m)) = ,0 ,
|det B|
(j,m)I()

for a.e. Rn and for all .

Characterization on Irregular Tight Wavelet Frames with Matrix Dilations

1031

Theorem 4. L2 (Rn ) satises


1
|det B|

 j )(A
 j ( + A1 m)) = ,0 ,
(A
j

(j,m)I()

for a.e. Rn and for all , if and only if {j,k }j,k is a tight frame with
constant 1 for L2 (Rn ).

Proof of Main Results

In fact, we only give the proof of Theorem 2.


Proof of Theorem 2: We rst prove that P (f, g) is absolutely convergent.
Now, let



Gj :=
f, j,k  j,k , f , j Z.
kZn

Using the Parseval identity, it is easy to show


1


 j + B s))d.
Gj =
f()(A
f( + A1
s)(A
j )(
j B
|det B| Rn
n
sZ


Thus, we would like to show that jZ Gj is absolutely convergent. To do so,
it is enough to show that the following two series are absolutely convergent:



 
I :=
f()(A
j )f ()(Aj )d,
Rn

jZ

and


f()(A
j )(

II :=
Rn

 j + B s))d.
f( + A1
s)(A
j B

sZn \{0}

Since



2 
2
 




 j + B s) 1 ((A
s) ),
(Aj )(A
j ) + (Aj + B
2

(4)

f that I is absolutely convergent.


It follows from (4) and the conditions on , ,

On the other hand, for h {, },



2

 

 
s) h(Aj ) d
f ()f ( + A1
j B
jZ,sZn \{0}

=
Rn

Rn

jZ,sZn \{0}


 
2
1
  1  1
 

h() d. (5)
f (Aj )f (Aj ( + B s)) 
|det Aj |

Since f B, it follows from (2) and (3) that there exists constant C0 > 0 such
that for each j Z, Rn , the number of s Zn \ {0} satisfying

1032

D. Yang et al.
1
 1
f(A1
s) = 0
j )f (Aj + Aj B

is less than C0 |det Aj |. Then



jZ,sZn \{0}




1
  1  1

F (A1
f (Aj )f (Aj ( + B s)) C
j ), (6)
|det Aj |
jZ

 2
 
where C = C0 f and F is compact in Rn \ {0}. By (2) and (3), {Aj }jZ is

an MFS (see [4]). Therefore there exists constant K > 0 such that

n
F (A1
j ) K for any R \ {0}.

(7)

jZ

Now, it follows from (5), (6) and (7) that the series II is convergent. Hence, we
can rearrange the series for P (f, f ), to obtain
1






P (f, f ) =
f()f( + )
(A
j )(Aj ( + )) d.
|det B|
n
R
(j,m)I()

Then using the polarization identity for the form P , we get the suciency of
the theorem.
Next, we prove the necessary condition. From the above discussion, we
have, let
P (f, g) = M (f, g) + R(f, g),
where

1
M (f, g) :=
|det B|

and
R(f, g) :=

1
|det B|


\{0}





g()f()
(A
j )(Aj ) d,

Rn

jZ

g()f( + )

Rn



(A
j )(Aj ( + ))d.

(j,m)I()

Now, x 0 Rn \ {0} and let


f1 () =

1
+H (),
(Hk )1/2 0 k

where () is the Lebesgue measure on Rn and


n
Hk = A1
k , = { R : || = 1}.

Then
M (f1 , f1 ) =

1
|det B| (Hk )


j )(A
 j )d
(A

0 +Hk jZ

Characterization on Irregular Tight Wavelet Frames with Matrix Dilations

1033

and
|R(f1 , f1 )|

1
|det B| (Hk )



 
 j ( +)) d
(Aj )(A

\{0}(j,m)I() (0 +Hk )(+0 +Hk )

|det B| (Hk )

\{0} (j,m)I()

\{0} (j,m)I()


2 1/2
 

(Aj ) d

(0 +Hk )(+0 +Hk )


2 1/2


.
(Aj ( + )) d

(0 +Hk )(+0 +Hk )

If (0 + Hk ) (0 + + Hk ) = , then A1
k ( ). Thus for (j, m) I(),

m (Aj A1
(Zn ) = Qj,k .
k ( )) B

Using (2) and (3), there exists a constant c > 0 such that j k + c. However,
|R(f1 , f1 )|


2 1/2


1
 

(Aj ) d
|det B| (Hk )

+H
0
k
jk+c mQj,k \{0}


2 1/2




.
(Aj ()) d

(8)

0 +Hk

jk+c mQj,k \{0}

For the rst factor,



1
(Hk )

jk+c mQj,k \{0}

|det Ak |

jk+c


2
 

(Aj ) d

0 +Hk



1

C det(Aj A1
k ) |det Aj |

jk+c

=C



  2
() d

Aj (0 +Hk )



  2
() d,

(9)

Aj (0 +Hk )

Here, we have used the fact #(Qj,k \ {0}) C(Aj A1


k ( )), which can be
obtained by (2), (3).
Similarly, we can estimate the second factor. Now, using (2) and (3), it is easy
to prove that:
(P1 ) there exist k1 , k2 Z such that the intersection of any k2 sets in {Aj (0 +
Hk )}kk1 ,jk+c is empty.
(P2 ) there exist constants k3 Z, > 1 such that for any k k3 , j k + c,
Aj (0 + Hk ) { : || k3 j |0 |}.

1034

D. Yang et al.

Using (8), (9), (P1 ) and (P2 ), we can get |R(f1 , f1 )| 0 when k . Then,
by Lebesgue theorem, we have


1
j )(A
 j )d
1 = lim
(A
k |det B| (Hk ) 0 +H
k
jZ

1  
 j 0 ),
=
(Aj 0 )(A
|det B|
jZ

which proves our claim for = 0. This also shows that


M (f, g) = f, g ,
for any f, g B.
To complete the proof of our theorem, choose 0 \ {0}, and write
R(f, g) = R1 (f, g) + R2 (f, g),
where
R1 (f, g) =

1
|det B|

1
R2 (f, g) =
|det B|

g()f( + 0 )

Rn



(A
j )(Aj ( + 0 ))d.

(j,m)I(0 )


\{0,0 }

g()f(+)

Rn



(A
j )(Aj ( + ))d.

(j,m)I()


2



Next, let 0 Rn \ {0} be any Lebesgue point of functions jZ (A
j ) and

2

 

(A
)

 . For xed k Z, we dene f2 , g2 as follows:
j
jZ
f2 ( + 0 ) =

1
1
0 +Hk (), g2 () =
+H ().
1/2
(Hk )
(Hk )1/2 0 k

Then, using Lebesgue Theorem, we have


lim R1 (f2 , g2 ) =

1
|det B|



(A
j 0 )(Aj (0 + 0 )).

(j,m)I(0 )

To estimate R2 (f2 , g2 ), we note that if g2 ()f2 ( + ) = 0, then


0 + A1
k ( ).
Since = A1
j m \ {0, 0 }, it follows from (2), (3) that there exist J0 Z
such that for any j J0 , m B (Zn ) \ {0}, A1
/ 0 + Dk , where
j m
Dk = A1
k ( ).

Characterization on Irregular Tight Wavelet Frames with Matrix Dilations

1035

Hence R2 (f2 , g2 ) can be rearranged as


R2 (f2 , g2 )

1
=
|det B|

jJ1 m(Aj 0 +Aj Dk )\{0}

J1

1
+
|det B|



g2 ()f2 ( + )(A
j )(Aj ( + ))d

Rn



g2 ()f2 ( + )(A
j )(Aj ( + ))d

Rn

j=J0 m(Aj 0 +Aj Dk )\{0}

= R2,1 (f2 , g2 ) + R2,2 (f2 , g2 ),


where J1 Z. Using (2) and (3), when k is large enough, for each j(J0 j J1 )
the number of m satisfying m (Aj 0 +Aj Dk ) is a constant which is not related
on k. Thus, by Lebesgue theorem, we have limk R2,2 (f2 , g2 ) = 0.
To estimate R2,1 (f2 , g2 ), we would like to prove that for each > 0 and k
which is large enough, there exists J1 Z such that R2,1 (f2 , g2 ) .
In fact, similar to R(f1 , f1 ), we have
R2,1 (f2 , g2 )


1
|det B| |Hk |

jJ1 m(Aj 0 +Aj Dk )\{0}


2 1/2
 

(Aj ) d

0 +Hk


2 1/2


.
(Aj ) d

0 +Hk

jJ1 m(Aj 0 +Aj Dk )\{0}

Therefore it is enough to estimate just one of these factors.


In fact, x any k which is large enough. Using the conditions (2) and (3),
there exists a constant C such that for any j J1 ,


# (Aj 0 + Aj Dk ) B T (Zn ) 1 + C |det Aj | |det Ak |1 ,
where #() is the number of elements in a given set. Then
1 
|Hk |

jJ1 m(Aj 0 +Aj Dk )\{0}


2
 

(Aj ) d

0 +Hk


2
1 
 

1
(1 + C |det Aj | |det Ak | )
(Aj ) d
|Hk |
0 +Hk
jJ1


2


 1


 

  2
=
(Aj ) d +
() d.
|Hk | 0 +Hk
Aj (0 +Hk )

jJ1

jJ1

Let J1 Z such that


lim


2


(A

)

j 0  < . Then
jJ1


jJ1

1
(Hk )

0 +Hk


2
 

(Aj ) d < /2.

1036

D. Yang et al.

By (P1 ), (P2 ) and Lebesgue theorem, we have






  2
lim
() d = 0.
k

jJ1

Aj (0 +Hk )

Thus for any > 0, there exists J1 such that limk |R2,1 (f2 , g2 )| . Finally,
we obtain

1


(A
j 0 )(Aj (0 + 0 )) = 0, for any 0 \ {0}.
|det B|
(j,m)I(0 )

We complete the proof of Theorem 2.

Remarks

In the proof of Theorem 2, we have used the following result in some places.
Remark 3.1. If (2) and (3) holds, then there exist constants C, q Z+ and
(0, 1) such that
(i) for any p, j Z+ , j pq and Rn , |Aj | C( 1 )p ||.
(ii) for any p, j Z+ , j pq and Rn ,|Aj | C p ||.
(iii) for any j, k Z, j < k and Rn , we have |Ak | C( 1 )

kj
q

|Aj |.

j
Remark 3.2. Assume that A is an expand matrix, S = \1
j= (A ), where
n
is a bounded measurable subset in R , supx,y x y < 1 and the origin is
an interior point of . Let  = S . By Theorem 4, is a tight frame function
with respect to dilation sequence {Aj }jZ . If Di (i = 1, , p) are nonsingular
matrices which is commutative with A, and Asp+i = As+1 Di for any s Z, i =
1, , p, then using Theorem 4 again, it is easy to show that is also a tight
frame function with respect to dilation sequence {Aj }jZ .

References
1. Hern
andez, E., Weiss, G.: A rst Course on Wavelets. CRC Press, Boca Raton
(1996)
2. Frazier, M., Garrigos, G., Wang, K., Weiss, G.: A characterization of functions that
generate wavelet and related expansion. J. Fourier Anal. Appl. 3 (1997) 883906
3. Frazier, M., Jawerth, B., Weiss, G.: Littlewood-Paley theory and the study of function spaces. CBMS Regional Conference Series in Mathematics, 79, AMS, Providence, R1 (1991)
4. Yang, D., Zhou, X.: Wavelet frames with irregular matrix dilations and their stability. J. Math. Anal. Appl. 295 (2004) 97106
5. Yang, D., Zhou, X.: Irregular wavelet frames on L2 (Rn ). Science in China Ser. A
Mathematics 2 (2005) 277287
6. Yang, D., Zhou, X.: Frame wavelets with matrix dilations in L2 (Rn ). Appl. Math.
Letters 17 (2004) 631639
7. Yang, X., Zhou, X.: An extension of Chui-Shi frame condition to nonuniform ane
operations. Appl. Comput. Harmon. Anal. 16 (2004) 148157

Feature Extraction of Seal Imprint Based on the


Double-Density Dual-Tree DWT
Li Runwu1, Fang Zhijun1, Wang Shengqian2, and Yang Shouyuan1
1

School of Information Technology, Jiangxi University of Finance & Economics,


Nanchang, China, 330013
lrw2008@gmail.com,fangzhijun@21cn.com, yshouy@sina.com
2
Jiangxi Science & Technology Teacher College, Nanchang, China, 330013
sqwang113@yahoo.com

Abstract. The most important problem on seal imprint verification is to extract


the imprint feature, which is independent upon varying conditions. This paper
proposes a new method to extract the feature of seal imprint using the doubledensity dual-tree DWT due to its good directional selectivity, approximate shift
invariance and computational efficiency properties. 16 different directions
information is obtained as a seal imprint image transforms by using doubledensity dual-tree DWT. Experimental results show that their directional
behaviors are much different, although the frequency distributions of true seals
are similar to false seals. This method is stable and computationally efficient
for seal imprint.
Keywords: Seal imprint verification, the double-density dual-tree DWT,
feature extraction.

1 Introduction
Seal imprint has been commonly used for personal confirmation in the Oriental
countries. Seal imprint verification for document validation is sticking point.
Therefore, it is highly desirable that large numbers of seal imprint are verified
automatically, speedily and reliably. However, seal imprint verification is a very
difficult problem [1]. Its difficulty comes from two aspects [2]:
(1) The various stamping conditions may affect the quality of the seal imprint
images;
(2) The forgery seal imprint may be very similar to the original seal imprint.
Therefore, it is important to find separate and efficient features of seal imprint.
Over the past years, many studies have been focused on extracting the impression
feature in frequency domain. Gabor filters [3] have been used in texture analysis due
to their good directional selectivity in different frequency scales. Serkan el. [4]
proposed a new feature extraction method utilizing dual tree complex wavelet
transform (DT-CWT).
In the conventional DWT domain, it is difficult to achieve perfect reconstruction
and equal frequency responses. In addition, directional selectivity is poor in DWT
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10371044, 2007.
Springer-Verlag Berlin Heidelberg 2007

1038

R. Li et al.

domain. In 2004, Selesnick [5] proposed the double-density dual-tree DWT, which
possess the properties of the double-density DWT as well as the DT-CWT.
In this paper, the double-density dual-tree DWT is employed for decomposing a
seal imprint image into the bandpass sub-images that are strongly oriented at 16
different angles. As higher directional selectivity is obtained, the dominant frequency
channel and orientation of the pattern are detected with a higher precision [5].
This paper is organized as follows. In Section 2, the double-density dual-tree DWT,
its dual-tree filters, and its properties are introduced. The projection operators of
bandpass subimages in the double-density dual-tree DWT are presented in Section 3.
The experimental results are shown in Section 4, followed by the conclusions drawn in
Section 5.

2 The Double-Density Dual-Tree DWT Theory


2.1 The Dual-Tree CWT
In order to have directional selectivity with Mallats efficient separable, it is
necessary to use complex coefficient filters. Filters are chosen to be linear phase so
that odd-length highpass filters have even symmetry and the even-length highpass
filters have odd symmetry about their midpoints [7]. Then, the DT CWT comprises
two trees of real filters, A and B, which produce the real and imaginary parts of the
complex coefficients. Fig.1 shows the complete 2-D DT CWT structure over 2
levels.
Each level of the trees produces 6 complex-valued bandpass subimages {D(n,m) ,
n=1,,6}(where m represents the scale.) as well as two lowpass subimages A(1,m)
and A(2,m) on which sub-sequent stages iterate [4]. {D(n,m) , n=1,,6} are strongly
oriented at 15, 45, 75, -15, -45, -75 degrees.
The results of two-level decomposition of the DT CWT are shown in Fig. 2. It is
seen that image subbands from six orientations are obtained as shown above, while
the seal impression image at each level of decomposition contains two parts: the real
parts and the imaginary parts. There are relative 8 image subbands at each level.
Therefore, there are 12 (62) image highpass subbands at each level, each of which
are strongly oriented at distinct angles.
2.2 The Double-Density Dual-Tree DWT
The design of dual-tree filters is addressed in [6], through an approximate Hilbert pair
formulation for the dual wavelets. Selesnick [5] also proposed the double-density
DWT and combined both frame approaches. The double-density complex DWT is
based on two scaling functions and four distinct wavelets, each of which is
specifically designed such that the two wavelets of the first pair are offset from one
other by one half, and the other pair of wavelets form an approximate Hilbert
transform pair [4].

Feature Extraction of Seal Imprint Based on the Double-Density Dual-Tree DWT

1039

The structure of the filter banks corresponding to the double-density complex


DWT consists of two oversampled iterated filter banks operating in parallel on the
same input data. The double-density DT CWT is shown in Fig.3 [6]. We see that
each tree produces 9 subbands, 8 (hi1 hi2 hi3 hi4 hi5 hi6 hi7 hi8)
of which are strongly oriented at 8 different angles. Then, 16 (82) image
subbands (Tree A and Tree B) are corresponding to 16 different angles at each
level.
Fig.4 is the second-level decomposition results of the double-density dual-tree
DWT.

Fig. 1. 2-D CWT

(a) The real parts of coefficients


(Subimages of each orientation)

(b) The imaginary parts of coefficients


(Subimages of each orientation)

Fig. 2. 2th-level decomposition subbands of Seal imprint image

Here we obtain image subbands from 16 orientations as in the previous case, but the
seal imprint image at each level of decomposition contains two parts (the real parts
and the imaginary parts, each which is the relative 18 sub-images at each level.)

1040

R. Li et al.

Fig. 3. 2-D Double-density Dual-tree DWT

(a) The real parts of coefficients

(b) The imaginary parts of coefficients

Fig. 4. 2th-level decomposition subbands in the double-density dual-tree DWT

Feature Extraction of Seal Imprint Based on the Double-Density Dual-Tree DWT

1041

Therefore, there are 36 sub-images at each level containing 32 high-frequency


subbands, some of which are the relative 16 different angles at each level. Since
higher directional selectivity is obtained, the dominant, frequency channel and
orientation of the pattern is detected with a higher precision.

3 The Projections of High-Frequency Subbands


The identification of seal imprint is more attention to detail information because the
overall profile of seal imprint is very similar. We proposed a method utilizing the
shift invariant properties and greater directional selectivity of the double-density
dual-tree DWT. To reduce the computational complexity, and get most effective
feature, the feature extraction of seal imprint utilize the projection operators of highfrequency subbands in the double-density dual-tree DWT. The horizontal direction
projections of the subbands of six different angles as the example are demonstrated
in Fig.5.
150

2500
2000

100

200

200

150

150

1500

100

100
1000
50

50
50

500

0
0

0
-50

-500

-50
-100

-1000
-100
-50

-150

-1500
-150

-2000
-100

10

20

30

40

50

60

70

100

50

-2500

10

20

30

40

50

60

70

-200

-200

10

20

30

40

50

60

70

-250

2000

200

250

1500

150

200

1000

10

20

30

40

50

60

70

10

20

30

40

50

60

70

150

100

500

100
50

0
0

50
0

-500

-50

-50
-1000

-50
-100

-1500

-100

-150

-2000
-150

10

20

30

40

50

60

70

-2500

-100

10

20

30

40

50

60

70

-200

-150

10

20

30

40

50

60

70

-200

Fig. 5. The horizontal projected vectors of high-frequency subbands


150

100

50

-50

-100

10

20

30

40

50

60

70

Fig. 6. Feature extraction of projected vectors

To reduce the computational complexity, we search for three maximum and three
minimum values, whose locations are regarded as the features of seal imprint in the

1042

R. Li et al.

projected vectors of high-frequency subbands. Aforementioned process is shown in


Fig.6.There are 32 high-frequency subbands which are the relative 632 values of the
features of the seal imprint at each level. The detailed process of feature extraction is
illustrated in Fig.7.
As shown in Fig.4, we can get 16 different directions information of seal imprint
image by the vertical/horizontal projection of the 32 high-frequency subbands.

Fig. 7. Projections of high-frequency subbands and the feature extraction

4 Experiment Result
In our experiment, the feature extraction of seal imprint is in the second level of
double-density dual-tree DWT. The seal imprint images are of size 256256.The
following images are the three true seal imprints and their corresponding false seal
imprints.
In Fig.8, we see that false seals are similar to the given true seals too much. The
performance of this method has been investigated using a database containing
1003 seal imprint images (three groups are A, B and C respectively, each contains
50 true seals and 50 false seals.), and promising results have been obtained. The
match ratio of the features is used for the classification algorithm. To reduce FAR
(False Acceptance Rate, set the threshold of 46% matching ratio for B-Group and
A-Group, and 50% matching ratio for C-Group. The experimental results are shown
in Table 1.

Feature Extraction of Seal Imprint Based on the Double-Density Dual-Tree DWT

(a0) true seal imprint

(b0) true seal imprint

(a1) false seal imprint

(b1) false seal imprint

1043

(c0) true seal imprint

(c1) false seal imprint

Fig. 8. Seal verification of three groups


Table 1. Results of seal imprint verification
FAR
(%)

FRR
(%)

Recognition rate

96%

B
C

0
0

6
14

97%
93%

Annotation: FAR (False Acceptance Rate), FRR (False Rejection Rate)

In Table 1, we can see that recognition rates of rotundity-seals (A) and ellipse-seals
(B) are more high than square -seals (C). As rotundity-seals (A) and ellipse-seals (B)
possess more salient features of direction, their features are separated highly in
comparison with square -seals.

5 Conclusion
In this paper, we proposed a feature extraction method of seal imprint based on the
double-density dual-tree DWT. Through the double-density dual-tree DWT of seal
imprint images, 16 different directions information is obtained. Although the
frequency distributions of true seals are similar to false seals, their directional
behaviors are much different. The method is stable and computationally efficient. The

1044

R. Li et al.

experimental results demonstrate that the features which have directional selectivity
properties extracted by this method are appropriate for verification of similar
structure.
Acknowledgments. This work is supported by the National Natural Science
Foundation of China (No. 60662003, 60462003, 10626029), the Science &
Technology Research Project of the Education Department of Jiangxi Province
(No.2006-231), Jiangxi Key Laboratory of Optic-electronic & Communication and
Jiangxi University of Finance & Economics Innovation Fund.

References
1. Fam T. J., Tsai W H: Automatic Chinese seal identification [J]. Computer Vision Graphics
and Image Processing, 1984, 25(2):311 - 330.
2. Qing H., Jingyu Y., Qing Z.: An Automatic Seal Imprint Verification Approach [J]. Pattern
Recognition, 1995, 28 (8):1251 - 1265.
3. Jain, A.K. and Farrokhnia. F., Unsupervised texture segmentation using Gabor filter,
Pattern Recognition, 1991.24, 1167-1186
4. Hatipoglu S., Mitra S K. and Kingsbury N.: Classification Using Dual-tree Complex Wavelet
Transform. Image Processing and Its Applications, Conference Publication No. 465 @IEE
1999
5. Selesnick I. W.: The Double-Density Dual-Tree DWT in IEEE Transactions on Signal
Processing, 52 (5): 1304-14, May 2004.
6. Selesnick I. W.: Hilbert Transform Pairs of Wavelet Bases, Signal Processing Letters, vol. 8,
no. 6, pp. 170.173, Jun. 2001.
7. Kingsbury N.G.: Image Processing with Complex Wavelets. Phil. Trans. Royal Society
London A, September 1999.

Vanishing Waves on Semi-closed Space Intervals


and Applications in Mathematical Physics
Ghiocel Toma
Department of Applied Sciences, Politehnica University, Bucharest, Romania

Abstract. Test-functions (which dier to zero only on a limited interval


and have continuous derivatives of any order on the whole real axis) are
widely used in the mathematical theory. Yet less attention was given to
intuitive aspects on dynamics of such test functions or of similar functions considered as possible solution of certain equations in mathematical
physics (as wave equation). This study will show that the use of wave
equation on small space interval considered around the point of space
where the sources of the generated eld are situated can be mathematically represented by vanishing waves corresponding to a superposition
of travelling test functions. As an important consequence, some directions for propagating the generated wave appears and the possibility of
reverse radiation being rejected. Specic applications for other phenomena involving wave generation (as the Lorentz formulae describing the
generation of a wave with dierent features after the interaction with the
observers material medium are also presented.
Keywords: vanishing waves, test functions, semiclosed intervals.

Introduction

Test-functions (which dier to zero only on a limited interval and have continuous derivatives of any order on the whole real axis) are widely used in the
mathematical theory of distributions and in Fourier analysis of wavelets. Yet
such test functions, similar to the Dirac functions, cant be generated by a differential equation. The existence of such an equation of evolution, beginning
to act at an initial moment of time, would imply the necessity for a derivative
of certain order to make a jump at this initial moment of time from the zero
value to a nonzero value. But this aspect is in contradiction with the property of
test-functions to have continuous derivatives of any order on the whole real axis,
represented in this case by the time axis. So it results that an ideal test-function
cant be generated by a dierential equation (see also [1]); the analysis has to be
restricted at possibilities of generating practical test-functions (functions similar
to test-functions, but having a nite number of continuous derivatives on the
whole real axis) useful for wavelets analysis. Due to the exact form of the derivatives of test-functions, we cant apply derivative free algorithms [2] or algorithms
which can change in time [3]. Starting from the exact mathematical expressions
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10451052, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1046

G. Toma

of a certain test-function and of its derivatives, we must use specic dierential


equations for generating such practical test-functions.
This aspect is connected with causal aspects of generating apparently acausal
pulses as solutions of the wave equation, presented in [4]. Thus, such testfunctions, considered at the macroscopic scale (that means not as Dirac-functions)
can represent solutions for certain equations in mathematical physics (an example being the wave-equation). The main consequence of this consists in the
possibility of certain pulses to appear as solutions of the wave-equation under
initial null conditions for the function and for all its derivatives and without any
free-term (a source-term) to exist. In order to prove the possibility of appearing
acausal pulses as solutions of the wave-equation (not determined by the initial
conditions or by some external forces) we begin by writing the wave-equation
2
1 2
2 2 =0
2
x
v t

(1)

for a free string dened on the length interval (0, l) (an open set), where
represents the amplitude of the string oscillations and v represents the velocity
of the waves inside the string medium. At the initial moment of time (the zero
moment) the amplitude together with all its derivatives of rst and second
order are equal to zero. From the mathematical theory of the wave-equation we
know that any solution of this equation must be a superposition of a direct wave
and of a reverse wave. We shall restrict our analyze at direct waves and consider
a supposed extension of the string on the whole Ox axis, being dened by the
function

1
exp ( (xvt+1)
2 1 ) for |x vt + 1| < 1
( ) =
(2)
0 for |x vt + 1| 1
where t 0. This function for the extended string satises the wave-equation
(being a function of x-vt , a direct wave). It is a continuous function, having
continuous partial derivatives of any order for x (, ) and for x 0. For
x (0, l) (the real string)the amplitude and all its derivatives are equal to zero
at the zero moment of time, as required by the initial null conditions for the real
string (nonzero values appearing only for x (2, 0) for t = 0, while on this
interval |x vt + 1| = |x + 1| < 1). We can notice that for t = 0 the amplitude
and its partial derivatives dier to zero only on a nite space interval, this being
a property of the functions dened on a compact set (test functions). But the
argument of the exponential function is x vt ; this implies that the positive
amplitude existing on the length interval (2, 0) at the zero moment of time will
move along the Ox axis in the direction x = +. So at some time moments
t1 < t2 < t3 < t4 < . . . after the zero moment the amplitude will be present
inside the string, moving from one edge to the other. It can be noticed that the
pulse passes through the real string and at a certain time moment tf in (when
the pulse existing at the zero moment of time on the length interval (2, 0) has
moved into the length interval (l, l + 2)) its action upon the real string ceases.
We must point the fact that the limit points x = 0 and x = l are not considered

Vanishing Waves on Semi-closed Space Intervals and Applications

1047

to belong to the string; but this is in accordance with the rigorous denition of
derivatives (for these limit points cant be dened derivatives as related to any
direction around them).
This point of space (the limit of the open space interval considered) is very
important for our analysis, while we shall extend the study to closed space
intervals. Considering small space intervals around the points of space where
the sources of the generated eld are situated (for example, the case of electrical
charges generating the electromagnetic eld), it will be shown that causal aspects
require the logical existence of a certain causal chain for transmitting interaction
from one point of space to another, which can be represented by mathematical
functions which vanishes (its amplitude and all its derivatives) in certain points
of space. From this point of space, an informational connection for transmitting
the wave further could be considered (instead of a tranmission based on certain
derivatives of the wave). Thus a kind of granular aspect for propagation along
a certain axis can be noticed, suitable for application in quantum theory. As
an important consequence, by a multiscale analysis and the use of non-Markov
systems, some directions for propagating the generated wave will appear and the
possibility of reverse radiation will be rejected. Finally. specic applications for
other phenomena involving wave generation (as the Lorentz formulae describing
the generation of a wave with dierent features after the interaction with the
observers material medium) will be also presented.

Test-Functions for Semi-closed Space Intervals

If we extend our analysis to closed intervals by adding the limit of the space
interval to the previously studied open intervals (for example by adding the
points x = 0 and x = l to the open interval (0, l), we should take into account the
fact that a complete mathematical analysis usually implies the use of a certain
function f (t) dened at the limit of the working space interval (the point of space
x = 0, in the previous example). Other complete mathematical problems for the
wave equation or for similar equations in mathematical physics use functions
f0 (t), fl (t) corresponding to both limits of the working space intervals (the points
of space x = 0 and x = l in the previous example) or other supplementary
functions.
The use of such supplementary functions dened on the limit of the closed
interval could appear as a possible explanation for the problem of generating
acausal pulses as solutions of the wave equation on open intervals. The acausal
pulse presented in the previous paragraph (similar to wavelets) travelling along
the Ox axis requires a certain non-zero function of time f0 (t) for the amplitude of
the pulse for the limit of the interval x = 0. It could be argued that the complete
mathematical problem of generating acausal pulses for null initial conditions on
an open interval and for null functions f0 (t) and fl (t) corresponding to function
(the pulse amplitude) at the limits of the interval x = 0 and x = l respectively, would reject the possibility of appearing the acausal pulse presented in
the previous paragraph. The acausal pulse presented implies non-zero values

1048

G. Toma

for f0 and fl at the limit of the closed interval at certain time moments, which
represents a contradiction with the requirement for these functions f0 and fl to
present null values at any time moment. By an intuitive approach, null external
sources would imply null values for functions f0 and fl and (as a consequence)
null values for the pulse amplitude .
Yet it can be easily shown that the problem of generating acausal pulses on
semi-closed intervals can not be rejected by using supplementary requirements
for certain functions f (t) dened at one limit of such space intervals. Let us
simply suppose that instead of function

1
exp ( (xvt+1)
2 1 ) for |x vt + 1| < 1
( ) =
(3)
0 for |x vt + 1| 1
presented in previous paragraph we must take into consideration two functions
0 and l dened as

1
exp ( (xvt+m)
2 1 ) for |x vt + 1| < 1
0 ( ) =
(4)
0 for |x vt + m| 1


and
l ( ) =

1
exp ( (x+vtm)
2 1 ) for |x vt + 1| < 1
0 for |x + vt m| 1

(5)

with m selected as m > 0, mn 1 > l (so as both functions 0 and l to have


non-zero values outside the real string and to be asymmetrical as related to
the point of space x = 0. While function 0 corresponds to a direct wave (its
argument being (x vt)) and l corresponds to a reverse wave (its argument
being (x + vt)) it results that both functions 0 and l arrive at the same time
at the space origin x = 0, the sum of these two external pulses being null all
the time (functions 0 and l being asymmetrical, 0 = l ). So by requiring
that (t) = 0 for x = 0 (the left limit of a semi-closed interval [0, l) ) we can
not reject the mathematical possibility of appearing an acausal pulse on a semiclosed interval.
This pulse is in fact a travelling wave propagating from x = towards
x = which vanishes at the point of space x = 0. Moreover, its derivatives
are also equal to zero at this point of space for certain time moments (when
both travelling pulses cease their action in the point of space x = 0). This pulse
is a solution of the wave-equation on the semi-closed interval [0, l), and can be
very useful for considering a transmission of interaction on nite space-intervals
(in our case the interaction being transmitted from x = l towards x = 0).
From this point of space, at the time moments when the amplitude and all
its derivatives are equal to zero the interaction can be further transmitted by
considering an informational connection; the mathematical form of the pulse is
changed, a new wave should be generated for the adjacent space interval and
a mathematical connection for transmitting further the interaction ( towards
x = ) is not possible while the pulse amplitude and all its derivatives vanish
at this time moments at the point of space x = 0. This aspect implies a step-by

Vanishing Waves on Semi-closed Space Intervals and Applications

1049

step transmission of interaction starting from an initial semi-closed interval (its


open limit corresponding to the source of the eldd, for example) to other space
intervals. This corresponds to a granular aspect of space suitable for applications
in quantum physics, where the generation and annihilation of quantum particles
should be considered on limited space-time intervals. For this purpose, specic
computer algorithms and memory cells corresponding to each space interval
should be used. The informational connection from one space interval to another
should be represented by computer operations of data initialization.

Aspects Connected with Spherical Waves

A possible mathematical explanation for this aspect consists in the fact that we
have used a reverse wave (an acausal pulse) propagating from x = towards
x = , which is rst received at the right limit x = l of the semi-closed interval
[0, l) before arriving at the point of space x = 0. It can be argued that in case
of a closed space interval [0, l] we should consider the complete mathematical
problem, consisting of two functions f0 (t), fl (t) corresponding to both limits of
the working space intervals (the points of space x = 0 and x = l. But in fact the
wave equation corresponds to a physical model valid in the three-dimensional
space, under the form
2 2 2
1 2
+ 2 + 2 2 2 =0
2
x
y
z
v t

(6)

and the one-dimensional model previously used is just an approximation. Moreover, the source of the eld is considered at a microscopic scale (quantum particles like electrons for the case of the electromagnetic eld, for example) and
the emitted eld for such elementary particles presents a spherical symmetry.
Transforming the previous equation in polar coordinates and supposing that the
function depends only on r (the distance from the source of the eld to the
point of space where this emitted eld is received), it results
2U
1 2U
2 2 =0
2
r
v t

(7)

U = r

(8)

where
An analysis of the eld emitted from the space origim towards a point of space
r = r0 (where the eld is received) should be performed on the space interval
(0, r] (a semi-closed interval); the point of space r = 0 can not be included in the
working interval as long as the solution (r) for the eld is obtained by dividing
the solution U (r) of the previous equation (in spherical coordinates) through
r (the denominator of the solution being zero, some supplementary aspects
connected to the limit of functions should be added, but still without considering
a solution for the space origin).

1050

G. Toma

Thus an asymmetry in the required methods for analyzing phenomena appears. In a logical manner, by taking also into consideration the free-term (corresponding to the source of the eld) situated in the point of space x = 0 (the
origin) it results that the use of function depending on x vt (mentioned in the
previous paragraph) or r vt (for the spherical waves) represents also a limit
for the case of a sequence of small interactions acting as external source (freeterm)- changes in the value of partial derivatives as related to space coordinates
- changes in the partial derivatives of the amplitude as related to time - changes
in the value of the function, so as the possibility of appearing acausal pulses
(not yet observed) to be rejected. Such a causal chain can be represented in a
mathematical form as a dierential equation able to generate functions similar
to test functions, dened as practical test functions only as an approximation at
a greater scale of space-time for the case when the length and time intervals corresponding to such equations with nite dierences are very small. Moreover, a
certain direction for the transmission of interaction appearing, it results that the
possibility of reverse radiation (a reverse wave generated by points of space where
a direct wave has arrived) should be rejected in a logical manner (a memory of
previous phenomena determining the direction of propagation). Mathematically,
an analysis at a small spatial and temporal scale based on continuous functions for transmitting interactions from one point of space to another (similar to
continuous wave function in quantum physics describing the generation and annihilation of elementary particles) should be described by non-Markov processes
(phenomena which should be analyzed by taking into account the evolution in
a past time interval).

Applications at Relativistic Transformation of Waves

An application of such non-Markov processes should be the analysis of Lorentz


transformation in special relativity, when a certain wave-train interacts with
the observers material medium. The usual interpretation of special relativity
theory considers that the Lorentz formulae describe the transformation of the
space-time coordinates corresponding to an event when the inertial reference
system is changed. These formulae are considered to be valid at any moment
of time after a certain synchronization moment (the zero moment) irrespective
to the measuring method used. However, there are some problems connected to
the use of mechanical measurements on closed-loop trajectories with analysis
performed on adjoined small time intervals. For example, if we consider that
at the zero moment of time, in a medium with a gravitational eld which can
be neglected, two observers are beginning a movement from the same point of
space, in opposite directions, on circular trajectories having a very great radius
of curvature, so as to meet again after a certain time interval, we can consider
the end of each small time interval as a resynchronization moment and it results
that time dilation appears on each small time interval. Yet if we consider that
the time intervals measured after a resynchronization procedure can be added to
the previously measured time intervals (the result being considered as related to

Vanishing Waves on Semi-closed Space Intervals and Applications

1051

the initial time moment) a global time dilation appears. If the time is measured
using the age of two plates, it results that the plate in a reference system S2 is
older than the other in reference system S1 , (having a less mechanical resistance)
and it can be destroyed by it after both observers stop their circular movements.
However, the same analysis can be made by starting from another set of small
time intervals considered in the other reference system, and nally results that
the plate in reference system S1 is older than the other in reference system S2 ,
(having a less mechanical resistance) and it can be destroyed by it after both
observers stop their circular movements. But this result is in logic contradiction
with the previous conclusion, because a plate can not destroy and in the same
time be destroyed by another one.
A logical attempt for solving this contradiction can be made by considering
that Lorentz formulae are valid only for electromagnetic phenomena (as in the
case of the transversal Doppler eect) and not in case of mechanical phenomena or any phenomena involving memory of previous measurements. Using an
intuitive approach which considers that the Lorentz transformation represents
physical transformation of a wave-train when this interacts with the observers
material medium, such logical contradiction can be avoided (see [5], [6] for more
details). Yet the memory of past events can not be totally neglected. The transformation of a received wave into another wave which moves along the same
direction, with certain mathematical expressions describing how space-time coordinates corresponding to the case when the received wave would have been not
aected by interaction are transformed into space-time coordinates corresponding to the transformed wave train (according to the Lorentz formulae valid on
the space and time intervals corresponding to the received wave-train) requires a
certain memory for the received wave train necessary for performing the transformation (other way no local time dilation would appear). This aspects is similar
to the requirement of using non-Markov processes for justifying how a certain
direction of propagation for the generated wave appears.

Conclusions

This study has shown that some solutions of the wave equation for semi-closed
space interval considered around the point of space where the sources of the generated eld are situated (for example, the case of electrical charges generating
the electromagnetic eld) can be mathematically represented by vanishing waves
corresponding to a superposition of travelling test functions. It is also shown that
this aspect requires the logical existence of a certain causal chain for transmitting
interaction from one point of space to another. As an important consequence, by
a multiscale analysis and the use of non-Markov systems, certain directions for
propagating the generated wave appeared and the possibility of reverse radiation
was rejected. Specic applications for other phenomena involving wave generation (as the Lorentz formulae describing the generation of a wave with dierent
features after the interaction with the observers material medium) have been
also presented. Unlike other mathematical problems (Cauchy problem) based on

1052

G. Toma

long-range dependence (see also [7], where statistical aspects are also taken into
consideration), this study presents aspects connected to short-range interactions.
Asymptotic properties are taken into account for the mathematical problem for
functions having the limit (null denominator at the limit of the working semiclosed interval) instead of an approach based on relaxation phenomena, as in [8].
In future studies, such aspects will be extended to mathematical models describing step changes in a certain environment (similar to aspects presented in [9],
[10] with step changes presented in [11]. The aspects presented in this study can
be extended at closed space intervals, by considering that at the initial moment
of time, at one of the spatial limits of the interval arrives a direct wave and a
reverse wave (asymmetrical as related to this point of space) represented both
by sequences of extended Dirac pulses having the same spacelemgth d = L/k (k
being an integer and L being the length of the whole interval). As a consequence,
after a certain time interval, a set of oscillations represented by stationary waves
with null derivatives of certain orders at both spatial limits will apeear.
Acknowledgment. This work was supported by the National Commission of
Romania for UNESCO, through a pilot grant of international research involving Politehnica University, Salerno University, IBM India Labs and Shanghai
University.

References
1. Toma, C. : Acausal pulses in physics-numerical simulations, Bulgarian Journal of
Physics (to appear)
2. Morgado, J. M., Gomes, D.J. : A derivative - free tracking algorithm for implicit
curves with singularities, Lecture Notes in Computer Science 3039 (2004) 221229
3. Federl, P., Prudinkiewiez, P. : Solving dierential equations in developmental models of multicellular structures using L-systems, Lecture Notes in Computer Science
3037 (2004) 6582
4. Toma, C.: The possibility of appearing acausal pulses as solutions of the wave
equation, The Hyperion Scientic Journal 4 1 (2004), 2528
5. Toma, C.: A connection between special relativity and quantum theory based on
non-commutative properties and system - wave interaction, Balkan Physics Letters
Supplement 5 (1997), 25092513
6. Toma, C.: The advantages of presenting special relativity using modern concepts,
Balkan Physics Letters Supplement 5 (1997), 23342337
7. Li, M., Lim, S.C.: Modelling Network Trac Using Cauchy Correlation Model with
Long-Range Dependence, Modern Physics Letters,B 19 (2005), 829840
8. Lim, S.C., Li, M. : Generalized Cauchy Process and Its Application to Relaxation
Phenomena, Journal of Physics A Mathematical and General 39 (2004), 29352951
9. Lide, F., Jinhai, L., Suosheng C.: Application of VBA in HP3470A Data Acquisition
System, Journal of Instrunebtation and Measurements 8 (2005), 377379
10. Lide, F., Wanling, Z., Jinha, L., Amin, J.: A New Intelligent Dynamic Heat Meter,
IEEE Proceedings of ISDA 2006 Conference, 187191
11. Xiaoting, L., Lide, F.: Study on Dynamic Heat Measurement Method after Step
Change of System Flux, IEEE Proceedings of ISDA 2006 Conference, 192197

Modelling Short Range Alternating Transitions


by Alternating Practical Test Functions
Stefan Pusca
Politehnica University, Department of Applied Sciences, Bucharest, Romania

Abstract. As it is known, practical test-functions [1] are very useful


for modeling suddenly emerging phenomena. By this study we are trying to use some specic features of these functions for modeling aspects
connected with transitions from a certain steady-state to another, with
emphasis on he use of short range alternating functions. The use of such
short range alternating functions is required by the fact that in modern
physics (quantum physics) all transitions imply the use of certain quantum particles (eld quantization) described using associated frequencies
for their energy. Due to this reason, a connection between a wave interpretation of transitions (based on continuous functions0 and corpuscle
interpretation of transitions (involving creation and annihilation of certain quantum particles) should be performed using certain oscillations
dened on a limited time interval corresponding to the transition from
one steady-state to another.
Keywords: transitions, test functions, short range phenomena.

Introduction

As it is known, basic concepts in physics connected with interaction are the


wave and corpuscle concepts. In classical physics the corpuscle term describes
the existence of certain bodies subjected to external forces or elds, and the
wave concept describes the propagation of oscillations and elds. In quantum
physics, these terms are closely interconnected, the wave train associated to a
certain particle describes the probability of a quantum corpuscle (an electron or
a photon) to appear; the results of certain measurements performed upon the
quantum particle are described by the proper value of the operators corresponding to the physical quantity to be measured, the action of these operators having
yo be considered in a more intuitive manner also. Certain problems connected
with measurement procedures on closed-loop trajectories in special relativity and
non-commutative properties of operators in quantum physics [2] imply a more
rigorous denition of measurement method and of the interaction phenomena,
classied from the wave and from the corpuscular aspect of matter, so as to avoid
contradiction generated by terminological cycles [3]. Logic denition for the class
of measuring methods based on the wave aspect of matter and for the class of
measuring methods based on the corpuscular aspect of matter upon interaction
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10531059, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1054

S. Pusca

phenomena, based on considerations about a possible memory of previous measurements (operators) in case of a sequence of received pulses were presented in
[4], trying to obtain expressive pattern classes (similar to those presented in [5]).
As a consequence, aspects connected with memory of previous measurements
corresponding to action of systems upon received wave-trains have to be raken
into consideration.
Moreover, this aspect implies an intuitive interpretation for the dependence
of the mass of a body inside a reference system. Thus, it was shown that for
the case when the Lorentz transformation doesnt generate a pulse (for example
when the relative speed between the material body and the wave is equal to c,
the speed of light in vacuum), the mass m is equal to , which means that no
interaction due to the received pulse exists. This manner the notion on innite
mass is connected with the absence of interaction) [8]. So m = for a body
inside a reference system S shows that we cant act upon the material body
using wave pulses emitted in system S; however, changes in the movement of
the body (considered in system S ) due to other external forces seem to be
allowed. The absence of interaction is connected also with absence of estimation
for space coordinates of the wave source (the general case being presented in
[9]). This aspect can be considered as a suddenly emerging phenomenon, while
the interaction disappears when the relative speed v between the system which
emits the wave and the system which receives it becomes equal to c.
Yet the problem is more complex if a high energy pulse interacts with a
single or with a small number of elementary (small) particles. In this case the
total energy of the particles (according to relativistic expression E = mc2 can
be much smaller than the energy of the received pulse which interacts with
them. For a correct analysis (according to previous considerations) the small
(elementary) particles should be considered as associated wave trains interacting
with a high-energy environment (some scaling aspects [10] appearing). The high
energy pulses would be much less aected by interaction, which means that
it is the element performing the transformation; associated wave-train of the
particles would be the much more aected by interaction, being the element
which undergoes the transformation. In the most general case, the study of wave
transformations according to Lorentz formulae in a certain environment must
be performed in the reference systems where the total momentum is zero (by
analogy with the study of collisions in the reference system associated to the
center of mass).
For an improved analysis of phenomena we must nd an approach able to connect transitions corresponding from one steady state of a system to another using
a formalism based on functions dened on limited time intervals. Smooth transitions (based on practical test-function, similar to wavelets) where presented in
[2], but that study has presented an algorithm for generating smooth transitions
for any derivative of the function f describing the transitions, avoiding alternating function. On the contrary, for allowing an approach able to explain also the
creation and annihilation of quantum particles by interactions in modern physics
(dened on an extremely small time interval dt around the interaction moment

Modelling Short Range Alternating Transitions

1055

of time), the use of such alternating functions is recommended, so as to appear


certain frequencies usually associated to energy of quantum particles in modern
physics. So the algorithm presented in [9] should be improved.

Connections with Test Functions

For modeling phenomena connected with wave-train transformation in a certain


environment, we could use the formalism of topological solitary waves with arbitrary charge [3] or of harmonic wavelets [4]. However, the disappearance of
interaction when the relative speed v equals c implies the absence of certain
state-variables at that very moment of time when v = c, when v pass from a
value less than c to a value greater than c; this is similar to aspects connected
with integration of functions similar to test functions on a working interval [5]
- a certain number of derivatives vanishing at the end of this interval. Specic
features from modeling solitary waves in composite materials [6] could be useful,
avoiding mathematical possibilities of generating acausal pulses [7] (the Lorentz
transformation of a received wave-train does not generate any wave without
a certain received wave to exist, and it acts instantly). Stochastic aspects of
Schroedinger equation imply a probability of measuring a certain value for a
physical quantity connected with an associated wave, not a probability of appearing dierent associated waves (see [8] for a wavelets analysis of Schroedinger
equation).
From basic mathematics it is known that the product (t)g(t) between a
function g(t) which belongs to C class and a test-function (t) which diers
to zero on (a, b) is also a test-function, because:
a) it diers to zero only on the time interval (a, b) where (t) diers to zero (if
(t) is null, then the product (t)g(t) is also null)
b) the function (t)g(t) belongs to the C class of functions, while a derivative
of a certain order k can be written as
((t)g(t))(k) =

k


Ckp (t)(p) g(t)(kp)

(1)

p=0

(a sum of terms represented by a product of two continuous functions).


Yet for practical cases (when phenomena must be represented by dierential
equations), the (t) test functions must be replaced by a practical test functions
f (t) C n on R (for a nite n - considered from now on as representing the order
of the practical test function) having the following properties:
a) f is nonzero on (a, b)
b) f satises the boundary conditions f (k) (a) = f (k) (b) = 0 for k = 0, 1, ..., n and
c) f restricted to (a, b) is the solution of an initial value problem (i.e. an ordinary dierential equation on (a, b) with initial conditions given at some point in
this interval).
The generation of such practical test functions is based on the study of dierential equations satised by the initial test functions, with the initial moment

1056

S. Pusca

of time chosen at a time moment close to the t = a moment of time (when the
function begins to present non-zero values).
By using these properties of practical test-functions, we obtain the following
important result for a product f (t)g(t) between a function g(t) which belongs
to C class and a practical test-function of n order f (t) which diers to zero
on (a, b):
General Property for Product: The product g(t)f (t) between a function g(t)
C and a practical test function f of order n is represented by a practical test
function of order n.
This is a consequence of the following two properties:
a) the product g(t)f (t) diers to zero only on the time interval (a, b) on which
f (t( diers to zero.
b) the derivative of order k for the product g(t)f (t) is represented by the sum
(f (t)g(t))(k) =

k


Ckp f (t)(p) g(t)(kp)

(2)

p=0

which is a sum of terms representing products of two continuous functions for


any k n, ( n being the order of the practical test-function f ) - only for k > n
discontinuous functions can appear in the previous sum. Integral properties of
practical test functions of certain order has been presented in [9]. For this, it
was shown that the integral (t) of a test function (t) (which diers to zero on
(a, b) interval) is a constant function on the time intervals (, a] and [b, +);
it presents a certain variation on the (a, b) time interval, from a constant null
value to a certain quantity corresponding to the nal constant value. This
aspects was used for modeling smooth transitions from a certain state to another
when almost all derivatives of a certain function are equal to zero at the initial
moment of time. The absence of interaction at the time moment tin (when v = c),
considered as initial moment of time, suggested that all (or a great number) of
derivatives of functions x = x(t), y = y(t), z = z(t) (the space coordinates) are
null for t = tin and present a certain variation at a time interval close to tin
(when v is no longer equal to c.
In the general case when a function f and a nite number of its derivatives
f (1) , f (2) , ..f (n) present the variations from null values to values , 1 , 2 , ...n
on the time interval [1, 1], a certain function fn which should be added to the
null initial function so as to obtain a variation n for the derivative of n order
was studied. By multiplying the exponential bump-like function (a test-function
on [1, 1]) with the variation n of the derivative of n order and by integrating
this product n + 1 times we obtain:
rating this product n + 1 times we have simply obtained:
-after the rst integration: a constant value equal to n at the time moment
t = 1 (while the integral of the bump-like test function on [1, 1] is equal to 1,
and a null variation on (1, +).

Modelling Short Range Alternating Transitions

1057

- after the second integration (when we integrate the function obtained at previous step): a term equal to n (t 1) and a term equal to a constant value cn1
(a constant of integration) on the time interval (1, +).
-after the n +1 integration: a term equal to n (t 1)n /n! and a sum of terms
having the form cni (t 1)i /i! for i N, i < n (cni being constants of integration) on the time interval (1, +) and so on. Corrections due to the fact that
function fn previously obtained has non-zero variations dn1 , dn2 , ..d1 for its
derivatives of order n 1, n 2, ..1 these values were substracted from the set
n1 , n2 , ..1 before passing to the next step, when the bump-like function
was multiplied by the corrected value n1 dn1 , Finally, by integrating this
product n times we obtained in a similar manner a function with a term equal
to n (t 1)n1 /(n 1)! and a sum of terms having the form cni (t 1)i /i! for
i N, i < n1 (cni being constants of integration) on the time interval (1, +),
being noticed that the result obtained after n integration possess the n 1 order
derivative equal to n1 , a smooth transition for this derivative from the initial
null value being performed. So the second function which must be added to the
initial null function is the integral of n-1 order for the bump-like function multiplied by this variation n1 (noted as fn1 ). The function f2 has a null value
for the derivative of n order, so the result obtained at rst step is not aected.
We must take care again to the fact that the function fn1 previously obtained
has non-zero variations d1n1 , d1n2 , ..d11 for its derivatives of order n 1, n 2, ..1
and so we must once again substract these values from the previously corrected
set n1 dn1 , n2 dn2 , ..1 d1 before passing to the next step. Finally
we obtain all functions fn+1 , fn , ...f1 which represent the terms of function f
modeling the smooth transition from an initial null function to a function having a certain set of variations for a nite number of its derivatives on a small
time interval. The procedure can be also applied for functions possessing a nite
number of derivatives within a certain time interval by time reversal ( t being
replaced with t).ere substracted from the set n1 , n2 , ..1 before passing
to the next step, when the bump-like function was multiplied by the corrected
value n1 dn1 , Finally, by integrating this product n times we obtained in a
similar manner a function with a term equal to n (t 1)n1 /(n 1)! and a sum
of terms having the form cni (t 1)i /i! for i N, i < n 1 (cni being constants of
integration) on the time interval (1, +), being noticed that the result obtained
after n integration possess the n 1 order derivative equal to n1 , a smooth
transition for this derivative from the initial null value being performed. So the
second function which must be added to the initial null function is the integral of
n-1 order for the bump-like function multiplied by this variation n1 (noted as
fn1 ). The function f2 has a null value for the derivative of n order, so the result
obtained at rst step is not aected. We must take care again to the fact that the
function fn1 previously obtained has non-zero variations d1n1 , d1n2 , ..d11 for its
derivatives of order n 1, n 2, ..1 and so we must once again substract these
values from the previously corrected set n1 dn1 , n2 dn2 , ..1 d1
before passing to the next step. Finally we obtain all functions fn+1 , fn , ...f1
which represent the terms of function f modeling the smooth transition from an

1058

S. Pusca

initial null function to a function having a certain set of variations for a nite
number of its derivatives on a small time interval. The procedure can be also
applied for functions possessing a nite number of derivatives within a certain
time interval by time reversal ( t being replaced with t).
Next step consists in considering the previously obtained functions the argument of a complex function F . In [10] has been presented the similitude between
coecients appearing in case of partial fractal decomposition and the electric
eld intensity E depending on distance a b (in electrostatics). If we write the
decomposition


 


1
1
1
1
1
=

(3)
(x a)(x b)
ab
xa
ab
xb
and compare the coecient 1/(a b) of each term with the electromagnetic eld



Q
Q
E=
(4)
4
ab
for the classical case in electrostatics when in a point situated in xd = b is
received an electric eld emitted by a body with Q charge, situated in a point
xs = a (the unidimensional case) - without taking the sign into consideration we can notice that coecient 1/(a b) is also the coecient of Q/(4). This
has suggested that such coecients of 1/(x a) correspond to certain physical
quantities noticed in point x = b and associated to a eld emitted in the point
x = a. It also suggested that the whole system Sa,b should be described as



Q
1
Sa,b =
(5)
4
(x a)(x b)
and it can be decomposed in phenomena taking place in point x = a or x =
b by taking into consideration the coecient of 1/(x a) or 1/(x b) from
partial fraction decomposition. Mathematically, these coecients ca , cb can be
written as
ca = lim (x a)Sa,b , cb = lim (x b)Sa,b
(6)
xa

xb

By simply replacing coecients a, b appearing in denominators expressions with


M1 exp iF1 , M2 exp iF2 (complex functions with arguments F1 , F2 determined
using the previous algorithm) we obtain a smooth transition for the denominator
of partial fractions involved in interaction, certain frequencies appearing. Thus
is an important step for improving aspects presented in [10], where the necessity
of using functions depending of time for coecients appearing on denominators
expressions for partial functions corresponding to physical quantities has been
already mentioned.

Conclusions

This study has shown that the use of such short range alternating functions is
required by the fact that in modern physics (quantum physics) all transitions

Modelling Short Range Alternating Transitions

1059

imply the use of certain quantum particles (eld quantization) described using
associated frequencies for their energy. Due to this reason, a connection between
a wave interpretation of transitions (based on continuous functions0 and corpuscle interpretation of transitions (involving creation and annihilation of certain
quantum particles) has been performed using certain oscillations dened on a
limited time interval corresponding to the transition from one steady-state to
another.
Acknowledgment. This work was supported by the National Commission of
Romania for UNESCO, through a pilot grant of international research involving Politehnica University, Salerno University, IBM India Labs and Shanghai
University.

References
1. Toma, G. : Practical test-functions generated by computer algorithms, Lecture
Notes Computer Science 3482 (2005), 576585
2. Toma, C.: The advantages of presenting special relativity using modern concepts,
Balkan Physics Letters Supplement 5 (1997), 23342337
3. DAvenia, P., Fortunato, D., Pisani, L. : Topological solitary waves with arbitrary
charge and the electromagnetic eld, Dierential Integral Equations 16 (2003)
587604
4. Cattani, C.: Harmonic Wavelets towards Solution of Nonlinear PDE, Computers
and Mathematics with Applications, 50 (2005), 11911210
5. Toma, C. : An extension of the notion of observability at ltering and sampling
devices, Proceedings of the International Symposium on Signals, Circuits and Systems Iasi SCS 2001, Romania 233236
6. Rushchitsky, J.J., Cattani, C., Terletskaya, E.V.: Wavelet Analysis of the evolution
of a solitary wave in a composite material, International Applied Mechanics, 40, 3
(2004), 311318
7. Toma, C.: The possibility of appearing acausal pulses as solutions of the wave
equation, The Hyperion Scientic Journal 4 1 (2004), 2528
8. Cattani, C.: Harmonic Wavelet Solutions of the Schroedinger Equation, International Journal of Fluid Mechanics Research 5 (2003), 110
9. Toma, A., Pusca, St., Moraraescu, C.: Spatial Aspects of Interaction between HighEnergy Pulses and Waves Considered as Suddenly Emerging Phenomena, Lecture
Notes Computer Science 3980 (2006), 839847
10. Toma, Th., Morarescu, C., Pusca, St.: Simulating Superradiant Laser Pulses Using Partial Fraction Decomposition and Derivative Procedure, Lecture Notes Computer Science 3980 (2006), 771779

Dierent Structural Patterns Created by Short


Range Variations of Internal Parameters
Flavia Doboga
ITT Industries, Washington, U.S.A.

Abstract. This papers presents properties of spatial linear systems described by a certain physical quantity generated by a dierential equation. This quantity can be represented by internal electric or magnetic
eld inside the material, by concentration or by similar physical or chemical quantities. A specic dierential equation generates this quantity
considering as input the spatial alternating variations of an internal parameter. As a consequence, specic spatial linear variations of the observable output physical quantity appear. It is shown that in case of very
short range variations of this internal parameters, systems described by
a dierential equation able to generate a practical test-function exhibit
an output which appears to an external observer under the form of two
distinct envelopes. These can be considered as two distinct structural
patterns located in the same material along a certain linear axis.
Keywords: patterns, short range variations, internal parameters.

Introduction

This papers presents properties of spatial linear systems described by a certain


physical quantity generated by a dierential equation. This quantity can be represented by internal electric or magnetic eld inside the material, by concentration or by similar physical or chemical quantities. A specic dierential equation
generates this quantity considering as input the spatial alternating variations
of an internal parameter. As a consequence, specic spatial linear variations of
the observable output physical quantity appear. It is shown that in case of very
short range variations of this internal parameters, systems described by a differential equation able to generate a practical test-function exhibit an output
which appears to an external observer under the form of two distinct envelopes.
These can be considered as two distinct structural patterns located in the same
material along a certain linear axis.
In the ideal mathematical case, suddenly emerging pulses should be simulated
using test-functions (functions which dier to zero only on a limited time interval
and possessing an innite number of continuous derivatives on the whole real
axis. However, as shown in [1], such test functions, similar to the Dirac functions,
cant be generated by a dierential equation. The existence of such an equation
of evolution, beginning to act at an initial moment of time, would imply the
necessity for a derivative of certain order to make a jump at this initial moment
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10601066, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Dierent Structural Patterns Created by Short Range Variations

1061

of time from the zero value to a nonzero value. But this aspect is in contradiction
with the property of test-functions to have continuous derivatives of any order
on the whole real axis, represented in this case by the time axis. So it results
that an ideal test-function cant be generated by a dierential equation. For this
reason, the analysis must be restricted at practical test-functions [2], dened
as functions which dier to zero on a certain interval and possess only a nite
number of continuous derivatives on the whole real axis. Mathematical methods
based on dierence equations are well known [3], but for a higher accuracy of the
computer simulation specic Runge-Kutta methods in Matlab are recommended.
The physical aspects of dynamical systems able to generate spatial practical
test-functions will be studied, for the case when the free-term of the dierential
equation (corresponding to the internal parameter of material) is represented
by alternating functions. The shape of the output signal (obtained by numerical
simulations in Matlab based on Runge-Kutta functions) will be analyzed, being
shown that for very short range alternating inputs an external observer could
notice (in certain condition) the existence of two distinct envelopes corresponding
to two distinct structural patterns inside the material. Such aspect diers to
the oscillations of unstable type second order systems studied using dierence
equations [4], and they diers also yo previous studies of the same author [5]
where the frequency response of such systems to alternating inputs was studied
(in conjunction with the ergodic hypothesis).

Equations Able to Generate Periodical Patterns

As it is known, a test-function on a spatial interval [a, b] is a function which


is nonzero on this interval and which possess an innite number of continuous
derivatives on the whole real axis. For example, the function

exp ( x211 )
if
x (1, 1)
(x) =
0 otherwise
is a test-function on [1, 1]. For a small value of the numerator of the exponent,
a rectangular shape of the output is obtained. An example is the case of the
function

exp ( x0.1
if
x (1, 1)
2 1 )
(x) =
0 otherwise
Using the expression of (x) and of its derivatives of rst and second order,
a dierential equation which admits as solution the function corresponding to
a certain physical quantity can be obtained. However, a test-function cant be
the solution of a dierential equation. Such an equation of evolution implies a
jump at the initial space point for a derivative of certain order, and test-function
must possess continuous derivatives of any order on the whole real axis. So it
results that a dierential equation which admits a test-function as solution can
generate only a practical test-function f similar to , but having a nite number
of continuous derivatives on the real Ox axis. In order to do this, we must add

1062

F. Doboga

initial conditions for the function f (generated by the dierential equation) and
for some of its derivatives f (1) , and/or f (2) etc. equal to the values of the testfunction and of some of its derivatives (1) , and/or (2) etc. at an initial space
point xin very close to the beginning of the working spatial interval. This can
be written under the form
(2)
(2)
fxin = xin , fx(1)
= (1)
xin and/or fxin = xin etc.
in

(1)

If we want to generate spatial practical test-functions f which are symmetrical


as related to the middle of the working spatial interval, we can choose as space
origin for the Ox axis the middle of this interval, and so it results that the
function f should be invariant under the transformation
x x
Functions invariant under this transformation can be written in the form f (x2 ),
(similar to aspects presented in [2]) and so the form of a general second order
dierential equation generating such functions must be
  d2 f
 2  df
 
a2 x2
+ a0 x2 f = 0
2 + a1 x
2
2
dx
d (x )

(2)

However, for studying the generation of structural pattersns on such a working


interval, we must add a free-term, corresponding to the internal parameter of
the material (the cause for the variations of the external obersvable physical
quntity). Thus, a model for generating a practical test-function using as input
the internal parameter u = u(x), x [1, 1], is
  d2 f
  df
 
a2 x2
+ a1 x2
+ a0 x2 f = u
2
2
dx
d (x2 )

(3)

subject to
lim f k (x) = 0 for k = 0, 1, . . . , n.

x1

(4)

which are the boundary conditions of a practical test-function. For u represented


by aletrnating functions, we should notice periodical variations of the external
observable phusical quantity f .

Periodical Patterns of Spatial Structures Descibed by


Practical Test-Functions

According to previous considerations for the form of a dierential equation invariant at the transformation
x x
a rst order system can be written under the form
df
=f +u
d (x2 )

(5)

Dierent Structural Patterns Created by Short Range Variations

1063

which converts to
df
= 2xf + 2xu
dx

(6)

representing a rst order dynamical system. For a periodical input (corresponding to the internal parameter) u = sin 10x, numerical simulations performed
using Runge-Kutta functions in Matlab present an output of an irregular shape
(gure 1), not suitable for joining together the outputs for a set of adjoining
linear intervals (the value of f at the end of the interval diers in a signicant
manner to the value of f at the beginning of the interval). A better form for the
physical quantity f is obtained for variations of the internal parameter described
by the equation u = cos 10x. In this case the output is symmetrical as related to
the middle of the interval (as can be noticed in gure 2) and the results obtained
on each interval can be joined together on the whole linear spatial axis, without any discontinuities to appear. The resulting output would be represented by
alternances of two great oscillations (one at the end of an interval and another
one at the beginning of the next interval) and two small oscillations (around the
middle of the next interval).

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0.05
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Fig. 1. f versus distance for rst order system, input u = sin(10x)

Similar results are obtained for an undamped dynamical system rst order,
represented by
df
=u
(7)
d (x2 )
which is equivalent to
df
= 2xu
dx

(8)

1064

F. Doboga

0.25

0.2

0.15

0.1

0.05

0.05

0.1
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Fig. 2. f versus distance for rst order system, input u = cos(10x)

0.005

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Fig. 3. f versus distance for rst order system, input u = sin(100x)

Connection with the Ergodic Hypothesis

When the internal parameter presents very short range variations, some new
structural patterns can be noticed. Considering an alternating input of the form
u = sin(100x), it results an observable physical quantity f represented in gure 3; for an alternating cosine input representd by u = cos(100x), it results
the output f represented in gure 4. Studying these two graphics, we can notice
the presence of two distinct envelopes. Their shape depends on the phase of the

Dierent Structural Patterns Created by Short Range Variations

1065

0.03

0.025

0.02

0.015

0.01

0.005

0.005

0.01
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Fig. 4. f versus distance for rst order system, input u = cos(100x)

input alternating component (the internal parameter), as related to the space


origin. At rst sight, an external observer could notice two distinct functions f
inside the same material, along the Ox axis. These can be considered as two
distinct structural patterns located in the same material, generated by a short
range alternating internal parameter u through a certain dierential equation
(invariant at the transformation x x).

Conclusions

This papers has presented properties of spatial linear systems described by a


certain physical quantity generated by a dierential equation. A specic dierential equation generates this quantity considering as input the spatial alternating
variations of an internal parameter. As a consequence, specic spatial linear variations of the observable output physical quantity appear. It was shown that in
case of very short range variations of this internal parameters, systems described
by a dierential equation able to generate a practical test-function exhibit an
output which appears to an external observer under the form of two distinct envelopes. These can be considered as two distinct structural patterns located in
the same material along a certain linear axis. By this study, a fundamental new
interpretation based on spatial aspects has been obtained for graphics previously
obtained for non-linear equations of evolution [5] (the novelty of this study being
justied).
Acknowledgment. This research work was guided by Cristian Toma (Politehnica University, Bucharest) and Carlo Cattani (University of Salerno, Italy)
through a pilot grant of international research involving Politehnica University,

1066

F. Doboga

Salerno University, IBM India Labs and Shanghai University - supported by the
National Commission of Romania for UNESCO.

References
1. Toma, C. : An extension of the notion of observability at ltering and sampling devices, Proceedings of the International Symposium on Signals, Circuits and Systems
Iasi SCS 2001, Romania 233236
2. Toma, G. : Practical test functions generated by computer algorithms, Lecture Notes
Computer Science vol. 3482 (2005), 576584
3. Dzurina, J. : Oscillation of second order dierential equations with advanced argument, Math. Slovaca, 45 3 (1995) 263268
4. Zhang, Zh., Ping, B., Dong, W. : Oscillation of unstable type second order nonlinear
dierence equation, Korean J. Computer and Appl. Math. 9 1 (2002) 8799
5. Doboga, F., Toma, G., Pusca, St., Ghelmez, M., Morarescu, C. : Filtering Properties
of Practical Test Functions and the Ergodic Hypothesis, Lecture Notes Computer
Science vol. 3482 (2005), 563568

Dynamic Error of Heat Measurement in Transient


Fang Lide, Li Jinhai, Cao Suosheng, Zhu Yan, and Kong Xiangjie
The Institute of Quality and Technology Supervising, Hebei University, Baoding, China,
071051
Leed_amy@yahoo.com.cn

Abstract. According to EN1434 (European Standard) and OIML


R-75(International Organization of Legal Metrology), there are two methods in
heat measurement, and they are all steady-state methods of test, using them in
transient, obvious error will be produced, so some accurately measuring
functions should be seeking. In a previous published paper of the author, a
transient function relationship of heat quantity and time is deduced, and the
validity of this function is proved through experimentation, also it is simplified
reasonably, so it can be used in variable flow rate heating system. In this study, a
comparison with steady-state method and dynamic method is presented, the
errors exist in the steady state method are analyzed. A conclusion is improved
that the steady-state methods used in variable flow heating system will result in
appreciable errors, the errors only exist in transient, when system reach steady
state, the errors disappear, moreover, the transient time is long in heating system,
it is at least 30 minutes, so it is necessary to take some measures to correct them,
however, study showed that the error can be ignored when the flow rate step
change is less than 5kg/h.
Keywords: Dynamic heat meter, variable flow heating system, step change, flux.

1 Introduction
The accuracy of measurement model is a continuous goal in measurement instrument
designing, moreover, in last decades saving the energy has become an important issue,
due to environmental protection and economical reasons. In every year a significant
part of the energy is utilized for heating. So the primary concern of heating services
today is to accurately measure and charge the heat consumption.
According to EN1434 (European Standard) and OIML R-75(International
Organization of Legal Metrology), there are two methods in heat measurement, and
they are all steady-state methods of test, Theory give the two methods as
2

Q = Ghd
1

v2

Q = k dV
v1

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10671074, 2007.
Springer-Verlag Berlin Heidelberg 2007

(1)
(2)

1068

L. Fang et al.

where: Q is the quantity of heat given up; G is the mass flow rate of the heat-conveying
liquid passing through the heat meter; h is the difference between the specific
enthalpies of the heat-conveying liquid at the flow and return temperatures of the
heat-exchange circuit; t is time; V is the volume of liquid passed; k called the heat
coefficient, is a function of the properties of the heat-conveying liquid at the relevant
temperatures and pressure; is the temperature difference between the flow and
return of the heat exchange circuit[1,2].
Usually, equation (1) called enthalpy difference method and equation (2) called K
coefficient method; they are widely used in all kinds of heat meters [3-8].
Howeve r, when the flow rate of system have a step change, heat dissipating capacity
is a function of flow rate, temperature and time, and, if the transient time interval
between two steady states is long enough, huge error would be produced by using
steady equation (1) or (2).
During the author study on a new heating measurement and control system, a
function relationship of heat quantity and time is deduced, and the validity of the
function is proved through experimentation, and it is simplified reasonably [9], so it can
be used in variable flow heating system, this article presented a comparison of state
heat measurement method and dynamic heat measurement method.

2 Principle of Dynamic Heat Meter


When system flux generated step change, the author put forward a level-move
presumption of temperature distribution curve in radiator during transient. On the basis
of the presumption, the function of heat quantity and time is deduced, and validity of
the function is proved through experimentation, the relativity error of the function was
within 3%. The dynamic heat measurement function in transient after flux changing to
large can be seen as Equation (3), and the opposite function can be expressed as
equation (4)[9,10,11,12].

Q01 = cG0 (t g t n )(1 e b0 FC )


( t g t n ).( K 0 + ( K 1 K 0 )

)
s1

b0

(t g t n ).( K 0 + ( K 1 K 0 )
b1

)
s1

(e

b 0 ( FC

( e b1 FC e

. a 0 1 )
s1

b1

FC
s1

b0 (

FC
. a 0 1 )
s1
s1

).......( 0 s1 )

Q10 = cG0 (t g t n )(1 e b0 FC ) +


( t g t n ).( K 1 ( K 1 K 0 )
b0

)
s1

)+

( e b 0 FC e

b0

FC
s1

Dynamic Error of Heat Measurement in Transient

( t g t n ).( K 1 ( K 1 K 0 )

b1

)
s1

(e

b1 ( FC +

. a1 0 )
s1

b1 (

FC +
. a1 0 )
s1
s1

.......(0 s 0 )
Where

1069

a 01 , the max right parallel moving distance; a10 , the max left parallel

b0 , b1 coefficient ( m 1 ); c mass specific heat of


water. FC total area of radiator m2 . G0 flux before step change. G1 flux after
step change. K heat transfer coefficient of radiator(W/m2. ). Q01 dynamic heat
quantity of radiator in flux largen transient W . Q1 0 dynamic heat quantity of
radiator in flux diminution transient W . t n average room temperature.
. t g feed water temperature of radiator
. s1 evacuation time of the medium in
moving distance;

the flow rate G1 . s 0 evacuation time of the medium in the flow rate G0 .
These two equations can be applicable not only for dynamic course, but also for
steady course. Let 0 in steady course.

3 Dynamic Heat Measurement Error of Heat Meter in Transient


Compared equation (3),(4) with equation (1)(or(2)) in different conditions, heat
measurement error in transient can be acquired. The follow analyzing is based on the
assumption that heat dissipating capacity of pipeline is too little to be considered, water
supply temperature t g is invariable.
3.1 Errors in Same Initial Flow Rate with Different Step Change
The results presented in Fig.1,2. The curves showed that Eq. (1) used in transient can
produce very huge errors. When system flow rate vary large from G0 to G1 ,
maximum error exist in the beginning of the transient, for this moment, the feed water
temperature become rising, but the backwater temperature remain unchanged, so the
temperature difference rising, on the same time, system flow rate vary large, all this
factors caused calculating value of heat dissipating capacity with Eq. (1) larger than
Eq. (3). With the time gone, the error become small with the backwater temperature
rising.
At = S 1 , the mass of system finish updating, backwater temperature reach its real
level, the error disappeared. When system flow rate vary small from

G1 to G0 ,

maximum error exist in the beginning of the transient too, and it is negative value, for
this moment the feed water temperature is invariable, and the backwater temperature

1070

L. Fang et al.

Fig. 1. Comparison of average of relative error in step change 5 and 10

Fig. 2. Average of relative error in the same initial flow rate with different step change

also remain unchanged at the beginning, so the temperature difference is invariable, on


the same time, system flow rate vary small, all this factors caused calculating value of
heat dissipating capacity with Eq. (1) less than Eq.(4). With the time gone, the errors
become small with the backwater temperature descending.
At = S 0 , the mass of system finish updating, backwater temperature reach its real
level, the error disappeared. Moreover, different step change has different average
of relative error, the larger step change, and the more average of relative error. Fig.1
only showed two step changes (5 and 10), more test data and comparison presented in
Fig.2.

Dynamic Error of Heat Measurement in Transient

3.2 Errors in Same Step Change with Different Initial Flow Rate

1071

G0

The result showed in Fig.3 to Fig6.

Fig. 3. Average of relative error in different initial flow rate and different step change (flow rate
vary to large)

Fig. 4. Average of relative error in different initial flow rate and different step change (flow rate
vary to small)

It is clear from all the results that average of relative errors increase with increasing
step change. At the same step change, absolute average of relative errors is descending
with the initial flow rate G 0 changing to large no matter in the transient of system flow

1072

L. Fang et al.

rate changing to large or small. Fig.3,4 showed average of relative errors have the same
trend in different step change; Average of relative error tends to unity at same step
change ratio, so all data almost exhibits one line in Fig.5. In flow rate large to small
transient, all data exhibits most curvature, particularly at low initial flow rate, see Fig.6.
These pictures would provide some valuable reference for heat measurement function
correlations. Two ways may be concerned in correcting the errors, one is to use
dynamic equations in transient, the other is to use steady equation multiply a coefficient
which is the function of step change ratio.

Fig. 5. Average of relative error in different step change ratio and different initial flow rate (flow
rate vary to large)

Fig. 6. Average of relative error in different step change ratio and different initial flow rate (flow
rate vary to small)

Dynamic Error of Heat Measurement in Transient

1073

4 Conclusion
From discussing above, The obvious error will be produced if heat dissipating capacity
of heat system is calculate by using steady-state equation (1) after step change of flow
rate.
In a fixed original flow rate, As the flow rate step change is small the relative error
corresponding this condition is small too, and otherwise, it is large. In a fixed step
change, average relative errors decreasing with the initial flow rate increasing. These
kinds of errors only exist in the transient, and a part of it can be counteracted because
the sign of the relative error is reverse during the two process of flow rate increasing
and decreasing. According to calculation data shown in the paper, the error can be
ignored when the flow rate step change is less than 5 kg/h, else it must be corrected.
The errors mentioned above only exist in transient, when system reach steady state,
the errors disappear, however, the transient time is long in heating system, it is at least
30 minutes, so it is necessary to take some measures to correct them.
To correcting the errors, some program should be added in systems software, so that
the heat meter can have some functions like step change discrimination, timing. In
brief, heat meter should have the function to judge the quantity of the flow rate step
change, so that using different formula in different working state. In brief, two ways
may be concerned in correcting the errors, one is to use dynamic equations in transient,
the other is to use steady equation multiply a coefficient which is the function of step
change.

References
[1] Europeanstandard, EN1434:1997has the status of a DIN standard, Heatmeters,(1997)
[2] OIML-R75 International Recommendation, Heatmeters,(2002)
[3] SHOJI KUSUI and TETSUO NAGAI: An Electronic Integrating Heat Meter, IEEE
TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT. VOL 39 NO S.
OCTOBER,pp785-789,1990
[4] Gza Mczr,Tibor Csubk and Pter Vrady: Distributed Intelligent Hierarchical System
for Heat Metering and Controlling, IEEE Instrumentation and Measurement Technology
Conference,Budapest,Hungary,pp2123-2128,May 21-23,2001.
[5] Gza Mczr, Tibor Csubk, and Pter Vrady: Student Member, IEEE, Distributed
Measurement System for Heat Metering and Control, IEEE TRANSACTIONS ON
INSTRUMENTATION AND MEASUREMENT, VOL. 51, NO. 4, pp691-694, AUGUST
2002.
[6] Jin Hai-long; Pan Yong: Development and research of a new style intelligent heat meter,
Chinese Journal of Sensors and Actuators vol.18, no.2 : 350-352, 2005
[7] Hao LN, Chen H, Pei HD: Development on a new household heat meter, ISTM/2005: 6TH
INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-9,
CONFERENCE PROCEEDINGS : 3090-3092, 2005
[8] Ye Xian-ming; Zhang Xiao-dong: Design on intelligent heat meter, Instrument Techniques
and Sensor no.1: 10-12, 2005
[9] Fang Lide: Study on dynamic character of a new heating measurement and control system,
Masters degree dissertation of Hebei University of Technology, pp18-36, 2005.3

1074

L. Fang et al.

[10] Fang Lide, Li Xiaoting, Li Jinhai, etc: Analyze Dynamic Errors of heat meter after step
change of Flow rate, Electrical Measurement & Instrumentation (J), pp20-24, .2005.9
[11] Toma, G: Practical Test Functions Generated by Computer Algorithms, Lecture Notes
Computer Science 3482 (2005), 576585
[12] Toma, C: An Extension of the Notion of Observability at Filtering and Sampling Devices,
Proceedings of the International Symposium on Signals, Circuits and Systems Iasi SCS
2001, Romania, 233--236

Truncation Error Estimate on Random Signals


by Local Average
Gaiyun He1 , Zhanjie Song2, , Deyun Yang3 , and Jianhua Zhu4
1

School of Mechanical Engineering, Tianjin University,


Tianjin 300072, China
hegaiyun@tju.edu.cn
2
School of Science, Tianjin University, Tianjin 300072, China
zhanjiesong@tju.edu.cn
3
Department of Information Science, Taishan College,
Taian 271000, China
nkuydy@163.com
4
National Ocean Technique Center, Tianjin 300111, China
besmile@263.net

Abstract. Since signals are often of random characters, random signals


play an important role in signal processing. We show that the bandlimited wide sense stationary stochastic process can be approximated by
Shannon sampling theorem on local averages. Explicit truncation error
bounds are given.
Keywords: stochastic process, random Signals, local averages,
truncation error, Shannon sampling theorem.

Introduction and the Main Result

The Shannon sampling theorem plays an important role in signal analysis as it


provides a foundation for digital signal processing. It says that any bandlimited
function f , having its frequencies bounded by W, can be recovered from its
sampled values taken at instances k/W, i.e.
f (t) =

+

k=


f

k
W


sinc(Wt k),

(1)

where sinc(t) = sint/(t), t = 0, and sinc(0) = 1.


This equation requires values of a signal f that are measured on a discrete set.
However, due to its physical limitation, say the inertia, a measuring apparatus
may not be able to obtain exact values of f at epoch tk for k = 0, 1, 2, .
Instead, what a measuring apparatus often gives us is a local averages of f near
tk for each k. The sampled values dened as local averages may be formulated
by the following equation


Corresponding author. Supported by the Natural Science Foundation of China under


Grant (60572113, 40606039) and the Liuhui Center for Applied Mathematics.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10751082, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1076

G. He et al.


f, uk  =

f (x)uk (x)dx

(2)

for some collection of averaging functions uk (x), k ZZ, which satisfy the
following properties,


supp uk [xk , xk + ], uk (x) 0, and


uk (x)dx = 1.
(3)
2
2
Where uk for each k ZZ is a weight function characterizing the inertia of
measuring apparatus. Particularly, in an ideal case, the function is given by
Dirac -function, uk = ( tk ), because f, uk  = f (tk ) is the exact value of tk .
The local averaging method in sampling was studied by a number of papers
[1]- [6] form 1994 to 2006.
The associated truncation error of (1) is dened dy
 
+N

k
sin Wt 
f (k/W)
f
RN
= f (t)
f
sinc(Wt k) =
(1)k
. (4)
W

Wt k
|k|>N

k=N

But on the one hand, we can not nish a innite number of terms in practise, we
only approximated signal functions by a nite number of terms. Which is called
truncation error deal with bounds by a number of papers [7]- [15].
On the other hand, since signals are often of random characters, random
signals play an important role in signal processing, especially in the study of
sampling theorems. For example, a signal of speech, where the random portion
of the function may be white noise or some other distortion in the transmission
channel, perhaps given via a probability distribution. So there are a lots of papers
on this topic too. Such as [16]-[24]. Now we give truncation error bounds random
signals by local averages.
Before stating the results, let us introduce some notations. Lp (IR) is the space
of all measurable functions on IR for which f p < +, where

f p :=

1/p

|f (u)|p du

f  := ess sup |f (u)|,

1 p < ,

p = .

uIR

BW,p is the set of all entire functions f of exponential type with type at most
W that belong to L2 (IR) when restricted to the real line [25]. By the PaleyWiener Theorem, a square integrable function f is band-limited to [W, W]
if and only if f BW,2 .
Given a probability space (, A, P) [26] , a real-valued stochastic process
X(t) := X(t, ) dened on IR is said to be stationary in weak sense if
E[X(t)2 ] < , t IR, and the autocorrelation function

RX (t, t + ) :=
X(t, )X(t + , )dP ()

is independent of t IR, i.e., RX (t, t + ) = RX ( ).

Truncation Error Estimate on Random Signals by Local Average

1077

A weak sense stationary process X(t) is said to be bandlimited to an interval


[W, W] if RX belongs to BW,p for some 1 p .
Now we assume that uk which are given by (3) satisfy the following properties.
i)

supp uk [
constants;

ii)

uk (t) 0,

k
k
k ,
+ k ], where /4 k , k /2, are positive
W
W


uk (t)dt = 1;


iii) m = inf {mk }, where mk :=


kZ

k/ W +/4

k/ W /4

uk (t)dt.

(5)

In this cases, The associated truncation error of random signals X(t, ) is


dened dy
+N

X
RN
= X(t)
X, uk  sinc(Wt k).
(6)
k=N

where the autocorrelation function of the weak sense stationary stochastic


process X(t, ) belongs to BW,2 , and W > W > 0.
The following results is proved by Belyaev and Splettst
osser in 1959 and 1981,
respectively.
Proposition A. [16, Theorem 5]) If the autocorrelation function of the weak
sense stationary stochastic process X(t, ) belongs to BW,2 , for W > W > 0,
we have

2


N


 X 2 
k

E |RN | = E
X(t, )
X
, sinc(W t k)

W
k=N

16RX (0)(2 + |t| W)2


.
2 (1 W/ W)2 N 2

(7)

Proposition B. [17, Theorem 2.2]) If the autocorrelation function of the weak


sense stationary stochastic process X(t, ) belongs to B W,p for some 1 p 2
and > 0, then

2


N


k

lim E
X(t, )
X
, sinc(W t k)
= 0.
(8)
N

W
k=N

For this case, we have the following result.


Theorem C. If the autocorrelation function RX of a weak sense stationary
stochastic process X(t, ) belongs to BW,2 , for W > W > 0 and 2/ N 100,
we have


2
 X 2
32RX (0)(2 + |t| W)2
ln N

E |RN | 14.80R  +
,
(9)
2 (1 W/ W)2
N
where {uk (t)} is a sequence of continuous functions dened by (5).

1078

G. He et al.

Or, in other words


E

X 2
|RN
|


=O

ln N
N

2
N .

(10)

Proof of the Main Result

Let us introduce some preliminary results rst.


Lemma D. [27] One has for q  > 1, 1/p + 1/q  = 1, and W > 0,
 q



2
q
|sinc(W t k)|q 1 +
< p .


q 1

(2.1)

k=

Lemma E. [26] If a stationary stochastic process X(t, ), t [a, b] is continuous


in mean square, f (t), g(t) are continuous function on [a, b], then

  
 b
b
b
b
E
f (s)X(s)ds
g(t)X(t)dt =
f (s)g(t)R(s t)dsdt.
(2.2)
a

Lemma F. Suppose that the autocorrelation function RX of the weak sense


stationary stochastic process X(t, ) belongs to B W,2 and W > 0, and satises

RX
(t) C(IR). Let


j
D
;
W 2

 







j
j
j
j
:= sup

RX
RX
RX
+ + RX
+

W
W
W
W
| |
2
| | 2

sup

| |


RX

j
+ u + v dudv

.
W

| | 2

Then we have for r, N 1,


 2r
+2N
   j r


D
;
(4N + 1)(RX
(t) )r
.
2
2
j=2N


Proof. Since RX is even and RX
(t) C(IR), we have



 
r
r
+2N
2N  
r


j

j
D
;
= D 0;
+2
D
;
2
2
2
j=1
j=2N
 2r


r
(4N + 1)(RX (t) )
.
2

which completes the proof.

(11)

Truncation Error Estimate on Random Signals by Local Average

1079

Proof of Theorem C. From Proposition A, Proposition B and Lemma E


we have

2
 k/ W +k
N


 X 2

E |RN | = E
X(t, )
uk (t)X(t, )dtsinc(W t k)


k=N k/ W k



N


k

= E
X(t, )
X
, sinc(W t k)

W
k=N


N

k
+
X
, sinc(W t k)
W
k=N

2
 k/ W +k
N

uk (t)X(t, )dtsinc(W t k)


k=N k/ W k
 X 2 
= 2E |RN |

2


  k/+k
N

+2E

X
,
uk (t)X (t, ) dt sinc(W t k)


W
k/ W k
k=N

N   
 



k
 X 2 
k
k

= 2E |RN | + 2E

X
, uk
+ s ds


W
W
k
k=N

2

 k

uk (k/ W +t)X (k/ + t, ) dt sinc(W t k)


N

 X 2 
= 2E |RN
| +2



N


k=N j=N

 
 



k
j
k
j
E X
, X
, uk (
+ u)uj
+ v dudv


W
W
W
W
k
k
 
 
 

 k  k  
k
j
k
j

E X
, X
+ v, uk
+ u uj
+ v dudv


W
W
W
W
k
k
 
 
 

 k  k  
k
j
k
j

E X
+ u, X
, uk
+ u uj
+ v dudv


W
W
W
W
k
k

 
 
 

 k  k  
k
j
k
j
+
E X
+u, X
+v, uk
+u uj
+v dudv


W
W
W
W
k
k

k


k

|sinc(W t k)||sinc(W t j)|


 
N
N  k


 X 2 
= 2E |RN
| +2




(k j)
(k j)
RX
v


W
W
k=N j=N k k



 
 

(k j)
(k j)
k
j
RX
+u +RX
+uv uk
+u uj
+v dudv
W
W
W
W

k

RX

1080

G. He et al.

|sinc(W t k)||sinc(W t j)|


 
N
N  k


 X 2 
2E |RN | + 2
k=N j=N


k


k


k

(k j)
;
2
W

 

k
j
uk
+ u uj
+ v dudv |sinc(W t k)||sinc(W t j)|
W
W


N
N


 X 2 
(k j)
= 2E |RN | + 2
D
;
|sinc(W t k)||sinc(W t j)|
2
W
k=N j=N
Using Holders inequality and Lemma D, we have
N


N



D

k=N j=N

(k j)
;
2
W


|sinc(W t k)| |sinc(W t j)|

p 1/p



N
N



(k j)

D
;
|sinc(W t j)|

2
W

k=N
j=N


1/q

N


|sinc(W t k)|q

k=N

p 1/p



N
N



(k j)
1/q

(p )
D
;
|sinc(W t j)|

2
W

k=N
j=N

p 1/p





(k j)

p
D
;
|sinc(W t j)|

2
W

k=N
j=N
N


where 1/p + 1/q = 1. By Hausdor-Young inequality [28, page176] and


Lemma F, we have

p 1/p





(k j)

D
;
|sinc(W t j)|

2
W

k=N
j=N
N


2N 



D

j=2N

(4N + 1)1/r

j
;
W 2

r

1/r

2N

j=2N

1/s

|sinc(W t j)|s

1/s
 2
2N




RX
(t)
|sinc(t j)|s
2
j=2N

Truncation Error Estimate on Random Signals by Local Average


(4N + 1)1/r RX
(t)

1
N

2

1081

1/s

|sinc(t j)|s

j=

where 0 1/s + 1/r 1 = 1/p . Let r = ln N/2. Notice that N 100,


we have

(4N + 1)1/r <

401N
400

2/ ln N
7.40.

Let s = 2r /(2r 1) and s = 2r . Then 1/s +1/s = 1 and p = 2r = ln N .


By Lemma E, we have

1/s

|sinc(t j)|s

s = p = ln N.

j=

Hence
 X 2
E |RN
|



2
32RX (0)(2 + |t| W)2
ln N

14.80R  +
.
2 (1 W/ W)2
N

This completes the proof.

References
1. Gr
ochenig, K., Reconstruction algorithms in irregular sampling, Math. Comput.,
59(1992), 181-194.
2. Butzer, P. L., Lei, J., Errors in truncated sampling series with measured sampled
values for non-necessarily bandlimited functions, Funct. Approx. Comment. Math.
26(1998), 18-32,.
3. Butzer, P. L., Lei, J., Approximation of signals using measured sampled values and
error analysis, Commun.Appl.Anal., 4(2000), 245-255.
4. Sun, W., Zhou, X., Reconstruction of bandlimited functions from local averages,
Constr. Approx., 18(2002), 205-222.
5. Sun, W., Zhou, X., Reconstruction of bandlimited signals from local averages, IEEE
Trans. Inform. Theory, 48(2002), 2955-2963.
6. Song, Z., Yang, S., Zhou, X., Approximation of signals from local averages, Applied
Mathematics Letters, 19(2006), 1414-1420.
7. Yao, K., Thomas, J. B., On truncation error bounds for sampling representations
of band-limited signals, IEEE Trans. Aerosp. Electron. syst., vol.AEs-2(1966), 640647.
8. Jagerman, D., Bounds for truncation error of the sampling expansion, SIAM J.
Appl. Math., vol.14(1966), 714-723.
9. Brown, J. L., Bounds for truncation error in sampling expansion of band-limited
signals, IEEE Trans. Inform. Theory, vol.IT-15(1969), 440-444.

1082

G. He et al.

10. Piper, H. S. Jr., Bounds for truncation error in sampling expansion of nite energy
band-limited signals, IEEE Trans. Inform. Theory, vol.IT-21(1975), 482-485.
11. Piper, H. S. Jr., Best asymptotic bounds for truncation error in sampling expansion
of band-limited functions, IEEE Trans. Inform. Theory, vol.IT-21(1975), 687-690.
12. Butzer, P. L., Engels, W., Scheben, U., Magnitude of the truncation error in sampling expansion of band-limited signals, IEEE Trans. Acoustics, Speech, and Signal
Processing, vol.ASSP-30(6)(1982), 906-912.
13. Butzer, P. L., Engels, W., On the implementation of the Shannon sampling series
for band-limited signals, IEEE Trans. Inform. Theory, vol.IT-29(2)(1983), 314-318.
14. Bucci, O. M., Massa, G. D., The truncation error in the application of sampling series to electromagnetic problems, IEEE Trans. Antennas and Propagation,
vol.36(7)(1988), 941-949.
15. Machiraju, R., Yagel, R. K., Reconstruction error characterization and control: a
sampling Theory approach, IEEE Trans. Visual. Comput. Graphics, vol.2(4)(1996),
364-378.
16. Belyaev, Y. K., Analytic random processes, Theory Probab. Appl. IV(1959)
437-444.
17. Splettst
osser, W., sampling series approximation of continuous weak sense stationary processes, Information and Control 50(1981), 228-241.
18. Balakrishnan, A. V., A note on the sampling principle for continuous signals, IRE
Trans. Inform. Theory IT-3(1957), 143-146.
19. Lloyd, S. P., A sampling theorem for stationary (wide sense) stochastic processes,
Trans. Amer. Math. Soc. 92(1959), 1-12.
20. Stens, R. L. , Error estimates for sampling sums based on convolution integrals,
Information and Control 45(1980), 37-47.
21. Butzer, P.L., Splettst
osser, W. and Stens R. L., The sampling theorem and linear
prediction in signal analysis, Jber. d. Dt. Math.-Verein., 90(1988), 1-70.
22. Olenko, A. YA., Pogany, T. K., A precise bound for the error of interpolation of
stochastic processes, Theor. Probability and Math Statist. vol.71(2005), 151-163.
23. Song, Z., Zhou, X., He, G., Error estimate on no-bandlimited radom signals by
local averages, LNCS3991(2006), 822-825.
24. Song, Z., Sun, W., Yang, S., Zhu, G., Approximation of Weak Sense Stationary Stochastic Processes from Local Averages, Science in China: Series A Math.
50(4)(2007),457-463.
25. Zayed, A.I., Butzer, P.L., Lagrange interpolation and sampling theorems, in
Nonuniform Sampling, Theory and Practice, Marvasti,F., Ed., Kluwer Academic, 2001, 123-168.
26. Li, Z., Wu, R., A course of studies on stochastic processes, High Education Press,
1987(in chinese).
27. Splettst
osser, W., Stens, R. L., Wilmes, G., on the approximation of the interpolating series of G. Valiron , Funct. Approx. Comment. Math. 11(1981), 39-56.
28. Pinsky, M. A., Introduction to Fourier analysis and wavelets, Wadsworth Group.
Brooks/Cole.(2002) Comput., 59, 181-194.

A Numerical Solutions Based on the


Quasi-wavelet Analysis
Z.H. Huang1, L. Xia2, and X.P. He3
1

College of Computer Science, Chongqing Technology and Business University, Chongqing,


P.R. China
zhhuangctbu@yahoo.com.cn
2
College of Science, Chongqing Technology and Business University, Chongqing, 400067,
P.R. China
xl@ctbu.edu.cn
3
College of Computer Science, Chongqing Technology and Business University, Chongqing,
400067, P.R. China
jsjhxp@ctbu.edu.cn

Abstract. Taking the approximate equations for long waves in shallow water as
example, the quasi-wavelet discrete scheme is proposed for obtaining numerical
solution of the (1+1) dimension nonlinear partial differential equation. In the
method, the quasi-wavelet discrete scheme is adopted to discretize the spatial
derivative discrete and the ordinary differential equation about time is obtained.
Then the fourth order Rung-Katta method is employed to discretize the temporal derivative. Finally the quasi-wavelet solution is compared with the analytical
solution, and the computations are validated.
Keywords: interval quasi_shannon; precise integration method; approximate
equations of long waves in shallow water.

1 Introduction
In recent years, many methods have been developed for the analytical solution to
nonlinear partial differential equation (PDE). For example, important such ones as
homogeneous balance method, variable separable methods and Jacobi elliptic function expansion are used to solve PDE exact solution and solitary wave solution[1-5].
But in general, it is quite difficult to carry on a deep study to the wellposedness of the
solutions. Wavelet function is the energy function with locality characterization, and
it is characterized by utilizing information from the nearest neighboring grid pints to
approximate the differentiation at a point, and thus is much more flexible. The wavelet method is applied to numerical solution of PDE by Morlet J, Arens G, Fourgeau E,
et al. Its applications have become one of the fastest growing research areas. The

This work was supported by the Key Science-Technology Project of Chongqing under Grant
NO.CSTC-2005AC2090, and the Science Foundation of Chongqing under Grant NO.CSTC2006BB2249.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10831090, 2007.
Springer-Verlag Berlin Heidelberg 2007

1084

Z.H. Huang, L. Xia, and X.P. He

Daubechies scaling function constructs the grid point method of wavelet to solve PDE
by [6]. Daubechies scaling function converges slowly with respect to mesh refinement
and is not cost-effective for achieving high precision. Hence, there is a strong demand
for a scheme, which can solve the numerical solutions of nonlinear partial differential
equations. The scaling function of Shannons wavelet is represented in an analytical
form. When Shannon scaling is multiplied by Gauss function[7], a function, quasi
scaling function will be obtained. The wavelet of orthogonal function when it is further generalized is called quasi-wavelet. While high-accuracy solution of numerical is
very useful to nonlinear PDEs when it functions as a local method.
The numerical solution of PDE is solved by method of quasi-wavelet in [8]. However, research of application of quasi-wavelet to numerical solutions of nonlinear
PDEs has not been conducted, for example (1+1) dimensional models such as the
approximate equations for long waves in shallow water have not been studied. This
paper uses the quasi-wavelet method to construct scaling function. In the method the
quasi-wavelet discrete scheme is adopted to discretize the spatial derivatives, and the
fourth order Rung-Katta method is adopted to discretize the temporal derivatives. The
numerical solution of example showed that quasi-wavelet method is a success in computation of the nonlinear PDEs solutions, and the method is helpful to improve the
precision of the solutions. In the end, the computation results are validated based on
the numerical solutions of the quasi-wavelet and the analytical solutions.

2 Quasi-wavelet Solutions of Approximate Equations for Waves


in Shallow Water
The long waves equations in shallow water have been found by Whithom and Broer
1

u t uu x v x + 2 u xx = 0

v ( uv ) 1 v = 0
x
xx
t
2

(1)
(2)

initial-boundary condition

u(a, t ) = u1 (t ),
v(a, t ) = v1 (t ),
Where x [a, b]
2.1

u(b, t ) = u2 (t )
v(b, t ) = v2 (t )

(3)
(4)

t >0

Spatial Coordinate Discrete to the Long Waves in Shallow Water

Identically proportion of an x-coordinate of spatial, is the grid spacing. x = (b a) / N where N is the total computational narrow bandwidth [a, b]. The
{xi = a + (i 1)x} i=1, 2, 3N+1 refers to discrete sampling pints centered

A Numerical Solutions Based on the Quasi-wavelet Analysis

around the point x, and xi xi + k = kx


respect to the coordinate at grid point

1085

the function values {u } and {v } with


i

xi , the Eq.(1)-(2) are represented as follows

ui
u v
1 2ui
=
+ ui i + i
t
2 x 2
x
x
vi
vi
ui
1 2 vi
=
+ ui
+ vi
t
2 x 2
x
x

(5)
(6)

Let
fi =

gi =

i=1, 2, 3, ..N+1

u i vi
1 2ui
+ ui
+
2 x 2
x
x

(7)

vi
ui
1 2 vi
+ ui
+ vi
2 x 2
x
x

(8)

By (5)-(8), Eq.(5)-(6)can be expressed as follows


du i
dt = f i

dv i = g
i
dt

(9)
(10)

2.2 Numerical Discrete Forms of Quasi-wavelet Spatial Coordinate Discrete to


the Long Waves in Shallow Water
To solve Eq.(9)-(10), the regularized Shannons delta kernel is used, which can dramatically increase the regularity of Shannons wavelet scaling function,[or quasiscaling function][8].
( x ) =

sin( x / )
exp[ x 2 / 2 2 ]
x/

(11)

where is the grid spacing, is determines the width of the Gauss envelop and can
be varied in association with the grid space. i.e., = r and r 2 / r is the
parameter. The regularized Shannons wavelet is called quasi-wavelet. Shannons
scaling function is recognized as basis function, the function f ( x ) and g ( x ) in the
interval can be expressed as

f ( x) =

k =

( x xk ) f ( xk )

(12)

g(x) = , (x xk )g(xk )
k =

(13)

1086

Z.H. Huang, L. Xia, and X.P. He

In discrete singular convolution algorithm, the band-limits f (x ) , g ( x ) and its derivatives with respect to the coordinate at grid point x are approximated by linear sun
of discrete values { f ( x k )} and {g ( x k )} in the interval ,

(n)

(x)

k=w

g (n) ( x)

+w

l = w

( n,) ( x x k ) f ( x k ) , ( n = 0 ,1, 2 ......)


(n)
,

( x x k ) g ( x k ) , ( n = 0,1, 2..........)

(14)
(15)

In fact, the 2w point of computation is obtained in the coordinate around a grid


point {xi } ; 2w+1 is the total computational bandwidth, which is usually much smaller
than the computational domain. Eq.(15)-(16) are called quasi-wavelet form of numerical discrete.
To compute Eq.(14)-(15), the regularized Shannons delta kernel is used, the delta
expressions for (14)-(15) and
(1)

( 2) can be given analytically as


)
( 2
x x
2
0

x0

(1,) ( x) =

x=0

[ ( x + x 3 + x
( 2,) ( x ) =
2
2
2
3 + /
2

) 2 (

1
1
+ 2 )] , x 0
2
x

,x = 0

Where = exp[ x 2 / 2 2 ] , = sin(x / ) and = cos(x / ) .


2.3 Temporal Derivatives Discretization
A Rung-Kutta schema is used for temporal derivatives. The ordinary differential
Eqs. (9)-(10) are used by fourth-order Rung-Katta method to discretize temporal
derivatives, which can be expressed as follows

uin +1 = uin +

t
[ K i ,1 + 2 K i ,2 + 2 K i ,3 + K i ,4 ]
6

(i = 1,2, 3,......N = 1 )

(16)

where

K i ,1 = f i ,1n

v nj =1 = v nj +

L j ,1 = g nj ,1

K i ,2 = f i ,2n

K i ,3 = f i ,3n

K i ,4 = f i ,4n

(i= 1, 2, ...,N+1)

t
[ L j ,1 + 2 L j ,2 + 2 L j ,3 + L j ,4 ]
6
L j ,2 = g nj ,2 L j ,3 = g nj ,3 L j ,4 = g nj ,4 (j=1, 2,...,N+1)

Where upper sign n is time level, t is length of time.


From Eq.(16),we have

(17)
(18)
(19)

A Numerical Solutions Based on the Quasi-wavelet Analysis


w
1 w
( 2,) ( m x )u mn + i + u in (1), ( m x )u in+ m

2 m= w
m= w

K i ,1 = f i ,1n =

m=w

K i,2 = f

n
i,2

m=w

(1 )
,

K i ,3 = f
w

m= w

m = w

m=w

n
i ,3

m= w

t
K i + m ,1 ] +
2

(20)

t
t
+
K i + m ,1 ] + [ u in +
K i ,1 ]
2
2
w

m=w

(1 )
,

( m x )[ v in+ m +

t
L i + m ,1 ]
2

1
=
2

m=w

w
t
t
K i + m ,2 ] + (1), ( m x )[ vin+ m +
Li + m ,2 ]
2
2
m= w

(21)

(22)

( 2,) ( m x )[ u mn + i + tu i + m , 2 ] + [ u in + tK i ,3 ]

(1), ( m x )[ u in+ m + tK i + m ,3 ] +

L j ,3 = g nj,3
(1)
,

n
m+i

1 w
t
t
= ( 2,) ( m x )[u mn + i +
K i + m ,2 ] + [u in +
K i ,2 ]
2 m= w
2
2

m =w

(1), ( m x )[ v in+ m + tL i + m ,3 ]

1
2

m= w

( 2,) ( m x )[ v nj + m +

(23)

1
(1), (mx)vnj +m + vnj (1), (mx)vnj +m
(2), (mx)vmn + j + unj m
2 m= w
= w
m= w
L j , 2 = g nj , 2 =

( m x )[ u

(2)
,

(1), ( m x )[u in+ m +

L j ,1 = g nj ,1 =

(1), ( m x ) vin+ m

( m x )[ u in+ m +

K i , 4 = f i ,n4

1
=
2

1087

(24)

t
L j + m ,1 ]
2

w
t
t
+ [ u nj +
K j .1 ][ (1), ( m x ) [ v nj + m ,1 +
L j + m ,1 ]
2
2
m = w
w
t
t
+ [ v nj +
L j ,1 ] (1). ( m x )[ u nj + m +
K j + m ,1 ]
2
2
m=w
1 w
t
L j ,2 = g nj ,2 =
( 2,) ( m x )[ v nj + m +
L j + m ,1 ]

2 m=w
2
w
t
t
+ [ u nj +
K j .1 ][ (1), ( m x ) [ v nj + m ,1 +
L j + m ,1 ]
2
2
m = w
w
t
t
+ [ v nj +
L j ,1 ] (1). ( m x )[ u nj + m +
K j + m ,1 ]
2
2
m= w
1 w
t
t
= ( 2,) ( mx )[v nj + m +
L j + m , 2 ] + [u nj +
K j .2 ]
2 m= w
2
2

(25)

(26)

w
t
t
t
L j + m , 2 ] + [v nj +
L j , 2 ] (1.) ( mx )[u nj + m +
K j + m , 2 ] (27)
2
2
2
m= w
1 w
= g nj, 4 = ( 2,) ( mx )[v nj + m + tL j + m,3 ]
2 m= w

( mx ) [v nj + m +
L j ,4

+ [u nj + tK j .3 ] (1,) (mx) [v nj + m + tL j + m ,3 ]
m=w
w

+ [v nj + tL j ,3 ] (1.) (mx)[u nj + m + tK j + m ,3 ]
m=w

(28)

1088

Z.H. Huang, L. Xia, and X.P. He

When t=0

the values of { u }

{ } n=0are obtained by Eq.(1)-(4). This

n
i

and vi

can be rewritten as

u 0i = u ( x)

(i=1, 2, 3N+1)

v = v ( x)

(j=1, 2, 3.N+1)

0
j

Where [-w, +w] is the computation bandwidth. The w may be an arbitrary constant to
reduce computation in narrower bandwidth.

3 Overall Solutions Scheme


In the above, we have computed (1,) and ( 2,) by only depending on the spacing

, therefore, when the grid spacing is provided, the coefficients need to be computed
only once and can be used during the whole computation. The main steps of computation can be expressed as following:
0

Step 1. using the known initial value both for ui (i=1, 2, 3N+1) and 0j
(j=1,2,3... N+1) or time level values of previous time ui and
n

n
j

i, j=1, 2, 3.

N +1 , and outside the computational domain are required extension.


Step 2. From Eq.(20)-(26), transformations of the specified grid point values

f i ,n1 , f i ,n2 , f i ,n3 , f i ,n4 and g nj,1 , g nj, 2 , g nj,3 , g 4j , 4 are obtained.
Step 3. By subtracting Eq(16)-(18) from second-step, the values are computed

n +1
i

n +1

and vi
i, j=1, 2, 3N +1 .
Step 4. Repeating the above process, from the first-step to the third-step with being
n +1

computational value u i

It satisfies the relation:


achieved.

i, j=1, 2, 3N+1and the boundary condition.


t = t + t and n = n + 1 until required time level is
n +1

and vi

4 Comparison Computations
Both quasi-wavelet numerical solutions and the analytical solutions are computed, by
Eq.(1)-(2). Assuming that Eq.(1)-(2) satisfy the initial-boundary condition below
u ( x, 0) =

c
c
(1 + ta n h
x)
2
2

(x, 0) =

c2
c
sec h 2 x
4
2

(29)

u (a, t ) =

c
1
c2
[1 + tanh ( ac + t )]
2
2
2

( a ,t ) =

c2
1
c2
sec h 2 [ ac + t ]
2
2
2

(30)

u (b, t ) =

c
1
c2
[1 + tanh ( cb +
t )]
2
2
2

(b , t ) =

c2
1
c2
[1 + tanh ( cb +
t )]
2
2
2

(31)

We have analytical solutions to Eq.(1)-(2)

A Numerical Solutions Based on the Quasi-wavelet Analysis

u =

c
1
c2
[1 + tanh ( cx +
t )] ,
2
2
2

v=

1089

c2
1
c2
sec h 2 [ cx +
t]
4
2
2

where c is an arbitrary constant.


0

To analyze and compare the computations, the initial values u i and v j of discrete
are obtained from Eq.(29), where

u i0 =

c
c
{1 + tanh
[ a + ( i 1) x ]} ,
2
2

i ,j=1,2,3N+1

v 0j =

c2
c
sec h 2
[ a + ( j 1) x ]
4
2
n

We shall compute the values of previous time level from the above ui and vi

i,

j=1,2,3N+1 . We choose c=0.5 computational bandwidth W=10, orthonormal


band = 3.2 computation domain[a, b]=[-100, 100], the number of grid N=200,
allowable time step t = 0.002 . These values are computed by method of quasiwavelet, respectively, and such a plot is given in Figure 1-4.
From Eq.(1)-(2), these figures are excellent agreement between the analytical solutions and quasi-wavelet numerical solutions.

Fig. 1. u-analytical solutions

Fig. 3. v- analytical solutions

Fig. 2. u-quasi-wavelet solution


(where w=10, t=0.002,=3.2)

Fig. 4. v-quasi-wavelet solution


(where w=10, t=0.002,=3.2)

1090

Z.H. Huang, L. Xia, and X.P. He

5 Conclusion
In the paper, a new quasi-wavelets method for numerical application is introduced. In
fact, their numerical solution is extremely approximate with analytical solutions and
solving PDEs. The latter has won great success in wide application to various on
PDEs.

References
1. Whutham, G. B.:Variational methods and applications to water waves. Proc Roy London,1967:(220A):6--25.
2. Broer, L. J. F. :Approximate equations for long water waves. Appl Sci Res,1975: (31):
337--396.
3. Kupershmidt, B. A.: Mathematics of dispersive waves. Comm Math Phys.1985: (99):51--73.
4. Wang, M.L.: A nonlinear function transform action and the exact solutions of the approximate equations for long waves in shallow water. Journal of Lan Zhou University. Natural
Sciences 1998 34 2
21--25.
5. Huang, Z.H.: On Cauchy problems for the RLW equation in two space dimensions. Appl
Math and Mech , 2002 (23) 2:169--177.
6. Morlet ,J., Arens, G., Fourgeau, E.,Et al.: Wave propagation and sampling theory and complex waves.Geophysics, 1982, 47(2):222--236.
7. Wei, G. W.: Quasi wavelets and quasi interpolating wavelets. Chen Phys.Lett, 1998.
296(3~4): 215--222.
8. Wan, D. C. ,We,i G. W.: The Study of Quasi-Wavelets Based Numerical Method Applied to
Burger Equations. Appl.Math. Mech, 2000.(21) 1099.

Plant Simulation Based on Fusion of L-System and IFS


Jinshu Han
Department of Computer Science, Dezhou University, Dezhou 253023,China
lihanjinshu@tom.com

Abstract. In this paper, we present a novel plant simulation method based on


fusion of L-system and iteration function system (IFS). In this method, the
production rulers and parallel rewriting principle of L-system are used to
simulate and control the development and topological structure of a plant, and
IFS fractal graphics with controllable parameters and plenty subtle texture are
employed to simulate the plant components, so that the merits of both
techniques can be combined. Moreover, an improved L-system algorithm based
on separating steps is used for the convenient implementation of the proposed
fusing method. The simulation results show that the presented method can
simulate various plants much more freely, realistically and naturally.
Keywords: Fractal plant; L-system; Iteration function system.

1 Introduction
Along with the computer graphics development, the computer simulation for
various natural plants as one of the research hotspots in computer graphics domain
has the widespread application prospect in video games, virtual reality, botanical
garden design, and ecological environment simulation. Nevertheless, at present,
commercial image and graph processing software based on Euclidean geometry
only can draw some regular geometric shapes, and cant describe the natural plants
with complex structures, subtle details and irregular geometric shapes realistically.
Although we can use complex techniques such as curves or surfaces to approximate
to such natural objects, we must ensure providing adequate memory spaces and
high rendering speeds. On the other hand, fractal geometry takes irregular
geometric shapes as research objects. One of the fractal typical application domains
is natural objects simulation. In plant simulation, there are two common methods:
L-system and iteration function system (IFS). Previous works[1,2,3,4,5] mainly
focus on plant simulation only by L-system or only by IFS, and cant provide
realistic and natural representations for plant constructs and shapes due to the
limitation of both methods. Obviously, combining them may be better. However, how to combine them is worth researching further. In response to such
problem, a novel plant simulation method based on fusion of L-system and IFS is
presented. The simulation results show the feasibility and validity of the presented
method.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10911098, 2007.
Springer-Verlag Berlin Heidelberg 2007

1092

J. Han

2 Characteristics of L-System and IFS


L-system[2] is a rewriting system based on symbols. It defines a complex object
through replacing every part of the initial object according to rewriting rules. For a
certain natural plant, the simulation processes are as follows: Firstly, set of production
rules is abstracted according to shapes and development rules of the plant. Secondly,
starting from the initial string, a new string is created recursively according to the
production rules. Thirdly, the created string is interpreted using turtle graphics
algorithm and the corresponding geometric object is drawn.
In plant simulation, L-system can better simulate development rules and
topological structure of a plant, which is the main merit of L-system. For example,
L-system can describe the development rules and structures of branches, leaves,
flowers and seeds freely, because of the fact that the parallel principle of L-system is
similar to the parallel growth course of plants. However, comparing with a natural
plant, the plant graphics drawn by L-system is lack of texture. The key reasons are as
follows: Firstly, the plant branches are represented by lines. Secondly, L-system only
considers the self-similarity of the topological branching structure of the whole plant,
and doesnt consider that the self-similarity also exists in every component of plant,
such as a bark or a leaf.
IFS[3] represents a plant shape based on collage theorem. For a certain plant
image, several sub-images generated by contraction affine transform are the little
scaled version of the original image. They try to cover original image accurately, and
they are permitted to overlap each other partly. So a set of contraction affine
transforms are as follow:

x r cos
W = {R 2 : 1 , 2 ,..., n } i =
y r sin

q sin x e
+ , i = 1,2..., n
q cos y f

(1)

Where, r, q are scaling factors in x or y direction, and , are rotation angles


between sub-images and the original image. e, f are translation distances of subimages in x or y direction. In addition, based on proportion of sub-images areas, an
array of accumulative probabilities associated with W is gotten. It is
n

P = { p1 , p2 ,..., pn } , where pi > 0,

= 1, i = 1,2,..., n

(2)

i =1

The above W and P form IFS. Commonly, the random iteration algorithm is used
to draw IFS fractal graphics. Firstly, ( x0 , y0 ) is the initial point. Next, an affine
transform is selected from W according to P , and new coordinate point ( x1 , y1 ) is
computed. Then ( x1 , y1 ) is looked as initial point. Repeat the process, and the orbit of
all points form a shape similar to initial image. We call it the attractor of IFS.
In plant simulation, IFS can better represent the subtle texture of a plant and the
implementation program is simple, which is the main merit of IFS. Nevertheless, IFS
is lack of representation and control for the topological structure of a plant. In other
words, IFS simulates a plant only according to the features of the plants geometric

Plant Simulation Based on Fusion of L-System and IFS

1093

shapes, and IFS cant represent physiological features of the plant. Therefore
simulated plants are machine-made, have few differences from each other. The key
reason is: for the plant simulated by IFS, both the whole and the parts have strict selfsimilarity. Namely, the length and amount of branches of trunks obey the same strict
self-similarity rules as those of branches or sub-branches, and the self-similarity rules
exist in more little detail sequentially.
Therefore, it is difficult in simulating a natural plant which has plenty texture and
various complex structure and shape only by L-system or only by IFS.

3 Plant Simulation Method Based on Fusion of L-System and IFS


Based on above analysis, a novel plant simulation method based on fusion of
L-system and IFS is presented. The basic idea is as follows: L-system is used to
simulate the diverse development rules and topological structure of a plant, and IFS is
employed to simulate the textured plant components, such as branches and leaves.
Therefore, the merits of both techniques can be combined. For one thing, through
L-system, we can freely control and simulate the development rules and topological
structure of plant. For another, through IFS we can quickly simulate the self-similarity
and plenty texture of the plant components.
The process of combining L-system with IFS is similar to that of building the
building blocks. For one thing, the production rules and parallel rewriting principle of
L-system are regarded as building rules for building blocks. For another, IFS fractal
graphics with controllable parameters and plenty subtle texture are regarded as
various building blocks. Therefore, we can build the IFS building blocks according to
L-system rules, so that realistic simulated plant shapes are created. Next, taking Fig.1
as the example, we introduce the implementation process of the novel method.
Fig.1 (a) is a leaf created by IFS[6]. It is regarded as a leaf building block. The IFS
algorithm of the leaf can be encapsulated as a function defined as leaf ( x, y, r , k1 , k 2 ) .
Where x, y is the value of x-coordinate or y-coordinate of the petiole bottom
endpoint, r is the rotation angle of the whole leaf, k1 , k 2 is the scaling factor in x or y
direction. If we change the values of x, y, r , k1 , k 2 , or the five controllable parameters,
we can create various leaves with different sizes, directions and coordinate positions.
Obviously, the leaf building block is not invariable, on the contrary, its size, direction
and coordinate position are changeable and controllable.
Fig.1 (b) shows branches created by L-system. L-system is a formal language,
which models objects through giving geometric interpretation to the character in
string. For instance, character F is a basic symbol in L-system, while drawing, F is
interpreted as Moving forward by line length drawing a line. However, in this paper,
character F can be interpreted as a novel meaning. For example, some F in the
character string can be interpreted as Moving forward drawing a leaf by calling the
function leaf ( x, y, r , k1 , k 2 ) . It means that the present values of coordinate position
and drawing direction or angle are regarded as actual parameters of function leaf () .
Therefore, An IFS leaf replaces the traditional L-system simple line. At the same
time, next coordinate values and angle are computed to prepare for next drawing step.

1094

J. Han

Under the plant growth rules shown in Fig.1 (b), the IFS leaf building block with
controllable parameters shown in Fig.1 (a) is called, so a plant with leaves in certain
growth position is created as shown in Fig.1 (d).
In order to simulate textured branches, we define a branch building block shown in
Fig.1(c). The branch building block is created by IFS[7].The IFS algorithm of the
branch is encapsulated as a function defined as trunk ( x, y, r , k1 , k 2 ) , the meaning of
every parameter is similar to that of function leaf () . Fig.1 (e) is a modeling plant
using L-system to combine IFS leaves and IFS textured branches together.

(a)

(b)

(c)

(d)

(e)

Fig. 1. Process of simulating plant by combing L-system with IFS: (a) An IFS leaf. (b) A
modeling branch with L-system. (c) An IFS textured branch. (d) Plant created with modeling
branches and IFS leaves. (e) Plant created with modeling branches, IFS leaves, and IFS
textured branches.

Through above process, we may see that as far as the method based on fusion of Lsystem and IFS, IFS is used to create various graphics building blocks representing
components of plant, and we can freely modify set of production rules of L-system,
namely, building rules for building blocks can be modified. Actually, L-system
determines the development rules and the whole structure of a plant, IFS determines
the modeling and texture of every basic part of a plant. Through combining the merits
of both techniques, we may simulate the topological structure and subtle texture of a
plant freely.

4 Improved L-System Implementation Algorithm


In order to create a realistic and natural modeling plant, we commonly request that
any component of the plant, such as a trunk, a branch or a leaf can be changed freely
in width, length, color and growth direction. Therefore, in implementation process of
the fusing method, according to actual demands, L-system must freely adjust or
control the function parameters of plant components created by IFS.

Plant Simulation Based on Fusion of L-System and IFS

1095

The implementation algorithm of traditional L-system is as follows: Firstly,


beginning with the initial string, according to set of production rulers, the string is
rewritten repeatedly to create a long string until the iteration number is reached.
Secondly, the long string is interpreted and drawn. This kind of final one-off drawing
process is not flexible enough. It is difficult in controlling the parameters of plant
components freely. Therefore, this paper presents the improved implementation
algorithm of L-system. The final one-off drawing process is separated into several
steps, namely, the string is rewritten once, then it is interpreted and drawn once, only
the new created parts are drawn every time. Fig.2 shows the steps of drawing plant
based on improved algorithm, where n is iteration number.
Fig.2 (a) shows the father branches, where n = 1 . In the figure, numbers 1, 2, 3 label
the points called growth points where the first generation sub-branches start to draw.
Once the father branches have been drawn, the position and direction of these growth
points are recorded, preparing for drawing the first generation sub-branches in next step.
Fig.2 (b) shows the first generation sub-branches drawn from the growth points
according to production rules, where n = 2 . At the same time, the growth points of the
second generation sub-branches, labeled by numbers from 1 to 9, are recorded.
Fig.2 (c) shows the third generation sub-branches drawn from growth points
recorded in Fig.2 (b).
Repeat above steps, and a plant with complex shape and topological structure is
created.

(a)

(b)

(c)

Fig. 2. Process of drawing plant based on the improved L-system algorithm: (a) n = 1 . (b)
n = 2 . (c) n = 3 .

The improved algorithm based on separating steps can control the drawing plant
process flexibly. For example, during the process of simulating plant, according to
actual needs, we can freely control the number of growth points based on the value of
n . We can control and adjust the parameters of width, length, color and growth
direction of certain branches freely. We can also freely control that the next
generation to be drawn is leaf or flower based on n . Therefore, L-system and IFS are
combined into an organic whole, and can realize better plant simulation freely.

1096

J. Han

5 Results
Based on the fusion algorithm of L-system and IFS, many plant simulation
experiments have been done. And during the experiments, for every drawing
parameter involved in L-system and IFS, modest random factor is added in the
drawing process. Fig.3, Fig.4, and Fig.5 are parts of the typical experiment results.
Fig.3 shows the simulation results of four maples. In Fig.3 (a)(c)(d)(e), the
development rulers of L-system are used to combine IFS branch building blocks with
IFS maple leaf building blocks[8] shown in Fig.3(b), and random factors are added in
the process. The simulation results show the four maples have similar natural
structure and subtle texture, but their branches and leaves in different ranks have
modest differences in width, length, color and growth direction.
Similarly, Fig.4 and Fig.5 are also created by the improved methods presented in
this paper. The development rulers of L-system are used to organize IFS branch
building blocks and IFS leaf building blocks into an organic whole.

(a)

(b)

(c)

(d)

(e)

Fig. 3. Plant simulation results 1: (a) Maple 1. (b) Maple leaf. (c) Maple 2. (d) Maple 3. (e)
Maple 4.

6 Conclusion
This paper presents an improved method for plant simulation based on fusion of Lsystem and IFS. In this method, L-system is used to simulate the random development
and topological structure of plant, IFS is used to simulate the self-similarity and subtle
texture existing in components of plant, so that more natural and realistic simulated
plant can be created.
In addition, based on improved method presented in this paper, three dimension Lsystem and lighting effects can be employed to create much more realistic and natural
simulated plants.

Plant Simulation Based on Fusion of L-System and IFS

1097

(d)

(a)

(b)

(c)

(e)

Fig. 4. Plant simulation results 2: (a) Arbor. (b) A arbor leaf[6]. (c) Bamboo. (d) A bamboo
pole. (e) A bamboo leaf.

(a)

(b)

Fig. 5. Plant simulation results 3: (a) Shrub. (b) A flower.

Acknowledgments
We thank Q.Z Li and anonymous reviewers for their suggestions and comments.

References
1. Jim Hanan.: Virtual plants-integrating architectural and physiological models. In:
Environmental Modeling & Software 12(1)(1997)35-42
2. Prusinkiewicz P.: Modeling of spatial structure and development of plants: a review. In:
Scientia Horticulturae 74(1998)113-149

1098

J. Han

3. Slawomir S. Nikiel.: True-color images and iterated function systems. In: Computer &
Graphics 22(5) (1998)635-640
4. Xiaoqin Hao.: The studies of modeling method of forest scenery for three dimension iterated
function system. In: Chinese Journal of Computers 22(7) (1999)768-773
5. Zhaojiong Chen.: An approach to plant structure modeling based on L-system. In: Chinese
Journal of Computer Aided Design and Computer Graphics 12(8) (2000)571-574
6. Sun wei, Jinchang Chen.: The simple way of using iteration function system to obtain the
fractal graphics. In: Chinese Journal of Engineering Graphics22(3) (2001)109-113
7. Chen qian, Naili Chen.: Several methods for image generation based on fractal theory. In:
Chinese Journal of Zhejiang University (Engineering Science) 35(6) (2001)695-700
8. Hua jie Liu, Fractal Art. Electronic edition. Electronic Video Publishing Company of
Hunan. Hunan China(1997)chapter 5.4

Appendix
The related data and IFS code of the figures in the paper are as follows: where
a = r cos , b = q sin , c = r sin , d = q cos .
Table 1. Affine transform parameters in Fig.1 (c)

a
0.5
0.5
0.5
0.5

1
2
3
4

b
0.5
0.5
0.5
0.5

c
0.0
0.0
0.0
0.0

d
0.0
0.0
0.0
0.0

e
0
50
50
50

f
0
0
50
50

p
0.15
0.35
0.35
0.15

In Fig.4 (d), p of 3 in Table 1 is changed into 0.15. Other parameters arent


changed.
Table 2. Affine transform parameters in Fig.4 (e)

1
2
3
4

a
b
c
0.29 0.4 -0.4
0.33 -0.34 0.39
0.42 0.0
0.0
0.61 0.0
0.0

d
0.3
0.4
0.63
0.61

e
0.28
0.41
0.29
0.19

f
0.44
0.0
0.36
0.23

p
0.25
0.25
0.25
0.25

In Fig.5 (b), the flower is created recursively based on the equations as follows:
X n +1 = bYn + F ( X n )
, F ( X ) = aX + 2(1 a ) X /(1 + X 2 )

Y
=

X
+
F
(
X
)
n
n +1
n +1

where, a = 0.3, b = 1.0

(3)

A System Behavior Analysis Technique with


Visualization of a Customers Domain
Shoichi Morimoto
School of Industrial Technology, Advanced Institute of Industrial Technology
1-10-40, Higashi-oi, Shinagawa-ku, Tokyo, 140-0011, Japan
morimoto-syoichi@aiit.ac.jp

Abstract. Object-oriented analysis with UML is an eective process


for software development. However, the process closely depends on workmanship and experience of software engineers. In order to mitigate this
problem, a precedence eort, scenario-based visual analysis, has been
proposed. The technique visualizes a customers domain, thus it enables requirement analyzers and customers to communicate smoothly.
The customers themselves can schematize their workows with the technique. Consequently, the analyzers and customers can easily and exactly
derive use case models via the collaborative works. Thus, this paper
proposes a technique to advance the analysis further, inheriting such
advantages. The extended technique can analyze initial system behavior specications. The customers can also join and understand system
behavior analysis, thus they can easily decide on specications for developing systems.
Keywords: Activity diagrams, Model-based development.

Introduction

Requirement analysis and specications for software are important factors to


make success of software development and because quality of the analysis aects
quality of the software, it is the most important process. Thus, various analysis techniques were proposed; especially, Object-Oriented Analysis (OOA) with
UML1 is most widely used to model a domain of customers. After the modeling,
customers and developers can understand and analyze the domain and systematically decide requirement specications [6]. However, because the developers
must fully analyze a domain of customers based on OOA, quality of the analysis
is dependent on their capability.
In order to mitigate the problem, Scenario-based Visual Analysis, SVA for
short, was proposed [3, 2]. In SVA, analyzers and customers can analyze requirements in a domain and elicit use cases from very simple workow scenarios
cooperatively. That is to say they can easily understand, schematize and analyze the domain in a much simpler manner than using UML. On the other
1

Unied Modeling Language: http://www.omg.org/UML/

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 10991106, 2007.
c Springer-Verlag Berlin Heidelberg 2007


1100

S. Morimoto

hand, diculty and quality of system behavior analysis in OOA with UML is
dependent on workmanship and experience of designers likewise. Therefore, this
paper proposes a system behavior analysis technique utilizing resources which
are generated in the SVA processes. One can model not only use cases but also
system behavior with the technique. Moreover, the technique visualizes a domain
of customers, thus it enables designers and customers to communicate smoothly.
Consequently, the customers can easily and exactly decide requirements to the
developers.

Process of the System Behavior Analysis Technique

We herein explain the process of SVA and the system behavior analysis.
2.1

The Process of the Scenario-Based Visual Analysis

SVA can be adaptable to the requirements phase of object-oriented software


development. That is, use case diagrams can be obtained via the collaborative
works on the process. Use cases and actors are generally found out from a conceptual map, named business birds eye view (BEV), through arranging icons
which indicate subjects, verbs, and nouns in workow scenarios. A software tool
named SVA editor is also provided to support the operations [3]. Analyzers can
systematically perform the analysis and customers can easily join to the process.
Furthermore, in the last phase of the process, both of them can collaboratively
and visually decide which part of the tasks in the scenario should be implemented
as software on a BEV.
In SVA, analyzers use workow scenarios in order to capture a customers
business domain. A BEV is created from the workow scenarios to obtain a conceptual map of the domain. The BEV will be arranged to clarify the whole structure of the elements which constitutes the workow. Finally, use case diagrams
will be elicited from the BEV. The process of SVA is performed as following steps;
1.
2.
3.
4.
5.
6.

Customers describe workow scenarios.


Analyzers form an initial BEV from the scenarios.
The analyzers arrange the BEV to cluster verbs.
The analyzers and the customers analyze roles of the subjects.
The analyzers consider the system boundary.
The analyzers obtain use case diagrams.

SVA denes some rules to write workow scenarios as follows.


Use simple sentences with one verb and several nouns which might involve
articles, propositions, and adjectives. Other parses, such as adverbs, can be
used but are not analyzed in SVA.
Use active sentence. Do not use passive voice. The order of the words is such
like Subject-Verb-Objects.

A System Behavior Analysis Technique with Visualization

1101

Since task statements of customers are described in the workow scenarios using
simple natural language, they can easily conrm the correctness of the contents.
The analyzers form a BEV from each sentence in the workow scenarios.
Rectangle icons and oval icons are used to stand for nouns and verbs, respectively.
A subject and the other icons are connected by lines from a verb. The line to
the subject is drawn as a solid line and lines to the other icons are drawn as
broken lines.
After all the sentences in the workow scenarios have been visualized on the
BEV, the analyzers synthesize them. Same nouns are shared by every sentence
in which they are used. Verbs are not shared even if the same verbs appear in
several sentences.
After the synthesis, the analyzers rearrange elements so that the BEV will
become legible. This arrangement also produces a semantic relation structure on
the conceptual map. During this arrangement to clarify, the analyzers have to
take care to nd out semantic clusters of verb icons. This process is necessary
to analyze roles of subjects in the next step where noun icons will be analyzed.
In the next step, the analyzers abstract subjective noun icons as roles of tasks.
If a subjective noun icon is connected from some clusters of verbs, the subjective
noun has some dierent roles. In such cases, a single subject is decomposed to
multiple roles. Each of them is reconnected to the verb clusters.
After the role analysis, both the analyzers and customers decide a system
boundary on the rearranged BEV. This decision is made by both the analyzers
and the customers cooperatively. The boxed line is used to indicate what part
of the tasks is developed to be a software system. After the boundary is drawn,
the analyzers have to consider what nouns should be moved into the boundary.
For example, they must decide whether or not physical objects are implemented
in software. In such situations, the editable map, i.e., the BEV acts as a communication tool between the analyzer and customer. They can have a visual
support, i.e., the map, to discuss about the scope of the system which is going
to be developed.
Generating a use case diagram is the nal step of SVA. The way how to elicit
actors and use cases is very simple. The role icons connecting to the verbs on the
system boundary line and located outside of the box become actors. The verbs
on the boundary which connect to the actors become use cases.
You can get the further details of the SVA process from the references [3, 2].
2.2

Artifacts of the System Behavior Analysis Technique

In the system behavior analysis of OOA with UML, interaction or activity diagrams are generally used. Interaction diagrams illustrate how objects interact
via messages. Activity diagrams are typically used for business process modeling, for modeling the logic captured by a single use case or usage scenario, or
for modeling the detailed logic of a business rule [1]. Because activity diagrams
are closely related with scenarios in natural language and are suitable for system
behavior analysis, the objective of our technique is to design activity diagrams
from the results in SVA (i.e., workow scenarios, a BEV, and use case diagrams).

1102

2.3

S. Morimoto

The Procedure of the System Behavior Analysis Technique

The system behavior analysis technique eectively utilizes the artifacts of SVA.
Designers rst select a target use case from an elicited use case diagram in SVA.
Secondly, the designers extract the source verb icon of the selected use case and
all noun icons which are connected with the verb icon from the BEV. In the
third step the designers and customers cooperatively analyze activities from the
extracted icons and the source sentence of the icons in the workow scenario.
Next the designers draw partitions of an activity diagram based on the actors
which are connected with the selected use case in the use case diagram. Then
the designers set the elicited activities on the partition which is drawn from the
actor of the activitys subject in chronological order. The designers repeat the
above steps to all use cases in the use case diagram. The following is the procedure of the analysis.
1. After having used SVA, select a use case in a use case diagram.
2. Extract the source verb icon of the selected use case and all noun icons which
are linked with the verb icon in the BEV.
3. Elicit activities from the extracted icons and the source sentence of the icons
in the workow scenario.
4. Draw partitions from actors which are linked with the selected use case in the
use case diagram.
5. Put the activities on the corresponding partition in chronological order.

Application

In order to intelligibly demonstrate the steps of the process in detail, we present


an actual consulting example. The business domain of this example is a hospital,
where a new software system is needed to support their daily tasks. Especially,
the stas demand is that they want to develop the online medical record chart
system for doctors.
3.1

Use Case Analysis

The objective of this paper is not to show the use case analysis of SVA, thus we
show only the outline of the SVA phase.
The workow scenario on the next page Fig. 1 shows the business domain
of the rst medical examination. The parts which are surrounded by square
brackets denote modications in the revision phase of SVA. These parts were
elicited by collaborative discussion of the analyzers and customers. First, all the
workows were modeled into BEVs. Secondly, the BEVs were synthesized into
one diagram. Thirdly, the verb icons were grouped and roles of the subjective
nouns were analyzed in the synthesized BEV. The BEV on the next page Fig. 2
was nally modeled from the workow scenario. The verb icons were classied
into the clusters (A), (B), (C), (D), (E), (F), (G), (H), (I), (J), (K), and (L).

A System Behavior Analysis Technique with Visualization

1103

1. The patient submits the insurance card to the receptionist.


2. The receptionist inquires of the patient the symptom.
3. The receptionist decides the department from the symptom.
4. The receptionist makes the patient ll the application.
5. The receptionist makes the medical chart from the application [and the insurance card].
6. [The receptionist makes the consultation card from the insurance card].
7. The receptionist brings the medical chart to the surgery of the department.
8. The receptionist hands the nurse in the surgery the medical chart.
9. The nurse calls the patient in the waiting room.
10. The patient enters the surgery.
11. The doctor examines the patient.
12. The doctor gives the medical treatment to the patient.
13. The doctor writes the medical treatment to the medical chart.
14. The patient leaves the surgery.
15. The patient goes the waiting room.
16. The nurse brings the medical chart to the reception.
17. The receptionist calculates the medical bill [from the medical treatment in the medical chart].
18. The receptionist calls the patient in the waiting room.
19. The receptionist charges the patient the medical bill.
20. The patient pays the receptionist the medical bill.
21. The receptionist hands the patient the consultation ticket [and the insurance card].
22. The receptionist puts the medical chart on the cabinet.

Fig. 1. The workow scenario of the rst medical examination

insurance

hospital business boundary

card

(A)

submint

make

ticket
hand
symptom
inquire

chart system boundary

(G)

consultation

(H)

(X)

make
transfer clerk

doctors
bill

decide

(B)

bring

application

(Y)

fill

chart making
clerk
ticket making
clerk
accounting
clerk

card carrier
charge

receptionist

(C)
patient

pay

(D)

call

claimant
calling clerk

(J)

(L)

cabinet

calcurate

medical
treatment

write

bill

enter

room

leave

surgery
examining
clerk

entering clerk

chart
receiver

put

waiting

(E)

examine

over

doctor's

call

chart

hand

bring

(K)

go

(F)

(I)

medical

(Z)

Fig. 2. The BEV of the hospital business

1104

S. Morimoto

The subjective nouns which are linked with two or more verbs were decomposed
into some roles. Consequently, the use case diagram in Fig. 3 was also composed
from the chart system boundary on Fig. 2. Use cases are the verb clusters (G),
(H), (K), and (L) on the boundary. Actors are the roles (Y) and (Z) which are
connected to the verb clusters with the solid line. Two actors and four use cases
are elicited and named adequately. The foregoing operations are the SVA outline.

(Y)

receptionist
(Z)

Online chart system


(G)
make a chart
(H)
make a consultation
ticket
(K)
calculate a
doctors bill
(L)
write a medical
treatment

doctor
Fig. 3. The use case diagram of the online medical record chart system

3.2

System Behavior Analysis

We analyze the system behavior from the workow, the BEV, and the use case
diagram of Section 3.1.
First, we select the use case make a chart. This use case was obtained
from the part (G) in the BEV. The verb icon is make on the part (G) of
Fig. 2 and the noun icons which are directly linked to it are insurance card,
application, and medical chart. These are the rst and second processes of
the system behavior analysis mentioned in Section 2.3.
Next, see the source sentence of the icons in the scenario Fig. 1. The sentence
corresponds to 5. The receptionist makes the medical chart from the application and the insurance card. That is, it is claried that the information of
application and insurance card is required in order to make a medical chart.
Therefore, the activities input the information on an application and input
the information on an insurance card are identied. We used the nouns application and insurance card, however the noun medical chart is not yet
used. The business domain of this example is the rst medical examination, thus
the patient visits the hospital surely for the rst time. Because there is no chart
of the patient, the receptionist must make a new chart for the patient. That is,
before the foregoing activities, the activity make a new chart data is required.
The above operation is the third process of the system behavior analysis.
Finally, make a partition from the actor linked to the use case, i.e., receptionist and put the activities on the partition in chronological order. In consequence,
the activity diagram in Fig. 4 is composed.

A System Behavior Analysis Technique with Visualization

1105

receptionist

mak e a new chart data

input the information on an application

input the information on an insurance card

Fig. 4. The activity diagram of the use case make a chart

receptionist

make a new cons ultation

receptionist

doctor

input a medical treatment

g et a medical chart data

ticket data

input the information of

calculate a doctors bill

a medical chart

print out a

input the information of


a medical treatment

print out a receipt

consultation ticket

Fig. 5. The activity diagrams of the other use cases

Similarly, we analyze the use case make a consultation ticket. This use
case was obtained from the part (H) in the BEV. The verb icon is make on
the part (H) of Fig. 2 and the noun icons which are directly linked to it are
consultation ticket and medical chart. The source sentence corresponds to
6. The receptionist makes the consultation card from the insurance card. The
information of medical chart is required in order to make a consultation ticket.
Therefore, the activity input the information on a medical chart is claried.
Because there is no consultation ticket of the patient like the above analysis,
the activity make a new consultation ticket data is required. Moreover, if
you pursue the connection of the icon consultation ticket, it will be claried
that the receptionist must hand the patient the actual object, i.e., the ticket.
Accordingly, the receptionist must print out the ticket with the online chart
system; the activity print out a consultation ticket is identied. Since the
third process of the system behavior analysis is a group work, the designers can
easily get customers consent about such activities. The other use cases can be
analyzed in the same procedure likewise. The diagrams are shown in Fig. 5.
Moreover, the technique is adaptable to the business modeling [5]. SVA can
also analyze business use cases. If the boundary for the hospital business is drawn

1106

S. Morimoto

in Fig. 2, the business use case diagram is formed in the same way. Similarly, a
business activity diagram can be elicited from the workow scenario. The subject
in each workow may become a partition of a business activity diagram and each
workow sentence excepted the subject clause may become a business activity.

Concluding Remarks

In this paper, we have proposed a system behavior analysis technique utilizing


the scenario-based visual analysis. Software designers can obtain activity diagrams according to the process of the technique. Since the technique utilizes the
scenario-based visual analysis, the designers can understand customers business
domain. Moreover, since the rules of the scenario description are very simple,
the customers who fully understand the business domain can write the workow
scenarios. Consequently, they can have common understanding via the group
works. That is, they can easily decide specications on software development.
Henceforth, we will apply the technique to further practical subjects. The activity diagrams of the example were not complex; they do not use condition or
decision. Moreover, the technique designs only activity diagrams now. If class
diagrams can be elicited from workow scenarios, sequence and state machine
diagrams can also be designed. In the business birds eye view, designers can
easily decide classes, because noun factors in the customers domain are claried. However, in order to identify attributes and methods of classes, it may be
necessary to add further information, e.g., a picture. In the case of the example
in this paper, maybe the picture of the medical chart is required for class design.
We are improving the technique and developing a tool to aid this idea.
Acknowledgement. My special thanks are due to Dr. Chubachi for his
advice.

References
1. Ambler, S. and Jeries, R.: Agile Modeling, Wiley (2002)
2. Chubachi, Y., Kobayashi, T., Matsuzawa, Y., and Ohiwa, H.: Scenario-Based Visual
Analysis for Use Case Modeling, IEICE Transactions on Information and Systems,
Vol. J88-D1, No.4 (2005) 813828 (in Japanese)
3. Chubachi, Y., Matsuzawa, Y., and Ohiwa, H.: Scenario-Based Visual Analysis for
Static and Dynamic Models in OOA, Proceedings of the IASTED International
Conference on Applied Modelling and Simulation (AMS 2002), ACTA Press (2002)
495499
4. Jacobson, I., Booch, G., and Rumbaugh, J.: The Unied Software Development
Process, Addison-Wesley (1999)
5. Jacobson, I., Ericsson, M., and Jacobson, A.: The Object Advantage - Business
Process Reengineering with Object Technology, Addison-Wesley (1996)
6. Kruchten, P.: The Rational Unied Process: An Introduction, Addison-Wesley
(2003)

Research on Dynamic Updating of Grid Service


Jiankun Wu, Linpeng Huang, and Dejun Wang
Department of Computer Science, Shanghai Jiaotong University, Shanghai, 200240,
P.R. China
jkwu@sjtu.edu.cn huang-lp@cs.sjtu.edu.cn wangdejun@sjtu.edu.cn

Abstract. In complicated distributed system based on grid environment, the


grid service is inadequate in the ability of runtime updating. While in the
maintenance of systems in grid environment, it is an urgent issue to solve to
support the transparent runtime updating of the services, especially in the case
of services communicating with each other frequently. On the basis of
researches on the implementation of grid services and interaction between them
following WSRF [3], this paper introduces proxy service as the bridge of the
interaction between services and achieved the ability to support the runtime
dynamic updating of grid services. Gird service updating must happen
gradually, and there may be long periods of time when different nodes run
different service versions and need to communicate using incompatible
protocols. We present a methodology and infrastructure that make it possible to
upgrade grid-based systems automatically while limiting service disruption.
Keywords: Grid service, Dynamic updating, Proxy service, Simulation service.

1 Introduction
With the change of application requirements and wide use of Internet, the across-area
and across-organization complicated applications have developed greatly in various
fields. The distributed technology has become the main method in these applications.
Accompanying with the system expanding day by day, the maintenance and
modification of the system became more frequent. Research shows that nearly half of
costs are spent in the maintenance of the complicated distributed system. Services
have to be paused in the traditional procedure of software maintenance, but some
system such as Bank must provide services continuously in 24 hours, any short-time
pause will make great lost. How to resolve the dilemma? Answer is Dynamic
Updating technology. Software Updating is defined as the dynamic behavior
including software maintenance and update in the life-cycle of software system [6].
Due to maintaining the system with whole system working normally, Dynamic
Updating is significant. The grid computing technology is the latest achievement of
the development of distributed technology, aiming to resolve the resource share and
coordination in WAN distributed environment and avoid the drawbacks such as
inadequate computation ability or unbalance loads[1][7][8]. It is a trend to develop
new complicated system based on grid technology and transplant the current system
into grid environment.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11071114, 2007.
Springer-Verlag Berlin Heidelberg 2007

1108

J. Wu, L. Huang, and D. Wang

As application system based on grid technology consists of services with specific


function, the system maintenance is mainly maintaining services. As the same as
other distributed systems, the services maintenance in grid application system still
face the problem of terminating service. Therefore it is necessary to introduce
Dynamic Updating technology in the service maintenance of grid environment. It has
more practical significance especially when the built system is in its long-time
running.
In the current grid technology, if we want to update or modify the working grid
services, we must stopped and start new service to run. This model is inadequate in
dynamic switch in running service. The substitution of grid service will make part of
the system terminated or produce long delay, especially in the case of service
communicate with each other frequently.
The proxy service and simulation service are introduced in the architecture
supporting grid service updating. The proxy service is not only responsible for the
service requests transmitting but also responsible for contacting with updating
component information service through subscribe/publish styles to obtain the new
version information in time. The interaction and interface of different service will be
transparency by introducing proxy service. And the simulation service is responsible
for simulating of behavior and state format between different versions of service.
This paper presents a flexible and efficient updating method that enables gridbased systems to provide service during updating. We present a new methodology
that makes it possible to updating grid-based systems while minimizing disruption
and without requiring all upgrades to be compatible. The rest of paper is organized as
follow. In section 2, the updating requirements of grid service is discussed. In
section 3, the architecture and relative technology supporting grid service updating
are presented and discussed. In section 4, prototype system and relative tests are
described. Finally, summary and future works are given.

2 Architecture Supporting Grid Service Updating


The architecture models a grid-based system as a collection of grid services. A service
has an identity, a collection of method that defines its behavior, and a collection of
resource representing state. Services communicate by sending soap message. A
portion of a services state may be persistent. A node may fail at any point; when it
node recovers, the service reinitializes itself from the persistent portion of its state and
when updating, the persistent state may need change the data format for the new
version.
To simplify the presentation, we assume each node runs a single top-level service
that responds to remote requests. Thus, each node runs a top-level serviceproxy
grid service. An upgrade moves a system from one version to the next by specifying a
set of service updating, one for each grid service that is being replaced. The initial
version has version number one and each subsequent version has the succeeding
version number.

Research on Dynamic Updating of Grid Service

1109

2.1 System Architecture


A class updating has six components: <oldService, newService, TF, SF,
pastSimulationService, futureSimulationService>. OldService identifies the service
that is now obsolete; newService identifies the service that is to replace it. TF
identifies a transform function that generates an initial persistent state for the new
service from the persistent state of the old one. SF identifies a scheduling function
that tells a node when it should update. PastSimulationService and
futureSimulationService identify services for simulation objects that enable nodes to
interoperate across versions. A futureSimulationService allows a node to support the
new services behavior before it upgrades; a pastSimulationService allows a node to
support the old services behavior after it upgrades. These components can be omitted
when not needed.

Fig. 1. Updating architecture

2.2 Analysis of Proxy Service Mechanism


The object of proxy service introducing is obtaining the transparency between
services. When the grid service is updating, the other grid services in the same grid
system will be not aware of it. The proxy service is not only responsible for the
service requests transmitting but also responsible for contacting with updating
component information service through subscribe/publish styles to obtain the new
version information in time.
2.3 Version Management
Because the updating doesnt complete in twinkling, it is necessary to support the
multi version coexist at same time. The simulation service is responsible for the
simulating of different interface and saving and recovering of states between current
version and old version, current version and new version services.

1110

J. Wu, L. Huang, and D. Wang

In order to accurately locate the simulation service of relative version for proxy
service, each simulation service has a resource to hold the version information such as
request interfaces, parameter formats, URL for the software data and so on.
When the service updating happens, the work flow of the proxy service is showed
as figure 3. The Proxy service gets the simulation service according to relative
resource and then deliver the service request to the respect simulation service and
return the result to the service requester in the finally.

Fig. 2. Version information in service updating

Fig. 3. Simulation procedure

2.4 Subscribe/Publish Model in Updating Procedure


The subscribe/publish style is adopted for publishing service version changing
information in order to make the updating information report more quickly and reduce
the load of network. The proxy service in every grid node subscribes to the updating
component information service for the services version changing information. Due to
the proxy service receiving the information, it will make some actions according to
the relationship between current service in the node and the new service. This style is
more efficient than the continue request/report method in the traditional updating
system. It makes the nodes deployed grid service focus on the main computing task
without initiatively querying the new service version information all the time.
As showed in figure 4, the proxy service is activated as a basic service when the
grid service container starts running. At the same time, the proxy service initiatively

Research on Dynamic Updating of Grid Service

1111

subscribe to the service version changing information of updating component


information service. The new version grid service is stored in the grid data base and
the URL of the data base is hold in the resource of updating component information
service. As the proxy services aware of the new service information of current
service, it requests the Grid Ftp with URL information to transmit the software data to
current node and deploys it. It reports the updating mid-states through request/reports
method to updating information service and the states are also represented by resource
which hold the URL of the data base storing the mid-states.

Fig. 4. Interaction model in updating platform

2.5 Scheduling of Service Management


We could add filters to the model that would determine some subset of nodes that
need to upgrade. Adding filters is enough to allow restructuring a system in arbitrary
ways.
In order to make the grid service dynamic updating more efficient, dynamic grid
service updating scheduling which is based on monitoring the load of nodes is
adopted in this paper. The performance evaluating model that bases on CPU
frequency, CPU load, memory capacity, occupied proportion of memory, disk
capacity and occupied proportion of disk is built. This model could make the updating
procedure more efficient and reduce the service interruption. We adopt the following
formula to define the evaluation.

= CPU + MEM + DISK

(1)

CPU = pCPU _ Freq CPU _ Freq + pCPU _ Load CPU _ Load

(2)

MEM = pMEM _ Cap MEM _ Cap + pMEM _ Occupied MEM _ Occupied

(3)

1112

J. Wu, L. Huang, and D. Wang

DISK = pDISK _ Cap DISK _ Cap + pDISK _ Occupied DISK _ Occupied

(4)

In the above formula, is the final evaluating parameter, CPU is CPU evaluating

parameter, MEM is memory evaluating parameter, DISK is disk evaluating parameter,


p is the coefficient of the relative ,and the is load parameter derived by monitor.
Through the evolution of such parameter described above, updating administrator
or updating management system will order the parameter according and select the
light load nodes according a special scope to update firstly the new version service.
The updating will be completed according to the similar rule until all the service in
the system complete updating.
2.6 States Management
States management reorganizes a services persistent state from the representation
required by the old service to that required by the new service and the current service
to that required by the past simulation service and future simulation service. Thus,
client services do not notice that the service has upgraded, except that client services
of the new type may see improved performance and fewer rejected requests, and
client services of the old type may see decreased performance and more rejected
requests. We adopted checkpointing technology [4][5][10][11] and process
migration[12] technology to save the states of service states and recover the states for
the new version service.
2.7 Updating Transactions of Grid-Based System
Due to the failure of updating procedure, the recovering of updating failure should be
considered. The updating transaction is adopted for managing the updating failure.
The updating mid-state is stored in the database through resources of updating
information service.
When system updating failures, the administrator or updating managing system
recover the system to the original point from the mid updating states stored in
database and the updating of system seems doesnt have happen. So the system
updating procedure is an atom transaction [9]. The checkpointing technology
[4][5][10][11] and process migration[12] technology are adopted as state saving
technology and state recovering.

3 Prototype and Analysis


In order to validate the methods validity, we build a grid platform infrastructure
which supports grid service dynamic updating. GT4[2] is adopted as the software
platform and service is developed conform to WSRF[3] specification. The updating
scheduling basing on monitor of computing resource with WS-MDS[2] will make the
updating procedure more efficient through selecting more optimal updating subset of
grid system nodes. The physical environment is showed as figure 5.

Research on Dynamic Updating of Grid Service

1113

Fig. 5. Grid environment supporting service updating

4 Summary and Future Work


A grid service dynamic updating method in grid environment is presented in this
paper and proxy service is introduced in this method for service request transmitting.
The transparency between services could reach by introducing proxy service. The
mechanism supporting multi version coexist at same time is introducing simulation
service. Simulation service is responsible for simulating interface behavior and states
format transferring of different versions.
In the aspect of state transferring, we adopt a mature state transfer method used in
other updating system. In the future, we will research more suitable state transferring
mechanism for system constructed by grid service.
Acknowledgments. This paper is supported by Grand 60673116 of National Natural
Science Foundation of China, Grand 2006AA01Z166 of the National High
Technology Research and Development Program of China (863).

References
1. Foster, I., Kesselman, C., Tuecke, S., The Anatomy of the Grid: Enabling Scalable
Virtual Organization, International Journal of Supercomputer Applications, 2001.3, Vol.
15(3), pp200-222
2. Globus Toolkit 4.0. http://www.globus.org/, 2006.11
3. WSRF-The WS-Resource Framework. http://www.globus.org/wsrf/, 2006.5
4. Michael Hicks. Dynamic Software Updating. PhD thesis, Computer and Information
Science, University of Pennsylvania, 2001

1114

J. Wu, L. Huang, and D. Wang

5. Peter Ebraert, Yves Vandewoude, Theo DHondt, Yolande Berbers. Pitfalls in


unanticipated dynamic software evolution. Proceedings of the Workshop on Reflection,
AOP and Meta-Data for Software Evolution(RAM-SE'05), 41-51
6. Yang Fu-qing, Mei Hong, Lu Jian , Jin Zhi. Some Discussion on the Development of
Software Technology. Acta Electronica Sinica(in Chinese) ,2002, 30(12A):1901-1906
7. I Foster, C Kesselman. The Grid: Blueprint for a new computing infrastructure1.San
Francisco: Morgan2Kaufmann ,1998
8. Ian Foster1, Carl Kesselman, et al. The Physiology of the Grid: An Open Grid Services
Architecture for Distributed Systems Integration. http://www.globus.org/reserch/papers/
ogsa.pdf
9. Iulian Neamtiu, Michael Hicks, Gareth Stoyle, Manuel Oriol. Practical Dynamic Software
Updating for C. Proceedings of the ACM Conference on Programming Language Design
and Implementation (PLDI2006), pp72-83.
10. G. Bronevetsky, M. Schulz, P. Szwed, D. Marques, and K. Pingali. Application-level
check pointing for shared memory programs. In Proc. ASPLOS, 2004.
11. J. S. Plank. An overview of checkpointing in uniprocessor and distributed systems,
focusing on implementation and performance. Technical Report UT-CS-97-372, Computer
Science Department, the University of Tennessee, 1997.
12. J. M. Smith. A survey of process migration mechanisms. ACM Operating Systems
Review, SIGOPS, 22(3):2840, 1988.

Software Product Line Oriented Feature Map


Yiyuan Li, Jianwei Yin, Dongcai Shi, Ying Li, and Jinxiang Dong
College of Computer Science and Technology, Zhejiang Univ., Hangzhou 310027, China
zjulyy@yahoo.com.cn, zjuyjw@zju.edu.cn, shidcai@163.com,
cnliying@zju.edu.cn, djx@zju.edu.cn

Abstract. The core idea of software product line engineering is to develop a


reusable infrastructure that supports the software development of a family of
products. On the base of domain analysis, feature modeling identifies
commonalities and variability of software products in terms of features to
provide an acknowledged abstract to various stakeholders. The concept of
feature map is proposed to perfect feature model. It supports customized feature
dependencies and constraint expresses, provides the capability to navigate and
locate the resource entities of features. Ontology is introduced as the
representation basis for the meta-model of feature maps. By the means of
selecting features to construct the reusable infrastructure, the components of
feature implementation are rapidly located and assembled to produce a family
of software products meeting certain dependencies and constraints.
Keywords: Variability, Feature map, Resource navigation, Ontology.

1 Introduction
Currently the manufacture of software is suffering from such problems as individual
customized requirements and frequent changes of business requirements. As a result,
it seems that traditional software development mode - which is to develop software
product specifically for certain applications requirements - costs more and has less
efficiency and maintainability. In this software development mode, its hard to meet
the requirements of software development in large scale customization environment.
The purpose of software production for mass customization is to produce and
maintain a family of software products with similar functions, figure out both their
commonalities and variability and manage these features [1]. It represents the trend of
software factorys evolution.
Software product line is an effective way to implement software production for
mass customization. Its a set of software systems with common controllable features.
The core idea of software product line engineering is to develop a reusable
infrastructure that supports the software development of a family of products [2]. A
software product line typically consists of a product line architecture, a set of
components and a set of products [3]. The characteristics of software development
applying software product line principals are to maintain the common software assets
and reuse them during the development process, such as domain model, software
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11151122, 2007.
Springer-Verlag Berlin Heidelberg 2007

1116

Y. Li et al.

architecture, process model, components, etc. Each product derives its architecture
from the product line architecture, instantiates and configures a subset of the product
line components and usually contains some product-specific code. Instantiated
products constitute a family of software products in domain.
Feature modeling is the mainstream of domain analysis for the software product
line. Its main purpose is to identify all commonalities and variability in software
product line. The outputs of feature modeling are all potential products of product line
[4]. FORM [5] is a famous development method based on feature. Difference between
domain products and family products shows the variability of software product line
[2]. Variability point model [6, 7] models the variability of software product line
through four ways. The complex dependency relationships among variability points
are presented in first order expression [8]. From the viewpoint of software
configuration management, the variability management of software product line can
be divided into nine sub modules according to two dimensions [9].
By analyzing the deficiency of current feature modeling and its description
language, this paper proposes an expanded feature modeling of software product line
feature map. It perfects feature dependency description and restriction expression,
supports quick navigation to feature resource artifacts of software product line in
distributed collaborative development environment. Its meta-model is also presented.

2 Feature Map
Feature is the first-order entity in domain. It shows some capabilities or specialties
owned by systems. Its the only determinate abstract in the domain and can be
understood simultaneously by domain experts, users and developers. To a certain
extent, feature is a kind of expression to ontology knowledge of application domain.
2.1 Deficiency of Feature Model
Feature modeling is to identify the commonalities and variability of all products in a
software product line via analysis to domain features and their relationship. Domain
reference architecture can be built according to feature model. The constituent units of
the architecture can be bound with related component entities. However, existing
feature model and its description technique have some deficiency.
Firstly, each domain may have its own feature mutual operation relation due to its
variety. It has indetermination. Although existing feature models sum up and analyze
the usual feature relation, they can not wholly describe all domain related feature
dependency relation. Secondly, existing feature model trends to build feature model
by aiming at domain systems function. This forms the functional features. However,
it seldom considers the non-functional domain features like performance, cost, and
throughput etc. Also it lacks effective description and expression measure. Thirdly,
domain feature analysis runs through all phases of software development lifecycle. It
refers to a lot of resource entities like requirement specification, design model and
component entities etc. Existing feature models only discuss production of software
product from the viewpoint of feature selection. They ignore the problem of feature
instantiation including the selection and locating of domain feature related resource
entities. Fourthly, there may exist more than one component entity that implements

Software Product Line Oriented Feature Map

1117

the functions presented by a certain feature for choices. Existing feature models
ignore the variability brought by feature implementation scheme.
Thus it can be seen that it is necessary to expand existing feature models to perfect
the modeling and description ability for feature dependency relationship, nonfunctional feature constraint, feature resource navigation and variability of domain.
2.2 Definition of Feature Map
This paper proposes the concept of feature map. It supports feature dependency
relationship and restriction expression and provides the capability of locating and
navigating resource entities to implement feature selecting according to specified
requirement, locate and assemble resource entities quickly and generate software
product family that can satisfy dependency relationship and restriction conditions.
A feature map can be defined as a 5 elements set. FM = (F, A, C, R, A, C, R),
among them,
- F is the feature set of feature map;
- A is the feature association set of feature map;
- C is the feature constraint expression set of feature map;
- R is the feature resource entity set of feature map;
- A denotes a mapping from F to the set P(A), i.e. A: FP(A). P(A) represents the
set of all the subsets of A. A meets the following conditions:

a A, F ' F ,| F ' | 2 and f F ' , A ( f ) = a


This means that an arbitrary feature can have multiple dependency relationships
with other features. Meanwhile, each feature association involves at least two
features.
- C denotes a mapping from F to the set P(C), i.e. C: FP(C). P(C) represents
the set of all the subsets of C. C meets the following conditions:
c C , F ' F ,| F ' | 1 and f F ' , C ( f ) = c
That is to say, for an arbitrary feature, it can be restricted by multiple constraint
expressions; while each feature constraint can be specified to either a certain feature,
or a set of features.
- R denotes a mapping from F to the set P(R), i.e. R: FP(R). P(R) represents the
set of all the subsets of R. R meets the following conditions:
f f ' F , R ( f ) R ( f ') = and R ( f ) = R
That is to say, each feature owns its resource entities.
Thus it can be concluded that the concept of feature map consists of two parts. On
the one hand, feature map expands existing feature models to construct its
infrastructure and foundation via perfecting feature dependency relationship
definition of existing feature models and aggrandizing feature constraint expression to
enhance the feature configuration relationship. On the other hand, feature map builds
its superstructure via introducing the resource entities of features and providing the
capability to rapidly navigate and locate them. With these two hands combined
tightly, by the means of selecting features to construct the reusable infrastructure, the

1118

Y. Li et al.

component entities of feature implementation are rapidly located and assembled to


produce a family of software products meeting certain dependencies and constraints.
2.3 Meta-model of Feature Map

Features together with their dependency relationship, constrain expression and


resource entities are abstracted as basic elements of meta-model. Corresponding with
the web ontology language OWL, modeling elements of meta-model can be divided
into ontology class element, object property element, data property element and data
type element. Among them, ontology class element represents the semantic principal;
object property element represents the association relationship among ontology class
elements as the format of object property of ontology class elements, both its domain
and range are ontology class elements; data property element represents the nonfunctional characteristics of ontology class element, its domain is ontology class
element while its range is data type element.

Fig. 1. The Meta Model of Feature Map Based on Ontology

The meta model of feature map based on ontology is described as figure 1,


Feature, FeatureBind, Association, Constraint and Resource etc. are defined as
ontology classes; while restrictsObject, hasResource, playedBy and hasRole etc. are
defined as ontology object properties to establish the relation network of semantic
principal; name, param, and location etc. are defined as data properties to describe the
feature properties of semantic principal. The meanings of the main meta-model
elements are described as following:

Software Product Line Oriented Feature Map

1119

Feature: ontology expression of feature definition in feature map, its common or


variable system characteristic that can be observed externally. Feature ontology
instance is identified by a unique global name.
FeatureBind: ontology class of feature binding, it associates the binding mode and
binding time through bindMode and bindTime object properties respectively.
Mode: binding mode of feature, including mandatory, optional, or, alternative and
exclude etc. modes. Classified from the viewpoint of if this binding mode is affected
by that of other features, mandatory and optional are unary binding modes while or,
alternative and exclude etc. are multiple binding modes. However, if its classified
from the viewpoint of the variability of features, only features indicated by mandatory
are the common indispensables while the ones indicated by others are optional
features rested with the specific software products.
Time: binding time of feature, it only makes sense to the variable features that
are indicated by optional, or, alternative and exclude etc. Its value can be design-time,
compile-time, implement-time, assemble-time, load-time, instantiate-time, runtime
etc.
Resource: expression of feature resource. It marks the software product
development phase producing the resource via the association of belongsTo object
property and Phase ontology class. It also indicates the type of entity object quoted by
the resource via the association of type object property and ResourceType ontology
class. Resource type is decided by the phase of software product development.
Entities quoted by the resource may locate on any places in the distributed network
environment, and can be navigated by URI through location object property.
Phase: stages of software product development. It includes requirement, design,
implementation, test and maintenance etc. Although software product line engineering
based on feature modeling is macroscopically similar with the traditional software
engineering which is oriented to single software product development in the aspect of
defining the phases of software development, they are dramatically different in the
aspect of concrete actualizing approach and detail in each phase [10].
ResourceType: it can be requirements analysis document, model / flow design or
component artifacts etc. This depends on the phase of software product development
during which this resource is produced.
Constraint: the non-functional restrictions on feature. Constraint expression
consists of a set of parameters, operators and variables. Constraint builds association
with Feature ontology class through restrictObject object property and confirms the
restricted object. Constraint can be defined to aim at property set of a single feature. It
also can include multiple features as restricted objects and build feature constraint
relationship under the general restriction.
Association: relationships between features. It associates the AssociationType
ontology class through the type object property to confirm relation type. It also
associates Role ontology class through the hasRole object property to make certain
the objects referred by association. It is built based on at least two associated objects.
AssociationType: type of association, including composed-of, implemented-by,
require, generalization/specialization and activate etc. Association has orientations,
among which, composed-of, generalization/specialization and implemented-by belong
to structural association; require and activate belong to reference association.

1120

Y. Li et al.

Role: the object referred by association. It associates Feature ontology class


through playedBy object property to make certain the real feature that assumes the
role. It also associates RoleType ontology class via type object property to indicate the
deserved role type. Assigning of role type determines the orientation of association.
RoleType: type of role. Its real range is decided by the type of association
accompanying with the role.
The hierarchy of feature map is built by relationships like composed-of,
generalization/specialization, implemented-by etc. among features. Common features
are represented by setting binding mode to mandatory while variable features are
established by marking the binding mode as optional, or, alternative or exclude etc.
On the one hand, dependency and mutual operation among features are expressed by
associations like implemented-by, require and activate etc. Moreover, the orientations
of associations are determined by the role that feature takes within association. On the
other hand, constraint expressions are built on the properties of a single feature or the
properties set of feature group. All kinds of resource entities related to features in
each development phase are navigated in network environment by location. Through
this way, structural association, dependency association and constraint conditions
among features are completely established. Meanwhile, by adding instances of
AssociationType, RoleType and ResoureType, meta-model can describe the new
associations and locate the new resource entities. Thus expansibility is available.
The variability of feature map is represented in several aspects. Firstly, as far as
binding mode and binding time are concerned, the former directly determines whether
the feature is selected or not, while the latter determines the occasion when the
optional features are instantiated. Secondly, the relations among features like require
and activate etc. determine if the other features that have dependency association or
mutual operation association with the present feature will be selected or not. Thirdly,
constraint expression determines the quantification constraint on the properties set of
a single feature or feature group, and furthermore, it will affect the selection of
component entities for feature implementation. Fourthly, on the base of the navigation
and locating of resource entities, software products instantiated by selecting resource
entities with same functions but different implementation plan will have different
non-functional characteristics like performance and quality of service etc.

3 Case Study
Figure 2 shows the feature map of mobile telephone software product line and the
mapping to its meta-model. Mobile telephone software product line is composed of
some functional features like password protection, game, telephone directory and
browser etc. Among them, password protection and browser are optional features.
Meanwhile, multiple games can be the choice, but to some limitation, such as a small
memory capacity, G3 and G4 can only be chosen one arbitrarily. In order to be in
operation, the length of password should be set to 6, the length of list in the telephone
directory should be no more than 250, and the required memory of embedded browser
should be less than 2M. In the process of feature analysis, each function feature has
related requirements specification, design model and implementation component.
Some functional features, for example, G2, even have various implementation
schemes.

Software Product Line Oriented Feature Map

1121

Functional features like password protection, game, telephone book and browser
etc are modeling as Feature ontology; the selection of feature is mandatory or
optional is modeling as Mode ontology; max length of password, volume of telephone
book and memory consumed by browser etc. are modeling as Constraint ontology;
hierarchy structure of features and the mutually exclusive relationship between G3
and G4 etc. are modeling as Association ontology; requirements document, design
models and component entities are modeling as ResourceType ontology; all lifecycle
phases of software development are modeling as Phase ontology. The whole
infrastructure of feature map is constructed by the associations among ontology via
object properties, while the superstructure of feature map is constructed by modeling
the reference of resource as location property to navigate and locate the resource
entities.

Fig. 2. Feature Map and Its Meta-model of Mobile Telephone Software Product Line

4 Conclusion
The core idea of software product line engineering is to develop a reusable
infrastructure that supports the development of a family of software products. Its an
efficient way to implement mass customized software production. Feature modeling is
the mainstream of domain analysis of software product line. It identifies
commonalities and variability of the products of a product line in terms of features to
provide an acknowledged abstract to various stakeholders. Uncertainty of variable
features determines the variability of software product line. Existing feature models
and their description can not entirely support the diversity of feature dependencies in

1122

Y. Li et al.

different domains. They do not support modeling and description of constraint


expression and can not navigate and locate the resources in network environment.
Moreover, their variability analysis did not consider the alternative of component
entities which implement the features. In this paper, the concept of feature map is
proposed to perfect feature model. Ontology is introduced as the representation basis
for the meta-model of feature map. Feature map supports customized feature
dependencies and constraint expressions, provides the capability to navigate and
locate the resource entities of features. Then by the means of selecting features to
construct the reusable infrastructure, the components of feature implementation are
rapidly located and assembled to produce a family of software products meeting
certain dependencies and constraints. The further work is to refine the feature map
during studies and practices, including how to define and describe its related action
characters and state transfer etc.

References
1. Charles W. Krueger. Software Mass Customization. BigLever Software, Inc. (2001)
2. Michel Jaring, Jan Bosch. Representing Variability in Software Product Lines: A Case
Study. Proceedings of the 2th International Conference on Software Product Lines
(SPLC02), Springer Verlag LNCS 2379 (2002) 1536
3. J. Bosch. Design & Use of Software Architectures - Adopting and Evolving a ProductLine Approach. Addison-Wesley (2000)
4. David Benavides, Pablo Trinidad, Antonio Ruiz-Cortes. Automated Reasoning on Feature
Models. Proceedings of the 17th International Conference on Advanced Information
Systems Engineering (CAiSE05), Springer Verlag LNCS 3520 (2005) 491503
5. Kang KC, Kim S, Lee J, Kim K, Shin E, Huh M. FORM: A Feature-Oriented Reuse
Method with Domain-Specific Reference Architectures. Annals of Software Engineering
(1998) 143168
6. Jan Bosch, Gert Florijn, Danny Greefhorst. Variability Issues in Software Product Lines.
Proceedings of the 4th International Workshop on Software Product Family Engineering
(PFE02), Springer Verlag LNCS 2290 (2002) 1321
7. Diana L. Webber, Hassan Gomaa. Modeling Variability in Software Product Lines with
The Variant Point Model. Elsevier (2003)
8. Macro Sinnema, Sybren Deelstra, Jos Nijhuis, Jan Bosch. COVAMOF: A Framework for
Modeling Variability in Software Product Families. Proceedings of the 3th International
Conference on Software Product Lines (SPLC04), Springer Verlag LNCS 3154 (2004)
197213
9. Charles W. Krueger. Variation Management for Software Production Lines. Proceedings
of the 2th International Conference on Software Product Lines (SPLC02), Springer Verlag
LNCS 2379 (2002) 3748
10. Kyo C. Kang, Jaejoon Lee, Patrick Donohoe. Feature-Oriented Product Line
Engineering. IEEE Software, Volume 19, Issue 4, July-Aug (2002) 5865

Design and Development of Software Configuration


Management Tool to Support Process Performance
Monitoring and Analysis
Alan Cline1, Eun-Pyo Lee2, and Byong-Gul Lee2
1

Ohio State University, Department of Computer Science and Engineering,


Columbus, Ohio, USA
acline@carolla.com
2
Seoul Womens University, Department of Computer Science,
Seoul, Korea
eun-pyo@hanmail.net, byongl@swu.ac.kr

Abstract. Most SCM tools underestimate the potential ability of monitoring


and reporting the performance of various software process activities, and
delegate implementation of such capabilities to other CASE tools. This paper
discusses how SCM tool can be extended and implemented to provide valuable
SCM information (e.g., metric data) in monitoring the performance of various
process areas. With the extended SCM tool capability, stakeholders can
measure, analyze, and report the performance of process activities even without
using the expensive CASE tools.
Keywords: Software Configuration Management, Process Metric.

1 Introduction
Software Configuration Management (SCM) is a key discipline for development and
maintenance of large and complex software systems [1], [2]. Many researches and
studies show that SCM is the most basic management activity for establishing and
maintaining the integrity of software products produced throughout the software life
cycle. The activities of SCM include identifying configuration items/units, controlling
changes, maintaining the integrity and the traceability of the configuration item, and
auditing and reporting of configuration management status and result.
The existing configuration management tools support some or the combination of
these activities with the help of functions including change control, version control,
work space management, and build/release control [3]. However, most SCM tools
underestimate the benefits of using the SCM metric capability in monitoring and
measuring of other process areas and have their implementation depend on other
CASE tools such as project management tool or spreadsheet software [4]. We believe
that the SCM tool capability can be extend to provide some valuable services and
information for monitoring and measuring the performance of various process
activities, such as project management, requirement management, or software quality
assurance. For example, to monitor the stability of requirements (SR) during
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11231130, 2007.
Springer-Verlag Berlin Heidelberg 2007

1124

A. Cline, E.-P. Lee, and B.-G. Lee

requirement management, some of the primitive metric data from the change control
function can be utilized as follows:

i Initial Requirementsi Ti) .

SR = NCR / (

(1)

where NCR represents the total number of change request on a requirement and Ti is
the time interval between the initial check-in and the final release of a requirement.
Current SCM tools rarely provide the mechanism for incorporating the
management and utilization of SCM metrics into other process areas. They usually
define only a few set of primitive SCM metrics and let the thirsty user utilize them on
a need basis. Besides, the current tools have no way of monitoring and measuring the
performance for a newly defined process or out of the defined scope. For example, a
user might be interested in measuring the effectiveness of the training plan prepared
for an organization or a senior manager might want to create a new process role with
a new responsibility and to see how the jobs get done. In this situation, the user is
prone to purchase a tool fit to that purpose.
This paper describes the design and development of SCM tool which can provide
the measurement, analysis, and reporting of the process performance in the broader
process areas. The remainder of this paper is organized as follows. Section 2 gives the
related works by comparing the exiting SCM tool capability based on metric
utilization. Section 3 and 4 describe the design and usage of our SCM tool in
supporting of metric capability. Finally, the conclusion and future works appears in
Section 5.

2 Current Configuration Management Tools


The features of the current SCM tools are limited to the support of only a few areas
such as requirement management or project tracking and oversight. Covering the
entire software process activities then costs for purchasing the expert tools.
IBMs ClearQuest provides a workflow management capability and enforce the
process monitoring. However, for this service to be useful, it needs to integrate with a
tool, Portfolio Manager, for the profiling of process information from the extern [5].
Telelogics Synergy, integrated with Dashboard, intends to support a decision making
of project manager by automating data collection, analysis, and reporting of
measurement data. However, the tool only focuses on automating the requirement
management process data [6]. Borlands StarTeam tends to offer a comprehensive
solution that includes integrated requirements management, change management,
defect tracking, and project and task management [7]. However, its primary usage is
limited to the SCM only and depends on external CASE tools to utilize the SCM
information in other process activities. Continuus/CM toolset also cannot be used for
the complicated process environment due to its lack of process related information
[8], [9]. The study of [10] supports our view by stating that the current SCM tools
have: 1) No support with the planning function; 2) No sufficient support with process
related functions; 3) No support with the report and audit mechanisms; 4) No support
with measurement.

Design and Development of SCM Tool

1125

3 Development of Configuration Management Process Tool


To lessen the problem of existing SCM tools, we propose a Configuration
Management Process Tool (CMPT) by implementing features such that the software
project team can monitor and analyze their process performance by utilizing the SCM
metrics for each process area. CMPT is implemented in JAVA and runs on a singleuser workstation connected to a remote shared CVS server.
Configuration Management Process Tool

Upper Mgmt

Trainers

customized (reconfigured)
courseware or manuals

change metrics*

change
requests and
requirement
elements (e.g.
use cases,
object classes)

training docs to
be controlled

Requirements
Manager

nbr of defects, number


of changes, change
owner and status

change metrics*, repair schedule/

Project Manager status, version change summary


project docs to be
controlled, test and dev
schedule

Configuration
Management Process
Tool

defects repaired, test


schedule variance

version requests and


defects repaired

QA
Manager

change metrics*, repair schedule/


status, version change summary

defects found, time


repaired

Test
Manager,
Tester

defect categories, acceptable nbr of


defects

code files and merged


versions

config ID per
info set; readiness
checklist
project docs to be
controlled; access perms;
CM guidelines &
processes

Configuration Manager

Project
Leader,
Developer
*change metrics = defect MTTR, product stability
(MTBF), schedule variance

Fig. 1. Context Diagram of the CMPT

Figure 1 illustrates the scope of CMPT and data flow that comes in and out of a
system. The requirements manager, for instance, inputs the requirement artifacts for
controlling and monitoring of use cases and object model classes. CMPT can deliver
to requirement manager the status reports containing the number of defects, the
number of changes and change requests, the change owner and date, and the status of
change requests.
3.1 Change Control Process
Among the many features of CMPT, the change control provides a base playground
for monitoring and collecting the various process data. The state flow diagram in
Figure 2 represents the invariants, pre-conditions, and post-conditions of change
control process.
The change control flows are divided into two modes, free text module and
executable module. The free text module elements (e.g., Proposal, Charter, Requirement
Specification, Design, Test Plan, Training Plan, and etc.) can have several states:
REQUEST PROPOSED, APPROVED, COMPLETE, and some intermediate states.
Executable modules (e.g., Change Requests, Defects, Code, Test Cases, and Use

1126

A. Cline, E.-P. Lee, and B.-G. Lee

Cases) go through seven states: REQUEST PROPOSED, APPROVED, ASSIGNED,


IMPLEMENTED, TEST READY, TESTED and COMPLETE states. CMPT is
designed to monitor and collect the status information for each of these states to provide
more accurate and affluent metrics for other process areas.
Check-in
Incident
(failed Test
Case)
Check-in
CR or
PLM from
new
req'mts

CR or FTM
Proposed
Check-in
CR (Add,
Delete,Up
date) from
testing

Incidents
that can
be
ignored

CCB
declares
Incident
as Defect

Defect
Proposed

CCB
approves
Item

CCB does
not
approve

Unapprove
d (archives
RPO)

Defect
Approved

PM re-assigns
to Developer

CR or FTM
Approved
TM
assigns
Tester

Test
Assigned

PM
assigns
Developer

Dev
Assigned

Tester
confirms NOT
ALL tests
pass

Author
Assigned

Developer
checks in
code
modules

Tester
checks in
test cases

Test
Implem

Defect or CR
Proposed for
failed test
case

PM
assigns
Author

Author
checks in
planning
module

Dev
Implem

Developer
checks in
code
modules

FTM
Written

Tester
checks in
test cases

Test Ready

GOTO TEST
OR DEV
ASSIGNED
(System
notifies Tester
or Developer)

Tester
confirms
ALL tests
pass

Configuration
Items Job Flow
all inputs required
either/or
input

Request
Tested

CCB does
not
approve
revision

CCB
approves
revision

Resubmitted

CCB
approves
revision

Complete

Fig. 2. Flow Items of the SCM Change Process

3.2 User Profiles and Transaction Permissions


CMPT facilitates with an access control mechanism for it can be utilized by different
participants in various process areas. Table 1 shows each users transaction CMPT
facilitates. The Project Manager will input project documents to be controlled, and the
test and development schedule. The Project Manager then want change metrics, repair
schedules, development and change statuses, and version change summaries for the
product from CMPT. The Requirements Manager may input requirement elements for
control and monitoring and want CMPT to deliver reports containing the number of
defects, number of changes and change requests, the change owner and date, and
status of change requests and repairs. The Test Manager and Tester may want CMPT
to generate defect repair reports (change metrics) and test schedule variance. Both
Developers and Project Leaders may want CMPT to produce merged product
versions, code files, traceable version links, and change reports. CMPT can provide
Configuration Manager with version or configuration ID and configuration status

Design and Development of SCM Tool

1127

profiles to help coordinating the various SCM activities. CMPT also provides the
audit mechanism to enforce the CM guidelines and policies as planned. For the QA
Manager, CMPT provides various QA metrics for each process area under the control
of SCM.
Table 1. Example of Each Users Transaction Permissions
User
Trans.
Get metric
report
Check out
item
Change
request
Approve
change
request

PM

RM

QM

CM

TM

Proj.
Leader/
Developer

Tester

Trainer

General
Users

3.3 Process Metric


The key to the process metrics analysis is to analyze the actual effort, schedule,
deviation of cost and plan, and defects during the project [11], [12]. Table 2 shows a
sample of such metrics provided in [11]. In CMPT, these metrics can be calculated
from Cartesian product of the fine grained SCM events (e.g., check-in, release, checkout, change request, change, change completion, and etc.) and scale measurements
(e.g., number of events, frequency of events, time interval, average time, and etc.).
Table 2. SCM Process Metrics
Metric
Schedule variance
Effort variance
Size variance
Change stability
Change density
Residual change density
Change distribution in
each development phase

Equation
(Actual duration-Planned duration)/Planned duration
(Actual effort-Planned effort)/Planned effort
(Actual size-Planned size)/Planned size
Number of change request/Total number of baseline items
Number of changes for each baseline/Time span between check
in and out
Number of changes completed / Number of changes requested
Number of changes in each phase/Total number of changes

4 Work Scenarios of CMPT


This section describes how CMPT can be utilized in producing metrics for monitoring
other process activities. Figure 3 shows that CMPT facilitates a set of transactions and
access permission associated with users role. A user can define and customize
various roles and transactions according to their process condition and environment.

1128

A. Cline, E.-P. Lee, and B.-G. Lee

Fig. 3. Setting SCM Permissions

Figure 4 and 5 show how user can retrieve or store a configuration item. For
checking out a configuration item (Figure 4), CMPT facilitates an exclusive locking
and shows a lock icon for the items that already have been checked out. For checking
in (Figure 5), a user can check in either free-text documents or executable elements.
In CMPT, both the check-in and the check-out event are combined with the scale
measurement (e.g., time, frequency, number and etc.) to produce the finer grained
SCM metrics, such as number of check-in or time of check-out.

Fig. 4. Check-out of Configuration Item

Fig. 5. Check-in of Configuration Item

Any project team member can propose a change request for adding new item(s),
and replacing and deleting any existing item(s) with reasons and expected completion
time (Figure 6). All element types of new items are pre-defined by the Configuration
Manager. Once a Change Request (CR) is proposed, the CCB reviews the CR or
defect for approval or disapproval. If CR or defect is approved, the Project Manager
and Test Manager assign the request to developers and testers for implementation.

Design and Development of SCM Tool

1129

Fig. 6. Create a Change Request Proposal

User can get a metrics report for a selected configuration item with a graph
(Figure 7). The graph view can provide a summary of various process performances.
Users can select a report type from predefined types such as configuration status,
history, tracking, or release report. The scale can be chosen from number of events,
frequency of events, time interval, and average time, and the metrics can be chosen
from check-in, release, check-out, change request, change, and change completion of
configuration items. Figure 7 indicates the frequency of change request for each use
case. In this case, use case no.2 shows the highest frequency of change requests.

Fig. 7. Get Metrics Report

5 Conclusion and Future Work


SCMs capability has to extend to provide some valuable services to monitor other
process or management activities, such as project management or software quality
assurance. Current SCM tools rarely provide such a monitoring capability nor provide
fine-grained data useful enough. This paper described the design and development of

1130

A. Cline, E.-P. Lee, and B.-G. Lee

an SCM tool which can provide the measurement, analysis and reporting capability
for monitoring other process performance without using expensive CASE tools. The
proposed CMPT tool can:
1. define and customize the access/role control and associated transaction
selection,
2. define and customize the process work flow, and
3. utilize the work flow status information and metrics to provide process
performance information.
Currently, the CMPTs process customization capability only applies to change
control. For future work, the customization scheme should be extended and enhanced
to adopt the various process characteristics and project environments. More studies
should also be focused on the reporting, including graphing, extraction, and
translation of process metric data.
Acknowledgments. This research was supported by the MIC (Ministry of
Information and Communication), Korea, under the ITRC (Information Technology
Research Center) support program supervised by the IITA (Institute of Information
Technology Advancement) (IITA-2006-(C1090-0603-0032)).

References
1. S. Dart: Concepts in Configuration Management Systems, Proc. Third Intl Software
Configuration Management Workshop (1991) 1-18
2. D. Whitgift: Methods and Tools for Software Configuration Management, John Wiley and
Sons (1991)
3. A. Midha: Software Configuration Management for the 21st Century, TR 2(1), Bell Labs
Technical (1997)
4. Alexis Leon: A guide to software configuration management, Artech House (2000)
5. IBM Rational: ClearCase, http://www-306.ibm.com/software/awdtools/changemgmt/
(2006)
6. Peter Baxter and Dominic Tavassoli: Management Dashboards and Requirement
Management, White Paper, Telelogic (2006)
7. Borland: Starteam, http://www.borland.com/us/products/starteam/index.html.
8. Continuus Software/CM: Introduction to Continuus/CM, Continuus Software Corporation
(1999)
9. Merant :PVCS, http://www.merant.com/products/pvcs/
10. Fei Wang, Aihua Ren: A Configuration Management Supporting System Based on
CMMI, Proceedings of the First International Multi-Symposiums on Computer and
Computational Sciences, IEEE CS (2006)
11. R. Xu, Y. Xue, P. Nie, Y. Zhang, D. Li: Research on CMMI-based Software Process
Metrics, Proceedings of the First International Multi-Symposiums on Computer and
Computational Sciences, IEEE CS (2006)
12. F. Chirinos and J. Boegh: Characterizing a data model for software measurement,
Journal of Systems and Software, v. (74), Issue 2 (2005) 207-226
T

Data Dependency Based Recovery Approaches in


Survival Database Systems
Jiping Zheng1,2, Xiaolin Qin1,2, and Jin Sun1
1

College of Information Science and Technology, Nanjing University of Aeronautics and


Astronautics, Nanjing, Jiangsu 210016, China
2
Institute of Information Security, Nanjing University of Aeronautics and Astronautics,
Nanjing, Jiangsu 210016, China
{zhengjiping, qinxcs, sunjinly}@nuaa.edu.cn

Abstract. Recovering from malicious attacks in survival database systems is


vital in mission-critical information systems. Traditional rollback and reexecute techniques are too time-consuming and can not be applied in survival
environments. In this paper, two efficient approaches - transaction dependency
based and data dependency based are proposed. Comparing to transaction
dependency based approach, data dependency recovery approaches need not
undo innocent operations in malicious and affected transactions even some
benign blind writes on bad data item speed up recovery process.

1 Introduction
Database security concerns confidentiality, integrity and availability of data stored in
a database [1]. Traditional security mechanisms focus on protection, especially
confidentiality of the data. But in some mission-critical systems, such as credit card
billing, air traffic control, logistics management, inventory tracking and online stock
trading, emphases are put on how to survive under successful attacks [2]. And these
systems need to provide limited service at all time and focus on database integrity and
availability.
Despite of existing protection mechanisms, various kinds of attacks and authorized
users to exceed their legitimate access or abuse the system make above systems more
vulnerable. So intrusion detection (ID) was introduced. There are two main
techniques, including statistical profiling and signature identification, which can
supplement protection of database systems by rejecting the future access of detected
malicious attackers and by providing useful hints on how to strengthen the defense.
However, there are several inherent limitations about ID [3]: (a) Intrusion detection
makes the system attack-aware but not attack-resistant, that is, intrusion detection
itself cannot maintain the integrity and availability of the database in face of attacks.
(b) Achieving accurate detection is usually difficult or expensive. The false alarm
rate is high in many cases. (c) The average detection latency in many cases is too long
to effectively confine the damage. Some malicious behaviors can not be avoided in
DBMS. So effective and efficient recovery approaches must be adopted after the
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11311138, 2007.
Springer-Verlag Berlin Heidelberg 2007

1132

J. Zheng, X. Qin, and J. Sun

detection of malicious attacks. The rest of this paper is organized as follows. A


summery of related work in this area is included in Section 2. In Section 3, recovery
approaches in traditional and survival DBMS are given. Section 3.1 describes
database and transaction theoretical model. Transaction logging recovery method is
put forward in Section 3.2. Then data dependency approaches without/with blind
writes are emphasized in Section 3.3 and Section 3.4 respectively. Performance
analysis is put forward in Section 4. Section 5 concludes the paper.

2 Related Work
The traditional even simplest method for recovering a database to a consistent state is
rollback followed by re-execution of the malicious transactions and ones which are
dependent upon them. This method, while effective, necessitates an undue amount of
work on the part of the database administrator, and requires a knowledge of which
transaction was inappropriate one. And, some benign and innocent transactions need
to be re-executed. In general, this is a relatively poor option (not efficient) and
inadequate for the purposes of most database installations.
In order to overcome the limitations of this simple rollback model, researchers
have investigated various other methods for recovering a database to a consistent
state. In general, there are two basic forms of post-intrusion recovery methods [4]:
transaction based and data dependency based. The difference lies in whether the
system recovers modifications to the logs to organize the data modified by
interdependencies and associations.
Transaction based recovery methods [5-7], mostly referred transaction logging
ones, rely on the ability of an ancillary structure to re-execute committed transactions
that have been both committed since the execution of the malicious transactions and
affected by those transactions. First ODAM [8] then ITDB [9] and Phoenix [10] are
survival DBMSs developed by Peng Liu et al. and Tzi-cher Chiueh respectively.
These prototypes are implemented on top of a COTS (Commercial-Off-The-Shelf)
DBMS, e.g. Oracle, PostgreSQL. In these systems, database updates are logged in
terms of SQL-based transactions. ODAM and ITDB identify inter-transaction
dependencies at the repair time by analyzing the SQL log and only undo malicious
transactions and ones affected by them while Phoenix maintains a run-time intertransaction dependency graph with selective transaction undo. However, these
systems rely on the ability of the recovery system to correctly determine the
transactions which need to be redone.
Data dependency based recovery methods [11-14] suggest to undo and redo only
affected operations rather than undoing all operations of affected transactions and
then re-executed them. Panda and Tripathy [12] [13] divide transaction log file into
clusters to identify affected items for further recovery. Nevertheless, they require that
log must be accessed starting from the malicious transaction till the end in order to
perform damage assessment and recovery.

Data Dependency Based Recovery Approaches in Survival Database Systems

1133

3 Recovery Approaches in Survival Database Systems


As mentioned methods above, our work is also based on the assumption that the
attacking transaction has already been detected by intrusion detection techniques. So,
given an attacking transaction, our goal is to determine the affected ones quickly,
stops new and executing transactions from accessing affected data, and then carries
out recovering process. In our methods, we suppose that the scheduler produces a
strict serializable history, and the log is not modifiable by users. As the transactions
get executed, the log grows with time and is never purged. Also, the log is stored in
the secondary storage, so every access to it requires a disk I/O.
3.1 Database and Transaction Theoretical Model
To explain our recovery approaches, first we provide database and transaction
theoretical model as below [15]:
Definition 1. A database system is a set of data objects, denoted as DB={x1, x2, ,
xn}.
Definition 2. A transaction Ti is a partial order with ordering relation <i , where
1. Ti {[ri(x), wi(x)]|x is a data object} (ai, ci);
2. if ri(x), wi(x) Ti, then either ri(x)<i wi(x), or wi(x)<i ri(x);
3. ci Ti iff ai Ti.

And r, w, a, c relate to the operation of read, write, abort, and commit respectively.
Definition 3. The (usually concurrent) execution of a set of transactions is modeled
by a structure called a history. Formally, let T={T1, T2, , Tn} be a set of transactions.
A complete history H over T is a partial order with ordering relation <H, where:
1. H= ni=1 Ti;

2. ni=1 <i <H.


Two transactions T1, T2 in a history H usually have three relations (assume that T1
begins first) as shown in Figure1. Figure 1(a) shows that T1, T2 are overlapped.
During a certain period between two dashed lines, there are operations of both T1 and
T2. Figure 1(b) shows that the runtime of T2 in that of T1. In figure 1(c), T1 and T2 do
not have the operations executed at the same time. That is, there is no opportunity
they read/write the same data items.

T1
T2
(a)

T1

T1

T2

T2
(b)

(c)

Fig. 1. (a) Two transactions T1, T2 are overlapped; (b) T2 begins after T1 begins and ends before
T1 ends; (c) T2 begins after T1 ends

1134

J. Zheng, X. Qin, and J. Sun

3.2 Transaction Logging Method


Transaction logging method relies on the availability of read information in the logs
and typically this is not a standard feature of commercial database systems. Oracle
DBMS does not log read operations in defaulted installations. Existing methods either
construct read logs [8] [9] or maintain an inter-transaction dependencies at runtime
[10]. In order to build recovery from malicious transactions, transaction logging
methods first analyzed transaction dependencies in the history. In general, transaction
dependencies are defined as follows [7]:
Definition 4. Transaction Tj is dependent upon transaction Ti in a history H if there
exists a data item x such that Tj reads x after Ti has updated x and there are no
transactions that update x between the time Ti update x and Tj reads x.
Definition 5. In a history H, transaction Ti affects Tj if the ordered pair (Tj, Ti) is in the
transitive closure of the dependent upon relation as described as in definition 4.
------------------------------------------------------------------------------------------------------Input: Serialized history H, malicious transaction B.
Output: A consistent state of DBMS.
Initialize: write_set={}; temp_write_set={};
undo_transaction_set=B; temp_undo_transaction_set={}.
Steps:
1. Locate the history H where the malicious transaction B starts.
2. Scan every operation op(x) forward until the end of the history.
2.1 if wB(x) then write_set=write_set {wB(x)};
/*w, r, a, c relate to the operations of write, read, abort, and commit respectively */
2.2 else
2.2.1 if wTi(x) then temp_write_set=temp_wrtie_set {wTi(x)};
2.2.2 if rTi(x) then
if Ti temp_undo_transaction_set skip;
else if op(x) write_set then
temp_undo_transaction_set= temp_undo_transaction_set {Ti};
2.2.3 if aTi then
temp_write_set= temp_write_set- temp_write_set|Ti;
/* temp_write_set|Ti denotes operations of Ti in temp_write_set*/
if Ti temp_undo_transaction_set then
temp_undo_transaction_set= temp_undo_transaction_set-{Ti};
2.2.4 if cTi then
if Ti temp_undo_transaction_set then
undo_transaction_set= undo_transaction_set {Ti};
temp_undo_transaction_set= temp_undo_transaction_set-{Ti};
write_set=write_set temp_write_set|Ti;
temp_write_set=temp_write_set- temp_write_set|Ti;
else temp_write_set=temp_write_set- temp_write_set|Ti;
3. Undo every transaction in undo_transaction_set.
------------------------------------------------------------------------------------------------------Fig. 2. Algorithm1: Transaction based malicious transaction recovery algorithm

Given a history H and malicious transactions detected previously, transaction


logging recovery methods identify the affected transactions then undo both malicious

Data Dependency Based Recovery Approaches in Survival Database Systems

1135

transactions and affected by them. An effective and efficient transaction logging


recovery algorithm [5] is shown as figure 2.
In figure 2, transaction based malicious transaction recovery algorithm undoes the
committed malicious transactions and decided affected transactions. If transaction T
finally aborted, it need not be undone.
In below history H1, according to algorithm 1 shown in figure 2, transactions in
undo_transaction_set={B}{B,T1} {B,T1,T4} need to be undone and re-executed.
That is, the recovery process only undoes malicious transaction B and affected
transactions T1, T4.
H 1 : rB ( x) wB ( x)rB (u ) wB (u )c B rT1 ( x) wT1 ( x) rT3 ( z ) wT3 ( z )cT3 rT1 ( y ) wT1 ( y )cT1
rT2 ( y ) wT2 ( y )rT2 (v) wT2 (v)aT2 rT4 (u ) wT4 (u ) rT4 ( y ) wT4 ( y) rT4 ( z ) wT4 ( z )cT4

3.3 Data Dependency Based Recovery Approach Without Considering Blind


Writes
Definition 6. First operation of Ti is write(x) whether or not data item x is dirty or not.
So Ti is called blind write transaction and operation write(x) is a blind write.
In most cases, before transactions updated data items, they first read them. But in
some cases, especially in malicious attacks environments, various kinds of abnormal
behaviors do not obey this rule. So bind writes are unavoidable. In this section, we
only discuss the situation without blind writes.
------------------------------------------------------------------------------------------------------Input: Serialized history H, malicious transaction B.
Output: A consistent state of DBMS.
Initialize: comtaminated_data_set={};
temp_comtanminated_data_set={}; read_data_set={}.
Steps:
1. Locate the history H where the malicious transaction B starts.
2. Scan the history operation op(x) forward until the end of history.
2.1 if wB(x) then comtaminated_data_set= comtaminated_data_set {x};
2.2 else for each transaction Ti
2.2.1 if rTi(x) then read_data_seti=read_data_seti {x};
2.2.2 if wTi(y) then
if rTi(x)<iwTi(y) && x (comtaminated_data_set read_data_seti )
(temp_comtaminated_data_seti read_data_seti ) then
temp_comtaminated_data_seti= temp_comtaminated_data_seti {y};
else if wTi(x)<iwTi(y)&& x (comtaminated_data_set read_data_seti )
(temp_comtaminated_data_seti read_data_seti ) then
temp_comtaminated_data_seti= temp_comtaminated_data_seti {y};
2.2.3 if cTi then
comtaminated_data_set= comtaminated_data_set
temp_comtaminated_data_seti;
2.2.4 if aTi then
temp_comtaminated_data_seti={};
read_data_seti={}.
3. Refresh data values in comtaminated_data_set.
------------------------------------------------------------------------------------------------------Fig. 3. Algorithm2: Data dependency based recovery algorithm without blind writes

1136

J. Zheng, X. Qin, and J. Sun

Definition 7. Within one transaction or between two transactions, operations are


influenced with each other. There exist read-write and write-write dependencies.
1) read-write dependency: if readi(x) <i writei(y) then y is dependent upon x;
2) write-write dependency: if writei(x) <i writei(y) then y is dependent upon x.
For read operations can not update any data item, there does not exist read-read
dependency. Figure 3 shows data dependency based recovery algorithm without blind
writes. The algorithm skips read operations because they do not change data values
and every write operation updates data values they first read them.
In history H1, according to data dependency recovery algorithm in figure 3, only
data items x, y, u, z need to be refreshed to correct values.
3.4 Data Dependency Based Recovery Approach with Blind Writes
Definition 8. A blind write is called a refresh-write if the operation belongs to a
benign transaction and not dependent of any contaminated data.
------------------------------------------------------------------------------------------------------2.2.2 if bwTi(x) then
if x temp_comtaminated_data_seti read_data_seti then
temp_comtaminated_data_seti = temp_comtaminated_data_seti {x};
else if x comtaminated_data_set read_data_seti then
comtaminated_data_set=comtaminated_data_set-{x};
else if wTi(x) then
if rTi(x)<iwTi(y) && x (comtaminated_data_set read_data_seti )
(temp_comtaminated_data_seti read_data_seti ) then
temp_comtaminated_data_seti= temp_comtaminated_data_seti {y};
else if wTi(x)<iwTi(y)&& x (comtaminated_data_set read_data_seti )
(temp_comtaminated_data_seti read_data_seti ) then
temp_comtaminated_data_seti= temp_comtaminated_data_seti {y};
------------------------------------------------------------------------------------------------------Fig. 4. Algorithm3: Modified data dependency based recovery algorithm on step 2.2.2 in
figure 3 with blind writes

If a malicious transaction with blind writes, algorithm 2 need not change anything
to satisfy this situation. If there exists a refresh-write described in definition 8, we can
modify step 2.2.2 in algorithm 2 to get data dependency based recovery algorithm
with blind writes which shown in figure 4. In figure 4, algorithm 3 shows that benign
transactions with blind writes can refresh contaminated data values to correct state.
H 2 = rT1 ( x 2 )rT1 ( x 4 )rT1 ( x5 )rT3 ( x5 ) wT1 ( x3 )rT1 ( x 4 ) wT1 ( x 4 )cT1 rT2 ( x3 )rT2 ( x 4 )
rT2 ( x5 ) wT2 ( x 2 ) wT2 ( x1 )cT2 wT3 ( x1 ) wT3 ( x 2 )cT3 rT4 ( x1 ) wT4 ( x 4 )rT4 ( x3 ) wT4 ( x5 )cT4

In history H2, wT3(x1), wT3(x2) are benign blind writes which update contaminated
data x1, x2 to correct state.

4 Performance Analyses
There are two basic measures to evaluate these recovery systems. One is promptness
and the other is the complexity of any ancillary structures required. These

Data Dependency Based Recovery Approaches in Survival Database Systems

1137

measurements are not adequate since data dependency based systems add overhead to
the individual transactions, while transaction logging methods append their overhead
to the recovery process. In spite of additional data structures used in data dependency
recovery algorithms; it will be more effective and efficient than transaction based
methods. We use following equation [14] to evaluate our recovery approaches.
T=

R * S R + W * SW
* TP
SP

(1)

R: the number of read operations in the history;


W: the number of write operations in the history;
SR: the size of a read operation record in bytes;
SW: the size of a read operation record in bytes;
SP: the size of a page file;
TP: the page access time in milliseconds.
In our model, we use the following system-dependent parameters shown in Table1:
Table 1. System-dependent parameters used in data dependency based recovery approaches
Name
SR
SW
SP
TP

System-dependent parameters
40 bytes
60 bytes
1024 bytes
20 milliseconds (ms)

According to algorithms shown above and equation (1), we can calculate the
corresponding time consumed in each recovery process shown in table 2.
Table 2. Time consumed in different recovery processes

H1
H2

Traditional undo
and re-execute
method

Transaction
based recovery
method

19.531 ms
17.188 ms

13.672 ms
12.500 ms

Data
dependency
based approach
without blind
writes
7.813 ms
/

Data
dependency
based approach
with blind
writes
/
3.906 ms

In table 2, we can see that data dependency based approaches are more effective
and efficient than traditional undo and transaction based recovery methods.

5 Conclusion
In survival DBMS, fast and accurate recovery from malicious transactions is crucial
for survival under malicious attacks. In this paper, we first propose transaction
logging methods to recover DBMS to a consistent state. Then data dependency based
recovery approaches without/with blind writes are given. Comparing to read logs
needed and whole transaction undone in transaction based methods, data dependency
approaches only undo malicious and affected operations. Accompanied with benign

1138

J. Zheng, X. Qin, and J. Sun

blind writes, data dependency approaches need not undo operations on the data items
that have been updated by benign blind writes.
Acknowledgements. This work has been supported by the National Natural Science
Foundation of China (60673127), High-Technology Research Project of Jiangsu
Province of China (BG2004005) and the Aerospace Science Foundation of China
(02F52033).

References
1. Elisa Bertino, Ravi Sandhu. Database Security-Concepts, Approaches, and Challenges.
IEEE Transactions on Dependable and Secure Computing, 2005, 2(1): 2-19
2. Paul Ammann, Sushil Jajodia, Catherine D. McCollum, Barbara T. Blaustein. Surviving
Information Warfare Attacks on Databases. In: Proceedings of IEEE Symposium on
Research in Security and Privacy, Oakland, California, 1997, 164-174
3. Peng Liu, Architectures for Intrusion Tolerant Database Systems. 18th Annual Computer
Security Applications Conference. San Diego California .December 09 - 13, 2002, 311-320
4. Jeffrey G. Klapheke. Evaluation of Post-Intrusion Database Recovery Methods. Computer
Science Seminar, Rensselaer at Hartford, SD2-T1-1, April 24, 2004.
5. Paul Ammann, Sushil Jajodia, Peng Liu. Recovery from Malicious Transactions. IEEE
Transactions on Knowledge and Data Engineering, 2002, 14(5): 1167-1185
6. Yi Hu, Brajendra Panda. Identification of Malicious Transactions in Database Systems. In:
Proceedings of the Seventh International Database Engineering and Applications
Symposium, 2003, 329-335
7. Peng Liu, Paul Ammann and Sushil Jajodia. Rewriting Histories: Recovering from
Malicious Transactions. Distributed and Parallel Databases Journal, 2000, 8(1): 7-40
8. Pramote Luenam, Peng Liu. ODAR: An On-the-fly Damage Assessment and Repair
System for Commercial Database Applications. In: Proceedings of 15th annual working
conference on Database and application Security, 2001, 239-252
9. Peng Liu. Jiwu Jing, Pramote Luenam, Ying Wang, Lunquan Li, Supawadee Ingsriswang.
The Design and Implementation of a self-Healing Database System. Journal of Intelligent
Information Systems, 2004, 23(3), 247-269
10. Tzi-cker Chiueh, Dhruv Pilania. Design, Implementation, and Evaluation of a Repairable
Database Management System. In: Proceedings of 21st International Conference on Data
Engineering, 2005, 1024-1035
11. Brajendra Panda, Kazi Asharful Haque. Extended Data Dependency Approach: A Robust
Way of Rebuilding Database. In: Proceedings of the 2002 ACM Symposium on Applied
Computing, ACM Press, New York, 2000, 446-452
12. Brajendra Panda, Sani Tripathy. Data Dependency based Logging for Defensive
Information Warfare. In: Proceedings of the 2000 ACM Symposium on Applied
Computing, ACM Press, New York, 2000, 361-365
13. Sani Tripathy, Brajendra Panda. Post-Intrusion Recovery Using Data Dependency
Approach. In: Proceedings of the 2001 IEEE Workshop on Information Assurance and
Security. United States Military Academy, West Point, NY ,June, 2001, 156-160
14. Brajendra Panda, Rajesh Yalamanchili. Transaction Fusion in the Wake of Information
Warfare. In: Proceedings of the 2001 ACM symposium on Applied computing, Las Vegas,
Nevada, United States, 2001, 242 - 247
15. Kun Bai, Hai Wang, Peng Liu. Towards Database Firewalls. In: Proceedings of IFTP
International Federation for Information, LNCS 3654, 2005, 178-192

Usage-Centered Interface Design for Quality Improvement


Chang-Mog Lee1,*, Ok-Bae Chang1, and Samuel Sangkon Lee2,**
1

Division of Electronics and Information Engineering, Chonbuk National University,


664-14 1ga Duckjin-Dong Duckjin-Gu Jeonju, Jeonbuk, 561-756, South Korea
{cmlee, okjang}@chonbuk.ac.kr
2
Department of Computer Science and Engineering, Jeonju University, 1200 3ga,
Hyoja-Dong Wansan-Gu Jeonju, Jeonbuk, 560-759, South Korea
samuel@jj.ac.kr

Abstract. As application development environment changes rapidly, importance


of user interface design is increasing. Usually, most of designers are clustering
by subjective method of individual to define objects that have relativity in design
interface. But, interface which is designed without particular rules just adds inefficiency and complexity of business to user who use this system. We propose an
object oriented design model that allows for flexible development by formalizing
the user interface prototype in any GUI environment. The visual cohesion of the
user interface is a new set of criteria which has been studied in relation to the
user interface contents, and is founded on the basis of the cohesion of the interface as defined using basic software engineering concepts. The visual cohesion
includes the issue of how each unit is arranged and grouped, as well as the cohesion of the business events which appear in the programming unit.
Keywords: Usage-centered Interface, Visual Cohesion, Prototype, Interface
Design Model, Object Unit.

1 Introduction
The design of a User Interface (UI), which is fundamental to the convergence of the
different customers' requirements, and the communication required to support the
complicated interaction between human beings and computers, requires very comprehensive and varied knowledge and experience [1]. The design of such a UI requires a
graphics expert, requirement analyzer, system designer, programmer, technology
(description) expert, social behavior expert and other experts, depending on the particular application [2]. However, it is difficult to realistically engage experts in multiple fields in the design of a UI.
Therefore, it is necessary to research automatic designs for a UI which can meet
the professional requirements of various fields. The use of Visual Cohesion (hereafter
referred to as VC) in the design of a UI helps improve its quality by providing the
designer or developer with a visual prototype prior to the embodiment of the system.
Moreover, VC provides the standards needed to measure the appropriateness of the
*
**

This work was supported by the second stage of Brain Korea 21 Project.
This work was financially supported by the Jeonju University.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11391146, 2007.
Springer-Verlag Berlin Heidelberg 2007

1140

C.-M. Lee, O.-B. Chang, and S.S. Lee

layout of the UI and its semantic contents. It is necessary to improve the comprehensibility of business tasks and the usability of the interface by clustering business
events in such a way that they are semantically related one another [3]. This paper
aims to look at the modeling techniques used to improve the VC. The purpose of the
VC in this study is to improve the comprehensibility and usability of a business system by clustering business events in such a way that they are related to one another
[4]. Therefore, this paper proposes 4 types of objects that can improve the VC of a UI
prototype and discusses the techniques used to produce an object oriented design that
performs the clustering of these objects, as well as discussing the method used to
measure the VC.

2 Related Works
GENIUS [5], JANUS [6], TRIDENT [7], GUIPS [8] are some examples of studies
related to the automatic creation of a user interface. Table 1 sums up the characteristics of recent studies into the automatic creation of a user interface.
Table 1. Comparison of studies into the automatic generation of a user interface

TRIDENT
Domain model - data and task analytical model
Method of
design

JANUS
- object model

- extraction of user interface by - extraction of user interface


the requirement analysis of task from object model
and function

Method of
- data and function requirement
specification specification
Characteristics - application analysis of the
interaction task and the decision
on the task attributes by the user
interface
- activity chain graph (interaction
of data and function)
- deals with the static features of
the user interface

- data structured specification


- Non-abstraction class transmits to the user interface
- Attributes and methods that
have nothing to do with the user
interface are disregarded in the
process.
- Dynamic features of the user
interface are not covered.

GUIPS
- object analysis model
founded on UML
- Extraction of user interface from scenario according to the requirement
analysis
- data specification of
transition object
- user-interaction interface
modeling from scenario
- object transition graph
(interaction with interface)
- prototype creation
interface of user interface

3 Interface Design Model Based on Classification


The object oriented model suggested in this paper is composed of 4 object models
which can improve the VC of the UI, as shown in Fig. 2: 1) business event object, 2)
task object, 3) transaction object, 4) form object model used to calculate the cohesion
of the suggested object model and validate the improvement of cohesion compared to
the existing design model. Therefore, the detailed objects of the UI are analyzed in
terms of their similarity, relevance and transference of the business events in the UI in
order to perform the clustering of the business events through the set of objects [9].
This is because the visualization of objects patterned by clustering can lead to an
improvement in the VC of the business events in the UI.

Usage-Centered Interface Design for Quality Improvement

1141

Fig. 1. Hierarchical structure of object oriented design model

3.1 Business Event Object


The design of the business event object is the stage in which the object which represents the User Interface Business Event Object (hereafter referred to as UIBEO) is
designed. The control pattern of the business event which comprises the transference
data of the business event is designed in the UI. In other words, the designing operation of the business event includes the design of the business event controls, such as
the radio buttons, combo boxes, check buttons, etc. The rules used to design the business event objects of the UIBEO are as follows.
<Rule 1> the business event which has the number of instant limited to one business
event is UIBEO.
<Rule 2> if the number of instance that can be fed to one business event is not regular, it is not the UIBEO.
<Rule 3> the item that can have the instance less than 7 at the maximum is the
UIBEO that can use the radio button.
<Rule 4> the business event that can have over 8 instances is the UIBEO that can
use the combo box.
<Rule 5> the item that can feed the instance of choice is the UIBEO that can use the
check button.
This improves the cohesion of the business event by effectively modeling the function of the business event in the UI, and also enhances the reusability of instant data
and functional cohesion within the UI.
3.2 Task Object
The clustering of task objects is the design stage in which the objects that represent
the User Interface Task Objects (hereafter referred to as UITO) are created. It is a

1142

C.-M. Lee, O.-B. Chang, and S.S. Lee

clustering stage that enables the user to distinguish the set of task objects by modeling
the block label composed of task units if there are more than 2 input or output business events which are transferred when the event occurs. The following is the rule
related to the clustering of the set of task objects of the business event that is transferred to a task.
<Rule 1> the business event which has the number of instant limited to one business event is UIBEO.
<Rule 2> the output UITO which has more than one record is the StringGrid bloc.
<Rule 3> the node which has more than one continual input business event is
UITO.
<Rule 4> the node which has more than one continuous output business event is
UITO.
The business events are clustered and labeled by set of task objects in order to improve the communication cohesion and VC of the business event in the UI.
3.3 Transaction Object
The design of the transaction object is the stage in which the object that represents the
User Interface tRransaction Object (hereafter referred to as UIRO) is created. In other
words, the UIRO is the clustering stage in which the group of business events composed of input-control-output events is grouped into the set of transaction objects. The
transaction object is created by turning the request (input) and response (output) of the
user into the block through one suite. The following is the rule used for clustering
the set of transaction objects.
<Rule 1> it is composed necessary of input task-button-output task, and the input
task can be omitted if overlapping with the previous transaction.
<Rule 2> it can have more than one input task and output task.
<Rule 3> the input task is the beginning of UIRO, while the output task is the end
of UIRO.
The design of the transaction object, which is the stage in which the users are provided with the set of transaction objects, is the method of clustering the transaction
objects that are grouped into 'task-control-output' tasks. This facilitates the understanding of the users by visualizing the transaction objects of the business events in
the UI. In other words, it makes it easier for the users to understand the set of transactions in the Interface, by clustering the 'input task-control-output' tasks into one object
unit for the sake of visualization.
3.4 Form Object
The design stage of the form object serves to create the object that represents the User
Interface Form Object (hereafter referred to as UIFO). This stage creates the form
object by dividing the business events into the form in which they are presented in the
UI. If the number of input/output business events exceeds 20 (criteria for human engineering) or the output form(or_state) is selected in more than one input and it is

Usage-Centered Interface Design for Quality Improvement

1143

necessary to make the user clearly understand as in the case of Interrupt, it divides the
objects into multiple forms. The following is the rule used for clustering the set of
form objects.
<Rule 1> the input/output objects exceed 20, and in case of different task, are divided into other form object.
<Rule 2> if the response to the demand is alternative, it is divided into different
form object.
<Rule 3> if the result of event is Interrupt, it is divided into new form object.
<Rule 4> if it is the abstract object with same task though it exceeds 20 items, it
cannot be divided into other form object.
<Rule 5> one task object can be divided into form object, and the transaction object gathers to become form object.
The efficient design of the form provides the support needed to facilitate the development of the program and its maintenance/repair, by making it easy to understand
the business process and reducing the complexity of the software.

4 Evaluation of Proposed Model


4.1 Features of Proposed Model
In this section, the existing common design, structured design and object oriented
design proposed in this paper will be explained, in order to compare these different
design models of the UI. Fig. 2 shows the structure of these evaluation models in
order to facilitate the understanding of the design models.
Common design

Control
design

Structured design

Relative
group

Objected Oriented design

Field
Task
Transaction

Fig. 2. Comparison of the structure of the referenced design model

Business events in the general design model are not subject to clustering, and the
designer arranges the business events (by him or herself). The designer determines
only the control of the business events, depending on his or her skill. The structured
design model of Constantine designs the control pattern of the business events like
general designers, and the designer groups the business events according to their relevance, so that the user can understand their relevance to the work involved.

1144

C.-M. Lee, O.-B. Chang, and S.S. Lee

Fig. 3. (a) Generalized design


model

Fig. 3. (b) Structured design


model

Fig. 3. (c) Object-oriented


design model

The object-oriented design model proposed in this paper applies the concept of the
object-oriented design by classifying the business events into objects (business events,
tasks, transactions, forms) in the UI. Fig. 3(a) shows the general design model which
designs only the control pattern of the business events. In other words, it designs only
the control patterns, such as the properties, product codes, product standards, etc. The
structured design model in Fig. 3(b) shows an example in which the control of the
business events is designed and their grouping is performed based on the user information, product information and order information according to the relevance of the
work involved. Fig. 3(c) is an example of the object-oriented design model visualized
by the multi-dimensional grouping of 4 objects to which the object oriented design
method is applied (business events, tasks, transactions, forms). The object-oriented
design model is a method that enables the user to recognize the object group and sequence of tasks which are transferred to the database in the UI, as shown in Fig. 2.
4.2 Criteria for VC
The VC of the UI is a new set of criteria which has been studied in relation to the UI
contents, and is founded on the basis of the cohesion of the interface as defined using
basic software engineering concepts. Constantine proposed an equation to estimate
the VC [10]. This VC is represented by the ratio of the number of pairs related to
visual business events to the number of business events. The summation of the VC in
the form and dialogue box is the summation of the VC in the group of all levels.

V C = 100

however,

N l ( N l 1) / 2

Gl

R i, j

i , j |i j

N l represents the number of business events in group l , Ri , j represents the semantic


relevance(however,

0 Ri , j 1 ) between the business events i and j in each

group. If business events i and

j are relevant, Ri , j . If no relevance exists, Ri , j = 0 .

The number of VCs increases if the grouping among relevant business events is good.
The equation used to calculate the VC is applied on the basis of the outcome of the
design in section 4.1, and the VCs of the designed models are compared and evaluated. The referenced design model used for the evaluation is designed in the form of a

Usage-Centered Interface Design for Quality Improvement

1145

visual prototype of a non-functional screen layout, and the basic visual properties and
business events were designed with the same number (of visual properties and business events) in order to ensure the objective measurement of the result of this experiment. Table 2 shows the outcome of the calculation of the relevance by pattern of this
referenced design.
Table 2. Outcome of the calculation of relevance by referenced design model
Generalized design
Structured design
Object oriented design

G1=1, G2=10, G3=10, G4=17


N1=2, N2=5, N3=5, N4=16
G1=1, G2=10, G3=10, G4=10, G5=3, G6=10, G7=0
N1=2, N2=5, N3=5, N4=5, N5=3, N6=5, N7=6
G1=1, G2=10, G3=10, G4=1, G5=3, G6=3, G7=10, G8=2, G9=2,G9=0
N1=2, N2=5, N3=5, N4=2, N5=3, N6=3, N7=5, N8=3, N9=3, N10=3

The number of instances of the product code and product standard were both assumed to be five.

The VC of the general design model was N=4, N1=2, N2=5, N3=5, N4=16, VC=27
was the lowest VC. The VC of the structured design model was N=7, and N1=2,
N2=5, N3=5, N4=5, N5=3, N6=5, N7=6, VC=74 was the intermediate level of VC. The
VC of the object oriented design model was N=10 and N1=2, N2=5, N3=5, N4=2,
N5=3, N6=3, N7=5, N8=3, N9=3, N10=3, VC=89 was the highest VC. Therefore, the
VC of the referenced design model can be compared as shown in table 3.
Table 3. Evaluation of VC of referenced design model

Name of referenced
design model

VC
Calculated value

General
design

Structured design

Object oriented
design

27

75

89

It was found that the cohesion of the object oriented design model proposed in this
paper was improved by approximately 14%. The VC provides the criteria for reviewing the quality of the visual prototype and graphic design for the UI. Moreover, the
VC provides the criteria for forecasting the user preference, the evaluation of the
easiness and comprehensibility, the degree of response, and the quality of the graphic
layout [10]. The automatic graphic layout of the objects ensures the most efficient
grouping and the highest cohesion according to the modeling rule, regardless of the
skill of the designer.

5 Conclusion
This paper studied the design rules and modeling technique of a UI that supports the
user based on the improved VC. The findings of this study are as follows: First, the
proposed method improves the VC by designing the objects of the UI on the basis of

1146

C.-M. Lee, O.-B. Chang, and S.S. Lee

objects which are functional, consecutive and communicative. Second, it improves the
user preference, easiness, comprehensibility, degree of response, and quality of the
graphic layout on the basis of the improvement of the object based VC. Third, it improves the communicative, consecutive, and procedural cohesion of business events
on the basis of the clustering of the UI objects. Fourth, it constitutes an object oriented
designing method that can improve the comprehensibility of the business process and
the usability of the UI on the basis of the visualization of the object pattern.

References
1. Garcia, E., Sicilia, M.A., Gonzalez, L., Hilera, J.R.:Dialogue-Based Design of Web Usability Questionnaires Using Ontologies. Computer-Aided Design of User Interfaces(2005) 131-144
2. Dix A.:Design of User Interface for Web. Proceedings, User Interface to Data Intensive
System(1999) 2-11
3. Constantine L.L., Biddle R., and Noble J.:Usage-centered Design Engineering: Models for
Integration. IFIP international conference on software engineering(2003) 106-113
4. Leszek A. Maciazek: Requirements Analysis and System Design. Addison Wesley(2001)
244-270
5. H. Balzert: From OOA to GUIs: The Janus System. IEEE Software, Vol. 8, No 9(1996)
43-47
6. F. Bodart, A.-M. Hennebert, J.-M. Leheureux, I. Provot, and J. Vanderdonckt: A Modelbased Approach to Presentation: A Continuum from Task Analysis to Prototype. Proceedings of the Eurographics Workshop on Design, Specification, Verification of Interactive
Systems, Carrara, Italy, Focus on Computer Graphics, Springer-Verlag, Berlin(1994)
77-94
7. M. Elkoutbi, I. Khriss, and R.K. Keller: Generating User Interface Prototypes from Scenarios. Proc. of the 4th IEEE International Symposium on Requirements Engineering(1999) 150-158
8. Nerurkar U.: Web User Interface Design, Forgotten Lessons. IEEE Software, Vol. 18, No.
6(2002) 69-71
9. Chidamber S. and Kemerer C.: A Metrics Suite for Objected-Oriented Design. IEEE
Transaction on Software Engineering, Vol. 20, No. 6(1994) 476-493
10. Constantine, L. L.:Visual Coherence and Usability: A Cohesion Metric for Assessing the
Quality of Dialogue and Screen Designs. Proceedings, Sixth Australian Conference on
Computer-Human Interaction, IEEE Computer Society Press(1996)

Description Logic Representation for Requirement


Specification
Yingzhou Zhang and Weifeng Zhang
College of Computer, Nanjing Univ. of Posts and Telecom., Nanjing 210003, China
zhangyz@njupt.edu.cn

Abstract. With the size and complexity of many software systems increasing,
they need to give a greater emphasis to capture and maintain requirement
knowledge within the software development process. This knowledge can be
captured by the requirement ontology. But the requirement analysis systems
must balance expressivity and inferential power with the real demands of requirement ontologies construction, maintenance, performance, and comprehensibility. Description logics (DLs) possess several featuresa terminological
orientation, a formal semantics, and efficient reasoning procedures which offer an effective tradeoff of these factors. In this paper, we use description logics
SHIQ to define the objects and their relations, and identify the subsumptions
capturing the constraints and relationships among the objects. We show how the
subsumptions can be used to answer some questions.

1 Introduction and Related Works


The requirement analysis plays an important role in the software engineering. But the
requirements are often ambiguous, incomplete and redundant; they are also changed
frequently during the design process due to the changes of technology and customers
objective [1]. To make sure that different engineers have a common understanding of
the terms to be used, the ontology-oriented requirements approach was presented in
[2 - 4]. An ontology [6, 7] is a collection of definitions of concepts and the shared
understanding [5]. The requirement ontology is a part of a more general ontology to
capture engineering design knowledge [8, 9].
The requirement ontology was formally described in [1] and [2], but their abilities
of reasoning were inefficient and undecidable. As requirement information grows in
scale, the requirement analysis systems must balance expressivity and inferential
power with the real demands of requirement ontologies construction, maintenance,
performance, and comprehensibility. Being one of decidable fragments of first-order
logic, description logics (DLs) [10-12] possess several featuresa terminological
orientation, a formal semantics, and efficient reasoning procedures which offer an
effective tradeoff of these factors [13]. In this paper, we will represent a requirement
specification through a requirement ontology that is described by description logics.

This work was supported in part by the National Natural Science Foundation of China
(60503020), the Natural Science Research Plan for Jiang Su High School (05KJD520151).

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11471154, 2007.
Springer-Verlag Berlin Heidelberg 2007

1148

Y. Zhang and W. Zhang

With the use of description logics, our approach not only can provide a design terminology to be easily shared by the corresponding engineers, but can define precisely
and unambiguously the meaning of the terminology.
The remainder of this paper is organized as follows. In section 2, we firstly briefly
review why it is necessary to represent requirement specification as requirement ontology. Then, we provide a brief introduction to description logics in section 3. In
section 4, we use description logics SHIQ to formally represent requirement specification. In section 5, we show how to use our representation method to answer some
common sense questions. Finally, section 6 gives the conclusions.

2 Requirement Ontology for Requirement Specification


As we all know, the requirements are often ambiguous, incomplete, redundant and
variational in the software engineering environment. Requirements generated by different engineers may be inconsistent since different designers may have different
perspectives on the system. This may bring about the same term is applied to different
concepts and different terms are used to denote the same entity. A suitable solution
was the use of ontologies which can make sure that different engineers have a common understanding of the terms to be used.
Originally, ontology was a concept in the philosophy. It describes the essence and
composition of the world, as said by the philosophers. In computer science, ontology
is mainly used for knowledge representation. It provides a means for knowledge sharing which is very much desirable in large-scale knowledge based projects, including
the traditional software engineering projects [3]. Nowadays, more and more software
engineers became interested in studying and using ontology concepts in their research
projects [8].
In [2], the five points for using ontology in requirement analysis were presented:
1.
2.
3.
4.
5.

Make relations as independent knowledge units


Organize the objects in ontologies.
Takes objects as basic components of ontologies.
Let ontologies form their own inheritance hierarchies.
Allow ontologies to be nested.

As the same to them, we are interested in a formal and rigorous approach to the
representation of knowledge. But we will adopt the description logics to define the
objects and their attributes, and identify the axioms capturing the constraints and
relationships among the objects.
Before providing a brief introduction to description logics (in next section), we
summarize their advantages for knowledge representation as following:
1. Translation to first-order predicate logic usually possible
2. Based on formal semantics: including declarative and compositional semantics;
standard Tarski-style interpretation I = ( I , I )
3. Inference problems decidable
4. Probably the most thoroughly understood set of formalisms in all of knowledge
representation

Description Logic Representation for Requirement Specification

1149

5. Wide range of logics developed: from very simple (no disjunction, no full negation) to very expressive (comparable to DAML + OIL)
6. Very tight coupling between theory and practice.

3 Description Logics
Description Logics (DLs) are knowledge representation formalisms that are able to
capture virtually almost all class-based representation formalisms used in Artificial
Intelligence, Software Engineering, and Databases. The basic elements of DLs are
concepts (also called classes) and roles, which denote sets of objects and binary relation, respectively. Concept expressions and role expressions (in the following simply
called concepts and roles) are formed by starting from a set of atomic concepts and
atomic roles, i.e., concepts and roles denoted simply by a name, and applying concept
and role constructs [13].
The most expressive DL that we refer to in this paper is called SHIQ [14-16]. In
such logic, concepts and roles are formed according to the following syntax:
C, D ::= A | CD | CD | C | R.C | R.C | nR.C | nR.C .

(1)

R, S :: = P | R | R S .

(2)

where A and P denote respectively atomic concepts and atomic roles; C (or D) and R
(or S) denote respectively arbitrary concepts and roles with transitive role R+ R; n
denotes a positive integer.
Syntax

Semantics

A I I

P I I I

CD

C I DI

CD

C I DI

I \ C I

R.C

{x | y. <x, y> R I y C I }

R.C

{x | y. <x, y> R I y C I }

nR.C

{x | #{ y| <x, y> R I y C I } n }

nR.C

{x | #{ y| <x, y> R I y C I } n }

{<x, y> | <y, x> R I }

RS

RI S I
Fig. 1. Formal Semantics of the SHIQ

1150

Y. Zhang and W. Zhang

The formal semantics of SHIQ is showed in Figure 1. In Figure 1, the superscript I


is an Standard Tarski-style interpretation I = ( I , I ) [17]; the notation #M denotes
the cardinality of the set M.
We also use the following abbreviations to increase readability:
AA .

(3)

AA .

(4)

( nR.C) ( n+1R.C ) .

(5)

(= nR.C) ( nR.C )( nR.C) .

(6)

CD ( CD) .

(7)

R.C R. C .

(8)

In next section, we will show how to use SHIQ to describe ontology in software
requirement.

4 Representation of Requirement Specification and Reasoning


in SHIQ
As mentioned above (Section 1 and 2), the high quality requirement ontologies are
crucial for the software design and development, and their construction, maintenance,
performance, and comprehensibility greatly depend on the availability of a welldefined semantics and powerful reasoning tools. Since DLs provide for both, they
should be ideal technologies to describe the requirement specifications. Furthermore,
some features of SHIQ make this DL expressive enough to represent the requirement
ontologies. These features include expressive number restrictions, the formulation of
complex terminological axioms, and especial roles (such as inverse roles, transitive
roles, and subroles) [5].
Subsumptions (e.g. C D) are usually called inclusion assertions [13]. A finite set
of inclusion assertions is called a TBox (Terminological Box). The requirement ontology can be formalized in a TBox of SHIQ.
To express requirement ontology in SHIQ, we firstly restrict the possible worlds by
introducing restrictions on the interpretations of requirement specification. For the
example shown in [1]:
There are three components of a desk spot lamp, namely Heavy_base, Small_head
and Short_arm
we can use inclusion assertions:
Desk_spot_lamp has_component.Heavy_base has_component.
Small_head has_component. Short_arm .
where has_component is a transitive role (relation).

(9)

Description Logic Representation for Requirement Specification

1151

In addition, a designer introduces a hole as a feature to the arm of a desk spot lamp
so that an electrical cord can run through it, and two threaded bars as another feature.
This can be expressed as following:
Short_arm has_feature.Hole 2 has_feature.Threaded_bar 2
has_feature. Threaded_bar .

(10)

Then we can define the relevant notions of our software requirement using concept
definitions. For example, the two concepts: primitive and composite, can be defined
respectively as:
primitive (has_component.) .

(11)

composite primitive .

(12)

and

The requirements describe the properties of what being designed. Primitive requirements often come from the customer expressing his/her wishes. They are often
ambiguous, incomplete and redundant; they are also changed frequently during the
design process due to the changes of technology and customers objective. Our ontology-oriented requirements approach in SHIQ can overcome them conveniently. For
example, the following defines the requirement the weight of desk spot lamp must be
less than 2.0 pound:
Desk_spot_lamp has_weight.less_than.A1 .

(13)

where the atomic concept A1 denotes 2.0 pound. If the requirement above be
changed to the weight of the desk spot lamp must be within 2.0 0.1 pound, we can
define it easily as:
Desk_spot_lamp has_weight.less_than.A2 has_weight.greater_
than.A3 .

(14)

where A2 and A3 denotes respectively 1.9 pound and 2.1 pound; less_than and
greater_than is the inverse role of less_than and greater_than, respectively.
In addition, the customer may specify that the desk spot lamp should be able to
use for lighting more than half a square meter of room. This can also be described
easily as:
Desk_spot_lamp has_feature.Lighting_feature has_Lighting_area.
less_than.A4 .

(15)

where A4 denotes 0.5 square meter.


In a word, SHIQ (esp. its TBox) can, on the one hand, represent conveniently the
expression of concept hierarchies. On the other hand, new concepts can be defined
easily by the combination of already given ones. Furthermore, this SHIQ system provides us with various reasoning capabilities that let us deduce implicit knowledge
from the explicitly represented knowledge. These capabilities contain deciding satisfiability and subsumption of SHIQ-concepts w.r.t. TBoxes and role hierarchies,

1152

Y. Zhang and W. Zhang

checking consistency of SHIQ-TBox and checking instance of concepts and roles (for
more, please see [12, 14-16]).
For example, the instance
Desk_spot_lamp (x)
means that x belongs to the concept Desk_spot_lamp. From Axiom (9) we can obtain
the following instances:
has_component(x, y1), has_component(x, y2), has_component(x, y3)
where has_component(x, y1) denotes that the desk spot lamp x has a component y1; y1,
y2, and y3 belong respectively to Heavy_base, Small_head and Short_arm.

5 Queries in Description Logics


As shown in Section 4, we have used the description logic SHIQ to present the terminology and assertions for ontology centered on requirements in software engineering.
The advantages of our technology based on SHIQ are the convenient expression of
concept hierarchies, the ease of defining new concepts by the combination of already
given ones, and the efficient and decidable reasoning capabilities. In addition, this
requirement ontology described in SHIQ can be used for answering many common
sense questions, by deduction using queries in description logics. In this paper, these
queries are non-recursive queries whose predicates are the concepts and relations that
appear in the DLs knowledge base. For a comprehensive discussion on answering
queries in DLs, see [12,15,16,18].
A SHIQ knowledge base (KB) K is a SHIQ-TBox. And a query expression q over a
SHIQ knowledge base K is a nonrecursive query of the form
q(x) conj1(x, y1) conjm (x, ym) .

(16)

where conji (x, yi) is a conjunction of atoms; x, yi are the variables appearing in the
conjunct; x perhaps is the set of x1, , xi . Each atom has one of the forms C(t) and
R(t1, t2), where t, t1, t2 are variables in x and yi or objects of the knowledge base. A
query expression q is interpreted as the set qI of constants c1, , cn, such that, when
substituting each ci for xi, the formula
y1. conj1(x, y1) ym. conjm (x, ym) .

(17)

computes to true in the given interpretation I .


For the example in [1]: in a complex artifact, we may want to find out where a
specific type (or class) of parts are used, i.e. to find out those parts that have a component of the type (lets denote it by T ). We can express it as the following query
expression:
q(x) conj(x, y) .

(18)

conj(x, y) has_component(x, y) T (y) .

(19)

Description Logic Representation for Requirement Specification

1153

6 Conclusion
In this paper we used description logic SHIQ to formally describe an ontology for
requirements in the engineering design domain. The advantages of our technology
based on SHIQ are the convenient expression of concept hierarchies, the ease of defining new concepts by the combination of already given ones, and the efficient and
decidable reasoning capabilities.
We showed how our requirement ontology addressed some issues on requirement
knowledge. The ontology provided communication of requirements by defining a
well-defined syntax and semantics. It allowed for checking for, completeness, consistency, and satisfiability. It provided a knowledge base for tools that perform document creation.

References
1. Jinxin, L., Mark, S.F., Taner B.: A Requirement Ontology for Engineering Design. In:
Concurrent Engineering: Research and Application, Vol. 4, No.3. (1996) 279-291
2. Lu, R., Jin, Z., Chen, G.: Ontology-Oriented Requirements Analysis. Chinese Journal of
Software, Vol. 11, No. 8. (2000) 1009-1017
3. Lu, R., Jin, Z.: Formal Ontology: Foundation of Domain Knowledge Sharing and Reusing.
Chinese Journal of Computer Science and Technology, Vol. 17, No. 5. (2002) 535-548
4. Jin, Z.: Ontology-Based Requirements Elicitation. Chinese Journal of Computers, Vol. 23,
No. 5. (2000) 486-492
5. Baader, F., Horrocks, I., Sattler, U.: Description Logics as Ontology Languages for the
Semantic Web. In: Proc. of the International Conference on Conceptual Structures (ICCS
2003). LNAI, 2003. To appear
6. Guarino, N.: Formal Ontology: Conceptual Analysis and Knowledge Representation. International Journal of Human-Computer Studies, Vol. 43, No. 5/6. (1995) 625-640
7. Gruber, T.R.: Towards Principles for the Design of Ontologies Used for Knowledge Sharing. Int. Journal of Human-Computer Studies, Vol. 43, No. 5/6. (1995) 907-928
8. Abran, A., Cuadrado-Gallego, J.J., Garca-Barriocanal, E., Mendes, O., Snchez-Alonso,
S., Sicilia, M.A.: Engineering the Ontology for the Swebok: Issues and Techniques. In:
Calero, C., Ruiz, F., Piattini, M. (eds.): Ontologies for Software Engineering and Software
Technology. New York: Springer. (2006) 103-122
9. Sicilia, M.A., Cuadrado-Gallego, J.J., Rodrguez, D.: Ontologies of Software Artifacts and
Activities: Resource Annotation and Application to Learning Technologies. 2005 International Conference on Software Engineering Research and Practice, Taipei (Taiwan).
(2005) 145-150.
10. Calvanese, D., Lenzerini, M., Nardi, D.: Description Logics for Conceptual Data Modeling. In: Logics for Databases and Information Systems. Kluwer-Academic Pulisher, (1998)
229-264
11. Calvanese, D., Lenzerini, M., Nardi, D.: Unifying Class-based Representaion Formalisms.
Journal of Intelligence Research, Vol. 11. (1999) 199-240
12. Calvanese, D., Giacomo, G.D., Lenzerini, M., Nardi, D.: Reasoning in Expressive Description logics. In: Robinson, A., Vornonkov, A. (eds.): Handbook of Automated Reasoning.
Elsevier Science Publishers (North- Holland), Amsterdam. (2001) 1581-1634

1154

Y. Zhang and W. Zhang

13. Calvanese, D., Giacomo, G.D., Lenzerini, M.: Description Logics: Foundations for Classbased Knowledge Representation. Proc. of the 17th IEEE Sym. on Logic in Computer Science (LICS 2002). Copenhagen, Denmark (2002) 359-370
14. Horrocks, I., Sattler, U., Tobies, S.: Practical Reasoning for Expressive Description Logics. Proc. of the 6th International Conference on Logic for Programming and Automated
Reasoning. LNAI 1705. (1999) 161-180
15. Horrocks, I., Sattler, U.: Optimised Reasoning for SHIQ. In: Proc. of the 15th Eur. Conference On Artificial Intelligence (ECAI 2002). (2002) 277-281
16. Horrocks, I., Sattler, U., Tobies, S.: Reasoning with Individuals for the Description Logic
SHIQ. Proc. of the 17th Conference on Automated Deduction (CADE-17). LNCS 1831.
(2000) 482-496
17. Calvanese, D., Giacomo, G.D., Lenzerini, M.: A Framework for Ontology Integration.
Proc. of the First Int. Semantic Web Working Symposium (SWWS 2001). (2001) 303-316
18. Calvanese, D., Giacomo, G.D.: Answering Queries Using Views over Description Logics
Knowledge Base. Proc. of the 16th Nat. Conf. on Artificial Intelligence (AAAI 2000).
(2000) 386-391

Ontologies and Software Engineering


Waralak V. Siricharoen
Computer Science Department, School of Science,
University the Thai Chamber of Commerce (UTCC)
Dindeang, Bangkok, Thailand
lak_waralak@yahoo.com, waralak_von@utcc.ac.th

Abstract. This paper is about using ontologies to identify the objects from a
problem domain text description. At the center of object models and ontologies
are objects within a given problem domain is similar to the concept provided by
ontologies. This paper addresses ontologies as a basis of a methodology for
object modeling, including available tools, particularly OntoExtract, which can
help the conversion process. This paper describes how the developers can
implement this methodology on the base of an illustrative example.
Keywords: Ontologies, Software Engineering, Object Models, Artificial
Intelligent.

1 Introduction
An ontology, in more generally and well known definition, is a specification of
conceptualization [9]. Ontologies described syntactically on the basis of languages
such as eXtensible Markup Language (XML), XML Schema, Resource Description
Framework (RDF), and RDF Schema (RDFS). The object oriented paradigm is the
framework in software engineering, influencing all effort in information science.
Discovering the right objects seems to be the most difficult task in the whole
development process. Object oriented software development is well supported by
numbers of working methods, techniques, and tools, except for this starting point object identification and building the related system object model. Converting the text
description of system problem domain and respective functional requirement
specifications into an object model is usually left to the intuition and experience of
developers (system analysts). Recently there has been great research interest in
applying ontologies for solving "language ambiguity problem" as either an ontologydriven or ontology-based approach [9]. This is true for object oriented software
engineering, mainly because of the similarity in the principles of the two paradigms.
Moreover, the object systems similar to ontologies, which represent conceptualized
analysis of a given domain, can be easily reused for different applications [10]. An
ontology is a specification of a representational vocabulary for a shared domain of
discourse: definitions of classes, relations, functions, and other objects [1] or, more
generally, a specification of conceptualization [2]. Semantic web uses ontologies as a
tool for easy integration and usage of content by building a semi-structured data
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11551161, 2007.
Springer-Verlag Berlin Heidelberg 2007

1156

W.V. Siricharoen

model. To solve the problem of heterogeneity in developing software applications,


there is a need for specific descriptions of all kinds of concepts, for example, classes
(general things), and the relationships that can exist among them, and their properties
(or attributes).
The proposed methodology described in this paper, is based on five submodels,
only two namely the text description model (T-model) and class (object) model (Cmodel), are included in the classical object oriented software development process.
The others model used represent specific analysis work, which the developers should
do, to get benefit from using ontologies for identification of objects. The basic idea is
to ensure suitable transformation of the models from one to another using respective
procedure and tools. This paper is structured as follows: section 2 is dedicated to a
more detailed description of the models as well as to discussion on the techniques and
tools, which can be practically used for model transformation. An illustrative example
of a part of the information system for the domain of academic management is used
throughout the paper to support the explanations; finally, section 3 summarizes the
proposed methodology.

2 The Proposed Methodology


The proposed methodology is based on integration of five submodels (T-model, Omodel, OL-model, I-model, and C-model). Models are inseparable and one of the
most significant parts of any methodology. They help developers to better understand
complex tasks. Object oriented analysis of a system under development is a good
example of such a complex task. The complexity stems from the fact that in object
oriented development everything is based on objects but their identification in a given
problem domain is completely left to the intuition of the developer. Fig. 1 shows the
basic idea of the models used and transformation process on them. The details of
models are described as follows.
2.1 T-Model: Text Description Model
The starting point of the objects identification is the T-model, which represents a
concise description of the problem domain, where the software system under
development will work, written in English. If not available the T-model is a
deliverable from a system analyst's work on the general user requirements for the
system functionality. The presumption is that this problem domain description
contains the main objects. This text is represented as an ontology description after
processing by an ontological engine tool in our case Corporum OntoExtract [3]. It is a
web-based version of Corporum, which is able to extract ontologies and represent
them in XML/RDF/OIL (default in RDF schema) and also to communicate with and
negotiate the final format of the to-be-submitted ontology extracted from a specific
text [4]. This tool can interpret text, in the sense that it builds ontologies that reflect
world concepts as the user of the system sees and expresses them. So at this point in
the process, the text is automatically processed and converted into ontologies, which
can be done online. To help this process we refer to a tool of conceptualization - an

Ontologies and Software Engineering

1157

ontological engine, which applied on the T-model, generates an ontological


description (O-model) of the problem domain at hand.
2.2 O-Model: Ontological Model
The ontology described in RDFS defines the names and relations of the extracted
concepts, or object names. RDFS provides a mechanism to define domain-specific
properties and classes of resources to which developers may apply those properties
[5]. More specifically, an ontology description is recognizable as an ontology
language. Classes are specified with <rdfs:class>. Subclasses and subproperties are
specified using <rdfs:subClassOf> and <rdfs:subPropertyOf> (the top class defined in
the schema is Resource) respectively. When a class is a subclass of several
superclasses, this is interpreted as a conjunction of superclasses [7]. A class may also
be defined as a subclass of other classes if evidence is found that the class is indeed a
subclass. A subclass relationship found by this tool is based on information about
the term [3].

I-model

Fig. 1. The proposed methodology of objects identification using ontologies

An important category that is exported by the CORPORUM OntoExtract engine is


the cross-taxonomic relations. While a typical ontology often represents taxonomy,

1158

W.V. Siricharoen

<isRelated> refers to cross-taxonomic links that may exist within a domain and, if
represented, can make a difference in finding needed information based on context
descriptions. In short, it can identify the possible relations between objects.
2.3 OL-Model: Ontologies Library Model
Data (attributes) and functions (methods, operations) are the two fundamental parts of
any objects. We can continue in this way relying on the decision making abilities of
the developers to the final acceptable object model of the system. However, because
of the requirement for decision making this process can still be characterized as
subjective or even intuitive, which was the main reason to propose these models. To
avoid this situation we can recall the most powerful feature of both object and
ontology orientation - they allow for a high degree of reusability of their artifacts in
different application domains. The idea is very simple - if something is defined
already and checked successfully and has been used in practice, perhaps with some
adjustments it can be used for another developers needs. This idea is implemented
and used broadly in object oriented software engineering through business objects and
related patterns, shown in more detail for example in Batanov and Arch-int, 2003.
This paper proposes here an extension of this idea introducing the notion of
Ontological Business Object Pattern (OBOP). An OBOP is an ontology-based
description of a business object that presumably will be included as a working object
in the object oriented software system. It actually rely on the fact that there are a
great number of ontological descriptions of concepts (objects) in different problem
domains, existing already and available from ontology library systems such as
WebOnto, Ontolingua, DARPA Agent Markup Language (DAML), Simple HTML
Ontology Extensions (SHOE), etc.
The DAML ontology library and SHOEntity library, is more specifically their
catalogs of ontologies, which are available in XML, HTML, and DAML formats.
Here classes are called categories and these categories constitute a simple is-a
hierarchy while the slots are binary relations [8]. What the developer should do at
this phase is to select the suitable ontology for the respective problem domain. Fig. 2.
shows an example of how available ontological description for our particular problem
domain can be considered as OBOP. Representation of ontology specifications is
standardized in a form of object description and this provides a great advantage for
software developers. For example, the ontological description shown in Fig. 2. is
found in the ontology library and has a structure, which can be used by the developer
directly as not only class hierarchy but as a structured content of respective classes.
Therefore, this description can be considered as OBOP. Within this pattern the
concept (object) student possesses exactly the properties (attributes) necessary for
the system under development. We can say the same for the root concept (object,
class) person. Moreover, in the ontology the attributes themselves are treated as
concepts (objects) just like in object orientation, which means that we can follow and
extract the description of all objects that we are interested in within the class
hierarchy.

Ontologies and Software Engineering

1159

Fig. 2. Example of ontological class hierarchy used as a pattern of computer science department
ontology version 1.1 [6]

2.4 I-Model: Integrated Model


From O-model, we capture some objects from text description, on the other hand, OLmodel also provide some relevant existing concept within ontologies library. I-model
play the important role to mapping the objects appear in only both models to be
working object in this particular problem domain. In order to emphasize the necessity
of this model we will review what information the developer has up to this point
working with the models described above:
1.

2.
3.

Set of objects in the problem domain PD = {O1, O2, O3,.., Oa} with their
names and relationships, extracted from the T-model by an ontological
engine (OntoExtract).
Set of objects FOE = {O1, O2, O3,.., Ob} with their names and relationships
as a result of applying an ontological engine.
Set of objects BOP = {O1, O2, O3,.., Od}with their names, relationships
(including hierarchical information) and functions as a result of searching
for OBOPs in ontology libraries (DAML and SHOEntity).

Fig. 3. shows in graphical form, although far from precise, the existing situation.
Without a doubt all objects are within the system problem domain but on one hand
their number is still large and they are defined from different perspectives. The
presumption, based on a number of experiments, is that the basic objects, which will
play a substantial role in ensuring the system functionality, will appear in the above
models regardless of the perspective. This practically means that we can apply a
simple integration procedure - intersection of the above sets - to identify those objects
In Fig. 3. the resulting area is O, or

1160

W.V. Siricharoen

Fig. 3. Integration Model

O = PD FOE BOP
The final decision should be taken by the developer.
2.5 C-Model: Class Model
In the O area in Fig.3 from I-model consist of the interesting objects which will
appear in C-model. The C-model is the goal of preliminary analysis of object oriented
systems. This is the well-known class hierarchy representation, including some initial
but significant relationships for the system functionality contents of objects data and
behavior (functions, operations). We stress on the word initial here to emphasize the
fact that the analysis is far from over yet. The developer should continue applying the
conventional analysis models, methods and techniques on the C-model, which can
lead to substantial changes, including adding new objects, deleting some objects,
adding or removing some elements of the included objects, etc. The C-model can be
represented graphically using different tools such as Rational Rose (class diagrams),
textually using either some natural language or pseudo programming language, and
finally using some highly structured tag-based language.

3 Conclusions
The ultimate goal of the developer's efforts is actually the objects identification which
is C-model. This is because the objects included the C-model should contain the
possible objects necessary for the next phases of design and implementation of the
software system. It is clear now the already mentioned problem with "language
ambiguity" - different interpretations of the T-model, without any formal support of
the choice of participating objects, would lead to creating C-models, which are quite
probably inconsistent, incomplete or inefficient for the further steps of design and
implementation. We believe that using ontology as a tool of conceptualization
working on the T-model can make (if not fully formal at least) semi-formal the

Ontologies and Software Engineering

1161

process of creating the C-model and in this way help developers in this complex and
imprecise task. This is the major motivation of the work described briefly in this
paper.
The author believe that merging ontologies with existing methods, techniques,
and tools used during the analysis phase of complex object oriented software systems
can contribute significantly to reaching better decisions, with a positive effect on all
the subsequent phases of the development process. This paper describes a
methodology for supporting the analysis phase of object oriented software
engineering using ontologies for identification of system objects. Five submodels are
introduced and briefly described in the paper as a part of this methodology. The
author also believe that these models and the process of their transformation can help
developers of complex object oriented software systems to: (a) transform user
requirements (represented as text description) into an object model of the system
under development based on the use of ontologies; (b) improve the effectiveness and
efficiency of the existing methodology for high-level system analysis in object
oriented software engineering.

References
1. Gruber, T.R.: Towards Principles for the Design of Ontologies Use for Knowledge
Sharing, In Proceedings of IJHCS-1994, Volume 5 No.6 (1994) 907-928
2. Cullot, N., Parent, C., Spaccapietra, S., and Vangenot, C.: Ontologies : A contribution to
the DL/DB Debate. Available online: downloaded on January (2005)
3. Engles, R.: Del 6: CORPORUM OntoExtract ontology extraction tool, On-ToKnowledge: Content-driven knowledge management tools through evolving ontologies. In
IST project IST-1999-1032, On-To-Knowledge (2001)
4. Engles, R.H.P., Bremdal, B. A., and Jones, R.: CORPORUM: a workbench for the
semantic web, In EXML/PKDD workshop, CognIT a.s., (2001)
5. Klein, M.: XML, RDF and relatives, In IEEE Intelligent Systems March/April (2001)
26-28
6. Heflin, J.:SHOE Ontologies, Available online: http://www.cs.umd.edu/projects/plus/
SHOE/onts/cs1.1.html, downloaded on January (2007)
7. Gil, Y., and Ratnakar, V.: A comparison of (semantic) markup languages, In Proceedings
of the 15th International FLAIRS Conference, Special Track on Semantic Web, Pensacola,
FL (2002)
8. Noy, N., Sintek, F. M., Decker, S., et al.: Creating semantic web contents with Protg2000, In IEEE Intelligent Systems, March/April (2001) 60-61
9. Deridder, D., and Wouters, B.: The Use of Ontologies as a Backbone for Software
Engineering Tools, Programming Technology Lab, Vrije Universiteit Brussel, Brussels,
Belgium (1999)
10. Swartout, W.: Ontologies, In IEEE Intelligent Systems, January/February (1999) 18-25

Epistemological and Ontological Representation


in Software Engineering
J. Cuadrado-Gallego1, D. Rodrguez1, M. Garre1, and R. Rejas2
1

Department of Computer Science


University of Alcal
a, Madrid, Spain
2
Department of Computer Science
Francisco de Vitoria University, Madrid, Spain
{jjcg,daniel.rodriguezg,miguel.garre}@uah.es, r.rejas.prof@ufv.es

Abstract. This paper provides an overview of how empirical research


can be a valid approach to improve epistemological foundations and ontological representations in Software Engineering (SE). Despite of all the
research done in SE, most of the results have not been yet been stated
as laws, theories, hypothesis or conjectures, i.e., from an epistemological
point of view. This paper explores such facts and advocates that the use
of empirical methods can help to improve this situation. Furthermore, it
is also imperative for SE experiments to be planned and executed properly in order to be valid epistemologically. Finally, this paper presents
some epistemological and ontological results obtained from empirical research in SE.
Keywords: epistemological foundation; ontological representation; SE.

Introduction

This paper presents the empirical software engineering (SE) research from a
epistemological and ontological point of view, where relevant concepts could be
dened as follows:
Epistemology is the branch of philosophy that studies the origin, nature, and
limits of human knowledge [28]. From an epistemological point of view, the
problem is how knowledge is acquired in the SE domain. Holloway [15] has
described how knowledge can be acquired by the following epistemological
approaches:
Authority-based epistemology states that truth is given to us by an authority, in the case of SE, an human expert [15].
Reason claims that what is true is that which can be proven using deductive logic. Reason dictates conditional absolute truth; if the premises
on which a valid deductive argument are known to be true, then the
conclusion of the argument must also be true [15].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11621169, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Epistemological and Ontological Representation in Software Engineering

1163

Experience claims that what is true is that which can be encountered


through the senses. The two variations relevant to this discussion are: (i)
anecdotal experience yields possible truth; if something happened for one
person, it is possible it may happen to others also; (ii) empirical evidence
states that truth is that which can be veried. Empirical evidence provides probable truth; if controlled experiments are designed properly and
replicated, then it is highly probable that the results accurately describe
reality [15].
Empirical Research is any activity that uses direct or indirect observation as
its test of reality. In theoretical research, it is a form of inductive reasoning.
It may also be conducted according to hypothetic-deductive procedures [28].
SE is dened by the IEEE [16] as the application of a systematic, disciplined, quantiable approach to development, operation, and maintenance
of software; that is, the application of engineering to software.
Formal ontologies are engineered artifacts aimed at representing a shared,
consensual conceptualization of the knowledge of a given domain [14].
Therefore, epistemology in SE aims to establish the origins, nature and limits
of knowledge in SE. Such knowledge should also be represented in a consensual, shared and optimal way, i.e., an ontology of the SE domain [14]. Taking into account the epistemological approach indicated above, i.e., authoritybased, reasoning and experience, when applied to SE techniques, it is possible to
claim that:
SE has advanced for many years by mainly following authority-based epistemologies, i.e., imposed by expert opinions or organisations promoting a set
of technologies. For example, in [6] there are 25 hypotheses such as Object
model reduces communication problems between analysts and users. As Holloway states [15], this is a weak epistemological foundation on which to base
an entire discipline.
Deductive logic in SE could be the application of traditional engineering
techniques in SE.
Experimental research has received in recent years many favorable announcements as an important epistemological way for SE.
Another research work in the area includes the one by Aaby [1], which explores
the foundation of Software Engineering from the perspectives of ontologies, epistemology and axiology.
The organisation of the paper is as follows. Section 2 provides an overview of
empirical research in SE. Section 3 discusses epistemology in in the context of
empirical SE research. Section 4 presents the use of ontologies in SE. Finally,
Section 5 concludes the paper.

Empirical Research in Software Engineering

The use of empirical methods, closer to the scientic research method, consists
of observing the world, proposing a model or theory of behaviour, measuring and

1164

J. Cuadrado-Gallego et al.

analysing, validating hypotheses of the model or theory, and if possible, repeating


the experiment [13]. The scientic method of research resides in opposition to
the advocacy research where a researcher conceive an idea, analyse the idea,
advocate the idea. Furthermore, researchers highlight the positives aspects of
empirical research methods (Fenton et al. [9], etc.).
The use of empirical research methods in SE is small if compared with other
disciplines. For example, Zelkowitz and Wallace [29] analysed over 600 papers in
the computer science literature and over one hundred papers from other scientic disciplines in order to (i) analyse if the computer science disciple validates
theories and (ii) to compare with other disciplines. They found that around 30%
of papers did not include experimentation although it was needed it and only the
10% of papers that included experimentation have controlled experimentation
methods. Tichy et al [27] also analysed over 400 paper in the computer science
literature concluding that 40% of papers did not include the required experimentation and they required empirical validation (compared with only only 10% to
15% of papers in other engineering disciplines).
In the last decades, however, the importance of empirical research in SE has
grown considerably as noticed by Zelkowitz and Wallace [29]. As a consequence
there are also numerous books, journals and conferences dedicated totally or
partially to disseminate the results of empirical research in software engineering.

Epistemology in Empirical Software Engineering


Research

Important issues in epistemology include the nature of knowledge, its presuppositions and foundations, its extent and validity. This describes what things are
known and how that knowledge is acquired. Platos Theaetetus already dened
knowledge as justied true belief. The problem is how to dene what we know
is true and to what extent that is true. Modern philosophers such as Gettier
[12] state that belief does not need to be fully justied to be true. Furthermore,
Popper states that a law is a hypothesis that has not yet been falsied. For him,
a scientic idea can never be proven true, because because no matter how many
observations seem to agree with it, it may still be wrong. On the other hand, a
single contrary experiment can prove a theory forever false. Theorists have had
consensus dening the meaning of laws, theories, hypotheses and conjectures:
Laws are generalizations on how things happens. From observations we can
generalize about what it is expected to happen but do not need to have an
explanation of why things happen.
Theories are explanations of laws, i.e., why things happen.
Hypothesis is tentative explanation that accounts for a set of facts that needs
to be conrmed by observation.
Conjectures are statements, opinions, or conclusion based on inconclusive or
incomplete evidence.

Epistemological and Ontological Representation in Software Engineering

3.1

1165

Epistemology in Software Engineering

Despite the amount of research performed in empirical software engineering,


most of the results have not been yet been stated as laws, theories, hypothesis
or conjectures, i.e., an epistemological point of view. An exception could be the
works of Lehman and Ramil [20]. They discuss why there is a need to develop
a theory of software evolution on a large scale instead of analysing microcosms
software evolution.
Following this approach, recently, Endres and Rombach [6] compiled a set of
laws, hypotheses and conjectures related to Engineering and Information Systems. Although many of these reported laws are validated by experience or
lessons learned, such formulation can help other researchers to design and create experiments in such a way that proper research questions are formulated.
Experiments can then be designed properly to validate or refute hypotheses or
theories.
3.2

Problems with Empirical Software Engineering Research

Not all empirical research in SE is been carried out properly and as Lehman
states: our laws are certainly weaker than those formulated in biological
sciences....
In order for Empirical research to improve the epistemological value of SE, hypotheses need to be stated properly. In an initial attempt, Kitchenham et al [18]
discuss a set of SE empirical guidelines highlighting the fact that SE empirical
research is still pretty poor. In relation to hypotheses, they comment that many
hypotheses are shallow hypotheses because they do not reect an explanatory
theory, and as a result, those experiments do not increase SE knowledge, i.e,
those do not provide value from an epistemological point of view. Also, Fenton
et al [9] state 5 questions should be asked for empirical research studies:

Is it based on empirical evaluation?


Was the experiment designed correctly?
Is it based on a toy or real situation?
Were the measurements used appropriate to the goals of the experiment?
Was the experiment run for a long enough time?

Problems with empirical research can include the experimental design itself,
samples and scope (toy vs. real, just as software-development-in-the-small diers
from software-development-in-the-large, research-in-the-small may dier form
research-in-the-large [9], the use of appropriate measures. An example of the
postulates of Kitchenham et al [17] is the necessity to apply the representational
theory of measurement to software measurements. Fenton and Peeger [10] provide guidelines to dene and apply metrics to measure the process or product
characteristics. Also, the validity of the metrics from an empirical and theorical
point of view has been discussed in the literature. The representation condition [10] asserts that a measurement mapping M must map entities into numbers
and empirical relations into numerical relations in such a way that the empirical

1166

J. Cuadrado-Gallego et al.

relations preserve and are preserved by the numerical relations. For example, if
A is taller than B if and only if M (A) > M(B) [10]. It is a well-known fact that
there are many dened SE metrics that do not follow these principles. Examples
include the McCalls complexity metric, which has no unit.
A correct way to establish epistemological foundations of software measurement is that a set of metrics must make clear what aspects of quality they are
trying to measure and who they are directed at because there are dierent points
of view about what quality means (developers, managers, users). In the SE domain, metrics should be based on a range of properly dened quality model from
xed hierarchical models (Boehm, McCalls FCM) to more exible approaches
such as the Goal-Question-Metric (GQM) [4].
Finally, the empirical evaluation is needed to help evaluate, predict, understand, control and improve the software development process or product [3].
Furthermore, Fenton and Peeger [10] argue that conducting empirical evaluations is the only way to improve software processes and products.
3.3

Some Epistemological Results of the Empirical Research


Approach to SE

Epistemology can help us dene assertions and how those assertions in SE can
be veried. Also the concepts of ontologies in Software Engineering are aimed at
dening factors and relationships that help to clarify what needs to be reported in
empirical studies.Epistemology and ontology foundations can help in conducting
research and advancing the SE domain without antinomies that we have at the
moment. For example, Lehman and Ramil [20] discuss why there is a need to
develop a theory of software evolution, its practical impact, underlying strategies
and outlines a strategy for the development of the theory. So far, Lehman is one
of the few authors that have dened their work as hypothesis and empirically
evaluate them to create laws.
There are also dichotomies in SE, e.g., in Formal Methods domain, Floyd [11]
suggests a discussion from a philosophical point of view that can be extrapolated
to other SE domains. Formal methods have focused quite a lot in the mathematical aspect. Floyd argues that a change from a product oriented perspective
(which regards software as a product) to a process oriented perspective, (which
views software in connection with human learning) is needed. In SE, the same
dichotomy exits in the same context and others. For example, metrics and quality are usually related either to the products or processes but not much research
has be carried out linking both aspects. Other examples where a long term view
of empirical research has contradicted previous research include:
Regarding to estimation Kitchenham states Function Points are not better
than Lines of Code (LOC). Also, Dolado [5] has shown that some modern
data mining estimation methods are not better than classical regression.
Shneiderman et al experiment [24] on the same analysed whether an algorithm is easier to comprehend if presented as a ow chart or as a pseudo
code. Authors concluded that there is no dierence between both. Years

Epistemological and Ontological Representation in Software Engineering

1167

later, Scalan [23] highlights some experimental aws of the original experiment (not taking time into account, questions that could only be answered
from the pseudocode and too simple program). Scalan proved that owcharts
outperform pseudocode with a proper experimental design.
Etc.
Finally, perhaps the most interesting point for the future of SE from the
Popper theses regarding epistemology is that every solution of a problem raises
new unsolved problems. Current research in empirical SE is the adoption of
evidence-based approach following the success of the evidence-based paradigm
in medicine during late 80s and early 90s. By analogy with evidence-based
medicine, Kitchenham et al [19] dene that the Evidence-based Software Engineering (EBSE) should be to provide the means by which current best evidence from research can be integrated with practical experiment and human values in the decision making process regarding the development and maintenance
of software.

Ontologies in Software Engineering

In SE, the use of ontologies has been used as a resource to integrate the information, to communicate what people have achieved, to adapt the goals of the
organization and to support the eciency of the processes. Ontologies can also
be used by applications require a higher level of formality of denition. For example, the cataloguing of learning resources or the mapping of vocabularies from
dierent information sources require precise denitions, or at least signicant
characterizations that help in deciding which terms to use in practical situations. Altho et al [2] describe an architecture oriented to reuse the experience
in SE that use ontologies as the underlying formalism.
From an ontological point of view, empirical research has dierent considerations. On the one hand, the use empirical research with the objective of
improving the epistemological foundations in SE could help in dening more
precise denitions, which in turn will provide more precise and useful ontologies.
The IEEE Standard Glossary of Software Engineering Terminology [16] is a
well-known attempt to provide precise characterizations of the main terms in
the eld. It uses natural language prose which is useful for an ecient communication, but it does not provide a clear demarcation for each of the concepts.
On the other hand, current works creating ontologies based on standards will
help in dening entities, parameters and the relationships between them. This is
turn will help to dene the experiments stating which variables could be considered as dependant or independent variables in the design of an experiment and
how those are related. An ISO standard, the Guide to the Software Engineering
Body of Knowledge (SWEBOK) provides an agreement on the content of what
compose the SE discipline. The SWEBOK project opened new possibilities to
ontology engineering in the eld of SE, since it represents a shared consensus on
the contents of the discipline and provide pointers to relevant literature on each
of its concepts, both are important elements in ontology engineering [14,25].

1168

J. Cuadrado-Gallego et al.

Conclusions

This paper provided an overview of empirical research from an epistemological


and ontological point of view, linking such concepts in the context of SE. This
paper also highlighted the fact that there are few laws, theories, hypothesis or
conjectures in SE. This fact is not only related to the lack research in this area
which has grown considerably in the last decade, but it may be also related
to epistemological approximations. The use of empirical methods following the
scientic method of research can help to improve this situation.
Empirical research must be properly dened to be valid from the epistemological point of view. There are many antinomies still operating that need to be
solved taking into account the established empirical and research methods and
the initial guidelines [18] that are appearing in the SE domain. Some epistemological and ontological results were presented highlighting these facts.

Acknowledgement
This work has been supported by the project MCYT TIN 2004-06689-C03.

References
1. Aaby, A.A.: The Philosophical Foundations of Software Engineering, Draft available at: http://cs.wwc.edu/aabyan/Articles/SE/.
2. Altho, K.-D., Birk, A., Hartkopf, S., Muller, W., Nick, M., Surmann, D. and
Tautz, C.: Systematic Population, Utilization, and Maintenance of a Repository
for Comprehensive Reuse. In Learning Software Organizations - Methodology and
Applications, Springer Verlag, LNCS 1756, (2000) 2550
3. Basili, V., Selby, R.W. and Hutchens, D.H., Experimentation in Software Engineering, IEEE Trans. on Soft Eng 12 7 (1986) 733743
4. Basili, V. R., Caldiera, G. and Rombach, H. D.: The Goal Question Metric Paradigm, in Encyclopedia of Software Engineering, John Wiley & Sons, Inc., (1994)
528532
5. Dolado, J.J. and Fernandez, L.: Genetic Programming, Neural Networks and Linear
Regression in Software Project Estimation, in International Conference on Software Process Improvement, Research, Education and Training (INSPIRE98) (1998)
157171
6. Endres, A., Rombach, H.D.: A Handbook of Software and Systems Engineering:
Empirical Observations, Laws and Theories. Pearson Addison Wesley, (2003)
7. Briand, L., Bunse, C., Daly, J., Dierding, C.: An Experimental Comparison of the
Maintainability of Object-Oriented and Structured Design Documents, Empirical
Software Engineering 2(3), (1997) 291-312
8. Chidamber, S.R. and Kemerer, C.F.: A metric suite for object oriented design,
IEEE Trans. on Soft Eng, 20(6) (1994) 476-493
9. Fenton, N.E., Peeger, S.L. and Glass, R.L.: Science and Substance: A challenge
to Software Engineers. IEEE Software 11(4)(1994) 8695
10. Fenton, N.E. and Peeger, S.L.: Software Metrics: A Rigorous and Practical Approach, 2nd Edition, PWS (1997)

Epistemological and Ontological Representation in Software Engineering

1169

11. Floyd, C.: Theory and Practice of Software Development, Stages in a Dialoge.
LNCS Vol. 915, 6th International Joint Conference CAAP/FASE on Theory and
Practice of Software Development, (1995) 2541
12. Gettier, E.: Is Justied True Belief Knowledge?, Analysis 23. Available at:
http://www.ditext.com/gettier/gettier.html
13. Glass, R.L.: The Software-Research Crisis. IEEE Software 11(6) (1994) 4247
14. Gruber, T.: Towards principles for the design of ontologies used for knowledge
sharing. Intl Journal of Human-Computer Studies 43(5/6) (1995) 907928
15. Holloway, C.M., Epistemology, Software Engineering, and Formal Methods Abstract of Presentation The Role of Computers in LaRC R&D, June 15-16 (1994).
Available at: http://shemesh.larc.nasa.gov/people/cmh/epsefm-tcabst.html
16. IEEE, IEEE Standard Glossary of Software Engineering Terminology, IEEE Std
610.12-1990, (1990)
17. Kitchenham, B., Peeger L., Fenton, N.: Towards a Framework for Software Measurement Validation. IEEE Trans on Soft Eng 21 (12) (1995) 929944
18. Kitchenham, B.A., Peeger, S.L., Pickard, L.M., Jones, P.W., Hoaglin, D.C., El
Emam, K., Rosenberg, J.: Preliminary Guidelines for Empirical Research in Software Engineering. IEEE Trans. on Soft Eng 28(8) (2002) 721-734,
19. Kitchenham, B.A., Dyba, T., Jorgensen, M.: Evidence-Based Software Engineering.
26th IEEE International Conference on Software Engineering (ICSE04) (2004)
273281
20. Lehman, M., Ramil, J.F.: Towards a Theory of Software Evolution and its Practical
Impact, Proceedings Intl. Symposium on Principles of Software Evolution, ISPSE
2000, 1-2 Nov, Kanazawa, Japan (2000) 211
21. Lenat, D.B.: Cyc: A Large-Scale Investment in Knowledge Infrastructure. Communications of the ACM 38(11) (1995) 3338
22. Popper, K.R.: Conjectures and Refutations, Routledge and Kegan Paul, (1963)
23. Scanlan, D.A.: Structured Flowcharts Outperform Pseudocode: An Experimental
Comparison. IEEE Software 6(5) (1989) 2836
24. Shneiderman, B.B., Mayer, R., McKay, D., Heller, P.: Experimental Investigations
of the Utility of Derailed Flow charts in Programming. Communications of the
ACM 20(6) (1977) 37338
25. Sicilia, M.A., Garcia, E., Aedo, I., Diaz, P.: A literature-based approach to annotation and browsing of Web resources. Information Research 8(2), (2003) 110
26. Tautz, C. and von Wangenheim, C.G.: REFSENO: A Representation Formalism
for Software Engineering Ontologies, Fraunhofer IESE IESE-015.98 (1998)
27. Tichy, W.F., Lucowiicz, L.,Prechelt, L., Heinz, E.A.: Experimental evaluation in
computer science: a quantitative study, Journal of Systems and Software 28(1)
(1995) 918
28. Wikipedia: The Free Encyclopaedia. Available: http://en.wikipedia.org/wiki/
Epistemology
29. Zelkowitz M.V., Wallace D.: Experimental Validation in Software Engineering.
Information and Software Technology 39(11) (1997) 735743

Exploiting Morpho-syntactic Features for Verb Sense


Distinction in KorLex
Eunryoung Lee1, Ae-sun Yoon1, and Hyuk-Chul Kwon2
1

Pusan National University, Korean Language Processing Laboratory


Pusan National University, Department of Computer Science and Engineering,
Jangjeon-dong, Geumjeong-gu, 609-735 Busan, S. Korea
{eunryounglee, asyoon, hckwon}@pusan.ac.kr

Abstract. Verb sense distinction is a basic principle in lexical knowledge


representation in wordnets. Starting from the results of automatic mapping of
English verb WordNet to Korean verb KorLex, the present study looks for a
syntactic and semantic verb sense interface using the morpho-syntactic features
of Korean verb and proposes a fine-grained verb sense distinction to make up
for the weak points of WordNet in NLP applications.
Keywords: WordNet, KorLex, Korean wordnet, verb sense distinction,
transitivity alternation, Korean middle verbs.

1 Introduction
A Korean verb wordnet named KoreLex was constructed based on the word-sense
mapping of English WordNet (Princeton, version 2.0, hereafter WN) to Korean 1
verbs. During semi-automatic translation, we noticed the following problems; (1) the
English verbs of accusative(acc)/inaccusative(inacc) alternation were mapped to two
or more morpho-syntactically different Korean lexical entities; (2) two English verb
senses were matched to a Korean middle verb(MV), having both acc/inacc features
but considered as having a unique sense in most Korean dictionaries. To cope with
these problems, the verb sense distinction in KorLex(KL) should be processed
primarily on the basis of coherent linguistic criteria. For this purpose, we explore
morpho-syntactic features that are strikingly marked at the lexical level in Korean,
and show how to apply them to KLs hierarchical structure.
The paper is structured as follows: Section 2 consists of an overview of related
work in English and in Korean. In Section 3, we look at the translation results of
2,322 verb synonym sets (synsets) and analyze the types of mismatches. Section 4
proposes to allocate the distinguished verb senses according to their morpho-syntactic
properties in the Korean verb hierarchy.
1

Korean is an agglutinative language and the word order, as determined by case markers, is
partially free. A verbs noun arguments, including the nominative and the accusative, can be
omitted according to the context, and the verb is always located in the final position.
Aspectual, modal markers as well as tense markers are postpositional morphemes
agglutinated to the verb radical.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11701177, 2007.
Springer-Verlag Berlin Heidelberg 2007

Exploiting Morpho-syntactic Features for Verb Sense Distinction in KorLex

1171

2 Related Work
Effect of the syntactic features on the meaning of verbs has been the subject of
theoretical debate in linguistics and it is still a controversial issue in building a lexical
knowledge base or ontology. However, no concrete choice methodology, nor any
theoretical settlement for the syntactic-semantic interface, for wordnet purposes, has
been suggested. Languages differ in what syntactic features of verbs are encoded
morphologically, and thus lexical formation and its morphological rules for certain
syntactic phenomena should be considered as having a semantic effect, especially in
agglutinative languages including Korean.
2.1 Searching for Syntactic and Semantic Verb Sense Interface
According to [3] and [5], the semantic information of a verb must include both its
central meaning and the thematic grid it specifies, both of which determine the
syntactic construction of the verb. Even though the more specific syntactic features
and semantic role of English verbs are explored in the FrameNet and in VerbNet,
it is very difficult to apply such information WN-like hierarchical semantic
network.
Previous studies on WNs verb sense distinction have attempted to complete WN
by integrating the syntactic properties into the verbs in order to extend the lexical
information. Consequently, questions regarding the means by which syntactic
information in a verb lexicon are arranged, and by which verb senses are
distinguished, are important issues. In this respect, grouping about 3,000 English
verbs according to diathesis alternation patterns, [9] is a pioneering work which led
also to a set of studies on verb polysemy.
Among a number of extensive studies on Levins work, [7] and [12] identifies
missing verb senses in transitive alternation and testing accuracy by mapping Levins
classification to WN. Although current researchers in the field of lexical semantics
have concluded that certain syntactic behavior traits of verbs are central to their
meaning, it has not been clearly established that the diathesis alternation results in a
change of meaning. Regarding this subject, [10] argues that case-alternated verbs
should be considered logically as polysemous and that therefore Levins verb class
can be used for verb sense disambiguation.
2.2 Case Alternation in Korean
Case alternation in Korean is characterized by (a) the position change of a noun
argument accompanying a change of its semantic role, (b) the reduction/ extension of
number of the noun argument, and (c) alternation of the case marker. It is generally
accepted that the case in Korean is determined by the case marker and according to
[4], case does not have the same grammatical properties as in English. [7] claims that
since case alternation has to be described additionally according to the semantic

1172

E. Lee, A.-s. Yoon, and H.C. Kwon

property of a noun argument at the phrasal level, it is difficult to consider that the
meaning of a verb changes according to its case alternation. In this respect, the MV in
Korean might offer a theoretical basis for verb sense distinction in a wordnet; the verb
allows both acc and inacc frames on a verb radical and the acc noun argument in the
transitive frame can be alternatively, be the nominative noun in the intransitive frame.
Two syntactic realizations of the MV baljok-hada (launch) allow two different
senses, as in following example.
(1)

keu-neun sailoun hakheoi-leul


baljogha-yeoss-da
He-nominative new academic society-acc launch-past-final
(He launched a new academic society.)
sailoun hakheoi-ga
baljogha-yeoss-da
new society-nominative launch-past-final
(A new academic society was launched.)

In the acc frame, the MV has a causative synonymous form, and in the inacc
frame, it is passive. At this point, it is useful to explore the sense distinction of MVs
to be sub-categorized both for the lexicographical purpose and for a WN-like
linguistic NLP application resource.

3 Verb Sense Distinction for KorLex Verb


Semi-automatic translation using bilingual dictionary does not fully reflect the
meanings of English verbs in Korean verbs when they contain both acc and inacc
frames at the same time. It is remarkable that pairs of transitive/intransitive verbs and
transitive/passive verbs are subject to more refined sense distinction2 and modification
of hierarchical structure in KL. In KL, a synset is defined as a set of synonymous verb
senses that should be interchangeable in least one context, which means that the
syntactic behavior of a verb is considered to be the core element determining its
semantic relations in KL.
3.1 Verb Sense Linking
Here, we will examine the Korean counterparts of English {verbs.change} and
analyze the mapping result in accordance with transitivity alternation. As for the
translation process between the two languages, we first mapped 2,325 English synsets
to Korean verbs using an English-Korean dictionary, obtaining 3,184 senses. The
automatic mapping process failed to provide a full and correct translation into Korean,
thus manual verification and correction were necessary.
We chose only 972 English-Korean verb synsets (corresponding to 1,314 verb
senses). We did not take into consideration the rest of the 1,353 synsets, translated
into verb phrases. Therefore, for the Korean verbs, we selected 972 synsets
2

See [8] for the sense distinction in WN and KL.

Exploiting Morpho-syntactic Features for Verb Sense Distinction in KorLex

1173

(1,344 verb senses) including only those appearing as entries in the Standard Korean
Dictionary3. The following Figure illustrates the mapping result.
1 E nglish verb synset
w ith alternating frames

(I)

1 Korean verb

(II)

1 E nglish verb synset


without alternating
frames

(III)

2 m orphologically
different K orean verbs

(IV)

1 K orean m iddle verb

Fig. 1. Mapping matrix for English and Korean verbs

As shown in Fig.1, we find four types of mapping result. The two left boxes
indicate the syntactic status of the English verb synset; the upper box indicates the
verbs with acc and inacc frames. The lower box indicates the verbs having no
alternating frames.
Type (I) is the most common in our data, numbering up to 1,028 Korean verb
senses among which 129 verb senses were found to be erroneous. Wrong connections
and missing word senses were post-processed manually. After correction, since an
English verb has two frames, the matching Korean verb set contains two morpholexically different verbs. Thus the Korean verb sets found in Type (I) are subject to
the sense distinction.
According to Type (II), English synsets are mapped to two different verb senses
that are either transitive-intransitive (for 65 English synsets), or transitive-passive
pairs (for 34 English synsets). To check the translation accuracy for Type (II), we
examined whether the gloss of an English synset contained more than two senses. It
appears that in most cases the sentence frames distinguish the acc/inacc senses more
accurately rather than the gloss. The Korean verb sets in this type ultimately need to
be separated into an independent synsets.
According to Type (III), English transitivity-alternated verbs are mapped to a
Korean MV also having two alternating frames according to transitivity alternation.
However, only 8 Korean MVs were matched. Regarding Type (III), it is necessary
that the refinement of translation results goes through multiple post-processing.
Finally, according to Type (IV), 50 Korean MVs are matched to either English acc or
inacc form, which process reflects the necessity of verb sense distinction of Korean
MVs. In the following two subsections, 3.2 and 3.3, we discuss the necessity of
modifying verb sense distinction in KL focusing on Type (II) and Type (III).
3.2 Sense Distinction: Transitive/Passive Form
According to Type (II), one English verb synset having alternating frames is mapped
to either a transitive/intransitive verb pair (II-a) or a transitive/passive verb pair (II-b).
3

We used the Standard Korean Dictionary, which contains 58,815 Korean verb entries.
English-Korean bilingual dictionary we used contains 14,454 verb entries.

1174

E. Lee, A.-s. Yoon, and H.C. Kwon

A translated set of verbs is subject to sense distinction in KL for two reasons, as


shown in the following examples.
(2) Na-neun gamja-leul jji-n-da
I-nominative potato-acc steam-present-final
(I steam the potatoes.)
(3) Gamja-neun
soss-eseo bbali jjyeo-ji-n-da
Potato-nominative pot-locative rapidly steam-passive-present-final
(The potato steams rapidly in the pot.)

First, jji-da to jjy-eoji-da, have different selectional restrictions and syntactic


distributions. The passivization of jji-da is realized by -eojida which is an
auxiliary verb. A passive verb in Korean is derived by postpositional morphemes in
the following three ways; (1) adhesion of a suffix (2) for verbs such as
<noun+hada(support verb)>, adhesion of morphemes such as deo, -bad, or dangha;
(3) adhesion of the auxiliary verb jida. In Korean by contrast to English, the passive
voice does not necessarily presuppose a syntactic alternation, but it is realized at the
lexical level by the suffixation of a specific morpheme. Providing ten passive
sentence structure, [11] proves that the Korean passivisation is not a syntax-dependent
process and that it is rather a morpho-semantic feature.
Second, following the WNs basic assumption, it is not possible in Korean to put
those verb forms in a synset. A synset is defined as a set of synonymous words that
are interchangeable in some context. But the Korean passive and transitive verb forms
cannot be substituted in any context.
3.3 Sense Distinction for Korean Middle Verbs
In Figure 1, Type (III) corresponds to the mapping between an English synset with
two alternating frames and a Korean synset with a MV. We believe that in KL, MVs
should be treated differently from the way they are treated in English WN.
As shown in Example (1), Subsection 2.2, the distributional trait of a Korean MV
provides a syntactic basis for semantic distinction. In addition, we think that semantic
features, even though they are not necessary and sufficient conditions for all MVs, are
inherent to MVs, and thus can provide linguistically positive arguments for the
polysemy of the MV, because (1) the acc frame of a MV is semantically identical to
its paraphrased sentence with a causative verb, and (2) the inacc frame of a MV is
semantically identical to its paraphrased sentence with a passive verb, especially
when the noun argument in the nominative case does not assume the agentive. The
disparity of semantic properties of MVs in acc/inacc constructions has not been
suggested in Korean linguistics as an argument for their sense distinction, since they
are used as additional criteria for identifying, on the syntactic basis, middle
constructions from other types of transitivity alternation. However, we argue that the
sense of a MV varies according to the syntactic construction and that different
semantic relations such as in the synonymy of a MV in two constructions are
footholds for sense distinction in KL Verb.

Exploiting Morpho-syntactic Features for Verb Sense Distinction in KorLex

1175

4 Building KorLex Verb Hierarchy


This section is devoted to the actual building of a KL verb hierarchy by applying the
verb sense distinction suggested in Subsections 3.2 and 3.3. Verb hierarchy in KL is
constituted with the troponymy(hyponym) relation which should be adequate for
representing the change of meaning according to the syntactic construction, allowing
for a fine-grained sense distinction useful to sentence parsing in NLP applications.
4.1 Passive Verb Classification and Its Place in KorLex
In KL, since verb hierarchy is biased according to the WN structure, it is efficient to
look for the appropriate place in relation to the top-hypernyms and their hyponyms in
WN. In Table 1, the top-hypernyms of Type A (pairs of acc and inacc concepts)
reflect the acc/inacc values independently; {change 0} is inacc and {change1, alter1,
modify 11}, accusative. Given the 35 top hypernyms of {verb.change}, we can
choose an appropriate place in the hierarchies for the Korean passive or transitive
verbs by following their syntactic and semantic features. The following Figure 2
represents the hierarchical structure of {bake 0}.
change0 ; byeonwha-deoda 0

change 0

change1, alter1, modify 11; byeonwha-sikida 0

change_integrity 0 ; eoiyangibagguida 0

change_integrity 0

eoiyaneul-bagguda 0

cook 0; ikda 0

ikhida 0
bake 0; guweojida 0

cook 0

gubda 0

bake 0
Mapping result

Fig. 2. Hierarchy of {bake 0}

Modification in KorLex

Fig. 3. Reconstruction of hierarchy for Passive/Transitive


verb gubda

In the hierarchy of {bake 0}, hypernyms are constituted of verb senses of inacc
frame. Thus we will use this structure to match the passive verb gu-weoji-da (bakepassive-final), and the Korean counterparts for the hypernyms of {bake 0} will be all
of passive value. Now, we can separately build the transitive and the passive forms of
gub-da(bake-final) as shown in Figure 3. The structure on the left is the hierarchy of
the passive form of the verb guweojida and its hypernyms of passive value. The
structure on the right represents the modified hierarchy of the transitive verb form and
its hypernyms of transitive value. Each counterpart of {bake 0} now has a different
semantic relation in KL. As a result, {bake 0} is mapped accurately to Korean
synonym guweojida and {cook 0} is mapped to ikda. The hierarchy on the right
also shows the additional nodes by which gubda is represented accusatively.
4.2 Middle Verb Sense Distinction and Its Place in KorLex
KL requires three different processes of MV sense distinction:
(1) separate the two senses of a MV and add them to appropriate existing KL
synset
(2) separate the two senses and create a node for a new sense in the KL hirerarchy
(3) separate the two senses and create both a node and a new hierarchy in KL

1176

E. Lee, A.-s. Yoon, and H.C. Kwon

Process (1) is applied to verbs for which we can find an appropriate synset for the
two distinguished KL senses. For example, the sense of the MV gwayeolhada
(overheat) was separated into gwayeolhada 0 (be overheated) / gwayeolhada1
(overheat) and added as a synonym to the existing synset node in KL, as shown in
Figure 4.
c h a n ge 1 , a lte r 1 ,
m o dify 1 1

c h a ng e 0

c ha ng e s ta te 0
tu rn 4

h e a t 1 , he a t
up 0
overheat 1
g w a y e olha d a

heat 0, hot up
2 , h e a t up 1
o v e r he a t 0
gw a y e o lh a d a

by e on h a d a 0 ,
b y e o n w ha h a d a 0

b y e o nw h a s ik id a 0

s a n gta e g a
b y e o n w ha h a d a 0

de ud a 0
tte o g e o bg e h a d a 0

tte o ge o w a jid a 0

g w a y e o lha d a 1

g w a y e o lh a da 0

M a p p in g r e s u lt

M od ific a tion in K o rL e x

Fig. 4. Sense distinction and position of gwayeolhada in KL

Process (2) was applied, for example, to the MV baigahada (multiply).


change 1, alter 1,
modify 11
byeonwhasikida 0
increase 2
jeunggasikida 0
multiply 0,
manifold 1
baigahada
baigasikida

byeonhada 0
byeonwhahada 0
keugiga
bakkwida

buleonada

byeonwhasikida 0

jeunggasikida 0
baigahada 1
baigasikida 0

byeonhada 0
byeonwhahada 0

keugiga
bakkwida 0
buleonada 0

baigahada 0
baegadoeda 0
Mapping result

Modification in KorLex

Fig. 5. Sense distinction of baigahada in KL

move 3
umjikida
shake 1, agitate 5
heundeullida
tremble 1
tteollida

umjikida

heundeulda

tteolda

quiver 1, quake 1,
palpitate 2
beoleonggeolida

umjikida 0

heundeullida 3

tteolida 2

beoleonggeolida 0

Mapping result

umjikida 2

heundeulda 6

tteolda 5

beoleonggeolida 1

Modification in KorLex

Fig. 6. Sense distinction obeoleonggeolida


in KL

In the mapping result, baigahada (multiply) was mapped to the English synset
{multiply 0, manifold 1}, also subject to sense distinction. We distinguished
baigahada (multiply) according to baigahada 0 (be multiplied) and baigahada 1
(multiply). baigahada 0 was linked to {multiply 0, manifold 1}, and the new sense
baigahada 1 was created as a troponym of buleonada 2 as shown in Figure 5.
Process (3) is the same process but with the addition to KL of a new Korean verb
sense. Fig 6 shows process (3) applied to beoleonggeolida (shake vigorously). In
consideration of the hypernyms, its sense has to be of inacc value. Therefore, after the
sense distinction of beoleonggeolida according to beoleonggeolida 0 (be shaken
vigorously) and beoleonggeolida 1 (shake vigorously), the latter requires a new set
of hypernymys, which is also added to KL.

5 Conclusion
Semi-automatic construction of KL using WN as pivot requires elaborate linguistic
criteria determining the grain size of sense and semantic relation. KL verb is
constructed based on English WN, but its semantic relations and sense distinctions are
modified. In this study, focusing on the relatedness of morpho-syntactic features and
the verb meanings, we showed that the sense distinction can be manageable by

Exploiting Morpho-syntactic Features for Verb Sense Distinction in KorLex

1177

extracting a different pair of syntactic frames acc/inacc for polysemous verbs in


Korean. Moreover, we demonstrated that the morpho-syntactic-based sense
distinction can be applied in establishing the semantic relation of hypernymytroponymy, which has been considered difficult to achieve. We think that this is
useful to any verb wordnet applied to NLP.
The problem of sense distinction is directly connected to the question of polysemy
and word sense disambiguation. However, for verb sense distinction, linguistically
robust criteria are necessary with or without limitation of generalization, since there is
no single method applicable to all lexical acquisitions.

Acknowledgment
This work was supported by the Korea Science and Engineering Foundation (KOSEF)
through the National Research Lab. Program funded by the Ministry of Science and
Technology (No. M10400000332-06J0000-33210).

References
1.
2.
3.
4.
5.
6.
7.

8.
9.
10.
11.
12.

Cruse, D.A.: Lexical Semantics. Cambridge University Press, Cambridge (1986)


Fellbaum, C.: WordNet- An Electronic Lexical Database. MIT Press, Cambridge (1998)
Grimshaw, J.: Argument Structure. MIT Press, Cambridge (1990)
Hong, J.S.: Nouns in Korean: Sai Kukeosaingwhal, Natioanl Institut of the Korean
Language vol.11-3, Seoul (2001), html version
Jackendoff, R.S.: Semantic Structures. MIT Press, Cambridge (1990)
Kim, M.L.: The Studies of Verbal Classes according to Case Mark-Alternation: Korean
Linguistics vol.25, Association for Korean Linguistics, Seoul (2004) 161-190
Kohl. T., Jones, D.A., Berwick, R.C. and Nomura, N.: Representing Verb Alternation in
WordNet: WordNet- An Electronical Lexical Database. MIT Press, Cambridge(1998)
153-178
Lee, E-R, Yoon, A-S and Kwon, H-C: Passive Verb Sense Distinction in Korean Wordnet:
Proceedings of the 3rd International WordNet Conference, GWC, Brno (2006) 211-216
Levin, B.: English Verb Classes and Alternations-A Preliminary Investigation. University
of Chicago Press, Chicago (1993)
Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge, London (1998)
Yang, J-S: Semantic Analysis of Korean Verb and Linking Theory. Bakijeong, Seoul
(1995)
Zickus, W. M.: A Comparative Analysis of Beth Levin's English Verb Class Alternations
and WordNet's Senses for the Verb Classes HIT, TOUCH, BREAK, and CUT.:
Proceedings of The Post-Coling 94 International Workshop on Directions of Lexical
Research. Beijing, China: Tsinghua University (1994) 66-74

Chinese Ancient-Modern Sentence Alignment


Zhun Lin and Xiaojie Wang
School of Information Engineering, Beijing University of Posts and Telecommunications
Beijing, China, 100876
xjwang@bupt.edu.cn

Abstract. Bi-text alignment is useful to many Natural Language Processing tasks


such as machine translation, bilingual lexicography and word sense
disambiguation. Most of previous researches are on different language pairs.
This paper presents a diachronic alignment of Ancient and Modern Chinese.
Because of the long history of Chinese culture and Chinese writing, lots of
Ancient Chinese texts are waiting to be translated into modern Chinese,
especially, the comparative study of Ancient and Modern Chinese is a very
important way to understand some characteristics in Modern Chinese. After
describing some characteristics in Ancient-Modern Chinese bi-texts, we first
investigate some statistical properties of Ancient-Modern bi-text corpus,
including the correlation test of text lengths between two languages and the
distribution test of length ratio data. We then pay more attention to n-m(n>1 or
m>1) alignment modes which are prone to mismatch.

1 Introduction
Text alignment is an important task in Natural Language Processing (NLP). It can be
used to support many other NLP tasks. For example, it can be utilized to construct
statistical translation models [1], and to acquire translation examples for
example-based machine translation [2]. It can be helpful in bilingual lexicography [3].
The approaches to text alignment can be classified into two types: statistical-based
and lexical-based. The statistical-based approaches rely on non-lexical information
(such as sentence length, co-occurrence frequency, etc.) to achieve an alignment task
[4]. The method proposed in [5] is based on the assumption that in order for the
sentences in a translation to correspond, the words in them must also correspond. Their
method made use of lexical anchor points to lead an alignment at the sentence level.
It has been shown that different language pairs are in favor of different information
in alignment. For example, [6] found that the sentence-length correlation between
English and Chinese is not as good as between English and French. Also, there is less
cognate information between Chinese-English pair than that in English-French pair,
while the alignment of Chinese-Japanese pair can make use of information of Hanzi
commonly appearing in both languages [7]. Currently, most methods rely on either or
both of above two ideas [8]. The approaches combining both length and lexical
information, such as [9], seem to represent the state of the art.
Chinese has a long history. It had been changed from Ancient Chinese to Modern
Chinese gradually. A lot of Ancient Chinese texts are waiting to be translated into
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 1178 1185, 2007.
Springer-Verlag Berlin Heidelberg 2007

Chinese Ancient-Modern Sentence Alignment

1179

modern Chinese, Ancient-Modern Chinese alignment will be helpful in both example


based machine translation and statistic machine translation. Although the two
languages are very different in some aspects, but many of the Modern Chinese
characteristics can be found in the Ancient Chinese [10], they have close relationship.
Therefore, the comparative study of Ancient-Modern Chinese may give useful cues on
understanding some characteristics in Modern Chinese.
In this paper, we present our current work on Ancient-Modern Chinese Sentence
alignment. Combination of both length-based information and lexicon-based
information is proved to be the state of art approach to bi-text alignment [8]. So we will
also combine these two kinds of information in our alignment algorithm. But for
Ancient-Modern pairs, as we will see, they are some special features in alignment. In
Ancient-Modern pairs, Hanzi characters, which commonly occur in both languages, is
obviously a useful feature for alignment, they are used for improving sentence
alignment with great significance.
The remainder of paper is organized as follows: section 2 describes some
characteristics in Chinese Ancient-Modern bi-text corpora, shows what we can get in a
sentence level alignment. Section 3 describes several information sources we use in our
approach, and how to combine these information sources for improving alignment.
Where, we investigate the length correlation between two languages and the
distribution of length ratio between them. We use a deliberate measure to deal with
several different kinds of alignment. We then implement several experiments and
evaluate different effects brought from different information sources in section 4, and
give some discussions on the interesting results. Finally we draw some conclusions.

2 Some Characteristics Ancient-Modern Chinese Bi-texts


Even though Ancient Chinese and Modern Chinese are similar on lots of aspects, there
are many sentence alignment modes between them. n-m (n>1 or m>1) modes, which
are to be prone to mismatch, are also common in practice.
Table 1. An example of Ancient-Modern Chinese bi-text: one Ancient Chinese sentence is
aligned with two Modern sentences

Mode
1:2

2:1

2:2

Ancient Chinese

Modern Chinese

1180

Z. Lin and X. Wang

Some examples about various alignment modes are given in table 1. There is 1-2, 2-1
and 2-2 alignment respectively. The statistics of our data aligned manually show these
kinds of alignment are more 10%, while less than 1.5% of the alignments contain 3 or
more sentences in one language.
Both Ancient Chinese and Modern Chinese are composed of Hanzi Character, for
example, is the same character in both ancient and modern Chinese. The
alignment of Ancient-Modern Chinese pairs can make use of information of Hanzi
information commonly appearing in both texts.

3 The Approach to Sentence Alignment


We use dynamic programming to find overall optimal alignment paragraph by
paragraph. We combine both length-based information and Hanzi-based information to
measure the cost for each possible alignment between Ancient-Modern Chinese strings.
Before we give the measure function, we first define a few notations. Let s i denote
the i th sentence in Ancient Chinese, | si | denote the number of character it includes,
sij denote the string from the i th sentence to the j th sentence in Ancient Chinese
text, t u denote the u th sentence in Modern Chinese, t uv denote the string from the
u th sentence to the v th sentence in Modern Chinese text. j < i means that sij is an
empty string, and so does t uv . Let d (i, u; j , v) be the function that computes the cost
of aligning sij with t uv . We divide d (i, u; j , v) into three parts, as shown in (1), to
reflect that we utilize three information sources to compute the cost.

d (i, u; j , v) = L(i, u; j , v) + M (i, u; j , v) H (i, u; j , v)

(1)

Where L (i, u; j , v) depends on length ratio of the paired two strings sij and t uv ,
different length ratios cause different L (i, u; j , v) ; M (i, u; j , v ) depends on the
alignment mode in the pair, that is, depending on how many sentences involved in this
pair at both sides, different n-m modes cause different M (i, u; j , v ) . H (i, u; j , v ) is
the contribution from Hanzi characters common in both strings.
We first describe how to use length-based information, and then Hanzi-based
information. We give a deliberate measure on multiple alignments which are prone to
mismatch.
3.1 Length and Mode
To utilize the length information in Ancient-Modern Chinese alignment, we make an
investigation on length correlation between these two languages. We also check if the
normal hypothesis of the length ratio is appropriate for this language pair. Following
Gale and Church (1991), we use alignment data at paragraph-level to do the correlation
and normal test. 205 paragraphs in our Ancient-Modern corpus are randomly chosen
for these tests.

Chinese Ancient-Modern Sentence Alignment

1181

The correlation test is to examine if the lengths of Ancient Chinese texts are related
to their Modern translations. That is, if longer texts in Ancient Chinese tend to be
translated into longer texts in Modern Chinese. Figure 1 shows that the lengths (in
characters) of Ancient and Modern Chinese paragraphs are highly correlated.

Fig. 1. Paragraph lengths are highly correlated in Ancient-Modern Chinese bi-texts. The
horizontal axis shows the length of Ancient Chinese paragraphs, while vertical scale shows the
lengths of the corresponding Modern Chinese paragraphs.

Fig. 2. The length-rate data of Ancient-Modern Chinese bi-texts is approximately normal, The
horizontal axis shows the length-ratio, while the vertical axis shows the frequency of each ratio.

In [4], they based their distance measure on an assumption that each character in one
language gives rise to a random number of characters in other language. They assume
those random variables are independent and identically distributed with a normal
distribution. They check the English-French pair for this assumption, and estimate
parameters for the normal distribution. This assumption is not clear to Ancient-Modern
Chinese pair. For making use of the same kind of length-based information in

1182

Z. Lin and X. Wang

alignment, we thus give an investigation on if this assumption can be held in


Ancient-Modern Chinese pair. We also use the paragraph-based data to do the test. We
use 205 length ratios from 205 Ancient-Modern Chinese paragraphs as samples.
Figure 2 is the histogram for these ratios.
We can find that length ratio data is approximately normal with mean =1.6
and Std. Deviation =0.21. Then following [4], we compute L (i, u; j , v) using
equation (2).

L(i, u; j , v) = 100 * log(2 * (1 Pr(| (| tuv | | sij | ) |)))

(2)

To compute the cost for different alignment mode, we aligned 407 Ancient-Modern
Chinese sentence pairs manually. Then we estimate the probabilities of different
alignment modes using frequencies of these modes occurring in data. The result is
listed in Table 2.
Supposed that j i = n means the string sij including n Ancient Chinese
sentences, specially, we use n = 1 to denote that there is no sentence in sij .
Similarly, when v u = m means the string t uv including m Modern Chinese
sentences and m = 1 means that there is no sentence in t uv , the cost for mode
n m can be computed by equation (3).

M (i, u; j , v) = 100 *log(Pr(n m))

(3)

Table 2. the frequencies of different alignment modes

Mode(n-m)
0-1, 1-0
1-1
1-2
2-1
2-2
Others
Total

Frequency f(n-m)
1
350
21
18
11
6
407

Probability Pr(n-m)
0.002
0.861
0.052
0.044
0.027
0.014
1

3.2 Hanzi Information


Another source of information is Hanzi occurring commonly in both Ancient and
Modern Chinese. We include totally 2366 Hanzi character pairs in a dictionary.
Following [7], we use different measures as in equation (4) and (5) for different
alignment mode. The equation (5) is for 2-2 mode, and the equation (4) is for other
modes. By including some length-related information in these measures, we can also
control the alignment length.

0
h (i,0; u ,0)
1
H (i , u ; j , v )
h1 (i,0; u ,0) + h2 (i , j; u ,0)
h1 (i,0; u ,0) + h3 (i,0; u , v )

j <i|v >u

0 1or1 0

j = i&v = u

11

j = i +1& v = u 2 1
j = i & v = u +1 1 2

(4)

Chinese Ancient-Modern Sentence Alignment

1183

Where h1 (i,0, u ,0) = | si t u | min{si , t u } , here we use the relative number of


commonly occurring Hanzi to measure the similarity between two strings,
h2 (i, j , u,0) = (| s j t u | | si s j t u |) s j , which is the relative number of
commonly occurring Hanzi only in s j and tu but not in s i , similarly, we use
h3 (i,0, u, v) to denote (| s i t v | | s i t u t v |) t v , which is the relative
number of commonly occurring Hanzi only in s i and t v but not in t u . For 2-2 mode,
we use (5) as a measure.
The three items in the right of equation (5) are the measures for three different ways
of merging si and s j and matching with t u and t v in Chinese-Japanese alignment.

H (i, u; j , v) min{ h1 (i,0; u ,0) + h2 (i, j; u ,0) + h1 (0, j;0, v),


h1 (i,0; u,0) + h3 (i,0; u, v) + h1 (0, j;0, v),
h1 (i,0;0, v) + h1 (0, j; u;0)}

(5)

This completes the description of distance measure. We finally use dynamic


programming to find the overall least cost in matching. Let D (i, u ) be total distance
aligning sentences from first one to i th in Ancient Chinese and sentences from first
one to u th in Modern Chinese, the recurrence then can be described in (6).

D ( j , v) = min{D (i 1, u 1) + d (i, j; u , v)}

(6)

4 Experiments and Evaluations


We use 205 Chinese Ancient-Modern paragraph pairs in our experiments, which
include total 1528 Ancient Chinese sentences and 1583 Modern sentences respectively.
To estimate parameter in equation 1, we align another 53 paragraphs which include
440 Ancient Chinese sentences and 452 Modern Chinese sentences. We use it as
training data in our experiments when is needed.
Our purpose of this experiment was to test a variety of information sources in the
sentence alignment of Ancient-Modern Chinese. Table 3 is the results.
Table 3. How different factors effect the accuracy(+: used, -: not used)

H
+
+
+
+
-

L
+
+
+
+

M
+
+
+
+

Accuracy(%)
80.33
56.01
92.43
79.59
52.09
52.50
78.70

We can find Hanzi information(H) is the most helpful one for alignment, when only
one information source is used, H achieves the best accuracy.

1184

Z. Lin and X. Wang

It is interesting that we can get the best result by combining only H with M instead of
combining all three information sources in other language pairs such as
Chinese-Japanese [7]. We think the reason is our H measure combines enough length
constraint information, while length information itself is not so helpful in
Ancient-Modern Chinese alignment.
We can find from Table 3 than length information give the worst performance. when
only the length information is considered, the alignment result inclines to match the
length rate with 1.6. When length rates locate in symmetrical position of a normal
distribution, for example 1.4 and 1.8, they have the same probability. In this case,
length information is not helpful in distinguish the situation in Figure 3, where 1-1 and
2-1 alignment have same probability.
0.4

1.4

1.0
Fig. 3. one length rate is 1.4 (1-1), the other is 1.8(2-1), they have the same probability

When only the alignment mode information is considered, the alignment result more
inclined to match the 1-1 mode.
But when there are more sentences in Modern Chinese, things will be changed, for
example, when there is one Ancient Chinese sentence and two Modern Chinese
sentences left. Two alignment sets can be possible, one is (1:2), another s (1:1 and 0:1).
the costs of these two sets are listed in Table 4. As we can see, the cost of (1-1,0-1)
mode is larger than others, so 1-2 mode is used.
Table 4. Two kinds of alignment: 1-1,0-1(or 0-1, 1-1) and 1-2 mode

Mode
1-1,0-1
1-2

Probability Pr(n-m)
0.861, 0.002
0.052

cost
276
128

5 Conclusions
This paper describes an Ancient-Modern Chinese alignment at the level of sentence by
combining both length information and Hanzi character information. We give a detail
description on some characteristics in Ancient-Modern Chinese bi-texts. We check the
correlation between the lengths of Ancient Chinese text and Modern Chinese text, and

Chinese Ancient-Modern Sentence Alignment

1185

find that length ratio data fits normal hypothesis approximately. We pay a special
attention on n-m alignment where n or m is greater than 1, and use a similarity measure
based on Hanzi information to get better accuracy than normal one. Experiments show
our proposal is very helpful.
We will extend our program to manage clause alignment in our future works. By
analyzing the sentence alignment, we find that there are less alignment modes in clause
alignment than that in sentence alignment. Therefore, we can expect to achieve higher
accuracy for Ancient-Modern Chinese clause alignment.

References
[1] Peter F. Brown, Jennifer C. Lai, Robert L. Mercer: Aligning Sentences in Parallel Corpora.
Proceedings of 29th Annual Conference of the Association for Computational Linguistics
(1991) 169-176.
[2] Hiroyuki Kaji, Yuuko Kida, Yasutusgu, Morimoto: Learning Translation Templates from
Bilingual Text. In Proceedings of the fifteenth International Conference on Computational
Linguistics, Nantes, France (1992) 672-678.
[3] Jrg Tiedemann: Recycling Translations - Extraction of Lexical Data from Parallel
Corpora and their Application in Natural Language Processing, Doctoral Thesis. Studia
Linguistica Upsaliensia 1, ISSN 1652-1366, ISBN 91-554-5815-7.
[4] William A. Gale, Kenneth W. Church: A Program for Aligning Sentences in Bilingual
Corpora. In Proceedings of 29th Annual Conference of the Association for Computational
Linguistics (1991) 177-184.
[5] Martin Kay, Martin Roscheisen: Text-Translation Alignment. Computational Linguistics.
Vol. 19, No. 1 (1993) 121-142.
[6] Wu Dekai: Aligning a Parallel English-Chinese Corpus Statistically with Lexical Criteria.
In Proceedings of the 32st Annual Meeting of the Association for Computational
Linguistics, (1994) 80-87.
[7] Wang Xiaojie, Ren Fuji: Chinese-Japanese Clause Alignment. Computational Linguistics
and Intelligent Text Processing, 6th International Conference (2005) 400-412
[8] Jean Veronis: From the Rosetta stone to the information societyA survey of parallel text
processing. In Parallel Text Processing. J. Veronis (eds.) (2000) 25-47 Kluwer Academic
Publishers. Printed in Netherland.
[9] I. Dan Melamed: Pattern recognition for mapping bitext correspondence. In Parallel Text
Processing, J. Veronis (eds.), Kluwer Academic Publishers. Printed in Netherland.
(2000)25-47
[10] Changxu Sun: Introduce to Ancient-Chinese lexicon, Shanghai dictionary Press (2005)(in
Chinese)

A Language Modeling Approach to Sentiment Analysis


Yi Hu1, Ruzhan Lu1, Xuening Li1,2, Yuquan Chen1, and Jianyong Duan1
1

Department of Computer Science and Engineering


Shanghai Jiao Tong University, Shanghai, China
2
School of Foreign Studies
Southern Yangtze University, Wuxi, China
{huyi, lu-rz, xuening_li, yqchen, duan_jy}@cs.sjtu.edu.cn

Abstract. This paper presents a language modeling approach to the sentiment


detection problem. It captures the subtle information in text processing to character the semantic orientation of documents as thumb up (positive) or thumb
down (negative). To handle this problem, we propose an idea to estimate both
the positive and negative language models from training collections. Tests are
done through computing the Kullback-Leibler divergence between the language
model estimated from test document and these two trained sentiment models.
We assert the polarity of a test document by observing whether its language
model is close to the trained thumb up model or the thumb down model.
When compared with an outstanding classifier, i.e., SVMs on movie review
corpus, language modeling approach showed its better performance.
Keywords: Sentiment Analysis; Language Modeling; KL-divergence.

1 Introduction
Traditional attention to document categorization lies in mapping a document to given
topics such as sport, finance and politics [4]. Whereas, recent years have seen a growing interest in non-topical analysis, in which characterizations are sought of the opinions and feelings depicted in documents, rather than just their themes. The problem
classifying a document as thumb up (positive) or thumb down (negative) is called
sentiment classification. Labeling documents by such semantic orientations would
provide succinct summaries and would be great useful in many intelligent information
systems. Its immediate applications include mining webs and blocking junk mails.
Sentiment classification has become a hot direction for its broad applications,
which has been attempted in different domains such as movie reviews, product reviews, and customer feedback reviews [1][2][6][8]. Some researchers have taken
positive or negative word/phrase counting methods into account and determining if a
word/phrase is positive or negative [9]. Other methods classify whole documents into
positive and negative by employing machine-learning algorithms. Several learning
algorithms are compared in [2] where it found that Support Vector Machines (SVMs)
generally give better results. Their work shows that, generally, these algorithms are
not able to achieve accuracies on sentiment classification comparable to those reported for standard topic-based categorization. The reason exists in many challenging
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11861193, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Language Modeling Approach to Sentiment Analysis

1187

aspects in this task. Intuitively, feelings in natural language are very often expressed
in subtle and complex ways, which usually needs knowledge to deal with.
This paper presents a language modeling approach to analyze documents as positive or negative, which emphasizes on suitably estimating the thumb up and thumb
down language models from training sets, and evaluating a test document represented with a language model via its distance from the two sentiment models. We also
try SVMs, a powerful discriminative model, for this sentiment analysis.
The rest of the paper is organized as follows. Our method is formalized in Section 2.
Section 3 follows with a description of preliminary experiments, and section 4 gives
the conclusion. Note that this paper discusses an ongoing work and provides the
framework of our idea and initial results, rather than a complete solution.

2 Language Modeling Approach to Sentiment Classification


In this section, we propose a language modeling approach to detecting semantic orientation in document. The motivation is very simple: the thumb up and thumb down
languages are likely to be substantially different, i.e. they prefer to different language
habits. We exploit this divergence in the language models to classify test document.
The thumb up orientation is represented with a positive language model P that
is a probability distribution over n-grams in positive collection. Accordingly, a negative language model N represents the language model for thumb down orientation.
A test document generates a language model d . Note that a language model is a
statistical model: probability distribution over language units, indicating the likelihood of observing these units in a language. Therefore a document can then compare
its model with thumb up or thumb down model using distance mechanism:
< 0 " thumb up " .
> 0 " thumb down "

(d ; P , N ) = Dis(d , P ) Dis(d , N ) :

(1)

Where Dis(p,q) is the distance between two distributions p and q. This formula expresses the classifying idea that if Dis ( d , P ) is smaller than Dis( d , N ) , it means the
test document d is closer to thumb up. Otherwise, if Dis ( d , P ) is greater than
Dis ( d , N ) , thumb down. Note that if (d ; P , N ) equals to zero, the test document

is regarded as neutral, but this case has not been discussed in our work. Next subsection, we exploit the Kullback-Leibler Divergence as the distance measure.
2.1 Using Kullback-Leibler Divergence for Sentiment Classification
Given two probability mass functions p( x) and q( x) , D( p || q) , the Kullback-Leibler
divergence (or relative entropy) between p and q is defined as
p( x) .
D( p || q) = p( x) log

x
q( x)

(2)

1188

Y. Hu et al.

It is easy to show that D( p || q) is always non-negative and is zero if and only if


p = q . Even though it is not a true distance between distributions because it is not
symmetric and does not satisfy the triangle inequality, it is still often useful to think of
the KL-divergence as a distance between distributions [3]. Formally, the KLdivergence between probability distributions d and P / N is calculated by:

Pr( n gram | d )
D(d || P ) = Pr( n gram | d ) log
.
n gram

Pr(n gram | P )

Pr( n gram | d )

D ( d || N ) = Pr( n gram | d ) log Pr(n gram | )


n gram
N

(3)

Where is the estimated model for the real and Pr(n gram | ) is the probability of
n-gram given the estimated model .
Once we have a language models to represent test document and a score based on
its distance to the two sentiment language models, we classify the test document as
thumb up or thumb down. Finally, substituting equation (3) into equation (1) we
have a new sentiment classifying function:

(d ;P ,N ) = D(d || P ) D(d || N )
=

) log Pr(n gram | N )


Pr(
n

gram
|

d
Pr(n gram | )
n gram
P

(4)

In this study, we only employ word-based unigrams and bigrams as model parameters. Because of the data sparseness problem, higher order n-grams (n >= 3) have not
been discussed, even if the higher order n-grams might approximate the true language
model in theory. But it is possible to use n-grams of higher orders in the same framework. In general, the computation of the above formula involves a sum over all the ngrams that have a non-zero probability according to Pr(n gram | ) . However, when
is based on certain general smoothing technique, the computation would assign a nonzero probability to unseen n-gram according to P r( n g ra m | sm o o th ) . We also observe
the smoothing effect in language modeling of sentiment.
2.2 Estimation for Model Parameters
Usually, the real language models ( P and N ) are unknown. They are estimated by
training from two available collections labeled with positive and negative to obtain P and N , respectively. d has the similar meaning.
For investigating the ability of language modeling approach, we use two methods
to estimate the unigrams and bigrams distribution: <1> the Maximum Likelihood
Estimate (MLE); <2> the smoothing estimation for these three language models.
MLE for Unigrams and Bigrams
MLE is used widely for model estimate, so we directly give the formula (5) and
simply analyze it as follows.

A Language Modeling Approach to Sentiment Analysis

#( wi in " s ")

s {d,P,N}
for unigram
PrML ( wi | " s ") = #( in " s ")
.

Pr ( w | w , " s ") = #( wi 1wi in " s ")


s {d,P, N}
for bigram
ML i i 1
#( wi 1 in " s ")

1189

(5)

What we have to explain in (5) is the s, which can represent a test document (d), the
thumb up collection (P) or the thumb down collection (N). The #(n gram) denotes the n-gram occurring times in corresponding collection (d, P or N), and *
denotes any word. The meanings of these characters are fixed in the rest of this paper.
The maximum likelihood estimate is an unreasonable one when the amount of
training data is small compared to the model size. It is clearly inaccurate to assign
zero probability to unseen n-grams. The smoothing describes techniques for adjusting
the maximum likelihood estimate to hopefully produce more accurate models.
Dirichlet Prior Smoothing for Unigram
Dirichlet Prior smoothing [10][12] is a linearly interpolated method to the problem of
zero probabilities and suitable for unigrams smoothing. Its purpose is to address the
estimation bias due to the fact that a document collection is a relatively small amount
of data with which to estimate a unigram model. More specifically, it is to discount
the MLE appropriately and assign non-zero probabilities to n-grams not observed in
collection. In terms of unigram model, the smoothing estimation is
Pr ( w |" s ") if word w is seen .
PrDP ( w | " s ") =
s PrML ( w | C ) otherwise

(6)

Where Pr ( w | " s ") is the smoothed probability of w seen in the collection represented
with s. PrML ( w | C ) is the whole corpus ( C ) language model based on MLE, and s
is a coefficient controlling the probability mass assigned to unseen words, so that all
probabilities sum to one. In general, s may depend on s. In this study, we exploit
the following smoothing formalization,
Pr ( w | " s ") =

#( w in " s ") + PrML ( w | C ) ,


#( in " s ") +

(7)

and

(8)
.
+ | C |
Although Dirichlet Prior smoothing is valid in many NLP tasks, in the sentiment
classification of movie review corpus, it only give slight improvement to simple MLE
(see the experiment section).
s =

Kenser-Ney Smoothing for Bigram


Kneser and Ney [5] have introduced an extension of absolute discounting where the
lower-order distribution that one combines with a higher-order distribution built in a
novel manner. To their consideration, a lower-order distribution is a significant factor

1190

Y. Hu et al.

in the combined model only when few or no counts are present in the higher-order
distribution. Following Kneser-Ney smoothing, Stanley F. Chen and Joshua Goodman
[10] mathematically motivate Kneser and Neys algorithm by selecting the lowerorder distribution such that the marginal of the higher-order smoothed distribution
match the marginal of the training data. Then the Kneser-Ney smoothing performs
best compared with other smoothing techniques when given different conditions [10].
To a bigram model, Chen et al select a smoothed distribution PrKN that satisfies the
following constraint on unigram marginals for all wi:

Pr

KN

( wi 1wi ) =

wi 1

#( wi ) .
w #( wi )

(9)

The left-hand side of this equation is the unigram marginal for wi of the smoothed
bigram distribution PrKN , and the right-hand side is the MLE of wi found in the training data. Therefore, to our smoothing, we assume that the bigram model has the form
given in Equation (10),
PrKN ( wi | wi 1 , " s ") =

max{#( wi 1wi ) D, 0}
D
+
N1+ ( wi 1 i) PrKN ( wi ), to " s " . (10)
#(
w
w
)
#(
w
w
)
w i 1 i
w i 1 i
i

In equation (10), D is the fixed discount from observed bigrams and

D=

n1
n1 + 2n2

by

Neys suggestion. Moreover,

PrKN ( wi ) =

N1+ (i wi ) .
N1+ (i wi 1 i)

(11)

For all the meanings of character such as D, N1+, n1 and n2, readers can refer to
Kneser and Chens papers[5][10].

3 Document Set and Experiments


Turney [7][8] found movie reviews to be the most difficult of several domains for
sentiment classification task, reporting an accuracy of 65.83% on a 120-document set
(random-choice baseline: 50%). Herein lies the reason we chose movie reviews for
study, and our data source was the Internet Movie Database (IMDB) archive of the
rec.arts.movies.reviews newsgroup that is adopted in Pangs work [2]. The datasets
selected only reviews where the author rating was expressed either with stars or some
numerical value. Ratings were extracted and converted into one of two categories:
positive (thumb up) and negative (thumb down).
To avoid domination of the corpus by a small number of prolific reviewers, the
corpus imposed a limit of fewer than 20 reviews per author per sentiment category,
yielding a corpus of 1000 negative and 1000 positive reviews, with a total of more
than a hundred reviewers represented. Note that all these original documents were
preprocessed by stemming and stop-word removal in our work.
We designed two experiments to investigate SVM and our method. The first was to
select the most suitable kernel from linear, polynomial, RBF and sigmoid kernels for

A Language Modeling Approach to Sentiment Analysis

1191

sentiment classification. The second was to compare the performance between our
method and SVMs. In order to see fair play, all the following experiments selected the
features based on word unigrams and bigrams occurring more than 2 times in the
2,000 reviews. The value of a feature is its appearing number.
With respect to the two experiments, we split the 2000 movie reviews into 1200
training samples (600 positive and 600 negative) and 800 test samples (400 positive
and 400 negative), and they were both evaluated in average accuracy based on 3-fold
cross validation.
SVMs Experiment
We extracted two kinds of input feature sets for SVMs, i.e., unigrams and bigrams.
The following experiments compared the performance of SVMs using linear, polynomial, RBF and sigmoid kernels, four conventional learning methods commonly
used for text categorization. We used Joachims SVMlight package [11] for training and
testing, and other parameters to different kernel functions set to their default values in
this package. This experiment aimed at seeing which one is more suitable for the
sentiment detection problem. Table 1 outlines the different results of SVMs on the
IDMB corpus when different kernel functions are used.
Table 1. Comparison of four kernel functions on the IDMB training and test sets. Linear kernel
achieved highest performance on both unigram and bigram features for categorization.

Features
unigrams
bigrams

# of features
13693
18602

Linear
78.21
73.42

Polynomial
59.59
51.46

Radial Basis Function


50.09
51.19

Sigmoid
49.25
62.00

The best results on the two feature sets come from the SVM using linear kernel.
Our language model based method is compared with the SVM using linear kernel.
Language Modeling Approach Experiment
We evaluated the language modeling approach described in section 2 on IMDB collections. As mentioned above, we used unigrams and bigrams models for evaluation.
Table 2 is the experiment result.
Table 2. Comparison between language modelling approach and SVMs

Features
unigrams
bigrams

# of features
13693
18602

LM-MLE
Uni-MLE
Bi-MLE

82.02
61.62

LM-Smoothing
Uni-DP
Bi-KN

84.13
73.80

SVMlight
(linear kernel)
78.21
73.42

Uni-DP (the smoothed unigram model) performed the best globally on unigrams features set in Table 2, which achieved an average significant improvement of +7.57%
compared to the best SVM result. What surprised us is that the simple Uni-MLE
could also perform better than SVM-Uni did. On the other hand, the experiment on

1192

Y. Hu et al.

bigram features showed that the best result of language modeling approach (Bi-KN)
was close to the result of SVM and the performance of Bi-MLE was poor.
With respect to the effect of smoothing technique: <1> Dirichlet Prior smoothed
unigram model with parameter u set to 450 (In our experiment, the best result appeared when we set u = 450). Uni-DP performed well for the sentiment classification
task but it only slightly improved the performance of Uni-MLE (+2.57%). Although
the model based on MLE was inherently poorly estimated, it was not clear that the
simple model must be smoothed since the improvement was limited. This phenomenon let us consider that it might be better to find a way of paying more attention to
some sensitive concepts to achieve better performance for sentiment classification.
<2> Kneser-Ney method smoothed bigram model based on an absolute discount idea,
and it did great contribution to estimate a better bigram model leading to a significantly better result than MLE (+19.77%). It obtained the comparable performance to
SVM. The reason might be depicted as: in theory, the higher order n-gram model
readily approximates the true language model, but for data sparseness, a powerful
smoothing mechanism could provide apparent contribution.

4 Conclusion
In this paper, we have presented a new method based on language model for sentiment classification. With respect to this generative sentiment classifier, we represent
the thumb up and thumb down semantic orientation with their corresponding
language models estimated from positive and negative collections. When classifying a
test document, the distances of its language model from these two sentiment models
are compared to determine its sentiment class.
Compared with SVMs, we might conclude as follows in terms of our experimental
results: to sentiment classification, when training data is limited, the smoothed low
order model, i.e., the unigram model can globally do the best. On the other hand,
smoothing technique does great contribution to higher order model that can also
achieve comparable performance to SVMs.
The experiments showed the potential power of language modeling approach in
this task. We demonstrate that our generative sentiment classifier is applicable by
learning the positive and negative semantic orientation efficiently in the supervised
manner. This seems to indicate the promising future of language modeling approach
for the sentiment detection problem. On the other hand, we stress that the approaches
we use are not specific to movie reviews, and it should be easily applicable to other
domains when given training data.
The difficulty of sentiment classification is apparent: negative reviews may contain
many apparently positive n-grams even while maintaining a strongly negative tone,
and the opposite is also common. All classifiers will face this difficulty. To the language modeling approach, our future work will focus on finding a good way to estimate better language models, especially the higher order n-gram models by introducing semantic link between n-grams.
Acknowledgement. This work is supported by NSFC Major Research Program
60496326: Basic Theory and Core Techniques of Non Canonical Knowledge.

A Language Modeling Approach to Sentiment Analysis

1193

References
1. Bo Pang and Lillian Lee: A Sentimental Education: Sentiment Analysis Using Subjectivity
Summarization Based on Minimum Cuts. In: Proc. of the 42nd ACL. (2004) 271-278
2. Bo Pang, Lillian Lee and Shivakumar Vaithyanathan: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proc. Conf. on EMNLP. (2002)
3. Cover, T. M. and Thomas, J. A.: Elements of Information Theory. Wiley. (1991)
4. Hearst, M.A.: Direction-based text interpretation as an information access refinement. In
P. Jacobs (Ed.), Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Mahwah, NJ: Lawrence Erlbaum Associates. (1992)
5. Kneser,R. & Ney,H: Improved backing-off for m-gram language modeling. In: Proc. of the
IEEE Internaltional Conference on Acoustics, Speech and Signal Processing, Detroit,
MI,volume 1. May, (1995) 181-184
6. Michael Gamon: Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: Proc. the 20th International Conference
on Computational Linguistics. (2004)
7. Peter D. Turney: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proc. of the ACL. (2002)
8. Peter D. Turney and Michael L. Littman: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS).
21(4), (2003), 315-346
9. Peter D. Turney and Michael L. Littman: Unsupervised learning of semantic orientation
from a hundred-billion-word corpus. Technical Report EGB-1094, National Research
Council Canada. (2002)
10. S. F. Chen and J. T. Goodman: An Empirical Study of Smoothing Techniques for Language Modeling. Technical Report: TR-10-98, Harvard University. (1998)
11. Thorsten Joachims: Making large-scale SVM learning practical. In Bernhard Scholkopf
and Alexander Smola, editors, Advances in Kernel Methods - Support Vector Learning,
MIT Press. (1999) 4456
12. Zhai, C. and Lafferty, J.: A study of smoothing methods for language models applied to ad
hoc information retrieval. In Proc. of SIGIR2001. (2001)

Processing the Mixed Properties of Light Verb


Constructions
Jong-Bok Kim1 and Kyung-Sup Lim2
1
2

School of English, Kyung Hee University, Seoul, Korea 130-701


Dept. of English and Tourism, Dongsin University, 520-714, Korea

Abstract. One of the most widely used constructions in Korean is the


so-called light verb construction (LVC) involving an active-denoting verbal noun (VN) together with the light verb ha-ta do. This paper rst
discusses the argument composition of the LVC, mixed properties of
VNs which have provided a challenge to syntactic analyses with a strict
version of X-bar theory. The paper shows the mechanism of multiple
classication of category types with systematic inheritance can provide
an eective way of capturing these mixed properties. An implementation
of the analysis within the LKB (Linguistics Knowledge Building) system
proves its feasibility and eciency.

Issues

The rst main theoretical and computational issue we encounter in the analysis
of the LVC is the status of the light verb and argument composition. One of
the main properties the light verb ha do carries is that it does not aect the
argument structure of the VN (verbal noun) it combines with.1
(1) a.

b.

John-i
Mary-eykey cenhwa(-lul hayessta)
John-NOM Mary-DAT phone-ACC did
John phoned Mary.
John-i
Mary-lul myengtan-ey chwuka(-lul hayessta)
John-NOM Mary-ACC list-LOC
addition-ACC did
John added Mary to the list.

As observed here, it is the type of VN that decides the types of arguments in


the given sentence. This has led the literature to view that the light verb has
no argument structure on its own and inherits the argument structure of the
theta-transparent VN.
The second main issue concerns the grammatical status of VNs. It is wellobserved that in terms of the internal properties, VNs behave like verbs, whereas
1

The abbreviations for the glosses and attributes used in this paper are acc (accusative), arg (argument), c-cont (constructional content), dat (dative), decl (declarative), lbl (label), loc (locative), ltop (local top),
nom (nominative), pl (plural), pre (predicate), pst (past), ind (index),
rels (relations), top (topic), etc.

Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 11941201, 2007.
c Springer-Verlag Berlin Heidelberg 2007


Processing the Mixed Properties of Light Verb Constructions

1195

in terms of external syntax, they act like nouns. For example, as observed in (1),
VNs select their own arguments and assign verbal cases such as ACC, regardless
of the light verbs presence. Adverbial modication also supports the verbal
properties of VNs: the VN can be modied by an adverb but not by an adjectival
element.
(2)

catongcha-ul mikwuk-ey
elyepkey/*elyewun swuchwul(-ul hayessta)
car-ACC
America-LOC hard/dicult
export-ACC did
(They) exported cars to America with diculty.

Another main issue in the LVC comes from syntactic variations. It is wellobserved that the VN in the true LVC has frozen eects: it does not undergo
relativization, scrambling, clefting, and topicalization. The VN further cannot
be wh-questioned or pronominlizaed:
(3) a.

b.
c.
d.
e.
f.

John-i
Bill-eykey tocaki-lul senmwul-ul hayssta
John-NOM Bill-DAT china-ACC present-ACC did
John gave a china to Bill as a present.
*John-i Bill-eykey tocaki-lul han senmwul (relativization)
*John-i senmwul-ul Bill-eykey tocaki-lul hayssta. (scrambling)
*John-i Bill-eykey han kes-un senmwul-i-ta (clefting)
*John-i Bill-eykey ku kes-ul hayssni? (pronominalization)
*John-i Bill-eykey mwues-ul hayssni? (wh-question)

Intriguing facts emerge when the VN does not appear with the accusative object. In such cases, the frozen eects disappear: all these syntactic processes
are possible.
(4) a.

b.
c.
d.
e.
f.

John-i
Bill-eykey senmwul-ul hayssta
John-NOM Bill-DAT present-ACC did
John gave a present to Bill.
John-i Bill-eykey han senmwul (relativization)
John-i senmwul-ul Bill-eykey hayssta. (scrambling)
John-i Bill-eykey han kes-un senmwul-i-ta (clefting)
John-i Bill-eykey ku kes-ul hayssni? (pronominalization)
John-i Bill-eykey mwues-ul hayssni? (question)

There have been various attempts to account for these aforementioned properties of LVC constructions. In what follows, we lay out a constraint-based analysis adopting the mechanism of multiple inheritance hierarchies that enables us
to capture the mixed properties as well as other related ones in a much more
streamlined manner.

2
2.1

A Typed Feature Structure Grammar: KPSG


Mixed Properties Within a Multiple Inheritance System

Our grammar KPSG (Korean Phrase Structure Grammar), based on the framework of HPSG (head-driven phrase structure grammar), aims at building a

1196

J.-B. Kim and K.-S. Lim

computationally feasible Korean grammar with a comprehensive coverage. In


the grammar, all the linguistic expressions are types of sign which in turn has
lex-sign (lexical sign) and syn-sign (syntactic sign) as its subtypes. Following
traditional wisdom, the KPSG takes the basic lexical categories of the grammar
(lex-sign) to include verbal, nominal, adverbial, and adnominal as its subtypes
which again are subclassied according to their properties. The following is a
simplied hierarchy, representing the relevant part:2
(5)

lex-sign
ffYYYYYYYY
ffffff
verbal
nominal
ffS S S
Y Y Y Y
ffffff
S S

v-stem
n-lxm
S S
eeYYYYYYYY
eeeeeYYYYYYYY
eeeeee
 Seee
v-tns-stem
v-free
vn
cn
ffYYYYYYYY 

ffffff
v-ind

v-dep

v-ger

The key point of capturing the mixed properties of VNs lies in the crossclassication and multiple inheritance mechanism.3 As noticed in the hierarchy, the type vn is declared to be the subtype of both verbal and n-lxm, implying
that it will inherit all the constraints of these supertypes. The type verbal is declared to have the value [V +] with a non-empty ARG-ST value, whereas n-lxm
has the value [POS noun]. The inheritance mechanism will then ensure that the
type vn has at least the information in (6)a. This lexical information will then be
enriched as in (6)b when each lexical intance inherits all the relevant constraints
from its supertypes.4
(6)









POS noun

 






vn


SYN
HEAD
V
+













POS noun  
N+








SYN | HEADN +





a. 
b. ARG-ST NPi , NPj 






V+




INDEX s1







ARG-ST [ ],...
 





PRED
study-rel






SEM ...






SEM
ARG0 s1







RELS 







ARG1
i


 

vn

PHON kongpwu study

ARG2 j

The dot line here means the existence of other types between the two types. The
type glosses mean v-ind(ependent), v-dep(endent), v-ger(undive).
The type v-ger is gerundive verbs like ilk-ess-um read-PST-NMLZ which also display mixed properties. See [1].
The semantics we represent here is a simplied version of a at semantic formalism
MRS (minimal recursion semantics). See [2] and [3] for details.

Processing the Mixed Properties of Light Verb Constructions

1197

As observed here, the system clearly represents why VNs are in part nominal
([N +]) and are in part verbal ([V +]) though in terms of POS, they are more
like nouns. In addition, by referring to a proper feature value, the grammar can
be exible enough to capture other related properties. For example, the KPSG
allows an adverb to modify a [V +] element. This would then predict the adverb
modication in the LVC we discussed in (2). In addition, since the type vn as
a subtype of n-stem bears [N +] and [POS noun], we naturally predict that
the VNs will act like other nominal elements: the VNs can have case markings
attached to them, have the GEN grammatical case, and can serve as the head
of a relative clause construction like the other [POS noun] elements.
2.2

Argument Composition and the Syntax of the LVC

The argument composition properties between the VN and the following light
verb lead us to take the light verb as a kind of auxiliary verb as given in (7):5
(7)

PHON ha-ta  do

SYN | HEAD | POS verb





ARG-ST [INDEX i], LEX +









XARG i

According to this lexical information, just like an auxiliary verb, the light verb
is syntactically transitive, selecting a subject argument and a VN expression
(lexical or phrasal). The VN forms a well-formed phrase with the light verb in
accordance with the following grammar rule:6
(8)

Head-Lex Rule:

hd-lex-ex
COMPS

LEX +
COMPS

AUX +

, H
COMPS



1

The Head-Lex Rule species that the auxiliary head combines with a [LEX +]
complement7 , and that to the resulting combination the COMPS value of this
lexical complement is passed up. This kind of argument composition is dierent
from the previous analyses ([5], [6]). mainly in that the composition happens in
syntax rather than in the lexicon. Since the external argument of the light verb
is identical with the rst argument, it in turn means the subject of the LVC is
determined by the VN.
To check the feasibility of our grammar equipped with the Head-Lex Rule and
other X grammar rules, we implemented this grammar in the LKB (Linguistic
5

The semantic attribute XARG identies the semantic index of a phrases external
argument, usually the subject of a verb phrase.
This rule generates complex predicate constructions like auxiliary constructions in
Korean. See [4].
The feature LEX is assigned to non-phrasal expressions such as words and complex
predicates.

1198

J.-B. Kim and K.-S. Lim

Knowledge Building System) (cf. [7]). The LKB system is a grammar and lexicon
development environment for use with constraint-based linguistic formalisms
such as HPSG.8 The following is the parsed tree and semantic representation of
sentences like (3a).

Fig. 1. Parsed Tree and MRS for (3a)

The tree structure in the small box indicates that the light verb hayssta did
here combines with its VN complement senmwul present, forming a well-formed
hd-lex-ex. This resulting combination also inherits the COMPS value of the VN
in accordance with the Head-Lex Rule in (8). This will then combines with
the argument tocaki china whose resulting VP again combines with the dative
argument Bill-eykey.
The bigger box represents the semantics of the sentence in the MRS (Minimal Recursion Semantics), developed by [3]. The MRS is a framework of computational semantics designed to enable semantic composition using only the
unication of type feature structures. (See [3] and [2]) We can observe that the
parsed MRS provides enriched information of the sentence. The value of LTOP
is the local top handle, the handle of the relation with the widest scope within
the sentence. The INDEX value here is identied with the ARG0 value of the
prpstn m rel (propositional message). The attribute RELS is basically a bag of
elementary predications (EP) each of whose value is a relation.9 Each of the
types relation has at least three features LBL, PRED (represented here as a
8
9

The LKB is freely available with open source (http://lingo.stanford.edu).


The attribute HCONS is to represent quanticational information. See [2].

Processing the Mixed Properties of Light Verb Constructions

1199

type), and ARG0. We can notice that the MRS correctly represents the propositional meaning such that John did the action of giving a china as a present
to Bill.
2.3

Common Noun Usages

VNs can also be used as common nouns when they take no ACC arguments. For
example, the VN-like nouns in (9) are dierent from the argument-taking VNs
even though they combine with the light verb.10
(9) a.

John-i
kongpwu-ul hayessta
John-NOM study-ACC did
John studied.
John-i
Bill-eykey senmwul-ul hayssta
John-NOM Bill-DAT present-ACC did
John did an introduction to Bill.

b.

Unlike the true VNs with the feature [N +, V +], these VNs are common nouns
with the feature [N +, V ]. As noted in (4), they also can be modied by
an adjectival element and they do not have frozen eects as VNs. In addition,
even though they do not select an ACC argument, they still keep the dative
argument Bill-eykey. To capture these relationships, our grammar posits the
following lexical rule:
(10)





VN-to-CN Lexical Rule:

vn-tr
ARG-ST  1 , [

]

cn-vn

HEAD | V
ARG-ST  1 

This lexical rule turns any transitive VNs selecting two or more arguments into
CNs (cn-vn) with no change in the meaning. However, the output has no verbal
properties any more as indicated from the [V ] value. The following illustrates
an example of this lexical process:
(11)









POS noun 










SYN HEAD V +






N
+


vn-tr

PHON senmwul

ARG-ST NPi , NPj , NPk 









POS noun 










SYN HEAD V






N
+


cn-vn

PHON senmwul

ARG-ST NPi , NPk 

As noted, the cn-vn is losing the [V +] property and becomes a canonical common
noun. One thing to note here is that even though the output is a common noun,
it still has the identical LEX and semantic value. This output will then allow us
to generate a structure like the following:
10

All the VNs are selecting a subject and an argument which are realized as NOM
and ACC.

1200

J.-B. Kim and K.-S. Lim

Fig. 2. Parsed Tree and MRS for (4a)

As given in the parsed tree, the light verb hayessta combines with senmwul-ul
present, forming a hd-lex-ex since the former has the LEX feature. The resulting
expression also inherits the COMPS value of senmwul-ul, the DAT argument
Bill-eykey. This is a complement sentence with no argument missing in which
senmwul-ul is a canonical NP that can undergo various syntactic processes as
given in (4). We also can observe that the grammar correctly provides a correct
MRS meaning representation.

An Implementation and Its Results

In testing the performance and feasibility of the grammar, we rst built up our
test sets from (1) the SERI Test Suites 97, (2) the Sejong Project Basic Corpus,
and (3) self-constructed examples adopted from the literature. The SERI Test
Suites ([8], designed to evaluate the performance of Korean syntactic parsers,
consists of total 472 sentences (292 test sentences representing the core phenomena of the language and 180 sentences representing dierent types of predicate).
Meanwhile, the Sejong Corpus have about 2,061,977 word instances with 179,
082 sentences. Of these, we found total 95,570 instances of the combination of a
noun (tagged as NNG) with the light verb ha-ta.11 Some of the nouns with the
higher frequency are given here:

5111
1730
11

/NNG+ /XSV speak


/NNG+ /XSV begin

3021
897

/NNG+ /XSV think


/NNG+ /XS need

The Sejong Corpus thus does not distinguish general nouns from verbal nouns.

Processing the Mixed Properties of Light Verb Constructions

834
543

/XR+ /XSA important 619


/NNG+ /XSV claim
528

/NNG+
/NNG+

1201

/XSV use
/XSV begin

Based on the frequency list, we rst extracted the most frequently used 100 VNs,
and from these VNs we selected 100 simple sentences (one from each VN type)
that could show us at least the basic patterns of the LVC.
The following shows the results of parsing our test suits:
Corpus Types
SERI Test Suite
Self-designed Test Suite
Ss from the Sejong Corpus
Total LVC Ss

# of S # of Parsed S # of LVC Ss Parsed LVC Ss


472
443 (93.7%) 12
12 (100 %)
350
330 (94.2%) 100
94 (94 %)
179, 082
100
87 (87 %)
212
190 (89%)

As the table shows, our system correctly parsed about 93 percent of the total 472
Seri Test Suite sentences which include those sentences that theoretical literature
have often discussed. The system also parsed about 94% of the self-designed test
sentences most of which are also collected from the major literature on the LVC.
As for the Sejong corpus, the system parsed about 87% of the simple sentences
from the Sejong corpus. Though there is need for extending this current grammar
to the wider range of authentic corpus data that display more complex properties
of the langauge, the parsing results indicate that the current grammatical system
is feasible enough to capture the mixed properties and gives us the possibility of
deep processing for such phenomena.

References
1. Kim, J.B., Yang, J.: Projections from morphology to syntax in the korean resource
grammar: implementing typed feature structures. In: Lecture Notes in Computer
Science. Volume 2945. Springer-Verlag (2004) 13-24
2. Bender, E.M., Flickinger, D.P., Oepen, S.: The grammar matrix: An open-source
starter-kit for the rapid development of cross-linguistically consistent broadcoverage
precision grammars. In Carroll, J., Oostdijk, N., Sutclie, R., eds.: Proceedings of
the Workshop on Grammar Engineering and Evaluation at the 19th International
Conference on Computational Linguistics, Taipei, Taiwan (2002) 8-14
3. Copestake, A., Flickenger, D., Sag, I., Pollard, C.: Minimal recursion semantics: An
introduction. Manuscript (2003)
4. Kim, J.B.: Korean Phrase Structure Grammar. Hankwuk Publishing, Seoul (2004)
In Korean.
5. Bratt, E.: Argument composition and the lexicon: Lexical and periphrastic
causatives in Korean. PhD thesis, Stanford University (1996)
6. Kim, J.B.: The Grammar of Negation: A Constraint-Based Perspective. CSLI Publications, Stanford (2002)
7. Copestake, A.: Implementing Typed Feature Structure Grammars. CSLI Publications, Stanford (2002)
8. Sung, W.K., Jang, M.G.: Seri test suites 95. In: Proceedings of the Conference on
Hanguel and Korean Language Information Processing. (1997)

Concept-Based Question Analysis


for an Efficient Document Ranking
Seung-Eun Shin1, Young-Min Ahn2, and Young-Hoon Seo2
1

Chungbuk National University BK21 Chungbuk Information Technology Center


Cheongju, Chungbuk, 361-763, Korea
seshin@nlp.chungbuk.ac.kr
2
School of Electrical & Computer Engineering, Chungbuk National University
Cheongju, Chungbuk, 361-763, Korea
mania@nlp.chungbuk.ac.kr, yhseo@chungbuk.ac.kr

Abstract. This paper describes a concept-based question analysis for an


efficient document ranking. Our idea is that we can rank efficiently documents
containing answers for questions when we use well-defined concepts because
concepts occurred in questions of same answer type are similar. That is, we can
retrieve more relevant documents if we know the syntactic and semantic role of
each word or phrase in question. For each answer type, we define a concept rule
which contains core concepts occurred in questions of that answer type.
Concept-based question analysis is a process which tags concepts to
morphological analysis result of a users question, determines the answer type,
and extracts untagged concepts from it using a matched concept rule. Empirical
results show that our concept-based question analysis can rank documents more
efficiently than any other conventional approach. Also, concept-based approach
has additional merits that it is language universal model, and can be combined
with arbitrary conventional approaches.
Keywords: Concept, Concept rule, Question Analysis, Document Ranking,
Question Answering.

1 Introduction
Information retrieval (IR) techniques used to find information fast and exactly from
tremendous documents have been rapidly developed with the growth and commercial
application of the Internet. However, we can often find that high ranked documents
retrieved from a general web search engine may be far from a user intension.
Therefore, effective retrieval and rank techniques are needed to provide more relevant
documents to users or question answering (QA) systems are demanded for users
convenience.
The TREC 2005 QA track contained three tasks: the main question answering task,
the document ranking task, and the relationship task. The goal of the document
ranking task was to create pools of documents containing answers to questions. The
task was to submit, for a subset of 50 of the questions in the main task, a ranked list of
up to 1000 documents for each question [1].
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 1202 1209, 2007.
Springer-Verlag Berlin Heidelberg 2007

Concept-Based Question Analysis for an Efficient Document Ranking

1203

Current research trends tend to focus on applying a natural language processing


(NLP) technique for efficient document retrieval [2], [3], [4], [5]. However, such
approaches cannot effectively reflect the meaning of sentences because they use only
index terms extracted from the morphological analysis or n-gram method. Results of
all IR systems include many non-relevant documents because the index cannot
naturally reflect the contents of documents and because queries used in IR systems
cannot represent enough information in users question [6]. That is, the statistical IR
model cannot understand a users intention because it does not consider semantics of
index terms. That is an essential reason of the inaccurate IR.
Question analysis in QA determines answer type corresponding to question type
using named entities, answer type taxonomies, and ontologies such as WordNet in
addition to question analysis in IR systems [7], [8], [9]. The number of answer types
varies widely from single digits to a few thousands. The subdivided classification of
answer type helps to extract more accurate answer by reducing the number of answer
candidates in phase of answer extraction.
This paper describes a concept-based question analysis for an efficient document
ranking in which concept rather than keyword makes an important role on document
retrieval. We define concepts commonly occurred in the same type of questions, and
use those concepts on document ranking to retrieve more relevant documents.

2 Concept-Based Question Analysis


Concept-based question analysis applies NLP techniques on a users natural language
questions and analyzes them semantically for an efficient document ranking. For each
answer type, we define a concept rule which contains core concepts occurred in
questions of that answer type. Concept is a well-defined semantic component for each
answer type.
Lets consider following questions whose answer type is either an author or time.
(Question 1) Who wrote Hamlet?
(Question 2) Who is the author of the novel, The old man and the sea?
(Question 3) When was the American Legion founded?
(Question 4) When was Hong Kong returned to Chinese sovereignty?
We can see concepts to be used commonly to represent the information demand of
users in questions (Question 1-2) whose answer type is an author. They are titles of
books (Hamlet, The old man and the sea), an interrogative pronoun (who), noun
to express the author (author), verb to express the author (write), and noun to express
the genre (novel). Concepts to be used commonly in questions (Question 3-4) whose
answer type is time are objects of an event (American Legion, Hong Kong), verbs to
represent an event (found, return), an interrogative pronoun (when), and nouns to be
related with an event (Chinese, sovereignty).
The statistical model extracts index terms (Who, write, Hamlet) from (Question 1)
and index terms (When, American Legion, found) from (Question 3). Then it
generates queries using index terms and ranks relevant documents by the querydocument similarity. Therefore, it will retrieve the document which includes simply
many index terms to more relevant document than the really relevant document which

1204

S.-E. Shin, Y.-M Ahn, and Y.-H. Seo

includes sentences such as Shakespeare is the author of Hamlet. Besides, users


consider the precision at top documents more important than the total precision
because most IR systems offer results over the hundreds of thousands of documents.
To solve this problem, we determine answer types and subtypes of answers, extract
concepts from users questions, and use them for an efficient document ranking.
A concept is not simple meaning of a word but the semantic role of a word or
phrase in sentences and is used to represent the users intention. We defined subtypes
of the answer and concepts from 643 questions whose answer type is a person in
TREC QA Data and Web, and we constructed a concept dictionary by tagging
concepts manually on those questions and expanded it using a synonym dictionary.
Table 1 shows a sample of subtypes of the answer and concepts. Subtypes of the
answer type person are classified by 24 categories such as author, family,
prizewinner, politician, developer, inventor, scholar, entertainer, player etc. We are
currently defining 125 concepts.
Table 1. Sample of subtypes of the answer and concepts
Subtype of the answer
Common
Author
Family
Prizewinner
Politician

Concepts
Nationality, Time, Sex, Person,
Book_Title, Author Noun, Author Verb,
Relationship, Base Person, Relationship Info,
Prize, Prize Noun, Prize Verb, Ceremony/Place,
Position, Event, Organization, Election Noun,

Table 2. Sample of concept tags and concept dictionary


Concept
Clue Adverb
Time
Who
Person
Author Noun
Author Verb
Genre
Relationship
Base Person
Book_Title

Tag
%Adverb
%Time
@Who
@Person
@Author_N
@Author_V
@Genre
@Relationship
#Base_Person
#Book_Title

Concept Dictionary
choi-cho(first), ma-ji-mak(last),
nu-gu(who)
sa-ram, in-mul, bun(person),
jeo-ja, jak-ga, geul-sseun-i(author), ...
jeo-sul-ha, jeo-jak-ha, sseu(write),
chaig(book), so-seol(novel), su-pil(essay),
a-deul(son), bu-in(wife), a-beo-ji, (father),

Table 2 is a sample of concept tags and concept dictionary. Tag for each concept
consists of Property + Concept. Properties are divided into three properties such as
%, @, #. We are currently constructing the concept dictionary which includes
2,039 vocabularies.
Concepts with % property are placed relatively free in sentence, and so they are
inappropriate to handle by rules. Some % property concepts such as %Adverb may
be extracted from concept dictionary, and others such as %Time from rules. These
concepts are used importantly in document ranking though they are not used in
concept rule matching.

Concept-Based Question Analysis for an Efficient Document Ranking

1205

Concepts with @ property are extracted from concept dictionary. That is, we tag
a word or phrase to @ property concept if it is found in concept dictionary.
Concepts with # property are ones to be extracted only from concept rules. Proper
nouns such as title of the book and person name belong to # property concepts. #
property concepts are extracted from question after a concept rule is selected using
@ property concepts.
We defined a concept rule manually for each answer type to extract concepts from
a users questions. Such a concept rule is represented as concepts and grammatical
morphemes in order to consider semantic and syntactic structure of the users
questions. Fig. 1 shows the BNF notation of the concept rule and characteristics of
Korean considered in our discussion. The concept rule consists of a list of <word
information> which consists of concepts and grammatical morphemes according to
the answer type.
<Concept Rule> ::= <Word List>
<Word List> ::= <Word Information> | <Word List><Word Information>
<Word Information> ::= ( <Concept> ) | ( <Concept> <Grammatical Morpheme> )
<Concept> ::= @Who|@Person|@Author_N|@Author_V|@Genre|
@Relationship|#Base_Person|#Book_Title|...
<Grammatical Morpheme> ::= jc|jx|jm|etm|co|ef|oj|co+etm
jc : case particle
co : copula
jx : auxiliary particle
etm : adnominal transition ending
jm : adnominal case particle
ef : sentence ending
oj : objective case particle
Fig. 1. BNF notation for the concept rule

The following examples are a concept rule for author that is represented as an
extended BNF and process of concept-based question analysis for (Question 2).

Example of the concept rule for the author

3. (#Book_Title co+etm) (@Genre jm )(@Author_N jx?) (@Who)?


4. (#Book_Title jc) (@Author_V etm) (@Person|@Author_N jx?) (@Who)?
5. (#Book_Title jm) (@Author_N jx?) (@Who)?

< Concept-based question analysis for (Question 2) >


No-in-gua ba-daran so-seol-ui jeo-ja-neun nu-gu-ib-ni-gga?
(Who is the author of the novel, The old man and the sea?)
Morphological analysis result of (Question 2) :
No-in-gua ba-da/nc+i/co+ra-go/ec+ha/pv+neun/etm so-seol/nc+ui/jm jeoja/nc+neun/jx nu-gu/np+i/co+~b-ni-gga/ef
Word List:
(No-in-gua ba-da co+etm) (@Genre jm) (@Author_N jx) (@Who co+ef)
Concept rule :
(#Book_Title co+etm) (@Genre jm) (@Author_N jx?) (@Who)?
Result of Concept-based Question Analysis for (Question 2)

1206

S.-E. Shin, Y.-M Ahn, and Y.-H. Seo

Answer type : in-mul (Person)


Subtype of the answer : jeo-ja (Author)
Book_Title : No-in-gua ba-da (The old man and the sea)
Genre : so-seol (novel)

We tag concepts to morphological analysis result of a users question and construct


the concept list. The concept list is matched by each concept rule. If there is a
matched concept rule, we determine answer type and extract concepts of # property
using that concept rule. We select the longest rule when we have several matched
concept rules.
If there is no concept rule to be matched to a users questions, we extract concepts
from the users questions by following manner. We classified questions whose answer
type is a person whether it contains a verb or not. The question which includes a verb
consists of Event_V (verb to represent an event), Person (noun to represent a person),
Property_N (noun to represent the property of a person such as doctor, author, and so
on), and Who (interrogative). The question which does not include a verb consists of
Property_N, NP (noun phrase), and Who. Therefore, we designed the common
concept rule according to the syntactic structure of Korean natural language questions
as follow.

Common concept rule for the question which includes a verb


1. (#NP jc) (#Event_V etm) (@Person|@Property_N) (@Who)?
2. (#NP jx) (@Who) (#Event_V ef)
3. (@Who) (#NP jc) (#Event_V ef)
Common concept rule for the question which does not include a verb
1. (#NP jm) (@Property_N) (@Who)?
2. (#NP jc) (@Property_N) (@Who)?
3. (@Property_N) (@Who)?

Although we cannot determine the subtype of the answer, we can extract concepts
by using common concept rules from a users question which has not the concept rule
to be applied.

3 Document Ranking
We generate queries that reflect various syntactic structures to represent the answer
and utilize them for an efficient document ranking. We can retrieve a document that
includes generated queries as a relevant document and improve the precision of
document retrieval. Other approaches that do not analyze the users question
semantically can hardly obtain queries whose syntactic structure is different from the
users question, but our approach can generate queries using the answer type and
concepts as results of concept-based question analysis.
We designed query generation concept rules in order to generate queries, and it is
made of concepts and grammatical morphemes. The query generation concept rules
have concepts and syntactic structures used to represent an answer. We can generate

Concept-Based Question Analysis for an Efficient Document Ranking

1207

queries using the query generation concept rules, concepts, concept dictionary, and
synonym dictionary. Query generation concept rules for an author are examples of
query generation concept rules.

Query generation concept rules for an author

3. (Book_Title jm) (Author_N)


4. (Book_Title co+etm) (Genre oj) (Author_V)

We can generate queries using query generation concept rules and results of the
concept-based question analysis of (Question 2) as below.
< Examples of the query generation >
Query generation concept rule 3 : (Book_Title jm) (Author_N)
Generated queries : No-in-gua ba-da-ui jeo-ja|jak-ga||geul-sseun-i
(writer|author of The old man and the sea)
Query generation concept rule 4 : (Book_Title co+etm) (Genre oj) (Author_V)
Generated queries :
No-in-gua ba-da-ra-neun so-seol-eul jeo-sul-ha|jeo-jak-ha||sseu
(write|wrote||to compose the novel, The old man and the sea)
Our approach can generate queries which have the same meaning as the original
question, but have different structures such as above example. Generated queries
reflect the syntactic structures of almost all phrases which have answers for that
question. We retrieve a document that includes generated queries as a relevant
document because they consist of concepts and syntactic structures used to represent
an answer.
Formula (1) is a transformation of the cosine coefficient to determine querydocument similarity when our approach is combined with the vector model [10]. We
calculate the query-document similarity by using formula (1) to retrieve a document
which includes generated queries as a relevant document.
if

G G
G G
di qge 0 then sim(di , qu ) = di qge
G G
d q
| d i || qu |

(1)

else sim ( d i , qu ) = G i Gu

di : document, qu : users question, d i : document vector


G
G
qge : generated query vector, qu : query vector by query expansion
We can rank documents by formula (1) because queries are generated by concepts
and syntactic structures which are used to represent an answer. We can increase the
precision at document retrieval by ranking documents which include a generated
query in the high position.

1208

S.-E. Shin, Y.-M Ahn, and Y.-H. Seo

4 Experimental Results
We randomly selected 100 questions as a test set from natural language questions
whose answer type is a person. They were questions which were used actually for IR
in the Web. We measured the precision at N documents. In our experiments, we used
Google and Yahoo as the IR systems and used only top 30 results of such systems for
the precision at N documents.
Table 3 shows the accuracy of concept-based question analysis. Accuracy of
concept extraction is (the number of concepts which are extracted correctly)/(the total
number of concepts). Table 4 shows the precision at N documents of the document
ranking result.
Table 3. Accuracy of concept-based question analysis

The number of questions


Accuracy of
answer type determination
Accuracy of
subtype determination
Accuracy of
concept extraction

Question which concept


frame is applied
69

Question which common


concept rule is applied
31

1.000

0.903

1.000

We cannot determine the


subtype of the answer

0.918

0.845

Table 4. Precision at N documents of the document ranking result

Micro Averaging Precision


Google+
Google
Yahoo
Our approach

Yahoo+
Our approach

At 3 docs
0.584
0.803(+0.219)
0.580
0.804(+0.224)
At 5 docs
0.585
0.770(+0.185)
0.557
0.743(+0.186)
At 10 docs
0.548
0.646(+0.098)
0.511
0.604(+0.093)
0.541(+0.063)
At 15 docs
0.523
0.591(+0.068)
0.478
0.506(+0.048)
At 20 docs
0.489
0.541(+0.052)
0.458
Precision at N documents: The percentage of documents retrieved in the top N
that is relevant. If the number of documents retrieved is fewer than N, then all
missing documents are assumed to be non-relevant.

In case that our approach is applied to Google and Yahoo, the test of precision at N
documents was improved by +0.2215(N=3), +0.1850(N=5) and +0.0955 (N=10). If
our approach is applied to more documents, the precision at N documents can be
improved more than that of table 4. In addition, we found that it is possible to make
document ranking more efficient by analyzing questions based on concepts which are
comparatively short but fully expressing a users intentions.

Concept-Based Question Analysis for an Efficient Document Ranking

1209

5 Conclusion and Future Work


In this paper, we proposed a concept-based approach for an efficient document
ranking. Concept-based question analysis extracts concept components from
morphological analysis result for a users question, determines answer type, and
generates queries using extracted concepts. And then, we rank document which
include a generated query in the high position as a relevant document.
We applied our concept-based question analysis to document retrieval system,
Google and Yahoo, and obtained a notable improvement (+0.2215, N=3) in the
precision at N documents. Although we make an experiment in the restricted domain
in which questions require a person name as its answer, our concept-based approach
can retrieve more relevant documents than any other conventional approach. Also, our
approach has additional merits that it is a language universal model, and can be
combined with arbitrary conventional approaches in the method that concept-based
approach is used when the given question can be analyzed to one of the defined
concept rules, and other approach is used otherwise.
We plan to expand concept rules for various domains, and expect incremental
performance improvement.
Acknowledgments. This research was supported by the Ministry of Information and
Communication, Korea under the Information Technology Research Center support
program supervised by the Institute of Information Technology Assessment, IITA2006-(C1090-0603-0046).

References
1. Ellen M. Voorhees and Hoa Trang Dang: Overview of the TREC 2005 Question
Answering Track. TREC 2005, (2005)
2. A.T. Arampatzis, T. Tsoris, C.H.A. Koster and Th.P. van der Weide: Phrase-based
Information Retrieval. Journal of Information Processing & Management, 34(6), (1998)
693707
3. Boris V. Dobrow, N.V. Loukachevitch and T.N. Yudina: Conceptual Indexing Using
thematic Representation of Texts. TREC-6, (1997)
4. Jose Perez-Carballo and Tomek Strzalkowski: Natural language information retrieval:
progress report. Journal of Information Processing & Management, 36(1), (2000) 155178
5. C. Zhai: Fast Statistical Parsing of Noun Phrases for Document Indexing. In Proceedings
of the Fifth Conference of Applied Natural Language Processing, (1997)
6. S. H. Myaeng: Current Status and New Directions of Information Retrieval Technique.
Communications of the Korea Information Science Society, 24(4), (2004) 614
7. A. Ittycheriah, M. Franz, W. Zhu, A. Ratnaparkhi: IBMs Statistical Question Answering
System. In 9th Text Retrieval Conference, (2000) 229334
8. S. Haragagiu, M. Pasca, S. Maiorano: Experiments with open-domain with open-domain
textual question answering. In COLLING-2000, (2000) 292298
9. M. Pasca, S. Harabagui: High Performance Question / Answer. In 24th Annual
International ACM SIGIR Conference on Research and Development in Information
Retrieval, (2001) 366374
10. G. Salton: Automatic Text Processing, Addison-Wesley, (1989)

Learning Classifier System Approach to Natural


Language Grammar Induction
Olgierd Unold
Institute of Computer Engineering, Control and Robotics
Wroclaw University of Technology
Wyb. Wyspianskiego 27, 50-370 Wroclaw, Poland
olgierd.unold@pwr.wroc.pl
http://sprocket ict.pwr.wroc.pl/~unold

Abstract. This paper describes an evolutionary approach to the problem of


inferring non-stochastic context-free grammar (CFG) from natural language
(NL) corpora. The approach employs Grammar-based Classifier System (GCS).
GCS is a new version of Learning Classifier Systems in which classifiers are
represented by CFG in Chomsky Normal Form. GCS has been tested on the NL
corpora, and it provided comparable results to the pure genetic induction
approach, but in a significantly shorter time. The efficient implementation for
grammar induction is very important during analysis of large text corpora.

1 Introduction
Syntactic processing, one of the complex task on natural language processing (NLP),
has always been considered to be paramount to a wide range of applications, such as
machine translation, information retrieval, speech recognition and the like.
Historically, most computational systems for syntactic parsing, employ hand-written
grammars, consisting of a laboriously crafted set of grammar rules to apply syntactic
structure to a sentence. But in recent years, a lot of research efforts are trying to
automatically induce workable grammars from annotated corpora. The process in
which a system produces a grammar given a set of corpora is known as grammatical
inference or grammar induction [4]. In general, the natural language (NL) corpora
may contain both positive and negative examples from the language under study,
which is described most often by context-free grammar (CFG). There are very strong
negative results for the learnability of CFG. Effective algorithms exist only for regular
languages, thus construction of algorithms that learn context-free grammar is critical
and still open problem of grammar induction. Many researchers have attacked the
problem of grammar induction by using evolutionary methods to evolve (stochastic)
CFG or equivalent pushdown automata [8], but mostly for artificial languages like
brackets, and palindromes. For surveys of the non-evolutionary approaches for CFG
induction see [6].
In this paper we examine NL grammar induction using Grammar-based Classifier
System (GCS) [7] - a new model of Learning Classifier System (LCS). In spite of
intensive research into classifier systems in recent years [5] there is still slight number
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 12101213, 2007.
Springer-Verlag Berlin Heidelberg 2007

Learning Classifier System Approach to Natural Language Grammar Induction

1211

of attempts at evolving grammars using LCS. Bianchi in his work [2] revealed, on the
basis of experiments with bracket grammars, palindromes and toy-grammar, higher
efficiency of LCS in comparison with evolutionary approach. Cyre [3] inducted a
grammar for subset of natural language using LCS but comparison to his results is
hard since usage of corpora protected by trademarks. GCS tries to fill the gap
also bringing grammar induction issues up. As was shown in [7], GCS achieves better
results than Bianchis system with reference to artificial grammars. This paper
describes GCS approach to the problem of inferring non-stochastic CFG from
NL corpora.

2 Grammar-Based Classifier System


The GCS operates similar to the classic LCS but differs from them in (i)
representation of classifiers population, (ii) scheme of classifiers matching to the
environmental state, (iii) methods of exploring new classifiers. Population of
classifiers has a form of a context-free grammar rule set in a Chomsky Normal Form
(CNF). This is not a limitation actually because every CFG can be transformed into
equivalent CNF. CNF allows only production rules in the form of A or ABC,
where A, B, C are the non-terminal symbols and a is a terminal symbol. The first rule
is an instance of terminal rewriting rule. These ones are not affected by the genetic
algorithm (GA), and are generated automatically as the system meets unknown (new)
terminal symbol. Left hand side of the rule plays a role of classifiers action while the
right side a classifiers condition. All classifiers (production rules) form a population
of evolving individuals. In each cycle a fitness calculating algorithm evaluates a value
(an adaptation) of each classifier and a discovery component operates only on a single
classifier. CFG learns using a training set that consists of sentences both syntactically
correct and incorrect. Grammar which accepts correct sentences and rejects incorrect
ones is able to classify unseen so far sentences from a test set. Cocke-YoungerKasami parser (CYK), which operates in (n3) time, is used to parse sentences from
corpus. Environment of classifier system is substituted by an array of CYK parser.
Classifier system matches the rules according to the current environmental state (state
of parsing) and generates an action (or set of actions in GCS) pushing the parsing
process toward the complete derivation of the sentence analyzed. The discovery
component in GCS is extended in comparison with standard LCS. In some cases a
covering procedure may occur, adding some useful rules to the system. It adds
productions that allow continuing of parsing in the current state of the system. Apart
from the covering a GA also explores the space searching for new, better rules.
Classifiers used in parsing positive examples gain highest fitness values, unused
classifiers are placed in the middle while the classifiers that parse negative examples
gain lowest possible fitness values. GCS uses a mutation of GA that chooses two
parents in each cycle to produce two offspring. The selection step uses the roulette
wheel selection. After selection a classical crossover or mutation can occur. Offspring
that are created replace existing classifiers based on their similarity using crowding
technique, which preserves diversity in the population and extends preservation of the
dependencies between rules by replacing classifiers by the similar ones.

1212

O. Unold

3 The Experiments
Bianchi in [2] was not trying to use his system to induct a grammar for huge NL
corpora. However such an experiment was performed using pure genetic algorithm
and CFG by Aycinena at all [1]. Their system used grammar in CNF and a CYK
parser, and as a corpora extensive part of various children books and the Brown
linguistic data. The corpora were part-of-speech tagged using a Brill tagger. All
English words were then removed leaving only the tags themselves, and number of
tags was reduced to 7 categories: nouns, pronouns; verbs, helping verbs; adjectives,
numeral, possessives; adverbs; prepositions, particles; conjunctions, determiners;
other (foreign words, symbols, and interjections).
The corpuses were divided into two parts, every third sentence was used for testing
evolved grammar, and the remaining part of the corpora for inducing the grammars.
The incorrect sentences were generated randomly from uniform distribution of length
from 2 to 15 tags. Some comparison set of experiments with GCS was performed on
the above NL corpora. Ten independent experiments were performed, evolution on
each training corpus ran for 1,000 generations. The main results of the NL grammar
induction with GCS are summarized in the table 1. In case of 5 corpuses the GCS
model induced a grammar of higher quality fitness, for the brown this value is only
slightly lower, and in the remaining 3 cases the estimators value is lower, but not
exceeding 5%. The values of the positive estimator are in 8 cases significantly higher
for the GCS model (the differences oscillate in the range of 4.2% and 16.2%), and for
the brown corpus the AKM approach got a result which is better by 0.5%.
Undoubtedly, the worst for the GCS model comes up the comparison of the negative
values for each corpus the model got decidedly higher values of this estimator, and
the differences oscillate in the range 1% for wizard to 17.3% for tom corpus. It
indicates that during the grammar induction the GCS model created in a few cases
(for 5 bodies the differences do not exceed 7%) productions which are too universal
in comparison to the AKM approach, which also parse a part of negative sentences.
The last parameter which can be compared is the number of evolutionary steps
(evals), in which both approaches found their best solutions. In as many as 6 cases the
GCS model did not exceed 50 steps, in the next case did not exceed 100 steps, and
two longest inductions took only slightly above 500 steps (somewhat over an hour).
The AKM approach took, in the best case, 15,500 steps, and for as many as 5 corpora
200,000 steps, and, according to the authors, 60 hours of calculation! The GCS
model proved to be incomparably more effective, being able to find, in the majority of
cases, the grammars with higher values of fitness and positive estimators. The evolved
grammar learned for the corpus children indicates some interesting linguistic features.
There are quite obvious groups like adjective noun, as well as rule noun verb. The
model found in the corpus also often appearing in English bigrams, so as noun
adverb, noun conjunction, verb adverb, or verb conjunction. The sentence can start
from the article why adding the article for the beginning of sentence is also keeping
its correctness. The straight majority of context-free production rules are beginning
from the starting symbol what is suggesting the big generality of these rules. On one
hand it will knock for economical writing entire grammar on the other however such a
versatility is enabling parsing also of sentences not belonging to the language.

Learning Classifier System Approach to Natural Language Grammar Induction

1213

Table 1. Comparison of NL grammar induction using genetic approach (AKM) with GCS. The
corpora include a selection of childrens books (denoted children, 986 learning correct
sentences, and 986 learning incorrect sentences), The Wizard of Oz (wizard, 1540/1540), Alice
in Wonderland (alice, 1012/1012), Tom Sawyer (tom, 3601/3601), and five Brown corpora:
brown_a (2789/2789), brown_b (1780/1780), brown_c (1099/1099), brown_d (1062/1062), and
brown_e (2511/2511). For each learning corpus, the table shows the target language, and four
sets of results. The first is the best fitness gained by GCS within 10 experiments and compared
approach. The fitness describes the percentage of sentences (correct and incorrect) recognized
correctly. Next results of the GCS model refer to the experiment in which best fitness was
obtained. The second result, positive, shows the percentage of correct examples from the train
set classified correctly. The third sort of results, negative, is the percentage of negative
examples classified incorrectly, and the last one indicates the number of generations needed to
reach the best fitness (evals).

Corpus
children
wizard
alice
tom
brown_a
brown_b
brown_c
brown_d
brown_e

fitness
GCS AKM
93,2
93,1
94,6
90,2
89,5
92,1
86,3
92,1
93,8
94,0
94,6
94,0
92,5
87,9
91,6
91,3
89,5
94

positive
GCS AKM
98,8
91,8
99,3
89,5
96,8
92,5
98,4
92,7
98,3
94,1
99,3
94,7
96,7
80,5
97,1
88,2
93,4
93,9

negative
GCS AKM
12,5
5,7
10,2
9,2
17,9
8,4
25,9
8,6
11,6
6,1
10,2
6,7
11,7
4,7
13,8
5,6
14,5
5,9

GCS
9
32
81
3
45
506
592
18
38

evals
AKM
200,000
200,000
200,000
200,000
48,500
200,000
15,500
45,000
122,000

References
1. Aycinena, M., Kochenderfer, M.J., Mulford D.C.: An evolutionary approach to natural
language grammar induction. Final project for CS224N: Natural Language Processing.
Stanford University (2003)
2. Bianchi, D.: Learning Grammatical Rules from Examples Using a Credit Assignement
Algorithm. In: Proc. of The First Online Workshop on Soft Computing (WSC1), Nagoya
(1996) 113-118
3. Cyre, W.R.: Learning Grammars with a Modified Classifier System. In: Proc. 2002 World
Congress on Computational Intelligence, Honolulu Hawaii (2002) 1366-1371
4. Gold, E.: Language identification in the limit. Information Control 10 (1967) 447-474
5. Lanzi, P.L., Riolo, R.L.: A Roadmap to the Last Decade of Learning Classifier System
Research. LNAI 1813, Springer Verlag (2000) 33-62
6. Lee, L.: Learning of Context-Free Languages: A Survey of the Literature. Report TR-12-96.
Harvard University, Cambridge, Massachusetts (1996)
7. Unold, O.: Playing a toy-grammar with GCS. In Mira J., lvarez J.R. (eds.) IWINAC 2005,
LNCS 3562 (2005) 300-309
8. Unold, O.: Context-free grammar induction with grammar-based classifier system. Archives
of Control Science, vol. 15 (LI) 4 (2005) 681-690

Text Retrieval Oriented Auto-construction of Conceptual


Relationship
Yi Hu1, Ruzhan Lu1, Yuquan Chen1, and Bingzhen Pei1,2
1

Department of Computer Science and Engineering,


Shanghai Jiao Tong University, Shanghai, China
2
School of Computer Science and Engineering,
Guizhou University, Guiyang, China
{huyi, rz-lu, yqchen, peibz}@cs.sjtu.edu.cn

Abstract. The dependence analysis is usually the key for improving the
performance of text retrieval. Compared with the statistical value of a
conceptual relationship, the recognition of relation type between concepts is
more meaningful. In this paper, we explored a bootstrapping method for
automatically extracting semantic patterns from a large-scale corpus to identify
the geographical is part of relationship between Chinese location concepts.
The experiments showed that the pattern set generated by our method achieves
higher coverage and precision than DIPRE does.
Keywords: Conceptual Relationship; Extraction Pattern; Text Retrieval.

1 Introduction
The independence assumption has been widely used in current retrieval models.
Although the assumption makes the models easily to build up, the independence
between words in language is obviously trustless in fact. Naturally, a promising idea
based on accurate conceptual relationship for information retrieval has been tried in
[4]. Apparently, the deep understanding of content is closer to human cognition. Note
that in terms of information retrieval based on concepts, it needs recognizing all valid
relation types between concepts to build up a complete conceptual graph of content,
and then providing response to the users requirement.
Seen from the previous efforts, unsupervised bootstrapping had been used in many
fields of information extraction. With respect to recognizing concept types (weapon
names, terrorism organizations etc.), the efforts in [3][5] are significant efforts. On the
other hand, extracting conceptual pairs with certain relationships from English corpus
are also tested [1][2]. This paper also proposes a method for constructing relationship
under bootstrapping learning by taking the geographical is part of as example.

2 Our Contributions
Our method is implemented based on the idea of DIPRE [2] by proposing a new SPG
(Semantic Pattern Getter) system. The contributions of our work lie in:
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 12141217, 2007.
Springer-Verlag Berlin Heidelberg 2007

Text Retrieval Oriented Auto-construction of Conceptual Relationship

1215

1. Introducing the bi-sequence alignment algorithm in bio-informatics to extract


multiple common subsequences (MCS) for getting flexible context expression rather
than the single longest common subsequence (LCS) in DIPRE.
2. Defining a new evaluation metric for the pattern confidence, which improves the
extracting quality.
2.1 Generating Patterns
A formalized pattern is a five-tuple, i.e. (prefix, middle, suffix, order, confidence) that
has the same meaning of [2]. The system captures patterns from the contexts of given
seeds, and the procedure of generating patterns in our study is illustrated as:
Step 1. Find all the occurrences of every seed in the large-scale corpus and record
the left, middle and right strings of the context.
Step 2. Use the bi-sequence alignment algorithm to extract patterns. Each context
pair of a seed generates a candidate pattern.
Step 3. The candidate patterns are filtered through the pattern validation rules.
Because there is the detailed description in [6], we do not describe the alignment
algorithm in details. But we can see an instance generated from two contexts:
Context1:

1515

(Company Address: No. 1515, Jiang Pu Rd., Shanghai City, China.)


Context2:

549

(Company Corresponding Address: No.549, Guo Shun Rd., Shanghai City, China.)
The bi-sequence alignment algorithm extracts a pattern denoting the is part of
relationship between two location concepts: (
<ANY_STRING>
, NULL,
<ANY_STRING> <ANY_STRING>
, -1, <Confidence>). In order to make
the pattern not to be overly generalized, we define the following validation rules to
choose new candidate patterns with higher qualities.

Rule 1. The prefix or suffix cannot be just a < ANY_STRING >;


Rule 2. Both the most right component of prefix and the most left component of
suffix cannot be an <ANY_STRING>;
Rule 3. The prefix, middle and suffix cannot just be punctuation;
These rules are simple and easy to understand. They are the experiences from
extracting patterns. Note that, the system removes those Chinese stop-words, such as
(de) and (le). In terms of a context set containing N sentences, the procedure
can at most generate CN2 candidate patterns. But a lot of them will be removed after

using the validation rules.


2.2 Pattern Confidence
How to evaluate the pattern confidence is the key factor affecting the coverage and
precision of final pattern set. We define a new confidence definition in (1).

Ppositive

Conf RlogF ( P) =
[log2 ( Pnew + 1)]
P
+
P
positive negative

(1)

1216

Y. Hu et al.

Where, P denotes the evaluated pattern and Ppositive denotes the number of correct
pairs in all the extracted pairs in current iteration; Pnegative denotes the number of
wrong pairs; Pnew is the number of newly extracted pairs. This pattern confidence
expresses the precision ingredient ( Ppositive ( Ppositive + Pnegative ) ) and the recall ingredient
( [log 2 ( Pnew + 1)] ) of the pattern P. and is the percentage of correct pairs in new ones.
When Pnew and are larger, the pattern confidence is higher too. If Pnew or is equal
to 0, the pattern confidence goes back to its initial definition in [7].

3 Experiments
In this section we give the experiment results of SPG and compare them with the
results of DIPRE. We use the sub-collections ranging from #29 to #39 of the
CWT100g (Chinese Web Test collection with 100GB web pages) corpus for training
and use the sub-collections ranging from #60 to #63 for testing. This paper uses the
following pairs as initial seeds in terms of is part of conceptual relationship.
Table 1. Initial Chinese Seed List

(Chang Chun)
(Hei Longjiang)
(Hai Dian)
(Xi An)
(Jing An)

(Ji Lin)
(China)
(Bei Jing)
(Shan Xi)
(Shang Hai)

The results of experiments by running three iterations can be seen in Table 2.


Table 2. Pairs and Patterns in every iteration

SPG
Iterations

Tuples

DIPRE
Patterns

Tuples

Patterns

28

21

57

51

31

33

171

169

20

25

In the first run, both the two systems use the same five seeds as initial input and the
last two runs use the seeds extracted from the previous iteration. Column 3 and 4
show the number of valid patterns. Seen from the experiment results, SPG can extract
more conceptual pairs than DIPRE does from the same sub-collections.
Then we need to consider the ability of SPG in constructing the conceptual
relationship between concepts. We use all the patterns obtained into the test collection,
respectively. Because the test collection is fixed, the system extracting MORE correct
pairs has the higher recall. Their precisions are easily evaluated manually.

Text Retrieval Oriented Auto-construction of Conceptual Relationship

1217

SPG has the better coverage than DIPRE because SPG can get more conceptual
pairs. On the other hand, the precisions also shows that SPG does better than DIPRE:
the precision of SPG in three iterations keeps about 90% with slightly fall, while the
DIPREs precision drops very quickly, and arrives at 50% in its third iteration. After
the experiment on testing corpus, SPG system totally obtain 1,504 pairs at last, and
1,358 of them are correct (precision = 90.29%); DIPRE obtains 588 pairs, and 384 of
them are correct (precision = 60.20%). The SPG patterns have higher performance.

4 Conclusion
In this paper, we propose a system named SPG for determining whether two concepts
satisfy the is part of relation in given context. We introduce a bi-sequence
alignment algorithm in bio-informatics to capture clearer and more understandable
patterns. And we also define a new confidence evaluating method for patterns. After
training the SPG system, the experiment results show that our system performs better
than DIPRE in terms of coverage and precision.
As mentioned above, our study aims at serving the retrieval model based on
concepts. Therefore the pattern set ought to give the support to determine the
relationship between two concepts in contexts. The pattern set should be useful in
many web pages and the precision should be guaranteed. For a long-term goal,
recognizing the accurate relationships definitely bring IR huge benefits, but to the
performance of current system, it is still a hard job. The key points of our future work
lies in developing an advanced SPG system and extending the method proposed in
this paper to the recognition of other relationships.
Acknowledgments. This work is supported by NSFC Major Research Program
60496326: Basic Theory and Core Techniques of Non Canonical Knowledge.

References
1. Agichtein, E., and Gravano, S.: Snowball: Extracting relations from large plain-text
collections. In Proc. of the 5th ACM International Conference on Digital Libraries (2000)
2. Brin, S.: Extracting patterns and relations from the World Wide Web. In Proc. of the 1998
International Workshop on the Web and Databases (1998)
3. Etzioni O., Cafarella M., Downey D. and Popescu A.M. et al: Methods for DomainIndependent Information Extraction from the Web: An Experimental Comparison. In Proc.
of the AAAI Conference (2004)
4. Genest, D. and Chenin, M. A Content-Search Information Retrieval Process Based on
Conceptual Graphs. Knowledge and Information Systems Journal. (2005) Vol. 8, 292-309
5. Lin W., Yangarber R., Grishman R.: Bootstrapped Learning of Semantic Classes from
Positive and Negative Examples. In Proc. of the ICML-2003 Workshop on the Continuum
from Labeled to Unlabeled Data, Washington DC (2003)
6. Michael S., B. Morgenstern, and J. Stoye. Divide-and-conquer multiple alignment with
segment-based constraints. Bioinformatics. (2003) 19(2): 189-195
7. Thelen M. and Riloff E.: A Bootstrapping Method for Learning Semantic Lexicon using
Extraction Pattern Contexts. In Proc. of the 2002 Conference on Empirical Methods in
Natural Language Processing (2002)

Filtering Methods for Feature Selection in


Web-Document Clustering
Heum Park and Hyuk-Chul Kwon
AI Lab. Dept. of Computer Science, Pusan National University, Busan, Korea
parkheum2@empal.com, hckwon@pusan.ac.kr

Abstract. This paper presents the results of a comparative study of filtering


methods for feature selection in web document clustering. First, we focused on
feature selection methods based on Mutual Information (MI) and Information
Gain (IG). With those features and feature values, and using MI and IG, we
extracted from documents representative max-value features as well as a
representative cluster for a feature and a representative cluster for a document.
Second, we tested the Max Feature Selection Method (MFSM) with those
representative features and clusters, and evaluated the web-document clustering
performance. However, when document sets yield poor clustering results by
term frequency, we cannot obtain good features using the MFSM with the MI
and IG values. Therefore, we propose new filtering methods, Min Count of
Representative Cluster for a Feature (MCRCF) and Min Count of
Representative Cluster for a Document (MCRCD). In the experimental results,
the MFSM showed better performance than was achieved using only term
frequency, MI and IG. And when we applied the new filtering methods for
feature selection (MCRCF, MCRCD), the clustering performance improved
notably. Thus we can assert that those filtering methods are effective means of
feature selection and offer good performance in web document clustering.
Keywords: feature selection, feature filtering, document clustering.

1 Introduction
Among feature selection methods in web-document clustering, X2, Mutual
Information(MI), and Information Gain(IG) are widely used.[1][2] However, we
cannot rely on such methods to provide consistently good performance, because web
documents have either a small number of terms or many kinds of terms and high term
frequencies. A small number of terms can be maintained in a document by increasing
the weights of feature values, but for documents containing many kinds of terms and
high term frequencies, the document vector space has to be reduced using filtering
methods. Filtering methods are general preprocessing algorithms that do not rely on
any knowledge of the algorithm.[3][4][5] Therefore, we propose new feature filtering
methods that can be applied independently to clustering application programs.
In the present study, first we extracted features and feature values using MI and
IG. And with those features and values, we extracted the documents representative
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 12181221, 2007.
Springer-Verlag Berlin Heidelberg 2007

Filtering Methods for Feature Selection in Web-Document Clustering

1219

max-value features as well as a representative cluster for a feature and a


representative cluster for a document. Second, we used the Max Feature Selection
Method (MFSM), which selects features using the max-feature values and a
representative cluster number for a feature. And we proposed Min Count of
Representative Cluster for a Feature (MCRCF) method and Min Count of
Representative Cluster for a Document (MCRCD) methods for filtering features.

2 Preliminaries
We used three corpora, the Natural Science directory services at www.empas.com,
www.yahoo.co.kr, and www.naver.com, three famous Korean portal sites, as
experimental datasets for feature selection and clustering. Those documents are well
classified manually by an indexer, and so we could easily evaluate the clustering
performance by comparing the pre-allocated directory with the clustering results.
Accordingly, we selected 9 subdirectories within the directory services. The total
number of document sets was 1,036, 964 and 1,093 respectively, and the term vector
space was a matrix consisting of tf or tf*idf. After bisecting each document set, we
used one half for feature selection and clustered the other half by applying the
selected features. For clustering and analysis of clustering performance, we used a
clustering toolkit, Cluto2.1, developed by the University of Minnesota.[3] We
compared the results of clustering using entropy- and purity- averaged measures.
For feature selection, first we calculated the feature values using the MI and IG,
and with the features and feature values, allocated each term to only the one category
c that had the maximum score for each expression MImax = max{MI(t,ci)}, and
(IGh)max = max{IG(t,ci)},{ if avg(IG(t,ci) ) >, (: threshold)}, and applied those
values in a document matrix. Second, with the feature values FVc(t), we obtained
representative clusters for a feature and representative clusters for a document. The
representative cluster number C (t ) = k { FV (t )} for a term t is the cluster number
f

max
i =1

with the maximum value among all of the clusters(i is the cluster number). And third,
k
we obtained the representative cluster number C ( d ) =
{ Count ( C ( t : t ))}
f

max
i =1

for each document using Cf(t), where Counti(Ci (t1:tn)) is the number of terms t1tn
that have Cf(t) in the i-th cluster for a document.

3 Filtering Methods for Feature Selection


First, we used the MFSM to filter features. We calculated the max-feature values
FVi(t)max in the i-th cluster using MI and IG, and with FVi(t)max , we determined the
representative cluster number Cf(t) for a term t. We then selected features in a
representative cluster. In expression 1, count(Ci(t)) is the count Ci(t) for a term t in all
of the clusters. Second, with MCRCF, we selected features using FVi(t)max and Cf(t)
and thresholds. If the feature value FVi(t) in the i-th cluster for a term t is greater than
the threshold , and the count of Ci(t) of term t in all of the clusters is greater than the
threshold (the threshold is 50% or 40% or 30% of the total number of clusters),
those features are removed from a document matrix. For example, when term t has a

1220

H. Park and H.-C. Kwon

feature value FVi(t)max greater than the threshold (=0) in cluster 1 and cluster 3,
count(Ci(t)) is 2, and if the threshold is 3, term t is selected. Third, with MCRCD,
we calculated the FVi(t)max and Ci(t) for a term t in the i-th cluster, as with MCRCF.
We could obtain a representative cluster number for a document Cf(d) as in section 2.
To apply the MCRCD method, we require the sum of feature values for term t in all
of the clusters, the weight W (2 or 3 or 4) and thresholds (50% or 40% or 30%)
constants. We include the following 5 steps in the filtering process.
Step 1: Calculate the feature value FV(t) of term t and FVi(t)max in the i-th cluster.
Step 2: Obtain Ci(t) and Ci(d) using FVi(t)max in the i-th cluster.
Step 3: Find features Fi that have a feature value FV(Cf(d)) greater than
(1/W)*FVi(Ci(d)) and , as in expression 2, where W is the weight
constant. Repeat Step3 for all of the terms from a document.
Step 4: Using those features Fi obtained from Step3, as in expression 3, if the count
of occurrences Ci(Fi) of feature Fi in all of the clusters is greater than the
threshold , those features are removed. Repeat Step2 to Step3 for all of the
documents.
Step 5: Apply those features and feature values to a document matrix, and cluster.
k

count

( C i ( t )) < Threshold (T )

i =1

(1)

1
) * FVi (Ci (d )) < FV (C f (d )) , and FV() >
W i =1

(2)

count (C ( F ) < Threshold (T )


i =1

(3)

Fig. 1. Entropies and Purities averaged using TF, MI, IG, and Filtering Methods MFSM with
MI and IG, MCRCF, and MCRCD. The clustering number is 30.

4 Experimental Evaluations
In these experiments, we clustered the document matrix by applying the MFSM with
the MI and IG values. As shown in Fig 1, the entropies and purities of Empas(1) were
the results of clustering the document set Empas(1) by applying features selected by

Filtering Methods for Feature Selection in Web-Document Clustering

1221

Empas(2). In the case of the Empas and Yahoo document sets, the clustering results
for the MFSM with the MI and IG values (MFSM-MI, MFSM-IG) were better than
for TF, MI and IG. But in the case of the Naver document sets, the clustering results
for MI and IG were very poor. Because the results by TF had poor entropies and
purities (0.327, 0.731), we could not obtain good features using the MFSM with MI
and IG values. And when we used the MCRCF and MCRCD methods, for not only
the Empas and Yahoo document sets but also for the Naver document sets, the
clustering results were better than when using the MFSM with MI and IG values, in
all of the entropies and purities. The entropies and purities of the clustering results
using MCRCF and MCRCD were improved notably.

5 Conclusion
This paper presents the results of a comparative study on features filtering for feature
selection web-document clustering. We applied, as feature selection methods, MI and
IG, and then the MFSM. There were better results for some document sets than those
obtained using term frequency (TF). But because the results by TF yielded poor
entropies and purities, we could not obtain good features using MI and IG, or the
MFSM either. Therefore, we were obliged to find new methods for selecting good
features and achieving good performance. When we applied MCRCF and MCRCD,
we were able to obtain a much better performance than that achieved using the
MFSM with MI and IG. Most notably in the case of the Yahoo document sets, there
was an extraordinarily good performance for all data sets using MCRCF and
MCRCD. Therefore, we can confirm that these feature-filtering methods offer
enhanced clustering performance as well as an effective means of selective filtering.
Acknowledgments. This work was supported by the Korea Research Foundation
Grant funded by the Korean Government (MOEHRD) (The Regional Research
Universities Program/Research Center for Logistics Information Technology).

References
1. Yiming Yang and Jan O. Pederson : A comparative study on feature selection in text
categorization, Proceedings of ICML-97, 14th International Conference on Machine
Learning. (1997).
2. Gang Wang, Frederick H. Lochovsky : Feature selection with conditional mutual
information maximin in text categorization. Proceedings of the thirteenth ACM
international conference. (2004), 342-349.
3. Ying Zhao and George Karypis : Criterion Functions for Document Clustering: Experiments
and Analysis. Technical Report TR #01--40, Department of Computer Science (2002)
4. Sanmay Das : Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection, The
Proceedings of the Eighteenth International Conference on Machine, (2001) 74 81
5. Huang Yuan, Shian-Shyong Tseng, Wu Gangshan, Zhang Fuyan : A two-phase feature
selection method using both filter and wrapper, IEEE SMC '99 Conference Proceedings.
1999 IEEE International Conference (1999) vol.2 132-136,
6. Baranidharan Raman, Thomas R. Ioerger : Instance Based Filter for Feature Selection,
journal of Machine Learning Rearch 1, (2002) 1-23,.

A Korean Part-of-Speech Tagging System Using


Resolution Rules for Individual Ambiguous Word
Young-Min Ahn1, Seung-Eun Shin2, Hee-Geun Park1
Hyungsuk Ji3, and Young-Hoon Seo1
1

School of Electrical & Computer Engineering, Chungbuk National University, Korea


nlpmania@paran.com, pinetree@nlp.chungbuk.ac.kr,
yhseo@chungbuk.ac.kr
2
BK21 Chungbuk Information Technology Center, Chungbuk National University, Korea
seshin@nlp.chungbuk.ac.kr
3
School of Information & Communication Engineering, Sungkyunkwan University, Korea
jihyungsuk@skku.edu

Abstract. In this paper we present a Korean part-of-speech tagging system


using resolution rules for individual ambiguous word. Our system resolves
lexical ambiguities by common rules, rules for individual ambiguous word, and
statistical approach. We built resolution rules for each word which has several
distinct morphological analysis results with a view to enhancing tagging
accuracy. Statistical approach based on Hidden Markov Model (HMM) is
applied for ambiguous words that are not resolved by the rules. The experiment
on the test set shows that the part-of-speech tagging system has high accuracy
and broad coverage.
Keywords: Part-of-Speech, Tagging, Resolution Rules.

1 Introduction
Morphological analyzer is the basic module for most natural language processing
systems such as natural language understanding, information retrieval, and machine
translation. Morphological analyzer analyzes all possible combination of morphemes
for a given word and generates one or more analysis results for that word. In general
about 30% of Korean words have two or more analysis results. Resolving these
ambiguities is thus vital to build reliable natural language processing system. Part-ofspeech tagging is the module to resolve such morphological ambiguities.
Conventional part-of-speech tagging approaches are classified into statistical
approach[1], rule-based approach[2], and hybrid approach [3].
Hidden Markov Model (HMM) has been the mainstay of the statistical model for
part-of-speech tagging. It selects a sequence of part-of-speech tags with the highest
probability from words or morphemes around an ambiguous word. Rule-based
approach resolves ambiguities by predefined rules. Its major advantage is high
accuracy but unlike statistical approach it has narrow coverage. Therefore pure rulebased approach is scarcely used in resolving morphological ambiguities. Hybrid
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 12221225, 2007.
Springer-Verlag Berlin Heidelberg 2007

A Korean Part-of-Speech Tagging System Using Resolution Rules

1223

approach uses both rules and statistical information with a view to achieving high
accuracy and broad coverage. Most Korean part-of-speech tagging systems adopt this
approach.
This paper presents a part-of-speech tagging system for Korean. The system uses
resolution rules for individual ambiguous words to get high accuracy. Each rule has
syntax and semantic information for an ambiguous word and words around it. Our
system applies common rules to test set followed by resolution rules and statistical
tagging.

2 Building Information for Disambiguation


2.1 Common-Rules

Common-rules are ones for idioms and phrases of common use such as ~
(can)(halq swu issta), ~
(cannot)(halq swu epsta), ~
(must)(haci anh.ul swu epsta). In the first stage common-rules are applied to
resolve ambiguities of words. For example the tags of (swu) in above expressions
are determined as an incomplete noun.

2.2 Resolution Rules for Individual Ambiguous Word


In this study, an ambiguous word is a word that has two or more distinct
morphological analysis results. Ambiguous words was collected and the rules for each
ambiguous word was built from a large corpus. 330 ambiguous words which amount to
50% of total ambiguous words were chosen. Morphemes, part-of-speech tags, word
senses, positions, and/or other information of each ambiguous word and words around it
were encoded into rules to resolve the ambiguous word. Some rules are shown in Fig.1.
tag
/!e_ @
/e_+/jx
@
/e_ @
@ /pa

@ /nc
/!e_ @
/e_+/jx @

/e_ @
all
[]/nc @
@ []/nc
default

meaning

e_
etm

ending
adnominal ending

/px+/etm
/mm
/mm
/pa+/ep+/ef
/pa+/ep+/ef

pa
pv
mm
ep
ec

/px+/ep+/ef
/np+/jx
/nc
/nc
/pv+/ec

np
nc
ef
jx
@

adjective
verb
indeclinable adj.
prefinal ending
conjunctive
ending
pronoun
common noun
final ending
auxiliary particle
ambiguous word

/pa+/etm
/pa+/etm

Fig. 1. An example of resolution rules for individual ambiguous word

1224

Y.-M. Ahn et al.

2.3 Statistical Information


Information on tag sequences and a word itself were extracted from the tagged
corpus. The unit of 3 words information was also extracted. Table 1 shows an
example of statistical information extracted in this manner.
Table 1. Example of statistical information

(yitnuen)

Information about a word,


previous tag
sequences
nc+jc
nc+jc
nc+jc
nc+jc
nc+jc
nc+jc

next tag
sequences
nb+co+ef
nb+co+ef+jx
nb+co+ep+ef
nb+jc
nb+jx
nc

frequency
20
4
2
17
11
33

Information about tag sequences,


nc+co+ec
previous tag
next tag
frequency
sequences
sequences
mm
,
6
mm
?
1
mm
EOS
1
mm
mag
8
mm
nc
1

3 Experiment
For evaluation, ETRI corpus and Sejong corpus were used for the experiment with
sentences. Table 2 shows the information of the test corpus. The rate of ambiguity in
the table means the average number of morphological analysis results of ambiguous
words. First, we apply common rules to an ambiguous word. Rules for individual
ambiguous words are applied subsequently when the ambiguity is not resolved. First
matching rule is applied to an ambiguous word when the conditions of two or more
rules are satisfied. Statistical information is applied finally when the ambiguity is not
resolved. Table 3 shows the result of tagging with the rules only, tagging with only
statistical information, and tagging with both.
In Table 3, the average rate of correctness with rules only is 88.68 and resolution
accuracy by rules is almost 100%; the average rate of correctness with statistical
information is 93.82%. These results are comparable to the best performance of
conventional Korean statistical part-of-speech system.
Table 2. The information of corpus for evaluation

ETRI corpus
Sejong corpus

The total number of


words
12357
4911

The number of
ambiguous words
4836
1761

The rate of
ambiguity
2.66
2.74

Total

17628

6597

2.70

Corpus

Table 3 also shows that the average rate of correctness of our system that uses both
rules and statistical information is higher than those of systems that use only rules or
only statistical information. Furthermore, the accuracy of our system is 96.86% which

A Korean Part-of-Speech Tagging System Using Resolution Rules

1225

is higher than any other conventional hybrid tagging systems whose accuracies varies
from 93% to 96%. Our system does not resolve the ambiguities correctly when two or
more consecutive words are all ambiguous.
Table 3. The correctness of each tagging

Corpus

Tagging
with rules

ETRI corpus
Sejong corpus
Total

89.67%
87.69%
88.68%

Tagging with
statistical
information
95.14%
92.50%
93.82%

Tagging with rules


and statistical
information
97.74%
95.98%
96.86%

4 Conclusion
In this paper, we described a hybrid Korean part-of-speech tagging system using rules
and statistical information. About 50% of ambiguities are resolved by those rules, and
their accuracy is almost 100%. Statistical tagging based on HMM is applied to words
which are not resolved by rules. The correctness of our system was 95.98%, while
that of statistical tagging only was 93.82%. The system failed to resolve correctly the
ambiguities for consecutive ambiguous words.
Acknowledgments. This research was supported by the Ministry of Information and
Communication, Korea under the ITRC, IITA-2006-(C1090-0603-0046).

References
1. Merialdo, B.: Tagging English Text with a Probabilistic Model. Computational Linguistics
20(2) (1994) 155171
2. Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A
Case Study in Part-of-Speech Tagging. Computational Linguistics 21(4) (1995) 543564
3. Tapanainen, P., Voutilainen A.: Tagging accurately Don't guess if you know. In:
Proceedings of the 7th Conference of the European Chapter of the Association for
Computational Linguistics. (1994) 149156

An Interactive User Interface for Text Display


Hyungsuk Ji and Hyunseung Choo
School of Information and Communication Engineering,
Sungkyunkwan University, Korea
jihyungsuk@skku.edu,choo@ece.skku.ac.kr

Abstract. Studies on the eect of text width on readability have encouraged the use of xed text-width web/electronic text design. The
drawback of this type of design is the loss of users interactivity with
regard to text modication. In this paper, we investigate the web design
of the worlds top 100 websites and present an alternative interactive
user interface for text display.
Keywords: Readability, Web design, line length, text presentation.

Introduction

A great number of texts are read nowadays on computer screens. With the
increase in the amount of web-based texts, it is becoming increasingly important
to develop a convenient user interface that can enhance the eciency and reduce
the fatigue when reading via the web [1]. Readability, among others, is one of the
most important factors that aect eciency and work fatigue, and it is closely
related to the text width or line length of the text.
The research on readability found that the largest acceptable text width for
printed materials is around 90mm, with some varying results. With regard to the
computer screen environment, the average text width preferred by the subjects
on the computer screen was found to be around 100mm [2,3]. An experiment
measuring the number of characters showed similar results of 55 characters per
line (cpl) [2] and 55-70 cpl [3]. Based on these studies, web designs with a specic
xed text width were recommended [3,4]. Inuenced by these kinds of studies,
a myriad of web pages are now designed to meet conditions for maximum readability, and the majority of them have xed text width format.
While general consensus was found with regard to the best readability and
line length in the above experiments, there have always been some inconsistent
results. First, the similar readability was found irrespective of the line length
as far as the line length does not exceed a certain limit [2]. Second, the optimum line length that corresponds to optimum readability diered depending on
researchers and experiments [2,3].
In this study, we propose an alternative user interface that restores the interactive text-width-selecting feature without going back to the rst generation
web document style.
Y. Shi et al. (Eds.): ICCS 2007, Part II, LNCS 4488, pp. 12261229, 2007.
c Springer-Verlag Berlin Heidelberg 2007


An Interactive User Interface for Text Display

1227

Web Design on Text Display

The following web design history is a classication of the layout, or the controllability of the text width in web pages. These are based on the period where
each characteristic feature was used eectively, and not at the point where the
feature was rst introduced.1
First Generation. The rst generation web design was characterized by its
simplistic layout. The text width was adjustable freely by dragging the borders
of windows with the mouse or by clicking the restore/maximize button.
Second Generation. With the adaptation of table tags, partial restrictions
were imposed on users adjustability of text width.
Third Generation. The frame was introduced: the menu and the main content
were separated. Adjusting the width of the menu and the main contents was
possible and users had the ability to hide menus by minimizing their width.
Fourth Generation. With the practical use of Cascading Style Sheets (CSS),
web pages became more exible from a designers point of view and more rigid
from a users point of view. Frames disappeared and it no longer became possible
for users to remove or hide menus.
In the rst generation, the adjustability of the text width provided web pages
with the most interactive and free text format as far as text width modication was concerned. This feature was so typical that even today almost all
word processors call dynamic text width layout web page layout. However,
inuenced by the studies on readability and line length and by other design requirements, most web pages today have xed text width and it is impossible for
users to adjust the text width.
To quantify these trends, we investigated the worlds top 100 websites as
chosen by Alexa. The results showed that only 11 web pages have a dynamic
text width format and about 90% of worlds top 100 websites use xed text width
web pages.2 Considering the still permitted personalizable options such as font
style, font size, color and global window size of web pages, it is quite surprising
that the most popular web designs do not allow the slightest exibility with
respect to the text width. This rigidity of web page design has resulted in losing
one of the most fundamental web pages features and reducing users active
interaction ability with the computer.

Proposed Interface

For 11 websites where dynamic text width modication is allowed, resizing global
windows (web browsers outer windows) width is required to get the desired
1

CSS was rst proposed in November 1995, which was virtually in the same period
as the previous two (Table was rst proposed in March 1995 and Frame September
1995).
The 11 websites are: *rkut.com *ikipedia.org *mazon *otolog.com *mdb.com *apidshare.de *andex.ru *c2.com *ourceforge.net *ama.ru *igg.com (The rst letter is
marked out for each item.).

1228

H. Ji and H. Choo

Fig. 1. The text width can be adjusted easily with a single mouse click

text width. This results in two diculties. First, in narrowing global windows
width, users lose access to the far right menus/tabs, unless the menus/tabs are
detached from the web browser. Second, after narrowing text width, restoration
is necessary to view other web pages. Since web page reading involves frequent
changes from one website to another, this alternation is by no means convenient.
Further, some websites have no limit on line length, which obliges users to resize
their browser windows for better reading.
It is still important to allow users to personalize web text width. The results
of line-length related studies show that the optimal line length is not identical for
every user and considerable dierences exist depending on the particular study
group [3,5]. Thus, providing users with interactive text-width modication functionality is important in order to provide users with more personalized control
over the text format. In particular, in conducting specic textual study like analyzing multitext, such an interface would be of benet. Multitext analysis study
requires both normal reading and comparing two or more adjacent lines, and two
types of text-width are necessary for the fulllment of the task (unlike normal
reading, it is preferable to have a greater line length for line-by-line comparison).
We propose a user interface that ensures users interactive control in adjusting
text format and provides a novel scrolling functionality for reading documents.

An Interactive User Interface for Text Display

1229

Special buttons were placed top right of the documents so that users can adjust
their contents width by a simple single click with the mouse (Fig. 1). The
button was designed without the use of mouse dragging since the latter requires
non-trivial concentration and attention. We propose (1) two dierent text-width
formats independent of global windows size and (2) a dynamic format, whose
width is dependent on global window size.
The texts in Fig. 1 are parallel verses of the Bible, which were constructed
automatically using the program multitext organizing model that we developed
in parallel with this study. In these multitext documents, the verses having the
same verse number are grouped together and the arrow buttons in Fig. 1 allow
users to bring the head of these groups on top of the window, ensuring the
constant positioning of the aligned elements.
Hornbk and Frkjr reported that users prefer a sheye interface, where
unimportant texts are reduced in size, or presented as unreadable, to a linear
interface, where no such distortions are implemented [6]. One drawback of such
a reduced size form is the amount of unimportant text cannot be known. Our
proposed scrolling method can be useful in such a case, since it is possible both to
keep full readable length of unimportant text, and to skip quickly over a portion
of text. An extra functionality was implemented that enables users to select or
remove certain portion of texts or left-side menu (the third and the fourth gure
in Fig. 1). These features were developed using JavaScript and CSS and can be
readily implemented in existing web documents.
Acknowledgments. This research was supported by the Ministry of Information and Communication, Korea under the ITRC, IITA-2006-(C1090-0603-0046).

References
1. Corry, M.D., Frick, T.W., Hansen, L.: User-centered design and usability testing
of a web site: An illustrative case study. Educational Technology Research and
Development 45(4) (1997) 6576
2. Dyson, M.C., Haselgrove, M.: The inuence of reading speed and line length on
the eectiveness of reading from screen. International Journal of Human-Computer
Studies 54(4) (2001) 585612
3. Ling, J., van Schaik, P.: The inuence of font type and line length on visual search
and information retrieval in web pages. International Journal of Human-Computer
Studies 64(5) (2006) 395404
4. Davidov, A.: Computer screens are not like paper: typography on the web. In
Sassoon, R., ed.: Computers and Typography. Volume 2. Intellect Books, Bristol,
UK. (2002) 2140
5. Mills, C.B., Weldon, L.J.: Reading text from computer screens. ACM Comput.
Surv. 19(4) (1987) 329357
6. Hornbk, K., Frkjr, E.: Reading patterns and usability in visualizations of electronic documents. ACM Trans. Comput.-Hum. Interact. 10(2) (2003) 119149

Author Index

Ab
anades, Miguel A. II-227
Abbate, Giannandrea I-842
Abdullaev, Sarvar R. IV-729
Abdullah, M. I-446
Acevedo, Liesner I-152
Adam, J.A. I-70
Adriaans, Pieter III-191, III-216
Adrianto, Indra I-1130
Agarwal, Pankaj K. I-988
Ahn, Chan-Min II-515
Ahn, Jung-Ho IV-546
Ahn, Sangho IV-360
Ahn, Sukyoung I-660
Ahn, Woo Hyun IV-941
Ahn, Young-Min II-1202, II-1222
Ai, Hongqi II-327
Al-Sammane, Ghiath II-263
Alexandrov, Vassil I-747, II-744,
II-768, II-792
Alfonsi, Giancarlo I-9
Alidaee, Bahram IV-194
Aliprantis, D. I-1074
Allen, Gabrielle I-1034
Alper, Pinar II-712
Altintas, Ilkay III-182

Alvarez,
Eduardo J. II-138
An, Dongun III-18
An, Sunshin IV-869
Anthes, Christoph II-752, II-776
Araz, Ozlem Uzun IV-973
Archip, Neculai I-980
Arin, B. II-335
Aristov, V.V. I-850
Arslanbekov, Robert I-850, I-858
Arteconi, Leonardo I-358
Aslan, Burak Galip III-607
Assous, Franck IV-235
Atanassov, E. I-739
Avolio, Maria Vittoria I-866
Awan, Asad I-1205
Babik, Marian III-265
Babuska, I. I-972
Baca
o, Fernando II-542

Bae, Guntae IV-417


Bae, Ihn-Han IV-558
Baek, Myung-Sun IV-562
Baek, Nakhoon II-122
Bai, Yin III-1008
Bai, Zhaojun I-521
Baik, Doo-Kwon II-720
Bajaj, C. I-972
Balas, Lale I-1, I-38
Balis, Bartosz I-390
Balogh, Zoltan III-265
Baloian, Nelson II-799
Bang, Young-Cheol III-432
Bao, Yejing III-933
Barab
asi, Albert-L
aszl
o I-1090
Barabasz, B. I-342
Barrientos, Ricardo I-229
Barros, Ricardo III-253
Baruah, Pallav K. I-603
Bashir, Omar I-1010
Bass, J. I-972
Bastiaans, R.J.M. I-947
Baumgardner, John II-386
Baytelman, Felipe II-799
Bayyana, Narasimha R. I-334
Bechhofer, Sean II-712
Beezley, Jonathan D. I-1042
Bei, Yijun I-261
Bell, M. I-1074
Belloum, Adam III-191
Bemben, Adam I-390
Benhai, Yu III-953
Benkert, Katharina I-144
Bennethum, Lynn S. I-1042
Benoit, Anne I-366, I-591
Bervoets, F. II-415
Bhatt, Tejas I-1106
Bi, Jun IV-801
Bi, Yingzhou IV-1061
Bidaut, L. I-972
Bielecka, Marzena II-970
Bielecki, Andrzej II-558
Black, Peter M. I-980
Bo, Hu IV-522

1232

Author Index

Bo, Shukui III-898


Bo, Wen III-917
Bochicchio, Ivana II-990, II-997
Bosse, Tibor II-888
Botana, Francisco II-227
Brendel, Ronny II-839
Bressler, Helmut II-752
Brewer, Wes II-386
Brill, Downey I-1058
Brooks, Christopher III-182
Browne, J.C. I-972
Bu, Jiajun I-168, I-684
Bubak, Marian I-390
Bungartz, Hans-Joachim I-708
Burguillo-Rial, Juan C. IV-466
Burrage, Kevin I-778
Burrage, Pamela I-778
Bushehrian, Omid I-599
Byon, Eunshin I-1197
Byrski, Aleksander II-928
Byun, Hyeran IV-417, IV-546
Byun, Siwoo IV-889
Cai, Guoyin II-569
Cai, Jiansheng III-313
Cai, Keke I-684
Cai, Ming II-896, III-1048,
IV-725, IV-969
Cai, Ruichu IV-1167
Cai, Shaobin III-50, III-157
Cai, Wentong I-398
Cai, Yuanqiang III-1188
Caiming, Zhang II-130
Campos, Celso II-138
Cao, Kajia III-844
Cao, Rongzeng III-1032, IV-129
Cao, Suosheng II-1067
Cao, Z.W. II-363
Carmichael, Gregory R. I-1018
Caron, David I-995
Catalyurek, Umit I-1213
Cattani, Carlo II-982, II-990, II-1004
Cecchini, Arnaldo I-567

Cepulkauskas,
Algimantas II-259
Cetnarowicz, Krzysztof II-920
Cha, Jeong-won IV-721
Cha, JeongHee II-1
Cha, Seung-Jun II-562
Chai, Lei IV-98
Chai, Tianfeng I-1018

Chai, Yaohui II-409


Chai, Zhenhua I-802
Chai, Zhilei I-294
Chakraborty, Soham I-1042
Chandler, Seth J. II-170
Chandola, Varun I-1222
Chang, Ok-Bae II-1139
Chang, Jae-Woo III-621
Chang, Moon Seok IV-542
Chang, Sekchin IV-636
Chang, Yoon-Seop II-562
Chaoguang, Men III-166
Chatelain, Philippe III-1122
Chaturvedi, Alok I-1106
Chawla, Nitesh V. I-1090
Che, HaoYang III-293
Chen, Bin III-653
Chen, Bing III-338
Chen, Changbo II-268
Chen, Chun I-168, I-684
Chen, Gang I-253, I-261, III-1188
Chen, Guangjuan III-984
Chen, Guoliang I-700
Chen, Jianjun I-318
Chen, Jianzhong I-17
Chen, Jiawei IV-59, IV-98
Chen, Jin I-30
Chen, Jing III-669
Chen, Juan IV-921
Chen, Ken III-555
Chen, Lei IV-1124
Chen, Ligang I-318
Chen, Liujun IV-59
Chen, Long IV-1186
Chen, Qingshan II-482
Chen, Tzu-Yi I-302
Chen, Weijun I-192
Chen, Wei Qing II-736
Chen, Xiao IV-644
Chen, Xinmeng I-418
Chen, Ying I-575
Chen, Yun-ping III-1012
Chen, Yuquan II-1186, II-1214
Chen, Zejun III-113
Chen, Zhengxin III-852, III-874
Chen, Zhenyu II-431
Cheng, Frank II-17
Cheng, Guang IV-857
Cheng, Jingde I-406, III-890
Cheng, T.C. Edwin III-338

Author Index
Cheng, Xiaobei III-90
Chi, Hongmei I-723
Cho, Eunseon IV-713
Cho, Haengrae IV-753
Cho, Hsung-Jung IV-275
Cho, Jin-Woong IV-482
Cho, Ki Hyung III-813
Cho, Sang-Young IV-949
Cho, Yongyun III-236
Cho, Yookun IV-905
Choe, Yoonsik IV-668
Choi, Bum-Gon IV-554
Choi, Byung-Uk IV-737
Choi, Han-Lim I-1138
Choi, Hyoung-Kee IV-360
Choi, HyungIl II-1
Choi, Jaeyoung III-236
Choi, Jongsun III-236
Choi, Kee-Hyun II-952
Choi, Myounghoi III-508
Choo, Hyunseung I-668, II-1226,
III-432, III-465, IV-303, IV-336,
IV-530, IV-534, IV-538, IV-550
Chopard, Bastien I-922
Chou, Chung-I IV-1163
Choudhary, Alok III-734
Chourasia, Amit I-46
Chrisochoides, Nikos I-980
Christiand II-760
Chtepen, Maria I-454
Chu, Chao-Hsien III-762
Chu, You-ling IV-1163
Chu, Yuan-Sun II-673
Chuan, Zheng Bo II-25
Chung, Hee-Joon II-347
Chung, Hyunsook II-696
Chung, Min Young IV-303, IV-534,
IV-550, IV-554
Chung, Seungjong III-18
Chung, Tai-Myoung III-1024
Chung, Yoojin IV-949
Cianni, Nathalia M. III-253
Ciarlet Jr., Patrick IV-235
Cisternino, Antonio II-585
Claeys, Filip H.A. I-454
Clark, James S. I-988
Clatz, Olivier I-980
Clegg, June IV-18
Clercx, H.J.H. I-898
Cline, Alan II-1123

1233

Coen, Janice L. I-1042


Co
no, A.S. III-82
Cole, Martin J. I-1002
Cong, Guodong III-960
Constantinescu, Emil M. I-1018
Corcho, Oscar II-712
Cornish, Annita IV-18
Cortial, J. I-1171
Costa-Montenegro, Enrique IV-466
Costanti, Marco II-617
Cox, Simon J. III-273
Coyle, E. I-1074
Cuadrado-Gallego, J. II-1162
Cui, Gang IV-1021
Cui, Ruihai II-331
Cui, Yifeng I-46
Cui, Yong IV-817
Curcin, Vasa III-204
Cycon, Hans L. IV-761
DAmbrosio, Donato I-866
D
aescu, Dacian I-1018
Dai, Dao-Qing I-102
Dai, Kui IV-251
Dai, Tran Thanh IV-590
Dai, Zhifeng IV-1171
Danek, Tomasz II-558
Danelutto, Marco II-585
Dang, Sheng IV-121
Dapeng, Tan IV-957
Darema, Frederica I-955
Darmanjian, Shalom I-964
Das, Abhimanyu I-995
Day, Steven I-46
Decyk, Viktor K. I-583
Degond, Pierre I-939
Delu, Zeng IV-283
Demeester, Piet I-454
Demertzi, Melina I-1230
Demkowicz, L. I-972
Deng, An III-1172
Deng, Nai-Yang III-669, III-882
Deng, Xin Guo II-736
Dhariwal, Amit I-995
Dhoedt, Bart I-454
Di, Zengru IV-98
Daz-Zuccarini, V. I-794
DiGiovanna, Jack I-964
Diller, K.R. I-972
Dimov, Ivan I-731, I-739, I-747

1234

Author Index

Ding, Dawei III-347


Ding, Lixin IV-1061
Ding, Maoliang III-906
Ding, Wantao III-145
Ding, Wei III-1032, IV-129, IV-857
Ding, Yanrui I-294
Ding, Yong IV-1116
Ding, Yongsheng III-74
Ding, Yu I-1197
Diniz, Pedro I-1230
Dittamo, Cristian II-585
Doboga, Flavia II-1060
Dobrowolski, Grzegorz II-944
Dong, Jinxiang I-253, I-261, II-896,
II-1115, III-1048, IV-725, IV-969
Dong, Yong IV-921
Dongarra, Jack II-815
Dongxin, Lu III-129
Dostert, Paul I-1002
Douglas, Craig C. I-1002, I-1042
Downar, T. I-1074
Drezewski, Rafal II-904, II-920
Dressler, Thomas II-831
Du, Xu IV-873
Du, Ye III-141
Duan, Gaoyan IV-1091
Duan, Jianyong II-1186
Dunn, Adam G. I-762
Dupeyrat, Gerard. IV-506
Efendiev, Yalchin I-1002
Egorova, Olga II-65
Eilertson, Eric I-1222
Elliott, A. I-972
Ellis, Carla I-988
Emoto, Kento II-601
Engelmann, Christian II-784
Eom, Jung-Ho III-1024
Eom, Young Ik IV-542, IV-977
Ertoz, Levent I-1222
Escribano, Jes
us II-227
Espy, Kimberly Andrew III-859
Ewing, Richard E. I-1002
Fabozzi, Frank J. III-937
Fairman, Matthew J. III-273
Falcone, Jean-Luc I-922
Fan, Hongli III-563
Fan, Ying IV-98
Fan, Yongkai III-579

Fang, F. II-415
Fang, Fukang IV-59
Fang, Hongqing IV-1186
Fang, Hua III-859
Fang, Li Na II-736
Fang, Lide II-1067
Fang, Liu III-1048
Fang, Yu III-653
Fang, Zhijun II-1037
Fang-an, Deng III-453
Farhat, C. I-1171
Farias, Antonio II-799
Fathy, M. IV-606
Fedorov, Andriy I-980
Fei, Xubo III-244
Fei, Yu IV-741
Feixas, Miquel II-105
Feng, Huamin I-374, II-1012, III-1,
III-493
Feng, Lihua III-1056
Feng, Y. I-972
Feng, Yuhong I-398
Ferrari, Edward I-1098
Fidanova, Stefka IV-1084
Field, Tony I-111
Figueiredo, Renato I-964
Fischer, Rudolf I-144
Fleissner, Sebastian I-213
Flikkema, Paul G. I-988
Fl
orez, Jorge II-166
Fortes, Jos A.B. I-964
Frausto-Sols, Juan II-370, IV-981
Freire, Ricardo Oliveira II-312
Frigerio, Francesco II-272
Frolova, A.A. I-850
Fu, Chong I-575
Fu, Hao III-1048
Fu, Qian I-160
Fu, Shujun I-490
Fu, Tingting IV-969
Fu, Xiaolong III-579
Fu, Yingfang IV-409
Fu, Zetian III-547
Fuentes, D. I-972
Fujimoto, R.M. I-1050
Fukushima, Masao III-937
F
urlinger, Karl II-815
Furukawa, Tomonari I-1180
Fyta, Maria I-786

Author Index
Gallego, Samy I-939
G
alvez, Akemi II-211
Gang, Fang Xin II-25
Gang, Yung-Jin IV-721
Gao, Fang IV-1021
Gao, Liang III-212
Gao, Lijun II-478
Gao, Rong I-1083
Gao, Yajie III-547
Garcia, Victor M. I-152
Gardner, Henry J. I-583
Garre, M. II-1162
Garsva, Gintautas II-439
Gautier, Thierry II-593
Gava, Frederic I-611
Gawro
nski, P. IV-43
Geiser, J
urgen I-890
Gelfand, Alan I-988
Georgieva, Rayna I-731
Gerndt, Michael II-815, II-847
Gerritsen, Charlotte II-888
Ghanem, Moustafa III-204
Ghattas, Omar I-1010
Gi, YongJae II-114
Gibson, Paul II-386
Gilbert, Anna C. I-1230
Goble, Carole II-712, III-182
Goda, Shinichi IV-142
Goderis, Antoon III-182
Goey, L.P.H. de I-947
Golby, Alexandra I-980
Goldberg-Zimring, Daniel I-980
Golubchik, Leana I-995
Gombos, Daniel I-1138
G
omez-Tato, A. III-637
Gong, Jian IV-809
Gong, Jianhua III-516, III-563
Gonz
alez-Casta
no, Francisco J. III-637,
IV-466
Gonzalez, Marta I-1090
Gore, Ross I-1238
Goto, Yuichi I-406
Gould, Michael II-138
Govindan, Ramesh I-995
Grama, Ananth I-1205
Gregor, Douglas I-620
Gregorio, Salvatore Di I-866
Gu, Guochang III-50, III-90, III-137,
III-157, III-178
Gu, Hua-Mao III-591

1235

Gu, Jifa IV-9


Gu, Jinguang II-728
Gu, Yanying IV-312
Guan, Ying I-270
Guang, Li III-166
Guang-xue, Yue IV-741
Guensler, R. I-1050
Guermazi, Radhouane III-773
Guibas, L.J. I-1171
Guo, Bo IV-202
Guo, Jiangyan III-370
Guo, Jianping II-538, II-569
Guo, Song III-137
Guo, Yan III-1004
Guo, Yike III-204
Guo, Zaiyi I-119
Guo, Zhaoli I-802, I-810
Gurov, T. I-739
Gutierrez, J.M. III-82
Gyeong, Gyehyeon IV-977
Ha, Jong-Sung II-154
Ha, Pan-Bong IV-721
Haase, Gundolf I-1002
Habala, Ondrej III-265
Hachen, David I-1090
Haegee, Adrian II-744, II-768
Hagiwara, Ichiro II-65
Hall, Mary W. I-1230
Hamadou, Abdelmajid Ben III-773
Hammami, Mohamed III-773
Hammond, Kevin II-617
Han, Houde IV-267
Han, Hyuck II-577, III-26, IV-705
Han, Jianjun I-426, IV-965
Han, Jinshu II-1091
Han, Ki-Joon I-692, II-511
Han, Ki-Jun IV-574
Han, Kyungsook I-78, I-94, II-339
Han, Lu IV-598
Han, Mi-Ryung II-347
Han, Qi-ye III-1012
Han, SeungJo III-829, IV-717
Han, Shoupeng I-1246
Han, Yehong III-444
Han, Youn-Hee IV-441
Han, Young-Ju III-1024
Han, Yuzhen III-911
Han, Zhangang IV-98
Hansen, James I-1138

1236

Author Index

Hao, Cheng IV-1005


Hao, Zhenchun IV-841
Hao, Zhifeng IV-1167
Hasan, M.K. I-326
Hasegawa, Hiroki I-914
Hatcher, Jay I-1002, I-1042
Hazle, J. I-972
He, Gaiyun II-1075
He, Jing II-401, II-409
He, Jingsha IV-409
He, Kaijian I-554, III-925
He, Tingting III-587
He, Wei III-1101
He, X.P. II-1083
He, Yulan II-378
He, Zhihong III-347
Heijst, G.J.F. van I-898
Hermer-Vazquez, Linda I-964
Hertzberger, Bob III-191
Hieu, Cao Trong IV-474
Hill, Chris I-1155, I-1163
Hill, Judith I-1010
Hinsley, Wes I-111
Hiroaki, Deguchi II-243
Hirose, Shigenobu I-914
Hluchy, Ladislav III-265
Hobbs, Bruce I-62
Hoekstra, Alfons G. I-922
Homann, C. I-1074
Holloway, America I-302
Honavar, Vasant I-1066
Hong, Choong Seon IV-474, IV-590
Hong, Dong-Suk II-511
Hong, Helen II-9
Hong, Jiman IV-905, IV-925, IV-933
Hong, Soon Hyuk III-523, IV-425
Hong, Weihu III-1056
Hong, Yili I-1066
Hongjun, Yao III-611
Hongmei, Liu I-648
Hor
ak, Bohumil II-936
Hose, D.R. I-794
Hou, Jianfeng III-313, III-320, III-448
Hou, Wenbang III-485
Hou, Y.M. III-1164
How, Jonathan I-1138
Hsieh, Chih-Hui I-1106
Hu, Bai-Jiong II-1012
Hu, Jingsong I-497
Hu, Qunfang III-1180

Hu, Ting IV-1029


Hu, Xiangpei IV-218
Hu, Xiaodong III-305
Hu, Yanmei I-17
Hu, Yi II-1186, II-1214
Hu, Yincui II-569
Hu, Yuanfang I-46
Hua, Chen Qi II-25
Hua, Kun III-867
Huajian, Zhang III-166
Huan, Zhengliang II-1029
Huang, Chongfu III-1016, III-1069
Huang, Dashan III-937
Huang, Fang II-523
Huang, Han IV-1167
Huang, Hong-Wei III-1114, III-1180
Huang, Houkuan III-645
Huang, Jing III-353
Huang, Kedi I-1246
Huang, LaiLei IV-90
Huang, Lican III-228
Huang, Linpeng II-1107
Huang, Maosong III-1105
Huang, Minfang IV-218
Huang, Mingxiang III-516
Huang, Peijie I-430
Huang, Wei II-455, II-486
Huang, Yan-Chu IV-291
Huang, Yong-Ping III-125
Huang, Yu III-257
Huang, Yue IV-1139
Huang, Z.H. II-1083
Huang, Zhou III-653
Huashan, Guo III-611
Huerta, Joaquin II-138
Huh, Eui Nam IV-498, IV-582
Huh, Moonhaeng IV-889
Hui, Liu II-130
Hunter, M. I-1050
Hur, Gi-Taek II-150
Hwang, Chih-Hong IV-227
Hwang, Hoyoung IV-889, IV-897
Hwang, Jun IV-586
Hwang, Yuan-Chu IV-433
Hwang, Yun-Young II-562
Ibrahim, H. I-446
Iglesias, Andres II-89, II-194, II-235
Inceoglu, Mustafa Murat III-607
Ipanaque, R. II-194

Author Index
Iskandarani,
sler, Veysi
I

Inan,
Asu
Ito, Kiichi

Mohamed
II-49
I-1, I-38
IV-74

I-1002

Jackson, Peter III-746


Jacob, Robert L. I-931
Jagannathan, Suresh I-1205
Jagodzi
nski, Janusz II-558
Jaluria, Y. I-1189
Jamieson, Ronan II-744
Jang, Hyun-Su IV-542
Jang, Sung Ho II-966
Jayam, Naresh I-603
Jeon, Jae Wook III-523, IV-425
Jeon, Keunhwan III-508
Jeon, Taehyun IV-733
Jeong, Chang Won III-170
Jeong, Dongwon II-720, III-508, IV-441
Jeong, Seung-Moon II-150
Jeong, Taikyeong T. IV-586
Jeun, In-Kyung II-665
Jho, Gunu I-668
Ji, Hyungsuk II-1222, II-1226
Ji, Jianyue III-945
Ji, Youngmin IV-869
Jia, Peifa II-956
Jia, Yan III-717, III-742
Jian, Kuodi II-855
Jian-fu, Shao III-1130
Jiang, Changjun III-220
Jiang, Dazhi IV-1131
Jiang, Hai I-286
Jiang, He III-293, III-661
Jiang, Jianguo IV-1139
Jiang, Jie III-595
Jiang, Keyuan II-393
Jiang, Liangkui IV-186
Jiang, Ming-hui IV-158
Jiang, Ping III-212
Jiang, Shun IV-129
Jiang, Xinlei III-66
Jiang, Yan III-42
Jiang, Yi I-770, I-826
Jianjun, Guo III-611
Jianping, Li III-992
Jiao, Chun-mao III-1197
Jiao, Licheng IV-1053
Jiao, Xiangmin I-334
Jiao, Yue IV-134

Jin, Hai I-434


Jin, Ju-liang III-980, III-1004
Jin, Kyo-Hong IV-721
Jin, Li II-808
Jin, Shunfu IV-210, IV-352
Jing, Lin-yan III-1004
Jing, Yixin II-720
Jing-jing, Tian III-453
Jinlong, Zhang III-953
Jo, Geun-Sik II-704
Jo, Insoon II-577
Johnson, Chris R. I-1002
Jolesz, Ferenc I-980
Jones, Brittany I-237
Joo, Su Chong III-170
Jordan, Thomas I-46
Jou, Yow-Jen IV-291
Jung, Hyungsoo III-26, IV-705
Jung, Jason J. II-704
Jung, Kwang-Ryul IV-745
Jung, Kyunghoon IV-570
Jung, Soon-heung IV-621
Jung, Ssang-Bong IV-457
Jung, Woo Jin IV-550
Jung, Youngha IV-668
Jurenz, Matthias II-839
Kabadshow, Ivo I-716
Kacher, Dan I-980
Kakehi, Kazuhiko II-601
Kalayc, Tahir Emre II-158
Kambadur, Prabhanjan I-620
Kanaujia, Atul I-1114
Kaneko, Masataka II-178
Kang, Dazhou I-196
Kang, Hong-Koo I-692, II-511
Kang, Hyungmo IV-514
Kang, Lishan IV-1116, IV-1131
Kang, Mikyung IV-401
Kang, Min-Soo IV-449
Kang, Minseok III-432
Kang, Sanggil III-836
Kang, Seong-Goo IV-977
Kang, Seung-Seok IV-295
Kapcak, Sinan II-235
Kapoor, Shakti I-603
Karakaya, Ziya II-186
Karl, Wolfgang II-831
Kasprzak, Andrzej I-442
Kawano, Akio I-914

1237

1238

Author Index

Kaxiras, Efthimios I-786


Ke, Lixia III-911
Keetels, G.H. I-898
Kempe, David I-995
Kennedy, Catriona I-1098
Kereku, Edmond II-847
Khan, Faraz Idris IV-498, IV-582
Khazanchi, Deepak III-806, III-852
Khonsari, A. IV-606
Ki, Hyung Joo IV-554
Kikinis, Ron I-980
Kil, Min Wook IV-614
Kim, Deok-Hwan I-204
Kim, Beob Kyun III-894
Kim, Byounghoon IV-570
Kim, Byung-Ryong IV-849
Kim, ByungChul IV-368
Kim, ChangKug IV-328
Kim, Changsoo IV-570
Kim, Cheol Min III-559
Kim, Chul-Seung IV-542
Kim, Deok-Hwan I-204, II-515, III-902
Kim, Do-Hyeon IV-449
Kim, Dong-Oh I-692, II-511
Kim, Dong-Uk II-952
Kim, Dong-Won IV-676
Kim, Eung-Kon IV-717
Kim, Gu Su IV-542
Kim, GyeYoung II-1
Kim, H.-K. I-1050
Kim, Hanil IV-660
Kim, Hojin IV-865
Kim, Hyogon IV-709
Kim, Hyun-Ki IV-457, IV-1076
Kim, Jae-gon IV-621
Kim, Jae-Kyung III-477
Kim, Jee-Hoon IV-344, IV-562
Kim, Ji-Hong IV-721
Kim, Jihun II-347
Kim, Jinhwan IV-925
Kim, Jinoh I-1222
Kim, Jong-Bok II-1194
Kim, Jong Nam III-10, III-149
Kim, Jong Tae IV-578
Kim, Joongheon IV-385
Kim, Joung-Joon I-692
Kim, Ju Han II-347
Kim, Jungmin II-696
Kim, Junsik IV-713
Kim, Kanghee IV-897

Kim, Ki-Chang IV-849


Kim, Ki-Il IV-745
Kim, Kilcheon IV-417
Kim, Kwan-Woong IV-328
Kim, Kyung-Ok II-562
Kim, LaeYoung IV-865
Kim, Minjeong I-1042
Kim, Moonseong I-668, III-432, III-465
Kim, Myungho I-382
Kim, Nam IV-713
Kim, Pankoo III-829, IV-660, IV-925
Kim, Sang-Chul IV-320
Kim, Sang-Sik IV-745
Kim, Sang-Wook IV-660
Kim, Sanghun IV-360
Kim, Sangtae I-963
Kim, Seong Baeg III-559
Kim, Seonho I-1222
Kim, Shingyu III-26, IV-705
Kim, Sung Jin III-798
Kim, Sungjun IV-869
Kim, Sung Kwon IV-693
Kim, Sun Yong IV-360
Kim, Tae-Soon III-902
Kim, Taekon IV-482
Kim, Tai-Hoon IV-693
Kim, Ung Mo III-709
Kim, Won III-465
Kim, Yong-Kab IV-328
Kim, Yongseok IV-933
Kim, Young-Gab III-1040
Kim, Young-Hee IV-721
Kisiel-Dorohinicki, Marek II-928
Kitowski, Jacek I-414
Kleijn, Chris R. I-842
Klie, Hector I-1213
Kluge, Michael II-823
Knight, D. I-1189
Kn
upfer, Andreas II-839
Ko, Il Seok IV-614, IV-729
Ko, Jin Hwan I-521
Ko, Kwangsun IV-977
Koda, Masato II-447
Koh, Kern IV-913
Kolobov, Vladimir I-850, I-858
Kondo, Djimedo III-1130
Kong, Chunum IV-303
Kong, Xiangjie II-1067
Kong, Xiaohong I-278
Kong, Yinghui II-978

Author Index
Kong, Youngil IV-685
Koo, Bon-Wook IV-562
Koo, Jahwan IV-538
Korkhov, Vladimir III-191
Kot, Andriy I-980
Kotulski, Leszek II-880
Kou, Gang III-852, III-874
Koumoutsakos, Petros III-1122
Kozlak, Jaroslaw II-872, II-944
Krile, Srecko I-628
Krishna, Murali I-603
Kr
omer, Pavel II-936
Kryza, Bartosz I-414
Krzhizhanovskaya, Valeria V. I-755
Kuang, Minyi IV-82
Kuijk, H.A.J.A. van I-947
Kulakowski, K. IV-43
Kulikov, Gennady Yu. I-136
Kulvietiene, Regina II-259
Kulvietis, Genadijus II-259
Kumar, Arun I-603
Kumar, Vipin I-1222
Kurc, Tahsin I-1213
Kusano, Kanya I-914
K
uster, Uwe I-128
Kuszmaul, Bradley C. I-1163
Kuzumilovic, Djuro I-628
Kwak, Ho Young IV-449
Kwak, Sooyeong IV-417
Kwoh, Chee Keong II-378
Kwon, B. I-972
Kwon, Hyuk-Chul II-1170, II-1218
Kwon, Key Ho III-523, IV-425
Kwon, Ohhoon IV-913
Kwon, Ohkyoung II-577
Kyriakopoulos, Fragiskos II-625
Laat, Cees de III-191
Laclavik, Michal III-265
Lagan`
a, Antonio I-358
Lai, C.-H. I-294
Lai, Hong-Jian III-377
Lai, K.K. III-917
Lai, Kin Keung I-554, II-423, II-455,
II-486, II-494, III-925, IV-106
Landertshamer, Roland II-752, II-776
Lang, Bruno I-716
Lantz, Brett I-1090
Larson, J. Walter I-931
Laserra, Ettore II-997

1239

Laszewski, Gregor von I-1058


Lawford, P.V. I-794
Le, Jiajin III-629
Lee, Bong Gyou IV-685
Lee, Byong-Gul II-1123
Lee, Chang-Mog II-1139
Lee, Changjin IV-685
Lee, Chung Sub III-170
Lee, Donghwan IV-385
Lee, Edward A. III-182
Lee, Eun-Pyo II-1123
Lee, Eung Ju IV-566
Lee, Eunryoung II-1170
Lee, Eunseok IV-594
Lee, Haeyoung II-73
Lee, Heejo IV-709
Lee, HoChang II-162
Lee, Hyun-Jo III-621
Lee, Hyungkeun IV-482
Lee, In-Tae IV-1076
Lee, Jae-Hyung IV-721
Lee, Jaeho III-477
Lee, Jaewoo IV-913
Lee, JaeYong IV-368
Lee, Jang-Yeon IV-482
Lee, Jin-won IV-621
Lee, Jong Sik II-966
Lee, Joonhyoung IV-668
Lee, Ju-Hong II-515, III-902
Lee, Jung-Bae IV-949
Lee, Jung-Seok IV-574
Lee, Junghoon IV-401, IV-449,
I-586, IV-660, IV-925
Lee, Jungwoo IV-629
Lee, K.J. III-701
Lee, Kye-Young IV-652
Lee, Kyu-Chul II-562
Lee, Kyu Min II-952
Lee, Kyu Seol IV-566
Lee, Mike Myung-Ok IV-328
Lee, Namkyung II-122
Lee, Peter I-1098
Lee, Samuel Sangkon II-1139, III-18
Lee, Sang-Yun IV-737
Lee, SangDuck IV-717
Lee, Sang Ho III-798
Lee, Sang Joon IV-449
Lee, Seok-Lae II-665
Lee, Seok-Lyong I-204
Lee, SeungCheol III-709

1240

Author Index

Lee, Seungwoo IV-905


Lee, Seung Wook IV-578
Lee, Soojung I-676
Lee, SuKyoung IV-865
Lee, Sungyeol II-73
Lee, Tae-Jin IV-336, IV-457, IV-550,
IV-554
Lee, Wan Yeon IV-709
Lee, Wonhee III-18
Lee, Wonjun IV-385
Lee, Young-Ho IV-897
Lee, Younghee IV-629
Lei, Lan III-381, III-384
Lei, Tinan III-575
Lei, Y.-X. IV-777
Leier, Andre I-778
Leiserson, Charles E. I-1163
Lemaire, Francois II-268
Lenton, Timothy M. III-273
Leung, Kwong-Sak IV-1099
Levnajic, Zoran II-633
Li, Ai-Ping III-121
Li, Aihua II-401, II-409
Li, Changyou III-137
Li, Dan IV-817, IV-841
Li, Deng-Xin III-377
Li, Deyi II-657
Li, Fei IV-785
Li, Gen I-474
Li, Guojun III-347
Li, Guorui IV-409
Li, Haiyan IV-961
Li, Hecheng IV-1159
Li, Jianping II-431, II-478, III-972
Li, Jinhai II-1067
Li, Jun III-906
Li, Li III-984
Li, Ling II-736
Li, Ming I-374, II-1012, III-1, III-493
Li, MingChu III-293, III-329
Li, Ping III-440
Li-ping, Chen IV-741
Li, Qinghua I-426, IV-965
Li, Renfa III-571
Li, Rui IV-961
Li, Ruixin III-133
Li, Runwu II-1037
Li, Sai-Ping IV-1163
Li, Shanping IV-376
Li, Shengjia III-299

Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,
Li,

Shucai III-145
Tao IV-166
Weimin III-629
Wenhang III-516
X.-M. II-397
Xiao-Min III-381, III-384
Xikui III-1210
Xin II-251, III-587
Xing IV-701, IV-853
Xingsen III-781, III-906
Xinhui IV-174
Xinmiao IV-174
Xinye III-531
Xiong IV-121
Xiuzhen II-1021
Xue-Yao III-125, III-174
Xuening II-1186
Xueyu II-978
Xuezhen I-430
Yan III-603
Yanhui I-196
Yi III-485
Yih-Lang IV-259
Yiming IV-227, IV-259
Ying II-1115
Yixue II-363
Yiyuan II-1115
Yong III-50, III-157, IV-251
Yuan I-1066
Yuanxiang IV-997, IV-1037,
IV-1124,IV-1171, IV-1175, IV-1179
Li, Yueping III-401
Li, Yun II-327
Li, Yunwei IV-598
Li, Zhiguo I-1114
Li, Zhiyong IV-1183
Li, Zi-mao IV-1045, IV-1147
Liang, L. III-988
Liang, Liang IV-202
Liang, Xiaodong III-334
Liang, Yi I-318
Liang, Yong IV-1099
Lim, Eun-Cheon III-821
Lim, Gyu-Ho IV-721
Lim, Jongin III-1040
Lim, Kyung-Sup II-1194
Lim, S.C. I-374, II-1012, III-1
Lim, Sukhyun I-505
Lim, Sung-Soo IV-889, IV-897
Lin, Jun III-579

Author Index
Lin, Yachen II-470
Lin, Zefu III-945
Lin, Zhun II-1178
Lin-lin, Ci III-539
Li
na
n-Garca, Ernesto II-370
Ling, Yun III-591
Linton, Steve II-617
Liu, Caiming II-355, IV-166
Liu, Dayou I-160
Liu, Dingsheng II-523
Liu, Dong IV-961
Liu, Dongtao IV-701
Liu, E.L. III-1151
Liu, Fang IV-1053
Liu, Feiyu III-1188
Liu, Fengli III-762
Liu, Fengshan II-33
Liu, Gan I-426, IV-965
Liu, Guizhen III-313, III-320, III-362,
III-440, III-457
Liu, Guobao II-97
Liu, Guoqiang III-1062
Liu, Haibo III-90, III-178
Liu, Hailing III-1205
Liu, Han-long III-1172
Liu, Hong III-329
Liu, Hong-Cheu I-270
Liu, Hongwei IV-1021
Liu, Jia I-168, III-677
Liu, Jiangguo (James) I-882
Liu, Jiaoyao IV-877
Liu, Jin-lan III-1008
Liu, Li II-17
Liu, Liangxu III-629
Liu, Lin III-980
Liu, Lingxia III-133
Liu, Ming III-1105
Liu, Peng II-523, II-896, IV-969
Liu, Qingtang III-587
Liu, Qizhen III-575
Liu, Quanhui III-347
Liu, Sheng IV-1068
Liu, Tianzhen III-162
Liu, Weijiang IV-793
Liu, Xiaojie II-355, IV-166
Liu, Xiaoqun IV-841
Liu, Xin II-657
Liu, Xinyu I-1238
Liu, Xinyue III-661
Liu, Xiuping II-33

Liu, Yan IV-59


Liu, Yijun IV-9
Liu, Ying III-685, III-781, IV-18
Liu, Yingchun III-1205
Liu, Yuhua III-153
Liu, Yunling IV-162
Liu, Zejia III-1210
Liu, Zhen III-543
Liu, Zhi III-595
Liu, Zongtian II-689
Lobo, Victor II-542
Lodder, Robert A. I-1002
Loidl, Hans-Wolfgang II-617
Loop, B. I-1074
Lord, R. II-415
Lorenz, Eric I-922
Lou, Dingjun III-401, III-410
Loureiro, Miguel II-542
Lu, Feng III-587
Lu, J.F. III-1228
Lu, Jianjiang I-196
Lu, Jianjun IV-162
Lu, Ruzhan II-1186, II-1214
Lu, Shiyong III-244
L
u, Shunying I-632
Lu, Weidong IV-312
Lu, Yi I-1197
Lu, Yunting III-410
Lu, Zhengding II-808
Lu, Zhengtian II-355
Lu, Zhongyu III-754
Luengo, F. II-89
L
 ukasik, Szymon III-726
Lumsdaine, Andrew I-620
Luo, Qi III-531, III-583
Luo, Xiaonan III-485
Luo, Ying II-538, II-569
Lv, Tianyang II-97
Ma, Q. I-1189
Ma, Tieju IV-1
Ma, Xiaosong I-1058
Ma, Yinghong III-444
Ma, Yongqiang III-898
Ma, Zhiqiang III-133
Macedo, Autran III-281
Madey, Gregory R. I-1090
Maechling, Philip I-46
Maeno, Yoshiharu IV-74
Mahinthakumar, Kumar I-1058

1241

1242

Author Index

Mahmoudi, Babak I-964


Majer, Jonathan D. I-762
Majewska, Marta I-414
Majumdar, Amitava I-46
Malekesmaeili, Mani IV-490
Malony, Allen I-86
Mandel, Jan I-1042
Mao, Cunli IV-598
Marchal, Loris I-964
Marin, Mauricio I-229
Markowski, Marcin I-442
Marques, Vincius III-253
Marsh, Robert III-273
Marshall, John I-1155, I-1163

Martnez-Alvarez,
R.P. III-637
Martino, Rafael N. De III-253
Martinovic, Jan II-936
Mascagni, Michael I-723
Matsuzaki, Kiminori II-601, II-609
Mattos, Amanda S. de III-253
Matza, Je III-852
Maza, Marc Moreno II-251, II-268
McCalley, James I-1066
McGregor, Robert I-906
McMullan, Paul I-538
Mechitov, Alexander II-462
Meeker, William I-1066
Mehats, Florian I-939
Mei, Hailiang III-424
Meire, Silvana G. II-138
Melchionna, Simone I-786
Meliopoulos, S. I-1074
Melnik, Roderick V.N. I-834
Memarsadeghi, Nargess II-503
Memik, Gokhan III-734
Meng, Fanjun II-478
Meng, Huimin III-66
Meng, Jixiang III-334
Meng, Wei III-299
Meng, Xiangyan IV-598
Merkevicius, Egidijus II-439
Metaxas, Dimitris I-1114
Miao, Jia-Jia III-121
Miao, Qiankun I-700
Michopoulos, John G. I-1180
Mikucioniene, Jurate II-259
Min, Jun-Ki I-245
Min, Sung-Gi IV-441
Ming, Ai IV-522
Minster, Bernard I-46

Mirabedini, Seyed Javad II-960


Missier, Paolo II-712
Mok, Tony Shu Kam IV-1099
Molinari, Marc III-273
Montanari, Luciano II-272
Monteiro Jr., Pedro C.L. III-253
Moon, Jongbae I-382
Moon, Kwang-Seok III-10
Moore, Reagan I-46
Mora, P. III-1156
Morimoto, Shoichi II-1099, III-890
Morozov, I. III-199
Morra, Gabriele III-1122
Moshkovich, Helen II-462
Mount, David M. II-503
Mu, Chengpo I-490
Mu, Weisong III-547
Mulder, Wico III-216
Mun, Sung-Gon IV-538
Mun, Youngsong I-660, IV-514
M
uller, Matthias II-839
Munagala, Kamesh I-988
Muntean, Ioan Lucian I-708
Murayama, Yuji II-550
Nagel, Wolfgang E. II-823, II-839
Nah, HyunChul II-162
Nakajima, Kengo III-1085
Nakamori, Yoshiteru IV-1
Nam, Junghyun III-709
Nara, Shinsuke I-406
Narayanan, Ramanathan III-734
Narracott, A.J. I-794
Nawarecki, Edward II-944
Nedjalkov, M. I-739
Nepal, Chirag I-78, I-94
Ni, Jun III-34
Nicole, Denis A. III-273
Niemegeers, Ignas IV-312
Niennattrakul, Vit I-513
Nieto-Y
an
ez, Alma IV-981
Ning, Zhuo IV-809
Niu, Ben II-319
Niu, Ke III-677
Niu, Wenyuan IV-9
Noh, Dong-Young II-347
Nong, Xiao IV-393
Noorbatcha, I. II-335
Norris, Boyana I-931

Author Index
Oberg, Carl I-995
Oden, J.T. I-972
Oh, Hyukjun IV-933
Oh, Jehwan IV-594
Oh, Sangchul IV-713
Oh, Sung-Kwun IV-1076, IV-1108
Ohsawa, Yukio IV-74, IV-142
Oijen, J.A. van I-947
Oladunni, Olutayo O. I-176
Oliveira, Suely I-221
Olsen, Kim I-46
Olson, David L. II-462
Ong, Everest T. I-931
Ong, Hong II-784
Oosterlee, C.W. II-415
Ord, Alison I-62
Othman, M. I-326, I-446
Ou, Zhuoling III-162
Ould-Khaoua, M. IV-606
Ouyang, Song III-289
. kylmaz, Berkin III-734
Ozs
Pacici, Leonardo I-358
Paik, Juryon III-709
Palkow, Mark IV-761
Pan, Wei II-268
Pan, Yaozhong III-1069
Pang, Jiming II-97
Pang, Yonggang III-117, III-141
Papancheva, Rumyana I-747
Parashar, Manish I-1213
Parhami, Behrooz IV-67
Park, Ae-Soon IV-745
Park, Byungkyu II-339
Park, Chiwoo I-1197
Park, Dong-Hyun IV-344
Park, Gyung-Leen IV-449, IV-586,
IV-660, IV-925
Park, Hee-Geun II-1222
Park, Heum II-1218
Park, Hyungil I-382
Park, Ilkwon IV-546
Park, Jaesung IV-629
Park, Jeonghoon IV-336
Park, Ji-Hwan III-523, IV-425
Park, JongAn III-829, IV-717
Park, Keon-Jun IV-1108
Park, ManKyu IV-368
Park, Moonju IV-881
Park, Mu-Hun IV-721

Park, Namhoon IV-713


Park, Sanghun I-25
Park, Seon-Ho III-1024
Park, Seongjin II-9
Park, So-Jeong IV-449
Park, Sooho I-1138
Park, Sungjoon III-836
Park, TaeJoon IV-368
Park, Woojin IV-869
Park, Youngsup II-114
Parsa, Saeed I-599
Paszy
nski, M. I-342, II-912
Pathak, Jyotishman I-1066
Pawling, Alec I-1090
Pedrycz, Witold IV-1108
Pei, Bingzhen II-1214
Pei-dong, Zhu IV-393
Pein, Raoul Pascal III-754
Peiyu, Li IV-957
Peng, Dongming III-859
Peng, Hong I-430, I-497
Peng, Lingxi II-355, IV-166
Peng, Qiang II-57
Peng, Shujuan IV-997
Peng, Xia III-653
Peng, Xian II-327
Peng, Yi III-852, III-874
Peng, Yinqiao II-355
P
uger, Dirk I-708
Pinheiro, Wallace A. III-253
Plale, Beth I-1122
Platos, Jan II-936
Prasad, R.V. IV-312
Price, Andrew R. III-273
Primavera, Leonardo I-9
Prudhomme, S. I-972
Prncipe, Jose C. I-964
Pu, Liang III-867
Pusca, Stefan II-1053
Qi, Jianxun III-984
Qi, Li I-546
Qi, Meibin IV-1139
Qi, Shanxiang I-529
Qi, Yutao IV-1053
Qiao, Daji I-1066
Qiao, Jonathan I-237
Qiao, Lei III-615
Qiao, Yan-Jiang IV-138
Qin, Jun IV-1045

1243

1244

Author Index

Qin, Ruiguo III-599


Qin, Ruxin III-669
Qin, Xiaolin II-1131
Qin, Yong IV-67, IV-1167
Qiu, Guang I-684
Qiu, Jieshan II-280
Qiu, Yanxia IV-598
Qizhi, Zhu III-1130
Queiroz, Jose Rildo de Oliveira
Quir
os, Ricardo II-138

Ruan, Jian IV-251


Ruan, Qiuqi I-490
Ruan, Youlin I-426, IV-965
Ryan, Sarah I-1066
Ryu, Jae-hong IV-676
Ryu, Jihyun I-25
Ryu, Jung-Pil IV-574
Ryu, Kwan Woo II-122
II-304

Ra, Sang-Dong II-150


Rarou, D. I-794
Rajashekhar, M. I-1171
Ram, Jerey III-244
Ramakrishnan, Lavanya I-1122
Ramalingam, M. II-288
Ramasami, K. II-288
Ramasami, Ponnadurai II-296
Ramsamy, Priscilla II-744, II-768
Ranjithan, Ranji I-1058
Ratanamahatana, Chotirat Ann I-513
Rattanatamrong, Prapaporn I-964
Ravela, Sai I-1147, I-1155
Regenauer-Lieb, Klaus I-62
Rehn, Veronika I-366
Rejas, R. II-1162
ReMine, Walter II-386
Ren, Lihong III-74
Ren, Yi I-462, I-466, II-974
Ren, Zhenhui III-599
Reynolds Jr., Paul F. I-1238
Richman, Michael B. I-1130
Rigau, Jaume II-105
Robert, Yves I-366, I-591
Roberts, Ron I-1066
Roch, Jean-Louis II-593
Rocha, Gerd Bruno II-312
Rodrguez, D. II-1162
Rodrguez-Hern
andez, Pedro S. III-637,
IV-466
Rom
an, E.F. II-370
Romero, David II-370
Romero, Luis F. I-54
Rong, Haina IV-243, IV-989
Rong, Lili IV-178
Rongo, Rocco I-866
Rossman, T. I-1189
Roy, Abhishek I-652
Roy, Nicholas I-1138

Sabatka, Alan III-852


Safaei, F. IV-606
Sainz, Miguel A. II-166
Salman, Adnan I-86
Saltz, Joel I-1213
Sameh, Ahmed I-1205
San-Martn, D. III-82
Sanchez, Justin C. I-964
S
anchez, Ruiz Luis M. II-1004
Sandu, Adrian I-1018, I-1026
Sanford, John II-386
Santone, Adam I-1106
Saraan, Haiduke II-203
Saraan, Nenette II-203
Savchenko, Maria II-65
Savchenko, Vladimir II-65
Saxena, Navrati I-652
Sbert, Mateu II-105, II-166
Schaefer, R. I-342
Scheuermann, Peter III-781
Schmidt, Thomas C. IV-761
Schoenharl, Timothy I-1090

Schost, Eric
II-251
Schwan, K. I-1050
Scott, Stephen L. II-784
Seinfeld, John H. I-1018
Sekiguchi, Masayoshi II-178
Senel, M. I-1074
Senthilkumar, Ganapathy I-603
Seo, Dong Min III-813
Seo, Kwang-deok IV-621
Seo, SangHyun II-114, II-162
Seo, Young-Hoon II-1202, II-1222
Seshasayee, B. I-1050
Sha, Jing III-220
Shakhov, Vladimir V. IV-530
Shan, Jiulong I-700
Shan, Liu III-953
Shan-shan, Li IV-393
Shang, Weiping III-305
Shanzhi, Chen IV-522

Author Index
Shao, Feng I-253
Shao, Huagang IV-644
Shao, Xinyu III-212
Shao, Ye-Hong III-377
Shao-liang, Peng IV-393
Sharif, Hamid III-859
Sharma, Abhishek I-995
Sharma, Raghunath I-603
Shen, Huizhang IV-51
Shen, Jing III-90, III-178
Shen, Linshan III-1077
Shen, Xianjun IV-1171, IV-1175,
IV-1179
Shen, Yue III-109, III-555
Shen, Zuyi IV-1186
Shi, Baochang I-802, I-810, I-818
Shi, Bing III-615
Shi, Dongcai II-1115
Shi, Haihe III-469
Shi, Huai-dong II-896
Shi, Jin-Qin III-591
Shi, Xiquan II-33
Shi, Xuanhua I-434
Shi, Yaolin III-1205
Shi, Yong II-401, II-409, II-490, II-499,
III-685, III-693, III-852, III-874,
III-906, III-1062
Shi, Zhongke I-17
Shi-hua, Ma I-546
Shim, Choon-Bo III-821
Shima, Shinichiro I-914
Shin, Byeong-Seok I-505
Shin, Dong-Ryeol II-952
Shin, In-Hye IV-449, IV-586, IV-925
Shin, Jae-Dong IV-693
Shin, Jitae I-652
Shin, Kwonseung IV-534
Shin, Kyoungho III-236
Shin, Seung-Eun II-1202, II-1222
Shin, Teail IV-514
Shin, Young-suk II-81
Shindin, Sergey K. I-136
Shirayama, Susumu II-649
Shiva, Mohsen IV-490
Shouyang, Wang III-917
Shuai, Dianxun IV-1068
Shuang, Kai IV-785
Shukla, Pradyumn Kumar I-310,
IV-1013
Shulin, Zhang III-992

Shuping, Wang III-992


Silva, Geraldo Magela e II-304
Simas, Alfredo Mayall II-312
Simmhan, Yogesh I-1122
Simon, Gyorgy I-1222
Simutis, Rimvydas II-439
Siricharoen, Waralak V. II-1155
Sirichoke, J. I-1050
Siwik, Leszek II-904
Siy, Harvey III-790
Skelcher, Chris I-1098
Skomorowski, Marek II-970
Slota, Damian I-184
Sn
asel, V
aclav II-936
zy
Snie
nski, Bartlomiej II-864
Soberon, Xavier II-370
Sohn, Bong-Soo I-350
Sohn, Won-Sung III-477
Soltan, Mehdi IV-490
Song, Hanna II-114
Song, Huimin III-457
Song, Hyoung-Kyu IV-344, IV-562
Song, Jeong Young IV-614
Song, Jae-Won III-902
Song, Joo-Seok II-665
Song, Sun-Hee II-150
Song, Wang-Cheol IV-925
Song, Xinmin III-1062
Song, Zhanjie II-1029, II-1075
Sorge, Volker I-1098
Souza, Jano M. de III-253
Spataro, William I-866
Spiegel, Michael I-1238
Sreepathi, Sarat I-1058
Srinivasan, Ashok I-603
Srovnal, Vilem II-936
Staord, R.J. I-972
Stauer, Beth I-995
Steder, Michael I-931
Sterna, Kamil I-390
Stransky, S. I-1155
Strug, Barbara II-880
Su, Benyue II-41
Su, Fanjun IV-773
Su, Hui-Kai IV-797
Su, Hung-Chi I-286
Su, Liang III-742
Su, Sen IV-785
Su, Zhixun II-33
Subramaniam, S. I-446

1245

1246

Author Index

Succi, Sauro I-786


Sugiyama, Toru I-914
Suh, W. I-1050
Sui, Yangyi III-579
Sukhatme, Gaurav I-995
Sulaiman, J. I-326
Sun, Dean III-1138
Sun, Feixian II-355
Sun, Guangzhong I-700
Sun, Guoqiang IV-773
Sun, Haibin II-531
Sun, Jin II-1131
Sun, Jun I-278, I-294
Sun, Lijun IV-218
Sun, Miao II-319
Sun, Ping III-220
Sun, Shaorong IV-134
Sun, Shuyu I-755, I-890
Sun, Tianze III-579
Sun, Xiaodong IV-134

Suvakov,
Milovan II-641
Swain, E. I-1074
Swaminathan, J. II-288
Szab
o, G
abor I-1090
Szczepaniak, Piotr II-219
Szczerba, Dominik I-906
Szekely, G
abor I-906
Tabik, Siham I-54
Tackley, Paul III-1122
Tadic, Bosiljka II-633, II-641
Tadokoro, Yuuki II-178
Tahar, So`ene II-263
Tak, Sungwoo IV-570
Takahashi, Isao I-406
Takato, Setsuo II-178
Takeda, Kenji III-273
Tan, Guoxin III-587
Tan, Hui I-418
Tan, Jieqing II-41
Tan, Yu-An III-567
Tan, Zhongfu III-984
Tang, Fangcheng IV-170
Tang, J.M. I-874
Tang, Jiong I-1197
Tang, Liqun III-1210
Tang, Sheng Qun II-681, II-736
Tang, Xijin IV-35, IV-150
Tang, Yongning IV-857
Tao, Chen III-953

Tao, Jianhua I-168


Tao, Jie II-831
Tao, Yongcai I-434
Tao, Zhiwei II-657
Tay, Joc Cing I-119
Terpstra, Frank III-216
Teshnehlab, Mohammad II-960
Theodoropoulos, Georgios I-1098
Thijsse, Barend J. I-842
Thrall, Stacy I-237
Thurner, Stefan II-625
Tian, Chunhua III-1032, IV-129
Tian, Fengzhan III-645
Tian, Yang III-611
Tian, Ying-Jie III-669, III-693, III-882
Ting, Sun III-129
Tiyyagura, Sunil R. I-128
Tobis, Michael I-931
Tokinaga, Shozo IV-162
Toma, Ghiocel II-1045
Tong, Hengqing III-162
Tong, Qiaohui III-162
Tong, Weiqin III-42
Tong, Xiao-nian IV-1147
Top, P. I-1074
Trafalis, Theodore B. I-176, I-1130
Treur, Jan II-888
Trinder, Phil II-617
Truno, Giuseppe A. I-567, I-866
Tsai, Wu-Hong II-673
Tseng, Ming-Te IV-275
Tsoukalas, Lefteri H. I-1074, I-1083
Tucker, Don I-86
Turck, Filip De I-454
Turovets, Sergei I-86
Uchida, Makoto II-649
U
gur, Aybars II-158

Ulker,
Erkan II-49
Unold, Olgierd II-1210
Urbina, R.T. II-194
Uribe, Roberto I-229
Urmetzer, Florian II-792
Vaidya, Binod IV-717
Valuev, I. III-199
Vanrolleghem, Peter A. I-454
Vasenkov, Alex I-858
Vasyunin, Dmitry III-191
Veh, Josep II-166

Author Index
Veloso, Rene Rodrigues III-281
Venkatasubramanian, Venkat I-963
Venuvanalingam, P. II-288
Vermolen, F.J. I-70
Vas, Jes
us M. I-54
Vidal, Antonio M. I-152
Viswanathan, M. III-701
Vivacqua, Adriana S. III-253
Vodacek, Anthony I-1042
Volkert, Jens II-752, II-776
Vuik, C. I-874
Vumar, Elkin III-370
Waanders, Bart van Bloemen I-1010
Wagner, Frederic II-593
W
ahlisch, Matthias IV-761
Walenty
nski, Ryszard II-219
Wan, Wei II-538, II-569
Wang, Aibao IV-825
Wang, Bin III-381, III-384
Wang, Chao I-192
Wang, Chuanxu IV-186
Wang, Daojun III-516
Wang, Dejun II-1107
Wang, Haibo IV-194
Wang, Hanpin III-257
Wang, Honggang III-859
Wang, Hong Moon IV-578
Wang, Huanchen IV-51
Wang, Huiqiang III-117, III-141,
III-1077
Wang, J.H. III-1164, III-1228
Wang, Jiabing I-497
Wang, Jian III-1077
Wang, Jian-Ming III-1114
Wang, Jiang-qing IV-1045, IV-1147
Wang, Jianmin I-192
Wang, Jianqin II-569
Wang, Jihui III-448
Wang, Jilong IV-765
Wang, Jing III-685
Wang, Jinping I-102
Wang, Jue III-964
Wang, Jun I-462, I-466, II-974
Wang, Junlin III-1214
Wang, Liqiang III-244
Wang, Meng-dong III-1008
Wang, Naidong III-1146
Wang, Ping III-389
Wang, Pu I-1090

1247

Wang, Qingquan IV-178


Wang, Shengqian II-1037
Wang, Shouyang II-423, II-455, II-486,
III-925, III-933, III-964, IV-106
Wang, Shuliang II-657
Wang, Shuo M. III-1205
Wang, Shuping III-972
Wang, Tianyou III-34
Wang, Wei I-632
Wang, Weinong IV-644
Wang, Weiwu IV-997, IV-1179
Wang, Wenqia I-490
Wang, Wu III-174
Wang, Xianghui III-105
Wang, Xiaojie II-1178
Wang, Xiaojing II-363
Wang, Xin I-1197
Wang, Xing-wei I-575
Wang, Xiuhong III-98
Wang, Xun III-591
Wang, Ya III-153
Wang, Yi I-1230
Wang, Ying II-538, II-569
Wang, You III-1101
Wang, Youmei III-501
Wang, Yun IV-138
Wang, Yuping IV-1159
Wang, Yunfeng III-762
Wang, Zheng IV-35, IV-218
Wang, Zhengning II-57
Wang, Zhengxuan II-97
Wang, Zhiying IV-251
Wang, Zuo III-567
Wangc, Kangjian I-482
Wareld, Simon K. I-980
Wasynczuk, O. I-1074
Wei, Anne. IV-506
Wei, Guozhi. IV-506
Wei, Lijun II-482
Wei, Liu II-146
Wei, Liwei II-431
Wei, Wei II-538
Wei, Wu II-363
Wei, Yi-ming III-1004
Wei, Zhang III-611
Weihrauch, Christian I-747
Weimin, Xue III-551
Weissman, Jon B. I-1222
Wen, Shi-qing III-1172
Wendel, Patrick III-204

1248

Author Index

Wenhong, Xia III-551


Whalen, Stephen I-980
Whangbo, T.K. III-701
Wheeler, Mary F. I-1213
Wibisono, Adianto III-191
Widya, Ing III-424
Wilhelm, Alexander II-752
Willcox, Karen I-1010
Winter, Victor III-790
Wojdyla, Marek II-558
Wong, A. I-1155
Woods, John I-111
Wu, Cheng-Shong IV-797
Wu, Chuansheng IV-1116
Wu, Chunxue IV-773
Wu, Guowei III-419
Wu, Hongfa IV-114
Wu, Jiankun II-1107
Wu, Jian-Liang III-320, III-389, III-457
Wu, Jianpin IV-801
Wu, Jianping IV-817, IV-833
Wu, Kai-ya III-980
Wu, Lizeng II-978
Wu, Qiuxin III-397
Wu, Quan-Yuan I-462, I-466,
II-974, III-121
Wu, Ronghui III-109, III-571
Wu, Tingzeng III-397
Wu, Xiaodan III-762
Wu, Xu-Bo III-567
Wu, Yan III-790
Wu, Zhendong IV-376
Wu, Zheng-Hong III-493
Wu, Zhijian IV-1131
Wyborn, D. III-1156
Xexeo, Geraldo III-253
Xi, Lixia IV-1091
Xia, Jingbo III-133
Xia, L. II-1083
Xia, Na IV-1139
Xia, Xuewen IV-1124
Xia, ZhengYou IV-90
Xian, Jun I-102
Xiang, Li III-1138
Xiang, Pan II-25
Xiao, Hong III-113
Xiao, Ru Liang II-681, II-736
Xiao, Wenjun IV-67
Xiao, Zhao-ran III-1214

Xiao-qun, Liu I-546


Xiaohong, Pan IV-957
Xie, Lizhong IV-801
Xie, Xuetong III-653
Xie, Yi I-640
Xie, Yuzhen II-268
Xin-sheng, Liu III-453
Xing, Hui Lin III-1093, III-1146,
III-1151, III-1156, III-1205
Xing, Wei II-712
Xing, Weiyan IV-961
Xiong, Liming III-329, III-397
Xiong, Shengwu IV-1155
Xiuhua, Ji II-130
Xu, B. III-1228
Xu, Chao III-1197
Xu, Chen III-571
Xu, Cheng III-109
Xu, H.H. III-1156
Xu, Hao III-289
Xu, Hua II-956
Xu, Jingdong IV-877
Xu, Kaihua III-153
Xu, Ke IV-506
Xu, Ning IV-1155
Xu, Wei III-964
Xu, Wenbo I-278, I-294
Xu, X.P. III-988
Xu, Xiaoshuang III-575
Xu, Y. I-1074
Xu, Yang II-736
Xu, Yaquan IV-194
Xu, You Wei II-736
Xu, Zhaomin IV-725
Xu, Zhenli IV-267
Xue, Gang III-273
Xue, Jinyun III-469
Xue, Lianqing IV-841
Xue, Wei I-529
Xue, Yong II-538, II-569
Yamamoto, Haruyuki III-1146
Yamashita, Satoshi II-178
Yan, Hongbin IV-1
Yan, Jia III-121
Yan, Nian III-806
Yan, Ping I-1090
Yan, Shi IV-522
Yang, Bo III-1012
Yang, Chen III-603

Author Index
Yang, Chuangxin I-497
Yang, Chunxia IV-114
Yang, Deyun II-1021, II-1029, II-1075
Yang, Fang I-221
Yang, Fangchun IV-785
Yang, Hongxiang II-1029
Yang, Jack Xiao-Dong I-834
Yang, Jianmei IV-82
Yang, Jihong Ou I-160
Yang, Jincai IV-1175
Yang, Jong S. III-432
Yang, Jun I-988, II-57
Yang, Kyoung Mi III-559
Yang, Lancang III-615
Yang, Seokyong IV-636
Yang, Shouyuan II-1037
Yang, Shuqiang III-717
Yang, Weijun III-563
Yang, Wu III-611
Yang, Xiao-Yuan III-677
Yang, Xuejun I-474, IV-921
Yang, Y.K. III-701
Yang, Young-Kyu IV-660
Yang, Zhenfeng III-212
Yang, Zhongzhen III-1000
Yang, Zong-kai III-587, IV-873
Yao, Kai III-419, III-461
Yao, Lin III-461
Yao, Nianmin III-50, III-66, III-157
Yao, Wenbin III-50, III-157
Yao, Yangping III-1146
Yazici, Ali II-186
Ye, Bin I-278
Ye, Dong III-353
Ye, Liang III-539
Ye, Mingjiang IV-833
Ye, Mujing I-1066
Yen, Jerome I-554
Yeo, Sang-Soo IV-693
Yeo, So-Young IV-344
Yeom, Heon Y. II-577, III-26, IV-705
Yi, Chang III-1069
Yi, Huizhan IV-921
Yi-jun, Chen IV-741
Yi, Sangho IV-905
Yim, Jaegeol IV-652
Yim, Soon-Bin IV-457
Yin, Jianwei II-1115
Yin, Peipei I-192
Yin, Qingbo III-10, III-149

1249

Ying, Weiqin IV-997, IV-1061,


IV-1124, IV-1179
Yongqian, Lu III-166
Yongtian, Yang III-129
Yoo, Gi-Hyoung III-894
Yoo, Jae-Soo II-154, III-813
Yoo, Kwan-Hee II-154
Yoon, Ae-sun II-1170
Yoon, Jungwon II-760
Yoon, KyungHyun II-114, II-162
Yoon, Seokho IV-360
Yoon, Seok Min IV-578
Yoon, Won Jin IV-550
Yoshida, Taketoshi IV-150
You, Jae-Hyun II-515
You, Kang Soo III-894
You, Mingyu I-168
You, Xiaoming IV-1068
You, Young-Hwan IV-344
Youn, Hee Yong IV-566
Yu, Baimin III-937
Yu, Beihai III-960
Yu, Chunhua IV-1139
Yu, Fei III-109, III-555, III-571
Yu, Jerey Xu I-270
Yu, Lean II-423, II-486, II-494,
III-925, III-933, III-937, IV-106
Yu, Li IV-1175
Yu, Shao-Ming IV-227, IV-259
Yu, Shun-Zheng I-640
Yu, Weidong III-98
Yu, Xiaomei I-810
Yu-xing, Peng IV-393
Yu, Zhengtao IV-598
Yuan, Jinsha II-978, III-531
Yuan, Soe-Tsyr IV-433
Yuan, Xu-chuan IV-158
Yuan, Zhijian III-717
Yuanjun, He II-146
Yue, Dianmin III-762
Yue, Guangxue III-109, III-555, III-571
Yue, Wuyi IV-210, IV-352
Yue, Xin II-280
Yuen, Dave A. I-62
Yuen, David A. III-1205
Zabelok, S.A. I-850
Zain, Abdallah Al II-617
Zain, S.M. II-335
Zaki, Mohamed H. II-263

1250

Author Index

Zambreno, Joseph III-734


Zand, Mansour III-790
Zapata, Emilio L. I-54
Zechman, Emily I-1058
Zeleznikow, John I-270
Zeng, Jinquan II-355, IV-166
Zeng, Qingcheng III-1000
Zeng, Z.-M. IV-777
Zha, Hongyuan I-334
Zhan, Mingquan III-377
Zhang, Bin I-286, I-995
Zhang, CaiMing II-17
Zhang, Chong II-327
Zhang, Chunyuan IV-961
Zhang, Defu II-482
Zhang, Dong-Mei III-1114
Zhang, Fangfeng IV-59
Zhang, Gexiang IV-243, IV-989
Zhang, Guang-Zheng I-78
Zhang, Guangsheng III-220
Zhang, Guangzhao IV-825
Zhang, Guoyin III-105
Zhang, H.R. III-1223
Zhang, J. III-1093
Zhang, Jing IV-765
Zhang, Jingping II-319, II-331
Zhang, Jinlong III-960
Zhang, Juliang II-499
Zhang, Keliang II-409
Zhang, L.L. III-1164
Zhang, Li III-599
Zhang, Li-fan I-562
Zhang, Lihui III-563
Zhang, Lin I-1026
Zhang, Lingling III-906
Zhang, Lingxian III-547
Zhang, Miao IV-833
Zhang, Min-Qing III-677
Zhang, Minghua III-58
Zhang, Nan IV-35
Zhang, Nevin L. IV-26
Zhang, Peng II-499
Zhang, Pengzhu IV-174
Zhang, Qi IV-1139
Zhang, Ru-Bo III-125, III-174
Zhang, Shenggui III-338
Zhang, Shensheng III-58
Zhang, Sumei III-448
Zhang, Weifeng II-1147
Zhang, Wen III-964, IV-150

Zhang, Xi I-1213
Zhang, Xia III-362
Zhang, XianChao III-293, III-661
Zhang, Xiangfeng III-74
Zhang, Xiaoguang IV-1091
Zhang, Xiaoping III-645
Zhang, Xiaoshuan III-547
Zhang, Xuan IV-701, IV-853
Zhang, Xueqin III-615
Zhang, Xun III-933, III-964, III-1032
Zhang, Y. I-972
Zhang, Y.M. III-1223
Zhang, Yafei I-196
Zhang, Yan I-632
Zhang, Ying I-474
Zhang, Yingchao IV-114
Zhang, Yingzhou II-1147
Zhang, Zhan III-693
Zhang, Zhen-chuan I-575
Zhang, Zhiwang II-490
Zhangcan, Huang IV-1005
Zhao, Chun-feng III-1197
Zhao, Guosheng III-1077
Zhao, Hui IV-166
Zhao, Jidi IV-51
Zhao, Jie III-984
Zhao, Jijun II-280
Zhao, Jinlou III-911
Zhao, Kun III-882
Zhao, Liang III-583
Zhao, Ming I-964
Zhao, Ming-hua III-1101
Zhao, Qi IV-877
Zhao, Qian III-972
Zhao, Qiang IV-1021
Zhao, Qingguo IV-853
Zhao, Ruiming III-599
Zhao, Wen III-257
Zhao, Wentao III-42
Zhao, Xiuli III-66
Zhao, Yan II-689
Zhao, Yaolong II-550
Zhao, Yongxiang IV-1155
Zhao, Zhiming III-191, III-216
Zhao, Zun-lian III-1012
Zheng, Bojin IV-1029, IV-1037,
IV-1171, IV-1179
Zheng, Di I-462, I-466, II-974
Zheng, Jiping II-1131
Zheng, Lei II-538, II-569

Author Index
Zheng, Rao IV-138
Zheng, Ruijuan III-117
Zheng, SiYuan II-363
Zheng, Yao I-318, I-482
Zheng, Yujun III-469
Zhengfang, Li IV-283
Zhiheng, Zhou IV-283
Zhong, Shaobo II-569
Zhong-fu, Zhang III-453
Zhou, Bo I-196
Zhou, Deyu II-378
Zhou, Jieping III-516
Zhou, Ligang II-494
Zhou, Lin III-685
Zhou, Peiling IV-114
Zhou, Wen II-689
Zhou, Xiaojie II-33
Zhou, Xin I-826
Zhou, Zongfang III-1062

Zhu, Aiqing III-555


Zhu, Changqian II-57
Zhu, Chongguang III-898
Zhu, Egui III-575
Zhu, Jianhua II-1075
Zhu, Jiaqi III-257
Zhu, Jing I-46
Zhu, Meihong II-401
Zhu, Qiuming III-844
Zhu, Weishen III-145
Zhu, Xilu IV-1183
Zhu, Xingquan III-685, III-781
Zhu, Yan II-1067
Zhuang, Dong IV-82
Zienkiewicz, O.C. III-1105
Zong, Yu III-661
Zou, Peng III-742
Zuo, Dong-hong IV-873
Zurita, Gustavo II-799

1251

S-ar putea să vă placă și