ICS20103110ubicc 534

Intelligent Communication Systems

Edited by:
AL-Dahoud Ali
Walid A. Salameh
Linda Smail

ISBN 978-1456392291
Ubiquitous Computing and Communication Journal

Phone: +1-347-4149239 FAX: +1-212-901-6990 info@ubicc.org
UbiCC Journal
Disseminator of Knowledge
Intelligent Communication System

First Edition

Editor
AL-Dahoud Ali
Al-Dahoud, is an associated professor at Al-Zaytoonah University, Amman, Jordan. He took his

High Diploma form FON University Belgrade 1986, PhD from La Sabianza1/Italy
and Kiev Polytechnic/Ukraine, 1996. He is working at Al-Zaytoonah University
since 1996 until now. He worked as visiting professor in many universities in
Jordan and Middle East, as supervisor of master and PhD degrees in computer
science. He established the ICIT conference since 2003 and he is the program
chair of ICIT until now. He was the Vice President of the IT committee in the ministry of
youth/Jordan, 2005 and 2006. Al-Dahoud was the General Chair of (ICITST-2008), June 23–28,
2008, Dublin, Ireland (www.icitst.org).
He has directed and led many projects sponsored by NUFFIC/Netherlands.
His hobby is conference organization, so he participates in the following conferences as general

chair, program chair, session’s organizer or in the publicity committee:
- ICITs, ICITST, ICITNS, DepCos, ICTA, ACITs, IMCL, WSEAS, and AICCSA
Journals Activities: Al-Dahoud worked as Editor in Chief or guest editor or in the Editorial board of
the following Journals:
Journal of Digital Information Management, IAJIT, Journal of Computer Science, Int. J. Internet
Technology and Secured Transactions, and UBICC.
He published many books and journal papers, and participated as keynote speaker in many
conferences worldwide.

Editorial Board
Walid A. Salameh
Walid A. Salameh, is a professor of Computer Science. He received his Bachelor degree from
Yarmuk University-Jordan 1984. His MSc and PhD were received in 1987 and
1991 respectively from the Department of Computer Engineering –METU. He
published more than 62 papers in the areas of neural networks, computer
networks and elearning paradigm. His recent research interests are on
building sustainable and efficient elearning paradigms and architectures that
serves the goals of learning outcomes. He is a member of the editorial boards of different journals
and contributed as a guest editor in different books.
Linda Smail
Linda Smail is an associate professor at the School of Arts and Sciences at the New York Institute
of Technology since September 2006. She holds a PhD in applied
Mathematics from the University of Marne-la-Vallée (Paris-East), France.
Until 2005, she worked as a researcher at the Laboratory of Analysis and

Applied Mathematics at the University of Marne-la-Vallée. Through this
research, which is sponsored by INRIA (The French National Institute for
Research in Computation Science and Control), she has, along with members of the research
team, developed new and more efficient algorithms to compute distribution and conditional
probabilities in large Bayesian Networks. She has developed an extensive knowledge of Bayesian
Networks, their applications, and all related computing techniques.
Her research interests include graphical models and machine learning and her recent research
focuses on exact inference algorithms for Bayesian networks. For the task of computing probability
and conditional probability in Bayesian networks.
COPYRIGHT © 2010
This work is subjected to copyright. All rights are reserved whether the whole or part of the material
is concerned, specifically the rights of translation, reprinting, re-use of illusions, recitation,
broadcasting, reproduction on microfilms or in any other way, and storage in data banks.
Duplication of this publication of parts thereof is permitted only under the provision of the copyright
law 1965, in its current version, and permission of use must always be obtained from UBICC
Publishers. Violations are liable to prosecution under the copy right law.
UBICC Journal is a part of UBICC Publishers
Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services
UbiCC Journal is indexed by:
• EBSCOhost Online Research Databases
• The Index of Information Systems Journals
• Documents in Computing and Information Science (DOCIS)
• Ulrich's Periodicals Directory
• Directory of Open Access Journals (DOAJ)
• The Index of Information Systems Journals
• Microsoft LIBRA for Academic Search
For permission to use this material from this text, contact us by
Tel +1-347-4149239
Fax +1-212-901-6990
www.ubicc.org
ISBN 978-1456392291
Preface
The book title has been chosen from the conference title: the first annual International Conference
on Information and Communication System (ICICS09).
The conference received over 160 papers; accepted 90 papers, 20 papers have been selected as
conference best papers, 9 papers are published in this book.
The book consists of 9 chapters that cover the following areas:
- Security And Cryptography

- Mobile Agents
- Evidence Mining
- Navigation Systems
- Software Development
- Vector Quantization
- Security in Internet Communication
- Communication through Expressing
- Intrusion Detection Systems
This book presents a collection of research papers in the arias of computer science and
communication systems. It is suitable for new researchers and PhD students because it contains
up to date researches in the mentioned fields and a lot of references for related topics.
Al-Dahoud Ali (Ph.D.)

Organization of Book
Intelligent Communication System, 1st Edition contains of 8 chapters:
Chapter 1- Security Approaches in Internet Communication Chapter 1 aims to examine

security approaches in internet communication. For this purpose, role of the coding science:
“cryptology” in providing secure internet communication and related techniques in this scope are
also explained within the chapter. Furthermore, in order to give an example for usage of cryptology
techniques, an e-mail application, which was developed to send or receive encrypted e-mail
messages, is also introduced in this chapter.
Chapter 2 - Digital Forensics Evidence Mining Tool In this chapter, we developed an

efficient digital forensics mining tool to help cybercrime investigators in evidence collection and
analysis by providing various forensically important features.
Chapter 3 - ACHIEVING UNCONDITIONAL SECURITY BY QUANTUM CRYPTOGRAPHY

Chapter 3 present quantum cryptosystems as a tool to attain the unconditional security. We
also describe the well known protocols used in the field of quantum cryptography.
Chapter 4 - Adaptive Architecture to Protect Mobile Agents The objective of this chapter is
to propose a protocol to protect mobile agent, based on two agents: mobile agent and investigator
agent. The investigator is a mobile agent’s prototype with no critical code and data. It is created
and sent by mobile agent in order to be executed first. In return, investigator agent is analyzed by
mobile agent to detect any malicious action: if actions are forbidden, mobile agent redoes a copy
and changes destination. If actions are doubtful, mobile agent chooses a plan of adaptation then
migrates. If all actions are authorized, mobile agent migrates with confidence.
Chapter 5 - Communication through expressing and remixing: Workshop and System

This chapter presents our participatory expressive workshop and information system to
support it. Aim in this research is to cultivate communications in local communities. Expressing
thoughts will be a first step to communicate each other and creating new stories by remixing others
expressions will help to exchange and grasp others thoughts. We propose a workshop program
and a model of content circulation, and develop a system to realize them. Our system supports
decomposing and recomposing by automatic draft content generation. We implemented the
system for a workshop, in which participants created contents based on a format of expression
named photo-attached acrostics. Through observation of the practice, we concluded that our
framework could help content decomposition.
Chapter 6 - IPAC System For Controlling Devices over the Internet In this chapter we have
proposed an appliance controlling system, named as Internet and PC Based Appliance Control
(IPAC), using concepts of parallel port programming. IPAC is designed to control a device from PC
and from Internet, and can be applied in any smart infrastructure to automate the device and can
work with almost every type of automation method either it is wired (e.g. LAN) or wireless (e.g.
Bluetooth). This system can be applied in designing smart homes, secure homes, centralized
device controlling system, Bluetooth control system, WAP control system.
Chapter 7 - Requirements engineering and traceability in agile software development

This chapter discusses problems concerned with the conduction of requirements
engineering activities in agile software development. We also suggests some improvements to
solve some challenges caused by agile requirements engineering practices in large projects, like
properly handling and identifying critical (including non-functional) requirements, documenting and
managing requirements documentation, keeping agile teams in contact with outside customers.
Finally, the chapter discusses the requirements traceability problem in agile software development
and suggests some ideas to maintain the traceability links between agile software artefacts to help
developers to comprehend parts of the system, and to keep the consistency among agile software
artefacts during refactoring.
Chapter 8 - Robust encoding of the FS1016 LSF parameters: Application of the channel
optimised trellis coded vector quantization This chapter illustrates an optimized trellis
coded vector quantization (OTCVQ) scheme designed for robust encoding of the LSF parameters.
The objective of this system, called initially "LSF-OTCVQ Encoder", is to achieve a low bit-rate
quantization of the FS1016 LSF parameters. The efficiency of the LSF-OTCVQ encoder (with
weighted distance) was first proved in the ideal case of transmissions over noiseless channel. After
that we were interested on the improvement of its robustness for real transmissions over noisy
channel. To protect implicitly the transmission parameters of the LSF-OTCVQ encoder
incorporated in the FS1016, we used a joint source-channel coding carried out by the channel
optimized vector quantization (COVQ) method. In the case of transmissions over noisy channel,
we will show that the new encoding system, called "COVQ-LSF-OTCVQ Encoder", would be able
to contribute significantly to the improvement of the FS1016 performances by ensuring a good
coding robustness of its LSF spectral parameters.
Chapter 1
Security Approaches in Internet Communication
Utku KOSE
Afyon Kocatepe University, Turkey
ABSTRACT
In today’s world, information security is an essential factor, which must be taken
into consider to ensure secure applications and services in information technology.
Since the inception of the information technology concept, there has been a
remarkable interest in security approaches that aim to protect information or digital
data. Especially, rise of the Internet and internet technologies has caused
searching for newer approaches, methods and techniques that try to provide
secure internet communication sessions for computer users over the Internet. In
this sense, this chapter aims to examine security approaches in internet
communication. For this purpose, role of the coding science: “cryptology” in
providing secure internet communication and related techniques in this scope are
also explained within the chapter. Furthermore, in order to give an example for
usage of cryptology techniques, an e-mail application, which was developed to
send or receive encrypted e-mail messages, is also introduced in this chapter.
Keywords: information security, internet communication, encryption

techniques.
1
____ Chapter 1: Security Approaches in Internet Communication
INTRODUCTION
Nowadays, the security concept is an important factor, which is associated with
almost all fields in peoples’ modern life. It is too important that this concept has
been a remarkable subject to the humankind for a long time period. Basically, the
“security” term can be defined as “the protection of a person, property or
organization from an attack” [1]. But it also has more specific meanings that are
used to define similar situations, aspects and features of different fields in the life.
Additionally, there are also different concepts, which are derived originally from the
“security” term and used to define security approaches and techniques in different
fields. The term: “information security” is one of these concepts and it is mostly
associated with information technology.
Briefly, the information security term is described as protecting information or
digital data against any attack that can be performed by using different attacking
technologies, methods and techniques [2]. At this point, the popularity and
extensiveness of information security is connected with advancements,
developments and improvements in the field of information technology. Actually,
the rise of the Internet and internet technologies has caused rapid developments
and improvements in information security and formed its current situation in the
modern life. Today, it is more important to ensure secure web services and web
applications for people who use these technologies to perform their works and
communicate with other people from all over the world. Especially, providing
secure internet communication between two people has become an important
subject that must be taken into consider for protecting send or received digital data.
Because of this, there are many different approaches, methods and techniques
that try to provide secure internet communication sessions for people over the
Internet.
This chapter aims to examine the foremost information security approaches in
especially internet communication. Additionally, role of the cryptology in providing
secure communication sessions and related methods or techniques that can be
evaluated in this sense are also examined and explained in the chapter. In this aim,
principles of encryption techniques like private-key encryption and public-key
encryption and their usage in the internet communication systems or applications
are explained. As an example for usage of these cryptology techniques, an e-mail
application, which was designed and developed to be used for sending or
receiving encrypted e-mail messages, is also introduced in the chapter. This
application employs a private-key encryption algorithm, which aims to provide an
effective approach to encrypt the mail message and related attachments. By
explaining structure of this algorithm and features of the related application, the
chapter tries to give more concrete ideas about security approaches in internet
communication.
The chapter is organized as follows: The second section explains the foremost
approaches, methods and techniques that can be used to ensure a secure internet
communication. This section also introduces some systems and programs that can
be used to provide security in communication sessions. Immediately afterwards,
the third section introduces the coding science: cryptology and explains using
features of widely-used encryption techniques briefly. Next, the fourth section
introduces the related e-mail application and finally, the chapter ends with a
discussion–conclusions section.
ENSURING SECURITY IN INTERNET COMMUNICATION

In order to ensure security in internet communication, many different
approaches, methods and techniques have been introduced in time. Additionally,
different kinds of systems and programs have also been designed and developed
to implement introduced approaches, methods or techniques in the internet
environment. Definitely, all of these developments and improvements aim to
ensure a secure communication over the Internet. At this point, the “secure
communication” concept must be defined in order to understand the subject better.
Secure communication can be defined as performing communication in different
ways that make a third party unable to listen to the communication session [3, 4].
Nowadays, this concept is used as the “secure internet communication” to define
3
security aspects of the internet communication.
Security Approaches in Internet Communication

Security approaches in internet communication can be examined under
different categories and titles. In this section, the foremost approaches, methods
and techniques are taken into consideration to explain the subject briefly. In
internet communication, the security can be categorized under three main titles.
These are [5, 6]:
1 Hiding the content,
2 Hiding individuals of the communication,
3 Hiding the communication environment.
The first title: “Hiding the content” examines the approaches that enable users
to hide contents of messages (information or digital data) that are received or send
during the communication session. The related approaches include “encryption”,
“steganography” and “identity based systems” [5, 6]. The encryption is the
technique of making information or digital data unclear with special mathematical
functions. The encryption is explained in more detail in Section 3. Steganography
is an approach that is used to hide information or digital data in a digital media like
voice, video and pictures. This can be done by replacing the least significant bit of
each pixel belonging to the chosen media [7]. Figure 1 shows a representative
sample that explains steganography. The last approach: identity based systems
try to provide secure communication by evaluating each user’s identity. In this way,
the communication channel is open to only trusted users.
Figure 1. A representative sample that explains steganography

The second title: “Hiding individuals of the communication” examines the
approaches that aim to hide individuals of the communication from third parties or
malicious factors. The term: “anonymity” can also be used to define approaches
under this title. The related approaches include “anonymous systems or
applications” and “trace routing techniques” [5, 6]. Anonymous systems or
applications are some kind of software or hardware solutions that enable users to
hide themselves from factors that want to listen to the communication session. For
instance, “anonymous proxies” allow computer users to access to the Internet via
fake addresses of different countries and they become untraceable in this way.
Figure 2 represents a diagram that shows the anonymous proxy approach simply.
In addition to anonymous systems and applications, some special routing
techniques also enable users to become untraceable.
The last title: “hiding the communication environment” refers to the approaches
that allow users to make the communication environment “hidden”. The simplest
way to achieve this work is to make communication environment less popular and
notable on the Internet environment. Apart from this, a random data flow can also
be created for the communication environment. By creating random data flow, the
communication environment becomes harder to detect.
Figure 2. Anonymous proxy approach
Systems and Programs to Ensure Security in Internet Communication

On the market, there are many different types of systems and programs that
are widely used to ensure security during internet communication. These systems
and programs can be categorized under the titles below:
4 Firewalls,
5
5 Anti Virus-Spyware-Malware Programs,

6 Monitoring Systems,
7 Encryption-Decryption Systems,
8 Secure Messaging (IM) Programs,
9 Secure Conferencing Programs,
10 Other Specific Systems or Programs
Firewalls are some kind of program or hardware systems that are specially
designed and developed to block unauthorized access to a computer system [8].
Nowadays, there are many different firewall programs that can be installed and
used over an operating system. Some software companies like “Agnitum” and
“Check Point Software Technologies Ltd.” develop and provide today’s popular
firewall programs like “Outpost” and “Zone Alarm”. On the other hand, there are
also more advanced hardware systems that act as firewalls for more advanced
computer systems like servers.
Anti Virus-Spyware-Malware programs provide security solutions against
dangerous program and code types like viruses, trojans, spywares and malwares.
Protecting computer systems against these types of dangerous factors is too
important to ensure security for especially internet communication. Today, different
software companies like “Kaspersky Lab.”, “Eset”, “Symantec”, “McAfee” and
“Trend Micro” provide special programs that combine functions of Anti Virus-
Spyware-Malware protecting mechanisms. Furthermore, there are also more
advanced and effective programs that combine both firewall and Anti Virus-
Spyware-Malware protecting mechanisms. These programs are usually called as
“Internet Security” programs. As a result of increasing number of dangerous
program, code types and other malicious factors, computer users, who often work
on the Internet, often prefer to use “internet security” programs.
Monitor systems are often used to watch active processes over a network
system. By using this type of systems, unwanted activities over the network
system can be detected easily and necessary precautions can be taken against
possible attacks in the future. In this way, unwanted third parties and malicious
factors on a special communication session can be detected and removed
immediately. Nowadays, there are many kinds of monitoring systems that are
developed by different software companies. For instance, “Microsoft” offers a free
network monitoring program named “Microsoft Network Monitor”. Additionally,
another company named “Paessler” works on only network programs and provides
a free network monitoring tool. Finally, “Net Optics” provides many different
advanced monitoring and filtering solutions for communication security.
In order to achieve a secure communication, another method is using
encryption-decryption systems. On the market, there are hardware and software
based encryption-decryption solutions that try to provide high-level security for
valuable information and stored digital data. Today, encrypting information and
digital data is the most effective and popular approach to provide security in
almost all fields of the modern life.
By combining different kinds of security approaches, some secure instant
messaging (IM) and conferencing programs have been designed and developed
by software companies. Some of these programs are also “open source” and they
offer “free” and “developing” security solutions for internet communication. For
instance, “Skype” is one of the most popular internet communication programs and
it provides secure voice and chat communication with 128 bit AES and 1024 bit
asymmetrical protocols [6]. On the other hand, “Zfone” is an open source program
that enables users to make secure voice communication. Some popular IM
programs like “Yahoo Messenger” uses secure approaches to provide more
security in their communication services. “WASTE” is also another IM program that
uses high strength “end-to-end” encryption and an anonymous network. As
different from other ones, the WASTE is an open source IM program [6].
In addition to the mentioned ones, there are also many more systems or
programs that have been developed to be used for more specific aims. Some of
these systems or programs ensure the security indirectly. Because of this, they
must be used with the support of other security systems or programs.
7
THE CODING SCIENCE: CRYPTOLOGY AND ENCRYPTION

TECHNIQUES
Today, different types of mathematical methods and techniques are used to
send information or digital data from one place to other places safely. The related
methods and techniques also enable computer users to store their valuable
information or digital data with more secure approaches. These methods and
techniques are examined within “Cryptology”. Cryptology is the name of the
science, which incorporates both “cryptography” and “cryptanalysis”. Cryptology
can also be called as the “coding science”. At this point, cryptography and
cryptanalysis refer to different aspects of the cryptology. Because of this, these
terms must be defined in order to understand main scope of the cryptology.
Cryptography is the field of encrypting information or digital data by using
mathematics, computer science and engineering approaches. On the other hand,
cryptanalysis is associated with studies and practices of making encrypted
information or data unencrypted. In this sense, cryptanalysis works on weak and
strong features of an encryption algorithm designed and developed within
cryptography approaches [9, 10]. In other words, cryptanalysis is the field, which is
used to analyze and break safe communication sessions. In order to achieve this,
cryptanalysis employs different types of methods and techniques like analytic
judgment, applications of mathematic tools and combinations of figure definition.
Figure 3 represents a diagram that shows fields of the cryptology.
Figure 3. Fields of the cryptology
There are three more terms that must be examined within this subject. These
are: cryptographer, cryptanalyst and cryptologist. Cryptographer is the term, which
is used to define person whose work or research studies are based on
cryptography. On the other hand, the person who studies on cryptanalysis is called
as the cryptanalysis. As it can be understood from previous explanations, both the
cryptographer and the cryptanalysis are cryptologists.
Cryptology is too important for computer users because it provides security for
computer-aided works such as transferring data between computer systems,
designing and developing new computer-based technologies and also performing
Internet communication. In today’s world, cryptology is also often used to provide
security in computer-based systems or applications like e-business, e-marketing,
e-science, e-government and e-signature [11]. At this point, two elements of the
cryptology: encryption and decryption techniques have important roles on status of
provided security levels. The encryption term can be defined as transforming
information or digital data with a special function and the encryption key used by
this function. On the other hand, decryption is defined as converting the encrypted
information or digital data to its original, unencrypted form. In encryption works,
two different encryption-key techniques are widely used. These are named as
“public-key encryption” and “private-key encryption” respectively. Today, another
approach, which is called as “Hybrid Cryptosystem”, is also used to combine
advantages of both public-key and private-key encryption techniques. In order to
have more idea about cryptology and its effects on Internet communication, it is
better to explain features of public-key and private-key encryption techniques.
Public-Key Encryption
Public-key encryption technique can also be called as “asymmetric encryption”.
In this encryption technique, the user, who wants to ensure a secure
communication, needs two different keys. These keys are called as “private key”
and “public key” respectively. Each key that the user can use has different roles
during the communication. The public key is known by everybody. But the private
key is known by only one user. The encryption process is performed by using the
public key whereas the decryption process is performed with the private key [12,
9
13]. At this point, some mathematical equations are used to make the connection
between public and private keys. In other words, encrypted information or digital
data can be decrypted by using the private key, which is connected with the public
key that was used for encrypting the mentioned information or digital data.
Because of this, it is impossible to decrypt the encrypted information or digital data
with the help of other private keys. In this technique, it is too important that the
user must hide his / her private key from other users. But he / she can share the
public key with other people [14]. Figure 4 represents a diagram that explains a
typical communication session based on public-key encryption technique.
Figure 4. A typical communication session based on public-key encryption

technique
Today, RSA (Rivest-Shamir-Adleman) algorithm is the most popular approach

that uses the public-key encryption technique. In addition to this algorithm, El
Gamal, PGP (Pretty Good Privacy), Diffie-Hellman key definition and DSA (Digital
Signature Algorithm) are also other widely used approaches that use the public-
key encryption technique [14 – 16].
Private-Key Encryption
Private-key encryption technique can also be called as “symmetric encryption”.
In this encryption technique, only one key is used for encrypting or decrypting the
information or digital data [12, 17]. The private-key encryption technique comes
with two different approaches: “block encryption” and “row encryption”. In block
encryption systems, the original message is separated into fixed length blocks and
each block is encrypted individually [9]. In this way, a block is matched with
another fixed length block from the same alphabet. In designing of block codes,
mixing and diffusion techniques are used and these techniques are applied by
using “permutation” and “linear transformation” operations respectively [18]. At this
point, strength of the related block encryption algorithm is set by S boxes, number
of loops, using keys in XOR operations, block length and key length. Using
random key is also another important factor to improve strength of the applied
algorithm [19]. The other approach: row encryption is a new form of permutation
algorithms which were used in the past [9]. Row encryption technique needs a
long key data. Because of this, transition files with feedback feature are used to
produce a half-random key. The encrypted message content is created by
performing XOR operations with the produced key on the original message. At this
point, the receiver must produce the same key in order to decrypt the encrypted
message [9]. Figure 5 represents a diagram that explains a typical communication
session based on private-key encryption technique.
Figure 5. A typical communication session based on private-key encryption

technique
Today, DES (Data Encryption Standart) algorithm is the most popular approach
that uses the private-key encryption technique. Additionally, AES (Advanced
Encryption Standard), IDEA (International Data Encryption Algorirhm), Skipjack,
RC5, RC2 and RC4 algorithms are also other popular approaches that use the
private-key encryption technique [11, 14 – 16, 19].
The explained encryption techniques provide different types of security
solutions and approaches for different systems. Because of this, their advantages
11
and disadvantages must be known to choose suitable technique for any designed
system. Private-key encryption ensures a fast encryption technique whereas
public-key encryption provides a slow, but a trusted one. Additionally, private-key
encryption technique is useful on digital data, which is stored in a media [12]. But it
may be expensive to ensure security in sharing the private key with other users.
Although public-key and private-key encryption techniques employ some different
features, they are widely used in different types of applications or systems, which
aim to ensure security in especially Internet communication.
A SECURE APPLICATION FOR E-MAIL COMMUNICATION

In order to give more concrete ideas about security approaches in internet
communication, a sample e-mail application can be examined in detail. The
developed application enables computer users to encrypt their message text and
attachments and send the encrypted content to the receiver(s) via easy-to-use
interface. At this point, decryption of the message is done by the receiver(s) with
the same application. The application comes with a simple but strong enough
private-key encryption algorithm to ensure security for send or received e-mail
messages. Before explaining the encryption algorithm of the application, it is better
to explain using features and interface of the developed application.
Using Features of the Application

The e-mail application was designed and developed by using the C#
programming language. At this point, object oriented programming methods and
techniques allowed developers to create a fast, stable and simple application
structure. Interfaces of the application have been formed with simple but effective
controls. Coding and designing processes of the application were performed on
the Microsoft Visual Studio 2005 platform. Figure 6 represents a screenshot from
the application.
Figure 6. A screenshot from the application
The e-mail application has a user friendly design and simple controls that
enable computer users to perform the related operations in a short time. The
application comes with three different interfaces, which can be used to perform
different operations related to e-mail communication. The user can view these
interfaces by using the provided controls on the application. With the first interface,
folders of the adjusted mail address (inbox, sent box…etc.) can be viewed. On the
other hand, other two interfaces are used for encrypting plain mail messages or
decrypting the received encrypted ones. In this way, the same application can be
used by both sender and receiver users to ensure a secure communication.
Working structure of the developed communication system is shown in Figure 7
briefly.
Figure 7. Working structure of the developed communication system
This simple but strong enough application employs an effective private-key

encryption algorithm, which enables user to encrypt their original mail message
text and related attachments with some basic mathematical functions. In this way,
13
a secure e-mail communication channel between two users can be realized easily.
In order to have more idea about security aspects of the e-mail application,
features of this algorithm must be explained in detail.
The Encryption Algorithm

The encryption algorithm, which is provided in the developed application,
ensures strong approaches to change original form of the message text to more
complex and different forms of data. For instance, the whole message text is
divided into some blocks to improve effectiveness of the encryption approach. At
this point, variable blocks are used instead of constant blocks. Additionally, the
security level of the algorithm is increased by producing random numbers for each
encryption process. Moreover, this algorithm also requires longer encryption keys
for longer message texts. Thus, longer message texts are encrypted with more
complex keys. On the other hand, the most important feature of the algorithm is
that it is based on the private-key encryption technique. With the support of the
private-key encryption, the developed algorithm offers an effective and fast
encryption process. Furthermore, it also supports encrypting big size data.
With the developed algorithm, the encryption process is performed and
completed in four steps. These are:
1. Separation of the original text,
2. Random permutation production method,
3. Key production and bit-level encryption (XOR method),
4. Production of the key and the encrypted data.
Under the next subtitle, these steps are explained in more detail.
Encryption steps
The first step of the encryption process is based on separation of the original
message text into different numbers of text blocks. The related number is
automatically defined according to the character count of the text. Separation of
the text is also performed according two rules: If the character count of the original
text can be divided into three and a half or seven, the original text is separated into
the character count. Otherwise, the character count is set to a definite number,
which can be divided into five. In order to achieve this, some space characters are
added to the original message text.
In the next step, random permutation production method is used to change
original content of the message text. For this purpose, random numbers are
produced for each block, which was obtained in the first step. Positions of each
character in the blocks are changed according to produced random numbers. As a
result of changing character positions, a simply encrypted text, which was created
with the permutation method, is obtained. With the permutation method, properties
of the related text characters are protected. But their positions are automatically
changed [17].
In the third step, the final form of the encrypted message text is obtained. In
this sense, random numbers between 0 and 9 are produced according to
character count of the encrypted text and each character of this text is encrypted
at bit-level with randomly produced key numbers. During the encryption process,
the XOR (eXclusive OR) method is used [According to the XOR method, the result
(output) is “1”, if two inputs are “different”. Otherwise, the result (output) is “0”]. As
a result, a set of new characters are obtained for the last form of the message text.
Table 1 shows some examples for the XOR encryption process of different
characters.
Table 1. Some examples for the XOR encryption process of different characters.
Character Key Encrypted Character
(in binary) (in binary) (in binary)
Z 3 Y
(01011010) (00000011) (01011001)
A 7 F
(01000001) (00000111) (01000110)
J 9 C
15
(01001010) (00001001) (01000011)
U 1 T
(01010101) (00000001) (01010100)
M 5 H
(01001101) (00000101) (01001000)
In the last step, created encryption – decryption key and the encrypted data are
organized in two separate temporary files and contents of these files are
transferred to the application interface to be saved by the user in .txt file formats.
Figure 8 represents a flowchart, which briefly explains and shows each step of the
developed encryption algorithm.
Figure 8. Flowchart of the developed encryption algorithm
By using the explained algorithm steps, the developed application allows users
to encrypt both their mail messages and attachments easily. At this point, it is also
important to explain decryption process of the application.
In the decryption process, the obtained key file is used by the application for
the encrypted mail message text. In order to understand decryption steps better,
main parts of the key file must be examined first. Figure 9 shows a brief schema
that shows the related key file parts.
Figure 9. Parts of the key file
As it can be seen from the Figure 9, the key file consists of two different parts,
which can be called as “Part 1” and “Part 2” respectively. During the decryption
process, the application gets the encrypted form of the text [before the bit-level
encryption (XOR)] by using the “Part 1”. Afterwards, the original message text is
created by using the “Part 2”. At this point, the original message text is the
decrypted text for the receiver. As a result, the receiver has a chance to get the
original message in a more secure way with the help of mentioned method.
All of the explained processes are performed by the e-mail application via two
different interfaces. In the application, these interfaces are named as “Encryption
Screen” and “Decryption Screen” respectively. In order to have more idea about
usage of the e-mail application, these interfaces must be explained briefly.
Encryption and Decryption Interfaces of the Application

As mentioned before, the developed e-mail application includes simple and
user friendly interfaces to provide fast and easy using experience for computer
users. As different from e-mail folder interface, “Encryption Screen” and
“Decryption Screen” have similar interfaces and controls. For encrypting any mail
message text, the user can open the Encryption Screen. Figure 10 shows a
screenshot from the Encryption Screen of the e-mail application.
17
Figure 10. A screenshot from the Encryption Screen of the e-mail application
In order to provide a fluent using experience, users are enabled to change the
view to encryption or decryption interfaces (screens) by using two buttons located
on left and top side of each “screen”. Additionally, the users are enabled to learn
more about the usage of the related “screen” by using the “Help” button. On the
Encryption Screen, the user can type the original message text under the
“Message” field. On the other hand, there are some more fields that are
associated with typical e-mail message fields like “To” and “From”. Additionally,
the user can also add one or more attachments to the message by using related
controls on this screen. In order to start the encryption process for the mail
message and related attachments, the “Encrypt” button can be used. During the
encryption process, some statistical information can also be viewed on the title bar
of this screen. At the end of the process, two .txt files are created for the produced
key and the encrypted data. These files can be saved by the user to any directory.
At this point, it is important to use .txt file type for the encrypted data to protect its
content from foreign characters, which can be produced by other word processor
programs. Moreover, process time is also lowered by using the .txt file type for the
encrypted data.
After getting the key and the encrypted data, the user can send the message to
the receiver(s) by using the “Send” button located on the Encryption Screen. While
sending the encrypted mail message to the receiver, restrictions, which are
applied by ports or firewalls, may affect the e-mail application. In order to solve this
problem, a remoting application, which is held in a trusted authority, was
developed. With this application, communication with the authority is performed
over the port: 80, by using shaped XML structure, which is suitable for the
semantic data model.
After receiving the encrypted e-mail, the Decryption Screen of the application
can be used to get the original mail message and its attachments. The Decryption
Screen was designed as similar to the Encryption Screen. Figure 11 shows a
screenshot from the Decryption Screen of the e-mail application.
Figure 11. A screenshot from the Decryption Screen of the e-mail application
On the Decryption Screen, the receiver can view the encrypted message text
under the “Encrypted Text” title. After choosing the key file, the text can be
decrypted by using the “Decrypt” button. As soon as the decryption process is
finished, the original text is shown in the text field located on bottom side of the
Decryption Screen.
The introduced e-mail application provides an effective and strong security
solution for e-mail communication over the Internet. It is too important that
examining using features and functions of this application enables readers to have
more concrete ideas about security approaches in today’s internet communication.
Definitely, there are also many different kinds of applications or systems that try to
ensure security for different fields of internet communication.
DISCUSSION–CONCLUSIONS
This chapter explained the foremost information security approaches in
especially internet communication. In this sense, role of the cryptology in ensuring
19
security for communication sessions and its fields that can be examined in this
scope were explained in the related sections. In order to explain more about usage
of cryptology in communication security works, principles and functions of
encryption techniques like private-key encryption and public-key encryption were
also examined. At this point, an e-mail application, which can be used for sending
or receiving encrypted e-mail messages, was also introduced to enable readers to
have more idea about the usage of cryptology and encryption techniques in
providing security for internet communication. Features and functions of this
application provide a simple but strong enough approach to support explained
subjects about the security factor in internet communication.
Today, the security concept is an extremely important subject because the
information is currently more valuable for the humankind and there is a supremely
effort to protect “valuable information” from environmental factors. As a result of
rapid developments in the technology, more advanced systems, which provide
better security solutions for valuable information or digital data, are designed and
developed expeditiously. With the related developments in the technology, it is
expected that the number of different attacking and “security breaking” methods
and techniques will be reduced in time. But conversely, more attacking and
“security breaking” methods or techniques are designed and developed by
malicious people from day to day. Because of this, more research studies and
works should be performed to take security precautions one step ahead from
malicious methods and techniques against the security.
Although there are many different, advanced security applications and systems,
the “human factor” is still a critical and important factor in providing a complete
security for almost all fields in the modern life. It is important that the human factor
is the weakest part of even more advanced security systems and it seems that this
situation will not be changed in the near future. Because of this, people must be
trained about current security approaches, methods and techniques that can be
used to ensure information security. Moreover they also must be warned against
potential “social engineering” methods and techniques that can be implemented to
benefit from disadvantages of the human factor. This can be done by doing the
following tasks:
1. Arranging educational seminars or meetings about information security,
2. Attending comprehensive conferences and symposiums about information
security,
3. Following the latest developments and improvements about security
approaches, methods and techniques,
4. Following the latest developments and improvements about attacking and
“security breaking” methods and techniques.
5. Being aware of social engineering methods.
“Getting access to source code…was kind of like the secret ingredient. I wanted to
know what the secret was…”, Kevin David MITNICK
21
REFERENCES
1. R. Kurtus: What is Security?, Ron Kurtus’ School for Champions, (2002).
[Online] Retrieved April 10, 2010 from: http://www.school-for-
champions.com/security/whatis.htm
2. N. Yalcin, and U. Kose: Sending E-Mail with an Encrypting Algorithm Based
on Private-Key Encryption, In Proceedings of the International Conference on
Information and Communication Systems 2009, pp. 33-37 (2009).
3. D. P. Agrawal, and Q.-A. Zeng: Introduction to Wireless and Mobile Systems,
Thomson, (2005).
4. J. Kurose, and K. Ross: Computer Networking, Addison Wesley, (2003).
5. L. M. Surhone, M. T. Timpledon, and S. F. Markesen: Secure Communication,
Betascript Publishing, (2010).
6. Wikipedia – The Free Encyclopedia: Secure Communication, (2010). [Online]
Retrieved April 13, 2010 from: http://en.wikipedia.org/wiki/Secure_communication
7. S. Sagiroglu, and M. Tunckanat: A Secure Internet Communication Tool,
Turkish Journal of Telecommunications, Vol. 1, No. 1, pp. 1-10 (2002).
8. Wikipedia – The Free Encyclopedia: Firewall (computing), (2010). [Online]
Retrieved April 14, 2010 from: http://en.wikipedia.org/wiki/Firewall_(computing)
9. R. J. Spillman: Classical and Contemporary Cryptology, Prentice Hall, pp. 1-6,
132, 137 (2005).
10. R. A. Mollin: RSA and Public Key Cryptograhy, Chapman and Hall/CRC, pp. 1-
25, 53 (2003).
11. S. Sagiroglu, and M. Alkan: Electronic Signature in All Respects: E-Signature,
Grafiker Publishing, pp. 2, 8-9, 24, 31 (2005).
12. W. Trappe, and L. C. Washington: Introduction to Cryptograhy with Coding
Theory, Prentice Hall, pp. 4-6 (2002).
13. M. D. Abrahams, S. Jajoida, and H. J. Podell: Information Security: An
Integrated Collection of Essays, Institute of Electrical and Electronics
Engineering, pp. 15, 350-384 (1995).
14. D. R. Stinson: Cryptography Theory and Practice, Chapman and Hall/CRC, pp.
114, 162 (1995).
15. C. Cimen, S. Akleylek, and E. Akyildiz: Mathematics of the Codes:
Cryptography, Middle East Technical University – Center of Society and
Science, (2007).
16. S. Singh: The Code Book: The Science of Secrecy from Ancient Egypt to
Quantum Cryptography, Anchor, (2000).
17. K. Schmeh: Cryptography and Public Key Infrastructure on the Internet,
Heidelberg, pp. 42 (2003).
18. M. T. Sakalli, E. Bulus, A. Sahin, and F. Buyuksaracoglu: Design Techniques
and Power Analysis in Flow Codes, In Proceedings of 9th Annual Academic
Informatics Conference, (2007).
19. S. Andac, E. Bulus, and M. T. Sakalli: Analyzing Strength of Modern Block
Encryption Algorithms, In Proceedings of 2nd Young Researchers Congress of
Engineering Sciences, pp. 87 (2005).
23
Chapter 2
Digital Forensics Evidence Mining Tool
Khaled Almakadmeh, & Mhammed Almakadmeh

Concordia University, Canada
ABSTRACT
Internet has created new forms of human interaction through its services, like
E-mail, Internet Forums and Online Banking Services. On the other hand, it has
provided countless opportunities for crimes to be committed, many digital
techniques have been developed, and used to help cybercrime investigators in the
process of evidence collection. In this paper, we developed an efficient digital
forensics mining tool to help cybercrime investigators in evidence collection and
analysis by providing various forensically important features.
Keywords: Evidence, Digital Forensics, Semantic Search, Cybercrime

Investigation
INTRODUCTION
Internet has provided many solutions that help people over the entire world to
facilitate their lives including; E-mail, Instant Messages (IM), Online Banking
Services, and many other services that most of the people can’t stop using.
However, according to published statistics, there are thousands of businesses and
24
government departments like Western Union, Creditcards.com and CD Universe
have been hacked, which resulted in over a billion dollars of damages per year,
and this amount of losses is climbing. This makes the job of law enforcement
officers including cybercrimes investigators more difficult and complicated,
because of the large amount of data that has to be collected and analyzed.
Most of cyber criminals use high-technological devices; this requires that law
enforcement agencies to have efficient tools and utilities to gather and analyze
data from these devices. These reasons were primary motivation behind
conducting our research in computer forensics to develop our Digital Forensic
Evidence Mining Tool. It’s dedicated to help cybercrimes investigators in the
process of collecting and analyzing evidence from suspects’ devices. We have
provided features that are highly needed, helpful and supportive toward evidence
collection.
Search engines like Google, Yahoo, and many others perform keyword search.
However, cybercrime investigators need is to be able to do a semantically oriented
search. Semantic search [1] provides a great flexibility during the investigation
process. For example, the word "cocaine" is not going to be mentioned frequently
in a drug dealer's communications, instead, when an investigator wants to search
for a word like "cocaine", (s)he is expecting to get results that contain the term
cocaine or any other related terms. Table 1 shows some examples of terms and
their synonyms/Hyponyms
Table 1. Examples of terms & their synonyms/ Hyponyms

Term synonyms/Hyponyms
Cocaine Blow, Nose Candy, Snow, Crack, Tornado
Bank Depository, Reserve, Backlog, Stockpile, Deposit, Container,
Money Resource, Money Box
Investigation Probe, Inquiry, Enquiry, Research, Investigating
Internet Net, Cyberspace, System, Electronic Net, Computer Network
Our tool is able to enrich the search with various semantic suggestions that the
25
____ Chapter 2: Digital Forensics Evidence Mining Tool
investigator can use. While developing our tool we faced many challenges; we
should take into consideration the tool efficiency, robust functionality, and
visualization during the whole development cycle. Besides these challenges, our
solution should be scalable for large number of files and ready to adapt new
features. In addition, the tool needs to be very responsive; within a matter of few
seconds the search results need to be displayed and ready to be processed.
RELATED WORK
In this section, we focus on previous tools and solutions that have been
proposed to help cybercrime investigators. First, we discuss stand alone utilities
used in this field and in subsequent sections we mention how our tool takes
advantage by integrating them, and providing more customized features that will
help cybercrime investigators in performing their jobs.
The first utility we use is Google Desktop Search (GDS) [2] provided by Google
Corporation. GDS is a desktop search engine that provides full text search for a
wide range of file types, such as emails, documents of all types, audio files,
images, chat logs, and history web pages that the user has visited. What makes it
efficient is that after the initial setup and building the index for the first time,
indexing occurs only when the machine is idle. Thus, the machine's performance
is not affected. GDS also makes sure that it stays up to date by monitoring any
changes on existing or in newly added files. The last but not the least feature is
finding deleted files; Google Desktop creates cached copies (snapshots) of all files.
These copies can be viewed even if the files have been deleted and are returned
in the search results.
The other utility we use is WordNet [3], a large English lexical database. It
provides nouns, verbs, adjectives and adverbs that are grouped into sets of
cognitive synonyms called “Synsets”. Synsets are interlinked by means of
conceptual-semantic and lexical relations [3]. In [4] indexing with WordNet Synsets
is used to improve text retrieval. We take advantage of this utility to show the
investigator a broad collection of suggestions that she/he could pass to GDS.
26
Further discussion about our developed solution is provided in subsequent
sections.
PROBLEM STATEMENT
A Good problem statement should answer the following questions:
What is the problem?
The investigator needs to be able to query the criminals’ devices to build
knowledge about what information it contains. This knowledge can be used to
provide evidence, and/or to prevent future incidents.
Who has the problem?
The intended clients for this solution are cybercrime investigators; they face a
problem when performing an effective and efficient search on the information
in criminals’ devices.
What is the solution?
A full featured desktop tool that uses GDS and WordNet to provide semantic
search in a suspect’s computer.
PROPOSED SOLUTION
In this section, we show an overview of our tool’s architecture. Then, we
discuss how each component in the tool contributes to the overall functionality.
After that, we show the use-case and activity diagram of our tool.
System Architecture
The system architecture provides a comprehensive overview of the tool and its
supporting infrastructure, Figure 1 shows the architecture of our tool:
27
Figure 1. Tool Architecture
Tool Components
The system components are:
Graphical User Interface
WordNet API
Google desktop SDK
Business Layer
We describe each component from a technical perspective, and explain how

they communicate with each other to handle the submitted task. Then, we present
the implemented features that are of great use to cybercrime investigations.
Graphical User Interface

The Graphical User interface (GUI) was designed to be simple, intuitive, and
yet very practical. It contains all our tool functionalities in a clear and standard
presentation to minimize the learning curve of the user. The GUI also provides
menus that accomplish the same functionalities as the main window components;
this menu is intended to help users that are more menu-oriented. Figure 2 shows a
screen shot of our tool.
28
Figure 2. Digital Forensics Evidence Mining Tool
WordNet
For the semantic search functionality, we decided not to automatically search
for all synonyms of the desired term. Since this approach will overload the tool,
and overwhelms the investigator with a large amount of results. Instead, we
designed our tool to search only for the desired term. Figure 3 shows more
practical feature-rich suggestion panel. When the forensic investigator enters a
term and hits Enter; the suggestion panel shows a list in the form of tree view that
contains synonyms, acronyms, sister terms…etc.
Figure 3. Panel shows suggestions for "Cocaine"
29
In addition, the investigator has the capability for more options, like specifying
whether he wants to look for nouns, verbs, or adjectives that are related to the
term he previously searched. Below that panel there is a definition window that
shows the definition of the selected word from the suggestion panel, and an
example of use. Double-clicking on a term from the suggestion panel initiates a
new request to search for that term and the results are displayed in a new tab.
This approach guarantees that our tool is working at the highest performance level.
Google Desktop SDK

The Google Desktop Search SDK consists of the following:
Event Schemas: The GDS engine processes event objects sent to it by other
components (Business layer, or even the GUI). An event object consists of the
content data the investigator wants the engine to index and store, as well as
additional meta-information and properties about that content or the event
object. The event schemas specify the allowed event types and the relevant
properties for each event type.
Developer Indexing API: The Developer Indexing API consists of interfaces
used to construct event objects and send them to GDS.
Developer Search API: We only use the Developer Search API. It sends an
HTTP request to Google Desktop Search engine that contains the investigator
search query term. The HTTP response contains the desktop search results in
XML format.
When the investigator submits a search query, actually (s)he generates an

HTTP request that includes a &format = xml parameter.
For example, to search for "Google" you would send something like:
http://127.0.0.1:4664/search&s=1ftR7c_h ZK
YvuYSRWnFHk91Z0?q=Google&format=xm.
To break this down:

http://127.0.0.1:4664/: is the localhost address and GDS port.
30
search&s=1ftR7c_hVZKYvuYS-RWnFHk91Z0: is the search command and a
security token.
?q=Google: is the query term(s) parameter.
If the investigator wants to search for more than one term, separate the terms
with +s. For example, to search for both "Google" and "GDS",
use:?q=Google+GDS.
If the investigator wants want to search for a specific phrase, separate the
terms with +s and surround the phrase with %22s. For example, to search for the
phrase "Google Desktop Search", use:?q=%22Goo-gle+Desktop+Search%22
To search for the two phrases "Google Desktop Search" and "Copyright 2005",
use:?q=%22Google+Desktop+Search%22+%22Copy-right+ 2005%22.
&format=xml specifies that the HTTP response returns the search results in XML
format. By default, an HTTP search response will only return the first ten results.
It’s kept for developer to specify the number as needed by appending the &num=
parameter, followed by the maximum number of results to be returned to the query.
There is no problem if the maximum number argument value is greater than the
total number of search results; only the total number of results is returned, with no
null "results".
Business Layer
This component is at the core of our tool; it receives the search terms from the
GUI and it interacts with the WordNet component in case the investigator wants to
search a keyword from the suggestion panel, it also sends the search term with
the search preferences to the GDS engine. The business layer processes the
results and sends them back to the GUI to be shown to the investigator. This layer
resembles the brain of our tool where all the processing complexity is hidden kept
separated from the GUI. It is composed of classes and functions that communicate
with the rest of the components.
31
Use Case Diagram

The use case diagram [5] gives an abstract of what functionalities the
investigator can use when working with our tool.
Figure 3. Use case Diagram
Activity Diagram
The activity diagram [5] shows the flow of the program when a search task is
submitted to the tool. As shown in the diagram, the user can specify advanced
search options before executing the search; also choose a keyword from WordNet
to run the search again. After the results are shown, the user can generate a
report and save it to be used later when presenting the evidence to the court of
law.
Figure 4. Activity Diagram
32
Applicability
Our tool runs on Windows XP, Vista, and even Windows 7, and by using
Google Desktop Search engine our tool can access all file types, MS Office files,
Outlook files, archive files (such as .zip, .rar), email and web history files.
TOOL FEATURES
Our tool provides a feature-rich environment for the investigator. We provide
many features that help the investigator in evidence analysis and report generation.
Below is a description of all the functionalities our tool provides:
Result Display: By default search results are displayed in a group of twenty per
page; the previous and next buttons allows the investigator to navigate through the
next and previous result page. The total number of results found is shown at the
top of the results page.
Access All Files Types: Using Google Desktop Search engine our tool can
access all file types, MS Office files, Outlook files, archive files (such as .zip, .rar),
and web history files.
Semantic Search: Full of features panel that suggests many variations of the
keyword, including a small panel that shows the meaning of each word, and a
sample sentence of how it is used.
Multiple Tabs: For each keyword searched a new tab will open, allowing the
investigator to conduct more search processes, and close any unneeded tab.
Advanced Search: Provides more options that allow the tool to filter the number
of results.
Choose various file types for more refined search, including most common file
types, like; text, images, audio/video, archive (zip), and HTML files.
Choose specific file category like email or web to search only the specified type
of files.
Choose the number of results per page.
Sort the results by relevance: when checked; relevant files (within the same
directory) will be displayed (sequentially) after each other.
Display File Snippet: Allows the investigator to see the searched term within the
33
file it’s been found.

Display Detailed File Information: like creation date, last access date, last write
date, file attributes, and MD5 Checksum value.
Opening the File in the Appropriate Application: when the file name is double
clicked in the graphical user interface.
Comprehensive Menu: provides the same functionalities to the user is (s)he is
more accustomed to using menus.
Report Generation: allows the selection of multiple files from multiple search
results tabs, to be added and used to generate a report in HTML format. This
report shows for each file: the file title, path, MD5 Checksum value, and files size.
Set the Search Path: to search within a specific directory only.
Calculate & Display the Hash (MD5) of the file to prove the integrity of the seized
evidence.
Help Menu: provides the user with a user manual of how to use the tool
functionalities.
CONCLUSIONS
In this paper, we developed a Digital Forensics Evidence Mining Tool that is
dedicated to help cybercrimes investigators, in the process of collecting and
analyzing data from a suspect’s computer. We have provided in this solution
features that are highly needed, helpful and supportive towards evidence
collection. We took advantage of some already developed APIs, such as; Google
Search Desktop API, and WordNet API to enrich our application. Due to recurring
requirements in this hot topic, our solution is scalable and can be adjusted to
adapt future requirements and features to provide a unique and essential tool for
cybercrime investigators.
34
REFERENCES
1. R. Guha, Rob McCool, Eric Miller, Semantic search, International World Wide
Web Conference, Proceedings of the 12th international conference on World
Wide Web.
2. Benjamin Turnbull, Barry Blundell, Jill Slay, Google Desktop as a Source of
Digital Evidence.
3. George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and
Katherine Miller, Introduction to WordNet: An On-line Lexical Database.
4. Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan Cigarrain, Indexing with
WordNet synsets can improve text retrieval, UNED, Ciudad Universitaria.
5. G. Booch, J. Rumbaugh, I. Jacobson, Unified Modeling Language User Guide.
35
Chapter 3
ACHIEVING UNCONDITIONAL SECURITY BY

QUANTUM CRYPTOGRAPHY
Mohamed Elboukhari, University Mohamed Ist, Morocco,

Mostafa Azizi, University Mohamed Ist, Oujda, Morocco
Abdelmalek Azizi, Academy Hassan II of Sciences & Technology, Morocco
ABSTRACT
Classical cryptography algorithms are based on mathematical functions. The
robustness of a given cryptosystem is based essentially on the secrecy of its
(private) key and the difficulty with which the inverse of its one-way function(s) can
be calculated. Unfortunately, there is no mathematical proof that will establish
whether it is not possible to find the inverse of a given one-way function. Since few
years ago, the progress of quantum physics allowed mastering photons which can
be used for informational ends and these technological progresses can also be
applied to cryptography (quantum cryptography). Quantum cryptography or
Quantum Key Distribution (QKD) is a method for sharing secret keys, whose
security can be formally demonstrated. It aims at exploiting the laws of quantum
physics in order to carry out a cryptographic task. Its legitimate users can detect
eavesdropping, regardless of the technology which the spy may have. In this study,
we present quantum cryptosystems as a tool to attain the unconditional security.
We also describe the well known protocols used in the field of quantum
36
cryptography.
Keywords: Quantum cryptography, Quantum key distribtuion, Unconditional

security.
INTRODUCTION
The Origin of the Concept of Quantum Computer
In his article [1] Richard Feynman presented an interesting idea illustrating how
a quantum system can be used for computation reasons. Also the article
described how effects of quantum physics could be simulated by such quantum
computer. Every experience investigating the effects and laws of quantum physics
is expensive and complicated. The idea of Richard Feynman was very interesting
because it can be used for future research of quantum effects.
A quantum computer is a machine for computation that uses quantum
mechanical phenomena, such as superposition and entanglement, to perform
operations on data. The principle behind quantum computation is that quantum
properties can be exploited to represent data and perform operations on these
data [2]. Later in 1985, it was proved that a quantum computer would be much
more powerful than a classical one [3].
A technology of quantum computers is also very different. For operation,
quantum computer uses quantum bits (qubits). Quantum mechanic’s laws are
completely different from the laws of a classical physics. A qubit can exist not only
in the states corresponding to the logical values 0 or 1 as in the case of a classical
bit, but also in a superposition state.
The major difference between quantum and classical computers is related to
the memory. While the memory of a classical computer is a string of state 0 (0’s)
and state 1(1’s) and it can perform calculations on only one set of numbers
simultaneously, the memory of a quantum computer is a quantum state that can
be a superposition of different numbers. A quantum computer can do an arbitrary
reversible classical computation on all the numbers simultaneously and performing
37
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
a computation on many different numbers at the same time interferes all the
results to get a single answer.
For example as in figure 1.1 a quantum computer with 4 qubits gives 24
superposition states. Each state would be classically equivalent to a single list of 4
1's and 0's. Such computer could operate on 24 states simultaneously. Eventually,
observing the system would cause it to collapse into a single quantum state
corresponding to a single answer, a single list of 4 1's and 0's.
Some problems occur in production of quantum computers. Any kind of
measurement of quantum state parameters considers interaction process with
environment (with other particles as particles of light for example), which changes
some parameters of this quantum state. Also, measurement of superposition
quantum state will collapse it into a classical state, this is called decoherence. The
decoherence problem is the major obstacle in a process of producing of a
quantum computer. If this problem cannot be solved, a quantum computer will be
no better than a silicon one [4].
Figure 1: Effect of four qubits
To make quantum computers powerful, many operations must be performed

before quantum coherence is lost. It can be impossible to construct a quantum
computer that will make calculations before the problem of decoherence. But if
one makes a quantum computer, where the number of errors is low enough, than
it is possible to use an error-correcting code for preventing data looses even in the
case when qubits in the computer decohere.
38
A hardware problem is another one problem in building quantum computers.
Because of some successful experiments Nuclear Magnetic Resonance (NMR)
technology is the most popular today. Also, some other designs are based on ion
trap and quantum electrodynamics (QED). All of these methods have significant
limitations and nobody knows what the architecture of future quantum computers
hardware will be [3].
The quantum computing is still in its infancy and although the concept of
quantum computers has remained purely theoretical for a long time, recent
developments in quantum computers have aroused interest. Experiments have
been carried out in which quantum computational operations were executed on a
very small number of qubits (quantum bit). Both practical and theoretical research
continues with interest, and many national government and military funding
agencies support quantum computing research to develop quantum computers for
both civilian and national security purposes, such as cryptanalysis.
If a quantum computer becomes a reality then the artificial intelligence is one of its
benefits. It has been proved that quantum computers will be much faster and consequently
will perform a large amount of operations in a very short period of time. So, increasing the
speed of operation will help computers to learn faster even using the one of the simplest
methods. Also, high performance will allow us in development of complex compression
algorithms, voice and image recognition, molecular simulations, true randomness and
quantum communication. Randomness is very interesting in simulations. Molecular
simulations are important for developing simulation applications for biology and chemistry.
Also, the quantum communication has great benefits in the field of security because both
receiver and sender are alerted when an eavesdropper tries to catch the signal and thus
quantum computers make communication more secure. Actually there a lot of research
concerning a new type of cryptography called quantum cryptography. Quantum
cryptography, or also quantum key distribution (QKD), uses quantum mechanics to
guarantee secure communication. It enables two parties to produce a shared random bit
string known only to them, which can be used as a key to encrypt and decrypt messages
[5].
39
New Field of Cryptography: Quantum Cryptography

The current cryptographic technologies, such as RSA and others are based on
factorization. Integer factorization problem is believed to be computationally
infeasible with an ordinary computer for large integers that are the product of only
a few prime numbers (e.g., products of two 300-digit primes) [6]. A quantum
computer by comparison could efficiently solve this problem using Shor's algorithm
[7] to find its factors. This ability would allow a quantum computer to "break" many
of the cryptographic systems in use today, in the sense that there would be a
polynomial time (in the number of digits of the integer) algorithm for solving the
problem of factorisation. In particular, most of the popular public key ciphers are
based on the difficulty of factoring integers (or the related discrete logarithm
problem which can also be solved by Shor's algorithm), including forms of RSA.
These are used to protect secure Web pages, encrypted email, and many other
types of data. Breaking these would have significant ramifications for electronic
privacy and security. The only way to increase the security of an algorithm like
RSA would be to increase the key size and hope that an adversary does not have
the resources to build and use a powerful enough quantum computer.
A way out of this dilemma would be to use some kind of quantum cryptography.
Wiesner [8] proposed the one-time pad method for key distributions, exploiting the
laws of physics to scan for system intrusion or wiretap in the 1970s. Quantum
mechanics does not regard measurement as an external and passive process, but
instead as one that changes the internal states of the system. Detection, wiretaps,
and intrusion are measurement behaviors, any wiretap and intrusion during key
distribution can be detected. Hence, a quantum cryptosystem attains unconditional
security.
Quantum cryptography is only used to produce and distribute a key, not to
transmit any message data. This key can then be used with any chosen encryption
algorithm to encrypt (and decrypt) a message, which can then be transmitted over
a standard communication channel (classical channel).
The security of quantum cryptography relies on the foundations of quantum
mechanics, in contrast to traditional public key cryptography which relies on the
40
computational difficulty of certain mathematical functions. Also traditional public
key cryptography cannot provide any indication of eavesdropping or guarantee of
key security. Quantum key distribution has an important and unique properly; it is
the ability of the two communicating users (traditionally referred to as Alice and
Bob) to detect the presence of any third party (referred to as Eve) trying to gain
knowledge of the key. A third party trying to eavesdrop on the key must in some
way measure it, thus introducing detectable anomalies. By using quantum
superpositions or quantum entanglement and transmitting information in quantum
states over a quantum channel (such as an optical fiber or free air), a
communication system can be implemented which detects eavesdropping.
STATE OF ART OF QUANTUM CRYPTOGRAPHY

Mathematicians have searched for ages, for a system that would allow two
people to exchange messages in perfect privacy. Quantum Cryptography was
born in the early seventies when Stephen Wiesner wrote the article "Conjugate
Coding"[8], was rejected by IEEE Information Theory but was eventually published
in 1983 in SIGACT News (15:1 pp. 78-88, 1983). Stephen Wiesner showed in his
paper how to store or transmit two messages by encoding them in two “conjugate
observables”, such as linear and circular polarization of light, so that either, but not
both, of which may be received and decoded. His idea is illustrated with a design
of unforgeable bank notes.
The ongoing development of quantum cryptosystems thereafter was primarily
the result of the efforts of Charles Bennett and Gilles Brassard. Most quantum
cryptographic key distribution protocols developed during that time were based on
Heisenberg’s Uncertainty Principle and Bell’s Inequality. Others employed the
quantum non-localization, such as the cryptosystem developed by Biham et al. [9].
Users store a particle in the quantum memory of the sending center, such that the
users of the same center are assured secure communication. Phoenix et al. [10]
introduced a method of developing a quantum cryptographic network rather than
adopting quantum non-localization. Huttner and Peres employed non-coupled
41
photons to exchange keys [11], and Huttner et al. also applied a weak correlation
to reduce significantly the level of tapped information [12]. Wiesner used bright
light to construct a quantum cryptosystem [13].
The early quantum cryptosystems developed in the 1980s and 1990s however
lacked complete facilities of research on the security of key distribution protocols.
An eavesdropper in these systems was assumed to be able to adopt only simple
wiretap methods but quantum mechanics can in practice support more complex
methods. Applying a separate method to manage each possible attack is quite
difficult and numerous research scholars devote themselves in enhancing the
system security by applying specific methods for key distribution under various
attacks.
The first one who examined the security of quantum cryptosystems was
Lutkenhaus [14]. In [15,16] Biham and Mor presented a method of resolving
collective attack. Mayers and Salvail [17], Yao [18] and Mayers [19] based their
research on BB84 Protocol [20], believing that this method could provide
unconditional security and resist various attacks. In the article [21] Bennett et al.
examined the security of even–odd bits of quantum cryptography.
Despite the development of Quantum Key Distribution protocols, after 20
years, a group of scholars asserted that although quantum cryptosystem based on
the QKDP can achieve unconditional security, its key generation is not efficient in
practice because the qubits transmitted in the quantum channel cannot be
completely employed. For example, out of 10 qubits, only 5 qubits are used for
key generation. Also, its key distribution applies one-time pad method, and the
length of the key must be the same as that of the plaintext, so the number of
qubits required far exceeds the length of plaintext. So, the cost of frequent
transmission of bulk messages is much too high. Consequently, the new idea of
Quantum Secure Direct Communication (QSDC) is proposed. A QSDC protocol
transforms plaintext to qubits to replace the key, and transmits the messages via
the quantum channel. This reduces the number of qubits used, thus enables
automatic detection of eavesdroppers.
Beige et al. in 2002 [22] was initialized the elaboration of QSDC Protocol. In
42
their scheme, the secure message comprises a single photon with two qubit
states; it becomes read-only after a transmission of an extra classical message via
a public channel for each qubit. Later Boström and Felbingeer developed a Ping-
Pong QSDC Protocol [23] that adopts the Einstein–Podolsky–Rosen (EPR) pairs
[24] as the quantum information carriers. In this protocol, the secure messages are
decoded during transmission, and no additional information needs to be
transmitted. A QSDC scheme using batches of single photons that act as a one-
time pad [25] is proposed by Deng et al. in 2004 and in 2005 Lucamarini and
Mancini presented a protocol [26] for deterministic communication without applying
entanglement. Wang et al. proposed a QSDC approach that uses single photons,
of which the concepts were resulted from the order rearrangement and the block
transmission of the photons [27].
PRINCIPLES USED IN ELABORATION OF QUANTUM KEY

DISTRIBUTION PROTOCOLS
Photon and Polarization

A photon in physics is an elementary particle, the quantum of the
electromagnetic field and the basic "unit" of light and all other forms of
electromagnetic radiation. It is also the force carrier for the electromagnetic force.
Like all elementary particles, photons are governed by quantum mechanics and
will exhibit wave-particle duality – they exhibit properties of both waves and
particles. For example, a single photon may be refracted by a lens or exhibit wave
interference, but also act as a particle giving a definite result when quantitative
mass is measured.
Polarization is a physical property; it emerges when a light is considered as an
electromagnetic wave. The polarization direction of a photon can be fixed to any
desired angle with a polarizing filter.
Heisenberg’s Uncertainty Principle
The Heisenberg’s Uncertainty Principle shows that two complementary quantum
43
states cannot be simultaneously measured. When Heisenberg was performing a

light diffraction experiment he discovered this principle. He remarked the
decoherence of wave function of the system while measuring the position of
photons. A shorter wavelength corresponds to a more precise position of the
photons; as the wavelength rises, disturbances increase, and the position of the
photons becomes imprecise and uncertain. So, the simultaneous measurements
of two complementary quantum states are imprecise, and they alter the system.
Therefore, the new state differs from that before the measurement.
Heisenberg’s Uncertainty Principle is the main principle to assure the security of
early Quantum Key Distribution protocols. For an eavesdropper who attempt to
tap into a system to hack secret information needs to measure the quantum state
but Heisenberg’s Uncertainty Principle states that the measurement of a quantum
system affects the entire system. Thus, the legitimate users can monitor any
change to determine the presence of an eavesdropper or a wiretap.
The application of the Heisenberg’s Uncertainty Principle and two types of
orthogonal quantum state have lead Bennett and Brassard [20] to build a key
distribution protocol, named the BB84 Protocol. Also, Bennett [28] presented a
similar protocol but simpler using a non-orthogonal quantum states; it was called
the B92 Protocol.
Bell’s Inequality
In 1935, Einstein et al. argued for the completeness of quantum mechanics [24].
They projected that a strong non-classical mechanical connection exists between
two particles A and B that are separated, and they form an entangled photon pair.
Restated, very strong connection is observed when two quantum bits are in an
entangled state. Modifying or measuring the state of one of the quantum bits
determines the relative change in the rest of the quantum bit states within the
entangled state. Also, even if they are later widely separated, their behavior
remains that of a single unit or a single entity, exhibiting a form of locality; space
has no impact on the quantum behavior of the entity. The measurement result of B
depends on that of A and vice versa.
A beautiful result discovered in 1964; Bell [29] applied the restrictive classical
44
probability correlation function to prove and explain that a connection exists
between the correlation functions satisfying Bell’s Inequality when a classical
probability is employed to illustrate the quantum status of a system. However, in
the 1970s many experiments [30] revealed that the inequality cannot be satisfied if
different bases are employed to measure the separated photons of the entangled
pair mentioned in EPR paradox. So, entangled quantum states exist whose
correlation function cannot be expressed using classical probability. These
quantum states are non-local. To the researchers who attempt to contradict that
quantum states have locality, these findings were an important victory.
PROTOCOLS OF QUANTUM CRYPTOGRAPHY
BB84 Protocol
This protocol [20] was elaborated by Charles Bennett and Gilles Brassard in 1984. It is
based in its design on Heisenberg’s Uncertainty Principle. It is known as BB84 after
its inventors and year of publication, was originally described using photon
polarization states to transmit the information. Any two pairs of conjugate states
can be used for the protocol, and many optical fiber based implementations
described as BB84 use phase encoded states. This protocol is surely the most
famous and most realized quantum cryptography protocol. The security proof of
this protocol against arbitrary eavesdropping strategies was first proved by Mayers
[31], and a simple proof was later shown by Shor and Preskill [32].
The sender and the receiver (Alice and Bob) are connected by a quantum
communication channel which allows quantum states to be transmitted. Actually,
there are two means to transport photons: the optical fiber or free space [33].
Recent research are experimenting the use of atoms and electrons as a quantum
particle [34]-[35] and perhaps a novel kind of quantum channel will appear. The
quantum channel may be tampered with by an enemy. By its very nature, this
channel prevents passive monitoring.
In addition Alice and Bob communicate via a public classical channel, for
example using broadcast radio or the internet. Neither of these channels needs to
45
be secure; the protocol is designed with the assumption that an eavesdropper

(Eve) can interfere in any way with both. So, this classical channel may be
passively monitored but not tampered with by Eve.
BB84 uses the transmission of single polarized photons (as the quantum
states). The polarizations of the photons are four, and are grouped together in two
different non orthogonal basis.
Generally the two non-orthogonal basis are:
-base ⊕ of the horizontal (0°) and vertical polarization (+90°), and we represent
0 1 ⊕ ={ 0 , 1 }
the base states with the intuitive notation: and . We have (for
details about quantum computation please see [36]).
-base ⊗ of the diagonal polarizations (+45°) and (+135°). The two different base
1 1
+ = (0 + 1) − = (0 −1)
+ − 2 2
states are and with and . We
⊗ ={ + , − }
have .
In this protocol, the association between the information bit (taken from a random
number generator) and the basis are described in Table 1.
Table 1. Coding scheme for the BB84 protocol.

Bit ⊕ ⊗
0 0 = a 00 + = a 10
1 1 = a 01 − = a 11
The BB84 can be described as follows [37]:

1) Quantum Transmissions (First Phase)
a) Alice chooses a random string of bits d ∈ {0,1} , and a random string of bases
n
b ∈ {⊕, ⊗} , where n > N ( N is the length of the final key).

n
a ij di b
b) Alice prepares a photon in quantum state for each bit in d and j in b as
in Table 1, and sends it to Bob over the quantum channel.
46
a
c) With respect to either ⊕ or ⊗ , chosen at random, Bob measures each ij
received. Bob’s measurements produce a string d ∈ {0,1} , while his choices of

' n
bases form b ∈ {0,1} .

' n
2) Public Discussion (Second Phase)

di
a) For each bit in d :
bi
i) Alice over the classical channel sends the value of to Bob.
ii) Bob responds to Alice by stating whether he used the same basis for
bi ≠ bi
' '
di di
measurement. Both and are discarded if .
b) Alice chooses a random subset of the remaining bits in d and discloses their
values to Bob over the classical channel (over internet for example). If the result of
Bob’s measurements for any of these bits do not match the values disclosed,
eavesdropping is detected and communication is aborted.
c) The string of bits remaining in d once the bits disclosed in step 2b) are
removed is the common secret key, K = {0,1} (the final key).

N
To understand BB84 protocol it is very important to describe how we measure a

qubit =e c + f g
qubit in the field of quantum physics; if we have a qubit as so
{c , g } c
the measure of this state in the basis produces the state with the
2 2
g
probability of | e | and the state of with the probability of | f | and of course
| e | + | f | = 1 ( | e | is the absolute square of the amplitude of e). So, measuring with
2 2 2
the incorrect basis yields a random result, as predicted by quantum theory. Thus, if
Bob chooses the ⊗ basis to measure a photon in state

1
, the classical outcome
1
will be either 0 or 1 with equal probability because 1 = ( + - - ) ; if the ⊕ basis
2
was chosen instead, the classical outcome would be 1 with certainty because
1 =1 1 +0 0
.
47
To detect Eve, Alice and Bob perform a test for eavesdropping in step 2b) of
the protocol. The idea is that, wherever Alice and Bob’s bases are identical (i.e.
'
b i =b i
), the corresponding bits should match To detect Eve, Alice and Bob perform
a test for eavesdropping in step 2b) of the protocol. The idea is that, wherever
bi = bi
'
Alice and Bob’s bases are identical (i.e. ), the corresponding bits should
di = di
'
match (i.e. ). If not, an external disturbance is produced or there is noise in

the quantum channel, we suppose all that is caused by Eve.
Eve can perform several attacks. One type of possible attack is the intercept-
resend attack, where Eve measures photons sent by Alice and then sends
replacement photons to Bob, prepared in the state she measures. This produces
errors in the key shared between Alice and Bob. As Eve has no knowledge of the
polarization of photons sent by Alice, she can only guess which basis to measure
photons, in the same way as Bob. In the case where she chooses correctly the
basis, she measures the correct photon polarization state as sent by Alice, and
resends the correct state to Bob. But if its choice is incorrect, the state she
measures is random, and the state sent to Bob is sometimes not the same as the
state sent by Alice. If Bob then measures this state in the same basis Alice sent,
he gets a random result instead of the correct result he would get without the
presence of Eve. An illustration of this type of attack is shown in the Table 2.
Eve chooses the incorrect basis with the probability 0.5, and if Bob measures
this intercepted photon in the basis Alice sent he gets a random result, i.e., an
incorrect result with probability of 0.5. The probability an intercepted photon
generates an error in the key string is then 0.5 × 0.5=0.25. If Alice and Bob publicly
compare n of their key bits the probability they find disagreement and identify the
presence of Eve is:
3 n
Pd = 1 − ( )
4 (1)
Pd = 0.9999999...
So to detect an eavesdropper with probability Alice and Bob
need to compare n = 72 key bits.
48
Table 2. An example of the intercept-resend attack.
Alice's random bit 0 1 1 0 1 0 0 1

Alice's random sending basis ⊕ ⊕ ⊗ ⊕ ⊗ ⊗ ⊗ ⊕
Photon polarization Alice sends 0 1 − 0 − + + 1
Eve's random measuring basis ⊕ ⊗ ⊕ ⊕ ⊗ ⊕ ⊗ ⊕

Polarization Eve measures and 0 + 1 0 − 1 + 1
sends
Bob's random measuring basis ⊕ ⊗ ⊗ ⊗ ⊕ ⊗ ⊕ ⊕
Photon polarization Bob 0 + + − 1 + 0 1
measures
PUBLIC DISCUSSION OF
BASIS
Shared secret key 0 - 0 - - 0 - 1
Errors in key ✓ - ✘ - - ✓ - ✓
B92 Protocol
In 1992, Bennett proposes a protocol for Quantum Key Distribution based on two
nonorthogonal states and known under the name of B92 or protocol of two states[28]. The
quantum protocol B92 is similar to the BB84 protocol but it uses only two states instead of
four states. B92 protocol is also based on the on Heisenberg’s Uncertainty Principle.
B92 protocol is proven to be unconditional secure. A remarkable proof of the
unconditional security of B92 is the proof of Tamaki [38]. That is meant that this
proof guaranteed the security of B92 in the presence of any enemy who can
perform any operation permitted by the quantum physics; consequently the
security of the protocol cannot be compromised by a future development in
quantum calculation. Others results related to unconditional secure of B92 are
discussed in [39][40].
The use of a quantum channel that Eve (enemy) cannot monitor without being
detected makes possible to create a secret key with an unconditional security
based on the laws of the quantum physics. The presence of Eve is made manifest
to the users of such channels through an unusually high error rate. B92 is a
protocol of quantum key distribution (QKD) which uses polarized photons as
information carriers. B92 supposes that the two legitimate users, Alice and Bob,
49
communicate through two specific channels, which the enemy also has access to:
• A classical channel, which can be public; Eve can listen passively (without
being detected);
• A quantum channel that (by its nature) Eve cannot listen passively.
The first phase of B92 involves transmissions over the quantum channel, while
the second phase takes place over the classical channel.
To describe B92 we use the same notations as those used for the description
of BB84 protocol. For simplicity we give the Fig. 4.2 to show different states of
photons (polarizations) which we use in this protocol. Encoding data on photons is
shown in Table 1.
Figure 2: Different states of photons used in B92 protocol.
In B92 protocol, several setups must be done [41]:

1) First phase (Quantum Transmissions)
a) Alice choose randomly a vector of bits A ∈ {0,1} , n > N ( N is the length of the
n
Ai = 0 0
final key). If Alice sends to Bob the state of over the quantum channel
Ai = 1 +
and if , she sends to him the state of , for all i ∈ {0,1, … , n} .
B =0
b) Bob creates in its turn a random vector of bits B ∈ {0,1} , n > N . If i
n
Bob
B =1
chooses the basis ⊕ and if i Bob chooses the basis ⊗ , for all i ∈ {0,1, … , n} .
0 +
c) Bob measures respectively each quantum state sent by Alice ( or ) in the
selected basis ( ⊕ or ⊗ ).
2) Second phase (Public Discussion)
50
a) Over the classical channel, Bob sends T to Alice.
T =1
b) Alice and Bob preserve only the bits of the vectors A and B for which i . In
A i = 1 − Bi
such case and in absence of Eve, we have: and the shared raw key is
Ai 1-Bi
formed by (or ).
c) Alice chooses a sample of the bits of the raw key and reveals them to Bob over
A ≠ 1 − Bi
the classical channel. If it exists i such as i , then Eve is detected and the
communication is aborted.
d) The shared secret key K ∈ {0,1} is formed by the raw key after elimination of the
N
samples of the step 2c).

The Table 3 illustrates how the B92 protocol operates. There are three points
to understand the protocol B92 perfectly. Firstly, if the test of Bob is equal to 0 for
a measure, then Bob does not know what Alice sent to him. Thus if Bob chooses
+
the basis ⊕ (resp. ⊗ ), he can obtain as result of his measure
0
(resp. ) for any
0 +
quantum state sent by Alice ( or ). Secondly, if the test of Bob is equal to 1
then Bob knows with exactitude what Alice sent to him, for example if Bob
−
chooses the basis ⊗ (resp. ⊕ ), he will obtain after measure the state (resp.
1 0 +
) and Alice surely sent to him (resp. ). Thirdly, in the step 2b), Alice and
T =1
Bob test the presence of Eve; the idea is that if it exists i such as i then
A i = 1 − Bi
, if not an external disturbance is produced or there is noise in the
quantum channel, we suppose all that is caused by Eve.
2) Second phase (Public Discussion)
a) Over the classical channel, Bob sends T to Alice.
T =1
b) Alice and Bob preserve only the bits of the vectors A and B for which i . In
A i = 1 − Bi
such case and in absence of Eve, we have: and the shared raw key is
Ai 1-Bi
formed by (or ).
51
c) Alice chooses a sample of the bits of the raw key and reveals them to Bob over
A ≠ 1 − Bi
the classical channel. If it exists i such as i , then Eve is detected and the
communication is aborted.
d) The shared secret key K ∈ {0,1} is formed by the raw key after elimination of the
N
samples of the step 2c).

The Table 3 illustrates how the B92 protocol operates. There are three points
to understand the protocol B92 perfectly. Firstly, if the test of Bob is equal to 0 for
a measure, then Bob does not know what Alice sent to him. Thus if Bob chooses
+
the basis ⊕ (resp. ⊗ ), he can obtain as result of his measure
0
(resp. ) for any
0 +
quantum state sent by Alice ( or ). Secondly, if the test of Bob is equal to 1
then Bob knows with exactitude what Alice sent to him, for example if Bob
−
chooses the basis ⊗ (resp. ⊕ ), he will obtain after measure the state (resp.
1 0 +
) and Alice surely sent to him (resp. ). Thirdly, in the step 2b), Alice and
T =1
Bob test the presence of Eve; the idea is that if it exists i such as i then
A i = 1 − Bi
, if not an external disturbance is produced or there is noise in the
quantum channel, we suppose all that is caused by Eve.
Table 3: Description of the mechanism of B92 protocol.

Bits chosen by Alice Ai = 0 Ai = 1
States sent by Alice 0 +
Bits chosen by Bob Bi = 0 Bi = 1 Bi = 0 Bi = 1
Basis chosen by Bob ⊕ ⊗ ⊕ ⊗
Results of the measures
0 1 + − 0 1 + −
of Bob
Probability to measure
1 0 1 0
1 1 1 1
the state 2 2 2 2
The value of the test 0 - 0 1 0 1 0 -
52
The EPR Protocol
Preliminary
In [42], Artur Ekert has elaborated a quantum protocol based on the properties
of quantum-correlated particles. He uses a pair of particles (called pair EPR).
EPR refers to Einstein, Podolsky and Rosen, which presented a famous
paradox in 1935 in their article [24]. They challenged the foundations of quantum
mechanics by pointing out a “paradox”. The authors state that there exist spatially
separated pairs of particles, called EPR pairs, whose states are correlated in such
a way that the measurement of a chosen observable A of one automatically
determines the result of the measurement of A of the other. Since EPR pairs can
be pairs of particles separated at great distances, this strange behavior is due to
“action at a distance.”
It is possible for example to create a pair of photons (each of which we label
below with the subscripts 1 and 2, respectively) with correlated linear polarizations.
An example of such an entangled state is given by:
S = 1
2
(0 1
1 2 + 1 1 0 2)
0
Thus, if one photon is measured to be in the state , the other, when
1
measured, will be found to be in the state , and vice versa.
To explain the paradox of “action at a distance”, Einstein et al. suppose that
there exist “hidden variables”, inaccessible to experiments. They then state that
such quantum correlation phenomena could be a strong indication that quantum
mechanics is incomplete. Bell [29] in 1964, gave a means for actually testing for
locally hidden variable (LHV) theories. He demonstrated that all such LHV theories
must satisfy the Bell inequality. On the other hand, quantum mechanics has been
shown to violate the inequality.
EPR Protocol
Unlike BB84 and B92 protocols, this protocol uses Bell’s inequality to detect
the presence or absence of Eve as a hidden variable. The EPR quantum protocol
53
is a 3-state protocol. We describe this protocol in terms of the polarization states of

an EPR photon pair.
θ
We use the notation of which denotes the polarization state of a photon linearly
polarized at an angle θ. As the three possible polarization states of our EPR pair,
we choose:
3π 3π
S0 = 1
2
(0 1
+ 0 2)
6 2 6 1
π 4π 4π π
S1 = 1
2
( + )
6 1 6 2 6 1 6 2
2π 5π 5π 2π
S2 = 1
2
( + )
6 1 6 2 6 1 6 2
For each of these states, we choose the following encoding data:

The 0 3π π 4π 2π 5π
6 6 6 6 6
state
Bit 0 1 0 1 0 1
The measurement operators [36] corresponding to this encoding are

respectively:
M0 = 0 0
,
π π
M1 =
6 6 ,
2π 2π
M2 =
6 6
Like BB84 and B92 protocols, there are two phases to the EPR protocol, the
first phase over a quantum channel and the second over a public channel. EPR
protocol could describe as follows [43]:
1) Quantum Transmissions (First phase)
Si { S j , 0 ≤ j ≤ 2}
Firstly, a state is randomly selected from the set of states to
54
Si
create EPR pair in the selected state . One photon of the established EPR pair
is sent to Alice, the other to Bob. With equal probability separately and
independently, Alice and Bob at random select one of the three measurement
M0 M1 M2
operators , and . They measure their respective photons with the
selected measurement operators. Alice records her measured bit. And Bob
records the complement of his measured bit. This procedure is repeated for as
many times as needed.
2) Public Discussion (Second phase)
Alice and Bob establish a discussion over a public channel to determine those bit
at which they used the same measurement operators. Next, they separate their
respective bit sequences into two subsequences. The first subsequence, called
raw key, consists of those bit at which they used the same measurement
operators. The second subsequence, called rejected key, consists of all the
remaining bit.
The purpose of the rejected key is to detect Eve’s presence. Alice and Bob
over the public channel compare their respective rejected keys to determine
whether or not Bell’s inequality is satisfied: if it is, Eve’s presence is detected and if
not, then Eve is absent.
For this specific EPR protocol, Bell’s inequality can be formulated as follows.
We note P ( ≠| i , j ) the probability that two corresponding bits of Alice’s and Bob’s
rejected keys do not coincide known that the measurement operators chosen by
Mi Mj M Mi
Alice and Bob are respectively either and or j
and .
We write also the expression:
P ( =| i , j ) = 1 − P ( ≠| i , j ) , Φ (i , j ) = P ( ≠| i , j ) − P ( =| i , j ) , Ι = 1 + Φ (1, 2) − | Φ (0,1) − Φ(0, 2) | .
So, the Bell’s inequality reduces in this case to

Ι≥0
and for quantum mechanics (i.e., no hidden variables)
1
Ι=−
2
55
which is a clear violation of Bell’s inequality.

There are others protocols of quantum cryptography. For example, there is the
EPR protocol with a single particle and there is also a 2-state EPR implementation
of the BB84 protocol. We can consult [44]-[45] for details. Also, the paper [46]
treats the various multiple state and rejected data protocols.
CONCLUSION
Quantum cryptography is based on a combinations of principles from quantum
physics and information theory and made possible thanks to the tremendous
progress in quantum optics and in the technology of optical fibers and of free
space optical communication. Its security relies on deep theorems in classical
information theory and on a profound understanding of the Heisenberg’s
uncertainty principle. Quantum cryptography has some important contributions to
classical cryptography: privacy amplification [47] and classical bound information
are examples of concepts in classical information whose discovery were much
inspired by quantum cryptography. Also, the fascinating tension between quantum
physics and relativity, as illustrated by Bell’s inequality, is not far away. Actually,
despite the huge progress over the recent years, many technological challenges
and open questions remain.
The first technological challenge at present concerns improved detectors
compatible with telecom fibers. Also two other issues concern free space and
quantum repeaters. The first is presently the only way to realize quantum
cryptography over thousands of kilometers using near future technology. The
purpose of the idea of quantum repeaters is to encode the qubits in such a way
that if the error rate is low, then errors can be detected and corrected entirely in
the quantum domain. So, the hope is that such techniques could extend the range
of quantum communication to essentially unlimited distances.
For the open questions side, we emphasize three main concerns. First,
complete and realistic analyses of the security issues are still missing. Second,
figures of merit to compare quantum cryptography schemes based on different
56
quantum systems (with different dimensions for example) are still awaited. Third,
the delicate question of how to test the apparatuses did not yet receive enough
attention.
Quantum cryptography could well be the first application of quantum
mechanics at the single quanta level. Many experiments have demonstrated that
keys can be exchanged over distances of a few tens of kilometers at rates at least
of the order of a thousand bits per second. There is no doubt that the technology
can be mastered and the question is not whether quantum cryptography will find
commercial applications, but when!
57
REFERENCES
1. R. Feynman, Simulating physics with computers, International Journal of Theoretical
Physics 21 (6&7) (1982) 467–488.
2. http://qist.lanl.gov:80/qcomp_map.shtml
3. West, J (2000). Quantum Computers. Retrieved December 1, 2002 from
California Institute of Technology, educational website:
http://www.cs.caltech.edu/~westside/quantum-intro.html#qc
4. Daniel, G. (1999). Quantum Error-Correcting Codes. Retrieved on November
31st, 2002 from: http://qso.lanl.gov/~gottesma/QECC.html
5. http://en.wikipedia.org/wiki/Quantum_cryptography
6. Integer Factoring By ARJEN K. LENSTRA - Designs, Codes and
Cryptography, 19, 101–128 (2000) Kluwer Academic Publishers.
http://modular.fas.harvard.edu/edu/Fall2001/124/misc/arjen_lenstra_factoring.
pdf
7. P.W. Shor, Algorithms for quantum computation: discrete logarithm and
factoring, in: Proceedings of the 35th Annual Symposium on the Foundations
of Computer Science, 1994, pp. 124–134.
8. S. Wiesner, Conjugate coding, SIGACT News 15 (1) (1983) 78–88.
9. E. Biham, B. Huttner, T. Mor, Quantum cryptography network based on
quantum memories, Physical Review A 54 (3) (1996) 2651– 2658.
10. S.J.D. Phoenix, S.M. Barnett, P.D. Townsend, K.J. Blow, Multi-user quantum
cryptography on optical networks, Journal of Modern Optics 42 (1995) 1155–
1163.
11. B. Hutter, A. Peres, Quantum cryptography with photon pairs, Journal of
Modern Optics 41 (12) (1994) 2397–2403.
12. B. Hutter, N. Imoto, N. Gisin, T. Mor, Quantum cryptography with coherent
states, Physical Review A 51 (3) (1995) 1863–1869.
13. S. Wiesner, Quantum cryptography with bright light, Manuscript, 1993.
14. N. Lutkenhaus, Security against eavesdropping in quantum cryptography,
Physical Review A 54 (1) (1996) 97–111.
15. E. Biham, T. Mor, Security of quantum cryptography against collective attacks,
58
Physical Review Letters 78 (11) (1997) 2256–2259.
16. E. Biham, T. Mor, Bounds on information and the security of quantum
cryptography, Physical Review Letters 79 (20) (1997) 4034– 4037.
17. D. Mayers, L. Salvail, Quantum oblivious transfer is secure against all
individual measurements, in: Proceedings of the 3rd Workshop on Physics
and Computation—PhysComp’94, IEEE Computer Society, 1994, pp. 69–77.
18. A.C.-C. Yao, Security of quantum protocols against coherent measurements,
in: Proceedings of the 26th Annual ACM Symposium on the Theory of
Computing, 1995, pp. 67–75.
19. D. Mayers, Quantum key distribution and string oblivious transfer in noisy
channels, in: Advances in Cryptology—CRYPTO’96, LNCS 1109, Springer-
Verlag, 1996, pp. 343–357.
20. C.H. Bennett, G. Brassard, Quantum cryptography: public key distribution and
coin tossing, in: Proceedings of the International Conference on Computers,
Systems & Signal Processing, Bangalore, India, December 10–12, 1984, pp.
175–179.
21. C.H. Bennett, T. Mor, J. Smolin, The parity bit in quantum cryptography,
Physical Review A 54 (4) (1996) 2675–2684.
22. Beige, B.-G. Englert, C. Kurtsiefer, H. Weinfurter, Secure communication with
a publicly known key, Acta Physica Polonica A 101 (3) (1999) 357.
23. K. Bostro¨m, T. Felbinger, Deterministic secure direct communication using
entanglement, Physics Review Letters 89 (18) (2002) 187902.
24. Einstein, B. Podolsky, N. Rosen, Can quantum-mechanical description of
physical reality be considered complete? Physical Review 47 (1935) 777–780.
25. F.-G. Deng, G.L. Long, Secure direct communication with a quantum one-time
pad, Physics Review A 69 (5) (2004) 052319.
26. M. Lucamarini, S. Mancini, Secure deterministic communication without
entanglement, Physics Review Letters 94 (2005) 140501.
27. J. Wang, Q. Zhang, C.-J. Tang, Quantum secure direct communication based
on order rearrangement of single photons, Physics Letters A 358 (4) (2006)
256–258.
59
28. C.H. Bennett, Quantum cryptography using any two non-orthogonal states,
Physical Review Letters 68 (21) (1992) 3121–3124.
29. J.S. Bell, On the Einstein–Podolsky–Rosen paradox, Physics 1 (1964) 195–
200.
30. J.F. Clauser, Experimental investigation of a polarization correlation anomaly,
Physics Review Letters 36 (1976) 1223–1226.
31. D. Mayers, ”Unconditional security in quantum cryptography,” Journal of the
ACM, vol. 48, no. 3, pp. 351–406, May 2001.
32. P. W. Shor and J. Preskill, ”Simple proof of security of the BB84 quantum key
distribution protocol,” Phys. Rev. Lett, vol. 85, no. 2, pp. 441–444, July 2000.
33. R.Hughes,J.Nordholt,D.Derkacs,C.Peterson, (2002). ”Practical free-space
quantum key distribution over 10km in daylight and at night”. New journal of
physics 4 (2002)43.1-43.14.URL: http://www.iop.org/EJ/abstract/1367-
2630/4/1/343/
34. Knight, P (2005). “Manipulating cold atoms for quantum information
processing”. QUPON conference Vienna 2005.
35. Tonomura, A (2005). “Quantum phenomena observed using electrons”.
QUPON conference Vienna 2005.
36. M. Nielsen and I. Chuang. Quantum Computation and Quantum Information.
Cambridge University Press, 2000.
37. M. Elboukhari, M. Azizi, A. Azizi, “Implementation of secure key distribution
based on quantum cryptography”, in Proc. IEEE Int. Conf Multimedia
Computing and Systems (ICMCS’09), page 361 - 365, 2009.
38. Tamaki, K., M. Koashi, and N. Imoto, “Unconditionally secure key distribution
based on two non orthogonal states,” Physical Review Letters 90, 167904
(2003), [preprint quant-ph/0210162].
39. Tamaki.K , Lütkenhaus.N, “Unconditional Security of the Bennett 1992
quantum key-distribution over lossy and noisy channel,“ Quantum Physics
Archive: arXiv:quantph/0308048v2, 2003.
40. Tamaki.K, Lütkenhaus.N, Koashi.M, and Batuwantudawe.J, “Unconditional
security of the Bennett 1992 quantum key-distribution scheme with strong
60
reference pulse, “ Quantum Physics Archive: arXiv:quant-ph/0607082v1, 2006.
41. M. Elboukhari, M. Azizi, A. Azizi, “Security Oriented Analysis of B92 by Model
Checking”, in Proc. IEEE Int. Conf. new technology, mobility and security
(NTMS), page 454-458, 2008.
42. Ekert, Artur K., Quantum cryptography based on Bell’s theorem, Physical
Review Letters, Vol. 67, No. 6, 5 August 1991, pp 661 - 663.
43. S. J. Lomonaco Jr: A quick glance at quantum cryptography
http://www.cs.umbc.edu/»lomonaco/Publications.html
44. Bennett, Charles H., Gilles Brassard, and N. David Mermin, Quantum
cryptography without Bell’s theorem, Physical Review Letters, Vol. 68, No. 5, 3
February 1992, pp 557 - 559.
45. D’Espagnat, B., Scientific American, November 1979, pp 128 - 140.
46. Blow, K.J., and Simon J.D. Phoenix, On a fundamental theorem of quantum
cryptography, Journal of Modern Optics, 1993, vol. 40, no. 1, 33 - 36.
47. Bennett, C. H., Brassard, G., Crepeau, C. and Maurer, U. M., "Generalized
Privacy Amplification", IEEE Transactions on Information Theory, 1995.
61
Chapter 4
Adaptive Architecture to Protect Mobile Agents
Nardjes BOUCHEMAL, LIRE Laboratory, Computer Science Department

Ramdane MAAMRI, Mentouri University of Constantine, Algeria
ABSTRACT
Mobile agents are a new paradigm to distributed computation, where mobile agent
roams the global Internet in search of services for its owner. The most problem
with this approach is security.
The objective of this paper is to propose a protocol to protect mobile agent,
based on two agents: mobile agent and investigator agent. The investigator is a
mobile agent’s prototype with no critical code and data. It is created and sent by
mobile agent in order to be executed first. In return, investigator agent is analyzed
by mobile agent to detect any malicious action: if actions are forbidden, mobile
agent redoes a copy and changes destination. If actions are doubtful, mobile agent
chooses a plan of adaptation then migrates. If all actions are authorized, mobile
agent migrates with confidence.
Keywords: Protection, Mobile agent, Adaptability investigation.
62
INTRODUCTION
Mobile agents are program instances, able to migrate from one agent platform
to another [1], [3], [9], thus fulfilling tasks on behalf of user or another entity. They
consist of three parts: code, data state (e.g. instance variables), and execution
state.
They transport sensitive information such as secret keys, electronic money,
and other private data.
Consequently, security is a fundamental precondition for acceptance of mobile
agent applications. In other words, we need to have a program that actively
protects itself against malicious hosts who try to attack mobile agent in order to
obtain service without providing payment, to remove private information from
agent’s memory, or to destroy their code, state or data [16], [27].
We can classify host’s attacks against mobile agents into three classes [4], [8],
[14]: inspection, modification and replay attacks. Inspection consists in examining
the contents of agent, or the stream of execution to get back critical information
transported by mobile agent.
The modification is realised by replacing some elements of agent with the aim
of leading an attack.
Replay attacks are obtained by cloning the agent, then by executing the
investigator in several configurations to find agent’s knowledge.
We also quote denial of service [12] where a malicious host can ignore
demands of service, introduce unacceptable delays for critical spots, don’t execute
the code of mobile agent either end it without notice. Other agents who wait for the
answer of this agent will be in deadlock.
Different approaches are proposed to guarantee to agents a trust execution in
visited hosts, such as tamper proof hardware [12], function hiding [13], black box
[6], [15] or clueless agents [11].
We explain in this paper a proposition to protect mobile agent, based on
cloning and adaptability concepts.
The investigator is a mobile agent copy, with no critical code and data. Mobile
63
____ Chapter 4: Adaptive architecture to protect mobile agents
agent creates its investigator, saves a copy and sends it first to the host. After
execution, investigator returns back to mobile agent who compares it with the
saved copy in order to detect possible attacks.
Thus, instead of divulging mobile agent to malicious host, we propose to send
an investigator firstly to examine possible attacks. This idea minimizes damages
on mobile agent and allows detecting and avoiding several types of attacks in
moment of their arrivals.
We will discuss performances that our proposition can bring to protect mobile
agents against these attacks in section 6, but at first we begin with section 2 where
we present some works based on one or several agents to protect mobile agent,
and other works based on adaptability. Section 3 describes principle, architecture,
and steps of proposed system.
Section 4 presents some implementation principles. In section 5 we propose an
experiment to compare our approach with tow other approaches.
We evaluate the protocol in section 6. Finally, section 7 summarizes our
contribution and describes future works.
Related Work
In this section, we summarize some proposed techniques based on
adaptability, and other based one or more agents.
A. Adaptability to protect mobile agents
Adaptability concept
The adaptation designates the action to react facing variations of environment
constraints [20], [21].
The adaptation can be static or dynamic: The static adaptability is done before
the execution according to environment knowledge detained.
Dynamic adaptability is based on estimation, during the construction of the
application, of different variations of the environment and defining actions of
adaptation. Consequently, it defines adaptability rules.
Works based on Adaptability
S.Hacini et al in [14] proposed an approach which offers to mobile agent the
64
possibility to modify its behaviour. This ability makes it unpredictable and
complicates its analysis [2]. The idea is that mobile agent must verify the customer
trustworthiness and present an appropriate behaviour.
S.Leriche and JP.Arcangeli [20], [21], proposed architecture based on micro
replaceable components to adapt mobile agent to various executions conditions.
We tried to use this idea in our architecture to adapt the mobile agent to different
attacks.
B. Works based on other agents

Corradi et al. [6], Yao et al. [25] [26] and Guan et al. [10] used mechanisms
with a TTP (Trusted Third Party) to protect mobile agent’s data. TTP records
itinerary information directly or indirectly. The main problem is that they need one
TTP at least, and mobile agent needs to communicate with it ceaselessly. So, TTP
will become a bottleneck, and even cause single point failure. More over, it’s not
easy to find a TTP in the open internet.
El Rhazi and S.Pierre in [7] proposed an approach based on cooperation of a
sedentary agent running inside a trusted third host. Results show that the protocol
detects several attacks, such as denial of service, incorrect execution and re-
execution of mobile agent code. But major limit is that the sedentary agent can do
nothing against different attacks, and mobile agent continues its migration
uselessly, so all future visited hosts (after code modification attack) will be
considered as being malicious even if they are not.
Ouardani and S.Pierre in [22] proposed a technique considered as an
improvement of [7], based on two sedentary agents. Their role is to follow the
mobile agent step by step by verifying its itinerary, calculating the time of
execution to detect various types of attacks, and verifying every time its state.
The Protocol
A. Origin (Military Life)
Our proposal is inspired from the principle of Military Investigator or scouts.
65
Their main mission is to acquire information about opponent territory, to dread

dangers and obstacles, then to make reports on all visited places and met dangers
or attacks. These reports are put back to leader of troop to analyze.
If he decides to go, he has to choose a policy of adaptation to environment’s
nature and opponent attacks.
He can ask for reinforcement, choose to go by air, by boat or by road. The
leader can decide not to go at all if investigators don’t return.
Note that these investigators have no idea on main mission of their troop; they are
only enlightening the road. Even if they are captured, they have no sensitive or
critical information.
Thus, the principle consists to send at first a prototype agent, called the
investigator, to destination host.
The investigator is a copy of mobile agent: its code is a part of mobile agent’s
code and does not translate any important knowledge. Investigator data is also a
copy of mobile agent’s non sensitive data.
Mobile agent creates its investigator, keeps a copy and sends it to destination host
to be executed. At return, mobile agent compares it with the saved copy in order to
detect any modification of data or code.
B. Architecture components
We use two different entities: mobile agent (MA) and investigator agent (CA).
We will detail components of each entity in what follows (Fig.1):
Mobile Agent
Mobile agent contains code, static and dynamic data and modules: interface,
analyzer, adapter, manager, component library and action library.
Code
Created by the owner, it expresses mobile agent knowledge, and constituted
by several replaceable components.
Static data
Static data are unchangeable data, such as creator identity, its digital signature,
data used during mobile agent’s execution, estimated investigator runtime (in
order to compare it later with the real execution time), decryption keys to decrypt
66
investigator’s report and other keys to encrypt collected data, partial results and
mobile agent itinerary.
Dynamic data
This part is dynamic because it evolves in every migration. We find black list
hosts (BLH), trusted hosts list (THL) and doubtful or untrusted hosts list (UHL).
Collected data and partial results are encrypted and saved after every migration
(decryption key is only on home host to guarantee data confidentiality).
Besides, every visited host is registered and encrypted by mobile agent to form the
itinerary.
Investigator’s copy
Mobile agent backs up a copy of investigator agent (code and data) before
migration towards a site to make comparison and analysis after return.
Action library
Action library contains a set of predefined events, divided into three classes:
• Class of allowed actions: It contains all allowed actions that investigator
can make without any problem. For example execution, migration, etc.
• Class of forbidden actions: This class contains forbidden actions which
can violate integrity or confidentiality of the investigator. For example action
delete or modification of code.
• Class of doubtful actions: It contains a set of doubtful actions that mobile
agent is not sure if they are untrusted or not. For example copy-paste code
action.
Interface
We propose an interface between mobile agent and the investigator. It receives
data, code and report of the investigator, then sends them to analyzer to be
analyzed.
Analyzer
The role of this part is to analyze investigator agent: code, data and report. No
modification must be done on code and static data. Dynamic data or partial results
must be signed by the visited host, otherwise, it is considered malicious.
Analyzer compares investigator’s report with actions library:
67
• If actions are forbidden (attack is grave), analyzer informs manager to

change investigator destination.
• If actions are doubtful (attack is avoidable), analyzer informs adapter to
choose suitable components from components library and ask manager to
replace components at mobile agent’s code in order to adapt it.
• If all actions are allowed, mobile agent migrates with confidence.
Component Library
It contains replaceable components used by mobile agent during adaptation.
These components are fragments of code chosen by adapter following
investigator’s report.
Adapter
We rely in this section on the architecture proposed by Hacini et al [14] and
Leriche et al [21]. The role of adapter is to adapt mobile agent following the events
report, by choosing and sending suited components to manager in order to change
mobile agent structure. Adapter contains a set of the type:
<If Action (i) Then Component (j)>.
For example, in case of code analyse, adapter decides to use a more raised
protection policy of code, by choosing obfuscation technique [17]. So, it adapts
mobile agent to a situation of attack by loading the component “obfuscation code”.
Manager
Its role is to change mobile agent structure to adapt it in case of attacks.
Manager redoes a copy of the investigator and changes its destination in case of
forbidden actions. It adds sites in blacklist, trust or untrusted list.
If there is no attack, manager is informed by analyzer and enables mobile agent to
migrate with guarantee.
2) Investigator agent
The investigator is a prototype of mobile agent with no important code and data.
Our aim is not to expose directly essential knowledge and sensitive data of mobile
agent without testing the assurance of host, and identify if it is malicious or not.
68
Investigator has static data, such as creator digital signature, and dynamics data,
such as partial results collected on the host. Investigator agent has also a report of
all produced actions on the host, and it communicates with its mobile agent using
an interface.
These actions are coded by an encoding key known only by mobile agent, and
changed for every migration in order to avoid the risk that a malicious host makes
forbidden action then tries to delete it from actions’ report.
Dynamic data
Static data Components
Lib
Code
MANAGER ADAPTER
Actions library ANALYZER
Investigat
or
MOBILE AGENT INTERFACE
INVESTIGATOR
Information
Flux Data Code Report
Communication Code, data and report of investigator
Figure.1. Proposed Architecture
C. Steps of the protocol

A. Firstly, mobile agent is on the site of origin H0. Before migrating to a site H1, it
creates an investigator, keeps a copy then sends it on H1. Mobile agent
calculates investigator execution time.
B. The investigator, by arriving on the host H1, calculates environmental key by
realizing several predefined actions on collected information from H1.
69
Environmental key aims at identifying the host before beginning the execution
[23]. If this key is not valid, investigator returns towards its mobile agent which
changes the destination. If the key is valid, investigator begins its execution.
C. Mobile agent analyzes investigator. We can envisage the following situations.
4.1 Mobile agent calculates and compares estimated time with execution time. If
it is exceeded, mobile agent concludes that investigator is either “killed” or its
destination is modified, so "lost". It puts this host in the blacklist and decides not to
migrate.
4.2 If investigator returns, mobile agent compares it with the saved copy. If it
finds that code or static data are modified, it puts responsible host in the blacklist
and decides not to go.
4.3 If there is no modification in code or static data, mobile agent verifies if
investigator partial results are signed. The objective is to avoid non-repudiation,
where a host can deny having received the investigator.
In the end of steps 4.1, 4.2, 4.3, mobile agent must redo another investigator and
sends it towards another destination.
4.4 If code and data are not modified, and partial results are signed, mobile agent
analyzes investigator’s report and compares its actions with library of actions:
• If it finds forbidden actions (such as deletion of code or data), it puts the host in
blacklist (BLH), redoes another copy of investigator and sends it towards
another destination.
• If it finds doubtful actions (such as copy past), it puts host in list of doubtful
sites (UHL), chooses suited components, then changes structure of its code by
making replacements of components .
• The goal is to increase the security level of mobile agent in order to avoid the
attack.
• If all report’s actions are allowed, mobile agent registers the host H1 in its list
of trusted host (THL), adds and encrypts it in the itinerary, then migrates.
70
D. At the end of its mission, mobile agent returns to host of origin with all collected
data, lists of malicious, doubtful and trusted hosts. These lists are used for
future migrations.
E. Before continuing its way towards a host H2, mobile agent verifies that H2 is
not in its black list, recreates an investigator and redoes same steps.
Implementation
In order to prove the viability of our protocol, prototypes of mobile and
investigator agents are created.
The current implementation is made within JADE agent platform (Java Agent
Development framework). The main reason for this selection was the fact that
JADE is one of the best modern agent environments [5]. Furthermore, JADE is
open-source and it is FIPA compliant [1].
In what follow, we describe some principles of our implementation
① Mobile agent and investigator agent take place in JADE containers. In order to
simulate different attacks, we propose to implement an agent, called Testing
Agent (TA), in the secondary container.
② Its role is to generate attacks on investigator agent to observe the behaviors of
mobile agent.
③ Mobile agent, investigator agent and testing agent are defined in Agent class
(package jade.core).
④ All proposed modules (manager, adaptor, and analyzer) are defined with java
classes. Actions library and components library are represented by data bases.
⑤ All proposed agents must be registered in DF agent (yellow pages service,
Class DFService, package jade.Domain).
⑥ Communication between agents is assured with FIPA ACL language. (Class
ACLMessage, package jade.lang.acl). We use Dummy agent to visualize
messages between agents (Fig.2).
⑦ All agents’ behaviors are defined with the class Behavior (package
jade.core.behaviour).
71
Figure 2. Messages with Dummy Agent
Experiment
In order to validate our protocol, we planned an experiment, which consists to
create three mobile agents (Fig.3): MobileAgent1, protected with an approach
based on detection, where an attack is detected after return to the home site.
The second agent, MobileAgent2, is protected with TTP (Trusted Third Party).
TTP is a server where MobileAgent2 is verified after each migration.
Third agent is MobileAgent3, protected with the proposed protocol.
72
Figure 3. Different Containers and Agents Used in the experiment
To simulate different attacks, TA generates some behaviours (Fig.4), classified

in three classes: forbidden behaviours, doubtful behaviours and peaceful
behaviour (i.e. no attacks).
Firstly, testing agent generates grave attacks (A4: Modification, A5: Kidnapping).
MobileAgent1 cannot detect these attacks before it returns to host of origin. In
more, it can lose responsible host trace.
TTP can detect that MobileAgent2 is modified or kidnapped, but it needs more
time.
MobileAgent3 can easily detect this kind of attacks by comparing the saved
copy of investigator agent and calculating investigator run time. Malicious host is
saved in black listed hosts.
Secondly, testing agent generates doubtful attacks: we choose spying
investigator code (A3) and copy past action (A2). MobileAgent1 will have no idea
that it was copied or analyzed. In the best, its owner can detect these attacks, but
73
too late.
TTP is able to detect these actions on MobileAgent2 when it verifies it.
However, TTP needs more time for verification and it can’t avoid attacks.
MobileAgent3 can detect these attacks once it analyzes investigator report. In
this case, it adapts itself by choosing obfuscation component, which consist to
make the code incomprehensible [18].
We note down that the attack was inefficient after adaptation.
Finally, testing agent doesn't attack any mobile agent: MobileAgent1 and
MobileAgent2 can‘t be sure that the host is trust. MobileAgent3 is faithful and can
migrate with confidence.
Figure 4. Testing Agent Interface
We calculate for every agent, necessary time to detect attacks, knowing that
total time to go and return towards the host of origin is of 50 seconds.
We elaborated graph of comparison presented in figure 5.
Note that more the number of sites to be visited is big more attacks time
detection of MobileAgent1and MobileAgent2 is more important.
74
Figure 4. Comparison between approaches
EVALUATION
Guan et al. [10], Karjoth et al. [19], Yao et al. [25] and T. Sander et al. [24]
used the following points as security properties of mobile agent, we use some of
them as requests to analyse our protocol:
• Confidentiality
Report’s analysis of investigator agent allows knowing if its code or data were
copied. Moreover, environmental key guaranties confidentiality, because only
authorized hosts can access at first to investigator then to mobile agent.
• Non-repudiation
The signature of results by host and recording of itinerary with encryption
assure this property, because no site can deny that it was visited.
• Integrity
It is verified when investigator agent is analysed by mobile agent, which
compares its code with the saved copy to detect any corruption or modification
Conclusion
In this paper, we looked at the general problem of mobile agent protection
against malicious hosts, and different proposed works. Then, we presented a
protocol to assure the mobile agent protection. The idea is based on prevention
75
and requires two agents: mobile agent, with knowledge and critical data, and a
prototype, with no sensitive code and data. This prototype is called investigator
agent.
A mobile agent, before migrating to a host, creates a investigator then sends it
firstly. In return, it is analyzed in order to detect attacks. If investigator is attacked,
mobile agent chooses a policy of adaptation which consists in replacing
components suited to the situations of attacks. Mobile agent decides not to go at
all if attacks are severe.
We described the proposed architecture and its implementation using JADE.
We discussed the capacity of this protocol to verify various properties of security.
While there are several areas of the work presented here that require further
investigation. There are two that particularly interest us. Firstly, we would like to
assess the performance of our proposal in a real case; we will choose e-
commerce. Secondly, we would like to develop JADE Mobile Agents witch is more
application specific.
76
REFERENCES
1. Ametller J, Cucurull J, Mart R, Navarro G, Robles S, “Enabling mobile agents
interoperability through fipa standards”, Lecture Notes in Artificial Intelligence,
CIA 2006, vol. 4149, Springer, Edinburgh, UK, 2006, pp. 388–401.
2. Belaramani N.M, “A component -based software system with functionality
adaptation for mobile computing”. Master’s thesis, 2002.
3. Bernard G., Ismail L, « Apport des agents mobiles à l'exécution répartie »,
Revue des sciences et technologies de l’information, série Techniques et
science informatiques, vol. 21, n 6, p. 771-796, 2002.
4. Biehl I, Meyer B and Wetzel S, “Ensuring the Integrity of Agent-Based
Computations by Short Proofs”. Proceedings of the second International
Workshop on Mobile Agents. LNCS, Vol. 1477, pages 183-194, 1998.
5. Chmiel K” Agent Technology in Modelling E-Commerce Processes; Sample
Implementation”, Multimedia and Network Information Systems, Volume 2,
Wrocław University of Technology Press, pp. 13-22
6. Corradi A, Montanari R, and Stefanelli.C, “Mobile agents integrity in e-
commerce applications,” in Proc. of the 19th IEEE International Conference on
Distributed Computing Systems Workshop (ICDCS’99), 1999, Austin, Texas:
IEEE Computer Society Press, pp. 59-64
7. El Rhazi A, Pierre S, Boucheneb H, « Secure protocol in mobile agent
environment ». IEEE CCECE 2003, May 4-7, vol.2, Montereal, pp777-80.
8. Farmer W, Guttman J, and Swarup V, “Security for Mobile Agents:
Authentication and State Appraisal”. In Proceedings of the 4th European
Symposium on Research in Computer Science (ESORICS'96), September
1996, pp. 118 - 130.
9. Fuggetta A, Picco G, Vigna G, « Understanding Code Mobility », IEEE
Transactions on Software Engineering, vol. 24, n5, p. 342-361, 1998.
10. Guan H, Meng X, Zhang H, “A forward integrity and itinerary secrecy protocol
for mobile agents,” Wuhan University Journal of Natural Sciences, china,
vol.11, No.6, pp. 1727-1730, 2006.
77
11. Guessoum Z, “Modèles et architectures agents et de systèmes multi-agents

adaptatifs”. Thèse d'habilitation de l'université de paris 2003.
12. Guessoum Z, Ziane M and Faci N, “Monitoring and Organizational Level
Adaptation of Multi-Agent Systems”. AAMAS'04, ACM, New York City pages
514—522, 2004.
13. Hacini S, Guessoum Z and Boufaida Z, “Using a Trust-Based Key to Protect
Mobile Agent Code”. World Enformatika Society, CCIS'2006, Venice, Italy,
pages 326-332, 2006.
14. Hacini S, Cheribi C and Boufaïda Z, "Dynamic Adaptability using Reflexivity for
Mobile Agent Protection", in Transactions On Engineering, Computing And
Technology Enformatika Cairo, Egypte, December 2006.
15. Hohl F, “A framework to Protect Mobile Agents by Using References States”.
ICDCS 1999.
16. Hohl F, “A Model of Attacks of Malicious Hosts against Mobile Agents”. In
Fourth Workshop on Mobile Object Systems (MOS'98): Secure Internet Mobile
Computations http://cuiwww.unige.ch/coopws/ws98 /papers/ hohl.ps, 1998.
17. Hohl, F.: Time limited blackbox security: protecting mobile agents from
malicious hosts. In: Vigna, G. (ed) Mobile agents and security. Lecture Notes
in Computer Science, Vol. 1419, pp. 52–59. Springer, Heidelberg (1998)
18. Jansen W, “Countermeasures for mobile Agent security”, Computer
communications, Special issue on Advance Security Techniques for Network
Protection, Elsevier Science, 2000.
19. Karjoth G, Asokan N, and Gulcu C, “Protecting the computation results of
freeroaming agents,” in Rothermel, K. and Hohl, F. (eds.), in Proc. of the 2nd
International Workshop, Mobile Agents 98.
20. Leriche S, Arcangeli J.P, “Une architecture pour les agents mobiles
adaptables”. Dana Journées Composants JC’2004, Lille 2004.
21. Lerriche S, « Architectures à composants et agents pour la conception
d’applications réparties adaptables » Thèse de doctorat 2006
22. Ouardani A, Pierre S, Boucheneb H, “A Security protocol for mobile agents
based upon the cooperation on sedentary agents”, Journal of Network and
78
Computer Applications 2006, pp 1228-12
23. Riordan, J, Schneier, B, “Environment key generation towards clueless
agents”. Lect. Notes Comput, 1998
24. Sander T, Tschudin C, “Protecting Mobile Agent against Malicious Hosts”,
G.Vigna (Ed.), Mobile Agents and Security, Lecture Notes in Computer
Science, Vol. 1419, ©Springer-Verlag Berlin Heidelberg, Berlin, 1998.
25. Yao M, Peng M, and Dawson E, “Using ‘Fair Forfeit’ to Prevent Truncation
Attacks on Mobile Agents,” in Proc. of the 10th Australasian Conference on
Information Security and Privacy ACISP 2005, Brisbane, Australia, pp.158-
169,2005
26. Yao M, Peng K, Matt Henricksen M, Foo E, and Dawson E, “Using
Recoverable Key Commitment to Defend Against Truncation Attacks in Mobile
Agents,” in Proc. of the 5th International Conference on E-Commerce and
Web Technologies (EC-Web 2004), volume 3182 of Lecture Notes in
Computer Science, pp.164-173. Springer-Verlag, 2004
27. Zachary J, “Protecting mobile code in the wild”. IEEE
79
Chapter 5
Communication through expressing and remixing:

Workshop and System
Kosuke Numa and Koichi Hori

The University of Tokyo, Japan
ABSTRACT
This paper presents our participatory expressive workshop and information system
to support it. Our aim in this research is to cultivate communications in local
communities. Expressing thoughts will be a first step to communicate each other
and creating new stories by remixing others expressions will help to exchange and
grasp others thoughts. We propose a workshop program and a model of content
circulation, and develop a system to realize them. Our system supports
decomposing and recomposing by automatic draft content generation. We
implemented the system for a workshop, in which participants created contents
based on a format of expression named photo-attached acrostics. Through
observation of the practice, we concluded that our framework could help content
decomposition.
Keywords: Participatory workshop, Creative activity support, Content

circulation, Content recomposition, Automatic content generation.
80
INTRODUCTION
With the rapid spread of mobile devices and the Web, we live in a world of
explosively increasing volumes of information. Through real-time content
publishing in daily lives, it became easier that people in different places
communicate each other, however far they may be. Despite the situation ― or
should we say, because of the situation, communications in local communities are
neglected. People in Japanese urban areas often don’t know well about their
neighbors, nor, much less, what their neighbors are thinking. For such purpose,
we developed a participatory expressive workshop and a support system.
Our aim in this research is to provide opportunities for people in a local area to
discuss on their area so that they understand their place and the community more
deeply and more widely.
First step of communication is to express your thought. Remixing expressions
will lead participants to exchange and understand others opinions. Our workshop
and system are integratively designed for such sake. We designed and developed
both information system and activities to utilize it. Our technological focus is on
content circulation framework which includes creation (expressing) and reuse
(remixing).
BACKGROUNDS AND APPROACH

Our aim in this research is to cultivate communications in local communities.
People in urban areas sometimes don’t have enough opportunities to discuss with
their neighbors on the area. Participatory expressive workshop is our approach to
draw their stories.Storytelling and narrative approaches are gaining wide
acceptance in several fields, including psychology, folklore, education, and
therapeutics. People articulate and interpret their temporal experience by telling
stories [1,2]. Storytelling helps people understand and manage their experiences.
In recent years, digital storytelling [3] evolved out of grassroots movements in the
US and the UK. Digital stories are typically produced in a workshop style.
81
____ Chapter 5: Communication through expressing and remixing: Workshop and System
Figure 1. Workshop as a system
A workshop is a specially managed place where people gather and act

collaboratively. In a typical digital storytelling workshop, participants combine their
narrated words and photos to produce short films. In order to help ordinary people
tell their stories, the workshop style is a key. Bruner asserts that a story is
produced from a joint act of a storyteller and hearers [4]. In a workshop, facilitators
and other participants play hearer roles.
In a context of communication, acceptance of stories occupies an important
part. Communication is not one-way information transfer. Hearers are requested to
understand and adopt others opinions. To promote acceptance of stories, we
provide an expression style in which participants remix their and others
expressions in our workshop.
We developed both workshop program and information system to support the
workshop. The system has functions to realize the workshop like storing and
showing expressions at first. In addition, it stimulates participants with machinery
generated remixed contents to provide new perspectives.
As a technological background, our work is to provide a new model of content
circulation, which includes loop of content creation. Creation of information is
difficult to be directly supported. Truly “new” information is rarely produced but
82
usually new combinations of information are devised. In this research, we propose
a framework for recomposition of stored contents from a creativity support
research perspective.
In our framework, a system shows draft contents, which are automatically
generated by remixing the user’s and the others contents, when a user produces a
new content. A user finishes her content by selecting and modifying draft content.
Through such process, we aim to develop an iteratively growing loop of
expressions. In the loop, others contents are taken in a user’s newly produced
content, and the content are used in others contents again.
In our research, we designed a new format of expression to emphasize a loop
of content creation and recomposition. The format, photo-attached acrostics,
contains pairs of pictures and sentences. This format is easy to be taken apart to
partial expressions. The details of the format will be shown in section 5.
Workshop we focus on in this research is a participatory and experiential group
work-based style for learning and creation. Workshops are held in various fields –
arts such as theatres, hand crafts and music, citizen-participatory town planning,
and learning like training in companies and classes in schools.
A workshop is arranged and organized by facilitator. The facilitator establishes
tasks and prepares a place. Participants work together for the tasks in the place.
Shared place and tasks enhance to form opinions and output expressions. In
some case participants collaborate and in some case they compete.
Lave discussed the process of learning, creation, and consensus formation in a
group called Community of Practice [5], where people share techniques, interests,
or concerns. Commitment to the Community of Practice is activated by roles,
which participants are required to play, such as a master and an apprentice [6].
This theory, Legitimate Peripheral Participation, explains participatory workshops
gain participants’ active commitments. A person, who plays a participant role, is
requested to carry tasks out based on the program prepared by the facilitator.
Figure 1 illustrates the concept of our proposed workshop as a system. The
core elements of a workshop are participants, facilitator, information system(s),
tasks, and place.
83
Every person related to a workshop plays a role such as a participant or a

facilitator. They gather in a place and work for given tasks. Information system
support people’s activities in a workshop. These whole is a creativity support
system called workshop; a participant her/himself is also a part of a system. A
participant gets new ideas and creates expressions through tasks in a workshop,
and at the same time, he contributes to others’ creation as a part of the system.
The context where an expression placed in will be changed when one gets a
new idea from another participant or from another expression. Information system,
however, can analyze and extract the structures of expressions based on the
surface expressions, not on the people’s subjective thoughts. Balanced
combination of outputs from information systems and interaction with other people
will stimulate participants effectively. A workshop, at the same time, requests
participants to output expressions as tasks in its program.
RELATED WORK
Creativity Support
In the beginnings of 1990s, research area called creativity support was raised.
In the area, problems like how computers can support human creative activity and
what kind of creative activity can be supported were discussed.
Boden distinguished two sorts of creativity: H-creativity, which indicates
historically new idea/concept formation, and P-creativity, psychologically new
idea/concept formation in human minds [7]. In our research, we aim P-creativity
support rather than H-creativity support. For ordinary people, our target users,
what they express ― externalization of internal nebulous thoughts ― is more
important than how they express ― surficial originality of expressing techniques.
In psychology field, Guilford made the distinction between convergent and
divergent thinking [8]. Our approach emphasizes neither of them specially, but if
daring to say, it matches divergent one. One of our aims is to support expressing,
which seems to be a convergent process; but widening users’ views and
84
unsticking users’ stuck thinkings are more important.
Many and many creative methods have been proposed, including KJ method
[9] and brainstorming [10], and many systems to help creative methods using
computer systems have been developed [11].
Automatic Content Generation

Our research aims to stimulate users by presenting draft expressions. It
doesn’t mean that the system takes a user’s place to “create” expressions; it just
presents candidates. Users decide to insert the draft into their expression or not,
and if so, they select which candidate is added and modify it according to their will.
Letting users place the generated candidates into their own content affords them
to think deeply about it. Automatic content generation techniques, however, are
useful for our purpose.
Bringsjord and his group developed a system called Brutus1, which generates
literary stories [12]. Knowledge bases and grammar rules are programmed in
advance, and it generates quite readable and natural stories. When sufficient
knowledge and enough rules are provided, machines can generate high-quality
unexpected expressions.
AARON programmed by painter Harold Cohen is known as a painting software
[13]. AARON generates paintings according to parameters given by Cohen. There
is an interesting story: Someone asked him “who is the ‘creator’ of the paintings?”
Cohen claimed that AARON does not paint, but Cohen paints using AARON. This
is the very what we emphasize: A system is a tool for creation. An output of the
system can be an expression only after evaluated and accepted by the user as her
expression. If she is insufficient, she can modify parameters or edit the output,
then “create” her work. Here an output is stimulation for a user.
Multiple document summarization [14, 15] is technically related to our research.
We have not implemented these techniques, but these will be helpful.
Knowledge Models in Knowledge Management

Our research aims to design a new loop of content circulation. As for
85
knowledge management area, SECI model [16] is widely known. SECI is the
abbreviation for Socialization, Externalization, Combination, and Internalization,
which are the processes of knowledge cycle. Shneiderman categorized creative
activities into following four activities: “collect,” “relate,” “create,” “donate” [17] And
Ohmukai et al. expanded Shneiderman’s model to distinguish information activity
layer and communication activity layer [18]. In their model called ICA model ―
Information and Communication Activities model, two layers of information
activities (“collect,” “create” and “donate”) and communication activities (“relate,”
“collaborate” and “present”) form cycles related to each other.
Hori and his group developed a cycle model which consists of the knowledge
liquidization and crystallization processes [19]. They called decomposition of
expressions into units in proper granularity with every possible connection among
each as liquidization. And as crystallization, they called new expression formation
from decomposed partial units based on new relationships within the context. Our
research is based on this concept [20]. In our proposed framework, a system
decomposes and recomposes collected users expressions.
Figure 2. Proposed framework
Figure 4. Architecture of developed system
86
CONTENT CIRCULATION FRAMEWORK
Figure 2 illustrates our proposed framework for content circulation. Contents
created by multiple users are stored into the database. The recomposing engine
decomposes stored contents and generates draft contents. Here we aim not to
create complete contents but to stimulate users. The support interface shows
drafts to a user and she edits and finishes her content. These operations spread to
the recomposing engine and it shows other drafts.
We show two levels of interaction loops here. Direct and local interaction
between users and the support system is shown in editing and stimulating loop.
Remixing and reusing stored contents form indirect and total interaction loop.
This model is applicable for various manners of creation and publication of
contents. For example, writing process of papers or blog entries include
information collection phase and editing phase. Of course authors need to add
their own original opinions, but candidate combinations of related information will
help their considerations. Format of expression can vary and is not limited to text
expressions. While we expect the framework can be applied to any types of
contents, we dare focus on text content in this research. Decomposition and
recomposition are realized by usual text processing techniques. We wanted to
focus not on techniques for implementation but on the content circulation
framework itself. For that reason, we held a participatory workshop where
participants created their contents and recomposed them into new contents.
87
PHOTO-ATTACHED ACROSTIC WORKSHOP

We designed and organized a workshop as
a field practice [21]. In the workshop,
participants create contents based on rules.
We designed a new format of expression
called photo-attached acrostics to highlight
the process of decomposing and
recomposing. Acrostic is “a poem or other
writing in an alphabetic script, in which the
first letter, syllable or word of each line,
paragraph or other recurring feature in the
text spells out another message1.” We
modified it to include pictures for each
sentence. Participants take and select
photos, write sentences whose first letters
match a message given. Here a pair of
sentence and photo should correspond and
both photos and sentences should be along
a theme given. An example of photo-
attached acrostic is shown in Fig. 3. The
message of the example is “ABCDE.”
In the workshop, participants create an
acrostic using their own photos at first. Then
next, they are divided into groups and
collaborate to create new expressions by
remixing their expressions. Collaboration
with others will raise new context and
stimulate participants. In the third step, they
Figure 3. Example of photo-
attached acrostics create expressions by themselves again,
using all pictures used in the former steps.
88
Participants are requested to place others’ (partial) expressions in their new
expressions. We aim that participants form new opinions/ideas stimulated by
others. At the same time, the workshop facilitator shows other new remixed
acrostics using the developed information system described below.
SUPPORT SYSTEM
The system consists of four parts (Fig. 4): expression database, expression
input interface, expression recomposing engine, and expressing support interface.
It has the same structure with the framework illustrated in Fig. 2, but is modified to
highlight its dataflow.
Users input their works, which are created in manual and analog manner in the
workshop. The expressing support interface shows draft expressions, which are
generated from the expression recomposing engine (see Figure 6).
The expression recomposing processes are as follows:
Decomposition phase
Analyze morphological structures of text.
Calculate term relation weights and term weights.
We use term dependency for term relation weights and term attractiveness for
term weights [22]. Term dependency td (t , t ) from term t to t ′ is given by:

′
sentences(t I t ′)
td (t , t ′) =
sentences(t ) (1)
Here sentences(t ) indicates the number of sentences in which term t appears,
and sentences(t I t ) is the number of sentences term t and t ′ appear at the

′
same time.
Term attractiveness attr (t ) of term t is a total of incoming term dependencies. T

is the set of all appearing terms.
attr (t ) = ∑ td (t , t ′)
t ′∈T t ′ ≠ t
(2)
Recomposition phase
89
Extract candidate terms according to their initial letters.

Extract photos which include each term in 1.
Evaluate photos.
We define the weight

wt ( p ) of a photo p for term t as follows:
wt ( p ) = ∑ td (t , t ′) ⋅ attr (t ′)
t ′∈T p t ′ ≠ t
(3)
For each initial letter, the term candidates, their related terms, and attached photos
are structured.
Figure 5 illustrates an example of word network extracted in the decomposing

phase. Sizes of the nodes indicate their weights ( td ) and the distances between
two connected nodes shows their strength of the link ( attr ). This figure just
visualizes word relations which are calculated internally in the system; we didn’t
show users the figure directly.
Based on this network, we generate candidates of expressions and present them

in creation support interface (Fig. 5). Choose one candidate of initial terms in the
left textbox, then other related terms are presented in middle textbox. Choose one
again, a photo combined to selected terms and the rest of related terms are
shown. User can overview candidates presented on the system, select one she
likes, and edit it.
In the workshop, a facilitator shows semi-automatically generated expressions,

which are edited in certain rules like choosing photos with the highest weights or
the lowest. With these expressions, we aim to stimulate the participants by
machinery generated context. Through this step, we observe the effects of the
expression recomposing engine. After the workshop, we asked the participants to
try the expressing support interface. We evaluate the interface from this test.
90
RESULTS AND DISCUSSIONS
The theme of our first practice was “Shonan” ― the name of a region along a
coast in central Japan. We called for participation to the people related to ― e.g.,
living around, working around, or was born around ― Shonan area. Through the
workshop, participants are expected to discuss together and get new opinions
about the area.
The workshop was held at 8th and 16th December 2007 in Fujisawa city, the
center of Shonan area, with nine participants. Most of their occupations were
related to media activities or media literacy: information media-major students, an
elementary school teacher, an art university professor, members of citizens’
television at Shonan, and so on. While the youngest was an undergraduate
student, a retired person was also included. Three were female, and six were male.
The participants were divided into three groups and finally they made 30 photo-
attached acrostics from 259 photos. Fig. 7 shows the scenes in the workshop.
Through this workshop, we aim that participants exchange their knowledge and
get new ideas through collaboration and competition. Most of the works from the
latter steps were created by remixing others’ former works. One participant,
however, didn’t change his mind finally. He preferred creating by himself rather
than through collaboration. This fact shows our method is not almighty; this seems
quite natural.
For the rest of participants, we found that collaborations in the shared place
were effective. In the workshop, we prepared the tasks which consist of individual
creations and collaborative creations. We expected that participants would change
and expand their way of thinking through these tasks. In the process of creating
one expression, the participants changed their views actually more frequently. One
was not always thinking together with others during a group work; she thought
about the task alone; Then she and other member of her group discussed about
their thoughts together; And she thought by herself again...
In the first step, the facilitator selected photos and terms based on certain rules
so that we could observe the effects of the recomposing engine. As a result, a
91
content, which was created by choosing photos and terms with the highest weights
in the expression (2) and the expression (3) (shown on the top of each list of
candidates in the support interface) happened to have a similar story structure to a
participant’s one. We aimed to form a different context, but made a similar story.
The facilitator, however, could create much more expressions in much less time.
The outputs of the system were not always new, but the number of outputs was
large enough to stimulate the participants.
Figure 5. Extracted network in the decomposition phase. Whole structure (top)

and partial enlargement (bottom).
92
Figure 6. Screen image of photo-attached acrostic creation support interface
93
As the second step, we asked the participants to try the support interface after the
workshop and conducted interviews with them. While positive comments like “I
could easily create new acrostics” were heard, a problem was pointed. The system
shows candidates for each sentence separately; connecting sentences ― making
story ― is not supported enough.
CONCLUSION
In this paper, we introduced our participatory expressive workshop and
information system to support it. We aimed to cultivate citizens’ communications in
local communities through expressing and remixing. Based on our model of
content circulation, we devel oped the workshop program and the system.
Figure 7: Photo-attached acrostic workshop
94
Our first trial dealt with a peculiar type of expression. But as we already
mentioned, our framework is applicable to other types of expressions. Especially it
suits on Web content creation. The Web can be a database from which a system
draw others expressions, and can be a place where people present their created
contents. Reusing and remixing loop of circulation is natively on the Web. We are
planning to develop an application of our framework for blogging.
ACKNOWLEDGEMENT
This work has been supported by a grant from the Japan Science &
Technology Agency under CREST Project.
95
REFERENCE
1. P. Ricœur: Temps et Récit, Seuil, (1983) [Time and Narrative, University

of Chicago Press, (1984)].
2. R. C. Schank: Tell Me a Story: Narrative and Intelligence, Northwestern
University Press, (1990).
3. J. Lambert: Digital Storytelling: Capturing Lives, Creating Community,
Digital Diner Press, (2002).
4. J. S. Bruner: Acts of Meaning, Harvard University Press, (1990).
5. Wenger, R. McDermott, and W. M. Snyder: Cultivating Communities of
Practice. Harvartd Business School Press, (2002).
6. J. Lave and E. Wenger: Situated learning: Legitimate peripheral
participation. Cambridge University Press, (1991).
7. M. A. Boden, The Creative Mind: Myths and Mechanisms, New York:
Basic Books, 1991.
8. J. P. Guilford, The Nature of Human Intelligence, New York: McGraw-Hill,
1967.
9. J. Kawakita, “The KJ method: a scientific approach to problem solving,”
Technical report, Kawakita Research Institute, Tokyo, 1975.
10. F. Osborn, Applied Imagination: Principles and Procedures of Creative
Problem-solving, Scribner New York, 1957.
11. J. Munemori and Y. Nagasawa, “Gungen: groupware for a new idea
generation support system,” in Information and Software Technology, Vol.
38, No. 3, pp. 213-220, 1996.
12. S. Bringsjord and D. A. Ferrucci, Artificial Intelligence and Literary
Creativity: Inside the Mind of Brutus a Storytelling Machine, Lawrence
Erlbaum Assoc Inc., 1999.
13. P. McCorduck, Aaron's code, WH Freeman & Co. New York, 1991.
14. Mani and E. Bloedorn, “Multi-document summarization by graph search
and matching,” in Proceedings of Fourteenth National Conference on
96
Artificial Intelligence (AAAI-97), pp. 622-628, 1997.
15. R. Barzilay, K. R. McKeown, and Michael Elhadad, “Information fusion in
the context of multi-document summarization,” in Proceedings of the 37th
Association for Computational Linguistics, pp. 550-557, 1999.
16. Nonaka and H. Takeuchi, The Knowledge Creating Company, Oxford
University Press, 1995.
17. B. Shneiderman, Leonardo's Laptop: Human Needs and the New
Computing Technologies, MIT Press, 2002.
18. Ohmukai, H. Takeda, M. Hamasaki, K. Numa, and S. Adachi, “Metadata-
driven personal knowledge publishing,” in Proceedings of 3rd International
Semantic Web Conference 2004, pp. 591-604, 2004.
19. Hori, K. Nakakoji, Y. Yamamoto, and J. Ostwald, “Organic perspectives of
knowledge management: Knowledge evolution through a cycle of
knowledge liquidization and crystallization,” In Journal of Universal
Computer Science, Vol. 10, No. 3, 2004.
20. Numa, K. Tanaka, M. Akaishi, and K. Hori, “Activating expression life cycle
by automatic draft generation and interactive creation," in International
Workshop on Recommendation and Collaboration (ReColl'08), 2008.
21. Numa, K. Toriumi, K. Tanaka, M. Akaishi, and K. Hori, “Participatory
Workshop as a Creativity Support System,” in 12th International
Conference on Knowledge-Based and Intelligent Information &
Engineering Systems (KES2008), 2008.
22. Akaishi, K. Satoh, and Y. Tanaka, “An associative information retrieval

based on the dependency of term co-occurrence,” in Proceedings of the
7th International Conference on Discovery Science (DS2004), pp. 195-206,
2004.
97
Chapter 6
IPAC System For Controlling Devices over the

Internet
Sandeep Kumar, Alighar Muslim University, India

Al-Dahoud Ali, Al-Zaytoonah University, Jordan
Archana Gupta, Aligarh Muslim University, India
Himanshu Bhardwaj, Aligarh Muslim University, India
Saiful Islam, Aligarh Muslim University, India
ABSTRACT
Today, because of the advancements in the computer and electronic sciences
everything is going to be automated. In fact, some devices or infrastructures are
capable to change the behavior according to situations; these devices are called
Smart Devices or Smart Infrastructures. This system is designed to meet the
requirement of appliance control in automated or smart infrastructures which
includes home, offices, industries or may be sophisticated vehicles like aero
planes. Appliance control basically refers the process or technique of controlling a
device (including complete machines, mechanical devices, electronic devices,
electrical devices etc.) using some comfortable, luxurious and reliable means
based on some automation methods.
Even a number of standards have been defined for wired and wireless controlling
and automation of home appliances including Bluetooth, UPnP, X10 etc, this field
98
is still in developing state. In this document we have proposed an appliance
controlling system, named as Internet and PC Based Appliance Control (IPAC),
using concepts of parallel port programming.
IPAC is designed to control a device from PC and from Internet, and can be
applied in any smart infrastructure to automate the device and can work with
almost every type of automation method either it is wired (e.g. LAN) or wireless
(e.g. Bluetooth). This system can be applied in designing smart homes, secure
homes, centralized device controlling system, Bluetooth control system, WAP
control system.
Keywords: Home Networking, Smart Homes, Secure Homes, UpnP Devices, X10
protocol, WAP Devices, IR Devices, Bluetooth Device
INTRODUCTION
Why Home Networking is not so common?

There are several key problems associated with creation of home networked.
Some of them are discussed below:
Consumers are unaware of the benefits of the networked or smart home

At this point in time, most home networks are used to connect PCs for tasks
such as printing and shared Internet connectivity. Consumers still do not see the
other potential benefits, such as on demand video, enhanced voice
communications, and remote security control. Because of this lack of awareness,
the demand for home networking products is still minimal
Running additional wires through homes is costly and a hassle for consumers.
In order to counteract this problem, the industry is developing wireless and other
standards which will allow users to interconnect information devices without installation of
new wires.
99
____ Chapter 6: IPAC System For Controlling Devices over the Internet
Technology is too complex for most household users.

Unlike other home electronics, the technology behind home networking is not intuitive
and requires more technological expertise than the average household possesses.
Lack of incentive for Internet providers to push networking technology.

The home providers of broadband Internet (i.e., cable Internet providers and DSL
providers) are currently surviving well enough on the strength of their connectivity service
sales and do not need to push additional products. In addition, these communications
carriers are too busy building network infrastructures and too swamped with customers
demanding their high-speed access to spend time worrying about home networking.
Potential privacy issues.

Because the networked home would enable information to flow out of the home in
ways that households are not accustomed to, privacy could be compromised. Additionally,
the new technology behind information appliances and smart homes could introduce new
security holes not before encountered.
Interface issues.
In smart home test beds, control interfaces have ranged from touch-screen devices to
PDAs. Data on the effectiveness of the various interfaces seems scarce.
So, these were the some issues regarding the popularity of home networking.
Now we shift our attention towards the home networking.
Imagine a completely networked home, in which every appliance can be
remotely managed [14] from anywhere on the Internet with a simple Web browser
[1][12][17][19]. The general goal of the automatic-home movement is to use
networking technology to integrate the devices, appliances and services found in
homes so that the entire domestic living space can be controlled centrally or
remotely [16].
Following is a snap displaying a typical automated home. [2]
100
Figure 1. A typical Automated Home System [2]
Home wiring, the advance home developers are installing, typically adds
several thousand dollars to the cost of a new home, and it is usually Ethernet or
coaxial cable -- or some combination of both -- with other technologies in the mix.
The network is being designed to make possible remote operation of appliances
connected to the network.
Figure 2. Already available extra wiring for device controlling in Smart

Infrastructure
Other technology developers are generating buzz in this area as well. In June
2008, at the Bluetooth World Congress, vendors were touting the expansion of
wireless networking technology into everything from air conditioners to cable
television boxes."Bluetooth was originally developed as a wireless technology --
primarily for short-range exchange of data between laptops, PDAs and mobile
phones," said Nick Hun, managing director at TDK Systems, whose Blu2i adapters
are being used in such home applications. But, he noted, when early adapters
101
were released to industrial engineers at the end of 2002 demand soon proved
overwhelming.
Secure home
It is a highly cute smart home environment in which every device is automated
with maintaining sufficient security. E.g. In the following figure home door is locked
by software lock (by using a password) and can be opened only by software
methods.
Figure 3. A software lock in Secure Home Environment [2]
Preliminaries
Summarized from [3][4][5][20][22] there are following technologies used to

create a networking environment where home appliances work as a network
nodes.
Direct Cable
In this devices are connected through serial, parallel or USB port. Generally
desktop software is also supplied for making the device management a
comfortable and easy task.
102
Figure 4 Direct Cable Connection Method of Home Networking [8]
Bluetooth
This is cross device wireless standard created for cell phones and PDAs, and
can link up to eight devices.
Phone Line
Data shares the phone line frequency and requires phone jack everywhere a
networked device is located. Also requires special cards and drivers.
Ethernet
Connections are made using hub system and network cards in each device. It
requires driver installation and wiring. There are more expensive and chances of
hardware conflicts are there.
Figure 5. Use of telephone media for controlling home devices
103
Radio Free Network

Uses radio frequency waves to transmit data through walls and doors up to 800 feet,
requires network card, can have some interference.
AC Network
Uses power lines and wiring already within home to connect parallel port to adapter in
outlet. It is difficult to set up, slow networking, problems with interference from other
devices and is also expensive [4].
RELATED WORK
At 2Wire's R&D laboratory, researchers are currently developing [2009]

wireless applications to control lighting and home security devices, Software and
other IT companies are also not lagging behind in this advancement. Following
figure shows a desktop PC Controlled automated home [2] from INSTEON.
Figure 6. Home controlling desktop software from INSTEON [2]
Related Technologies
UPnP(Universal Plug and Play Devices) UPnP technology is a distributed,
open networking architecture that employs TCP/IP and other Internet technologies
to enable seamless proximity networking, in addition to control and data transfer
among networked devices in the home, office, and public spaces. Intel software for
UPnP technology helps hardware designers and software developers build easy
connectivity into common electronic devices [4] [6] [9].
104
X-10 Devices
X10 [7][21] is a communication language protocol that allows compatible
products to talk to each other via existing 110 v electrical wiring in the home. Upto
256 different addresses are available and each device you can use usually
requires a unique address.
Infra-Red Device
IR data transmission is also employed in short-range communication among
computer peripherals and personal assistance. Remote controls and IrDA devices
use infrared light-emitting diodes (LEDS) to emit infrared radiation which is
focused by a plastic lens into a narrow beam.
Bluetooth Devices [10][13]

It is a small form factor, low cost technology that provides low-power, short
range (up to 10 m) links between mobile PCs, cell phones, printers or other
devices arranged in ad hoc ‘piconets’ of up to 8 devices. For promoting
interoperability between different Bluetooth devices Bluetooth Special Interest
Group (SIG) has produced over 400 pages of profiles (published as volume 2 of
the version 1.1 specifications). Bluetooth is a simpler technology than any other
popular IEEE 802.1 1 standard for wireless local area networks. By ‘simpler’ is
meant here fewer and/or less demanding RF semiconductor chips, fewer passive
components and less complex digital base band chips.
DESIGN OF IPAC
Organization of IPAC
Figure [7] shows the complete organization structure of IPAC. In the figure only
four devices are shown but using IPAC system we can control up to 128 devices
(Why and How, This will be clear in the next section).
105
Functions of Different Units of IPAC
Web Interface (WI)

This is the interface available over the internet and appears on the browsers
screen. Its function is to just provide an interface to user over the net through web
browser so that user can access the appliance at the distant place (home or
office).
User Interface (UI)

This is the WI counterpart for local users. This is the interface that is used by
the users who own the server.
Fig. 7 Organization of IPAC
Figure 7. Organization of IPAC
This is basically similar to INSTEON desktop[2] software and has aim of the
device management a comfortable and easy task for user’s point of view.
106
Server
This runs web-server so that the system can be accessed over the internet.
Second important part of server is the database which stores the information about
status of different devices.
Tracker
This system reads the status entries from database and generates proper
control word to be PC port for generating proper signals. This is one of the most
important parts of the system.
Control Word Format

For the proposed IPAC system an 8-bit control word will be used to control
various home-appliances. Following figure shows the interpretation of different
bits:
Figure 8. Control Word Format for proposed IPAC System
4-bits are used to address maximum of 16 rooms. (R-Field)

3-bits are used to address up to 8 devices within a single room. (D-Field)
Single bit is used to set or reset (ON or OFF) the device selected by R-field and D-
field bits. (SField)
In this structure we can address up to 128 devices because we are using 7-bits
for addressing. All available 128 addresses (devices) are grouped into the 16
groups (rooms). Table1 shows these 16 groups and corresponding device address
107
range.
Room Number Device Address Range

1 0-7
2 8-15
3 16-23
4 24-31
5 32-39
6 40-47
7 48-55
8 56-63
9 64-71
10 72-79
11 80-87
12 88-95
13 96-103
14 104-111
15 112-119
16 120-127
Table 1. Grouping of 128 addresses into 16 different groups
Encoding
When converting the status record from database into control word we will first
find the binary equivalent of room and device separately. From the control word
structure it is clear that we have to left shift the room bits by three position to place
them at the correct position. Then we can OR the ‘room bits’ and ‘device bits’ to
determine the absolute address. .Similarly we have to ‘OR’ the absolute address
with (10000000)2 for making the status field ‘1’ if the device is ON. Otherwise if the
device is OFF there is no need to change the MSB because by default it is ‘0’.
Moreover, one thing to be noted here is that the address of first room is ‘0000’ but
we say it ‘Room 1” just to keep the room number in natural domain. Similar is the
case with device address. So while decoding we will decrease the room number
and device number by 1 prior to converting to binary equivalent. This is shown in
the flow chart given below. While implementing the system in HLL it is important
that in place of decreasing room-no by 1 then covering to binary and then shifting
108
the room bits by 3-bits is just equivalent to multiplying room no by 8 after
decreasing by 1. Similarly ORing with (10000000)2 is equivalent to adding with 128
in decimal number system (only in this particular case).
Example
If room number is 5, device number is 8 and we want to set this device in ON
state.
Binary Method
(5-1)10=(4)10=(00000100)2 (room bits)
(8-1)10=(7)10=(00000111)2 (device bits)
Status =1
Shifting room bit left by 3-position we get (00100000)2
After ORing with device bits we get (00100111)2
Since the status is ON so we have to OR with (10000000)2
After ORing we get (10100111)2 which is the required
control word.
Denary Method
(5-1)10=(4)10 (corrected room no)
(8-1)10=(7)10 (corrected device no)
Status =1
Multiplying corrected room no by 8 we get 4X8=32
Adding with corrected device number we get 32+7=39
Since the status is ON so we have to add 128
After adding 128 we get (167)10 which is the required control word.
Since binary equivalent of (167)10 is (10100111)2 we are at the right track.

Following figure the complete process flowchart
109
Figure 9. Flowchart for Encoding the room number, device number and their
associated status into corresponding control word.
Poller
This part of IPAC remains in running state as long as your system (server) is
ON (if you do not want to exit the IPAC service). This system, after a periodical
time, executes the tracker so that if any changed had been made in the device
status it should be propagate to corresponding device. Its function is to watch
(poll) the database continuously so it is designated as Poller.
Appliance Controller (AC)

This part of IPAC, which is in the form of hardware, is responsible for
interfacing the controlling PC (Server) with electrical devices. This part comprises
of a 7X128 line decode that is used to address one out of 128 devices ( 7 input
110
bits are taken from the 7 LSBs of parallel port). The MSB from the parallel port
represents the data (status i.e. ON or OFF). With the addition of a relay of proper
rating we can connect any device. The relay passes the AC signal as long as the
device-status associated with this is ON. For maintaining the device continuously
(even in power failure condition) we have to attach a memory element for storing
the device status (We have used flip-flop) for that.
Electrical Appliances
These are home appliances which we are going to control. We will control the
electrical device using this system but inclusion of transducer we can also control
mechanical or electromechanical devices.
SIMMULATION
In Section-I, Section-II and Section-III we have laid out, designed our proposed
system IPAC. In this section we will discuss a particular simulation of IPAC to
analyze the results.
Simulation Requirements
Following table gives the detail requirements needed for running the simulation of
IPAC:
Platform (Technology) Purpose

Visual Basic 6.0 User Interface
HTML Web Interface
ASP For Controlling from Internet
Access 2000 Database Mappings
Sending Control Signal to Appliance
D-25 Type Parallel Port
Controller
PC74HC154P For Decoding Control Word
For Inverting Active Low Output of
HD74LS04P
Decoder
12 LEDs To display the Control Signals
Table.2 Simulation Requirements
111
Simulation Screen Shots

Following are some snaps taken from the simulation:
Figure 10. Status changing by user interface
RESULT AND CONCLUSION

We have designed this system for controlling electrical devices but the design
can be extended to control mechanical devices. We have simulated the system for
controlling the LEDs and this is working properly.
At last we will finish our system design discussion by concluding that using the
parallel programming concept of PC is also a good tool for controlling home
appliances.
112
Figure 11. Statuses Changing by Web Interface
Figure 12. Appliance Controller for simulating controlling the LEDs
The major benefit is the cost. Since here the major cost factor is the PC which
is generally present in every intermediate level family. Other major cost distribution
113
factor is the cost of the software but this is a very long time asset. The hardware
part (i.e. Appliance Controller) is just a decoder and a set of flip-flops so it is hardly
of app. $10 USD. The remaining is the relays and the cost of the relay depends on
the rating which depends on the device to be control. Except relays all the cost-
distribution factors are one-time and fixed investment and does not depends on
the number of devices we are going to control.
ACKNOWLEDGEMENT
The authors would like to thank to Mr. M. Inamullah (Department of Computer
Engineering), Mr. Izharuddin (Department of Computer Engineering) for their
help in completing this work.
114
REFERENEC
1. Gene J. Koprowski, ”A Brilliant Future for the Smart Home”, TechNewsWorld

08/04/03 8:11 AM PT
2. “Solution that Gives you complex control of your House”, SMARTHOME, VOL
118,2009
3. Jennifer Recktenwald’s “Home networks: Connecting more than computers”,
Tech Republic, November 10, 1999
4. Piyush Varshney, “Remote Controlled Home Automation System”, M Tech 3rd
Sem, ZHCET, AMU, 2006
5. Kannan and S. Vijayakumar, ”Smart Home Tested for Disabled People”,TIFAC
CORE, Velammal Engineering College, Chennai, Mobile and Pervasive
Computing-2008
6. HyunRyong Lee, JongWon Kim, "UPnP Protocol Extension for Contents
Sharing among Digital Home Networks", KISS, Vol.3 1, No.2, 2004
7. Phil Kingery, "Digital X-10,"Advanced Control Technologies. Inc, 2002.
8. M. Hashimoto, K. Teramoto, T. Saito, and T. Okamoto, “Home Network
Architecture Considerin Digital home Appliance,” IWNA98.
9. Stefan Knauth, Rolf Kistler, Daniel K”aslin and Alexander Klapproth,”UPnP
Compression Implementation for Building Automation Devices”
10. Priyanka Varshney, Nikhil Kumar, Mohd. Rihan,”Implementation of Bluetooth
Based Smart Home for Assisting the Physically Disabled”
11. R. Al-Ali, M. AL-Rousan, “Java-Based Home Automation System,” IEEE
Transactions on Consumer Electronics, vol. 50, no. 2, pp. 498-504, May 2004.
12. Peter M. Corcoran and Joe Desbonnet, “Browser-Style Interfaces To A Home
Automation Network”, EEE Transactions on Consumer Electronics, Vol. 43,
No. 4, NOVEMBER 1997
13. Kwang Ye01 Lee', Jae Weon Choi', “Remote-Controlled Home Automation
System via Bluetooth Home Network “,SICE Annual Conference in Fukui,
August 4-6,2003, Fukui University, Japan
14. Intark Han, Hong-Shik Park, Youn-Kwae Jeong, and Kwang-Roh Park, “An
115
Integrated Home Server for Communication, Broadcast Reception, and Home

Automation” , IEEE Transactions on Consumer Electronics, Vol. 52, No. 1,
FEBRUARY 2006
15. John J. Greichen, “Value Based Home Automation For Today's Market”, IEEE
Transactions on Consumer Electronics, Vol. 38, No. 3, AUGUST 1992
16. P.M. Cocoran, F. Papai and A. Zoldi, "User Interface Technologies for Home
Appliances and Networks," IEEE Transactions on Consumer Electronics,
Vol.44, No.3
17. Andreas Rosendahl and J. Felix Hampe, Goetz Botterweck, “Mobile Home
Automation – Merging Mobile Value Added Services and Home Automation
Technologies”, Sixth International Conference on the Management of Mobile
Business (ICMB 2007)
18. Alheraish, “Design and Implementation of Home Automation System,” IEEE
Transactions on Consumer Electronics, vol. 50, no. 4, pp. 1087-1092, Nov.
2004.
19. T. Uemukai, H. Hagino, T. Hara, M. Tsukamoto, K. Harumoto, and S. Nishio,
“A WWW Browsing Method Using a Cellular Phone in a Remote Display
Environment,” Data Processing Conference (99-HI-86,99-MBL-l l), Vo1.99,
No.97, p.51-56, 1999.
20. Park Gwangro, "Trends of Home Network Technologies and Services,"
KRNET 2004, June, 2004.
21. Zhang Yuejun, Wu Mingguang, "The Study and Development on Intelligent
Lighting System Based on X-10 Protocol," China Illuminating Engineering
Journal, vol. 15 No. 1, pp.22-26, Mar.2004.
22. Bill Rose, WJR Consulting Inc. , “Home Networks: A Standards Perspective”,
IEEE Communication Magazine, December 2001
116
Chapter 7
Requirements engineering and traceability in agile

software development
Abdallah Qusef ,Rocco Oliveto ,Andrea De Lucia

University of Salerno. Italy
ABSTRACT
Finding out, analyzing, documenting, and checking requirements are important
activities in all development approaches, including agile development. This
chapter discusses problems concerned with the conduction of requirements
engineering activities in agile software development. We also suggests some
improvements to solve some challenges caused by agile requirements
engineering practices in large projects, like properly handling and identifying
critical (including non-functional) requirements, documenting and managing
requirements documentation, keeping agile teams in contact with outside
customers. Finally, the chapter discusses the requirements traceability problem in
agile software development and suggests some ideas to maintain the traceability
links between agile software artefacts to help developers to comprehend parts of
the system, and to keep the consistency among agile software artefacts during
refactoring.
Keywords: Requirements Engineering, Traceability, Agile Software

Development.
117
____ Chapter 7: Requirements engineering and traceability in agile software development
INTRODUCTION
The agile approach is creating a stir in the software development community.
Agile methods are reactions to traditional ways of developing software and
acknowledge the need for an alternative to documentation driven, heavyweight
software development processes [1]. In the implementation of traditional methods,
work begins with the elicitation and documentation of a complete set of
requirements, followed by architectural and high-level design, development, and
inspection. Beginning in the 1990s, some practitioners found these initial
development steps frustrating and, perhaps, impossible [2]. The industry and
technology move too fast, requirements change at rates that swamp traditional
methods [3], and customers have become increasingly unable to definitively state
their needs up front while, at the same time, expecting more from their software.
As a result, several consultants have independently developed methods and
practices to respond to the inevitable change they were experiencing. These Agile
methods are actually a collection of different techniques (or practices) that share
the same values and basic principles. The agile Manifesto states valuing
"individuals and interaction over processes and tools, working software over
comprehensive documentation, customer collaboration over contract negotiation,
and responding to changes over following a plan" [1].
Requirements Engineering (RE) is the process of establishing the services that
the customer requires from a system and the constraints under which it operates
and is developed. The main goal of a RE process is creating a system
requirements document for knowledge sharing, while Agile Development (AD)
methods focus on face-to-face communication between customers and agile
teams to reach a similar goal. There are several research papers discussing the
relationship between RE and AD (e.g. [4], [5], [6], [7], [8], [9]). They explain some
RE practices in agile methods, compare these practices between agile and
traditional development systems, and examine the problems of AD when it is
dealing with the management of large projects and control critical requirements.
This chapter addresses the problem of how (user) requirements can be
118
captured and specified in the context of agile software development approaches. It
therefore tries to identify how standard RE techniques and processes can be
combined with agile practices and to find solutions to some of the difficulties
related to their work. In addition, this article discusses the traceability problem in
agile software development, since the current traceability between agile software
artifacts is ill defined [10]. In particular, we discuss how to solve the traceability
problem by extracting some important information from software artifacts to
identify traceability links between them. We also discuss how these links can be
used to improve the decisions making process and help developers during the
refactoring process. Finally, the chapter comes up with a set of guidelines for agile
requirements engineering.
The chapter is organized as follows. Section 2 sheds light on the benefits and
limitations of agile methodologies in the software development life cycle and
discusses some agile approaches from a requirements engineering perspective.
The agile RE activities are discussed in detail in Section 3, while Section 4 briefly
discusses how the requirement engineering process is performed in two agile
approaches. Section 5 addresses the requirements traceability. Section 6 gives
some guidelines and enhancements concerning with an efficient application of RE
practices in AD. Finally, Section 7 summarizes our conclusions and future work.
AGILE SOFTWARE DEVELOPMENT

The goal of agile methods is to allow an organization to be agile, but what does
it mean to be Agile? Jim Highsmith says that being Agile means being able to
"Deliver quickly. Change quickly. Change often" [2]. While agile techniques vary in
practices and emphasis, they follow the same principles behind the agile
manifesto [1]:
• The highest priority is to satisfy the customer through early and continuous
delivery of valuable software.
• Welcome changing requirements, even late in development. Agile processes

harness change for the customer's competitive advantage.
119
• Deliver working software frequently, from a couple of weeks to a couple of

months, with a preference to the shorter timescale.
• Business people and developers must work together daily throughout the
project.
• Build projects around motivated individuals. Give them the environment and
support they need, and trust them to get the job done.
• The most efficient and effective method of conveying information to and within
a development team is face-to-face conversation.
• Working software is the primary measure of progress.
• Agile processes promote sustainable development. The sponsors, developers,

and users should be able to maintain a constant pace indefinitely.
• Continuous attention to technical excellence and good design enhances agility.
• The best architectures, requirements, and designs emerge from self-organizing

teams.
• At regular intervals, the team reflects on how to become more effective, then
tunes and adjusts its behavior accordingly.
• Simplicity--the art of maximizing the amount of work not done--is essential.
Agile development methods have been designed to solve the problem of

delivering high quality software on time under constantly and rapidly changing
requirements and business environment. Agile methods have a proven track
record in the software and IT industries. Fig. 1 shows that about 69\% of
organizations are adapting one or more agile practices for use in general project
management as well as organizational development [11].
In fact, the agile development methodologies are used in organizations where
there is no requirement freezing, incremental and iterative approach is used for
modeling and every one in the team is an active participant and everyone's input is
welcome. The main benefit of the agile development software is that it allows for
an adaptive process - in which the team and development react to and handle
120
changes in requirements and specifications, even late in the development process.
Through the use of multiple working iterations, the implementation of agile
methods allows the creation of quality, functional software with small teams and
limited resources. The proponents of the traditional development methods criticize
the agile methods for the lightweight documentation and inability to cooperate
within the traditional work-flow. The main limitations of agile development are:
agile works well for small to medium sized teams; also, agile development
methods do not scale, i.e. due to the number of iterations involved it would be
difficult to understand the current project status; in addition, an agile approach
requires highly motivated and skilled individuals which would not always be
available; lastly, no enough written documentation in agile methods lead to
information loss when the code is actually implemented. However, with proper
implementation the agile methods can complement and benefit traditional
development methods. Furthermore, it should be noted that traditional
development methods in non-iterative fashions are susceptible to late stage
design breakage, while agile methodologies effectively solve this problem by
frequent incremental builds which encourage changing requirements. In the
following, some common agile methods are briefly discussed from the
requirements engineering perspective.
Agile Modeling
Agile Modeling (AM) is a new approach for performing modeling activities [12].
It gives the developers a guideline of how to build models--using an agile
philosophy as its backbone--that resolve design problems and support
documentation purposes but not over-build these models. The aim is to keep the
amount of models and documentation as low as possible. The RE techniques are
not explicitly referred in AM but some of the AM practices support some RE
techniques like brainstorming.
121
Figure 1. Agile development adoption [11]
Feature-Driven Development
Feature-Driven Development (FDD) consists of a minimalist, five-step process
that focuses on building and design phases [13] each defined with entry and exit
criteria, building a features list, and then planning-by-feature followed by iterative
design-by-feature and build-by-feature steps. In the first phase, the overall domain
model is developed by domain experts and developers. The overall model consists
of class diagrams with classes, relationships, methods, and attributes. The
methods express functionality and are the base for building a feature list. A feature
in FDD is a client-valued function. The feature lists is prioritized by the team. The
feature list is reviewed by domain members [14]. FDD proposes a weekly 30-
minute meeting in which the status of the features is discussed and a report about
the meeting is written.
Dynamic Systems Development Method
Dynamic Systems Development Method (DSDM) was developed in the U.K. in
the mid-1990s. It is an outgrowth of, and extension to, Rapid Application
Development (RAD) practices [15]. The first two phases of DSDM are the
feasibility study and the business study. During these two phases the base
requirements are elicited. Further requirements are elicited during the
122
development process. DSDM does not insist on certain techniques. Thus, any RE
technique can be used during the development process [7]. DSDM’s nine
principles include active user involvement, frequent delivery, team decision
making, integrated testing throughout the project life cycle, and reversible changes
in development.
Extreme Programming
Extreme Programming (XP) is the most famous of any of the agile approaches.
It is based on values of simplicity, communication, feedback, and courage [6]. XP
aims at enabling successful software development despite vague or constantly
changing software requirements. The XP relies on the way the individual practices
are collected and lined up to function with each other. Some of the main practices
of XP are short iterations with small releases and rapid feedback, close customer
participation, constant communication and coordination, continuous refactoring,
continuous integration and testing, and pair programming [17].
Scrum
Scrum is an empirical approach based on flexibility, adaptability and
productivity [18]. The Scrum leaves open for the developers to choose the specific
software development techniques, methods, and practices for the implementation
process. Scrum has been in use for nearly ten years and has been used to
successfully deliver a wide range of products.
REQUIREMENTS ENGINEERING: AN AGILE DEVELOPMENT

PROPSECTIVE
RE is concerned with discovering, analyzing, specifying, and documenting the
requirements of the system. RE activities deserve the greatest care because the
problems inserted in the system during RE phase are the most expensive to
remove. As shown in Fig. 2, some studies revealed that around 37\% of the
problems occurred in the development of challenging systems are related to the
requirements phases [19].
The main difference between traditional and agile development is not whether
123
to do RE but when to do it. The RE processes in traditional systems focuses on

gathering all the requirements and preparing the requirements specification
document before going to the design phase, while the agile RE welcomes
changing requirements even late in the development life-cycle.
Agile RE applies the focal values mentioned in the agile manifesto to the RE
process. The processes used for agile RE vary widely depending on the
application domain, the people involved and the organization developing the
requirements. However, this chapter explains the agile RE activities which are:
Feasibility Study, Elicitation and Analysis, Documentation, Validation, and
Management.
Feasibility Study
The Feasibility Study gives the overview of the target system and decides
whether or not the proposed system is worthwhile. The input of the feasibility study
is an outline description of the system and how it will be within an organization.
The results should be a short report, which recommends whether or not it is worth
carrying on with the RE and AD process. Initially, all relevant
Figure 2. Problems of challenging systems [19]
stakeholders have to be defined, in other words, all right customers who are
related to the development of the system and are affected by its success or failure
124
must be selected, and then the brainstorming session takes place to share the
knowledge ideas between agile teams and "ideal'' customers to answer a number
of questions like:
Does the system contribute to the high level objectives and the critical requirements
of the organization? In a first step, the high level goals and critical requirements
(functional and non-functional requirements) for the system are defined upfront in order to
determine the scope of the system. These requirements describe the expected business
values to the customer.
Is your organization ready for the AD? Each agile method has its own characteristics
and practices that will change the daily work of the organization. Before an organization
selects one of them, it should consider whether or not it is ready for agile development. In
fact, this question is very important and many researchers tried to answer it like [11], [20].
For example, Ambler [11] discusses some successful factors and questions to be
answered affecting the successful adoption of agile methods.
Can the system be implemented within given budget? Some contracts do not allow for
changing requirements. "The requirements must be complete before a contract can be
made, which is often found in fixed-priced projects'' [5]. In agile projects where changing
requirements is welcome, contracts often are based on time and expenses and not on
fixed-priced scope. Hence, agile methods use scope-variable price contracts [21]. This
means that the features really implemented into the system and its cost evolve as well.
Therefore, requirements are not specified in details at contract level but defined step by
step during the project through a negotiation process between the customer and the
development team [8].
How to integrate the agile activities with traditional organizational activities already
in place? Some researches suggest tentative models for integrating agile activities with
traditional organizational activities by transferring the knowledge from one process to
another and how the traditional team should adopt its activities to suit the mechanisms of
agile teams [22][23].
Requirements Elicitation
In this activity, agile teams work with stakeholders to find out about the
application domain, the services that the system should provide, the system's
operational constraints, and the required performance of the system (non-
functional requirement). The most important techniques used for requirements
125
elicitation in AD are:
Interviews: "Interviewing is a method for discovering facts and opinions held by
potential stakeholders of the system under development'' [6]. There are two types
of interviews: Closed interviews, where a predefined set of questions are
answered, and the open interviews, where there is no predefined agenda and a
range of issues are explored with stakeholders. In fact, interviews are good for
getting an overall understanding of what stakeholders do and how they might
interact with the system, but they are not good for understanding domain
requirements. All agile methods say that interviews are an efficient way to
communicate with customers and to increase trust between two sides.
Brainstorming: this is a group technique for generating new, useful ideas, and
promoting creative thinking. Brainstorming can be used to elicit new ideas and
features for the application, define what project or problem to work on and to
diagnose problems in a short time. The project manager plays an important role in
brainstorming. He/she determines the time of creative session, makes sure that
there is no escalating discussions about certain topics, and comes to make sure
that every body expresses his/her opinion freely. After the creative session is
ended, the topics are evaluated by the team. Also, the connections and
dependences between the discussed ideas are represented by (for example)
graph visualization, so the conflicts with other requirements are found and
evaluated.
Ethnography: it is an observational technique that can be used to understand
social and organizational requirements [24]. In agile development ethnography is
particular effective at discovering two types of requirements: the first one refers to
requirements that are derived from the way in which people actually work rather
than the way in which process definitions say they ought to work, and the second
one refers to requirements that derived from cooperative and awareness of other
people's activities. Ethnography is not a complete approach to elicitation and it
should be used with other approaches such as use case analysis [19][24].
Use Case analysis: this is a scenario based technique used in UML-based
development which identifies the actors involved in an interaction and describes
126
the interaction itself. A set of use cases should describe possible interactions that
will be presented in the system requirements; each use case represents a user-
oriented view of one or more functional requirements of the system [24].
Requirements Analysis
The main task here is to determine whether the elicited requirements are unclear,
incomplete, ambiguous or contradictory, and then resolve these issues. Conflicts
in requirements are resolved through prioritization negotiation with stakeholders.
The main techniques used for requirements analysis in agile approaches are:
Joint Application Development (JAD): this is a workshop used to collect
business requirements while developing a system. The JAD sessions also include
approaches for enhancing user participation, expediting development, and
improving the quality of specifications [24]. In agile environment, in case of
conflicts between stakeholders' requirements the use of JAD can help promoting
the use of a professional facilitator who can help to resolve conflicts. In addition,
the JAD sessions encourage customer involvement and trust in the developed
system.
Modeling: system models are important bridge between the analysis and the
design process [6]. In agile environment the pen board (or pin board also) is
divided into three sections: models to be implemented, models under
implementation, and models completed. This layout provides a visual
representation of the project status [8]. These models must be documented and
not thrown-away.
Prioritization: agile methods specify that the requirements should be considered
similar to a prioritized stack. The features are prioritized by the customers based
on their business value, so that the agile teams estimate the time required to
implement each requirement. The agile team must distinguish between ``must
have" requirements from ``nice to have" requirements; this can be done by
frequent communications with the customers. Fig. 3 shows the requirements
prioritization process: at the beginning of each iteration, there is a requirements
collection and prioritization activity. During that, new requirements are identified
127
and prioritized. This approach helps to identify the most important features inside
the ongoing project. Typically, if a requirement is very important it is scheduled for
the implementation in the upcoming iteration; otherwise it is kept on hold. At the
following iteration, the requirements on hold are evaluated and, if they are still
valid, they are included in the list of the candidate requirements together with the
new ones. Then, the new list is prioritized to identify the features that will be
implemented; if a requirement is not important enough, it is kept on hold
indefinitely [8].
Figure 3. Requirements prioritization process
Requirements Documentation
The purpose of requirements documentation is to communicate requirements
(or knowledge sharing) between stakeholders and agile teams. In fact, no formal
requirements specification is produced in agile development methods since agile
focuses on minimal documentation. The features and the requirements are
recorded on story boards, index cards, and paper prototypes like use cases and
data flow diagrams.
The lack of documentation might cause long-term problems for agile teams [6].
In the following, we suggest some recommendations for agile development teams
to help them in managing and implementing large projects and projects with critical
128
requirements:
The agile team leader assigns two or three members to produce
documentation in parallel and concurrence with development. The two (or three)
members will be responsible for handling requirements (functional and non-
functional requirements), writing, reviewing, and maintaining documentation
consistent with development. Furthermore, efficient practices like peer interviews
will help to ensure the accuracy and quality of the documentation. The reason for
choosing two or three members is because the resources are limited and the other
members must adhere to the agile manifesto of producing working software rather
than documentation. In addition, we can not have just one person doing it,
because that violates one of the agile manifesto principles: "Business people and
developers must work together daily through-out the project" [1].
Using computer-based tools like UML modeling and project management tools
to specify a high level description of the project, and to document certain practices
and requirements used in agile projects in an electronic format.
Developing a reverse engineering process [25] to be applicable on agile
projects, so that we can use it to reverse engineer the code to produce
documentation using for example UML modeling tools.
Requirements Validation
The goal of requirements validation is to ensure that requirements actually
define the system which the customer wants. The requirements validation checks
the consistency, completeness and realism of requirements. The main practices
used for requirements validation in agile approaches are:
Requirements reviews: it is a manual process that involves multiple readers from
both agile team and stakeholders checking the requirements against current
organizational standards and organizational knowledge for anomalies and
omissions. In agile projects the requirements reviews must be formal reviews: we
mean that the agile team should walk with the customers through each
requirement; conflicts, errors, extra, and omissions in the requirements should be
formally recorded.
129
Write tests first: In agile development, testing is also a method for requirements
validation and therefore also part of requirements engineering. In some agile
methods like XP, the requirements are implemented and tested using the Test
Driven Development (TDD) approach. By applying this technique developers
create tests before writing code. The developed code is then refactored as needed
to improve its structure [26]. The TDD supports evolutionary development and
promotes the development of high quality code. The requirement from which the
test case was created is now presented in a form in which it is completely
validated, in the sense that it can be automatically (after each iteration) determined
whether a requirement is implemented by the software or not. This makes the
developers aware for the progress of the project and the state of the current
iteration of the project. Also, it supports the refactoring process to get an improved
design by reduced coupling and strong cohesion [27]. A common misconception is
that all tests are written prior to implementing the code [7]. Rather, TDD contains
short iterations which provide rapid feedback. Code refactoring and unit tests
ensure that emerging code is more simple and readable. In fact, unit tests can be
considered as a live and up-to-date documentation: they represent an excellent
repository for developers trying to understand the system, since they show how
parts of a system are executed.
Evolutionary prototyping: a prototype is an initial version of the system.
Evolutionary prototyping starts with a relatively simple system which implements
the most important customer requirements which are best understood and which
have the highest priority. The system prototypes allow customers to experiment to
see how the system supports their work (requirements elicitation), and may reveal
errors and omission in the requirements which have been proposed (requirements
validation). As shown in Fig. 4, the main objective of evolutionary prototyping in
AD is to deliver a working system to customers by focusing on customer
interaction [24].
Acceptance testing: acceptance testing is a formal testing conducted by the
customer to ensure that the system satisfies the contractual acceptance criteria.
The acceptance tests are not different than the automated system tests, but they
130
are performed by the customer. Delivering working software to the customer is a
fundamental agile principle and hence the customers create acceptance criteria for
the requirements and test the requirements against these criteria. Being AD an
incremental process, the customers can give feedbacks to the developers to
enhance the development of future increments of the system. However, as a
general problem there are often no formal acceptance tests for non-functional
requirements.
Requirements Management
Understanding and controlling changes to system requirements take place in
this activity. In order for requirements management tools to work efficiently, they
must be able to store requirements, prioritize requirements, track requirement
changes and development progresses, and provide a level of requirements
traceability [29], [30].
In agile projects, managers have to create and maintain a framework for the
interaction between the agile teams and the stakeholders, by (i) identifying the
ideal people who can be members of agile teams and ideal customers who can
answer all the developers’ questions correctly, (ii) strengthening the collaboration,
and (iii) negotiating contracts with the customers [8].
Figure 4. Evolutionary prototype process
131
We believe that agile methods can play an important role in the management
of large projects. The decomposition of the larger parts of the project into smaller
components, called sub-components, lends itself to the employment of more agile
teams. These agile teams can work in other time zones and other countries
provided that frequent communications and self organization are established. Agile
teams working in parallel on sub-components allows for quick development and
early design. An early design leads to an early review. Consequently, the iterative
schedule and emphasis on delivering the product allows the agile teams to assess
the successes and shortcomings, and plan for the next iteration. Once a specific
agile team has successfully completed a sub-component, the team is available to
work on another component or sub-component. Each of these smaller agile teams
will still be responsible for assigning two members to complete the previously
described documentation which is necessary to satisfy the other stakeholders.
Agile teams should use modern communications like web-based shared team
projects and instant messaging tools. These tools are useful to keep in touch with
the customer and other agile teams in order to discuss requirements when they
are not on-site.
In Section (requirement traceability) we will discuss the requirements
traceability as one of the important aspects of the requirements management.
REQUIREMENTS ENGINNERING IN XP AND SCRUM

In the previous section, we discussed about how the requirements engineering
activities are performed in agile methods in general form. Although the agile
methods do not have distinct and explicit RE process, they use most of this
process.
Table 1 summarizes how RE activities are implemented actually in XP and
Scrum methods, respectively. Here, the XP and Scrum are chosen as a good
examples of agile methods to show how the RE process are applied practically.
As we said before, the agile RE welcomes changing requirements even late in
the development lifecycle and it depends on the people as the most important
132
factor for the project success. There is no difference between different agile
methods in the requirements elicitation phase, since they rely on face-to-face
communication between the ideal customers and agile team. The ideal customers
described the system requirements as user stories in XP, while in Scrum the
product backlog is formulated to include all described features and requirements.
Then, the analysis of requirements depends on the requirements prioritization
process that prioritizes the requirements according to their importance for the
customer. In fact, all agile methods are based on the requirements prioritization to
implement the most important requirements first. In addition, frequent delivery of
working software allows better understanding and analysis of requirements.
The requirements documentation activity in agile development depends on
face-to-face communication and software source code as a good resource for
knowledge sharing since agile development focuses on minimal documentation.
The features and the requirements in XP are recorded on story boards, index
cards, and paper prototypes. To ensure that the requirements and features
captured are a valid representation of the required system, in AD, frequent
meetings between customers and agile teams can be scheduled. Also, the
customer can run acceptance tests to ensure that delivered functions actually
define the system which he/she wants. In XP, the developers can use the TDD
cycle to validate their work frequently and to refactor their code as needed.
In XP the short increments and incremental planning techniques are used to
manage the change of requirements. The change of requirements may result to
add and/or delete the user stories. The Scrum provides a project management
framework that focuses development into 30-day Sprint cycles in which a specified
set of backlog features are delivered. The core practice in Scrum is the use of
daily 15-minute team meetings for coordination and integration.
REQUIREMENTS TRACEABILITY
Requirements traceability refers to "the ability to describe and follow the life of
a requirement, in both a forwards and backwards direction'' [31]. In another
133
definition, requirements traceability refers to the characteristics of a system in

which the requirements are clearly linked to their sources and
Table 1. RE implementation in XP and Scrum
RE activity XP implementation Scrum implementation
Requirements Requirements elicited as Product Owner formulates the

Elicitation stories. Product Backlog.
Customers write user stores. Any stakeholders can participate
in the Product Backlog.
Requirements Not a separate phase. Backlog Refinement Meeting.

Analysis Analyze while developing. Product Owner prioritizes the
Customer prioritizes the user Product Backlog.
stories. Product Owner analyzes the
feasibility of requirements.
Requirements User stories & acceptance Face-to-face communication.

Documentation tests as requirements
documents.
Software products as
persistence information.
Face-to-face
Communication.
Requirements Test Driven Development Review meetings.

Validation (TDD).
Run acceptance tests.
Frequent feedback.
Requirements Short planning iteration. Sprint Planning Meeting.

Management User stories for tracking. Items in Product Backlog for
Refactor as needed. tracking.
Change requirements are
added/deleted to/from Product
Backlog.
134
to the artifacts created during the system development life cycle based on these
requirements [32]. Traceability can provide important insights into system
development and evolution assisting in both top-down and bottom-up program
comprehension, impact analysis, and reuse of existing software, thus giving
essential support in understanding the relationships existing within and across
software requirements, design, and implementation [32], [33]. The importance of
maintaining traceability links is confirmed on one side by the support provided by
many CASE tools (see for instance Rational Requisite Pro 1 ) and on the other side
by numerous standards, such as the ISO 15504, CMMI, and IEEE 1219-1998 [34],
that consider requirement traceability as a "best practice".
Traceability is an important part in traditional software development but it is not
a standard practice for the agile methods. However, maintaing traceability links
between the artefacts produced can provide important insight also in agile
development environment. In particular, tracing user stories to their test cases and
back we have a way to validate that a user story is implemented and tested. The
importance of maintaing the dependency links among agile software artefacts
becomes an essential part in maintenance task. The artefacts produced during an
agile development process (i.e., requirements, acceptance tests, unit tests and
code) generally change at the same time. Thus, having traceability links between
the produced artefacts allows to map high-level documents, and thus abstract
concepts, to low-level artefacts. This clearly improves the software maintainability:
once a maintainer has identified the high-level document (e.g., requirement)
related to the feature to be changed, traceability helps to locate the code to be
maintained. This will also help the testers to see what they need to change when a
user story is changed or removed and what are the user stories they need to test.
In addition, the traceability links are useful to validate that the system is
implemented correctly and gives the customer some form of certification that we
have tested the system.
1
http://www-01.ibm.com/software/awdtools/reqpro.
135
Dependencies (links) between unit tests and classes under test can help
developers to comprehend the code as the unit test explicitly indicates what the
expected behavior of a class should be for typical usage scenarios. Moreover, the
traceability links between unit tests and related code can be also exploited to
maintain consistency during refactoring process. When refactoring, the developer
must ensure that all unit tests continue to pass, so unit tests might need to be
refactored together with the source code [35]. Indeed, refactoring of the code
should be followed by refactoring of the tests [36]. Many of these dependent test
refactorings could be automated or at least made easier, if the exact relationships
between the unit tests and the corresponding classes under test would be known.
There are many techniques that have been presented to support traceability
management. These techniques have been intended to work with traditional
software development methodologies and therefore designed under the
assumption that a formal requirements process is in place, but in agile software
development the situation is different because the main development artifact is the
source code. As a result, many researchers tried to find solutions for this challenge
[37], [38]. For example, in [37], Echo tool-based approach is proposed to enable
the scalability of agile requirements elicitation practice. Echo provides a
mechanism that allows for flexible and dynamic creation of content as well as the
supporting traceability structure. Indeed, Echo tool does not support multi-user
environment to enable distributed collaboration. In addition, other agile practices
like TDD are not supported by this tool. Moreover, in [38] the traceability patterns
framework is produced as a solution to requirement component tractability in agile
software development depending on the structure of source code. However, this
framework lacks to the practical evaluation in real-world industrial systems.
Thus, the support for traceability in contemporary software engineering
environments and tools is not satisfactory. As a result, traceability links between
agile software artefacts are not explicitly maintained [39], [10]. Thus,
dependencies between different artefacts have to be manually identified when
needed. As a result, during the comprehension of existing software systems,
software engineers are required to spend a large amount of effort on synthesizing
136
and integrating information from various sources to establish links among these
artifacts. This consideration calls for (semi)automatic approaches supporting the
developer during the identification of links between software artefacts.
The artefacts produced during software development in agile environment are
usually user stories, unit tests, and code classes. Links between user stories and
unit tests and between unit tests and code classes are enough to support all the
activities described above.
As for the identification of links between user stories and unit tests, approaches
based on Information Retrieval (IR) [40] techniques could be exploited to support
such a task. The rationale behind such approaches is the fact that user stories are
text based and that programmers use meaningful domain terms to define source
code identifiers in their unit tests. Indeed, IR-based methods propose a list of
candidate traceability links on the basis of the similarity between the text contained
in the software artifacts. Such methods are based on the conjecture that two
artifacts having high textual similarity share several concepts, thus they are good
candidates to be traced on each other.
In the traceability community promising results have been achieved applying IR
techniques for recovering links between source code and free text documentation.
In particular, IR methods have been used to recover traceability between manual
pages and source code [33], [41], between requirements [42]; between several
others types of artifacts [30]; and between unit tests and units under test [10].
However, to the best of our knowledge there is no empirical study carried out to
evaluate the support given by IR methods to recover links between user stories
and unit tests.
Quite different is the scenario when considering the links between unit tests
and tested classes. In this case, links have to recovered between structured
artefacts (i.e., source code). Thus, heuristic-based approaches can be exploited
[43]. Some guidelines and naming conventions that describe the testing
environment have been proposed to facilitate the identification of these links. For
example, the naming conventions based approaches identify the tested classes by
analyzing the name of the unit test. Usually, the name of a unit test is obtained by
137
the name of the tested class followed or preceded by the word "Test''. For
instance, the class Converter is tested by the unit test "ConverterTest'' (or
"TestConverter''). Although this approach is very simple, it establishes one-to-one
relationship between unit test and tested class, but this is not always true in real
programming life [44]. In addition, not all developers might follow the predefined
naming conventions when they named their unit tests.
In this context, other heuristic-based approaches should be used. Bruntink et
al. [45] show that classes which depend on other classes require more test code
and thus are more difficult to test than independent classes. In order to improve
the testability of complex classes, they suggest using a "cascaded test suites''
where a test of a complex class can use the tests of its required classes to set up
the complex test scenario. Source code analysis techniques [46], [47] can be used
to detect and capture dependencies among the unit tests onto related source
code. As an example, the backward program slicing [48] uses intraprocedural or
interprocedural control and data flow information to identify the classes that
directly or indirectly affect the computation of the results of the assert statements,
i.e., the statements used to compare the actual outcome with the expected
outcome, in the unit tests.
GUIDELINES FOR AGILE RE

This section introduces some guidelines to improve the performances of
requirements engineering processes in agile environment and to enhance the
quality of requirements:
• Customer Involvement: agile development focuses very strongly on customer
interaction. At the beginning, all relevant ideal stakeholders have to be
identified. Selecting the right customers and prioritizing their respective
requirements is a key issue. The different elicitation practices aim to get as
much knowledge as possible from all stakeholders and resolve
inconsistencies.
138
• Agile Projects Contracts: at the beginning, the most critical requirements are
expressed by the stakeholders as well as they can, so that the experienced
project leaders can determine an initial cost for agile projects and guess the
cost of later changes.
• Frequent Releases: frequently delivering parts of the system provides the
ability to release faster expected results to the customers in order to get
feedbacks from them. Hence, the requirements are implemented in an iterative
and incremental fashion.
• Requirements Elicitation Language: use linguistic methods for requirements
elicitation, derived from Natural Language Processing (NLP) [6]. In other
words, requirements are collected using the language of the customer, not a
formal language for requirements specification.
• Non-Functional Requirements (NFR): in agile approaches handling of NFR
is ill defined [7]. We propose the customers and agile team leaders to arrange
for meetings to discuss NFR (and all critical requirements) in the earliest
stages. Once the initial NFR of a project have been identified and documented,
the agile teams can begin with development.
• Smaller agile teams are flexible: smaller agile teams allow continuous
communications between them and stakeholders in efficient way, and the
requirements changes are controlled. Fig. 5 shows that whenever the agile
teams are smaller, the chances of the project success increased [12].
• Evolutionary requirements: RE in agile methods accommodate changing
requirements even late in the development cycle, but that changes to the
requirements must wait until the culmination of each iteration. Therefore, agile
development does not spend much time in initial requirements elicitation.
Consequently, this methodology will ensure that iterations are consistent with
expectations, and that the development process will remain organized.
139
Figure 5. Agile team sizes [11]
• No early documentation: any documents produced in the early stages can

quickly become irrelevant because the agile principles encourage
requirements change. By allocating only 5\%-15\% of the resources to
requirements we think development team can still address shortcomings in
agile development while complying with the agile principles in general.
• Requirements splitting: if the agile team considers a requirement too
complex, this technique helps the customer to divide it into simpler ones. This
helps agile teams to better understand the functionalities requested by the
customer, and helps agile teams working in parallel with frequent
communications between them. In XP [16], the requirements are written on
story cards, the complex user stories are broken down smaller stories. Of
course not all user stories can be divided since some contain several sub-
requirements, or record non-functional requirements. If a story card could be
successfully divided, the original story card is discarded, since it no longer
needed. All requirements are now included in the union of the new story cards.
• Requirements Traceability: we are persuaded that agile projects would work
better if they include requirements traceability tools together with validation
tools. A good practice would be to identify the traceability links in TDD
environment. In other words, the traceability links between test cases and
140
related code should be identified and evolved to control co-changes. In this
way, once the code is refactored, the agile team is able to re-build the
traceability matrix again and determine what are the test cases needed to be
re-run. In particular, the focus should be on the identification of the traceability
links added or deleted after the refactoring process. In case the traceability
links between source code and the related unit tests are broken during
refactoring, this may be treated as a warning for possible code and/or unit test
review [35]. Traceability information between requirements, source code and
unit tests can also be used to drive software development, by identifying
requirements for which unit tests and/or source code has not been
implemented yet. In addition, traceability information can be used to support
refactoring.
CONCLUSION AND FUTURE WORK

The agile methodology manifesto supports a very efficient RE. This chapter
surveyed the real process and activities of agile RE including feasibility study,
elicitation, analysis, documentation, validation, and management. The secret of
the success of agile RE is customer collaboration, good agile developers, and
experienced project managers.
The chapter also provided some recommendations to solve the requirements
documentation problem in agile projects, to make agile methodology suitable for
handling projects with critical (functional and non-functional) requirements, to allow
agile teams involved in large software projects to work in parallel with frequent
communications between them. In addition, we also highlighted some benefits
provided by the requirements traceability during software development and
suggested some methods to identify links between artifacts produced in agile
development environment.
As future work, we will plan to conduct industrial case studies to support our ideas,
and try to develop a tool that support the distinction between functional and non-
functional requirements. Supporting traceability in TDD environment aiming at re-
141
establishing traceability after refactoring and using traceability to improve

refactoring is also part of the agenda for future work.
ACKNOWLEDGEMENT
We would like to thank Mr. Avishek Shrestha for his help, valuable ideas,
and various references.
142
REFERENCES
1. K. Beck, M. Beedle, A. van Bennekum, A. Cockburn, W. Cunningham, M.
Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeffries, J. Kern, B. Marick, R.
C. Martin, S. Mellor, K. Schwaber, J. Sutherland, and D. Thomas: Manifesto
for agile software development., [Online]. Available:
http://www.agilemanifesto.org/ .
2. J. Highsmith: Agile software development ecosystems., Boston, MA, USA:
Addison-Wesley Longman Publishing Co., Inc., (2002).
3. C. J. Highsmith, K. Orr: Extreme programming., in Proceedings of E-Business
Application Delivery,pp. 4–17 (2000),
4. Eberlein and J. C. S. do Prado Leite: Agile requirements definition: A view
from requirements ngineering, in Proceedings of the International Workshop
on Time-Constrained Requirements Engineering (TCRE’02) (2002).
5. R. Goetz and R. C. Endeavour: How agile processes can help in time-
constrained requirements engineering,” in Proceedings of the International
Workshop on Time-Constrained Requirements Engineering (TCRE’02) (2002).
6. F. Paetsch, A. Eberlein, and F. Maurer: Requirements engineering and agile
software development, in Proceedings of the Twelfth International Workshop
on Enabling Technologies (WETICE’03)., Washington, DC,USA: IEEE
Computer Society, p. 308 (2003).
7. L. Cao and B. Ramesh: Agile requirements engineering practices: An
empirical study., IEEE Softw., vol. 25, no. 1, pp. 60–67 (2008).
8. A. Sillitti and G. Succi: Requirements engineering for agile methods., in
Engineering and Managing Software Requirements. Springer Verlag, pp. 309–
326 (2005).
9. S. Bose, M. Kurhekar, and J. Ghoshal: Agile methodology in requirements
engineering., [Online]. Available: http://www.infosys.com/research/publica-
tions/agile-requirementsengineering.pdf.
143
10. B. V. Rompaey and S. Demeyer: Establishing traceability links between unit

test cases and units under test., Software Maintenance and Reengineering,
European Conference on, vol. 0, pp. 209–218 (2009).
11. S. Ambler: When does(nt) agile modeling make sense?., [Online]. Available:
http://www.agilemodeling.com/essays/whenDoesAMWork.htm.
12. ——:Agile Modeling: Effective Practices for eXtreme Programming and the
Unified Process. New York, NY, USA: John Wiley & Sons, Inc. (2002).
13. S. R. Palmer and M. Felsing: A Practical Guide to Feature-Driven
Development., Pearson Education, (2001).
14. P. Coad, J. d. Luca, and E. Lefebvre: Java Modeling Color with Uml:
Enterprise Components and Process with Cdrom. Upper Saddle River, NJ,
USA: Prentice Hall PTR, (1999).
15. J. Stapleton: Dsdm: Dynamic systems development method., in TOOLS (29),
p. 406 (1999).
16. K. Beck: Extreme Programming Explained., Addison Wesley, (2000).
17. P. Abrahamsson, O. Salo, J. Ronkainen, and J. Warsta: Agile software
development methods - review and analysis., VTT PUBLICATIONS, Tech.
Rep. 478, (2002).
18. K. Schwaber and M. Beedle: Agile Software Development with Scrum., Upper
Saddle River, NJ, USA: Prentice Hall PTR, (2001).
19. A. Polini: Software requirements., [Online]. Available:
http://www1.isti.cnr.it/polini/lucidiSE/Requirements1.pdf.
20. A. Sidky and J. Arthur: Determining the applicability of agile practices to
mission and life-critical systems., in Proceedings of the 31st IEEE Software
Engineering Workshop (SEW ’07). Washington, DC, USA: IEEE Computer
Society, pp. 3–12 (2007).
21. M. Poppendieck and T. Poppendieck: Lean Software Development: An
AgileToolkit., Boston, MA, USA: Addison-Wesley Longman Publishing Co.,
Inc., (2003).
22. O. Salo: Systematical validation of learning in agile software development
environment., in Wissens management, pp. 92–96 (2005).
144
23. O. Salo and P. Abrahamsson: Integrating agile software development and
software process improvement: a longitudinal case study., in ISESE, pp. 193–
202 (2005).
24. I. Sommerville and P. Sawyer: Requirements Engineering: A Good Practice
Guide., New York, NY, USA: John Wiley & Sons, Inc., (2000).
25. E. J. Chikofsky and J. H. C. II: Reverse engineering and design recovery: A
taxonomy., IEEE Software, vol. 7, no. 1, pp. 13–17, (1990).
26. M. Fowler: Refactoring: Improving the Design of Existing Code., Boston, MA,
USA: Addison-Wesley, (1999).
27. K. Beck and M. Fowler: Planning Extreme Programming., Boston, MA, USA:
Addison-Wesley Longman Publishing Co., Inc., (2000).
28. B. W. Boehm: Verifying and validating software requirements and design
specifications., IEEE Softw., vol. 1, no. 1, pp. 75–88 (1984).
29. L. Delgadillo and O. Gotel: Story-wall: A concept for lightweight requirements
management., in RE, pp. 377–378 (2007).
30. A. D. Lucia, F. Fasano, R. Oliveto, and G. Tortora: Recovering traceability
links in software artifact management systems using information retrieval
methods., ACM Transactions on Software Engineering and Methodology, vol.
16, no. 4, p. 13 (2007).
31. O.Goteland and A.Finkelstein: Ananalysis of the requirements traceability
problem., in Proc. of 1st International Conference on Requirements
Engineering. Colorado Springs, Colorado, USA: IEEE CS Press, pp. 94–101
(1994).
32. B. Ramesh and M. Jarke: Toward reference models for requirements
traceability., IEEE Transactions of Software Engineering., vol. 27, no. 1, pp.
58–93 (2001).
33. G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo: Recovering
traceability links between code and documentation., IEEE Transactions on
Software Engineering, vol. 28, no. 10, pp. 970–983 (2002).
34. IEEE: The institute of electrical and electronics engineers., inInc.IEEE Std
1219-1998: IEEE Standard for Software Maintenance, (1998).
145
35. J. H. Hayes, A. Dekhtyar, and D. S. Janzen: Towards traceable test driven

development., in Proceedings of the 2009 ICSE Workshop on Traceability in
Emerging Forms of Software Engineering. Washington, DC, USA: IEEE
Computer Society, pp. 26–30 (2009).
36. A. van Deursen and L. Moonen: The video store revisited – thoughts on
refactoring and testing., in Proc. Int’l Conf. eXtreme Programming and Flexible
Processes in Software Engineering (XP), Alghero, Sardinia, Italy, pp. 71–76
(2002).
37. C. Lee, L. Guadagno, and X. Jia: An agile approach to capturing requirements
and traceability., in Proceedings of the 2nd International Workshop on
Traceability in Emerging Forms of Software Engineering, pp. 17–23 (2003).
38. A. Ghazarian: Traceability patterns: an approach to requirement component
traceability in agile software development,” in ACS’08: Proceedings of the 8th
conference on Applied computer science. Stevens Point, Wisconsin, USA:
World Scientific and Engineering Academy and Society (WSEAS), pp. 236–
241 (2008).
39. H. M. Sneed: Reverse engineering of test cases for selective regression
testing., in Proceedings of the 8th Working Conference on Software
Maintenance and Reengineering. Washington, DC, USA: IEEE Computer
Society, p. 69 (2004).
40. R. Baeza-Yates and B. Ribeiro-Neto: Modern Information Retrieval., Addison-
Wesley, (1999).
41. A. Marcus and J. I. Maletic: Recovering documentation-to-source-code
traceability links using latent semantic indexing., in Proceedings of 25th
International Conference on Software Engineering. Portland, Oregon, USA:
IEEE CS Press, pp. 125–135 (2003).
42. J. H. Hayes, A. Dekhtyar, and S. K. Sundaram: Advancing candidate link
generation for requirements tracing: The study of methods., IEEE
Transactions on Software Engineering, vol. 32, no. 1, pp. 4–19 (2006).
146
43. A. De Lucia, F. Fasano, and R. Oliveto: Traceability management for impact
analysis., in Proceedings of Frontiers of Software Maintenance. Beijing, China:
IEEE Press, pp. 21–30 (2008).
44. R. V. Binder: Testing object-oriented systems: models, patterns, and tools.,
Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., (1999).
45. M. Bruntink and A. v. Deursen: Predicting class testability using object-
oriented metrics., in Proceedings of the 4th IEEE International Workshop
Source Code Analysis and Manipulation. Montreal, Canada: IEEE Computer
Society, pp. 136–145 (2004).
46. D. Binkley: Source code analysis: A road map., in FOSE, pp.104–119 (2007).
47. K. Gallagher and D. Binkley: Program slicing., in Frontiers of Software
Maintenance. Beijing, China: IEEE CS Press, (2008).
48. M. Weiser: Program slicing., IEEE Trans. Software Eng., vol. 10, no. 4, pp.
352–357 (1984).
147
Chapter 8
Robust encoding of the FS1016 LSF parameters:

Application of the channel optimised trellis coded
vector quantization
BOUZID Merouane
University of Sciences and Technology Houari Boumediene (USTHB), Algeria
ABSTRACT
Speech coders operating at low bit rates necessitate efficient encoding of the
linear predictive coding (LPC) coefficients. Line spectral Frequencies (LSF)
parameters are currently one of the most efficient choices of transmission
parameters for the LPC coefficients. In this paper, we present an optimized trellis
coded vector quantization (OTCVQ) scheme designed for robust encoding of the
LSF parameters. The objective of this system, called initially "LSF-OTCVQ
Encoder", is to achieve a low bit-rate quantization of the FS1016 LSF parameters.
The efficiency of the LSF-OTCVQ encoder (with weighted distance) was first
proved in the ideal case of transmissions over noiseless channel. After that we
were interested on the improvement of its robustness for real transmissions over
noisy channel. To protect implicitly the transmission parameters of the LSF-
OTCVQ encoder incorporated in the FS1016, we used a joint source-channel
coding carried out by the channel optimized vector quantization (COVQ) method.
In the case of transmissions over noisy channel, we will show that the new
148
encoding system, called "COVQ-LSF-OTCVQ Encoder", would be able to
contribute significantly to the improvement of the FS1016 performances by
ensuring a good coding robustness of its LSF spectral parameters.
Keywords: Source-channel coding, Robust speech coding, LSF

parameters.
INTRODUCTION
In speech coding systems, the short-term spectral information of the speech
signal is often modelled by the frequency response of an all-pole filter whose
transfer function is denoted by H(z) = 1/A(z) in which A(z) = 1 + a1 z −1 +…+ ap z
−p [1]. In telephone band speech coding (300-3400 Hz, fe = 8 KHz), the
parameters of this filter are derived from the input signal through linear prediction
(LP) analysis of p = 10 order. The 10 parameters {ai}i=1,2,…,10, known as the
Linear Predictive Coding (LPC) coefficients [1], play a major role in the overall
bandwidth and preserving the quality of the encoded speech. Therefore, the
challenge in the quantization of the LPC parameters is to achieve the transparent
quantization quality [2], with the minimum bit-rate while maintaining the memory
and computational complexity at a low level.
In practice, one doesn't quantify directly the LPC coefficients because they
have poor quantization properties. Thus, other equivalent parametric
representations have been formulated which convert them into much more
suitable parameters to quantize. One of the most efficient representations of the
LPC coefficients is the Line Spectral Frequency (LSF) [3]. The LSF parameters
(LSFs) which are related to the zeros of polynomials derived from A(z) [1] exhibit a
number of interesting properties. These properties [2] make them a very attractive
set of transmission parameters for the LPC coefficients. Exploiting these
properties, various coding schemes based on scalar and vector quantization were
developed in the past for the efficient quantization of spectral LSF parameters.
Several works showed that the vector quantizer (VQ) schemes, such as multistage
149
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
VQ [4], Split VQ [2]…, can achieve at lower bit-rates the transparent quantization
quality of the LSFs compared with those conceived based on scalar quantizer
(SQ).
In this paper, we present an optimized trellis coded vector quantization
(OTCVQ) scheme designed for the efficient and robust coding of LSF parameters.
The aim of this system, called at the beginning "LSF-OTCVQ Encoder", is to
achieve a low bit rate transparent quantization of LSFs by exploiting the intra-
frame dependence between the closest pairs of the LSF parameters. In the case
of ideal transmissions over a noiseless channel, we have already proved in [5] that
the LSF-OTCVQ encoder (with weighted distance) could achieve good
performances when applied to encode the LSF parameters of the US Federal
Standard FS1016. Indeed, we have showed that LSF-OTCVQ encoder of 27
bits/frame produces equivalent perceptual quality to that obtained when the LSF
parameters are unquantized.
Subsequently, our interest was drawn to the improvement of the LSF-OTCVQ
encoder robustness for real transmissions over noisy channel. In low bit rate
speech coding domain, the essential objective is to reduce the bit rates of speech
coders while maintaining a good quality of transmission. In general, during the
design of speech coding systems, the effects of transmission noises are often
neglected. A redundant channel coding [6] is conventionally used to ensure an
"explicit" protection to sensitive parameters of speech coders against channel
errors. According to the separate design approach, suggested by Shannon in his
classical source/channel coding theorems [7], the channel encoder can be
designed separately from the source encoder by adding redundant bits (Error-
detecting-correcting codes) to source data. Indeed, robust encoding systems could
be designed according to this separation approach but at the cost of an increase
of the bit-rate/delay transmission and the complexity of the coding/decoding.
However, at low bit rate where the constraints in complexity and delay are very
severe, this channel coding is not especially recommended. The separation design
disadvantages have motivated some researchers to investigate a joint solution to
the source and channel coding optimization problem so that they can reduce the
150
complexity on both sides, while providing performances close to the optimum. For
these purposes, Joint Source-Channel Coding (JSCC) was introduced in which the
overall distortion is minimized by simultaneously considering the impact of the
transmission errors and the distortion due to source coding [8], [9], [10]. Most of
these works have proved the effectiveness of the JSCC to protect implicitly (i.e.,
without redundancy) source data while maintaining a constant bit rate and a
reduced complexity.
To implicitly protect the transmission indices of our LSF-OTCVQ encoder
incorporated in the FS1016, we used a JSCC method carried out by the Channel
Optimized Vector Quantization (COVQ). We will show first how to adapt and apply
successfully the COVQ technique for the robust design of a new encoding system
(called "COVQ-LSF-OTCVQ encoder") in order to implicitly protect some of its
indices. To finish, we will generalize the study with the complete protection of all
the indices of the COVQ-LSF-OTCVQ encoder.
An outline of this paper is as follows. In section 2, we briefly review the basics
of vector quantization. In section 3, we describe the design steps of the OTCVQ
encoding system. Examples of comparative results of TCVQ/OTCVQ encoders
are reported in this section. Next, we present the joint coding method by the
COVQ technique. The performances of the COVQ system applied to encode
memoryless source are presented at the end of the section. The application of the
OTCVQ scheme for encoding the LSF parameters is described in section 5.
Simulation results, when using two different distance measures (unweighted and
weighted) in the design and the operation of the LSF-OTCVQ encoder, are
provided. In section 6, we present the application of the LSF-OTCVQ encoder to
quantize the LSF parameters of the FS1016 speech coder. After, a JSCC-COVQ
method was used to implicitly protect the LSF-OTCVQ indices for transmissions
over noisy channel. Conclusions are given in section 7.
VECTOR QUANTIZATION
A k-dimensional vector quantizer (VQ) of size L is a mapping Q of k-
151
dimensional Euclidean space ℜk into a finite subset (codebook) Y = {y0,…, yL−1}

composed of L codevectors [11]. The design principle of a VQ consists of
partitioning the k-dimensional space of source vectors into L non overlapping cells
{R0,.., RL−1} and associating with each cell Ri a unique codevector yi.
Coding a sequence of input source vectors by a VQ consists thus to associate
to each source vector x the binary index i ∈ {0,…, L −1} of a close codevector yi
whose distance from the input vector is minimized. In general, the vector
quantization involves an irreversible loss of information which results in a quality
degradation evaluated commonly by a distortion measure. For a given VQ, the
average distortion is defined by [11]:
L −1
∑ ∫ d ( x, y i ) p(x )⋅ dx
1
D=
k i =0 x∈Ri
, (1)
where p(x) is the k-fold probability density function of the source and d(x, yi) is the
widely used squared Euclidean distance.
The optimal design of a VQ is based on the principle of searching simultaneously
the partition {R0,.., RL−1} and the representing codevectors {y0,.., yL−1} which
minimizes the average distortion D. To resolve this problem, two main necessary
conditions of optimality need to be successively satisfied during the VQ design
process [11]:
1. For a given codebook Y = {y0, y1,..., yL−1}, the optimal partition satisfy :
{
Ri = x : d ( x, y i ) ≤ d ( x, y j ), ∀j ≠ i } (2)
It's the nearest neighbor optimality condition.
2. Given an encoder partition {Ri, i = 0,..., L−1}, the optimal codevectors yi are the
centroids in each partition cell Ri (centroid condition) :
y i = Cent ( Ri ) = E ( X / X ∈ Ri ) (3)
Various algorithms for the design of VQ have been developed in the past. The
most popular one is certainly the LBG algorithm [12]. This algorithm (LBG-VQ) is
an iterative application of the two optimality conditions such as the partition and
the codebook are iteratively updated.
152
OPTIMIZED ENCODING SYSTEM BASED ON THE TRELLIS
CODED VECTOR QUANTIZATION
The scalar trellis coded quantization (TCQ) [13] and its generalized version
to vector case (TCVQ) [14], [15] improve upon traditional trellis encoders [16] by
labelling the trellis branches with entire subsets rather than with individual
reproduction levels. This approach, which was motivated by Ungerboeck's
formulation of Trellis Coded Modulation (TCM) [17], uses a structured alphabet
with an extended set of quantization levels.
In this work, one was interested particularly on the TCVQ encoder which
structure is quite similar to TCQ, with an increase in complexity due to vector
codebook searching [14]. The design of a TCVQ encoder consists of several
interrelated steps. These steps include selection of trellis, extended initial
codebook construction, partitioning of the codebook's codevectors into
subcodebooks (subsets) and labelling the trellis branches with these subsets.
Consider the design process of a k-dimensional TCVQ encoder of rate R bits
per sample (bps) used to encode a sequence of source vectors. The S-state trellis
used in TCVQ can be any one of Ungerboeck's amplitude modulation trellises [17].
The extended initial TCVQ codebook is generally designed by the LBG algorithm.
It contains 2kR+1 codevectors (twice that of the VQ). However, during the TCVQ
encoding process, only a subset of size 2kR of these codevectors may be used to
represent a source vector at any instance of time. According to Ungerboeck's set
partitioning method, the codevectors are then partitioned into four subsets D0, D1,
D2 and D3 each of size 2kR−1. In our TCVQ encoders design, we used the heuristic
algorithm described in [15] to partition the extended TCVQ codebook. After that,
the subsets are labelled on the trellis branches according to Ungerboeck's rules of
TCM [17]. These rules are meant to ensure that the distortion between the original
and the reconstructed source sequences (under clear channel assumptions) is
close to the minimum.
To encode the source vectors sequence, the well-known Viterbi algorithm [16]
is used to find a legitimate optimal path through the trellis, which results in
153
minimum distortion. The TCVQ encoder transmits to reception a bit sequence

specifying the corresponding optimal path (sequence of subsets) in addition to a
sequence of kR−1 bits codewords necessary to specify codevectors from the
chosen subsets. At the TCVQ decoder side, the bit sequence that specifies the
selected optimal trellis path is used as the input to the convolutional coder of the
TCVQ system. The output of this coder selects the proper subset Di. The
codewords of the second binary sequence are used to select the correct
codevectors from each subset.
An example of a 4-states scalar TCQ encoder of rate R = 2 bps used to
encode a memoryless source, which is uniformly distributed on the interval [-A A],
is illustrated on Fig. 1.
(a)
(b)
(c)
Figure 1. TCQ encoder of rate R=2 bps : (a) Section of labelled 4-states trellis, (b)
Output alphabet levels and partition, (c) TCQ convolutional coder.
Examples of simulation results for encoding unity-variance memoryless
154
Gaussian sources using integer and fractional rates TVCQ encoders are
respectively given in tables 1 and 2. For different rates, results are given in terms
of Signal to Noise Ratio (SNR) in dB, along with the corresponding LBG-VQ
performance and distortion rate function D(R). Notice that when the rate is
fractional, the dimension k has to be such that kR becomes an integer.
Table 1. Performances of TCVQ encoding with integer rates for the Gaussian
source.
Rate Vector TCVQ Trellises Size (State's Number) LBG- D(R)

bps Dim. VQ
4 8 16 32 64
1 4.64 4.77 4.85 4.93 4.98 4.40
2 4.85 5.03 5.09 5.10 5.21 4.42 6.02
1
3 4.98 5.08 5.12 5.15 5.22 4.49
4 5.05 5.14 5.16 5.18 5.23 4.69
1 10.18 10.31 10.38 10.46 10.51 9.31
2 10.36 10.50 10.57 10.60 10.69 9.70 12.04
2
3 10.59 10.69 10.72 10.75 10.81 10.00
4 10.93 11.02 11.05 11.07 11.12 10.41
Table 2. Performances of TCVQ encoding with fractional rates for the Gaussian
source.
Rate Dim. TCVQ Trellises Size (State's Number) LBG- D(R)
bps k VQ
4 8 16 32 64 128 256
0.66 6 3.34 3.39 3.41 3.42 3.45 3.48 3.49 3.05 4.01
0.75 4 3.72 3.78 3.80 3.82 3.87 3.90 3.93 3.36 4.51
0.80 5 3.96 4.04 4.07 4.08 4.14 4.18 4.20 3.69 4.82
At the same encoding rate, these results show that the TCVQ outperforms the
TCQ (k = 1). Moreover, the TCVQ allows fractional rates as shown by the
simulation results listed in table 2. We can see also that, for a given rate, the
TCQ/TCVQ performances are higher than those of the conventional SQ/VQ.
To more improve the TCVQ performances, a training optimization procedure
155
for the extended TCVQ codebook design was developed [5]. For a given training
source vectors, this procedure updates the TCVQ codebook by replacing each
code vector with the average of all the source vectors mapped to this code vector.
This leads to an iterative design algorithm for the overall TCVQ encoder. Using
this optimization variant, the algorithm will be called OTCVQ (Optimized Trellis
Coded Vector Quantization) algorithm.
Examples of simulation results for encoding memory less Gaussian sources
using fractional rate OTCVQ encoders are listed in table 3.
Table3. Performances of the OTCVQ with fractional rates for the Gaussian
source.
TCVQ Trellises Size (State's Number)

Rate Dim. 4 8 16 32 64 128 256 LBG- D(R)
bps k VQ
0.66 6 3.41 3.45 3.47 3.48 3.49 3.52 3.53 3.05 4.01
0.75 4 3.81 3.85 3.87 3.89 3.93 3.96 3.97 3.36 4.51
0.80 5 4.08 4.14 4.16 4.17 4.21 4.23 4.25 3.69 4.82
Comparing these results with those given in table 2, we clearly notice the
performance improvements brought by the optimization of the TCVQ codebooks.
JOINT CODING BY THE CHANNEL OPTIMIZED VECTOR

QUANTIZATION
Vector quantization is currently used in various practical applications and since
some type of channel noise is present in any practical communication system, the
analysis and design of VQs for noisy channels is receiving increasing attention.
In this work, we considered the joint source-channel coding (JSCC) associated
specifically with the use of VQ in order to provide an implicit protection to our
quantizers. Particularly, we were interested on a category of JSCC relating to
quantizers optimized by taking into account the error probability of channel. It's
about the channel optimized vector quantization [8], [18].
156
COVQ system principle: Modified optimality conditions
A channel optimized vector quantizer (COVQ) is a coding scheme based on
the principle of VQ generalization by taking into account the present noise on the
transmission channel. The idea is to exploit the knowledge about the channel in
the codebook design process and the encoding algorithm. Thus, the operations of
source and channel coding are integrated jointly into the same entity by
incorporating the channel characteristics in the design procedure. Indeed, the
LBG-VQ is well appropriate to a modification in this sense. The purpose then is to
minimize a modified total average distortion between the reconstituted signal and
the original signal, given the channel noise.
The design of a COVQ encoder is carried out by a VQ version extended to the
noisy case [8], [18]. The COVQ scheme keeps the same VQ block structure
(encoder/decoder, dimension, bit rate). The difference is in the formulation of the
necessary conditions of optimality to minimize a modified expression of the total
average distortion. This new distortion is formulated by considering simultaneously
the distortion due to vector quantization and channel errors [18], [19]:
L −1 ⎡ L −1 ⎤
∑∫ ∑
1
D= p( x) ⎢ p( j / i) ⋅ d ( x, y j )⎥ dx , (4)
k i =0 ⎢ j =0 ⎥
Ri ⎣ ⎦
where p(j/i) is the channel transition probability which represents the probability
that the index j is received given that the index i is transmitted. By comparing the
Eq. (4) with Eq. (1), one can notice easily that these two equations are equivalent,
except that the Eq. (4) uses a modified distance measure (term in the braces). It
about the same distance d but with weightings given by the channel transition
probabilities p(j / i), i, j = 0,..., L−1.
The formulations of optimality necessary conditions of COVQ are also derived
in two steps, according to the minimization principle of the modified total average
distortion [8], [18], [19].
For a given codebook Y = {y0,..., yL−1} and by using a squared Euclidean
distance measure, the optimal partition Ri (i= 0,..., L−1) for a noisy channel is such
that :
157
⎧⎪ L−1 2 L−1 ⎫⎪
∑ ∑
2
Ri = ⎨x ∈Rk : p( j /i) x − yj ≤ p( j /l) x − yj , ∀l ≠ i⎬ (5)
⎪⎩ j=0 j=0 ⎭⎪
Similarly, the optimum codebook for a fixed partition is given by:
L −1
∑ p( j / i) ∫ xp( x).dx
i =0 Ri
yj = , j = 0,…, L−1. (6)
L −1
∑ p( j / i) ∫ p( x).dx
i =0 Ri
The codevector yj represents now the centroid of all input vectors that are
decoded into the cell Rj, even if the transmitted index i is different from j. The
equations (5) and (6) are respectively referred as the generalized nearest
neighbour and centroid conditions with a modified distortion measure. The optimal
codevectors for noisy channel are thus linear combinations of those for the
noiseless case, weighted by the a posteriori channel transition probabilities.
In our applications, the communication channel considered is a discrete
memoryless channel with finite input and output alphabets. Precisely, we assumed
a memoryless binary symmetric channel (BSC) model with bit error (crossover)
probability p [6], [16]. For codewords (VQ indices) of n bits, the BSC transition
probabilities are described by [9], [19]:
p( j i ) = (1 − p) n−d H (i , j ) ⋅ p d H (i , j ) , (7)
where dH (i, j) (0 ≤ dH (i, j) ≤ n) is the Hamming distance between the n-bits binary
codewords represented by integers i and j.
When the channel bit error probability p is sufficiently small, the probability of
multiple bit errors in an index is very small relative to the probability of zero or one
bit error [9], [18], [19]. To simplify the numerical computations, it is often adequate
to consider only the effects of single bit errors on channel codewords. The BSC
channel model can be then approximated by [9]:
⎧ p j ∈ ξi ,
⎪
p( j i ) = ⎨1 − np j = i, (8)
⎪ 0
⎩ otherwise
158
where ξi is the set of all integers j, (0 ≤ j ≤ L −1), such that the binary
representation of j is of Hamming distance one from the binary representation of i.
In the case where the source distribution is unknown, long training database of
k-dimensional vectors can be used for the quantizer design. With the
approximation given in Eq. (8), the equations (4) and (6) will be respectively
modified as:
N −1 L −1
∑ ∑ k p( j / it ) ⋅ d ( xt , y j ) ,
1 1
D= (9)
N t =0 j∈ξi
and:
∑ p( j / i) ∑ xl / N
i∈ξ j l:xl ∈Ri
yj = , (10)
∑ p ( j / i ) Ri /N
i∈ξ j
where N is the size of the training base and Ri denotes the number of training
vectors belonging to the cell Ri.
COVQ encoder design algorithm

The design procedure of the COVQ encoding system is a straightforward
extension of the LBG-VQ algorithm. An iterative optimization of the two modified
optimality conditions is carried out such as the partition and the codebook
codevectors are updated by using the modified distortion including the channel
probability [8], [18]. The steps of our version of the COVQ algorithm are detailed in
[20].
We suppose that a set of input vectors is available (training base) and that the
BSC channel error probability ε is given. This channel probability, which is often
called design error probability of COVQ codebook, is considered as an input
parameter in the optimization process. At the beginning this design parameter is
set temporarily at a low value; then gradually increased until matching the desired
design error probability.
The choice of the initial codebook is very important since it can significantly
159
impact the final results. In our design, the initial codebook is conceived for ε = 0
(i.e., for noiseless channel). It is about a simple run of the conventional LBG-VQ
algorithm which will converge to a locally optimal codebook. This codebook will be
used as initial codebook of the COVQ algorithm. Then, for each stage of ε, the
algorithm will converge to an intermediate codebook which will be used as initial
codebook of the next stage in the COVQ design process.
The greatest difficulty in the COVQ system design is that the channel error
probability is a parameter in the optimization process. In real transmission
situation, this parameter is difficult to estimate. It may even vary in time, making
the design according to a specific value rather academic. Thus, according to the
practical situation and to the estimates of the real communication channel
characteristics, COVQ encoders can be selected to obtain the highest degree of
robustness.
COVQ encoder performances

We now present numerical results on the performance of COVQ encoding
system operating over a BSC channel with variable bit error probability p.
Examples of simulation results of COVQ encoders, trained for various values of
the design probability parameter ε (ε = 0.001, 0.005, 0.010 and 0.050) are given in
table 4. These encoders, whose selected characteristics are: k = 2, R = 2 bps and
L = 16, were applied to encode memoryless Gaussian source. For a comparative
evaluation with the conventional VQ, the LBG-VQ (designed for a noiseless
channel, ε = 0.000) performances were also included in the table.
Table 4. SNR Performances comparison between COVQ and VQ over BSC

channel
ε
0.000 0.001 0.005 0.01 0.05
p
0.000 9.686 9.685 9.624 9.537 8.664
0.001 9.584 9.604 9.565 9.481 8.643
0.005 9.292 9.314 9.357 9.332 8.571
160
0.01 8.927 8.965 9.034 9.179 8.477
0.05 6.824 6.918 7.351 7.608 7.800
0.1 4.650 5.292 5.875 6.801 7.043
0.2 2.518 3.109 3.876 4.752 5.886
In the case of transmissions over noisier channels (higher values of p), the
results indicate that COVQ performs better than LBG-VQ. For example, for a BSC
of p = 0.2, a considerable SNR gain of 3.36 dB was obtained by the COVQ
(trained for ε = 0.05) compared with the LBG-VQ. One notice that when the
channel probability p does not match with the design probability ε, COVQ
encoders trained for ε identical or close to p are those which yields the best
performances. However, when the channel is noiseless (p = 0.000) the SNR-
performances of COVQ encoders are suboptimal with the increase of the design
parameter ε. In this case, the LBG-VQ ensures comparable performances or better
than the COVQ. Same remarks when the channel error probability is low (p <
0.005) with a slight performances improvement obtained by COVQ encoders
trained for a low value of the design parameter ε (example, COVQ for ε = 0.001).
OPTIMIZED-TCVQ FOR LOW-BIT RATE ENCODING OF LSF

PARAMETERS
Using the OTCVQ encoding technique, an encoding scheme for the LSF
parameters is presented in this section. The aim of this encoding system, called
"LSF-OTCVQ Encoder" [5], is to efficiently quantize the LSF parameters of one
frame using only the dependencies among the same parameters.
For speech coding applications, the OTCVQ is used in block mode, where
each block corresponds to an LSF vector of size 10. In this work, two-dimensional
2-D codebooks (k = 2) are used for encoding the LSF vectors. Thus, each stage in
the trellis diagram is associated with 2-D of the LSF vector. Hence, there are five
stages in the LSF-OTCVQ trellis with two branches entering and leaving each
state. Since the LSF parameters have different means and variances, five
extended codebooks are then needed to encode an LSF vector.
161
Knowing that choice of an appropriate distance measure is an important issue

in the design of any VQ system, we have used another distance measure in the
design and the operation steps of the LSF-OTCVQ encoder. It's about the
weighted Euclidean distance measure. Based on the LSF parameters properties,
several weighted distance measures have been proposed for the LSF encoding [2],
[4], [21]. In our applications, we used the weighted squared Euclidean distance
given by:
10
d ( f , fˆ ) = ∑ ci wi ( f i − fˆi ) 2
i =1 , (11)
where fi and f̂ i are respectively the ith coefficients of the original f and
quantized f̂ LSF vectors; ci and wi represent respectively the constant and

variable weights assigned to the ith LSF coefficient.
These weights are meant to provide a better quantization of LSF parameters in the
formant regions. Many weighting functions have been defined to calculate the
variable weight vector w = [w1,…, w10]. Particularly, we used the weighting
function, known by the inverse harmonic mean (IHM) [21]:
1 1
wi = +
f i − f i −1 f i +1 − f i
, (12)
where f0 = 0 and f11 =0.5. The constant weight vector c = [c1,…, c10] is
experimentally determined [2]:
⎧1.0, for 1 ≤ i ≤ 8
⎪⎪
c i = ⎨0.8, for i = 9 (13)
⎪0.4, for i = 10
⎪⎩
The LSF quantizer performances are evaluated by the average spectral

distortion (SD) which is often used as an objective measure of the LSF encoding
performance. This measure correlates well with human perception of distortion.
When calculated discretely over a limited bandwidth, the spectral distortion for
frame i is given, in decibels, by [4] :
162
n1 −1 2
⎡ S (e j 2πn / N ) ⎤
∑
1
SDi = ⎢10 log10 ⎥ . (14)
n1 − n 0 ⎢ ˆ (e j 2πn / N ) ⎥
n = n0 ⎣ S ⎦
For speech signal sampled at 8 kHz with a 3 kHz bandwidth, an N = 256 point
FFT is used to compute the original S(ej2πn/N) and quantized Ŝ(ej2πn/N) power
spectra of the LPC synthesis filter, associated with the ith frame of speech. The
spectral distortion is thus computed discretely with a resolution of 31.25 Hz per
sample over 96 uniformly spaced points from 125 Hz to 3.125 kHz. The constants
n0 and n1 in Eq. (14) correspond to 1 and 96 respectively.
Generally, it is accepted that an average SD of about 1 dB indicates negligible
audible distortion has incurred during quantization. This value has been, in the
past, suggested for transparent quantization quality and used as a goal in
designing many LPC quantization schemes. In [2], Paliwal and Atal established
that the average SD is not sufficient to measure perceived quality alone. They
introduced the notion of spectral outliers frames. Consequently, we can get
transparent quality if we maintain the following three conditions:
1. The average SD is about 1 dB,
2. The percentage of outlier frames having SD between 2 and 4 dB is less than
2%,
3. No frames must have SD greater than 4 dB.
Now, we evaluate the performances of our LSF-OTCVQ encoder operating
at different bit rates. All simulation results reported in this section were obtained by
using four-state trellis and 2-D codebooks. For each encoding rate, 2 bits are thus
assigned to represent the initial state. When the remaining bits cannot be equally
assigned to represent the five 2-D codebooks, fewer bits are used in the last
codebooks, since it is known that human resolution in the higher frequency bands
is less than in the lower frequency bands. We investigated the optimum bit
allocations for the LSF-OTCVQ encoder and found that the bit allocations given in
table 5 yield the best results.
163
Table 5. Bit allocations of each LSF-OTCVQ trellis stage codebook as a function

of bit rate
Bits / LSF Trellis Stage Number :

Vector 1 2 3 4 5
24 5 5 5 4 3
Bits / Stage
codebook
25 5 5 5 5 3
26 6 5 5 5 3
27 6 6 5 5 3
28 6 6 6 5 3
The speech data used in the experiments of this section consists of

approximately 43 min of speech taken from the TIMIT speech database [22]. To
construct the LSF database, we have used the same LPC analysis function of the
FS1016 speech coder [23]. A 10-order LPC analysis, based on the autocorrelation
method, is performed every analysis frame of 30 ms using a Hamming window.
One part of the LSF database, consisting of 75000 LSF vectors, is used for
training and the remaining part, of 11262 LSF vectors (different from the training
set), is used for test.
For different bit rates, the performances of the LSF-OTCVQ encoder are
shown in table 6. These results have been obtained by using separately two
different distortion measures (unweighted and weighted distances) in both the
design and the operation of the LSF-OTCVQ encoder.
These comparative results clearly show the improvement of the LSF-OTCVQ
performances, obtained by using the weighted distance. The LSF-OTCVQ
encoder, designed with a weighted distance, need 27 bits/frame to get transparent
quantization quality. Compared to the encoder designed with the unweighted
distance, it can save about 1-2 bits/frame while maintaining comparable
performances.
164
Table 6. Performances of the LSF-OTCVQ encoder as a function of bit rate.
EFFICIENT AND ROBUST CODING OF THE FS1016 LSF

PARAMETERS: APPLICATION OF THE LSF-OTCVQ
In this section we use the LSF-OTCVQ encoder (with weighted distance) to
quantize the LSF parameters of the FS1016. For the moment, we suppose that the
transmissions are done over a noiseless ideal channel. Recall that the US Federal
Standard FS1016 is a 4.8 kbits/s Code Excited Linear
Prediction (CELP) speech coder [23]. According to the FS1016 norm, the LSF
parameters are encoded at the origin by an SQ of 34 bits/frame.
For the same test database (11262 LSF vectors), this 34 bits/frame LSF SQ
results in an average SD of 1.72 dB, 25.99 % outliers in the range 2-4 dB, and
0.46 % outliers having SD greater than 4 dB. By comparing these results with
those given in table 6, we can see that the LSF-OTCVQ encoder (for all studied
lower rates) performs better than the 34 bits/frame SQ used at the origin in the
FS1016. Thus, several bits per frame can be gained by the application of the LSF-
OTCVQ in the LSF encoding process of the FS1016.
Subjective listening tests of the 27 bits/frame LSF-OTCVQ encoder were also
performed. Incorporating this encoder in the FS1016, the bit rate for the
quantization of the LSF parameters decreases to 900 bits/s and consequently the
FS1016 operate at a bit rate of 4.57 kbits/s. To carry out these tests, we generated
for the same original speech signal three versions of synthetic speech signals: one
with unquantized LSFs and the two others with quantized LSFs using respectively
the 27 bits/frame LSF-OTCVQ encoder and the 34 bits/frame SQ. Subjective
165
quality evaluations are done here through A-B comparison and MOS (Mean
Opinion Score) tests using 8 listeners. Six sentences from the TIMIT database
(spoken by three male and three female speakers) are used for the subjective
evaluations.
The A-B comparison test involves presenting listeners with a sequence of two
speech test signals (A and B). For each sentence, a comparison is done between
the two synthetic signals: one A (or B) with unquantized LSFs and the other B (or
A) with LSFs quantized by the LSF-OTCVQ encoder. The A-B signal pairs are
presented in a randomized order. The listeners choose either one or the other of
the two synthesized versions, or indicate no preference. For the MOS tests, the
listeners were requested to rate each synthetic speech sentence (with LSF-
OTCVQ quantized LSFs) in a scale between 1 (bad) and 5 (excellent). At the end,
the average score of opinion (MOS) is calculated.
Results from the A-B comparison tests show that the majority of the listeners
(58.84 %) have no preference. The mean preference for speech signal coded with
LSF-OTCVQ quantized LSFs (20.83 %) is identical to that obtained for the speech
signal coded with unquantized LSFs. Roughly, we can conclude that the two
considered versions of coded speech are statistically indistinguishable, i.e., there
are no perceptible differences and the quantization does not contribute to
audible distortion. In terms of MOS, the considered coded version of speech
exhibits a good score of 3.89. This implies that good communications quality and
high levels of intelligibility [2] are obtained using the 27 bits/frame LSF-OTCVQ
encoder in the FS1016.
In addition, in term of average segmental signal-to-noise ratio (SSNR), the
synthetic speech signals with unquantized LSF parameters gave an average
SSNR of 11.05 dB; with LSF-OTCVQ encoding of LSF parameters, the average
SSNR obtained is 10.31 dB. In the case where LSF parameters are quantized by
the 34 bits SQ, an average SSNR of 9.59 dB was obtained. Thus, a reduction in
coding rate with an improvement of the SSNR-performances of the FS1016 was
obtained by application of the LSF-OTCVQ encoding system.
166
Robustness of the COVQ-OTCVQ encoder: Transmission over a noisy
channel
In a practical communication system, the robustness of the LSF-OTCVQ
encoder must be reinforced so that the encoder will be able to cope up with
channel errors. In this part, we were interested in implicit protection of the
encoders by application of the JSCC-COVQ technique. We will see first how to
apply the COVQ for the robust design of the LSF-OTCVQ encoder in order to
provide an implicit protection to some of its indices. To finish, we will generalize
the study with the full protection of all the indices of the new LSF-OTCVQ encoder
with the COVQ technique.
Design of the LSF-OTCVQ encoder with JSCC-COVQ technique

The design principle of the LSF-OTCVQ encoder optimized for noisy channel is
based mainly on the design algorithm of LSF-OTCVQ modified according to the
basic concept of the COVQ. In the applications, the five extended codebooks of
our new encoding system, denoted by: "COVQ-LSF-OTCVQ encoder", were
optimized for a design error probability ε = 0.05.
The basic steps of our design algorithm of the 27 bits/frame COVQ-LSF-
OTCVQ encoder are summarized below. Notice that the trellis states number of
the encoder is always S = 4; consequently 2 bits/frame are necessary to represent
the initial state. The remaining 25 bits are assigned for the 5 codebooks according
to the bits allocation given in table 5. Let us specify that at the beginning the 5
initial extended codebooks are designed by the LBG-VQ algorithm (ε = 0.000)
using the weighted Euclidean distance. The codebooks design of COVQ-LSF-
OTCVQ encoder is done using the same training data base (75000 LSF vectors).
Thereafter, this base is divided into 5 training subsets of 2-D LSF vector pairs
(LSF 1-2, LSF 3-4, LSF 5-6, LSF 7-8 and LSF 9-10).
167
Design steps of COVQ-LSF-OTCVQ encoder:

Step 1: Initial design
− Based on the 5 training subsets, use the COVQ (εc = 0.05) algorithm to design
the five (2-D) extended initial codebooks of the encoder.
− Partition each initial codebook in 4 sub-codebooks using the set partitioning
algorithm. Then, label the transitions of each trellis stage with the corresponding
partitioned COVQ-codebook (i.e.,
COVQ-codebook LSF1-2 for stage 1,…
− Set a stop threshold α to very small value.
Step 2: TCVQ coding/decoding process

− For the given LSF vectors training base, find the best possible reproduction LSF
vectors through
the trellis by using a modified version of Viterbi procedure.
− Calculate the average SD between the original and quantized LSF vectors.
Step 3: Termination Test

− If the relative decrease of the average SD is below the threshold α, save the 5
optimized codebooks of COVQ-LSF-OTCVQ encoder, stop.
− Otherwise, updates the 5 COVQ-codebooks using a modified version of the
optimization procedure and go to step 2.
In step 2, the TCVQ encoding process of input LSF vectors consists to find
the best possible sequence of codevectors (optimal path) through the trellis. This
research task is assured by the Viterbi algorithm with a slight modification of the
distance computation formula. This distance, which must be minimized during the
TCVQ search process of the optimal codevector, is formulated as follows:
1 k
d ( f , fˆi ) = ∑ p( j / i) ∑cm wm d ( f (m) − fˆ j (m))2 (15)
k m=1
j∈ξi
168
where k is the dimension of LSF vectors (k = 2 for LSF's pairs) and ξi is the set of
the i-neighbors such as dH (i, j) = 1. Recall that after the encoding process, COVQ-
LSF-OTCVQ encoder transmits two binary sequences in addition to two bits
representing the trellis initial state.
In this part, we must notice that only the indices sequence of COVQ-LSF-
OTCVQ codevectors (sequence of 20 bits for the 5 indices) is supposed to be
protected implicitly by COVQ. This sequence results directly from the COVQ
search procedure through the 5 codebooks of the encoder. On the other hand, the
other binary sequences (initial state, optimal path) are not delivered by VQ search
process and consequently they are not protected implicitly against channel errors.
Performances of the COVQ-LSF-OTCVQ system: Encoding of the FS1016

LSF parameters
We present now the performances of the 27 bits/frame COVQ-LSF-OTCVQ
encoder (ε = 0.05) applied for the efficient and robust coding of FS1016 LSF
parameters. In these simulations, the channel errors will affect only the
transmission of LSF parameters. For the moment, only the sequences of 20
bits/frame specifying the COVQ-LSF-OTCVQ codevectors indices are transmitted
over a BSC channel of bit error probability p varying between 0 and 0.5.
The data base used in the following evaluations is composed of 13.69s speech
sequences extracted from the test data base. Synthesized speech signals of this
base were generated by the FS1016, with objective evaluations in terms of
average SD for the LSF encoders and average SSNR for synthetic speech signals.
The SD Performances of the 27 bits/frame systems: LSF-OTCVQ (without
protection) and COVQ-LSF-OTCVQ (ε = 0.05) are reported in table 7.
These results show that when the channel error probability becomes rather
high (p > ε = 0.05), the COVQ yields significant improvement to the performances
of LSF-OTCVQ encoder. Without protection, the LSF-OTCVQ has incurred more
severe degradation compared with the protected LSF encoder. This degradation is
represented by a brutal increase in the average SD of the LSF-OTCVQ as well as
the percentage of outliers frames having SD> 4 dB. Under these conditions, the
169
COVQ (ε = 0.05) has permitted thus to LSF-OTCVQ to have a good robustness

against channel errors by maintaining a reduced and slow increase of the average
SD and the number of outliers frames (SD > 4 dB).
Table 7. Performance comparisons between COVQ-LSF-OTCVQ/LSF-OTCVQ

encoders of 27 bits/frame: Application to the FS1016 LSF parameters encoding
However, when the transmissions are done over a noiseless channel (p =

0.000) or slightly disturbed (p ≤ ε), the performances of COVQ-LSF-OTCVQ
become suboptimal by compromising the transparent quantization quality.
On other hand, important observations were noted concerning the SSNR
objective performances of the global FS1016 encoder. Indeed, contrary to certain
conclusions made before, the FS1016 SSNR performances (with LSF parameters
coded by COVQ-LSF-OTCVQ) are also remarkable when the channel is slightly
disturbed. The comparative evaluation of the FS1016 objective performances, with
LSFs coded by LSF-OTCVQ and COVQ-LSF-OTCVQ encoders, is presented in
Fig. 2.
170
12
10
Average SSNR (dB)

8
% ( FS1016 with LSF-OTCVQ)

2
% ( FS1016 with COVQ-LSF-OTCVQ)
0
0.001 0.01 0.1 0.5
Error Probability (p)
Figure 2. Average SSNR performances of the FS1016 speech coder.
For error probabilities p ≤ 0.01, these results show that the distortions are
negligible for the two LSF encoding systems. We can conclude that the encoding
system COVQ-LSF-OTCVQ (ε = 0.05) can provide a good implicit protection to the
FS1016 LSF parameters with suboptimal SD-performances when the channel is
slightly disturbed.
COVQ-LSF-OTCVQ encoder with redundant channel coding

Now, we generalize the study with the full protection of all transmission indices
of the 27 bits/frame COVQ-LSF-OTCVQ encoder (ε = 0.05). By adequately
exploiting the bits gained by this encoder, a redundant channel coding is used to
explicitly protect the 7 bits/frame remaining without protection. Since in our
simulations the transmissions are done via BSC channel with the assumption of
only one error bit dominating by corrupted index (single error), a simple single
error-correcting code is largely sufficient to correct all possible single errors which
will affect the transmitted sequences of the encoder (5 bits of the optimal path and
the 2 bits of the initial state). Notice, of course, that the 20 bits/ frame representing
the codevectors indices of the optimal path are already protected by COVQ.
To carry out the channel coding of the non-protected 7 bits/frame, we used two
error-correcting Hamming (7, 4, 3) codes belonging to the category of systematic
linear block codes. In this paper, we will not review the design/operation theory of
171
the Hamming codes which is generally well documented [6]. These codes were
first conceived to effectively correct only one error per transmission block (single
error-correcting codes). In our design, the two Hamming (7, 4, 3) codes have the
capacity to protect 8 bits by generating together 14 bits. The 27 bits/frame COVQ-
LSF-OTCVQ encoder, with the two Hamming (7, 4, 3) codes, will thus operate at a
rate of 34 bits/frame. It is about the same number of bits allocated with the original
coding of the FS1016's LSF parameters. Thus, the global design of the FS1016
with COVQ-LSF-OTCVQ (plus the 2 Hamming codes) of LSF parameters
maintains the speech coder rate to its original value of 4.8 kbits/s.
The performances of the non-protected LSF-OTCVQ compared with those of
the COVQ-LSF-OTCVQ (ε = 0.05) encoder with Hamming (7, 4, 3) codes are
given in table 8.
Table 8. Performances comparison between the LSF-OTCVQ encoder and the

COVQ-LSF-OTCVQ (ε =.05) + Hamming (7, 4, 3) codes
For all error probability variation range, the results showed that the channel
coding by Hamming codes (7, 4, 3) has clearly improved the performances of the
27 bits/frame COVQ-LSF-OTCVQ encoding system. The global system thus has a
good robustness against the errors of the noisy channel. On the other hand by
comparing these results with those given in table 7, the LSF-OTCVQ encoder has
incurred larger degradation in terms of average SD and outliers. This is due mainly
to the random noise effects of the binary sequences specifying the initial state or
172
the optimal path.
Concerning the SSNR performances of the global FS1016 (with LSFs coded by
COVQ-LSF- OTCVQ + 2 Hamming (7, 4, 3) codes), the degradations are very low
and even negligible for error probabilities p < 0.01. The SSNR performances of the
FS1016, in the cases with and without LSF protection, are presented in Fig. 3.
12
10
Average SSNR (dB)
FS1016 with non-protected LSF-OTCVQ

2
FS1016 with COVQ-LSF-OTCVQ + 2 Ham(7,4)
0
0.001 0,01 0,1 0.5
Error Probability (p)
Figure 3. Average-SSNR performances of global FS1016
CONCLUSION
In this work, an optimized trellis coded vector quantization scheme has been
developed and successfully applied for the efficient and robust encoding of the
FS1016 LSF spectral parameters. In the case of ideal transmissions over a
noiseless channel, objective and subjective evaluation results revealed that the 27
bits/frame LSF-OTCVQ encoder (with weighted distance) produced equivalent
perceptual quality to that when the LSF parameters are unquantized.
After, we used a JSCC-COVQ technique to protect implicitly the transmission
indices of the LSF-OTCVQ encoder incorporated in the FS1016. The simulation
results showed that our new COVQ-LSF-OTCVQ encoding system has permitted
to the basic LSF-OTCVQ encoder to have a good robustness against BSC
channel errors especially when the transmission errors probability is high. To finish
this work, it was necessary to protect all the transmission indices of the COVQ-
173
LSF-OTCVQ encoder since only a part of its indices was protected implicitly by
JSCC-COVQ. By using adequately the bits per frame gained by this encoder, a
redundant channel coding by Hamming codes was used to explicitly protect the
remaining bits without protection. We showed that the COVQ-LSF-OTCVQ
encoder, using the Hamming codes (7, 4, 3), has contributed significantly to the
improvement of the encoding performances of the FS1016's LSF parameters.
We can conclude that our global COVQ-LSF-OTCVQ encoding system with
Hamming channel codes can ensure an effective and robust coding of the LSF
parameters of the FS1016 operating over noisy channel.
174
REFERENCES
1. W.B. Kleijn and K. K. Paliwal, : Speech coding and synthesis, Elsevier Science
B.V., (1995).J.
2. K. K. Paliwal and B.S. Atal : Efficient vector quantization of LPC parameters at
24 bits/frame, IEEE Transactions on Speech and Audio Processing, vol. 1, no.
1, pp. 3-14 (1993). F. R.
3. F. Itakura : Line spectrum representation of linear predictive coefficients of
speech signals", Journal of Acoustical Society of America, vol. 57, p.535
(1975).
4. W. F. LeBlanc, B. Bhattacharya, S. A. Mahmoud and V. Cuperman : Efficient
search and design procedures for robust multi-stage VQ of LPC parameters
for 4 kb/s speech coding, IEEE Transactions on Speech and Audio Processing,
vol. 1, no. 4, pp. 373-385 (1993).
5. M. Bouzid, A. Djeradi and B. Boudraa : Optimized Trellis Coded Vector
Quantization of LSF Parameters: Application to the 4.8 Kbps FS1016 Speech
Coder, Signal Processing, Vol. 85, Issue 9, pp. 1675-1694 (2005).
6. S. Lin : An Introduction to Error-Correcting Codes", Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, USA (1970).
7. C. E. Shannon: A Mathematical Theory of Communication, Bell System
Technical Journal, vol. 27, no. 3 and 4, pp. 379-423 and 623-656 (1948).
8. K. A. Zeger and A. Gersho : Vector quantizer design for memoryless noisy
channels, in Proceedings of the International Conference on Communications
(ICC'88), Philadelphia, pp. 1593-1597 (1988).
9. N. Farvardin : A Study of vector quantisation for Noisy Channels, IEEE
Transactions on Information Theory, vol. 36, n°. 4, pp. 799-809 (1990).
10. S. B. Z. Azami, P. Duhamel and O. Rioul : Combined source-channel coding:
Panorama of methods, CNES Workshop on Data Compression, Toulouse
France (1996).
11. A. Gersho, R. M. Gray : Vector quantization and Signal compression, Kluwer
Academic Publishers, USA (1992).
175
12. Y. Linde, A. Buzo, R. M. Gray : An Algorithm for Vector Quantization Design,

IEEE Transactions on Communications, COM-28, pp. 84-95 (1980).
13. M. W. Marcellin and T. R. Fischer : Trellis coded quantization of memoryless
and Gauss-markov sources, IEEE Trans. on Communications, vol. 38, pp. 83-
93 (1990).
14. T. R. Fischer, M. W. Marcellin and M. Wang : Trellis coded vector
quantization", IEEE Transactions on Information Theory, vol. 37, pp. 1551-
1566 (1991).
15. F. S. Wang and N. Moayeri : Trellis coded vector quantization, IEEE Trans. on
Communications, vol. 40, pp. 1273-1276 (1992).
16. A. J. Viterbi and J. K. Omura : Principles of Digital Communication and
Coding, McGraw-Hill Kogakusha (1979).
17. Ungerboeck : Trellis-coded modulation with redundant signal sets, Part I and
II, IEEE Commun. Magazine, vol. 25, pp. 5-21, (1987).
18. N. Farvardin and V. Vaishampayan : On the performance and Complexity of
Channel-Optimized Vector Quantizers", IEEE, Transactions on Information
Theory, vol. 37, n°.1, pp. 155-159 (1991).
19. D. M. Chiang, L. C. Potter : Vector Quantisation For Noisy Channels: A guide
To performance And Computation, IEEE Trans. on Circuits and systems for
Video Technology, vol. 7, n°.1, pp. 604-612 (1997).
20. M. Bouzid : Codage conjoint de source et de canal pour des transmissions par
canaux bruités, Doctorate Thesis, Speech Communication, USTHB university,
Alger, 2006.
21. R. Laroia, N. Phamdo and N. Farvardin: Robust and efficient quantization of
speech LSP parameters using structured vector quantizers", Proc. IEEE Int.
Conf. Acoust., Speech and Signal Processing, pp. 641-644 (1991).
22. S. Garofolo and al. : DARPA TIMIT Acoustic-phonetic Continuous Speech
Database, Technology Building, National Institute of Standards and
Technology (NIST), Gaithersburg (1988).
23. P. Campbell, T. E. Tremain and V. C. Welch : The Proposed Federal Standard
1016 4800 bps Voice Coder: CELP, Speech Technology Magazine, pp. 58-64.
176

ICS20103110ubicc 534

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

ICS20103110ubicc 534

Încărcat de

Drepturi de autor:

Formate disponibile

Intelligent Communication Systems

Ubiquitous Computing and Communication Journal

Intelligent Communication System

Al-Dahoud, is an associated professor at Al-Zaytoonah University, Amman, Jordan. He took his

He has directed and led many projects sponsored by NUFFIC/Netherlands.

His hobby is conference organization, so he participates in the following conferences as general

Until 2005, she worked as a researcher at the Laboratory of Analysis and

UBICC Journal is a part of UBICC Publishers

Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services

UbiCC Journal is indexed by:

• EBSCOhost Online Research Databases

• The Index of Information Systems Journals

• Documents in Computing and Information Science (DOCIS)

• Ulrich's Periodicals Directory

• Directory of Open Access Journals (DOAJ)

• The Index of Information Systems Journals

• Microsoft LIBRA for Academic Search

For permission to use this material from this text, contact us by

- Security And Cryptography

Al-Dahoud Ali (Ph.D.)

Intelligent Communication System, 1st Edition contains of 8 chapters:

Chapter 1- Security Approaches in Internet Communication Chapter 1 aims to examine

Chapter 2 - Digital Forensics Evidence Mining Tool In this chapter, we developed an

Chapter 3 - ACHIEVING UNCONDITIONAL SECURITY BY QUANTUM CRYPTOGRAPHY

Chapter 5 - Communication through expressing and remixing: Workshop and System

Chapter 7 - Requirements engineering and traceability in agile software development

Security Approaches in Internet Communication

Keywords: information security, internet communication, encryption

ENSURING SECURITY IN INTERNET COMMUNICATION

security aspects of the internet communication.

Security Approaches in Internet Communication

Figure 1. A representative sample that explains steganography

Figure 2. Anonymous proxy approach

Systems and Programs to Ensure Security in Internet Communication

5 Anti Virus-Spyware-Malware Programs,

THE CODING SCIENCE: CRYPTOLOGY AND ENCRYPTION

Figure 3. Fields of the cryptology

Figure 4. A typical communication session based on public-key encryption

Today, RSA (Rivest-Shamir-Adleman) algorithm is the most popular approach

Figure 5. A typical communication session based on private-key encryption

A SECURE APPLICATION FOR E-MAIL COMMUNICATION

Using Features of the Application

Figure 7. Working structure of the developed communication system

This simple but strong enough application employs an effective private-key

The Encryption Algorithm

(01001010) (00001001) (01000011)

Figure 8. Flowchart of the developed encryption algorithm

Encryption and Decryption Interfaces of the Application

Digital Forensics Evidence Mining Tool

Khaled Almakadmeh, & Mhammed Almakadmeh

Keywords: Evidence, Digital Forensics, Semantic Search, Cybercrime

Table 1. Examples of terms & their synonyms/ Hyponyms

Figure 1. Tool Architecture

We describe each component from a technical perspective, and explain how

Graphical User Interface

Figure 3. Panel shows suggestions for "Cocaine"

Google Desktop SDK

When the investigator submits a search query, actually (s)he generates an

To break this down:

Use Case Diagram

Figure 3. Use case Diagram

Figure 4. Activity Diagram

file it’s been found.