Documente Academic
Documente Profesional
Documente Cultură
Edited by:
AL-Dahoud Ali
Walid A. Salameh
Linda Smail
ISBN 978-1456392291
Disseminator of Knowledge
Editor
AL-Dahoud Ali
- ICITs, ICITST, ICITNS, DepCos, ICTA, ACITs, IMCL, WSEAS, and AICCSA
Journals Activities: Al-Dahoud worked as Editor in Chief or guest editor or in the Editorial board of
the following Journals:
Journal of Digital Information Management, IAJIT, Journal of Computer Science, Int. J. Internet
Technology and Secured Transactions, and UBICC.
He published many books and journal papers, and participated as keynote speaker in many
conferences worldwide.
Editorial Board
Walid A. Salameh
Walid A. Salameh, is a professor of Computer Science. He received his Bachelor degree from
Yarmuk University-Jordan 1984. His MSc and PhD were received in 1987 and
1991 respectively from the Department of Computer Engineering –METU. He
published more than 62 papers in the areas of neural networks, computer
networks and elearning paradigm. His recent research interests are on
building sustainable and efficient elearning paradigms and architectures that
serves the goals of learning outcomes. He is a member of the editorial boards of different journals
and contributed as a guest editor in different books.
Linda Smail
Linda Smail is an associate professor at the School of Arts and Sciences at the New York Institute
of Technology since September 2006. She holds a PhD in applied
Mathematics from the University of Marne-la-Vallée (Paris-East), France.
Her research interests include graphical models and machine learning and her recent research
focuses on exact inference algorithms for Bayesian networks. For the task of computing probability
and conditional probability in Bayesian networks.
COPYRIGHT © 2010
This work is subjected to copyright. All rights are reserved whether the whole or part of the material
is concerned, specifically the rights of translation, reprinting, re-use of illusions, recitation,
broadcasting, reproduction on microfilms or in any other way, and storage in data banks.
Duplication of this publication of parts thereof is permitted only under the provision of the copyright
law 1965, in its current version, and permission of use must always be obtained from UBICC
Publishers. Violations are liable to prosecution under the copy right law.
Tel +1-347-4149239
Fax +1-212-901-6990
www.ubicc.org
ISBN 978-1456392291
Preface
The book title has been chosen from the conference title: the first annual International Conference
on Information and Communication System (ICICS09).
The conference received over 160 papers; accepted 90 papers, 20 papers have been selected as
conference best papers, 9 papers are published in this book.
The book consists of 9 chapters that cover the following areas:
This book presents a collection of research papers in the arias of computer science and
communication systems. It is suitable for new researchers and PhD students because it contains
up to date researches in the mentioned fields and a lot of references for related topics.
Chapter 4 - Adaptive Architecture to Protect Mobile Agents The objective of this chapter is
to propose a protocol to protect mobile agent, based on two agents: mobile agent and investigator
agent. The investigator is a mobile agent’s prototype with no critical code and data. It is created
and sent by mobile agent in order to be executed first. In return, investigator agent is analyzed by
mobile agent to detect any malicious action: if actions are forbidden, mobile agent redoes a copy
and changes destination. If actions are doubtful, mobile agent chooses a plan of adaptation then
migrates. If all actions are authorized, mobile agent migrates with confidence.
Chapter 6 - IPAC System For Controlling Devices over the Internet In this chapter we have
proposed an appliance controlling system, named as Internet and PC Based Appliance Control
(IPAC), using concepts of parallel port programming. IPAC is designed to control a device from PC
and from Internet, and can be applied in any smart infrastructure to automate the device and can
work with almost every type of automation method either it is wired (e.g. LAN) or wireless (e.g.
Bluetooth). This system can be applied in designing smart homes, secure homes, centralized
device controlling system, Bluetooth control system, WAP control system.
Chapter 8 - Robust encoding of the FS1016 LSF parameters: Application of the channel
optimised trellis coded vector quantization This chapter illustrates an optimized trellis
coded vector quantization (OTCVQ) scheme designed for robust encoding of the LSF parameters.
The objective of this system, called initially "LSF-OTCVQ Encoder", is to achieve a low bit-rate
quantization of the FS1016 LSF parameters. The efficiency of the LSF-OTCVQ encoder (with
weighted distance) was first proved in the ideal case of transmissions over noiseless channel. After
that we were interested on the improvement of its robustness for real transmissions over noisy
channel. To protect implicitly the transmission parameters of the LSF-OTCVQ encoder
incorporated in the FS1016, we used a joint source-channel coding carried out by the channel
optimized vector quantization (COVQ) method. In the case of transmissions over noisy channel,
we will show that the new encoding system, called "COVQ-LSF-OTCVQ Encoder", would be able
to contribute significantly to the improvement of the FS1016 performances by ensuring a good
coding robustness of its LSF spectral parameters.
Chapter 1
Utku KOSE
Afyon Kocatepe University, Turkey
ABSTRACT
In today’s world, information security is an essential factor, which must be taken
into consider to ensure secure applications and services in information technology.
Since the inception of the information technology concept, there has been a
remarkable interest in security approaches that aim to protect information or digital
data. Especially, rise of the Internet and internet technologies has caused
searching for newer approaches, methods and techniques that try to provide
secure internet communication sessions for computer users over the Internet. In
this sense, this chapter aims to examine security approaches in internet
communication. For this purpose, role of the coding science: “cryptology” in
providing secure internet communication and related techniques in this scope are
also explained within the chapter. Furthermore, in order to give an example for
usage of cryptology techniques, an e-mail application, which was developed to
send or receive encrypted e-mail messages, is also introduced in this chapter.
1
____ Chapter 1: Security Approaches in Internet Communication
INTRODUCTION
Nowadays, the security concept is an important factor, which is associated with
almost all fields in peoples’ modern life. It is too important that this concept has
been a remarkable subject to the humankind for a long time period. Basically, the
“security” term can be defined as “the protection of a person, property or
organization from an attack” [1]. But it also has more specific meanings that are
used to define similar situations, aspects and features of different fields in the life.
Additionally, there are also different concepts, which are derived originally from the
“security” term and used to define security approaches and techniques in different
fields. The term: “information security” is one of these concepts and it is mostly
associated with information technology.
Briefly, the information security term is described as protecting information or
digital data against any attack that can be performed by using different attacking
technologies, methods and techniques [2]. At this point, the popularity and
extensiveness of information security is connected with advancements,
developments and improvements in the field of information technology. Actually,
the rise of the Internet and internet technologies has caused rapid developments
and improvements in information security and formed its current situation in the
modern life. Today, it is more important to ensure secure web services and web
applications for people who use these technologies to perform their works and
communicate with other people from all over the world. Especially, providing
secure internet communication between two people has become an important
subject that must be taken into consider for protecting send or received digital data.
Because of this, there are many different approaches, methods and techniques
that try to provide secure internet communication sessions for people over the
Internet.
This chapter aims to examine the foremost information security approaches in
especially internet communication. Additionally, role of the cryptology in providing
secure communication sessions and related methods or techniques that can be
evaluated in this sense are also examined and explained in the chapter. In this aim,
principles of encryption techniques like private-key encryption and public-key
encryption and their usage in the internet communication systems or applications
are explained. As an example for usage of these cryptology techniques, an e-mail
application, which was designed and developed to be used for sending or
receiving encrypted e-mail messages, is also introduced in the chapter. This
application employs a private-key encryption algorithm, which aims to provide an
effective approach to encrypt the mail message and related attachments. By
explaining structure of this algorithm and features of the related application, the
chapter tries to give more concrete ideas about security approaches in internet
communication.
The chapter is organized as follows: The second section explains the foremost
approaches, methods and techniques that can be used to ensure a secure internet
communication. This section also introduces some systems and programs that can
be used to provide security in communication sessions. Immediately afterwards,
the third section introduces the coding science: cryptology and explains using
features of widely-used encryption techniques briefly. Next, the fourth section
introduces the related e-mail application and finally, the chapter ends with a
discussion–conclusions section.
3
____ Chapter 1: Security Approaches in Internet Communication
5
____ Chapter 1: Security Approaches in Internet Communication
Firewalls are some kind of program or hardware systems that are specially
designed and developed to block unauthorized access to a computer system [8].
Nowadays, there are many different firewall programs that can be installed and
used over an operating system. Some software companies like “Agnitum” and
“Check Point Software Technologies Ltd.” develop and provide today’s popular
firewall programs like “Outpost” and “Zone Alarm”. On the other hand, there are
also more advanced hardware systems that act as firewalls for more advanced
computer systems like servers.
Anti Virus-Spyware-Malware programs provide security solutions against
dangerous program and code types like viruses, trojans, spywares and malwares.
Protecting computer systems against these types of dangerous factors is too
important to ensure security for especially internet communication. Today, different
software companies like “Kaspersky Lab.”, “Eset”, “Symantec”, “McAfee” and
“Trend Micro” provide special programs that combine functions of Anti Virus-
Spyware-Malware protecting mechanisms. Furthermore, there are also more
advanced and effective programs that combine both firewall and Anti Virus-
Spyware-Malware protecting mechanisms. These programs are usually called as
“Internet Security” programs. As a result of increasing number of dangerous
program, code types and other malicious factors, computer users, who often work
on the Internet, often prefer to use “internet security” programs.
Monitor systems are often used to watch active processes over a network
system. By using this type of systems, unwanted activities over the network
system can be detected easily and necessary precautions can be taken against
possible attacks in the future. In this way, unwanted third parties and malicious
factors on a special communication session can be detected and removed
immediately. Nowadays, there are many kinds of monitoring systems that are
developed by different software companies. For instance, “Microsoft” offers a free
network monitoring program named “Microsoft Network Monitor”. Additionally,
another company named “Paessler” works on only network programs and provides
a free network monitoring tool. Finally, “Net Optics” provides many different
advanced monitoring and filtering solutions for communication security.
In order to achieve a secure communication, another method is using
encryption-decryption systems. On the market, there are hardware and software
based encryption-decryption solutions that try to provide high-level security for
valuable information and stored digital data. Today, encrypting information and
digital data is the most effective and popular approach to provide security in
almost all fields of the modern life.
By combining different kinds of security approaches, some secure instant
messaging (IM) and conferencing programs have been designed and developed
by software companies. Some of these programs are also “open source” and they
offer “free” and “developing” security solutions for internet communication. For
instance, “Skype” is one of the most popular internet communication programs and
it provides secure voice and chat communication with 128 bit AES and 1024 bit
asymmetrical protocols [6]. On the other hand, “Zfone” is an open source program
that enables users to make secure voice communication. Some popular IM
programs like “Yahoo Messenger” uses secure approaches to provide more
security in their communication services. “WASTE” is also another IM program that
uses high strength “end-to-end” encryption and an anonymous network. As
different from other ones, the WASTE is an open source IM program [6].
In addition to the mentioned ones, there are also many more systems or
programs that have been developed to be used for more specific aims. Some of
these systems or programs ensure the security indirectly. Because of this, they
must be used with the support of other security systems or programs.
7
____ Chapter 1: Security Approaches in Internet Communication
There are three more terms that must be examined within this subject. These
are: cryptographer, cryptanalyst and cryptologist. Cryptographer is the term, which
is used to define person whose work or research studies are based on
cryptography. On the other hand, the person who studies on cryptanalysis is called
as the cryptanalysis. As it can be understood from previous explanations, both the
cryptographer and the cryptanalysis are cryptologists.
Cryptology is too important for computer users because it provides security for
computer-aided works such as transferring data between computer systems,
designing and developing new computer-based technologies and also performing
Internet communication. In today’s world, cryptology is also often used to provide
security in computer-based systems or applications like e-business, e-marketing,
e-science, e-government and e-signature [11]. At this point, two elements of the
cryptology: encryption and decryption techniques have important roles on status of
provided security levels. The encryption term can be defined as transforming
information or digital data with a special function and the encryption key used by
this function. On the other hand, decryption is defined as converting the encrypted
information or digital data to its original, unencrypted form. In encryption works,
two different encryption-key techniques are widely used. These are named as
“public-key encryption” and “private-key encryption” respectively. Today, another
approach, which is called as “Hybrid Cryptosystem”, is also used to combine
advantages of both public-key and private-key encryption techniques. In order to
have more idea about cryptology and its effects on Internet communication, it is
better to explain features of public-key and private-key encryption techniques.
Public-Key Encryption
Public-key encryption technique can also be called as “asymmetric encryption”.
In this encryption technique, the user, who wants to ensure a secure
communication, needs two different keys. These keys are called as “private key”
and “public key” respectively. Each key that the user can use has different roles
during the communication. The public key is known by everybody. But the private
key is known by only one user. The encryption process is performed by using the
public key whereas the decryption process is performed with the private key [12,
9
____ Chapter 1: Security Approaches in Internet Communication
13]. At this point, some mathematical equations are used to make the connection
between public and private keys. In other words, encrypted information or digital
data can be decrypted by using the private key, which is connected with the public
key that was used for encrypting the mentioned information or digital data.
Because of this, it is impossible to decrypt the encrypted information or digital data
with the help of other private keys. In this technique, it is too important that the
user must hide his / her private key from other users. But he / she can share the
public key with other people [14]. Figure 4 represents a diagram that explains a
typical communication session based on public-key encryption technique.
Private-Key Encryption
Private-key encryption technique can also be called as “symmetric encryption”.
In this encryption technique, only one key is used for encrypting or decrypting the
information or digital data [12, 17]. The private-key encryption technique comes
with two different approaches: “block encryption” and “row encryption”. In block
encryption systems, the original message is separated into fixed length blocks and
each block is encrypted individually [9]. In this way, a block is matched with
another fixed length block from the same alphabet. In designing of block codes,
mixing and diffusion techniques are used and these techniques are applied by
using “permutation” and “linear transformation” operations respectively [18]. At this
point, strength of the related block encryption algorithm is set by S boxes, number
of loops, using keys in XOR operations, block length and key length. Using
random key is also another important factor to improve strength of the applied
algorithm [19]. The other approach: row encryption is a new form of permutation
algorithms which were used in the past [9]. Row encryption technique needs a
long key data. Because of this, transition files with feedback feature are used to
produce a half-random key. The encrypted message content is created by
performing XOR operations with the produced key on the original message. At this
point, the receiver must produce the same key in order to decrypt the encrypted
message [9]. Figure 5 represents a diagram that explains a typical communication
session based on private-key encryption technique.
Today, DES (Data Encryption Standart) algorithm is the most popular approach
that uses the private-key encryption technique. Additionally, AES (Advanced
Encryption Standard), IDEA (International Data Encryption Algorirhm), Skipjack,
RC5, RC2 and RC4 algorithms are also other popular approaches that use the
private-key encryption technique [11, 14 – 16, 19].
The explained encryption techniques provide different types of security
solutions and approaches for different systems. Because of this, their advantages
11
____ Chapter 1: Security Approaches in Internet Communication
and disadvantages must be known to choose suitable technique for any designed
system. Private-key encryption ensures a fast encryption technique whereas
public-key encryption provides a slow, but a trusted one. Additionally, private-key
encryption technique is useful on digital data, which is stored in a media [12]. But it
may be expensive to ensure security in sharing the private key with other users.
Although public-key and private-key encryption techniques employ some different
features, they are widely used in different types of applications or systems, which
aim to ensure security in especially Internet communication.
The e-mail application has a user friendly design and simple controls that
enable computer users to perform the related operations in a short time. The
application comes with three different interfaces, which can be used to perform
different operations related to e-mail communication. The user can view these
interfaces by using the provided controls on the application. With the first interface,
folders of the adjusted mail address (inbox, sent box…etc.) can be viewed. On the
other hand, other two interfaces are used for encrypting plain mail messages or
decrypting the received encrypted ones. In this way, the same application can be
used by both sender and receiver users to ensure a secure communication.
Working structure of the developed communication system is shown in Figure 7
briefly.
13
____ Chapter 1: Security Approaches in Internet Communication
a secure e-mail communication channel between two users can be realized easily.
In order to have more idea about security aspects of the e-mail application,
features of this algorithm must be explained in detail.
Under the next subtitle, these steps are explained in more detail.
Encryption steps
The first step of the encryption process is based on separation of the original
message text into different numbers of text blocks. The related number is
automatically defined according to the character count of the text. Separation of
the text is also performed according two rules: If the character count of the original
text can be divided into three and a half or seven, the original text is separated into
the character count. Otherwise, the character count is set to a definite number,
which can be divided into five. In order to achieve this, some space characters are
added to the original message text.
In the next step, random permutation production method is used to change
original content of the message text. For this purpose, random numbers are
produced for each block, which was obtained in the first step. Positions of each
character in the blocks are changed according to produced random numbers. As a
result of changing character positions, a simply encrypted text, which was created
with the permutation method, is obtained. With the permutation method, properties
of the related text characters are protected. But their positions are automatically
changed [17].
In the third step, the final form of the encrypted message text is obtained. In
this sense, random numbers between 0 and 9 are produced according to
character count of the encrypted text and each character of this text is encrypted
at bit-level with randomly produced key numbers. During the encryption process,
the XOR (eXclusive OR) method is used [According to the XOR method, the result
(output) is “1”, if two inputs are “different”. Otherwise, the result (output) is “0”]. As
a result, a set of new characters are obtained for the last form of the message text.
Table 1 shows some examples for the XOR encryption process of different
characters.
Table 1. Some examples for the XOR encryption process of different characters.
Character Key Encrypted Character
(in binary) (in binary) (in binary)
Z 3 Y
(01011010) (00000011) (01011001)
A 7 F
(01000001) (00000111) (01000110)
J 9 C
15
____ Chapter 1: Security Approaches in Internet Communication
U 1 T
(01010101) (00000001) (01010100)
M 5 H
(01001101) (00000101) (01001000)
In the last step, created encryption – decryption key and the encrypted data are
organized in two separate temporary files and contents of these files are
transferred to the application interface to be saved by the user in .txt file formats.
Figure 8 represents a flowchart, which briefly explains and shows each step of the
developed encryption algorithm.
By using the explained algorithm steps, the developed application allows users
to encrypt both their mail messages and attachments easily. At this point, it is also
important to explain decryption process of the application.
In the decryption process, the obtained key file is used by the application for
the encrypted mail message text. In order to understand decryption steps better,
main parts of the key file must be examined first. Figure 9 shows a brief schema
that shows the related key file parts.
Figure 9. Parts of the key file
As it can be seen from the Figure 9, the key file consists of two different parts,
which can be called as “Part 1” and “Part 2” respectively. During the decryption
process, the application gets the encrypted form of the text [before the bit-level
encryption (XOR)] by using the “Part 1”. Afterwards, the original message text is
created by using the “Part 2”. At this point, the original message text is the
decrypted text for the receiver. As a result, the receiver has a chance to get the
original message in a more secure way with the help of mentioned method.
All of the explained processes are performed by the e-mail application via two
different interfaces. In the application, these interfaces are named as “Encryption
Screen” and “Decryption Screen” respectively. In order to have more idea about
usage of the e-mail application, these interfaces must be explained briefly.
17
____ Chapter 1: Security Approaches in Internet Communication
Figure 10. A screenshot from the Encryption Screen of the e-mail application
In order to provide a fluent using experience, users are enabled to change the
view to encryption or decryption interfaces (screens) by using two buttons located
on left and top side of each “screen”. Additionally, the users are enabled to learn
more about the usage of the related “screen” by using the “Help” button. On the
Encryption Screen, the user can type the original message text under the
“Message” field. On the other hand, there are some more fields that are
associated with typical e-mail message fields like “To” and “From”. Additionally,
the user can also add one or more attachments to the message by using related
controls on this screen. In order to start the encryption process for the mail
message and related attachments, the “Encrypt” button can be used. During the
encryption process, some statistical information can also be viewed on the title bar
of this screen. At the end of the process, two .txt files are created for the produced
key and the encrypted data. These files can be saved by the user to any directory.
At this point, it is important to use .txt file type for the encrypted data to protect its
content from foreign characters, which can be produced by other word processor
programs. Moreover, process time is also lowered by using the .txt file type for the
encrypted data.
After getting the key and the encrypted data, the user can send the message to
the receiver(s) by using the “Send” button located on the Encryption Screen. While
sending the encrypted mail message to the receiver, restrictions, which are
applied by ports or firewalls, may affect the e-mail application. In order to solve this
problem, a remoting application, which is held in a trusted authority, was
developed. With this application, communication with the authority is performed
over the port: 80, by using shaped XML structure, which is suitable for the
semantic data model.
After receiving the encrypted e-mail, the Decryption Screen of the application
can be used to get the original mail message and its attachments. The Decryption
Screen was designed as similar to the Encryption Screen. Figure 11 shows a
screenshot from the Decryption Screen of the e-mail application.
Figure 11. A screenshot from the Decryption Screen of the e-mail application
On the Decryption Screen, the receiver can view the encrypted message text
under the “Encrypted Text” title. After choosing the key file, the text can be
decrypted by using the “Decrypt” button. As soon as the decryption process is
finished, the original text is shown in the text field located on bottom side of the
Decryption Screen.
The introduced e-mail application provides an effective and strong security
solution for e-mail communication over the Internet. It is too important that
examining using features and functions of this application enables readers to have
more concrete ideas about security approaches in today’s internet communication.
Definitely, there are also many different kinds of applications or systems that try to
ensure security for different fields of internet communication.
DISCUSSION–CONCLUSIONS
This chapter explained the foremost information security approaches in
especially internet communication. In this sense, role of the cryptology in ensuring
19
____ Chapter 1: Security Approaches in Internet Communication
security for communication sessions and its fields that can be examined in this
scope were explained in the related sections. In order to explain more about usage
of cryptology in communication security works, principles and functions of
encryption techniques like private-key encryption and public-key encryption were
also examined. At this point, an e-mail application, which can be used for sending
or receiving encrypted e-mail messages, was also introduced to enable readers to
have more idea about the usage of cryptology and encryption techniques in
providing security for internet communication. Features and functions of this
application provide a simple but strong enough approach to support explained
subjects about the security factor in internet communication.
Today, the security concept is an extremely important subject because the
information is currently more valuable for the humankind and there is a supremely
effort to protect “valuable information” from environmental factors. As a result of
rapid developments in the technology, more advanced systems, which provide
better security solutions for valuable information or digital data, are designed and
developed expeditiously. With the related developments in the technology, it is
expected that the number of different attacking and “security breaking” methods
and techniques will be reduced in time. But conversely, more attacking and
“security breaking” methods or techniques are designed and developed by
malicious people from day to day. Because of this, more research studies and
works should be performed to take security precautions one step ahead from
malicious methods and techniques against the security.
Although there are many different, advanced security applications and systems,
the “human factor” is still a critical and important factor in providing a complete
security for almost all fields in the modern life. It is important that the human factor
is the weakest part of even more advanced security systems and it seems that this
situation will not be changed in the near future. Because of this, people must be
trained about current security approaches, methods and techniques that can be
used to ensure information security. Moreover they also must be warned against
potential “social engineering” methods and techniques that can be implemented to
benefit from disadvantages of the human factor. This can be done by doing the
following tasks:
1. Arranging educational seminars or meetings about information security,
2. Attending comprehensive conferences and symposiums about information
security,
3. Following the latest developments and improvements about security
approaches, methods and techniques,
4. Following the latest developments and improvements about attacking and
“security breaking” methods and techniques.
5. Being aware of social engineering methods.
“Getting access to source code…was kind of like the secret ingredient. I wanted to
know what the secret was…”, Kevin David MITNICK
21
____ Chapter 1: Security Approaches in Internet Communication
REFERENCES
1. R. Kurtus: What is Security?, Ron Kurtus’ School for Champions, (2002).
[Online] Retrieved April 10, 2010 from: http://www.school-for-
champions.com/security/whatis.htm
2. N. Yalcin, and U. Kose: Sending E-Mail with an Encrypting Algorithm Based
on Private-Key Encryption, In Proceedings of the International Conference on
Information and Communication Systems 2009, pp. 33-37 (2009).
3. D. P. Agrawal, and Q.-A. Zeng: Introduction to Wireless and Mobile Systems,
Thomson, (2005).
4. J. Kurose, and K. Ross: Computer Networking, Addison Wesley, (2003).
5. L. M. Surhone, M. T. Timpledon, and S. F. Markesen: Secure Communication,
Betascript Publishing, (2010).
6. Wikipedia – The Free Encyclopedia: Secure Communication, (2010). [Online]
Retrieved April 13, 2010 from: http://en.wikipedia.org/wiki/Secure_communication
7. S. Sagiroglu, and M. Tunckanat: A Secure Internet Communication Tool,
Turkish Journal of Telecommunications, Vol. 1, No. 1, pp. 1-10 (2002).
8. Wikipedia – The Free Encyclopedia: Firewall (computing), (2010). [Online]
Retrieved April 14, 2010 from: http://en.wikipedia.org/wiki/Firewall_(computing)
9. R. J. Spillman: Classical and Contemporary Cryptology, Prentice Hall, pp. 1-6,
132, 137 (2005).
10. R. A. Mollin: RSA and Public Key Cryptograhy, Chapman and Hall/CRC, pp. 1-
25, 53 (2003).
11. S. Sagiroglu, and M. Alkan: Electronic Signature in All Respects: E-Signature,
Grafiker Publishing, pp. 2, 8-9, 24, 31 (2005).
12. W. Trappe, and L. C. Washington: Introduction to Cryptograhy with Coding
Theory, Prentice Hall, pp. 4-6 (2002).
13. M. D. Abrahams, S. Jajoida, and H. J. Podell: Information Security: An
Integrated Collection of Essays, Institute of Electrical and Electronics
Engineering, pp. 15, 350-384 (1995).
14. D. R. Stinson: Cryptography Theory and Practice, Chapman and Hall/CRC, pp.
114, 162 (1995).
15. C. Cimen, S. Akleylek, and E. Akyildiz: Mathematics of the Codes:
Cryptography, Middle East Technical University – Center of Society and
Science, (2007).
16. S. Singh: The Code Book: The Science of Secrecy from Ancient Egypt to
Quantum Cryptography, Anchor, (2000).
17. K. Schmeh: Cryptography and Public Key Infrastructure on the Internet,
Heidelberg, pp. 42 (2003).
18. M. T. Sakalli, E. Bulus, A. Sahin, and F. Buyuksaracoglu: Design Techniques
and Power Analysis in Flow Codes, In Proceedings of 9th Annual Academic
Informatics Conference, (2007).
19. S. Andac, E. Bulus, and M. T. Sakalli: Analyzing Strength of Modern Block
Encryption Algorithms, In Proceedings of 2nd Young Researchers Congress of
Engineering Sciences, pp. 87 (2005).
23
Chapter 2
ABSTRACT
Internet has created new forms of human interaction through its services, like
E-mail, Internet Forums and Online Banking Services. On the other hand, it has
provided countless opportunities for crimes to be committed, many digital
techniques have been developed, and used to help cybercrime investigators in the
process of evidence collection. In this paper, we developed an efficient digital
forensics mining tool to help cybercrime investigators in evidence collection and
analysis by providing various forensically important features.
INTRODUCTION
Internet has provided many solutions that help people over the entire world to
facilitate their lives including; E-mail, Instant Messages (IM), Online Banking
Services, and many other services that most of the people can’t stop using.
However, according to published statistics, there are thousands of businesses and
24
government departments like Western Union, Creditcards.com and CD Universe
have been hacked, which resulted in over a billion dollars of damages per year,
and this amount of losses is climbing. This makes the job of law enforcement
officers including cybercrimes investigators more difficult and complicated,
because of the large amount of data that has to be collected and analyzed.
Most of cyber criminals use high-technological devices; this requires that law
enforcement agencies to have efficient tools and utilities to gather and analyze
data from these devices. These reasons were primary motivation behind
conducting our research in computer forensics to develop our Digital Forensic
Evidence Mining Tool. It’s dedicated to help cybercrimes investigators in the
process of collecting and analyzing evidence from suspects’ devices. We have
provided features that are highly needed, helpful and supportive toward evidence
collection.
Search engines like Google, Yahoo, and many others perform keyword search.
However, cybercrime investigators need is to be able to do a semantically oriented
search. Semantic search [1] provides a great flexibility during the investigation
process. For example, the word "cocaine" is not going to be mentioned frequently
in a drug dealer's communications, instead, when an investigator wants to search
for a word like "cocaine", (s)he is expecting to get results that contain the term
cocaine or any other related terms. Table 1 shows some examples of terms and
their synonyms/Hyponyms
Our tool is able to enrich the search with various semantic suggestions that the
25
____ Chapter 2: Digital Forensics Evidence Mining Tool
investigator can use. While developing our tool we faced many challenges; we
should take into consideration the tool efficiency, robust functionality, and
visualization during the whole development cycle. Besides these challenges, our
solution should be scalable for large number of files and ready to adapt new
features. In addition, the tool needs to be very responsive; within a matter of few
seconds the search results need to be displayed and ready to be processed.
RELATED WORK
In this section, we focus on previous tools and solutions that have been
proposed to help cybercrime investigators. First, we discuss stand alone utilities
used in this field and in subsequent sections we mention how our tool takes
advantage by integrating them, and providing more customized features that will
help cybercrime investigators in performing their jobs.
The first utility we use is Google Desktop Search (GDS) [2] provided by Google
Corporation. GDS is a desktop search engine that provides full text search for a
wide range of file types, such as emails, documents of all types, audio files,
images, chat logs, and history web pages that the user has visited. What makes it
efficient is that after the initial setup and building the index for the first time,
indexing occurs only when the machine is idle. Thus, the machine's performance
is not affected. GDS also makes sure that it stays up to date by monitoring any
changes on existing or in newly added files. The last but not the least feature is
finding deleted files; Google Desktop creates cached copies (snapshots) of all files.
These copies can be viewed even if the files have been deleted and are returned
in the search results.
The other utility we use is WordNet [3], a large English lexical database. It
provides nouns, verbs, adjectives and adverbs that are grouped into sets of
cognitive synonyms called “Synsets”. Synsets are interlinked by means of
conceptual-semantic and lexical relations [3]. In [4] indexing with WordNet Synsets
is used to improve text retrieval. We take advantage of this utility to show the
investigator a broad collection of suggestions that she/he could pass to GDS.
26
Further discussion about our developed solution is provided in subsequent
sections.
PROBLEM STATEMENT
A Good problem statement should answer the following questions:
What is the problem?
The investigator needs to be able to query the criminals’ devices to build
knowledge about what information it contains. This knowledge can be used to
provide evidence, and/or to prevent future incidents.
Who has the problem?
The intended clients for this solution are cybercrime investigators; they face a
problem when performing an effective and efficient search on the information
in criminals’ devices.
What is the solution?
A full featured desktop tool that uses GDS and WordNet to provide semantic
search in a suspect’s computer.
PROPOSED SOLUTION
In this section, we show an overview of our tool’s architecture. Then, we
discuss how each component in the tool contributes to the overall functionality.
After that, we show the use-case and activity diagram of our tool.
System Architecture
The system architecture provides a comprehensive overview of the tool and its
supporting infrastructure, Figure 1 shows the architecture of our tool:
27
____ Chapter 2: Digital Forensics Evidence Mining Tool
Tool Components
The system components are:
Graphical User Interface
WordNet API
Google desktop SDK
Business Layer
28
Figure 2. Digital Forensics Evidence Mining Tool
WordNet
For the semantic search functionality, we decided not to automatically search
for all synonyms of the desired term. Since this approach will overload the tool,
and overwhelms the investigator with a large amount of results. Instead, we
designed our tool to search only for the desired term. Figure 3 shows more
practical feature-rich suggestion panel. When the forensic investigator enters a
term and hits Enter; the suggestion panel shows a list in the form of tree view that
contains synonyms, acronyms, sister terms…etc.
29
____ Chapter 2: Digital Forensics Evidence Mining Tool
In addition, the investigator has the capability for more options, like specifying
whether he wants to look for nouns, verbs, or adjectives that are related to the
term he previously searched. Below that panel there is a definition window that
shows the definition of the selected word from the suggestion panel, and an
example of use. Double-clicking on a term from the suggestion panel initiates a
new request to search for that term and the results are displayed in a new tab.
This approach guarantees that our tool is working at the highest performance level.
30
search&s=1ftR7c_hVZKYvuYS-RWnFHk91Z0: is the search command and a
security token.
?q=Google: is the query term(s) parameter.
If the investigator wants to search for more than one term, separate the terms
with +s. For example, to search for both "Google" and "GDS",
use:?q=Google+GDS.
If the investigator wants want to search for a specific phrase, separate the
terms with +s and surround the phrase with %22s. For example, to search for the
phrase "Google Desktop Search", use:?q=%22Goo-gle+Desktop+Search%22
To search for the two phrases "Google Desktop Search" and "Copyright 2005",
use:?q=%22Google+Desktop+Search%22+%22Copy-right+ 2005%22.
&format=xml specifies that the HTTP response returns the search results in XML
format. By default, an HTTP search response will only return the first ten results.
It’s kept for developer to specify the number as needed by appending the &num=
parameter, followed by the maximum number of results to be returned to the query.
There is no problem if the maximum number argument value is greater than the
total number of search results; only the total number of results is returned, with no
null "results".
Business Layer
This component is at the core of our tool; it receives the search terms from the
GUI and it interacts with the WordNet component in case the investigator wants to
search a keyword from the suggestion panel, it also sends the search term with
the search preferences to the GDS engine. The business layer processes the
results and sends them back to the GUI to be shown to the investigator. This layer
resembles the brain of our tool where all the processing complexity is hidden kept
separated from the GUI. It is composed of classes and functions that communicate
with the rest of the components.
31
____ Chapter 2: Digital Forensics Evidence Mining Tool
Activity Diagram
The activity diagram [5] shows the flow of the program when a search task is
submitted to the tool. As shown in the diagram, the user can specify advanced
search options before executing the search; also choose a keyword from WordNet
to run the search again. After the results are shown, the user can generate a
report and save it to be used later when presenting the evidence to the court of
law.
32
Applicability
Our tool runs on Windows XP, Vista, and even Windows 7, and by using
Google Desktop Search engine our tool can access all file types, MS Office files,
Outlook files, archive files (such as .zip, .rar), email and web history files.
TOOL FEATURES
Our tool provides a feature-rich environment for the investigator. We provide
many features that help the investigator in evidence analysis and report generation.
Below is a description of all the functionalities our tool provides:
Result Display: By default search results are displayed in a group of twenty per
page; the previous and next buttons allows the investigator to navigate through the
next and previous result page. The total number of results found is shown at the
top of the results page.
Access All Files Types: Using Google Desktop Search engine our tool can
access all file types, MS Office files, Outlook files, archive files (such as .zip, .rar),
and web history files.
Semantic Search: Full of features panel that suggests many variations of the
keyword, including a small panel that shows the meaning of each word, and a
sample sentence of how it is used.
Multiple Tabs: For each keyword searched a new tab will open, allowing the
investigator to conduct more search processes, and close any unneeded tab.
Advanced Search: Provides more options that allow the tool to filter the number
of results.
Choose various file types for more refined search, including most common file
types, like; text, images, audio/video, archive (zip), and HTML files.
Choose specific file category like email or web to search only the specified type
of files.
Choose the number of results per page.
Sort the results by relevance: when checked; relevant files (within the same
directory) will be displayed (sequentially) after each other.
Display File Snippet: Allows the investigator to see the searched term within the
33
____ Chapter 2: Digital Forensics Evidence Mining Tool
CONCLUSIONS
In this paper, we developed a Digital Forensics Evidence Mining Tool that is
dedicated to help cybercrimes investigators, in the process of collecting and
analyzing data from a suspect’s computer. We have provided in this solution
features that are highly needed, helpful and supportive towards evidence
collection. We took advantage of some already developed APIs, such as; Google
Search Desktop API, and WordNet API to enrich our application. Due to recurring
requirements in this hot topic, our solution is scalable and can be adjusted to
adapt future requirements and features to provide a unique and essential tool for
cybercrime investigators.
34
REFERENCES
1. R. Guha, Rob McCool, Eric Miller, Semantic search, International World Wide
Web Conference, Proceedings of the 12th international conference on World
Wide Web.
2. Benjamin Turnbull, Barry Blundell, Jill Slay, Google Desktop as a Source of
Digital Evidence.
3. George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and
Katherine Miller, Introduction to WordNet: An On-line Lexical Database.
4. Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan Cigarrain, Indexing with
WordNet synsets can improve text retrieval, UNED, Ciudad Universitaria.
5. G. Booch, J. Rumbaugh, I. Jacobson, Unified Modeling Language User Guide.
35
Chapter 3
ABSTRACT
Classical cryptography algorithms are based on mathematical functions. The
robustness of a given cryptosystem is based essentially on the secrecy of its
(private) key and the difficulty with which the inverse of its one-way function(s) can
be calculated. Unfortunately, there is no mathematical proof that will establish
whether it is not possible to find the inverse of a given one-way function. Since few
years ago, the progress of quantum physics allowed mastering photons which can
be used for informational ends and these technological progresses can also be
applied to cryptography (quantum cryptography). Quantum cryptography or
Quantum Key Distribution (QKD) is a method for sharing secret keys, whose
security can be formally demonstrated. It aims at exploiting the laws of quantum
physics in order to carry out a cryptographic task. Its legitimate users can detect
eavesdropping, regardless of the technology which the spy may have. In this study,
we present quantum cryptosystems as a tool to attain the unconditional security.
We also describe the well known protocols used in the field of quantum
36
cryptography.
INTRODUCTION
The Origin of the Concept of Quantum Computer
In his article [1] Richard Feynman presented an interesting idea illustrating how
a quantum system can be used for computation reasons. Also the article
described how effects of quantum physics could be simulated by such quantum
computer. Every experience investigating the effects and laws of quantum physics
is expensive and complicated. The idea of Richard Feynman was very interesting
because it can be used for future research of quantum effects.
A quantum computer is a machine for computation that uses quantum
mechanical phenomena, such as superposition and entanglement, to perform
operations on data. The principle behind quantum computation is that quantum
properties can be exploited to represent data and perform operations on these
data [2]. Later in 1985, it was proved that a quantum computer would be much
more powerful than a classical one [3].
A technology of quantum computers is also very different. For operation,
quantum computer uses quantum bits (qubits). Quantum mechanic’s laws are
completely different from the laws of a classical physics. A qubit can exist not only
in the states corresponding to the logical values 0 or 1 as in the case of a classical
bit, but also in a superposition state.
The major difference between quantum and classical computers is related to
the memory. While the memory of a classical computer is a string of state 0 (0’s)
and state 1(1’s) and it can perform calculations on only one set of numbers
simultaneously, the memory of a quantum computer is a quantum state that can
be a superposition of different numbers. A quantum computer can do an arbitrary
reversible classical computation on all the numbers simultaneously and performing
37
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
a computation on many different numbers at the same time interferes all the
results to get a single answer.
For example as in figure 1.1 a quantum computer with 4 qubits gives 24
superposition states. Each state would be classically equivalent to a single list of 4
1's and 0's. Such computer could operate on 24 states simultaneously. Eventually,
observing the system would cause it to collapse into a single quantum state
corresponding to a single answer, a single list of 4 1's and 0's.
Some problems occur in production of quantum computers. Any kind of
measurement of quantum state parameters considers interaction process with
environment (with other particles as particles of light for example), which changes
some parameters of this quantum state. Also, measurement of superposition
quantum state will collapse it into a classical state, this is called decoherence. The
decoherence problem is the major obstacle in a process of producing of a
quantum computer. If this problem cannot be solved, a quantum computer will be
no better than a silicon one [4].
38
A hardware problem is another one problem in building quantum computers.
Because of some successful experiments Nuclear Magnetic Resonance (NMR)
technology is the most popular today. Also, some other designs are based on ion
trap and quantum electrodynamics (QED). All of these methods have significant
limitations and nobody knows what the architecture of future quantum computers
hardware will be [3].
The quantum computing is still in its infancy and although the concept of
quantum computers has remained purely theoretical for a long time, recent
developments in quantum computers have aroused interest. Experiments have
been carried out in which quantum computational operations were executed on a
very small number of qubits (quantum bit). Both practical and theoretical research
continues with interest, and many national government and military funding
agencies support quantum computing research to develop quantum computers for
both civilian and national security purposes, such as cryptanalysis.
If a quantum computer becomes a reality then the artificial intelligence is one of its
benefits. It has been proved that quantum computers will be much faster and consequently
will perform a large amount of operations in a very short period of time. So, increasing the
speed of operation will help computers to learn faster even using the one of the simplest
methods. Also, high performance will allow us in development of complex compression
algorithms, voice and image recognition, molecular simulations, true randomness and
quantum communication. Randomness is very interesting in simulations. Molecular
simulations are important for developing simulation applications for biology and chemistry.
Also, the quantum communication has great benefits in the field of security because both
receiver and sender are alerted when an eavesdropper tries to catch the signal and thus
quantum computers make communication more secure. Actually there a lot of research
concerning a new type of cryptography called quantum cryptography. Quantum
cryptography, or also quantum key distribution (QKD), uses quantum mechanics to
guarantee secure communication. It enables two parties to produce a shared random bit
string known only to them, which can be used as a key to encrypt and decrypt messages
[5].
39
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
40
computational difficulty of certain mathematical functions. Also traditional public
key cryptography cannot provide any indication of eavesdropping or guarantee of
key security. Quantum key distribution has an important and unique properly; it is
the ability of the two communicating users (traditionally referred to as Alice and
Bob) to detect the presence of any third party (referred to as Eve) trying to gain
knowledge of the key. A third party trying to eavesdrop on the key must in some
way measure it, thus introducing detectable anomalies. By using quantum
superpositions or quantum entanglement and transmitting information in quantum
states over a quantum channel (such as an optical fiber or free air), a
communication system can be implemented which detects eavesdropping.
41
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
photons to exchange keys [11], and Huttner et al. also applied a weak correlation
to reduce significantly the level of tapped information [12]. Wiesner used bright
light to construct a quantum cryptosystem [13].
The early quantum cryptosystems developed in the 1980s and 1990s however
lacked complete facilities of research on the security of key distribution protocols.
An eavesdropper in these systems was assumed to be able to adopt only simple
wiretap methods but quantum mechanics can in practice support more complex
methods. Applying a separate method to manage each possible attack is quite
difficult and numerous research scholars devote themselves in enhancing the
system security by applying specific methods for key distribution under various
attacks.
The first one who examined the security of quantum cryptosystems was
Lutkenhaus [14]. In [15,16] Biham and Mor presented a method of resolving
collective attack. Mayers and Salvail [17], Yao [18] and Mayers [19] based their
research on BB84 Protocol [20], believing that this method could provide
unconditional security and resist various attacks. In the article [21] Bennett et al.
examined the security of even–odd bits of quantum cryptography.
Despite the development of Quantum Key Distribution protocols, after 20
years, a group of scholars asserted that although quantum cryptosystem based on
the QKDP can achieve unconditional security, its key generation is not efficient in
practice because the qubits transmitted in the quantum channel cannot be
completely employed. For example, out of 10 qubits, only 5 qubits are used for
key generation. Also, its key distribution applies one-time pad method, and the
length of the key must be the same as that of the plaintext, so the number of
qubits required far exceeds the length of plaintext. So, the cost of frequent
transmission of bulk messages is much too high. Consequently, the new idea of
Quantum Secure Direct Communication (QSDC) is proposed. A QSDC protocol
transforms plaintext to qubits to replace the key, and transmits the messages via
the quantum channel. This reduces the number of qubits used, thus enables
automatic detection of eavesdroppers.
Beige et al. in 2002 [22] was initialized the elaboration of QSDC Protocol. In
42
their scheme, the secure message comprises a single photon with two qubit
states; it becomes read-only after a transmission of an extra classical message via
a public channel for each qubit. Later Boström and Felbingeer developed a Ping-
Pong QSDC Protocol [23] that adopts the Einstein–Podolsky–Rosen (EPR) pairs
[24] as the quantum information carriers. In this protocol, the secure messages are
decoded during transmission, and no additional information needs to be
transmitted. A QSDC scheme using batches of single photons that act as a one-
time pad [25] is proposed by Deng et al. in 2004 and in 2005 Lucamarini and
Mancini presented a protocol [26] for deterministic communication without applying
entanglement. Wang et al. proposed a QSDC approach that uses single photons,
of which the concepts were resulted from the order rearrangement and the block
transmission of the photons [27].
43
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
44
probability correlation function to prove and explain that a connection exists
between the correlation functions satisfying Bell’s Inequality when a classical
probability is employed to illustrate the quantum status of a system. However, in
the 1970s many experiments [30] revealed that the inequality cannot be satisfied if
different bases are employed to measure the separated photons of the entangled
pair mentioned in EPR paradox. So, entangled quantum states exist whose
correlation function cannot be expressed using classical probability. These
quantum states are non-local. To the researchers who attempt to contradict that
quantum states have locality, these findings were an important victory.
BB84 Protocol
This protocol [20] was elaborated by Charles Bennett and Gilles Brassard in 1984. It is
based in its design on Heisenberg’s Uncertainty Principle. It is known as BB84 after
its inventors and year of publication, was originally described using photon
polarization states to transmit the information. Any two pairs of conjugate states
can be used for the protocol, and many optical fiber based implementations
described as BB84 use phase encoded states. This protocol is surely the most
famous and most realized quantum cryptography protocol. The security proof of
this protocol against arbitrary eavesdropping strategies was first proved by Mayers
[31], and a simple proof was later shown by Shor and Preskill [32].
The sender and the receiver (Alice and Bob) are connected by a quantum
communication channel which allows quantum states to be transmitted. Actually,
there are two means to transport photons: the optical fiber or free space [33].
Recent research are experimenting the use of atoms and electrons as a quantum
particle [34]-[35] and perhaps a novel kind of quantum channel will appear. The
quantum channel may be tampered with by an enemy. By its very nature, this
channel prevents passive monitoring.
In addition Alice and Bob communicate via a public classical channel, for
example using broadcast radio or the internet. Neither of these channels needs to
45
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
-base ⊕ of the horizontal (0°) and vertical polarization (+90°), and we represent
0 1 ⊕ ={ 0 , 1 }
the base states with the intuitive notation: and . We have (for
details about quantum computation please see [36]).
-base ⊗ of the diagonal polarizations (+45°) and (+135°). The two different base
1 1
+ = (0 + 1) − = (0 −1)
+ − 2 2
states are and with and . We
⊗ ={ + , − }
have .
In this protocol, the association between the information bit (taken from a random
number generator) and the basis are described in Table 1.
1 1 = a 01 − = a 11
a) Alice chooses a random string of bits d ∈ {0,1} , and a random string of bases
n
a ij di b
b) Alice prepares a photon in quantum state for each bit in d and j in b as
in Table 1, and sends it to Bob over the quantum channel.
46
a
c) With respect to either ⊕ or ⊗ , chosen at random, Bob measures each ij
the incorrect basis yields a random result, as predicted by quantum theory. Thus, if
was chosen instead, the classical outcome would be 1 with certainty because
1 =1 1 +0 0
.
47
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
To detect Eve, Alice and Bob perform a test for eavesdropping in step 2b) of
the protocol. The idea is that, wherever Alice and Bob’s bases are identical (i.e.
'
b i =b i
), the corresponding bits should match To detect Eve, Alice and Bob perform
a test for eavesdropping in step 2b) of the protocol. The idea is that, wherever
bi = bi
'
Alice and Bob’s bases are identical (i.e. ), the corresponding bits should
di = di
'
48
Table 2. An example of the intercept-resend attack.
B92 Protocol
In 1992, Bennett proposes a protocol for Quantum Key Distribution based on two
nonorthogonal states and known under the name of B92 or protocol of two states[28]. The
quantum protocol B92 is similar to the BB84 protocol but it uses only two states instead of
four states. B92 protocol is also based on the on Heisenberg’s Uncertainty Principle.
B92 protocol is proven to be unconditional secure. A remarkable proof of the
unconditional security of B92 is the proof of Tamaki [38]. That is meant that this
proof guaranteed the security of B92 in the presence of any enemy who can
perform any operation permitted by the quantum physics; consequently the
security of the protocol cannot be compromised by a future development in
quantum calculation. Others results related to unconditional secure of B92 are
discussed in [39][40].
The use of a quantum channel that Eve (enemy) cannot monitor without being
detected makes possible to create a secret key with an unconditional security
based on the laws of the quantum physics. The presence of Eve is made manifest
to the users of such channels through an unusually high error rate. B92 is a
protocol of quantum key distribution (QKD) which uses polarized photons as
information carriers. B92 supposes that the two legitimate users, Alice and Bob,
49
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
communicate through two specific channels, which the enemy also has access to:
• A classical channel, which can be public; Eve can listen passively (without
being detected);
• A quantum channel that (by its nature) Eve cannot listen passively.
The first phase of B92 involves transmissions over the quantum channel, while
the second phase takes place over the classical channel.
To describe B92 we use the same notations as those used for the description
of BB84 protocol. For simplicity we give the Fig. 4.2 to show different states of
photons (polarizations) which we use in this protocol. Encoding data on photons is
shown in Table 1.
a) Alice choose randomly a vector of bits A ∈ {0,1} , n > N ( N is the length of the
n
Ai = 0 0
final key). If Alice sends to Bob the state of over the quantum channel
Ai = 1 +
and if , she sends to him the state of , for all i ∈ {0,1, … , n} .
B =0
b) Bob creates in its turn a random vector of bits B ∈ {0,1} , n > N . If i
n
Bob
B =1
chooses the basis ⊕ and if i Bob chooses the basis ⊗ , for all i ∈ {0,1, … , n} .
0 +
c) Bob measures respectively each quantum state sent by Alice ( or ) in the
selected basis ( ⊕ or ⊗ ).
2) Second phase (Public Discussion)
50
a) Over the classical channel, Bob sends T to Alice.
T =1
b) Alice and Bob preserve only the bits of the vectors A and B for which i . In
A i = 1 − Bi
such case and in absence of Eve, we have: and the shared raw key is
Ai 1-Bi
formed by (or ).
c) Alice chooses a sample of the bits of the raw key and reveals them to Bob over
A ≠ 1 − Bi
the classical channel. If it exists i such as i , then Eve is detected and the
communication is aborted.
d) The shared secret key K ∈ {0,1} is formed by the raw key after elimination of the
N
51
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
c) Alice chooses a sample of the bits of the raw key and reveals them to Bob over
A ≠ 1 − Bi
the classical channel. If it exists i such as i , then Eve is detected and the
communication is aborted.
d) The shared secret key K ∈ {0,1} is formed by the raw key after elimination of the
N
the state 2 2 2 2
52
The EPR Protocol
Preliminary
In [42], Artur Ekert has elaborated a quantum protocol based on the properties
of quantum-correlated particles. He uses a pair of particles (called pair EPR).
EPR refers to Einstein, Podolsky and Rosen, which presented a famous
paradox in 1935 in their article [24]. They challenged the foundations of quantum
mechanics by pointing out a “paradox”. The authors state that there exist spatially
separated pairs of particles, called EPR pairs, whose states are correlated in such
a way that the measurement of a chosen observable A of one automatically
determines the result of the measurement of A of the other. Since EPR pairs can
be pairs of particles separated at great distances, this strange behavior is due to
“action at a distance.”
It is possible for example to create a pair of photons (each of which we label
below with the subscripts 1 and 2, respectively) with correlated linear polarizations.
An example of such an entangled state is given by:
S = 1
2
(0 1
1 2 + 1 1 0 2)
0
Thus, if one photon is measured to be in the state , the other, when
1
measured, will be found to be in the state , and vice versa.
To explain the paradox of “action at a distance”, Einstein et al. suppose that
there exist “hidden variables”, inaccessible to experiments. They then state that
such quantum correlation phenomena could be a strong indication that quantum
mechanics is incomplete. Bell [29] in 1964, gave a means for actually testing for
locally hidden variable (LHV) theories. He demonstrated that all such LHV theories
must satisfy the Bell inequality. On the other hand, quantum mechanics has been
shown to violate the inequality.
EPR Protocol
Unlike BB84 and B92 protocols, this protocol uses Bell’s inequality to detect
the presence or absence of Eve as a hidden variable. The EPR quantum protocol
53
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
π 4π 4π π
S1 = 1
2
( + )
6 1 6 2 6 1 6 2
2π 5π 5π 2π
S2 = 1
2
( + )
6 1 6 2 6 1 6 2
Bit 0 1 0 1 0 1
Like BB84 and B92 protocols, there are two phases to the EPR protocol, the
first phase over a quantum channel and the second over a public channel. EPR
protocol could describe as follows [43]:
1) Quantum Transmissions (First phase)
Si { S j , 0 ≤ j ≤ 2}
Firstly, a state is randomly selected from the set of states to
54
Si
create EPR pair in the selected state . One photon of the established EPR pair
is sent to Alice, the other to Bob. With equal probability separately and
independently, Alice and Bob at random select one of the three measurement
M0 M1 M2
operators , and . They measure their respective photons with the
selected measurement operators. Alice records her measured bit. And Bob
records the complement of his measured bit. This procedure is repeated for as
many times as needed.
2) Public Discussion (Second phase)
Alice and Bob establish a discussion over a public channel to determine those bit
at which they used the same measurement operators. Next, they separate their
respective bit sequences into two subsequences. The first subsequence, called
raw key, consists of those bit at which they used the same measurement
operators. The second subsequence, called rejected key, consists of all the
remaining bit.
The purpose of the rejected key is to detect Eve’s presence. Alice and Bob
over the public channel compare their respective rejected keys to determine
whether or not Bell’s inequality is satisfied: if it is, Eve’s presence is detected and if
not, then Eve is absent.
For this specific EPR protocol, Bell’s inequality can be formulated as follows.
We note P ( ≠| i , j ) the probability that two corresponding bits of Alice’s and Bob’s
rejected keys do not coincide known that the measurement operators chosen by
Mi Mj M Mi
Alice and Bob are respectively either and or j
and .
We write also the expression:
P ( =| i , j ) = 1 − P ( ≠| i , j ) , Φ (i , j ) = P ( ≠| i , j ) − P ( =| i , j ) , Ι = 1 + Φ (1, 2) − | Φ (0,1) − Φ(0, 2) | .
55
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
CONCLUSION
Quantum cryptography is based on a combinations of principles from quantum
physics and information theory and made possible thanks to the tremendous
progress in quantum optics and in the technology of optical fibers and of free
space optical communication. Its security relies on deep theorems in classical
information theory and on a profound understanding of the Heisenberg’s
uncertainty principle. Quantum cryptography has some important contributions to
classical cryptography: privacy amplification [47] and classical bound information
are examples of concepts in classical information whose discovery were much
inspired by quantum cryptography. Also, the fascinating tension between quantum
physics and relativity, as illustrated by Bell’s inequality, is not far away. Actually,
despite the huge progress over the recent years, many technological challenges
and open questions remain.
The first technological challenge at present concerns improved detectors
compatible with telecom fibers. Also two other issues concern free space and
quantum repeaters. The first is presently the only way to realize quantum
cryptography over thousands of kilometers using near future technology. The
purpose of the idea of quantum repeaters is to encode the qubits in such a way
that if the error rate is low, then errors can be detected and corrected entirely in
the quantum domain. So, the hope is that such techniques could extend the range
of quantum communication to essentially unlimited distances.
For the open questions side, we emphasize three main concerns. First,
complete and realistic analyses of the security issues are still missing. Second,
figures of merit to compare quantum cryptography schemes based on different
56
quantum systems (with different dimensions for example) are still awaited. Third,
the delicate question of how to test the apparatuses did not yet receive enough
attention.
Quantum cryptography could well be the first application of quantum
mechanics at the single quanta level. Many experiments have demonstrated that
keys can be exchanged over distances of a few tens of kilometers at rates at least
of the order of a thousand bits per second. There is no doubt that the technology
can be mastered and the question is not whether quantum cryptography will find
commercial applications, but when!
57
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
REFERENCES
1. R. Feynman, Simulating physics with computers, International Journal of Theoretical
Physics 21 (6&7) (1982) 467–488.
2. http://qist.lanl.gov:80/qcomp_map.shtml
3. West, J (2000). Quantum Computers. Retrieved December 1, 2002 from
California Institute of Technology, educational website:
http://www.cs.caltech.edu/~westside/quantum-intro.html#qc
4. Daniel, G. (1999). Quantum Error-Correcting Codes. Retrieved on November
31st, 2002 from: http://qso.lanl.gov/~gottesma/QECC.html
5. http://en.wikipedia.org/wiki/Quantum_cryptography
6. Integer Factoring By ARJEN K. LENSTRA - Designs, Codes and
Cryptography, 19, 101–128 (2000) Kluwer Academic Publishers.
http://modular.fas.harvard.edu/edu/Fall2001/124/misc/arjen_lenstra_factoring.
pdf
7. P.W. Shor, Algorithms for quantum computation: discrete logarithm and
factoring, in: Proceedings of the 35th Annual Symposium on the Foundations
of Computer Science, 1994, pp. 124–134.
8. S. Wiesner, Conjugate coding, SIGACT News 15 (1) (1983) 78–88.
9. E. Biham, B. Huttner, T. Mor, Quantum cryptography network based on
quantum memories, Physical Review A 54 (3) (1996) 2651– 2658.
10. S.J.D. Phoenix, S.M. Barnett, P.D. Townsend, K.J. Blow, Multi-user quantum
cryptography on optical networks, Journal of Modern Optics 42 (1995) 1155–
1163.
11. B. Hutter, A. Peres, Quantum cryptography with photon pairs, Journal of
Modern Optics 41 (12) (1994) 2397–2403.
12. B. Hutter, N. Imoto, N. Gisin, T. Mor, Quantum cryptography with coherent
states, Physical Review A 51 (3) (1995) 1863–1869.
13. S. Wiesner, Quantum cryptography with bright light, Manuscript, 1993.
14. N. Lutkenhaus, Security against eavesdropping in quantum cryptography,
Physical Review A 54 (1) (1996) 97–111.
15. E. Biham, T. Mor, Security of quantum cryptography against collective attacks,
58
Physical Review Letters 78 (11) (1997) 2256–2259.
16. E. Biham, T. Mor, Bounds on information and the security of quantum
cryptography, Physical Review Letters 79 (20) (1997) 4034– 4037.
17. D. Mayers, L. Salvail, Quantum oblivious transfer is secure against all
individual measurements, in: Proceedings of the 3rd Workshop on Physics
and Computation—PhysComp’94, IEEE Computer Society, 1994, pp. 69–77.
18. A.C.-C. Yao, Security of quantum protocols against coherent measurements,
in: Proceedings of the 26th Annual ACM Symposium on the Theory of
Computing, 1995, pp. 67–75.
19. D. Mayers, Quantum key distribution and string oblivious transfer in noisy
channels, in: Advances in Cryptology—CRYPTO’96, LNCS 1109, Springer-
Verlag, 1996, pp. 343–357.
20. C.H. Bennett, G. Brassard, Quantum cryptography: public key distribution and
coin tossing, in: Proceedings of the International Conference on Computers,
Systems & Signal Processing, Bangalore, India, December 10–12, 1984, pp.
175–179.
21. C.H. Bennett, T. Mor, J. Smolin, The parity bit in quantum cryptography,
Physical Review A 54 (4) (1996) 2675–2684.
22. Beige, B.-G. Englert, C. Kurtsiefer, H. Weinfurter, Secure communication with
a publicly known key, Acta Physica Polonica A 101 (3) (1999) 357.
23. K. Bostro¨m, T. Felbinger, Deterministic secure direct communication using
entanglement, Physics Review Letters 89 (18) (2002) 187902.
24. Einstein, B. Podolsky, N. Rosen, Can quantum-mechanical description of
physical reality be considered complete? Physical Review 47 (1935) 777–780.
25. F.-G. Deng, G.L. Long, Secure direct communication with a quantum one-time
pad, Physics Review A 69 (5) (2004) 052319.
26. M. Lucamarini, S. Mancini, Secure deterministic communication without
entanglement, Physics Review Letters 94 (2005) 140501.
27. J. Wang, Q. Zhang, C.-J. Tang, Quantum secure direct communication based
on order rearrangement of single photons, Physics Letters A 358 (4) (2006)
256–258.
59
____ Chapter 3: Achieving unconditional security by Quantum Cryptography
28. C.H. Bennett, Quantum cryptography using any two non-orthogonal states,
Physical Review Letters 68 (21) (1992) 3121–3124.
29. J.S. Bell, On the Einstein–Podolsky–Rosen paradox, Physics 1 (1964) 195–
200.
30. J.F. Clauser, Experimental investigation of a polarization correlation anomaly,
Physics Review Letters 36 (1976) 1223–1226.
31. D. Mayers, ”Unconditional security in quantum cryptography,” Journal of the
ACM, vol. 48, no. 3, pp. 351–406, May 2001.
32. P. W. Shor and J. Preskill, ”Simple proof of security of the BB84 quantum key
distribution protocol,” Phys. Rev. Lett, vol. 85, no. 2, pp. 441–444, July 2000.
33. R.Hughes,J.Nordholt,D.Derkacs,C.Peterson, (2002). ”Practical free-space
quantum key distribution over 10km in daylight and at night”. New journal of
physics 4 (2002)43.1-43.14.URL: http://www.iop.org/EJ/abstract/1367-
2630/4/1/343/
34. Knight, P (2005). “Manipulating cold atoms for quantum information
processing”. QUPON conference Vienna 2005.
35. Tonomura, A (2005). “Quantum phenomena observed using electrons”.
QUPON conference Vienna 2005.
36. M. Nielsen and I. Chuang. Quantum Computation and Quantum Information.
Cambridge University Press, 2000.
37. M. Elboukhari, M. Azizi, A. Azizi, “Implementation of secure key distribution
based on quantum cryptography”, in Proc. IEEE Int. Conf Multimedia
Computing and Systems (ICMCS’09), page 361 - 365, 2009.
38. Tamaki, K., M. Koashi, and N. Imoto, “Unconditionally secure key distribution
based on two non orthogonal states,” Physical Review Letters 90, 167904
(2003), [preprint quant-ph/0210162].
39. Tamaki.K , Lütkenhaus.N, “Unconditional Security of the Bennett 1992
quantum key-distribution over lossy and noisy channel,“ Quantum Physics
Archive: arXiv:quantph/0308048v2, 2003.
40. Tamaki.K, Lütkenhaus.N, Koashi.M, and Batuwantudawe.J, “Unconditional
security of the Bennett 1992 quantum key-distribution scheme with strong
60
reference pulse, “ Quantum Physics Archive: arXiv:quant-ph/0607082v1, 2006.
41. M. Elboukhari, M. Azizi, A. Azizi, “Security Oriented Analysis of B92 by Model
Checking”, in Proc. IEEE Int. Conf. new technology, mobility and security
(NTMS), page 454-458, 2008.
42. Ekert, Artur K., Quantum cryptography based on Bell’s theorem, Physical
Review Letters, Vol. 67, No. 6, 5 August 1991, pp 661 - 663.
43. S. J. Lomonaco Jr: A quick glance at quantum cryptography
http://www.cs.umbc.edu/»lomonaco/Publications.html
44. Bennett, Charles H., Gilles Brassard, and N. David Mermin, Quantum
cryptography without Bell’s theorem, Physical Review Letters, Vol. 68, No. 5, 3
February 1992, pp 557 - 559.
45. D’Espagnat, B., Scientific American, November 1979, pp 128 - 140.
46. Blow, K.J., and Simon J.D. Phoenix, On a fundamental theorem of quantum
cryptography, Journal of Modern Optics, 1993, vol. 40, no. 1, 33 - 36.
47. Bennett, C. H., Brassard, G., Crepeau, C. and Maurer, U. M., "Generalized
Privacy Amplification", IEEE Transactions on Information Theory, 1995.
61
Chapter 4
ABSTRACT
Mobile agents are a new paradigm to distributed computation, where mobile agent
roams the global Internet in search of services for its owner. The most problem
with this approach is security.
The objective of this paper is to propose a protocol to protect mobile agent,
based on two agents: mobile agent and investigator agent. The investigator is a
mobile agent’s prototype with no critical code and data. It is created and sent by
mobile agent in order to be executed first. In return, investigator agent is analyzed
by mobile agent to detect any malicious action: if actions are forbidden, mobile
agent redoes a copy and changes destination. If actions are doubtful, mobile agent
chooses a plan of adaptation then migrates. If all actions are authorized, mobile
agent migrates with confidence.
62
INTRODUCTION
Mobile agents are program instances, able to migrate from one agent platform
to another [1], [3], [9], thus fulfilling tasks on behalf of user or another entity. They
consist of three parts: code, data state (e.g. instance variables), and execution
state.
They transport sensitive information such as secret keys, electronic money,
and other private data.
Consequently, security is a fundamental precondition for acceptance of mobile
agent applications. In other words, we need to have a program that actively
protects itself against malicious hosts who try to attack mobile agent in order to
obtain service without providing payment, to remove private information from
agent’s memory, or to destroy their code, state or data [16], [27].
We can classify host’s attacks against mobile agents into three classes [4], [8],
[14]: inspection, modification and replay attacks. Inspection consists in examining
the contents of agent, or the stream of execution to get back critical information
transported by mobile agent.
The modification is realised by replacing some elements of agent with the aim
of leading an attack.
Replay attacks are obtained by cloning the agent, then by executing the
investigator in several configurations to find agent’s knowledge.
We also quote denial of service [12] where a malicious host can ignore
demands of service, introduce unacceptable delays for critical spots, don’t execute
the code of mobile agent either end it without notice. Other agents who wait for the
answer of this agent will be in deadlock.
Different approaches are proposed to guarantee to agents a trust execution in
visited hosts, such as tamper proof hardware [12], function hiding [13], black box
[6], [15] or clueless agents [11].
We explain in this paper a proposition to protect mobile agent, based on
cloning and adaptability concepts.
The investigator is a mobile agent copy, with no critical code and data. Mobile
63
____ Chapter 4: Adaptive architecture to protect mobile agents
agent creates its investigator, saves a copy and sends it first to the host. After
execution, investigator returns back to mobile agent who compares it with the
saved copy in order to detect possible attacks.
Thus, instead of divulging mobile agent to malicious host, we propose to send
an investigator firstly to examine possible attacks. This idea minimizes damages
on mobile agent and allows detecting and avoiding several types of attacks in
moment of their arrivals.
We will discuss performances that our proposition can bring to protect mobile
agents against these attacks in section 6, but at first we begin with section 2 where
we present some works based on one or several agents to protect mobile agent,
and other works based on adaptability. Section 3 describes principle, architecture,
and steps of proposed system.
Section 4 presents some implementation principles. In section 5 we propose an
experiment to compare our approach with tow other approaches.
We evaluate the protocol in section 6. Finally, section 7 summarizes our
contribution and describes future works.
Related Work
In this section, we summarize some proposed techniques based on
adaptability, and other based one or more agents.
A. Adaptability to protect mobile agents
Adaptability concept
The adaptation designates the action to react facing variations of environment
constraints [20], [21].
The adaptation can be static or dynamic: The static adaptability is done before
the execution according to environment knowledge detained.
Dynamic adaptability is based on estimation, during the construction of the
application, of different variations of the environment and defining actions of
adaptation. Consequently, it defines adaptability rules.
Works based on Adaptability
S.Hacini et al in [14] proposed an approach which offers to mobile agent the
64
possibility to modify its behaviour. This ability makes it unpredictable and
complicates its analysis [2]. The idea is that mobile agent must verify the customer
trustworthiness and present an appropriate behaviour.
S.Leriche and JP.Arcangeli [20], [21], proposed architecture based on micro
replaceable components to adapt mobile agent to various executions conditions.
We tried to use this idea in our architecture to adapt the mobile agent to different
attacks.
The Protocol
A. Origin (Military Life)
Our proposal is inspired from the principle of Military Investigator or scouts.
65
____ Chapter 4: Adaptive architecture to protect mobile agents
66
investigator’s report and other keys to encrypt collected data, partial results and
mobile agent itinerary.
Dynamic data
This part is dynamic because it evolves in every migration. We find black list
hosts (BLH), trusted hosts list (THL) and doubtful or untrusted hosts list (UHL).
Collected data and partial results are encrypted and saved after every migration
(decryption key is only on home host to guarantee data confidentiality).
Besides, every visited host is registered and encrypted by mobile agent to form the
itinerary.
Investigator’s copy
Mobile agent backs up a copy of investigator agent (code and data) before
migration towards a site to make comparison and analysis after return.
Action library
Action library contains a set of predefined events, divided into three classes:
• Class of allowed actions: It contains all allowed actions that investigator
can make without any problem. For example execution, migration, etc.
• Class of forbidden actions: This class contains forbidden actions which
can violate integrity or confidentiality of the investigator. For example action
delete or modification of code.
• Class of doubtful actions: It contains a set of doubtful actions that mobile
agent is not sure if they are untrusted or not. For example copy-paste code
action.
Interface
We propose an interface between mobile agent and the investigator. It receives
data, code and report of the investigator, then sends them to analyzer to be
analyzed.
Analyzer
The role of this part is to analyze investigator agent: code, data and report. No
modification must be done on code and static data. Dynamic data or partial results
must be signed by the visited host, otherwise, it is considered malicious.
Analyzer compares investigator’s report with actions library:
67
____ Chapter 4: Adaptive architecture to protect mobile agents
Component Library
It contains replaceable components used by mobile agent during adaptation.
These components are fragments of code chosen by adapter following
investigator’s report.
Adapter
We rely in this section on the architecture proposed by Hacini et al [14] and
Leriche et al [21]. The role of adapter is to adapt mobile agent following the events
report, by choosing and sending suited components to manager in order to change
mobile agent structure. Adapter contains a set of the type:
<If Action (i) Then Component (j)>.
For example, in case of code analyse, adapter decides to use a more raised
protection policy of code, by choosing obfuscation technique [17]. So, it adapts
mobile agent to a situation of attack by loading the component “obfuscation code”.
Manager
Its role is to change mobile agent structure to adapt it in case of attacks.
Manager redoes a copy of the investigator and changes its destination in case of
forbidden actions. It adds sites in blacklist, trust or untrusted list.
If there is no attack, manager is informed by analyzer and enables mobile agent to
migrate with guarantee.
2) Investigator agent
The investigator is a prototype of mobile agent with no important code and data.
Our aim is not to expose directly essential knowledge and sensitive data of mobile
agent without testing the assurance of host, and identify if it is malicious or not.
68
Investigator has static data, such as creator digital signature, and dynamics data,
such as partial results collected on the host. Investigator agent has also a report of
all produced actions on the host, and it communicates with its mobile agent using
an interface.
These actions are coded by an encoding key known only by mobile agent, and
changed for every migration in order to avoid the risk that a malicious host makes
forbidden action then tries to delete it from actions’ report.
Dynamic data
Static data Components
Lib
Code
MANAGER ADAPTER
Investigat
or
INVESTIGATOR
Information
Flux Data Code Report
69
____ Chapter 4: Adaptive architecture to protect mobile agents
Environmental key aims at identifying the host before beginning the execution
[23]. If this key is not valid, investigator returns towards its mobile agent which
changes the destination. If the key is valid, investigator begins its execution.
C. Mobile agent analyzes investigator. We can envisage the following situations.
4.1 Mobile agent calculates and compares estimated time with execution time. If
it is exceeded, mobile agent concludes that investigator is either “killed” or its
destination is modified, so "lost". It puts this host in the blacklist and decides not to
migrate.
4.2 If investigator returns, mobile agent compares it with the saved copy. If it
finds that code or static data are modified, it puts responsible host in the blacklist
and decides not to go.
4.3 If there is no modification in code or static data, mobile agent verifies if
investigator partial results are signed. The objective is to avoid non-repudiation,
where a host can deny having received the investigator.
In the end of steps 4.1, 4.2, 4.3, mobile agent must redo another investigator and
sends it towards another destination.
4.4 If code and data are not modified, and partial results are signed, mobile agent
analyzes investigator’s report and compares its actions with library of actions:
• If it finds forbidden actions (such as deletion of code or data), it puts the host in
blacklist (BLH), redoes another copy of investigator and sends it towards
another destination.
• If it finds doubtful actions (such as copy past), it puts host in list of doubtful
sites (UHL), chooses suited components, then changes structure of its code by
making replacements of components .
• The goal is to increase the security level of mobile agent in order to avoid the
attack.
• If all report’s actions are allowed, mobile agent registers the host H1 in its list
of trusted host (THL), adds and encrypts it in the itinerary, then migrates.
70
D. At the end of its mission, mobile agent returns to host of origin with all collected
data, lists of malicious, doubtful and trusted hosts. These lists are used for
future migrations.
E. Before continuing its way towards a host H2, mobile agent verifies that H2 is
not in its black list, recreates an investigator and redoes same steps.
Implementation
In order to prove the viability of our protocol, prototypes of mobile and
investigator agents are created.
The current implementation is made within JADE agent platform (Java Agent
Development framework). The main reason for this selection was the fact that
JADE is one of the best modern agent environments [5]. Furthermore, JADE is
open-source and it is FIPA compliant [1].
In what follow, we describe some principles of our implementation
① Mobile agent and investigator agent take place in JADE containers. In order to
simulate different attacks, we propose to implement an agent, called Testing
Agent (TA), in the secondary container.
② Its role is to generate attacks on investigator agent to observe the behaviors of
mobile agent.
③ Mobile agent, investigator agent and testing agent are defined in Agent class
(package jade.core).
④ All proposed modules (manager, adaptor, and analyzer) are defined with java
classes. Actions library and components library are represented by data bases.
⑤ All proposed agents must be registered in DF agent (yellow pages service,
Class DFService, package jade.Domain).
⑥ Communication between agents is assured with FIPA ACL language. (Class
ACLMessage, package jade.lang.acl). We use Dummy agent to visualize
messages between agents (Fig.2).
⑦ All agents’ behaviors are defined with the class Behavior (package
jade.core.behaviour).
71
____ Chapter 4: Adaptive architecture to protect mobile agents
Experiment
In order to validate our protocol, we planned an experiment, which consists to
create three mobile agents (Fig.3): MobileAgent1, protected with an approach
based on detection, where an attack is detected after return to the home site.
The second agent, MobileAgent2, is protected with TTP (Trusted Third Party).
TTP is a server where MobileAgent2 is verified after each migration.
Third agent is MobileAgent3, protected with the proposed protocol.
72
Figure 3. Different Containers and Agents Used in the experiment
73
____ Chapter 4: Adaptive architecture to protect mobile agents
too late.
TTP is able to detect these actions on MobileAgent2 when it verifies it.
However, TTP needs more time for verification and it can’t avoid attacks.
MobileAgent3 can detect these attacks once it analyzes investigator report. In
this case, it adapts itself by choosing obfuscation component, which consist to
make the code incomprehensible [18].
We note down that the attack was inefficient after adaptation.
Finally, testing agent doesn't attack any mobile agent: MobileAgent1 and
MobileAgent2 can‘t be sure that the host is trust. MobileAgent3 is faithful and can
migrate with confidence.
We calculate for every agent, necessary time to detect attacks, knowing that
total time to go and return towards the host of origin is of 50 seconds.
We elaborated graph of comparison presented in figure 5.
Note that more the number of sites to be visited is big more attacks time
detection of MobileAgent1and MobileAgent2 is more important.
74
Figure 4. Comparison between approaches
EVALUATION
Guan et al. [10], Karjoth et al. [19], Yao et al. [25] and T. Sander et al. [24]
used the following points as security properties of mobile agent, we use some of
them as requests to analyse our protocol:
• Confidentiality
Report’s analysis of investigator agent allows knowing if its code or data were
copied. Moreover, environmental key guaranties confidentiality, because only
authorized hosts can access at first to investigator then to mobile agent.
• Non-repudiation
The signature of results by host and recording of itinerary with encryption
assure this property, because no site can deny that it was visited.
• Integrity
It is verified when investigator agent is analysed by mobile agent, which
compares its code with the saved copy to detect any corruption or modification
Conclusion
In this paper, we looked at the general problem of mobile agent protection
against malicious hosts, and different proposed works. Then, we presented a
protocol to assure the mobile agent protection. The idea is based on prevention
75
____ Chapter 4: Adaptive architecture to protect mobile agents
and requires two agents: mobile agent, with knowledge and critical data, and a
prototype, with no sensitive code and data. This prototype is called investigator
agent.
A mobile agent, before migrating to a host, creates a investigator then sends it
firstly. In return, it is analyzed in order to detect attacks. If investigator is attacked,
mobile agent chooses a policy of adaptation which consists in replacing
components suited to the situations of attacks. Mobile agent decides not to go at
all if attacks are severe.
We described the proposed architecture and its implementation using JADE.
We discussed the capacity of this protocol to verify various properties of security.
While there are several areas of the work presented here that require further
investigation. There are two that particularly interest us. Firstly, we would like to
assess the performance of our proposal in a real case; we will choose e-
commerce. Secondly, we would like to develop JADE Mobile Agents witch is more
application specific.
76
REFERENCES
1. Ametller J, Cucurull J, Mart R, Navarro G, Robles S, “Enabling mobile agents
interoperability through fipa standards”, Lecture Notes in Artificial Intelligence,
CIA 2006, vol. 4149, Springer, Edinburgh, UK, 2006, pp. 388–401.
2. Belaramani N.M, “A component -based software system with functionality
adaptation for mobile computing”. Master’s thesis, 2002.
3. Bernard G., Ismail L, « Apport des agents mobiles à l'exécution répartie »,
Revue des sciences et technologies de l’information, série Techniques et
science informatiques, vol. 21, n 6, p. 771-796, 2002.
4. Biehl I, Meyer B and Wetzel S, “Ensuring the Integrity of Agent-Based
Computations by Short Proofs”. Proceedings of the second International
Workshop on Mobile Agents. LNCS, Vol. 1477, pages 183-194, 1998.
5. Chmiel K” Agent Technology in Modelling E-Commerce Processes; Sample
Implementation”, Multimedia and Network Information Systems, Volume 2,
Wrocław University of Technology Press, pp. 13-22
6. Corradi A, Montanari R, and Stefanelli.C, “Mobile agents integrity in e-
commerce applications,” in Proc. of the 19th IEEE International Conference on
Distributed Computing Systems Workshop (ICDCS’99), 1999, Austin, Texas:
IEEE Computer Society Press, pp. 59-64
7. El Rhazi A, Pierre S, Boucheneb H, « Secure protocol in mobile agent
environment ». IEEE CCECE 2003, May 4-7, vol.2, Montereal, pp777-80.
8. Farmer W, Guttman J, and Swarup V, “Security for Mobile Agents:
Authentication and State Appraisal”. In Proceedings of the 4th European
Symposium on Research in Computer Science (ESORICS'96), September
1996, pp. 118 - 130.
9. Fuggetta A, Picco G, Vigna G, « Understanding Code Mobility », IEEE
Transactions on Software Engineering, vol. 24, n5, p. 342-361, 1998.
10. Guan H, Meng X, Zhang H, “A forward integrity and itinerary secrecy protocol
for mobile agents,” Wuhan University Journal of Natural Sciences, china,
vol.11, No.6, pp. 1727-1730, 2006.
77
____ Chapter 4: Adaptive architecture to protect mobile agents
78
Computer Applications 2006, pp 1228-12
23. Riordan, J, Schneier, B, “Environment key generation towards clueless
agents”. Lect. Notes Comput, 1998
24. Sander T, Tschudin C, “Protecting Mobile Agent against Malicious Hosts”,
G.Vigna (Ed.), Mobile Agents and Security, Lecture Notes in Computer
Science, Vol. 1419, ©Springer-Verlag Berlin Heidelberg, Berlin, 1998.
25. Yao M, Peng M, and Dawson E, “Using ‘Fair Forfeit’ to Prevent Truncation
Attacks on Mobile Agents,” in Proc. of the 10th Australasian Conference on
Information Security and Privacy ACISP 2005, Brisbane, Australia, pp.158-
169,2005
26. Yao M, Peng K, Matt Henricksen M, Foo E, and Dawson E, “Using
Recoverable Key Commitment to Defend Against Truncation Attacks in Mobile
Agents,” in Proc. of the 5th International Conference on E-Commerce and
Web Technologies (EC-Web 2004), volume 3182 of Lecture Notes in
Computer Science, pp.164-173. Springer-Verlag, 2004
27. Zachary J, “Protecting mobile code in the wild”. IEEE
79
Chapter 5
ABSTRACT
This paper presents our participatory expressive workshop and information system
to support it. Our aim in this research is to cultivate communications in local
communities. Expressing thoughts will be a first step to communicate each other
and creating new stories by remixing others expressions will help to exchange and
grasp others thoughts. We propose a workshop program and a model of content
circulation, and develop a system to realize them. Our system supports
decomposing and recomposing by automatic draft content generation. We
implemented the system for a workshop, in which participants created contents
based on a format of expression named photo-attached acrostics. Through
observation of the practice, we concluded that our framework could help content
decomposition.
80
INTRODUCTION
With the rapid spread of mobile devices and the Web, we live in a world of
explosively increasing volumes of information. Through real-time content
publishing in daily lives, it became easier that people in different places
communicate each other, however far they may be. Despite the situation ― or
should we say, because of the situation, communications in local communities are
neglected. People in Japanese urban areas often don’t know well about their
neighbors, nor, much less, what their neighbors are thinking. For such purpose,
we developed a participatory expressive workshop and a support system.
Our aim in this research is to provide opportunities for people in a local area to
discuss on their area so that they understand their place and the community more
deeply and more widely.
First step of communication is to express your thought. Remixing expressions
will lead participants to exchange and understand others opinions. Our workshop
and system are integratively designed for such sake. We designed and developed
both information system and activities to utilize it. Our technological focus is on
content circulation framework which includes creation (expressing) and reuse
(remixing).
81
____ Chapter 5: Communication through expressing and remixing: Workshop and System
82
usually new combinations of information are devised. In this research, we propose
a framework for recomposition of stored contents from a creativity support
research perspective.
In our framework, a system shows draft contents, which are automatically
generated by remixing the user’s and the others contents, when a user produces a
new content. A user finishes her content by selecting and modifying draft content.
Through such process, we aim to develop an iteratively growing loop of
expressions. In the loop, others contents are taken in a user’s newly produced
content, and the content are used in others contents again.
In our research, we designed a new format of expression to emphasize a loop
of content creation and recomposition. The format, photo-attached acrostics,
contains pairs of pictures and sentences. This format is easy to be taken apart to
partial expressions. The details of the format will be shown in section 5.
Workshop we focus on in this research is a participatory and experiential group
work-based style for learning and creation. Workshops are held in various fields –
arts such as theatres, hand crafts and music, citizen-participatory town planning,
and learning like training in companies and classes in schools.
A workshop is arranged and organized by facilitator. The facilitator establishes
tasks and prepares a place. Participants work together for the tasks in the place.
Shared place and tasks enhance to form opinions and output expressions. In
some case participants collaborate and in some case they compete.
Lave discussed the process of learning, creation, and consensus formation in a
group called Community of Practice [5], where people share techniques, interests,
or concerns. Commitment to the Community of Practice is activated by roles,
which participants are required to play, such as a master and an apprentice [6].
This theory, Legitimate Peripheral Participation, explains participatory workshops
gain participants’ active commitments. A person, who plays a participant role, is
requested to carry tasks out based on the program prepared by the facilitator.
Figure 1 illustrates the concept of our proposed workshop as a system. The
core elements of a workshop are participants, facilitator, information system(s),
tasks, and place.
83
____ Chapter 5: Communication through expressing and remixing: Workshop and System
RELATED WORK
Creativity Support
In the beginnings of 1990s, research area called creativity support was raised.
In the area, problems like how computers can support human creative activity and
what kind of creative activity can be supported were discussed.
Boden distinguished two sorts of creativity: H-creativity, which indicates
historically new idea/concept formation, and P-creativity, psychologically new
idea/concept formation in human minds [7]. In our research, we aim P-creativity
support rather than H-creativity support. For ordinary people, our target users,
what they express ― externalization of internal nebulous thoughts ― is more
important than how they express ― surficial originality of expressing techniques.
In psychology field, Guilford made the distinction between convergent and
divergent thinking [8]. Our approach emphasizes neither of them specially, but if
daring to say, it matches divergent one. One of our aims is to support expressing,
which seems to be a convergent process; but widening users’ views and
84
unsticking users’ stuck thinkings are more important.
Many and many creative methods have been proposed, including KJ method
[9] and brainstorming [10], and many systems to help creative methods using
computer systems have been developed [11].
85
____ Chapter 5: Communication through expressing and remixing: Workshop and System
knowledge management area, SECI model [16] is widely known. SECI is the
abbreviation for Socialization, Externalization, Combination, and Internalization,
which are the processes of knowledge cycle. Shneiderman categorized creative
activities into following four activities: “collect,” “relate,” “create,” “donate” [17] And
Ohmukai et al. expanded Shneiderman’s model to distinguish information activity
layer and communication activity layer [18]. In their model called ICA model ―
Information and Communication Activities model, two layers of information
activities (“collect,” “create” and “donate”) and communication activities (“relate,”
“collaborate” and “present”) form cycles related to each other.
Hori and his group developed a cycle model which consists of the knowledge
liquidization and crystallization processes [19]. They called decomposition of
expressions into units in proper granularity with every possible connection among
each as liquidization. And as crystallization, they called new expression formation
from decomposed partial units based on new relationships within the context. Our
research is based on this concept [20]. In our proposed framework, a system
decomposes and recomposes collected users expressions.
86
CONTENT CIRCULATION FRAMEWORK
Figure 2 illustrates our proposed framework for content circulation. Contents
created by multiple users are stored into the database. The recomposing engine
decomposes stored contents and generates draft contents. Here we aim not to
create complete contents but to stimulate users. The support interface shows
drafts to a user and she edits and finishes her content. These operations spread to
the recomposing engine and it shows other drafts.
We show two levels of interaction loops here. Direct and local interaction
between users and the support system is shown in editing and stimulating loop.
Remixing and reusing stored contents form indirect and total interaction loop.
This model is applicable for various manners of creation and publication of
contents. For example, writing process of papers or blog entries include
information collection phase and editing phase. Of course authors need to add
their own original opinions, but candidate combinations of related information will
help their considerations. Format of expression can vary and is not limited to text
expressions. While we expect the framework can be applied to any types of
contents, we dare focus on text content in this research. Decomposition and
recomposition are realized by usual text processing techniques. We wanted to
focus not on techniques for implementation but on the content circulation
framework itself. For that reason, we held a participatory workshop where
participants created their contents and recomposed them into new contents.
87
____ Chapter 5: Communication through expressing and remixing: Workshop and System
88
Participants are requested to place others’ (partial) expressions in their new
expressions. We aim that participants form new opinions/ideas stimulated by
others. At the same time, the workshop facilitator shows other new remixed
acrostics using the developed information system described below.
SUPPORT SYSTEM
The system consists of four parts (Fig. 4): expression database, expression
input interface, expression recomposing engine, and expressing support interface.
It has the same structure with the framework illustrated in Fig. 2, but is modified to
highlight its dataflow.
Users input their works, which are created in manual and analog manner in the
workshop. The expressing support interface shows draft expressions, which are
generated from the expression recomposing engine (see Figure 6).
The expression recomposing processes are as follows:
Decomposition phase
Analyze morphological structures of text.
Calculate term relation weights and term weights.
We use term dependency for term relation weights and term attractiveness for
sentences(t I t ′)
td (t , t ′) =
sentences(t ) (1)
same time.
89
____ Chapter 5: Communication through expressing and remixing: Workshop and System
wt ( p ) = ∑ td (t , t ′) ⋅ attr (t ′)
t ′∈T p t ′ ≠ t
(3)
For each initial letter, the term candidates, their related terms, and attached photos
are structured.
90
RESULTS AND DISCUSSIONS
The theme of our first practice was “Shonan” ― the name of a region along a
coast in central Japan. We called for participation to the people related to ― e.g.,
living around, working around, or was born around ― Shonan area. Through the
workshop, participants are expected to discuss together and get new opinions
about the area.
The workshop was held at 8th and 16th December 2007 in Fujisawa city, the
center of Shonan area, with nine participants. Most of their occupations were
related to media activities or media literacy: information media-major students, an
elementary school teacher, an art university professor, members of citizens’
television at Shonan, and so on. While the youngest was an undergraduate
student, a retired person was also included. Three were female, and six were male.
The participants were divided into three groups and finally they made 30 photo-
attached acrostics from 259 photos. Fig. 7 shows the scenes in the workshop.
Through this workshop, we aim that participants exchange their knowledge and
get new ideas through collaboration and competition. Most of the works from the
latter steps were created by remixing others’ former works. One participant,
however, didn’t change his mind finally. He preferred creating by himself rather
than through collaboration. This fact shows our method is not almighty; this seems
quite natural.
For the rest of participants, we found that collaborations in the shared place
were effective. In the workshop, we prepared the tasks which consist of individual
creations and collaborative creations. We expected that participants would change
and expand their way of thinking through these tasks. In the process of creating
one expression, the participants changed their views actually more frequently. One
was not always thinking together with others during a group work; she thought
about the task alone; Then she and other member of her group discussed about
their thoughts together; And she thought by herself again...
In the first step, the facilitator selected photos and terms based on certain rules
so that we could observe the effects of the recomposing engine. As a result, a
91
____ Chapter 5: Communication through expressing and remixing: Workshop and System
content, which was created by choosing photos and terms with the highest weights
in the expression (2) and the expression (3) (shown on the top of each list of
candidates in the support interface) happened to have a similar story structure to a
participant’s one. We aimed to form a different context, but made a similar story.
The facilitator, however, could create much more expressions in much less time.
The outputs of the system were not always new, but the number of outputs was
large enough to stimulate the participants.
92
Figure 6. Screen image of photo-attached acrostic creation support interface
93
____ Chapter 5: Communication through expressing and remixing: Workshop and System
As the second step, we asked the participants to try the support interface after the
workshop and conducted interviews with them. While positive comments like “I
could easily create new acrostics” were heard, a problem was pointed. The system
shows candidates for each sentence separately; connecting sentences ― making
story ― is not supported enough.
CONCLUSION
In this paper, we introduced our participatory expressive workshop and
information system to support it. We aimed to cultivate citizens’ communications in
local communities through expressing and remixing. Based on our model of
content circulation, we devel oped the workshop program and the system.
94
Our first trial dealt with a peculiar type of expression. But as we already
mentioned, our framework is applicable to other types of expressions. Especially it
suits on Web content creation. The Web can be a database from which a system
draw others expressions, and can be a place where people present their created
contents. Reusing and remixing loop of circulation is natively on the Web. We are
planning to develop an application of our framework for blogging.
ACKNOWLEDGEMENT
This work has been supported by a grant from the Japan Science &
Technology Agency under CREST Project.
95
____ Chapter 5: Communication through expressing and remixing: Workshop and System
REFERENCE
96
Artificial Intelligence (AAAI-97), pp. 622-628, 1997.
15. R. Barzilay, K. R. McKeown, and Michael Elhadad, “Information fusion in
the context of multi-document summarization,” in Proceedings of the 37th
Association for Computational Linguistics, pp. 550-557, 1999.
16. Nonaka and H. Takeuchi, The Knowledge Creating Company, Oxford
University Press, 1995.
17. B. Shneiderman, Leonardo's Laptop: Human Needs and the New
Computing Technologies, MIT Press, 2002.
18. Ohmukai, H. Takeda, M. Hamasaki, K. Numa, and S. Adachi, “Metadata-
driven personal knowledge publishing,” in Proceedings of 3rd International
Semantic Web Conference 2004, pp. 591-604, 2004.
19. Hori, K. Nakakoji, Y. Yamamoto, and J. Ostwald, “Organic perspectives of
knowledge management: Knowledge evolution through a cycle of
knowledge liquidization and crystallization,” In Journal of Universal
Computer Science, Vol. 10, No. 3, 2004.
20. Numa, K. Tanaka, M. Akaishi, and K. Hori, “Activating expression life cycle
by automatic draft generation and interactive creation," in International
Workshop on Recommendation and Collaboration (ReColl'08), 2008.
21. Numa, K. Toriumi, K. Tanaka, M. Akaishi, and K. Hori, “Participatory
Workshop as a Creativity Support System,” in 12th International
Conference on Knowledge-Based and Intelligent Information &
Engineering Systems (KES2008), 2008.
97
Chapter 6
ABSTRACT
Today, because of the advancements in the computer and electronic sciences
everything is going to be automated. In fact, some devices or infrastructures are
capable to change the behavior according to situations; these devices are called
Smart Devices or Smart Infrastructures. This system is designed to meet the
requirement of appliance control in automated or smart infrastructures which
includes home, offices, industries or may be sophisticated vehicles like aero
planes. Appliance control basically refers the process or technique of controlling a
device (including complete machines, mechanical devices, electronic devices,
electrical devices etc.) using some comfortable, luxurious and reliable means
based on some automation methods.
Even a number of standards have been defined for wired and wireless controlling
and automation of home appliances including Bluetooth, UPnP, X10 etc, this field
98
is still in developing state. In this document we have proposed an appliance
controlling system, named as Internet and PC Based Appliance Control (IPAC),
using concepts of parallel port programming.
IPAC is designed to control a device from PC and from Internet, and can be
applied in any smart infrastructure to automate the device and can work with
almost every type of automation method either it is wired (e.g. LAN) or wireless
(e.g. Bluetooth). This system can be applied in designing smart homes, secure
homes, centralized device controlling system, Bluetooth control system, WAP
control system.
Keywords: Home Networking, Smart Homes, Secure Homes, UpnP Devices, X10
protocol, WAP Devices, IR Devices, Bluetooth Device
INTRODUCTION
Running additional wires through homes is costly and a hassle for consumers.
In order to counteract this problem, the industry is developing wireless and other
standards which will allow users to interconnect information devices without installation of
new wires.
99
____ Chapter 6: IPAC System For Controlling Devices over the Internet
Interface issues.
In smart home test beds, control interfaces have ranged from touch-screen devices to
PDAs. Data on the effectiveness of the various interfaces seems scarce.
So, these were the some issues regarding the popularity of home networking.
Now we shift our attention towards the home networking.
Imagine a completely networked home, in which every appliance can be
remotely managed [14] from anywhere on the Internet with a simple Web browser
[1][12][17][19]. The general goal of the automatic-home movement is to use
networking technology to integrate the devices, appliances and services found in
homes so that the entire domestic living space can be controlled centrally or
remotely [16].
Following is a snap displaying a typical automated home. [2]
100
Figure 1. A typical Automated Home System [2]
Home wiring, the advance home developers are installing, typically adds
several thousand dollars to the cost of a new home, and it is usually Ethernet or
coaxial cable -- or some combination of both -- with other technologies in the mix.
The network is being designed to make possible remote operation of appliances
connected to the network.
Other technology developers are generating buzz in this area as well. In June
2008, at the Bluetooth World Congress, vendors were touting the expansion of
wireless networking technology into everything from air conditioners to cable
television boxes."Bluetooth was originally developed as a wireless technology --
primarily for short-range exchange of data between laptops, PDAs and mobile
phones," said Nick Hun, managing director at TDK Systems, whose Blu2i adapters
are being used in such home applications. But, he noted, when early adapters
101
____ Chapter 6: IPAC System For Controlling Devices over the Internet
were released to industrial engineers at the end of 2002 demand soon proved
overwhelming.
Secure home
It is a highly cute smart home environment in which every device is automated
with maintaining sufficient security. E.g. In the following figure home door is locked
by software lock (by using a password) and can be opened only by software
methods.
Preliminaries
Direct Cable
In this devices are connected through serial, parallel or USB port. Generally
desktop software is also supplied for making the device management a
comfortable and easy task.
102
Figure 4 Direct Cable Connection Method of Home Networking [8]
Bluetooth
This is cross device wireless standard created for cell phones and PDAs, and
can link up to eight devices.
Phone Line
Data shares the phone line frequency and requires phone jack everywhere a
networked device is located. Also requires special cards and drivers.
Ethernet
Connections are made using hub system and network cards in each device. It
requires driver installation and wiring. There are more expensive and chances of
hardware conflicts are there.
103
____ Chapter 6: IPAC System For Controlling Devices over the Internet
RELATED WORK
Related Technologies
UPnP(Universal Plug and Play Devices) UPnP technology is a distributed,
open networking architecture that employs TCP/IP and other Internet technologies
to enable seamless proximity networking, in addition to control and data transfer
among networked devices in the home, office, and public spaces. Intel software for
UPnP technology helps hardware designers and software developers build easy
connectivity into common electronic devices [4] [6] [9].
104
X-10 Devices
X10 [7][21] is a communication language protocol that allows compatible
products to talk to each other via existing 110 v electrical wiring in the home. Upto
256 different addresses are available and each device you can use usually
requires a unique address.
Infra-Red Device
IR data transmission is also employed in short-range communication among
computer peripherals and personal assistance. Remote controls and IrDA devices
use infrared light-emitting diodes (LEDS) to emit infrared radiation which is
focused by a plastic lens into a narrow beam.
DESIGN OF IPAC
Organization of IPAC
Figure [7] shows the complete organization structure of IPAC. In the figure only
four devices are shown but using IPAC system we can control up to 128 devices
(Why and How, This will be clear in the next section).
105
____ Chapter 6: IPAC System For Controlling Devices over the Internet
This is basically similar to INSTEON desktop[2] software and has aim of the
device management a comfortable and easy task for user’s point of view.
106
Server
This runs web-server so that the system can be accessed over the internet.
Second important part of server is the database which stores the information about
status of different devices.
Tracker
This system reads the status entries from database and generates proper
control word to be PC port for generating proper signals. This is one of the most
important parts of the system.
In this structure we can address up to 128 devices because we are using 7-bits
for addressing. All available 128 addresses (devices) are grouped into the 16
groups (rooms). Table1 shows these 16 groups and corresponding device address
107
____ Chapter 6: IPAC System For Controlling Devices over the Internet
range.
Encoding
When converting the status record from database into control word we will first
find the binary equivalent of room and device separately. From the control word
structure it is clear that we have to left shift the room bits by three position to place
them at the correct position. Then we can OR the ‘room bits’ and ‘device bits’ to
determine the absolute address. .Similarly we have to ‘OR’ the absolute address
with (10000000)2 for making the status field ‘1’ if the device is ON. Otherwise if the
device is OFF there is no need to change the MSB because by default it is ‘0’.
Moreover, one thing to be noted here is that the address of first room is ‘0000’ but
we say it ‘Room 1” just to keep the room number in natural domain. Similar is the
case with device address. So while decoding we will decrease the room number
and device number by 1 prior to converting to binary equivalent. This is shown in
the flow chart given below. While implementing the system in HLL it is important
that in place of decreasing room-no by 1 then covering to binary and then shifting
108
the room bits by 3-bits is just equivalent to multiplying room no by 8 after
decreasing by 1. Similarly ORing with (10000000)2 is equivalent to adding with 128
in decimal number system (only in this particular case).
Example
If room number is 5, device number is 8 and we want to set this device in ON
state.
Binary Method
(5-1)10=(4)10=(00000100)2 (room bits)
(8-1)10=(7)10=(00000111)2 (device bits)
Status =1
Shifting room bit left by 3-position we get (00100000)2
After ORing with device bits we get (00100111)2
Since the status is ON so we have to OR with (10000000)2
After ORing we get (10100111)2 which is the required
control word.
Denary Method
(5-1)10=(4)10 (corrected room no)
(8-1)10=(7)10 (corrected device no)
Status =1
Multiplying corrected room no by 8 we get 4X8=32
Adding with corrected device number we get 32+7=39
Since the status is ON so we have to add 128
After adding 128 we get (167)10 which is the required control word.
109
____ Chapter 6: IPAC System For Controlling Devices over the Internet
Figure 9. Flowchart for Encoding the room number, device number and their
associated status into corresponding control word.
Poller
This part of IPAC remains in running state as long as your system (server) is
ON (if you do not want to exit the IPAC service). This system, after a periodical
time, executes the tracker so that if any changed had been made in the device
status it should be propagate to corresponding device. Its function is to watch
(poll) the database continuously so it is designated as Poller.
110
bits are taken from the 7 LSBs of parallel port). The MSB from the parallel port
represents the data (status i.e. ON or OFF). With the addition of a relay of proper
rating we can connect any device. The relay passes the AC signal as long as the
device-status associated with this is ON. For maintaining the device continuously
(even in power failure condition) we have to attach a memory element for storing
the device status (We have used flip-flop) for that.
Electrical Appliances
These are home appliances which we are going to control. We will control the
electrical device using this system but inclusion of transducer we can also control
mechanical or electromechanical devices.
SIMMULATION
In Section-I, Section-II and Section-III we have laid out, designed our proposed
system IPAC. In this section we will discuss a particular simulation of IPAC to
analyze the results.
Simulation Requirements
Following table gives the detail requirements needed for running the simulation of
IPAC:
111
____ Chapter 6: IPAC System For Controlling Devices over the Internet
112
Figure 11. Statuses Changing by Web Interface
The major benefit is the cost. Since here the major cost factor is the PC which
is generally present in every intermediate level family. Other major cost distribution
113
____ Chapter 6: IPAC System For Controlling Devices over the Internet
factor is the cost of the software but this is a very long time asset. The hardware
part (i.e. Appliance Controller) is just a decoder and a set of flip-flops so it is hardly
of app. $10 USD. The remaining is the relays and the cost of the relay depends on
the rating which depends on the device to be control. Except relays all the cost-
distribution factors are one-time and fixed investment and does not depends on
the number of devices we are going to control.
ACKNOWLEDGEMENT
The authors would like to thank to Mr. M. Inamullah (Department of Computer
Engineering), Mr. Izharuddin (Department of Computer Engineering) for their
help in completing this work.
114
REFERENEC
115
____ Chapter 6: IPAC System For Controlling Devices over the Internet
116
Chapter 7
ABSTRACT
Finding out, analyzing, documenting, and checking requirements are important
activities in all development approaches, including agile development. This
chapter discusses problems concerned with the conduction of requirements
engineering activities in agile software development. We also suggests some
improvements to solve some challenges caused by agile requirements
engineering practices in large projects, like properly handling and identifying
critical (including non-functional) requirements, documenting and managing
requirements documentation, keeping agile teams in contact with outside
customers. Finally, the chapter discusses the requirements traceability problem in
agile software development and suggests some ideas to maintain the traceability
links between agile software artefacts to help developers to comprehend parts of
the system, and to keep the consistency among agile software artefacts during
refactoring.
117
____ Chapter 7: Requirements engineering and traceability in agile software development
INTRODUCTION
The agile approach is creating a stir in the software development community.
Agile methods are reactions to traditional ways of developing software and
acknowledge the need for an alternative to documentation driven, heavyweight
software development processes [1]. In the implementation of traditional methods,
work begins with the elicitation and documentation of a complete set of
requirements, followed by architectural and high-level design, development, and
inspection. Beginning in the 1990s, some practitioners found these initial
development steps frustrating and, perhaps, impossible [2]. The industry and
technology move too fast, requirements change at rates that swamp traditional
methods [3], and customers have become increasingly unable to definitively state
their needs up front while, at the same time, expecting more from their software.
As a result, several consultants have independently developed methods and
practices to respond to the inevitable change they were experiencing. These Agile
methods are actually a collection of different techniques (or practices) that share
the same values and basic principles. The agile Manifesto states valuing
"individuals and interaction over processes and tools, working software over
comprehensive documentation, customer collaboration over contract negotiation,
and responding to changes over following a plan" [1].
Requirements Engineering (RE) is the process of establishing the services that
the customer requires from a system and the constraints under which it operates
and is developed. The main goal of a RE process is creating a system
requirements document for knowledge sharing, while Agile Development (AD)
methods focus on face-to-face communication between customers and agile
teams to reach a similar goal. There are several research papers discussing the
relationship between RE and AD (e.g. [4], [5], [6], [7], [8], [9]). They explain some
RE practices in agile methods, compare these practices between agile and
traditional development systems, and examine the problems of AD when it is
dealing with the management of large projects and control critical requirements.
This chapter addresses the problem of how (user) requirements can be
118
captured and specified in the context of agile software development approaches. It
therefore tries to identify how standard RE techniques and processes can be
combined with agile practices and to find solutions to some of the difficulties
related to their work. In addition, this article discusses the traceability problem in
agile software development, since the current traceability between agile software
artifacts is ill defined [10]. In particular, we discuss how to solve the traceability
problem by extracting some important information from software artifacts to
identify traceability links between them. We also discuss how these links can be
used to improve the decisions making process and help developers during the
refactoring process. Finally, the chapter comes up with a set of guidelines for agile
requirements engineering.
The chapter is organized as follows. Section 2 sheds light on the benefits and
limitations of agile methodologies in the software development life cycle and
discusses some agile approaches from a requirements engineering perspective.
The agile RE activities are discussed in detail in Section 3, while Section 4 briefly
discusses how the requirement engineering process is performed in two agile
approaches. Section 5 addresses the requirements traceability. Section 6 gives
some guidelines and enhancements concerning with an efficient application of RE
practices in AD. Finally, Section 7 summarizes our conclusions and future work.
• The highest priority is to satisfy the customer through early and continuous
delivery of valuable software.
119
____ Chapter 7: Requirements engineering and traceability in agile software development
• Business people and developers must work together daily throughout the
project.
• Build projects around motivated individuals. Give them the environment and
support they need, and trust them to get the job done.
• The most efficient and effective method of conveying information to and within
a development team is face-to-face conversation.
• At regular intervals, the team reflects on how to become more effective, then
tunes and adjusts its behavior accordingly.
120
changes in requirements and specifications, even late in the development process.
Through the use of multiple working iterations, the implementation of agile
methods allows the creation of quality, functional software with small teams and
limited resources. The proponents of the traditional development methods criticize
the agile methods for the lightweight documentation and inability to cooperate
within the traditional work-flow. The main limitations of agile development are:
agile works well for small to medium sized teams; also, agile development
methods do not scale, i.e. due to the number of iterations involved it would be
difficult to understand the current project status; in addition, an agile approach
requires highly motivated and skilled individuals which would not always be
available; lastly, no enough written documentation in agile methods lead to
information loss when the code is actually implemented. However, with proper
implementation the agile methods can complement and benefit traditional
development methods. Furthermore, it should be noted that traditional
development methods in non-iterative fashions are susceptible to late stage
design breakage, while agile methodologies effectively solve this problem by
frequent incremental builds which encourage changing requirements. In the
following, some common agile methods are briefly discussed from the
requirements engineering perspective.
Agile Modeling
Agile Modeling (AM) is a new approach for performing modeling activities [12].
It gives the developers a guideline of how to build models--using an agile
philosophy as its backbone--that resolve design problems and support
documentation purposes but not over-build these models. The aim is to keep the
amount of models and documentation as low as possible. The RE techniques are
not explicitly referred in AM but some of the AM practices support some RE
techniques like brainstorming.
121
____ Chapter 7: Requirements engineering and traceability in agile software development
Feature-Driven Development
Feature-Driven Development (FDD) consists of a minimalist, five-step process
that focuses on building and design phases [13] each defined with entry and exit
criteria, building a features list, and then planning-by-feature followed by iterative
design-by-feature and build-by-feature steps. In the first phase, the overall domain
model is developed by domain experts and developers. The overall model consists
of class diagrams with classes, relationships, methods, and attributes. The
methods express functionality and are the base for building a feature list. A feature
in FDD is a client-valued function. The feature lists is prioritized by the team. The
feature list is reviewed by domain members [14]. FDD proposes a weekly 30-
minute meeting in which the status of the features is discussed and a report about
the meeting is written.
Dynamic Systems Development Method
Dynamic Systems Development Method (DSDM) was developed in the U.K. in
the mid-1990s. It is an outgrowth of, and extension to, Rapid Application
Development (RAD) practices [15]. The first two phases of DSDM are the
feasibility study and the business study. During these two phases the base
requirements are elicited. Further requirements are elicited during the
122
development process. DSDM does not insist on certain techniques. Thus, any RE
technique can be used during the development process [7]. DSDM’s nine
principles include active user involvement, frequent delivery, team decision
making, integrated testing throughout the project life cycle, and reversible changes
in development.
Extreme Programming
Extreme Programming (XP) is the most famous of any of the agile approaches.
It is based on values of simplicity, communication, feedback, and courage [6]. XP
aims at enabling successful software development despite vague or constantly
changing software requirements. The XP relies on the way the individual practices
are collected and lined up to function with each other. Some of the main practices
of XP are short iterations with small releases and rapid feedback, close customer
participation, constant communication and coordination, continuous refactoring,
continuous integration and testing, and pair programming [17].
Scrum
Scrum is an empirical approach based on flexibility, adaptability and
productivity [18]. The Scrum leaves open for the developers to choose the specific
software development techniques, methods, and practices for the implementation
process. Scrum has been in use for nearly ten years and has been used to
successfully deliver a wide range of products.
123
____ Chapter 7: Requirements engineering and traceability in agile software development
Feasibility Study
The Feasibility Study gives the overview of the target system and decides
whether or not the proposed system is worthwhile. The input of the feasibility study
is an outline description of the system and how it will be within an organization.
The results should be a short report, which recommends whether or not it is worth
carrying on with the RE and AD process. Initially, all relevant
stakeholders have to be defined, in other words, all right customers who are
related to the development of the system and are affected by its success or failure
124
must be selected, and then the brainstorming session takes place to share the
knowledge ideas between agile teams and "ideal'' customers to answer a number
of questions like:
Does the system contribute to the high level objectives and the critical requirements
of the organization? In a first step, the high level goals and critical requirements
(functional and non-functional requirements) for the system are defined upfront in order to
determine the scope of the system. These requirements describe the expected business
values to the customer.
Is your organization ready for the AD? Each agile method has its own characteristics
and practices that will change the daily work of the organization. Before an organization
selects one of them, it should consider whether or not it is ready for agile development. In
fact, this question is very important and many researchers tried to answer it like [11], [20].
For example, Ambler [11] discusses some successful factors and questions to be
answered affecting the successful adoption of agile methods.
Can the system be implemented within given budget? Some contracts do not allow for
changing requirements. "The requirements must be complete before a contract can be
made, which is often found in fixed-priced projects'' [5]. In agile projects where changing
requirements is welcome, contracts often are based on time and expenses and not on
fixed-priced scope. Hence, agile methods use scope-variable price contracts [21]. This
means that the features really implemented into the system and its cost evolve as well.
Therefore, requirements are not specified in details at contract level but defined step by
step during the project through a negotiation process between the customer and the
development team [8].
How to integrate the agile activities with traditional organizational activities already
in place? Some researches suggest tentative models for integrating agile activities with
traditional organizational activities by transferring the knowledge from one process to
another and how the traditional team should adopt its activities to suit the mechanisms of
agile teams [22][23].
Requirements Elicitation
In this activity, agile teams work with stakeholders to find out about the
application domain, the services that the system should provide, the system's
operational constraints, and the required performance of the system (non-
functional requirement). The most important techniques used for requirements
125
____ Chapter 7: Requirements engineering and traceability in agile software development
elicitation in AD are:
Interviews: "Interviewing is a method for discovering facts and opinions held by
potential stakeholders of the system under development'' [6]. There are two types
of interviews: Closed interviews, where a predefined set of questions are
answered, and the open interviews, where there is no predefined agenda and a
range of issues are explored with stakeholders. In fact, interviews are good for
getting an overall understanding of what stakeholders do and how they might
interact with the system, but they are not good for understanding domain
requirements. All agile methods say that interviews are an efficient way to
communicate with customers and to increase trust between two sides.
Brainstorming: this is a group technique for generating new, useful ideas, and
promoting creative thinking. Brainstorming can be used to elicit new ideas and
features for the application, define what project or problem to work on and to
diagnose problems in a short time. The project manager plays an important role in
brainstorming. He/she determines the time of creative session, makes sure that
there is no escalating discussions about certain topics, and comes to make sure
that every body expresses his/her opinion freely. After the creative session is
ended, the topics are evaluated by the team. Also, the connections and
dependences between the discussed ideas are represented by (for example)
graph visualization, so the conflicts with other requirements are found and
evaluated.
Ethnography: it is an observational technique that can be used to understand
social and organizational requirements [24]. In agile development ethnography is
particular effective at discovering two types of requirements: the first one refers to
requirements that are derived from the way in which people actually work rather
than the way in which process definitions say they ought to work, and the second
one refers to requirements that derived from cooperative and awareness of other
people's activities. Ethnography is not a complete approach to elicitation and it
should be used with other approaches such as use case analysis [19][24].
Use Case analysis: this is a scenario based technique used in UML-based
development which identifies the actors involved in an interaction and describes
126
the interaction itself. A set of use cases should describe possible interactions that
will be presented in the system requirements; each use case represents a user-
oriented view of one or more functional requirements of the system [24].
Requirements Analysis
The main task here is to determine whether the elicited requirements are unclear,
incomplete, ambiguous or contradictory, and then resolve these issues. Conflicts
in requirements are resolved through prioritization negotiation with stakeholders.
The main techniques used for requirements analysis in agile approaches are:
Joint Application Development (JAD): this is a workshop used to collect
business requirements while developing a system. The JAD sessions also include
approaches for enhancing user participation, expediting development, and
improving the quality of specifications [24]. In agile environment, in case of
conflicts between stakeholders' requirements the use of JAD can help promoting
the use of a professional facilitator who can help to resolve conflicts. In addition,
the JAD sessions encourage customer involvement and trust in the developed
system.
Modeling: system models are important bridge between the analysis and the
design process [6]. In agile environment the pen board (or pin board also) is
divided into three sections: models to be implemented, models under
implementation, and models completed. This layout provides a visual
representation of the project status [8]. These models must be documented and
not thrown-away.
Prioritization: agile methods specify that the requirements should be considered
similar to a prioritized stack. The features are prioritized by the customers based
on their business value, so that the agile teams estimate the time required to
implement each requirement. The agile team must distinguish between ``must
have" requirements from ``nice to have" requirements; this can be done by
frequent communications with the customers. Fig. 3 shows the requirements
prioritization process: at the beginning of each iteration, there is a requirements
collection and prioritization activity. During that, new requirements are identified
127
____ Chapter 7: Requirements engineering and traceability in agile software development
and prioritized. This approach helps to identify the most important features inside
the ongoing project. Typically, if a requirement is very important it is scheduled for
the implementation in the upcoming iteration; otherwise it is kept on hold. At the
following iteration, the requirements on hold are evaluated and, if they are still
valid, they are included in the list of the candidate requirements together with the
new ones. Then, the new list is prioritized to identify the features that will be
implemented; if a requirement is not important enough, it is kept on hold
indefinitely [8].
Requirements Documentation
The purpose of requirements documentation is to communicate requirements
(or knowledge sharing) between stakeholders and agile teams. In fact, no formal
requirements specification is produced in agile development methods since agile
focuses on minimal documentation. The features and the requirements are
recorded on story boards, index cards, and paper prototypes like use cases and
data flow diagrams.
The lack of documentation might cause long-term problems for agile teams [6].
In the following, we suggest some recommendations for agile development teams
to help them in managing and implementing large projects and projects with critical
128
requirements:
The agile team leader assigns two or three members to produce
documentation in parallel and concurrence with development. The two (or three)
members will be responsible for handling requirements (functional and non-
functional requirements), writing, reviewing, and maintaining documentation
consistent with development. Furthermore, efficient practices like peer interviews
will help to ensure the accuracy and quality of the documentation. The reason for
choosing two or three members is because the resources are limited and the other
members must adhere to the agile manifesto of producing working software rather
than documentation. In addition, we can not have just one person doing it,
because that violates one of the agile manifesto principles: "Business people and
developers must work together daily through-out the project" [1].
Using computer-based tools like UML modeling and project management tools
to specify a high level description of the project, and to document certain practices
and requirements used in agile projects in an electronic format.
Developing a reverse engineering process [25] to be applicable on agile
projects, so that we can use it to reverse engineer the code to produce
documentation using for example UML modeling tools.
Requirements Validation
The goal of requirements validation is to ensure that requirements actually
define the system which the customer wants. The requirements validation checks
the consistency, completeness and realism of requirements. The main practices
used for requirements validation in agile approaches are:
Requirements reviews: it is a manual process that involves multiple readers from
both agile team and stakeholders checking the requirements against current
organizational standards and organizational knowledge for anomalies and
omissions. In agile projects the requirements reviews must be formal reviews: we
mean that the agile team should walk with the customers through each
requirement; conflicts, errors, extra, and omissions in the requirements should be
formally recorded.
129
____ Chapter 7: Requirements engineering and traceability in agile software development
Write tests first: In agile development, testing is also a method for requirements
validation and therefore also part of requirements engineering. In some agile
methods like XP, the requirements are implemented and tested using the Test
Driven Development (TDD) approach. By applying this technique developers
create tests before writing code. The developed code is then refactored as needed
to improve its structure [26]. The TDD supports evolutionary development and
promotes the development of high quality code. The requirement from which the
test case was created is now presented in a form in which it is completely
validated, in the sense that it can be automatically (after each iteration) determined
whether a requirement is implemented by the software or not. This makes the
developers aware for the progress of the project and the state of the current
iteration of the project. Also, it supports the refactoring process to get an improved
design by reduced coupling and strong cohesion [27]. A common misconception is
that all tests are written prior to implementing the code [7]. Rather, TDD contains
short iterations which provide rapid feedback. Code refactoring and unit tests
ensure that emerging code is more simple and readable. In fact, unit tests can be
considered as a live and up-to-date documentation: they represent an excellent
repository for developers trying to understand the system, since they show how
parts of a system are executed.
Evolutionary prototyping: a prototype is an initial version of the system.
Evolutionary prototyping starts with a relatively simple system which implements
the most important customer requirements which are best understood and which
have the highest priority. The system prototypes allow customers to experiment to
see how the system supports their work (requirements elicitation), and may reveal
errors and omission in the requirements which have been proposed (requirements
validation). As shown in Fig. 4, the main objective of evolutionary prototyping in
AD is to deliver a working system to customers by focusing on customer
interaction [24].
Acceptance testing: acceptance testing is a formal testing conducted by the
customer to ensure that the system satisfies the contractual acceptance criteria.
The acceptance tests are not different than the automated system tests, but they
130
are performed by the customer. Delivering working software to the customer is a
fundamental agile principle and hence the customers create acceptance criteria for
the requirements and test the requirements against these criteria. Being AD an
incremental process, the customers can give feedbacks to the developers to
enhance the development of future increments of the system. However, as a
general problem there are often no formal acceptance tests for non-functional
requirements.
Requirements Management
Understanding and controlling changes to system requirements take place in
this activity. In order for requirements management tools to work efficiently, they
must be able to store requirements, prioritize requirements, track requirement
changes and development progresses, and provide a level of requirements
traceability [29], [30].
In agile projects, managers have to create and maintain a framework for the
interaction between the agile teams and the stakeholders, by (i) identifying the
ideal people who can be members of agile teams and ideal customers who can
answer all the developers’ questions correctly, (ii) strengthening the collaboration,
and (iii) negotiating contracts with the customers [8].
131
____ Chapter 7: Requirements engineering and traceability in agile software development
We believe that agile methods can play an important role in the management
of large projects. The decomposition of the larger parts of the project into smaller
components, called sub-components, lends itself to the employment of more agile
teams. These agile teams can work in other time zones and other countries
provided that frequent communications and self organization are established. Agile
teams working in parallel on sub-components allows for quick development and
early design. An early design leads to an early review. Consequently, the iterative
schedule and emphasis on delivering the product allows the agile teams to assess
the successes and shortcomings, and plan for the next iteration. Once a specific
agile team has successfully completed a sub-component, the team is available to
work on another component or sub-component. Each of these smaller agile teams
will still be responsible for assigning two members to complete the previously
described documentation which is necessary to satisfy the other stakeholders.
Agile teams should use modern communications like web-based shared team
projects and instant messaging tools. These tools are useful to keep in touch with
the customer and other agile teams in order to discuss requirements when they
are not on-site.
In Section (requirement traceability) we will discuss the requirements
traceability as one of the important aspects of the requirements management.
132
factor for the project success. There is no difference between different agile
methods in the requirements elicitation phase, since they rely on face-to-face
communication between the ideal customers and agile team. The ideal customers
described the system requirements as user stories in XP, while in Scrum the
product backlog is formulated to include all described features and requirements.
Then, the analysis of requirements depends on the requirements prioritization
process that prioritizes the requirements according to their importance for the
customer. In fact, all agile methods are based on the requirements prioritization to
implement the most important requirements first. In addition, frequent delivery of
working software allows better understanding and analysis of requirements.
The requirements documentation activity in agile development depends on
face-to-face communication and software source code as a good resource for
knowledge sharing since agile development focuses on minimal documentation.
The features and the requirements in XP are recorded on story boards, index
cards, and paper prototypes. To ensure that the requirements and features
captured are a valid representation of the required system, in AD, frequent
meetings between customers and agile teams can be scheduled. Also, the
customer can run acceptance tests to ensure that delivered functions actually
define the system which he/she wants. In XP, the developers can use the TDD
cycle to validate their work frequently and to refactor their code as needed.
In XP the short increments and incremental planning techniques are used to
manage the change of requirements. The change of requirements may result to
add and/or delete the user stories. The Scrum provides a project management
framework that focuses development into 30-day Sprint cycles in which a specified
set of backlog features are delivered. The core practice in Scrum is the use of
daily 15-minute team meetings for coordination and integration.
REQUIREMENTS TRACEABILITY
Requirements traceability refers to "the ability to describe and follow the life of
a requirement, in both a forwards and backwards direction'' [31]. In another
133
____ Chapter 7: Requirements engineering and traceability in agile software development
134
to the artifacts created during the system development life cycle based on these
requirements [32]. Traceability can provide important insights into system
development and evolution assisting in both top-down and bottom-up program
comprehension, impact analysis, and reuse of existing software, thus giving
essential support in understanding the relationships existing within and across
software requirements, design, and implementation [32], [33]. The importance of
maintaining traceability links is confirmed on one side by the support provided by
many CASE tools (see for instance Rational Requisite Pro 1 ) and on the other side
by numerous standards, such as the ISO 15504, CMMI, and IEEE 1219-1998 [34],
that consider requirement traceability as a "best practice".
Traceability is an important part in traditional software development but it is not
a standard practice for the agile methods. However, maintaing traceability links
between the artefacts produced can provide important insight also in agile
development environment. In particular, tracing user stories to their test cases and
back we have a way to validate that a user story is implemented and tested. The
importance of maintaing the dependency links among agile software artefacts
becomes an essential part in maintenance task. The artefacts produced during an
agile development process (i.e., requirements, acceptance tests, unit tests and
code) generally change at the same time. Thus, having traceability links between
the produced artefacts allows to map high-level documents, and thus abstract
concepts, to low-level artefacts. This clearly improves the software maintainability:
once a maintainer has identified the high-level document (e.g., requirement)
related to the feature to be changed, traceability helps to locate the code to be
maintained. This will also help the testers to see what they need to change when a
user story is changed or removed and what are the user stories they need to test.
In addition, the traceability links are useful to validate that the system is
implemented correctly and gives the customer some form of certification that we
have tested the system.
1
http://www-01.ibm.com/software/awdtools/reqpro.
135
____ Chapter 7: Requirements engineering and traceability in agile software development
Dependencies (links) between unit tests and classes under test can help
developers to comprehend the code as the unit test explicitly indicates what the
expected behavior of a class should be for typical usage scenarios. Moreover, the
traceability links between unit tests and related code can be also exploited to
maintain consistency during refactoring process. When refactoring, the developer
must ensure that all unit tests continue to pass, so unit tests might need to be
refactored together with the source code [35]. Indeed, refactoring of the code
should be followed by refactoring of the tests [36]. Many of these dependent test
refactorings could be automated or at least made easier, if the exact relationships
between the unit tests and the corresponding classes under test would be known.
There are many techniques that have been presented to support traceability
management. These techniques have been intended to work with traditional
software development methodologies and therefore designed under the
assumption that a formal requirements process is in place, but in agile software
development the situation is different because the main development artifact is the
source code. As a result, many researchers tried to find solutions for this challenge
[37], [38]. For example, in [37], Echo tool-based approach is proposed to enable
the scalability of agile requirements elicitation practice. Echo provides a
mechanism that allows for flexible and dynamic creation of content as well as the
supporting traceability structure. Indeed, Echo tool does not support multi-user
environment to enable distributed collaboration. In addition, other agile practices
like TDD are not supported by this tool. Moreover, in [38] the traceability patterns
framework is produced as a solution to requirement component tractability in agile
software development depending on the structure of source code. However, this
framework lacks to the practical evaluation in real-world industrial systems.
Thus, the support for traceability in contemporary software engineering
environments and tools is not satisfactory. As a result, traceability links between
agile software artefacts are not explicitly maintained [39], [10]. Thus,
dependencies between different artefacts have to be manually identified when
needed. As a result, during the comprehension of existing software systems,
software engineers are required to spend a large amount of effort on synthesizing
136
and integrating information from various sources to establish links among these
artifacts. This consideration calls for (semi)automatic approaches supporting the
developer during the identification of links between software artefacts.
The artefacts produced during software development in agile environment are
usually user stories, unit tests, and code classes. Links between user stories and
unit tests and between unit tests and code classes are enough to support all the
activities described above.
As for the identification of links between user stories and unit tests, approaches
based on Information Retrieval (IR) [40] techniques could be exploited to support
such a task. The rationale behind such approaches is the fact that user stories are
text based and that programmers use meaningful domain terms to define source
code identifiers in their unit tests. Indeed, IR-based methods propose a list of
candidate traceability links on the basis of the similarity between the text contained
in the software artifacts. Such methods are based on the conjecture that two
artifacts having high textual similarity share several concepts, thus they are good
candidates to be traced on each other.
In the traceability community promising results have been achieved applying IR
techniques for recovering links between source code and free text documentation.
In particular, IR methods have been used to recover traceability between manual
pages and source code [33], [41], between requirements [42]; between several
others types of artifacts [30]; and between unit tests and units under test [10].
However, to the best of our knowledge there is no empirical study carried out to
evaluate the support given by IR methods to recover links between user stories
and unit tests.
Quite different is the scenario when considering the links between unit tests
and tested classes. In this case, links have to recovered between structured
artefacts (i.e., source code). Thus, heuristic-based approaches can be exploited
[43]. Some guidelines and naming conventions that describe the testing
environment have been proposed to facilitate the identification of these links. For
example, the naming conventions based approaches identify the tested classes by
analyzing the name of the unit test. Usually, the name of a unit test is obtained by
137
____ Chapter 7: Requirements engineering and traceability in agile software development
the name of the tested class followed or preceded by the word "Test''. For
instance, the class Converter is tested by the unit test "ConverterTest'' (or
"TestConverter''). Although this approach is very simple, it establishes one-to-one
relationship between unit test and tested class, but this is not always true in real
programming life [44]. In addition, not all developers might follow the predefined
naming conventions when they named their unit tests.
In this context, other heuristic-based approaches should be used. Bruntink et
al. [45] show that classes which depend on other classes require more test code
and thus are more difficult to test than independent classes. In order to improve
the testability of complex classes, they suggest using a "cascaded test suites''
where a test of a complex class can use the tests of its required classes to set up
the complex test scenario. Source code analysis techniques [46], [47] can be used
to detect and capture dependencies among the unit tests onto related source
code. As an example, the backward program slicing [48] uses intraprocedural or
interprocedural control and data flow information to identify the classes that
directly or indirectly affect the computation of the results of the assert statements,
i.e., the statements used to compare the actual outcome with the expected
outcome, in the unit tests.
138
• Agile Projects Contracts: at the beginning, the most critical requirements are
expressed by the stakeholders as well as they can, so that the experienced
project leaders can determine an initial cost for agile projects and guess the
cost of later changes.
• Frequent Releases: frequently delivering parts of the system provides the
ability to release faster expected results to the customers in order to get
feedbacks from them. Hence, the requirements are implemented in an iterative
and incremental fashion.
• Requirements Elicitation Language: use linguistic methods for requirements
elicitation, derived from Natural Language Processing (NLP) [6]. In other
words, requirements are collected using the language of the customer, not a
formal language for requirements specification.
• Non-Functional Requirements (NFR): in agile approaches handling of NFR
is ill defined [7]. We propose the customers and agile team leaders to arrange
for meetings to discuss NFR (and all critical requirements) in the earliest
stages. Once the initial NFR of a project have been identified and documented,
the agile teams can begin with development.
• Smaller agile teams are flexible: smaller agile teams allow continuous
communications between them and stakeholders in efficient way, and the
requirements changes are controlled. Fig. 5 shows that whenever the agile
teams are smaller, the chances of the project success increased [12].
• Evolutionary requirements: RE in agile methods accommodate changing
requirements even late in the development cycle, but that changes to the
requirements must wait until the culmination of each iteration. Therefore, agile
development does not spend much time in initial requirements elicitation.
Consequently, this methodology will ensure that iterations are consistent with
expectations, and that the development process will remain organized.
139
____ Chapter 7: Requirements engineering and traceability in agile software development
140
related code should be identified and evolved to control co-changes. In this
way, once the code is refactored, the agile team is able to re-build the
traceability matrix again and determine what are the test cases needed to be
re-run. In particular, the focus should be on the identification of the traceability
links added or deleted after the refactoring process. In case the traceability
links between source code and the related unit tests are broken during
refactoring, this may be treated as a warning for possible code and/or unit test
review [35]. Traceability information between requirements, source code and
unit tests can also be used to drive software development, by identifying
requirements for which unit tests and/or source code has not been
implemented yet. In addition, traceability information can be used to support
refactoring.
141
____ Chapter 7: Requirements engineering and traceability in agile software development
ACKNOWLEDGEMENT
We would like to thank Mr. Avishek Shrestha for his help, valuable ideas,
and various references.
142
REFERENCES
1. K. Beck, M. Beedle, A. van Bennekum, A. Cockburn, W. Cunningham, M.
Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeffries, J. Kern, B. Marick, R.
C. Martin, S. Mellor, K. Schwaber, J. Sutherland, and D. Thomas: Manifesto
for agile software development., [Online]. Available:
http://www.agilemanifesto.org/ .
2. J. Highsmith: Agile software development ecosystems., Boston, MA, USA:
Addison-Wesley Longman Publishing Co., Inc., (2002).
3. C. J. Highsmith, K. Orr: Extreme programming., in Proceedings of E-Business
Application Delivery,pp. 4–17 (2000),
4. Eberlein and J. C. S. do Prado Leite: Agile requirements definition: A view
from requirements ngineering, in Proceedings of the International Workshop
on Time-Constrained Requirements Engineering (TCRE’02) (2002).
5. R. Goetz and R. C. Endeavour: How agile processes can help in time-
constrained requirements engineering,” in Proceedings of the International
Workshop on Time-Constrained Requirements Engineering (TCRE’02) (2002).
6. F. Paetsch, A. Eberlein, and F. Maurer: Requirements engineering and agile
software development, in Proceedings of the Twelfth International Workshop
on Enabling Technologies (WETICE’03)., Washington, DC,USA: IEEE
Computer Society, p. 308 (2003).
7. L. Cao and B. Ramesh: Agile requirements engineering practices: An
empirical study., IEEE Softw., vol. 25, no. 1, pp. 60–67 (2008).
8. A. Sillitti and G. Succi: Requirements engineering for agile methods., in
Engineering and Managing Software Requirements. Springer Verlag, pp. 309–
326 (2005).
9. S. Bose, M. Kurhekar, and J. Ghoshal: Agile methodology in requirements
engineering., [Online]. Available: http://www.infosys.com/research/publica-
tions/agile-requirementsengineering.pdf.
143
____ Chapter 7: Requirements engineering and traceability in agile software development
144
23. O. Salo and P. Abrahamsson: Integrating agile software development and
software process improvement: a longitudinal case study., in ISESE, pp. 193–
202 (2005).
24. I. Sommerville and P. Sawyer: Requirements Engineering: A Good Practice
Guide., New York, NY, USA: John Wiley & Sons, Inc., (2000).
25. E. J. Chikofsky and J. H. C. II: Reverse engineering and design recovery: A
taxonomy., IEEE Software, vol. 7, no. 1, pp. 13–17, (1990).
26. M. Fowler: Refactoring: Improving the Design of Existing Code., Boston, MA,
USA: Addison-Wesley, (1999).
27. K. Beck and M. Fowler: Planning Extreme Programming., Boston, MA, USA:
Addison-Wesley Longman Publishing Co., Inc., (2000).
28. B. W. Boehm: Verifying and validating software requirements and design
specifications., IEEE Softw., vol. 1, no. 1, pp. 75–88 (1984).
29. L. Delgadillo and O. Gotel: Story-wall: A concept for lightweight requirements
management., in RE, pp. 377–378 (2007).
30. A. D. Lucia, F. Fasano, R. Oliveto, and G. Tortora: Recovering traceability
links in software artifact management systems using information retrieval
methods., ACM Transactions on Software Engineering and Methodology, vol.
16, no. 4, p. 13 (2007).
31. O.Goteland and A.Finkelstein: Ananalysis of the requirements traceability
problem., in Proc. of 1st International Conference on Requirements
Engineering. Colorado Springs, Colorado, USA: IEEE CS Press, pp. 94–101
(1994).
32. B. Ramesh and M. Jarke: Toward reference models for requirements
traceability., IEEE Transactions of Software Engineering., vol. 27, no. 1, pp.
58–93 (2001).
33. G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo: Recovering
traceability links between code and documentation., IEEE Transactions on
Software Engineering, vol. 28, no. 10, pp. 970–983 (2002).
34. IEEE: The institute of electrical and electronics engineers., inInc.IEEE Std
1219-1998: IEEE Standard for Software Maintenance, (1998).
145
____ Chapter 7: Requirements engineering and traceability in agile software development
146
43. A. De Lucia, F. Fasano, and R. Oliveto: Traceability management for impact
analysis., in Proceedings of Frontiers of Software Maintenance. Beijing, China:
IEEE Press, pp. 21–30 (2008).
44. R. V. Binder: Testing object-oriented systems: models, patterns, and tools.,
Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., (1999).
45. M. Bruntink and A. v. Deursen: Predicting class testability using object-
oriented metrics., in Proceedings of the 4th IEEE International Workshop
Source Code Analysis and Manipulation. Montreal, Canada: IEEE Computer
Society, pp. 136–145 (2004).
46. D. Binkley: Source code analysis: A road map., in FOSE, pp.104–119 (2007).
47. K. Gallagher and D. Binkley: Program slicing., in Frontiers of Software
Maintenance. Beijing, China: IEEE CS Press, (2008).
48. M. Weiser: Program slicing., IEEE Trans. Software Eng., vol. 10, no. 4, pp.
352–357 (1984).
147
Chapter 8
BOUZID Merouane
University of Sciences and Technology Houari Boumediene (USTHB), Algeria
ABSTRACT
Speech coders operating at low bit rates necessitate efficient encoding of the
linear predictive coding (LPC) coefficients. Line spectral Frequencies (LSF)
parameters are currently one of the most efficient choices of transmission
parameters for the LPC coefficients. In this paper, we present an optimized trellis
coded vector quantization (OTCVQ) scheme designed for robust encoding of the
LSF parameters. The objective of this system, called initially "LSF-OTCVQ
Encoder", is to achieve a low bit-rate quantization of the FS1016 LSF parameters.
The efficiency of the LSF-OTCVQ encoder (with weighted distance) was first
proved in the ideal case of transmissions over noiseless channel. After that we
were interested on the improvement of its robustness for real transmissions over
noisy channel. To protect implicitly the transmission parameters of the LSF-
OTCVQ encoder incorporated in the FS1016, we used a joint source-channel
coding carried out by the channel optimized vector quantization (COVQ) method.
In the case of transmissions over noisy channel, we will show that the new
148
encoding system, called "COVQ-LSF-OTCVQ Encoder", would be able to
contribute significantly to the improvement of the FS1016 performances by
ensuring a good coding robustness of its LSF spectral parameters.
INTRODUCTION
In speech coding systems, the short-term spectral information of the speech
signal is often modelled by the frequency response of an all-pole filter whose
transfer function is denoted by H(z) = 1/A(z) in which A(z) = 1 + a1 z −1 +…+ ap z
−p [1]. In telephone band speech coding (300-3400 Hz, fe = 8 KHz), the
parameters of this filter are derived from the input signal through linear prediction
(LP) analysis of p = 10 order. The 10 parameters {ai}i=1,2,…,10, known as the
Linear Predictive Coding (LPC) coefficients [1], play a major role in the overall
bandwidth and preserving the quality of the encoded speech. Therefore, the
challenge in the quantization of the LPC parameters is to achieve the transparent
quantization quality [2], with the minimum bit-rate while maintaining the memory
and computational complexity at a low level.
In practice, one doesn't quantify directly the LPC coefficients because they
have poor quantization properties. Thus, other equivalent parametric
representations have been formulated which convert them into much more
suitable parameters to quantize. One of the most efficient representations of the
LPC coefficients is the Line Spectral Frequency (LSF) [3]. The LSF parameters
(LSFs) which are related to the zeros of polynomials derived from A(z) [1] exhibit a
number of interesting properties. These properties [2] make them a very attractive
set of transmission parameters for the LPC coefficients. Exploiting these
properties, various coding schemes based on scalar and vector quantization were
developed in the past for the efficient quantization of spectral LSF parameters.
Several works showed that the vector quantizer (VQ) schemes, such as multistage
149
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
VQ [4], Split VQ [2]…, can achieve at lower bit-rates the transparent quantization
quality of the LSFs compared with those conceived based on scalar quantizer
(SQ).
In this paper, we present an optimized trellis coded vector quantization
(OTCVQ) scheme designed for the efficient and robust coding of LSF parameters.
The aim of this system, called at the beginning "LSF-OTCVQ Encoder", is to
achieve a low bit rate transparent quantization of LSFs by exploiting the intra-
frame dependence between the closest pairs of the LSF parameters. In the case
of ideal transmissions over a noiseless channel, we have already proved in [5] that
the LSF-OTCVQ encoder (with weighted distance) could achieve good
performances when applied to encode the LSF parameters of the US Federal
Standard FS1016. Indeed, we have showed that LSF-OTCVQ encoder of 27
bits/frame produces equivalent perceptual quality to that obtained when the LSF
parameters are unquantized.
Subsequently, our interest was drawn to the improvement of the LSF-OTCVQ
encoder robustness for real transmissions over noisy channel. In low bit rate
speech coding domain, the essential objective is to reduce the bit rates of speech
coders while maintaining a good quality of transmission. In general, during the
design of speech coding systems, the effects of transmission noises are often
neglected. A redundant channel coding [6] is conventionally used to ensure an
"explicit" protection to sensitive parameters of speech coders against channel
errors. According to the separate design approach, suggested by Shannon in his
classical source/channel coding theorems [7], the channel encoder can be
designed separately from the source encoder by adding redundant bits (Error-
detecting-correcting codes) to source data. Indeed, robust encoding systems could
be designed according to this separation approach but at the cost of an increase
of the bit-rate/delay transmission and the complexity of the coding/decoding.
However, at low bit rate where the constraints in complexity and delay are very
severe, this channel coding is not especially recommended. The separation design
disadvantages have motivated some researchers to investigate a joint solution to
the source and channel coding optimization problem so that they can reduce the
150
complexity on both sides, while providing performances close to the optimum. For
these purposes, Joint Source-Channel Coding (JSCC) was introduced in which the
overall distortion is minimized by simultaneously considering the impact of the
transmission errors and the distortion due to source coding [8], [9], [10]. Most of
these works have proved the effectiveness of the JSCC to protect implicitly (i.e.,
without redundancy) source data while maintaining a constant bit rate and a
reduced complexity.
To implicitly protect the transmission indices of our LSF-OTCVQ encoder
incorporated in the FS1016, we used a JSCC method carried out by the Channel
Optimized Vector Quantization (COVQ). We will show first how to adapt and apply
successfully the COVQ technique for the robust design of a new encoding system
(called "COVQ-LSF-OTCVQ encoder") in order to implicitly protect some of its
indices. To finish, we will generalize the study with the complete protection of all
the indices of the COVQ-LSF-OTCVQ encoder.
An outline of this paper is as follows. In section 2, we briefly review the basics
of vector quantization. In section 3, we describe the design steps of the OTCVQ
encoding system. Examples of comparative results of TCVQ/OTCVQ encoders
are reported in this section. Next, we present the joint coding method by the
COVQ technique. The performances of the COVQ system applied to encode
memoryless source are presented at the end of the section. The application of the
OTCVQ scheme for encoding the LSF parameters is described in section 5.
Simulation results, when using two different distance measures (unweighted and
weighted) in the design and the operation of the LSF-OTCVQ encoder, are
provided. In section 6, we present the application of the LSF-OTCVQ encoder to
quantize the LSF parameters of the FS1016 speech coder. After, a JSCC-COVQ
method was used to implicitly protect the LSF-OTCVQ indices for transmissions
over noisy channel. Conclusions are given in section 7.
VECTOR QUANTIZATION
A k-dimensional vector quantizer (VQ) of size L is a mapping Q of k-
151
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
152
OPTIMIZED ENCODING SYSTEM BASED ON THE TRELLIS
CODED VECTOR QUANTIZATION
The scalar trellis coded quantization (TCQ) [13] and its generalized version
to vector case (TCVQ) [14], [15] improve upon traditional trellis encoders [16] by
labelling the trellis branches with entire subsets rather than with individual
reproduction levels. This approach, which was motivated by Ungerboeck's
formulation of Trellis Coded Modulation (TCM) [17], uses a structured alphabet
with an extended set of quantization levels.
In this work, one was interested particularly on the TCVQ encoder which
structure is quite similar to TCQ, with an increase in complexity due to vector
codebook searching [14]. The design of a TCVQ encoder consists of several
interrelated steps. These steps include selection of trellis, extended initial
codebook construction, partitioning of the codebook's codevectors into
subcodebooks (subsets) and labelling the trellis branches with these subsets.
Consider the design process of a k-dimensional TCVQ encoder of rate R bits
per sample (bps) used to encode a sequence of source vectors. The S-state trellis
used in TCVQ can be any one of Ungerboeck's amplitude modulation trellises [17].
The extended initial TCVQ codebook is generally designed by the LBG algorithm.
It contains 2kR+1 codevectors (twice that of the VQ). However, during the TCVQ
encoding process, only a subset of size 2kR of these codevectors may be used to
represent a source vector at any instance of time. According to Ungerboeck's set
partitioning method, the codevectors are then partitioned into four subsets D0, D1,
D2 and D3 each of size 2kR−1. In our TCVQ encoders design, we used the heuristic
algorithm described in [15] to partition the extended TCVQ codebook. After that,
the subsets are labelled on the trellis branches according to Ungerboeck's rules of
TCM [17]. These rules are meant to ensure that the distortion between the original
and the reconstructed source sequences (under clear channel assumptions) is
close to the minimum.
To encode the source vectors sequence, the well-known Viterbi algorithm [16]
is used to find a legitimate optimal path through the trellis, which results in
153
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
(a)
(b)
(c)
Figure 1. TCQ encoder of rate R=2 bps : (a) Section of labelled 4-states trellis, (b)
Output alphabet levels and partition, (c) TCQ convolutional coder.
154
Gaussian sources using integer and fractional rates TVCQ encoders are
respectively given in tables 1 and 2. For different rates, results are given in terms
of Signal to Noise Ratio (SNR) in dB, along with the corresponding LBG-VQ
performance and distortion rate function D(R). Notice that when the rate is
fractional, the dimension k has to be such that kR becomes an integer.
Table 1. Performances of TCVQ encoding with integer rates for the Gaussian
source.
Table 2. Performances of TCVQ encoding with fractional rates for the Gaussian
source.
Rate Dim. TCVQ Trellises Size (State's Number) LBG- D(R)
bps k VQ
4 8 16 32 64 128 256
0.66 6 3.34 3.39 3.41 3.42 3.45 3.48 3.49 3.05 4.01
0.75 4 3.72 3.78 3.80 3.82 3.87 3.90 3.93 3.36 4.51
0.80 5 3.96 4.04 4.07 4.08 4.14 4.18 4.20 3.69 4.82
At the same encoding rate, these results show that the TCVQ outperforms the
TCQ (k = 1). Moreover, the TCVQ allows fractional rates as shown by the
simulation results listed in table 2. We can see also that, for a given rate, the
TCQ/TCVQ performances are higher than those of the conventional SQ/VQ.
To more improve the TCVQ performances, a training optimization procedure
155
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
for the extended TCVQ codebook design was developed [5]. For a given training
source vectors, this procedure updates the TCVQ codebook by replacing each
code vector with the average of all the source vectors mapped to this code vector.
This leads to an iterative design algorithm for the overall TCVQ encoder. Using
this optimization variant, the algorithm will be called OTCVQ (Optimized Trellis
Coded Vector Quantization) algorithm.
Examples of simulation results for encoding memory less Gaussian sources
using fractional rate OTCVQ encoders are listed in table 3.
Table3. Performances of the OTCVQ with fractional rates for the Gaussian
source.
Comparing these results with those given in table 2, we clearly notice the
performance improvements brought by the optimization of the TCVQ codebooks.
156
COVQ system principle: Modified optimality conditions
A channel optimized vector quantizer (COVQ) is a coding scheme based on
the principle of VQ generalization by taking into account the present noise on the
transmission channel. The idea is to exploit the knowledge about the channel in
the codebook design process and the encoding algorithm. Thus, the operations of
source and channel coding are integrated jointly into the same entity by
incorporating the channel characteristics in the design procedure. Indeed, the
LBG-VQ is well appropriate to a modification in this sense. The purpose then is to
minimize a modified total average distortion between the reconstituted signal and
the original signal, given the channel noise.
The design of a COVQ encoder is carried out by a VQ version extended to the
noisy case [8], [18]. The COVQ scheme keeps the same VQ block structure
(encoder/decoder, dimension, bit rate). The difference is in the formulation of the
necessary conditions of optimality to minimize a modified expression of the total
average distortion. This new distortion is formulated by considering simultaneously
the distortion due to vector quantization and channel errors [18], [19]:
L −1 ⎡ L −1 ⎤
∑∫ ∑
1
D= p( x) ⎢ p( j / i) ⋅ d ( x, y j )⎥ dx , (4)
k i =0 ⎢ j =0 ⎥
Ri ⎣ ⎦
where p(j/i) is the channel transition probability which represents the probability
that the index j is received given that the index i is transmitted. By comparing the
Eq. (4) with Eq. (1), one can notice easily that these two equations are equivalent,
except that the Eq. (4) uses a modified distance measure (term in the braces). It
about the same distance d but with weightings given by the channel transition
probabilities p(j / i), i, j = 0,..., L−1.
The formulations of optimality necessary conditions of COVQ are also derived
in two steps, according to the minimization principle of the modified total average
distortion [8], [18], [19].
For a given codebook Y = {y0,..., yL−1} and by using a squared Euclidean
distance measure, the optimal partition Ri (i= 0,..., L−1) for a noisy channel is such
that :
157
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
⎧⎪ L−1 2 L−1 ⎫⎪
∑ ∑
2
Ri = ⎨x ∈Rk : p( j /i) x − yj ≤ p( j /l) x − yj , ∀l ≠ i⎬ (5)
⎪⎩ j=0 j=0 ⎭⎪
Similarly, the optimum codebook for a fixed partition is given by:
L −1
∑ p( j / i) ∫ xp( x).dx
i =0 Ri
yj = , j = 0,…, L−1. (6)
L −1
∑ p( j / i) ∫ p( x).dx
i =0 Ri
The codevector yj represents now the centroid of all input vectors that are
decoded into the cell Rj, even if the transmitted index i is different from j. The
equations (5) and (6) are respectively referred as the generalized nearest
neighbour and centroid conditions with a modified distortion measure. The optimal
codevectors for noisy channel are thus linear combinations of those for the
noiseless case, weighted by the a posteriori channel transition probabilities.
In our applications, the communication channel considered is a discrete
memoryless channel with finite input and output alphabets. Precisely, we assumed
a memoryless binary symmetric channel (BSC) model with bit error (crossover)
probability p [6], [16]. For codewords (VQ indices) of n bits, the BSC transition
probabilities are described by [9], [19]:
p( j i ) = (1 − p) n−d H (i , j ) ⋅ p d H (i , j ) , (7)
where dH (i, j) (0 ≤ dH (i, j) ≤ n) is the Hamming distance between the n-bits binary
codewords represented by integers i and j.
When the channel bit error probability p is sufficiently small, the probability of
multiple bit errors in an index is very small relative to the probability of zero or one
bit error [9], [18], [19]. To simplify the numerical computations, it is often adequate
to consider only the effects of single bit errors on channel codewords. The BSC
channel model can be then approximated by [9]:
⎧ p j ∈ ξi ,
⎪
p( j i ) = ⎨1 − np j = i, (8)
⎪ 0
⎩ otherwise
158
where ξi is the set of all integers j, (0 ≤ j ≤ L −1), such that the binary
representation of j is of Hamming distance one from the binary representation of i.
In the case where the source distribution is unknown, long training database of
k-dimensional vectors can be used for the quantizer design. With the
approximation given in Eq. (8), the equations (4) and (6) will be respectively
modified as:
N −1 L −1
∑ ∑ k p( j / it ) ⋅ d ( xt , y j ) ,
1 1
D= (9)
N t =0 j∈ξi
and:
∑ p( j / i) ∑ xl / N
i∈ξ j l:xl ∈Ri
yj = , (10)
∑ p ( j / i ) Ri /N
i∈ξ j
where N is the size of the training base and Ri denotes the number of training
159
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
impact the final results. In our design, the initial codebook is conceived for ε = 0
(i.e., for noiseless channel). It is about a simple run of the conventional LBG-VQ
algorithm which will converge to a locally optimal codebook. This codebook will be
used as initial codebook of the COVQ algorithm. Then, for each stage of ε, the
algorithm will converge to an intermediate codebook which will be used as initial
codebook of the next stage in the COVQ design process.
The greatest difficulty in the COVQ system design is that the channel error
probability is a parameter in the optimization process. In real transmission
situation, this parameter is difficult to estimate. It may even vary in time, making
the design according to a specific value rather academic. Thus, according to the
practical situation and to the estimates of the real communication channel
characteristics, COVQ encoders can be selected to obtain the highest degree of
robustness.
160
0.01 8.927 8.965 9.034 9.179 8.477
0.05 6.824 6.918 7.351 7.608 7.800
0.1 4.650 5.292 5.875 6.801 7.043
0.2 2.518 3.109 3.876 4.752 5.886
In the case of transmissions over noisier channels (higher values of p), the
results indicate that COVQ performs better than LBG-VQ. For example, for a BSC
of p = 0.2, a considerable SNR gain of 3.36 dB was obtained by the COVQ
(trained for ε = 0.05) compared with the LBG-VQ. One notice that when the
channel probability p does not match with the design probability ε, COVQ
encoders trained for ε identical or close to p are those which yields the best
performances. However, when the channel is noiseless (p = 0.000) the SNR-
performances of COVQ encoders are suboptimal with the increase of the design
parameter ε. In this case, the LBG-VQ ensures comparable performances or better
than the COVQ. Same remarks when the channel error probability is low (p <
0.005) with a slight performances improvement obtained by COVQ encoders
trained for a low value of the design parameter ε (example, COVQ for ε = 0.001).
161
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
where fi and f̂ i are respectively the ith coefficients of the original f and
where f0 = 0 and f11 =0.5. The constant weight vector c = [c1,…, c10] is
experimentally determined [2]:
⎧1.0, for 1 ≤ i ≤ 8
⎪⎪
c i = ⎨0.8, for i = 9 (13)
⎪0.4, for i = 10
⎪⎩
162
n1 −1 2
⎡ S (e j 2πn / N ) ⎤
∑
1
SDi = ⎢10 log10 ⎥ . (14)
n1 − n 0 ⎢ ˆ (e j 2πn / N ) ⎥
n = n0 ⎣ S ⎦
For speech signal sampled at 8 kHz with a 3 kHz bandwidth, an N = 256 point
FFT is used to compute the original S(ej2πn/N) and quantized Ŝ(ej2πn/N) power
spectra of the LPC synthesis filter, associated with the ith frame of speech. The
spectral distortion is thus computed discretely with a resolution of 31.25 Hz per
sample over 96 uniformly spaced points from 125 Hz to 3.125 kHz. The constants
n0 and n1 in Eq. (14) correspond to 1 and 96 respectively.
Generally, it is accepted that an average SD of about 1 dB indicates negligible
audible distortion has incurred during quantization. This value has been, in the
past, suggested for transparent quantization quality and used as a goal in
designing many LPC quantization schemes. In [2], Paliwal and Atal established
that the average SD is not sufficient to measure perceived quality alone. They
introduced the notion of spectral outliers frames. Consequently, we can get
transparent quality if we maintain the following three conditions:
1. The average SD is about 1 dB,
2. The percentage of outlier frames having SD between 2 and 4 dB is less than
2%,
3. No frames must have SD greater than 4 dB.
Now, we evaluate the performances of our LSF-OTCVQ encoder operating
at different bit rates. All simulation results reported in this section were obtained by
using four-state trellis and 2-D codebooks. For each encoding rate, 2 bits are thus
assigned to represent the initial state. When the remaining bits cannot be equally
assigned to represent the five 2-D codebooks, fewer bits are used in the last
codebooks, since it is known that human resolution in the higher frequency bands
is less than in the lower frequency bands. We investigated the optimum bit
allocations for the LSF-OTCVQ encoder and found that the bit allocations given in
table 5 yield the best results.
163
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
Bits / Stage
codebook
25 5 5 5 5 3
26 6 5 5 5 3
27 6 6 5 5 3
28 6 6 6 5 3
164
Table 6. Performances of the LSF-OTCVQ encoder as a function of bit rate.
165
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
quality evaluations are done here through A-B comparison and MOS (Mean
Opinion Score) tests using 8 listeners. Six sentences from the TIMIT database
(spoken by three male and three female speakers) are used for the subjective
evaluations.
The A-B comparison test involves presenting listeners with a sequence of two
speech test signals (A and B). For each sentence, a comparison is done between
the two synthetic signals: one A (or B) with unquantized LSFs and the other B (or
A) with LSFs quantized by the LSF-OTCVQ encoder. The A-B signal pairs are
presented in a randomized order. The listeners choose either one or the other of
the two synthesized versions, or indicate no preference. For the MOS tests, the
listeners were requested to rate each synthetic speech sentence (with LSF-
OTCVQ quantized LSFs) in a scale between 1 (bad) and 5 (excellent). At the end,
the average score of opinion (MOS) is calculated.
Results from the A-B comparison tests show that the majority of the listeners
(58.84 %) have no preference. The mean preference for speech signal coded with
LSF-OTCVQ quantized LSFs (20.83 %) is identical to that obtained for the speech
signal coded with unquantized LSFs. Roughly, we can conclude that the two
considered versions of coded speech are statistically indistinguishable, i.e., there
are no perceptible differences and the quantization does not contribute to
audible distortion. In terms of MOS, the considered coded version of speech
exhibits a good score of 3.89. This implies that good communications quality and
high levels of intelligibility [2] are obtained using the 27 bits/frame LSF-OTCVQ
encoder in the FS1016.
In addition, in term of average segmental signal-to-noise ratio (SSNR), the
synthetic speech signals with unquantized LSF parameters gave an average
SSNR of 11.05 dB; with LSF-OTCVQ encoding of LSF parameters, the average
SSNR obtained is 10.31 dB. In the case where LSF parameters are quantized by
the 34 bits SQ, an average SSNR of 9.59 dB was obtained. Thus, a reduction in
coding rate with an improvement of the SSNR-performances of the FS1016 was
obtained by application of the LSF-OTCVQ encoding system.
166
Robustness of the COVQ-OTCVQ encoder: Transmission over a noisy
channel
In a practical communication system, the robustness of the LSF-OTCVQ
encoder must be reinforced so that the encoder will be able to cope up with
channel errors. In this part, we were interested in implicit protection of the
encoders by application of the JSCC-COVQ technique. We will see first how to
apply the COVQ for the robust design of the LSF-OTCVQ encoder in order to
provide an implicit protection to some of its indices. To finish, we will generalize
the study with the full protection of all the indices of the new LSF-OTCVQ encoder
with the COVQ technique.
167
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
− Based on the 5 training subsets, use the COVQ (εc = 0.05) algorithm to design
the five (2-D) extended initial codebooks of the encoder.
− Partition each initial codebook in 4 sub-codebooks using the set partitioning
algorithm. Then, label the transitions of each trellis stage with the corresponding
partitioned COVQ-codebook (i.e.,
COVQ-codebook LSF1-2 for stage 1,…
− Set a stop threshold α to very small value.
In step 2, the TCVQ encoding process of input LSF vectors consists to find
the best possible sequence of codevectors (optimal path) through the trellis. This
research task is assured by the Viterbi algorithm with a slight modification of the
distance computation formula. This distance, which must be minimized during the
TCVQ search process of the optimal codevector, is formulated as follows:
1 k
d ( f , fˆi ) = ∑ p( j / i) ∑cm wm d ( f (m) − fˆ j (m))2 (15)
k m=1
j∈ξi
168
where k is the dimension of LSF vectors (k = 2 for LSF's pairs) and ξi is the set of
the i-neighbors such as dH (i, j) = 1. Recall that after the encoding process, COVQ-
LSF-OTCVQ encoder transmits two binary sequences in addition to two bits
representing the trellis initial state.
In this part, we must notice that only the indices sequence of COVQ-LSF-
OTCVQ codevectors (sequence of 20 bits for the 5 indices) is supposed to be
protected implicitly by COVQ. This sequence results directly from the COVQ
search procedure through the 5 codebooks of the encoder. On the other hand, the
other binary sequences (initial state, optimal path) are not delivered by VQ search
process and consequently they are not protected implicitly against channel errors.
169
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
170
12
10
0
0.001 0.01 0.1 0.5
Error Probability (p)
For error probabilities p ≤ 0.01, these results show that the distortions are
negligible for the two LSF encoding systems. We can conclude that the encoding
system COVQ-LSF-OTCVQ (ε = 0.05) can provide a good implicit protection to the
FS1016 LSF parameters with suboptimal SD-performances when the channel is
slightly disturbed.
171
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
the Hamming codes which is generally well documented [6]. These codes were
first conceived to effectively correct only one error per transmission block (single
error-correcting codes). In our design, the two Hamming (7, 4, 3) codes have the
capacity to protect 8 bits by generating together 14 bits. The 27 bits/frame COVQ-
LSF-OTCVQ encoder, with the two Hamming (7, 4, 3) codes, will thus operate at a
rate of 34 bits/frame. It is about the same number of bits allocated with the original
coding of the FS1016's LSF parameters. Thus, the global design of the FS1016
with COVQ-LSF-OTCVQ (plus the 2 Hamming codes) of LSF parameters
maintains the speech coder rate to its original value of 4.8 kbits/s.
The performances of the non-protected LSF-OTCVQ compared with those of
the COVQ-LSF-OTCVQ (ε = 0.05) encoder with Hamming (7, 4, 3) codes are
given in table 8.
For all error probability variation range, the results showed that the channel
coding by Hamming codes (7, 4, 3) has clearly improved the performances of the
27 bits/frame COVQ-LSF-OTCVQ encoding system. The global system thus has a
good robustness against the errors of the noisy channel. On the other hand by
comparing these results with those given in table 7, the LSF-OTCVQ encoder has
incurred larger degradation in terms of average SD and outliers. This is due mainly
to the random noise effects of the binary sequences specifying the initial state or
172
the optimal path.
Concerning the SSNR performances of the global FS1016 (with LSFs coded by
COVQ-LSF- OTCVQ + 2 Hamming (7, 4, 3) codes), the degradations are very low
and even negligible for error probabilities p < 0.01. The SSNR performances of the
FS1016, in the cases with and without LSF protection, are presented in Fig. 3.
12
10
Average SSNR (dB)
0
0.001 0,01 0,1 0.5
Error Probability (p)
CONCLUSION
In this work, an optimized trellis coded vector quantization scheme has been
developed and successfully applied for the efficient and robust encoding of the
FS1016 LSF spectral parameters. In the case of ideal transmissions over a
noiseless channel, objective and subjective evaluation results revealed that the 27
bits/frame LSF-OTCVQ encoder (with weighted distance) produced equivalent
perceptual quality to that when the LSF parameters are unquantized.
After, we used a JSCC-COVQ technique to protect implicitly the transmission
indices of the LSF-OTCVQ encoder incorporated in the FS1016. The simulation
results showed that our new COVQ-LSF-OTCVQ encoding system has permitted
to the basic LSF-OTCVQ encoder to have a good robustness against BSC
channel errors especially when the transmission errors probability is high. To finish
this work, it was necessary to protect all the transmission indices of the COVQ-
173
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
LSF-OTCVQ encoder since only a part of its indices was protected implicitly by
JSCC-COVQ. By using adequately the bits per frame gained by this encoder, a
redundant channel coding by Hamming codes was used to explicitly protect the
remaining bits without protection. We showed that the COVQ-LSF-OTCVQ
encoder, using the Hamming codes (7, 4, 3), has contributed significantly to the
improvement of the encoding performances of the FS1016's LSF parameters.
We can conclude that our global COVQ-LSF-OTCVQ encoding system with
Hamming channel codes can ensure an effective and robust coding of the LSF
parameters of the FS1016 operating over noisy channel.
174
REFERENCES
1. W.B. Kleijn and K. K. Paliwal, : Speech coding and synthesis, Elsevier Science
B.V., (1995).J.
2. K. K. Paliwal and B.S. Atal : Efficient vector quantization of LPC parameters at
24 bits/frame, IEEE Transactions on Speech and Audio Processing, vol. 1, no.
1, pp. 3-14 (1993). F. R.
3. F. Itakura : Line spectrum representation of linear predictive coefficients of
speech signals", Journal of Acoustical Society of America, vol. 57, p.535
(1975).
4. W. F. LeBlanc, B. Bhattacharya, S. A. Mahmoud and V. Cuperman : Efficient
search and design procedures for robust multi-stage VQ of LPC parameters
for 4 kb/s speech coding, IEEE Transactions on Speech and Audio Processing,
vol. 1, no. 4, pp. 373-385 (1993).
5. M. Bouzid, A. Djeradi and B. Boudraa : Optimized Trellis Coded Vector
Quantization of LSF Parameters: Application to the 4.8 Kbps FS1016 Speech
Coder, Signal Processing, Vol. 85, Issue 9, pp. 1675-1694 (2005).
6. S. Lin : An Introduction to Error-Correcting Codes", Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, USA (1970).
7. C. E. Shannon: A Mathematical Theory of Communication, Bell System
Technical Journal, vol. 27, no. 3 and 4, pp. 379-423 and 623-656 (1948).
8. K. A. Zeger and A. Gersho : Vector quantizer design for memoryless noisy
channels, in Proceedings of the International Conference on Communications
(ICC'88), Philadelphia, pp. 1593-1597 (1988).
9. N. Farvardin : A Study of vector quantisation for Noisy Channels, IEEE
Transactions on Information Theory, vol. 36, n°. 4, pp. 799-809 (1990).
10. S. B. Z. Azami, P. Duhamel and O. Rioul : Combined source-channel coding:
Panorama of methods, CNES Workshop on Data Compression, Toulouse
France (1996).
11. A. Gersho, R. M. Gray : Vector quantization and Signal compression, Kluwer
Academic Publishers, USA (1992).
175
____ Chapter 8: Robust encoding of the FS1016 LSF parameters: Application of the
channel optimised trellis coded vector quantization
176