Sunteți pe pagina 1din 620

PROCEEDING OF

THE THIRD INTERNATIONAL CONFERENCE ON


CONTEMPORARY ISSUES IN COMPUTER AND INFORMATION
SCIENCES

THE INSTITUTE FOR ADVANCED STUDIES IN BASIC SCIENCES


(IASBS)

Foreword
We are glad and proud that we have been able to hold the International Conference on
Contemporary Issues in Computer and Information Science for the third successive year,
with your delightful presence; the conference which, besides special attention to the
scientific progress in this subject, is aimed at bringing better interaction between different
areas of computer science and everyday life, and acknowledges this fraternity a must for
the progress of the society. By this approach, the CICIS Conference pays direct attention
to several applicatory aspects of computer science and information. The third conference
is held with concentration on Graph and Geometrical Algorithms, Intelligent Systems,
Bioinformatics, IT and the society, in addition to all computer areas.
What makes us more glorious is the coincidence of this conference with the 20th
anniversary of the creation of Institute for Advanced Studies in Basic Sciences where
outstanding scientific achievements is carried out in a friendly environment which would
never have happened without God assistance and notable effort of the directors, teachers,
researchers and students.
Number of 277 received papers indicates kind feedback and makes us more determined.
45 papers (16.24 %) were accepted as oral presentation, 77 papers (27.79 %) as poster
presentation and 155 papers were rejected.
In this conference besides IASBS, Computer Society of Iran, Iranian branch of IEEE
and University of Zanjan have collaborated and supported us and we hope this improves
the scientific results.
Last but not least, we would like to respect our sponsors for their help and financial
support: Information Technology and Digital Media Development Center, Statistics and
Informatics department of Sanjesh Organization, Arameh Innovative Researchers, Brown
Walker Publisher.

Bahram Sadeghi Bigham


General Ghair

Contents
Reducing Packet Overhead by Improved Tunneling-based Route Optimization
Mechanism
Hooshiar Zolfagharnasab

Neural Network Learning based on Football Optimization Algorithm


Payam Hatamzadeh and Mohammad Reza Khayyambashi

Evaluating XML Retrieval Systems Using Methods of Averaging Precision and


Recall at Rank Cut-offs
Marzieh Javadi and Hassan Naderi

15

Performability Improvement in Grid Computing with Artificial Bee Colony


Optimization Algorithm
Neda Azadi and Mohammad Kalantari

19

Security Enforcement with Language-Based Security


Ali Ahmadian Ramaki, Shahin Shirmohammadzadeh Sahraeii and Reza Ebrahimi
Atani

26

Application of the PSO-ANFIS Model for Time Series Prediction of Interior


Daylight Illuminance
Hossein Babaee and Alireza Khosravi

30

Evaluating the impact of using several criteria for buffer management in


VDTNs
Zhaleh Sadreddini, Mohammad Ali Jabraeil Jamali and Ali Asghar Pourhaji Kazem

36

Improvement of VDTNs Performance with Effective Scheduling Policy


Masumeh Marzaei Afshord, Mohammad Ali Jabraeil Jamali and Ali Asghar
Pourhaji Kazem

40

Classification of Gene Expression Data using Multiple Ranker Evaluators and


Neural Network
Zahra Roozbahani and Ali Katanforoush

44

Data mining with learning decision tree and Bayesian network for data
replication in Data Grid
Farzaneh Veghari Baheri, Farnaz Davardoost and Vahid Ahmadzadeh

49

Design and Implementation of a three-node Wireless Network


Roya Derakhshanfar, Maisam M.Bassiri and S.Kamaledin Setarehdan

54

CEA Framework: A Comprehensive Enterprise Architecture Framework for


middle-sized company
Elahe Najafi and Ahmad Baraani

58

Thick non-crossing paths in a polygon with one hole


Maryam Tahmasbi and Narges Mirehi

64

A Note on the 3-Sum Problem


Keivan Borna and Zahra Jalalian

69

Voronoi Diagrams and Inversion Geometry


Zahra Nilforoushan, Abolghasem Laleh and Ali Mohades

74

Selection of Effective Factors in Estimating of Costumers Respond to Mobile


Advertising by Using AHP
Mehdi Seyyed Hamzeh, Bahram Sadeghi Bigham and Reza Askari Moghadam

80

An Obstacle Avoiding Approach for Solving Steiner Tree Problem on Urban


Transportation Network
Ali Nourollah and Fatemeh Ghadimi

84

Black Hole Attack in Mobile Ad Hoc Networks


Kamal Bazargan

89

Improvement of the Modeling Airport Assignment Gate System Using SelfAdaptive Methodology
Masoud Arabfard, Mohamad Mehdi Morovati and Masoud Karimian Ravandi

95

A new model for solving capacitated facility location problem with overall cost
of losing any facility and comparison of Particle Swarm Optimization,
Simulated Annealing and Genetic Algorithm
Samirasadat jamali Dinan, Fatemeh Taheri and Farhad Maleki

100

A hybrid method for collusion attack detection in OLSR based MANETs


Hojjat Gohargazi and Saeed Jalili

104

A Statistical Test Suite for Windows to Cryptography Purposes


R. Ebrahimi Atani, N. Karimpour Darav and S. Arabani Mostaghim

109

An Empirical Evaluation of Hybrid Neural Networks for Customer Churn


Prediction
Razieh Qiasi, Zahra Roozbahani and Behrooz Minaei-Bidgoli

114

A Clustering Based Model for Class Responsibility Assignment Problem


Hamid Masoud, Saeed Jalili and S.M.Hossein Hasheminejad

118

A Power-Aware Multi-Constrained Routing Protocol for Wireless Multimedia


Sensor Networks
Babak Namazi and Karim Faez

123

Mobile Learning- Features, Approaches and Opportunities


Faranak Fotouhi-Ghazvini and Ali Moeini

127

Predicting Crude Oil Price Using Particle Swarm Optimization (PSO) Based
Method
Zahra Salahshoor Mottaghi, Ahmad Bagheri and Mehrgan Mahdavi

131

Image Steganalysis Based On Color Channels Correlation In Homogeneous


Areas In Color Images
SeyyedMohammadAli Javadi and Maryam Hasanzadeh

134

Online Prediction of Deadlocks in Concurrent Processes


Elmira Hasanzade and Seyed Morteza Babamir

138

Fisher Based Eigenvector Selection in Spectral Clustering Using Google's Page


Rank Procedure
Amin Allahyar, Hadi Sadoghi Yazdi and Soheila Ashkezari Toussi

146

Imperialist Competitive Algorithm for Neighbor Selection in Peer-to-Peer


Networks
Shabnam Ebadi and Abolfazl Toroghi Haghighat

151

Different Approaches For Multi Step Ahead Traffic Prediction Based on


Modified ANFIS
Shiva Rahimipour, Mahnaz Agha-Mohaqeq and Seyyed Mehdi Tashakkori Hashemi

156

E-service Quality Management in B2B e-Commerce Environment


Parvaneh Hajinazari and Abbass Asosheh

161

Calibration of METANET Model for Real-Time Coordinated and Integrated


Highway Traffic Control using Genetic Algorithm: Tehran Case Study
Mahnaz Aghamohaqeqi, Shiva Rahimipour, Masoud Sa_lian and S.Mehdi Tashakori
Hashemi

165

Designing An Expert System To Diagnose And Propose About Therapy Of


Leukemia
Zohreh Mohammad Alizadeh Bakhshmandi and Armin Ghasem Azar

171

A Basic Proof Method For The Verification, Validation And Evaluation Of


Expert Systems
Armin Ghasem Azar and Zohreh Mohammad Alizadeh Bakhshmandi

175

Point set embedding of some graphs with small number of bends


Maryam Tahmasbi and Zahra Abdi reyhan

180

On The Pairwise Sums


Keivan Borna and Zahra Jalalian

184

Hyperbolic Voronoi Diagram: A Fast Method


Zahra Nilforoushan, Ali Mohades, Amin Gheibi and Sina Khakabi

187

Solving Systems of Nonlinear Equations Using The Cuckoo Optimization


Algorithm
Mahdi Abdollahi, Shahriar Lotfi and Davoud Abdollahi

191

A Novel Model-Based Slicing Approach For Adaptive Softwares


Sanaz Sheikhi and Seyed Morteza Babamir

195

A novel approach to multiple resource discoveries in grid environment


Leyli Mohammad khanli, Saeed Kargar and Hossein Kargar

200

HTML5 Security: Offline Web Application


Abdolmajid Shahgholi, HamidReza Barzegar and G.Praveen Babu

205

Earthquake Prediction by Study on Vital Signs of Animals in Wireless Sensor


Network by using Multi Agent System
Media Aminian, Amin Moradi and Hamid Reza Naji

209

Availability analysis and improvement with Software Rejuvenation


Zahra Rahmani Ghobadi and Baharak Shakeri Aski

213

A fuzzy neuro-chaotic network for storing and retrieving pattern


Nasrin Shourie and Amir Homayoun Jafari

219

GSM Technology and security impact


Ahmad Sharifi and Mohsen Khosravi

224

MicTSP: An Efficient Microaggregation Algorithm Based On TSP


Reza Mortazavi and Saeed Jalili

228

Proposing a new method for selecting a model to evaluate effective factors on


job production capabilities of central province industrial cooperatives using
Data mining and BSC techniques
Peyman Gholami and Davood Noshirvani Baboli

233

A Complex Scheme For Target Tracking And Recovery Of Lost Targets In


Cluster-Based Wireless Sensor Networks
Behrouz Mahmoudzadeh and Karim Faez

237

A Measure of Quality for Evaluation of Image Segmentation


Hakimeh Vojodi and Amir Masoud Eftekhary Moghadam

241

An Unsupervised Evaluation Method for Image Segmentation Algorithms


Hakimeh Vojodi and Amir Masoud Eftekhary Moghadam

246

Evaluate and improve the SPEA using fuzzy c-mean clustering algorithm
Pezhman Gholamnezhad and Mohammad mehdi Ebadzadeh

251

Hypercube Data Grid: a new method for data replication and replica
consistency in data gird
Tayebeh Khalvandi, Amir Masoud Rahmani and Seyyed Mohsen Hashemi

255

Exploiting Parameters of SLA to Allocate Resources for Bag of Task


Applications in Cloud Environment
Masoud Salehpour and Asadollah Shahbahrami

262

Bus Arrival Time Prediction Using Bayesian Learning for Neural Networks
Farshad Bakhshandegan Moghaddam, Alireza Khanteimoory and Fatemeh Forutan
Eghlidi

267

SRank: Shortest Path-Based Ranking in Semantic Network


Hadi Khosravi-Farsani, Mohammadali Nematbakhsh and George Lausen

271

RL Rank: A Connectivity-based Ranking Algorithm Using Reinforcement


Learning
Elahe Khodadadian, Mohammad Ghasemzadeh and Vali Derhami

276

YABAC4.5: Yet Another Boosting Approach for C4.5 Algorithm


B.Shabani and H.Sajedi

281

A New Method for Automatic Language Identification In Trilingual documents


of Arabic, English, and Chinese with Different Fonts
Einolah Hatami and Karim Faez

286

Clustering in backtracking for solution of N-queen Problem


Samaneh Ahmadi, Vishal Kesri and Vaibhav Kesri

290

An Improved Phone Lattice Search Method for Triphone Based Keyword


Spotting in Online Persian Telephony Speech
Maria Rajabzadeh, Shima Tabibian, Ahmad Akbari and Babak Nasersharif

294

Adaptive Gaussian Estimation of Distribution Algorithm


Shahram Shahraki and Mohammad-R. Akbarzadeh-T

300

A New Feature Transformation Method Based On Genetic Algorithm


Hannane Mahdavinataj and Babak Nasersharif

304

Evaluating the performance of energy aware tag anti collision protocols in


RFID systems
Milad Haj Mirzaei and Masoud Ghiasbeigi

310

GPS GDOP Classification via Advanced Neural Network Training


H. Azami, S. Sanei and H. Alizadeh

315

Improving Performance of Software Fault Tolerance Techniques Using


Multi-Core Architecture
Hoda Banki, Seyed Morteza Babamir, Azam Farokh and Mohamad Mehdi Morovati

320

An Introduction to an Architecture for a Digital-Traditional Museum


Reza Asad Nejhad, Mina Serajian, Mohsen Vahed and Seyyed Peyman Emadi

326

A Comparison of Transform-Domain Digital Image Watermarking Algorithms


Asadollah Shahbahrami, Mitra Abbasfard and Reza Hassanpour

329

Polygon partitioning for minimizing the maximum of geodesic diameters


Zahra Mirzaei Rad and Ali Mohades

336

Automatic Path-oriented Test Case Generation by considering Infeasible Paths


Shahram Moadab, Hasan Rashidi and Eslam Nazemi

340

Control Topology based on delay and traffic in wireless sensor networks


Bahareh Gholamiyan Yosef Abad and Masuod Sabaei

345

Two-stage Layout of workstations in an organization based clustering and


using an evolutionary approach
Rana ChaieAsl, Shahriar Lotfi and Reza Askari Moghadam

350

CAB : Channel Available Bandwidth Routing Metric for Wireless Mesh


Networks
Majid Akbari and Abolfazl Toroghi Haghighat

355

A PSO Inspired Harmony Search Algorithm


Farhad Maleki, Ali Mohades, F. Zare-Mirakabad, M. E. Shiri and Afsane Bijari

360

Repairing Broken RDF Links in the Web of Data by Superiors and Inferiors
sets
Mohammad Pourzaferani and Mohammad Ali Nematbakhsh

365

Palmprint Authentication Based on HOG and Kullback Leibler


Ma.Yazdani, F. Moayyedi and Mi. Yazdani

370

A Simple and Efficient Fusion Model based on the Majority Criteria for
Human Skin Segmentation
S. Mostafa Sheikholslam, Asadollah Shahbahrami, Reza PR Hasanzadeh and Nima
Karimpour Darav

374

A New Memetic Fuzzy C-Means Algorithm For Fuzzy Clustering


Fatemeh Golichenari and Mohammad Saniee Abadeh

380

Cross-Layer Architecture Design for long-range Quantum Nanonetworks


Aso Shojaie, Mehdi Dehghan Takhtfooladi,Mohsen Safaeinezhad and Ebrahim
SaeediNia

385

Generation And Configuration Of PKI Based Digital Certificate Based On


Robust OpenCA Web Interface
Parisa Taherian and Mohammad Hossein Karimi

391

Network Intrusion Detection Using Tree Augmented Naive-Bayes


R. Najafi and Mohsen Afsharchi

396

Dynamic Fixed-Point Arithmetic: Algorithm and VLSI Implementation


Mohammad Haji Seyed Javadi, Hamid Reza Mahdiani and Esmaeil Zeinali Kh.

403

Cost of Time-shared Policy in Cloud Environment


GhDastghibyfard and Abbas Horri

408

Using Fuzzy Classification System for Diagnosis of Breast Cancer


Maryam Sadat Mahmoodi, Bahram Sadeghi Bigham and Adel Najafi-Aghblagh
Rostam Khan

412

Government above the Clouds: Cloud Computing Based Approach to


Implement E-Government
Toofan Samapour and Mohsen Solhnia

417

Human Tracking-by-Detection using Adaptive Particle Filter based on HOG


and Color Histogram
Fatemeh Rezaei and Babak H.Khalaj

422

Use of multi-agent system approach for concurrency control of transactions in


distributed databases
Seyed Mehrzad Almasi, Hamid Reza Naji and Reza Ebrahimi Atani

426

Multi-scale Local Average Binary Pattern based Genetic algorithm (MLABPG)


for face recognition
A. Hazrati Bishak and K. faez

430

A Novel Method for Function Approximation in Reinforcement Learning


Bahar Haghighat, Saeed Bagheri Shouraki and Mohsen Firouzi

435

An Intelligent Hybrid Data Mining Method for Car-Parking Management


Sevila Sojudi, Susan Fatemieparsa, Reza Mahini, Parisa YosefZadehfard and
Somayeh Ahmadzadeh

443

Iris Recognition with Parallel Algorithms Using GPUs


Meisam Askari, Reyhane azimi and Hossein Ebrahimpour Komle

448

Improving Performance of Mandelbrot Set Using Windows HPC Cluster and


MPI.NET
Azam Farokh, Hoda Banki, Mohamad Mehdi Morovati and Hossein Ebrahimpour
Komle

453

The study of indices and spheres for implementation and development of trade
single window in Iran
Elham Esmaeilpour and Noor Mohammad Yaghobi

458

Web Anomaly Detection Using Artificial Immune System and Web Usage
Mining Approach
Masoumeh Raji, Vali Derhami and Reza Azmi

462

A Fast and Robust Face Recognition Approach Using Weighted Haar And
Weighted LBP Histogram
Mohsen Biglari, F. Mirzaei and H. Ebrahimpour-Komleh

467

An Unsupervised Method for Change Detection in Breast MRI Images based on


SOFM
Marzieh Salehi, Reza Azmi and Narges Norozi

473

A new image steganography method based on LSB replacement using


Genetic Algorithm and chaos theory
Amirreza Falahi and Maryam Hasanzadeh

478

Providing a CACP Model for Web Services Composition


Parinaz Mobedi and Mehregan Mahdavi

482

Using Collaborative Filtering for Rate Prediction


Sonia Ghiasifard and Amin Nikanjam

487

A New Backbone Formation Algorithm For Wireless Ad-Hoc Networks Based


On Cellular Learning Automata
Maryam Gholami, Mohammad Reza Meybodi and Mohammad Reza Meybodi

492

Solving Dominating Set Problem In Unit Disk Graphs By Genetic Algorithms


Azadeh Gholami, Mahmoud Shirazi and Bahram Sadeghi Bigham

498

Conflict Detection and Resolution in Air Traffic Management based on Graph


Coloring Problem using Prioritization Method
Hojjat Emami and Farnaz Derakhshan

504

A Review of M-Health Approach for Chronic Disease Management


Marva Mirabolghasemi, N.A.Iahadi, Maziar Mirabolghasemi and Vida Zakerifardi

509

A New IIR Modeling by means of Genetic Algorithm


Tayebeh Mostajabi and Javad Poshtan

514

A New Similarity Measure for Improving Recommender Systems Based on


Fuzzy Clustering and Genetic Algorithm
Fereshteh Kiasat and Parham Moradi

518

The lattice structure of Signed chip firing games and related models
A. Dolati, S. Taromi and B. Bakhshayesh

525

Tiling Finite Planes


Jalal Khairabadi ,Rebvar Hosseini, Zohreh Mohammad Alizadeh and Bahram
Sadeghi Bigham

528

J2ME And Mobile Database Design


Seyed Rebvar Hosseini, Lida Ahmadi, Bahram Sadeghi Bigham and Jalal
Khairabadi

532

IIR Modeling via Skimpy Data and Genetic Algorithm


Tayebeh Mostajabi and Javad Poshtan

536

Concurrent overlap partitioning, A new Parallel Framework for Haplotype


inference with Maximum parsimonious
Mohsen Taheri, Alireza Meshkin and Mehdi Sadeghi

540

A Bayesian Neural Network for Price Prediction in Stock Markets


Sara Amini, Farzaneh Yahyanejad and Alireza Khanteymoori

548

Maintaining the Envelope of an Arrangement Fixed


Marzieh Eskandari and Marjan Abedin

553

Investigating and Recognizing the Barriers of Exerting E-Insurance in Iran


Insurance Company According to the Model of Mirzai Ahar Najai (Case Study:
Iran Insurance Company in Orumieh City)
Parisa Jafari, Hamed Hagtalab, Morteza Shokrzadeh and Hasan Danaie

557

Identifying and Prioritizing Effective Factors in Electronic Readiness of the


Organizations for Accepting and Using Teleworking by Fuzzy AHP Technique
(Case Study: Governmental and Semi-Governmental Organizations in Tabriz
City)
Morteza Shokrzadeh, Naser Norouzi, Jabrael Marzi Alamdari and Alireza Rasouli

561

Hybrid Harmony Search for the Hop Constrained Connected Facility Location
Problem
Bahareh khazaei, Farzane Yahyanejad, Angeh Aslanian and S. Mehdi Hashemi

566

Gene Selection using Tabu Search in Prostate Cancer Microarray Data


Farzane Yahyanejad, Mehdi Vasighi, Angeh Aslanian and Bahareh khazaei

571

BI Capabilities and Decision Environment in BI Success


Zahra Jafari, Mahmoud Shirazi and Mohammad Hosseion Hayati

575

Computation in Logic and Logic in Computation


Saeed Salehi

580

Rating System for Software based on International Standard Set 25000


ISO/IEC
Hassan Alizadeh, Hossein Afsari and Bahram Sadeghi Bigham

584

TOMSAGA: TOolbox for Multiple Sequence Alignment using Genetic


Algorithm
Farshad Bakhshandegan Moghaddam, Mahdi Vasighi

589

To enrich the life book of IT specialists through shaping living schema Strategy
based on Balance-oriented Model
Mostafa Jafari

595

Reducing Packet Overhead by Improved Tunneling-based Route


Optimization Mechanism
Hooshiar Zolfagharnasab

Department of Computer Engineering


University of Isfahan
Department of IT, Soroush Educational Complex
hoppico@eng.ui.ac.ir

Abstract: Common Mobile IPv6 mechanisms, bidirectional tunneling and route optimization,
show inefficient packet overhead when both nodes are mobile. Researchers have proposed methods
to reduce per-packet overhead regarding to maintain compatible with standard mechanisms. In this
paper, three mechanisms in Mobile IPv6 are discussed to show their efficiency and performance.
Following discussion, a new mechanism called improved tunneling-based route optimization is proposed and due to performance analysis on packet overhead, it is shown that proposed mechanism
has less overhead comparing to others. Analytical results indicate that improved tunneling-based
route optimization transmits more payloads due to send packets with less overhead.

Keywords: Mobile IP; Route Optimization; Bidirectional Tunneling; Packet Overhead.

Introduction

Mobile IP is a technique enables nodes to maintain


permanent IP address while they are moving through
networks [1]. Due to Mobile IP protocol, a communication can be established between a Mobile Node (MN)
and a Corresponding Node (CN) regardless to their locations.
The Mobile IP protocol supports transparency
above the network layer including transport layer
which consists of the maintenance of active TCP connections and UDP port bindings, and application layer.
Mobile IP is most often found in wireless WAN environments where users need to carry their mobile devices
across multiple LANs with different IP addresses [2]
[4]. Mobile IP is implemented in IPv6 via two mechanisms called bidirectional tunneling and route optimization [1], [8].

inform other devices about location and network they


are wandering. Original packets from the network upper layers are embedded in packets containing mobile
routing headers. Reducing mobility overhead causes
more data to be sent with each packet. Therefore some
mechanisms are used to reduce mobility overhead. In
this paper, a new mechanism is proposed to reduce mobility overhead by reusing address field of IP address
twice.

Related Works

Some attempts have been performed to improve security and performance in Mobile IP. C. Perkins proposed
a security mechanism in binding updates between CN
and MN in [5]. C. Vogt et al. in [6] proposed a proactive address testing in route optimization.

In other aspect, D. Le and J. Chang suggested reIn order to enable mobility over IP protocols, net- ducing bandwidth usage due to use tunnel header inwork layer of mobile devices should send messages to stead of route optimization header when both MN and
Corresponding

Author: IT Manager at Soroush Educational Complex, Tehran, Iran, Tel: (+98) 912 539-4829

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

CN are mobile nodes [7].


It should be noted few papers focused on bandwidth reduction in Mobile IP while a lot of suggestions
are proposed to solve issues in security and delay. In
this paper, we are going to present a new technique to
reduce bandwidth by diminishing overhead of packets
when both MN and CN are mobile nodes.

Also it is important to update CNs binding cache by


sending BU messages frequently.
Route Optimization mechanism uses Home Address
Option header extension to carry MNs HoA when a
packet is sent from MN to CN. Reversely when a packet
is sent from CN to MN, another header extension called
Type 2 Routing header is used.

In a scenario that both MN and CN are mobile


nodes, route optimization can be implemented, too
[1]. Since both MN and CN have HoA and CoA,
3 Mobile IPv6
packet routing requires both extension headers to carry
enough information for the pairs network layer. ThereDiscussing about bidirectional and route optimization, fore, to transmit a packet from MN to CN, not only
we will talk about their advantages and disadvantages. Home Address Option header, but also Type 2 RoutLater, a method presented in [7], is explained to cover ing header should be filled with appropriate addresses.
some disadvantages of standard mechanisms.
Since each extension header is 24 bytes, total overhead
to transmit a packet between two mobile nodes is 48
bytes.

3.1

Bidirectional Tunneling

In Bidirectional Tunneling, MN and HA are connected


to each other via a tunnel, so signaling is required to
construct a tunnel between MN and CN. Packets sent
from CN to MN passes through HA before deliverance
to MN. Intercepting all packets destined to MN, HA
detects by Proxy Neighbor Discovery [9]. Since MN
is not present in home network and assuming noticed
tunnel is constructed, HA encapsulates each detected
packet in a new packet addressed to MNs new careof address (CoA) and sends them through the tunnel
[10]. At the end of the tunnel, the tunneled packet
is de-capsulated by MNs network layer before being
surrendered to MNs upper layers.

3.3

Tunneling-based Route Optimization (TRO)

As discussed before, in a scenario when both MN and


CN are mobile nodes, total overhead to carry a packet
between nodes is 48 bytes in route optimization. To reduce the overhead, D. le and J. Chang in [7] proposed
a mechanism called Tunneling-based Route Optimization. Like standard route optimization, TRO construct
a tunnel to transfer packets directly between MN and
CN. But in their proposed method, a Tunnel Manager
is controlling packets. Not only tunnel manager is in
touch with binding cache, but also it manipulates packets importing and exporting from the network layer.

Similar encapsulation is performed when MN sends


packets. Encapsulated packets are tunneled to HA,
that is called reverse tunneling, by adding 40 bytes as
tunnel header, addressed from MNs CoA to HA. Being
de-capsulated by HA, tunneling header is removed and
modified packet is sent to CN through the Internet.

As long as MNs transport layer create a packet


from MNs HoA destined to CNs HoA, the packet
is surrendered to MNs tunnel manager before it is
sent. Since tunnel manager is aware of CNs mobility,
it encapsulates the packet in a new packet addressed
from MNs CoA to CNs CoA. Later the packet is sent
through the tunnel to CN. At the other side of tunnel,
CNs tunnel manger de-capsulate the packet, extract3.2 Route Optimization (RO)
ing the original packet addressed from MNs HoA to
CNs HoA. Then the packet is surrendered to transIn Route Optimization mechanism, packets are trans- port layer which is still unaware of mobility.
mitted between MN and CN directly [3]. Binding Update (BU) messages are sent not only to HA, but also to
To maintain compatible with previous mechanisms,
all connected CNs to bind MNs current address to its BU messages are changed. By using a flag called ROT,
HoA. Each CN has a table called Binding Cache to keep tunnel manager decides whether to use tunneling-based
track of all corresponding MNs CoA and their HoA. route optimization or standard route optimization [7].
Similar table is kept in MN to determine whether a
CN uses bidirectional tunneling or route optimization.
TRO mechanism benefits from using 40 bytes tun-

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1: Protocol model for route optimization and packets passing between layers
nel header instead of using 48 bytes extension header depicts the protocol model in sender and receiver.
when standard route optimization is used.Result presented in [7] shows that TRO can increase performance
in Mobile IP comparing to standard mechanisms.

Improved
Tunneling-based
Route Optimization (ITRO)

More reduction can be accessed in order to spend less


header overhead in communication between MN and
CN, when they are both mobile nodes. Each node
constructs a binding cache to keep the address of the
other, so there is no necessity to send HoA of the other
pair via header extension because it can be obtained
from binding cache by the help of CoA included in
packet. In other words, header overhead is reduced
by using IPv6 address fields twice, both for the Internet addressing and mobile addressing. Instead, a
tunnel manager should be embedded not only to control binding cache, but also change the packet header.
The tunnel manager should control whether IPv6 address header is used for Internet addressing or mobile
addressing. Later in this section, we discuss about Improved Tunneling-based Route Optimization method.

4.1

Protocol Model in End-Points

Mobile IPv6 protocol should change a little to support


overhead reduction. Both nodes should be devised with
a tunnel manager which control and change all packets
switched between MN and CN. Also the noticed tunnel manger should be allowed to access binding cache
in order to find corresponding HoA of a node. Fig. 1

4.2

Improved Tunneling-based Route


Optimization Routing

Below, we discuss two scenarios to explain our proposed method. It should be mentioned that a tunnel
between MN and CN should be initiated at first. Also
BU messages have been sent to construct binding cache
in both CoA and HoA.
As long as MN wants to send a packet to CN, since
mobility is transparent to upper layers in nodes, MNs
network layer sets both source of the packet to MN
HoA and destination to CNs HoA. In the next step,
when tunnel manager gets the packet, it updates the
packet by changing both packets source and destination. Since MN is in a foreign network, it changes the
source field from its HoA to its CoA. Later, searching
binding cache (by the help of CNs HoA), it finds CNs
corresponding CoA and then writes it in the destination address field. Altered packet is sent directly to
CN through the tunnel.
By reception of packet to the other side of the tunnel, CNs tunnel manager manipulates the packet to
make it ready for upper layers. First manipulation is
performed by changing the packets destination from
CN CoA to CNs HoA. Next step is followed by searching binding cache with MNs CoA to find corresponding HoA. Later, the CNs tunnel manger then change
packets source from MNs CoA to what has just been
found, MNs HoA. As long as changes are finished, the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Evaluation

We have evaluated our proposed mechanism via comparison to three other mechanisms. Since improved
tunneling-based route optimization mechanism intends
Figure 2: Improved tunneling-based route optimization to reduce header overhead, main comparison metric
is bytes consumed to establish mobile communication.
packets due to Fig. 1
We used relation 1 proposed in [7] to calculate mobility
overhead. It should be noted that mobility overhead
is bytes used to route packets from one mobile node to
another, and is different from overhead used to route
updated packet is surrendered to upper layers. Due to packets through network layer.
Fig. 1, packets sent from MN to CN are addressed as
shown in Fig. 2.
Mobility Addition Size
,
Mobility Overhead Ratio =
Original Packet Size
Same action is performed when a packet is sent from
(1)
CN to MN. Since CNs network upper layers are unaware of mobility, a packet is constructed which is adAlso, comparing to bidirectional tunneling mechadressed from CNs HoA to MNs HoA. As the packet is nism, communicating time is also mentioned which is
passed to CNs tunnel manger, due to binding cache, defined as total time for a packet to deliver from source
to destination.
the destination of the packet is changed from MNs
HoA to MNs CoA. Since CN knows its CoA, tunnel
Moreover, packets are assumed to be 1500 bytes
manger updates the packets source from its HoA to that is maximum transmission unit size in Ethernet,
CoA. Then the packet is tunneled to MN.
containing IPv6 packets, extension header if needed
and tunneling overhead.
Similarly, MNs tunnel manager changes the packets destination from MNs CoA to MNs HoA. Later,
searching binding cache, the packets source is also
5.1 Comparing to Bidirectional Tunchanged from CNs CoA to CNs HoA.

neling

4.3

Changing BU messages

To maintain compatible with other MIPv6 mechanisms, binding messages should change. We propose
to use two flags in order to distinguish three different
mechanisms. Calling ROT0 and ROT1, these flags indicate whether route optimization or tunneling-based
route optimization or improved tunneling-based route
optimization is used. Routing mechanisms due to
ROT0 and ROT1 are listed in table 1.

As mentioned before, in bidirectional tunneling, packets from CN should be tunneled from HA to MN and
are replied in the same tunnel from MN to HA, called
reverse tunneling. For each time a packet is tunneled,
40 bytes are used additionally to route the packet to
the other side of tunnel. As a packet is tunneled twice
to reach to destination, 80 bytes are consumed in two
different communications. Total bandwidth which is
used to carry a packet from source to destination is
calculated as follows:

Mobility Overhead Ratio =

+
Table 1: Routing mechanism due to ROT flags
Mechanism

ROT1

ROT0

Route Optimization
Tunneling-based Route Optimization
Improved Tunneling-based Route
Optimization (proposed method)

0
0

0
1

1 or 0

Tunnel Header SizeHAM N

Original Packet Size


Tunnel Header SizeM N HA
Original Packet Size

40
40
+
= 5.48%,
1500 40 1500 40

(2)

Also in bidirectional tunneling, each routing elapses


one Internet routing time [11] because each node can
be anywhere in the Internet. Due to Fig. 3, total delay

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 4: Route optimization packets due to Fig. 1

5.2

Comparing to Route Optimization

Although both route optimization and proposed mechanisms construct a tunnel to reduce delay time and
overhead needed to communicate two mobile nodes,
different overheads are used to route a packet in constructed tunnel. In the situation when both nodes are
mobile, route optimization uses Home Address Option
and Type 2 routing extension headers as it is depicted
Figure 3: Comparing delay time for bidirectional tun- in Fig. 4. Since each extension header is 24 bytes in
neling mechanism and route optimization based mech- size, total mobility header added to IPv6 packet is
anisms
48 bytes. So mobility overhead ratio is calculated as
follows:
consists of three Internet routing time that is com24 Btype 2 + 24 BHOA Option
Mobility Overhead Ratio =
puted from:
1500 48
=
Total time =TM N HAM N + THAM N HACN
+ THACN CN
= 3 TInternet ,

48
= 3.3%,
1452

(6)

(3)

Since packets are tunneled directly to each node,


one Internet time is required (Equation. 5).

In improved tunneling-based route optimization,


since nodes are connected to each other through a tunnel, there is no need to tunnel packets twice between
MN and HA. Also address field of packet issued both
for tunnel and IPv6 header. Therefore, reduction in
both overhead and delay are sensible. Mobility Overhead Ratio is calculated as follows:

Because improved tunneling-based route optimization uses address field of packet both for tunneling and
IPv6 routing, as it calculated before, it uses 0% of total
packet size.
Using same tunnel for transmitting packets, total
delay time is same for both route optimization and proposed method.

0 BIPv6 tunnel header


1500 0
5.3
0
=
= 0%,
(4)
1500

Mobility Overhead Ratio =

Comparing to Tunneling-based
Route Optimization

Tunneling-based Route Optimization is proposed not


Also delay in proposed mechanism is computed only to decrease communication delay, but also to refrom:
duce overhead. It benefits from both tunneling idea
used in bidirectional tunneling and connecting directly

Total time = TM N CN = TInternet ,


(5) used in Route Optimization. Tunneling header which
is 40 bytes is added to IPv6 packet duo to reduce 48
bytes of extension headers. Fig. 5 shows packets A and
It means in Improved Tunneling-based Route Opti- B due to Fig. 1 when tunneling-based route optimizamization mechanism s more efficient both in overhead tion mechanism is used. Also, mobility overhead ratio
and delay.
is calculated as follows:

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

with standard mechanisms, not only the tunnel manager should be changed, but also Binding Update messages must be altered. Comparison to Bidirectional
Tunneling, Route Optimization and Tunneling-based
Route Optimization shows that the packet overhead of
proposed mechanism is reduced significantly comparFigure 5: Tunneling-based route optimization packets ing to previous mechanisms. Therefore regarding to
less overhead for each packet, more data can be transdue to Fig. 1
mitted through network via a Mobile IP communication.
Table 2: Comparison between Mobile IPv6 mechanisms
Mechanism

Bidirectional Tunneling
Route Optimization
Tunneling-based Route
Optimization
Improved Tunneling-based
Route Optimization
(proposed method)

Packet
Overhead
(%)

Delay
(Internet
Time)

6.6
3.3

3
1

2.74

Acknowledgement
I would like to thank Soroush Educational Complex
and especially Mr. Adbullah Shirazi for financial support and assistance. Also I should thank Mr. Seyed
Morteza Hosseini for preparing final version of PDF
using LATEX 2 .

Refrences
40 BIPv6 tunnel header
1500 40
40
=
= 2.74%,
(7)
1460

Mobility Overhead Ratio =

Total delay is equal to one Internet Time, because


packets should pass a tunnel same as the tunnel used
in Route Optimization.
Comparing to improved tunneling-based route optimization mechanism, proposed method has no overhead in header used in mobile communication. And
total delay is the same to route optimization mechanism.
Listed in table 2, Mobile IPv6 mechanisms are compared to each other. All in all it is obvious that proposed method can reduce both delay and bandwidth
used in mobile nodes communication.

Conclusion

In this paper, performance of both standard Mobile


IPv6 routing mechanisms and Tunneling-based Route
Optimization are analyzed. To reduce packet overhead,
we proposed Improved Tunneling-based Route Optimization mechanism. In order to maintain compatible

[1] D. Johnson, C. Perkins, and J. Arkko, Mobility Support


in IPv6, Internet Draft (work in progress), IETF (2009),
[Online] Available: http://tools.ietf.org/id/draft-ietf-mextrfc3775bis-05.txt.
[2] R.
Koodli,
Mobile
IPv6
Fast
RFC
5568,
IETF
(2009),
[Online]
http://www.ietf.org/rfc/rfc5568.txt.

Handovers,
Available:

[3] A. Muhanna, M. Khalil, S. Gundavelli, K. Chowdhury,


and P. Yegani, Binding Revocation for IPv6 Mobility, Internet Draft (work in progress), IETF (2009), [Online]
Available: http://www.ietf.org/id/draft-ietf-mext-bindingrevocation-14.txt.
[4] M. Liebsch, A. Muhanna, and M. Blume, Transient Binding for Proxy Mobile IPv6, Internet Draft
(work in progress), IETF (2009), [Online] Available:
http://www.ietf.org/id/draft-ietf-mipshop-transientbcepmipv6-04.txt.
[5] C. Perkins, Securing Mobile IPv6 Route Optimization Using a Static Shared Key, RFC 4449, IETF (2006), [Online]
Available: http://www.ietf.org/rfc/rfc4449.txt.
[6] C. Vogt, R. Bless, M. Doll, and T. Kuefner, Early Binding
Updates for Mobile IPv6, in Proceedings of IEEE Wireless
Communications and Networking Conference (WCNC05) 3
(2005), 14401445.
[7] D. Le and J. Chang, Tunneling-based route optimization
for mobile IPv6, in Proceedings of IEEE Wireless Communications, Networking and Information Security (WCNIS)
(2010), 509513.
[8] C.
Perkins,
IP
Mobility
Support
for
IPv4,
RFC
3344,
IETF
(2002),
[Online]
Available:
http://www.ietf.org/rfc/rfc3344.txt.
[9] T. Narten, E. Nordmark, and W. Simpson, Neighbor Discovery for IP Version 6 (IPv6), RFC 4861, IETF (2007),
[Online] Available: http://www.ietf.org/rfc/rfc4861.txt.
[10] A. Conta and S. Deering, Generic Packet Tunnelling in
IPv6 Specification, RFC 2473, IETF (1998), [Online] Available: http://www.ietf.org/rfc/rfc2473.txt.

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[11] M. Kalman and B. Girod, Modeling the delays of successivelytransmitted Internet packets, In Proceedings of the

IEEE International Conference on Multimedia and Expo,


ICME04, Taipei, Taiwan (2004), 20152018.

Neural Network Learning based on Football Optimization Algorithm


Payam Hatamzadeh

Mohammad Reza Khayyambashi

Faculty of Engineering

Faculty of Engineering

Department of Computer Engineering

Department of Computer Engineering

Payam@eng.ui.ac.ir

M.R.Khayyambashi@eng.ui.ac.ir

Abstract: Football Optimization Algorithm (FOA) is a novel optimization algorithm, which is


inspired by football game. Like other evolutionary ones, the proposed algorithm starts with an
initial population called team. Population individuals called players are in two types: main players
and substitute players. Teamwork among these players forms the basis of the proposed evolutionary
algorithm. In this paper, the application of the FOA to tuning the parameters of artificial neural
networks (ANNs) is presented as a new evolutionary method of ANN training. In this paper, neural
network trained with FOA, Imperialist Competitive Algorithm, Particle Swarm Optimization and
Genetic Algorithm and compared the experimental results obtained from these four methods. The
consideration of results showed that the training and test error of the network trained by FOA
algorithm has been reduced in comparison to the other three methods. Hence, FOA can tune the
weight values and it is believed that FOA will become a promising candidate for training ANNs.

Keywords: Football Optimization Algorithm; Artificial Neural Network; Evolutionary Algorithms.

Introduction

Artificial neural networks have been developed as generalizations of mathematical models of biological nervous systems. A first wave of interest in neural networks emerged after the introduction of simplified neurons by McCulloch and Pitts (1943). Neural networks
have the ability to perform tasks such as pattern recognition, classification problems, regression problems differential equations and etc as demonstrated [1,2]. The
basic processing elements of neural networks are called
artificial neurons, or simply neurons or nodes. In a simplified mathematical model of the neuron, the effects
of the synapses are represented by connection weights
that modulate the effect of the associated input signals,
and the nonlinear characteristic exhibited by neurons
is represented by a transfer function. The neuron impulse is then computed as the weighted sum of the
input signals, transformed by the transfer function.

used to determine weight adjustments has a large influence on the performance of neural networks. While gradient descent is a very popular optimization method,
it is plagued by slow convergence and susceptibility to
local minima as demonstrated [3]. Therefore, other
approaches to improve neural networks training introduced as demonstrated [4]. These methods include
global optimization algorithms, such as Seeker Optimization Algorithm [5], Genetic Algorithms [6-8], Particle Swarm Optimization Algorithms [9-10], Imperialist Competitive Algorithm [11] and Harmony Search
Algorithm [12].

In this paper, a new evolutionary algorithm has


been proposed which has inspired by football game
called Football Optimization Algorithm. The proposed
algorithm starts with an initial population called team.
A team composed of good passers and mobile players.
All the players are divided into two types: main players
and substitute players. Teamwork among main players is the main part of the proposed algorithm and
The learning capability of an artificial neuron is expectantly causes the ball to converge to the goal.
achieved by adjusting the weights in accordance to the Teamwork is achieved when individuals make personal
chosen learning algorithm. The optimization method
Corresponding

Author, P. O. Box 83139-64841, T: (+98) 913 913-9948

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

sacrifices to work together for the success of the group.


Here, the Neural Network trained by FOA, ICA, PSO
and GA algorithms and compared their results with
each other. The consideration of results indicated that
the training and test error of the network which trained
by FOA algorithm has been reduced in comparison to
the other methods. The rest of this paper is organized
as follows. Section two introduce the proposed algorithm and studies its different parts in details. The
proposed algorithm for training the Neural Network
will be presented in section three. The proposed algorithm is tested with benchmark problems in section
four and section five concludes the paper.

Football Optimization
Algorithm

Figure 1 shows the flowchart of the proposed algorithm. FOA encodes potential solutions to a specific
problem on players and applies teamwork operators to
these players. This algorithm is viewed as function optimizers although the range of problems to which this
algorithm has been applied to, is quite broad.
Start
Initialize parameters

Parameter
n

Description
Maximum number of players
Divide coefficient of players
Number of replacement in entire iterations
Number of replacement per iteration
Pass coefficient
Velocity coefficient of players
Spectators effect on players
Spectators effect on parameters

Value
[11,)
(0,1]
[0,n]
[0, n ]
[0,1]
best value [0.5,2]
[0,1]
[0,1]

Table 1: Adjustable parameters of the FO algorithm

2.2

Creating a team

The first step in the implementation of any optimization algorithm is to generate an initial population [13].
In a FO algorithm, a population of players called team,
which encode candidate solutions to an optimization
problem, evolves towards better solutions. In other
words, each player creates by array. The population
size (n) depends on the nature of the problem, but typically contains several hundreds or thousands of possible
solutions. This algorithm usually starts from a population of randomly generated individuals and covers the
entire range of possible solutions (the search space).

player1
player2

T eam =
(1)

..

.
playern

Collecting random players and


creating a team

in which:
Dividing players into main players
and Substitution players

playeri = [parameter1 , ..., parameterk ]

Giving the ball to suitable player


Passing the ball to best player

2.3

Dividing players

Attacking players into free space

Well-organized and well-prepared teams are often seen


beating teams. Before the games, coaches always
Substituting players
trained the team because it can play an important part
No
in a match. After practice players, they selected most
Stop conditions
satisfied?
powerful them to form main team. In this algorithm,
Yes
the power of a player is founds by evaluating the fitDone
ness function. After evaluating the fitness function of
players, m of the most powerful players are selected to
form the main players. The remaining of the populaFigure 1: Flowchart of the FO algorithm
tion (s) will be remaining axs substitute players. Then,
we have two types of players: main players and substitute players.
2.1 Initialize parameters

player1

..
mainP layers =

Table 1 shows the adjustable parameters of the football


.
optimization algorithm.
playerm mk
Random moving by spectators effect

The Third International Conference on Contemporary Issues in Computer and Information Sciences

playerm+1

..
substituteP layers =

.
playern
sk

Playeri

(2)

Parameter1

Parameter1

Parameter2

Parameter2

in which:

Parameter k

m = round(n ), s = n m

2.4

Player j

Parameter k

Figure 2: Exchange k parameters between two players in pass operation

Giving the ball to suitable player

2.6

Attacking players into free space

From the beginning of each playing period until the


end of the playing period, there is one ball in football
game. Hence, one player the existing main players is
selected to get possession of of the ball. Individual
solutions are selected through a fitness-based process,
where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Other
methods rate only a random sample of the population,
as this process may be very time-consuming. Equation
3 determines the certain selection methods that rate
the fitness of each solution, add random value (with
uniform distribution(U )) to it, and preferentially select the best rank solution.
OwnerBall = BestIndex{Rank1 , ..., Rankm }

Once a player has passed the ball, other players do not


remain stationary but move into a position where they
can receive the ball and give more options to the player
in possession. Moving into free spaces is one of the
most critical skills that footballers must develop. Players must move off the ball into space to give an advance
the maximum chance of success. Passes to space are
feasible when there is intelligent movement of players
to receive the ball and they do something constructive
with it. In this algorithm, players move into the search
space. The proposed algorithm has modeled this fact
by moving all the players toward the best player. To
search different points around the best player a random
amount of deviation added to the direction of movement. Figure 4, shows the overview of this movement.
(3)
Transfer occurs in a space that is shown as a triangle.

in which:
Ranki = F itness(playeri ) + U (d, +d)

2.5

Passing the ball to the best player

Passing the ball is a key part of football. The purpose


of passing is to keep possession of the ball by maneuvering it on the ground between different players and
to advance it up the playing field. Aside from having
conspicuous advantages, passing is a skill that demands
good technical ability not only from the distributor but
from the receiver as well. In this algorithm, passing is
a tool with great creative potential and always has to
be directed at a teammates feet. The pass is considered as an offensive action. Thus upon figure 2, in each
iteration, the rank of every player in the population is
evaluated, the best-ranked player is selected from the
current main players (based on their fitness), and parameters exchanged between passer and it.

10

Figure 3: Moving players toward the best player with


a randomly deviated

This movement is shown in equation 4 in which the


players move towards the best player by x units.
x [0, U ( d) + ]

(4)

Where U is the uniform (or any) distribution, is the


number greater than zero and causes the players to
get closer to the goal from any side,d is the distance
between the best player and other players and is a
parameter that adjusts the deviation from the original
direction.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.7

Moving by spectators effect

2.9

Convergence

The impact of spectators upon sport is substantial and This process is repeated until a termination condition
varied. These are one of the reasons for the success has been reached. Common terminating conditions
of football teams. Spectators at the stadium and team are:
practices increase morale and the sense of responsibility
in the football players. This feeling will be transferred
A solution is found that satisfies minimum criteamong all players and even coaches and managers.
ria (goal).
This movement is shown in figure 4 in which spectators
Fixed number of iterations reached.
effect modeling by random change in players parameters. In equation 5, m is a number of main players and
Allocated budget (computation time/money)
k is number of parameters of each player.
reached.
Ef f ectP layers = k 
Ef f ectP arameters = m

The highest solutions fitness is reaching or has


reached a plateau such that successive iterations
no longer produce better results.

(5)

Manual inspection.

2.8

Combinations of the above.

Substitutes

A number of players may be replaced by substitutes


during the course of the football game. Common reasons for a substitution include injury, tiredness, ineffectiveness, a tactical switch, or time wasting at the
end of a finely poised game. The most tired players
are generally substituted, but coaches often replace ineffective players in order to freshen up the attacking
posture and increase their chances of scoring. In this
algorithm, like football matches, substituting players is
required to make the conditions better. This can vary
during the game and put a significant impact on the
success of the team. The number of substitutes must
be determined before the algorithm begins, which may
be anywhere between zero and n.
Thus upon figure 5, in each iteration, the fitness
of every player in the team is evaluated and a comparison between the weakest main player and the best
substitute player takes place. If the substitute player
is stronger than the main player, a switch takes place.
During this execution algorithm can use of (number
of replacement) to adjust parameters. For example, if
is very high must decrease (spectators effect on
players).

player1
mainPlayer s
playerm

mk

playerm1

substitutePlayers

playern

sk

Figure 4: Player replacement by the substitute

ANN Learning based on FOA

Optimal connection weights can be formulated as a


global search problem where in the architecture of the
neural network is pre-defined and fixed during the evolution. Three-layered Perceptron Neural Network applied, including an input layer, a hidden and an output
layer which is formulated as formula (6):
op =

H
X
i=1

wip f [

n
X

wjp xj ]

(6)

j=1

Where p denotes the number of epoch,H denotes the


number of neurons in the hidden layer,w denotes the
weights of the network and f denotes the activation
function of each neuron which can be considered as
sigmoid and tanh. The number of input nodes set to
the number of attributes, hidden nodes to 10, and one
node in the output layer. We considered the weights
of network training phase as the variables of an optimization problem. The Mean Square Error (MSE)
used as a cost function in algorithm. The goal in proposed algorithm is minimizing this cost function. Figure 6 shows the flowchart of the proposed algorithm.
Evolutionary search of connection weights can be formulated as follows: (1) Generate an initial team of N
weight and evaluate the MSE of each population depending on the problem. (2) Depending on the fitness
and using suitable selection methods dividing players
into main players and substitute players. (3) Apply
football operators to players. (4) Check whether the
network has achieved the specified number of generations has been reached then goes to step 3. (5) End.

11

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Teaching Assistant Evaluation: includes evaluations of teaching performance; scores are low,
medium, or high. It contains 151 samples with 5
attributes.
50% of instance data applied for training the Neural
Network and the remaining 50% for testing. The neural network trained by FOA, ICA, PSO and GA algorithms and the results compared with each other. An
accurate comparison of the four methods is presented
that uses 10-fold experiment replication. For each classification problem, same topologies have been selected
Figure 5: Training and classification processes
and minimum cost function value and mean cost value
versus epochs are presented. The results of these experiments are presented in Table 2 and 3. Figures 7 show
the mean test error and the mean train error (false
4 Experimental Results
classification percent) for each of the four compared
optimization methods on five classification problems.
From the experimental results, it can be seen that in
In this paper, the proposed method performance evalall cases the FOA performed better.
uated in comparison to ICA, PSO and GA algorithms
for training a three layered Perceptron Neural Network.
FOA
ICA
PSO
GA
Dataset
Precision
Precision
Precision
In the FOA algorithm, the parameters , , , and
MSE
MSE
MSE Precision MSE
(%)
(%)
(%)
respectively are set to 0.5, 1, 2.5, 0.1 and 0.1. The
Wine
0.0509
0.9348
0.0945
0.8587
0.0552
0.8478
0.1783
0.4022
Glass
0.4881
0.742
0.5762
0.5158
0.5505
0.5965
0.6948
0.4649
number of players is considered 100. In the ICA alHeart
0.0887
0.8603
0.1778
0.7647
0.1051
0.6765
0.1999
0.7059
Vertebral 0.2836
0.7134
0.4148
0.6115
0.3154
0.7061
0.4319
0.5796
gorithm, the parameters , a and b respectively are
Teaching 0.2878
0.6538
0.5264
0.4487
0.3103
0.6026
0.5077
0.4872
set to 2, 0.5 and 0.2. The number of imperialists and
the colonies are considered 10 and 100. In the PSO
Table 2: Train result for each of the four methods
algorithm, the parameters c1 and c2 are fixing to 1.5
and the number of the particle is 100. Determining
this amount for c1 and c2 we have given equal chance
FOA
ICA
PSO
GA
Dataset
Precision
Precision
Precision
Precision
to social and cognition components take part in search
MSE
MSE
MSE
MSE
(%)
(%)
(%)
(%)
process. In GA the population size is 100, the muWine
0.0224
0.9651
0.0723
0.9186
0.3316
0.7326
0.2350
0.3372
Glass
1.261
0.651
1.5927
0.5300
1.4525
0.4800
2.8823
0.4000
tation and crossover rate are respectively set to 0.03
Heart
0.2008
0.7463
0.2099
0.7015
0.1874
0.5970
0.2185
0.6642
Vertebral 0.3649
0.6948
0.4298
0.5294
0.4293
0.6869
0.5301
0.5556
and 0.5. The number of iteration is considered 1000
Teaching 1.621
0.3562
2.600
0.2899
2.077
0.3014
3.588
0.2110
for all methods. The datasets used for evaluating the
proposed approach are known datasets that are availTable 3: Test result for each of the four methods
able for download from UCI and refer to classification
problems. Five datasets have selected as follows:
0.6

Glass: includes glass component analysis for glass


pieces that belong to 7 classes. It contains 214
samples with 9 attributes.
Statlog (Heart): includes heart disease that belongs to 2 classes. It contains 270 samples with
13 attributes.

0.5

0.4
FOA

Error

Wine: includes data from wine chemical analysis


that belong to 3 classes. It contains 178 samples
with 13 attributes.

ICA

0.3

PSO
GA

0.2

0.1

0
Mean Test Error

Mean Train Error


Algorithms

Figure 6: Mean train(test) error for all methods

Vertebral Column: containing values for six


biomechanical features used to classify orthopedic patients into 3 classes (normal, disk hernia or
Figures 8-11, the comparison of Mean Square Error
spondilolysthesis). It contains 310 samples with (MSE) of Neural Network trained by FOA, ICA, GA
6 attributes.
and PSO algorithms with Teaching Dataset, indicated

12

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

that the proposed algorithm trained very well rather of ball to the goal as expected. In this cooperation,
than the other algorithms.
the ball is moved gradually to the goal and finally best
player takes a shot at the goal. Then, Football Optimization Algorithm uses an evolutionary algorithm
in order to optimize the weights of a neural network.
The FOA method is evaluated on five known classification problems and compared against the state of the
art methods: ICA, PSO and GA. The consideration
of the results showed that the training and test error
of the network trained by the FOA algorithm has been
reduced in comparison to the other three methods. FuFigure 7: Mean square error for FOA per iteration
ture work will consist in modifying some parts of the
algorithm improve the algorithm execution speed.

Refrences

Figure 8: Mean square error for ICA per iteration

[1] T. J. Glezakos, T. A. Tsiligiridis, L. S. Iliadis, C. P. Tsiligiridis, F. P. Maris, and P. K. Yialouris, Feature extraction
for time-series data: An artificial neural network evolutionary training model for the management of mountainous watersheds: Lecture Notes in Computer Science, Neurocomputing 73/2009 (2009), 4959.
[2] T. J. Glezakos, G. Moschopoulou, T. A. Tsiligiridis, S.
Kintzios, and C. P. Yialouris, Plant virus identification
based on neural networks with evolutionary preprocessing:
Lecture Notes in Computer Science, Computers and Electronics in Agriculture 70/2010 (2010), 263275.
[3] M. Georgiopoulos, C. Li, and T. Kocak, Learning in the
feed-forward random neural network: A critical review: Lecture Notes in Computer Science, Performance Evaluation
68/2011 (2011), 361384.

Figure 9: square error for GA per iteration

[4] P. Kordik, J. Koutnik, J. Drchal, O. Kovarik, M. Cepek,


and M. Snorek, Meta-learning approach to neural network
optimization: Lecture Notes in Computer Science, Neural
Networks 23/2010 (2010), 568582.
[5] C. Dai, W. Chen, Y. Zhu, Z. Jiang, and Z. You, Seeker optimization algorithm for tuning the structure and parameters of neural networks: Lecture Notes in Computer Science,
Neural Networks 74/2011 (2011), 876883.
[6] D. Mantzaris, G. Anastassopoulos, and A. Adamopoulos,
Seeker optimization algorithm for tuning the structure and
parameters of neural networks: Lecture Notes in Computer
Science, Neural Networks 24/2011 (2011), 831835.

Figure 10: Mean square error for PSO per iteration

Discussion and Future Works

In this paper, an optimization algorithm based on modeling the football match is proposed. Each individual
of the population is called a player. The team is divided into two groups: main players and substitute
players. A team composed of good passers and mobile players. Teamwork among main players forms the
core of this algorithm and results in the convergence

[7] D. Rivero, J. Dorado, J. Rabunal, and A. Pazos, Generation


and simplification of Artificial Neural Networks by means of
Genetic Programming: Lecture Notes in Computer Science,
Neurocomputing 73/2010 (2010), 32003223.
[8] A. Sedki, D. Ouazar, and E. Mazoudi, Evolving neural network using real coded genetic algorithm for daily rainfallrunoff forecasting: Lecture Notes in Computer Science, Expert Systems with Applications 36/2009 (2009), 4523
4527.
[9] S. Kiranyaz, T. Ince, A. Yildirim, and M. Gabbouj, Evolutionary artificial neural networks by multi-dimensional particle swarm optimization: Lecture Notes in Computer Science, Neural Networks 22/2009 (2009), 14481462.
[10] J. Yu, S. Wang, and L. Xi, Evolving artificial neural networks using an improved PSO and DPSO: Lecture Notes
in Computer Science, Neurocomputing 71/2008 (2008),
10541060.

13

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[11] M. Abdechiri, K. Faez, and H. Bahrami, Neural Network


Learning Based on Chaotic Imperialist Competitive Algorithm: Lecture Notes in Computer Science, 2nd Int. Workshop on Digital Object Identifier Intell (2010), 15.

14

[12] S. Kulluk, L. Ozbakir, and A. Baykasoglu, Self-adaptive


global best harmony search algorithm for training neural
networks: Lecture Notes in Computer Science, Procedia
Computer Science 3/2011 (2011), 282286.

Evaluating XML Retrieval Systems Using Methods of Averaging


Precision and Recall at Rank Cut-offs
Marzieh Javadi

Department of Computer Engineering, Zanjan Branch, Islamic Azad University, Zanjan, Iran
MarziehJavadi@ymail.com

Hassan NADERI
Faculty of Iran University of Science and Technology (IUST)
naderi@iust.ac.ir

Abstract: Today, with growing of XML documents on the web, attempts to develop XML retrieval
systems is also growing. As more XML retrieval systems are offered, performance evaluation of them
become more important. In this context, there are some metrics that are used to rank retrieval
systems, and most of them, extend the definitions of precision and recall.
In this paper, ranking of XML retrieval systems, for INEX 2010 runs, according to three methods
of Averaging precision and recall values in specific rank-cutoffs, are compared with results of MAiP
metric, that is used for evaluation by INEX.

Keywords: XML Retrieval; Precision; Recall; Evaluation Metrics.

Introduction

In section 2, part of INEX that is used in this paper, is


described. In section 3, evaluation metrics is presented.
In section 4, ranking of XML retrieval systems, based
Compared with traditional information retrieval, that on the measure stated in section 3, has been compared
is considered whole of document as retrievable unit, with each other. In section 5, results are presented.
XML document structure, provides more evaluation
challenges. XML retrieval systems on collection of articles that have been marked with XML, retrieve a list
of XML elements as the best answers to user query[1]. 2
INEX
With the increasing number of XML retrieval systems,
evaluate the performance of them become more imporINEX project is a large scale project in field of XML
tant.
Most of these evaluation methods use of test collections retrieval, and includes set of test collection and evaluthat is made for this purpose. These test collections are ation methods[3].
usually include set of documents, user requests and relevance assessments, that are specify set of correct answers for user request[2].
2.1 INEX 2010 Data Centric Collection
Since 2002, several metrics for performance evaluation
and ranking of XML retrieval systems is presented,
that any of them have some disadvantages. In this INEX collection in this track used of IMDB collecpaper, influence of precision and recall, that are main tion,that built from www.imdb.com. Text files from
concepts in evaluation of retrieval systems efficiency, this site was converted to XML documents. Overally,
has been studied at specific rank cut-offs.
this collection included 4,418,102 XML files[4].
Corresponding

Author, T: (+98) 5827115

15

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.2

Topics

the relevance value of ei .


Precision at rank r, is measured as the fraction of retrieved relevant information at rank r as it shown in
In total, 28 topics were selected for this track, that
equation 2:
Pr
reflect users information need. A sample of topic from
rval(ei )
(2)
P @r = Pi=1
this track is shown in Fig. 1:
r
i=1 size(ei )
In above equation, size(ei ) is size of retrieved element
on rank i, and |L| is the length of retrieved element
list[7].
Recall at rank r, is measured as the fraction of relevant
information retrieved at rank r as it shown in equation
3:
r
1 X
R@r =
rval(ei )
(3)
Trel i=1
Figure 1: INEX 2010 Data Centric Track Topic Sample In above equation, Trel shows toatal amount of highlighted relevant text for a topic[7].

2.3

Assessments and Evaluation In


3.2
INEX 2010 Data Centric Track

MAiP

For each topic, AiP calculated by averaging iP scores


Runs of XML retrieval systems evaluated with preci- on 101 standard recall levels:
sion, recall, MAP, P@5, P@10, ... in INEX 2010.
X
1
iP [x]
(4)
AiP =
101 x=0.00,0.01,...,1.00

Evaluation Metrics

In INEX 2010, for a topic, systems should retrieve a


ranked list of XML elements,that have been detected
related to request. So XML retrieval systems, in addition to ordering of XML elements according to their estimated relevance score, should retrieve elements that
havent overlap with previously retrieved elements[5].

Also, iP on standard recall level of x, is the highest


precision on each recall level, that was obtained from
this level or following recall level. Mean average interpolated precision (M AiP ) calculated by averaging the
AiP values for all of topics[5]. M AiP for n topics is
shown in equation 5:
1X
M AiP =
AiP
(5)
n

3.3
3.1

F Measure

Precision and Recall

Precision and recall values using F measure, can be


converted to a single score. By comparing scores that
Amount of retrieved relevant information, measure
obtained from F measure, knowing which system is cawith length of relevant text. So, instead of counting
pable to retrieve more relevant information, without
the number of retrieved relevant documents, amount
retrieving significant amount of irrelevant information,
of relevant text that is retrieved is measured[6].
is possible[7]. F measure, calculated as equation 6:
As it shown in equation 1, rsize(ei ) is amount of rel2.P @r.R@r
evant text in element e, that was retrieved in rank i.
(6)
F measure =
P @r + R@r
For measuring amount of relevant text that is retrieved
from ei , relevance value function was defined as below:
(
rsize(ei )
if overlap(i) = 0 3.4 Arithmetic and Geometric Means
P
rval(ei ) =
rsize(ei ) . ( ej Ri )rval(ej ) else
(1) Arithmetic and geometric means of precision and recall
In equation 1, if there is overlap between ei and ej that values, calculated as equation 7 and 8:
is on the Ri (list of elements that retrieved before elP @r + R@r
Arithmeticmean =
(7)
ement i), relevance value of ej will be deducted from
2

16

The Third International Conference on Contemporary Issues in Computer and Information Sciences

GeometricM ean =

2
P @r.R@r

(8) recall at rank 50, using 27 runs of INEX IMDB 2010.


The Spearman correlation coefficient is 0.94.

System Rankings with Evaluation Measures

In this section, rankings of XML retrieval systems, for


INEX 2010 runs on data centric track are computed,
according to three methods of averaging precision and
recall, and compared with system ranking obtained
from MAiP.
First, we calculated precision, recall and MAiP for
XML retrieval systems of INEX 2010 IMDB. Afterwards, evaluation of systems was done by calculation
of F measure and arithmetic and geometric means, and
ranking based on these measures.
INEX is selected 1, 2, 5, 10, 25, 50 for rank cut-offs, so
we use these cut-off points too.
Table 1, 2 and 3 show spearman correlation coefficients
calculated from the run orderings using the 27 submitFigure 2: Correlation between run orderings by MAiP
ted runs, respectively for arithmetic and geometric
and arithmetic mean of precision and recall at rank 50
means and F measure at rank cut-offs, for the INEX
IMBD 2010. Best results on each method are shown
in bold.
The graph of Figure 3 provides a detailed overview
We observe that results of each of the three averaging
of
the
observed correlation between run orderings genmethods is strongly correlated to MAiP.
erated by MAiP and geometric mean of precision and
Table 1. Spearman correlation coefficients calculated recall at rank 10, using 27 runs of INEX IMDB 2010.
from the run orderings obtained from arithmetic mean The Spearman correlation coefficient is 0.93.
at rank cut-offs and MAiP

MAiP

a@1
0.87

a@2
0.85

a@5
0.92

a@10
0.93

a@25
0.92

a50
0.94

Table 2. Spearman correlation coefficients calculated


from the run orderings obtained from geometric mean
at rank cut-offs and MAiP

MAiP

G@1
0.84

G@2
0.85

G@5
0.93

G@10
0.93

G25
0.92

G50
0.93

Table 3. Spearman correlation coefficients calculated


from the run orderings obtained from F measure at
rank cut-offs and MAiP

MAiP

F@1
0.80

F@2
0.86

F@5
0.90

F@10
0.90

F25
0.89

F50
0.91

Figure 3: Correlation between run orderings by MAiP


and geometric mean of precision and recall at rank 10

The graph of Figure 2 provides a detailed overview of


The graph of Figure 4 provides a detailed overview
the observed correlation between run orderings gener- of the observed correlation between run orderings genated by MAiP and arithmetic mean of precision and erated by MAiP and F measure at rank 50, using 27

17

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

runs of INEX IMDB 2010. The Spearman correlation related with MAiP measure. Despite of importance of
coefficient is 0.91.
research for overcoming the weaknesses of existing metrics and efforts to creat new metrics, results of simplest
definitions are very close to the best existing metrics.
According to results were shown tables in section 4,
arithmetic mean of precision and recall at rank cut-off
50, has created the best results. Hence it can be the
appropriate baseline for comparing results of metrics
that are created in the future. In the future, we want
to expand this research with the Wikipedia collection
and more cut-off points.

Refrences
[1] J. Pehcevski and B. Piwowarski, Evaluation Metrics for
Semi-Structured Text Retrieval (2009).
[2] M. Lalmas and A. Tombros, INEX 2002-2006: Understanding XML Retrieval Evaluation: DELOS07 Proceedings of
the 1st international conference on Digital libraries: research
and development, Springer, Berlin/Heidelberg (2007), 187
196.

Figure 4: correlation between run orderings generated


by MAiP and F measure at rank 50

Conclusions
Works

and

Future

In this paper, ranking of XML retrieval systems, based


on three metrics has been compared. For measuring
amount of correlation between results obtained from
metrics, spearman correlation coefficients is used. Results from this study show that the average values of
precision and recall at some rank cut-offs, strongly cor-

[3] N. Fuhr, N. Govert, G. Kazai, and M. Lalmas, INEX: Initiative for the Evaluation of XML Retrieval: Proceedings of the
SIGIR 2002 Workshop on XML and Information Retrieval
(2002).
[4] A. Trotman and Q. Wang, Overview of the INEX 2010 Data
Centric Track: Lecture Notes in Computer Science, Springer,
Berlin/Heidelberg 6932/2011 (2011), 171181.
[5] J. Kamps, J. Pehcevski, G. Kazai, M. Lalmas, and S. Robertson, INEX 2007 evaluation measures, Springer, Heidelberg
4862 (2008), 2433.
[6] J. Pehcevski and J.A. Thom, HiXEval: Highlighting XML
Retrieval Evaluation. In Advance in XML Information Retrieval and Evaluation: Fourth Workshop of the Initiative
of XML Retrieval, INEX 2005, Springer, Berlin/Heidelberg
3977/2006 (2006), 4357.
[7] J. Pehcevski, Evaluation of Effective XML Information Retrieval, Phd thesis, Chapter 5, pages: 149184, 2006.

18

Performability Improvement in Grid Computing with


Artificial Bee Colony Optimization Algorithm
Neda Azadi

Mohammad Kalantari

Islamic Azad University of Qazvin

Islamic Azad University of Qazvin

Department of Electrical, IT & computer science

Department of Electrical, IT & computer science

Qazvin, Iran

Qazvin, Iran

Neda.Azadi@qiau.ac.ir

md.kalantari@aut.ac.ir

Abstract: Modeling and evaluating of grid computing environment is very difficult because of
complexity and distribution nature. The present paper studies the evaluation of the performability
of grid computing. Here, a tree structure is assumed for the grid with RMS in its root. Users
give their tasks as well as their requirements to the RMS and finally take back the result from it.
The RMS divides the task into parallel and smaller subtasks in order to get a better performance.
Transferring each parallel subtask to several resources also increases its reliability. Analysis of the
system by means of reliability and performance measure is called performability. The performability
improvement is directly related to resource allocation among subtasks. In this paper, we present an
algorithm for resource allocation based on artificial bee colony optimization algorithm. The most
important step in optimizing algorithms is to define the objective function that should be solved
with optimizing algorithms. In this paper, the objective function is the performability improvement.
Since the tree structure is used in the resource allocation problem, the Bayesian logic and graph
theory are also used.

Keywords: RMS; performability; Bayesian model; graph theory; optimization; artificial bee colony; swarm intelligent.

Introduction

Grid computing[2] has emerged as an important new


field, distinguished from conventional distributed computing by its focus on large-scale resource sharing,
innovative applications, and, in some cases, highperformance orientation. The Open Grid Services Architecture (OGSA[1]) enables the integration of services and resources across distributed, heterogeneous,
dynamic virtual organizations sharing and serviceprovider relationships. This feature with OGSA gives
the grid this opportunity to satisfy its users requirements with best QoS and performance. So providing
the grid users requirement has great importance. The
grid users requirement is of different levels and a combination of high performance and reliability together
which is called performability.
Corresponding

The users give their desired level of performance


and reliability requirements to the RMS [3]; RMS divides users task into parallel subtasks and then it distributes the subtasks among the available resources according to the types of resource conditions and level of
users requirements. This is resource allocation. After
performing the subtasks, resources give back results to
the RMS and finally they are delivered to the users.
Performance, which is mostly interpreted as the execution time, is affected by factors such as number of
available resources, the rate of resources reliability and
communication channel [5]. It is evident that performance and reliability affect each other; however they
were evaluated separately in the past [4]. In this paper
these two measures are considered simultaneously in a
grid with a tree structure.

Author, P. O. Box 7153744414, T: (+98) 917 110 0291

19

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

If a task breaks into n parallel subtasks, the execution time will decrease. But in a real situation which is not devoid of failure - any failure in each subtask makes the whole task execution problematic. In
order to solve this problem, increase the reliability besides performance, and make harmony between these
two measures, we assign each subtask to several resources. In this way, if a failure accrues, any subtask
can be performed by other resources and the probability of the flawless accomplishment of the main task will
increase [5].
In the paper [10] the evaluation of performability is
studied in a grid with star structure. The models used
in the evaluation of systems performance and reliability are queuing network [12], Stochastic Petri Nets [14],
Bayesian model [16] and markov models [13]. Each
of the above models can be evaluated by the analysis or simulation methods [15]. Like paper [5], we use
Bayesian method for evaluation.
As mentioned, one way to increase the grids
performability is to optimize the resource allocation
among subtasks. In paper [6] this is done with Genetic
algorithm [25]. Nevertheless, in the present paper we
have made use of the artificial bee colony since it is
simpler and more flexible than the genetic algorithm.
The rest of the paper is organized as follows. Section 2 presents a model for the evaluation of reliability
and performance. The artificial bee colony algorithm is
explained in third part. The result of the optimization
is presented in part 4 and in the final part, a comparison between genetic optimization algorithm and artificial bee colony optimization algorithm is presented.

The resources are particularly assigned to a subtask.


Each resource has a constant processing speed
and rate of failure. Each communication channel
has a constant rate of failure and bandwidth.
The rate of failure in processing resource and
communication channel does not change during
the activity period.
The failure in resources and communication
channel is follow Poisson process. The probability of failures depends on the time of information
transfer and input processing. In fact exponential distribution is a general distribution in the
discussion of reliability analysis of software and
hardware component that is both theoretically
and practically acceptable [9].
The failure of resources and communication
channel is independent.
If a failure happens in resources or communication channels before the transformation of outputs from resource to the RMS, the whole task
encounters failure.
The resources start processing of the subtasks immediately after receiving them. This way, there
is no waiting queue in the resource for doing the
task; and consequently there is neither waste of
time nor waiting time.
The whole task would not be completely performed unless the result of all subtasks is delivered to the RMS.
If the information transfers via several communication channels, the transferring speed will be
limited to a link with the least bandwidth.
RMS is thoroughly reliable and its failure probability is assumed to zero.

Performance And Reliability


Models

The time of input transfer depends on the


amount of inputs (that should be transferred).
The time of subtask processing depends on the
complexity of calculation.

A few researches have been done about grids performability since their complexity challenges their model
making and evaluation [8]. In this part, grid is evalu- According to assumption above, when the subtask j is
ated from a performability point of view. In order to assigned to a resource i, the processing time is a ranutilize this model, the following hypothesis are needed dom variable that can be calculated from this relation:
[5, 6, 9]:
Cj
Tij =
xi
The requirements are taken into account imme- Where xi is the processing speed of resource i and Cj
is the computational complexity of subtask j. If data
diately; therefore, no time is wasted.
transmission between the RMS and resource i is ac The RMS divides each task into several subtasks.
complished through links belonging to a set i . Where
The resources are automatically recorded in si is the link with minimum bandwidth in a set i ,
and ai is denoted the amount of data that should be
RMS.

20

The Third International Conference on Contemporary Issues in Computer and Information Sciences

transmitted for the subtask j, thus the random time of does not fail:
I
X
communication between the RMS, and the resource i
i Qi
(3)
W
=
that executed the subtask j, can calculated from this
R()
aj
i=1
relation:ij = si
R() is defined as the probability that the correct outFor the constant failure rate the probability that puts without respect to the service time.
resource i does not fail until the completion of the subA tree is composed of the combination of resources
task j can be obtain as: pij = ei Tij
and the communication channel taking part in the exWhere i is the failure rate of resource k.
ecution of a task. Each tree contains several minimal
Given a constant failure rate the probability that spanning trees (MST) that guarantee the complete exthe communication channel between the RMS and the ecution of a task by the subtasks. On the condition
resource i does not fail until the completion of the sub- that any composing part of a tree encounters a failure, the whole task will be jeopardized. As any task
task j is:
is divided into parallel subtasks, different realizations
qij = ei ij
Where i is the failure rate of communication channel (MST) have been made. The execution time of any
MST is determined by the features of the grid such
between the RMS and the resource i.
as the bandwidth of communication channels, the pro
The random total completion time for subtask j cessing speed of performing resources and . . . .
assigned to resource i is equal to Tij + ij . It distribution of this time is P r( ) = pij qij .

The probability of performing a task by each tree is


measured after arranging the MST according to each
During the division of a task into parallel subtasks, ones execution time in an upward manner. The exeseveral combinations for performing the task come into cution of a task can be transferred to next MST of the
existence. Each combination is called a realization [5]. list if the previous MST encounters a failure.
Each realization performs the task in a deterministic
According to model [5] that was briefly mentioned
time and specific probability. At the end, the reliabilabove
and on the basis of conditional probability, the
ity of each task is defined as the probability that the
probability
of performing a task by each M ST (Qi )
correct output is produced according to users requireused
in
relation
1 and 3, can be deduced through the
ments. Depending on whether the user has determined
following
relation:
his execution time requirement or not, there are two
ways for calculating the reliability:
i1 , E
i2 , ..., E
1 )
Qi = pr(Ei , E
(4)

1 If the user has requested the time limitation Where Ei is the event when M STi is available and Ei
for the execution, the reliability of each task can is the event when M STi is not available. A binary
search tree can be used to calculate relation 4.
be calculated through this relation:
R( ) =

I
X

Qi .1(i < )

(1)

i=1

2 If the user hasnt determined the execution time,


the reliability can be calculated this way:
R( ) =

I
X

Qi

(2)

i=1

In such relations, parameter i shows the number of realizations of performing a task,i equals the time of
execution task by realization i and Qi is the probability
of performing the task by realization i that performing
a task in i .
he conditional expected service time W is considered to be a measure of its performance, which determines the expected service time, given that the service

21

Optimizing Technique

Resource allocation in grids is a complicated problem


[6]. To optimize it, we should make use of Meta heuristic methods [17]. The Meta heuristic algorithms are
placed in the group of approximate algorithms and
have the ability to exit on the condition of getting
caught in the local optimization. The application of
Swarm Intelligent (SI) [18] in Meta heuristic algorithm
is a method vastly used in complicated problems. SI
is a kind of artificial intelligent that is formed according to collection behavior in the distributed and selforganized environments [21]. This intelligence is inspired by natural behaviors. Examples of such intelligence are ant colony optimization algorithm [19],
particle swarm optimization algorithm [24], bee colony

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

optimization algorithms [20] and cuckoo optimization solution is the best distribution of resources among the
algorithm.
subtasks to gain the highest degree of reliability.
Artificial bee colony (ABC) algorithm is one of the
newest and most applied optimization algorithms because of its simplicity and few control variables. The
researchers have paid special attention to it since 2005.
In ABC algorithm, each cycle of the search consists
of three steps: moving the employed and onlooker
bees onto the food sources and calculating their nectar
amounts; and determining the scout bees and directing
them onto possible food sources. A food source position represents a possible solution to the problem to
be optimized. The amount of nectar of a food source
corresponds to the quality of the solution represented
by that food source(fitness function). Onlookers are
placed on the food sources by using a probability based
selection process. As the nectar amount of a food
source increases, the probability value with which the
food source is preferred by onlookers increases, too[21].

In each cycle, 10 new solutions are produced. These


solutions are, in fact, the same as food resources around
the hive for bees to find the best ones for honey. The
produced solutions are saved in some arrays to be evaluated. The subtask numbers allocated to any resource
are recorded in the solution array. For example, if we
have 3 subtasks and 9 execution resources in a way that
the first subtask is allocated to the resources 1,4,6,7,
the second subtask to the resources 2 and 9, and the
third subtask to the resources 3, 5, 8, then the solution
array would be as bellow:

The main steps of the algorithm are given as below


Ten random solutions are produced in the initial
[20]:
phase. The first solution is presupposed to be the most
optimized one. Then in the 100 cycle the employee
bees, scout bees and onlooker bees produce the next
1: Initialize Population
solutions in order to reach the most optimized one.
The new solution is made by the old solution in a way
2: repeat
that we change the random place of the array, which
3: Place the employed bees on their food sources
shows its allocated subtask number, with another sub4: Place the onlooker bees on the food sources de- task number. Making new solutions, we should pay
pending on their nectar amounts
attention to the fact that some solutions are not valid.
5: Send the scouts to the search area for discovering For instance, there should be just one subtask in each
solution.
new food sources
6: Memorize the best food source found so far
7: until requirements are met
The selection is controlled by a control parameter
called limit. If a solution representing a food source
is not improved by a predetermined number of trials,
then that food source is abandoned by its employed
bee and the employed bee is converted to a scout.

3.1

Description of problem and its solution

This solution should again be replaced by a new


valid solution. In each stage of movement of employee
bees and onlooker bees, the fitness of each solution is
evaluated by relation 2 that is introduced in second
part of the paper; and the most valid solution is memorized and placed in the global optimized variable. The
limit control variable is used to control the inappropriate solutions. If a solution does not improve the
problem solving procedure, the counter of limit variable adds one unit to the last phase, if the limit variable goes beyond a determined number, it means that
the current solution is not appropriate anymore and
the scout bees should replace it with a new solution.

Here we want to use ABC algorithm to optimize the


Evaluating The Work of Optiproblem of resource allocation in grid with the aim of 4
improvement in its performability. To do this, we remizing Algorithm
spectively appoint 100 and 20 to the control variable of
algorithm maxcycle and limit. The three main stages
of this algorithm are actually performed in 100 cycles in In order to evaluate the performance of the optimizing
order to reach the optimized solution. The optimized algorithm of ABC in the resource allocation problem,

22

The Third International Conference on Contemporary Issues in Computer and Information Sciences

we need to define a context for performing the algo- algorithm. This result actually shows that the ABC
rithm. Since our goal is to compare the result of this is a more appropriate solution for resource allocation
optimizing algorithm with the genetic optimizing algo- among the subtasks.
rithm, we use the mentioned context of paper [6].
Subtask Amount of
Distribution
Same as[6] the task is broken into 3 subtasks by the
complexity
RMS. The amount of complexity and data transferring
SB1
38.94
R1 , R3 , R7
of any subtask is shown in Tables 1 and 2.
SB2
25.44
R2 , R5
SB3
35.62
R4 , R6 , R8 , R9
SB1 38.94%
SB2 25.44%
Table 3: Distribution of Optimal Solution
SB3 35.62%
Table 1: Amount of Complexity of Each Subtask

SB1
SB2
SB3

250 MB
350 MB
400 MB

Table 2: Amount of Transferred Data of Each Subtask

There are 9 processing resources that are connected Figure 2: Diagram of comparison of two optimizing
together like a tree structure. Figure 1 shows the algorithm
grid environment. In this figure, the rate of resource
failures, communication channel failures, information
transfer speed and resource processing speed are also
4.1 The effect of bandwidth on evaluaexhibited.

tion measures
We have two scenarios for limiting the communication
channel in calculating the quality of the resources in
ABC algorithm:
1: the bandwidths of communication channel are supposed to be the minimum of the existing channel.
2: the bandwidths of communication channel are supposed to be the average of the existing channel.
If we use later scenario the reliability seems to be
increased and performance would increase. In Figure
3, the effect of average communication channel bandwidth usage has been shown.

Figure 1: Structure of evaluating grid [6]

A program for the ABC algorithm was written in


java, and executed on a Pentium IV 1.5G processor.
This takes about 1.50 minutes, and this time converge
is better than genetic optimization algorithm. The result for the near-optimal subtask distribution among
the nine resources, are given in table 3. The resulting
reliability and performance for the near-optimal solutions that found are alternatively 0.9799 and 41.10. As
exhibited in the following diagram in Figure 2, the re- Figure 3: Diagram of the comparison of bandwidths
liability resulted from ABC is better than the genetic influence

23

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

As shown in the above diagram, the reliability of


performing task is increased by the average bandwidth
via limiting the communication channels.

4.2

The effect of users requirement on


the optimizing algorithm

As previously mentioned, users requirements for execution time are different. Some of them should be performed in a time limit while some other are supposed to
be done correctly without any time constraint. Therefore, the user will be more pleased with the result if the
requirements are analyzed in addition to selecting and
allocating the resources. This way, the resources and
their power will be used in appropriate manners [7].
So, in this part, we consider the proposed algorithm
with respect to users requirements.
Imagine that in the previous context, a user delivers the task along with constraint on execution time
(deadline) to the RMS. Such limitation of time directly
affects the tasks reliability since, in such a situation;
the subtasks are only assigned to those realizations that
can perform them in shorter time duration than the
users deadline. In the following diagram that is shown
in Figure 4 the three following condition are compared
in a particular distribution. If each execution time
limit is between 40 to 120 seconds, different conditions
can be defined as follow:

Conclusion and Future Work

The problem of resource allocation is highly complicated because the complexity and distribution of computational grids are more complicated than other distributed environments. It is not possible to optimize such difficult problems with common algorithms;
rather Meta heuristic optimization algorithms are more
useful. The most important step in optimizing algorithms is to define the objective function that should
be solved with optimizing algorithms. In this paper,
the objective function is the simultaneous increase in
the two measures of performance and reliability or, in
the other word, performability. The grid user delivers
his intended task as well as requirements (optionally)
to the RMS. After dividing the task to the parallel subtask, the RMS allocates the best resources to subtasks
using an optimizing algorithm, and in the meanwhile,
it considers the processing resource and communication channels and their elements such as speed rate,
bandwidth, processing speed and so on. Therefore the
task can be performed with the highest performability.

As a future procedure for optimizing the problem of resource allocation in grids, we can use other
Meta heuristic algorithms that are inspired by nature,
like the artificial immune system, ant colony, particle
swarm, etc and then compare the results. Exponential
distribution is a general distribution in the reliability
analysis of hardware and software component but it
has a constant rate, while in the real environment, the
failure rate is time varying parameter. Therefore, the
use of another appropriate distribution for failures can
First condition: the users deadline is 70 second.
be studied in the future.
Second condition: the users deadline is 100 second.
Third condition: no deadline is defined.

Refrences
[1] I Foster, C Kesselman, and S Tuecke, The anatomy of the
grid: Enabling scalable virtual organizations, International
Journal of High Performance Computing Applications 15
(2001), 200-222.
[2] I Foster, D. Becker, and C Kesselman, The grid 2: Blueprint
for a new computing infrastructure, San Francisco, CA:
Morgan-Kaufmann, 2003.

Figure 4: Diagram of influence of users requirement


on reliability

[3] K Krauter, R Buyya, and M Maheswaran, A taxonomy and


survey of grid resource management systems for distributed
computing, SoftwarePractice and Experience 32(2): (2002),
135-164.
[4] I. Eusgeld, J. Happe, and P. Limburg, Performability. In:
Dependability Metrics, Springer, Berline/Heidelberg (2008),
pp.254.

As the above picture shows, the more time limits in


users requirements, i.e. less time considered for task
execution, leads to less reliability.

[5] Y.S. Dai and G. Levitin, Reliability and performance of treestructured grid services, Reliability and performance of treestructured grid services 55(2) (2006), 337-349.

24

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] Y.S. Dai and G. Levitin, Optimal Resource Allocation for


Maximizing Performance and Reliability in Tree-Structured
Grid Services, IEEE Transactions on Reliability 56(3)
(2007), 444-453.
[7] L Ramakrishnan and D.A Reed, Performability Modeling
for Scheduling and Fault Tolerance Strategies for Scientic
Workows, HPDC 08 Proceedings of the 17th international
symposium on High performance distributed computing.
[8] S. Jarvis, N. Thomas, and A.V. Moorsel, Open issues in
grid performability, I.J.of Simulation 5(5) (2005), 3-12.
[9] Y.S. Dai, M. Xie, and K.L. Poh, Reliability Analysis of
Grid Computing Systems, Proc. Ninth IEEE Pacific Rim
Intl Symp.Dependable Computing (PRDC02) (2002), 97
104.
[10] G. Levitin and Y.S Dai, Performance and reliability of a
star topology grid service with data dependency and two
types of failure, IIE Transactions 39(8) (2007), 783-794.
[11] A. Heddaya and A. Helal, Reliability, Availability, Dependability and Performability: A User-centred View, Technical
Report BU-CS-97-011, Boston University (1996).
[12] L. Kleinrock, Queueing Systems, Theory, Wiley 1 (1975).
[13] M. Bernardo and M. Bravetti, Performance Measurement
Sensitive Congruencies for Markovian Process Algebras,
Theoretical Computer Science 290 (2003), 117-160.
[14] M. Ajmone, G. Balbo, and G. Conte, A Class of Generalized Stochastic Petri Nets for the Performance Evaluation
of Multiprocessor Systems, ACM Transactions on Computer
Systems 2 (1984), 93-122.

25

[15] J. Banks, J.C. Ii, B. Nelson, and D. Nicol, Discrete-event


System Simulation, Prentice- Hall, 1999.
[16] J.G.T. Toledano and L.E. Sucar, Bayesian Networks for
Reliability Analysis of Complex System, Springer-Verlag
Berlin Heidelberg. (1998).
[17] E.G. Talbi, Metaheuristics: from design to implementation,
Wiley, 2009.
[18] G. Beni and J. Wang, Swarm Intelligence in Cellular
Robotic Systems, Proceed, NATO Advanced Workshop on
Robots and Biological Systems, Italy (1989).
[19] M. Dorigo and D. Stutzle, Ant Colony Optimization, MIT
Press, 2004.
[20] D. Karaboga, Artificial bee colony algorithm, Scholarpedia
5(3) (2010), no. 6915.
[21] D. Karaboga, An Idea Based on Honey Bee Swarm for Numerical Optimization, Technical Report-TR06, Erciyes University, Engineering Faculty, Computer Engineering Department, Turkey. (2005).
[22] D. Karaboga and B. Akay, A comparative study of Artificial
Bee Colony algorithm, Applied Mathematics and Computation 214(1) (2009), 108132.
[23] D. Karaboga and B. Akay, A survey:algorithms simulating
bee swarm intelligence, Artificial Intelligence Review (2009).
[24] M. Clerc and B. Akay, Particle Swarm Optimization by
Maurice, ISTE, 2006.
[25] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Kluwer Academic Publishers,
Boston, MA, 1989.

Security Enforcement with Language-Based Security


Ali Ahmadian Ramaki

Shahin Shirmohammadzadeh Sahraeii

University of Guilan, Rasht, Iran

University of Guilan, Rasht, Iran

Department of Computer Engineering

Department of Computer Engineering

ahmadianalir@msc.guilan.ac.ir

sahraei.shahin@gmail.com

Reza Ebrahimi Atani

University of Guilan, Rasht, Iran


Department of Computer Engineering
rebrahimi@guilan.ac.ir

Abstract: Language-based security is a mechanism for analysis and rewriting applications toward
guaranteeing security policies. By use of such mechanism issues like access control by employing a
computing base would run correctly. Most of security problems in software applications were previously handled by this component due to low space of operating system kernel and complicacy. These
days this task by virtue of increasing space of OS applications and their natural complicacy is fulfilled by novel proposed mechanisms which one of them is treated as security establishment or using
programming languages techniques to apply security policies on a specific application. Languagebased security includes subdivisions such as In-lined Reference Monitor, Certifying Compiler and
improvements to Type which would be described individually later.

Keywords: security; security policy; programming languages; language-based security.

Introduction

Growing use of the Internet, security of mobile codes is


one of important challenges and issues in todays computational researches. On increasing our dependency
on large global networks such as the Internet and receiving their services in order to perform personal routines and spread global information over these global
networks and even download from this perilous area is
potentially susceptible to destructive attacks from attackers and may be followed by irrecoverable effects.
We do not still forget pernicious attacks such as Mellisa and Happy 99 and while downloading plug-ins in
the internet packages careful attention is needed and
how exhaustive outcomes they have caused[7]. Recent
researches show these types of security issues are on the
rise. Today with respect to expansion of computational
environments, safety topic in term of mobile codes is
indispensible. For instance, having downloaded an ap Corresponding

plication from the internet from an unknown source


how could we warrant it would not carry an unwanted
file which may put system safety as risk? A way for
understanding of the situation is use of language-based
security. Throughout the method, security information
of an application programmed in a high-level language
is extracted during compilation of the application that
is a compiled object. The extra security information
includes formal proof, notes about the type or other
affirmable documents. The compiled object is likely to
be created alongside destination code and before running the main code is automatically examined to warn
of errors types or unauthorized acts. Java ByeCode affirmative is an example for the issue in question. The
chief challenge is as to how to create such mechanisms
such that in the first place they have the desirable performance and in the second place they are not revealing
to others as much as possible [1].
Following the paper, the issue literature is reviewed

Author, P. O. Box 41635-3756, F: (+98) 131 6690 271, T: (+98) 131 6690 274-8 (Ex 3017)

26

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

in section 2. Traditional actions to affirm the security


in computer systems are investigated in section 3. In
section 4 language-based security framework described
and later desirable techniques are explained in section
5 and finally in section 6 the conclusion is drawn.

Two Principles in Computer


Security

To understand language-based security more accurately we need to introduce two principles in computer
security systems and provide them with detail descriptions[6].
I. Principle of Least Privilege (PoLP): while running accomplishment policies, each principle is supposed to have least possible access to be applied;
II. Minimal Trusted Computing Based (MTCB):
components which should operate properly to confirm
execution system properties, such as operating system
kernel and hardware. That is mechanism in use fulfills big tasks while small. Smaller and simpler systems have less errors and improper interactions which
is quite appropriate to install safety.

Traditional Approaches to Apply Security

II. Cryptography: by this method makes it possible to install safety at a sensible data transmission
level in an unreliable network and make use of a receiver as a verifier. Power of cryptography methods
is as much complex as hypotheses. Digital Encryption
Standards (DESs) are susceptible to violation by a sufficient amount of damaging codes. Cryptography thus
cannot guarantee downloaded codes from a network to
be safe. It is only able to provide a safe transmittal
space for these codes through the Internet to avoid intrusions and suspicious interference;
III. Code instrumentation: Another approach practiced by operating system in some systems to inspect
safety level of a program from various aspects such as
writing, reading and programming jumps. Code instrumentation is a process through which machine code of
an executed program is changed and main action consequently could be overseen during execution. Such
changes occur in sequence of program machine code
for two reasons; first, behaviors of changed code and
initial code equal. It suggests that initial code did not
violate safety policy and second, if violation by initial
code occurs, changed code is immediately able to handle the situation by two options; either it recognizes
violation, gains control from system and terminates
destructive process or prevents fatal effects which are
likely to affect the system soon. For instance lets suppose a program needs to be run on a machine with certain hardware specifications. To do so lets assume the
program is loaded within a continuous space of memory addresses [c2k , c2k + 2k 1 ] where c and k are
integer numbers. The program then links to run and
after execution and obtaining destination code, by altering values of and jumping to another address space
of memory for indirect addresses, the code in question
is ready to run[4];

Traditional methods to safety issue within computer


systems include: I) Utilizing OS kernel as a reference
IV. Trusted compiler: this method is fulfilled by
monitor; II) Cryptography; III) Code instrumentation;
a
component
known as trusted compiler. By making
IV) Trusted compilation.
virtue of codes limited access, compiler attempts to
These mechanisms offer a constant amount of pre- generate a code which is trusted. There are two alterliminary security policies benefitting from low flexibil- natives for operating system kernel to warrant reliability of compiler.
ity. In future we scrutinize them in detail.
I. Utilizing operating system kernel as reference
monitor: this method is the oldest but the most exhaustive mechanism in use to guarantee security policies in software systems and fulfills single actions on
data and critical components of system through operating system kernel. Kernel is an indispensible component of operating system code retrieving vital components and data directly. The rest of programs are
somehow constrained in order to access these data and
components such that kernel plays a role of proxy interchanging messages for communication;

Language-Based Security

In computer systems, compiler usually interprets a program in a high-level language. Assembler of destination machine then issues Hex code of the program to
the hardware to let it start. Compiler obtains information about programs while compiling them. The
information includes variables values, types or speci-

27

The Third International Conference on Contemporary Issues in Computer and Information Sciences

fied information and may be analyzed and modified in


order to optimize produced destination code by compiler. After successful compilation, extra information
are mostly rallied which can provide information about
security of destination compiled code. For example in
case the program is written in a safe language before
compilation filter of type check must be complete successfully. So codes about security information should
also be generated alongside destination code in order
to run on the hardware during compilation process.
This information as a certificate is created before program execution and it starts running before produced
destination code execution to ensure security policies
of the specific convection is met. Such process is already shown in fig. 1. Concept of language-based security is given to such extra information extracted from
a program written in a high-level language and while
compiling this extra information package also called
certificate. During downloading applications from the
Internet or any other unsafe tool, this package of extra information is uploaded as well. Code consumer is
able to evoke a verifier program before running an application to confirm the certificate and code then run
it.

II. Type Assembly Language (TAL): certificate is a


type reminder such that verifying process on the users
side inspects code structure in term of type;
III. Efficient Code Certification (ECC): in this approach contains extra information about destination
code checking concept structures and code objectives
according to type theory information.

Language-Based
Techniques

Security

A reference monitor is a program execution and prevents the program if it violates the safety policies. Typical examples for reference monitor are operating systems (hardware monitor), interpreters (software monitor) and firewalls. Most of safety mechanisms, today,
employ reference monitor.
I. In-lined Reference Monitor (IRM): a mechanism
fulfilled by operating system in traditional approaches
to supervise programs flawless execution and confirmation of objective safety policies is that reference monitor and objective system are located in distinct address space. Alternative approach is an in-lined reference monitor; a similar task which is performed by
SFI. This component fulfills safety policy for objective
system by stopping reading, writing and jumps in the
memory outside a predefined area [3]. One of methods
thus, is the merge of the reference monitor with objective application. In-lined reference monitor is specified
by definitions below:
A. Security events: action to be performed by reference monitor.
B. Security status: information to be stored during
a safety event occurrence according to which a permission to progress is issued.

Figure 1: Overview of Language-Based Security

Code providers take advantage from various techniques to produce such certificate. Some of the most
important ones are:

C. Security updating: sections of the program running in response to safety events and updating safety
status. SASI is the first generation of IRM proved
by researches to be an approach guaranteeing policies in question. The first generation is programmed
in Assembly 80x86 and the second generation is programmed in Java [2]. SASI x86 that is compatible with
Assembly 80x86 is the graphical output of gcc compiler. The destination code generated meets the two
conditions below:

I. Proof Carrying Code (PCP): produced certificate


by the code provider is a first order logic proof wherein
a set of safe conditions to run code is supplied and user
checks their correctness on the downloaded application
A. The program behavior never changes by adding
while running the code;
NOPs.

28

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

B. Variables and addresses of target branch marked safety is founded on two principles of minimal access
with some tags by gcc compiler are matched during privilege and computing base. In such approaches the
compilation.
safety is warranted by operating systems and kernels
which the kernel acts as a proxy for other processes
So the first version is comprehensively employed in running on the system. Because of technology adorder to save the program memory data. In the second vances, complicacy of operating systems in terms of
version of IRM, JVML SASI, the programmed is pre- tasks and increase in kernel codes for supporting propserved in term of type safety. JVML instructions pro- erties such as graphic cards and distributed file sysvide information about the program classes, instances, tem, new approaches install safety which are proved to
methods, threads and types. Such information can be be high performance like safety establishment by using
utilized by JVML SASI to supply safety policies in ap- programming techniques. Such techniques drop under
plications [5]. Rewriting components in IRM mecha- three main categories: in-lined reference monitor, type
nism generate a verifying code with related destination system and certifying compilers which are described
code by this extra information [10].
separately.
II. Type System: the main objective is to prevent
error occurrence during the execution. Such errors are
identified by a type checker. The importance of this
case is that a high-level program certainly does have
many variables. If these variables of a programming
language are within a specific area we technically say
the language is a type safe. Lets assume variable x in
Java is defined as a Boolean and whenever it is initiated False the result is !X(not x) that is True. If
variables are under a condition such that their values
are within an undefined area we say the language is
not type safe. In such languages we do not meet types
but a global type including all possible types. An action is fulfilled by arguments and output may contain
an optional constant, an error, an exception or an uncertain effect [8]. Type system is a component of type
safe languages holding all types of variables and type of
all expressions are computed during execution. Type
systems are employed in order to decide a program is
well-formed. Type safe languages are explicitly known
as typical if types are parts of syntax otherwise implicit
type.
III. A Certifying compiler is a compiler that the
data given to it guarantees a safety policy, generates a
certificate as well as destination code which is checkable by machine i.e. it checks policies in question [9].

Conclusion

Security in computer systems holds an importance


stand. In traditional approaches computer systems

Refrences
[1] J.O. Blech and A. Poetzsch Heffter, A Certifying Code
Generation Phase. Proceedings of the Workshop on Compiler Optimization meets Compiler Verification (2007 ), 65
82.
[2] U. Erlingsson and F.B Schneider, IRM Enforcement of
Java Stack Inspection. In IEEE Symposium on Security and
Privacy, Oakland, California (2000 ), 246255.
[3] R. Wahbe, S. Lucco, T. Anderson, and S. Graham, Ecient
Software-Based Fault Isolation. In Proc.14th ACM Symp.
on Operating System Principles (SOSP) (1993 ), 203216.
[4] K. Crary, D. Walker, and G. Morrisett, Typed Memory
Management in a Calculus of Capabilities. In Proc. 26th
Symp. Principles of Programming Languages (1999 ), 262
275.
[5] U. Erlingsson and F.B. Schneider, SASI Enforcement of Security Policies: A Retrospective. In Proc. 26th Symp. Principles of Programming Languages (1999 ), 262275.
[6] F.B. Schneider, G. Morrisett, and R. Harper, A LanguageBased Approach to Security. Lecture Notes in Computer
Science (2001 ), 86101.
[7] D. Kozen, G. Morrisett, and R. Harper, Language-Based
Security. Mathematical Foundations of Computer Science
(1999 ), 284298.
[8] R. Hahnle, J. Pant, P. Rummer, and D. Walter, Integration
of a Security Type System into a Program Logic. Theoretical Computer Science (2008 ), 172189.
[9] C. Yiyun, L. Ge, H. Baojian, L. Zhaopeng, and C. Liu, Design of a Certifying Compiler Supporting Proof of Program
Safety. Theoretical Aspects of Software Engineering,IEEE
(2007 ), 127138.
[10] M. Jones and K.W. Hamlen, Enforcing IRM Security Policies: Two Case Studies. Intelligence and Security Informatics, IEEE (2009 ), 214216.

29

Application of the PSO-ANFIS Model for Time Series Prediction of


Interior Daylight Illuminance
Hossein Babaee

Alireza Khosravi

Faculty of Electrical and Computer Engineering

Faculty of Electrical and Computer Engineering

Noushirvani University of Technology

Noushirvani University of Technology

Babol, Iran

Babol, Iran

hbabaee@stu.nit.ac.ir

akhosravi@nit.ac.ir

Abstract: The increasing need for more energy sensitive and adaptive systems for building light
control has encouraged the use of more precise and delicate computational models. This paper
presents a time series prediction model for daylight interior illuminance obtained using optimized
Adaptive Neuro- Fuzzy Inference System (ANFIS). Here the training data is collected by simulation,
using the globally accepted light software Desktop Radiance. The model developed is suitable for
adaptive predictive control of daylight - artificial light integrated schemes incorporating dimming
and window shading control. In ANFIS training process, if the data clustered first and then go to
ANFIS, the performance of ANFIS will be improved. In clustering process, the radius of clusters
has high efficiency on the performance of system. In order to achieve the best performance we need
to determine the optimum value of clusters radius. In this study particle swarm optimization has
been used to determine the optimum value of radius. Simulation results show that the proposed
system has high performance.

Keywords: Particle swarm optimization, Adaptive Neuro- Fuzzy inference system, Radius, Optimization

Introduction

To develop, automatic control strategies in addition to


evaluate the visual and energy performance provided
by daylight, requires an accurate prediction of daylight
entering a building [1]. Daylight Factor (DF) [2], Daylight Coefficient (DC) [3], Useful Daylight Illuminance
(UDI), computer simulations, Average daylight factor,
etc [4] are the various methods adopted for the estimation of interior daylight illuminance. The DF approach
has been in practice for the last 50 years. The DF approach has gained favour because of its simplicity, but
it is not flexible enough to predict the dynamic variations in daylight illuminance as the sun position and
sky condition change. The DC concept was developed
by Tregenza PR [5], which considers the changes in the
luminance of the sky elements, offers more effective way
of computing indoor daylight illuminance. As the sky
is treated as an array of point sources, the daylight co Corresponding

efficient approach can be used to calculate the reflected


sunlight, and is particularly appropriate for innovative
daylight system with complex optical properties. In the
UDI approach, daylight illuminance metrics are based
on absolute values of time varying daylight illuminance
for a period of full year. Recently, Kittleret.al.[6]have
proposed a new range of 15 standard sky luminance
distributions including five clear, five partly cloudy and
five overcast sky types. DHW Li et al [4], Have proposed average daylight factor concept suitable for all
above 15 standard skies. This proposition may be a
useful paradigm for planning and design of daylight
systems, but again uncertain about the effectiveness of
this method for automated control strategy as we cannot predict the type of sky ahead. This varying illuminance predictions, as used for meteorological data sets,
offer a more realistic account of true daylight conditions, than the previously mentioned DF,DC and UDI.
ANFIS shows very good learning and prediction capabilities, which makes it an efficient tool to deal with

Author, P. O. Box 47135-484, T: (+98) 21 8890-7940

30

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

uncertainties encountered in this venture. A variety of


computer design tools are available for collecting the
data required for training the Adaptive Neuro- Fuzzy
Inference System. Here, the software Desktop Radiance is used for collection one full year data with different sky conditions. The interior illuminance level
is calculated for a given environment at any time of
the year. Instead of using measured values of illuminance levels, here we used the simulated data from the
model created using the appropriate design tool. The
illuminance levels obtained in this way are used as a
training data for ANFIS to predict the six step ahead
values for the model under consideration. Hence, these
predicted values identify how the system is going to behave ahead of a particular time. This paper highlights
how ANFIS can be employed to predict future values
of the daylight availability. In ANFIS training process,
if the data is clustered first and then goes to ANFIS,
the performance of ANFIS will be improved. In clustering process, the radius of clusters has high efficiency
on the performance of system. In order to achieve the
best performance we need to determine the find the
optimization value of clusters radius using PSO algorithm. The rest of this paper is organized as follow:
Section 2 introduces the Adaptive Neuro-Fuzzy Inference System. PSO algorithm is presented in section
3. In section 4 we describe the experimental settings
and the experimental results. The conclusions are in
section 5.

The system has two inputs x1 and x2 with one output


F. A square node (adaptive node) has parameters and
changes during training while a circle node (fixed node)
has none. Two membership functions are associated
with each input. The rule contains two fuzzy if-then
of Takagi and Sugenos type. The key features of the
five layers are described as follows. In the following
presentation Oli denotes the output of node I in a layer
L [9, 10].

The Membership function can be bell-shaped or


Gaussian. Parameters in this layer are referred to as
promise parameters

Adaptive Neuro -Fuzzy Inference System (ANFIS)

Adaptive -network-based fuzzy inference system (ANFIS) has been proposed by Jang [7]. The fuzzy inference system is implemented in the framework of
adaptive networks using a hybrid learning procedure,
whose membership function parameters are tuned using a back propagation algorithm combined with a least
square method. ANFIS is capable of dealing with uncertainty and imprecision of human knowledge. It has
self-organized ability and inductive inference function
to learn from the data. ANFIS is a multilayer feed forward network [7]. Each node of the network performs a
particular function on incoming signals as well as a set
of parameters pertaining to this node. To present the
ANFIS architecture, consider two-fuzzy rules based on
a first order Sugenos model [8] shown in Figure 1.

Layer 1: The nodes in this input layer are adaptive. They define the membership functions of the
inputs.

Figure 1: structure of ANFIS[9]

O1,i = Ai (x1 );

i = 1, 2

O1,i = Bi 2 (x1 );

i = 3, 4

(3)
(4)

Where, Ai and Bi can be any appropriate fuzzy sets


in parameter form.
Layer 2:The nodes in this rule layer are fixed. It
multiplies all the incoming signals and sends the product Out. Output of each node represents the firing
strength of a rule
O2,i = Wi = Ai (x1 )Bi (x2 );

i = 1, 2

(5)

The output of each node is this layer which represents


the firing strength of the rule.

Rule 1 : IF x is A1 and y is B1 ,
then f1 = p1 x1 + q1 x2 + r1

Layer 3:The nodes in this normalization layer are


(1) fixed. The nodes normalize the firing strengths obtained in Layer 2

Rule 2 : IF x is A2 and y is B2 ,
then f2 = p2 x1 + q2 x2 + r2

(2)

i =
O3,i = W

31

Wi
;
W1 + W2

i = 1, 2

(6)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Layer 4:The nodes in this inference layer are adaptive. an excessive propagation of rules when the input data
The outputs in this layer are the outputs from Layer has a high dimension.
3 multiplied by a linear formula. Parameters in this
The cluster radius indicates the range of influence
layer are referred to as consequent parameters:
of a cluster when you consider the data space as a unit
i Fi = W
i (pi xi + qi x2 + ri ); i = 1, 2 (7) hypercube. Specifying a small cluster radius usually
O4,i = W
yields many small clusters in the data, and results in
Where, pi , qi and ri are design parameters (consequent many rules. Specifying a large cluster radius usually
parameter since they deal with the then-part of the yields a few large clusters in the data, and results in
fuzzy Rule).//// Layer 5:The nodes in this output fewer rules.
layer are fixed. It computes the overall output as the
summation of the Weighted outputs from Layer 4.
In this study in order to more increasing the accuracy of proposed system, we intend to find the optimum
value of clusters radius using PSO. In next section,
PSO algorithm is explained.
P
W
F
i
i
X
i Fi = iP
; i = 1, 2
(8)
O5,i = F =
W
Wi
i

The ANFIS architecture is not unique. Some layers can


be combined and still produce the same output. There
are two sets of parameters in the above fuzzy inference
system. The overall output is linear in the consequent
parameters on layer 3 but non linear in the parameters
on layer 1. The hybrid learning algorithm detailed in
[9] consists of a forward and a backward pass. In the
forward pass, the linear parameters are updated using
least squares estimator (LSE). In the backward pass,
errors of derivatives are calculated for each node starting from the output end and propagating towards the
input end of the network. The non linear parameters
are updated by steepest descent algorithm [9].
Training of Neuro-Fuzzy has several steps. At the
first step of training, the initial fuzzy sets should be
determined. Actually the fuzzy sets define the number
of sets for each input variable and their shapes. During
training, all of the training dataset would be present
to network and it tries by learning the spatial relationship between the data to minimize the error. Sometime
lower error could not guaranty the better performance
of network and it may because of network over training.
While ANFIS is being trained, all of the training
dataset would be presented to network and it tries
by learning the spatial relationship between the data
to minimize the error. If the input-output clusters of
training data are found, the cluster information could
be used to generate a fuzzy inference system. The rules
partition themselves according to the fuzzy qualities
associated with each of the data clusters.
An important advantage of using a clustering
method to find rules is that the resultant rules are more
tailored to the input data than they are in a FIS generated without clustering. This reduces the problem of

32

PSO Algorithm

The basic operational principle of the particle swarm is


reminiscent of the behaviour of a group, for example, a
flock of birds or school of fish, or the social behaviour of
a group of people. Each individual flies in the search
space with a velocity which is dynamically adjusted
according to its own flying experience and its companions flying experience, instead of using evolutionary
operators to manipulate the individuals like in other
evolutionary computational algorithms. Each individual is considered as a volume-less particle (a point) in
the N-dimensional search space. At time step t, the
ith particle is represented as: .The set of positions of
m particles in a multidimensional space is identified
as .The best previous position (the position giving the
best fitness value) of the ith particle is recorded and
represented as . The index of the best particle among
all the particles in the population (global model) is
represented by the symbol g. The index of the best
particle among all the particles in a defined topological neighbourhood (local model) is represented by the
index subscript . The rate of movement of the position
(velocity) for particle at the time step is represented as
.The particle variables are manipulated according to
the following equation (global model [11]):

vin (t) = wi vin (t 1) + c1 rand1(.)


(pin xin (t 1)) + c2 rand2(.) (pgn xin (t 1))
xin (t) = xin (t 1) + vin (t)
(9)
where n is the dimension . and c2 are positive constants, rand1(.) and rand2(.) are two random functions in the range [0,1], and is the inertia weight. For
the neighbourhood (lbest) model, the only change is
to substitute pln for pgn in the equation for velocity.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

This equation in the global model is used to calculate a particles new velocity according to its previous
velocity and the distance of its current position from
its own best experience and the groups best experience . The local model calculation is identical, except
that the neighbourhoods best experience is used instead of the groups best experience. Particle swarm
optimization has been used for approaches that can be
used across a wide range of applications, as well as for
specific applications focused on a specific requirement.
Its attractiveness over many other optimization algorithms relies in its relative simplicity because only a
few parameters need to be adjusted[12,13].

Figure 3: prediction errors without optimization

Simulation Results
4.2

Performance with Optimization

In order to build an ANFIS that can predict x(t + 6)


from the past values of daylight levels, the training data Next, we apply PSO to find the optimum value of raformat is [x(t 18), x(t 12), x(t 6), x(t); x(t + 6)] . dius. Table 2 shows the coefficient values in the PSO
Training and checking data are shown in Figure 2.
algorithm. Figure 3 shows the prediction error.
As a comparison of Figures 2 and 3 implied, optimization significantly reduces the amount of grants.
Figure 3 shows the prediction error, it is found that
error is maximum of 1 lux which does not result any
change in the control signals (very small variation will
not be considered as it results too much Fluctuation
of light). So we stopped at this level of performance
instead of going for more extensive training. Figure
4 shows the non-linear surface, of the Sugeno Fuzzy
model for the problem of time series prediction. We
Figure 2: Training & Checking data used for ANFIS have used Fuzzy logic Toolbox of MATLAB to develop
the ANFIS model with 4 inputs and single output.
prediction

Performance criteria in this study, the area enclosed


between the original signal and the signal predicted by
ANFIS is assumed. It is clear that much more accurate
prediction of the original signal and the signal is predicted to be closer together therefore The area enclosed
between them will be less.

4.1

In Table 2, the area enclosed between the original


signal and the signal predicted by the optimization and
without optimization is shown as this table suggests
that the optimal system performance is much better
than without optimization. Figure 5 of the original
signal and predicted signal is shown in ANFIS optimized. As this suggests, the two signals are very close
together.
Number of particles
Error limit
Acceleration constant
Maximum velocity
Maximum number of iterations
Size of the local neighborhood
Constants c1 = c2

Performance without optimization

First we have evaluated the performance of the recognizer without optimization. Figure 2 shows the prediction error. As Figure 2 suggests that the difference
between the original signal and the signal expected at
different times almost too much and is approximately
0.05

10
e-10
3
8
100
2
2.1

Table 1: Coefficient Values In PSO Algorithm

33

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Status
Optimized ANFIS
ANFIS without optimization

Value
8.7649e-004
0.3214

is only one output, genfis2 may be used to generate an


initial FIS for ANFIS training. genfis2 accomplishes
this by extracting a set of rules that models the data
behaviour.

Table 2: The Area Enclosed Between the Original Signal and the Signal Prediction

The rule extraction method first uses the sub cluster function to determine the number of rules and
antecedent membership functions and then uses linear
least squares estimation to determine each rules consequent equations. This function returns a FIS structure
that contains a set of fuzzy rules to cover the feature
space.
ANFIS
ANFIS uses a hybrid learning algorithm to identify
the membership function parameters of single-output,
Sugeno type FIS. A combination of least-squares and
back propagation gradient descent methods are used
for training FIS membership function parameters to
model a given set of input/output data.

Figure 4: prediction errors with optimization

EVALFIS
This performs fuzzy inference calculations. Y = EVALFIS(U,FIS) simulates the FIS for the input data U and
returns the output data Y. For a system with N input
variables and output variables, U is M-by-N matrix,
each row being a particular input vector and Y is
M-by-L matrix, each row being a particular output
Figure 5: InputOutput SURFVIEW of ANFIS scheme vector.

Conclusion

The most important advantage of the proposed model


is the ability to predict natural systems behaviour at a
future time, which can be used for lighting control. The
implementation of ANFIS model is less complicated
than the sophisticated identification and optimization
procedures. Compared to fuzzy logic systems, ANFIS
Figure 6: Original signal and the signal predicted
has an automated identification algorithm and has easier design. In comparison with neural networks, it has
fewer numbers of parameters and has faster adaptation. In order to increase the accuracy of the proposed
5 Matlab Functions Used For system, a PSO algorithm is used to determine the optimum value of cluster radius (which is used in ANFIS
Time Series Prediction
training). The non-linear characteristics of the daylight systems can be tolerably handled in the proposed
GENFIS2
system. This prediction could be utilized as an input
for the artificial light and shading controls. Possibility
genfis2 generates a Sugeno-type FIS structure using to reduce the number of sensors and connections, imsubtractive clustering and requires separate sets of in- prove the performance of control strategy. PSO-ANFIS
put and output data as input arguments. When there based time series prediction model for daylight interior

34

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

luminance is the unique, novel as it is simple, reliable


and easily accessible for different room conditions.

Refrences
[1] A.Nabil and J.Mardaljevic, Useful daylight illuminance: a
new paradigm for assessing daylight in building, Lighting
Research & Technology 37 (2005), no. 1, 4159.
[2] DHW Li, CCS Lau, and JC Lam, Predicting daylight Illuminance by computer simulation techniques, Lighting Research & Technology 36 (2003), no. 2, 113119.
[3] P.J Littlefair, Daylight coefficients for practical computation of internal illuminances, Lighting Research & Technology 24 (1992), no. 3, 127135.
[4] DHW Li and GHW Cheung, Average daylight factor for the
15 CIE standard skies, Lighting Research & Technology 38
(2006), no. 1, 137152.
[5] Tregenza PR and Waters IM, Daylight coefficients, Lighting
Research & Technology 15 (1983), 6571.
[6] Kittler R, Darula S, and Perez R, A set of standard skies
characterizing daylight conditions for computer and energy
conscious design, Bratislava, Slovakia, 1998.

[7] J.-S and R. Jang, ANFIS: Adaptive-Network-Based Fuzzy


Inference System, IEEE Transactions on Systems, Man, and
Cybernetics 23 (1993), 665685.
[8] K. Erenturk, ANFIS-Based Compensation Algorithm for
Current-Transformer Saturation Effects, IEEE Transactions on Power Delivery 24 (2009), no. 1.
[9] S. R. Jang and E. Mizutani, Neuro-Fuzzy and soft computation, Prentice Hall, NJ, 1997.
[10] S. R. Jalluri and B. V. S Ram, A Neuro -Fuzzy Controller
for Induction Machines Drives, Journal of Theoretical and
Applied Information Technology 19 (2010), no. 2.
[11] R.C Eberhart and Kennedy.j, A new Optimizer Using Particle Swarm Theory, Proceeding of the Sixth International
Symposium on Micro Machine and Human Science, Nagoya,
Japan (2005), 3943.
[12] H.babaee and A.khosravi, presented at IEEE Conference,
China (2011).
[13] Hongchao Yin and Wenzhi Dai, Optimal Operational Planning of Steam Power Systems Using an IPSOSA Algorithm,
Journal of Computer and Systems Sciences International 49
(2010), no. 5, 750756.

35

Evaluating the impact of using several criteria for buffer management


in VDTNs
Zhaleh Sadreddini

Mohammad Ali Jabraeil Jamali

Department of Computer Sciences

Department of Computer Sciences

zh.sadreddini@iaushab.ac.ir

mjamali@itrc.ac.ir

Ali Asghar Pourhaji Kazem


Department of Computer Engineering
apourhajikazem@iaut.ac.ir

Abstract: In Vehicular Delay Tolerant Networks (VDTNs), the optimal use of buffer management
policies can improve the overall network throughput. Due to the fact that several message criteria
can be considered simultaneously for the optimal buffer management, conventional policies are unsuccessful to support different applications. Through this research, we present a buffer management
strategy called Multi Criteria Buffer Management (MCBM). This technique applies several message criteria according to the requirements of different applications. We examine the performance
of proposed buffer management policy by comparing it with existing FIFO and Random. According
to the proposed scenario in this paper, simulation results prove that the MCBM policy perform well
as existing ones in terms of the overall network performance.

Keywords: Buffer management policies, Epidemic routing, Vehicular Delay Tolerant Networks.

Introduction

to increase network performance has been presented to


improve the efficiency of network.

In VDTNs we can point out to different scenarVehicular Delay Tolerant Networks (VDTNs) are an
ios:
traffic condition monitoring, collision avoidance,
application of the Delay-Tolerant Networks (DTNs),
emergency
message dissemination, free parking spots
where the mobility of vehicles is used for connectivity
information,
advertisements, etc.[4],[5],[6]
and data communications.[1]
According to the mobility and high speed of vehicles, the end to end path is not available all the time.
Therefore, in such networks the intermittent connection is occurred and as a result, the sending of the
message encounters a delay.[2],[3] In order to overcome
the intermittent connectivity and increase the delivery
rate of messages and reduce the average latency, we
have used store carry and forward patterns. So, messages are stored and sent among network nodes until
reaching the final destination. Consequently, according
to the limitation of space in buffer nodes, messages are
faced with the buffer overhead and dropped. To overcome this problem, optimal buffer management policy
Corresponding

Author,T: (+98) 914 411-5567

36

According to the requirements of different applications, it is possible that multiple major message criteria
are considered simultaneously for optimal buffer management. In addition, different criteria may have different levels of importance and conflict with each other.
However, the existing policies have considered only one
or two message criteria; as a result they are for a single
purpose and do not support different applications.
The MCBM technique offers the buffer management problem as a multi- criteria decision problem.
Therefore, different criteria can be applied to manage
the buffer in terms of requirement of different applica-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tions via the MCBM technique.[7]


In this article, the Emergency Warning scenario has
been considered to represent improvement of the network performance and responding to requirements of
different applications via MCBM technique. In this
scenario, the delivery of the emergency message has
special importance. In the simulation performed, increasing the message delivery rate and the acceleration
of delivery of the emergency messages (like collision
alarm messages) are considered.

2.1

Existing buffer management


policies

Table 1: Decision matrix

In the decision matrix, rij shows the value of j-th


criterion for i-th messages and we have:

First-In First-Out (FIFO)

FIFO is a straightforward policy, which just orders


messages to be forwarded at a contact opportunity
According to the requirement of applications, the
based on their receiving time (first-come, first-served
values of different criteria have different units and conbasis). When the FIFO dropping policy is enforced,
tradict with each other. Normalization should be done
the buffer congestion dropped messages will be the ones
to assimilates units, elimination conflict of criteria and
at the head of the queue (drop head).[?2-8]
equalize of the range of values.

2.2

Random

One of the most common methods of MCDM is


WSM (Weighted Sum Model). In this method, first the
normalization operations of rij values are done. After
normalization, a message would be selected to send or
drop in which value of A*WSM-Score is the highest:

In random policy, messages are scheduled for transmission in a random order. Moreover, the selection of
messages to be dropped is in random order.[9],[10]

2.3

Approach

Time complexity of this method is O (mn), where


m is the number of criteria and n is the number of mesAccording to the requirements of different applications, sages. According to this fact, the number of criteria is
Multi Criteria Buffer Management (MCBM) technique less than the number of current messages in the buffer,
applies different and even contradictory criteria for op- so the complexity is linear and would be O (m).
timal buffer management.[7] For this purpose, using
In order to prove the improvement of the network
the Multi Criteria Decision Making (MCDM) method
decision matrix for buffer management is created as Ta- performance via MCBM technique, the Emergency
ble 1. In the decision matrix for buffer management, Warning scenario has been presented. Simulation rebuffer has n messages, each message has m different sults in the next section show the performance of excriteria, and the importance degree of j-th criterion is isting FIFO and Random and MCBM technique via
above-mentioned scenario.
wj.

37

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Performance Evaluation

3.2

Performance analysis of Epidemic


Routing protocol for the scenario
with 20 vehicles

In the Emergency Warning scenario, message priority


and acceleration of delivery of emergency messages is
an important issue. However, we consider the presented priority pattern in DTN architecture. There- The performance analysis starts with the scenario
fore, this work considers three traffic priority classes: where only 20 vehicles move across the map roads.
Bulk, Normal and Expedited (emergency).
Figure1 shows the effect of the delivery probability
with existing FIFO, Random and MCBM. According
The MCBM technique can be compared with dif- to the Emergency Warning scenario, MCBM technique
ferent types of buffer management policies. However, improves the emergency message delivery rate in the
we compare our technique with conventional FIFO and worst conditions (low traffic, buffer size and TTL). The
Random policies. For this purpose, a simulation study delivery rate of the emergency message in the MCBM
using the Opportunistic Network Environment (ONE) is about 6
Simulator has been executed.[11] We created a set of
extensions for the ONE simulator to support traffic
priorities, schedule and drop policies for traffic differentiation. The performance metrics considered are the
delivery rate of messages, per priority class. Next section describes the two simulation scenarios and the corresponding performance analysis.

3.1

3.3

Simulation Setup

Performance analysis of Epidemic


Routing protocol for the scenario
with 100 vehicles

When the number of vehicles is increased in VDTN, the


possibility of connecting factors is increased too. The
result will have significant effect on improving the delivery rates. According to Figure 2, the delivery rate of
emergency messages for the above-mentioned scenario
in the MCBM is about 20

Table 2: Simulation Setup

In this study, it is assumed that the delivery of emergency messages is so important and that these messages generate larger volumes of traffic. Thus, messages are generated with sizes uniformly distributed
in the ranges of [250 KB, 750 KB] for bulk messages,
[500 KB, 1 MB] for normal messages, and [750 KB,
1.5 MB] for emergency messages. In all policies, the
creation probability of three priority class is set as:
Emergency=20The performance of policies assessment
is done with the Epidemic routing protocol.[12] Epidemic is a flooding based routing protocol where nodes Figure 1: MCBM, FIFO and Random Delivery Probability with 20 vehicles
exchange the messages they dont have.

38

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

buffer management.

Refrences
[1] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues,
A layered architecture for vehicular delay-tolerant networks,
(ISCC09), Sousse, Tunisia (2009).
[2] V. N. G. J. Soares, J. J. P. C. Rodrigues, and P. S. Ferreira, Improvement of messages delivery time on vehicular
delay-tolerant networks, ICPP, Vienna, Austria, (2009).

Figure 2: MCBM, FIFO and Random Delivery Probability with 100 vehicles

[3] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues,


Evaluating the impact of storage capacity constraints on
vehicular delay-tolerant networks, CTRQ, Colmar, France
(2009).
[4] D. Niyato and P. Wang, Optimization of the Mobile Router
and Traffic Sources in Vehicular Delay Tolerant Network,
IEEE, (2011).

Discussion and Future Works

In order to support different types of applications in


VDTNs, buffer management policies should be designed so that several criteria could be imposed on
them. Therefore, we propose the MCBM technique. In
this article, the Emergency Warning scenario has been
considered to compare the efficiency of the proposed
technique with conventional FIFO and Random policies. In this comparison we observe that single purpose
buffer management policies are not able to respond in
different types of scenarios. The obtained results of the
simulation according to the proposed scenario show improvement of network efficiency in the message delivery
rate compared with other policies. In future works, the
efficiency of proposed technique will be studied with
different types of required application scenarios via several routing protocols. According to the requirement
of applications, we can also apply network criteria in

[5] R. Tatchikou, S. Biaswas, and F. Dion, Cooperative vehicle


collision avoidance using inter-vehicle packet forwarding,
IEEE, MO, USA, (2005).
[6] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues, Scheduling and drop policies for traffic differentiation on vehicular delay-tolerant networks, SoftCOM, Croatia, (2009).
[7] TomGl., Theodor J. Stewart., and H. Thomas, Multi criteria decision making: advances in MCDM models, algorithms, theory, and applications, Springer (1999).
[8] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues, Traffic differentiation support in vehicular delaytolerant Networks, Springer science (2010).
[9] Q. Ayub, S. Rashid, and M. SoperiMohdZahid, Buffer
Scheduling Policy for Opportunitic Networks (2011).
[10] S. Rashid, Q. Ayub, M. SoperiMohdZahid, and A. HananAbdullah, Impact of Mobility Models on DLA (Drop
Largest) Optimized DTN Epidemic routing protocol (2011).
[11] A. Kernen, J. Ott, and T. Krkkinen, The ONE simulator
for DTN protocol evaluation, SIMUTools, Rome, (2009).
[12] D. Becker, Epidemic routing for partially connected ad hoc
networks, Duke University, (2000).

39

Improvement of VDTNs Performance with Effective Scheduling Policy


Masumeh Marzaei Afshord

Mohammad Ali Jabraeil Jamali

Islamic Azad University, Shabestar Branch

Islamic Azad University, Shabestar Branch

Department of Computer Science

Department of Computer Science

Shabestar, Iran

Shabestar, Iran

m.marzaei@gmail.com

m jamali@itrc.ac.ir

Ali Asghar Pourhaji Kazem


Islamic Azad University, Tabriz Branch
Department of Computer Engineering
Tabriz, Iran
a pourhajikazem@iaut.ac.ir

Abstract: In Vehicular Delay Tolerant Networks (VDTNs), buffer management policies effect on
performance of network. Most conventional buffer management policies make decision only based on
message criteria and do not consider features of environment where nodes are located. In this paper
we propose knowledge based scheduling (KBS) policy which make decision using two knowledge
of amount free space of receiver node s buffer and traffic amount of segment where sender node
is located. Using simulation, we evaluate performance of proposed policy and compare it with
Random and Lifetime desc policies. Simulation results show that our buffer management policy
increases delivery rate and decreases number of drop significantly.

Keywords: Epidemic Router, Scheduling Policy, Vehicular Delay Tolerant Networks.

Introduction

new contact opportunity. This process continues until


messages reach to destination.

Delay Tolerant Network (DTN) has been introduced


for situation where the connection between the nodes
is sparse. As a result, unlike traditional mobile ad hoc
network (MANET), the end to end path between a
source and destination will only be available for a brief
and unpredictable period of time [7].

In order to increase delivery rate and decrease average of latency in VDTNs, message replication is
performed by many of routing protocols. Combination of message storage during long periods of time
and their replication imposes high storage overhead on
buffer nodes and reduce overall performance of network. Therefore efficient buffer management policies
are required to improve overall performance of network. Most conventional buffer management policies,
make decision just based on message criteria (like size
of message, time-to-live (TTL) of message, number of
forwarding of message).

Vehicular Delay Tolerant Networks (VDTNs) are


an application of DTNs where vehicles are responsible to make communication. In VDTNs, movement
and high velocity of vehicles lead to short contact durations, intermittent connectivity and highly dynamic
network topology issues. To overcome these problems,
In this paper, we present an effective buffer manstore-carry and forward strategy used in VDTNs. Vehicles store messages on their buffers while a connec- agement policy, called Knowledge Based Scheduling
tivity is not available. They carry messages until a (KBS), that in addition to considering message crite Corresponding

Author, T: (+98) 914 302 3661

40

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ria, it forwards a message based on knowledge of free where this message will also contain information about
space amount of receiver nodes buffer and also knowl- free space of buffer [4].
edge of traffic amount of segment where sender node
is located. Using simulation, we show that KBS policy
Assume the free space of receiver nodes buffer is
improve performance of network.
250K. As can be seen in Table 1, in buffer of sender
node, there is multiple messages with equal or smaller
size than free space of node receivers buffer. In this
case, KBS policy makes decision based on amount traf2 Existing Scheduling Policies
fic of segment where the sender node is located. Based
on segment traffic, it selects the message with least
TTL or with highest TTL, among messages with equal
Scheduling policy determines the order which messages or smaller than free space of receiver buffer. Therefore
should be forwarded at a contact opportunity.
KBS policy gives opportunity both messages with low
TTL and messages with high TTL.

2.1

Msgid
M1
M2
M3
M4
M5
M6

FIFO (First in-First out)

FIFO scheduling policy orders messages to be forwarded at a contact opportunity based on their entry
time into nodes buffer.

2.2

Msgsize
180K
450K
200K
150K
300K
550K

MsgTTL
50
120
90
100
70
35

Table 1: Buffer space of sender node

Random

Random scheduling policy forwards messages in a ranIf the segment traffic of sender node is low or
dom order.
medium (An interval has been defined for low or
medium traffic), among messages with equal or smaller
size than free space of receiver nodes buffer (M1, M3,
M4), the message with the highest TTL (M4) is se2.3 Lifetime descending order
lected to forward. The reason of selecting the message
with the highest TTL is that since segment traffic is
Lifetime descending order (Lifetime desc) policy sorts low, contact opportunities in segment are also low and
messages based on their TTL in a descending order waiting times in buffers are high, so the possibility of
and at contact opportunity forwards the message with traversing current segment by messages with high TTL
is more than the possibility of traversing current seghighest TTL.
ment by messages with low TTL. But if segment traffic
is high (An interval has been defined for high traffic),
the message with the least TTL is selected to forward
(M1). In this case, since segment traffic is high, con3 Proposed policy
tact opportunities in segment are also high and waiting
times in buffers are low, so it is possible that messages
Knowledge Based Scheduling (KBS) policy in addition with low TTL traverse current segment before expirato message criteria, considers neighboring environment tion. As a result an opportunity would be given to
of node and using two knowledge of free space amount messages with low TTL. Knowledge of segment traffic
of receiver nodes buffer and traffic amount of segment is obtained using traffic oracle [3]. Based on Cartesian
where sender node is located, makes decision. In a coordinate of each node, this oracle obtains related segcontact opportunity, KBS policy considers free space ment and determines traffic amount that including the
of receiver nodes buffer and forwards a message with number of present nodes in that segment.
equal or smaller size than it. Therefore it reduces numIf the size of all messages in buffer of sender node
ber of drops. The knowledge of free space of receiver
nodes buffer is obtained based on HELLO-RESPONSE is larger than the free space of node receivers buffer,
technique. Sender node sends a HELLO message in or- the KBS policy makes decision just by considering segder to make communication. If receiver node hears the ment traffic of sender node. When segment traffic of
HELLO message, it will send a RESPONSE message sender node is low or medium, due to above mentioned

41

The Third International Conference on Contemporary Issues in Computer and Information Sciences

reasons, the message with highest TTL is selected to


forward and when the traffic is high, the message with
the least TTL is selected to forward.

Simulation results

Figure 1 shows the comparison of buffer management


policies with respect to delivery ratio. KBS policy by
considering free space of receiver buffer and by forwarding a message based on it, reduces number of dropped
messages and consequently increases number of delivered messages. Moreover, it by making decision based
on TTL of messages also can increase delivery probability.

Simulation setup

In this section, we evaluate our KBS policy and compare it with Random, FIFO and Lifetime desc scheduling policies. The dropping policy in all three policies is
Drop head [17]. Evaluation is done by simulation using
Opportunistic Network Environment (ONE) simulator
[14].
Performance metrics considered are the message delivery probability (measured as the ratio of the delivered messages to the sent messages), and number of
drop.
Figure 1: KBS, Lifetime desc, Random and FIFO DeTo evaluate, Epidemic routing protocol is used [1]. livery Probability
Epidemic routing protocol is a flooding based protocol.
According to this protocol, when two nodes connect,
they send to each other messages which they do not
have.
Figure 2 represents the comparison of buffer management
policies with respect to the number of drops.
In order to examine the performance of KBS policy, we use an urban scenario. Area of simulation is KBS policy reduces number of drop to significant rate,
6000m 6000m. We simulate 100 vehicles. Buffer ca- because it forwards a message based on free space repacity of vehicles is 20Mbyte. Vehicles move with ran- ceiver node receives the message with the least number
dom speed 30 and 50 Km/h and using shortest avail- of drops.
able path. Random Wait times of vehicles are between
5 and 15 minutes.
Network nodes communicate with each other using
a wireless connectivity link with data transmission rate
of 6Mbps and transmission range of 30 meter.
The messages are generated using an inter-message
creation interval that is uniformly distributed in the
range of [5,20] seconds. Message size is uniformly distributed in the range of [500K, 1M]. TTL of messages
are 120 minutes along the simulations. Simulation time
is 12 hours.
In all the scenarios we have defined the number of
vehicles less than 6 in one segment as a low traffic, the
number of vehicles 6 to 8 as medium traffic and the Figure 2: KBS, Lifetime desc, Random and FIFO
Number of Drop
number of vehicles higher than 8 as high traffic.

42

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion and future works

In this paper KBS buffer management policy was presented which in addition to message criteria, uses
knowledge of neighboring environment of nodes. This
policy based on free space of receiver nodes buffer and
traffic of segment where sender node is located, select
a message to forward.
Using simulation, performance of KBS policy was
compared with FIFO, Random and Lifetime desc
scheduling policies. Results were showed that KBS,
increases delivery ratio and decrease number of drop
significantly. In future works, we can present dropping policy that considers neighboring environment
of nodes. Moreover, we can compare the proposed
method with other buffer management policies.

Refrences
[1] A. Vahdat and D. Becker, Epidemic routing for partially
connected ad hoc networks, Duke University, Tech. Rep. Cs200006, 2000.
[2] K. Fall, Delay-tolerant network architecture for challenged
internets, In Proc. SIGCOMM (2003).
[3] S. Jain, K. Fall, and R. patre, Routing in delay tolerant
network, In Proc. SIGCOMM (2004).
[4] J. Lebrun, Ch.N. Chuah, D. Ghosal, and M. Zhang,
Knowledge-based opportunistic forwarding in vehicular
wireless ad Hoc Networks, IEEE Conference on Vehicular
Technology 4 (2005), 22892293.
[5] A. Lindgren and K.S. Phanse, Evaluation of queuing policies and forwarding strategies for routing in intermittently
connected networks, IEEE international Conference on
Communication System Software and Middleware (2006),
110.

[6] A. Jindal and K. Psounis, Performance analysis of epidemic


routing under contention, In Proc. IWCMC.
[7] D. Niyato, P. Wang, and J.Ch.M. Toe, Performance Analysis of the Vehicular Delay Tolerant Network (2007).
[8] G. Fathima, R.S.D. Wahidabanu, Singer Y, and Kaelbling
P, Effective buffer management and scheduling of bundles
in delay tolerant networks with finite buffers, In International Conference on Control, Automation, Communication
and Energy Conservation (INCACEC 2009) (2000), 14.
[9] N. Dusit, P. Wang, and J.Ch.M. Teo, Performance Analysis
of the Vehicular Delay Tolerant Network, In proc. NSERC
(2009).
[10] V.N.G.J. Soares, J.J.P.C. Rodrigues, P.S. Ferreira, and
A.M.D. Nogueira, Improvement of message delivery time
on vehicular delay-tolerant networks, In International Conference on Parallel Processing Workshops (2009), 344349.
[11] V.N.G.J. Soares, F. Farahmand, and J.J.P.C. Rodrigues,
Evaluating the impact of storage capacity constraints on
vehicular delay-tolerant networks, In second International
Conference on Communication Theory, Reliability and
Quality of Service (CTRQ 2009) (2009).
[12] V.N.G.J. Soares, F. Farahmand, and J.J.P.C.Rodrigues, A
layered architecture for vehicular delay-tolerant networks,
In IEEE Symposium on Computers and Communications
(ISCC09) (2009), 122127.
[13] S. Kaveevivitchai and H. Esaki, Independent dtns message
deletion mechanism for multi-copy routing scheme, In Sixth
Asian Internet Engineering Conference (AINTEC) (2009).
[14] A. Keranen, J. Ott, and T. Kakkainen, The ONE Simulator for DTN Protocol Evaluation, In SIMUTools: 2nd International Conference on Simulation Tools and Techniques
(2009).
[15] V.N.G.J. Soares, F. Farahmand, and J.J.P.C.Rodrigues,
Traffic differentiation support in vehicular Delay tolerant
Networks, Springer (2010).
[16] A. Krifa and Ch. Barakat, Th. Spyropoulos, Message drop
and scheduling in dtns: Theory and practice, IEEE Transactions on Mobile Computing (2010).
[17] V.N.G.J. Soares, F. Farahmand, and .J.P.C. Rodrigues,
Performance analysis of scheduling and dropping policies
in vehicular delay-tolerant networks, In International Journal on Advances in Internet Technology 3 (2010), 137145.

43

Classification of Gene Expression Data using Multiple Ranker


Evaluators and Neural Network
Zahra Roozbahani

Ali Katanforoush

Department of Computer Science, Math. Sci.

Department of Computer Science, Faculty of Math. Sci.

Shahid Beheshti University,Tehran, IRAN

Shahid Beheshti University,Tehran, IRAN

z.roozbahani@mail.sbu.ac.ir

a katanforosh@sbu.ac.ir

Abstract: Samples assayed by high-throughput micro array technologies challenge conventional


Machine Learning techniques. The major issue is the number of attributes (genes) that is highly
greater than the number of samples. In feature selection, we attempt to reduce the number of
attributes to obtain the most effective genes. In the prediction scheme introduced in this paper,
several feature selection methods are combined with an Artificial Neural Network (ANN) classifier.
Initially, we exploit various evaluators to measure association between the gene expression rate and
susceptibly categories. Then we rank genes based on each of measures and select a fixed number
of top rank genes. To assess the performance of this method, we use a Multi Layer Perceptron
(MLP) in which the input layer is associated to the genes commonly selected by all evaluators. We
consider gene expression samples for Leukemia, Lymphoma and DLBCL to evaluate our method
using Leave-one-out cross validation. Results show that our approach outperforms the predictive
accuracy compared to other methods.

Keywords: Feature Selection, Artificial Neural Network, Gene Expression, Cancer Classification.

Introduction

Microarray technology can profile the expression level


of thousands of genes, simultaneously. The resulting
profile simply reveals which genes are up or down regulated. It plays an important role on the study of specific cancers, activation of oncogenic pathways, and to
discover novel biomarkers for the clinical diagnosis [1].
In practice, classification algorithm is widely adopted
to analyze gene expression data.
Artificial Neural Network (ANN) are widely used
in microarray data analysis [2,3]. The great number
of genes (regarding to the number of samples) makes
conventional Machine Learning techniques, like ANN
being impractical. A common approach to resolve this
issue is reduction to the most associated genes. This is
an important problem, which is referred to as feature
selection. Feature selection is one of the most important issues in data mining, machine learning, pattern
Corresponding

classification, and so on. Only relevant features are


useful for classification to produce better performance
and reduces computation cost. It is necessary to take
measures to decrease the feature dimension under not
decreasing recognition effect; this is called the problems of feature optimum selection [4]. It is also an
effective dimensionality reduction technique and an essential preprocessing method to remove noise features
[5]. The basic idea of feature selection methods is
searching through all possible combinations of features
in the data to find which subset of features works best
for pattern reorganization. There are at least two Advantages for reduction of feature dimension; time and
space complexity of the model are reduced and redundant correlations are discarded. A successful selection
method should produce simple, moderate, less redundancy and unambiguous features [6,7]. Generally, feature selection methods are distinguished to two categories; 1) filter methods and 2) wrapper methods [1].
In filter methods, genes are selected based on their relevance to certain classes. A wrapper method embeds a

Author, P. O. Box 3719166943 ,T: (+98) 9192510195

44

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

gene selection method within a classification algorithm.


The wrapper methods are not as efficient as the filter
methods due to the fact that an algorithm runs on the
original high dimensional Microarray dataset. [8]

All mentioned evaluators are implemented in Weka


package [14]. The only admissible search method for
above evaluators is the Ranker method which ranks
features by their individual evaluations.

So far, several soft computing methods such as


fuzzy sets [9], rough set theory [10], and neural netThe Neural Network Classifier
works [11,12] have been proposed for gene expression 3
based association study. All these methods consider
only one evaluator while the prediction accuracy of a
classifier is quite sensitive to selected genes [3]. There- In this section, details of the ANN which we use for
fore we propose the gene selection method in which the association study using gene expression profiles are
discussed. The ANN has three types of layers, namely,
different evaluators are simultaneously satisfied.
the input layer, the output layer and the hidden layer,
The paper is organized as follows. In Section 2, we which is intermediate between the input and output
review some feature evaluators. Section 3 presents our layers. Fig. 1 shows a Multilayer feed-forward ANN
proposed method where an ANN classifier is modeled. structure. The neurons in two adjacent layers are fully
Section 4 focuses on experimental results and conclu- connected, while the neurons within the same layer are
not connected.
sion.

Feature Selection

We study five attribute evaluators with ranker search


method to find the best set of features. In the feature
selection step, two objects should be considered: a feature evaluator and a search method. The evaluator
assigns a predictive value to each subset of features.
Details for evaluators and search algorithms are discussed in [13].

Figure 1: Multilayer Feedforward ANN Structure.

Evaluators
In this paper, each neuron in the input layer is asBrief description of evaluators used in this paper is sociated with a gene selected by the pervious step (feaas follows;
ture selection), the number of hidden layers is 1 or 2,
and the output layer has just a single neuron. We
use four different training algorithms, Resilent Back GainRatioAttributeEval: measure the gain ratio
propagation(RP), Levenberg Marquardt(LM), Onewith respect to the class.
Step Secant Backpropagation (OSS), and Broyden,
InfoGainAttributeEval: measure the information Fletcher,Goldfarb, Shanno(BFGS). In the framework
of Backpropagation (BP) scheme. We set up initial
gain with respect to the class.
weight with random values. The learning procedure
OneRAttributeEval: evaluate the worth of an atiterates until the error (estimated by validation set) is
tribute using the OneR classifier.
fallen under a pre-specified threshold.
ReliefFAttributeEval: resample an instance and
In our method, selection algorithm is implemented
consider the value of the given attribute for the
nearest instance of the same and different class. in two steps: 1) first the relevant candidate genes from
the initial set of features are selected by each criterion
SymmetricalUncertAttributeEval: measure the
evaluator, and 2) genes which are commonly passed all
symmetrical uncertainty with respect to the
evaluators threshold are selected.
class.
CfsSubsetEval: Subsets of features that are
highly correlated with the class while having low
intercorrelation are preferred.

Datasets
For exploring the performance of new gene se-

45

The Third International Conference on Contemporary Issues in Computer and Information Sciences

lection method, three well known gene expression


datasets are considered: the leukemia, the lymphoma
and the Diffuse Large B-cell Lymphoma (DLBCL)
datasets. These data have been received with great
interest in gene selection and cancer classification
researches [10,15]. These data are publicly available from www.upo.es/eps/aguilar/datasets.html and
datam.i2r.a-star.edu.sg/datasets/krbd . To assess the
performance of classification, we evaluate the ANN
once with LOOCV (leave-one-out cross validation) and
once again by a particular test dataset without cross
validation.
Table 1: Rank Thresholds of Feature Selection

Bagging and AdaBoost), while we achieved to


98.61 % (LOOCV) just using 11 gene. Also we
can achieve a higher performance compared to
[18] that obtained the accuracy of 98% (LOOCV)
using 132 genes.
Lymphoma
This data set contains 45 samples, 22 of them
are germinal center B-like group (GCL) and 23
are activated Blike group (ACL) and the number
of genes is 4026.
In the lymphoma data set, RP is the most accurate classifier (accuracy=97.77%) using leaveone-out cross validation (Table 4). The ANN,
regardless of the training algorithm, exactly classified all test samples of lymphoma without cross
validation (Table 3). Our results on this dataset
and those obtained by SVM and Bayesian net
[19] are tightly close to each other; 97.77% vs.
97.87%. It is remarkable that the same result
has been obtained by the hyper-box enclosure
method [15].
DLBCL

Experimental
Discussion

Results

and

The feature selection evaluators and their Rank thresholds with respect to each datasets are shown in Table
1. In the first step of selection algorithm, the number of selected genes is set to a moderate number, e.g.
between 30 to 90. Then, we find minimum number
of genes that is shared in all evaluators criteria. Informative genes found in datasets are listed in Table
2.
ALL-AML Leukemia The leukemia data consists
of 72 samples among which are 25 samples of
AML and 47 samples of ALL. The number of
genes in each sample in this dataset is 7129. The
training data consists of 38 samples (27 ALL and
11 AML), and the rest is considered as test data
[16]. Using the test data without cross validation,
a perfectly accurate classification is observed (Table 3). This also achieves the best leave-one-out
(LOOCV) result (98.61) with RP training algorithm (Table 4). LM and OSS methods are the
second highest accurate classifier with the accuracy of 95.83% and 94.44% respectively.
Our result compares with the result reported in
[17] where 1038 genes predict for 91.18% (10-CV,

46

The third dataset contains 58 samples from DLBCL patients and 19 samples from follicular lymphoma (FL) on 7029 genes. Here, RP and BFGS
obtained the most accurate results; respectively
100% (estimated by the test data, Table 3) and
96.10% (estimated by LOOCV, Table 4).
It should be noted that result reported in [20]
are more accurate than our result (97.50% vs.
96.10%), but they have not identified any group
of genes responsible for DLBCL. Our results compare with results of the kNN based method (reported accuracy=92.71%) [21] where eight genes
have been identified to be associated with DLBCL. The hyper-box enclosure method [15] obtains the same accuracy as our multiple Ranker
methods with ANN.
We are also interested in the effect of the feature reduction on the classification accuracy. We
gradually reduce the number of initial genes selected by each evaluator and re-organize the ANN
classifier. Fig. 2 shows the trend of accuracy
with respect to the number of initial genes. The
numbers of commonly selected genes are shown
by bullets on each curve. As shown in Fig.
2, over 90 percent of Lymphoma samples can
be perfectly identified by using only one gene
(GENE3330X). The same accuracy can be also
achieved by four genes for Leukemia (M84526 at,
X95735 at, U46499 at, L09209 s at). DLBCL is
rather complicated; a reliable classification requires at least seven genes, even more (see Ta-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ble 2). It should be noted that no subset of six


genes or less can result in a classification with the
accuracy above 90 percent.

Fig. 3. In this step, we consider only the results of LM algorithm known as the most efficient
training algorithm in our experiences. We have
also studied some other classifiers than ANN, like
SMO, Kstar and Logistic regression, but no better results have been obtained.

Precision of classification that is the ratio of truly


predicted samples in each class is illustrated in

Table 2: Informative Genes Found in Datasets

Table 3: Accuracy of Classification with Test set Using ANN

Table 4: Accuracy of Classification with LOOCV Using ANN

Conclusion

has been introduced. Selected genes have been used


to establish an ANN by which sample types of gene
expression data have been classified. Three public
In this paper, a successful gene selection method based datasets of gene expression have used to test the peron combination of multiple feature selection methods formance. Our comprehensive assessment using leave-

47

The Third International Conference on Contemporary Issues in Computer and Information Sciences

one-out cross validation has shown the highest prediction accuracy for the proposed approach among gene
expression classification algorithms. It suggests our
method can select informative genes for cancer classification.

[6] S.B. Dong and Y.M Yang, Hierarchical Web Image Classification By Multi-Level Features, Proceedings of the first
international conference on Machine Learning and Cybernetics, Beijing (2002), 663 668.
[7] R. Setiono and H Liu, Feature Selection via Discretization,
IEEE Transactions on Knowledge and Data Engineering 9
(1997), 642645.
[8] H. Hu, J Li, H Wang, and G Daggard, Combined Gene Selection Methods for Microarray Data Analysis, Proceedings
on 10th International Conference Knowledge-Based Intelligent Information and Engineering Systems, Bournemouth,
UK (2006), 911.
[9] S.A. Vinterbo, E.Y Kim, and L Ohno-Machado, Small,
fuzzy and interpretable gene expression based classifiers,
bioinformatics/bti287 21 no. 9 (2005), 19641970.
[10] L. Sun, D Miao, and H Zhang, Gene Selection with Rough
Sets for Cancer Classification, IEEE, Fourth International
Conference on Fuzzy Systems and Knowledge Discovery,
Haikou (2007), 167172.

Figure 2: Accuracy vs. number of common genes (CG).

[11] J. Khan, J.S Wei, M Ringner, L.H Ladanyi, F Westermann,


F Berthold, and F Berthold et al, Classifi- 620 cation and
diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Med 7 (2001),
673679.
[12] M. Muselli, M Costacurta, and F . Ruffino, Evaluating switching neural networks through artificial and real
gene expression data, Artificial Intelligence in Medicine 45
(2009), 163171.
[13] Y. Wang and IV Tetko, Gene selection from microarray
data for cancer classification a machine learning approach,
Comp Biol Chem (2005), 3746.
[14] H. Hall, G Holmes, B Pfahringer, and P Reutemann et al,
The weka data mining software: An update; sigkdd explorations, SIGKDD Explorations 11: Issue 1 (2009).

Figure 3: Class precision of ANN classifier with LM


algorithm.

[15] O. Dagliyan, F Uney-Yuksektepe, I. H Kavakli, and M


Turkay, Optimization Based Tumor Classification from Microarray Gene Expression Data, journal. Plos one 6, Issue
2, e14579 (2011), 3746.

Refrences

[16] TR. Golub, DK Slonim, P Tamayo, C Huard, and M


Gaasenbeek et al, Molecular classification of cancer Class
discovery and class prediction by gene expression monitoring, Bloomfield CD, Lander ES (1999), 531537.

[1] R. Kohavi and G.H. John, Wrapper for feature subset selection, Artif.Intell. 97,1/2 (1997), 273324.
[2] Z. Zainuddin and P. Ong, Reliable multiclass cancer classification of microarray gene expression profiles using an
improved wavelet neural network, Expert Systems with Applications, 38 (2011), 1371113722.
[3] L. Nanni and A. Lumini, Wavelet selection for disease classification by DNA microarray data, Expert Systems with
Applications, 38 (2011), 990995.
[4] Y. Yan and J.O Pederson, Comparative Study of feature selection in Text Categorization, Proceedings on Fourteenth
International Conference on Machine Learning (ICML97),
(1997), 412420.
[5] B. Krishnapuram, A.J Hartemink, L Carin, and M.A.T
Figueiredo, A Bayesian Approach to Joint Feature Selection and Classifier Design, IEEE Transactions on Pattern
Analysis and Machine Intelligence 26, No. 9 (2004), 1105
1111.

48

[17] AC. Tan and D Gilbert, Ensemble machine learning on gene


expression data for cancer classification, Appl Bioinformatics 2: S (2003), 7583.
[18] M. Okuya, H Kurosawa, J Kikuchi, Y Furukawa, and H Matsui et al, Upregulation of survivin by the e2a-hlf chimera is
indispensable for the survival of t(17;19)-positive leukemia
cells, JBiolChem 285: 18 (2010), 5060.
[19] R. Hewett and P Kijsanayothin, Tumor classification ranking from microarray data, BMC Genomics 9: S21 (2008).
[20] A. Statnikov, CF Aliferis, I Tsamardinos, and D Hardin et
al, Acomprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis, BMC Genomics 21 (2005), 631643.
[21] JG. Zhang and HW Deng, Gene selection for classification
of microarray data based on the bayes error, BMC Bioinformatics 8 (2007), 370.

Data mining with learning decision tree and Bayesian network for
data replication in Data Grid
Farzaneh Veghari Baheri

Farnaz Davardoost

Department of Computer, Khodaafarin Branch,

Department of Computer, Khosroshahr Branch,

Islamic Azad University, Khodaafarin-Iran.

Islamic Azad University, Khosroshahr-Iran.

Farzaneh Veghari@Yahoo.com

Farnaz Davardoost@Yahoo.com

Vahid Ahmadzadeh

Department of Computer,
Payame Noor University,
PO BOX 19395-3697 Tehran, Iran.
Ahmadzadeh.Vahid@Gmail.com

Abstract: Data management is a main problem in Grid environment. A data Grid is composed
of thousands of geographically distributed storage resources usually located under different administrative domains. The size of the data managed by data Grid is continuously growing, and it has
already reached Petabytes. Large data files are replicated across the Data Grid to improve the
system performance. In this paper, we improve the performance of data access time and reduce
the access latency. In this research, a hybrid model is extended by combining a Bayesian Network
and a learning decision tree. We suppose hierarchical architecture which has some clusters. This
approach detects which data should be replicated. Initially, an algorithm calculates Entropy for
dataset then calculated Gain for every attribute. Finally the probability of result calculated with
Bayesian expression and replication rule will be produced. We simulate this approach to evaluate
the performance of proposed hybrid method. The simulation results show that the data access time
is reduced.

Keywords: Bayesian Network; Data Replication; Entropy; Gain; Grid; Learning Decision Tree.

Introduction

In recent years, applications such as bioinformatics, climate transition, and high energy physics produce large
datasets from simulations or experiments. Managing
this huge amount of data in a centralized way is ineffective due to extensive access latency and load on
the central server. In order to solve these kinds of
problems, Grid technologies have been proposed. Data
Grids aggregate a collection of distributed resources
placed in different parts of the world to enable users to
share data and resources (Chervenak et al., 2000; Allcock et al., 2001; Foster, 2002; Worldwide Lhc Computing Grid, 2011). Data replication has been used in
Corresponding

Author

49

database systems and Data Grid systems. Data replication is an important technique to manage large data
in a distributed manner. The general idea of replication
is to place replicas of data at various locations. Learning decision trees and Bayesian networks are widely
used in many areas, such as data mining, classification
systems, and decision support systems and so on.
A decision tree is model that of inductive learning from observation. Decision trees are creating from
training data in a top-down direction. A learning decision tree is like a hierarchical tree structure which
is divided based on a single attribute at each internal
node. The first stage of a learning decision tree is the
root node that is allocated all the examples from the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

training set. If it is the case that all examples belong


to the same class, and then no other decisions need to
be create partition the examples, and the solution is
perfect. If classes at this node be a part of two or more
classes, then a test is made at the node that will result
in a split. The process is repeated for each of the new
nodes until a differentiating tree is complete.

4. Plain Caching: a local copy is stored on initial


request.

5. Caching plus Cascading: combines plain caching


and cascading strategies.

Bayesian networks are popular within the community of artificial intelligence due to their ability to
support probabilistic reasoning from data with uncertainty. A Bayesian Network (BN) is a directed acyclic
6. Fast Spread: file copies are stored at each node
graph that represents relationships of probabilistic naon the path to the best client.
ture among variables of interest. With a network at
hand, probabilistic inference can be conducted to predict the values of some variables based on the observed
In [9] authors discussed a new dynamic replication
values of other variables and find a pattern in training
method
in a multi-tier data Grid called predictive hierdata. [1, 2, 3, 4, 5, 6, 7].
archical fast spread (PHFS) which is an extended verIn this paper, we present a hybrid model compose sion of fast spread. Considering spatial locality, PHFS
of the learning decision trees and Bayesian networks tries to increase locality in accesses by predicting users
resultant from running database. We assume a hier- subsequent file demands and pre-replicate them beforearchical architecture of data Grid system. Proposed hand in hierarchal manner. In PHFS, in order to elimarchitecture composes some clusters that every cluster inate the delay of replication on request, data must be
composes some sits. At first, decision tree based on replicated in advance by using the concept of predicting
ID3 learning algorithm is being created then, a set of future request while we use learning decision tree and
decision rules is generated for data replication in Grid Bayesian network for determining which data should
environment. We simulate our method to evaluate the be replicated.
performance of this training method.. Providing the
replication rule for data will increase the performance
of system and it is obtained the more optimum solution than the other methods. Summary, the time of
data access is reduced with proposed hybrid method.
Section 2, in this paper introduces some previous work
on data replication. Section 3, explain our proposed
method in detail and section 4, we evaluate proposed
method. Finally, conclusion are presented is section 5.

Related work

Some recent studies have discussed the problem of


replication in data Grids. Some of these works will
be surveyed in this section. In [8] six distinct strategies are presented for the multi-tier data Grid. These
strategies are as follows:

In [10], a hybrid model is advanced by integrating a


case-based data clustering procure and a fuzzy decision
tree for medical data classification. A large amount of
research has been guided to study the behavior of a
group of medical symptoms. However, the researcher is
more interested in discovering potential disease factors.
Therefore, they take a different method by proposing a
case-based fuzzy decision tree to diagnose the potential
illness symptoms. In [10] authors used decision trees
for medical data classification while in this paper we
use ID3 learning algorithm in decision trees for data
replication in data Grid.

The Proposed Architecture

1. No Replication: in this case only the root node The performance of replication strategies is highly deincludes the replicas.
pendent on the architecture of data Grid. One of the
basic models is the hierarchical data model which is
2. Best Client: a replica is created for the client
also known multi-tier. In this paper, we assumed hiwhich access frequently.
erarchical architecture with 2 tires furthermore; our
3. Cascading: a replica is created on the path of the architecture is considered as cluster. This hierarchical
best client.
architecture is shown in Fig.1.

50

The Third International Conference on Contemporary Issues in Computer and Information Sciences

If all Examples are positive, Return the singlenode tree Root, with label = +
If all Examples are negative, Return the singlenode tree Root, with label = If Attributes is empty, Return the single-node
tree Root, with label = most common value of
Target attribute in Examples
Figure 1: Hierarchical architecture for data manage- Otherwise Begin
ment

Tier 0 is broker which is responsible replicate of


data and at tire 1, there are the cluster heads. Cluster
head is a central node which is managing all nodes of
cluster and monitors status of nodes. At tire 2, there
are users which requested are being inserted from this
part.

3.1

ACalculate Gains of all attributes then select


the attribute with the highest Gain
The decision attribute for RootA
For each possible value, vi, of A,
Add a new tree branch below Root, corresponding to the test A = vi
Let Examples vi be the subset of Examples
that have value vi for A

Learning Decision Tree

If Examples vi is empty

A decision tree is a hierarchical model for supervised


learning whereby the local region is identified in a sequence of recursive splits in a smaller number of steps.

A decision tree is composed of internal decision nodes


and terminal leaves. We want to realize which data
must be replicated according to historical data access. ID3(Examples
There are tables in the broker and cluster head like End
Return Root
table 1 which included following fields [11].
Table1. Database in Broker and Clusters Head
Fields name
ID
Access number
Priority
Service time
Size of data

Description
Data Identification
Number of
accessing data
importance of data
Long of time for
allocating requested data
Size of data

Values
Number
Low, Mid,
High
Low, High
Low, High

Then below this new branch add a leaf


node with label = most common Value
of Target attribute in Examples
Else below this new branch add the subtree
vi, Target attribute, Attributes - (A)))

The central choice in the ID3 [16] algorithm is selecting


which attribute to test at each node in the tree. What
is a good quantitative measure of the worth of an attribute? We will define a statistical property, called
information Gain. In order to define information Gain
precisely, we begin by defining a measure commonly
used in information theory, called Entropy. Equation
(1) shows the formula for calculating the Entropy.

Low, High

c
X

pi Log2 pi
(1)
i=1
In this paper, we present the basic algorithm for
decision tree learning corresponding approximately to
The information gain, Gain(S, A) of an attribute A, relID3 Where Examples are according to table 1, target
ative to a collection of examples S, is defined as Equaattribute is data replication and attributes are fields
tion (2).
of table (Access Number, Priority, Service time, Size
X
of data). The summary of ID3 algorithm is as follows:
|Sv |
Gain = Entropy(S)
Entropy(Sv ) (2)
|S|
Entropy(s) =

vV alues(A)

ID3(Examples, Target attribute, Attributes)


Create a Root node for the tree

51

where V alues (A) is the set of all possible values for


attribute A, and Sv is the subset of S for which attribute A has value v (i.e., Sv = {sS|A(s) = v}).

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Note the first term in Equation (2) is just the entropy


of the original collection S, and the second term is the
expected value of the entropy after S is partitioned using attribute A. The expected entropy described by
this second term is simply the sum of the entropies of
each subset Sv, weighted by the fraction of examples
|Sv |
that belong to Sv. Gain(S, A) is therefore the
|S|
expected reduction in entropy caused.

3.2

Bayesian Networks

Bayesian Network could represent the probabilistic relationships between data and data replication. Given
symptoms, the Bayesian network can be used to compute the probabilities of the use of data in future. To
develop a Bayesian network, at first we often improve
a DAG such decision tree. Then we verify the condiFigure 2: Example of Decision Tree
tional probability distributions of each variable. Now
we calculated probability of attribute. We find the influence factor for all the attribute values. The influence
factor gives the dependability of the attribute value on
Rule base allows knowledge extraction. The rules
the class label. The formula for Influence factor for a
reflect
the main characteristics of the dataset. Decision
particular Class Ci is given Equation (3).
tree of Figure 2 can be written down as the following
set of rules:


N (Aj = Xi Ci )
I(Aj = xi ci ) =
(3)
N (Ci )
No Replication Rule:
IF
{
Where Aj =attribute that is currently considered for (Accesss Number= Low) OR
calculating, j varies from 1..n here n refers to maxi- (Access Number=Mid AND priority=Low) OR
mum number of predictive attributes and k is maxi- (Access Number=Mid AND priority=High AND Sermum number of attribute values for the attribute Aj vice Time=Low) OR
[12, 13, 14, 15].
(Access Number=Mid and priority=High AND Service Time=High AND size of data =High)
}
Then replicate=No

3.3

Extracting rules from combina- Replication Rule:


IF
tion of learning decision tree and {
Bayesian network
(Access Number=High) OR

Access Number=Mid AND priority=High AND Service Time=High AND size of data =Low OR
We assume that a learning decision tree for target value Access Number=Mid AND priority=High AND Serdata replication is according to Figure 2 each path from vice Time=High AND size of data =mid
the root to the leaf can be written down as set of IF- }
THEN rules.
Then replicate=Yes

52

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences

Simulations

We evaluate and compare the performance of our approach with no replication algorithm in a condition
which number of clusters verify. Figure 3 Illustrates
comparing of access time in 4 clusters, 8 clusters and
12 clusters.

[1] J.Zhang L , B.Lee, X.Tang, C.Yeo, A model to predict the


optimal performance of the Hierarchical Data Grid, future
Generation Computer Systems 26 (2010).
[2] J.Perez, F.Garcia-Carballeira, J.Carretero, A.Calderon,
J.Fernandez, Branch replication scheme: A new model for
data replication in large scale data Grids, Future Generation Computer Systems 26 (2010) 12-20.
[3] R.Chang, C.Lin, S.His, Accessing data from many servers
simultaneously and adaptively in data Grids, future Generation Computer Systems 26 (2010) 6371.
[4] N.Mansourin, G.Dastghaibyfard, A dynamic replica management strategy in data Grid, Journal of Network and
Computer Applications.
[5] B. Chandra, P. P.Varghese, Fuzzifying Gini Index based
decision trees, in: expert systems with applications,
36(2009)8549 -8559,contents lists available at sciencdirect.
[6] T.D. Schneider, Information Theory Primer, 1995 12
November 2000 [cited 1996; version 2,32,27 July 1995] Available from:ftp://ftp.ncifcrf.gov/delila/primer.ps.

Figure 3: Access time for various cluster number

[7] K. Sashia, A.Selvadoss Thanamani, Dynamic replication


in a data grid using a Modified BHR Region Based Algorithm, Future Generation Computer Systems 27 (2011)
202210.
[8] K. Ranganathan, I. Foster, Design and evaluation of dynamic replication strategies for a high performance data
Grid, in: International Conference on Computing in High
Energy and Nuclear Physics, vol. 2001.

As it shown in Figure 3, access time is decreasing


as the number of cluster increase. Comparing access
time of our approach with no replication shows 22%
decreasing in 4 clusters, 27% decreasing in 8 clusters
and finally 34% decreasing in 12 clusters.

[9] L.Mohammad Khanli, Ayaz Isazadeha, Tahmuras N.


Shishavan, PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid, Future
Generation Computer Systems 27 (2011) 233244.
[10] C.Y.Fan, P.C.Chang, J.J.Lin, J.C.Hsieh, A hybrid model
combining case-based reasoning and fuzzy decision tree
for medical data classification, in: applied soft computing(2010),contents lists.
[11] L.Mohammad Khanli,F.Veghari Baheri, A hybrid model
combining decision tree and Bayesian network for data replication In Grid environment, Journal of telecommunications, volume3, issue 1, april 2010.

Conclusion

In this paper, we present a hybrid model compose of


the learning decision tree and Bayesian network for
replication strategy in data Grid. We assume a hierarchical architecture of data grid system. Proposed
architecture composes some clusters that every cluster composes some sits. Finally, a set of decision rules
is generated for data replication in grid environment.
We simulate our method to evaluate the performance
of this training method. It is obtained the more optimum solution than the other methods. Providing the
replication rule for data will increase the performance
of system. Summary, the time of data access is reduced
with proposed hybrid method.

53

[12] S. A.a.Balamurugan, R.Rajaram, Effective solution for unhandled exception in decision tree induction algorithms
, in: expert systems with - applications 36(2009)1211312119,contents lists available at sciencedirect.
[13] T.Amjad, M.Sher, A.Daud, A survey of dynamic replication strategies for improving data availability in data grids,
Future Generation Computer Systems 28 (2012) 337349.
[14] N.Xiong, Learning fuzzy rules for similarity assessment in
case-based reasoning, Expert Systems with Applications
38 (2011) 1078010786.
[15] L.Mohammad Khanli, F.Mahan, A.Isazadeh, Active rule
learning using decision tree for resource management in
Grid computing, Future Generation Computer Systems 27
(2011) 703710.
[16] Tom M. Mitchell, Machine Learning, McGraw-Hill Science/Engineering/Math; (March 1, 1997)

Design and Implementation of a three-node Wireless Network


For Tranferring Patients Medical Information without Data Collision

Roya Derakhshanfar

Islamic Azad University


Department of Biomedical Engineering, Science and Research Branch
Tehran, Iran
r.derakhshanfar@srbiau.ac.ir

Maisam M.Bassiri
Iran University of Science and Technology
Department of Electrical Engineering
Tehran, Iran
basiri@iust.ac.ir

S.Kamaledin Setarehdan
University of Tehran
Control and Intelligent Processing Center of Excellence, School of ECE, College of Engineering
Tehran, Iran
ksetareh@ut.ac.ir

Abstract: The purpose of this paper is to introduce a method for transmitting patients data
using a wireless network. By this network, the patients data is first gathered at a central station
and from there, it is then sent to a computer. In the computer, the patients profiles are created,
so that their medical information can be controlled every moment. The existing protocol between
master and slave provides synchronous data transfer without collision. Another protocol is also
provided between the computer and the master in order to collect, save and process the data.

Keywords: Medical devices; Telemedicine; Codevision Software; Wireless networks.

Introduction

Telemedicine was pioneered at the beginning of the


20th century, for example, in the field of maritime
medicine by using telecommunication and morse code
[1]. The intention of telemedicine it means that patients can receive health services such as prognosis of
disease, advance diagnosis, treatment service and therapy procedure at any time and in any geographical
area. This is significant to provide the better facility
for people that living in outlying regions and that people which are located in the deprived regions. The evo Corresponding

lution of telemedicine has created birth to a widespread


nomenclauture that are as follows: e-health, telediagnosis, teletreatment, telemonitoring and telehealth.
Telemedicine has been employed to surmount distance
and at this way, wireless communication devices are reliably helpful. The improvements in telemedicine and
communication technologies such as wireless networks
generates supporting systems for the management of
chronic illness such as heart diseases and hypertension
and aids to the physicians for careful examination related to disease at any situation. Nowadays, computerbased systems are used for the clinical applications.
In the territory of telemedicine, monitoring applica-

Author, P. O. Box 1568834911, T: (+98) 21 88400950

54

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tions using wireless networks were developed. The


telemedicine service requires a computer-based system
which can control patients health condition using modern monitor and can transmit, patients profile to assess
as soon as possible. Telemedicine has advanced in the
industrialized countries. This way can be convenient to
reduce of costs and disinclination of the elderly people
for return to the hospital or clinical centers. The purpose of telemedicine is to provide immediate medical
treatment through modern monitors, wireless networks
and telecommunications procedure such as, satellites,
mobile e-health applications and specific medical devices based on the sensors and microcontroller devices,
for the help to the patients. Some articles in the domain of telemedicine, were published. Andreas Lymberis and Silas Olsson, have described the current status of multidisciplinary research and development of
IBC (intelligent biomedical clothing), based on bibliographic research [1]. Anthoula P.Anagnostaki, Sotiris
Pavlopoulos, Efthivoulos Kyriakou and Dimitris Koutsouris, have discussed a novel codification scheme based
on two healthcare informatics standards, the VITAL
and DICOM sup.30, in addressing the robust interchange of waveform and medical data for a home care
application [2]. Edward Mutafungwa, Zhong Zheng,
Jyri Hamalainen, Mika Husso and Timo Korhonen,
have proposed a complementary solution based on the
emerging femtocellular approach for indoor emergency
telemedicine scenarios [3]. A.Yadollahi, Z.Moussavi
and P.Yahampath, have described an adaptive method
for compression of respiratory and swallowing sounds
[4]. Claudio De Capua, Antonella Meduri and Rosario
Morello, have presented an original ECG measurement
system based on web-service-oriented architecture to
monitor the heart health of cardiac patients [5]. Alfonso Prieto-Guerrero, Corinne Mailhes and Francis
Castanie, have described a method based on left-sided
and right-sided autoregressive modeling to interpolate
missing samples in an ECG signal [6]. Tae-Soo Lee,
Joo-Hyun Hong and Myeong-Chan Cho, have implemented a protable and wearable biomedical digital assistant using a personal digital assistant (PDA) and
an ordinary cellular phone [7]. Daniel Lucani, Giancarlos Cataldo, Julio Cruz, Guillermo Villegas and
Sara Wong, have developed a prototype of a portable
ECG-monitoring device for clinical and non-clinical environments as part of a telemedicine system to provide remote and continuous surveillance of patients
[8]. Marco J.Suarez Baron, Juan J.Velasquez, Carlos A.Cifuentes and Luis E.Rodriguez, have introduced
an algorithm to telemedicine intelligent through web
mining and instrumentation wearable [9]. J.Escayola,
I.Martinez, J.Trigo, J.Garcia, M.Martinez-Espronceda,
S.Led and L.Serrano, have described an extended review of the most recent innovative advances in biomedical engineering applied to the standard-based design

for ubiquitous and personal healthcare environments


[10]. A.Bramanti, L.Bonanno, A.Celona, S.Bertuccio,
A.Calisto, P.Lanzafame and P.Bramanti, have identified regional spots as potential territorial stations for
the telemedicine interventions delivery through the use
of GIS (geographical information system), a technology
that is recently considered an important and new component for many epidemiological and health projects
[11]. Cristian Ravariu and Florin Babarada, have offered a web protocol for medical diagnosis, under educational projects, as application to learning, bioinformatics and telemedicine [12]. Yibao Wang, Yang
Liu, Xudong Lu, Jiye An and Huilong Duan, have proposed a simple and complete representation of biosignal
data based on MFER and CDA [13]. Stefano Bonacina
and Marco Masseroli, have designed a web application
that enables different healthcare actors to insert and
browse healthcare data, bio-signals and biomedical images of patients enrolled in a program of cardiovascular
risk prevention [14]. Mohd Fadlee A.Rasid and Bryan
Woodward, have described the design of a processor,
which samples signals from sensors on the patient. It
then transmits digital data over a Bluetooth links to
a mobile telephone that uses the general packet radio
service (GPRS) [15]. This study consists of two slaves
and one master. Each slave is used as an interface for
receiving patients information and the master is a central station in fact. Patients information is gathered in
the master through that it is sent to a PC. In this way,
patients information profiles are created in the PC.
This enables doctors to evaluate each patients medical information momentarily. Here the information of
drug and temperature enters the central station (master) through each of the slaves, with its characteristic
code. The master/slave is a model of communication
where one device or process has unidirectional control
over one or more other devices. In some systems a
master is elected from a group of eligible devices, with
the other devices acting in the role of slaves [16].

2
2.1

Material And Methods


Hardware

In this study, HM-T and HM-R have been used as the


transmitter and receiver respectively. HM-R is a receiver module that is easily usable through serial connection to a microcontroller or a PC. This module is
based on FSK modulation which provides longer working distance and less interference in comparison to ASK
technology. For carrying out data transmission, a receiver pair is needed. For this purpose a HM-T trans-

55

The Third International Conference on Contemporary Issues in Computer and Information Sciences

mission module with three pins, two for the module


supply and one for data, has been used. It is enough to
deliver data to this module through serial interface so
that it can send the data via FSK modulation. These
modules can be found in different working frequencies of 315, 433, 868 and 915 MHz. According to the
datasheet, these modules have three options for baud
rate. Considering the necessity of this study to have
noise reduction and achieve maximum efficiency, 4800
bps was chosen as the most suitable baud rate for the
application. In this baud rate, the modules work better and data is received more accurately and precisely.
Baud rate set-up was done through codewizard AVR.
A sharp note which has to be considered when using
these modules is that they go to a sleep mode if there
is no data transferring for 70 ms. At first the module
is off. When data transferring is started, the module is
in its sleep mode. In order to awake it, we have to send
some assumed data and then it is ready to transfer the
main data. The modules work in different frequencies
in master and slave. In master, the HM-T transmission
module and the HM-R receiver module are connected
to the microcontroller with frequencies of 433 MHz and
915 MHz respectively. In slave, the HM-T transmitter
and the HM-R receiver module are connected to the
microcontroller with frequencies of 915 MHz and 433
MHz respectively. The reason for putting the modules
to work with different frequencies in master and slave
is preventing from interference during sending and receiving. SMT160 is an intellectual temperature sensor
with a PWM digital output. Therefore it can be connected to the microcontroller without using an ADC
convertor. The thermal range of this sensor is between
45 C to 150 C and its output is a rectangular waveform whose duty cycle is dependent on temperature.
This dependency is given by the following linear equations:
T1
(1)
DC =
T1 + T2
(DC 0.032)
t=
(2)
0.0047
Where DC is duty cycle and t is temperature in Celsius degree. For measuring temperature by SMT160
sensor, it is enough to measure the DC and then the
temperature is obtained using equation (2). The easiest way to measure the DC is using a microcontroller.
In AVR microcontroller, DC can be measured with different methods. In the method used in this study, it is
enough to read and control a pin in the program. Using these two counters, duty cycle is obtained and the
rest is calculated as mentioned before. In the slave, an
ATmega16 from AVR microcontroller has been used.
A keyboard is connected to port B and a LCD to port
A of the microcontroller. The temperature sensor is
connected to pin PD4, HM-T transmission module to
TX pin (pin No.15) and HM-R receiver module to RX

56

pin (pin No.14). In the master, an ATmega16 has been


used as well. The module connections to micro in the
master are the same as ones in the slave. Besides communication with the receiver and transmitter modules,
the master should be capable of communication with
PC through serial interface. Serial interface is provided
via a MAX232 voltage converter IC and a proper serial
cable. The hardware setup is shown in Figure 1.

2.2

Software

Serial interface can be explained in two parts; one between the master and slave and the other between the
master and the PC.

Figure 1: Hardware setup

In the communication protocol between master and


slave, data is gathered in slaves by the micro-controller
and then it is sent to the master. The request asked by
the master is as follows: first in sends the code consisting of star character, channel number and square character in sequence as the transmission code. Then the
slave responds to the master with the code consisting
of star character, channel number, drug, temperature
and finally check sum (CS) and square character in sequence. Check sum is calculated as sum of the ASKII
code of star character, channel number, drug and temperature. CS is calculated at both sides of master and
slave; the error is detected if there is any difference
between these two. The explained protocol is written
in codevision software for the both master and slave
parts.
In the communication protocol between the master
and the PC, the temperature and drug of each channel
are defined by three characters. Via this protocol the
PC can read the sent data with the corresponding port
and finally display or save it if necessary. This protocol
has been written in Visual Basic.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Results

The built setup works as follows considering both of the


explained protocols: the temperature corresponding to
each of the slaves is displayed against the word temp
on the LCD. The temperatures of the two slaves are
not necessarily equal. The numbers from 1 to 8 have
been assigned to each of the 8 special drugs, one by
one. For selecting a special drug, it is simply enough
to push the assigned number on the keyboard and then
press F1 key to confirm it. The drug information is sent
according to the number and is displayed against the
word drug. For example, if the channel 0 slave needs
the fifth drug, after pressing No.5 on the keyboard and
confirming it by pressing F1, drug number is saved and
displayed in the form of D5 in drug section of the table
on the PC when the default program delay is passed.
This means drug No.5 is requested. This program is
also capable of sending several drugs in sequence. Another feature is also included in this program and that
is if, for any reason, any of the slaves is disconnected
and data has not been delivered to the master, the master can recognize it by channel No and announce the
disconnection. For example, if channel 1 (CH1) slave
is disconnected, the massage CH1 disconnected is
shown in bottom of the table in the exe file.

Conclution

[2] Anthoula P.Anagnostaki, Sotiris Pavlopoulos, Efthivoulos


Kyriakou, and Dimitris Koutsouris, A novel codification
scheme based on the VITAL and DICOM standards for
telemedicine applications, IEEE Trans.Biomed. Eng 49
(2002), no. 12.
[3] Edward Mutafungwa, Zhong Zheng, Jyri H
am
al
ainen, Mika
Husso, and Timo Korhonen, Exploiting femtocellular networks for emergency telemedicine application in indoor environments, IEEE (2010).
[4] A.Yadollahi, Z.Moussavi, and P.Yahampath, Adaptive compression of respiratory and swallowing sounds, IEEE, 28th
Annual Int. Conf. New York (2006).
[5] Claudio De Capua, Antonella Meduri, and Rosario Morello,
A smart ECG measurement system based on web-serviceoriented architecture for telemedicine applications, IEEE
Trans. Instrumentation and Measurement 59 (2010), no. 10.
[6] Alfonso Prieto-Guerrero, Corinne Mailhes, and Francis Castani
e, Lost sample recovering of ECG signals in e-healyh
applications, IEEE, 29th Annual Int. Conf, Lyon, France
(2007).
[7] Tae-Soo Lee, Joo-Hyun Hong, and Myeong-Chan Cho,
Biomedical digital assistant for ubiquitous healthcare, IEEE,
29th Annual Int. Conf, Lyon, France (2007).
[8] Daniel Lucani, Giancarlos Cataldo, Julio Cruz, Guillermo
Villegas, and Sara Wong, A portable ECG monitoring device with blue-tooth and holter capabilities for telemedicine
applications, IEEE, 28th Annual Int. Conf., New York
(2006).
[9] Marco J.Suarez Baron, Juan J.Velasquez, Carlos
A.Cifuentes, Luis E.Rodriguez, and Sara Wong, An
approach to telemedicine intelligent, through web mining
and instrumentation wearable, IEEE (2011).
[10] J.Escayola, I.Martinez, J.Trigo, J.Garcia, M.MartinezEspronceda, S.Lel, and L.Serrano, Recent innovative advances in biomedical engineering : standard-based design
for ubiquitous p-health, IEEE, 4th Int. Multi-Conference on
Computing in the Global Information Technology (2009).

Today, medical technologies development has been


greatly affected clinical examination and consult methods in hospitals and health care centers. Wireless network appearance in making connections with people
in different places and obtaining desirable results have
caused many hospitals to use different methods in order to computerize their affairs. In this study, transferring the information of drug and temperature for two
patients, without collision was investigated using two
slaves. Adding other patients is possible by considering related slave circuits and program and hardware
operation. We hope that in near future, patients electronic profiles can be accessible for doctors and nurses
via wireless network communication so that a better
control and care in health and treatment can be provided.

[11] A-Bramanti, L.Bonanno, A.Celona, S.Bertuccio, A.Calisto,


P.Lanzafame, and P.Bramanti, GIS and spatial analysis for
costs and services optimization in neurological telemedicine,
IEEE, 32th Annual Int. Conf. Buenos Aires, Argentina
(2010).

Refrences

[16] http://en.wikipedia.org/wiki/master/slave-(technology).

[12] Cristian Ravariu, Florin Babarada, A.Celona, S.Bertuccio,


A.Calisto, P.Lanzafame, and P.Bramanti, The e-healthcare
point of diagnosis implementation as a first instance, IEEE,
1th Int. Conf, Data Compression, Communications and Processing (2011).
[13] Yibao Wang, Yang Liu, Xudong Lu, Jiye An, and Huilong
Duan, A general-purpose representation of biosignal data
based on MFER and CDA, Third Int. Conf. Biomedical Engineering and Informatics (2010).
[14] Stefano Bonacina and Marco Masseroli, A web application
for managing data of cardiovascular risk patients, IEEE,
28th Annual Int. Conf., New York (2006).
[15] Mohd Fadlee, A.Rasid, and Bryan Woodward, Blue-tooth
telemedicine process for multichannel biomedical signal
transmission via mobile cellular networks, IEEE Trans. Inform. Tech. Biomed 9 (2005), no. 1.

[17] http://www.atmel.com.
[1] Andreas Lymberis and Silas Olsson, Intelligent biomedical
clothing for personal health and disease management : state
of the art and future vision, Telemedicine Journal and ehealth 9 (2003), no. 4.

[18] http://www.hy-line.de.
[19] http://www.hoperf.com.
[20] http://www.maxim-ic.com.

57

CEA Framework: A Comprehensive Enterprise Architecture


Framework for middle-sized company

Elahe Najafi

Ahmad Baraani

MSc of Information Technology

Department of Computer Engineering

Research institute for ICT,Tehran,IRAN

Isfahan University,Isfahan,IRAN

enajafi@aut.ac.ir

ahmadb@eng.ui.ac.ir

Abstract: Designing architecture for Organizations is a complex and confusing process. It is not
obvious that you should start from which point and how you can continue to achieve the holistic
architectural model of an organization. Using CEA framework (CEAF), a semantic enterprise architecture framework (EAF), brings a new opportunity for enterprise experts to get their enterprise
ontology by focusing on one variable at a time without losing sense of enterprise as a whole. A
number of semantic frameworks like CEAF have been presented by famous Enterprise Architecture
(EA) researchers and experts till now. A significant goal of all of them is to design a transparent
Enterprise which is as LEAN as possible to adapt and adopt external demands and environment
changes. To achieve this goal CEAF is based on primitive object named Service. It is a substantial
characteristic of CEAF which distinguishes it from other presented frameworks till now.

Keywords: Enterprise Architecture(EA);Enterprise Architecture Framework(EAF); Service Oriented; Service Oriented Framework; Service Oriented Enterprise Architecture(SOEA) .

Introduction

works in section 2, CE framework is elaborated in details in Section 3. Finally, Directions of future research
are discussed in section 4 to conclude the paper.

Enterprise architecture (EA) is an approach that organizations should practice to integrate their business with Information and Communication Technology (ICT). It presents a comprehensive and rigorous
solution describing a current and future structure and
behaviour of an organization by employing a logical
structure. This structure comprising a comprehensive
collection of different views and aspects of the enterprise, called EAF . EAF is a total picture of an organization showing how all organization elements work
together to achieve defined business objectives.
Several distinctive EAF have been proposed, till now,
but many organizations are struggling with using these
frameworks, the main challenge which current EA
Frameworks faced is that using these frameworks are a
tedious and complex activity.
In this paper we will present a new service oriented
semantic framework to reduce this challenge.In the remainder of this paper we will discussed the related
Corresponding

Author, T: (+98) 913 3264575

58

1.1

Related Work

Several distinctive EA Framework(EAF) have been


proposed from 1980 till now, but many organizations
are struggling with using these framework. The main
challenges of existing frameworks are:
1. Inflexible to deliver
andrespond manner.

services

in

sense-

2. Lack of well-defined alignment between services


delivered at every level of the organization.
3. Lack of integration between enterprise goals, actions and resources.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

4. Does not support cross-organizational interac- 2.1.1


tion.

First Row:Service Strategy

Observer: Strategist
Description: Service Strategy provides a foundation
for enterprise management.It derives all enterprise activities.
To eliminate these challenges a number of researchers Critical Questions:
try to use SO paradigm with EAF for generating EA
artifacts )[16]. These researchers believe that this
What are our business objectives and expectaparadigm re-engineer enterprise to one that senses the
tions?
environment rapidly and adapts itself to business chal In which domains and to whom do we offer our
lenges and opportunities quickly.
services (our stakeholders)?
Although the scope and coverage of these frameworks
differ extensively, they do not completely clarify how
What value do we create for our stakeholders?
combination between EAF can takes advantage of ser What services do we offer to our stakeholders now
vices and what is a well-defined classification schema
or plan to offer in the future?
to support this combination. To eliminate the deficien What is the quality and warranty of our services
cies of current SOEAF we suggest a new SO semantic
to differentiate our services from rivals?
framework named CEA in next section.
Who are our service provider partners?
5. Heterogeneous models for each cell.

CEA Framework

What is the pattern of our activities in our value


chain or value network to create value for our
stakeholders?
How do we allocate resources efficiently to resolve
conflicting demands for shared resources?

CEA Framework is a two dimensional normalized


scheme which is the intersection of two classifications,
Goals:
the aspects of organization and the organization audience perspectives. These views and aspects will be
Thinking about why and what we want done beelaborated in details as CEAF rows and columns in
fore thinking of how does it.
two next sections.
Define our scope

2.1

CEAF Rows

CEAF rows show the enterprise from various viewpoints. For describing each row we defined a template
comprising below items and defined each row by this
template.
Observers:a list of audiences and viewers.
Description:a brief depiction.
Goals:a list of goals targeted by each row.
Critical Question:a list of questions to be answered
by end of each row.
Organization:a list of the roles and the responsibility
of each role corresponding COBIT RACI chart.In this
field A,R and I are stand for Accountable, Responsible
and Informed.
Candidate Patterns:a list of patterns and suitable
references.
Perquisite:a list of perquisite inputs.
Deliverables:a list of deliverables of each row
Based on:a list of theoretical concepts which each row
is based on.

Ensuring that organizations are in position to


handle the costs and risks associated with their
services
Set up foundation for operational effectiveness
and instinctive performance.
Organization:
I Strategists, CEO, CFO, FIO, SOA ESC, SOA BC,
AMC,ARB,A CIO,R Chief EA Architect
Candidate Patterns:Strategic pattern,Business pattern
Perquisite: Business strategy
Deliverables:Context
Diagram,Service
Portfolio,Service Design Requirement
Based on: ITIL Service Strategy

2.1.2

2nd Row:Process Service Design

Observer:Customer, Business Owner.


Description: Process service design covers design

59

The Third International Conference on Contemporary Issues in Computer and Information Sciences

principles and methods for converting strategic goals


and desires into real orchestration services.
Critical Questions:

What are the quality requirements and constraints of each business services?
What is the pattern of each business services?

Which of our design services realizing our business processes are meaningful for our external Goals:
stakeholders?
What are the quality, management and operational requirements of each design services which
must be addressed as a fundamental part of design?

Design of new or changed business service aligned


process services.
Provide a holistic view of business design.

Provide business services handled business needs.


How can we interact with external service
provider and use the services they provide to
achieve our IT service targets and business exOrganization:
pectations?
I Business executive, business owner,A Chief EA Ar What is the pattern of each process services?
chitect
R Business analyst ,Service Analyst, Service Designer,
Service architect
Goals:
Candidate Patterns: Business Pattern
Perquisite:Services Catalogue (Business part),Process
Design of new or changed service for introduction SLA, OLA
Deliverables:Detailed Service catalogue (Business
into the live environment.
Part),Business Service level Agreement
Provide a holistic view to all aspects of process
Based on:ITIL Service design,IBM Service Model and
design insuring us that all of process services are
SOMA technique,Thomas ERL Service Oriented Apconsidered when any of the individual one change
proach
or amend.
Provide service alignment.
Organization:
I Strategists, CFO,A Chief EA Architect
R Business analyst, Service architect, Business executive, business owner
Candidate Patterns:Process pattern
Perquisite: Service Portfolio, Strategic goals
Deliverables:Service
catalogue
(High
levelProcess),Process Service level Agreement,Operational
level Agreement
Based on:ITIL Service design,IBM Service Model
and SOMA technique,Thomas Erl Service Oriented
Approach

2.1.4

4th:IT Service Design

Observer: IT specialist Description:In this process


we design the basic or complex IT services of each
business service .These are the components which are
loosely coupled, self-contained and stateless.
Critical Questions:
What are Our IT services to realize Business process?
What are the non-functional requirements of each
IT services?
What is the design pattern of each IT services?

2.1.3

3th Row:Business Service Design

Observer:Owner and User


Description:In this level we design the Business services as component of process services.
Critical Questions:
Which of our design services realizing business
activities is meaningful for internal stakeholders?

60

Goals:
Design of new or changed IT service aligned business services.
Provide a holistic view to all aspects of system
design
provide IT services realized business needs.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Organization: I Business analyst,A Chief EA Architect


R System analyst, Service Analyst, Service Designer,
IT architect, Application Designer, DB Designer
Candidate Patterns: Design and Architectural pattern
Perquisite: Service catalogue (Business part)
Deliverables:Service catalogue (IT Part),Service Solution Architecture,Quality of Services
Based on:ITIL Service design,IBM Service Model and
SOMA technique,Thomas ERL Service Oriented Approach

2.2

CEA Columns

The aspects of enterprise are specified by six categories


defining CEAF Columns. These categories are: Purpose, Pattern or Practice, Policy, Stakeholder, People
and Resource. These aspects are described by beyond
template
Description: a brief depiction.
Levels: Different elements of each column show in different Levels.
Deliverables: a list of deliverables of each row

2.2.1

2.2.2

2nd:Policy

Description: This column is about policies of organization.A policy is management expectation, intention
and condition used to ensure that consistent and appropriate decision, design and development of goals,
responsibilities, resources and process are created at
last. Policies are about constraints and quality of different type of services exposed in different rows.
Levels:
Strategy policies: Strategic policies describe governance rules that driving the strategic decision .they
should be considered to accomplish strategic mission
through well understood steps by an agreed date and
budget. These policies consider any risk, constraints
and limitation affecting business strategy and quality
of delivered services.
Orchestration policies: Orchestration policies address any constraint exists for composition and integration of business services together.
Business Policies:
These policies specify constraints, standards and business rules regarding the operation of services.
IT policies: IT policy is about the quality of IT services. It covers all types of non-functional requirements
like performance, efficiency, security, availability, reliability which should be addressed by service-oriented
architecture.
Deliverable: Policy relationship Map

First Column:Purpose

Description:In this column the goals which we want


to be achieved by means of the remainder columns are
enumerated. These goals will define in various levels
ranging from the most abstract depiction of the business to more detailed and measurable set of objectives.
Levels:
Strategic headline: These are business philosophy,
the manner in which, services are provided, the governing set of beliefs, values and a sense of purpose shared
by the entire organization as vision, mission and strategic headlines.
Business goals: These are realistic translation of abstract strategic headline .These type of goals are like a
mountain peaks that we want to achieved in long term.
Achieving these goals insure us to accomplish strategic
headline.
Business objectives: These are quality format of
Business goals which provide measurable views of business goals.
IT Targets: IT targets are quantity goals declared
for each quality objectives defined in previous level.
Deliverable: The hierarchically tree between different types of goals defined in this cell based on Balanced
Score Card

2.2.3

3rd Column:Service

Description:A Service is a loosely coupled, selfcontained and stateless component that interacts with
other services to accomplish business goals and deliver
value to customers. In this column we design services
in three levels.
Levels:
Process services: Process services provide the control capabilities required to manage the flow and interactions of multiple services in ways that implement
business processes. These services representing longterm work flows or macro flow of business processes
which implemented by an orchestration of basic and
complex business services.
Business services: Each business service may participate in different process services. These services
contain business micro logic. These services are meaningful from business internal viewer of systems.
IT services: IT services are services which handle
the technical view of system. These types of services
include the technology solutions and IT constraints to
design services. An IT Service may be composite or
basic.

61

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2.2.4

Training and skills enhancement was needed for personnel and communication management (3) New roles
and responsibilities should be defined (4) the governance structure must be established.
Deliverables: Organization structure, Chain of Authority and Responsibility.

4th Column:Pattern or Practice

Description: This column specifies how we can


achieve defined goals in first column. In this column a
organization specifies a solution pattern comprising a
set of activities and practices that solve common problems in a given context of business success.
Levels:
Strategic pattern:
These patterns leverage best
practices along with a collection of proven architectures
used in different domains which organization offers its
services in. Best practices include methodologies, techniques, guidelines and strategies.
Process pattern: This pattern defines the model of
orchestration between different services. The process
services must do its responsibility stand alone. Each
orchestration between services helps us to accomplish
some parts of our strategic mission.
Business pattern:
Business pattern includes the
patterns matching business service scenarios. Each
business service scenario is minimally mapped to the
activities it supports, the rules it abides by, the messages it transfers, the data warehouses it retrieves data
from and the information it captures, processes, stores
and accesses.
IT Pattern: In IT level we focus on IT solution to
realize business services. In this level we use technical
and design pattern to cover business scenarios.
Deliverables: Pattern Language

2.2.5

2.2.7

7th Column:Resource

Description: Through earlier columns we defined our


goals and perspectives, The services and the level of
then, we can offer to our stakeholders, the pattern and
ways we must follow to achieve the goals and the human resources needed to accomplish way. The only
thing which remains is other resources. Raw materials, Environment resources, technical resources and
financial resources are some example of these resources
which must be considered in this column.
Deliverable: List of resources

Conclusion and Future Work:

In this paper we introduce a comprehensive framework


using services as primitive elements. This classification structure shows a holistic view of any organization
by Seven aspects(Purpose, Policy, Service, Pattern or
Best Practice, Stakeholder, People and Resource)from
different perspectives.Our future work agenda encompasses using this framework in more case studies and
presented a schema which is suitable for companies
with every size.

5th Column:Stakeholder

Description: In this column we focus on stakeholder


management to realize organization stakeholders, categorize them, understanding their needs, expectations,
responsibilities, authorities and decision rights. These
stakeholders are all of the organizations and external
peoples that affect our business and our organization
activities.
Deliverables: Stakeholder Model

Refrences
[1] D. HARRISON and L. VARVERIS, TOGAF: ESTABLISHING ITSELF AS THE DEFINITIVE METHOD FOR
BUILDING ENTERPRISE ARCHITECTURES IN THE
COMMERCIAL WORLD (2004).
[2] D. MINOLI, Enterprise Architecture A to Z: Frameworks,
Business Process Modelling, SOA, and Infrastructure Technology, Auerbach PUBLICATIONS, 2008.
[3] J. Schekkerman, How to Survive in the Jungle of Enterprise
Architecture Frameworks: Creating or Choosing an Enterprise Architecture Framework, Trafford Publishing, 2006.

2.2.6

6th Column:People

Description: In this column we define all of organization workers and committees which participant in
defining EA.By focusing on people we can clarify: (1)
the changes needed in organization structure, chains
of responsibility, authority and communication. (2)

62

[4] A. AYED, M. ROSEMANN, E. FIELT, and A. KORTHAUS, ENTERPRISE ARCHITECTURE AND THE
INTEGRATION OF SERVICE-ORIENTED ARCHITECTURE, PACIS 2011 PROCEEDINGS, Brisbane , Australia
(2011).
[5] A. NABIOLLAHI, R. A. ALIAS, and S. SAHIBUDDIN, A
SERVICE BASED FRAMEWORK FOR INTEGRATION
OF ITIL V3 AND ENTERPRISE ARCHITECTURE, Design (2010), 15.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[6] J. Schekkerman, EA and Services Oriented Enterprise


(SOE) / Service Oriented Architecture (SOA) and Service
Oriented Computing (SOC) (2008).

[7] M. Ibrahim, Service-Oriented Architecture and Enterprise


Architecture (2007).

63

Thick non-crossing paths in a polygon with one hole

Maryam Tahmasbi

Narges Mirehi

Shahid Beheshti University, G.C.,Tehran, Iran1

Shahid Beheshti University, G.C.,Tehran, Iran

Department of Computer Science

Department of Computer Science

m tahmasi@sbu.ac.ir

n.mirehi@mail.sbu.ac.ir

Abstract:
We consider the problem of finding a large number of disjoint paths for unit disks moving amidst
static obstacles. The problem is motivated by the problem of finding shortest non-crossing paths for
aircraft in air traffic management, in which one must determine the shortest path for any aircraft
that can safely move through a domain while avoiding each other and avoiding no-fly zones and
predicted weather hazards. We compute K shortest paths for aircrafts in a domain with one hole,
where K 1 pairs of terminals lie on the boundary of the domain, but one pair of terminals lie on
the boundary of the domain and the boundary of the hole. We present an algorithm for solving the
problem in polynomial-time.

Keywords: K thick paths; minkowski sum; non-crossing paths; simple polygon with one hole; minsum

Introduction

[6].
The input to the problem is a simple polygonal domain, and K pairs of terminals (sk , tk ) that are sources
and sinks of any path, and one hole/obstacle. K 1
pairs of terminals lie on the boundary of the domain
and one of the points of the last pair lies on the boundary of the hole. The goal is to find K thick non-crossing
paths in the domain with no intersection to the hole,
such that the total length of the paths is minimum.

One of the most studied subjects in computational geometry is the shortest path problem [1],[2]. one of the
extension geometric shortest path problem is [3]: given
a set of obstacles and a pair of points (s, t), find a
shortest s t path avoiding the obstacles. The noncrossing paths problem is an extension of the shortest
path problem: given a set of obstacles and K pairs
of points (sk , tk ), find a collection of K non-crossing
Thick path planning in geometric domains is an imsk tk paths such that the paths are optimal accord- portant computational geometry subject with applicaing to some criterion. The objective may be either to tions in robotics, VLSI routing, air traffic management
minimize the sum of the lengths of the paths (minsum (ATM), sensor networks, etc. [3].
version) or to minimize the length of the longest path
(minmax version). A thick path is the Minkowski sum
of a curve and the unit disk. Two thick path are called
non-crossing when they are non-intersecting, the thick 2
Motivation
paths allow to share parts of the boundary with each
other; but the interiors of the paths are disjoint [4].
We are motivated by an application in ATM; similar
The problem of finding multiple thick paths (the problems may arise in other coordinated motion planThick non-Crossing Paths Problem), which we consider ning problems in transportation engineering, e.g., shipin this paper, is an extension of both the shortest non- ping vessels, robotic material handling machines, etc.
crossing paths [5] and the shortest thick path problems The polygon P models an airspace through which the
Corresponding

Author: T: (+98) 21 29903004

64

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

aircrafts intend to fly. We assume that the aircraft remain at constant altitude (as is often the case during
en route flight), so that we can consider the problem
to be in a two-dimensional domain. There is an obstacle within P that correspond to a no-fly zone arising
from special use of airspace (military airspace, noise
abatement zone, security zone over city, etc.).
We are interested in determining the paths for
any aircraft from source to sink that safely be routed
through P whit optimal total length of paths. while
maintaining safe separation from each other and from
the obstacle .

Related work

This problem can be viewed as a variation of the FatEdge Graph Drawing Problem (FEDP) [7],[8], which,
in turn, is an extension of the continuous homotopic
routing problem (CHRP) a classical problem in VLSI
design [9],[10],[11],[12]. A related problem is that of
finding shortest paths homotopic to a given collection
of paths [13],[14],[15]. The novelty of our work lies in
considering the problem in simple polygons and polygonal domains; the previous research concentrated on
point obstacles for the paths. Although only point obstacles are considered in CHRP/FEDP, the existing results on FEDP [7],[8] are more general than our result
in some other aspects: the general FEDP receives as
input, an embedding of an arbitrary planar graph and
finds a drawing with the edges of maximum thickness;
we do not answer the question of finding the maximum
separation between the paths. Some heuristics for finding thick non-crossing paths in polygonal domains are
suggested in the VLSI literature [16], but neither complexity analysis nor performance guarantees are given
there. A very restricted version is considered in [17].
In a rectilinear environment, fast algorithms are known
for some special cases of the minsum version [18],[19].
A related problem can be viewed in [6], that considered
all pairs (sk , tk ) lie on the boundary of the polygon and
the sources/sinks are not allow to lie on the boundary
of the holes. We extend the work in [6] for the case
where one of the sinks/sources lies on the boundary of
the hole, and compute K shortest path in linear time.

The input is a polygonal domain specified by an


outer (simple) polygon P and a hole in it. Let n denote the number of vertices on the boundary of , and
let h denote the number of hole that here equals to
one.We specify boundary P by P and boundary hole
with Q.
A (thin) path is a simple (non-self-intersecting)
source-sink curve in . For w 0 and S R2 , let
hSi(w) denote the Minkowski sum of S and the disk of
radius w centered at the origin. A w- thick path
is the Minkowski sum of a path and the disk of radius w : = hi(w) . The path is called the reference
path for the thick path . A 1-thick path is called just
a thick path.In this paper we will also use polygon to
refer to a set whose boundary consists of straight line
segments and circular arcs; the complexity of such a
polygon is the number of its boundary segments and
arcs.

Figure 1: The simple polygon with one hole and pairs


(sk , tk ).

When there is one sink that


lies on the hole boundary

We start by recollecting and extending known results


on finding multiple thin non-crossing shortest paths [5].
We consider the case of simple polygon with one hole
where there is a pair (si , ti ) with si P and ti Q
and other terminals lie on the boundary of P as shown
in figure 1. In this condition there exists a linear time
algorithm for finding K shortest thin paths [5], we want
to extend this result and find K thick non-crossing optimal paths using the results in [6].

Preliminaries

Let P 1 = P \(P )1 be the 1 unit offset of P inside it. We assume that P 1 is still a simple polygon.
We begin with a formal statement of our problem and Let ST = {(sk , tk ), k = 1...K} be the set of K pairs
a review of some relevant notions and results from pre- of points on the boundary of P 1 and the boundary of
vious works [20],[6].
Q1 (one point lies on the boundary of Q1 ) and ()k be a

65

The Third International Conference on Contemporary Issues in Computer and Information Sciences

sk tk path within P 1 ; We call sk the start and tk the


destination of the kth path. Let k be the thick path
within P with k as the reference path, i.e. k = (k )1 .
Thick paths 1 , ..., K are called non-crossing if
i j = for i 6= j {1 . . . K}. Note that we
allow the thick paths to share parts of the boundary
with each other, we only require that the interiors of
the paths are disjoint. We require that for k = 1 . . . K
the sk tk path in the collection is as short as possible
the existence of other paths. We also assume that the
problem instance is feasible, that means the polygon Figure 2: From left to right: an instance of the probis wide enough to accommodate the thick paths. The lem; the mapping of P and the terminals to the unit
circle; the tree of slices Tsl .
approach of [5] to the problem was as follows.
First, the boundary of the polygon is mapped to the
unit circle bdC and the terminals are identified with
their images. Then, a chord sk tk is drawn between the
terminals in every pair (sk , tk ), k = 1 . . . K 1 (ignore
pair its ti lie on the hole). If two of the chords cross,
then the problem instance is infeasible. Otherwise, the
K1
S
tree of slices Tsl on C sl(t1 , s1 )
sl(sk , tk ) is built
k=1

in which the root is the whole circle C, the roots immediate children are sl(s1 , t1 ) and sl(t1 , s1 ), and the
parent-child relation is defined by containment of the
slices (see Fig. 2, ignore the shaded disk now; see also
[5] for details).

Let ST ord = {(v1 ), . . . , (v2K1 )} be the set


{s1 , . . . , sK , t1 , . . . , tK |ti 6= tl } ordered clockwise
around bdP 1 . Similar to [4], we define dk (u, v). Let
v, u be two consecutive points in ST ord. Let be a
path within C from a point on bdC(v, u) to a point on
the chord sk tk . The kth depth of P 1 (v, u) denoted by
dk (v, u), is defined as the minimum number of (other)
chords that crosses, for all paths . Let Ok be the
set of obstacles, obtained by inflating each part of bdP 1
by 2 times its kth depth (arithmetic modulo 2K 1 is
assumed in the indices):

Let us assume that there is a pair (sl , tl ) with


sl P and tl Q. We can have two paths from
sl to tl , one passing above the hole and the other be2K1
S 1
low it (Fig.3), when a path between sl and tl is routed,
O
=
P (vj , vj+1 )2dk (vj ,vj+1 )
k
then there is only one way to route the rest of the pairs
j=1
in a non-crossing fashion. Furthermore, each path can
be routed in two ways: above or below the hole Q, we
denote these paths by a (sl , tl ) and b (sl , tl ), respectively. Thus we can compute a (sl , tl ) and b (sl , tl ),
solve the problem separately for each case, and choose
Ok can be founded in O(n + K) time [6] by adaptthe solution with minimum total length. In the follow- ing the algorithm for computing the medial axis of a
ing, we concentrate on determining non-crossing paths simple polygon in linear time [21].
assuming that a (sl , tl ) has been routed. The other
case can be treated in the same way.
We first consider a (sl , tl ) for sl tl , then we solve
problem for b (sl , tl ), separately.

Figure 3: The paths b (sl , tl ) (left), and a (sl , tl )


(right).

66

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

5.1

Algorithm

Data: The simple polygon P with one hole, K


pairs (si , ti ) where there is a pair (si , ti )
with si P and ti Q and other
terminals lie on the boundary of P
Result: K thick non-crossing optimal paths
begin
1. First build Ol , the free space for l is a
splinegon and thus routing a (sl , tl ) inside
Ol can be done in linear time [22].

Conclusions and open problems

Our result can be extended to the case where there is


more than one sink that lies on the hole boundary. We
leave open the problem of finding K thick non-crossing
paths in a polygonal domain with more than one source
and sinks can lie on one of the hole boundaries. We conjecture that our approach can be extended to higher
dimensions and to other shapes of the moving objects,
as long as the motion is purely translational.

2. Inflate a (sl , tl ) by 2 units.


3. Define 0 = a (sl tl ), 0 is a simple
polygon or several connected simple
polygons, the boundry is included in
the polygon (Fig.4).
4. For all i = 1, . . . , K, i 6= l, we can find
K 1 thick paths in a simple polygon as
[6] in O(n + K) time. For finding every
path k, we must first build Ok , then
in the free space route k inside Ok .
end

Refrences
[1] Je. Erickson and A. Nayyeri, A Shortest non-crossing walks
in the plane, Proceedings of the 22nd Annual ACM-SIAM
Symposium on Discrete Algorithms (SODA) (2011), 125
128.
[2] E. M. Arkin, J. S. B. Mitchell, and V. Polishchuk, Maximum thick paths in static and dynamic environments, Comput. Geom 43(3) (2010), 279-294.
[3] J. S. B. Mitchell, Geometric shortest paths and network optimization Handbook of Computational Geometry,
J. Sack and G. Urrutia, editors, Elsevier Science B.V.
North-Holland, Amsterdam, pages: 633701, 2000.

Similarly we solve the problem again, starting with


b (sl , tl ), then we compare the total length of the two
results and choose the smaller one.

[4] V. Polishchuk, non-crossing paths and minimumcost


continuous
flows
in
geometric
domains,
P.h.D thesis, Stony Brook University Available at
http://cs.helsinki.fi/valentin.polishchuk/pages/thesis.pdf,
August 2007.

Figure 4 shows a polygon with one hole and a path


from sl to tl that passes above the hole. In the right,
the path is inflated and a simple polygonal free space
is created.

[5] E. Papadopoulou, k-pairs non-crossing shortest paths in a


simple polygon, Int. J. Comp. Geom. Appl., 9(6) (1999),
533-552.
[6] J. S. B. Mitchell and V. Polishchuk, A Thick non-crossing
paths and minimum-cost flows in polygonal domains, 23rd
ACM Symposium on Computational Geometry (2007), 56
65.
[7] C. A. Duncan, A. Efrat, S. G. Kobourov, and C. Wenk,
Drawing with fat edges, In GD01, Revised Papers from the
9th International Symposium on Graph Drawing, London,
UK, Springer-Verlag. (2002), 162177.
[8] A. Efrat, S. Kobourov, M. Stepp, and C. Wenk, Growing
fat graphs, In SCG 02: Proceedings of the eighteenth annual symposium on Computational geometry, New York,
NY, USA, ACM Press (2002), 277278.

Figure 4: Shortest thick path from sl to tl and the


resulting simple polygonal free space.

5.2

Running time

The time complexity of finding K thick paths is K(n +


K). Since we must compute K thick paths twice, the
exact running time is 2 K(n + K).

[9] R. Cole and A. Siegel, River routing every which way, but
loose, Proc. 25th Annu. IEEE Sympos. Found. Comput. Sci
(1984), 6573.
[10] S. Gao, M. Jerrum, M. Kaufman, K. Mehlhorn, and W. R
u
lling, On continuous homotopic one layer routing, SCG88:
Proc. of the fourth annual symposium on Computational geometry, New York, NY, USA, ACM Press (1988), 392-402.
[11] C. E. Leiserson and F. M. Maley, Algorithms for routing
and testing routability of planar VLSI layouts, Proc. 17th
Annu. ACM Sympos. Theory Comput. (1985), 6978.
[12] F. M. Maley, Single -Layer Wire Routing and Compaction,
MIT Press, Cambridge, MA (1990).
[13] S. Bespamyatnikh, Computing homotopic shortest paths in
the plane, J. Algorithms 49(2) (2003), 284303.

67

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[14] A. Efrat, S.G. Kobourov, and A. Lubiw, Computing homotopic shortest paths efficiently, In Proceedings of the 10th
Annual European Symposiumon Algorithms London, UK,
Springer-Verlag (2002), 411423.
[15] T. Dayan, Rubber-Band Based Topological Router, P.h.D
thesis, UC Santa Cruz, 1997.
[16] C. P. Hsu, General river routing algorithm, Proc. of the
twentieth design automation conference on Design automation (1983), 578583.
[17] A. Aggarwal, M. M. Klawe, S. Moran, P. W. Shor, and R.
Wilber, Geometric applications of a matrix searching algorithm, In Proc. 2nd Annu.ACM Sympos. Comput. Geom.
(1986), 285292.
[18] Y. Kusakari, H. Suzuki, and T. Nishizeki, Finding a shortest pair of paths on the plane with obstacles and crossing

68

areas, In J. S. et al, editor Algorithms and Computation


(1995), 4251.
[19] J. Takahashi, H. Suzuki, and T. Nishizeki, Finding shortest non-crossing rectilinear paths in plane regions, ISAAC
(1993), 98107.
[20] J.S. B. Mitchell, maximum flows in polyhedral domains, J.
Comput. Syst. Sci. 40 (1990), 88123.
[21] F. Chin, J. Snoeyink, and C. A . Wang, Finding the medial
axis of a simple polygon in linear time, Discrete Comput.
Geom. 21(3) (1999), 405420.
[22] E. A. Souvaine and D. L. Delage, Convex Hull and Voronoi
Diagram of Additively Weighted Points, Proc. 6th Annu.
ACM Sympos. Comput. Geom. (1990), 350359.

A Note on the 3-Sum Problem


Keivan Borna

Zahra Jalalian

Faculty of Mathematical Sicenes and Computer

Faculty of Engineering

Kharazmi University

Kharazmi University

borna@tmu.ac.ir

jalalian@tmu.ac.ir

Abstract: The 3-Sum problem for a given set S of integers is subject to find all three-tuples
(a, b, c) for which a + b + c = 0. In computational geometry many other problems like motion
planning relate to this problem. The complexity of existing algorithms for solving 3-Sum are O(n2 )
or a quotient of it. The aim of this paper is to provide a linear hash function and present a fast
algorithm that finds all suitable three-tuples in one iteration of S. We also improve the performance
of our algorithm by using index tables and dividing S into two negative and non-negative parts.

Keywords: 3-Sum; Computational Complexity; Linear Hash Function; Motion Planning.

Introduction

The 3-Sum problem for a given set S of n integers asks


whether there exist a three-tuples of elements from S
that sum up to zero. A problem P is 3-Sum-hard if
every instance of 3-Sum of size n can be solved using
a constant number of instances of P each of O(n) size
and o(n2 ) additional time. One can think of a 3-SUMhard problem in many interesting situations including
incidence problems, separator problems, covering problems and motion planning. Obviously by testing all
three-tuples this problem can be solved in O(n3 ) time.
Furthermore if the elements of S are sorted then we
can use Algorithm 1 with O(n2 ) complexity. It is interesting to mention that Algorithm 1 is essentially the
best algorithm known for 3-Sum and it is believed that
the problem cannot be solved in sub-quadratic time,
but so far this has been proven in some very restricted
models of computation only, such as the linear decision tree model. In fact Erickson [4, 5] proved an (n2 )
lower bound in the restricted linear decision tree model.
This model is based on the transpose of the transformation presented in [3] that maps each point (a, b) to the
line y = ax + b and vice-versa. However, the problem
remained unsolved in general for other computational
models.
Corresponding

Data: A sorted set S of integers


Result: A three-tuple (a, b, c) for which
a + b + c = 0.
for i = 1, , n 2 do
j=i, k=n-1;
while k > j do
if si + sj + sk = 0 then
print si , sj , sk ;
j=j+1;
k=n-1;
end
if si + sj + sk > 0 then
k=k-1;
else
j=j+1;
end
end
end
Algorithm 1: An O(n2 ) algorithm for finding all
solutions for the 3-Sum problem
In [1] the authors presented a subquadratic algorithms for 3-Sum. More precisely on a standard word
RAM with -bit words, they obtained a running time
of O(n2 / max{lg 2 n/(lglgn)2 , /lg 2 } + sort(n)). This
method is based on using an almost linear map h that
was already introduced in [2]. In fact for a random odd
integer a on bits, the hash function h maps each x

Author, P. O. Box 45195-1159, F: (+98) 26 3455-0899, T: (+98) 26 3457-9600

69

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

to the first s = lg bits of a x. In the second section


of [1] for a given , the authors found suitable a, b for
which = a + b. In fact they proved that if = a + b
then h() = h(a) h(b) {0, 1} and if 6= a + b
then h(a) h(b) h() {0, 1, 2} is only true
with small probability, i.e., if h(a) h(b) h()
/
{0, 1, 2} then there is no (a, b) for which = a + b.
Notice that the operator is modulo 2s and we have
h(a) h(b) h(c) h(a + b + c) {0, 1, 2}. Furthermore since multiplication in a is linear, removing the
first s bits of the multiplication makes h to be
non-linear. The reason to use the factor {0, 1, 2} is to
make this map linear. We refer the interested reader to
see [2, 6] for more information about this hash function
which is known as Universal hashing.

resents the index of b in array R for which a+b+c = 0.


2. If indexb 0, indexb < sizeR and R[indexb] 6=
S[0]-1 then let b = R[indexb]. Now if a + b + c = 0
and b > a and b < c then we have found a suitable
three-tuples. In order to find the other solutions let
indexc = indexc 1 and do the next repetition of the
loop.
3. If (a + c < S[0] and a + c > S[n 1]) or
a c, then there is no suitable value for b for which
a + b + c = 0 and so a must be changed with larger
values in S. Thus in order to find a solution let
indexa = indexa + 1, a = S[indexa], indexc = n 1
and c = S[indexc] and do the next repetition of the
loop.

In this paper we apply a linear hash function h


that uses only one substraction operation. More pre4. Else if a < c, then c is too large and so c
cisely for a sorted array S of length n we define h via must be changed with smaller values in S. That is
h(i) = S[i] S[1]. Now we construct a new array R let indexc = n 1 and c = S[indexc].
of length S[n 2] S[1] + 1 for which the relation
R[h(i)] = S[i] between the indices and values of its elIn the following, Algorithm 2 computes all suitable
ements is established. In fact R indicates a set that three-tuples for the 3-Sum problem:
is created by h and knowing the value h(i) one can
obtain S[i] as S[i] = h(i) + S[1]. Now applying our alData: A sorted set S of n integers
gorithm and by only one iteration over S one can find
Result: All (a, b, c) for which a + b + c = 0.
all three-tuples (a, b, c) for which a + b + c = 0.
indexa = 0, indexc = n-1;
a = S[indexa], c = S[indexc];
The organization of this paper is as follows. In SecsizeR = S[n-2] - S[1] + 1;
tion 2 our proposed algorithm and its complexity analwhile indexa < n - 3 do
ysis are presented. In Section 3 the performance of
indexb = -(a+c)-S[1];
our algorithm by using index tables instead of arrays
if
and dividing S into two specific parts are improved.
0 indexb < sizeR and R[indexb] 6= S[0]-1
Finally Section 4 is devoted to some conclusions and
then
future works.
b = R[indexb];
if a + b + c = 0 and c > b > a then
print a, b, c;
end
indexc = indexc - 1;
2 Our Algorithm
c = S[indexc];
end
if S[n 1] < a + c < S[0] or a c then
In this section we present our proposed algorithm. For
indexa = indexa + 1;
the ease of reader more details about this algorithm
a = S[indexa];
will be given during an example. Let S be a sorted
indexc = n-1;
array of integers of length n. We first define a hash
c = S[indexc];
function h with h(i) = S[i] S[1]. Then we construct
a sorted array R of length sizeR = S[n 2] S[1] + 1
end
initialized with S[0] 1. Then we allocate members
else if a < c then
of S in R via h and the formula R[h(i)] = S[i]. Now
indexc = indexc - 1;
let indexa = 0, indexc = n 1, a = S[indexa], c =
c = S[indexc];
S[indexc].
end
end
Repeat the following commands while indexa <
Algorithm 2: Our algorithm for finding all solutions
n 3:
for the 3-Sum problem
1. Let indexb = (a + c) S[1]. In fact indexb rep-

70

The Third International Conference on Contemporary Issues in Computer and Information Sciences

As an example if S is an array with 8 elements


25, 10, 7, 3, 2, 4, 8, 10, then R has 19 elements
and index of each element of it will be computed via
S[i] S[1]. Thus R[0] = 10, R[3] = 7, , R[18] =
10 and the other cells will be filled with S[0]1 = 26.
Hence R = 10, 26, 26, 26, 7, 26, , 26, 8.
Now let a (and c) be the first (and last) element of
S. That is, a = S[0] = 25, c = S[n 1] = 10. Let
j = ((a + c) S[1]) = ((25 + 10) (10)) = 25
and since j is not in the range of indices of R, for this
a we can not find b, c. But since a + c = 15 < 0 hence
i = i + 1 = 1, a = S[i] = 10. Then the new value for
j is j = ((a + c) S[1]) = ((10 + 10) (10)) =
10. Since R[10] = 26, no value for b for which
a + b + c = 0 is found. On the other hand since
a + c = 0 so we should seek for a smaller value of
c. Thus let l = l 1 = 6, c = S[l] = 8. Since
j = ((a + c) S[1]) = ((10 + 8) (10)) = 12,
b = R[12] = 2 and a + b + c = 10 + 2 + 8 = 0 thus
we have found a suitable three-tuples. Our algorithm
reports the other solution as 7 + 3 + 10 = 0.

of data in each block. In this way we can quickly access


to the address of data and we can check if there is any
b for a+ c such that a+ b+ c =0.
The bit array R is a word in RAM consisting of
m := S[n 2] S[1] + 1 bits. We initialize all bits of R
with zero. In order to map elements of S in R, we consider R as a word with at least m bits in RAM. Then
each bit indicates a number in S. If this bit is one
(zero), then we deduce that the number exists (does
not exist) in S. Then for members of S we use the
following formulas:
= S[0], h(S[i]) = S[i] + , R[h(S(i)] = 1.

As an example let S = {25, 10, 7, 3, 2, 4, 8, 10}.


If a = 10, c = 8, then b should satisfy b =
(10 + 8) = 2. Therefore using the bitmap we
refer to the bit number b + = 2 + 10 = 12 and if this
bit is one we conclude that b exists in S and henceforth
One can see that this algorithm finds all the three we have found a suitable three-tuples in S.
tuples a, b, c for which a + b + c = 0 in one iteration of
the given array. Furthermore in this example one can
note that if in each loop three conditional statements
are going to run (in the middle case), then we obtain
3.2 Dividing S into two parts
24 statements totally. Whereas using Algorithm 1 this
amount will be 35. One of the advantages of our algorithm is that the collisions in our hash function is When the number of elements of R is very large and
impossible, this is because elements of set S and thus we can not store S in the main memory, another useful
R will not repeat.
approach can be applied. As a matter of fact we can
put the array R into a file and use two index tables for
choosing the values of a and c. Note that in order to
a + b + c = 0, at least one of a, b, c should be nega3 Improving the performance of have
tive. Thus we can divide S into two subsets S1 and S2
for choosing negative and non-negative values and store
our Algorithm
them in the main memory. The following algorithm
computes all suitable three-tuples for the 3-Sum probTwo possible limitations of our algorithm are the use lem very quickly. Let midS denote the index of the first
of extra memory and the number of comparisons. In non-negative element of S. Let a S1 , c S2 , then usthis section we provide solutions to overcome these two ing the relation indexb = (a + c) + S[1] we can decide
limitations.
whether b exists in the file or not. If (a + c) < S[1]
then we have to move a to the next element of S1 ,
or if (a + c) > S[n 1] then there is no suitable
3.1 Using data file plus index table in- b for the current values of a, c and we have to take
another elements from S1 and S2 . As an example if
stead of array
S = {25, 10, 7, 3, 2, 4, 8, 10}, then the two index
tables are S1 = {25, 10, 7, 3}, S2 = {2, 4, 8, 10}.
In this subsection we improve the performance of our Our algorithm proceeds and reports the following two
algorithm when the number of elements of S is very solutions: 10+2+8 = 0 = 7+3+10. In the followlarge using a bitmap. In fact when the size of array S ing we present the improved version of our algorithm
is large and it would not fit in the main memory, we can for finding all solutions for the 3-Sum problem. Note
use a file located in the auxiliary memory and an index that since the sets for choosing the values for a and c
table which is in the main memory. The data in the file are being smaller, the amount of comparisons and thus
put in blocks and index table shows the largest amount the running time of our algorithm will decrease.

71

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data: A sorted set S of n integers


Result: All (a, b, c) for which a + b + c = 0.
indexa = 0, indexc = n-1;
a = S[indexa], c = S[indexc];
sizeR = S[n-2]-S[1]+1;
while indexa < midS do
indexb = -(a+c)-S[1];
if
0 indexb < sizeR and R[indexb] 6= S[0]-1
then
b = R[indexb];
if a + b + c = 0 and c > b > a then
print a, b, c;
end
indexc = indexc - 1;
if indexc < midS then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
end
c = S[indexc];
continue;
end
if S[n 1] < a + c < S[0] or a c then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
c = S[indexc];
end
else if a < c then
indexc = indexc - 1;
if indexc < midS then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
end
c = S[indexc];
end
end
Algorithm 3: Our improved algorithm for finding
all solutions for the 3-Sum problem
We conclude our discussions in Sections 2 and 3 in
the following theorem.
Theorem 1: Algorithm 3 computes all solutions for
the 3-Sum problem faster than Algorithms 1 and 2.
Finally we compare our Algorithms 2 and 3 with
Algorithm 1. We generated 100 sets all of size 20 of
random integers in the range [50, 50] and counted the
amount of operations that each algorithm are doing. In
figures 1 and 2 (the outpus of a Java program) the leftmost, the middle and the rightmost histograms count
the amount of operations that each of Algorithm 1, 2
and 3 are doing.

Figure 1: The histogram of amount of operations that


Algorithms 1, 2 and 3 are doing.

In Figure 2 we draw the histograms for the average


amount of operations that each algorithm are doing.

Figure 2: The histogram of avarage amount of operations that Algorithms 1, 2 and 3 are doing in 100 tests.

Discussion and Future Works

In this paper a fast and optimized algorithm for finding


all solutions for the 3-Sum problem is presented. Further works for generalizing this algorithm to rational
and complex numbers are under progress.

Refrences
[1] I. Baran, E. D. Demaine, and M. Patrascu, Subquadratic algorithm for 3-SUM: Proc. 9th Worksh. Algorithms & Data
Structures, Springer, Berlin/Heidelberg 3668/2005 (2005),
409421.
[2] M. Dietzfelbinger, Universal hashing and k-wise independent
random variables via integer arithmetic without primes: Lecture Notes in Computer Science, Proc. 13th Symposium on
Theoretical Aspects of Computer Sceince (1996), 569580.
[3] H. Edlesbrunner, J. ORourke, and R. Seidel, Constructing
arrangements of lines and hyperplanes with applications:
Lecture Notes in Computer Science, SIAM. J. Comput. 15
(1986), 341363.
[4] J. Erickson, Lower bounds for fundamental geometric problems, PhD thesis, University of California at Berkeley, 1996.

72

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[5] J Erickson, Lower Bounds for Linear Satisfiability Problem:


Chicago Journal of Theoretical Computer Science 8 (1999).

73

[6] M. N. Wegman and J. L. Carter, New classes and applications of hash functions, Proc. 20th IEEE FOCS (1979),
175182.

Voronoi Diagrams and Inversion Geometry


Zahra Nilforoushan

Abolghasem Laleh

Department of Computer Engineering

Department of Mathematics, Faculty of Science

Kharazmi University, Tehran, Iran

Alzahra University, Tehran, Iran

shadi.nilforoushan@gmail.com

aglaleh@alzahra.ac.ir

Ali Mohades

Faculty of Mathematics and Computer Science


Amirkabir University of Technology, Tehran, Iran
mohades@aut.ac.ir

Abstract: Voronoi diagrams have proven to be useful structures in various fields and are one of
the most fundamental concept in computational geometry. Although Voronoi diagrams in the plane
have been studied extensively, using different notions of sites and metrics, little is known for other
geometric spaces. In this paper we are interested in the Voronoi diagram of a set of sites in the
given inversion circle. We studied various cases which show some difference of Voronoi diagram
between Euclidean and inversion geometry. Finally, a special partition of the inversion circle which
is proven that will be a Voronoi diagram of inverted point in the inversion circle, is given.

Keywords: Inversion circle,Stereographic Projection, Voronoi diagrams.

Introduction

see [1, 3, 5, 713] for more details.

Given a set of sites and a distance function from a point


to a site, a Voronoi diagram can be roughly described
as the partition of the space into cells that are the locus
of points closer to a given site than any other sites.
Voronoi diagrams belong to the computational geometers favorite structures. They arise in nature and
have applications in many fields of science [4]. Excellent surveys on the background, construction and
applications of Voronoi diagrams can be found in Aurenhammers survey [2] or the book by Okabe, Boots,
Sugihara and Chiu [15]. Naturally the first type of
Voronoi diagrams being considered was the one for
point sites and the Euclidean metric. Subsequent studies considered extended sites such as segments, lines,
curved objects, convex objects, semi-algebraic sets and
various distances like L1 , l , hyperbolic metric or
any distance defined by convex polytope as unit ball;
Corresponding

Consider a circle with center O and radius r. If


point P is not at O, the inverse of P with respect to
the circle C is the point P 0 lying on ray OP such that
(OP )(OP 0 ) = r2 . The circle is called the circle of inversion, and point O is the center of inversion (see
Figure 1).

Figure 1: The inversion circle.

An inversion effectively turns the circle inside out.

Author, Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran,
Iran, T: (+98) 26 34550002
Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran, Iran.

74

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Every point inside goes outside, every point outside


goes inside, and all the points on the circle itself stay
put. The only thing unaccounted for is the center of
the circle. Let us say that it goes to a point at infinity [6].
The key properties of inversions are that circles map
to either circles and lines and that inversion preserves
the size of angles. Thus, inversions will prove to be
most useful whenever we are dealing with circles and
there are many interesting applications of inversion. In
particular there is a surprising connection to the Circle
of Apollonius. There are also interesting connections to
the mechanical linkages, which are devices that convert
circular motion to linear motion. Finally, as suggested
by the properties of inversion, there is a connection
between inversion and isometries of the Poincar Disk.
In particular, inversion will give us a way to construct
hyperbolic reflections in h-lines.

Definition 1. As to the inversion relation


(OP )(OP 0 ) = r2 of Figure 1, for a given line l :
ax + by + c = 0 with c 6= 0, we define the left and
the right side of l to be as follows:
The right side of l is the side whose distance of its
point from the center of inversion is greater than the
distance of ls points from the center of inversion. We
denote this region by lR . The other side of l is the left
side of l and we denote it by lL .
Theorem 1: If the given line l : ax + by + c = 0 with
c 6= 0, does not have any contact with the inversion
circle C, then the image of the right side of l under the
inversion map t relative to the inversion circle C, will
be inside of the circle t(l).
Proof: Without loss of generality, let C be the
unit circle. Thus by Lemma 1, there is some t :
C {0} C such that t(z) = z1 . If c < 0, then
lR = {(x, y) : ax + by + c > 0}. (see Figure 2 for
example).

Voronoi diagrams have nice properties which motivated us to study if they will be preserved in other
spaces. In this paper, we study the case in inversion
geometry specially in a given inversion circle.

Some properties of inversion


geometry
Figure 2: The figure of Theorem 1.

There are some facts about the inversion:


1: The inverse of a circle ( not through the center of
inversion ) is a circle.
2: The inverse of a circle through the center of inversion is a line.
3: The inverse of a line ( not through the center of
inversion ) is a circle through the center of inversion.
4: A circle orthogonal to the circle of inversion is its
own inverse.
5: A line through the center of inversion is its own inverse.
6: Angles are preserved in inversion.
Lemma 1: An inversion relative to the unit circle is
defined by:
t : C {(0, 0)} C st.

t(z) =

1
.
z

Proof: See [16]. 


Lemma 2: Let A and B be two given region
t be the inversion map obtained from Lemma
t(A B) = t(A) t(B).
Proof: Note that t is a 1-1 map and this
the proof. 

Note that t(lR ) = t({(x, y) : x2 + y 2 < ax+by


c }).
Obviously t(lR ) is the inside of circle t(l). If c >
0, then lR = {(x, y)|ax + by + c < 0} and hence
t(lR ) = t({(x, y)|x2 + y 2 < ax+by
c }). This completes
the proof. 

The proof of Theorem 1 enable us to deduce that


lL is mapped into the outside of the circle t(l) in the
inversion circle.
Theorem 2: Let the given line l : ax + by + c = 0 with
c 6= 0 intersects the inversion circle C, then under the
inversion relative to the circle C (if situation is as in
Figure 3) we have followings:
1. The region with light gray color of lR will be mapped
into the region with the same color inside of the circle
t(l) and vise versa.
and let 2. The region with dark gray color of lR which is out1, then side of the circle C, will be mapped into the region with
the same color inside of the circle C and vise versa.
finishes 3. The region with white color of lL which is inside the
circle C, will be mapped into the region with the same

75

The Third International Conference on Contemporary Issues in Computer and Information Sciences

color outside of the circle C and vise versa.


Proof: By using Lemma 2 and Theorem 1 and applying the fact that An inversion turns the inside of
circle out and vise versa, after some computation the
proof will be obtained. 

In this case, for any given Voronoi diagram with


the above property and given inversion circle, there is
a point in the inversion circle that corresponds to the
Voronoi vertex. Thus mapping the sites, edges and regions of Voronoi diagram into the inversion circle, we
will find a region in the inversion circle that contains
the two other corresponding regions. It is intersecting
to know that the mentioned region is the image of that
Voronoi region which contains the inversion circle (see
Figure 4).

Figure 3: The figure of Theorem 2.

Mapping Voronoi diagram


into the inversion circle

Figure 4: Voronoi vertex is outside of C and non of the


Voronoi edges intersect C.

In this section we study the image of Voronoi diagram


in the given inversion circle by the inversion map. We b. When the Voronoi vertex is outside of the inversion
circle and there is a Voronoi edge that contactins the
discuss it more precisely step by step.
inversion circle.
As an immediate result of Section 2, one can see
In this case, there is a point in the inversion cirthat the image of the given Voronoi diagram under the
cle
corresponding
to the Voronoi vertex and the image
inversion map relative to the inversion circle C is as
of
the
Voronoi
diagram
will divide the inversion circle
follows. For this let N denote the number of sites:
into three distinct regions (see Figure 5).
N = 2.
In this case Voronoi diagram consists of a line l which
is the perpendicular bisector of sites. Now for mapping
these into the given inversion circle, we have to study
two cases that the line l is through the center of inversion or not.
The case that l is not through the center of inversion
had been studied in Theorems 1 and 2. Further if l
is through the center of inversion, since a line through
the center of inversion is its own inverse, and since an
inversion turns the circle inside out, we are done.
N = 3.
When the number of sites are more than two, we have Figure 5: Voronoi vertex is outside of C and there is a
Voronoi vertex. Thus we can discuss about the case Voronoi edge intersecting C.
that it is inside of the inversion circle or outside of it.
Now since Voronoi edges intersect the inversion circle,
we will obtain the followings:
b. When the Voronoi vertex is inside of the inversion
circle.
a. When the Voronoi vertex is outside of the inversion
circle and non of the Voronoi edges have an intersect
It is clear that in this case the corresponding point
with the inversion circle.
to the Voronoi vertex is not inside the inversion circle.

76

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

That is, the image of the Voronoi diagram under the


inversion relative to the inversion circle, will be three
curves inside the inversion circle which meet at infinity
(the center of inversion). See Figure 6.

Figure 7: Stereographic Projection.

Figure 6: Voronoi vertex is inside C.

The number of sites are more than three.


For this case we can generalize our results mentioned
above. Each Voronoi vertex which is inside the inversion circle has no corresponding point in the inversion
circle. Therefore, the image of the Voronoi diagram inside the inversion circle will miss the images of Voronoi
vertices which are inside of the inversion circle.

From the north pole N of the sphere , draw the


line through the point p in C; the stereographic image of p on is the point p where this line intersects
. Since this gives us a one-to-one correspondence between points in C and points on , let us also say that
p is the stereographic image of p. Notice that no confusion should arise from this, the context making it clear
whether we are mapping C to , or vice versa.

4.2

Stereographic Formulae

In this subsection we recall explicit formulae connecting the coordinates of a point z in C and its stereographic projection z on . These formulae are useful
in investigating non-Euclidean geometry.

To begin with, let z = x + iy and let (X, Y, Z) be


the Cartesian coordinates of z on . Here the X and
Y axes are chosen to coincide with the x and y axes of
C, so that the positive Z-axis passes through N . To
make yourself comfortable with these coordinates, the
equation of is X 2 + Y 2 + Z 2 = 1, the coordinates of
4 Special partition
N is (0, 0, 1), and similarly S = (0, 0, 1), 1 = (1, 0, 0),
i = (0, 1, 0), etc.
be the stereographic map
Theorem 3: Let : C
In this section we focus on the image of the given 2- (with the above mentioned properties), then for any
dimensional Voronoi diagram inside the given inversion point (X, Y, Z) on :
circle. We first briefly explain Stereographic Projection.
Y
X
+i
(X, Y, Z) =
1Z
1Z

and the invert formula for given point z = x + iy in C


is:

4.1

Stereographic Projection

1 (x + iy) = (

2y
x2 + y 2 1
2x
,
,
)
x2 + y 2 + 1 x2 + y 2 + 1 x2 + y 2 + 1

.
Let be the sphere centered at the origin of C, and
Proof: See [14]. 
with unit radius. That is, its equator coincides with
Now change the direction in Figure 7, and let S be
the unit circle. We now seek to setup a correspondence
between points on and points in C (see Figure 7).
the north pole of Riemann sphere, then we deduce the
following:

77

The Third International Conference on Contemporary Issues in Computer and Information Sciences

be the stereographic map with this Theorem 5: For a given set of points as sites in C, the
Let u : C
assumption that S is the north pole, then for the given image of Voronoi diagram of mentioned sites inside the
point (X, Y, Z) on :
given inversion circle C, will be a partition of C which
preserve symmetry. That is, the image of each pair of
X
Y
u(X, Y, Z) =
+i
sites inside C are symmetric with respect to the image
1+Z
1+Z
of corresponding Voronoi edge.
and
Therefore according to Theorem 5 and
2y
x2 + y 2 1
2x
, 2
, 2
) Lemma 4.1 [2], we will obtain the main result of this
u1 (x + iy) = ( 2
2
2
2
x +y +1 x +y +1 x +y +1
paper as follows.
.
Theorem 6: For a given set of points as sites in C, the
image of Voronoi diagram of mentioned sites inside the
Hence in this case we have the followings:
(i) The interior of the unit circle is mapped to the given inversion circle C, is the Voronoi diagram of the
southern hemisphere of . In particular, 0 is mapped inverted point sites in C. That is, the inversion of any
Voronoi diagram in C relative to the given circle C,
to the south pole N .
will give a Voronoi diagram in C.
(ii) Each point on the unit circle is mapped to itself.
(iii) The exterior of the unit circle is mapped to the
northern hemisphere of , except that S is the stereographic image of .

Refrences

4.3

[1] H. Alt and O. Schwarzkopf, The Voronoi diagram of curved


objects, Proc. 11th Annu. ACM Sympos. Comput. Geom.
(1995), 8997.

Main results

By combining Theorem 3 and Corollary ??, the following interesting theorem will be derived:
and denote
Theorem 4: Let P be a given point in C
P 0 = u 1 (P ), then P 0 is the inverse of P with respect to the unit circle.
By using
Proof: Let P (x, y) be a given point in C.
Theorem 3,
1 (x + iy) = (

2y
x2 + y 2 1
2x
,
,
)
x2 + y 2 + 1 x2 + y 2 + 1 x2 + y 2 + 1
2

2y
x +y 1
2x
according to Corollary ??, u( x2 +y
2 +1 , x2 +y 2 +1 , x2 +y 2 +1 )
y
y
x
x
0
= ( x2 +y2 , x2 +y2 ). Thus P = ( x2 +y2 , x2 +y2 ). Therefore it will imply that (OP )(OP 0 ) = 1 and the proof is
done. 

For a given set of sites in C with Euclidean distance


function from a point to a site, a Voronoi diagram can
be described as the partition of C into cells that are the
locus of points closer to a given site than to any other
site. The boundaries of these cells are called Voronoi
edges and each of the Voronoi edges correspond to two
sites. On the other hand, each pair of sites are symmetric with respect to the corresponding Voronoi edge.
As mentioned earlier, the Riemann sphere is a model
so it is obvious that
for extended complex plane C,
the stereographic projection in any case will preserves
symmetry. Therefore for a given Voronoi diagram in C
and inversion circle C in C, it is sufficient to consider a
Riemann sphere whose equator coincides with the inversion circle C. Now the stereographic projection and
Theorem 4 gives the following theorem.

78

[2] F. Aurenhammer and R. Klein, Voronoi Diagrams, Handbook of Computational Geometry, J. Sack and G. Urrutia,
editors, Elsevier Science Publishers, B.V. North-Holland,
Chapter 5, pages: 201290, 2000.
[3] L. P. Chew and L. Drysdale, Voronoi diagram based on convex distance functions, Proc. 1st Ann. Symp. Comp. Geom.
(1985), 235244.
[4] S. Drysdale, Voronoi Diagrams: Applications from Archaology to Zoology, Regional Geometry Institute, Smith College,
July 19 (1993).
[5] A. Francois, Voronoi diagrams of semi-algebraic sets, Ph.D
Thesis, Department of Computer Science, The University of
British Colombia, January, 2004.
[6] M. J. Greenberg, Euclidean and Non-Euclidean Geometries,
2nd ed., W. H. Freeman & Co., 1988.
[7] M. Karavelas, 2D Segment Voronoi Diagrams, CGAL User
and Reference Manual: All parts, Chapter 43, 20 December,
2004.
[8] M. I. Karavelas and M. Yvinec, The Voronoi Diagram of
Planar Convex Objects, 11th European Symposium on Algorithms (ESA 2003), LNCS 2832 (2003), 337348.
[9] D.-S. Kim, D. Kim, and K. Sugihara, Voronoi diagram of a
circle set from Voronoi diagram of a point set: 2. Geometry,
Computer Aided Geometric Design 18 (2001), 563585.
[10] V. Koltun and M. Sharir, Polyhedral Voronoi diagrams of
polyhedra in three dimensions, In Proc. 18th Annu. ACM
Sympos. Comput. Geom. (2002), 227236.
[11]

, Three dimensional Euclidean Voronoi diagrams of


lines with a fixed number of orientations, In Proc. 18th
Annu. ACM Sympos. Comput. Geom. (2002), 217226.

[12] D. T. Lee, Two-dimensional Voronoi diagrams in the Lp


metric, JASM 27(4) (1980), 604618.
[13] Z. Nilforoushan and A. Mohades, Hyperbolic Voronoi Diagram, ICCSA 2006, LNCS 3984 (2006), 735742.
[14] T. Needham, Visual Complex Analysis, Oxford University
Press Inc., New York, 1998.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[15] A. Okabe, B. Boots, K. Sugihara, and N. Chiu, Spatial tesselations: concepts and applications of Voronoi diagrams,
2nd edition., John Wiley & Sons Ltd., Chichester, 2000.

[16] The Open University Mathematics: Unit 25 Geometry VI


the Kleinian view., The Open University Press, Prepared
by the Course Team, 1984.

79

Selection of Effective Factors in Estimating of Costumers Respond to


Mobile Advertising by Using AHP
Mehdi Seyyed Hamzeh

Pnu University
Department of Computer Engineering and Information Technology
mehdi seidhamze@yahoo.com

Bahram Sadeghi Bigham


Institute for Advanced Studies in Basic Sciences
Department of Computer Science and Information Technology
b sadeghi b@iasbs.ac.ir

Reza Askari Moghadam


Pnu University
Department of Computer Engineering and Information Technology
askari@pnu.ac.ir

Abstract: This paper presents an application of the analytic hierarchy process used to selection
of effective factors in estimating of costumers respond to mobile advertising and then investigates
the most successful factors form of mobile communication; short message services (SMS).
This method adapts a multi-criteria approach that can be used for analysis and comparison of
mobile advertising. Four criteria were used for evaluating mobile advertising: Information services,
Entertainment, Coupons, Location base services. For each, a matrix of pair wise comparisons
between factors influence was evaluated. Finley the aim of this investigate is to gain a better
understanding of how companies use mobile advertising in doing business.

Keywords: : Mobile Advertising; E Advertising; Personalization; Analytic Hierarchy Process; Short Message
Services (SMS); Successful Factors

Introduction

put them in to analysis.


Recently a great attention has been veered toward
the efficacy of mobile advertisement in cellular phones
from scholars and experts. This is due to the peculiarity of cellular phone which makes it different from other
media, for instance we can mention to personalization
of advertisement.

With a growth and progress in electronic business, especially in cellular phones, mobile advertisement seems
to be successful when elements which are effective on
customers attitude in electronic and wireless situation
are to be well-understood and necessary actions shell
be done in this case. While several elements are efOnline advertising (ad) is a form of promotion that
fective on customers attitude owe can mention to peruses the Internet and World Wide Web for the express
sonal values and inner believes customers characterpurpose of delivering marketing messages to attract
istics and also technological and media elements and
customers [3].
even strategies which companies take up and finally
Corresponding

Author, T: (+98) 911 325-8525

80

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2
2.1

Literature review
Mobile advertising

Short message services (SMS) have become a new technological buzzword in transmitting business to customer messages to such wireless devices as cellular telephones, pagers, and personal data assistants. Many
brands and media companies include text message
numbers in their advertisements to enable interested
consumers to obtain more information [4].Mobile marketing uses interactive wireless media to deliver personalized time- and location-sensitive information promoting goods, services, and ideas, thereby generating value
for all stakeholders [2].Studying interactive mobile services such as SMS and MMS suggests drawing upon
theories in marketing, consumer behavior, psychology
and adoption to investigate their organizational and
personal use [4].
Mobile advertising is predicted to be an important
source of revenue for mobile operators in the future
[9] and has been identified as one of the most promising potential business areas. For instance, in comparison with much advertising in traditional media,
mobile Advertisements can be customized to better
suit a consumers needs and improve client relationship [1].Examples of mobile advertising methods include mobile banners, alerts, and proximity-triggered
advertisements [6].

3
3.1

The AHP

Personalization

Marketers can personalize text messages based on the


consumers local time, location, and preferences, directions to the nearest vegetarian restaurant open at the
time of request. A person may use a mobile device
to receive information but also for a purpose of personal entertainment, it mean if any matter (mobile
SMS/Advertising) will disturb his personal entertainment then he will never like to disclose his personal
information [10].

3.2

Credibility

The credibility of the mobile applications is what


makes their uses frequent. If the user experienced a
problem during the transaction or mobile advertising,
it is certain that it will not use the mobile applications
once more [7].

3.3

2.2

Alternatives

Consumer permission

Corporate advertising often serve as the primary point


of contact, asking consumers for permission to receive
SMS and According to all the experts, advertisers
should have permission and convince consumers to opt
in before sending advertisements [1].

The AHP is one the extensively used multi-criteria decision making (MCDM) methods. The AHP has been
applied to a wide variety of decisions including car purchasing, IS project selection [8], and IS success[5].
The AHP is aimed at integrating different measures 3.4 Consumer control
in to a signal overall score for ranking decision alternative. Its main characteristic is that it is based on pair
wise comparison judgments.
There is a trade-off between personalization and consumer control. Gathering data required for tailoring
In this paper, we discuss one representative the re- messages raises privacy concerns. Corporate policies
lationship between the effective factors in success the must consider legalities such as electronic signatures,
marketing companies and also influence the way the electronic contracts, and conditions for sending SMS
consumer reacts to mobile advertising.
messages [1].

81

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1: Hierarchical model for selection of effective factors

4
4.1
4.1.1

A case study using AHP

4.2

Applying the AHP method


Breaking down the problem

The first step was to develop a hierarchical structure


of the problem. This classifies the goal and all decision
criteria and variables into three major levels, as depicted in Figure 1. The highest level of the hierarchy
is the overall goal: to select the best influence factors
mobile advertisement. Level 2 represents the criteria
that companies offer the services for their customers.
Level 3 contains the decision alternatives that in mobile ads we consider the types of proposed solutions
which are important to users.

4.1.2

Discussion of results

In the previous section the weight of the criteria with


regard to the purpose and also the weight of the alternatives with regard to the criteria were determined.
Now, the way relative weights are combined for evaluating the final weights to choose and prioritize the
best elements will be explained in Fig2. Since the rate
of compatibility is less than 0.1, it can be concluded
that the group decision has an acceptable compatibility. Therefore, the obtained results consist of personalizing messages in the first rank with the final weight
of 0.484, message credit with the weight of 0.310 in the
second rank. Sending message in an appointed time
with the weight of 0.118 in the third rank, and the
right of message reception on the part of the user with
the weight of 0.089 in the final rank.

Comparative judgments to establish priorities

5
After calculating the weight of the effective elements
in mobile ads in relation to the total designated criteria, we should determine the weight of the criteria. In
other words, the quota of each criterion in determining
the best effective element must be identified. To do
this we need to compare the criteria in pairs. For example, in order to determine the relative importance of
the four major criteria, a 4 4 matrix was formed. Expert Choice provided ratings to facilitate comparison,
these then needed to be incorporated into the decision
making process. After inputting the criteria and their
importance into Expert Choice, the priorities from each
set of judgments were found.

82

Conclusion

Online advertising is a new service in the marketing


industry. An AHP-based methodology was designed
and applied and has proven its potential in helping
decision- makers in supporting in order to specify and
prioritize elements that are effective on mobile business
processes in cellular phone user, in has paid attention
to find out the relation between attitude and effective
element in comparison with mobile advertisement in
cellular phones. Finally we were show the operational
process of hierarchical analysis on the mobile advertisement.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 2: Synthesis for Select best influence factors

Refrences
[1] Arno Scharl, Astrid Dickinger, and Jamie Murphy, Diffusion and success factors of mobile marketing, Electronic
commerce research and applications 4 (2005), 159-217.
[2] A.P. Dickinger, A. Haghirian, A. Scharl, and J. Murphy,
A conceptual model and investigation of SMS Marketing,
Thirty-Seventh Hawaii International Conference on System
Sciences (HICSS-37), Hawaii, U.S.A (2004).
[3] Cookhwan Kim, Kwiseok Kwonb, and Woojin Chang, How
to measure the effectiveness of online advertising in online
marketplaces, Expert Systems with Applications 38 (2011),
4234-4243.
[4] David Jingjun Xu, Stephen Shaoyi, and Liao Qiudan Li,
Combining empirical experimentation and modeling techniques: A design research approach for personalized mobile advertising applications, Decision support Systems 44
(2008), 710-724.
[5] E.W.T.Ngai, Selection of web sites for online advertising
using the AHP, Information & Management 40 (2003), 233
242.

[6] G.M. Giaglis, P. Kourouthanassis, and A. Tsamakos, towards a classification framework for mobile location services, in: B.E.Mennecke, T.J. Strader (Eds.), Mobile Commerce: Technology ,theory, and applications, Idea Group
Publishing (2003).
[7] Glin Bykzkan, Determining the mobile commerce user requirements using an analytic approach, Computer Standards & Interfaces 31 (2009), 144-152.
[8] M.J. Schniederjans and RL. Wilson, sing the analytic hierarchy process and goal programming for information system
project selection, Information & Management 20 (1991), 33
342.
[9] DeZoysa and E. Mizutani, Mobile advertising needs to get
personal, tele-communications International 36 (2002),
no. 2.
[10] Thtinen j and B. V. S Ram, Mobile Advertising or Mobile
Marketing a need for new concept?, Conference proceeding
of eBRF, 152164.

83

An Obstacle Avoiding Approach for Solving Steiner Tree Problem on


Urban Transportation Network
Ali Nourollah
Department of Electrical and Computer Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran

Fatemeh Ghadimi
Department of Electrical and Computer Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
F Ghadimi@qiau.ac.ir

Abstract: The Steiner Tree Problem in a graph, which is one of the most well known optimization
problems, is used for finding minimum tree between some Terminal nodes. This problem has various
usages that one of them is routing in the urban transportation network. In these networks, there
are some obstacles that Steiner tree must avoid them. Moreover, as this problem is NP-Complete,
the time complexity of solving it, is very important for make it useable in large networks. In this
article, an obstacle avoiding approach has proposed that can find the near optimum answer of the
Steiner tree problem, in polynomial time. This approach has good rates in comparison with the
others, and it can find the possible near optimum tree, even when there are some obstacles in the
network.

Keywords: Steiner Tree on the Graph; Urban Transportation Network; Free-Form Obstacles; Heuristic Algorithms.

Introduction

graph interconnecting all given terminal nodes that do


not cross obstacles. It has been proven that this sub
graph is absolutely a tree if the given graph has no
negative weight [1]. The only difference between STP
and Minimum Spanning Tree (MST) is the ability of
using some extra nodes called Steiner nodes, in order
to reduction of path cost. This difference has made
STP an NP-Complete Problem.

The Steiner Tree Problem (STP) has several definitions, but in this article, it is considered on a graph.
The STP on a graph has many practical usages such
as global routing and wire length estimation in VLSI
applications, civil engineering and routing on urban
networks, and also multicasting in computer networks.
In 1972, the STP even on the graph, has been
This article focuses on the urban transportation network routing, so in computing Steiner tree, the sug- proven to be NP-Complete [2], so there is no polygested approach should avoid obstacles that may be nomial time solution for it, that can find the optimum
answer. Thus, there is a need for heuristics and approxexisted in this network.
imate approaches instead of exact algorithms. Some of
The urban transportation network is assumed as an these approaches are as follows: MST based algorithms
undirected, weighted graph. The nodes of this graph like algorithms of Takahashi et al. [3] and Wong et al.
are intersects, the edges are roads and the weights are [4] that for finding Steiner tree, they add an edge at
traffic volume. In this graph there can be some poly- each time until all terminals connect together; Nodegons that they are the obstacles, like Tehran restricted based local search algorithms like Dolagh et al. [5]
traffic area. The Steiner Tree Problem that can avoid that find Steiner tree with using local search and idenobstacles is defined as follows: finding the shortest sub tifying proper neighbors; Greedy Randomized Search
Corresponding

Author, T: (+98) 912 7668429

84

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

algorithms [6] that have three phases: a construction 3


Our Proposed Approach
phase, a local search phase and if necessary an updating phase. All of these approaches do not find the optimum answer of the STP, but they find near optimum The algorithm that is suggested in this article called
Obstacle Avoiding Steiner Tree on Urban transportaanswers in polynomial time.
tion Network (OASTUN). It can find Steiner tree on
In this article, we present a new heuristic approach an undirected and weighted graph with avoiding obthat can find near optimum answers of Steiner tree in stacles.
polynomial time and avoid obstacles. This article is
The inputs of this algorithm are the graph G and
organized as follows: In the next Section, some definitions and notations are reviewed. In Section3, our new the set T and also there is an assumption that says
algorithm is explained. In Section4, the experimental there is no isolated terminal in G, even when the obresults are presented and finally in Section5 there are stacles are considered. This algorithm consists of four
phases: the obstacle avoiding phase, the preprocessing
the conclusions.
phase, the first phase and the second phase. Finally,
the outputs of this algorithm are the Steiner tree and
its cost. The time complexity of OASTUN algorithm,
in the worst case is O(n(m + nlogn)), and it can find
near optimum answers while avoiding obstacles.

Definitions and Notations

3.1

Obstacle Avoiding Phase

In this phase of the algorithm, the given graph must be


refining from nodes and edges that are in the obstacle
Problem Definition: The STP asks for the minimum polygons. The inputs of this phase are graph G and
cost sub graph of G spanning T , that can use Steiner obstacles O, and the output of it is the refined graph
nodes. This tree must avoid any obstacles.
G0 . The related pseudo code is Algorithm 1.
Graph Definition: The undirected, weighted graph
G = (V, E, W ) includes of a set of vertices (V ) that
each one has a coordinates, and a set of edges (E) that
each edge is undirected and connects two vertices with
nonnegative weight (W ). Terminal nodes (T ) are in
a subset of graph vertices (T V ) and all the other
vertices are Steiner nodes (S = V \T ). The number of
vertices in V , is n, the number of edges in E, is m and
the number of terminal nodes in T , is r.
Obstacle Definition: On the graph G there can be
some free-form polygons as Obstacles O = (OV, OE)
that each one has a set of nodes (OV ), and a set of
edges (OE). The Steiner tree must avoid these obstacles and uses nodes and edges that are not in these
restricted areas. There is no limitation on the shape
of polygons and the number of them, but no terminal
nodes must be in these areas, and no terminals must
become isolated. This means that from each terminal,
there must be some edges to the other terminals.
Dijkstras algorithm Definition: This algorithm is
used for finding the shortest tree from one node to the
other nodes in a graph. By using Fibonacci Heap for
implementing Dijkstras algorithm, it has O(m+nlogn)
time complexity [7].

Algorithm OASTUN// Obstacle Avoiding Phase


Input. = (OV ,OE ), ( = , , )
Output. ( = , , )
1 for each do
2 if Inside( , ) is true then
3
Remove from and its edges from ;
4 end if
5 end for
6 for each do
7 for each OE do
8
if Intersect( , ) is true then
9
Remove from ;
10
break this inner loop;
11
end if
12 end for
13 end for

Algorithm 1: Pseudo code of Obstacle Avoiding


Phase
Definition 1: Inside(x, A) is a procedure that its
output is true if the node x be inside of the polygon
A and otherwise its false. The algorithm of this procedure, first of all, draws a line from outside of the
polygon A to the node x. If this line intersects with
the edges of the polygon A, for odd times, the node x

85

The Third International Conference on Contemporary Issues in Computer and Information Sciences

is inside of the polygon.

algorithm. Afterward, among all these paths in the


tree, the shortest path from ti to another terminal is
Definition 2: Intersect(A, B) is a procedure that selected and added to the ith cell in the array D. Then
its output is true if the edge A intersects with the edge its edges and nodes are added to P and N .
B otherwise its false. The algorithm of this procedure
is explained in [8].
The second loop (lines 6-20) is repeated for J times
or until no changes occur in the P . This loop is exactly
like the previous loop, but it obtains a shortest path
tree from ti to other nodes in N . If the weight of this
3.2 Preprocessing phase
shortest path is less than the weight of the previous
path for ti in D and it has no repeated edge with the
In this phase in order to reduce the counts of nodes previous path, then its edges and nodes are exchanged
and edges in the graph G0 , those ones that arent nec- with the previous ones in P and N . The number of
essary in Steiner tree computation must be omitted. the variable J, according to the experimental results
Therefore, the Steiner nodes that have less than two has been determined three, and its sufficient. At the
edges (Deg < 2) and their connected edges are omit- end of this phase there is an omission of repeated nodes
ted. The resulted graph of this phase is called G00 and in N and repeated edges in P (lines 21, 22).
the related pseudo code is Algorithm 2.
Algorithm OASTUN // First Phase
Algorithm OASTUN // Preprocessing Phase

Input. , = , ,
Output. ,

Input. , ( = , , )
Output. ( = , , )
// = \
1 for each do
2
if Deg( ) < 2 then
3
Remove from and its edge from ;
4
end if
5 end for

Initialization: = , = , = , = 3.
//P is a set of edges; N is a set of nodes; is an array of size r, J is
//a counter.

Algorithm 2: Pseudo code of Preprocessing Phase

3.3

First Phase

In this phase of the algorithm, the computation of the


Steiner tree is started. The shortest path between each
terminal node to one of the other terminals, which is
the nearest one, is computed in this phase. The inputs
of this phase are graph G00 and the set T and the outputs are the set of edges (P ) and the set of nodes (N )
from the obtained paths. Algorithm 3 is the pseudo
code of this phase.

1 for each do
2 Min{ShrtTree( , };
3 + .edges;
4 + .nodes;
5 end for
6 repeat
7 flag true;
8 -1;
9
for each do
10 Temp Min{ShrtTree( , };
11 if Temp < and Temp.edges .edges then
12
\ .edges;
13
\ .nodes;
14
Temp;
15
+ .edges;
16
+ .nodes;
17
flag false;
18
end if
19 end for
20 until flag=true or =0.
21 Remove all repeated edges in ;
22 Remove all repeated nodes in ;

Definition: ShrtT ree(x, A) is a procedure that its


output is a set of shortest paths from node x to each
Algorithm 3: Pseudo code of First Phase
node in the set A. These paths are obtained by computing the shortest tree that rooted in x and its leaves
are the nodes in the set A. This procedure uses Di3.4 Second Phase
jkstras algorithm to make a short tree, so if there be
more than one path with a same weight for two nodes,
In this phase after examination of the connectivity staone of them is chosen.
tus of the terminals, if there are any isolated trees they
The first loop (lines 1-5) of this pseudo code is ex- should be connected. For this reason, all the edges in
ecuted for each terminal (ti ), and it obtains a shortest P that connected together are put in the same groups.
path tree from ti to other terminals by using Dijkstras Afterward if the number of groups be greater than one,

86

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

three loops are executed. Algorithm 4 is the pseudo


In the third loop (lines 16-22), if there are any
code of this phase.
Steiner nodes in the set H, for each of them, the shortest path is computed. If this path has lower cost than
the previous one, and also it has the connection condiAlgorithm OASTUN // Second Phase
tions, it is replaced with the previous path and the related edges and nodes in P and N are exchanged. The
Input. , = , , , ,
path that has the connection condition doesnt make a
Output. Steiner tree path, Steiner tree Cost
cycle, or it doesnt make terminals to be isolated.
Initialization: = , = .
// C is a set of founded paths; H is a set of selected Steiner
//nodes.

1 Put all which are connected together, in


the same groups;
2 if groups number > 1 then
3 for each do
4 all nodes in with different groups
from ;
5 + ShrtTree( , ;
6 end for
7 while groups number> 1 do
8
Temp Min{ }which is not added yet;
9
if Temp connects two groups then
10
+ Temp.edges;
11
+ Temp.nodes;
12
+ Temp.Steiner nodes;
13
Update groups number;
14
end if
15 end while
16 for each do
17
if has a shorter path to any then
18
if this shorter path has the conditions then
19
Replace it with previous one and update
and ;
20
end if
21
end if
22 end for
23 Delete all repeated edges in ;
24 Delete all repeated nodes in ;
25 end if
26 for each do
27 if Deg < 2 then
28
Remove from and its edge from ;
29 end if
30 end for
31 Compute the summation of costs of all .

Algorithm 4: Pseudo code of Second Phase

At the end of this phase (lines 23- 30), there are


omissions of repeated edges of P , and repeated nodes
of N . Moreover, Steiner nodes with the deg less than
2 are also omitted from N , and their edges from P .
Finally, all the edges in the set P are the edges of the
Steiner tree, and the summation of their weights is the
cost of the Steiner tree.

Experimental Results

We implemented our algorithm in the C# programming language, and all the experiments were performed
in a computer with a 2.50 GHz Intel processor and 3GB
of RAM. This algorithm has been executed on several
data sets such as Beasleys data sets [9] and SteinLib
data sets [10]. Here the results of running the OASTUN algorithm on the set B of Beasleys data set are
shown.
The costs of the resulted Steiner trees from executing OASTUN algorithm on the set B, without running
Obstacle Avoiding phase, are in Table 1. The rate of
this algorithm is computed from the ratio of the cost
of OASTUN to the optimum cost.

Table 1: The results of OASTUN algorithm without


running Obstacle avoiding phase, on the set B
Graph Nodes Edges Terminals Optimum OASTUN
Number Count Count
Count
Cost
Result

In the first loop (lines 3-6), the shortest paths from


each node in N to other nodes of it that they are not
in the same groups, are computed. Afterward, the resulted paths will be added to C.

1B
2B
3B
4B
5B
6B
7B
8B
9B
10 B
11 B
12 B
13 B
14 B
15 B
16 B
17 B
18 B

In the second loop (lines 7-15), until all the separated trees are not joined together, a path with lowest
cost that connects two trees is selected from the C. The
edges and nodes of the selected path are respectively
added to P and N and also if there is any Steiner node
in this path, it is added to set H. In this situation,
the connectivity status of the groups and the number
of isolated groups are updated.

87

50
50
50
50
50
50
75
75
75
75
75
75
100
100
100
100
100
100

63
63
63
100
100
100
94
94
94
150
150
150
125
125
125
200
200
200

9
13
25
9
13
25
13
19
38
13
19
38
17
25
50
17
25
50

82
83
138
59
61
122
111
104
220
86
88
174
165
235
318
127
131
218

82
83
138
59
61
122
111
104
220
86
92
174
170
235
321
132
131
218

Rate
Time
(Opt/MSTG) (h:m:s:ms)
1
1
1
1
1
1
1
1
1
1
1.045
1
1.03
1
1.009
1.039
1
1

0: 0: 0: 10
0: 0: 0: 16
0: 0: 0: 33
0: 0: 0: 15
0: 0: 0: 20
0: 0: 0: 50
0: 0: 0: 28
0: 0: 0: 37
0: 0: 0: 101
0: 0: 0: 54
0: 0: 0: 66
0: 0: 0: 131
0: 0: 0: 69
0: 0: 0: 123
0: 0: 0: 220
0: 0: 0: 125
0: 0: 0: 168
0: 0: 0:350

The Third International Conference on Contemporary Issues in Computer and Information Sciences

When the obstacles are drawn on the main graphs,


according to the shape and position of the obstacles,
the underlying graph will be changed. Therefore, the
Steiner tree and its cost become different from the original graph that had no obstacle.
In the urban transportation network that intersects
are assumed as vertices of the graph and roads are the
edges, the obstacles are the places where drivers cannot pass through them. In Figure1, there are some
samples of graph 16 from data set B, with some obstacles. According to these obstacles, the Steiner tree and
its cost are changed. In this Figure all the Terminal
nodes are shown with filled circles. In Fig.1 (a), the
obtained Steiner tree of graph 16, without considering
obstacles is shown. The cost of this tree is 132, that
its a near optimum cost. In Fig.1 (b) and (c), two freeform obstacles in different positions have drawn. The
cost of the obtained Steiner tree in this condition is respectively 200 and 193. These are the possible shortest
trees that do not pass through the obstacles. In Fig.1
(d), some boundaries are drawn for the graph and the
cost of the resulted Steiner tree is 163.

It is obvious that if there were no obstacles, and


all the edges of the given graph could be used, the result would be closer to the optimum. However, this
approach finds near minimum possible answers, even
while it should avoid obstacles.

Conclusions

Steiner tree problem is an important issue in many


fields. In this article, a new heuristic approach proposed that it could find Steiner trees on the graphs even
when there are obstacles. This algorithm can be used
on the huge graphs such as transportation networks, in
appropriate running time. As there are free-form restricted areas in the urban transportation network that
drivers cannot pass through them, the OASTUN approach considers these free-form obstacles and it finds
the Steiner tree with avoiding them. This algorithm
has polynomial time complexity, and it can find near
optimum answers in good rates in comparison with the
optimum answer and the other works.

Refrences
[1] S. E. Dreyfus and R. A. Wagner, The Steiner Problem in
Graphs, Networks 1 (1972), 195207.
[2] R.M. Karp, Reducibility among Combinatorial Problems,
Complexity of Computer Communications, Plenum Press,
New York (1972), 85103.

(a)

[3] H. Takahashi and A. Matsuyama, An approximate solution


for the Steiner problem in graphs, Math. Jpn. 24 (1980),
573577.

(b)

[4] Y. F. Wu, P. Widmayer, and C. K. Wong, A faster approximation algorithm for the Steiner problem in graphs, Acta.
Info. 23 (1986), 223229.
[5] S. V. Dolagh and D. Moazzami, New Approximation Algorithm for Minimum Steiner Tree Problem, International
Mathematical Forum 6/53 (2011), 26252636.
[6] S. L. Martins, P. M. Pardalos, M. G. C. Resende, and C.
C. Ribeiro, Greedy Randomized Adaptive Search Procedures
For The Steiner Problem In Graphs, AT&T Labs Research,
Technical Report (1998).
[7] S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani, Algorithms, Chapter 4, Section 4, 2006.

(c)

[8] M. De Berg, O. Cheong, M. V. Kreveld, and M. Overmars,


Computational Geometry, Third Edition, Springer-Verlag,
Berlin/Heidelberg, Chapter 2, Section 1, 2008.

(d)

Figure 1: The obtained Steiner trees of graph 16 from


data set B. The color of the given graph is white smoke;
the color of obstacles is blue, and the color of obtained
Steiner tree is black.

88

[9] J. E. Beasley, OR-Library: Distributing Test Problems by


Electronic Mail, Operational Research Soc. 41/11 (1990),
10691072.
[10] T. Koch, A. Martin, and S. Voss, SteinLib: An Updated
Library on Steiner Tree Problems in Graphs, ZIB-Report
00-37, Germany (2000).

Black Hole Attack in Mobile Ad Hoc Networks


Kamal Bazargan

University Of Guilan
Department of IT Engineering Trends in Computer Networks
Ka.Bazargan@yahoo.com

Abstract: Ad hoc wireless network includes a set of distributed nodes that are connected with
each other wirelessly. Nodes can be the host computer or router. Nodes directly without any access
point to communicate with each other and have no fixed organization and therefore have been
formed in an arbitrary topology. Each node are equipped with sender and receiver. An important
feature of this network is a dynamic and changing topology .It is result of node mobility. Nodes in
these networks are continually changing its position that it requires a routing protocol that has the
ability to adapt to these changes, to appear.

Keywords: Specialized mobile networks, network security, massive attack of the black hole, the routing protocol,
Black Hole, AODV

Introduction

order to use this network.


Communication between nodes in the ad hoc networks are via radio waves and if another node is a
node in radio range is considered as its neighboring
nodes and not requested the communication between
two nodes that wouldnt in radio range. So can be used
other nodes for communication, so communication between nodes is created based on cooperation and mutual trust between nodes is created. Stimulated nodes,
the wireless communication, lack of defensive lines, the
lack of centralized management to review behavior of
the existing nodes in the network, the dynamic change
of network structure and power constraints of nodes,
provides a good platform for various attacks against
wireless networks. Nodes in these networks that they
work together and exchange information (in fact, work
together based on trust) provides a good opportunity
for attacker that penetrate the network and disrupt
network routing and the elimination of exchange of information on their networks.

Ad hoc network routing and security is the problem


of todays networks. Ad hoc wireless networks are
two types: smart sensor networks and mobile Ad hoc
networks. In Ad hoc sensor networks routing hardware sensor imposed restrictions on the network that
should be considered routing methods, including the
power supply be limited in nodes. And in practice
it is not possible to replace or recharge; the routing
method proposed in this network should be best to
use the available energy, must be informed of the resources so if nodes were not sufficient resources dont
send packet for destination. Autonomous and capable of being adapted to create nodes. Nowadays tend
to use wireless networks is growing day by day, because every person in any place and any time it can
be. Special mobile networks are set of wireless nodes
that can be formed dynamically at any place and at any
time without using any form of network infrastructure.
Most of these nodes play a roll both as the router as
One of the most popular protocol used in these neta node in the same time. This feature has made the works, is AODV protocol that in many studies effects
possibility of establishing networks with fixed structure of the attacks on AODV protocol has been studied.
and is not predefined, such important occasions, such
as military items, earthquakes, floods and the like, in
Corresponding

Author, P. O. Box 45195-1159, T: (+98) 241 424-8299

89

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

AODV using a query cycle makes a route and route


a request. When the source node to route the request
to find the destination node, a node that is currently
has no route to destination in the database, packet is
broadcast the route request across the network, Nodes
that receive this packet, if they have you have the news
of its destination say it and finish the routing otherwise by adding its node number, scrolling messages,
and finally sent to a neighboring node to destination
node And message routing for source node is returned.

3.2

Introduction Targets of Attacks

Black holes are two characteristics: first of all, introduce his path as the shortest route (reliable routes), although this is a false path, with intention to the packet
stopping Secondly, black holes is wasting with the passage of the node of origin to consumption. In Ad hoc
networks routing, AODV protocols is one of the most
popular protocols that are used, Black hole nodes that
more damage to this protocol are in the routing protoThe routing nodes are created problem during the col and cause disturb to the routing protocol.
routing that causes data loss in the network so are
called them malicious nodes or black holes in, this paper is presented solution black holes attack. This way
3.3 Divided Black Hole Nodes
the behavior of nodes in the network decides whether
the target node is malicious or not?
Black hole nodes can be divided into several categories:

1 Nodes that are created problem individually.

AODV Algorithm

2 That a group of nodes (some nodes) are working


together

AODV algorithm (Advanced On-demand Distance


Vector) doesnt data path in the header. Each node
controls RREQ on the table when it is already. If your In another division malicious node can also be divided
table is in the final node, then issued RREP. Other- into two functional parts:
wise, the messages RREQ get broadcast. RREP can
certainly be sent back to the RREQ sender. For an
intermediate node is aware of this issue that he knows
1 Malicious nodes use the received data and remove
whether the route request is newer, are used a sequence
them from their path.
number in RREQ messages. So just in case the RREQ
2 Malicious nodes that consume data received and
sequence number is smaller than the known sequence
then broadcast them in a perverse way (This crenumber, RREP message is issued by the intermediate
ates networks traffic and cause buffers of nodes
nodes.
be used.).

3
3.1

Black Hole Attacks

Types of Black Hole Attacks


Division

Introducing Black Hole Attack

The most dangerous attacks are black hole attacks.


In black hole attacks, the attacker with false news for
shortest path routing, can be attracted network traffic
to own side and then puts it away. Black hole attack is
a severe attack that can easily be used against routing
in mobile phones. Black hole, has been specified a malicious node that it respond falsely path to any request
path without an active route to the destination and in
this way all packets are received. If malicious nodes
work together as a group, very serious injuries will be
created in the network. This type of attack is named
Cooperative black hole attacks.

4.1

Methods of Indivisual Nodes

In these nodes when routing is done from source node,


the node change information of routing table of source
node (reduce HOP (number of nodes to navigate to
the destination) and reduce destination routing time)
Source node choose and that node as path of the data
sending that this node will cause the data consumed
and destroyed. If the black hole introduce itself as the
proper path for all nodes, in this case will cause loss of
all network packets and eventually caused the denial of
service.

90

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.2

Introduce Making Resistant Tech- is issued, Takes voting place around the process of a
niques in The Black Hole Attack on node. Then, based on opinions issued by the neighbor
node RREP, takes decision that the node is being held
Indivisual Nodes
for bad business.

Some solution is proposed, for single black hole in this


way in the next data to the destination, when intermediate node responds to the RREQ, add to RREP
packets then the source node, sends a request (FREQ)
to the next responsive node and ask about the responsive node and destination path. Using this method,
can be detected reliability of responsive node only if
the next step is reliable. This solution cannot prevent
the attack of the massive black hole in the MANETs.
For example, if the next node cooperates with the responsive node, it will simply respond to FREQ for each
question. Source node trust to the next step and sends
data and response node that this node is a black hole
node. In the next proposed way to prevent attacks on
individual black holes, the proposed method requires
intermediate nodes to send the request route confirm
or CREQ to the destination node in the next HOP.

4.3

Introduce Making Resistant Techniques in The Massive Black Hole


Attack

In this solution with little change in AODV protocol


introduces data routing table (DRI) and with checking
this table can be largely prevent the black hole attacks.
There is a solution to identify black hole nodes in cooperate by adding two bits of information in the routing
data table (DRI) that it will fix the problem somewhat.
Node #
3
6
B2

Data Routing
From
1
1
0

Information
Through
0
1
0

Table 1:

In the above table, the value of 1 means true and


the value 0 is false. From in this table means information that the node sends to the desired node and
Through means the information that gets the desired
node. The table is an example of node 4, that is maintained through = 0, from = 1 means that data sent
from node 4 to node 3, But no data packets has been
found through the node. Node (6) is in the table means
that data is sent from node 4 to node 6 and the data
path of the desired node is found. (I.e., through this
node to the correct data is added), and for node B2 the
value. Inserted by means that are not tied to this data
is sent and received data from node that this method
is presented to detect malicious and new nodes with 2
bits of data.

After that, the next HOP received CREQ. The


memory will search itself route to find a route to destination. If there is a path sends route confirm or
reply CREP to the source node with route information. Source node by comparing the information in the
CREP detects whether RREP is or not. Because added
operation to the routing protocol overhead so overhead
is high. In another method, the source node via finding more than one path, agrees with destination, the
validity of the RREP begun. Source node tries to get
RREP packet more than two nodes. In Ad hoc netAnother solution is presented that can prevent masworks in much the same routes, there are a number of
sive black hole attacks. This solution that is developed
nodes and the Hop is common.
is AODV. These solutions discover a safe route that
When the source node receives the RREP, if the avoids of massive black hole. In this supposed that to
paths to the destination, Hop is common, Source node confirm nodes participate in relation .In this method,
can identify the safe route to the destination node, the to prevent black hole attacks is used from the truth
routing delay is caused because the node must wait to table in which each node has a degree of accuracy that
receive RREP from more than two nodes. We use this is as the size of the node. If nodes accuracy degree is
way to prevent increasing of the routing overhead and zero, this means that the node should be discarded so
delay of routing. The next method when the RREP it is called black hole.

91

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The Proposed Method

the received information the desired accuracy will be


checked and if it is malicious nodes an alarm message
will broadcast in the network to the target node to
The proposed method is to try to decide whether the be placed in quarantine. The proposed algorithm has
hostile node malicious via nods behavior. Principles been implemented on AODV protocol and for doing its
of the proposed method are as follows:
operations are used from several packages:
1 Recorded information about that node is as follows:
Total data sent to neighboring nodes
Total data received from a neighbor node
The number of responses received from a
neighbor node
2 Sending packets requesting comments from
neighbors about a neighboring node that has
send reply packets
3 Receiving recorded information about the sender
of reply packet in its neighboring nodes

1 Request packet of information about a node: the


packet contains the node ID in question, request
sender ID and time of packet life.
2 Data packets to neighboring nodes about the
node in question: This package includes the number of received data packets received from the
target node, the number of sent packets sent to
the target node, and the number of RREP received packets received from the desired node.
3 Warning packet: This package includes nodes
that are known to be malicious node and should
be in the quarantine list. Alarm package is distributed in the whole network.

4 Review the received information and comments


about the malicious node
The benefits of the proposed method are that the node
begins the poll process when received a RREP packet
6 Removal of the Quarantine node in the routing from an unreliable. I.e. if a node already has proven
process
its integrity (via sending data packets) the survey is
longer than others. This will reduces the overhead of
the proposed algorithm. Secondly, when the informaIn the proposed method, each node in the network data tion requested, the neighboring nodes are also updated
structures is:
to reduce the algorithm overhead.
5 Send risk packet quarantine to a malicious node

1 Each node has a table that is related to its behavior and its of neighboring. Each entry in this
6 Simulations of Black Hole Attable specifies that the neighbor node with the
tacks
specified Id how many data packet send with this
and how many reply packet send this node and
how many data packet the desired node is delivIn this simulation using NS simulation software and the
ered to a neighbor node.
number of healthy nodes and the number of malicious
2 Each node contains a list of nodes that are in nodes, we show the simulation results.
quarantine and will be removed from the routing
process.

6.1
Malicious nodes are nodes that are responding RREQ
packets to send RREP packets to the large number
of data packets delivered to it by the data, but the
minimum data has been sent to neighboring nodes.
When a node receives RREP packet from its neighbor node if the node receives a RREP responding to a
RREQ, be an intermediate node and destination node,
it checks whether the responding node is not the nodes
that are in quarantine. If the node is a malicious node,
the RREP packet is discarded. Otherwise, voting process is performed around the responding node so as to
obtain all the desired node activity. Then based on

Parameters Used in Simulation


Parameter
Simulation software
Simulation time
Number of nodes
Routing Protocol
Traffic model
Stop time
The implementation
Transfer of
The number of malicious nodes

92

value
OPNET
600 sec
50
AODV
CBR
2 sec
600*600 m
250 m
2

The Third International Conference on Contemporary Issues in Computer and Information Sciences

6.2

The Simulated Performance

The average delay of End-to-End: This specifies that


when the source packet reaches to the destination
Packet delivery ratio PDR (Packet Delivery Ratio):
Data packet transmission rate from origin to destination
Routing rates: The maximum data rate transmission
in Routers (Such times: RREQ, RREP, RERR)

6.3

Figure 1: OPNET simulation environment

The Simulation

Simulation
time in
seconds
100
130
160
190
210
240
270
300

The
average
delay
End-to-End
0.003323244
0.003323344
0.003323371
0.003323419
0.003323444
0.003323454
0.003323474
0.003323348

Packet
delivery
rate
2557
2551
4001
4001
4001
4001
4001
4001

Routing
rates
2171
2133
2151
2198
2165
2171
2199
2111

Table 2: Results of simulation: AODV under attack


Figure 2: Packet delivery ratio vs. Simulation Time
Simulation
time in
seconds
100
130
160
190
210
240
270
300

The
average
delay
End-to-End
1.104323644
1.104323644
1.104323644
1.104323654
1.104323654
1.104323664
1.104323664
1.104323664

Packet
delivery
rate
2271
2333
2451
2698
2765
2871
2899
2911

Routing
rates
4950
4950
4950
4950
4950
4950
4950
4950

Table 3: Results of simulation: After removal of the


black hole

The following we will display the simulation using simulation software graph with malicious nodes.
Figure 1: OPNET simulation environment
Figure 2: View increasing the delivered package with
the removal of malicious nodes
Figure 3: Show created delay because of. Attack elimination
Figure 4: View the routing overflow because of attack
elimination.

93

Figure 3: Packet delivery ratio vs. Simulation Time

Conclusion

In this paper described methods to deal with the black


hole. This method is applicable on AODV protocol
that can be easily detected malicious nodes and destroyed them. The main advantage of this method
is that it detects malicious nodes with minimal overhead and puts them in quarantine, we can be used this

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

method in a more ad hoc networks Because of this simplicity and ease of implementation.

ad-hoc networks, IEEE Transactions on Mobile Computing


2 (2003), no. 3, 257269.
[2] H. Deng, W. Li, and D. P. Agrawal, Routing security in ad
hoc networks, IEEE Communications Magazine 40 (2002),
no. 10, 7075.
[3] Karpijoki,
Security
in
Ad
Hoc
http://www.hut.fi/ vkarpijo/netsec00/.

Networks:

[4] Zhou and Z. J. Haas, Securing Ad Hoc Networks, IEEE 13


(1999), no. 6.
[5] Lundberg, Routing Security in Ad-Hoc Networks, Helsinki
University of Technology.
[6] Elizabeth M. Royer and Chai-Keong Toh, A Review of Current Routing Protocols for Ad-Hoc Mobile Wireless.
[7] Charles E. Perkins and Elizabeth M. Royer, Ad-hoc OnDemand Distance Vector (AODV) Routing, Internet.

Figure 4: routing overhead vs. Simulation Time

Refrences
[1] C. Bettstetter, G. Resta, and P. Santi, The node distribution of the random Waypoint mobility model for wireless

[8] David B. Johnson and David A. Maltz, Dynamic Source


Routing in Ad-Hoc Wireless Networks, Mobile Computing.
[9] Izhak Rubin, Arshad Behzad, Huiyo Luo Ruhne Zhang,
and Eric Caballero, TBONE: A Mobile-Backbone Protocol for Ad-Hoc Wireless Networks, In Proceedings of IEEE
Aerospace Conference 6 (2002), 27272740.
[10] Y. Zhang, W. Lee, Huiyo Luo Ruhne Zhang, and Eric Caballero, Intrusion Detection in Wireless Ad-Hoc Networks,
In Proceedings of Mobicom 2000 (2000), 275283.

94

Improvement of the Modeling Airport Assignment Gate System


Using Self-Adaptive Methodology
Masoud Arabfard

Mohamad Mehdi Morovati

Kashan University of Medical Sciences, kashan, Iran

University of Kashan, Kashan, Iran

arabfard-ma@kaums.ac.ir

Department of Computer Engineering


mm.morovati@grad.kashanu.ac.ir

Masoud Karimian Ravandi


Science and Research Branch , Islamic Azad University , Yazd , Iran
Departemant Of Computer
karimianravandi@gmail.com

Abstract: Nowadays, the influence of software on most of the fields such as industry, sciences,
economy, etc is understood significantly. Success of software systems depends on its requirements
coverage. Requirement Engineering explains that the system can do what work in what circumstances. Successful Requirement Engineering depends on exact knowledge of users, customers and
beneficiaries requirements. Airport Assignment Gate System is a System Software which performs
the Gate Assignment Management to aircrafts automatically. This system is used in order to reduce
delays in airline system as well as reducing the delay time for planes which are waiting for landing
or flying. In this paper, the Self-Adaptive Methodology has been used for modeling this system and
with regard to this issue that this system should show different behavior in different conditions.
Self-Adaptive System is a system which is able to change itself at the time of responding to the
changing needs, system and environment. Using this Methodology, this paper attempts to support
the uncertainty and accountability to the needs created in the runtime more than ever.

Keywords: Self-adaptive Software; Run-time Requirements Engineering; KAoS; Uncertainly Management; Goal
Oriented; Airport Assignment Gate System.

Introduction

Nowadays, the software has been influenced most of


the fields such as industry, sciences, economy, etc significantly. Success of software systems depends on
its requirements coverage. Requirement Engineering
explains that the system can do what work in what
circumstances. Successful Requirement Engineering
depends on exact knowledge of users, customers and
beneficiaries requirements. Understanding the concept of system can be used anywhere in software development such as modeling, analysis, negotiation and
documentation of beneficiaries requirements, evaluating the documents provided of requirements, and
Corresponding

Author, T: (+98) 913 2649405

95

management of requirements evolution [1]. In this


paper, a kind of Requirement Engineering named the
Requirement Engineering based on the goal has been
used. In the Goal-based Requirement Engineering, the
main focus is on the goals of system. In fact, the goal
is used in this technology for extracting the requirements, evaluation, structure, documentation, analysis,
and system evolution. According to this viewpoint, the
goal is an instructional explanation of the system concept and the system should achieve it by participating
with the agents. The Agent is an active component
of system and has a particular role in the system. In
the other words, the agent is responsible for satisfying
the needs. On the other hand, the agents in each system show the range of system. Therefore, the goal of

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

system should be shared as the phenomena among the


agents in order that the agents satisfy them by doing
their own tasks[2].We require a language of Requirement Engineering Modeling in order to accomplish the
Requirement Engineering. In this Paper, we have used
the KAoS language for modeling the requirements.
The reason for using the KAoS is the superiority of
this technology compared to other languages. The
cause for this superiority is the definition of system
concept as the hierarchy of objectives and using the
concept Agent [3]. Moreover, this technology was
introduced in [4] with fully object-oriented agents and
this caused that it was used more than before.

feedback circle in order to adapt itself with changes occurred in the runtime (Figure 1). These changes may
have arisen from the system itself (Internal factors) or
the concept of system (External factors). Thus, these
kinds of systems required to scan themselves, detect
the changes, decide to react against the change, and
finally implement the decided action[6].

Airport Assignment Gate System

Gates are the final ports for passengers entry and exit
at the airport. Airport Assignment Gate is the process of selecting and assigning the aircraft to the Gate,
which is used for exact and scheduled assignment and is
considered as one of the important tasks at an airport.
This assignment is connected with a set of arrived and
moved flights, the Gates, which are ready to be assigned, and a set of constraints, which are imposed by
the airlines and airport. Thus, the Assignment Process
may be different due to various circumstances. In order
to create an efficient assignment, the assignment process must be able to cope with the sudden changes in
the operating environment and provide a timely soluFigure 1: Self-Adaptive System Feed Back Loop
tion for satisfying the proactive needs. Therefore, the
Gate Assignment should be quite clear and explicit and
has the ability to cope with the changes [7].By increasing the number of passengers and flight, the complexity
of this process will be increased significantly and the
2 Self-Adaptive System
optimal use of gates will be so important. Furthermore,
as mentioned, due to the sudden changes, which may
Self-Adaptive System is a kind of system which is able be occurred, the system should apply the optimal and
to change itself in the runtime in respond to changing efficient assignment according to the new conditions
needs, system and environment. These kinds of sys- and caused requirements.
tems depend on a variety of aspects like user needs,
features of system, features of environment, etc. The
main feature of these systems is that it reduces partly
Goal and Agent in Assignment
the dependence on human management. In fact, the 4
Self-Adaptive Software assesses its own behavior and
Gate System
changes it if it becomes clear in the assessments that
the system has not done the task, which is assigned
to it, completely and not achieved the desired objec- System Goal is the system final aim which should
tive, or the work can be done with greater efficiency achieve it. Goal can be connected with the life of sysand effectiveness [5].Before creation of Self-Adaptive tem or its scenario. Goal can be displayed as several
Software, the reimplementation and reconfiguration quantities each of which is connected with different feaprocess of system, which was a time consuming and tures. In addition, this goal can be divided into several
costly act, was done by human or his direct manage- sub goals each of which are associated with a feature.
ment in order to respond the occurred changes. There- A Behavioral goal defines a maximum set of system
fore, the research on software, which can automatically permissible behaviors. This kind of goal is divided into
and without human interference adapt itself with the two groups including the Achieve Goal and Maintain
occurred changes in the runtime, became important. Goal. Achieve goal is an objective which indicates the
Self-Adaptive Software was developed as a system with ultimate destination of system and demonstrates the

96

The Third International Conference on Contemporary Issues in Computer and Information Sciences

behavior which finally should be existed in the system.


The norm goal prioritizes among the alternative behaviors of system and determines that which alternative
has much profits and advantages. In the other words,
it is not necessary to have all of them implemented continuously in the system. This type of goal is usually
considered as a criterion for selecting the options in the
system. However, we cannot show whether the target
goal can be satisfied with these conditions or dissatisfied with other conditions. In fact, unlike the behavioral goal, there is no clear understanding in this type
of goal [8].An Agent is an active member of a system
which plays a role in satisfying the goal. What is considered in the Agent Model is not the special features of
agent and its features, but is the role of agent in satisfying the goal [9]. From a functional standpoint, an agent
is a processor which performs a specific function under
the transparent and clear conditions for satisfying the
desired goals. These conditions depend on the certificates and tasks which are defined for each operator in
the operator model. In order to analyze the responsibilities and guiding them to the permissions and tasks,
we require dividing the multiple-agent responsibilities
into the single-agent ones for the low level goals, so
that each goal card is assigned to a software agent for
the requirements and/or an environmental agent for
the expectations. In order to assign a goal to an agent,
the ability of that agent should be considered. These
abilities, as the features of classes corresponding to the
agent, are defined in the object model.

Figure 3: Goal Diagram of System

System Model Simulation

Figure 2 represents the Use Case chart of Airport System. KAoS charts are drawn using this chart and based
on it. The players of this chart are in fact the agents
of goal and responsibility model. Moreover, the cases
of this chart help us in determining the existing methods in the object model and the operators of operator
model. The goal chart of Airport Assignment Gate
System is presented in Figure 3. This chart is designed
based on the goal-based methodology. The numbers
shown in figures 3 and 4 are described as follows.
1. Gate Is Requested
2. Achieve[Getting information If Pilot was requested]
3. Achieve[Checking, Assigning emergency flight If
Information was given]
4. Achieve[Checking capacity, airline, area and
making a decision If emergency was not true]
5. Achieve[Update1 database]
6. Achieve[Inform To pilot]
7. Achieve[Inform pilot to allocator for leaving gate]
8. Achieve[Assigning gate if flight was emergency]
9. Achieve[Assigning gate that is appropriate to
other constraint if flight was not emergency]

Figure 2: Usecase Diagram of Assignement Gate System

97

10. Achieve[Update2 DataBase]

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

After extracting the system goal chart, the agent chart


is extracted from the goal chart. Figure 4 shows the
system agent chart. Then, the object model is extracted using the Use Case chart. Figure 5 represents
this chart. Finally, the performance of system is displayed by developing the operator chart. This chart is
obtained by connecting the extracted charts at the previous stages. Figure 6 shows the operator chart. The
numbers shown in Figure 6 are described a follows.
A. RequestGate
B. AddInfo
Figure 5: Object Diagram of System

C. GetInfo
D. CheckEmergent
E. AssignGateToEmergencyFlight
F. CheckOtherConstraint
G. AssignGateToOtherFlight
H. AddQueue

Related Works

I. Update1Databese
J. InformedPilot
K. LeaveGate
L. FindEmergencyWaitingFlight
M. AssignToEmergencyWaitingFlight
N. FindAppropriateFlight
O. AssignAppropriateFlight
P. Update2Database

In the runtime, the Requirement Engineering is considered as a subset of self-adaptive software engineering
science which only has been studied seriously in recent
years. [10] is one of the works which can solve the
problem of Airport Assignment Gate System. In the
method presented in [10] the ability of functions based
on the past knowledge and experiences for the manual
operation is used for solving this problem, but the algorithms used in this method had more analyzing and
computing power than before. The major problem of
this method was the manual section of operation. Furthermore, in order to optimize the Gate Assignment
in [11], it has been focused on minimizing the distance
passed by passenger between the terminal and gate
assigned to the aircraft. Despite the fact that this
subject is considered as a second-rate problem among
the problems of gate assignment, generally solving
this problem will affect the optimum gate assignment.
Moreover, in order to solve the assignment problem
in [12], the probable flight delay has been focused. In
this method of problem solving, the probable gate assignment model and proactive assignment rules have
been used.Since that this system should be changed according to various conditions, which may be occurred
in the operational environment, and adapt itself with
new circumstances, none of the existing systems have
focused on much supporting of the uncertainty in designing this system. This aim has been achieved in the
method presented in this paper using the self-adaptive
methodology and the modeling language of KAoS requirements.

Figure 4: Agent Diagram

98

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 6: Operator Diagram of System

Conclusions and Future Work

This paper attempted to support the uncertainty more


in designing and implementing the Airport Assignment
Gate System by using the Self-Adaptive Methodology,
and a system, which was much independent and had
the most optimized performance, was designed. According to the charts presented for designing this system, it became obvious that this system adapts itself
with different conditions and performs the best possible performance in any circumstances. Recently, a new
technology named Techne has been developed for
modeling the requirements of Self-Adaptive Software
Systems and the main focus in this technology is on the
more Management of uncertainty in the Self-Adaptive
Software [13]. Due to the shortcomings in this technology such as the lack of development environment
of charts, etc it cannot be considered as a formally
introduced technology yet. But, according to the research conducting in this field, this technology can
be considered as a turning point in the Requirement
Engineering and designing the Self-Adaptive Software
Systems.

[3] A. Uszok, J.M. Bradshaw, and R. Jeffers, KAoS: A policy


and domain services framework for grid computing and semantic web services, Trust Management (2004), 16-26.
[4] J.M. Bradshaw, S. Dutfield, P. Benoit, and J.D. Woolley,
KAoS: Toward an industrial-strength open agent architecture, Software Agents (1997), 375-418.
[5] M. Salehie and L. Tahvildari, Self-adaptive software: Landscape and research challenges, ACM Transactions on Autonomous and Adaptive Systems (TAAS) 4 (2009), no. 2,
14.
[6] B. Cheng, R. de Lemos, H. Giese, P. Inverardi, J. Magee, J.
Andersson, B. Becker, N. Bencomo, Y. Brun, and B. Cukic,
Software engineering for self-adaptive systems: A research
roadmap, Software Engineering for Self-Adaptive Systems,
LNCS (2009), 1-26.
[7] H. Ding, A. Lim, B. Rodrigues, and Y. Zhu, The overconstrained airport gate assignment problem, Computers
and operations research 32 (2005), no. 7, 1867-1880.
[8] A. Dardenne, A. Van Lamsweerde, and S. Fickas, Goaldirected requirements acquisition, Science of computer programming 20 (1993), no. 1-2, 3-50.
[9] P. Donzelli, A goal-driven and agent-based requirements engineering framework, Requirements Engineering 9 (2004),
no. 1, 16-39.
[10] Y. Cheng, A knowledge-based airport gate assignment system integrated with mathematical programming, Computers
and industrial engineering 32 (1997), no. 4, 837-852.
[11] A. Haghani and M.C. Chen, Optimizing gate assignments at
airport terminals, Transportation Research Part A: Policy
and Practice 32 (1998), no. 6, 437-454.

Refrences
[1] M. Jackson, The meaning of requirements, Annals of Software Engineering 3 (2010), no. 1, 5-21.
[2] A. Van Lamsweerde, Requirements engineering: from system goals to UML models to software specifications, Vol. 3,
Wiley, 2009.

99

[12] S. Yan and C.H. Tang, A heuristic approach for airport gate
assignments for stochastic flight delays, European journal
of operational research 180 (2007), no. 2, 547-567.
[13] I.J. Jureta, A. Borgida, N.A. Ernst, and J. Mylopoulos,
Techne: Towards a new generation of requirements modeling languages with goals, preferences, and inconsistency
handling, IEEE (2010), 115-124.

A new model for solving capacitated facility location problem with


overall cost of losing any facility and comparison of Particle Swarm
Optimization, Simulated Annealing and Genetic Algorithm
Samirasadat jamali Dinan

Fatemeh Taheri

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

smrjamali@yahoo.com

Ft.taheri@gmail.com

Farhad Maleki

M. E. Shiri

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

maleki.farhad@gmail.com

shiri@aut.ac.ir

Abstract: Facility location problem arise in a wide variety of practical problems. In this paper we
propose a new formulation for capacitated facility location problem which is a development of general
framework including the amount of risk for each facility if other resource cannot serve its customers.
The new formulation is evaluated by three meta heuristic algorithm including Genetic algorithm,
Particle Swarm optimization algorithm and Simulated annealing and finally some numerical example
are provided to show the performance of these algorithm in solving the new problem formulation.

Keywords:
Capacitated Facility Location Problem,Genetic Algorithm ,Paticle Swarm Optimization,Simulated Annealing

Introduction

The facility location problem is a classic combinatorial optimization problem for determining the number
and locations of a set of facilities which of N capacityconstrained facilities should be used to satisfy the demand for M customers at the lowest sum of fixed and
variable costs. The problem is formulated as in Khumawala (1974). Structural properties of the location
problems treated here have been studied by e.g. Leung
and Magnanti (1989),Cornu-ejols, Sridharan and Thizy
(1991), Aardal (1992), and Aardal, Pochet and Wolsey
(1995) and (Harkness and ReVelle, 2003; Drezner et
al., 2002; Canel and Das, 2002; Nozick, 2001;Canel et
al., 1996, 2001; Melkote and Daskin, 2001; Giddings
et al., 2001; Canel and Khumawala, 1996, Hinojosa et
Corresponding

Author

100

al., 2000; Tragantalerngsak et al.,2000; Avella et al.,


1998; Owne and Daskin, 1998; Volgenant, 1996). Consequently, there is now a variety of approaches for solving these problems. The most well known of the general heuristic methods are Particle swarm optimization
(PSO), Simulated Annealing(SA), and Genetic Algorithms (GA). The popularity of these heuristics has
flourished in recent years and several published studies
can be found in the literature where they outperform
the tailored counterparts. However, only a few studies
provided comparisons of these three heuristics in depth.
In this paper, we compare the relative performance of
PSO, SA and GA on capacitated facility location problem (CFLP). The choice of CFLP is made due to its
strategic importance in the design of the supply chain
network. Our motivation is to contribute further to the
understanding of which of these three heuristics may

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

be more effective under different circumstances. In the It is defined by:


remainder of the paper, we briefly provide the references to the pertinent FLP literature. Then we present
new formulation of capacitated facility location probXX
X
Z = min
(ckj xkj ) +
(fj yj )
lem(CFLP) and benefit of using it then discuss about
jJ
solving this problem with meta heuristic method and
kK jJ
the details of implementation of PSO,SA and GA and ST:
X
finally empirical comparison are described.
k K :
xkj = 1

(1)

(D)

jJ

j J :

Problem Statement

sj yj < dk

(T)

dk xkj < sj yj

(C)

jJ

The Capacitated facility location problem is one type


of FLP with capacity restriction the natural extension
of the problem is to allow one type facilities with capacity restriction. Consider locating a number of facilities
with same types to several sites (locations). If one site
is selected, a fixed setup cost occurs, which is independent of the facilities being install in it. However, in
the real life the purchasing price often depends on the
purchasing size . It is important to note that representation selection (matrix vs. vector) as well as the
parameter values for PSO, SA and GA described below
in detail are determined based on the results of extensive pilot experiments and testing for the problem.
In these experiments, the effect of various parameter
settings on solution quality and computation time is
assessed for each of the meta heuristics and parameter
values are set accordingly. representation selection and
parameter values used for the three FLP are shown in
Table 1. In this section, we describe the solution procedure of CFLP with the three heuristics.

CFLP Problem
Genetic Algorithm
Mutation Rate
Crossover
Iteration
POP
Simulated annaleang
Accept
Rate
Partical swarm Optimization
Iteration

Parameter
0.05
0.85
100
40
1000
0.09
100

Table 1: Representation selection and parameter values.

We start by giving the mathematical formulation of


a general model for capacitated facility location problems.
The general model formulated as in Khumawala (1974).

j J :

X
kK

j J, k K :
xkj yj 6 0
0 6 xkj 6 1 , 0 6 yj 6 1
yj {0, 1}
Where K is the set of customers and J the set of potential plant locations; ckj is the cost of supplying of
the customer Ks demand dk from location j,fj is the
fixed cost of operating facility j and aj its capacity if
it is open; the binary variable yj is equal to 1 if facility j is open and 0 otherwise;Finally,xkj denote the
fraction of customer ks demand met from facility j.
the constrains (D) are the demand constraint and constraints (C) are te capacity constraints. The aggregate
capacity constraint (T) and the implied bounds (B) are
superfluous;they are ,however, usually added in order
to sharpen the bound if Lagrangean relaxation of constraints (C) and/or (D) is applied. Without loss of generality it is assumed that
P ckj 0 k, j, fjP 0 j, sj >
0j, dk 0 k K : jJ sj > dk = kK dk Lagrangean relaxation approaches for the CFLP relax at
one the constraint sets (D) or (C). 2.2 new formulation of CFLP with cost risk In the new formulation of
CFLP we add the overall cost of risk that calculate for
each facility
XX
X
X
Z = min
(ckj xkj ) +
(fj yj ) +
(Rj ) (2)
kK jJ

jJ

jJ

According to xkj ,y and dk we can Calculated total


amount of demand to meet for each facility, this vector
is qj also know the capacity for all facility, now according to sj and qj we calculate the remaining capacity
of each facility after to meet demands. If we calculate In the absence of any facility How much of the
demand is not satisfy by Remaining capacity of other
facility the risk of losses each facility calculated in this
problem assume a facility without capacity restriction
for answer to the remaining demands with expensive
transport cost (2 max(ckj )) . How to calculate Rj

101

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.2

explain with a sample .


if we assume xkj is:

0.5
0.2
0.3
0.5
0.7

0
0
0
0
0

0.1
0.1
0.1
0.1
0.1

0
0
0
0
0

0.3
0.3
0.2
0
0.1

0.1
0.4
0.4
0.4
0.4

customerf acility

y shows open and close facilities:




Unrestricted results

When all the heuristics were allowed to finish their run


according to their parameters, PSO gives best results
in terms of rapidly reaching low-cost solutions, followed
by GA and SA, respectively. Figures (3-5) illustrate the
performances of CFLP with PSO ,GA,SA when facility
is 15 and customer is 10.

dk (demand vector) is:


5

20

10

5.4

x 10

Best: 366029.8831 Mean: 368287.3626


Best fitness
Mean fitness

5.2

19.8

3.9

5.2

14

5
4.8
Fitness value

dk with sum=45 How much of the demand is satisfy


by facility 1: 19.8 = 5 0.7 + 20 0.5 + 10 0.3 +
4 times0.2 + 5 0.5 And for all facility qj :

4.6
4.4
4.2

capacity of facilities is sj :

20

26

10

10

30


3.8
3.6

Remaining capacity of each facility E:




0.2

6.1

4.8

15

20

40
60
Generation

80

100


Figure 1: CFLP with risk

Total of remaining facility = 26.7 If we lose facility 1


other facility unable to satisfy his dimands 26.7 0.2
19.2 = 7.3 If this value is negative we should be satisfy
these demand from the facility with expensive transport cost .we can decide about benefit or harm of each
facility if it is open or close.
Fig. 1 illustrate the performance of CFLP with risk
and Fig. 2 illustrate the performance of CFLP with
out risk when facility is 15 and customer is 20.

x 10

Best: 241945.5792 Mean: 241945.63


Best fitness
Mean fitness

2.9

3.1

Empirical comparison
Fitness value

Time-limited results

For the time-limited evaluation, all of the three heuristics were allowed a maximum time of 200 s and the
best solutions from each heuristic were noted. This
approach evaluates the efficiency with which the three
heuristics reach quality solutions over time. For CFLP,
PSO gives best results in terms of rapidly reaching lowcost solutions, followed by SA and GA, respectively.

102

2.8
2.7
2.6
2.5
2.4

20

40
60
Generation

80

Figure 2: CFLP with out risk

100

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

4.5

x 10

10

3.5
Function value

3
2.5
2

7
6

1.5

0.5
0

Best Function Value: 311372.8065

x 10

3
0

20

40

60

80

200

100

400

600
Iteration

800

1000

1200

Figure 5: CFLP with risk by SA


Figure 3: CFLP with risk by PSO

Refrences
5

5.5

x 10

[1] Fleischmann, Bernhard., and Klose, advanced solutions to


practical problems, Wiley Publishing, Chapter 1, pages: 110, 2005.

Best: 210400.9005 Mean: 211615.2837

[2] Zvi Drezner and Horst W. Hamacher, Facility Location:application and theory, Wiley Publishing, 2005.

Best fitness
Mean fitness

[3] K Aardal, Reformulation of capacitated facility location problems: How redundant information can help, Annals of Operations Research (1998), 289-308.

Fitness value

4.5
4

[4] Marvin A. Arostegui Jr, Sukran N. Kadipasaoglu, and


Basheer M. Khumawala, An empirical comparison of tabu
search, simulated annealing and genetic algorithms forfacilities locations problems, , Elsevier Science Publishers, Houston , USA (2006).

3.5
3
2.5
2

20

40
60
Generation

80

Figure 4: CFLP with risk by GA

100

[5] Z. Lu and N. Bostel, A Facility Location Model forLogistics


Systems Including Reverse Flows:The Case of Remanufacturing Activities, Computers Operations Research 34 (2007),
299-323.
[6] J Harkness and C. ReVelle, Facility location with increasing
production costs, European Joural of Operational Research
(2003), 1-13.

103

A hybrid method for collusion attack detection in OLSR based


MANETs

Hojjat Gohargazi

Saeed Jalili

Tarbiat Modares University (TMU)

Tarbiat Modares University (TMU)

Faculty of Electrical and Computer Engineering

Faculty of Electrical and Computer Engineering

h.gohargazi@modares.ac.ir

sjalili@modares.ac.ir

Abstract: Due to the lack of infrastructure and routers Mobile Ad hoc NETwork (MANET)s in
addition to external attacks, are vulnerable against internal attacks that can occur from authorized
nodes. Collusion attack is a prevalent attack based on Optimized Link State Routing (OLSR)
protocol. In this attack two colluding malicious node prevent routes to a target node from being
established. In this paper we propose a hybrid (One Class Classification (OCC) and Centroid)
method for detecting Collusion attack. For this purpose we adapt OCC methods using a simple
distance based method called Centroid. results show that this model increases the accuracy of
discerning this attack.

Keywords: Anomaly detection; Collusion attack; OLSR; One class classification; MOG.

Introduction

The rest of this paper is organized as follows: Section 2


discuses the related works. Section 3 gives an overview
of OLSR protocol and collusion attack. The proposed
MANETs are wireless networks with mobile nodes and method is presented in section 4. Section 5 shows the
without infrastructure. In this networks there is no experimental results, and at last section 6 describes
special device as router, so all nodes have to partici- future works and concludes the paper.
pate in routing process. Due to these factors, routing
protocols become the base of MANETs. However such
cooperation makes the network vulnerable to attacks
occurred from authorized malicious nodes.
2 Related work
Collusion attack[1] is one of particular and severe attacks against MANETs based on OLSR[2] protocol, in
which a pair of attacker nodes collude and cooperate In [1] the authors proposed a method to detect the
together to prevent routes to a specific node from be- collusion attack by including 2 hop neighbourhood ining established, thus that node will be out of reach formation of each node in its HELLO messages. The
of other nodes. Our method to detect this attack is method tries to discern the attack based on contrabased on machine learning approach. We use an OCC dictions in topology information table. This cause the
method that is able to distinguish normal and abnor- node to obtain information about its 3 hop neighbourmal behaviours. Also considering the importance of hood without the need of TC messages. Although this
collusion attack, the classifier is adapted by a simple method can detect the attack but it is difficult to disdistance based method to better discern this attack. tinguish between the topology changes and the attacks.
The distance based method is called Centroid. An- Result is the increasing of false positives.
other advantage of this method versus the other works The proposed method in [3] is incorporation of an inis the ability of detecting the attacks similar to Collu- formation theoretic trust framework in OLSR. Nodes
sion Attack.
collaborate together to calculate trust values of each
Corresponding

Author, P.O Box: 14115-143 T: (+98) 21 8288-3374

104

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

other. After a certain threshold the node with weak


trust value will be on the blacklist. This method needs
an extra storage to store trust values for each node
and furthermore correctness of cooperation of neighbour nodes affects the accuracy of method. Authors in
[4] proposed a simple attack resistant method, that is
based on the fact of forcing Target to choose only one
Multi Point Rely (MPR) node in Collusion Attack. According to this method, there should be more than one
MPR in the MPR set of each node, when the node
have more than one 1 hop neighbour. Due to choosing
a non-optimal set of MPRs, this method affects performance of network by increasing of traffic overhead.
In [5] Node Isolation Attack has been investigated, that
is similar to Collusion attack. This attack involves single attacker instead of pair of attackers. In the detection phase of proposed method in [5] Target node
observes its MPR to check if it is generating TC messages include Target link informations. This method
is not suitable for Collusion Attack because the second
attacker that drops packets may be outside of Target
range.
All of methods mentioned above are based on changes
in the protocol. also they are proposed only against
one attack, and even are not suitable for similar ones.
In literature only [6] uses machine learning algorithms
to detect attacks against OLSR. But Collusion attack
is not studied in it. This model is based on ensemble
methods and uses a two class classification algorithm
(C4.5).

3
3.1

Background
Optimized Link State Routing

OLSR is one of four standard routing protocols provided for MANETs. this protocol is proactive in which
the routes to all nodes are calculated periodically and
maintained in routing table for each node. Base of
OLSR have been established on two types of messages
HELLO and TC. Every node broadcasts HELLO messages only to its 1 hop neighbourhood at 2 second
interval times including its link, neighbourhood and
MPR informations. Using the information collected
from HELLO messages, each node selects a subset of
its 1 hop neighbours called MPR set. MPRs ensure
delivering packets received from their selectors to all 2
hop neighbours of them.
After selecting MPRs and informing them of their selectors, every MPR generates and broadcasts TC messages each 5 seconds to propagate topology information
across the network. Unlike HELLO messages, TC mes-

sages are forwarded and spread but just by MPRs. Using topology information obtained from messages every node calculates its routing table by a shortest path
computation algorithm.

3.2

Collusion Attack

In OLSR, routes are calculated by the information


collected from TC messages. So in Collusion attack
the necessary condition to prevent routes to a node
from being calculated, is that the attacker node be
the only MPR of Target. Attacker1 that is one of
1 hop neighbours of Target, advertises all Targets 2
hop neighbours as itself neighbours in its HELLO messages. According to the MPR selection phase of protocol, Attacker1 will become the Targets MPR. Thereafter Attacker1 selects the Attacker2 as its only MPR.
TC messages generated by Target will be forwarded by
Attacker1 and TC messages generated or forwarded by
Attacker1 will be forwarded only by Attacker2 . However Attacker2 drops this messages instead of forwarding them. Since TC messages of Target and Attacker1
does not reach other nodes, they will not be able to
create any route to Target.

Proposed method

In this section we describe our method to detect attacks (especially Collusion Attack) against OLSR.
First a set of features is needed to use for collecting
data samples. For this purpose we use 20 different
features, 16 of them are taken from features defined
in [6] and the others are new ones. The features and
their descriptions are listed in figure 1.

Figure 1: features (* features from [6])

105

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.3

Testing and Voting

Combining is performed in Testing phase. After learning a model and calculating T and A in Training
phase, when a sample xi receives, to test whether it is
normal or attack, first its distance to model (learned by
OCC) is computed (Di ). Then according to equation 1
RDi is calculated to determine the probability of being
Collusion Attack. At last this two values are combined
with a Voting mechanism. We use two different simple
voting functions, mean and maximum. This functions
are defined as:
y = mean(Di , RDi )
y = max(Di , RDi )

Figure 2: Proposed method

As it is shown in figure 2 the proposed method is


consist of two phases, Training and Testing. These
phases and their parts discus in the following.

Experimental results in next section exhibit the mean


function works better than maximum.

5
4.1

and

Experiment and Results

Data Scaling

To validate our model we simulated a MANET in Network Simulator 2 (NS2) to collect normal and attack
Many of OCC methods are sensitive to data scaling. datasets. The simulation parameters are as follows
So it is important how the data are scaled. AssumNumber of nodes
50
ing X = {x1 , x2 , ..., xn } as data samples, the scaling
Simulation
time
3000s
method we used, is as follows:
Area
1000m 1000m
xsi = (xi T )/T i
Mobility model
RWP
Traffic
type
CBR
in which T and T are mean and standard deviation
of training data, respectively.

4.2

Centroid method

This method is proposed to adapt the output of OCC


method to detect Collusion attack. Assume that T
and A are mean of normal data (i.e data collected
from network in lack of any attack) and mean of attack
data (i.e data collected in occurrence the Collusion attack) used for training. Thus related distance of data
sample xi is calculated as:
RDi = ||xi T ||/||xi A ||

(1)

RD shows the sample xi how much is close to attack versus normal status. Whatever the value of RD
be high, means it has occurred Collusion Attack with
higher probability. As is shown in figure 2 a part of this
method is performed in Training and part in Testing.

106

Simulation has been run with six different movements


for both normal and attack situations. Data gathered
from simulations formed six dataset for each situation,
two of them were used for training and the others were
used to test the method.
To show effect of proposed model two measures are
used. Receiver Operating Characteristic (ROC) curve
shows contrast between Detection Rate (DR) and False
Alarm Rate(FAR). DR is ratio of detected attack data
to all ones and FAR is the ratio of normal data detected as attack to all normal data. Another measure
is Area Under Curve (AUC). AUC shows the overall
superiority of a ROC versus another one.
we tested our method with an OCC method, Mixture
of Gaussian (MOG). This method naturally is not
OCC one, but in [7] it is defined and used as well as
OCC methods. The results of applying this method
purely and with combining is shown in figure 3. Since
the FAR higher than 20% is ineligible, to better representing the effect of method, in this paper ROC is

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

60

drawn just until FAR equal 20%. As can be seen,


combining MOG with Centroid by mean function is
better than combining by maximum function or using
it purely. Figure 4 compares the AUC of ROCs shown
in figure 3.
ROC is a curve based on threshold. In this study
the threshold is used to determine output value y is
Collusion attack or not. For this reason selecting a
threshold is trade of between DR and FAR. Whatever
value of threshold be closer to one, DR becomes higher
but FAR goes up too. Decreasing value of threshold
to zero causes decreasing both DR and FAR. As it is Figure 6: ROC curves for Node Isolation attack by
shown in figure 5 selecting 0.60221 as threshold results three methods
DR 78% with FAR 10% for MOG-Centroid combined
with mean function. Finally to reveal the advantage
of the model in identifying attacks similar to ColluConclusions and future work
sion attack, figure 6 represents the comparison of ROC 6
curves of methods applying on Node Isolation attack
data.
In this paper we propose a model to adapt OCC methods to detect Collusion attack against OLSR. For this
purpose we used MOG as OCC method and defined
the Centroid to adapt it. In addition to detecting Collusion attack, the other advantage of this model is its
ability to diagnose attacks similar to Collusion attack.
Results represent that the proposed model acts well
when mean function is used for combining. In future
we will focus on defining more features to represent behaviour of OLSR and detecting more attacks against
this protocol. Defining a method stronger than Centroid can be a future work.
50

DR(%)

40

30

20

10

MOG
MOGCentroid (max function)
MOGCentroid (mean function)

10
FAR(%)

12

14

16

18

20

90

80

70

DR(%)

60

50

40

30

20

MOG
MOGCentroid (max function)
MOGCentroid (mean function)

10

10
FAR(%)

12

14

16

18

20

Figure 3: ROC curves for detecting Collusion attack


by three methods

Acknowledgement

This research has been supported in partial by Iran


Telecommunication Research Center (ITRC).
Figure 4: Comparing AUC of ROCs in figure 3

Refrences
90

[1] B. Kannhavong et al, A Collusion Attack Against OLSRbased Mobile Ad Hoc Networks, Global Telecommunications
Conference, GLOBECOM 06. IEEE, 2006, pp. 15.

80
DR= 78%
FAR= 10%

70

Threshold= 0.60221

DR(%)

60

[2] T. Clausen and P. Jacquet, RFC 3626 - Optimized Link State


Routing Protocol (OLSR), IETF RFC3626 (2003), 1-75.

50

40

[3] M.N.K. Babu et al, On the prevention of collusion attack in


OLSR-based Mobile Ad hoc Networks, Networks, ICON 2008.
16th IEEE International Conference on, 2008, pp. 1-6.

30

20

10

10
FAR(%)

12

14

16

18

20

Figure 5: Seeing Threshold, higher threshold causes


higher DR and FAR and vice versa

[4] P.L. Suresh et al, Collusion attack resistance through forced


MPR switching in OLSR, Wireless Days (WD), 2010 IFIP,
2010, pp. 1-6.
[5] B. Kannhavong et al, A study of a routing attack in OLSRbased mobile ad hoc networks, International Journal of Communication Systems 20 (2007), no. 11, 12451261.

107

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] J.B.D. Cabrera et al, Ensemble methods for anomaly detection and distributed intrusion detection in Mobile Ad-Hoc
Networks, Information Fusion 9 (2008), no. 1, 96 - 119.

108

[7] D.M.J. Tax, One-class classification; concept-learning in the


absence of counter-examples. Ph.D. thesis, Delft University
of Technology (2001).

A Statistical Test Suite for Windows to Cryptography Purposes


R. Ebrahimi Atani

N. Karimpour Darav

Department of Computer Engineering

Department Of Computer Engineering

Faculty of Engineering

Faculty of Engineering

Guilan University

Lahijan branch, Islamic Azad University

rebrahimi@guilan.ac.ir

karimpour@liau.ac.ir

S. Arabani Mostaghim

Department of Computer Engineering


Faculty of Engineering
Guilan University
saideh arabani@yahoo.com

Abstract: Encryption has been considered a precious technique to protect information against
unauthorized access in addition to developing analytical methods to evaluate cryptographic algorithms. Analysis of statistical tests is one of the methods used by the International Institute of
standard and Technology (NIST). This article introduce a software tool that implemented by using
C and JAVA programming languages for cryptographic purposes.

Keywords: Cryptography; Statistical Tests; Pseudo Random Number Generator(PRNG), JNI

Introduction

By exploiting JNI[7] technique, C and JAVA programming languages have intertwined together to take advantage of the features of these two languages.

Encryption plays s significantly important role in protecting information against unauthorized access. The
use of random number in cryptographic applications
is increasing[1] notably. For example, needed keys are
generated by utilizing random number generators in
order to prevent attackers from guessing keys. Hence,
generating random numbers is a sobering problem,
which is done by applying random number generators. However, evaluating their quality is far from
straightforward and needs some analytical manipulation. NIST[2] for this purpose has provided a set
of statistical tests applied on the output of implemented Random Number Generators (RNGs). Consequently, their results are taken into account as a
benchmark to select the generator for desired application[2]. Nonetheless, there are many other applications
that statistical analysis can be used in the fields[1],[2].
Our tool utilizes the C and JAVA[9] programming languages, can be run under Windows operating system.
Corresponding

Random Number Generators

Naturally generating random numbers is possible only


by random physical phenomena. Mouse motions or
keyboard keys pressed for time can be a random phenomenon, and these phenomena can be used as RNGs.
The use of mathematical functions is another method
for generating random numbers, used in computer systems, named as Pseudo-random number generators
(PRNG). The Functions generating random numbers,
and used in our tool include:
Linear Congruential Generator (LCG): A LCG
produces a pseudo-random sequence of numbers
x1 , x2 , x3 , ..., xn based on the equation[5]:

Author, P. O. Box 41635-3756 T: (+98)01316690274

109

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

xi+1 = axi + b

mod

m,

f or

i0

Blum-Blum-Shub generator The output of this


generator is based on three independent parameters. Two parameters p and q are prime numbers
whereas the third parameter r is random selected
in the interval [1, pq 1] such that gcd(r, pq) = 1.
Since, the parameters p, q and s are time dependent, users will not be able to guess or produce
the same sequences[1].

(1)

The primitives of equivalence are[2]:


a is dependent on current state.
x0 = 231
And the constants b and m are:
b = 0 , m = (231 1)

Micali-Schnorr Generator: Input parameters of


this generator are also time dependent. In other
word, by changing the state, input parameters
will be changed[1]:
Two prime numbers p and q. A parameter e is
selected such that
1 e , = (p 1)(q 1)
gcd(e, ) = 1, n = pq
and
80e N = blgnc + 1.
k = bN (1 2e )c
In [1] the algorithm of this generator has been
described.

Quadratic Congruential Generator(QCG): A


QCG produces a pseudo-random sequence of
numbers x1 , x2 , x3 , ..., xn from the equation[5]:
xi+1 = ax2i + bxi + c

mod

(2)

Where i 0. According to the constants, here


the QCG is divided into two sections[2]:
For section 1: a = 1 and b, c = 0 and m is a
512 bit prime
xi+1 = x2i

mod

m,

f or

i0

(3)

For section 2: a = 2, b = 3, c = 1 and m = 2512


xi+1 = 2x2i + 3xi + 1

mod

m, f or

i0

(4)

Set of Statistical Tests

Cubic Congruential Generator: This generator A package of statistical tests are proposed by NIST[2]
produces random numbers by use of a cubic equa- as criteria to evaluate the quality of an RNG(or PRNG)
tion. The cubic equation is[17]:
to be appropriate for one (more) application(s). If a sequence is successfully passed all these 15 tests, it does
xi+1 = ax3i + bx2i + cxi + d mod m
(5) not mean that the result of these tests are exactly correct but with high probability, it can be accepted. This
By under primitives, the recurring equation is
package consist of 15 tests coming as the follow:
applied to generate random numbers by the
CCG[2]. a = 1, b = c = d = 0, m = 2512
xi+1 = x3i mod 2512 , f or i 0
Frequency test : in this test the number of 0s and
1s in a sequence is computed and then compared
Exclusive OR Generator: This generator prowith an expected result. Furthermore in this test,
duces random numbers through recurrence equa2 is computed from the equation[12],[1]:
tion[1]:
xi = xi1 xi127 , f or

i 128.

(6)

I should be considered that the input parameter


is a 127-bit variable.
Modular Exponentiation Generator: To produce
a 512-bit sequence, this generator uses the equation[6]:
xi+1
= ayi mod m, f or i 1 where a
and m are constant numbers. yi is a variable[2].
Secure Hash Generator: A Secure Hash Generator(SHA)[18] produces a sequence xi with b-bit
length as:
160 b 512.

2 =

(n0 n1 )2
n

(7)

Frequency Test within a Block: in this test, the


entire sequence is divided into blocks having Kbit length. As a matter of fact, the number of
n
blocks, N = b K
c , N.K n and | n (N.K) |
bit sequence would be neglected. Afterward, it
computes the number of 1s(as n1 ) and 0s(as
n0 ) in a block. Finally, if n1 (or n0 ) is the same
as it would be expected( K
2 ) , we can say that
the entire sequence is approximately random. It
should be considered that if the K = 1, this test
refers to frequency test[13].

110

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Runs Test: If s is considered as an entire sequence, containing 0s and 1s. The iteration of
0s (or 1s) is named as a run of sequence. If the
number of runs in s is the same as a expected
in a random sequence, the test result being the
entire sequence will be random[14].

Approximate Entropy Test: For this test, the


number of overlapping k-bit templates in an entire sequence is computed. In fact, in this test,
it is compared k and k+1 bit subsequences with
a really random sequence[15].
Cumulative Sums (Cusum) Test: In this test,
at first the sequence is converted to digit -1,1,
and then the maximum of P
sums are computed
k
by means of equation Si = i=0 (2xi 1) for k
=0..n. If the maximum of sums is too large, the
sequence could not be random[2].

Longest Run of Ones: In this test the length of


a run of 1s would be tested. In other word, the
maximum length of a run of 1s is computed, and
then compared with the expected value[14].
Binary Matrix Rank Test: In this test, the number of disjoint sub-matrices is considered[11].

Random Excursions Test: For this test, eight


stages(-4,-3,-2,-1,1,2,3,4) are considered. A cycle
in the sequence s begins from 0 and ends up
with 0. If the number of each state in cycles is
the same as what would be expected, the entire
sequence could be a random sequence[16].

Discrete Fourier Transform Test: This test


searches in a sequence and discovers the pattern of occurrences in a period of bit sequences.
In other word, the purpose of this test is to compute the number of iterations of patterns in a
sequence. The number of iterations determines
the peak heights[3] and [4].
Non-overlapping Template Matching Test: there
are some templates, that have non-periodic iteration in their sequences. In this test the number
of these templates in an entire sequence is computed. This number show whether the entire
sequence is random or not[2].

Random Excursions Variant Test: In this test,


eighteen states has been considered and like two
before tests the cusums are computed,however in
this test, the total number of occurring a stage
in a cycle is important[16].

Implementation of the Tool by


JNI

Overlapping Template Matching Test: this test is


the same as the former test, however, in this test,
the subsequences are overlapping each other[2].
In this program by use of C and JAVA programming
languages, an application program has been designed
Maurers Test: for this test, the number of bits for cryptography purposes. Its Graphical User Interin subsequences of an entire sequence being ex- faces(GUIs) was designed so that a user can easily
actly the same as what it would be expected for interact with it. In other word, it is user friendly, and
it was designed for windows operating systems. Since
a random sequence patterns is considered[2].
JAVA is a powerful programming language and has a
Linear Complexity Test : This test is based lot of enriched GUIs[8] library and tools[9], we have
on Linear Feedback Shift Register(LFSR). The decided to reap its benefits.
length of LFSR is to be considered. For instance
if the length of an LFSR is too short, it demonstrates that the entire sequence is non-random[3].
Serial Test : Subsequences of sequence s iterating and having m-bit length are countered. These
subsequences can compute2 [10]:
2 =

n
2m X
(ni m )2
n
2
m
iZi

m1

X
iZim1

(ni

n
2m1

)2

Java Native Interface: JNI is one of the best


features of the Java programming language. It
is a powerful technique on JAVA. Application
programs that use this technique, can combine
native code written in other programming languages such as C and C++ and utilize them as a
java program[7].

In our work, we use JNI technique. At first we create


a Dynamic Link Library(DLL)[7] file extracted from
(8)
codes that written by C programming language. These
codes are open source and published by NIST[2] which

111

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

could be converted to a DLL file by means of applying some changes and compiling them by visual studio
compiler. Its user interface has been designed by java,
using JNI technique to connect the GUIs to the main
core codes. As it is shown in figure 1, in the generator
window, the kind of the generating algorithm must be
selected, and then in the next step either one or more
tests should be selected. When the program ends up
successfully, a window like what is shown in figure 3
appears.

Finally the output of the program is stored in text files.


Two files for each test which selected by the user are
created as the result and states. The result.txt includes
p-values that are to be obtained by test. In other word
the results of test(s) are p-values that user can decide
which one is more suitable PRNG for an application.
However, the states.txt contains information that is
provided by each test. The information that is stored
in this file is proportioned with each test. FinalAnalysisReport.txt has brief and facilitated information of
tests. The result files can be interpreted according to
what are described in previous sections(see section 3).

6
Figure 1: Generators view

Interpretation the Output of


The Program

Conclusion

In this paper, we have used C and JAVA programming


languages to produce an efficient application program
for cryptography purposes. This program consists of
9 mathematical algorithms that produce pseudo random numbers and 15 statistical tests proposed by U.S.
NIST. These generators and tests have been introduced
by NIST using ANSI C programming language. The
graphical user interface has been written by means of
Java programming language, and using JNI technique.
One of the features of this program is that it can be
run windows operating systems with high user-friendly
capabilities .

Refrences
Figure 2: Tests view
[1] A. Menezes, P. van, Oorschot, and S. Vanstone, Handbook
of Applied Cryptography, CRC-Press,Inc, Chapter 5, pages:
169190,Chapter 9, pages: 321348, Jun,1997.
[2] Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid,
Elaine Barker, Stefan Leigh, Mark Levenson, Mark Vangel,
David Banks, Alan Heckert, James dray, and San Vo, A
Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications: Reports on
Computer Systems Technology, NIST,U. S (April,2010),
(available from: www.csrc.nist.gov).
[3] J. L. Massey and S. Serconek, A fourier transform approach to the linear complexity of nonlinearly filtered sequences: Advances in Cryptology-CRYPTO, Lecture Notes
in Computer Science (1994), 332-340.
[4] J. Stern, Secret linear congruential generators are not cryptographically secure, Proceedings of the IEEE 28th Annual
Symposium on Foundations of Computer Science (1987),
421-426.

Figure 3: Successful finished view

[5] H. Krawczyk, How to predict congruential generators, Journal of Algorithms (1992), 527-545.

112

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] S. M. Hong, S.Y. Oh, and H. Yoon, New modular multiplication algorithms for fast modular exponentiation, Advances
in CryptologyEUROCRYPT (1996), 166-177.
[7] Sheng. Liang, The JavaT M Native Interface Programmers
Guide and Specification, ADDISON-WESLEY, Jun,1999.

[13] Milton Abromowitz and Irene A. Stegun, Handbook


of
Mathematical
Functions:
with
Formulas,Graphical and Mathematical Tables, NBS Applied
Mathematics
Series-55,
(available
from:
http://people.math.sfu.ca/
cbm/aands/toc.htm),
December 1972.

[8] H. M. Deitel and P. J. Deitel, The JavaT M How to program


JAVA, Prentice Hall, august,2004.

[14] Anant P. Godbole and Stavros G. Papastavridis, Runs and


patterns in probability, Selected papers (1994).

[9] www.java.sun.com.

[15] A. Rukhin, Approximate entropy for testing randomness,


Journal of Applied Probability 37 (2000).

[10] I.J. Good, The serial test for sampling numbers and other
tests for randomness, Proceedings of the Cambridge Philosophical Society (1953), 276-284.

[16] M. Baron and A. L. Rukhin, Distribution of the Number of


Visits For a Random Walk: Communications in Statistics,
Stochastic Models 15 (1999), 593-597.

[11] I. N. Kovalenko, Distribution of the linear rank of a random matrix, Theory of Probability and its Applications 17
(1972), 342-346.

[17] J. Eichenauer-Herrmann and E. Herrmann, Compound cubic congruential pseudorandom numbers, Computing 59
(1997), 85-90.

[12] Kai Lai Chung and Farid AitSahlia, Elementary Probability


Theory: with Stochastic Processes and an Introduction to
Mathematical Finance, Springer-Verlag New York, Februry
14, 2003.

[18] ANSI X9.30 (PART 2), Public Key Cryptography Using Irreversible Algorithms for the Financial Services Industry:
The Secure Hash Algorithm 1(SHA-1), ASC X9 Secretariat
American Bankers Association (1993).

113

An Empirical Evaluation of Hybrid Neural Networks for Customer


Churn Prediction
Razieh Qiasi

Zahra Roozbahani

University of Qom,Qom, Iran

University of Shahid Beheshti, Tehran, IRAN

Department of Information Technology

Department of Computer Science

raziehghiasi@gmail.com

roozbahani2@gmail.com

Behrooz Minaei-Bidgoli
University of Science and Technology, Tehran, Iran
Department of Computer Engineering
minaeibi@cse.mcu.ed

Abstract: Customer churn has become a critical issue, especially in the competitive and mature
telecommunication industry. From economic and risk management perspective, it is important to
understand customer characteristics in order to retain customers. However, few studies have used
hybrid modeling for churn prediction. The main contribution of this paper is to use hybrid neural
networks for churn prediction. The experimental results show that the hybrid model performs better
than single neural network model.

Keywords: churn; customer retention; hybrid data mining; neural networks.

Introduction

As the new markets are developed, competition between companies increases sharply. Since the competition gets hard and telecommunication becomes a selling product, companies encounter to minimize costs,
add value to their services, and guarantee differentiation. Now, the customers can choose their service
providers, so companies pay attention to customer care
in order to keep their position in the market. Under
the hard conditions of competition, companies try to
focus on customers behaviors. Base on the needs of
customers, telecommunication companies decide their
service offers, give a shape to their communication network and in addition change their organizational structure [1]. If a customer ends doing business with a
provider, and join another one, the customer is called a
churner. Churn is a major problem for companies with
many customers, like credit card providers or insurance
companies. In telecommunication industry, the sharp
Corresponding

increase of competition makes customer churn a great


concern for the providers [2]. As In the wireless telephone industry, annual churn rates have been reported
to range from 23.4% [3] to 46% [4]. Churn is closely
related to the retention concept, representing the opposite effect: churn = 1- retention. While the focus
of the retention investigation is to find out why customers stay, churn focuses on the reasons a customer
may leave. In order to effectively manage customer
churn for companies, it is important to build a more effective and accurate customer churn prediction model.
Statistical and data mining techniques are useful to create the prediction models. This paper also focuses on
the use of data mining to predict customer churn. Customer churn prediction models aim to detect customers
with a high propensity to attrite. An accurate segmentation of the customer base allows a company to target
the customers that are most likely to churn in a retention marketing campaign, which improves the efficient
use of the limited resources for such a campaign. Many
studies examined different data mining techniques to

Author, P. O. Box 37181-87181, T: (+98) 919 5421835

114

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

predict customer churn. Researchers showed that hybrid data mining models can improve the performance
of the single clustering or classification techniques individually. In particular, they are composed of two
learning stages [5]. Nevertheless, few studies examine
the performance of hybrid data mining techniques for
customer churn prediction. Therefore, this paper uses
hybrid neural network in order to improve the accuracy
of prediction models. The rest of the paper is organized
as follows. The definition of churn and the summary
of the studies are introduced in Section 2. The data
which is used in the research is described in Section 3,
and the modeling process based on neural network is
presented in Section 4. The conclusion of this paper is
represented in Section 5.

Literature Review

Many highly competitive organizations have understood that retaining existing and valuable customers
is their core managerial strategy to survive in industry. This leads to the importance of churn management. Customer churn means that customers are intending to move their custom to a competing service
provider. Many studies have discussed customer churn
management in various industries, especially in mobile
telecommunications. In order to understand how related work constructs their prediction models, this paper reviews some of the current related studies. ShinYuan Hung et al. (2006) [?6] used decision tree and
neural network techniques for predicting wireless service churn. They understood that both decision tree
and neural network techniques can deliver accurate
churn prediction models. John Hadden et al. (2007)
[7] reviewed some of the most popular technologies that
have been identified for the development of a customer
churn management platform. Kristof Coussement and
Dirk Van den Poel (2008) [8] compared three classification techniques Logistic Regression, Support Vector
Machines and Random Forests to distinguish churners from non-churners. Their reviews show that Random Forests is a viable opportunity to improve prediction performance compared to Support Vector Machines and Logistic Regression which both exhibit an
equal performance. Elen Lima et al. (2009) [9] show
how domain knowledge can be incorporated in the data
mining process for churn prediction, viz. through the
evaluation of coefficient signs in a logistic regression
model, and secondly, by analyzing the decision table
(DT) extracted from a decision tree or rule-based classifier. Dulijana Popovi and Bojana Dalbelo Bai (2009)
[10] presented a model based on fuzzy methods for
churn prediction in retail banking. B.Q. Huang et al.

(2010) [11] In order to improve the prediction rates of


churn prediction in land-line telecommunication service
field, this paper proposes a new set of features with
three new input window techniques. For evaluating
these new features and window techniques, the three
modeling techniques (decision trees and multilayer perceptron neural networks and support vector machines)
are selected as predictors. Their results show that the
new features with the new window techniques are efficient for churn prediction in land-line telecommunication service fields. Afaq Alam Khan et al (2010) [12]
identified the best churn predictors on the one hand
and are evaluated the accuracy of different data mining techniques on the other in ISP industry in I.R.Iran.
Clustering users by their usage features and incorporating cluster membership information in classification
models is another aspect which has been addressed in
this study. V. Vijaya Saradhi, Girish Keshav Palshikar
(2011) [13] reviewed different methods proposed to predict customer churn and have provided a predictive
model for employ churn problem. Pnar Kisioglu, Y.
Ilker Topcu (2011) [1] constructed a model by Bayesian
Belief Network to identify the behaviors of customers
with a propensity to churn in telecommunication industry. According to the results of Bayesian Belief Network, average minutes of calls, average billing amount,
the frequency of calls to people from different providers
and tariff type are the most important variables that
explain customer churn.Guangli Nie et al. (2011) [14]
provided a model to predict customer churn with applying two techniques (logistic regression and decision
tree) using credit card data. The test result shows that
regression performs a little better than decision tree.

3
3.1

Data
Reactive Agents

In this paper we used CRM dataset provided by American telecom companies, which focuses on the task
of customer churn prediction. Database contained a
churn variable signifying whether the customer had left
the company two months after observation or not, and
a set of 75 potential predictor variables which has been
used in a predictive churn model. For the purpose of
this paper 4,000 records are randomly selected that
with ratio 9 to 1 are divided into two test data set and
train data set.

115

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.2

Noise Reduction

4.2

Noise is the irrelevant information which would cause


problems for the subsequent processing steps. Therefore, noisy data should be removed. This noise can be
removed by finding their locations and using the correct values to replace them. For instances, the correct
values are used to replace the incorrect ones; the missing values that are identified by NULL or blank spaces
can be removed by neutral values; one of the duplicated data is kept and the others are removed; outliers
can be removed with anomaly detection models.

3.3

Ccombined Neural Network Models

Combined neural network models often results in a prediction accuracy that are higher than the individual
models. This construction is based on a straightforward approach that has been termed stacked generalization. The stacked generalization concepts formalized by Wolpert [16] and refer to schemes for feeding information from one set of generalizers to another before
forming the final predicted value (output). The unique
contribution of stacked generalization is that the information fed into the net of generalizers comes from
multiple partitionings of the original learning set.[17],
[18].

Normalization
4.3

Application of Combined Neural

Normalization is changing the scale of data so that they


Network Model
map to a small and finite range such as [-1,1]. Normalization can be done in various ways such as min-max
normalization, Z-Score normalization and so on. In The combined neural network topology used for the
detection of customer churn. The network topology
this study, min-max method is used to normalize.
was the MLPNN with a single hidden layer in first
level and the network topology was the MLPNN with
a two hidden layer in second level. The network had
75 input neurons, equal to the number of feature vec4 Modeling
tors. We trained second level neural network to combine the predictions of the first level networks. The
second level network has 1 input. The target for the
4.1 Multi-Layer Feed Forward Neural second level network was the same as the targets of
the original data. In the first level and second level,
Network (MLFF)
training of neural networks was done in 50 and 100
epochs, respectively. Since the values of mean square
MLFF is one of the most common NN structures, as errors (MSEs) converged to a small constants approxithey are simple and effective, and have found home mately zero in epochs, training of the neural networks
in a wide assortment of machine learning applications. was successful. In the first level and second level analMLFF are feed-forward NN trained with the standard ysis, the Levenberg-Marquardt and RP training algoback-propagation algorithm. Multi-layer feed forward rithms were used respectively.
neural network architecture shown in fig.1 They have
been shown to yield accurate predictions in difficult
problems [15].

Evaluation Model

After building a predictive model, providers would


want to use these classification models to predict future behavior. It is essential to evaluate the classifier
in terms of performance. First, the predictive model
is estimated on a training set. Afterwards, this model
is validated on an unseen dataset, the test set. It is
essential to evaluate the performance on a test set, in
order to ensure that the trained model is able to generalize well. We can count the number of true positives
Figure 1: Multi-layer feed forward neural network ar- (TP), true negatives (TN), false positives (FP) (actuchitecture
ally negative, but classified as positive) and false nega-

116

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tives (FN) (actually positive, but classified as negative)


examples. Therefore, sensitivity, specificity and accuracy of performance metrics are given by the following
expressions [19].
(T P + T N )
(T P + F N + T N + F P )
TN
specif ity =
(T N + F P )
TP
Sensivity =
(T P + F N )
M isclassif icationerror = 1 Accuracy
Accuracy =

Prediction performance of purposed model is shown in


Fig 2. As the results show, the hybrid model performs
better than single neural network model.

[3] They Love Me, They Love Me Not 17(21) (2000), 3842.
[4] Standing
By
Your
Carrier:
Available
From
Http://Currentissue.Telophonyonline.Com/ (2002).
[5] M. Lenard, G.R. Madey, and P. Alam, The Design And
Validation Of A Hybrid Information System For The Auditors Going Concern Decision, Journal Of Management
Information Systems 14(4) (1998), 219237.
[6] S.Y. Hung, D.C. Yen, and H.Y. Wang, Applying Data Mining To Telecom Churn Management, Expert Systems With
Applications 31(5) (2006), 1552.
[7] J. Hadden, A. Tiwari, R. Roy, and D. Ruta, Assisted Customer Churn Management:State-Of-The-Art And Future
Trends, Computers & Operations Research 34(10) (2007),
29022917.
[8] K. Coussement and D. Van Den Poel, Improving Customer Attrition Prediction By Integrating Emotions From
Client/Company Interaction Emails And Evaluating Multiple Classifiers, Expert Systems with Applications 36(3)
(2009), 61276134.
[9] E. Lima, C. Mues, and B. Baesens, Domain Knowledge Integration In Data Mining Using Decision Tables: Case Studies In Churn Prediction, Journal of the Operational Research Society 60(8) (2009), 10961106.
[10] D. Popovic and B.D. Basic, Churn Prediction Model In Retail Banking Using Fuzzy C-Means Algorithm, Informatica
33 (2009), 243247.

Figure 2: Prediction performance for prediction model.

Conclusion

In this study, we also developed and used hybrid neural networks for predicting potential churn in wireless
telecommunication services. We have tested our hybrid neural network model and compared this model
with a single neural network model. The results of our
experiments indicate that the hybrid neural networks
perform better than the single neural network model,
but are computationally expensive. However, successful churn management must also include effective retention actions. Manager need to develop attractive
retention programs to satisfy those customers. Furthermore, integrating churn score with customer segment and applying customer value will also help managers to design the right strategies to retain valuable
customers.

Refrences
[1] p. Kisioglu and Y.I. Topcu, Bayesian Belief Network Approach To Customer Churn Analysis:A Case Study On The
Telecom Industry Of Turkey, Expert Systems With Applications 37 (2011), 71517157.
[2] M. Richeldi and A. Perrucci, Churn Analysis Case Study:
Telecom Italia Lab Report, Torino, Italy (2002).

[11] B.Q. Huang, T.M. Kechadi, B. Buckley, G. Kiernan, E.


Keogh, and T. Rashid, New Feature Set With New Window
Techniques For Customer Churn Prediction In Land-Line
Telecommunications, Expert Systems With Applications 37
(2010), 36573665.
[12] A. Alam Khan, S. Jamwal, and M.M. Sepehri, Applying
Data Mining To Customer Churn Prediction In An Internet Service Provider, International Journal Of Computer
Applications 9 (2010), no. 7, 814.
[13] V. Vijaya Saradhi and G. Keshav Palshikar, Employee
Churn Prediction, Expert Systems With Applications 38
(2011), no. 3, 19992006.
[14] G. Nie, W. Rowe, L. Zhang, Y. Tian, and Y. Shi, Credit
Card Churn Forecasting By Logistic Regression And Decision Tree, Expert Systems With Applications 38(12)
(2011), 1527315285.
[15] G.E. Rumelhart, G.E. Hinton, and R.J. Williams, Learning
Internal Representations By Error Propagation: Published
in Book Parallel distributed processing: explorations in the
microstructure of cognition, MIT Press, Cambridge, MA
(1986).
[16] D.H. Wolpert, Stacked Generalization, Neural Networks
5(2) (1992), 241-259.
[17] D. West and V. West, Improving Diagnostic Accuracy Using A Hierarchical Neural Network To Model Decision
Subtasks, International Journal Of Medical Informatics 57
(2000), no. 1, 41-55.
[18] E.D. Ubeyli and I. Gler, Improving Medical Diagnostic Accuracy Of Ultrasound Doppler Signals By Combining Neural Network Models, Computers In Biology And Medicine
35(6) (2005), 533554.
[19] J. Han and M. Kamber, Data mining: Concepts and techniques, The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor, Morgan Kaufmann
Publishers, San Francisco, CA, 2006.

117

A Clustering Based Model for Class Responsibility Assignment


Problem
Hamid Masoud

Saeed Jalili

Tarbiat Modares University (TMU)

Tarbiat Modares University (TMU)

Electrical and Computer Engineering Faculty

Electrical and Computer Engineering Faculty

H.Masoud@Modares.ac.ir

Sjalili@Modares.ac.ir

S.M.Hossein Hasheminejad
Tarbiat Modares University (TMU)
Electrical and Computer Engineering Faculty
SMH.Hasheminejad@Modares.ac.ir

Abstract: Assigning responsibilities to classes is a vital and critical task in the object oriented
software design process and directly affects maintainability, reusability and performance of software
system. In this paper we propose a clustering based model for solving the Class Responsibility
Assignment (CRA) problem. The proposed model is independent of specific clustering method and
has a high extensibility to cover the new features of object oriented software design. The input
of model is collaboration diagrams of analysis phase and its output is the class diagram with high
cohesion and low coupling. To evaluate the proposed model we use four different clustering methods:
X-means, Expectation Maximization (EM), K-means and Hierarchical Clustering (HC). Comparing
the obtained results of clustering methods with the expert design reveals that the clustering methods
yield promising results.

Keywords: Object-oriented analysis and design; Class responsibility assignment (CRA); Clustering.

Introduction

Object-oriented software design process involves several steps, in which each step has its own activities. Class Responsibility Assignment (CRA) is one
of the important and complex activities in the ObjectOriented Analysis and Design (OOAD). Its main goal is
to find the optimal assignments of responsibility (where
responsibilities are shown in terms of methods and attributes) to classes in regards to various aspects of coupling and cohesion, thus leading to a more maintainable and reusable model [1]. CRA not only is vital
during analysis and design phase, but also during maintenance.
There are many methodologies to help recognize responsibilities of a system [2] as well as assigning them
to classes [3], but all of them depends greatly on hu Corresponding

man judgment. On the other hand emergence of new


responsibilities or change existing responsibilities (e.g.,
removed or moved to other classes) causes the model
change, hence the need for reallocation of responsibilities is essential. CRA is an onerous task, therefore,
having an automated method for it, can provide enormous help to designers.
All researches for solving the CRA problem using metaheuristic methods and there is no method based on the
clustering techniques. In this paper we address CRA
as a clustering problem, making it fit for application of
clustering methods. For this purpose, first, we extract
some features from input collaboration diagrams, then
use clustering methods for clustering them and generating class diagram. Comparing the obtained results
of four different clustering methods (X-means, Expectation Maximization (EM), K-means and Hierarchical

Author, P. O. Box 14115-143, T: (+98) 2182883521

118

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Clustering (HC)) with the model designed by expert


reveals that the clustering methods yield promising results.
The rest of this paper is organized as follows: Section
2, discusses CRA as a clustering problem. In section
3, the proposed model is described in detail. The used
case study to evaluate the proposed model is described
in section 4. Section 5 presents experimental results.
Finally, conclusions and future work are drawn in Section 6.

Related Works

In recent years there has been a dramatic increase in


work on Search Based Software Engineering (SBSE).
SBSE is an approach to software engineering in which
search based optimization algorithms are used to address problems in Software Engineering [4]. The focus
of most research in this area is on software testing
[4]. However, there are many researches in software
design area [5]. Recently, some researches have used
the metaheuristic optimization algorithm to solve the
CRA problem. Keeffe and Cinneide [6, 7] use a Simulated Annealing (SA) algorithm to automatically improve the structure of an existing inheritance hierarchy.
Bowman et al. [8] study the use of a Multi-Objective
Genetic Algorithm (MOGA) in solving the CRA problem. The objective is to optimize the class structure
of a system through the placement of methods and
attributes. In this study the Strength Pareto Approach (SPEA2) is used. Glavas and Fertalj [9] use
four different metaheuristic optimization algorithms
(simple Genetic Algorithm (GA), hill climbing, SA,
and particle swarm optimization) to solve the CRA
problem. They use responsibility dependency graph
as input for optimization algorithms and use coupling
and cohesion metrics for evaluation. Seng et al.[10] use
GA to automatically determine potential refactorings
of a class structure and inheritance hierarchy. The
used Mutation and Crossover operators are moving a
method from a class to another class and moving methods/attributes up/down in an inheritance hierarchy.

The CRA problem can simply be mapped to a clustering problem. To show this, first, we define the clustering problem. Consider a set of N d -dimensional data
objects O = {O1 , O2 , , ON }, where Oi = (oi1 , oi2 , ,
oid ) Rd . Each oij called a feature (attribute, variable, or dimension) and represents value of data object
i at dimension j. Given O the set of data objects, the
goal of partitional clustering is to divide the data objects into K clusters {C1 , C2 , ,CK }, that satisfies the
following conditions:
a) Ci 6= ,
i = 1, ..., K
SK
b) i=1 Ci = O
c) Ci Cj = ,

i, j = 1, ..., K and i 6= j

In the CRA problem we have a set of methods and


attributes (indicating responsibility) that must be divided between K classes. If we consider each of the
classes as a cluster and each of the methods or attributes as a data object, then the CRA problem is converted to the clustering problem. In fact, the main objective of clustering algorithms is to maximize betweencluster separation (coupling between clusters) and minimize within-cluster scatter (cohesion of clusters). As
mentioned above the coupling and cohesion metrics are
two main goals in the CRA problem.

Proposed Model

As mentioned in section 1, all existing methods for


solving the CRA problem based on the metaheuristic
algorithms and there is no method based on clustering techniques. In this paper, we propose a clustering
based model for solving the CRA problem. In compared with metaheuristic based methods the proposed
model has several advantages. These advantages are:
Easy to extend; by extracting new features, according to application and user priorities, easily,
we can import the new aspects of OOAD into
the design of class diagram. But in metaheuristic based methods, development causes large
changes in the structure of population members/solutions encoding, fitness function and operators.

CRA as a Clustering Problem

Using a variety of new and efficient clustering


methods without changing the model.

Bowman [8] defines the CRA problem as:


CRA is about deciding where responsibilities, under
the form of class operations (as well as the attributes
they manipulate), belong and how objects should interact (by using those operations).

No need to design a specific fitness function; in


metaheuristic based methods, definition of a specific fitness function for the CRA problem and
weighting its elements are a complex and onerous task.

119

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1 shows our model for solving the CRA problem. The proposed model has three main steps: (1)
extracting features and generating data set, (2) clustering data set, and (3) processing clustering results and
generating class diagram. These steps are described in
the following subsections.

Table 1: Extracted features from inputs


Acronym
MAR
MMR
RA
RM
AC
MC

Definition
Method-Attribute Relation
Method-Method Relation
Related Attributes
Related Methods
Attribute Complexity
Method Complexity

mmrij {S, SP, SR, SP R}, i, j = 1, ..., M and i 6= j (2)

RA and RM show semantically related attributes and


methods, respectively. RA and RM are A A and M
M matrixes, respectively and defined as follow:

1 if attributes i and j semantically related,
raij =
0 Otherwise.
i,j=1,...,A
Figure 1: The proposed model for CRA problem
(3)

1 if methods i and j semantically related,
rmij =
0 Otherwise.
i,j=1,...,M
(4)
4.1 Extracting Features
AC and MC are vectors with length A and M, and
show the complexity of attributes and methods, reOur model is based on the clustering methods, so it is spectively. These features defined as:
necessary to generate a processable data set by clusacj = AF anInj , j = 1, ..., A
(5)
tering methods. For this purpose, first, we extract
features are shown in Table 1. These features are demci = M F anOuti + M F anIni , i = 1, ..., M (6)
fined based on the dependency between responsibiliWhere M F anOuti is the number of methods called by
ties. In fact, there are two types of dependency: data
Mi , M F anIni is the number of methods that call Mi
and functional dependencies. Data dependency is a
and AF anInj is the number of methods that used or
dependency between a method and attribute. Funcmodified Aj .
tional dependency is a dependency between two methAfter extracting the values of these features, for generods. Dependencies between two methods can be of
ating final data set, they must be processed. For this
four different types [9]: simple call dependency (source
purpose, for each method/attribute in MAR, MMR,
method simply starts destination method); parameterRA and RM matrixes, their similarity degree with
ized call dependency (source method starts destination
other elements of corresponding matrix are calculated
method and sends data); simple call waiting for reand as a feature for it are placed in the final data set.
sult (source method starts the destination method and
In this paper, Jaccard [11] binary similarity function
uses its result); parameterized call waiting for result
used to calculate the similarity degree of elements.
(source method starts destination method sending data
and uses its result). In this paper, these dependencies determined by S, SP, SR and SPR, respectively.
Also, dependencies between a method and attribute 4.2 Clustring
can be of three different types: simple use dependency
(method uses the value of attribute); modify depen- In this step, the generated data set in previous step is
dency (method modifies the value of attribute); use and clustered. Clustering algorithms are used for this purmodify dependency (method uses the value of attribute pose preferably are dynamic clustering methods. Aland modifies it). These dependencies determined by U, though non-dynamic clustering methods can be used,
M and UM, respectively.
in which case the number of clusters should be determined by an expert.
MAR and MMR show data and functional dependencies, respectively. MAR and MMR are M A and
M M matrixes, respectively, where M is the number
of methods and A is the number of attributes. These 4.3 Processing of Clustering Results
features defined as follow:
After clustering of the data set, each cluster is conmarij {U, M, U M }, i = 1, ..., M and j = 1, ..., A (1) sidered as a class and the relationship between classes

120

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

are determined according to dependency of their contents. For example, suppose method Mi from Class1
calls method Mj from Class2, in this case there is a
relationship between Class1 and Class2.

as the number of clusters and saw K-means method


with 13 for the number of clusters achieved better result than other methods. In this experiment X-means
and EM finds 14 and 17 clusters, respectively. Figure
3 shows the best design generated by proposed model.
This design is the result of K-means with 13 clusters.

Case Study

In order to validate our model, we performed a case


study. For this purpose, the iCoot [12] case study is
used. It is a car rental and reservation system and
its analysis level class diagram, designed by expert, is
shown in Figure 2. The iCoot case study consists of
18 classes and 75 responsibilities (38 methods and 37
attributes). Maximal possible number of classes in the
solution had to be equal to the number of responsibilities which made the size of search space considerably
large (7575 ).

Figure 3: The best design generated by proposed model

Figure 2: The Class diagram designed by expert

Experimental Results

We used four different clustering methods, X-means,


EM, K-means and HC, for clustering of data set. Xmeans and EM are dynamic clustering methods and
automatically finds the number of clusters. K-means
and HC are static clustering algorithms, and the number of clusters should be set before running the algorithm. For these algorithms the number of clusters is
determined by expert. We tested 13, 14, 15 and 18

The quality of a software design has mostly been


measured with cohesion and coupling, which mostly
conform to the quality factors of efficiency and modifiability [5]. Cohesion is a measure of the extent to
which the various functions performed by an entity are
related to one another [13]. Coupling is the degree of
interaction between classes [13]. There are many cohesion and coupling metrics [13]. The low value of coupling and high value of cohesion is desirable. In this
paper, we use Method-Method Coupling [8] (MMC)
and Method-Attribute Coupling [8] (MAC) for measurement of coupling, and use Ratio of Cohesive Interactions [8] (RCI) and Tight Class Cohesion [8] (TCC)
for measurement of cohesion. Table 2 shows the value
of these metrics for designed class diagram by clustering methods and expert. Based on the results shown
in Table 2, K-means with 13 clusters obtain the better
coupling and cohesion values. Also, in compared with
expert design, the results of other clustering methods
are good and have better coupling and cohesion values.
In order to evaluate the effectiveness of clustering
methods, we compared its performance with single objective GA, proposed by Bowman[8]. For this purpose, we run each method 10 times and report the
obtained best and average values. The provided results by clustering methods compared to GA shown in
Table 3. Based on the results shown in Table 3, clustering methods and GA obtain the same best value
in the coupling and cohesion metrics, but the average

121

The Third International Conference on Contemporary Issues in Computer and Information Sciences

value of these metrics, obtained by GA are worse than yield promising results. On the other hand comparing
clustering methods. Also, the computational time of the obtained results of clustering methods with single
clustering methods is better than GA.
objective Genetic algorithm reveals that the clustering
methods have low computational time and better average value for coupling and cohesion metrics.
Table 2: The value of coupling and cohesion
In a future work, we intend to use powerful dynamic
for clustering methods and expert design
clustering methods and extend the feature set to sup#
Coupling
Cohesion
port the new aspects of software design.
Algorithm

Classes

MAC

MMC

RCI

TCC

X-means
EM
K-means
K-means
K-means
K-means
HC
HC
HC
HC
Expert

14
17
18
15
14
13
18
15
14
13
18

22
25
27
22
18
13
27
25
22
18
27

29
29
23
29
29
24
27
29
29
29
29

0.137
0.125
0.079
0.128
0.139
0.149
0.097
0.125
0.120
0.129
0.005

0.102
0.102
0.272
0.201
0.35
0.35
0.173
0.098
0.35
0.35
0.109

Table 3: The obtained result by clustering methods


and Geneteic algorithm (SD: Standard Deviation)
Method

Metric
Coupling

Genetic
Algorithm

Cohesion

Avg SD
Best
Avg SD
Best

Time
Coupling
Clustering
Methods

Cohesion
Time

Avg SD
Best
Avg SD
Best

Value
38.8 1.4
37
0.420 0.09
0.499
41s
37.3 0.9
37
0.472 0.08
0.499
1 0.5 s

Conclusions and Future works

Class Responsibility Assignment (CRA) is an important and complex activity in the object oriented analysis and design. In this paper, we addressed CRA as
a clustering problem and proposed a clustering based
model (Figure 1) for solving it. The proposed model
has three main steps: (1) extracting features and generating data set, (2) clustering data set, and (3) processing clustering results and generating class diagram.
Four different clustering methods (X-means, EM, Kmeans and HC) used to evaluate the proposed model.
Comparing the obtained results of expert design with
clustering methods reveals that the clustering methods

122

Refrences
[1] L.C. Briand, J. Daly, and J. Wuest, A Unified Framework for Cohesion Measurement in Object-Oriented Systems, Empirical Software Engineering 3 (1998), 65117.
[2] C. Larman, Applying UML and patterns: an introduction
to object-oriented analysis and design and iterative development, Prentice Hall, 2004.
[3] B. Bruegge and A.H. Dutoit, Object-Oriented Software Engineering, Prentice Hall, 2004.
[4] M. Harman, S.A. Mansouri, and Y. Zhang, Search based
software engineering: A comprehensive analysis and review
of trends techniques and applications, Kings College London,Technical Report TR-09-03 (2009).
[5] O. Raiha, A survey on search-based software design, Computer Science Review 4 (2010), 203-249.
[6] M. OKeeffe and M. O Cinneide, Towards Automated Design Improvement through Combinatorial Optimization,
Proceedings of the Workshop on Directions in Software Engineering Environments (2004).
[7] M. O Keeffe and M. O Cinneide, Search-Based Refactoring
for Software Maintenance, Journal of Systems and Software
81 (2008), 502-516.
[8] M. Bowman, L.C. Briand, and Y. Labiche, Solving the Class
Responsibility Assignment Problem in Object-Oriented
Analysis with Multi-Objective Genetic Algorithms, IEEE
Transactions on Software Engineering 36 (2010), 817837.
[9] G. Glavas and K. Fertalj, Metaheuristic Approach to Class
Responsibility Assignment Problem, Proceedings of the International Conference on Information Technology Interfaces (ITI) (2011), 591596.
[10] I. Seng, J. Stammel, and D. Burkhard, Search-Based Determination of Refactorings for Improving the Class Structure of Object-Oriented Systems, Proceedings of the 8th annual conference on Genetic and evolutionary computation
(2006), 19091916.
[11] S. Choi, S. Cha, and C.C. Tappert, A survey of binary similarity and distance measures, Journal of Systemics, Cybernetics and Informatics 8 (2010), 4348.
[12] M. Docherty, Object-Oriented Analysis and Design, John
Wiley & Sons Ltd, 2005.
[13] G. Gui and P.D. Scott, Coupling and Cohesion Measures
for Evaluation of Component Reusability, Proceedings of
the international Workshop on Mining software repositories
(2006), 1821.

A Power-Aware Multi-Constrained Routing Protocol for Wireless


Multimedia Sensor Networks
Babak Namazi

Karim Faez

Amirkabir University of technology

Amirkabir University of technology

Department of Electrical Engineering

Department of Electrical Engineering

b namazi@aut.ac.ir

kfaez@aut.ac.ir

Abstract: Energy efficiency and quality of service(QoS) assurance are challenging tasks in wireless
multimedia sensor networks(WMSNs). In this paper, we propose a new power-aware routing protocol for WMSNs supporting multi-constrained QoS requirements, using localized information. For
realtime communication we consider both delay at sender nodes and queuing delay at the receiver.
In order to achieve reliability requirements and energy efficiency, each node dynamically adjusts
its transmission power and chooses nodes that have less remaining hops towards the sink. A load
balancing approach is used to increase lifetime and avoid congestion. Simulation results shows that
our protocol can support QoS with less energy consumption.

Keywords: WMSN; Routng; Quality of Service; Power Control;Energy efficiency.

Introduction

Recent advances in CMOS technology has led to


new derivative of sensor- based networks, namely wireless multimedia sensor networks(WMSNs)[1, 2]. WMSNs consist of a large number of sensor nodes equipped
with multimedia sensors, capable of retrieving multimedia information from the environment. Due to this
ability, WMSNs are gaining great potential in military
situations and other video surveillance systems.
Compared with traditional WSNs, designing a
routing protocol in WMSN is more challenging, concerning energy limitations in transmitting such bandwidth demanding data[3]. In addition multimedia content needs certain quality of service (QoS), such as reliability and real time. Traffic is also diverse and different flows may have different requirements. In this
work we propose a novel QoS routing protocol with energy consideration based on traffic differentiation and
localized information.
Many researchers have worked on real-time routing protocols for WSNs. For example, SPEED[4] attempts to choose paths that ensure a fixed speed, considering delay at the sender node. One drawback of
Corresponding

Author, T: (+98) 912 5491108

123

SPEED is that it does not support different latency


requirements, in addition delay at the receiver is not
taken into account. MMSPEED [5] defines multispeed
routing with different routing layers, each supporting
a different speed. However energy efficiency is not addressed directly in both of these protocols. RPAR[6] pioneers in incorporating energy consumption in realtime
communication. It achieves required end-to-end delay
at low power by dynamically adjusting transmission
power. None of the above mentioned protocols consider
hop count for minimizing the latency. In our protocol
we try to consider both transmission and queuing delay and use a power control approach to guarantee required end-to-end latency, over the least hops towards
the sink.
In order to support QoS in reliability domain, different approaches have been used in the literature. One
of which, is to send duplicated packets towards different nodes. MMSPEED uses this approach to achieve
higher packet reception ratio. To have more energy efficiency, EARQ [7] selects just two paths and send the
duplicated packet to the alternative path. To avoid
congestion near the sink, LOCALMOR[8] uses the single path multi sink approach and sends a copy of the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

packet to a secondary sink. All of these methods decrease the lifetime of the network. REP[9] instead, uses
a power allocation protocol to guarantee needed reliability, since increasing transmission power results in
higher SINR. It divides the area into many concentric
coronas and randomly chooses a node from the corona
nearer to sink and increase transmission power until
the requirement is met. We use a novel HELLO message approach to find the exact remaining hops to the
sink and select nodes which have better link quality,
needing less increase in transmission power.
The rest of this paper is organized as follows: Section 2 gives the network model and assumptions. The
proposed protocol is described in section 3, and its
performance is evaluated in section 4. Finally, section
5 concludes the paper.

Each node should be aware of its neighboring nodes status, including their position, remaining energy, quality of the link, remaining hops to the
sink(level) and queue state. Like other state-of-the
art localized routing protocols, we use HELLO packets
to exchange these needed information, but instead of
sending HELLO messages simultaneously by all of the
nodes, the sink is the node that initializes the HELLO
message containing its information, labeled as level
zero. Upon receiving the first HELLO packet each node
is labeled as the next level and broadcast its information. Using this method all of the nodes know their
level and tell it to their neighbors. The sink node does
it at fixed intervals and after each reception, nodes add
an entry to their routing table, including: nodes distance to sink, level, remaining energy, speed and required transmission power. We will discuss about the
last two in more detail, later in this section.

System Model

Protocol Overview

The protocol uses local information and selects the


forwarding nodes according to their ability to fulfill the
flows requirements of latency and reliability. It has four
components: 1)Neighbor management which is responsible for gathering neighboring information and managing routing Table. 2)Latency estimation which calculates the latency of a forwarding node. 3)Reliability estimation which estimates the quality of available links
and finds the best transmission power. 4)Geographic
forwarding which defines the policy of forwarding.

3.1

Neighbor Management

In this paper we adopt a WMSN formed by a large


number of multimedia sensors randomly deployed in an
environment to collect information. All of the sensor
nodes have the same specifications, except for the sink
node which has no energy limits. Upon receiving each
data packet, nodes can calculate bit error rate by the
SINR of the signal based on the modulation used[10].
We also make the assumption that all of the nodes
in the network, know the current geographic coordinates of their owns and the sink node and are stationary during network lifetime.
Like LOCALMOR, we define four classes of packets based on two criteria;whether they are high and low
priority (reliability),and whether they are realtime not.
High priority packets can stand for I frames and low
priority packets can be considered as P frame packets
in a video stream. The type of the packet is specified
at the application layer.
Most tranceivers support different transmission 3.2 Latency Estimation
power. For calculation energy consumption we use the
model proposed in [9].The energy consumed by a transTwo types of delay may occur in these netmitter sending data of size f is based on:
works,delay at the sender and delay at the receiver.
P tx
8f cir
Delay at sending nodes mostly depends on the used
(P + P tx )
(1)
E tx (P tx ) =
MAC parameters. At the receiving nodes delay is due
R

to queuing. Propagation delay is usually ignored. In


tx
cir
where P is the transmission power, P
is the circle
order to estimate the transmission delay each node
power, R is the bandwidth and is the conversion effitimestamps the packet after receiving it from applicaciency of power amplifier. The energy consumption of
tion layer or other nodes, and after receiving the ACK
the receiver is:
packet at MAC layer. Transmission delay is computed
8f
E rx =
P rx
(2) as follows:
R
in which P rx is the received power.
dtr = trec tsent tack
(3)
We assume that all of the nodes are capable of
changing their transmission power in order to achieve in which trec is the time the packet is received at the
network QoS requirements.
routing layer, tsent is the time the node receives the

124

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ACK packet and tack is the time consumed for transmitting an ACK packet at the receiver.
Transmission delay may vary because of changes
in network parameters. A reason might be variations
in transmission power level. In order to count for past
delays, we use EWMA method for estimating the transmission delay.
Queuing delay, say dq , is computed at the receiver
and is exchanged between nodes via HELLO packets.
We use the moving average approach for this delay too.
Having different kinds of delay, each node can estimate
the velocity of its neighbors and compare it to the required velocity of the packet to be transmitted. The
velocity of neighboring nodes is computed as bellow:

After HELLO packets each node knows what the


BER of the packet sent to a specific node and at which
power level it was. If the BER value is more than
the desired value, the node increases its transmission
power and otherwise decreases it in order to have a
better energy efficiency. During the next interval the
packets will be transmitted to the specific node at this
level.
The required BER is estimated locally based on
the hop count of the packet and remaining hops to the
sink node.
h
BERij = BERreq

(6)

in which: BERij is the BER of the packet transmitting


(4) from node i to node j, and BERreq is the required endto-end BER. h is the total hop count from the source to
in which disid is the distance between the node itself destination and is computed by adding the hop count
and the sink node and disjd is the distance between the of the packet and the level of the node.
By this method the needed reliability is met locally,
neighboring node and the destination. The required
and
it is expected that after a few HELLO packets invelocity is:
tervals, all of the nodes know the required transmission
disid disjd
(5) power of both high priority and low priority packets,
vreq =
tdl te
where tdl denotes the deadline of the packet and te is for all of the nodes in their fast neighbors set.
the experienced delay for packet receiving at this node.
It is a part of packets header.
Each node has a set of its neighbors fulfilling the 3.4 Geographic Forwarding
latency requirement, called fast neighbors, consisting
of the nodes meeting the condition vj > vreq . The
After finding the eligible nodes satisfying latency reforwarding node is chosen from this set.
quirement and their corresponding transmission power,
each node selects the most energy efficient node for the
next hop. To do that we consider both the energy cost
3.3 Reliability Estimation
of forwarding nodes and their residual energy. In addition selecting just one path may cause energy depletion
Considering reliability requirements, each flow can of nodes in the path. To avoid this we give a score to
tolerate certain bit error rate value. Increasing trans- all of the eligible nodes based on their residual energy
mission power is the solution we use to achieve the , energy cost and remaining hops to the sink. The
required BER. Due to wireless nature, transmitted nodes having lower level are selected first and if their
signal can be received by nodes in the range of the number is less than two, nodes having the same level
sender and this signal can be counted as interference are added to them. These nodes are sorted and chosen
to nodes that are not the packets destination. On the based on their score, i.e. higher probability is given to
other hand, increasing transmission power may cause best neighbors.
more SINR value, resulting in lower BERs. So finding
The energy cost is computed as bellow:
the most efficient transmission power is very important
E tx
in our protocol.
(7)
cost =
In order to achieve the best transmission power
disid disjd
level satisfying our reliability needs, we assume all of
the nodes to have an initial power level. At this initial If the energy cost is a small value, it means that the
power level nodes start to transmit HELLO packets. forwarding node is more energy efficient.
Each node has a table containing the BER of the
Using the proposed forwarding policy, the load is
packets received from a specific source and the power distributed over the nodes satisfying QoS needs and
at which the packet transmitted. The transmission therefore the lifetime of the network is increased and
power of a packet is a part of its header. This table less congestion will happen. Another advantage is that
is updated after receiving a packet and is sent during nodes having higher scores have higher priority and are
HELLO packets.
selected more often.
vj =

disid disjd
dtr + dq

125

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Simulation Results

protocol provides more reliability with less energy consumption.

To evaluate the performance of the proposed protocol we used Castalia-3.2[11] simulator. Castalia is a discrete event simulator designed for simulating wireless
sensor networks. The simulation configuration consists
of 36 nodes randomly deployed in a 100*100 m2 terrain.802.11 MAC protocol (with RTS/CTS packets) is
used and a node is selected randomly to send its packet
to the sink node. The traffic consists of all four class
packets and the simulation time is 600 seconds.
The performance metrics used are average energy
consumption, average end-to-end delay and BER. We
compare our protocol, called hereafter PMCR(Poweraware Multi-Constrained Routing), with LOCALMOR
protocol over these metrics.
EndtoEnd Delay(ms)
1200
PMCR
LOCALMOR
350

1000

[1] I.F. Akyildiz, T. Melodia, and K.R. Chawdury, A Survey


on Wireless Multimedia Sensor Networks, Computer Networks(Elsevier) 51,no.4 (2007), 921-960.

900
800
700
PMCR
LOCALMOR

600
200
0.05

0.1

0.15
0.2
Required BER

0.25

0.3

500
0.05

(a) Delay

0.1

0.15
0.2
Required BER

0.25

0.3

(b) Energy Consumption

Figure 1: Delay and energy consumption with Tdl =


0.3s

Fig.1(a) shows average end-to-end delay in different BER requirement for high priority packets and
Tdl = 0.3s. The average energy consumption is shown
in Fig.1(b) for this situation. It can be seen that our
protocol uses less energy than LOCALMOR protocol.

Packet BER
1200
PMCR
LOCALMOR

[2] S. Misra, M. Reisslein, and G. Xue, A Survey of Multimedia Streaming in Wireless Sensor Networks, IEEE Commun.Survey Tutorials 10 (2008), 1839.
[3] S. Ehsan and B. Hamdaouni, A Survey on Energy-Efficient
Routing Techniques with QoS Assurances for Wireless Multimedia Sensor Networks, IEEE Commun.Survey Tutorials
(early access) (2011).
[4] T. He, J.A. stankovic, C. Lu, and T.F. AbdelZaher, A SpatioTemporal Communication Protocol for Wireless Sensor
Networks, IEEE Trans. Parallel and Distributed Systems
16,no.10 (2005), 995-1006.
[5] E. Felemban, C. Lee, and E. Ekici, MMSPEED:Multipath
Multi-SPEED protocol for QoS Guarantee of Reliability and
Timeliness in Wireless Sensor Networks, IEEE Trans. Mobile Comput. 5,no.6 (2006), 738-754.
[6] O. Chipara, Z. He, G. Xing, Q. Chen, X. Wang, C. Lu, J.A.
stankovic, and T.F. AbdelZaher, Realtime Power-aware
Routing for Sensor Networks, in proc. 14th IEEE International Workshop on Quality of Service(IWQoS 2006),New
Haven,Ct (June 2006).

Energy Consumption

0.5

We propesed a novel localized routing protocol for


WMSNs. The protocol takes into account the QoS
and traffic diversity, needed for transmitting multimedia data in such resource limited networks. Simulation
results shows that our protocol outperforms other protocols like LOCALMOR in case of reliability, end-toend delay and energy consumption.

1100

300

250

Conclusion

Refrences

Energy Consumption

400

1100

0.4
1000
0.3

[7] J. Heo, J. Hong, and Y. Cho, EARQ:Energy Aware Routing


for Realtime and Reliable communication in Wireless Industrial sensor Networks, IEEE Trans.Industrial Informatics 5 (2009), 3-11.

900
800

0.2

700
0.1

PMCR
LOCALMOR

600
0
200

250

300
EndtoEnd Delay(ms)

(a) BER

350

400

500
200

250

300
Required BER

350

400

(b) Energy Consumption

Figure 2:
BER and energy consumption with
BERreq = 0.1

Packets BER for different time deadlines is shown


Fig.2(a) and corresponding energy consumption is
shown in Fig.2(b). It is obvious that the proposed

[8] D. Djenouri and I. Balasingham, Traffic-DifferentiationBased Modular QoS Localized Routing for Wireless Sensor
Networks, IEEE Trans. Mobile Computing 10 (2011), 797809.
[9] K. Lin and M. Chen, Reliable Routing Based on Energy
Prediction for Wireless Multimedia Sensor Networks, IEEE
GLOBECOM (2010), 1-5.
[10] A.F. Molisch, Wireless Communications, John Wiley and
Sons, 2011.
[11] Castalia User Manual, http://castalia.npc.nicta.com.au/,
2011.

126

Mobile Learning- Features, Approaches and Opportunities


Faranak Fotouhi-Ghazvini

Ali Moeini

Department of Computer Engineering

Faculty of Engineering

University of Qom

Tehran University

faranak fotouhi@hotmail.com

moeini@ut.ac.ir

Abstract: Mobile learning is a new paradigm of learning that takes place in a meaningful context,
involves exploration and investigation, and includes opportunities for social dialogue and interaction
where learners have access to appropriate resources. The learning process could be supported
by the use of mobile phone in a responsive manner by means of context aware hardware and
technologies that facilitate interaction and conversation. This mode of learning can enhance and
improve learning, teaching and assessment. In this article we discuss distinctive feature of mobile
learning, different approaches to mobile learning in different continents, advancement of portable
devices their implications and mobile learning in Iran.

Keywords: Mobile Learning; Mobile Games; Game Based Learningt; Augmented Reality.

Introduction

Distinct
Learning

Features

of

M-

M-learning has three main characteristics: (1) mobility, (2) context aware and (3) able to communicate.
Sharples [10] defines mobility as (a) mobility in physical space it is not bound to classroom (b) it uses mobile hardware such as Bluetooth, GPS, camera, WiFi
all integrated in a compact portable devices (c) mobile in social space learner could forms different ad hoc
During the last decade, mobile learning (m-learning) a groups during the day for collaborative learning (d)
new kind of e-learning has been introduced, in which mobile in time, learning could be distributed during
the power of wireless technologies have been used in ed- different time according to the learners preferences.
ucational context. Compared to traditional e-learning
it is more personal, always connected to communicaAnother characteristics of mobile learning is being
tion tools, portable, cheap and available to the public. context aware, which means it can collect environmenThe m-learning consumers are mobile and so learn- tal data simultaneously or with the learners command
ing could take place ubiquitously anywhere. Educa- to help his/her to better analyze and apprehend education protocols which employ this method has aided tional material that depend to the physical world. The
many aspects of learning such as: motivation [1], au- data is usually collected using devices such as GPS,
tonomy [2], interaction and collaboration [3] and [4], compass, Bluetooth, camera, accelerometer and gyroself-esteam [5], social skill [6], accessibility [7], and lan- scope.
guage acquisition [7]. It has been especially effective in
the teaching of disadvantaged students in developing
The next characteristic of m-learning is that it alcountries [8] and [9].
ways has available communication tools such as phone
Corresponding

Author, P. O. 37161466119, T: (+98) 09127506309

127

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

calls, SMS, MMS and mobile internet. These features tions for tablets that could be used in schools.
facilitate the process of learning between students and
teachers when they are located at different physical
M-learning in Middle East has had limited accomspaces.
plishments. However, it is moving towards the use of
Java applications and online electronic materials [28].

M-learning across Continents

Europe and Japan are far ahead of other countries vis-vis taking advantage of mobile phones features. They
have used SMS in mobile commerce thus forming a
rich communication ecosystem with clients. Many mlearning research projects have taken place in Europe
[11-15]. These projects have played a major role in
shaping and developing mobile learning theories and
techniques. On the other hand the homogenous mobile communication system in Europe has provided
each project with a big market. However in North
America lacking homogeneity in the implementation
of third generation of mobile communication systems
caused the late blooming of m-learning. At present,
m-learning application include game simulation environments that incorporate technologies such as GPS,
WiFi and Bluetooth [16].

Current Market Changes of


Mobile devices

Most new mobile phones take advantage of high speed


third generation of mobile systems for video conferencing and web surfing. The storage capacity, processing power and memory of devices have increased
considerably. The electronic components such as GPS,
accelerometers, compass, high resolution camera and
proximity sensors being integrated in a single device,
have increased the mobile phones capabilities compared to PCs.

According to Gartners [29] predictions, in the next


few years, Android operating systems will turn into the
most favorite mobile phone operating system and will
dominate over half of the market. iPhone and Nokia
Windows will have the second and third place respecIn Asia, the spread of cheap mobile phones with tively.
high quality of services is increasingly rising and it has
proven to be a cost effective method of teaching and
These hardware and software advances have given
learning. In Bangkok, text messaging has been used birth to Augmented Reality (AR), which is combining
to participate in examaminations [17]. In Japan mo- and incorporating the real world data in virtual enbile web has been used for English language learning vironments simulated in mobile devices. AR adds a
[18]. In the Philippines, text messaging has been used real layer of information in the form of text, graphic
to teach English, Mathematics and Science [19]. In and voice, when the mobile application accesses two
Taiwan, PDAs have been used for collaborating learn- mobile phones camera. AR is capable of transforming
ing [20] on field trips. In Hong Kong mobile web 2 educational spaces that did not have any connection to
has introduced new opportunities to form teaching and the subject that was taught into a dynamic space that
learning.
simulates an authentic learning environment that increases the learners motivation [30]. Game based learnAmbient Insight [22] has predicted that by 2015, ing has grown in recent years and different researches
the leading consumers of m-learning applications will have proven that these games are suitable for learning
be America, China, India, Indonesia and Brazil.
[31]. With recent mobile phone advances, many teachers have mobile games as an important tool of teaching
In Africa different projects have been carried out during the classroom to increase the learners interest
with the aid of British universities. For example [23], and problem solving [31].
[24] are the two projects that have been lead by UK
Open University. In these project PDAs have been
There are other mobile devices in the market such
used for studying electronic books, playing educational as Palm PC, iPod, iPad, ebook Reader, Nintendo DS
films and audio clips. Text messaging has been used and Playstation. Foresters analyzers believe, the tablet
for assessing Kenyan students; this has been devised PCs with their small size, only consisting of a flat touch
and implemented by Wolverhapton University [25].
screen with less than 9 inch display screen are the main
players [32] amongst the wide spectrum of mobile deIn Australia, m-learning had a slower pace com- vices. Display screen of tablets compared to mobile
pared to European countries, currently the two phones will provide a more complete learning experiprojects [26], [27] are designing educational applica- ence and different universities could use tablets instead

128

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of printed books as part of their green projects.

Implementation
Learning in Iran

of

Mobile

There are many challenges facing mobile learning in


Iran. M-learning has not been officially recognized as
a mode of learning by the Ministry of Education. Research has been carried out in a limited scale at universities, mainly theoretical or as an MSC projects,
however without a secure financial support they have
not been able to be implemented on a large scale. Iranian users often have mobile phones with lower capabilities compared to European users; furthermore Iran
uses second generation of mobile communication system which is slower than the most recent generations.
However, mobile games seem to be a more practical
solution for its ease of implementation and less dependency on infrastructure.

versity, under the supervision of the lecturer. The virtual space of the game simulated a computer lab. The
game experience was considered highly rich and motivating by students to learn Technical English Vocabularies. It has also assisted the lecturer on the teaching
by presenting the material, involving the students on
high level cognitive processes and assessing the students work using an advanced scoring system [38].

Conclusion

Recent advances in mobile technologies and mobile


phone capabilities provide a bright future for mobile
learning. Current researches have proved that combining game based learning and augmented reality could
be very effective in the mobile education. These games
could result in a more fulfilling experience by placing the students in an authentic environment that
could apply their knowledge in an unthreatening environment. Incorporating them in the classroom helps
teachers to present and assess the educational material
There are few educational games that have been in less time and in more organized manner.
implemented and been tested in Iran. They are as follows:
(1) Adventure Quize Games [36], this game consisted of an attractive environment and fun characters.
Players had a chance to win or loose according to a
series of questions that were asked by game characters.
This game did not result in any cognitive changes. It
was mainly due to the fact that the questions did not
relate to the game story and they were considered annoying [36].
(2) MOBO City [37] was an adventure game in
a fantasy world of a computers motherboard. Electronic components were depicted as different building
in MOBO city. The main character was a bus that
carried the data from a starting location often a computers electronic port to certain location such as computers monitor. In this game, the player should have
protected the data from computer viruses that were flying in a spaceship. During the game when the player
passed from certain location or achieved a certain goal
they were presented with an appropriate Technical English vocabulary. This game has been effective on
teaching the vocabulary meaning [37].
(3) The Detective Alavi was the first AR game implemented in Iran [38]. In this game university students were constantly moving between real and virtual
spaces with the help of mobile phones graphical interface, two dimensional Quick Response (QR) codes,
Bluetooth and camera. The game took place in a uni-

129

Refrences
[1] J. L. Shih, C. W. Chuang, and G. J Hwang, An Inquirybased Mobile Learning Approach to Enhancing Social Science Learning Effectiveness, Educational Technology and
Societyg 13/4 (2010), 50-62.
[2] C. Whitet, Learner Autonomy and New Learning Environments, Language Learning and Technology 15/3 (2011),
1-3.
[3] J.
Attewellt,
From
Research
and
Development
to Mobile Learning: Tools for Education and
Training Providers and their Learners, Proceedings
of
mLearn
2005
(2005),
Available
from:
http://www.mlearn.org.za/CD/papers/Attewell.pdf.
[4] D. Corlettt and M. Sharples, Tablet technology for informal
collaboration in higher education, Proceedings of MLEARN
2004: Mobile Learning anytime everywhere, London, UK:
Learning and Skills Development Agency (2004), 59-62.
[5] M. Hansent, G. Oosthuizen, J. Windsor, I. Dohertyt, S.
Greig, K. McHardy, and L. McCann, Enhancement of Medical Interns Levels of Clinical Skills Competence and SelfConfidence Levels via Video iPods: Pilot Randomized Controlled Trial, Journal of Medical Internet Research 2011
13/1 (2011), e29.
[6] M. Joseph, C. Branch, C. March, and S. Lerman, Key factors mediating the use of a mobile technology tool designed
to develop social and life skills in children with Autistic Spectrum Disorders, Computers and Education 58/1
(2011), 53-62.
[7] F. Fotouhi-Ghazvini, R.A. Earnshaw, A. Moeini, D. Robison, and P.S. Excell, From E-Learning to M-Learning
the use of Mixed Reality Games as a New Educational
Paradigm, The International Journal of Interactive Mobile
Technologies (IJIM) 5/2 (2011), 17-25.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[8] A. Kumar, A. Tewari, G. Shroffi, D. Chittamuru, M. Kam,


and J. Cannyl, An exploratory study of unsupervised mobile
learning in rural India, In Proceedings of the 28th international conference on Human factors in computing systems
(CHI 10) (2010), 743-752.
[9] S.S. Nashr, Blended Mobile Learning in Developing Nations
and Environments with Variable Access: Three Cases., Mobile Information Communication Technologies Adoption in
Developing Countries: Effects and Implications (2010), 91102.
[10] M. Sharples, M. Milradi, I. Arnedillo, and G. Vavoulau,
Mobile Learning: Small Devices, Big Issues, TechnologyEnhanced Learning: Principles and Products/ Springer
Netherlands 2009 (2009), 233-249.
[11] Handler,http://www.eee.bham.ac.uk/handler/default.asp.
[12] MLearn,www.mobilearn.org/mlearn2004/presentations.htm.
[13] M-learning, http://www.m-learning.org/.
[14] Leonardo, http://www.leonardo.org.uk/.
[15] Molenet, http://www.molenet.org.uk/.
[16] E. Klopfer, Augmented Learning: Research and Design of
Mobile Educational Games, The MIT Press, 2008.
[17] K. Whattananarong, An experiment in the use of mobile
phones for testing at King Mongkuts Institute of Technology (2005), http://seameo.org/vl/krismant/mobile04.pdf.
[18] P. Thornton and C. Houser, Using Mobile Phones in Education, 2nd IEE International Workshop on Wireless and
Mobile Technologies in Education. (2004).
[19] A. Ramos, J. Trinona, and D. Lambert, Viability of SMS
technologies for non-formal distance education, Information and Communication Technology for Social Development (2006), 69-80.
[20] P.P. Luo, C.H. Lai, and D. Lambert, Mobile Technology
Supported Collaborative Learning in a Fieldtrip Activity,
Technology Enhanced Learning (2009).
[21] 2011 International Conference on ICT in Teaching and
Learning (15th HK Web Symposium) 11-13 July, 2011 Hong
Kong SAR http://ict2011.com/page.

[26] http://delphian.com.au.
[27] http://www.apac.studywiz.com/.
[28] R. Belwal and S. Belwal, Mobile Phone Usage Behavior of
University Students in Oman, New Trends in Information
and Service Science (2009), 954-962.
[29] P. Christy and H. Stevens, Gartner Says Android
to Command Nearly Half of Worldwide Smartphone
Operating System Market by Year-End 2012 (2011),
http://www.gartner.com/it/page.jsp?id=1622614.
[30] H. Tarumi, Y. Tsujimoto, T. Daikoku, F. Kusunoki, S. Inagaki, M. Takenaka, and T. Hayashi, Balancing virtual and
real interactions in mobile learning, International Journal
of Mobile Learning and Organisation 5/1 (2011), 28-45.
[31] C.L. Holden and J.M. Sykes, University of New Mexico,
USA Leveraging Mobile Games for Place-Based Language
Learning, International Journal of Game-Based Learning
1/2 (2011), 1-18.
[32] J. Johnson, Tablets To Overtake Desktop Sales
By
2015,
Laptops
Will
Still
Reign
(2010),
http://www.inquisitr.com/76157/tablets-to-overtakedesktop-sales-by-2015-laptops-will-still-reign.
[33] S. Papert, The Childrens Machine: Rethinking School in the
Age of the Computers, Basic Books, New York, 1993.
[34] C.N. Quinn and R. Klein, Engaging Learning Designing eLearning Simulation Games, Pfeiffer: John Wiley and Sons,
Inc., 2005.
[35] G.a. Gunter, R. F. Kenny, and E.H. Vick, Taking educational games seriously: using the RETAIN model to design endogenous fantasy into standalone educational games,
Journal of Educational Technology Research and Development 56/5 (2008), 511-537.
[36] F. Fotouhi-Ghazvini, A. Moeini, D. Robison, R.A. Earnshaw, and P.S. Excelli, A Design Methodology for Gamebased Second Language Learning Software on Mobile
Phones, Proceedings of Internet Technologies and Applications, Wrexham, North Wales (2009), 609-618.

[22] S.S. Adkins, The Worldwide Market for Mobile [37] F. Fotouhi-Ghazvini, R.A. Earnshaw, D. Robison, and P.S.
Excelli, The MOBO City: A Mobile Game Package for
Learning Products and Services: 2010-2015 ForeTechnical Language Learning, International Journal of Incast and Analysis (2010), 1-21, Available from:
http://www.ambientinsight.com/Resources/Documents/Ambient- teractive Mobile Technologies 3/2 (2009), 19-24.
Insight-2010-2015-US-Mobile-Learning-Market-Executive[38] F. Fotouhi-Ghazvini, R.A. Earnshaw, D. Robison, A.
Overview.pdf.
Moeini, and P.S. Excelli, Using a Conversational Framework in Mobile Game based Learning Assessment and
[23] http://www.open.ac.uk/deep.
Evaluation, Communications in Computer and Information
[24] http://www.bridges.org/ipaq competition.
Science/Springer-Verlag Berlin Heidelberg 177 (2011), 200213.
[25] www.wlv.ac.uk /.

130

Predicting Crude Oil Price Using Particle Swarm Optimization


(PSO) Based Method
Zahra Salahshoor Mottaghi

Ahmad Bagheri

Faculty of Engineering

Faculty of Engineering

Department of Computer Engineering

Department of Mechanical Engineering

zsalahshoor@msc.guilan.ac.ir

bagheri@guilan.ac.ir

Mehrgan Mahdavi
Faculty of Engineering
Department of Computer Engineering
mahdavi@guilan.ac.ir

Abstract: Oil is a strategic commodity in the entire world. Oil price is always changing, but this
change is rapid and predicting this change is difficult too. So how to predict the future price of
oil is one of the major issues in this industry. In this paper,a Particle Swarm Optimization (PSO)
based method has been proposed to predict the future price of oil for upcoming 4 months. PSO is a
population-based optimization method that was inspired by flocking behavior of birds and human
social interactions. The proposed equation has 13 dimensions and 4 variables. These variables
are price of petroleum in the past 4 months. The experimental results indicate that the proposed
approach can predict monthly petroleum price with 3.5 dollar difference on average.

Keywords: Crude Oil Price; Particle Swarm Optimization; Predicting; Forecasting.

Introduction

hart had been inspired it from the life of birds and fish
[1]. This algorithm has good speed and accuracy; it can
solve engineering problems greatly. Here, a method
Prediction is an estimate or a number of quantitative based on PSO is used to predict oil price. The reestimates about the likelihood of future events that sults show this method has good ability in forecasting
will be developed by the use of current and past data. medium-term crude oil price.
Predictions are used as a guide for public and private
Many studies have predicted oil price such as intepolicies, because decision making is not possible without predictive knowledge. For thousands of years oil grating text mining and neural networks in forecasting
has had an important role in peoples lives. It is not the oil price [2], Junyou have been proposed a method
only the main source of worlds energy but also it is for forecasting stock price using PSO-trained neural
very hard to find a product that does not need oil in networks [3], and Abolhassani have been introduced a
its production or distribution. Hence, predicting oil method for forecasting stock price using PSOSVM [4].
price is considered to be a hot topic in this industry.
This paper is organized as follows; PSO algorithm
In this paper, oil price has been predicted by a PSO
is described in Section 2. In Section 3 expressed the
based method.
method based on PSO for predicting the oil price, the
PSO is one of the intelligent algorithms and is a evaluation results are given in Section 4 and Section 5
suitable algorithm in optimization. Kennedy and Eber- refers to conclusion.
Corresponding

Author,P.O. Box 3756, F: (+98) 131 669-0271, T: (+98) 131 669-0270

131

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Particle Swarm Optimization

error between the actual and predicted values that is


shown in the equation (4).

xR xmin
Particle swarm optimization is a population-based evo(3)
xn =
lutionary algorithm and is similar to other populationxmax xmin
based evolutionary algorithms. PSO is motivated by
the simulation of social behavior instead of survival of xn is normalized value, xmax, xmin are maximum and
minimum amount of data, and xR is the data that must
the fittest [1].
be normalized.
In PSO, each candidate solution is associated with
n
X
a velocity [5]. The candidate solutions are called parF (x) =
(Eactual Epredicted )2
(4)
ticles, and the position of each particle is changed aci=1
cording to its own experience and that of its neighbors
(velocity). It is expected that the particles will move
Where Eactual is the real value of oil price and Epretoward better solution areas. Mathematically, the pardicted is the predicted oil price, n is the number of
ticles are manipulated according to the following equadata. The formula (5) is estimating the predicted
tions.
value for first future month,and these formulas (6), (7),
and (8) are estimating second, third and fourth future
v~i (t + 1) = wv~i (t) + C1 r1 (~xpbesti ~xi (t))
month,respectively. These are the same but they are
difference in last phrase. In the training mode, past 4
+C2 r2 (~xgbesti ~xi (t))
(1)
months of oil price are used for learning model, But in
x~i (t + 1) = x~i (t) + v~i (t + 1)
(2) test mode, fixed price of 4 months from the previous
or past 4 months are used to calculate each of the next
4 months.the proposed method is shown in figure 1.
Where xi (t) and vi (t) denote the position and velocity of particle i, at time step t. r1, r2 are random Epredictedif irstmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
values among zero to one. C1 is the cognitive learning
factor and represents the attraction that a particle has
+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
toward its own success. C2 is the social learning factor
and represents the attraction that a particle has to+w12 xi+2 xi+1 + w13 xi+3 4 xi+1 6
(5)
ward the success of the entire swarm. W is the inertia
weight which is employed to control the impact of the
Epredictedisecondmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
previous history of velocities on the current velocity of
a given particle. The personal best position of the par+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
ticle i is xpbesti and xgbest is the position of the best
particle of the entire swarm. Here, W is 0.4, and C1,
+w12 xi+2 xi+1 + w13 xi+3 4
(6)
C2 are 2.
Epredictedithirdmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6

The Proposed Method

+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi


+w12 xi+2 xi+1 + w13 xi+3 4 xi+1

(7)
Identifying and applying various parameters influencing oil price from past and present status can be very Epredictedif ourthmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
effective in making accurate predictions. Parameters
such as dollar price and inflation in America can affect
+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
on the desired issue.
+w12 xi+2 xi+1 + w13 xi+3 xi+2 xi+1 1.9
(8)
In this paper, the monthly oil price from past years
has been used to predict the next 4 months. These w shows the number of dimensions that are obtained
data are divided into two parts, training and testing by PSO algorithm in the training phase and i is the
data into three 4-months periods. Then data normal- number of data. Algorithm are repeated in each stage
ization was performed by formula (3) on the data until 100 times, 36 is the number of particles. Oil price is
they were placed between zero and one. The function predicted by the use of equations (5),(6),(7), and (8)
of PSO algorithm is considered to be the total squared in three periods.

132

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Table 1: Actual and Predicted values of Test Data for


Proposed Method.

Figure 1: Shows the Proposed PSO Based Method for


Forecasting Oil Price. n is the number of training data

Experimental Results

In order to evaluate the proposed method, it was used


on oil data sets contains 351 records from 1982 to 2011
which are available on www.ioga.com. MATLAB 2011
is used to implement this method. The number of
training data for the first, second and third periods are
339, 343, and 347, respectively. Experimental results
for the three periods are shown in Figure 2 and Table
1. The proposed method is predicted the monthly oil
price with 3.5 dollar diffrence on average. The initial
population in PSO is selected random so the averages
of 4 times apply the algorithm for per month are used
in these results.

Discussion and Future Work

Oil is important in the international economy, so the


forecast of oil prices is essential in the countrys planning. In this paper, the monthly price of petroleum is
forecasted with a new method based on Particle Swarm
Optimization. The results revealed acceptable performance of this method. This method can use to predict
other stocks price such as gold price.

105
Actual
Predicted

Oil Price (Dollars per Barrel)

100

Refrences

95
90
85
80
75
70
65

6
Month

10

12

Figure 2: The Forecasting Result of the Proposed


Method in three periods.

133

[1] J. Kennedy and R.C. Eberhart, Particle swarm optimization, In Proceedings of the IEEE International Conference
on Neural Networks/Perth, Australia IV (1995), 1942-1948.
[2] Sh. Wang, L. Yu, and K.K. Lai, A novel hybrid AI system
framework for crude oil price forecasting, Lecture Notes in
Computer Science 3327 (2004), 233-242.
[3] B. Junyou, Stock forecasting using PSO-trained neural networks, In Proceedings of the congress on Evolutionary computation (2007), 2879-2885.
[4] A.M. Toliyat Abolhassani and M. Yaghobbi, Stock price forecasting using PSOSVM, 3rd International Conference on advanced computer theory and engineering (ICACTE) (2010),
352-356.
[5] R.C. Eberhart, Dobbins R., and Simpson P.K., Computational Intelligence PC Tools, Morgan Kaufmann Publishers
(1996), 233-242.

Image Steganalysis Based On Color Channels Correlation In


Homogeneous Areas In Color Images
SeyyedMohammadAli Javadi

Maryam Hasanzadeh

Shahed university, Tehran, Iran

Shahed university, Tehran, Iran

sm.javadi@shahed.ac.ir

hasanzadeh@shahed.ac.ir

Abstract: Steganography is the art of hiding information. Whereas the goal of steganography
is the avoidance of suspicion to hidden messages in other data, steganalysis aims to discover and
render useless such covert messages .In this article, we proposed a new method for steganalysis
based on the color channels correlation in adjacent pixels while omitting the heterogeneous areas
in color images.This method is designed independent of steganography method. The results of our
proposed method shows that it has high accuracy in steganalysis.It also does better than well known
WS, SP and RS steganalysis methods in low embedding rates.

Keywords: steganography, steganalysis, color channels correlation, homogeneous and heterogeneous areas

Introduction

Steganography is the art of hiding information. Despite cryptography that deals with immuning information content not to be wiretapped, Steganography techniques are used to make messages undercover. Since
the main goal of steganography is to communicate securely in a completely undetectable manner, an adversary should not be able to distinguish in any sense
between cover-objects (objects not containing any secret message) and stego-objects (objects containing a
secret message). In this context, steganalysis refers to
the body of techniques that are conceived to distinguish between cover-objects and stego-objects [1],[2].
Digital images have high degree of redundancy in
representation and pervasive applications in daily life,
thus appealing for hiding data. As a result, the past
decade has seen growing interests in researches on image steganography and image steganalysis. Some of
the earliest work in this regard was reported by Johnson and Jajodia [3],[4]. They mainly look at palette
tables in GIF images and anomalies caused there in
by common stego-tools. A more principled approach
to LSB steganalysis was presented in [5] by Westfeld
and Pfitzmann. They identify Pairs of Values (PoVs),
Corresponding

Author, T: (+98) 915 755-5288

134

which consist of pixel values that get mapped to one another on LSB flipping. Fridrich, Du and Long [6] define
pixels that are close in color intensity to be a difference
of not more than one count in any of the three color
planes. They then show that the ratio of close colors to
the total number of unique colors increases significantly
when a new message of a selected length is embedded
in a cover image as opposed to when the same message
is embedded in a stego-image. A more sophisticated
technique that provides remarkable detection accuracy
for LSB embedding, even for short messages, was presented by Fridrich et al. in [7] and called RS method.
Moreover; the other different methods of steganalysis
such as WS [8] by Fridrich and M. Goljan and sample
pair(sp) [9] by Dumitrescu , Xiaolin and Wang have
been presented.
The most of recent steganalysis methods in color
image are based on some independent process in each
color channels. In this article , we proposed a new steganalysis method for detection stego- image, while we
focused on existence correlation between color channels
in homogeneous areas in color images .
This paper is structured as follows. In Section 2,
we will introduce the principle and basic of proposed

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

method. In Section 3, we present our experimental re- will be improved. To do so, the heterogeneous areas are
sults. Finally, Section 4 concludes the paper.
computed by using the following formula (Equation 2)
and it will not interfere in calculating CF. In other
words, these pixels have no effects when calculating
features. We expect that, dont have any correlations
2 Proposed Method
in heterogeneous areas, thus accuracy in steganalysis
method will be increased by omitting these pixels from
Signv matrix.
The proposed method is based on color channels correlation and the omittion of heterogeneous areas in color Signv (P ) = (R) > T hr&(G) > T hr&(B) > T hr
image that is designed independent of steganography
(2)
methods. The basic idea of feature extraction in RGB
space is based on[10].The way of extracting features
In the above formula, threshold is selected adapin this method is as follow. In first step, for all pixels tively such that n% of the image pixels belongs to hetin the color image, we compute the differences between rogeneus area. We set the n parameter to 5 experimenpixel intensity and the intensity of its neighbor pixels in tally. It means that 5% of image pixels that have the
four direction: 0, 45, 90 and 135 (i.e. we compute dif- least correlation to their neighboring pixels wont be inferences for three channels(Red, Green and Blue) and terfered in computing Signv . In the proposed method
produce the vector V =[R G B]T .). Fig.1 shows four feature based on the mentioned correlation have
a pixel P and its neighbors in these four directions.
been extracted from the image. First we calculate this
feature in four directions:0, 45, 90 and 135.
Dif f = [CF0 , CF45 , CF90 , CF135 ]

(3)

Then the mean and the variance of Diff vector will


be computed.
F eature1 = M ean(Dif f )

(4)

F eature2 = V ariance(Dif f )

(5)

Figure 1: direction of changes in pixel P

In second step, the sign of three components of V


has been calculated, then the summation of these signs
has been computed which are between+3 to-3. We
called this value for the supposed pixel P, Signv (p).
Because natural images have correlation in color channels, we expect the sign of vector components to be
the same in each pixel. So the Signv must be +3, -3
or 0 for the most of the pixels. We use this fact as a
feature for discovering stego-image from clear image.
Actually, the values of V in a clear image have the
same sign, since the correlation between the neighbor
pixels causes the values of color intensity in pixels have
increase or decrease in one direction. But by embedding the message, this correlation diminishes and we
expect that we have some values with different signs
in the vector. In fact, that is because of the embedding process is regardless to color channels correlation.
Hence, the first feature is defined as the ratio of the
pixels with: Signv (p) = +3,-3 or 0, to the total number
of pixels in the image.
CF =

#{p|signv(p) = 3or0or + 3}
#T otalImageP ixels

These two attained values create the first and the


second features.In the above relations, it is expected
that Mean has greater value in clear image than stegoimage. The opposite holds for Variance. In the next
step we embed a random message in the image using
LSB replacement and repeat the above operation .But
in this situation, the image carries a message. The
third feature is compute by subtracting the variance of
Diff and Diff vector ( resulted vectors before and after
embedding) .
Dif f 0 = [CF0 , CF45 , CF90 , CF135 ]

(6)

F eature3 = |V ariance(Dif f ) V ariance(Dif f 0 )|


(7)
Finally, we embed a random message in the image
using LSB replacement and repeat the above operation
and calculated the 4th feature as below(we add some
little value to fraction vent to avoid of division on zero.)
Dif f 00 = [CF0 , CF45 , CF90 , CF135 ]

(1)

F eature4 =
In third step, the attained feature of previous steps

135

(8)

kM ean(Dif f 00 ) M ean(Dif f 0 )k
kM ean(Dif f 0 ) M ean(Dif f ) + Ek
(9)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

If any input image is already tampered with a message, embedding it again will not modify the features
a lot. So we expect F eature3 to be close to zero and
F eature4 be close to 1.After the extraction of the feature, a key factor is choosing a classifier. In this article
we used support vector machine(SVM) with polynomial kernel.

Experimental Result

In this section we display the experimental results of


our proposed method and the gained outcomes will be
compared to the three well known WS, SP and RS steganalysis methods and with suggested method in[10].
We have downloaded 100 images for training set and
50% of these images embed using LSB replacement
with embedding rates of 10%,20%,30% 100%. To determine the accuracy of steganalysis methods in test
set, 60 various images have been chosen. In this article the training and test process have been done in the
same conditions for three well known WS, SP and RS
steganalysis methods and suggested method in[10] and
our proposed method. The steganalysis assessment has
been resulted based on confusion matrix[11].

The proposed steganalysis methods in low embedding rates(10%,20%,30%) which detection is harder,
have done better than the other methods. Also the
proposed method in high embedding rates, has suitable
performance. The proposed method in all cases does
better than the SP steganalysis method. There is a little viberation in proposed method with change in embedding rate, while in other methods, there is a lot of
viberation. In other words, in some other steganalysis
methods, there will be much viberation in embedding
rates, but in proposed method from the low embedding
rate to high embedding rate, the total detection rate
will be improving.

Figure 3: TP Rate

Figure 2: confusion matrix

T Ps
T Ps + F Ns
F Ps
F P Rate =
T Ns + F Ps
T Ps + T Ns
AccuracyRate =
T Ps + F Ns + T Ns + F Ps
T Ps
P recisionRate =
T Ps + F Ps
T P Rate =

Figure 4: FP Rate
(10)
(11)
(12)
(13)

In this article,we drawing the charts of the Equations 10, 11, 12 and 13 for three well known WS, SP
and RS steganalysis methods and suggested method
in[10] and our proposed method (fig.3-6). Regarding
the charts(fig.3-6) we come to these conclusions:

136

Figure 5: Accuracy Rate

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] J. D. Boissonnat and C. Delage, ESSENTIALS OF IMAGE
STEGANALYSIS MEASURES, Journal of Theoretical and
Applied Information Technology (2010).
[2] T Morkel ., JHP Eloff ., and MS Olivier., AN OVERVIEW
OF IMAGE STEGANOGRAPHY, Proceedings of the
Fifth Annual Information Security South Africa Conference
(ISSA2005),, Sandton, South Africa (, June/July 2005 ).
[3] N. F. Johnson. and S. Jajodia ., Steganalysis: The investigation of Hidden Information, IEEE Information Technology
Conference, Syracuse, USA (1998).

Figure 6: Precision Rate

[4] BHANU PRAKASH BATTULA . and R. SATYA


PRASAD., Steganalysis of Images created using current
steganography software, Springer , David Aucsmith (Ed.):
Information Hiding, LNCS 1525, Verlag Berlin Heidelberg
(1998), 3247.
[5] A. Westfield. and A. Pfitzmann ., Attacks on Steganographic
Systems, Springer, Information Hiding, LNCS 1768, Verlag
Heidelberg, 1999 (1999), 6176.
[6] J. Fridrich., R. Du., and M. Long., Steganalysis of LSB Encoding in Color Images: M. Long, Proceedings of ICME
2000, , New York, USA (2000).

Conclusion

In this paper, we have proposed a new steganalysis technique based on color channels correlation and
omitting the heterogeneous areas in color image. We
demonstrated the effectiveness of the proposed approach against LSB replacement. It is shown that our
method detects the hidden message very accurately
even in low embedding rate. The results of our proposed method shows that it has high accuracy in steganalysis and in lower rates it also does better than
well known WS, SP and RS steganalysis methods and
suggested method in[10].

[7] J. Fridrich ., M. Goljan., and R. Du., Reliable Detection of


LSB Steganography in Color and Grayscale Images, Proc.
of the ACM Workshop on Multimedia and Security, Ottawa,
CA, October 5 (2001), 2730.
[8] J. Fridrich. and M. Goljan., On estimation of secret message
length in LSB steganography in spatial domain, Security,
Steganography, and Watermarking of Multimedia Contents
VI, E. J. Delp III and P. W. Wong, eds., Proc. SPIE 5306
(2004), 2334.
[9] S. Dumitrescu., Wu Xiaolin., and Zhe Wang., Detection
of LSB Steganography via Sample Pair Analysis, SpringerVerlag, LNCS,New York,USA ( 2003), 355372.
[10] N.yousefi, Steganalysis of 24-bit color images, M.S.C Thesis, Enginering Department,Shahed University (in persian)
(2011).
[11] Bin Li., Junhui He., Jiwu Huang., and Yun Qing Shi., A Survey on Image Steganography and Steganalysis, Journal of
Information Hiding and Multimedia Signal Processing Volume 2 (April 2011), 142-172.

137

Online Prediction of Deadlocks in Concurrent Processes


Seyed Morteza Babamir

Elmira Hasanzade

University of Kashan

University of Kashan

Department of Computer Engineering

Department of Computer Engineering

Kashan, Iran

Kashan, Iran

elm.hasanzade@grad.kashanu.ac.ir

Babamir@kashanu.ac.ir

Abstract: this study addresses an approach to predict deadlocks in concurrent processes where
processes are threads of a multithread program. A deadlock occurs when two processes need some
resource held by the other; accordingly both of them will wait for the other forever. Based on
past behavior of threads of a multithread program, deadlock possibility n future behavior of the
threads can be guessed. To predict future behavior based on past behavior stimulates us to use
a mathematical model because multithread programs have uncertain behavior. To this end, we
consider past behavior of threads in terms of time series indicating a sequence of time points. Then,
we use the past time points in Artificial Neural Networks (ANNs) to predict future time points.
Efficiency and elasticity in predicting complex behavioral patterns by ANNs was our stimulation
in using ANNs. In fact, using ANNs in predicting and improving safety of multithread programs
behavior is contribution of this study. To show the effectiveness of our model, we applied it for
some Java multithread programs that were prone to deadlock. In compared with actual execution
of the programs, it was proved that about 74% of deadlock predictions were correct.

Keywords: Multithread program, Deadlock detection, Artifical Neural Networks, Time series

Introduction

in many cases cause performance lost. In addition,


using these methods has serious risks on system integrity. It seems preventing the programs from trapping into deadlock, is much more suitable. For this
reason, some policies have been devised to suppress a
concurrent system from getting trapped in deadlock.
Deadlock Prevention and Deadlock Avoidance are
the examples of such policies. Anyway, these types
of approaches make lots of limitation for concurrency
and in many cases cause other concurrency problems
like starvation.

The prevalence of multi-core processors is widely encouraging the programmers to use the concurrent programming. However, applying concurrency introduces
lots of challenges, and among them, deadlock is one of
the most popular problems. The origin of deadlock is
in sharing exclusive resources between the processes or
threads. Locking mechanisms have been used to share
these resources between processes or threads. Locking
is a task that is done by the programmer. Because of
this fact, it is an error prone technique and potential
Online deadlock detection at runtime, has been reto cause deadlocks.
ceived attention in recent years, because it does not
have previous approaches limitations. In general, they
Recovering from deadlock is not a cost efficient so- allow the system proceed normally without any limlution. The solutions like: 1- Restarting the system, itation. When the program is running, one or more
2- killing several processes or threads until deadlock monitors observe the execution of program and try to
obviated 3- extorting some resources from processes, find out about the possibility of deadlock in the future.
are most common ways for deadlock recovery. Using
each one of these approaches is not cost efficient and
Corresponding

Author, P. O. Box 87317-51167, F: (+98) 3615559930, T: (+98) 913 163 5211

138

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The prediction of deadlock possibility at runtime is


another choice wich we use as the bases of our proposed
approach. This paper is organized as follows: section
2 overviews the related works and technologies used in
this work. Proposed model is discussed in section 3.
problem definition is in subsection 3.1 and model architecture and its components is shown in subsection
3.2. The model implementation and evaluation results
have been discussed in section 4. We conclude this
paper in section 5.

Related Works

Deadlocks are the best known problem in concurrent


systems. Deadlocks are threats to system safety. Safety
says that something bad never happen and deadlock is
an instance of something bad which usually occurs in
concurrent systems. One early attempt to dealing with
deadlock problem was to let the program trapped into
deadlock and then trying to recover from that. Recovering from deadlock is not cost efficient and in many
cases cause performance lost. In addition, using these
methods has serious risk on system integrity. When a
deadlock occurred, its side effect may be manifested in
next states of programs and then causes serious problems. Deadlock can disable recovering methods then
system cannot rollback to a safe state automatically.
In recent years, so many techniques have been developed to detect potential deadlocks before than they
real occurrence. It is obvious if we can detect a potential deadlock earlier, we can made decisions about
preventing the system from trapping into the deadlock
or about the policy of recovering from it. In general
we can divide potential deadlock detection techniques
in two categories: offline techniques and dynamic techniques. Offline techniques analyze a simple model of
program in most of cases do model checking, or analyze the source code of program which is an annotation
based task. In the case of model checking, it analyze
the model and not the real program, it is possible that
there are some differences between model and implementation. The most important disadvantage of model
checking is state explosion problem which is a thinkable
problem in multithread programs. Annotation based
techniques, need the programmer effort to inject the
knowledge into the source code. This technique is not
useful to legacy code. Also it depend on a specific language and dont beneficiary in the languages which are
not type-able [1].

mum effort. These advantages make online techniques


a proper choice to use in potential deadlock detection.
The reason of deadlocks is in requesting shared resources nestedly. That is, thread or process requests
an exclusive shared resource while it holds other resources. These requests are also blocking, which means
if requested resource is in use by another thread, the
requester thread or process will stop working and wait
until the thread which hold the requested resource, releases it. To represent these requests and holdings resources, we can draw a graph which also represent the
threads that stop proceeding because of other threads.
There is a deadlock in the system if there is a circle in
this graph. To reason about potential deadlock, Most
of approaches use this graph. However, this graph can
also be drawn in form of a lock graph. Lock means a
constraint for each process or thread when it requests
a shared resource. This constraint is for mutual exclusion. In [2] a controller draw an online system lock
graph and using some algorithms, find specific paths
named not guarded SCC- strongly connected components. These paths have a strong possibility to change
to one or more circles. The idea is that they raise the
probability of manifesting really existing deadlocks in
a not guarded SCC with injecting noise. This approach
is similar to Goodlock algorithm [3]. The difference is
Goodlock looks within the scope of one process run,
which means, when a cycle in the graph is caused by
lock sequences from two different runs, Goodlock cannot detect it [2]. Some extension on Goodlock algorithms had been done such as what discussed in [4]
which is another form of Goodlock, named iGoodlock
or informative Goodlock. iGoodlock reports the potential cycles in a multithread program based on lick
location in the program. DEADLOCKFUZZER is another technique which combines iGoodlock with a randomized thread scheduler to create real deadlocks with
high probability.

In [5] deadlock immunity concept has been introduced. It means the ability in a system that in some
way, it can avoid from all deadlocks happened in the
past. When a deadlock happens for the first time, they
keep deadlocks information in a concept named context in order to avoid the similar contexts in future
runs. In this approach they achieve immunity against
the corresponding deadlocks. To avoid deadlocks with
already seen contexts, they use changing the scheduling of threads. Deadlock contexts increase in the system; therefore, it can avoid a wider range of deadlocks.
However, if a deadlock does not have a pattern similar
to an already encountered one, this approach will not
In turn online techniques mostly are not language avoid it.
dependent and dont need programmer effort. These
Obviously, in all online approaches, they pre-run
techniques can be applied to legacy code with mini-

139

The Third International Conference on Contemporary Issues in Computer and Information Sciences

some portion of the program and using some techniques


like noise injection or rescheduling the threads runs,
check whether it is possible to encounter a deadlock in
the future or not [6]. In all of them, using the fact that
a multithread program has a nondeterministic nature
and some other reasoning, they select a portion of state
space, and pre-search it to find out the possibility of
deadlock. However, when this portion is large enough,
searching it at runtime is not a trivial task neither in
time or space [7]. To address these issues, it will be
suitable to use process behavior prediction techniques
to predict those parts of process behavior that, are related to deadlock occurrence. Indeed, in this way the
overhead of detecting a potential deadlock at runtime
will be linear equation of cost that used in prediction
technique.

2.1

ear ones we find Autoregressive (AR), Moving Average


(MA) and combined AR and MA (ARIMA) [11]. These
techniques have some limitations, such as inefficiency
for real world problems which are mostly complex and
nonlinear. These techniques assume that a time series is generated by a linear process. In turn statistical
techniques based on nonlinear predictors like threshold predictor, exponential predictor, polynomial predictor, and bilinear predictor, were proposed to add
more precision to prediction. However, the selection of
the suitable nonlinear model and the computation of
its parameters, is a difficult task for a practical problem which there is no prior knowledge about the under
consideration time series. Moreover, it has been shown
that the capability of the nonlinear model is limited,
because it is unable to provide a long-term prediction
[9].

In recent years, artificial intelligence tools have


Process behavior prediction techbeen extensively used for time series prediction [12,13].
niques
In particular, artificial neural networks are frequently

In some applications, it will be usefull to predict the future behaviore on applications. In order to apply these
techniques, it is necessary to know the application behaviors in past and predict the future behaviors. A
process behavior can be represented by its execution
pattern. This pattern also known as process access
patterns [8].

exploited for time series prediction problems. A neural network is an information processing system that is
capable of treating complex problems of pattern recognition, or of dynamic and nonlinear processes. In particular, it can be an efficient tool for prediction applications. The advantage of neural networks compared to
statistical approaches, is their ability to learn and then
to generalize from their knowledge [14]. Also the neural networks are based on training and in many cases
their prediction results are precise, even if the training
set has considerable noise [10]. These approaches are
much more suitable for real world problems that do not
obey specific rules.

To predict the behavior of a process or thread, the


execution trace must be converted into a representative time series. Time series is a set of observations
from past until present, denoted by s(t i) where
0 < i < P , and P is the number of observations. Time
Process behavior prediction techniques, were
series prediction is to estimate future observations, lets
s(t + i){i = 1...N }, where N is the size of prediction mostly used in applications to improve performance
utilization algorithms in distributed and concurrent
window [9].
systems. This works usually done in four steps. At
Observed behaviors could been the sequence of first step, they observe application execution by usevents performed by a process or thread (for example, ing an analyzing tool. In second step, the obtained
disk I/O, CPU activity, network transmissions, grab- application behavior should be converted into time sebing a lock and so on). Then the equivalent time se- ries. In third step, the converted behavior (time series)
ries also represent these events. In cases where these used to predict some next behavior of processes. The
time series are sequential, future member of time se- final step consists of the quantication of the predicted
ries (equivalent future application event) can be easily behavior. That is, the predicted future behavior used
determined. Most applications, however, employ com- in load balancing, catching and prefetching, process
plex rules, therefore requiring different approaches for migration, utilized thread scheduling and failure preprediction, such as statistical evaluation or artificial diction algorithms [15]. For example, if we want to use
predicted behavior in a load balancing algorithm, and
neural networks techniques [10].
if the prediction affirms that the process would suffer
In general, time series prediction techniques can a transition from execution state S1, characterized by
be classified in two categories: statistical techniques an excessive CPU utilization, to the state S2, which
and techniques based on advanced tools as neural net- requires heavy network traffic, the model allows preworks. Statistical prediction techniques are based on dicting the best course of action for such operation.
linear predictors or nonlinear predictors. Among lin-

140

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Proposed Model

This type of information can be easily converted into univariant time series, which represent a dedicated thread behavior against a dedicated resource in a time interval. This time series
could be showed in the form of two elements tuple like: (threadi , resourcej )={nothing, request, nothing, nothing, nothing, release, nothing, request}which
means in the first period of time T hreadi requests
resourcej , in the next period of time it has nothing to do with resourcej and also in the next two period either. In the sixth period of time T hreadi releases resourcej . This thread will request resourcej
in eighth period again. This set can be written for any
thread and any resource which make a two elements
tuple together. Each member in this set can take one
of these three values: {release, request, nothing}. This
set is a univariant time series which can be used for
prediction of the thread behavior, in (t + 1)th period of
time. Therefore, we will have n r time series which n
is the number of threads and r is the number of shared
resources or locks.

In this section we summarize of our model in detecting


potential deadlocks in a multithread program, using artificial neural network. In multithread programs, there
is some shared resources that used by threads. When
a thread need a shared resource, requests and wants to
lock it, if resource was available takes it and otherwise
stop proceeding until it can take the resource. Also
when a thread does not need an owned resource any
more, it releases the resource. The activities like requesting and releasing a resource which issued from
threads or processes in a concurrent system, cause
deadlocks. Therefore, these types of information are
valuable for determining the possibility of deadlock.
The order of these requisitions and releases has direct
effect on deadlock occurrence. Thus, if one can predict
the future order of requisitions and releases that will
be issued from each thread or process, she/he can determine the possibility of deadlock. The precision of
predicting future requests or releases has direct effect
on the precision of our approach in detecting potential
Actually, what we are trying to do, is extracting
deadlocks.
these deadlock-wise behaviors, and predicting somehow the future deadlock-wise behaviors of processes or
In the following subsection we define the problem threads.
that should deal with and in the next subsection; we
are going to discuss our model.

3.2
3.1

Model Architecture and Components

Problem Defenition

To detecting the potential deadlocks, as mentioned,


the future order of requisitions and releases that will
be issued from each thread or process should be predicted. That is, for every process or thread, we should
predict the type of action which it is going to perform
about each shared resource. For example, if the system is in the tth period of time, for (t + 1)th period of
time our prediction should be something like: threadi
is going to request resourcej and as we know if it is
not available, the threadi will stop proceeding until
it could take the resourcej , or threadk is going to
release resourcez that it took previously. These types
of information are necessary to reasoning about deadlock possibility. To refer to these types of behaviors
we introduced deadlock-wise behavior concept. The
other behaviors of threads or process are not related
in detecting potential deadlocks. For example, the
deadlock-wise behavior of T hreadi between times 0 to
t, in an execution trace could be something like:
threadi [0t]={Request(resourcej ),Request(resourcek ),
Release(resourcej ),Request(resourcel ),
Request(resourcep ),Release(resourcel )}

To online prediction of potential deadlocks in multithread programs we proposed a model consist of four
components. Each component has a dedicated task.
The architecture of proposed model has been showed
in Figure 1. Each component task is discussed in the
following.

Figure 1: Proposed model architecture

Behavior Extraction & Time series Generation Component:


this component has
two main parts. The first one (behavior extractor) is responsible for observing the application
execution and online extracting the deadlockwise behaviors from observed execution. It sends
the extracted deadlock-wise behaviors to Run-

141

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2 thread2 holds resource a and resource d, and


time Lock Tracker Component which will be exrequests resource b, then is waiting for thread4
plained later. Anyway, once a target behavior
which holds resource b
has been extracted, the second part (Time Series
Generator) converts this behavior into a member in the time series that this behavior belongs
to. The result of this part will be injected into And if the prediction results that came from Predictor
Component is something like:
Predictor Component.
Runtime Lock Tracker Component: this
component takes the extracted online deadlockwise behaviors and draws and keeps an online
lock graph which represents the current lock and
thread or process states in the system.
Predictor Component: this component is responsible for predicting the future members of
time series that came from Behavior Extraction & Time series Generation Component . As
According to these predictions, Decision Maker
we discussed earlier, considering the complexity
Component
concludes:
and nonlinearity nature of concurrent processes
or threads behaviors, the most suitable way to
predict the future member of time series which
1. thread2 is going to wait for thread1
are another representation of these threads or
2. thread3 is going to wait thread2
processes behaviors, is artificial neural network
prediction techniques. In dedicated time inter3. thread2 is going to wait thread4
vals, this component takes the time series from
Behavior Extraction & Time series Generation
Component, and predicts the future members of This component, based on what it receives from runtime lock tracker component and predictor compotime series.
Decision Maker Component: the last compo- nent draw a virtual lock graph which composes the
nent is Decision Maker Component. This compo- information gained from those two components. The
nent takes the predicted time series and the cur- composition of our example will be a virtual graph like
rent lock graph and composes these two together. Figure 3.
After that, using cycle detection algorithms results about the deadlock possibility. For example
if the current system lock graph is something like
Figure 2:

Figure 3: The Composit Graph

In our example, the composition of the real system


lock graph and predicted future events results a virtual graph that represent in the next state or period of
time, the deadlock is possible. Therefore, this component reports that a deadlock in the next state of system
is predicted.

Figure 2: Real System Lock Graph


This graph means:

All of these components linked together in a way


1 thread1 holds resource c, and requests resource r, which could cooperate in runtime. For any program
then is waiting for thread3 which holds resource r that uses acquiring locks for mutual exclusion in a multithread program, we can use this model to predict the

142

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

deadlock possibility at runtime. In the next section


we will discuss the implementation and evaluation of
our model on a Java multithread program which all of
threads behave randomly.

4.1

another algorithm that can find cycles in the resulting composed graph. It receives the predictor component and online lock tracker component results and
reasons about the possibility of deadlock in the future.

Implementation and Evalua4.2 Evaluation Results


tion Results
Implementation

We implemented each component separately and then


linked them together. In the following each component
discussed in detail.

We applied our implementation to online prediction


of potential deadlock for a multithread program written with Java. This program is a deadlock prone multithread program which consists of 20 threads which
share 10 resources. In our test suit multithread program, every thread requests the resources in a random
way, and also releases each one after a random time.
Our evaluation consists of two phase. First we evaluate multiple network configuration using training and
testing each configuration. Second we evaluate our approach in detecting potential deadlock.

Behavior Extraction & Time series Generation Component was implemented using Java and
AspectJ compiler. This component takes a Java written multithread program, and instruments it using AspectJ. What it instruments in the target code, is the
logic of extracting deadlock-wise behaviors and converting them to a time series. After doing this, any
time that targeted multithread code executed, the be- 4.2.1
haviors that we are interested in, will be extracted at
runtime and will be converted to time series.
The second component, that is runtime lock
tracker, implemented with Java. It takes online extracted deadlock-wise behaviors from the first component and draws a lock graph.
Third component has been implemented using
MATLAB default Time Series Tools, which placed
in Neural Network Toolbox of MATLAB. We used
Nonlinear Aggressive (NAR) predictor network in our
work. This network predicts each member of a time
series using d past values of that series. That is
y(t) = f (y(t 1), ..., y(t d)) . This is a simple network which consists of 3 layers named, input, output
and hidden layer. In addition to the d parameter, the
number of nodes in hidden layer is another important
factor in network configuration which has effect on the
efficiency of predictions. Hidden layer nodes are responsible for the main part of prediction task and the
proper number of these nodes is depended on type of
time series which should be predicted. We used n r
of these networks (n is the number of threads and r is
the number of shared resources) to predict all the future members of time series. This is a simple network
and its computational complexity is low.

Selecting network configuration parameters and training the networks

We deployed 200 NAR networks (Nonlinear Aggressive) to prediction Because of 20 threads and 10 resources. For the first phase evaluation, we ran our
program 250 times and used the information of these
runs to train and test the networks. In this part of
work, we examined networks using different values for
d which is the number of past values of series, and also
the different number of nodes in hidden layers. In this
way, we selected the best configuration of networks to
applying them in proposed deadlock prediction model.
The result of each configuration showed in Table 1.
As it is obvious the overall result of networks in
the case which d is 3 and the number of nodes in hidden layer is 10, is the best result. So we selected this
configuration to be placed in predictor component.

4.2.2

Evaluation of proposed approach in detecting potential deadlock

After training the networks with selected configuration,


we started the second phase evaluation and testing our
Last component is again a Java program that im- model. The obtained results have been shown in Figure
plements a two graph composition algorithm and one 4.

143

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Total number
of runs
250
250
250
250
250
250
250

Table 1: The Results of Different Network Configuration


d{y(t) = f (y(t 1), ..., y(t d))} Number of hidden Test set MSE over
layer nodes
200 network
1
5
9.0e-1
2
5
4.058e-1
3
5
1.81e-1
4
8
1.08e-0
1
10
4.001e-1
2
10
2.089e-0
3
10
1.052e-0

parison with other online techniques is more cost efficient. In addition, it does not force offline or traditional
deadlock detection techniques limitations, like Banker
algorithm.

Figure 4: The Failure rate of potential deadlock prediction

The contribution of this work is in using process behavior prediction techniques to reasoning about deadlock possibility. We first convert the process execution
behavior to multiple time series, and next predict the
future members of these series. The predicted members retranslated into behaviors; therefore, we obtain
future behaviors of threads. Using these predicted behaviors we result about deadlock possibility in the future. Rate of true detection of deadlock occurrence is
depended on the correctness of predicted behaviors. In
proposed approach, the prediction has been done using neural networks, which is a powerful technique in
predicting complex and nonlinear time series.

Figure 4 represents the failures in prediction the


deadlock possibility using our approach. In our test,
we ran the target multithread program 500 times, during these runs deadlock occurred 17 times. Our approach reported 13 of them before than their actual
occurrences and missed 4 of them. Also in 3 cases, it
We used NAR network which is a time series predicreported false positive. This is considerably a good re- tor network. We trained and evaluated these networks
sult for a program that behaves completely in a random using the information gathered from test runs. The
way.
obtained results showed an applicable performance. In
most of multithread programs, the each thread behavior is depended on its past behaviors. Therefore,
NAR networks which uses past information to result5 Conclusion
ing about future could be a proper network to predict future behaviors of threads. The results which
showed in table 1, emphasizes this claim. Finally, the
Detecting potential deadlock techniques in a multi- configuration of network which made the best results
thread program could be divided in two major cate- was selected and used in the predictor component of
gories named online techniques and offline techniques. our model. The final results obtained from the model
Online detection techniques in compare with static showed 74% of deadlocks predicted correctly before octechniques have some advantages. For example, offline currence. As we saw in Figure 4 except a few ones,
techniques suffer from state explosion or require pro- in most of cases, the model correctly conclude about
grammer effort to inject knowledge into code to detect the possibility of deadlock occurrence. Considering
potential deadlocks. Therefore in recent years, online the completely random behavior of threads, this result
techniques received lots of attention. Anyway, the most is satisfactory. Also this model doesnt augment any
important weakness of currently used online detection overhead at runtime to the program. Except a little
techniques is that they are not cost efficient, neither in instrumentation that we inject to code for extracting
time or space.
deadlock-wise behaviors, this model completely run in
separate. Also none of the components have a cost
In this work, we introduced a novel online approach intensive task, even the predictor component which is
to detect potential deadlocks. This approach in com-

144

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

implemented using neural network.

[8] E. Dodonov and R. F. d. Mello, A Model for Automatic


On-Line Process Behaviour Extraction, Classification and
Prediction in Heterogeneous Distributed Systems, Seventh
IEEE International Symposium on Cluster Computing and
the Grid(CCGrid07) (2007).

Refrences

[9] N. Baccour, H. Kaaniche, M. Chtourou, and M. B. Jemaa,


Recurrent neural network based time series prediction: Particular design problems, International Conference on Smart
Systems and Devices, Tunisia, Hammamet (2007).

[1] D.Engler and K.Ashcraft, RacerX:Effective, Static Detection of Race Conditions and Deadlocks, SOSP (2003).
[2] Y. Nir-Buchbinder, R. zoref, and S. Ur, Deadlocks: From Exhibiting to Healing, Runtime Verification: 8th International
Workshop, RV Budapest, Hungary (2008).
[3] S. Bensalem, J. Fernandez, K. Havelund, and L. Mounier,
Confirmation of deadlock potentials detected by runtime
analysis, WorkShop on Parallel and distributed systems:
testing and debugging (2006).
[4] P.Joshi, C. Park, K. Sen, and M. Naik, A randomized dynamic program analysis technique for detecting real deadlocks, ACM SIGPLAN conference on Programming language design and implementation, Dublin, Ireland (2009).
[5] H. Jula and G. Candea, A Scalable, Sound, EventuallyComplete Algorithm for Deadlock Immunity, 8th International Workshop, RV Budapest, Hungary (2008).
[6] F.Chen and G.Rosu, Predictive Runtime Analysis of Multithread Programs, supported by the joint NFS/NASA.
[7] C.Wang, S. KunduS, M. Gana, and A. Gupta, Symbolic Predictive Analysis For Concurrent Programs.

[10] R. Zemouri, D. Racoceanu, and N. Zerhouni, Recurrent radial basis function network for time-series prediction, Engineering Applications of Artificial Intelligence (2003), 453463.
[11] O. Voitcu and Y. Wong, On the construction of a non-linear
recursive predictor, Science B.V., Journal of Computational
and Applied Mathematics (2004).
[12] Y. Chen B and A. Abraham, Time-series forecasting using flexible neural tree model, journal=A. Abraham, (2004),
219-235.
[13] C.J. Lin and Y.J. Xu, A self-adaptive neural fuzzy network
with group-based symbiotic evolution and its prediction applications, Science, Fuzzy Sets and Systems (2005).
[14] R. Zemouri and P. Ciprian Patic, Recurrent Radial Basis Function Network for Failure Time Series Prediction,
World Academy of Science, Engineering and Technology 72
(2010).
[15] E. Dodonov and R. F. d. Mello, A Novel Approach For
Distributed Application Scheduling Based on Prediction of
Communication Events, Future Generation Computer Systems 26.

145

Fisher Based Eigenvector Selection in Spectral Clustering Using


Googles Page Rank Procedure
Amin Allahyar

Hadi Sadoghi Yazdi

Ferdowsi University of Mashhad

Ferdowsi University of Mashhad

Department of Computer Engineering

Department of Computer Engineering

Amin.Allahyar@stu-mail.um.ac.ir

H-Sadoghi@um.ac.ir

Soheila Ashkezari Toussi


Ferdowsi University of Mashhad
Department of Computer Engineering
Soheila.Ashkezari@stu-mail.um.ac.ir

Abstract: Ng. Jordan Weiss (NJW) approach is one of the most widely used spectral clustering
algorithms. It uses eigenvectors of the normalized affinity matrix derived from input data. These
eigenvectors are treated as new features of the input data and now has same structure of high
dimensional input data but are represented in lower dimension. these new trasformed data can
be easily used in regular clustering algorithms. NJW method uses the eigenvectors with highest
corresponding eigenvalue. However, these eigenvectors are not always the best selection to reveal
the structure of the data. In this paper, we aim to use Googles page rank algorithm to replace unsupervised problem with an approximated supervised problem then we utilize the fisher criterion to
select the most representative eigenvectors. The experimental result demonstrates the effectiveness
of selecting the relevant eigenvectors using the proposed method.

Keywords: Feature/Eigenvector Selection, Fisher Criterion, Spectral Clustering, Googles Page Rank.

Introduction

Spectral clustering techniques[1] originate from the


spectral graph theory[2] and make use of the spectrum
of the similarity matrix of the data to apply dimensionality reduction for clustering. Hence, the basic idea
is to construct a weighted graph from the input data
in such a way that the vertices of the graph are data
points, and each weighted edge represents the degree of
similarity between every corresponding pair of vertices.
Scott and Longute-Higgines algorithm[3], Perona and
Freeman algorithm[4], Normalized cut[5] and NJW[6]
are such spectral techniques.
Spectral clustering methods use the eigenvectors of the
normalized affinity matrix obtained from data to carry
out data partitioning. In most of these techniques, the
value of corresponding eigenvalue determines the pri Corresponding

ority of the eigenvectors. For example, to partition


data to K clusters, NJW uses the eigenvectors corresponding to the largest K eigenvalue of the normalized
Laplacian matrix of input data. However, this order
does not guarantee to select the best features to represent the input data[7][8][9]. In this paper with an inspiration from Googles page rank algorithm, problem
is converted to an approximated supervised problem.
Then the fisher criterion is applied. Using the score
obtained from fisher criterion we propose a new eigenvector selection method to find the relevant eigenvector
to describe the natural groups of input data.
The rest of paper is organized as follows: Section 2 contains a brief review of spectral clustering and one of its
most popular algorithm, i.e. NJW. Section 3 is dedicated to the related works about eigenvector selection,
furthermore as a requirement to propose our method,

Author, P.O. BOX: 91779-48974, F: (+98) 511 421 5067, T: (+98) 511 421-5071

146

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

we introduce the fisher criterion and the Googles page


rank algorithm. In Section 4, we propose our new
eigenvector selection approach. Section 5 contains the
empirical results and Section 6 concludes this article.

2
2.1

3. Compute the first K eigenvectors v1 , v2 , . . . , vK


corresponding to the first K largest eigenvalues
1 , 2 , . . . , K of LN and form the column wise
matrix V = [v1 , v2 , . . . , vK ] R .
4. Renormalize V and form the matrix Y such that
V
all rows have unit length as Yij = ij .

Preliminaries

j Vij

5. Cluster represented data matrix Y into K clusters via K-means.

Spectral Clustering

Spectral clustering technique[1] has a strong connection with spectral graph theory[2]. It usually refers to
the graph partitioning based on the eigenvalues and
eigenvectors of the adjacency (or affinity) matrix of
a graph. Given a set of N points in d dimensional
space X = {x1 , x2 , . . . , xN } Rd we can build a complete, weighted undirected graph G(V, A) whose nodes
V = {v1 , v2 , . . . , vN } correspond to the N patterns and
edges defined through the adjacency matrix A encode
the similarity between each pair of sample points. Adjacency between two data points can be defined as (1):
Aij = e

d2 (xi ,xj )
2

(1)

where, d measure the dissimilarity or distance between


patterns and the scaling parameter controls how
rapidly the affinity falls off as the distance between
xi and xj increase. The selection of tuning parameter
greatly affects spectral clustering result. The tuning
method proposed by Zelnik and Peona[10] introduces
local scaling by selecting a i for each data point xi
instead of the fixed scale parameter . The selection is
done using the distance between point xi and the pth
nearest neighbor. In this way, the similarity matrix is

d2 (xi ,xj )

defined as, h(xi , xj ) = e i j , i = d(xi , xp ), and


xp is the pth nearest neighbor of xi . It should be noted
that using this method the result is dependent on the
choice of parameter p.

2.2

gree
PN matrix D is a diagonal matrix whose Dii =
j=1 aij element is the degree of the point xi .

3
3.1

Related Work
Eigenvector Selection

In[7], it is shown that if the gap size between the K th


and K + 1th eigenvalue of LN is small, then there is no
guarantee that first K eigenvector set be the optimal
set. Xiang and Gong[8] proposed to use eigenvector
selection based on their relevant to improve the performance of the spectral clustering method. In [11] population analyses have been provided to gain insights into
which eigenvectors should be used. Another approach
for eigenvector selection is proposed by Feng Zhao et
al.[9], which is based on entropy ranking. It should be
noted that in all of these mentioned algorithms first
eigenvector (the maximum corresponding eigenvalue)
of the affinity matrix is excluded from selection.
From another point of view, the eigenvector selection
can be seen as a feature selection problem which many
unsupervised feature selection methods can be used including: Tabu search[12], Correlation[13], Genetic algorithm[14], Laplacian matrix[15]. Another feature selection method is the fisher criterion[16] which is looking for set of features, that keeps the data of other
classes in highest distance while the data in same class
close. This will results in the optimal solution, but its
apparently a supervised algorithm and needs the labels
of data to be available.

Ng-Jordan-Weiss (NJW) Method

NJW algorithm[6] aims to find a new representation 3.2 Fisher Criteroin


on the first eigenvectors of the Laplacian matrix using
the following steps:
Fisher criterion is a well known supervised feature selection algorithm based on the maximization of the
ratio of the between-class scatter to the within-class
1. Form the affinity matrix by (1).
scatter shown in Figure.1. Between-class scatter is a
2. Compute the degree matrix D and normalized measure of distance between mean of each class relaaffinity matrix LN = D1/2 AD1/2 . The de- tive to the mean(s) of the other classes. Within-class

147

The Third International Conference on Contemporary Issues in Computer and Information Sciences

scatter is variance of each classs. We use this measure clusters has more connection compared to data reside
in the boundary. Using the Googles page rank one
for evaluating eigenvectors relevance as (2):
can detect which data has most connections. Because
K
of the intrinsic disjunction of these data (as they reside
i,j=1 kxi xj k
(2)
fscore = K
in center of each cluster and focused on a spot) reprei=1 kvar(xi )k
sented in Figure.2 it is very easy to cluster them into
where xi and var(xi ) is the mean and variance of class i separate groups using a popular clustering algorithm
respectively. Whatever the value of this index is higher, such as K-means and it converge in one or two iterthe data points are better separated in classes.
ations. These data is then labeled according to their
clusters. So the problem is converted from an unsupervised feature selection to a supervised one. After this
step we can use a regular fisher criterion to score each
eigenvector individually. After this phase K number
eigenvectors with the highest score is selected for last
phase of spectral clustering procedure. Block diagram
of the proposed approach is shown in Figure.3

Figure 1: Fisher criterion: set W1 of features provide


a better separation in compare to set W2 [16].

3.3

Googles Page Rank Algorithm

Figure 2: Demonstration of 20 data selected from the


Page ranking[17] is an essential task in Google process. maximum value of first eigenvector of IRIS dataset.
The important pages should have a lot of link to\from Part A is representation of 2ed and 3rd and Part B is
the other pages while the links itself should be made to same data with 2ed and 3rd and 9th eigenvector.
other important pages. Google uses a specific spectral
based page ranking which utilize the first eigenvector of
affinity matrix calculated from graph of all web pages
to find the important pages[18]. Using this specific algorithm google can rank millions of web pages in few
hours. The algorithm is briefly described as follows:
Let A be the affinity matrix which Aij is 1 if page i
has a link to page j. let D be the diagonal matrix of
degrees where Dii = j Aij . Let S be the scaled matrix
defined as S = D1 A. Let i be the ith eigenvector
of S corresponding to the largest eigenvalues. So has
dimensionality of 1 n where rank of page x is x .
In the other word, if value of x is high (compared to
Figure 3: Block diagram of the proposed algorithm.
others) then it means that page x is linked to many
pages.

The Proposed Algorithm

Adjacency matrix A is basically a graph which represents the connection between data.It can be considered as the connection of web pages and be fed to the
Google ranker function so that the data (web pages)
that has many neighbors (links in web) can be determined. Regularily, data which reside in the center of

148

Experimental Result

To investigate the capability of proposed algorithm,


a number of data sets from UCI as well as MNIST
handwritten digit database is used. The comparition
is based on NMI, the standard measure in determining
the quality of clustering. Properties of the considered
datasets are reported in Table.1 and the NMI result
of NJW and proposed method is reported in Table.2.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

To form the affinity matrix, we utilized the proposed we aim to investigate more indexes for pairwise and
method in[10] using the 7th nearest neighbor. By com- individually evaluation of eigenvectors.
paring the NMI result (Table.2 and Figure.4), it can
be seen that the proposed method has higher NMI except in two datasets, Image and Glass. By analyzing
the data in these datasets it can be observed that the
input data in these dataset has a very mixed clusters.
This issue shows the fact that first K eigenvectors related to the largest eigenvalue are more appropriate
when the clusters are too mixed together.

Table 1: Properties of selected UCI and


Mnist dataset.
Name
Instance Feature Class
Iris
150
4
3
Wine
178
13
3
Ionosphere
351
34
2
Breast-w
683
9
2
Soybeans
47
35
4
Glass
214
9
6
Liver
345
6
2
Image
210
19
7
Mnist 58
400
784
2
Mnist 89
400
784
2
Mnist 038
400
784
3
Mnist 1234
400
784
4
Table 2: NMI Comparition of NJW
and proposed method.
Name
NJW
Proposed
Iris
91.33
96.67
Wine
97.19
98.91
Ionosphere
69.8
71.35
Breast-w
61.93
74.13
Soybeans
100.00
100.00
Glass
49.53
48.52
Liver
56.81
67.42
Image
64.29
63.13
Mnist 58
84.00
86.31
Mnist 89
86.00
86.55
Mnist 038
82.67
83.02
Mnist 1234
81.13
88.49

Conclusion

In this paper a new approach for selecting relevant


eigenvectors in spectral clustering is proposed. This
approach utilizes the Googles page rank algorithm to
identify the boundary data. In the next step by exploiting their congenital splitness they get approximated labels. In the last step by utilizing the fisher criterion the
relevant eigenvectors is selected. For the future work,

Figure 4: Minimum and maximum NMI achieved during 50 run. The blue column is NJW and red column
is the proposed method.

Refrences
[1] N. and Shawe-Taylor Cristianini J. and Kandola, Spectral
kernel methods for clustering, Advances in neural information processing systems 14 (2002), 649655.
[2] F.R.K. Chung, Spectral graph theory, Amer Mathematical
Society, 1997.
[3] G.L. and Longuet-Higgins Scott H.C., Feature grouping by
relocalisation of eigenvectors of the proximity matrix, Proc.
British Machine Vision Conference, 1990, pp. 103108.
[4] P. and Freeman Perona W., A factorization approach to
grouping, Computer VisionECCV98 (1998), 655670.
[5] T. and Belkin Shi M. and Yu, Data spectroscopy:
Eigenspaces of convolution operators and clustering.
[6] A.Y. and Jordan Ng M.I. and Weiss, On spectral clustering:
Analysis and an algorithm, Advances in neural information
processing systems 2 (2002), 849-856.
[7] N. and Verri Rebagliati A., Spectral clustering with more
than K eigenvectors, Neurocomputing (2011).
[8] T. and Gong Xiang S., Spectral clustering with eigenvector
selection, Pattern Recognition 41 (2008), no. 3, 10121029.
[9] F. and Jiao Zhao L. and Liu, Spectral clustering with eigenvector selection based on entropy ranking, Neurocomputing
73 (2010), no. 10, 17041717.
[10] P. and Zelnik-Manor Perona L., Self-tuning spectral clustering, Advances in neural information processing systems 17
(2004), 16011608.
[11] T. and Belkin Shi M. and Yu, Data spectroscopy:
Eigenspaces of convolution operators and clustering, The
Annals of Statistics 37 (2009), no. 6B, 39603984.
[12] Y. and Li Wang L. and Ni, Feature selection using tabu
search with long-term memories and probabilistic neural
networks, Pattern Recognition Letters 30 (2009), no. 7,
661670.
[13] M.A. Hall, Correlation-based feature selection for machine
learning, The University of Waikato, 1999.
[14] S.C. Yusta, Different metaheuristic strategies to solve the
feature selection problem, Pattern Recognition Letters 30
(2009), no. 5, 525534.
[15] X. and Cai He D. and Niyogi, Laplacian score for feature
selection, Advances in Neural Information Processing Systems 18 (2006), 507.
[16] R.O. and Hart Duda P.E. and Stork, Pattern Classification
and Scene Analysis 2nd ed. (1995).

149

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[17] L. and Brin Page S. and Motwani, The PageRank citation


ranking: Bringing order to the web. (1999).

150

[18] A.N. and Meyer Langville C.D., Google page rank and beyond, Princeton Univ Pr, 2006.

Imperialist Competitive Algorithm for Neighbor Selection in


Peer-to-Peer Networks
Shabnam Ebadi

Abolfazl Toroghi Haghighat

Islamic Azad University, Qazvin Branch, Qazvin, Iran

Islamic Azad University, Qazvin Branch, Qazvin, Iran

Department of Information Technology

Department of Information Technology

Shabnam ebadi@yahoo.com

at haghighat@yahoo.com

Abstract: Peer-to-peer (P2P) topology has significant influence on the performance, search efficiency and functionality, and scalability of the application. In this paper, we propose the Imperialist
Competitive Algorithm (ICA) approach to the problem of Neighbor Selection (NS) in P2P Networks.
Each country encodes the upper half of the peer-connection matrix through the undirected graph,
which reduces the search space dimension. The results indicate that ICA usually required shorter
time to obtain better results than PSO (Particle Swarm Optimization), specially for large scale
problems.

Keywords: Neighbor Selection, Imperialist Competitive Algorithm, peer to peer Network.

Introduction

Peer-to-peer computing has attracted great interest and attention of the computing industry and
gained popularity among computer users and their networked virtual communities [1]. All participants in
a peer-to-peer system act as both clients and servers
to one another, thereby surpassing the conventional
client/server model and bringing all participant computers together with the purpose of sharing resources
such as content, bandwidth, CPU cycles It is no longer
just used for sharing music files over the Internet.
Many P2P systems have already been built for some
new purposes and are being used. An increasing number of P2P systems are used in corporate networks or
for public welfare [2].
A recent survey states that computer users are increasingly downloading large-volume contents such as
movie and software, and 24 percent of the Internet
users had downloaded a feature-length film online at
least once, and that there exists a large demand for
this category of P2P applications. A new generation
of P2P applications serves this purpose, where their top
Corresponding

priority is to effectively distribute the content instead


of locating it. Example of these include BitTorrent,
which have seen significant increase in usage in terms
of network traffic and number of users[3]. These applications are conducive for distributing large-volume
contents partly because they divide the content into
many small pieces and allowing peers to exchange those
pieces instead of the complete file. Such a mechanism
has been demonstrated to improve the efficiency of P2P
exchanges. The intuition is that when content is broken into pieces for P2P exchange, it takes a shorter
time before a peer can begin to upload to its neighbors while simultaneously downloads from the community. In practical P2P systems, peers often keep a large
set of potential neighbors, but only simultaneously upload/download to/from a small subset of them, which
we call active neighbors, to avoid excessive connection
overhead[4].
The important process that improves the efficiency
of distribution is refer to as the neighbor selection (NS).
NS is the process where one or more entities in the P2P
network police the system by determining the neighbors the other peers that they will connect to for obtaining and/or distributing the content for each peer.

Author, P. O. Box 13418-58861, T: (+98) 21 66033847

151

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

It is intuitive to note that the mechanism adopted to


decide the neighbors has a strong influence on the distribution efficiency.

The Imperialist Competitive


Algorithm

P2P comprises peers and the connections between


these peers. These connections may be directed, may
have different weights and are comparable to a graph
with nodes and vertices connecting these nodes. Defining how these nodes are connected affects many properties of an architecture that is based on a P2P topology,
which significantly influences the performance, search
efficiency and functionality, and scalability of a system.

The Imperialist Competitive Algorithm starts with an


initial population (countries in the world). Some of the
best countries in the population are selected to be the
imperialists and the rest form the colonies of these imperialists. All the colonies of initial population are divided among the mentioned imperialists based on their
power. The power of an empire which is the counterpart of the fitness value in GA is inversely proportional
to its cost. After dividing all colonies among imperialA common difficulty in the current P2P systems is
ists, these colonies start moving toward their relevant
caused by the dynamic membership of peer hosts. This
imperialist country[10].
results in a constant reorganization of the topology [5].
Koulouris et al.[6] presented a framework and an
implementation technique for a flexible management of
peer-to-peer overlays. The framework provides means
for self-organization to yield an enhanced flexibility in
instantiating control architectures in dynamic environments, which is regarded as being essential for P2P
services to access, routing, topology forming, and application layer resource management. In these P2P applications, a central tracker decides about which peer
becomes a neighbor to which other peers.

The total power of an empire depends on both the


power of the imperialist country and the power of its
colonies. We will model this fact by defining the total
power of an empire by the power of imperialist country
plus a percentage of mean power of its colonies.
Then the imperialistic competition begins among
all the empires. Any empire that is not able to succeed
in this competition and cant increase its power (or at
least prevent decreasing its power) will be eliminated
from the competition[10].

Koo et al. [7] investigated the neighbor-selection


process in the P2P networks, and proposed an efficient
single objective neighbor-selection strategy based on
Genetic Algorithm (GA). Sun et al. [8] proposed a
PSO algorithm for neighbor selection in P2P networks.
Abraham et al. [9] proposed Multi swarms for neighbor
selection in peer-to-peer overlay networks.

The imperialistic competition will gradually result


in an increase in the power of powerful empires and a
decrease in the power of weaker ones. Weak empires
will lose their power and ultimately they will collapse.
The movement of colonies toward their relevant imperialists along with competition among empires and
also the collapse mechanism will hopefully cause all the
countries to converge to a state in which there exist
In this paper, we propose the Imperialist Comjust one empire in the world and all the other counpetitive Algorithm (ICA) approach to the problem of
tries are colonies of that empire. In this ideal new
Neighbor Selection (NS) in P2P Networks [10].
world colonies, have the same position and power as
the imperialist[10].
Imperialist Competitive Algorithm is inspired from
the socio-political process of imperialism and imperialistic competition. This algorithm (like many optimization algorithms) starts with an initial population.
Neighbor-Selection Problem
Each individual of the population is called a country. 3
Some of the best countries with the minimum cost are
in P2P Networks
considered as the imperialist states and the rest will be
the colonies of those imperialist states. All the colonies
are distributed among the imperialist countries regard- Kooa et al. model the neighborhood selection probing their power. The power of each country is inversely lem using an undirected graph and attempted to deproportional to its cost, which is the fitness value in the termine the connections between the peers [7]. Given
GA.
a fixed number of N peers, we use a graph G=(V,E)
to denote an overlay network, where the set of vertices
V = {v1 , ..., vN } represents the N peers and the set of
edges E = {eij {0, 1}, i, j = 1, ..., N } represents their
connectivities: eij = 1 if peers i and j are connected,

152

The Third International Conference on Contemporary Issues in Computer and Information Sciences

and eij = 0 otherwise. eij = eji for all i 6= j and


eij = 0 when i = j. Let C be the entire collection of
content pieces, and we denote {ci C, i = 1, ..., N } to
be the collection of the content pieces each peer i has.
We further assume that each peer i will be connected
to a maximum of di neighbors, where di < N . The
disjointness of contents from peer i to peer j is denoted
by ci \cj , which can be calculated as:

NS. In first method (ICAm) each country is an encoded


matrix of symbols representing a solution, which may
be feasible or infeasible. In particular, a country in our
problem is a matrix of bits, which corresponds to the
solution values of e0ij s in Problem (2).

The country is encoded to map each dimension to


one directed connection between peers, i.e. the dimension is N N . The domain for each dimension is limited
ci \cj = ci (ci cj )
(1) to 0 or 1. But only in the upper half of the matrix and
update the values in both the optimal solution can be
Where n denotes the intersection operation on sets. reached.
This disjointness can be interpreted as the collection
of content pieces that peer i has but peer j does not.
The algorithm randomly generates countries that
In other words, it denotes the pieces that peer i can peer i will be connected to a maximum of di neighbors,
upload to peer j. Moreover, the disjointness operation so we are somewhat improved, and this makes the imis not commutative, i.e., ci \cj 6= cj \ci We also denote plementation of this algorithm is faster than other al|ci \cj | to be the cardinality of ci \cj which is the num- gorithms to achieve optimal response. In the second
ber of content pieces peer i can contribute to peer j. In method (ICA), each country is an encoded string of
order to maximize the disjointness of content, we want symbols representing a solution. In particular, a counto maximize the number of content pieces each peer try in our problem is a string of bits, which corresponds
can contribute to its neighbors by determining the con- to the solution values of e0 s in Problem (2).
ij
nections eij s Define ij 0 s to be sets such that ij = C
if eij = 1, and ij = 0 null set) otherwise. Therefore
The country is encoded to map each dimension to
we have the following optimization problem[7]:
one directed connection between peers, i.e. the dimension is N N . But the neighbor topology in P2P networks is an undirected graph, i.e. eij = eji for all i 6= j
and eij = 0 for all i = j. We set up a search space of
D dimension as N (N 1)/2. The domain for each
dimension is limited to 0 or 1.

N
N
[
X
(ci \cj ) ij |
|
max
B

j=1 i=1

Subject to:
N
X

In a few numbers of peers, results of methods are


similar, but by increasing number of peers, result of
first method is better second method.

eij di , i

j=1

The Imperialist Competitive


Algorithm For NS

In this paper, an optimization algorithm based on


modeling the imperialistic competition is used for
NS problem. Each individual of the population is
called country. The population is divided into two
groups: colonies and imperialist states. The competition among imperialists to take possession of the
colonies of each other forms the core of this algorithm
and hopefully results in the convergence of countries
to the global minimum of the problem. In this competition the weak empires collapse gradually and finally
there is only one imperialist that all the other countries
are its colonies.
First, the random matrix of the whole country is
initialized In this paper, we propose two methods for

153

Initialize parameters
Initialize random countries(N)
Calculate fitness of countries
Initialize the empires
for i=1 to D do
Assimilate()
Revolution()
Competition()
Calculate fitness of empires
if the end Decades is met or there is just one
empire then
stop and output the best solution, the
fitness
else
go to Assimilate()
end
end
Algorithm 1: Neighbor Selection Algorithm Based
on ICA (N, D)
The main steps in the algorithm are summarized

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

in the pseudo code shown in Algorithm1. In the algorithm, N is number of peers and D is the total number
of iterations to solve the NS problem.
After initialize parameters and random countries
based on the problem (2), the initial countries are
evaluated. After initialize the empires, in assimilation
move the colonies toward their relevant imperialist. We
find in this algorithm, match bits between imperialists
and their colonies that are not similar then calculate
number of them. We select randomly some of bits that
must change; Next step will be updated colony with
new values.
After revolution and evaluation empires by using
problem (2), in this algorithm after a while all the empires except the most powerful one will collapse and
all the colonies will be under the control of this unique
empire. If the end Decades is met or there is just one
empire stop.

Figure 1: Performance for the NS (25, 1400, 12)

Figure 2 illustrate the ICA, ICAm, PSO performance during the search processes for the NS problem
versus iteration in during the each processes for the
5 Experimental Studies
problem (30, 1400, 15).
Specific parameter settings of the algorithms are described in Table 2 Simulation result of ICA and ICAm
This section analyzes and compares the results of sim- almost is similar. As evident, ICA methods obtained
ulation PSO and ICA.
better results much faster than PSO.
Given a P2P state S = (N, C, M), in which N is the
number of peers, C is the entire collection of content
pieces, M is the maximum number of the peers which
each peer can connect steadily in the session.
Figure 1 illustrate the ICA, ICAm, PSO performance
ICA, ICAm
NumOfCountries = 80, Nuduring the search processes for the NS problem versus
mOfInitialImperialists = 8, Nuiteration in during the each processes for the problem
mOfDecades = 50, Revolution(25, 1400, 12).
Rate = 0.3, AssimilationCoefficient= 2
Specific parameter settings of the algorithms are
PSO
C1=1.5, C2=4-C1 NumofPartidescribed in Table 1.
cles=80, MaxIterations=50
As evident, ICA methods obtained better results
much faster than PSO.

ICA, ICAm

PSO

NumOfCountries = 50, NumOfInitialImperialists = 5, NumOfDecades = 50, RevolutionRate = 0.3, AssimilationCoefficient= 2


C1=1.5,C2=4-C1 NumofParticles=50, MaxIterations=50

Table 1: Parameter settings for the algorithms

Table 2: Parameter settings for the algorithms

Figure 3 illustrate the ICA, ICAm, PSO performance during the search processes for the NS problem
versus iteration in during the each processes for the
problem (40, 1400, 20).
Specific parameter settings of the algorithms are described in Table 2. As evident, ICA methods obtained
better results much faster than PSO.

154

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ist Competitive Algorithm. In the proposed approach,


the country encodes the upper half matrix of the peer
connection through the undirected graph, which reduces the dimension of the search space. We evaluated
the performance of the ICA with PSO. The results indicate that ICA usually required shorter time to obtain better results than PSO, especially for large scale
problems. The proposed algorithm could be an ideal
approach for solving the NS problem.

Refrences
[1] S. Kwok, P2P searching trends: 2002-2004, Information
Processing and Management 42 (2006), 237-247.

Figure 2: Performance for the NS (30, 1400, 15)

[2] T. Idris, J. Altmann, and P. Smyth, A Market-managed


topology formation algorithm for peer-to-peer files sharing
networks, Lecture Notes in Computer Science 4033 (2006),
61-77.
[3] R. Xia and K. Muppala, A Survey of Bit-Torrent Performance, IEEE Communications Surveys & Tutorials 187
(2010), 119.
[4] H. Zhang and Z. Shao, Optimal Neighbour Selection in
Bit-Torrent-like Peer-to-Peer Networks, Proceeding of the
ACM (2011).
[5] S. Surana, B. Godfrey, K. Lakshminarayanan, R. Karp, and
I.Stoica, Load balancing in dynamic structured peer-to-peer
systems, Performance Evaluation 63 (2006), 217240.
[6] Koulouris T, R Henjes, K Tutschku, and H de Meer, Implementation of adaptive control for P2P overlays, 8IWAN 12
(2003), 1229-1252.
[7] S.G.M. Koo, K. Kannan, and C.S.G. Lee, On neighbourselection strategy in hybrid peer-to-peer networks, Future
eneration Computer Systems 22 (2006), 732-741.

Figure 3: Performance for the NS (40, 1400, 20)

[8] S Sun, A Abraham, G Zhang, and H Liu, A Particle Swarm


Optimization Algorithm for Neighbour Selection in Peer-toPeer Networks, 6th International Conference on Computer
Information Systems and Industrial Management Applications 1 (2007), 166-172.
[9] A. Abraham, H. Liu, and A.E Hassanien, Multi swarms for
neighbour selection in peer-to-peer overlay networks, 2010.

Conclusions

In this paper, we investigated the problem of neighbor


selection in peer-to-peer networks using The Imperial-

155

[10] Atashpaz-Gargari E, Imperialist Competitive Algorithm:


An Algorithm for Optimization Inspired by Imperialistic
Competition, IEEE Congress on Evolutionary Computation
(2007), 46614667.

Different Approaches For Multi Step Ahead Traffic Prediction Based


on Modified ANFIS
Shiva Rahimipour

Mahnaz Agha-Mohaqeq

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

Rahimipour@aut.ac.ir

m.mohaqeq@aut.ac.ir

Seyyed Mehdi Tashakkori Hashemi


Amirkabir University of Technology
Department of Mathematics and Computer Science
Hashemi@aut.ac.ir

Abstract: In the last two decades, short term prediction of traffic parameters has led to a vast
number of prediction algorithms. Short term traffic prediction systems that operate in real time are
necessary but not sufficient. A prediction system should be able to generate accurate and reliable
multi steps ahead predictions, besides the single step ahead ones. Multi steps ahead predictions
should provide information about the future traffic states with acceptable accuracy in cases of
system failure. This paper presents a comparative study between three different approaches for
multistep ahead forecasting. After a brief discussion about each approach, we apply them for data
gathered from Tehran highways, by modifying the structure of Adaptive Neuro-Fuzzy Inference
System (ANFIS). Finally the results of the comparative study are summarized.

Keywords: short term prediction, multistep ahead, neuro fuzzy.

Introduction

Traffic prediction has many uses in planning, design


and many other operations in the field of Intelligent
Transportation Systems (ITS). Accurate predictions
can reduce the impact of traffic congestion which is
a common problem all over the world. The effectiveness of short term prediction systems that operate in
real time, depends on predicting traffic information in
a timely manner[1]. This means that besides the traffic
conditions met in real time, a prediction system should
not only be able to generate accurate single step ahead
predictions, but to produce reliable multi steps ahead
predictions, in cases of data collection failure. Therefore one crucial issue is the development of models that
provide forecasts more than one step ahead. Multiple
steps ahead forecasting is of outmost importance. Sin Corresponding

Author, T: (+98) 914 187 5640

156

gle interval prediction algorithms cannot support any


operational decision making mechanisms as they cannot provide a reliable representation of the way traffic might evolve in the following minutes. The literature shows that researchers prefer non-conventional
statistical approaches such as neural networks and nonparametric regression to produce accurate forecasts for
several steps ahead [2]. Multistep ahead predictions
efforts are made using different methods such as neural networks [3], [4], [5], statistical approaches like
ATHENA [6], ARIMA [7] and state-space models [8].
Smith and Demetsky [9] used non-parametric regression to provide forecasts for 4 h ahead in 30-min intervals. Chen et al. [10] predicted traffic conditions 12
steps ahead at 15-min intervals, and Dia [11] predicted
travel time 45 steps ahead at 20-s interval.
Multi step ahead predictions provide the means to generate information on traffics anticipated state with ac-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ceptable accuracy for a significant time horizon in cases


of system failure. The present paper focuses on providing a comparative study between Multi step ahead
prediction approaches applied to traffic parameters collected from Tehran highways. The remainder of the
paper is structured as follows : The main characteristics of two families of prediction strategies, the SingleOutput strategy and the Multi-Output strategy, is presented in the following section. Next, we review ANFIS and its modified structures for implementing the
approaches. Finally, the paper ends with the results of
the comparative study.

2.1

Conventional approaches to multi-step-ahead prediction like iterated and direct methods, belong to this
family since they both model from historical data a
multiple-input single-output mapping. Given a timeseries of a variable - for example volume V(t),V(t1),. . . ; their difference resides in the considered output
variable: V (t + 1) in the iterated case and the variables
V (t + h), h {1, . . . , H} in the direct case [14].

2.1.1

Multi step ahead prediction


approaches

A common problem with the traffic forecasting model


is the low accuracy of long term forecasts. The estimated value of a parameter may be reasonably reliable
for the short term future, but for the longer term future, the estimate is likely to become less accurate.
There are several possible reasons to account for this
increasing inaccuracy. One reason is that the environment in which the model was developed has changed
over time. Therefore, the input valid at a given time interval does not in fact have an influence on the output
relevant for a time interval quite some distance away
in the future. Another reason is that the model itself was not well developed. The inaccuracy arises due
to immature training or a lack of appropriate data for
training. The trained model may cover the surrounding
neighborhood of data but fails to model cyclic changes
of trend or seasonal patterns of data [12].
Most predictions systems are dependent on data transmission. This suggests that continuous flow of volume
and occupancy data is necessary to operate efficiently.
However, it is common for most real-time traffic data
collection systems to experience failures [13]. For this,
a real-time prediction system should be able to generate predictions for multiple steps ahead to ensure its
operation in cases of data collection failures.
This section introduces the main characteristics of two
families of multiple steps ahead prediction strategies:
the Single-Output strategy which relies on the estimation of a single-output predictor and the MultipleOutput strategy, which learns from data a multipleinput multiple-output dependency between a window
of past observations and a window of future values.

Multi-input Single-output (MISO)


approach

Iterated method

In this method, once a one-step-ahead prediction is


computed at time t, V (t), the value is fed back as an
input to the following step at t+1.
V (t+1)=f (V (t), V (t), V (t-1), . . . )

In iterated methods, an H-step-ahead prediction


problem is tackled by iterating, H times, a one-stepahead predictor.
Iterated methods may suffer from low performance in
long horizon tasks. This is due to the fact that they
are essentially models tuned with a one-step-ahead criterion and therefore, they dont take the temporal behavior into account appropriately. Moreover, the predictor takes approximated values as inputs instead of
actual observations, which leads to the propagation of
the prediction error [12].

2.1.2

Direct method

The Direct method is an alternative method for longterm prediction. It learns H single output models
where each returning a direct forecast of V (t + h) with
h {1, . . . , H}:
V (t+h)=f(V(t),V(t-1),. . . )

h {1, . . . , H}

In fact it transforms the problem to H distinct parallel problems.


This method does not propagate the prediction errors
but the fact that the H models are learned independently induces a conditional independence of the H estimators V (t+h). This prevents the technique from
considering complex dependencies between the vari-

157

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ables V(t+h) and consequently bias the prediction accuracy. Also direct methods often require higher functional complexity than iterated ones in order to model
the stochastic dependency between two series values at
two distant instants [12].
The reliability of direct prediction models is suspect because the model is forced to predict further ahead [15].
This is the main argument in using iterative models
in multiple steps ahead prediction. On the other hand,
iterative predictions have the disadvantage of using the
predicted value as input that is probably corrupted
[16]. A possible way to overcome this shortcoming is to
move from the single-output to multiple-output modeling.

2.2

Multi-input Multi-output (MIMO)


approach

Both aforementioned cases used multi-input singleoutput techniques to implement the predictors. Singleoutput approaches face some limits when the predictor
is expected to return a long series of future values.
Another possible way for multistep ahead prediction
is to move from the modeling of single-output mapping to the modeling of multi-output dependencies.
This requires the adoption of a multi-output technique
where the predicted value is no more a scalar quantity
but a vector of future values of the time series. This
approach replaces the H models of the direct approach
by one multiple-output model [14].
{V (t+h), . . . , V (t+1)}=f (V(t), V(t-1), . . . )

The MIMO method constrains all the horizons to


be predicted with the same model structure, for instance with the same set of inputs, and by using the
same learning procedure. This constraint greatly reduces the flexibility and the variability of the singleoutput approaches and it could produce the negative
effect of biasing the returned model [14].

has lots of successful application in traffic prediction


field. Fuzzy logic is also famous for the well-known
strong capability of prediction. The common structure
of Adaptive Neuro-Fuzzy Inference System is shown in
Fig.1.

Figure 1: A common ANFIS structure

ANFIS network organizes two parts like fuzzy systems. The first part is the antecedent part and the
second part is the conclusion part, which are connected
to each other by rules in network form. ANFIS structure demonstrated in five layers can be described as a
multi-layered neural network. The first layer executes
a fuzzification process, the second layer executes the
fuzzy AND of the antecedent part of the fuzzy rules,
the third layer normalizes the membership functions
(MFs), the fourth layer executes the consequent part
of the fuzzy rules, and finally the last layer computes
the output of fuzzy system by summing up the outputs
of layer fourth.

We are going to use this network to apply the iterative method to our data. For the direct approach we
have to train H single output ANFIS with each returning a direct forecast of V(t+h) with h {1, . . . , H}.
when we place as many ANFIS models side by side as
there are required, the structure is called MANFIS1 .
Here, each ANFIS has an independent set of fuzzy
rules, which makes it difficult to realize possible certain
correlations between outputs. MANFIS is used to im3 Adaptive Neuro Fuzzy Infer- plement the direct approach. Another structure which
is used here for the MIMO approach is called CANence System
FIS2 . CANFIS has extended the notion of a singleoutput system, ANFIS, to produce multiple outputs.
Neuro- Fuzzy System combines the advantages of the In short, fuzzy rules are constructed with shared memtwo intelligent methods: Neural Network and fuzzy bership values to express correlations between outputs
logic. Neural network is capable of self-learning and [17].
1 Multiple
2 Coactive

Adaptive Neuro Fuzzy inference system


Adaptive Neuro Fuzzy inference system

158

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data

The set of data (traffic speed/density) employed in this


study are collected from certain data collection points
every 1 minute, from 7 a.m. to 11 a.m. Our data collecting tool was Aimsun simulator which shows traffic behavior in Tehran. All data were collected from a
five-lane section along the East-Hemmat-highway with
400 meters length. The data samples used for training
and testing the models are normalized to a value between zero and 1. For all models 80% of data is used
for training and the rest 20% for testing the model.

minimum error in the MIMO approach. The error


increases when the multiple ANFIS model is used for
three step ahead prediction (direct approach), and finally the iterative approach has the maximum error
between all approaches.
Figures 2-4 are the time series of the actual versus the
predicted speed for three step ahead using all three
strategies. Solid lines are the real and dashed lines are
the predicted values:

Consider the series of traffic data which varies as a


function of time (V(t)). Using the models described in
the previous section, the input-output relations for the
multistep ahead prediction is: (our goal is to predict
one and three step ahead predictions)

Iterative :
V (t+1)=ANFIS(
V (t+2)=ANFIS(
V (t+3)=ANFIS(

Figure 2: Actual vs predicted speed - Direct approach


V (t), V (t-1), V (t-2),. . . )
V (t+1), V (t), V (t-1),. . . )
V (t+2), V (t+1), V (t),. . . )

Direct :
V (t+1)=ANFIS( V (t), V (t-1), V (t-2),. . . )
V (t+3)=ANFIS( V (t), V (t-1), V (t-2),. . . )
Which is in fact :
{V (t+1), V (t+3)} = MANFIS( V(t), V(t-1),
V(t-2),. . . )
MIMO :
{V (t+1), V (t+3)} = CANFIS( V (t), V (t-1), Figure 3: Actual vs predicted speed - Iiterated apV (t-2),. . . )
proach
We repeat these predictions for two traffic parameters,
speed and density.

Results

After training process, checking the performance of the


models has been carried out using the test data. Final
results are summarized in the following table:
Figure 4: Actual vs predicted speed - MIMO approach
Table 1: MSE in 3 step ahead prediction
Direct Iterative
MIMO
Speed
0.0189 0. 0191 1.847e-004
Density 0.0245
0.0370
7.716e-004

Conclusion

In this paper, we implemented three different apAs shown in the table, all parameters achieve their proaches for multistep ahead prediction based on AN-

159

The Third International Conference on Contemporary Issues in Computer and Information Sciences

FIS. The data used to train and check the models were
acquired by Aimsun simulation. The results show that
all testing errors are low enough to be accepted, but
MIMO approach implemented by CANFIS obviously
shows the good performance of simplicity, precision
and stabilization. It can be used in practical projects as
an applied short-time prediction model of urban roads.

Refrences

[1] B. L. Smith and R. K. Oswald, Meeting Real-Time Requirements with Imprecise Computations: A Case Study in Traffic Flow Forecasting, Computer Aided Civil and Infrastructure Engineering 18/3 (2003), 201213.
[2] E. I. Vlahogianni, J. C. Goloas, and M. G. Karlaftis, Short
term traffic forecasting: Overview of objectives and methods, Transport Reviews 24/5 (2004), 533-557.
[3] M. S. Dougherty and M. R. Cobbet, Short-term inter-urban
traffic forecasts using neural networks, International Journal of Forecasting 13 (1997), 21-31.
[4] B. Abdulhai, H. Porwal, and W. Recker, Short-term Freeway Traffic Flow Prediction Using Genetically-optimized
Time-delay-based Neural Networks, UCB, UCB-ITSPWP991 (Berkeley, CA (1999).
[5] S. Innamaa, Short-term prediction of traffic situation using MLP-neural networks, Proceedings of the 7th World
Congress on Intelligent Transportation Systems, Turin,
Italy (2000).
[6] M. Danech-Pajouh and M. Aron, ATHENA: a method
for short-term inter-urban motorway traffic forecasting,
Recherche Transport Securite 6 (1991), 1116.
[7] H. Kirby, M. Dougherty, and S. Watson, Should we use neural networks or statistical models for short term motorway
forecasting, International Journal of Forecasting 13 (1997),
45-50.

160

[8] J. Whittaker, S. Garside, and K. Lindeveld, Tracking and


predicting network traffic process, International Journal of
Forecasting 13 (1997), 5161.
[9] B. L. Smith and M. J. Demetsky, Multiple-interval freeway
traffic flow forecasting, Transportation Research Record
1554 (1996), 136141.
[10] H. Chen, S. Grant-Muller, L. Mussone, and F. Montgomery,
A study of hybrid neural network approaches and the effects
of missing data on traffic forecasting, Neural Computing
and Applications 10 (2001), 277286.
[11] H. Dia, An object-oriented neutral network approach to
short-term traffic forecasting, European Journal of Operational Research 131 (2001), 253261.
[12] H.H. Nguyen and C.W. Chan, Multiple neural networks for
a long term time series forecast, Neural Computing and
Applications 13 (2004), 9098.
[13] A. Stathopoulos and M.G. Karlaftis, A multivariate statespace approach for urban traffic flow modelling and prediction, Transportation Research Part C 11/2 (2003), 121135.
[14] S.B. Taieb, A. Sorjamaa, and G. Bontempi, Multiple-Output
Modelling for Multi-Step-Ahead Time Series Forecasting,
Neurocomputing 73 (2009), 1950-1957.
[15] A.S. Weigend and N.A. Gershenfeld, Time Series Prediction: Forecasting the future and understanding the past:
Santa Fe Institute Studies in the Science of Complexity
(1993).
[16] E.I. Vlahogianni and M.G. Karlaftis, Local and Global Iterative Algorithms for Real-Time Short-term Traffic Flow Prediction, Urban Transport and Hybrid Vehicles book, pages:
192, 2010.
[17] J.S. Roger, C.T. Sun, and E. Mizutani, Neuro-Fuzzy and
Soft Computing A Computational Approach to Learning
and Machine Intelligence, Prentice-Hall, 1997.

E-service Quality Management in B2B e-Commerce Environment


Parvaneh Hajinazari

Abbass Asosheh

Tarbiat Modares University

Tarbiat Modares University

Department of Information Technology Engineering

Department of Information Technology Engineering

Tehran, Iran

Tehran, Iran

p.hajinazari@gmail.com

Asosheh@modares.ac.ir

Abstract: The service oriented architecture (SOA) and its most common implementation, services, enable the enterprises to increase their agility in the face of change, to improve their operating
efficiency, and greatly reduce the cost of doing business in e-commerce environments. However, in
order to have a certain business, the behavior of the services should be guaranteed and these guarantees can be specified by Service Level Agreements (SLAs). In this regard, we present a model to
express SLAs and utilize the business services performance requirements specified as Key Indicators
(KPIs and KQIs) to define SLA parameters. This model can help automate the process of SLA
negotiation, monitoring and take actions in case of violations.

Keywords: E-Commerce; Service Oriented Architecture (SOA); Service Level Agreements (SLAs); Key Performance Indicators (KPIs); Key Quality Indicators (KQIs).

Introduction

has to specify both non-functional as well as functional


properties, in particular the technical quality of service
characteristics such as response time, throughput and
availability. In this regard, our research is based on
Nowadays, in modern global markets of e-Commerce,
quality management of business services and utilizes
contracts are made for short-period strategies that may
the Key Indicators (KIs) and Service Level Agreements
continue some days and even less. In such a dynamic
(SLAs) concepts.
environment, enterprises need to respond more effectively and quickly to opportunities in order to remain
The increasing role of SLAs in B2B systems is concompetitive in the global markets. In this regard, SOA
siderable. In a B2B system, the structure and role of
paradigm is known as the best practice for enterprises
SLAs allow for virtualization of the providers resources
because at bare minimum, it has the potential to lead
because B2B collaboration demands not only discovery
to increase agility, transparency and to decrease deof application resources based on existing metadata,
velopment and maintenance costs. In this approach,
but also discovery of these resources in the context
services can be easily assembled to form a collection of
of agreed contracts, like SLAs. In B2B environments,
autonomous and loosely coupled business processes [1].
the SLA mechanism helps service provider and conServices include unassociated and loosely coupled units
sumer manage the risks and expectations, and establish
of functionality that have no calls to each other. A sertrusted business relationships. Effective SLAs are very
vice performs an action such as filling out an online apimportant to guarantee business continuity, customer
plication for an account. In this context, it is important
satisfaction and trust. The metrics used to measure
to be ensured that services are executed as expected.
and manage performance compliance to SLA commitFor this purpose, each involved actor should be dements, are the heart of a successful agreement and are
scribed by both functional and trustworthy capabilities
a critical long term success factor. The categorization
so that qualified services could be aggregated to fulof SLA contents with a particular focus on SLA metfill the business goal. In addition, the service provider
Corresponding

Author, T: (+98) 911 325-8525

161

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

rics facilitates design decisions and helps to identify


responsibilities for critical IT processes in disruption
management during the execution of SLAs. However,
SLA definition is not straightforward from the business
goals of the enterprise [2]. In this regard, this research
uses KIs (KPIs and KQIs) concept as a good solution
to identify the essential metrics [3]. In our work, KPI
is used as business services performance indicators in
order to map to SLA parameters. In this way, target
values for KPIs can be specified in SLAs.

In our work the SLA concept is oriented to the service relationship between service consumer and service
provider in which a set of metrics could be used for
describing levels of quality of service and in order to
guarantee these levels, mechanisms are utilized. This
is in conformity with the definition contained in the
SLA Management Handbook of the TeleManagement
Forum.

The remainder of the paper is organized as follows:


section 2 describes the basic concepts in order to facilitate the understanding of our approach. The proposed
model is presented in section 3. At last, conclusions
and an outlook to our future work are discussed in section 4.

Background

From a business perspective, a generalized statement


of business goals relevant to the scope of the project is
decomposed into sub goals that must be met in order
for the higher-level goals to be met. This hierarchical
decomposition of the goals leads to identify the services
that will help in fulfilling the sub goals. It is also necessary to identify key indicators in order to provide an
objective basis for evaluating the degree to which the
goal has been achieved. Key indicators and target values identified during the process are used to measure,
monitor, and quantify the success of the SOA solution in fulfilling business needs [4]. In this regard, the
Telecommunication Management Forum (TM Forum)
utilized KPIs and KQIs for managing service quality.
KPIs are quantifiable measurements that reflect the
critical successful or unsuccessful factors of a particular service. KPIs represent the performance; thus, they
cannot completely represent end-to-end service quality,
therefore the TM Forum proposed KQIs, which are indicators that provide measurements of a specific aspect
of the performance of the product, product components (services) or service elements, and represent their
data from a number of sources including the KPIs [5].
The TMForum also defined a hierarchy among KPIs,
KQIs, and SLAs Fig. 1. SLAs in which provider and
consumer define the expected service behavior and its
quality, can be defined in terms of services KQIs. Service KQIs use service KPIs as metrics for reporting the
performance of their services (Target values for KPIs
must be reached in a certain period). An example of
the relationship between KPIs and KQIs is shown in
Table 1.

Figure 1: KQI, KPI and SLA Relationship [5]

Proposed Model

In this study, we assume the service provider offers the


services with desired functionality, so we focus on services non-functional requirements. The correct management of such requirements directly impacts the success of organizations participating in e-commerce and
also the growth of e-commerce. To achieve these objectives, a number of research contexts need to be explored. First, mechanisms are necessary to synthesize
business services KPIs on the SLA metrics of the web
services that compose the business process. Then, a
good theoretical SLA model is necessary to formally
specify and represent SLA. Finally some approaches
need to be developed to monitor and manage SLAs.
In this regard we model SLA metrics based on performance information of the services that compose the
business process. Our approach relies on the use of ontologies to represent the SLA model. Ontology-based
approaches have been suggested as a solution for semantically describing SLA specifications in a machine
understandable format.
Then an autonomic SLA monitoring system can be
developed using SWRL rules and the Jess rule engine.
The purpose of this system is detecting SLA violations.
In this way, there is a possibility to assure the traceability between KPIs defined over business services and
their target values established in SLAs.

162

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Service

Service KQI
Availability
Speech/Visual Quality

Video Conference

Response Time
Round Trip Delay
Delay
Confidentiality
Non- repudiation
Interoperability
Connect Time

Service KPI
MTBF, MTBR
Loss of Service
MOS
Loss, Jitter, Delay
CustomerSatisfacation
Response Time
OWD, RTT
OWD, RTT
PhysicalAccecclViolations
PhysicalAccecclViolations
InteroperabilityComplaints
Connect Time

Table 1: KQI and KPI Relationship


In this section, we illustrate the ontological model
for representing SLA based on KIs. We adopt UML for
representing the model. At first, as shown in Figure
2, an SLA is a contract between a customer and a service provider over a specific service, since our ontologybased SLA should contain all of these entities. Also,
an SLA has constraints that a service provider has to
guarantee. Therefore, SLA is the main class and aggregates three main classes: Parties, Services and Obligations. Parties class describes consumer and provider
involved in an SLA. It aggregates other classes used
to store information such as name, phone number, address, email and other related data. Services class represents the information about the offered service like
the validity period for the SLA, the service level that
is specified through service parameters and their corresponding KPIs. Obligations class represents the conditions that must be respected by the provider with
respect to the offered service. These conditions are
expressed in terms of KQIs, which are used for defining the terms under which the offered service will be
monitored and evaluated. Preferentially, KQIs are expressed as a function of KPIs. Availability is a KQI
example that can be defined in terms of MTBR and
MTBF KPIs for Helpdesk service. Obligations class
also defines the penalties to be applied when the expressed conditions are offended. We define how a KQI
is calculated in terms of the associated KPIs using
SWRL rules [6].

ships among KIs or between an SLA and a KI, calculating KQIs from KPIs, and detecting SLA violations. For modeling metric dependencies between services and processes, we focus on metrics which can be
measured or calculated at runtime. Examples for such
metrics are response time and availability [8].

Conclusion and Future Work

SOA enables the integration of services from various organizations such that the organizations can easily use
the services of other organizations based on specified
standards and setting out contracts under same standards. However some external providers may offer services that they dont meet the quality attribute requirements of the service consumer organization, therefore
defining an Service Level Agreement and establishing
SLA management mechanisms are important factors
when explaining the quality requirements for achieving
mission goals of business and service-oriented environments [9]. The level of service can be specified as target
and minimum which allows customers to be informed
what to expect (the minimum), while providing a measurable (average) target value that shows the level
of organization performance. SLA management allows the enterprises to identify and solve performancerelated problems before the business is being influenced
Based on our model, we have developed an ontol- by these problems.
ogy for representing SLA specification and presented
The KI is a key instrument in order to evaluate the
a Web Ontology Language (OWL) based Knowledge
performance
of business services and detect the state of
Base that can be used in an autonomous SLA mancurrent
and
completed
processes. In our methodology,
agement system. In our study, Protg, a free openKIs
are
used
for
mapping
business services performance
source ontology editor [7] with other related plug-ins
indicators
to
SLA
parameters.
With this method one
like SWRL tab are employed. In order to implement
can
find
the
suitable
services
that
satisfy business proour ontology, we built SLA OWL, and then, the SWRL
cess
performance
requirements.
In
general, being able
rules have been added for inferring hidden relationto characterize SLA parameters has some advantages

163

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

for enterprises. First, it allows for more efficient translation of enterprises vision into their business processes,
since those can be designed according to service specifications. Second, it allows the selection and execution
of web services based on business process requirements,
to better fulfill customer expectations. Third, it makes
possible the monitoring of business processes based on
SLA. In order to achieve these purposes, we introduced
a model to manage business services with SLAs that
guarantee a certain quality of performance. In this
regard, we have investigated business services KPI hierarchy based on [5] and proposed the ontology-based
SLAs and SWRL rules that are used for inferring hidden relationships among KPIs and SLAs.

[10]. Hence, forecasting SLA violations is more appropriate than just detecting them. This is left for our
future work. In spite of the above mentioned restrictions, our ontology-based SLA has some advantages as
it is very easy to extend due to its use of ontologies.
Also, SWRL rules, which are used for reasoning, can
be defined and modified dynamically without affecting
other aspects of the code. However, one of our major
research aims for the future work is finding a suitable
way to forecast SLA violations.

Refrences
[1] M. P. Papazoglou and W. J. Heuvel, Service oriented architectures: approaches, technologies and research issues, The
VLDB 16 (2007), 389-415.
[2] G. Frankova, M. Sguranb, F. Gilcher, S. Trabelsi, J. Dorflinger, and M. Aiello, Deriving business processes with service level agreements from early requirements, Journal of
Systems and Software, Elsevier 84 (2011), 1351-1363.
[3] E. Toktar, G. Pujolle, E. Jamhour, M. Penna, and M. Fonseca, An XML model for SLA definition with key indicators,
IP Operations and Management, Springer 4786 (2007), 196199.
[4] A. Arsanjani, S. Ghosh, A. Allam, T. Abdollah, S. Ganapathy, and K. Holley, SOMA: A method for developing serviceoriented solutions, IBM Systems Journal 47 (2008), 377
396.

Figure 2: SLA Class Diagram

There are a number of restrictions and open areas in


this work that is explained in following. We considered
and classified almost every KPIs based on [5], however,
it needs to be verified whether it reflects the actual
consumer and service providers criteria. Another restriction is about violations. Neither party wants the
SLA to be violated; consumers want a high level of
service for their key business processes, not a payment
for SLA violation, which will never compensate for the
loss of business. Similarly the provider does not want
to suffer the loss of market trust and credibility which
may affect many more accounts than that affected by
the SLA violation. Therefore the SLA must be in place
to foster a cooperative business approach to common

[5] The TeleManagement Forum and the Open Group, SLA


Management Handbook, Enterprise Perspective 4 (2004).
[6] I. Horrocks, P. F. Patel-Schneider, A. Allam, H. Boley,
S. Tabet, B. Grosof, and M. Dean, SWRL: A Semantic
Web Rule Language combining OWL and RULEML, W3C
(2004).
[7] Available The protege ontology editor and knowledge acquisition system, http://www.hut.fi/ vkarpijo/netsec00/.
[8] S. Kalepu, S. Krishnaswamy, and S. W. Loke, Verity: a qos
metric for selecting web services and providers, 4th International Conference on Web Information Systems Engineering
Workshops (2003).
[9] P. Bianco, G. A. Lewis, and P. Merson, Service level
agreements in service-oriented architecture environments,
Technical Report CMU/SEI-2008-TN-021 Carnegie Mellon
(2008).
[10] B. Mitchell and P. McKee, SLAs A Key Commercial Tool,
In P. Cunningham and M. Cunningham (eds.), Exploiting the Knowledge Economy- Issues, Applications, Case
Studies, Proc. eChallenges e-2006 Conference, IOS Press 3
(2006).

164

Calibration of METANET Model for Real-Time Coordinated and


Integrated Highway Traffic Control using Genetic Algorithm: Tehran
Case Study
Mahnaz Aghamohaqeqi

Shiva Rahimipour

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

m.mohaqeq@aut.ac.ir

rahimipour@aut.ac.ir

Masoud Safilian

S.Mehdi Tashakori Hashemi

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

m.safilian@aut.ac.ir

hashemi@aut.ac.ir

Abstract: This paper employs previously developed model predictive control (MPC) approach
to optimally coordinate variable speed limits and ramp metering along the 2 km section of the
Hemmat highway to deal with the problem of rush hour congestion. To predict the evolution of
traffic situation in this zone, an adapted version of the METANET model that takes the variable
speed limits into account should be used.Before using this traffic model to predict the evolution
of the traffic situation, it should be calibrated in order to make the state variables of the model
in a good consistence with the real values. To do this, we use a genetic algorithm. Simulation
consequence show that genetic algorithm is able to find optimal solutions to model set parameters
so that MPC approach results less congestion, a higher outflow and a lower total time spent in the
controlled areas.

Keywords: Model predictive control (MPC); METANET model; calibration; ramp metering; variable speed limit
control; genetic algorithm.

Introduction

eas, and environmental considerations render this approach little attractive. The second approach is based
on the fact that the capacity provided by the existing infrastructure is practically underutilized, i.e. it
is not fully exploited [1]. Thus, before building new
infrastructure, the full exploitation of the already existing infrastructure by means of dynamic traffic management measures such as ramp metering, reversible
lanes, speed limits and route guidance should be ensured.

The notoriously increasing number of vehicles that use


the provided network capacity has leaded to severe
problems in the form of congestion which results in serious economic and environmental problems, negative
impacts on the quality of life and augmenting possibility of accidents. Two complementary approaches for
solving problems caused by motorway congestion phenomena are possible without diverting demand to other
Ramp metering is the most common way to control
modes of transportation. The first one is to construct
new motorways, i.e. address the problem by providing traffic conditions on the highway networks by regulatadditional capacity to the networks. Land availability ing the input flow from the on-ramps to the highway
issues, especially in and around large metropolitan ar- mainstream. A good overview of the different ramp
Corresponding

Author, T: (+98) 9360565834

165

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

metering algorithms is found in [2]. However, the effectiveness of this method will be reduced when the
demand from the onramp is high and traffic in the upstream mainline is getting dense [3]. In such circumstances, ramp meter cannot relieve or even alleviate
the congestion itself, because even a small flow from
the on-ramp can cause a breakdown and subsequently
congestion will be formed, especially where the capacity of on-ramp is limited. Thats because, ramp metering only controls the inflow from the on-ramp into the
mainline and the collective behaviors of the drivers in
the mainline of highway are not controlled by this. This
is why using ramp metering alone cannot appropriately
control the highway traffic in practice and employing
other control strategies such as Variable Speed Limits
is needed.
Variable Speed Limit control is a particular dynamic traffic management measure that aims to simultaneously improve both traffic safety and traffic performance (e.g., minimizing the total time spent) of highway network by dynamically adjusting optimal set of
speeds for controlled segments and display those variable speed limits on variable message signs (VMSs).
Variable Speed Limit attempts to control the collective
vehicle speed or driver behavior of mainline and in this
regard is complementary to ramp metering [4]. On the
other hand, as shown in [3], placing speed limiters just
before the on-ramp can help reduce the outflow of controlled segments so that there will be some space left to
accommodate the traffic from the on-ramp.in this way,
traffic breakdown could be prevented or delayed. These
are the motivation for using different control strategies
in coordinated scheme. References [5-8] are examples
of resources that considered both variable speed limit
and ramp metering, which are believed to be the two
key tools influencing conditions on congested highways.

One of the major difficulties to implement a modelbased optimization control strategy is that the model
parameters are difficult to calibrate. To address this
issue genetic algorithm is used to tune the model set
parameters.
The arrangement of this article is as follows. In Section 2, the basics of the MPC scheme are introduced.
In Section 3, the traffic flow model (prediction model)
is introduced. The tuning process of model parameters
base on the genetic algorithm is explained in Section
4. In Section 5, the introduced method is applied to
the 2-km section of the eastbound Hemmat highway
selected as the study network. Section 6 summarizes
the main conclusions.

Model Predictive Control

We consider the problem of finding the best control


settings for a group of controllers in the study network consisting of ramp meters and a set of variable
speed limit signs. The control objective is to minimize
the total time spent (TTS) by all vehicles in the study
network. To do this, we use previously proposed model
predictive control (MPC) approach.

The core idea of the MPC is its use of a dynamic


model to predict the future behavior of the system at
each optimization step in order to avoid making myopic control decisions. In this paper, we have utilized
MPC as an online method to optimally control traffic
flow in a part of Hemmat highway with system states
being predicted by a macroscopic traffic flow model [8].
We assume that the reader is familiar with the basic
ingredients of the MPC approach. Nevertheless, the
As noted above, for a given traffic network a com- following paragraphs provide a brief description of the
bination of various traffic control strategies has the po- MPC framework introduced in [9].
tential to achieve better performance than when they
Consider a traffic network with N controllers over
are implanted separately. Beside, the latest advances in
a
specific
time horizon. The time horizon is divided
computers and communication technologies have made
into
P
large
control intervals, each subdivided into M
it feasible and financially viable to implement these Ausmall
intervals
(called system simulation steps). It is
tomatic control tools in coordinated scheme to improve
assumed
that
over
each control interval, the control
real-world traffic conditions [8]. Modern optimal convariables
are
kept
the
same, whereas the system state
trol techniques such as MPC, model-based optimizachanges
by
the
simulation
step.Let kc be the index for
tion control strategy, seem appropriate for this purlarge
intervals
(k
=
1,
2,
,
P
) and k for all the subinterc
pose [3] .Thus, we employ previously developed model
vals
(k
=
1,
2...,
M
P
)
[8].
The
transition of the system
predictive control (MPC) approach to find the control
state
can
be
expressed
as
follows:
settings for a group of controllers in the 2 km section
of the Hemmat highway consisting of the combination
of ramp meters and variable speed limit signs in order
x(k + 1) = f (x(k), u(k), d(k))
to minimize the total time spent (TTS) by all vehicles
in this site.
Where x(k), u(k), and d(k) are vectors representing

166

The Third International Conference on Contemporary Issues in Computer and Information Sciences

corresponding to highway stretches. A highway link


(m) is divided into (Nm ) segments (indicated by the
index i) of length (lm,i ) and by the number of lanes
(nm ). Each segment (i) of link (m) at the time instant t = kT , k = 0, ..., K is macroscopically characterized by the traffic density m,i (k)(veh/lane/km),
the mean speed vm,i (k)(km/h) and the traffic volume
qm,i (k)(veh/h). Each link has uniform characteristics
i.e. no on-ramp or off-ramp and no major changes in
geometry. The nodes of the graph are placed between
For the time period of [1.2, ..., P ], in which P is the links where the major changes, such as on-ramps and
prediction horizon. To reduce the computational com- off-ramps in road geometry occur. The time step used
plexity, a control horizon C(C < P ) is usually defined for simulation is denoted by T [11].
to represent the time horizon over which the control
Table 1 describes the notations related to the
signal is considered to be fixed, i.e.,
METANET model [11]. The traffic stream models that
capture the evolution of traffic on each segment at each
u(kc ) = u(C 1)f orkc > C
time step are shown in Table 2 [11].

the system state, the Control decisions, and the disturbance at time k. At each control step k c , a new optimization is performed to compute the optimal control
decisions(u(kc )), e.g,.

u1 (kc ) u1 (kc + 1) ... u1 (kc + p 1)

.
.
...
.

.
.
...
.

.
.
...
.
uN (kc ) uN (kc + 1) ... uN (kc + p 1)

Therefore, for N controllers, the N C vector of optimal controls(u(kc )) would be

u1 (kc ) u1 (kc + 1)
.
.

.
.

.
.
uN (kc ) uN (kc + 1)

... u1 (kc + C 1)

...
.

...
.

...
.

... uN (kc + C 1)

Only the first optimal control signal ui (kc ), i =


1, 2, ..., N (first column) is applied to the real system,
and after shifting the prediction and control horizon
one step forward with the current observed states of
the real system to the model, the process is repeated.
This feedback is necessary to correct any prediction
errors and system disturbances that may deviate from
model prediction. Since we have to work with a nonlinear system (traffic model), in each control time step kc ,
a nonlinear programming has to be solved to find the
N C optimal solutions before reaching the next control time step (kc + 1) [8]. For more information about
MPC approach see [10] and the references therein.

Prediction model

The traffic flow model used here to predict the future


behavior of the traffic system is the extended version
of METANET model for speed limits.
The METANET is a macroscopic traffic model that
is discrete in both space and time. The model represents the network by a directed graph with the links

167

Table 1: Notations Used in Metanet Model


m,
Link index
i
Segment index
T
Simulation step size
k
Time step counter
m,i (k) Density of segment i of highway link m
vm,i (k) Speed of segment i of highway link m
qm,i (k) Flow of segment i of highway link m
Nm
Number of segments in link m
nm
Number of lanes in link m
lm,i
Length of segment i in link m

Time constant of the speed relaxation term

Speed anticipation parameter (veh/km/lane)

Speed anticipation parameter(km2 /h)


am
Parameter of the fundamental diagram
crit,m Critical density of link m
V (m,i (k))
Speed of segment i of link m on a
homogeneous highway as a function of m,i (k)
max,m Maximum density of link m
vf ree,m Free-flow speed of link m
w0 (k) Length of the queue on on-ramp o at the
time step k
q0 (k) Flow that enters into the highway at time
step k
d0 (k) Traffic demand at origin o at time step
k
r0 (k) Ramp metering rate of on-ramp o at time
step k
Q0
On-ramp capacity

Speed drop term parameter caused by merging


at an on-ramp
vcontrol,m,i
Speed limit applied in segment i of link m

Parameter expressing the disobedience


of drivers with the displayed speed
limits

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 2: Link Equations and Descriptions


qm,i (k) = vm,i (k)m,i (k)nm
m,i (k + 1) = m,i (k) + lm,iTnm [qm,i1 (k) qm,i (k)]
vm,i (k + 1) = vm,i (k)+
T
( V [m,i (k)] vm,i (k))
m
{z
}
|

Flow-Density equation
Conservation of vehicles
Speed dynamic
Relaxation term: drivers try to achieve desired speedV ().

Relaxtion Term

T
+(
vm,i (k)[vm,i1 (k) vm,i (k)])
lm,i
|
{z
}

Convection Term: Speed decrease (increase) caused by inflow

Convection Term

of vehicles.
m T m,i+1 (k) m,i (k)

(
)
m .lm,i
m,i (k) + m
{z
}
|

Anticipation Term: the speed decrease (increase)as drivers

Anticipation Term

(k)

m,i
a
V [m,i (k)] = vf ree,m exp( (1)
am ( crit,m )m
wo (k + 1) = wo (k) + T [do (k) qo (k)]
qo (k) = min[do (k) + woT(k) , Qo .ro (k),

m,1 (k)
]
Qo max,m
max,m crit,m

(k)

m,i
a
V [m,i (k)] = min[vf ree,m exp( (1)
am ( crit,m )m )
, (1 + )vcontrol,m,i (k)]

T q (k)v

( (lm,i nmo (m,1m,1


(k)+ )

experience the density increase(decrease) in downstream.


Speed-Density relation(fundamental diagram)
Origins queuing model
Ramp outflow equation
The outflow depends on the traffic condition in the
mainstream and also on the metering rate,ro (k) [0, 1]
Speed limit model
The desired speed is the minimum of the speed determined
by (4) and the speed limit, which id displayed on variable
message sign(VMS)
Speed drop caused by merging phenomena. If there is an
on-ramp then the term must be added to (3)
model data. For this purpose, the genetic algorithm
toolbox implemented in Matlab is employed.

Calibration
METANETs
model parameters

The model Calibration procedure aims at enabling the


model to represent traffic conditions with sufficient
accuracy. The macroscopic model presented in Section 3 includes a number of parameters that reflect
particular characteristics of a given highway stretch
and depend upon highway geometry, vehicle characteristics, drivers behavior and etc. these parameters
Should be calibrated to fit a representative set of real
data with the maximum possible accuracy. For this
purpose the macroscopic traffic simulator Aimsun
will be used. Data deriving from this simulator will
be used as real world data in order to be confronted
with the model data. The purpose of the calibration is
just minimizing the difference between real data and

Genetic algorithm starts with an initial set of random solutions called population. Each individual in
the population is called a chromosome, representing a
solution to the calibration problem. The evolution operation simulates the process of Darwinian evolution
to create population from generation to generation by
selection, crossover and mutation operations. The success of genetic algorithm is founded in its ability to
keep existing parts of solution, which have a positive
effect on the outcome [12].
The seven parameters of METANET model
(vf ree , , , , , am , crit ) are changed by the genetic
algorithm. To compromise between computation time
and precision, the 30 individuals are selected. After
creating a new population the fitness value has to be
calculated for each member in the population and then
ranked based on the fitness value. The genetic algorithm selects parents from the current population by
using a selection probability. Then the reproduction

168

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of children from the selected parents occurs by using


recombination and mutation. The cycle of evaluation,
selection and reproduction terminates when the convergence criteria is met [11].

4.1

Fitness Function

The calibration is an optimization procedure that minimizes the difference between the real data coming
from Aimsun and the data coming from METANET
model. In particular we try to minimize the following
objective function:
Figure 1: Segment 2, measured versus predicted flow,
speed and density,qualitative validation.

Nsamp

model
sim
(qm,i
(h) qm,i
(h))2

h=0 m,iIall
model
+(vm,i
(h)

sim
sim
2
vm,i
(h))2 + (model
m,i (h) m,i (h))
(1)

Case Study

Where Nsamp is the number of simulation time step A 2-km section of the eastbound Hemmat highway was
into the entire simulation period, Iall is the set of in- selected as the study network. The Hemmat services
dexes of all pairs of links and segments.
a large volume of commuter traffic in both morning
and evening peak periods, leading to heavy recurrent
congestion. For these reasons, we consider the 2-km
section of this highway as an ideal study section to apply control framework presented above in order to al4.2 Results of Model Calibration
leviate serious congestion problems. Network topology
and the location of the control equipments and sensors
For the Calibration procedure one measurement set, can be seen in Fig.2.
corresponding to one weekday, from 7 a.m to 11 a.m,
was available from the Study site. Our data collecting tool was Aimsun simulator. These data provided
flow, speed and density measurements on a ten secondby- ten second basis. Genetic algorithm results a set of
optimal parameters. The summarized outcome of this
effort is presented in Table 3.
Figure 2: Candidate traffic network.
Table 3: Parameter set for Hemmat highway
crit,m
vf ree (km/h)

(second)
32.1646
92.1957
0.08649
13.839
(km2 /h) (veh/km)
am
31.6307
56.0935
2.425
-

The objective function used in this paper is to minimize the TTS spent by all vehicles, as defined in

Pk+P 1 P
T j=k [ m,i m,i (j)lm,i nm +
P
PK+P 1
P
[ramp oOramp (ro (j)
+
o wo (j)]
j=k
P
vi (j)vi (j1) 2
2
) +
Based on the set of parameters shown in Table 3, ro (j P 1)) + speed iIspeed (
vf ree
Fig.1 depicts the speed, density and flow trajectory de- queue 0Oramp (max(wo wmax ))2 ] (2)
termined by the Calibrated model and compared with
the actual measurements. As it can be seen in Fig.1,
For the MPC system, the optimal prediction and
after calibrating the model parameters the model is control horizons were found to be approximately 60
properly able to predict the network traffic conditions. and 48 steps, corresponding to 10 and 8 min, respecTTS

169

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tively. The time step for control updates was set to 1


min, which means that every minute, optimal control
must be computed and applied to the traffic system.
The simulation results, from 7 a.m to 9 a.m, for nocontrol and MPC cases are shown in Fig. 3. The TTS
under no-control case was 2482.1 (veh.h). The TTS
under control case was 2192.7 (veh.h), which showed
11.6584% improvement compared with the no-control
case.

Conclusions and Futre Work

In this paper, model predictive control approach has


been used to address the problem of Congestion control
in the selected part of Hemmat highway. METANET
model which is used for the prediction step of MPC was
calibrated by genetic algorithm to enable the model
to represent traffic conditions with sufficient accuracy.
Based on the simulation consequences, MPC approach
results less congestion, a higher outflow, and a lower
total time spent in the controlled areas. For future
works, we will be focusing on testing the MPC approach for larger part of Tehrans traffic network, including more traffic controllers, to investigate proficiency of this method as the number of traffic controllers increases.

Refrences

Figure 3: The simulation results for the no-control


case: segments traffic density, segments traffic speed,
segments traffic flow and origins queue length.

Figure 4: Simulation results for the controll case: Segment traffic density, Segment traffic speed, Segment
traffic flow, Origin queue length, Optimal ramp metering rates and Optimal speed limit values.

[1] A. Kotsialos, M. Papageorgiou, C. Diakakii, Y. Pavlis, and


F. Middelham, Traffic Flow Modeling of Large-Scale Motorway Networks Using the Macroscopic Modeling Tool
METANET, IEEE TRANSACTIONS ON INTELLIGENT
TRANSPORTATION SYSTEMS 3 (2002), 282292.
[2] A. Kotsialos and M. Papageorgiou, ramp metering: An
overview, IEEE TRANSACTIONS ON INTELLIGENT
TRANSPORTATION SYSTEMS (2002), 271281.
[3] A. Hegyi, B. De Schutter, and H. Hellendoorn, Model predictive control for optimal coordination of ramp metering and
variable speed limits, Transport. Res. C 13 (2005), 185209.
[4] X. Lu, T. Qiu, P. Varaiya, R. Horowitz, and S. E Shladover,
Combining Variable Speed Limits with Ramp Metering
for Freeway Traffic Control, American Control Conference
(2010).
[5] A. Alessandri, A. Di Febbraro, A. Ferrara, and E. Punta,
Optimal control of freeways via speed signaling and ramp
metering, Control Engineering Practice 6 (1998), 771780.
[6] C. Caligaris, S. Sacone, and S. Siri, Optimal ramp metering
and variable speed signs for multiclass freeway traffic, Proc.
of European Control Conference, Kos Greece (2007).
[7] I. Papamichail, K. Kampitaki, M. Papageorgiou, and A.
Messmer, Integrated Ramp Metering and Variable Speed
Limit Control of Motorway Traffic Flow, 17th IFAC World
Congress, Seoul, Korea (2008).
[8] A. Ghods, L. Fu, and A. Rahimi Kian, An Efficient
Optimization Approach to Real-Time Coordinated and
Integrated Freeway Traffic Control, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2010).
[9] A. Hegyi, Model predictive control for integrating traffic
control measures (2002).
[10] A. Kotsialos, M. Papageorgiou, and A. Memer, Integrated
optimal control of motorway traffic networks, American
Control Conf,(ACC), San Diego (1999), 21832187.
[11] A. Ghods, A. Rahimi Kian, and M. Tabibi, Adaptive Freeway Ramp Metering and Variable Speed Limit Control:A
Genetic-Fuzzy Approach, IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE (2009).
[12] D. Goldberg, Genetic Algorithm in Search, Optimization
and Machine Learning (1989).

170

Designing An Expert System To


Diagnose And Propose About Therapy Of Leukemia
Armin Ghasem Azar

Zohreh Mohammad Alizadeh Bakhshmandi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

z.alizadeh@iasbs.ac.ir

a.ghasemazar@iasbs.ac.ir

Abstract: Expert systems are designed for non-expert individuals with the aim of providing skills
of qualified personnel. These programs simulate the pattern of thinking and the manner of how
human operates and cause the operation of expert systems to be close to operations of human or
an expert. Variety of expert systems has been yet offered in the field of medical science and in
this respect it is one of the leading sciences. Leukemia is very common and serious cancer starts
in blood tissue such as the bone marrow. It causes large numbers of abnormal blood cells to be
produced and enter the blood. Speed is always effective in diagnosis and treatment of Leukemia
and recovery of patients, but sometimes there is no access to specialists for patients and because of
this reason designing a system with specialist knowledge, that offers the diagnosis and appropriate
treatment to patients, provides the timely treatment of patients. In this paper an expert system
has been presented for diagnosis of Leukemia using VP-Expert shell.

Keywords: Expert System of Leukemia; Diagnosis; Therapy.

Introduction

With the expanding application of information technology, decision making systems or generally decisions
based on computer have been of very importance. In
this regard expert systems as one of the parts attributed to artificial intelligence have the main role.
All kinds of decisions in expert systems are taken by
the help of computers. Expert systems, are knowledgebased systems and knowledge is their most important
part. In these systems knowledge is transferred from
experts in any sciences to the computer. Expert systems have been used extensively in various sciences.
So far various expert systems have been designed and
presented in areas such as industry, space travel, financial decision making and etc. Using expert system has
found its way to medical world [1].
DENDRAL was presented in 1965 to describe and explain the molecular structure [2], MYCIN was submitted in 1976 to diagnose heart disease [3], and other expert systems to detect acid and electrolyte materials,
Corresponding

train in management of anaesthesia, diagnose diseases


of internal medicine are of this category [6].
The purpose of this article is to present an expert system to diagnose and propose practices in therapy of
Leukemia. The issue will be discussed in more and then
the stages of system construction and its components
will be expressed and finally the stages of designed system function will be described with an applied example.
A medical expert system is a computer program that
offers effective aids in making decision about diagnosis
of diseases and motions on treatment method. Diagnosis of disease and predicting complications is done
after the program, receives patients information. This
information is usually transmitted through the patient
to the physician. Medical expert systems have features that distinguishes them from other medical applications. One aspect of this difference is that these
systems, mimic the arguments of an expert physician,
step by step, in order to achieve accurate results. In
most cases, the specialist using this software, is aware

Author, P. O. Box 45195-1159, M: (+98) 914 306-0594, T: (+98) 241 415-5056

171

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of these sequential arguments.


Leukemia is one the most important cancers that human society has been involved with. There is not usually a certain sign for Leukemia and when symptoms
appear, they are very ambiguous and complex and are
too similar to symptoms of flu. An expert system can
be designed that can diagnose Leukemia with a view of
the above symptoms, and suggest specific treatments.
Using expert software systems has some advantages
such that:
Individuals have fleeting and transient expertise.
For example, a person may change his job, be- Figure 1: the relationship between various components
come sick, etc, but the computer has permanent in expert systems [6]
expertise;
In this method, the systems that are not yet ready
to be formally delivered are provided to users to obtain the necessary feedback and necessary modifications are done on the system. This method involves
three stages: Analysis, Design and Implementation,
that are repeated together in a lump [13].
Prototype method is also used in this article. There Also, expert systems have the ability to upgrade. fore, the purposes and objectives of expert systems are
Some other advantages that expert systems can firstly defined and then gradually related research and
create include:
identification of hard-wares and soft-wares and related
experiences will be done. Then the environment of ex High Performance;
pert system is described and then conceptual analysis
and design of system is done and in fact a kind of feasi Full And Fast Performance Time;
bility is performed. In the next stage, the components
of expert system is determined and the soft-wares that
Good Reliability;
can support these components of system are surveyed
Being Understood;
and determined. Finally, the system is built and components are put together.
Flexibility;
Person does not have stable expertise. Expert
individual may have holidays, recreations programs, etc, that all these impact adversely on
normal function of individuals, but computers are
stable and in the same condition, offer the same
outputs;

Risk Reductions;

2.2

Components Of Expert System

Durability And Survival;


Existence Of Multiple Specialities.
The aim of the project leads to presentation of this article, taking advantages of a software system in order to
achieve all the benefits of a expert system to diagnose
the disease and propose about how to treat Leukemia
[6, 8].

Survey Method

Expert system for diagnosing and proposing about


Leukemia as any expert system is composed of three
main components:
Knowledge base management subsystem;
Interface management subsystem;
Inference engine subsystem.
The schematic view of components of an expert system
is shown in Figure 1. In following all three components
of designed system will be described [6, 7, 13].

The VP - EXPERT expert shell has been used in order


to design the mentioned expert system. This software
2.1 Stages of system construction
has been presented in 1993 by World Tech Systems
Company in America as a tool for developing rulePrototype is one of the most common design methods based expert systems. The software features can include [7, 14]:
that is used by builders of expert systems.

172

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Ability to create a knowledge base file with a sim- any question. For this purpose, 3 tables, decision in orple table;
der to identify patient, deduce the type of blood test
mode and deduce the type of symptoms are used.
Chaining capability to link together multiple
knowledge bases;
Automatically generation of some questions that
achieving to the result is not possible without
knowing their answers;
2.4 Inference Engine Subsystem
Existence of relatively diverse mathematical
functions;
In rules-based systems, inference engine, works in a
Existence of instructions that wants expert sysmanner that selects a base for the test and checks
tem to explain its activities through a consultawhether or not the conditions of this rule are correct?
tive work.
These conditions may be assessed through examination
of the user or may be derived from the facts that were
obtained during interviews. When conditions about a
rule are right, then the results of that rule will be correct. Once this rule is activated, and the result is added
to the knowledge base.

2.5

The User Interface Subsystem

Figure 2: Mockler diagram related to diagnosis of


The user interface for an expert system, normally
Leukemia [6]
should be of high exchangeable power, so that the
structure of information exchange is accomplished in
the form of talk to an applicant and an expert human
2.3 Knowledge Base Subsystem
[8]. VP-Expert shell has a user interface that some
questions are asked from the user based on rules of
The block and Mockler diagrams are used in order system knowledge base and based on the answer user
to achieve the knowledge base of mentioned system. gives the system, necessary conclusions are done and
Block diagrams are graphs in which the main tasks of at the end a good answer is offered to the user. In
the system is determined and is very suitable for ex- the next section, the work process of expert systems is
pressing the relationship between agents and targets. described with a practical example.
The block diagram related to the diagnosis of Leukemia
in the first level has been composed of three parts of
blood test, symptoms of disease and time of disease on
set.
Block diagrams do not help in writing the rules, because they don not have necessary details for this work. 2.6 Implementation
In this regards, a diagram is necessary that specifies
the relationship between factors effective on the aim
by specifying the questions, rules and recommenda- Consider a man is suddenly contacted vomiting,
tions. The first level of Mockler diagram for diagnosis headache, anemia and splenomegaly and his blood test
of Leukemia is shown in Figure 2.
shows that the PLT is 19,000, WBC to 3,000 units,
As is shown, the questions about duration of disease RBC to 5, HCT to 0.30 and the amount of hemoglobin
have been shown on the straight line and options re- is 11 units. This person plans to investigate disease
lated to the questions can be also seen under the same status (or lack of) and its kind with designed expert
line. After the questions and options that user should system. A view of the user interface and the answer
answer to every question, are determined by drawing of designed system is shown in Figure 3. After the
Mockler diagram, the results and various situations can diagnosis of disease, system provides ways of disease
be determined that the user may impose in response to treatment.

173

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] Durkin J., Expert Systems: Design and Development, Prentice Hall, New York, 1994.
[2] Edward A, Feigenbaum BG, Buchanan D, and Meta D,
Roots of Knowledge Systems and Expert System Applications, Artificial Intelligent 59 (1933), no. 12, 233240.
[3] Shorrtliffe EH, Computer-based Medical Consultations:
MYCIN, Elsevier Science Publishers, New York (1976).

Figure 3: the question of VP-Expert system from user


about vomiting disease symptoms

[4] Siyadat M and Soltaniyanzadeh H, Hippocampus location in


the human brain in MRI process by expert systems, Journal
of Engineering, Faculty of Tehran University 341 (2001),
923.

[5] Hatzilygeroudis P, Vassilakos J, and Tsakalidis A, XBONE:


A Hybrid Expert System Supporting Diagnosis of Bone
Diseases, Europe97,Proceeding of the Medical Informatics,
London (1997).

Discussion And Conclusion

In this article, providing an expert system to diagnose


and recommend treatment method for Leukemia was
proceeded. For this purpose, the objectives and targets
of expert system was first defined and then the review
of relevant researches and identification of hard-wares
and soft-wares and related experiences was proceeded
and the environment of expert system was described.
Then the conceptual design and system analysis and
in fact a kind of feasibility was conducted. In the next
step, the components of expert systems was determined
and the VP-Expert shell was determined as a software
that can support those components.
It is noteworthy that it should be tried to provide systems that can simulate the behaviour od expert people,
but it is not always possible. One defect of the designed
system is that the clinical evaluation is not possible and
system acts only based on user responses and can not
survey the verification of responses received from the
user.

[6] Ghazanfari M and Kazemi Z, Expert Systems, Elmo Sanat,


Tehran, 2004.
[7] Elahi Sh and Rajabzadeh A, Expert Systems: Intelligent
Decision Making Pattern, Bazargani, Tehran, 2004.
[8] Darligton K and Motameni H(Translator), Expert Systems,
Olomeh Rayaneh, Tehran, 2003.
[9] Babamohammadi H, Internal Surgery Nursing, Boshra,
Tehran, 2009.
[10] Bahadori M, Robbins Pathology, Andisheh Rafi, Tehran,
2006.
[11] Robbins SL, Historical Specificity, Andisheh Rafi, Tehran,
1998.
[12] Shahbazi K, What is a Cancer, Elmo Sanat, Tehran, 2010.
[13] Turban E, Aronson JE, and Liang TP, Decision Support
Systems and Intelligent Systems, 7th ed., Prentice Hall,
New York, 2005.
[14] Simonovic SP, User Manual of VP Expert: Rule based
expert system development tool, Word Tech System, London, 1993.

174

A Basic Proof Method For The


Verification, Validation And Evaluation Of Expert Systems
Armin Ghasem Azar

Zohreh Mohammad Alizadeh Bakhshmandi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

a.ghasemazar@iasbs.ac.ir

z.alizadeh@iasbs.ac.ir

Abstract: In the present paper, a basic proof method is provided for representing the verification,
validation and evaluation of expert systems. The result provides an overview of the basic method
for formal proof such as: partition larger systems into small systems, prove correctness on small
systems by non-recursive means, prove that the correctness of all subsystems implies the correctness
of the entire system.

Keywords: Expert System; Partition; Non-recursive.

Introduction

An expert system is correct when it is complete, consistent, and satisfies the requirements that express expert knowledge about how the system should behave.
For real-world knowledge bases containing hundreds of
rules, however, these aspects of correctness are hard to
establish. There may be millions of distinct computational paths through an expert system, and each must
be dealt with through testing or formal proof to establish correctness.
To reduce the size of the tests and proofs, one useful
approach for some knowledge bases is to partition them
into two or more interrelated knowledge bases. In this
way the VV&E problem can be minimized [1].

Overview of proofs using partitions

The basic method of proving each of these aspects of


correctness is basically the same. If the system is small,
a technique designed for proving correctness of small
systems should be used. If the system is large, a technique for partitioning the expert system must be applied and the required conditions for applying the par Corresponding

tition to the system as a whole should be proven. In


addition the correctness of any subsystem required by
the partition must be ensured. Once this has been accomplished this basic proof method should be applied
recursively to the sub-expert systems.
Once the top level structure of the Knowledge base has
been validated, to show the correctness of the expert
system, the following criteria must be accomplished [6]:
Show that the Knowledge base and inference engine implement the top level structure;
Prove any required relationships among subexpert systems or parts of the top level Knowledge representation;
Prove any required properties of the subKnowledge bases.

2.1

A simple example

To illustrate the basic proof method, Knowledge Base


1 will be proved correct in Table 1 and although this
Knowledge base is small enough to verify by inspection.

Author, P. O. Box 45195-1159, M: (+98) 914 306-0594, T: (+98) 241 415-5056

175

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Rule 1
Rule 2
Rule 3

If Risk tolerance = high AND Discretionary income exists = yes then investment = stocks.
If Risk tolerance = low OR Discretionary income exists = no then investment = bank account.
If Do you buy lottery tickets = yes OR Do you currently own stocks = yes then Risk tolerance
= high.
If Do you buy lottery tickets = no AND Do you currently own stocks = no then Risk tolerance
= low.
If Do you own a boat = yes OR Do you own a luxury car = yes then Discretionary income
exists = yes.
If Do you own a boat = no AND Do you own a luxury car = no then Discretionary income
exists = no.

Rule 4
Rule 5
Rule 6

Table 1: Knowledge Base 1 [7]


2.1.1

Knowledge Base 1

If variable Do you buy lottery tickets is assigned a


value yes, then two of the four regions are relevant.
In Figure 1.a, they are shown with a hatch. The two
2.1.2 Illustrations of Knowledge Base 1
regions corresponding to hypotheses Do you currently
The Knowledge Base 1 (KB1) has six rules. There are own stock?=yes are hatched in Figure 1.b.
seven variables which can take two possible values. It
is, therefore a seven dimensional, binary problem [5].
Lets focus on Rule 3 to understand the illustrations of
KB1.
DO YOU BUY LOTTERY TICKETS?=YES
AND
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY TICKETS?=YES

DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

DO YOU CURRENTLY
OWN STOCK?

DO YOU CURRENTLY
OWN STOCK?

DO YOU BUY LOTTERY


TICKETS?

YES

NO
YES

YES

YES

YES

YES

NO

NO

(a)

(a)

NO
NO

DO YOU CURRENTLY
OWN STOCK?

YES

NO
NO

DO YOU CURRENTLY
OWN STOCK?

DO YOU BUY LOTTERY


TICKETS?

YES

NO
DO YOU BUY LOTTERY
TICKETS?

DO YOU BUY LOTTERY TICKETS?=YES


OR
DO YOU CURRENTLY OWN STOCKS?=YES

(b)

(b)

Figure 2: Knowledge Base 1 [7]


Figure 1: Knowledge Base 1 [7]
It has two hypotheses, and one conclusion. The hypotheses are Do you buy lottery tickets?=yes, and
Do you currently own stock=yes. They are associated with the logical operator or. The consequent is Risk Tolerance=high. This is illustrated
in Figure 1. For the two variables of the hypotheses in
Rule 3, there are two possible values: yes or no.
The number of possible combinations of values for the
variables is four. These four combinations appear in
Figure 1 as four square regions defined by the closed
boundary (defining the domain or the variables) and
the line boundaries separating the possibles values for
each variable. Each square is a Hoffman region.

In two dimensions, a Hoffman region is a surface as


shown in this example. In three dimensions, it would
be a volume.
The logical operators are and, or and not. In
Figure 1.a and 1.b, the Hoffman regions corresponding
to hypothesis of Rule 3 are hatched. When combined
with an and logical operator, intersection of the two
sets of Hoffman regions. This is shown in Figure 2.a.
The intersection in this case is a unique Hoffman region. In Rule 3, an or operator connects the two
hypotheses. In this case, the union two sets of Hoffman regions is taken, as shown in Figure 2.b.

176

The Third International Conference on Contemporary Issues in Computer and Information Sciences

RULE 3
DO YOU BUY LOTTERY TICKETS?=YES
AND
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

DO YOU CURRENTLY
OWN STOCK?

RULE 3
DO YOU BUY LOTTERY TICKETS?=YES
OR
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

YES

NO
YES
3

YES

NO
NO

DO YOU CURRENTLY
OWN STOCK?

YES

Include all the rules that set subsystem variables


in their conclusions. For the risk tolerance subsystem, Rules 3 and 4 are included;

NO

Include all variables that appeared in rules already in the subsystem and are not goals of another subsystem;

RISK
TOLERANCE?

(a)

Start with the variables that are goals for the subsystem, e.g., risk tolerance for the risk tolerance
subsystem;

LOW

THEN
RISK TOLERANCE>LOW

(b)

For the risk tolerance subsystem, include Do you


buy lottery tickets and Do you currently own
stocks;
Quit if all rules setting subsystem variables are
in the subsystem, or else go to Step 2. For the
risk tolerance subsystem, there are no more rules
to be added.

Figure 3: Knowledge Base 1 [7]

Next, the region by the logical expression of the hypotheses is labelled with its rule. For Rule 3, the three Figure 4 below shows the partitioning of KB1 using
Hoffman regions are labelled with a circled 3 as shown this method.
in Figure 3.a. Consequence for the Rule is linked to
the label of the region of the hypotheses. In Figure
3.b, an arrow starts at the circled 3 and ends at the
value low of the variable Risk.

2.2

LT=

To prove the correctness of Knowledge Base 1 (KB1),


the expert Knowledge can determine that the system
represents a 2-step process [3]:

INVESTMENT
(1)

DISC. INCOME
(DI)

Step 1-Determine Knowledge Base


structure
ST=

YES
NO

RISK TOLERANCE
(RT)

YES
= Boat
Y

NO

YES
= Lux. Car

YES
NO

NO
AND

BANK ACCOUNT

OR

Find the values of some important intermediate


variables, such as risk tolerance and discretionary
income;

RULES :

3, 4

1, 2

5, 6

Use these values to assign a type of investment.


KB1 was built using this Knowledge; therefore, it can
be partitioned into the following pieces:
A subsystem to find risk tolerance
(part of Step 1);

Figure 4: Knowledge Base 1 [3]

2.4

A subsystem to find discretionary income


(part of Step 1);

Step 3-Completeness of expert systems

A subsystem to find type of investment given this 2.4.1 Completeness Step 1-Completeness Of
Subsystems
information
(part of Step 2).
The first step in proving the completeness of the entire expert system is to prove the completeness of each
2.3 Step 2-Find Knowledge Base Parti- subsystem. To this end it must be shown that for all
tions
possible inputs there is an output, i.e., the goal variables of the subsystem are set. This can be done by
To find each of the three subsystems of KB1, an itera- showing that the OR of the hypotheses of the rules that
tive procedure can be followed:
assign to a goal variable is true [7].

177

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.4.2

Completeness Step 2-Completeness of 2.5.2


the entire system

The results of subsystem completeness are used to establish the completeness of the entire system. The basic argument is to use results on subsystems to prove
that successively larger subsystems are complete. At
each stage of the proof there are some subsystems
known to be complete; initially the subsystem that
concludes overall goals of the expert system will be
complete. At each stage of the proof, a subsystem that
concludes some of the input variables of the currentlyproved-complete subsystem is added to the currently
complete subsystem. After a number of steps equal to
the number of subsystems, the entire system can be
shown to be complete.

2.5

Step 4-Consistency of the entire


system

Consistency Step 2-Prove consistency of


subsystems

If there are inconsistent conclusions in the Knowledge


base as a whole, then the next step in proving consistency is to prove the subsystems consistent. This can
be done by showing that no set of inputs to a subsystem
can result in any of the sets of inconsistent conclusions.

2.5.3

Consistency Step 3-Consistency of entire


system

The results of subsystem consistency are used to establish the consistency of the entire system. The basic
argument is to use results on subsystems to prove that
successively larger subsystems are consistent. At each
stage of the proof, there are some subsystem known to
be consistent; initially, this is the subsystem that concludes goals of the expert system as a whole. At each
stage of the proof, a subsystem that concludes some of
the input variables of the currently-proved-consistent
subsystem is added to the currently consistent subsystem. After a number of steps equal to the number
of subsystems, the entire system can be shown to be
consistent [2].

The first step in proving the consistency of the entire


expert system is to prove the consistency of each subsystem. To do this, the user must show that for all
possible inputs, the outputs are consistent, i.e., that
the AND of the conclusions can be satisfied.
For example, if an expert system concludes temperature >0 and temperature <100, the AND of these
conclusions can be satisfied. However, if the system concludes, temperature <0 and temperature
>100, the AND of these two conclusions has to be
false. It is clear that based on the input that produced
these two conclusions, it is not possible for all of the 2.6 Step 5-Specification satisfaction
systems conclusions to be true at the same time and
thus the system producing these conclusions is inconIn order to prove that KB1 satisfies its specifications,
sistent.
the user must actually know what its specifications are.
This is a special case of the general truth that in order
2.5.1 Consistency Step 1-Find the mutually to verify and validate, the user must know what a sysinconsistent conclusions
tem is supposed to do. Specifications should be defined
in the planning stage of an expert system project [4].
The first step in proving consistency is to identify those To illustrate the proof of specifications it will be assets of mutually inconsistent conclusions for each of sumed that KB1 is supposed to satisfy: A financial
the subsystems identified in the Find partitions step advisor should only recommend investments that an
above. Some sets of conclusions are mathematically investor can afford. As with many other aspects of
inconsistent [2]. For example, if a system describes verification and validation, expert Knowledge must be
temperature, the set:
brought to bear on the proof process. For KB1, an
temperature <0, temperature >100 is mathemat- expert might say that anyone can afford a savings acically inconsistent.
count. Therefore, the user only has to look at the conBecause some sets of conclusions are inconsistent be- ditions under which stocks are recommended. Howcause of domain expertise, finding all sets of inconsis- ever, that same expert would probably say that just
tent conclusions generally requires expert Knowledge. having discretionary income does not mean that the
Note that if there are no mutually inconsistent conclu- user can afford stocks; that judgement should be made
sions in the expert system as a whole, then consistency on more than one variable. Therefore, it would be
is true by default, and no further consistency proof is reasonable to conclude that KB1 does not satisfy the
necessary.
above specification.

178

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusion

This paper has argued that V&V techniques are an


essential part of the Knowledge engineering process,
because they offer the only way to judge the success
(or otherwise) of a KBS development project. This is
equally true in the context of Knowledge management,
where V&V techniques tell us whether or not the KBS
can be relied upon to accurately embody the Knowledge of the human experts that supplied it.
However, examination of known studies on the effectiveness of existing KBS VV&E techniques has shown,
that the state of Knowledge in this area is sparse. The
way to improve this situation would be by systematically gathering data from a representative set of KBS
projects and V&V techniques. Without such a study,
Knowledge engineering will remain very much an art
and, by extension, so will the use of KBS technology
in Knowledge management.
It is difficult to generalise our results to all Knowledge
based systems and, of course, further evaluations of
other applications are necessary to confirm (or challenge) our conclusions. However, since the method we
have used minimises the need for experts interpretation of the faults, we can reasonably conclude that if
we use an application of similar size and complexity
to GIBUS, we would expect to obtain similar results.

179

Consequently, since our application has a size and a


complexity which is representative of actual practice,
we would expect that consistency and completeness
checking, in addition to testing, would be an effective
combination of methods to validate many of the Knowledge based systems actually under development.

Refrences
[1] Ayel M and Laurent J-P, two different ways of verifying
Knowledge-based systems, Validation, Verification and Test
of Knowledge-Based Systems, Wiley, New York (1991), 6376.
[2] Bendou A, A constraint-based test data generator,
EUROVAV-95, Saint Badolph, France (1995), 19-29.
[3] Ginsberg A, Knowledge-base reduction: A new approach to
checking Knowledge bases for inconsistency & redundancy,
AAAI 88 2 (1988), 585-589.
[4] Kirani S, Zualkernan I.A, and Tsai W.T., Comparative Evaluation of Expert System Testing Methods, Computer Science
Department, University of Minnesota, Minneapolis 2 (1992),
9230.
[5] Laurent J-P, Proposals for a valid terminology in KBS validation, ECAI-92, Wiley, New York 2 (1992), 829-834.
[6] Lounis R and Ayel M, Completeness of KBS, EUROVAV-95,
Saint Badolph, France 2 (1995), 3146.
[7] OLeary D, Design, development and validation of expert systems: A survey of developers, Vol. 2, 1991.

Point set embedding of some graphs with small number of bends


Maryam Tahmasbi

Zahra Abdi reyhan

Shahid Beheshti University, G.C.,Tehran, Iran

Shahid Beheshti University, G.C.,Tehran, Iran

Department of Computer Science

Department of Computer Science

m tahmasi@sbu.ac.ir

z.abdi@mail.sbu.ac.ir

Abstract: In this paper we study the problem of point-set embedding. We assume that G is a
planar graph with n vertices and S is a set of n points in general position in the plane. The problem
is to find a planar drawing of G such that each vertex is mapped to one of the points in S and each
edge is mapped to a polygonal chain and the drawing has small number of bends. In this paper we
prove that (1) every wheel has a point set embedding with no bends on a set of points in non-convex
position. Moreover, if the points are in general position, then wheel has a point set embedding with
at most one bend. (2) every -graph has a point set embedding with at most six bends on a set of
points in general position such that one of the cycles is drawn with straight lines. (3) every k-path
graph has a point set embedding on a set of points in general position with at most 2k 2 bends.

Keywords: point set embedding; wheel; -graph; planar drawing; bend; convex hull,k-path graph.

Introduction

other versions of the problem, one of the most studied


ones is the one when a partial drawing of the graph is
given [1] ,[6],[7].

The problem of computing a planar drawing of a graph


Given a planar graph G = (V, E) and a planar
on a given set of points in the plane is a classical sub- straight-line partial drawing D of G, it was shown that
ject both in the graph drawing and in the computa- it is N P -hard to decide whether G admits a planar
tional geometry [1]. Let G be a planar graph with n straight-line drawing including D [7].
vertices and S be a set of n points in the plane, a point
set embedding of G on S is a planar drawing of G such
that each vertex is mapped to a distinct point of S and
each edge is drawn as a polygonal chain. A point set
embedding with no bends is called a straight line point
set embedding. There are two versions of the problem,
point set embedding with mapping and without mapping. We study the problem for the case that there
is no predefined mapping between the vertices of the 2
Wheels
graph and the points of the set.
Given a planar graph G, deciding whether G has a
straight line point set embedding is N P -complete [2].
It is proved that any outer planar graph has a planar
straight line point set embedding[3] ,[4] and any planar
graph has a planar embedding on every point set in the
plane with at most two bends per edge[5]. The problem
is also studied for planar triangulations[5]. There are
Corresponding

Author, T: (+98) 21 299-03004

180

A wheel Wn is a graph consisting of a cycle with n vertices and a vertex, called center that is adjacent to all
vertices of the cycle. In this section we study the problem of embedding a wheel on a point set S in general
position. We study two cases where the points of S are
in convex and non convex position, separately.

CICIS12, IASBS, Zanjan, Iran, May 28, 2012

2.1

Points in non-convex position

are in convex position, we can draw diagonals from pi


to all other points of S. Now, we need to connect the
points before and after c on CH(S). For this purpose
In this section we suppose that S is a set of n+1 points
we place a dummy vertex out of the convex hull and
in general non-convex position.
near pi . From this dummy vertex we draw a straight
Theorem 2.1: The graph Wn admits a straight line line segment to the points before and after pi . Finally,
point set embedding on S.
we replace this dummy vertex with a bend. Fig.2 shows
these steps.
Proof. Let CH(S) be the convex hull of S and pl and pr
be the leftmost and rightmost points of CH(S), respectively. Let pl = p1 , p2 , . . . , pk be the clockwise ordering
of points on CH(S). Let q1 , . . . , qh be the remaining
points from right to left. Choose a point qi as the center. We can draw straight lines from qi to all other
points. Starting from p1 , at each convex region, with
respect to point distance connect all points except qi .
Fig. 1 shows the steps.

Figure 2: (a) A set S of 5 points in convex position.


(b)convex hull of S. (c) A point set embedding of W4
on S with one bend in total.

-graph

A -graph consists of two vertices p and q and three


vertex disjoint paths P1 , P2 and P3 that connect p and
q. In this section we present an algorithm that draws
Figure 1: (a) The wheel W4 . (b) A set of five points in a -graph G on a set of points in general position with
non-convex position. (c) convex hull of S. (d) straight at most six bends in total such that one of the cycles
is drawn with straight lines.
line point set embedding of G on S
Let be a drawing of a -graph G and v be a
vertex of G; the vertex v is visible from below in if
2.2 Points in convex position
the open vertical half line below v does not intersect
, and is visible from above in if the open vertical
In this section, we describe how to compute a point set half line above v does not intersect [1].
embedding of Wn on a set S of n + 1 points in convex
(and general) position with at most one bend in total.
Theorem 3.1: Let G be a -graph and S be a set
of n points in general position. The graph G admits
Theorem 2.2: The wheel Wn admits a point set a point set embedding on S where one of the cycles
is drawn with straight line segments and there are six
embedding on S with exactly one bend.
bends in the drawing.
Proof. Let CH(S) be the convex hull of S and pl be
the leftmost point of CH(S). Let pl = p1 , . . . , pn+1 Proof. Let p and q be the degree three vertices of G
be the points on CH(S) in clockwise order. Choose and P1 , P2 and P3 be the three paths connecting them
an arbitrary point pi as the center. Since the points in G. Let C1 be the cycle consisting of P1 and P2 and
2

181

The Third International Conference on Contemporary Issues in Computer and Information Sciences

C2 be the cycle consisting of P2 and P3 . Suppose that


C2 has n2 vertices. Let S 0 be set of the first n2 vertices
of S from below in lexicographic order. We map p to
the left most point in S 0 , d. Then we map all vertices
of C2 to points of S 0 using the algorithm in [1]. Now we
need to map all vertices of P1 from p to q, except p and
q, to the points of S 00 = S S 0 from left to right. Suppose that q is mapped to a point c in S 0 . It is enough
to connect d to d0 , the left most vertex of S 00 , and c to
c0 , the rightmost vertex of S 00 . Let B 0 and B 00 be the
bounding boxes of S 0 and S 00 respectively. In order to
connect d to d0 , we need two bends: one is the top left
corner of B 0 and the other is the bottom left corner of
B 00 . For connecting c to c0 , we distinguish two cases:

k-path graph

A k-path graph consists of two vertices p and q and


k 3 vertex disjoint paths P1 , P2 , . . . , Pk that connect
p and q. In this section we present an algorithm that
draws k-path graph G on a set of points in general
position with at most 2k 2 bends in total.
Theorem 4.1: Let G be a k-path graph and S be a
set of n points in general position. The graph G admits
a point set embedding on S with at most 2k 2 bends
in total.

(1) if the point c is visible from above, in order to


connect c to c0 , we use a path with three bends locating
at the following positions: the projection of c on upper
edge of B 0 , the top right corner of B 0 , and the bottom
right corner of B 00 .

Proof. Let P1 , P2 , P3 , . . . , Pk be the paths of G from p


to q in counter clockwise order around p. For 2 i k
let ni be the number of vertices in Pi except p and q
and n1 be the number of vertices in P1 .

(2) if the point c is visible from below, in order to


connect c to c0 , we use four bends locating at the following positions: the projection of c on the lower edge
of B 0 , the bottom right and top right corners of B 0 and
the bottom right corner of B 00 .

Let S1 be set of the first n1 vertices of S from below in lexicographic order. We map p to the left most
point in S1 , h1l and q to the right most point in S1 ,
h1r . suppose
Pi1 that Si be the set of the first ni vertices
of S j=1 Sj from below in lexicographic order,for
2 i k.
We map all vertices of P1 from p to q, to the points
of S1 from left to right. Now we need to map all vertices of Pi from p to q, except p and q to the Si from
left to right. let Bi be the bounding boxe of Si . It is
enough to connect p to the left most point in Si , hil
and q to the right most point in Si , hir , for 2 i k.

Figure 3 shows the mapping.

In order to connect p to hil , we use a path with


one bend locating at the following position: we draw a
line with maximum positive slope from hil such that it
doesnt cross boxes Bj , for 1 j i, This line is called
Li and draw a horizontal line from vertex p in box B1
along the left side. Similarly, we draw a line below
L1i1 from vertex p in box B1 in very short distance
along with left side which is called L1i . intersection of
Li and L1i is named qi , Now we can connect vertex p
to hi1 using dummy vertex qi .
The rightmost point of the box Bi (hir ) is connected
to q in similar way, except for the line corresponding
to line Li is considered as with the minimum negative
slope, and the line corresponding to L1i is considered
along with the right side. Figure 4 shows these connections.
Figure 3: (a) -graph G. (b) straight line point set emFinally, we replace each dummy vertices with one
bedding of C2 on S 0 .(c) and (d) point set embedding bend. Thus we can draw k-path graph with maximum
of G on S with at most 6 bend
2k 2 bends.
3

182

CICIS12, IASBS, Zanjan, Iran, May 28, 2012

the wheel has a point set embedding with at most one


bend, for every - graph the number of bends in point
set embedding is at most six, and one of the cycles is
drawn with no bend in the resulting drawing. Then we
extended the results to k-path graphs and presented an
algorithms that computed a point set embedding on a
set of points in general position with at most 2k 2
bends.
Constrained point set embedding of graphs on a set
of points has been recently investigated where it is required to draw a sub graph with straight line and the
remaining parts with small number of bends. In further works we are going to examine constrained point
set embedding with several sub graphs in a way that
the sub graphs are drawn with straight lines and other
parts with small number of bends.

Refrences
Figure 4: (a)point set embedding of p1 , p2 and p3 . (b)
point set embedding of G on S with at most 6 bend

[1] E. Di Giacomo, W. Didimo, G. Liotta, H. Meijer, and S.


Wismath, constrained point-set embedding of planar graphs,
LNCS, GD07 proceeding 5417 (2008), 360371.
[2] S. Cabello, Planar embeddability of the vertices of a graph
using a fixed point set is NP-hard, J. Graph Algorithms Appl
10 (2) (2006), 353363.

Conclusions
Works

and

[3] N. Castaneda and J. Urrutia, Straight line embeddings of


planar graphs on point sets, 8th Canadian Conference on
Computational Geometry 9(6) (1996), 312318.

Future

[4] P. Gritzmann, B. Mohar, J. Pach, and R. Pollack, Embedding a planar triangulation with vertices at specified points,
Amer. Math 98 (2) ( Monthly (1991)), 165166.
[5] M. Kaufmann and R. Wiese, Embedding vertices at points:
Few bends suffice for planar graphs, J. Graph Algorithms
Appl. 6 (1) (2002), 115129.

In this paper we studied the problem of point set embedding of wheels, graphs and k-Path graphs without mapping.

[6] E. . Di Giacomo, W. Didimo, G. Liotta, H. Meijer, and S.


Wismath, Point set embeddings of trees with given partial
drawings, Comput. Geom. 42 (67) (2009), 664676.

We proved that every wheel has a point set embedding with no bends on a set of points in non-convex position. In case that the points are in general position,

[7] M. Patrignani, On extending a partial straight-line drawing,


Internat. J. Found. Comput. Sci. (Special issue on Graph
Drawing) 17 (5) (2006), 10611069.

183

On The Pairwise Sums


Keivan Borna

Zahra Jalalian

Faculty of Mathematical Sicenes and Computer

Faculty of Engineering

Kharazmi University

Kharazmi University

borna@tmu.ac.ir

jalalian@tmu.ac.ir

Abstract: The aim of this paper is to study two open problems and provide faster algorithms for
them. More precisely for two sets X and Y of numbers with the size of n and m we first present an
O(nm) algorithm to sort X + Y = {x + y | x X, y Y } of pairwise sums. Then we offer another
O(nm) algorithm for finding all pairs (x, y) and (x0 , y 0 ) from X + Y for which x + y = x0 + y 0 . In
particular if X, Y are both of size n this later algorithm enables us to know when the set X + Y
have n2 unique elements.

Keywords: Lower Bounds; Linear Hash Function; Sorting Pairwise Sums.

Introduction

Given two sets of numbers, each of size n, how quickly


can the set of all pairwise sums be sorted? In other
words, given two sets X and Y , our goal is to sort the
set X + Y = {x + y | x X, y Y }, cf. [3, Problem
41]. There are several motivations for the problem of
finding the required amount of comparisons for sorting a set if a partial order on the input set is given.
Many authors including [1,2] described several geometric problems that are Sorting-(X+Y )-hard. It is known
that there is a subquadratic-time transformation from
sorting X + Y to each of the following problems: computing the Minkowski sum of two orthogonal-convex
polygons, determining whether one monotone polygon
can be translated to fit inside another, determining
whether one convex polygon can be rotated to fit inside another, sorting the vertices of a line arrangement,
or sorting the interpoint distances between n points in
Rd . In addition there is an immediate application to
multiplying sparse polynomials [4].

and then to O(n2 logn) by Lambert [6] and Steiger and


Streinu [7]. These results imply that no superquadratic
lower bound is possible in the full linear decision tree
model. One motivation of this paper is to present an
O(n2 ) algorithm for sorting X + Y . As a matter of fact
for two sets X, Y of numbers of size n, m in Section 2
we present an O(nm) algorithm for sorting X +Y . The
decision version of this problem is also interesting; does
the set X + Y have n2 unique elements? This problem
which will be discussed in Section 3 provides another
motivation for this paper.
The organization of this paper is as follows. For
two sets X, Y of numbers of size n, m in Section 2 we
present an O(nm) algorithm for sorting X + Y . In Section 3 our O(nm) algorithm for finding all pairs (x, y)
and (x0 , y 0 ) from X + Y for which x + y = x0 + y 0 is
presented. Section 4 is devoted to conclusions.

Sorting X + Y

In [4] the author presented an algorithm that can


sort X + Y using only 8nlogn + 2n2 comparisons but
the algorithm needs exponential time to choose which For two sorted sets X, Y of numbers of size n =
comparisons to perform. This exponential overhead SizeX, m = SizeY we first find the array Z = X + Y
was reduced to polynomial time by Kahn and Kim [5] using an O(nm) algorithm.
Corresponding

Author, P. O. Box 45195-1159, F: (+98) 26 3455-0899, T: (+98) 26 3457-9600

184

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data: Two sorted set X, Y of numbers of size


follows:
n = SizeX, m = SizeY
Result: The array Z = X + Y
Z[0] = 55, Z[11] = 44, Z[36] = 19, Z[43] = 12,
LowAmountZ=X[0] + Y[0];
Z[44] = 11, Z[47] = 8, Z[69] = 14, Z[79] = 24,
HighAmountZ = X[n-1] + Y[m-1];
AlphaZ = -LowAmountZ;
Z[80] = 25, Z[112] = 57, Z[113] = 58
SizeZ = HighAmountZ - LowAmountZ + 1;
for i = 0, , SizeZ 1 do
Z[i] = X[0] + Y[0] - 1;
Furthermore S as the set representation of Z is,
end
for i = 0, , n 1 do
for j = 0, , m 1 do
S = {55, 44, 19, 12, 11, 8, 14, 24, 25, 57, 58}.
indexZ=X[i]+Y[j]+AlphaZ;
Z[indexZ]=X[i]+Y[j];
end
end
Our discussions in this section will prove the folAlgorithm 1: An O(nm) algorithm for computing lowing theorem.
the array Z = X + Y .
Theorem 1: For two sets X, Y of numbers of size n and
m, using the Algorithms 1 and 2 one can sort X + Y
In this algorithm the first (smallest) value of X + Y
in O(nm).
is X[0] + Y [0] and the last (largest) value is X[n 1] +
Y [n 1]. Therefore, the index of X[0] + Y [0] in the
X + Y will be zero. To determine the address of the
other pairwise sums, we have to define a hash function
to find the index of each element of the sum. Since
the first pairwise sum should go to the zero index, thus 3
When x + y = x0 + y 0 ?
our hash function will be h(x + y) = x + y + , where
= (X[0] + Y [0]). The hash function gets the pairwise sum and produces the index in X + Y where the
corresponding sum should insert to it. We define an For a while assume that each of X and Y are of the
array Z and initialize it with X[0] + Y [0] 1. Finally same size n. The following problem is related to sorting
we have to delete the elements of X + Y in which their the set X + Y :
value is equal to X[0]+Y [0]1 and apply a shift-to-left
operation in each step. For this let c be the number of
Does the set X + Y have n2 unique elements?
distinct elements of Z = X + Y and let S be an array
of size c. The following easy algorithm fills S and thus
finds the sorted set X + Y .
This problem is essentially equivalent with the followData: The array Z of size c generated in
Algorithm 1
Result: The sorted set S = X + Y
j = 0;
for i = 0, , c 1 do
if Z[i] Z[0] then
S[j] = Z[i];
j = j + 1;
end
end
Algorithm 2: The algorithm for obtaining elements
of set Z generated as an array in Algorithm 1.

ing problem:
For which x, x0 X and y, y 0 Y we have x + y = x0 + y 0 ?

In this section we present a new algorithm to find all


pairwise with equal sum amount. We first create two
arrays Z1, Z2 with the same size (X[n1]+Y [n1])
(X[0] + Y [0]) + 1. Then, both arrays, Z1 and Z2 will
initialize with the amount of X[0] + Y [0] 1. For each
pairwise sum from X + Y this algorithm uses a hash
function to place the sum value into the corresponding
address in Z1 and puts the xs index of set X at the
For example if X = {27, 9, 42} and Y = same address of Z2. But, before the sum value be lo{28, 17, 15, 16}, the array Z = X + Y will be as cated to its cell, the algorithm checks the cells content,
and if this cells amount is greater than X[0] + Y [0] 1,
it means that two pairwise sum with equal sum amount
have been found.

185

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Data: Two sorted set X, Y of numbers of size


n = SizeX, m = SizeY
Result: All pairs (x, y) and (x0 , y 0 ) from X + Y
for which x + y = x0 + y 0
1. LowAmountZ=X[0] + Y[0];
2. HighAmountZ = X[n-1] + Y[m-1];
3. AlphaZ = -LowAmountZ;
4. SizeZ = HighAmountZ - LowAmountZ + 1;
5. for i = 0, , SizeZ 1 do
5.1. Z1[i] = X[0] + Y[0] - 1;
5.2. Z2[i] = - 1;
end
6. for i = 0, , n 1 do
7. for j = 0, , m 1 do
7.1. indexZ=X[i]+Y[j]+AlphaZ;
7.2. if Z1[indexZ] > (X[0] + Y [0] 1)
then
7.2.1. print X[i] + Y[j] ;
7.2.2 print X[Z2[indexZ]] +
(Z1[indexZ]-X[Z2[indexZ]]);
end
7.3. Z1[indexZ]=X[i]+Y[j];
7.4. Z2[indexZ]=i;
end
end
Algorithm 3: An O(nm) algorithm for finding all
pairs (x, y) and (x0 , y 0 ) from X + Y for which x + y =
x0 + y 0

y 0 = (x+y)x0 , therefore Z1[indexZ]X[Z2[indexZ]]


will be the value of y 0 .
For example if X = {17, 13, 12, 5, 19} and
Y = {9, 6, 2, 7, 11, 16} the output of our algorithm
are as follows:
13 + 6 = 17 + 2
13 + 7 = 17 + 11
12 + 11 = 17 + 16
5 + 6 = 12 + 11
5 + 2 = 13 + 16
Since the complexity of Algorithm 3 is obviously
(mn), we obtain the following theorem.
Theorem 2: For two sets X, Y of numbers of size n
and m, Algorithm 3 will find all pairs (x, y) and (x0 , y 0 )
from X + Y for which x + y = x0 + y 0 in O(nm).

Discussion and Future Works

For two sets X and Y of numbers of size n and m


we presented an O(nm) algorithm for sorting the set
X + Y = {x + y|x X, y Y } of pairwise sums.
We also presented another O(nm) algorithm for finding all pairs (x, y) and (x0 , y 0 ) from X + Y for which
In lines 1 and 2, we first determine the minimum x + y = x0 + y 0 . Constructing faster algorithms for
and maximum amounts of the set Z := X + Y (and these problems are the subject of further progress.
call them LowAmount and HighAmount) and assign
HighAmountZ LowAmountZ + 1 to its size. Then
Refrences
in line 3 we establish a constant for the hash function.
In the commands 5.1 and 5.2, Z1 and Z2 will be ini- [1] A. Hernandez, Finding an o(n2 logn) algorithm is sometimes
tialized with the amount of X[0] + Y [0] 1 and 1
hard: Carleton University Press, Ottawa, Canada, Proc. 8th
Canad. Conf. Comput. Geom. (1996), 289-294.
respectively, because Z2 will be used to keep the x address of each pairwise sum. In line 7.1 we sum X[i] [2] G. Barequet and S. Har-Peled, Polygon containment and
translational min-Hausdorff-distance between segment sets
and Y [j] and put the value in Z1. But first using our
are 3SUM-hard, Internat. J. Comput. Geom. Appl. 11
hash function, it appoints the cells address of Z for
(2001), 465-474.
which the sum X[i] + Y [j] has to be placed. Then it
0
[3]
E.
D. Demaine, J. S. B. Mitchell, and J. O Rourke, The Open
compares the content of the obtained address with the
Problems Project (2010), 191.
value of X[0] + Y [0] 1. If the cells content is equal
[4]
M.
L. Fredman, How good is the information theory bound
to X[0] + Y [0] 1, it means that there isnt such sum
in sorting?, Theoret. Comput. Sci. 1 (1976), 355-361.
in X + Y . Otherwise the pairwise sum amount goes
to the Z1[indexZ] and its x index, which is i, will be [5] J. Kahn and J. Han Kim, Entropy and sorting, J. Comput.
Sys. Sci. 51 (1995), 390-399.
located at the Z2[indexZ], lines 7.3 and 7.4 respec[6] J. L. Lambert., Sorting the sums (xi + yj ) in O(n2 ) compartively. In fact once the cells content amount is not
isons, Theoret. Comput. Sci. 103 (1992), 137-141.
equal with X[0] + Y [0] 1, it means that there are an[7] W. Steiger and I. Streinu, A pseudo-algorithmic separation
other x + y sum with the same value, therefore we have
of lines from pseudo-lines, Inform. Process. Lett. 53 (1995),
to find the x0 and y 0 . For this reason, the algorithm
295-299.
goes to Z2[indexZ] which has the address of x0 , and [8] J. Erickson, Lower bounds for fundamental geometric problems, PhD thesis, University of California at Berkeley, 1996.
then with the obtained index it can retrieve x0 . Since

186

Hyperbolic Voronoi Diagram: A Fast Method


Zahra Nilforoushan

Ali Mohades

Department of Computer Engineering

Faculty of Mathematics and Computer Science

Kharazmi University, Tehran, Iran

Amirkabir University of Technology, Tehran, Iran

shadi.nilforoushan@gmail.com

mohades@aut.ac.ir

Amin Gheibi

Sina Khakabi

School of Computer Science

School of Computing Science

Carleton University, Ottawa, Canada

Simon Fraser University, Burnaby, BC, Canada

amin-gheibi@carleton.ca

sinakhm.cs84@aut.ac.ir

Abstract: Voronoi diagrams have useful applications in various fields and are one of the most
fundamental concepts in computational geometry. Although Voronoi diagrams in the plane have
been studied extensively, using different notions of sites and metrics, little is known for other
geometric spaces. In this paper, we present a simple method to construct the Voronoi diagram of
a set of points in the Poincare hyperbolic disk, which is a 2-dimensional manifold with negative
curvature. Our trick is to define and use some well-formed geometric maps which take care of
connection between the Euclidean plane and Poincare hyperbolic disk. Finally, we give a brief
report of our implementation.

Keywords: Computational geometry, Hyperbolic space, Geodesic, Voronoi diagrams.

Introduction

Brown [5], its higher-dimensional analogues can be obtained using methods in Seidel [20].

Voronoi diagrams for point-sets in d-dimensional Euclidean space E d have been studied by a number of
people in their original as well as in generalized settings. For a finite set M ( E d , the (closest-point)
Voronoi diagram of M associates each p M with the
convex region R(p) of all points closer to p than to any
other point in M . More formally, R(p) = {x E d |
d(x, p) < d(x, q), q M p}, where d denotes the
Euclidean distance function. Voronoi diagrams are of
importance in a variety of areas other than computer
science whose enumeration exceeds the scope of this
paper (see for instance Aurenhammers survey [3] or
the book by Okabe, Boots, Sugihara and Chiu [18] ).
Shamos and Hoey [21] were the first to introduce the
planar diagram to computational geometry and also
demonstrated how to construct it efficiently. Using
a dual correspondence to convex hulls discovered by

As the variety of applications of the Voronoi diagram were recognized, people soon became aware of the
fact that many practical situations are better described
by some modification than by the original diagram. For
example, diagrams under more general metrics [15,16],
for more general objects than points [9, 13], and of
higher order [10, 14, 21] have been investigated.
The interesting properties of Voronoi diagrams attracted our attention to ask a natural question whether
they will be satisfied in other spaces, especially for hyperbolic surfaces. Hyperbolic surfaces are characterized
by negative curvature and cosmologists have suffered
from a persistent misconception that negatively curved
universe must be the finite 3-D hyperbolic space [23].
Although we do not see hyperbolic surfaces around us,

Corresponding Author, Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran,
Iran, T: (+98) 26 34550002
Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran, Iran.

187

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

often nevertheless nature does posses a few. For example, lettuce leaves and marine flatworms exhibit hyperbolic geometry. There is an interesting idea about hyperbolic plane by W. P. Thurston that if we move away
from a point in hyperbolic plane, the space around that
point expands exponentially [22]. Hyperbolic geometry has found applications in fields of mathematics,
physics, and engineering. For example in physics, until we figure out whether or not the expansion of the
universe is decelerating, hyperbolic geometry could be
the most accurate way to define the geometries of fields.
Einsteins invented his special theory of relativity based
on hyperbolic geometry.

used to transfer the Poincare hyperbolic disk to the


Euclidean plane R2 , compute the Voronoi diagram in
R2 and then transfer it back. Section 4 is devoted to
some Implementations.

Poincar
e hyperbolic disk

The Poincare hyperbolic disk is a two-dimensional


model for hyperbolic geometry. Therefore it has a
negative curvature and defined as the disk D2 =
{(x, y) R2 |x2 + y 2 < 1}, with hyperbolic metric
dx2 +dy 2
2
Now we switch to some applications of the Voronoi ds = (1x2 y2 )2 . See [2] and [12] for details.
diagram in hyperbolic spaces. In [19] the authors
The Poincare disk is a model for hyperbolic geomedeal with Voronoi diagram in simply connected complete manifolds with non positive curvature, called try in which a geodesic (which is like a line in Euclidean
Hadamard manifold. They proved that the facet of geometry) is represented as an arc of a circle whose
Voronoi diagram can be characterized by hyperbolic ends are perpendicular to the disks boundary (and diVoronoi diagram. They considered that these Voronoi ameters are also permitted). Two arcs which do not
diagrams and its dual structure, Delaunay triangula- meet correspond to parallel rays, arcs which meet ortion, can be used as mesh generation, computer graph- thogonally correspond to perpendicular lines, and arcs
ics and color space [6]. Another application of Voronoi which meet on the boundary are a pair of limit rays
diagram in hyperbolic models is triangulating a sad- (see Fig. 1).
dle surface, which is a part of the triangulation of a
general surface. On general surface, some parts have
positive curvature, other parts have negative curvature
and other parts near zero. In such cases, one can divide the surface into some parts, make triangulation of
each part according to their curvature.
Further applications of the Voronoi diagram in hyperbolic spaces is devoted to the Farey tessellation
which is studied in [1]. The Teichm
uller space for T 2 is
2
Figure 1: Poincare disk and some of its geodesics
the hyperbolic plane H = {z = x + iy C|y > 0}: Tz2
2
can be thought of as the quotient space of R over the
lattice {m.1 + n.z|m, n Z} C. Let X H2 be the
The equation of a geodesic of D2 is expressed as
set of all parameters z corresponding to the tori with
three equally short shortest geodesics (i.e., tori glued either
from a regular hexagon). Then the Farey tessellation
x2 + y 2 2ax 2by + 1 = 0, with a2 + b2 > 1,
is nothing but the Voronoi diagram of H2 with respect
to X.
or
Such applications motivated us to study the
Voronoi diagrams on hyperbolic spaces. In [17], the
first two authors of this paper have studied the Voronoi
diagram in Poincare hyperbolic disk where the running
time of the proposed algorithm was O(n2 ). In this paper, we present a new method to compute the Voronoi
diagram in Poincare hyperbolic disk whose expected
worst case running time is O(nlogn).
This paper is organized as follows. In Section 2, a
brief introduction to Poincare hyperbolic disk is studied. Section 3 briefly reports the required maps we

ax = by.
Geodesics are basic building blocks for computational
geometry on the Poincare disk. The distance of two
points is naturally induced from the metric of D2 ; consider two point z1 (x1 , y1 ), z2 (x2 , y2 ) D2 , the distance
between z1 and z2 , denoted by d(z1 , z2 ), can be expressed as
Z
d(z1 , z2 ) =
ds
the geodesic connecting z1 and z2

= tanh1 (|

188

z2 z1
|).
1 z1 z2

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Our method

(0, 0, 2)

Suppose we are given a set S of n points (representing sites) in D2 . To construct the Voronoi diagram,
we use a combination of four maps to transfer these
sites into the Euclidean plane. The maps are defined
between four hyperbolic models and Euclidean plane,
denoted by D2 , S 2 , K 2 , H 2 and R2 , respectively. In
[7], Cannon et al. have an elegant discussion about
these hyperbolic models:

H2

p1

S2
D2

p2
p3
p4

S0
p = (x0 , y0 )

(0, 0, 1)

Figure 2: An illustration of the combination maps between D2 and R2

1. D2 = {(x, y) : x2 + y 2 < 1},


dx2 +dy 2
ds2D2 = (1x
2 y 2 )2

Now by using any algorithm in [4] for constructing the Voronoi diagram of the transferred sites in
R2 which has the worst case running time complexity
O(nlogn), the combination of the inverses of fi s will
allow us to obtain the Voronoi diagram in D2 . This
3. K 2 = {(x, y) : x2 + y 2 < 1},
combination is robust, as the subsequent theorem verdx2 +dy 2
2
ifies.
dsK 2 = 4 (1x2 y2 )2
Theorem 1: Let z1 and z2 be two points in R2 and J
be their bisector. Then f (J) would be the bisector of
4. H 2 = {(x, y, z) : z 2 x2 y 2 = 1, z > 0},
f (z1 ) and f (z2 ) in D2 and f = f11 f21 f31 f41
2
2
2
2
dsH 2 = dx + dy dz .
where fi s (i = 1, 2, 3, 4) are the above mentioned
maps.
Proof: Since we use the geodesics in each hyperThe list of maps that we defined and used is given
bolic models and Euclidean plane R2 , by using the corin the following:
responding metrics ds2 , we obtain that the bisector of
two given points z1 and z2 in R2 will be mapped to the
(a) A central projection map from the point (0, 0, 1),
bisector of f (z1 ) and f (z2 ) in D2 and vice-versa .
f1 : D2 S 2 that
2. S 2 = {(x, y, z) : x2 + y 2 + z 2 = 1, z > 0},
2
2
+dz 2
ds2S 2 = dx +dy
z2

(x, y, z) 7 (x, y, 1).

As the complexity of the mentioned maps are linear, we conclude that the complexity of our method to
compute the Voronoi diagram of a set of sites in D2
is O(nlogn) using any algorithm with the complexity
O(nlogn) to compute the Voronoi diagram in R2 for the
transferred sites from D2 and this yields the following
consequence.

(c) A central projection map from the point (0, 0, 0),


f3 : K 2 H 2 that

Hyperbolic Voronoi diagram can be constructed


with an O(nlogn) time complexity algorithm.

(x, y) 7 (

2x
2y
1 x2 y 2
,
,
).
1 + x2 + y 2 1 + x2 + y 2 1 + x2 + y 2

(b) A lifting map f2 : S 2 K 2 that

(x, y, 1) ( p

x
1

x2

y2

y
1
,p
,p
).
2
2
1x y
1 x2 y 2

(d) A central projection map from the point (0, 0, 2),


f4 : H 2 R2 that

Implementation

In this section we present our implementation, and discuss its performance in some series of experiments, designed to test different aspects of our algorithm and implementation. Our code has been written in C++, and
Fig.2 is an illustration of the above mentioned for visualization we have used MATLAB. Our implespaces and the connecting maps.
mentation with C++ have three main steps: in the first
(x, y, z) (

2x 2y
,
).
z2 z2

189

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

step we transfer our points (sites) from Poincare disk


to R2 . In this step the program reads the coordinates
of points from a file and then uses some methods and
functions to transfer them to R2 . In the second step
we work on transferred points and use Fortunes algorithm and draw the Voronoi diagram of points. Source
code of the Fortunes algorithm is available in [11, 24].
The output is the end points of the voronoi edges in
R2 . In the third step we transfer the end points to the
Poincare disk. So we use the inverse mode of the maps
defined in the first step. The output is the end points
of Voronoi edges in Poincare disk. Since we have the
formula for a geodesic in the Poincare disk, so we can
draw Voronoi edges easily.
We have used Visual C++ in Microsoft Visual Studio.NET 2005 with .NET Framework 2.0 and MATLAB Ra 2006. All experiments were run on an ASUS
Notebook Z53 j series with 2.0 GHz core 2 duo CPU
and 2 GB DDR2 RAM.
In Fig.3 the result of our implemented method for
five random sites situation is given.

[3] F. Aurenhammer, Voronoi Diagrams: a Survey of a Fundamental Geometric Data Structure, ACM Computing Surveys 23(3) (1991), 345405.
[4] F. Aurenhammer and R. Klein, Voronoi Diagrams, Handbook of Computational Geometry, J. Sack and G. Urrutia,
editors, Elsevier Science Publishers, B.V. North-Holland,
Chapter 5, pages: 201290, 2000.
[5] K. Q. Brown, Voronoi diagrams from convex hulls, Inform.
Process Lett. 9 (1979), 223228.
[6] H. Brettel, F. Vienat, and J. D. Mollonl, Computerized simulation of color appearance for dichromats, Journal of Optical Society of America 14(10) (1997), 26472655.
[7] J. W. Cannon, W. J. Floyd, R. Kenyon, and W. R. Parry,
Flavors of Geometry, MSRI Publications 31 (1997), 59115.
[8] The CGAL User and Reference Manual: All Parts. Release
3.3., 2007.
[9] R. L. Drysdale and D. T. Lee, Generalization of Voronoi
diagrams in the plane, SIAM J. COMPUT. 10 (1981), 73
87.
[10] H. Edelsbrunner, J. ORourke, and R. Seidel, Constructing
arrangements of lines and hyperplanes with applications,
Proc. 20th. Ann. IEEE Symp. FOCS (1983), 8391.
[11] S. Fortune, http://cm.bell-labs.com/who/sjf/index.html.
[12] C. Goodman-Strauss, Compass and Straightedge in the
Poincar
e Disk, Disk. Amer. Math. Monthly 108 (2001), 33
49.
[13] D. G. Kirkpatrick, Efficient computation of continous skeletons, Proc. 20th. Ann. IEEE Symp. FOCS (1979), 1827.
[14] D. T. Lee, On k-nearest neighbor Voronoi diagrams in the
plane, IEEE Trans. Comp. C-31, 6 (1982), 478487.
[15] D. T Lee, Two-dimensional Voronoi diagrams in the Lp
metric, JASM 27(4) (1980), 604618.
[16] D. T. Lee and C. K. Wong, Voronoi diagrams in L1 (L )
metrics with two dimensional storage applications, SIAM
J. COMPUT. 9 (1980), 200211.

Figure 3: Result for the nine random sites in R2 and


D2

[17] Z. Nilforoushan and A. Mohades, Hyperbolic Voronoi Diagram, ICCSA 2006, LNCS 3984 (2006), 735742.
[18] A. Okabe, B. Boots, K. Sugihara, and N. Chiu, Spatial tesselations: concepts and applications of Voronoi diagrams,
Wiley Series in Probability and Statistics, 2000.

Acknowledgments

We would like to thank Professor Dr. R. Klein for


reading the first version of the manuscript.

Refrences
[1] S. Anisov, Geometrical spines of lens manifolds, Department of Mathematics, Utrecht University, 2005.
[2] J. W Anderson, Hyperbolic Geometry, New York. SpringerVerlag, 1999.

[19] K. Onishi and J. Itoh, Voronoi diagram in simply connected


complete manifold, IEICE TRANS. Fundamentals. E85-A,
5 (2002), 944948.
[20] R. Seidel, A convex hull algorithm optimal for point sets in
even dimensions, M. S. thesis, Rep. 81-14, Dep. Computer
Science, Univ. of British Colombia, 1981.
[21] M. I. Shamos and D. Hoey, Closest-Point Problems, Proceedings 16th IEEE Symposium on Foundations of Computer Science (1975), 151162.
[22] W. P. Thurston, Three dimensional Geometry and Topology, Princeton University Press, 1997.
[23] J. R. Weeks, The Shape of Space, CRC; 2nd edition, 2001.
[24] Voronoi Resources,
http://www.skynet.ie/ sos/mapviewer/voronoi.php.

190

Solving Systems of Nonlinear Equations Using The Cuckoo


Optimization Algorithm
Mahdi Abdollahi

Shahriar Lotfi

Aras International Campus. University of Tabriz

University of Tabriz

Department of Computer Sciences

Department of Computer Sciences

m.abdollahi89@ms.tabrizu.ac.ir

shahriar lotfi@tabrizu.ac.ir

Davoud Abdollahi
University of Tabriz
Department of Mathematics
d abdollahi@tabrizu.ac.ir

Abstract: Systems of nonlinear equations arise in a diverse range of sciences such as economics,
engineering, chemistry, mechanics, medicine and robotics etc. For solving systems of nonlinear equations, there are several methods such as Newton type method, Particle Swarm algorithm (PSO),
Conjugate Direction method (CD) which each has their own strengths and weaknesses. The most
widely used algorithms are Newton-type methods, though their convergence and effective performance can be highly sensitive to the initial guess of the solution supplied to the methods. This
paper introduces a novel evolutionary algorithm called Cuckoo Optimization Algorithm, and some
well-known problems are presented to demonstrate the efficiency and better performance of this new
robust optimization algorithm. In most instances the solutions have been significantly improved
which proves its capability to deal with difficult optimization problems.

Keywords: Systems of Nonlinear Equations; Optimization; Cuckoo Optimization Algorithms; Evolutionary Algorithm.

Introduction

in [3] for solving systems of nonlinear equations. From


mathematical methods we can point the Filled Function methods [4].

Solving systems of nonlinear equations has always been


important in science. Most of the scientific problems
are related to the system of nonlinear equations. As
you know, there are two types of system equations.
The first type is linear and the second type is called
nonlinear. There are several methods for the first type
but there are few methods for the second type that the
solution often comes with approximate.
So far, several methods are presented for solving systems of nonlinear equations. Existing methods have
been tried to solve such problems in less time and with
higher accuracy. The genetic algorithm is used in [1]
and the particle swarm algorithm has been improved
Corresponding

In this paper, we introduce Cuckoo Optimization


Algorithm (COA) for solving the systems of nonlinear
equations. The results of cuckoo optimization algorithm are compared with other methods found in [1],
[3] and [4] to illustrate the power and high efficiency
of this algorithm.
In sections 2, we will briefly overview the COA. In
Section 3, how to apply cuckoo algorithm for solving
systems of nonlinear equations will be explained. In
Section 4, the obtained numerical results will be presented as a comparison and finally in section 5, we have
the conclusions and future works.

Author, P. O. Box 51586-49456, F: (+98) 411 669-6012, T: (+98) 914 116-2612

191

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The Cuckoo Optimization Algorithm (COA)

distance from their habitat. From now on, this maximum range will be called Egg Laying Radius (ELR).
In an optimization problem with upper limit of varhi
and lower limit of varlow for variables, each cuckoo has
an egg laying radius (ELR) which is proportional to
the total number of eggs, number of current cuckoos
eggs and also variable limits of varhi and varlow . So
ELR is defined as:

Like other evolutionary algorithms, the proposed algorithm starts with an initial population of cuckoos.
These initial cuckoos have some eggs to lay in some
host birds nests. Some of these eggs which are more
similar to the host birds eggs have the opportunity to
grow up and become a mature cuckoo. Other eggs are
Current cuckoo0 s eggs
(varha varlow )
ELR =
detected by host birds and are killed. The grown eggs
T otal number of eggs
reveal the suitability of the nests in that area. The
(3)
more eggs survive in an area, the more profit is gained
in that area. So the position in which more eggs surwhere is an integer, supposed to handle the maxivive will be the term that COA is going to optimize
mum value of ELR.
[2].

Solving Systems of Nonlinear


Equations With COA

Each cuckoo starts laying eggs randomly in some


other host birds nests within her ELR. After egg laying process, p% of all eggs (usually 10%), with less
profit values, will be killed.

Let the form of systems of nonlinear equations be:


When young cuckoos grow and become mature, they

immigrate
to new and better habitats with more simif1 (x1 , x2 , ..., xn ) = 0

larity of eggs to host birds and also with more food for
f2 (x1 , x2 , ..., xn ) = 0
(1) new youngsters.To recognize which cuckoo belongs to
..

which group, used K-means clustering method (a k of

fn (x1 , x2 , ..., xn ) = 0
3 5 seems to be sufficient in simulations).
In order to transform (1) to an optimization probWhen each cuckoo moving toward goal point, they
lem, we will use the auxiliary function:
only fly a part of the way and also have a deviation.
Each cuckoo only flies % of all distance toward goal
n
X
2
minf (x) =
fi (habitat),
(2) habitat and also has a deviation of radians. For each
cuckoo, and are defined as follows:
i=1
U (0, 1)
U (/6, /6)

habitat = (x1 , x2 , ..., xn )


In order to solve systems of nonlinear equations,
its necessary that the values of problem variables be
formed as an array. In GA and PSO terminologies this
array is called Chromosome and Particle Position,
respectively. But here in COA it is called habitat.
In a Nvar -dimensional optimization problem, a habitat
is an array of 1 Nvar , representing current living position of cuckoo. The profit of a habitat is obtained by
evaluation of fitness function f(x) in equation (2).We
should mention that COA maximizes a profit function.
To use COA in cost minimization problems, one can
easily multiple the profit function to a minus.

(4)

Due to the fact that there is always equilibrium in


birds population so a number of Nmax controls and
limits the maximum number of live cuckoos in the environment.
After some iterations, all the cuckoo population
moves to one best habitat with maximum similarity
of eggs to the host birds and also with the maximum
food resources. There will be least egg losses in this
best habitat. Convergence of more than 95% of all
cuckoos to the same habitat puts an end to Cuckoo
Optimization Algorithm (COA).

To start the optimization algorithm, a candidate


habitat matrix of size Npop Nvar is generated. Then
Evaluation and Experimental
some randomly produced number of eggs is supposed 4
for each of these initial cuckoo habitats. In nature,
Results
each cuckoo lays from 5 to 20 eggs. These values are
used as the upper and lower limits of egg dedication to In this section, the results of applying this algorithm
each cuckoo at different iterations. Another habit of to solve the following problems are offered:
real cuckoos is that they lay eggs within a maximum

192

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Problem 1 [1]:

cos(2x1 ) cos(2x2 ) 0.4 = 0
2(x2 x1 ) + sin(2x2 ) sin(2x1 ) 1.2 = 0

Table 1: Used parameters in cuckoo Algorithm for


problems
Parameters
Initial pop.
Range
of eggs
for each
cuckoo
Number
of
iteration
Maximum
of
cuckoos
Number
of
clusters
Egg
laying
radius
Pop. variance

2 x1 2, 2 x2 2
Problem 2 [1]:

f1 (x1 , x2 ) = ex1 + x1 x2 1 = 0
f2 (x1 , x2 ) = sin(x1 x2 ) + x1 + x2 1 = 0
2 x1 2, 2 x2 2
Problem 3 [1]:

0 = x1 0.25428722 0.18324757x4 x3 x9

0 = x2 0.37842197 0.16275449x1 x10 x6

0 = x3 0.27162577 0.16955071x1 x2 x10

0 = x4 0.19807914 0.15585316x7 x1 x6

0 = x5 0.44166728 0.19950920x7 x6 x3
0 = x6 0.14654113 0.18922793x8 x5 x10

0 = x7 0.42937161 0.21180486x2 x5 x8

0 = x8 0.07056438 0.17081208x1 x7 x6

0 = x9 0.34504906 0.19612740x10 x6 x8

0 = x10 0.42651102 0.21466544x4 x8 x1

P2
40

P3
5

P4
5

P5
5

P6
5

[2, 4]

[2, 4]

[2, 4]

[2, 4]

[2, 4]

[2, 4]

150

200

300

30

300

300

250

300

500

1000

1000

300

50

50

50

50

50

50

0.001

0.001

0.001

0.001

0.001

0.001

Table 2: Results of problems 1


Method
Newtons
Secant
Broydens
Effatis
Evolutionary
COA

(x1 ,
(0.15,
(0.15,
(0.15,
(0.1575,
(0.15772,
(0.1563,

x2 )
0.49)
0.49)
0.49)
0.4970)
0.49458)
0.4931)

(f1 , f2 )
(-0.00168, 0.01497)
(-0.00168, 0.1497)
(-0.00168, 0.1497)
(0.005455, 0.00739)
(0.001264, 0.000969)
(-3.2559e-004, 1.2562e-006)

Table 3 shows the solution obtained for problem 2.

10 xi 10, i = 1 to 10
Problem 4 [3]:


P1
20

Table 3: Results of problems 2


Method
Effatis
Evolutionary
COA

x31 3x1 x22 1 = 0


3x21 x2 x32 + 1 = 0

(x1 , x2 )
(0.0096, 0.9976)
(-0.00138, 1.0027)
(-0.00003, 1.00009)

(f1 , f2 )
(0.019223, 0.016776)
(-0.00276, -0.0000637)
(-0.0000745, 0.0000174)

As in Tables 2 and 3 is visible, the solutions obtained


for both problems 1 and 2 are better and has higher
accuracy than of the solutions obtained by other methods. Table 4 shows the solution obtained for problem 3.

1 x1 2, 1 x2 2
Problem 5 (Neurophysiolosgy Application) [1]:
2
x1 + x23 = 1

x
+ x2 = 1

2 3 4
x5 x3 + x6 x34 = 0
x5 x31 + x6 x32 = 0

x5 x1 x23 + x6 x24 x2 = 0

x5 x21 x3 + x6 x22 x4 = 0

Table 4: Results of problems 3


Method

Evolutionary

| xi | 10
Problem 6 [4]:

0.5sin(x1 x2 ) 0.25x2 / 0.5x1 = 0
(1 0.25/)(exp(2x1 ) e) + ex2 / 2ex1 = 0

COA

0.25 x1 1, 1.5 x2 2.
The used parameters in COA for problems is listed
in Table 1.

(x1 , ..., x10 )


0.1224819761
0.1826200685
0.2356779803
-0.0371150470
0.3748181856
0.2213311341
0.0697813035
0.0768058043
-0.0312153867
0.1452667120
0.2482000000
0.3869000000
0.2772000000
0.1908000000
0.4453000000
0.1487000000
0.4266000000
0.0647000000
0.3467000000
0.4119000000

(f1 , ..., f10 )


0.1318552790
0.1964428361
0.0364987069
0.2354890155
0.0675753064
0.0739986588
0.3607038292
0.0059182979
0.3767487763
0.2811693568
-0.0094474086
0.0060038145
-0.0011322079
-0.0097329967
0.0001244906
-0.0000867383
-0.0051325862
-0.0085537600
0.0008737175
-0.0152687483

Table 5 shows the solution obtained for problem 4.


Table 5: Results of problems 4
Method

Table 2 shows the solution obtained for problem 1.


To study the other compared methods in Table 2, see
[1], [5], [6] and [7].

PSO
COA

(x1 , x2 )
1.08421508149135
-0.29051455550725
1.08421508149135
-0.29051455550725

(f1 , f2 )
-9.99200722162e-016
6.77236045021e-015
-9.99200722162e-016
6.77236045021e-015

The solution obtained for problem 5 is seen in Table

193

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion and Future Works


6, and Figure 1 shows the convergence of the graph. 5
As Table 6 shows, the results of COA are better than
the evolutionary algorithm.
In this paper, we used the cuckoo optimization algorithm for solving systems of nonlinear equations.
Table 6: Results of problems 5
Some well known problems were presented to demonMethod
(x1 , ..., x6 )
(f1 , ..., f6 )
strate the efficiency of finding the best solution using
-0.8078668904
0.0050092197
the COA. The proposed method had very good per-0.9560562726
0.0366973076
Evolutionary
0.5850998782
0.0124852708
formance, and was able to achieve to better results
-0.2219439027
0.0276342907
as shown in tables (2-7). In this algorithm, a gradual
0.0620152964
0.0168784849
-0.0057942792
0.0248569233
evolution in reaching the answer was quite visible. The
-1.0000000000
1.8769e-004
figure (1) reveals this fact. According to the Figure
-1.0000000000
1.9044e-004
COA
-0.0137000000
2.9019e-008
(2), the results have been stable. Therefore, we could
-0.0138000000
-2.0000e-004
say that this algorithm has high performance for solv0.5209000000
1.3944e-006
-0.5207000000
4.9330e-005
ing systems of nonlinear equations and is effective to
find the optimum solutions with high accuracy.
Current Cost = 1.1393e007, at iteration = 300
2

10

As a future work, we are planning to extend COA on


solving the boundary value problems such as Harmonic
and Biharmonic equations. We can also use the normal
distribution instead of uniform distribution to achieve
better results. It is noteworthy that the convergence
speed could be raised by the use of chaos theory for
[8].

Cost Value

10

10

10

10

10

50

100

150
Cuckoo Iteration

200

250

300

Figure 1: The convergence chart of problem 5


Results obtained for problem 6 are in Table 7.

Refrences
[1] C. Grosan and A. Abraham, A New Approach for Solving
Nonlinear Equations Systems: PART A: SYSTEMS AND
HUMANS, IEEE VOL. 38, NO. 3 (MAY 2008), 698714.

Table 7: Results of problems 6


Method

(x1 , x2 )
0.50043285
3.14186317
0.29930000
2.83660000

Filled Function
COA

(f1 , f2 )
-0.00023852
0.00014159
-0.000071289
0.000026644

[2] R. Rajabioun, Cuckoo Optimization Algorithm, ELSEVIER


Applied Soft Computing (2011), 55085518.

Figure 2 indicates the stability diagram of problem 6


in 30 runs. We achieved to acceptable mean and standard deviation such 4.84e-07 and 5.0426e-07 respectively.

[3] M. Jaberipour, E. Khorram, and B. Karimi, Particle Swarm


Algorithm for Solving Systems of Nonlinear Equations, ELSEVIER Comput. Math. Appl 62 (2011), 566576.
[4]

C. Wang, R. Luo, K. Wu, and B. Han, A New Filled


Function Method for An Unconstrained Nonlinear Equation,
ELSEVIER Comput. Appl. Math 235 (2011), 16891699.

[5] C. G. Broyden, A Class of Methods for Solving Nonlinear Simultaneous Equations, Math. Comput vol. 19, no. 92 (Oct.
1965), 577593.
5

Fitness Function

10

[6] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P.


Flannery, Numerical Recipes in C: The Art of Scientific
Computing, Cambridge, U.K.:Cambridge Univ. Press, 2002.
10

10

[7]

S. Effati and A. R. Nazemi, A new method for solving


a system of the nonlinear equations, Appl. Math.Comput
vol. 168, no. 2 (2005), 877894.

15

10

10

15
Run Number

20

25

30

Figure 2: The stability chart of problem 6

[8] H. Bahrami, K. Faez, and M Abdechiri, Imperialistic Competitive Algorithm Using Chaos Theory for Optimization,
12th International Conference on Computer Modelling and
Stimulation (2010).

194

A Novel Model-Based Slicing Approach For Adaptive Softwares


Sanaz Sheikhi

Seyed Morteza Babamir

University of Kashan

University of Kashan

Department of Computer Engineering

Department of Computer Engineering

sheikhi@grad.kashanu.ac.ir

babamir@kashanu.ac.ir

Abstract: Dynamic changes in operational environments of softwares and users requirements has
caused software communities to develop adaptive softwares. The inherent dynamism of adaptive
softwares makes them complex and error prone. So accomplishing many tasks such as understanding, testing, analyzing cohesion and coupling of an adaptive software is a difficult and costly labor.
we present a novel approach for slicing of an adaptive system which its result can be used for fulfilling these task with less costs and more easier. The approach uses Techne model of an adaptive
software. Being model-based gives the approach the chance of not being involved in software code
and work in a abstract level.

Keywords: slicing; adaptive software; model-based; Techne model.

Introduction

Dynamic environment and stakeholders needs can be


perfectly managed by the help of adaptive softwares.
Adaptive softwares modify their behaviors or structures in response to changes of environment or users
requirements. This level of flexibility is accompanied
with the risk of more errors and complexity. So costs
of many activities like testing, integration, debugging,
cohesion and coupling analysis increases and they get
more difficult.
Slicing techniques can be useful in these cases. They
choose some statements(commands) of a software program which affect a predefined set of desirable variables(usually output variables), called criterion, based
on their policy. The result,called a slice, comprises
the statements affecting the criterion and irrelevant
statements are removed. It is simpler, so cost effective
and easier to be analyzed.

There are slicing techniques for both the source


code and model of softwares[1]. In this paper we propose a new approach for slicing of adaptive softwares
and for avoiding problems of complex codes and application dependence we use a Techne model of an adap Corresponding

Author

195

tive software. Also the approach uses Techne model


properties such as Preferences and conflicts between
the model elements and being optional. In this way
the produced slice is also the best way of satisfaction
of the slicing criterion.
The rest of this paper is organized as follows, next
section gives a brief statement of the problem, An introduction to adaptive softwares and Techne model is
given in section 3. Concepts of slicing is clarified in section 4. Section 5 will introduce the proposed approach
for slicing of an adaptive software. the approach is applied to a case study in section 6. Related works are
briefly reviewed in section 7. Section 8 includes the
conclusion and future works.

Problem Statement

But adaptive softwares are inherently complex and using usual software engineering approaches for some
applications like understanding, testing, cohesion and
coupling analysis of them is so difficult, has high cost
and is error prone. So new Techniques should be used
to optimize adaptive software development to take advantage of them. Slicing is a reducing technique that

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Optional (O): it shows optional propositions.

can solve the problem.

ADAPTIVE SOFTWARE

Software systems operate in open, changing and unpredictable environments. So to have robustness, they
should be able of adapting to environmental changes
as well as adapting to their internal changes and stakeholders various requirements [2]. There are many languages to model these systems. Structural or object
oriented ones specify a system from its developer point
of view and dont pay attention to stakeholder of the
systems, instead goal based languages are closer to
stakeholder idea and are easier to understand [3,4].
Techne [5,6] is a goal based modeling language for
adaptive systems. In addition to general properties of
other languages, Techne has unique properties distinguishing it from other languages. The model is in the
form of a directed graph. Its nodes represent propositions related to environment or stakeholder of the software such as:

To clarify the Techne Model, we consider the problem of scheduling a meeting. Scheduling can be done
automatically by the use of email and web forms or
manually. Web forms are designed to acquire participants calendar constraints to submit the requests to
modify the meeting date and location. web forms addresses and the invitations are sent to participants by
email. The manual approach organize the meeting via
phone calls. A part of the model of meeting schedular
is depicted in Figure 1.

Goal (g): stakeholders eligible condition that


must be satisfied.
Quality constraint (q): it limits the value of a
measurable and well-defined characteristic of the
system.
Soft goal (s): its like quality constraint but it
limits the value of an ill defined characteristic of
the system.

Figure 1: Techne model of meeting scheduler

SLICING

Domain assumption (k): a proposition that


should be always true about the environment of program slicing is an approach to choose some pieces
of a program affecting a set of desired variables. The
the system regardless of the system.
result is called a slice. Size of a slice is much less than
Task (t): whatever must be done for satisfaction size of a program. so, slice is a cost effective choice
for program testing, program maintenance, cohesion
of goals,quality constraint and soft goals.
and coupling analysis, comprehension of program and
a lot more usages. Almost all the slicing methods use
Edges of the graph determine the relation between the a dependency graph of the program. It is a directed
nodes. The relations are of four types :
graph and its nodes are statements(commands) of the
program and its edges show either data or control dependency between the statements [7]. Slicing methods
Inference (I): it conveys the conjunction of a set
work base on a slicing criterion (V,n). V is a set of
of propositions to satisfy another proposition.
variables to be analyzed that are residing before line
Conflict (C): whenever two nodes are related via number n in the program source code. They choose
a conflict relation, they cant be satisfied to- all the statements of the program residing before line
n that are directly or indirectly changing values of
gether.
set V variables. They recognize the dependency and
Preference (P): when a proposition is preferred relations between the statements from the program
to another one, the former is linked to the latter dependency graph.
via a preference relation.

196

The Third International Conference on Contemporary Issues in Computer and Information Sciences

PROPOSED APPROACH

2. Determine
Decomposed(ai ),
Preference(ai ).

For slicing a program either its source code or its model


can be used. As source code of an adaptive software
is complicated we use the model of an adaptive software instead of its code for slicing. we consider the
Techne modeling language attributes (preference, being optional) in our slicing policy. The Techne model
is in the form of a dependency graph, so we use it as
software dependency graph.

We use the Techne model depicted in figure 1 to


show our slicing approach. In Techne model all the
goals (gi ), soft goals (si ), domain assumption (ki ),
quality constraints (qi ) and tasks (ti ) are supposed to
be as the graph nodes. All the inference (Ii ), conflict
(ci ) and preference (pi ) relations between the nodes are
supposed to be the dependency edges.
For slicing one of the goals, soft goals or quality constraints is picked up as slice criterion and the following
sets should be define:
concepts is the set of all the graph nodes.
A is a subset of concepts and includes all the
goals, soft goals and quality constraints.
Decomposed(ai ) { B1 ,..., Bs }
ai A , s = |concepts|
Bt = (bn )tn=1 , bn concepts , t=1,..., s
There may be diiferent pathes for satisfaction of
ai . In fact, each Bt is a path for satisfaction of
ai and is called a solution.
Conflict(ai ): it is the set of all the 2 tuple like (bp
, bq ), where ( bp and bq Ba ) or ( bp Ba and bp
Bb , a 6= b ) and (Ba and Bb Decomposed(ai )
) and bp and bq are in conflict with each other.
In the graph bp and bq are connected to each other
via an edge with conflict relation.

Conflict(ai ),

3. For each Bj Decomposed(ai ) and each bp and


bq Bj ,if ((bp , bq ) Conflict(ai )) then remove
Bj from Decomposed(ai ).
4. If Decomposed(ai ) is empty, then there is no path
for satisfaction of ai , return .
5. If Decomposed(ai ) has only one member, called
BF , then pick it up and go to step 11.
6. If each two members of Decomposed(ai ) are in
chosen-together-list then go to step 11.
If Decomposed(ai ) has more than one member
which are not in the chosen-together-list then
choose them, call them Bk , Bj and insert (Bj
, Bk ) into checked-together-list and:
7. For each bp Bk , bq Bj :
if (bp , bq ) is in Preference(ai ) then increase prefk
one unit
if (bq , bp ) is in Preference(ai ) then increase prefj
one unit
8. For each member of Bk , having optional label increase prefk one unit
For each member of Bj , having optional label increase prefj one unit
9. Insert Bj and BK names in to checked together
list.
10. If prefj < prefk then remove Bj
Decomposed(ai )
If prefj > prefk then remove Bk
Decomposed(ai )
go to step 4.

from
from

11. Each members of Decomposed(ai ) can be a path


for satisfaction of ai . It depends on designer
or programmer strategy to choose one of them,
called BF .

12. Slice(ai ) = Slice(b1 )... Slice(bF ), BF =(b1 ,...,bF )


Preference(ai ): it is the set of all the 2 tuple like
(bp , bq ) where ( bp Ba and bp Bb , a 6= b )
and ( Ba and Bb Decomposed(ai )).
At first the approach chooses a slice criterion (ai )
In the graph, bp is preferred to bq if there is a and adjusts an empty list called chosen-together-list
preference relation from bp to bq
and also sets two flags(prefk and prefj ) to zero presenting worthiness of two different solution to comAfter preparing the sets, we perform the following pare them and choose one of them. In the second step different solution of ai become members of
steps:
Decomposed(ai ) and conflicts between concepts and
priority of concepts are determined. In the third step,
1. Choose one of the set A members as the slicing Any solution consisting conflicting propositions is recriterion, called ai .
moved from Decomposition(ai ) set. In step four if
Set an empty list called chosen-together-list.
the decomposition(ai ) set is empty then the algorithm
Set prefk and prefj ,flags, to zero.
returns empty set, meaning that there is no path to

197

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

satisfy ai . On the other hand if there is more than


one solution, they are compared in consideration with
the number of optional concepts they cover and preferences existing among their concepts.
Finally one of the remained solutions in the
decomposition(ai ), called BF , is chosen. Eq.(1) determines the slice:
Slice(ai ) = Slice(b1 ) ... Slice(bF ) (1)
BF =( b1 ,...,bF )

7. prefk ++
8. 9. Insert (Bj , Bk ) into checked-together-list
10. prefk > prefj Decomposed(S1 )= {(G1 , Q2
)}
go to 11
11. BF = (G1 , Q2 )

Slice(ai ) contains the most effective elements on


satisfaction of ai and is free of irrelevant elements. SO
it is a suitable choice for analyzing cohesion and coupling, generating test cases for test cases and many
other applications easily and with less cost.

Case Study

S
S
12. slice(S1 )=slice(G1 )S slice(Q2 ) = slice(G1 )
slice(G1 )=slice(T1 ) slice(T2 )
slice(S1 )= { S1 , Q2 , G1 , T1 , T2 }
Result is a slice which is much more smaller,simpler
and cost effective to be analyzed for different applications in comparion with the whole Techne model. It
has less nodes and relations relevant to the way of satisfaction of the criterion. It contains all the elements
that effect satisfaction of the criterion.

Applying the slicing approach to the Techne model of


meeting schedular depicted in Figure 1 with the slicing criterion accommodate late changes leads to creation of the slice depicted in Figure 2. Because of space
limitation, we use the following notation for the model
elements which will be used in this section:
S1 : Accommodate late changes.
G1 : obtain change requests via web form.
Q1 : change requests can be setup up to 6h prior. Figure 2: slice for the criterion accommodate late
changes
Q2 : change requests can be setup up to 3h prior.
T1 : leave web form open up to 3h prior.
T2 : implement web form for change reqs.
T3 : leave web form open up to 6h prior.
The approach steps are :
1. criterion = S1 .
2. Decomposed(S1 )= {(G1 ,Q1 ),(G1 , Q2 )}
Conflict(S1 )=
Preference(S1 )= (Q2 ,Q1 )
3. 4. 5. 6. Bj =(G1 ,Q1 ) , BK =(G1 , Q2 )

Related Work

As far as we have studied, no research has been dedicated to slicing of adaptive system model. But there
are some researches on slicing of software models, specially UML models. Lizzhang [8] considered class diagram , and does the slicing to extract test cases based
on a black box method. Ray [9] used condition slicing
of class diagrams but it is not suitable as class diagrams are static and dont show the systems behavior
with regarding the data dependencies. To compensate
this handicap Samuel [1] benefited sequence diagram,
which is dynamic and shows the system behavior, for
slicing and test case generation. He proposes a formula
for slicing criterion adequacy and claims that it covers
the slicing criterion with the least number of test cases.
Bertolino [10] focused on message passing between sequence diagram components and tries to generate test

198

The Third International Conference on Contemporary Issues in Computer and Information Sciences

cases for covering all the predicates and interactions


existing in the sequence diagram.

Discussion and Future Works

We present a novel approach for slicing of an adaptive


system, the result of the approach is specially applicable for testing, verifying coupling and cohesion or understanding of a complex adaptive software. As adaptive softwares are usually complicated, our approach
avoids involving their source codes and instead uses
the adaptive softwares models. In this way the work
proceeds in an abstract level and gets rid of complexity
of codes details.
The approach is designed based on Techne model of an
adaptive software. It benefits the model architecture,
which is in the form of a graph, instead of creating a
dependency graph. Also it is based on the priority and
being optional properties of the Techne modeling language. Therefore, the result of the approach can be
considered as the best solution to satisfy the slice criterion. The produced slice contains exactly those parts
of the model affecting the satisfaction of the criterion
and hence a huge part of the software irrelevant to the
criterion gets eliminated and the analysis cost considerably is reduced.
Naturally there is much to be done in this field. Large
number of comparisons in the approach increases the
cost, and one of our future work is to alter the approach in such a way to solve this problem. And also
we are planning to slice the adaptive system model
with a dynamic method which probably may reduce
the complexity of the approach and size of the slice.

199

Refrences

[1] P. Samuel and R. Mall, A Novel Test Case Design Technique Using Dynamic Slicing of UML Sequence Diagrams,
e-Informatica Software Engineering Journal 2/1 (2008),
367378.
[2] A. G. Ganek and T. A. Corbi, The dawning of the autonomic computing era, IBM Systems Journal 2/1 (2003),
71-92.
[3] E. Nitto, C. Ghezzi, A. Metzger, M. Papazoglou, and K.
Pohl, A journey to highly dynamic, self-adaptive servicebased applications, Automated Software Engineering Journal/USA 15/3 (2008), 313-317.
[4] Q. Zhu, L. Lin, H. M. Kienle, and H. A. Muller: Characterizing maintainability concerns in autonomic element design,
software maintenance ICSM/Beijing (2008), 197-206.
[5] A. Borgida, N. Ernest, I.J. Jureta, A. Lapouchnian, S.
liaskos, and J Mylopoulos, Techne (another) Requirements
Modeling Language, University of Toronto (2009).
[6] I.J. Jureta, A Borgida, N. Ernest, and J Mylopoulos,
Techne: Towards a New Generation of Requirements Modeling Languages with Goals, Preferences, and Inconsistency
Handling, Proceeding of IEEE International Conferance on
Requirement Engineering,sydney,NSW (2010), 115-124.
[7] D. Binkley, S. Danicic, T. Gyimothy, M. Harman, A. Kiss,
and B. Korel, Theoretical foundations of dynamic program
slicing, Theoretical Computer Science 360/23-41 (2006).
[8] W. Linzhang, Y. Jiesong, Y. Xiaofeng, H. Jun, L. Xuandong,
and Z. Guoliang, Generating test cases from UML activity
diagrams based on gray-box method, Proceedings of the 11th
Asia- Pacific Software Engineering Conference/Washington,
DC, USA (2004).
[9] M. Ray, S. S. Barpanda, and D.P. Mohapatra, Test Case
Design Using Conditioned Slicing of Activity Diagram, International Journal of Recent Trends in Engineering 1/2
(2009), 117-120.
[10] A. Bertolino and F. Basanieri, A practical approach to UML
based derivation of integration tests, Proceedings of 4th International Software Quality Week Europe (2000).

A novel approach to multiple resource discoveries in grid environment


Leyli Mohammad khanli

Saeed Kargar

University of Tabriz

Islamic Azad University,Tabriz Branch

Department of Computer Sciences

Department of Computer

Tabriz, Iran

Tabriz, Iran

l-khanli@tabrizu.ac.ir

saeed.kargar@gmail.com

Hossein Kargar
Islamic Azad University, Science and Research Branch
Department of Computer
Hamedan, Iran
h.kargar.ir@gmail.com

Abstract: In this paper, we proposed the method of discovering resource in grid environment
which is able to discover the required combinational resources of users apart from single resources.
In this method, the idea of combination of colors was used for saving and discovering resources.
This method uses combination of colors for illustrating characteristics of resources and the users
use combination of colors or their equivalent codes for requesting their necessary resources. This
method is able to establish the users required resources with low traffic and discover them by a direct
path and diagnose the changes which were occurred in the system and update the environment.
This method is simulated in environments with different sizes and the results show that this method
established lower traffic in environment comparing the other methods and so it is more effective.

Keywords: Facility Location; Voronoi Diagram; Reactive Agent; Computational Geometry; Artificial Intelligence.

Introduction

Computational grid is a virtual distributed computing


environment aimed at establishing an environment for
sharing resources in a wide geographical range. The
resources which are being shared in grid maybe heterogeneous, be in different geographical places and belong
to different domains and etc. It is certain that finding
a resource or combination of resources for executing a
special program is very complicated and difficult. The
grid system should support a mechanism in order to
be implemented so that this mechanism can discover
the required resources of users in a widespread environment with establishing low traffic and give it to the
users. These mechanisms are known as resource discovering mechanisms.
Corresponding

The traditional resource discovering mechanisms


use methods such as centralized methods for discovering resources [16]. These methods are very effective and efficient in discovering the resources in small
environments. Although all available resources in an
environment are managed by a central server, and this
server is able to support well the sent requests, when
the size of environment was spread, these systems confronts with problem. This means that, the information
size which is saved and managed in a server, the size
of requests which are sent to discover the resources to
server, will increase significantly. This led to creating
a bottleneck in server and reduces the efficiency of system. The researchers decided to establish systems for
solving this problem which dont belong to a central
server. These systems are known as distributed systems. In distributed systems, a central server is not
responsible for all resources and the system is man-

Author, F: (+98) 411 3347480, T: (+98) 411 3304122

200

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

aged as distributed. Such systems are systems which


All aforementioned methods except the last method
were presented recently and use resource discovering were methods which discover a resource for user and
tree for discovering the resource [712]. These systems were not able to discover multi-resources simultanecan provide better efficiency than previous methods.
ously to the user. The most important difference between proposed method in this paper with [11] and
In this paper, we propose a method which is able other methods is that the proposed method is able to
to discover multi-resources to a user simultaneously. do simultaneously multi-resource on trees with differWe designated a color to each of the characteristics ent and desirable children (no merely on binary tree).
of resources for saving the information of each of the In this work, we used combination of colors for the first
resources and used combination of their constituting time in this work.
colors for combinational resources. The results of our
simulated method show the efficiency of this method
in grid systems with different sizes.

Our resource discovery and


update methods

The remainder of this paper is organized as follows.


Section 2 presents an overview of related work. Section
3 explains the proposed algorithm. Section 4 shows the
simulation results, and Section 5 concludes the paper In this section, first we explain the manner of designatand outlines some future research directions.
ing colors to each of the resources and then we introduce the method of resource discovering and update.

Related work

Resource discovering problem is one of the most important problems which researchers are trying to solve
it and they have proposed different methods for solving
this problem and we explain some of these methods in
brief in this section.

In this work, for producing the colors we use Color


Schemer Studio 2 [14]. We use three main colors including red, green and blue that each of them are represented on three bands which are darkened from one
direction to the other. Such that the colors near to
255 are lighter and the colors near to 0 are darker colors. We can create many colors from combination of
these three colors. Any desirable color can be presented
by three number of x, y, z(x, y, z {1, 2, 3, ..., 256})
that each of them represented the concentration of red,
green, blue colors respectively. So white is represented
with code (255.255.255) and black is represented with
code (0.0.0). Our imaginative grid environment which
is on a tree structure is shown in Figure 1.

Among the other resource discovering methods,


we can point to the proposed method in [13] which
uses routing tables for this reason. In this method,
the whole environment is considered as a combination
of routers and resources and resource discovering is
As you see in this figure, combinational resources
done by routing tables including the number of routers
phases to each of the type of available resources in the are written in each of nodes. These nodes are imagined
as the grid sites that each of them has resources.
environment.
The other method is the proposed method of Chang
et al which uses resource discovering tree for finding the
required resources of user [7]. This method is also a distributed method and each node is responsible for itself
and its childrens nodes. This method can improve the
previous methods for many parts.
In our previous work [8], we used the weighted resource discovering tree for discovering the resource in
grid environment. This method can reduce the re- Figure 1: An example of typical grid environment on
source discovering cost comparing the previous meth- a Resource discovery tree
ods.
In this example, we imagined two types of CPU,
The proposed method in [11] is a multi-resource discovering method which is done the resource discovering three types of HDD and two types of RAM in environment. As we told, we represented any type of resource
on a binary tree.

201

The Third International Conference on Contemporary Issues in Computer and Information Sciences

with a color. In Figure 2, all imagined resources in our


environment were attributed to a unique color. The
nodes will save the similar color with due to the available resource in a table called Color Table which will
be introduced. If nodes have combination of resources,
they will save combination of the colors of resources.

Figure 4: A sample of Color Table (belong to node 1)

3.1

Resource discovery

Now imagine that, a user needs resource CPU 3.8 GHz


& HDD 2TB. This user delivers the related color to a
node which is nearer (here for example node 7).
Figure 2: Existing resources in our grid environment

For example, if one node has the combination resource CPU 3.8 GHz & HDD 2TB, it will use the resulted color from combination of (255.102.0) (related
to CPU 3.8) and (34.0.204) (related to HDD 2T). For
obtaining combination of colors, it is enough to calculate the integral number of the average of numbers:
([(255+34)/2].[(102+0)/2].[(0+204)/2])=(144.51.102)
Figure 5: A sample of resource discovery in our method
In Figure 3, three samples of combination resources
were represented together with color and code. In FigHaving received this request, the node 7 compares
ure 4, the color table related to node 1 was shown. As
it
first
with its local color and then with available colors
you see, number of the rows of color table of each node
in
its
table.
is for the number of the children of that node. In each
row, the color related to available resources was written
Since there is no conformity, so delivers it to its
in related child.
parent (that means node 4). The node 4 delivers the
request to node 1 in the same way. As it is shown
in Figure 5, node 1 finds a conformity in row 1 which
is related to node 2 and delivers the request to that
node and the node 2 acts in the same way and sends
the request to nodes 5 and 6 and at the end, the desired resources was discovered in two nodes to the user
(multi reservation).

Figure 3: An example of combinational resources

202

As it was seen, the proposed method can discover


the required resource of the user by a direct path on

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

trees with different sizes and desired number of chil- ferent methods. We supposed 300 users that everyone
dren.
requested different number of resources. In Figures 7
and 8, the results were shown.

Simulation Results

As it can be seen in all tests, having increased the


size of environment and having increased the number
of request resources of users, our method have better
efficiency.

We did the simulation in MATLAB environment and


show the results in graphs. We consider the varied simulations for simulating the size of environment; also,
we supposed that the users request different number
of resources at any time. Having these assumptions,
we obtained the results for tree method with height
of 4 [7], FRDT [8], MMO [15, 16] and flooding-based
which are one resource methods. One-resource methods mean the methods which are able to discover just
one resource at any time to the user. In these methods,
we supposed that the current methods send the users
request separately and then discover these resources
for the user [11]. Also the proposed method in [11]
which is a multi-resource method on binary tree is one
of the methods which we compared with our proposed
Figure 7: The number of the visited nodes by the users
method.
requests during the resource discovery that the users
In the first test, the average number of nodes to request two resources
whom the requests are sent, shown for the state in
which every user requests one resource. Here, we compared our method with other methods. As show in
Figure 6, the average number of visited nodes in our
method is lower than other methods and is equal to
FRDT. It is because that first, in this experiment every user requested just one resource and second, since
our method passes just a direct path, like FRDT, so,
both visit the same number of nodes. In this test we
supposed 300 requests.

Figure 8: The number of the visited nodes by the users


requests during the resource discovery that the users
request four resources

Conclusions and future work

Figure 6: Average number of nodes that requests


are forwarded in resource discovery using different ap- This paper presents a method for discovering the disproaches
tributed and scalable resource which supported from
discovering multi-resource in grid dynamic environment. In this method, the idea of combination of colIn next simulations, we established 300 requests in ors was used for saving and discovering resources. The
the environment. These tests show the number of vis- simulation results show that this method is an effective
ited nodes in different size of grid environments for dif- and efficient method in grid environment.

203

The Third International Conference on Contemporary Issues in Computer and Information Sciences

In future, if we can reduce the size of saved data in


tables, we can improve the method for most part.

[15] Ye Zhu, Junzhou Luo, and Teng Ma, Dividing Grid Service Discovery into 2-stage matchmaking, ISPA 2004, LNCS
3358 (2004), 372-381.
[16] Sanya Tangpongprasit, Takahiro Katagiri, Hiroki Honda,
and Toshitsugu Yuba, A time-to-live based reservation algorithm on fully decentralized Resource Discovery in Grid
computing, Parallel Computing 31 (2005).

Refrences
[1] I. Foster, C. Kesselman, and Globus, A meta-computing infrastructure tool-kit, Int. J. High Perform, Comput. Appl
2 (1997), 115-128.

[17] Muthucumaru Maheswaran, Klaus Krauter, and Teng Ma,


A parameter-based approach to Resource Discovery in Grid
Computing Systems, GRID (2000).

[2] M. Mutka and M. Livny, Scheduling remote processing capacity in a workstation processing bank computing system,
Proc. of ICDCS (1987).

[18] K.I. Karaoglanoglou, H.D. Karatza, and Teng Ma, Resource


Discovery in a dynamical grid based on Re-routing Tables,
Simulation Modelling Practice and Theory 16 (2008), 704720.

[3] C. Germain, V. Neri, G. Fedak, and F. Cappello,


XtremWeb: Building an experimental platform for global
computing, Proc. of IEEE/ACM Grid (2000).
[4] A. Chien, B. Calder, S. Elbert, and K. Bhatia, Entropia:
Architecture and performance of an enterprise desktop grid
system, J. Parallel Distrib. Comput 63 (2003), no. 5.
[5] F. Berman, Adaptive computing on the grid using AppLeS,
TPDS 14 (2003), no. 4.

[19] Simone A. Ludwig and S.M.S. Reyhani, Introduction of semantic matchmaking to Grid computing, J. Parallel Distrib.
Comput 65 (2005), 15331541.
[20] Juan Li and Son Vuong, Grid resource discovery using semantic communities, Proceedings of the 4th International
Conference on Grid and Cooperative Computing, Beijing,
China (2005).

[6] M.O. Neary, S.P. Brydon, P. Kmiec, S. Rollins, P. Capello,


and JavelinCC, Scalability issues in global computing, Future Gener. Comput. Syst. J 15 (1999), no. 56, 659-674.

[21] juan Li and Son Vuong, Semantic overlay network for Grid
Resource Discovery, Grid Computing Workshop (2005).

[7] R-.S Chang and M-.S .Hu, A resource discovery tree using
bitmap for grids, Future Generation Computer Systems 26
(2010), 2937.

[22] Cheng Zhu, Zhong Liu, Weiming Zhang, Weidong Xiao,


Zhenning Xu, and Dongsheng Yang, Decentralized Grid Resource Discovery based on Resource Information Community, Journal of Grid Computing (2005).

[8] L.M Khanli and S. Kargar, FRDT: Footprint Resource


Discovery Tree for grids, Future Gener. Comput. Syst 27
(2011), 148-156.
[9] L.M Khanli, A. Kazemi Niari, and S. Kargar, An Efficient
Resource Discovery Mechanism Based on Tree Structure,
The 16th International Symposium on Computer Science
and Software Engineering (CSSE 2011) (2011), 4853.
[10] Leyli Mohammad Khanli, Saeed Kargar, and Ali Kazemi
Niari, Using Matrix indexes for Resource Discovery in Grid
Environment, The 2011 International Conference on Grid
Computing and Applications (GCA11), Las Vegas, Nevada,
USA (2011), 3843.
[11] Leyli Mohammad Khanli, Ali Kazemi Niari, and Saeed Kargar, A binary tree based approach to discover multiple types
of resources in grid computing, International journal of computer science & Emerging Technology, Sprinter Global Publication E-ISSN: 2044-6004 (2010).
[12] leyli Mohammad Khanli, Ali Kazemi Niari, and Saeed Kargar, Efficient Method for Multiple Resource Discoveries in
Grid Environment, The 2011 International Conference on
High Performance Computing & Simulation (HPCS 2011)
(2011).
[13] R. Raman, M. Livny, and M. Solomon, Matchmaking: distributed resource management for high throughput computing, hpdc, Seventh IEEE International Symposium on High
Performance Distributed Computing (HPDC-798), (1998),
140.
[14] Rajesh. Raman, Matchmaking Frameworks for Distributed
Resource Management, Wisconsin-Maddison, 2001.

204

[23] Fawad Nazir, Hazif Farooq Ahmad, Hamid Abbas Burki,


Tallat Hussain Tarar, Arshad Ali, and Hiroki Suguri, A resource monitoring and management middleware infrastructure for Semantic Resource Grid, SAG 2004, LNCS 3458
(2005), 188-196.
[24] Thamarai Selvi Somasundaram, R.A. Balachandar, Vijayakumar Kandasamy, Rajkumar Buyya, Rajagopalan Raman, N. Mohanram, and S. Varun, Semantic based Grid Resource Discovery and its integration with the Grid Service
Broker, Proceedings of 14th 4 International Conference on
Advanced Computing & Communications, ADCOM (2006),
84-89.
[25] J. Li and S. Vuong, A scalable semantic routing architecture
for Grid resource discovery, 11th Int. Conf. on Parallel and
Distributed Systems, ICPADS05 1 (2005), 29-35.
[26] K Karaoglanoglou and H Karatza, Resource discovery in
a dynamical grid system based on re-routing tables, Simulation Modelling Practice and Theory, Elsevier 16 (2008),
no. 6, 704-720.
[27] Color Schemer Studio 2: http://www.colorschemer.com/.
[28] M. Marzolla, M. Mordacchini, and S. Orlando, Resource discovery in a dynamic environment, Proceedings of the 16th
International Workshop on Database and Expert Systems
Applications, DEXA05 (2005), 356-360.
[29] M.Marzolla, M.Mordacchini, and S.Orlando, Peer-to-peer
systems for discovering resources in a dynamic grid, Parallel Comput 33 (2007), no. 45, 339-358.

HTML5 Security: Offline Web Application


Abdolmajid Shahgholi

HamidReza Barzegar

Jawaharlal Nehru Technological University

Jawaharlal Nehru Technological University

School of Information and Technology

School of Information and Technology

Hyderabad, India

Hyderabad, India

Shahgholi a@hotmail.com

Hr.barzegar@gmail.com

G.Praveen Babu
Jawaharlal Nehru Technological University
School of Information and Technology
Hyderabad, India
pravbob@jntu.ac.in

Abstract: Offline Web Application [7]: Web applications are able through using HTML5 Offline
Web Application to make them working offline. A web application can send an instruction which
causes the UA to save the relevant information into the Offline Web Application cache. Afterwards
the application can be used offline without needing access to the Internet. Whether the user is asked
if a website is allowed to store data for offline use or not depends on the UA. For example, Firefox
3.6.12 asks the user for permission but Chrome 7.0.517.44 does not ask the user for permission to
store data in the application cache. In this case the data will be stored in the UA cache without
the user realizing it.

Keywords: Offline Web Application, User Agent, Cache Poisoning

Introduction

< htmlmanif est =0 /cache.manif est0 >


< body >

Creating web applications which can be used offline


was difficult to realize prior to HTML5. Some manufacturers developed complex work around to make their
web applications work offline. This was mainly realized
with UA add-ons the user had to install. HTML5 introduces the concept of Offline Web Applications. A
web application can send information to the UA which
files are needed for working offline. Once loaded the
application can be used offline. The UA recognizes the
offline mode and loads the data from the cache. To tell
the UA that it should store some files for offline use
the new HTML attribute manifest in the < html >
tag has to be used:

The attribute manifest refers to the manifest file


which defines the resources, such as HTML and CSS
files, that should be stored for offline use. The manifest file has several sections for defining the list of files
which should be cached and stored offline, which files
should never be cached and which files should be loaded
in the case of an error. This manifest file can be named
and located anywhere on the server; it only has to end
with .manifest and returned by the web server with
the content-type text/cache-manifest. Otherwise the
UA will not use the content of the file for offline web
application cache.

<!DOCT Y P EHT M L >

User Agent (UA): The UA represents a web application consumer which requests a resource from a web

Corresponding

Author, P. O. Box 1447653148, F: (+98) 021 8827-1350, T: (+91) 8686184291

205

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

application provider. This resource is processes by the


UA and, depending on the resource, is rendered and
displayed by the UA to the end-user. The UA has
the capability to establish Hypertext Transfer Protocol (HTTP) [6] connections to a web server, to render
HTML / CSS and execute JavaScript code correctly.

is possible as well. This mainly breaks the requirement


of UA protection. But breaking this security requirement all other security requirements are endangered
implicitly as well. E.g., if the security requirement secure caching can be broken, an attacker can include any
content into the Offline Web Application cache and use
this code for breaking the other security requirements
Further, the UA has implemented the HTML 4.01 as well.
and HTML 5 standard and its corresponding capabilities such as the Geolocation API or Web Storage .
Web application: The web application is a generic
term of the entity providing web resources and is Composed out of the following three main parts:
Website: The website is composed out of several
single web resources and is accessible via its URI.
Web server: The web server is hosting at least
one website. The HTTP(S) connection is established between the UA and the web server.
Besides hosting websites additional resources are
also provided by the web server. Other connections, such as Web Socket API connections, are
also established between the UA and the web
server.
Database: The database stores any kind of data
needed for the web application such as personal
information about their users.
Motivation
As seen many attacks against web applications exist (in
2010) and the need for security in the Internet grows.
Beside the comfort the web provides, security concerns
are critical points to be considered. This applies to
current web applications but also for future web applications. The threats to web applications described in
this section need to be kept in mind when considering
HTML5 security issues.

Vulnerabilities

With the introduction of Offline Web Applications the


security boundaries are moved. In web applications
prior to HTML5 access control decisions for accessing
data and functions were only done on server side. With
the introduction of Offline Web Applications parts of
these permission checks are moved towards the UA.
Therefore, implementing protections of web applications solely on server side is no longer sufficient if Offline Web Applications are used. The target of attacking web application is not limited to the server-side; attacking the client-side part of Offline Web Application

Threats and attack scenarios

Spoofing the cache with malicious data has been a


problematic security issue already prior to HTML5.
Cache poisoning was possible with already existing
HTML4 cache directives for JavaScript files or other
resources. However, UA cache poisoning attacks were
limited. With HTML5 offline application this cache
poising attacks are more powerful. The following
threats are made worse in HTML5:
Cache Poisoning: It is possible to cache the root
directory of a website. Caching of HTTP as well
as HTTPS pages is possible. This breaks the security requirement of UA protection and Secure
caching.
Persistent attack vectors: The Offline application cache stays on the UA until either the
server sends an update (which will not happen
for spoofed contents) or the user deletes the cache
manually. However, a similar problem as for Web
Storage exists in this case. The UA manufacturers have a different behavior if the recent history is deleted. This breaks the security requirement of UA protection.
User Tracking: Storing Offline Web Application
details can be used for user tracking. Web applications can include unique identifiers in the
cached files and use these for user tracking and
correlation. This breaks the security requirement
of Confidentiality. When the offline application
cache is deleted depends on the UA manufacturers.
As already mentioned, cache poisoning is the most critical security issue for offline web applications. Therefore, possible cache poisoning attack scenario is given
in this section which is motivated on the ideas of an
article from [8]. Figure 1 shows a sequence diagram
which illustrates how an attacker can poison the cache
of a victims UA. The victim goes online through an unsecure malicious network and accesses whichever page
(the page to be poisoned does not have to be accessed

206

The Third International Conference on Contemporary Issues in Computer and Information Sciences

necessarily). The malicious network manipulates the


data sent to the client and poisons the cache of the UA.
Afterwards, the victim goes online through a trusted
network and accesses the poisoned website. Then the
actual attack happens and the victim loads the poisoned content from the cache.

9 The JavaScript performs the login request to


www.filebox-solution.com (From here the steps
are optional; theyre performed to hide the actual attack from the user).
10 The Login request is sent to www.filebox- solution.com.
11 Login successful (The user does not notice the
attack performed).

One may argue that a similar kind of attack was


possible also with standard HTML cache features.
That is correct but the offline application attack has
two advantages:

Caching of the root directory is possible: If the


user opens the poisoned website, the UA will not
make any request to the network and loads the
poisoned content from the cache. If the root directory is cached using HTML4 cache directives,
a request to the server is sent as soon the user
clicks refresh (Either the server sends a HTTP
304 not modified or an HTTP 200 OK or the page
is loaded from the server and not from cache).

Figure 1:
1 Victim access any.domain.com through a malicious access point (e.g. public wireless).
2 The HTTP GET Request is sent through the malicious access point to any.domain.com.

SSL-Resources can be cached as well: In HTML4


Man-in-the-middle attacks were possible but
then the user had to access the website through
the unsecured network. With offline application
caching of the root of an HTTPS website can be
cached; the user does not have to open the website. The user may accept an insecure connection
(certificate warning) in an unsecured network because he does not send any sensitive data. The
real attack happens if the user is back in his secured network, feels safe and logs in to the poisoned application.

3 Any.domain.com returns the response.


4 The access point manipulates the response
from any.domain.com: A hidden Iframe with
src=http://www.filebox-solution.com is added to
the response which is sent to the UA.
5 This hidden Iframe causes the UA to send a request to www.filebox-solution.com in the background (the user will not notice this request).
6 The request to www.filebox-solution.com is intercepted by the malicious access point and returns
a faked login page including malicious JavaScript.
The HTML page contains the cache manifest
declaration. The cache. Manifest file is configured to cache the root directory of www.fileboxsolution.com (the cache. Manifest file itself is
returned with HTTP cache header to expire late
in the future).
7 The victim opens his UA in a trusted network
and enters www.filebox-solution.com in the address bar. Because of the offline application cache
the UA loads the page from the cache including
the malicious JavaScript. No request is sent to
www.filebox- solution.com.

Countermeasures

The threats Persistent attack vectors and Cache poisoning cannot be avoided by web application providers.
The threats are defined in the HTML5 specification.
To come around this problem is to train the users to
clear their UA cache whenever they have visited the Internet through an unsecured network respectively be8 After the user has entered the login credentials to fore they want to access a page to which sensitive data
the faked login form (offline application), it posts are transmitted. Further, the user needs to learn to
the credentials to an attacker controlled server understand the meaning of the security warning and
(JavaScript code execution).
only accept Offline Web Applications of trusted sites.

207

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion

Applications such as e-mail clients, word processing


or image manipulation applications will have the capabilities to run completely in the browser. Making
use of HTML5 running these application completely
offline in the browser will also be possible. This provides new ways for malware. Everything the user needs
to run HTML5 web application is a HTML5 supporting browser. This is an ideal target for a malware for
write-once, run everywhere - HTML5 is platform independent. Malware only making use of JavaScript and
HTML5 features may be seen numerous with the initiation of HTML5. It will be new that the targets of
HTML malware will no longer be limited to web application servers but move to the UA as well (beside the
problematic of exploiting browser vulnerabilities) because HTML5 provides feature rich capabilities to the
UA; they can even be persisted without exploiting UA
vulnerabilities, e.g. in the Web Storage. Overall it can
be said that making web applications secure solely with
technological solutions is a very complex task and cannot be done by all web application providers. Therefore, the end-user is highly responsible for using web
applications carefully and only providing personal and
sensitive data if a strong trust relationship exists.

Refrences
[1] World
Wide
Web
Consortium
(W3C),
HTML
4.01
Specification,
and
W3C
Recommendation,
http://www.w3.org/TR/1999/REC-html401-19991224/
(1999).
[2] The World Wide Web Consortium (W3C) and XHTML
1.0 The Extensible HyperText Markup Language,
http://www.w3.org/TR/xhtml1/ (2000).
[3] The World Wide Web Consortium (W3C), HTML5 - A vocabulary and associated APIs for HTM and XHTML, and E.
Jamhour, http://www.w3.org/TR/html5/ 4786 (2007), 196199.
[4] M. Pilgrim, HTML5: Up and Running, Sebastopol: OReilly
Media, 2010.
[5] Web
Hypertext
Application
Technology
Working
Group
(WHATWG),
What
is
the
WHATWG?:
http://wiki.whatwg.org/wiki/FAQ (2011).
[6] Internet Engineering Task Force, The Internet Society:
Hypertext Transfer Protocol HTTP/1.1,
http://www.ietf.org/rfc/rfc2616.txt (1999).
[7] The World Wide Web Consortium (W3C), Offline Web Applications: http://www.w3.org/TR/offline-webapps/ (1999).
[8] Lavakumar Kuppan and Attack and Defense Labs, Chrome
and Safari users open to stealth HTML5 AppCache
attack: http://blog.andlabs.org/2010/06/chrome-and-safariusers-open-to-Stealth.Html (2010).

208

Earthquake Prediction by Study on Vital Signs of Animals in


Wireless Sensor Network by using Multi Agent System
Media Aminian

Amin Moradi

Islamic Azad University

Institute for Advance Studies in Basic Sciences

Scienc and Research branch of Kerman, Iran

Department of Physics

Department of Computer

amin.moradi@iasbs.ac.ir

media.aminian@yahoo.com

Hamid Reza Naji


International Center for Science and High Technology,Kerman,Iran
Department of Computer
hamidnaji@ieee.org

Abstract: We use a multi agent system architecture approach in a wireless sensor network (WSN)
for prediction the occurrence earthquake by study on vital signs animals. This system uses several
agents with different functionalities. CBR methods were applied to analyze and compare the similarity in animal vital signs just before an occurred earthquake with real time to reduce false alarm.
The presented architecture consists of two layers, including interface layer and regional layer. At
the interface layer the interface agents interact with users and at the regional layer, the cluster
agents communicate with each other and packing the information.

Keywords: Earthquake prediction;WSN;Multi Agent System;CBR.

Introduction

efficiency data collection.

Every year more than 13,000 earthquakes with a magnitude greater than 4.5 occurred around the world that
hundreds of them are destructive and too many people
lost their lives[1]. If we could predict them, we will be
able to save many lives. Before the earthquake Earths
crust break and gases such as argon and radon are released into the air[2]. Animals are sensitive to these
gases and their behavior and vital signs in response
to these gases will be changed[3]. So we can detect
stress in animals by measuring the vital signs. A WSN
comprises numerous sensor devices commonly known
as motes which can contain several sensors to monitor the vital signs such as temperature, heart rate, etc.
The sensor motes are spatially scattered over a large
area Since, Data collection is difficult in this network.
So, we presented a multi layer agent system to increase
Corresponding

Author, T: (+98) 937 459-4169

209

The Proposed architecture

The present environment of collaborative agents in


WSN is described by three entities, as shown in Figure
1. These entities include Web browser, software agents
and sensor nodes[4]. The web browser is the gateway
for the user to receive results in the appropriate format.
Agents are the intelligent entities that are able to respond to the user needs and relieve the user from being
the controller of the system[5]. Sensor nodes are physical entities that are able to read temperature, heart
rate, etc from the environment.
The proposed layered system architecture consists of
two layers: interface layer and regional layer. The

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

agents in each layer collaborate with each other to


achieve their goals. The agents on each layer coordinate with their upper layers to transmit information.At
the regional layer, the cluster agents collect the sensory data from the sensors. The clusters agents process and repackage them and finally pull the packets to
Figure 2: A case base for animal vital signs
the interface layer. At the interface layer, the interface
agents interacts with the sensor network and receives
the packets from the cluster agents and shows them
in the appropriate format(text or graphic) and CBR 2.3 CBR
agent measures the similarity coefficient.
In this project, CBR methods and algorithms were
used. CBR (Case Based Reasoning) systems resolve
new problems by recovering similar resolved problems
in the case base and reapplying their solutions to the
new problem[6]. The first step of CBR system is case
representation. A case must contain the problem composition, its solution and its output. In this perspective
a CBR system can be defined by three iterative steps:
1. Retrieve the most similar cases to new problem
from case base.
2. Reuse the solutions of these recovered cases. If
necessary, adapt their solutions to resolve the
new problem by creating a suitable solution to
him.

Figure 1: Multi Agent System architecture

2.1

Regional Layer

Cluster agents operate in regional layer. In the WSN,


we have several sensors that are attached on the animal body and report animal vital signs. The attached
sensors on animal body, sends their readings to the
cluster head. Each cluster head transmits its information to the cluster agents that are locate on regional
layer. These agents receive information from the cluster head, process and repackage them.

2.2

Interface Layer

The interface layer is operated by the interface and


CBR agent. CBR agent maintains a case base that has
implemented by SQL (Figure2). This agent using CBR
methods to measures the occurrence of earthquakes.
Then, it sends all the information to the interface agent
for graphical display.

3. Keep the new solution in the case base in order


to use in the future.

2.4

Similarity Coefficients

Various similarity coefficients are proposed by the researchers in several domains. A similarity coefficient
indicates the degree of similarity between object pairs.
Methods are shown in figure 1[5]. The variables will be
transformed as a the number of property being located
in the two cases, the new case and every recorded case
in the case base; b the number of property being located in the new case; c the number of property being
located in the recorded case. The steps of a CBR system by an analytical approach can be ordered in five
steps:
1. The abnormal vital signs enter as a new case to
the system.
2. Thanks to the interviews made with the experts,
weights have already attributed to every try case.

210

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Weight
0.16
0.15
0.14
0.13
0.11
0.9
0.8
0.7
0.6
1

3. For every recorded case in the base similarity coefficient (Sij) between old case and the new case
is calculated.

4. Cases which have a similarity coefficient under


similarity limit are eliminated.

Degree(1-9)
9
8
7
6
5
4
3
2
1
55

Property
Heart rate
shaking
Temperature
Breath rate
Blood glucose
Urine volume
Calcium
Proteins
Enzymes

Table 1: Importance degrees and weights belonging to


the properties
5. Cases which are very similar to the new case are
retrieved from the case base.
Effects of stress
Increase
Increase
Decrease
Increase
Increase
Increase
Decrease

Normal range
60-120 per minute
37-40 C
20-23 per minute
53-59 mg per cc
89-109 mg per cc
11-14 mg per cc

Property
Heart rate
shaking
Temperature
Breath rate
Blood glucose
Urine volume
Calcium

Table 2: The values of cat vital signs

In this project, a calculation formula of similarity


coefficient is developed by jaccard model:
n
n
X
X
ci = (
Wi ai )/( (Wi ai + Wi bi + Wi ci ))
i=1

(1)

i=1

In this formula, a represent the properties found in


both of the cases, the new case and every registered
Figure 3: Different ways for calculating the similarity case in the case base;b the properties found only in the
new case; c the properties found only in the registered
coefficient in literature
case.

3
2.5

Conclusion

Writing Algorithm

The program compares the similarity of cases with the


new case to find the best solution. For that reason, we
need a method to calculate similarity coefficients. So,
we have to determine weight for the important properties such as heart rate, temperature, calcium, breath
rate, etc. the properties have a given value in normal cases. Table 2 indicates importance degrees and
weights belonging to the properties. The values and
the effective of stress on cats have represented in table
3[7].

211

We presented a layered system architecture using


agents for wireless sensor networks that can be useful
to predict earthquake. The network consists of several sensor nodes that able to sense the vital sign of
animals. Since the changes in animal vital signs may
be due to other factor such as noise or entrance of an
alien animal to their territory, CBR methods employed
to increases confidence coefficient. The proposed system has some disadvantages such as the probability of
detach sensors from the animal body or sensors damage
along animals activity. The other disatvantage is limitation of the animals. Because for each animal there

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

is a corresponding cluster agent in the regional layer.


But we can design a multi-hop WSN to resolve this
problem in the future.

[3] E. Buskirk, Unusual animal behavior before earthquakes:


A review of possible sensory mechanisms, REVIEWS OF
GEOPHYSICS 19 (1981), 247270.
[4] S. Hussain, Collaborative Agents for Data Dissemination in
Wireless Sensor Network . High-Performance Computing in
an Advanced Collaborative Environment (2006), 1623.

Refrences
[1] United
States
Geological
http://earthquake.usgs.gov .

Reading RG6 6BB, UK , Atmospheric and Solar-Terrestrial


Physics 72 (2009), 376-381.

Surveys

USGS,

[2] R. Harrison and K. Aplinb, Atmospheric electricity coupling


between earthquake regions and the ionosphere. a Department of Meteorology, University of Reading, Earley Gate,

[5] P. katia and, multiagent system, AI magazine 19 (1998).


[6] B. Gulcin, Intelligent system applications in electronic
tourism, Elsevier 38 (2010), 6586-6598.
[7] M. Cynthia and R. Klein, MERCK veterinary manual,
MERCK publishing, Chapter 5, pages: 201290, 2006.

212

Availability analysis and improvement with Software Rejuvenation


Zahra Rahmani Ghobadi

Baharak Shakeri Aski

Samangan Institute of Higher Education,Amol

Ramsar Azad University

Department of Computer

Department of Computer

m.rah62@gmail.com

baharakshakeriaski@yahoo.com

Abstract: Today, almost everyone in the world is directly or indirectly affected by computer
systems. Therefore, there is great need for looking at ways to increase and improve the reliability
and availability of computer systems. Software fault tolerance techniques improve these capabilities.
One of Software fault tolerance techniques is Software rejuvenation, which counteracts software
aging. In this paper, we address this technique for the application with one and two and three
software versions then extend model for n versions and show that with more software versions can
greatly improve availability of application.

Keywords: software rejuvenation; Reliability; Availability; continuous-time Markov process.

Introduction

When software applications run continuously, error


conditions are accumulated and the result is a degradation of the computer system or even a crash failure.
This phenomenon has been reported as software aging.
A proactive method in order to counteract this phenomenon is software rejuvenation.
The causes of software aging are memory leaking, unreleased file locks, file descriptor leaking, data corruption
in the operating environment of system resources, etc.
software aging will affect the performance of the application and eventually cause the application to fail. The
software rejuvenation technique terminates the program when its performance declines to a certain degree,
then restarts to clean the inner state and the software
performance will be restored.
Software rejuvenation, first reported by Huang et al.[1]
It has now been performed in various systems that software aging has been observed such as billing applications , process restart in Apache[3]. Furthermore, software rejuvenation has been proposed as an action that
will increase availability of a two node clustered computer systems[4] or service reliability in VoIP server[5]
and moreover to counteract intruders attacks[6].
Many researchers have been concentrated in studying
software rejuvenation under different circumstances.
The research effort varies as far as the system stud Corresponding

Author, T: (+98) 935 822-4714

213

ied or the kind of modeling that is used concerns.


Huang et al.[1] uses a continuous time Markov chain
to model software rejuvenation. Vaidyanathan et al.[7]
use stochastic reward nets (SRNs) to model and analyze cluster systems, which employ software rejuvenation. Park and Kim[8] use semi-Markov process to
model software rejuvenation in order to improve the
availability of personal computer-based active/standby
cluster systems. In[9] both check pointing and rejuvenation are used together to further reduce the expected completion time of a program. In[10] Dohi et al.
formulate software rejuvenation models via the semiMarkov processes, and derive analytically for respective cases the optimal software rejuvenation schedules,
which maximize system availability. Furthermore, they
develop nonparametric statistical algorithms to estimate the optimal software rejuvenation schedules, provided that the statistical complete (uncensored) sample data. In[7] Vaidyanathan et al. construct a semiMarkov reward model based on workload and resource
usage data collected from the UNIX operating system
to model software rejuvenation. Trivedi et al. discuss
stochastic models to evaluate the effectiveness of proactive fault management in operational software systems
and determine optimal times to perform rejuvenation,
for different scenarios in[12]. Two software rejuvenation policies of cluster server systems under varying
workload, called fixed rejuvenation and delayed reju-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

venation are presented and in order to achieve a higher


average throughput one of them is proposed by Xie et
al. in[13]. Okamura et al. in[14] deal with dependability analysis of a client/server software system with rejuvenation. Liu et al. in[15] use software rejuvenation as
a proactive system maintenance technique deployed in
a CMTS (Cable Modem Termination System) cluster
system, study different rejuvenation policies evaluate
these policies by stochastic reward net models solved
by SPNP (Stochastic Petri Net Package). In[16] the
optimal software rejuvenation policy maximizing the
interval reliability in the general semi-Markov framework is considered by Suzuki et al. Furthermore, Bobbio et al. in[17] use fine-grained software degradation
models for optimal rejuvenation policies.

Software Rejuvenation

Figure 1: Software rejuvenation model of single application


The system has three states: the working state 0 (denoted as H), the failure state 1 (denoted as F) and the
rejuvenation state 2 (denoted as R). In the beginning,
the application stays in the working state 0. With system performance degrades over time, a failure may occur.If system failure occurs before triggering software
rejuvenation, the application changes from the working
state 0 to system failure state 1 and then the system
recovery operation is started immediately. Otherwise,
the application changes from the working state 0 to the
software rejuvenation state 2 and later the software rejuvenation is carried out. After completing the system
repair or rejuvenation, the application becomes as good
as new and changes to the beginning working state 0
again. We define the time interval from the beginning
of the system working to the next one as one cycle.
According to the model described above, at any time
t the application can be in any one of three states:
up and available for service (working state 0), recovering from a failure (the failure state 1), or undergoing software rejuvenation (the rejuvenation state 2).
To formally describe the software rejuvenation model
of single version application, continuous time Markov
process denoted as z = (zt ; t0) is used, where zt represents the state of application at time t. The transition
probability function of Z is expressed as follows [10]:

Software rejuvenation is a proactive fault management


technique aiming at cleaning up the internal state of
the system to prevent the occurrence of more severe
crash failures in the future. It involves occasionally
terminating an application or a system, cleaning its
internal state and restarting it[18]. Application is unavailable during rejuvenation. Although rejuvenation
may sometimes increase the downtime of an application, those are usually planned and scheduled downtimes. If care is taken to schedule rejuvenation during
the idlest times of an application, then the cost due to
those downtimes is expected to be short. Downtime
costs are the costs incurred due to the unavailability of
pij (t) = p(zt = j|z0 = i)(i, j , t 0)
(3)
the service during downtime of an application [2] Let
Pij (t) be transition probability function of continuous- Where, = 0, 1, 2 is the state space set.
time Markov process and qij be transition rate. Kol- For the software rejuvenation model in Fig.1,1 , 1 , r1 ,
mogorov forward equation is defined as follows:
and R1 represents the failure rates from system working state to failure state, the transition rate to trigger
N
software rejuvenation, the rejuvenation rate from softX
dpij (t)/dt =
pik (t)qkj , i, j = 0, 1, 2
(1) ware rejuvenation state to system working state and
k=0
the recovery rate from system failure state and the recovery rate from system failure state to system working
By Letting P (t) to be the matrix of transition prob- state, respectively. Let Q be the matrix of the tranability function Pij (t)(i, j = 0, 1, 2, ) and Q to be the sition rate function. According to the state transition
matrix of transition rate function qij (t)(i, j = 0, 1, 2, ), relationship of single version application, the transition
formula (1) can be expressed in matrix format as fol- rate matrix for the continuous time Markov process Z
can be easily derived as:
lows:

(1 + 1 ) 1
1
0
p (t) = p(t)Q
(2)
R1
R1
0
Q=
(4)
r1
0
r1
First, we study Software rejuvenation model for the
application with one software version, model based Let P (t) be the matrix of transition probability function Pij (t)(i, j ). According to Kolmog forward
Markov process, as is show in Fig. 1.

214

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Eq.1, transition probability matrix P (t) satisfies:


p0 (t) = p(t)Q
p(0) = I

(5)

Where, I is the unit matrix.


LetPj , j be the instantaneous steady probability
of single version application in state j. According to
the limit distribution theorem,Pj , j is given by:
pj = lim pij (t)(i, j )
t

(6)

Substitute Eq.4 and 6 to Eq.5, the following equation


is derived:
(1 + 1 )P0 + R1 P1 + r1 P2 = 0
R1 P1 + 1 P0 = 0

this model. The assumptions are explained as following:


Assumption 1: Software rejuvenation is not allowed for
both versions to be carried out concurrently.
Assumption 2: At any time t only one version can be
in rejuvenation state.
Assumption 3: if the version be in failure state, other
versions cant transfer to rejuvenation state.
Assumption 4: rejuvenation rate from software rejuvenation state to system working state is faster than
recovery rate from system failure state to system working state.
Also it is assumed that Zt is the state of the version at
time t, 0 = {0, 1, 2 7} is the state space set. Similarly, we use continuous time Markov process, denoted
as z = (zt ; t 0) to describe the software rejuvenation model of two-node application. The transition
probability function of Z is expressed as Eq. 10 and
pj , j is given by[19]:
pj = lim pij (t)(i, j 0 )
t

r1 P2 + 1 P0 = 0

(9)

Correspondingly, the transition probability matrix


(7) P (t) also satisfies the condition in Eq. 5. By subi=0
stitution Eq. 9 and 10 to Eq. 5 the Eq.11 can be
derived[19]:
Where Pi , i = 0, 1, 2 can be obtained by solving the
Eq.7. The application is available for service requests
in working state 0 and application is unavailable for
rejuvenation state 1 and failure state 2, thereafter, the
system availability for single application is given by:
2
X

pi = 1

pA1 = p0

2.1

(8)

Software rejuvenation model of


two-node application

(10)

We extend the software rejuvenation model of single


application to two-dimension state space, then derive
(1 +2 +1 +2 )P0 +R1 P1 +R2 P2 +r1 P4 +r2 P5 = 0
software rejuvenation model of two-node application
as shown in Fig.2. The states of application are de(R1 + 2 )P1 + 1 P0 + R2 P3 + r2 P7 = 0
noted by a 2-tuple S, which is formally defined as:
(R2 + 1 )P2 + 2 P0 + R1 P3 + r1 P6 = 0
S = {{(i, j)|i, j {H, F, R}} where i is the state of
(R1 + R2 )P3 + 2 P1 + +1 P2 = 0
the first version of application and j is the state of
the second version of application.For the first version
(r1 + 2 )P4 + 1 P0 + R2 P6 = 0
of application,1 , 1 , r1 , and R1 represents the failure
(r2 + 1 )P5 + 2 P0 + R1 P7 = 0
rates from system working state to failure state, the
(r1 + R2 )P6 + 2 P4 = 0
transition rate to trigger software rejuvenation,the rejuvenation rate from the: software rejuvenation state
(r2 + R1 )P7 + 1 P5 = 0
to working state, respectively. Correspondingly, for the
7
X
second version of application, 2 , 2 , r2 , and R2 depi = 1
(11)
notes the failure rate, the transition rate to trigger
i=0
software rejuvenation, the rejuvenation rate and the
By solving the above equations, we can obtain the value
recovery rate, respectively.
We discussed assumptions for simplicity and limited of Pi , i = {0, 1, 2 7}. According to the rejuvenation

215

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

model in Fig.3, the application is unavailable in the number of existence states, and transition rate funcstate of (F,F),(R,F) and (F,R). Thereafter, the avail- tion matrix, for every number version. After accountability of two-node application is given by:
ing of this matrix and it placing in Eq.5 we can obtain
present probability in every state. By accounting of
pA2 = p0 + p1 + p2 + p4 + p5
these probabilities, can obtain availability system by
following formula:
= 1 (p3 + p6 + p7 )
(12)
pA = 1 (pm1 + pm2 + + pmn + p2n 1 ) (14)

Suppose that n software version be available, the number of states at any time t account with following formula:
n

n2

m=3 2

 
 
 
n
n3 n
n4 n
2
2

2
3
4

n(n1)


 
n
nn n
2
n1
n

(15)

Figure 2: Software rejuvenation model of two applications.


Which 3n is all states that exist,2n2 n2 is number of
states that 2 version are in rejuvenation states, Accord2.2 Software rejuvenation model of ing to Assumption 2, At any time t only one version can
be in rejuvenation state therefore number of states that
three-node application
have repeated versions in rejuvenation
state, should

be deduct from 3n . So, 2n3 n3 is the number of
We study this work for three-dimension state space and
states that 3 version are in rejuvenation state and figain the less unavailability by Software rejuvenation
nally ,2nn nn is the state that all the versions be in
model of three-node application as show in Fig.3. Q is
rejuvenation state.
matrix of the transition rate function as in Eq.16.By
solving the obtained equations, we obtain the value
ofPi , i = {0, 1, 2 19}. According to the rejuvenation model in Fig.3, the application is unavailable in
the state of (F,F,F),(R,F,F),(F,R,F),(F,F,R). Thereafter, the system availability of three-node application
is given by[20]:
pA3 = 1 (p7 + p17 + p18 + p19 )

2.3

(13)

Software rejuvenation model of nversion application

As considered in previous parts, we determined the


software rejuvenation model for applications with one,
two and three software versions. By doing several experiments, we could expand this model to n software
versions, and obtained a formula for account of state
space set and also introduce reference matrix that denote the transition rate function matrix for n software
version. The non-zero elements of the matrix are show
in Table 1. In the table, there are three main columns
in which the row number, column number and value
of each non-zero element. We can, therefore, obtain

(16)

Table1: The non-zero elements in the transition rate function


matrix for n-version

216

The Third International Conference on Contemporary Issues in Computer and Information Sciences

R#
C#
Value
R#
C#
Value
0
1
1 2n n 2
2n 1
n
0
2
2
...
2n 1
...
0
...
...
2n 2
2n 1
1
0
n
n
2n
0
r1
0
2n
1
2n + 1
0
r2
0
2n + 1
2
...
0
...
0
...
... 2n + n 1
0
rn
0
2n + n 1 n
2n
2n + n
2
1
0
R1
2n
2n + n + 1 3
1
n+1
2
2n
...
...
1
n+2
3
2n
2n + 2n 2 n
1
...
...
2n + 1 2n + 2n 1 3
1
2n
n
2n + 1
...
...
2
0
R2
2n + 1
...
n
2
n+1
1
2n + 1 2n + 3n 2 1
2
2n+1
3

2
2n+2
4 2n + n 1

1
2
...
... 2n + n 1

2
2
3n-2
n 2n + n 1

...
3
0
R3 2n + n 1

n1
3
n+2
1
2n + n
2
r1
3
2n+1
2 2n + n + 1
3
r1
4
2n+2
2
...
...
r1

2n + 2n 2
n
r1
n
0
Rn 2n + 2n 1
3
r2
n
2n
1
...
...
r2
n+1
3n-2
2 2n + 3n 4
n
r2
n+1
1
R2

n+2
1
R3
n2
n
rn1
...
...
...
n2 1
n-1
rn
2n-1
1
Rn
...
...
rn
n+1
2
R1 n 2 n 2
1
rn
n+2
3
R1
...
n-1
rn1
...
...
...
...
n-1
rn1
2n-1
n
R1

2n
2
R3
m-n-2
1
r2
2n+1
2
R4
2n + n
m-n
n
...
...
...
...
m-n
...
3n-2
2
Rn
...
m-n
2
2n
3
R2 2n + 2n 1
m-n
1
2n+1
4
R2

...
...
...
m-3n
m-1
1
3n-2
n+1
R2
...
m-1
n

...
m-1
...
2n 2 2n 2n 4 Rn m 2n 1
m-1
2
2n 2
...
...
m-2n
m
n
2n 2
2n n 3 R2
...
m
...
2n 1
2n n 2 Rn
...
m
2
2n 1
...
...
m-n-1
m
1
2n 1
2n 2
R1
m-1
2n n 2 rn
2n 2n 4 2n 2
n
m-2
2n n 3 rn1
...
2n 2
...
...
...
r2
2n n 3
2n 2
2
m-n-1
2n 2
r1

After accounting m, transition rate function matrix that is a


m*m matrix determined from reference matrix.
Now obtain the probability present of in per states (such as Pi
that is i = 1, 2 . . . m) by matrixs multiply action and equations
accounting and solve of equations and then by setting these Pi in
following Eq.15 can obtain the probability of system availability.
By availability accounting for applications with multiple versions can obtain this conclusion, that constantly by increasing
of version number, decrease the amount of system unavailability
considerable.

217

Figure 3: Software rejuvenation model of three applications.


Table2: Parameter Value
r1 = .. = rn = 1
R1 = .. = Rn =0.1
1 = .. = n =0.005
1 = .. = n =0.002
Table3: Parameter Value
Number of version
1
2
3
4
5

Unavailability
0.047528
0.022814
0.022438
0.00108
0.00061

Numerical Results and Analysis

To acquire availability measure of application, we perform numerical experiments by taking system unavailability as evaluation indicator. The system parameter default values in software
rejuvenation model are given in table 2. All the parameter values are selected by experimental experience for demonstration
purposes.
The change in the unavailability of software applications with
the different number of versions and rejuvenation rates is plotted in table 3 and Fig. 2. The number of versions is varied
from simplex to multiplex (n = 5), at the same time we perform
software rejuvenation with the interval from rate=0.5 to infinity (rate=0: no rejuvenation). From the graph, the amount of
unavailability decrement from simplex to duplex is significant.
We can see that number of versions strongly influences system
reliability. With the number of version increasing, the system
unavailability reduces rapidly and goes to a steady value.

Conclusion

In this paper, we presented software rejuvenation structure and


set up the software rejuvenation model in one, two and threedimension state space for one application. In the model, the sys-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tem availability formula is derived from continuous time Markov


process. The numerical experiment results show that the system
unavailability greatly minimizes when the number of versions
increases.

[10] T. Dohi, K. Goseva, and K. Trivedi, Statistical nonparametric algorithms to estimate the optimal software
rejuvenation schedule: ACM SIGMETRICS Conf., ACM
Cambridge, MA (2000).

Refrences

[11] K. Vaidyanathan and K. Trivedi, A comprehensive model


for software rejuvenation: IEEE Transactions on Dependable and Secure Computing 2 (2005).

[1] S. Yu, CH. Qi, and H. Xin, Positive software fault-tolerate


technique based on time policy, Communication and Computer 4 (2007), 1114.
[2] Y. Huang, C. Kintala, N. Koletis, and N. Fulton, Software
rejuvenation: analysis, module and application: Symposium
on Fault Tolerant Computing (1995), 381390.
[3] L. Li, K. Vaidyanathan, and K. Trivedi, An approach to
estimation of software aging in a web server: International Symposium on Empirical Software Engineering ISESE (2002), 125129.
[4] V. P. Koutras and a. N. Platis, Applying Software Rejuvenation in a Two Node Cluster System for High Availability: International Conference on Dependability of Computer
Systems (DEPCOS-RELCOMEX06), Poland 6 (2006),
175182.
[5] A. Platis and V. P. Koutras, Optimal rejuvenation policy
for increasing VoIP service reliability: European Safety and
Reliability (ESREL 2006) Conference (2006), 22852290.
[6] K. M. Aung and J. S. Park, A framework for software rejuvenation for survivability: 18th International Conference
on Advanced Information Networking and Applications 2
(2004).
[7] K. Vaidyanathan and R. Harper, Analysis and implementation of software rejuvenation in cluster systems: ACM
SIGMETRICS Performance Evaluation Review 29 (2001).

[12] K. Trivedi and K. Vaidyanathan, Modelling and analysis of


software aging and rejuvenation: 33rd IEEE Annual Simulation Symposium (2000).
[13] W. Xie and Y. Hong, Software rejuvenation policies for
cluster systems under varying workload: 10th IEEE Pacific
Rim International Symposium on Dependable Computing
(PRDC04) 18 (2004), 163177.
[14] H. Okamura, S. Miyahara, and T. Dohi, Dependability analysis of a client/server software system with rejuvenation:
13th International Symposium on Software Reliability Engineering (2002).
[15] Y. Liu, K. Trivedi, Y. Ma, and H. Levendel, Modelling and
analysis of software rejuvenation in cable modem termination system (2002).
[16] H. Suzuki, K. Trivedi, T. Dohi, and N. Kaio, Modelling and
analysis of software rejuvenation in cable modem termination system (2003).
[17] A. Bobbio, M. Sereno, and C. Anglano, Fine grained software degradation models for optimal rejuvenation policies
46 (2001).
[18] T. Thein and J. Park, Availability Analysis of Application
Servers Using Software Rejuvenation and Virtualization,
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24 (2009), 339346.

[8] K. Park and S. Kim, Availability analysis and improvement


of active/standby cluster systems using software rejuvenation: ACM SIGMETRICS Performance Evaluation Review,
Journal of Systems and Software 61 (2002).

[19] Q. Yong, M. Haining, H. Di, and Ch. Ying, A Study on


Software Rejuvenation Model of Application Server Cluster in Two-Dimension State Space Using Markov Process,
Information Technology Journal 7 (2008), 98104.

[9] S. Garg, Y. Huang, K. Kintala, and K. Trivedi, Minimizing


completion time of a program by checkpointing and rejuvenating: ACM SIGMETRICS Conf., ACM Cambridge, MA
(1996), 252261.

[20] Z. R. Ghobadi and H. Rashidi, Software Rejuvenation


Technique-An Improvement in Applications with Multiple
Versions, Computers and Intelligent Systems Journal 2
(2010), 2226.

218

A fuzzy neuro-chaotic network for storing and retrieving pattern


Nasrin Shourie

Amir Homayoun Jafari

Islamic Azad University

Tehran University

Department of Biomedical Engineering, Science and Research Branch

School of Medicine

Tehran, Iran

Tehran, Iran

shourie.n@srbiau.ac.ir

amir h jafari@aut.ac.ir

Abstract: In this paper, a fuzzy neuro-chaotic network is proposed for retrieving pattern. Activation function of each neuron is a logistic map with flexible searching area. Bifurcation parameter
and searching area of each neuron are determined depending on its desired output. They are obtained using two fuzzy systems, separately. In the beginning of training process, desired patterns
are stored in fixed points by use of pseudo-inverse matrix learning algorithm. Then required data
for constructing of the fuzzy systems are provided. The fuzzy rule bases are designed by use of look
up table scheme based on provided data. In the retrieving process, all neurons are initially set to
be chaotic. Each neuron searches for its state space completely to find its correct periodic points.
When this occurs, the neuron is driven to periodic state of period 2. In this case, the bifurcation
parameter and the searching area of the neuron are determined by the two obtained fuzzy systems.
When all neurons are driven to periodic state, the desired pattern is retrieved. Computer simulations represent the remarkable performance of the proposed model in the field of retrieving noisy
patterns.

Keywords: Chaotic neural model, Bifurcation, Fuzzy rules, Pattern retrieving.

Introduction

Chaotic behavior exists in many biological systems specially, in behavior of biological neuron. Observation of
chaotic behavior in biological neuron persuades many
researchers to consider these properties in artificial
neural network models, in order to obtain new computational capability. Hence, numerous chaotic neural
models with ability of representing chaotic behavior
and data processing were offered until now.
For example, G. Lee and N.H. Farhat proposed a
chaotic pulse coupled neural network as an associative
memory based on a bifurcation neuron which is mathematically equivalent to the sine circle map [3]. In
another research, a bifurcation neuron is suggested by
M.Lysetskiy and J.M. Zurada that is constructed with
the third iterate of logistic map. It uses an external
input which shifts its dynamics from chaos to one of
the stable fixed points [4]. L.Zhao et al. [5] presented
Corresponding

a chaotic neural model for pattern recognition by using periodic and chaotic dynamics. Periodic dynamic
represents a retrieved pattern and chaotic dynamic corresponds to searching process. A. Taherkhani et al. [6]
designed a chaotic neural network that could be used
for storing and retrieving gray scale and binary patterns. This model contains chaotic neurons with logistic map as activation function and a NDRAM network
which is applied as supervisor model for the neurons of
the model evaluating.
In this paper, we try to show the advantage of
chaotic behavior in artificial neural network. Chaotic
neurons are able to emerge various solutions for a problem. Therefore, we propose a fuzzy neuro-chaotic network, which is capable of pattern retrieving. In this
model, activation function of each neuron is a logistic
map with flexible searching area. Parameters of neurons are obtained using two fuzzy systems, separately.
In the training process, data are stored in memory us-

Author, P. O. Box 1388673111, F: (+98) 21 44524165, T: (+98) 21 44520786

219

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ing pseudo-inverse matrix learning algorithm. Then,


required data for designing the fuzzy systems are provided. The fuzzy rule bases are constructed using look
up table scheme based on provided data. In the retrieving process, noisy pattern is presented to the model as
the initial conditions of neurons. All neurons are initially set to be chaotic. Each neuron starts to search for
its state space to find its proper periodic points. The
neuron that finds its correct periodic points is driven to
periodic state of period 2. Thus, its bifurcation parameter and its searching area are determined by the two
obtained fuzzy systems. When all neurons are driven
to periodic state, the stored pattern is retrieved.

Model Description

The proposed model consists of chaotic neurons which


activation function of each one is a logistic map with
flexible searching area as described below:
xi (k + 1) = bi (t)xi (k)(1 xi (k)/i ), i = 1, 2, ..., N
(1)
Where xi (k) is output of ith neuron. bi (t) is a bifurcation parameter of ith neuron which determines its
dynamical changes. In Eq. (1), parameter i controls
searching area of ith neuron. Due to using this parameter, it is possible to determine the searching area of
each neuron individually depending on its appropriate
output. Therefore, the model is able to retrieve multi
value content patterns. N is the number of neurons
that equals to the number of elements in each pattern vector. Bifurcation diagram of logistic map with
= 0.5 is represented in Figuse 1.
The proposed model is divided into three stages:
the training stage, the designing fuzzy systems stage,
and the retrieving stage.

2.1

Training Stage

Then required data for constructing of the fuzzy systems are provided using the training patterns. The
training patterns are noisy versions of basic patterns
that are normalized into [0-1]. Each one of the training
patterns is applied to the model separately as its initial
conditions. At first, all neurons are set to be chaotic
in order to search for their state space completely to
find correct periodic points. As the maximum value
for each element of the training pattern is equal to 1,
initial searching area of each neuron is considered into
[0-1.1] and therefore i (0) is set to 1.1. The dynamic
of each neuron is determined relevant to its error that
is defined as below:




N
X



(4)
wij xj (k) i = 1, 2, ..., N
ei (t) = xi (k)


j=1
Where, wij is an element of the connection matrix that
is obtained by Eq. (3) and xi (k) is the output of ith
neuron. As the outputs of neurons in periodic state
alternate with the period of two, t = k 2. Thus ei
of each neuron is evaluated every two time units. The
bifurcation parameter of each neuron is obtained as:

wij xj (k) <


Ap (0) if ei (t) = xi (k)
bi (t) =


j=1

Ac
otherwise
(5)
Where is the threshold of error, Ac is a bifurcation
parameter corresponding to chaotic state and Ap (0) is
an initial bifurcation parameter corresponding to periodic orbit with period of 2. If ei (t) is greater than
a threshold, ith neuron still remains in chaotic state.
Otherwise, it indicates that the neuron approximates
its corresponding periodic point. When this occurs, the
neuron is driven to periodic state with period of 2 and
its initial bifurcation parameter is set as bi (0) = Ap (0).
In this case, the output of neuron and its error are
stored for constructing the fuzzy systems and then the
initial i (0) is calculated using Eq.(1). In this way,
bi (0) and the output of neuron are substituted in Eq.
(1). Then Eq. (1) is solved for in a way that one
of the periodic points of logistic map will be equal to
present output of neuron.

In the Training stage, at first basic patterns are normalized into [0-1] and then they are stored in fixed points.
It has been supposed that matrix X = {x1 , x2 , , xN }
containing M training patterns which each one include
N elements. All of the M training patterns are stored
in fixed points as [2, 5]:

Then the output of present neuron is substituted


in Eq.(1) with bi (0) and i (0) and the output of logistic map is calculated two times, named x1 and x2
which x2 is corresponding to the proper periodic point.
Thus, sign of difference between x2 and x1 is stored as
one of the fuzzy systems inputs, too. Subsequently,
WX = X
(2)
the parameters of neuron are adjusted using following
Where W is the connection matrix. It is obtained as equations to minimize difference between the output
below by using the pseudo-inverse matrix learning al- of neuron and its corresponding element in the desired
pattern:
gorithm [2, 5]:
W = X(X T X)1 X T

bi = bi + 21 sign(di oi )(x2 x1 )

(3)

220

(6)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

i = i + 22 sign(di oi )

(7) 2.2

Design of Fuzzy Systems

The desired pattern in retrieving stage is not given and


therefore the parameters of neurons could not be adjusted using Eq. (6) and (7) in this stage. Each one
of the neurons parameters is determined by three variables: oi , ei and the sign of x. The two fuzzy systems
are designed separately by use of the stored data in the
training stage. Therefore Eq.(6) and (7) could be replaced with constructed fuzzy systems. The fuzzy rule
bases are designed using look up table scheme based on
stored inputoutput pairs. In order to creating look up
table for each of the fuzzy systems, at first a number
of fuzzy sets are generated which cover input-output
spaces, completely. Then the membership value of
each inputoutput pair in the corresponding fuzzy sets
is determined. The fuzzy sets which have the largest
membership values for each input-output variable are
detected and thereby the Fuzzy IF-THEN rule is generated. Since there are conflicting rules with the same
IF parts and different THEN parts, a degree is assigned
to the generated rules and only one rule from conflicting group with the maximum degree is maintained.
The degree of a rule is determined as product of maximum membership values of input-output variables [1].
Finally, the two Fuzzy systems are constructed using
Figure 1: Bifurcation diagram of logistic map with product inference engine, singleton fuzzifier and central
= 0.5 and the searching area in to [0-0.5].
average defuzzifier based on the obtained rule bases.
Where di and oi are the corresponding element of the
output of neuron in the desired pattern and the output
of the present neuron, respectively. 1 and 2 are the
learning rate parameters. By using updated b2 and 2 ,
the neurons output is calculated two times and then
oi is set to the second obtained output. Adjusting of
the parameters of neuron continues until difference between the output of neuron and its corresponding element in the desired pattern is less than a predetermined
amount.

The parameters of neuron adjusting steps can be


summarized as below:
1. Set bi (0) = Ap (0) and calculate ai (0)
2. Calculate difference between the neurons output
and its corresponding element in desired pattern and Figure 2: Ten stored images which each of them is conwhen the result is less than a predetermined number tained 16 16 pixels.
terminate it,
3. Adjusting bi and i , using Eq. (6) and (7),
4. Calculating the neurons output two times by using 2.3 Retrieving Stage
new bi and i , and set oi to second obtained output,
5. Go to 2.
In the retrieving stage, the noisy pattern is applied as
The obtained bi and i are stored. This process is the initial conditions of model. The error of neuron
performed for other neurons until all of them are driven ei (t) is evaluated every two iterations and the neuron
to periodic state. Then this learning process is done for whose error is less than the threshold is chosen to
other training patterns, separately and their obtained become periodic. In this case, output of neuron and
parameters are stored, too. Eventually, a set of data its error are stored and then bi (0) is set to be Ap (0).
are provided consisting of resulted bi , i , the error of Then, the initial searching area i (0) is obtained. The
neuron (ei ), the output of neuron (oi ) and the sign of two periodic points (x1 andx2 ) of logistic map are caldifference between the two periodic points of logistic culated using bi (0) and i (0). Thus, xi (k), ei (t), the
map with period of two (x). Thereby the two fuzzy sign of difference between the two periodic points (only
systems are constructed that each one is responsible for calculating bi ) are applied to the obtained fuzzy systo calculate one of the bifurcation parameter and the tems and therefore each one of the parameters of neusearching area of neurons in the retrieving stage.
ron are calculated, separately. This process continues

221

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Model
recognition %

Variance of added Gaussian noise


= 0.1 = 0.2 = 0.3 = 0.4 = 0.5
100
100
97.5
91.7
82

Table 1: The recognition results from applying retrieved images using proposed model to the classifier.
until all neurons are driven to the periodic state.

Results

added noise is Gaussian white noise with mean 0 and


variance into [0.1-0.5]. One of the images with its
noisy versions is represented in Figure 3.Some of the
retrieved images are shown in Figure 4.
The output of the model is applied to classifier for
recognition. We used fuzzy C means (FCM) algorithm
[1] to classify the output of the model. In this way, 10
clusters are assumed and the center of each one is set at
one of the basic images. Using the FCM algorithm, the
membership value of neurons output to each cluster is
evaluated and the cluster with maximum membership
value are chosen as the recognized image. The obtained
results are illustrated in Table.1.

In order to evaluate the performance of described


model, this model is used in retrieving noisy grey scale
images. Then retrieved images are applied to a classifier. Therefore, ten grey scale images [7] are used
that each one contained 16 16 pixels (Figure 2). In
the training stage, the images are stored and then 10
noisy versions of each one are generated. With the aim
of generating noisy images, Gaussian white noise with
mean 0 and variance 0.1 is added to the basic images.
In this simulation, = 0.2, Ap (0) = 3.2, 1 = 0.5,
and 2 = 0.5. Ultimately, 24442 input-output pairs 4
Conclusion
are provided for constructing the fuzzy rule bases and
using these, 463 rules are obtained for each one of the
fuzzy systems, separately.
A new chaotic model for retrieving pattern has been
proposed. In this model, activation function of each
neuron is a logistic map with flexible searching area
that has two dynamics: chaotic and periodic. In
chaotic state, each neuron searches for its state space
completely in order to find correct periodic points. If
Figure 3: Example of a stored image with its noisy ver- the neuron finds its correct periodic points, it will be
sions (Gaussian noise with mean 0 and variance ). (a) driven to periodic state and then its bifurcation paNoiseless image, (b) = 0.1, (c) = 0.2,(d) = 0.3, rameter and searching area are obtained using the two
fuzzy systems. In this model, the bifurcation param(e) = 0.4, (f) = 0.5.
eter and the searching area of each neuron are determined individually depending on desired output.
Thereby, the model is able to retrieve multi- value content patterns and confronted no limitation in added
noise. The performance of this model is evaluated on
retrieving of the gray scale images. The obtained results represent the capability of this model in the field
of recognition and retrieving pattern.

Refrences
Figure 4: Some examples of pattern retrieval using our
proposed model. (a) Noisy images, (b) retrieved images using proposed model. (c) Noiseless images.

[1] Li Xin Wang, A course in fuzzy systems and control, 1997.


[2] A. Krogh J.Hertz and R.G. palmer, Introduction to the theory of neurocomputing, A ddison-wesley, reading ,MA, 1991.
[3] G. Lee, N.H. Farhat, and S Wu, The Bifurcating Neuron Network 1, Neural Networks 14 (2001), 115131.

To test the proposed model, 100 noisy samples are


generated from each one of the basic images. The

[4] M. Lysetskiy and J.M. Zurada, Bifurcating neuron: computation and learning, Neural Networks 17 (2004), 225-232.

222

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[5] L. Zhao, J. C.G. Caceres, A. P.G. Damiance. Jr, and H. Szu,


Chaotic dynamics for multi - value content addressable memory, Neurocomputing 69 (2006), 1628-1636.
[6] A. Taherkhani, A. Mohammadi, A. Seyyedsalehi, and h. Da-

223

vande, Design of a chaotic neural network by using chaotic


nodes and NDRAM network, 2008 IEEE World Congress on
computational intelligence (WCCI 2008) (2008).
[7] http://www..gaussianprocess.org/gpml/data.

GSM Technology and security impact


Ahmad Sharifi

Jawaharlal Nehru Technological University Hyderabad, INDIA


Department of Information Technology
ahmadsharifi.it@gmail.com

Mohsen Khosravi
Jawaharlal Nehru Technological University Hyderabad, INDIA
Department of Computer Science Engineering
mo kho 1388@yahoo.com

Abstract: GSM (Global System for Mobile Communications) is a standard set introduced by
the European Telecommunications Standards Institute (ETSI) to explain technologies for second
generation (or 2G) digital cellular networks. It was designed to be a secure mobile phone system
with strong subscriber authentication and over-the-air transmission encryption. Security plays
a crucial role in wireless communication. Due to ubiquitous nature of the wireless medium, it
causes more susceptible to security attacks than wired communications. With noticing to daily
usage of GSM equipments by hundreds of millions of users, more secure and reliable algorithms for
encryptions will be considered.

Keywords: GSM, Mobile, Cellular network, Security, Key, Algorithm

Introduction

the immediate vicinity.

GSM signifies an extremely successful technology and


bearer for mobile communication system. People use
it not only in business but also in everyday life. It
uses a combination of Frequency Division Multiple
Access (FDMA) and Time Division Multiple Access
(TDMA).GSM system has an allocation of 50 MHz
(890-915 MHz and 935-960MHz) bandwidth in the 900
MHz frequency band. Using FDMA this frequency
band is divided into 124 channels, each with a carrier
of 200 KHz. Using TDMA each of these channels is
then further divided into 8 time slots. Combination of
FDMA and TDMA lead to realization for maximum of
992 channels for transmitting and receiving.

Figure 1: Cellular network architecture

In order to be able to serve hundreds of thousands


of users, the frequency must be reused. This is done
as a 2G technology, the system offers advantages to
through cell. GSM is a cellular network, which means
that cell phones connect to it by searching for cells in both consumers and networks. For the former, it allows
Corresponding

Author, F: (+91)7799199701, T: (+91)7799199701

224

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

them to change carriers without changing phones. The 2


GSM Security
system is also preferred by network operators because
there are many vendors and companies that implement
In GSM, security functions including authentication,
it.
confidentiality to encrypt radio channel, have been deBecause it uses digital technology for speech and fined to protect the air interface. The security of the
signal, its more efficient in transmitting data. This is system depends on the tamper resistance of the Subthe main reason why the system has become very pop- scriber Identity Module (SIM) that is supplied to the
subscriber by the operator. The SIM contains crypular in international roaming.
tographic algorithms and keys and looks like a small
Another innovative feature of the GSM is the short smart card. Before a subscriber can use a mobile
messaging system (SMS or text messaging). This fea- phone, his SIM is authenticated to the mobile operture allows mobile phone users to send short messages ator. In the SIM, the authentication algorithm A3 is
implemented. It is a one-way function that takes as
using their phone anywhere around the world.
inputs a random number generated by the operator
This service has become very popular and is being and the secret subscriber key Ki that is stored in the
implemented by other standards. The text services are SIM. The outcome of the algorithm is verified by the
less expensive than making calls, hence its increasing operator, and if it is correct the SIM has been authenticated successfully. A3 can be operator specific. To
popularity.
authenticate the subscriber to the SIM, a PIN-code is
The worldwide emergency number (112) is also part used. With respect to roaming and hand-over, location
of the system. This feature lets users make contact anonymity is achieved using a temporary identification
with emergency numbers. This is very helpful for trav- of the subscriber (TMSI). Confidentiality on the air
elers who are not familiar with local police or hospital interface is achieved by scrambling based on the algorithm A5, which is implemented in the mobile phone
numbers.
and in the network. The scrambling key Kc is generated by the operator and by the SIM. Like the authentication algorithm A3, the key generation algorithm A8
takes as inputs the random number generated by the
operator and the secret subscriber key ki that is stored
in the SIM.

Figure 2: General architecture of a GSM network

SIM: Subscriber identity module.


ME: Mobile equipment.
BTS: Base transceiver station.
BSC: Base station controller.
HLR: Home location register.
VLR: Visitor location register.
MSC: Mobile services switching center.
EIR: Equipment identity register.
AuC: Authentication Center.
UM: Represents the radio link.
Abis: Represents the interface between the base stations and base station controllers.
A: The interface between the base station subsystem
and the network subsystem.
PSTN and PSPDN: Public switched telephone network
and packet switched public data network.

Figure 3: Authentication

225

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Key
32
The security mechanisms specified in the GSM
length in
standard make it the most secure cellular telecommubits
nications system available. The use of authentication,
Time re- 1.19
encryption, and temporary identification numbers enquired to hours
sures the privacy and anonymity of the systems users,
test all
as well as safeguarding the system against the fraudupossible
lent use. Even GSM systems with the A5/2 encryption
keys
algorithm or even with no encryption are inherently
more secure than analog systems due to their use of
speech coding, digital modulation, and TDMA chan- Table 1: Brute-force
sizes
nel access.

2.1

GSM Encryption Algorithms

A partial source code implementation of the GSM A5


algorithm was leaked to the Internet in June, 1994.
More recently there have been rumors that this implementation was an early design and bears little resemblance to the A5 algorithm currently deployed. Nevertheless, insight into the underlying design theory can
be gained by analyzing the available information. The
details of this implementation, as well as some documented facts about A5, are summarized below:
A5 is a stream cipher consisting of three clockcontrolled LFSRs of degree 19, 22, and 23.

40

56

64

128

12.7
days

2,291
years

584,542 10.8
years

1024
years

key search times for various key

The time required for a 128-bit key is extremely


large; as a basis for comparison the age of the Universe
is believed to be 1.6 1010 years. An example of an
algorithm with a 128-bit key is the International Data
Encryption Algorithm (IDEA). The key length may alternately be examined by determining the number of
hypothetical cracking machines required to decrypt a
message in a given period of time.
Key length
in bits
40
56
64
128

1 day

1 week

1 year

13
836,4788
2.14108
3.9 1027

2
119,132
3.04 106
5.6 1026

2,291
584,542
10.8 1024

The clock control is a threshold function of the Table 2: Number of machines required to search a key
space in a given time
middle bits of each of the three shift registers.
The sum of the degrees of the three shift registers
is 64. The 64-bit session key is used to initialize
the contents of the shift registers.
The 22-bit TDMA frame number is fed into the
shift registers.
Two 114-bit key streams are produced for each
TDMA frame, which are XOR-ed with the uplink
and downlink traffic channels.
It is rumored that the A5 algorithm has an effective key length of 40 bits.

2.2

Key Length

Let focus on key length as a figure of merit of an encryption algorithm. Assuming a brute-force search of
every possible key is the most efficient method of cracking an encrypted message (a big assumption); Table 1
shown below summarizes how long it would take to
decrypt a message with a given key length, assuming
a cracking machine capable of one million encryptions
per second.

226

A machine capable of testing one million keys per


second is possible by todays standards. In considering
the strength of an encryption algorithm, the value of
the information being protected should be taken into
account. It is generally accepted that DES with its 56bit key will have reached the end of its useful lifetime
by the turn of the century for protecting data such as
banking transactions. Assuming that the A5 algorithm
has an effective key length of 40 bits (instead of 64), it
currently provides adequate protection for information
with a short lifetime. A common observation is that
the tactical lifetime of cellular telephone conversations is on the order of weeks. The technical details
of the encryption algorithms used in GSM are closelyheld secrets. The algorithms were developed in Britain,
and cellular telephone manufacturers desiring to implement the encryption technology must agree to nondisclosure and obtain special licenses from the British
government. Law enforcement and Intelligence agencies from the U.S., Britain, France, the Netherlands,
and other nations are very concerned about the export
of encryption technology because of the potential for
military application by hostile nations. An additional

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

concern is that the widespread use of encryption technology for cellular telephone communications will interfere with the ability of law enforcement agencies to
conduct surveillance on terrorists or organized criminal
activity.

Security must operate without user assistance,


but the user should know it is happening.

A disagreement between cellular telephone manufacturers and the British government centering on export permits for the encryption technology in GSM was
settled by a compromise. Western European nations
and a few other specialized markets such as Hong Kong
would be allowed to have the GSM encryption technology, in particular, the A5/1 algorithm. A weaker version of the algorithm (A5/2) was approved for export
to most other countries, including central and eastern European nations. Under the agreement, designated countries such as Russia would not be allowed to
receive any functional encryption technology in their
GSM systems. Future developments will likely lead
to some relaxation of the export restrictions, allowing
countries, which currently have no GSM cryptographic
technology to receive the A5/2 algorithm.

Dont relegate lawful interception to an afterthought - especially as one considers end-toend security.

Base user security on smart cards


Possibility of an attack is a problem even if attack
is unlikely.

Use published algorithms, or publish any specially developed algorithms.

Refrences
[1] GSM/EDGE:A mobile communications system determined
to stay AEU, International Journal of Electronics and Communications (2011).
[2] Steve Gold, Cracking GSM Network Security (2011).
[3] Bernard Menezes, Network Security and Cryptography, CENGAGE Learning (2010).
[4] European Telecommunications Standards Institute (ETSI),
European digital cellular telecommunication system.

Conclusion

In this article, we have described design and performance issues of GSM on the cellular network, required
security and developing of open international standards. The technical details of the encryption algorithms used in GSM are closely-held secrets. GSM
provides a basic range of security features to ensure adequate protection for both the operator and customer.

[5] El-Ghazali Talbi and Herv Meunier, Hierarchical parallel approach for GSM mobile network design, Journal of Parallel
and Distributed Computing 66 (2006), no. 2.
[6] Sukanta Das, Sipra Das (Bit), and Biplab K Sikdar, Nonlinear Cellular Automata Based Design of Query Processor
for Mobile Network, accepted for publication in the proceedings of IEEE SMC, Hawaii (2005).
[7] Brookson C, Can You Clone a Smart Card (SIM):
http://www.brookson.com/gsm/clone.pdf.
[8] Satish Damodaran and Krishna M Sivalingam, Scheduling
algorithms for multiple channel wireless local area networks
Computer Communications (2002).

227

MicTSP: An Efficient Microaggregation Algorithm Based On TSP


Reza Mortazavi

Saeed Jalili

Tarbiat Modares University

Tarbiat Modares University

Electrical and Computer Engineering Faculty

Electrical and Computer Engineering Faculty

r.mortazavi@modares.ac.ir

sjalili@modares.ac.ir

Abstract: Microaggregation is a perturbative method for statistical disclosure control (SDC).


This technique masks microdata so that they can be released while preserving the privacy of data
owners. The main principle in microaggregation is to group records within clusters and replace them
by the centroid of the corresponding cluster. Each cluster must contain at least k records, where k
is a constant value predefined by data protector. This replacement distorts the dataset and reduces
the utility of published data, which is measured by information loss (IL). Optimal microaggregation
to produce the lowest IL possible for a given k, can be computed in polynomial time for univariate
data, but the problem is NP-hard in the case of multivariate records. Several heuristics have been
proposed in the literature for multivariate microaggregation. This paper proposes MicTSP, an
efficient algorithm for multivariate microaggregation by converting the problem to univariate case
using TSP tour which visits all records in a cyclic manner in the original domain space. The time
complexity of MicTSP, provided that a TSP tour is prepared, is O(nk2 ) which is much better
than related methods in the literature. Experimental results on standard datasets reveal that the
proposed method is effective both in terms of reducing information loss and applicability.

Keywords: Microaggregation; Privacy Preserving Data Publishing (PPDP); TSP; Perturbative Masking Methods

Introduction

Continuous advances in computer technologies enable


corporations to collect enormous amount of personal
data which are valuable resource for researchers and
governmental agencies. However before dissemination
of such data, some procedures must be applied to
make them anonymous. Different approaches to implement a privacy model must tradeoff between what
are needed for privacy level specified by the model and
utility of protected data. A basic computational privacy model is k-anonymity [1], in which the record of a
data owner is made indistinguishable within a group of
at least k 1 other members. Multiple operators may
be used to realize this model, such as generalization
and suppresion, anatomization and permutation, purturbation, and aggregation. A recent survey of these
operators can be found in [2]. This paper focuses on
aggregation operator, where records are first grouped
within clusters and then are replaced by their corre Corresponding

sponding centroid. This mechanism is called microaggregation and is widely used in practice. Since original
records are changed, some information is lost after this
anonymization. The more similar records in groups,
the more utility remains in perturbed data. Regarding
microaggregation mechanism, this utility is measured
by information loss (IL) metric. Lower values of IL
means less distortion is introduced and anonymized
dataset is more similar to original one. The optimal
microaggregation problem can be formally defined as
follows: Given a dataset with n records and d numerical attributes, cluster the records into groups, each
of them contain at least k records, such that the sum
of squared error (SSE)
Pc within
Pnj groups is minimized.
2
SSE is defined as
i=1 kXji Xj k , where c
j=1
is the number of groups, nj is the number of records
in j-th group, and Xj is the mean of j-th group. IL
is defined as SSE/SST 100%, where SST is the
sum of squared
the entire dataset and is
Pcerror
Pwithin
nj
calculated as j=1 i=1
kXji Xk2 , where X is the
centroid of the entire dataset.

Author, P. O. Box 143-14115, T: (+98) 21 8288 3374

228

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

This problem has been shown to be NP-hard for


the multivariate case, i.e. when d > 1 [3], however
it can be solved in polynomial time when d = 1, in
a method called MHM [4]. Several heuristics have
been proposed that lead to low information loss such
as MDAV [5], DBA [6], and PCA [7]. Recently, some
promising approaches have been proposed that convert the multivariate problem to univariate one, so the
optimal univariate solution can be applied [8]. Generally speaking, if an ordered list of records is prepared
for optimal univariate solution, it can produce optimal
groups for the given list. So the problem of optimal
multivariate microaggregation can be reduced to sort
multi dimensional records based on some criterion in
the original domain space. This paper builds on the
work of [8], while reducing its time complexity from
O(n2 k 2 ) to O(nk 2 ), which makes it practical due to
the fact that k is much smaller than n. This technique
uses a given records order in a cycle. This cycle can be
produced by a solution of travelling salesman problem,
so the method is named MicTSP. Comparing MicTSP
with other methods, shows that the approach can produce lower information loss especially for k < 5, which
are typical in practice, however a refinement procedure
after MicTSP, may adopt it for even larger values of
k. Using previously studied benchmark datasets, our
method is shown to outperform existing microaggregation heuristics.
Section 2, discusses the MHM based microaggregation
method and record ordering heuristics proposed in [8].
Section 3 introduces MicTSP and refinement procedure. Section 4 is devoted to experimental results,
while section 5 concludes the paper.

2
2.1

MHM microaggregation and


related Heuristics
The MHM Algorithm

The MHM algorithm introduced in [4] involves constructing a graph over a list of sorted records, and finding the shortest path in the graph. Each arc in the final
shortest path represents a group containing records under the arc (excluding the record in the beginning of
the arc). MHM constructs the graph as follows: Let
X = X1 . . . Xn be a vector of length n consisting of
the records sorted in ascending manner. Construct a
graph Gk,n . For each record Xi , the graph has a node
with label i. The graph also has one additional node
with label 0. For each pair of graph nodes (i, j) such
that i + k j < i + 2k, the graph has a directed arc
(i, j) from node i to node j. Each arc (i, j) corresponds
a group C(i,j) consisting of {Xh : i < h j}. For each

arc (i, j), let the length L(i,j) of the arc be the withingroup sum of squared error
group
Pj for the corresponding
2
C(i,j) , i.e. L(i,j) =
(X

M
)
,
where
h
(i,j)
h=i+1
Pj
1
X
,
is
the
centroid
of records in
M(i,j) = ji
h
h=i+1
group C(i,j) . It is proved that every group in each optimal cluster corresponds to an arc of the graph and,
each optimal clustering corresponds to a path from
node 0 to node n in the graph. The length of the
shortest path is equal to SSE of the clustering. The
time complexity of constructing the directed graph is
O(k 2 n). A shortest path algorithm for this graph has
complexity O((k + 1)n) [4]. Since k is small, the algorithm is efficient in practice.

2.2

MHM based ordering heuristics

MHM algorithm produces optimal clustering for a


given ordering of records. Unfortunately, there does
not exist a unified approach to sort multi dimensional
records, so some ad hoc algorithms have been proposed.
These approaches try to visit the records in original domain space, while preserving locality, i.e. neighbours
must be visited more sooner than other records. Ferrer
et. al proposed nearest point next (NPN), maximum
distance (MD-MHM), maximum distance to average
vector (MDAV-MHM), and centroid-based fixed-size
(CBFS-MHM) heuristics [9]. The NPN heuristic visits nearest neighbour of current record as next visiting
one, starting from the furthest away record from the
centroid of the entire dataset. The MD-MHM heuristic works as follows: Find the two records r and s that
are furthest from one another. Form a group containing r and k 1 closest records to r. Find the k 1
closest records to s and form another group. Repeat
this process, using the remaining records until all the
remaining records are assigned. Stitch the groups together by using NPN heuristic to their centroids. The
MDAV-MHM heuristic is very similar to MD, but considers distances of records from the centroid of the entire dataset rather than distances to other records. The
CBFS-MHM is similar to MDAV-MHM, but considers
only one record at a time to form a group. Another
approach to explore the records is to consider records
as cities and trying to visit all of them in a tour. This
formulation is much more like TSP, while the criterion
on the length of the tour forces to preserve locality constraint indirectly. Recently Heaton et. al [8] have used
the idea to order records in a cycle taken from optimal
tour of equivalent TSP. Even though the problem is
computationally difficult, a large number of heuristics
and exact methods are known, so that some instances
with tens of thousands of cities can be solved [10]. A
good overview of these heuristics can be found in [11].
Experimental results in [8] show the effectiveness of the
approach specially for lower values of k. Despite this

229

The Third International Conference on Contemporary Issues in Computer and Information Sciences

effectiveness, the main drawback of the approach is related to converting a given tour to a path, i.e. selecting
the first record of the path from which the graph must
be constructed. One heuristic would be to delete the
longest edge in the cycle to convert it to a path. Unfortunately, this method doesnt always produce optimal
clustering in terms of IL. Another approach is to test all
possible starting records. Obviously, this method is not
applicable due to its time complexity of O(n2 k 2 ) and
O(n2 k) to construct the graphs and compute shortest
paths, respectively. Additionally, the heuristic of a
shorter tour results in a lower IL, may fail. In the
next section, we illustrate this problem for a small two
dimensional dataset, and propose an efficient refinement procedure to overcome the weakness.

MicTSP

Selecting starting record to convert a tour to a path


is the main difficulty of using TSP to order records
and passing them to MHM algorithm. In this section,
we construct a new graph based on the tour of TSP
and show this graph can be used to produce optimal
clustering regarding the records order in a given tour.
Let X = Xl1 . . . Xln Xln+1 . . . Xln+2k1 be a vector of
length n+2k1 consisting of the records in a TSP tour
with 2k1 repetitions of first records, i.e. Xln+1 = Xl1 ,
Xln+2 = Xl2 , and so on. Set l1 = 1, i.e. the path is
started from the first node in the dataset1 . For a given
k and n, construct a graph Hk,n with n + 2k 1 nodes
with labels from l1 = 1 to ln+2k1 corresponding to
records in X. For each pair of graph nodes (li , lj )
such that i n and i + k j < i + 2k, the graph
has a directed arc (i, j) from node i to node j, so the
graph has nk directed arcs. Each arc (i, j) corresponds
a group C(i,j) consisting of {Xlh : i < h j}. For
each arc (i, j), let the length L(i,j) of the arc be the
within-group sum of squared error
Pj for the corresponding group C(i,j) , i.e. L(i,j) = h=i+1 (Xlh M(i,j) )2 ,
Pj
1
where M(i,j) = ji
h=i+1 Xlh , is the centroid of
records in group C(i,j) . The computation of L(1,k) requires O(k) arithmetic operations, while L(1,k+1) can
be computed efficiently using only O(1) arithmetic:
L(1,k+1) = L(1,k) + (M(1,k) Xlk+1 )2 k/(k + 1), and
M(1,k+1) = M(1,k) +(Xlk+1 M(1,k) )/(k +1). Additionally, the same computation can be used for remaining
arcs, so the time complexity of constructing the graph
is reduced from O(k 2 n) in [4] to O(k + (nk 1)).
Theorem 1: An optimal clustering corresponds to
shortest path among 2k 1 shortest paths from node
1l

2 can be selected clockwise


2 The details of calculation is

li to node ln+i for 1 i < 2k. Formally speaking,


SSEOP T = min{D(i,n+i) , 1 i < 2k}, where D(i,j) is
the length of shortest path from node li to node lj .
Proof: In the optimal clustering, the order of visiting groups does not change the resulting IL. Each of
the first 2k 1 records in the optimal clustering may
be a starting node. So its sufficient to examine only
first 2k 1 records in Hk , n. 
Based on the theorem, it is possible to produce optimal
clustering in O(k 2 n) which is significantly lower than
O(kn2 ) proposed in [8], regarding the fact that k << n
in practice.
Another drawback of using TSP tour is related to the
principal assumption that always lower tour length produces cycles that can yield clustering with lower IL.
However, this is not always true (specially for larger
values of k). Figure 1 depicts the result of TSP based
microaggregation using shortest tour for a two dimensional dataset consisting of 9 records, while the optimal microaggregation solution in Figure 2, which is
obtained by exhaustive search, produces lower IL. The
problem arises when some near records are assigned to
different clusters, because TSP routes must visit all the
nodes. In other words, the objective function of TSP is
more global than it is required for clustering problem.
We proposed a post processing refinement procedure to
check for such near records in final clustering to overcome this problem. The clusters to which these records
are assigned can exchange them to produce lower IL.
When a pair of clusters are selected to be checked, all
records within them are exchanged temporarily and
if the resulted clustering produces lower IL, the indices of involved clusters and records along with the
amount of reduced IL are saved in the set of potential updates, T , to be applied in the next phase. More
concisely, this reduction can be measured by dSSE =
SSE1 SSE2 , where SSE1 and SSE2 denotes the original and changed SSE, respectively. if dSSE > 0, this
exchange would be a candidate improvement and so is
inserted in T . The calculation of dSSE may be done in
an efficient manner since only local updates have been
occurred in two clusters2 . Suppose two records Xpr
and Xqs of two clusters with indexes of p and q are
considered to be checked. Let M(1,p) , M(1,q) , M(2,p) ,
and M(2,q) be cluster centers of p and q, just before
and after exchanging the records. So = Xqs Xpr ,
M(2,p) = M(1,p) + /np , and M(2,q) = M(1,q) + /nq .
So dSSE = (2np nq (M(2,q) M(2,p) ) + (np +
nq ))/(np nq )).
In the next phase, all candidate exchanges in T are
chosen in greedy manner, i.e. they are first sorted in
descending order based on dSSE, and then, are applied
unless the exchange does not decrease IL, because of

or counter-clockwise without affecting the final result of clustering in terms of IL measure.


omitted due to space limitation.

230

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

conflict with previous changes. If an exchange is committed on the dataset, involved clusters and all their
neighbours are added to a list to be considered in the
next round. This iteration terminates after no considerable change in IL is achieved or a maximum repeat
count is reached.
TSP SSE: 5.1272

containing 4092 records. The results of MDAV method,


as the state of the art approach, are also reported for
comparison. All experiments are performed on a reguTM
lar laptop with Intel CORE i7 and 4 GB of memory
R
R

on Windows 7 using Matlab
2012. Timing results
are the best of 5 times running.

2
5/3

Table 1-Comparison between MicTSP and MDAV

1.5
3

k
Tarragona

8/3

1
1/3
0.5
4/2

Census

3
4
5
6
10

5.6922
7.4947
9.0884
10.3847
14.1559

5.0710
6.8708
8.4611
9.7662
14.6112

4.9603
6.6662
8.0179
9.0907
12.9363

12.8579
11.0545
11.7788
12.4606
8.6155

0.77
0.09
0.10
0.09
8.03

0.84
1.04
2.41
2.79
8.19

EIA

3
4
5
6
10

0.4829
0.6714
1.6667
1.3078
3.8397

0.3843
0.5262
0.8582
1.1205
2.0756

0.3617
0.4948
0.7730
0.9521
2.0189

25.1129
26.2918
53.6239
27.2014
47.4204

0.18
0.26
0.30
0.42
0.72

3.09
3.46
3.54
5.02
6.48

3/2
7/1

0.5

2
6/1
1

1
2/2
9/1
1.5
2
2

*- MicTSP2 refers to Microaggregation based on TSP followed by refinement.

1.5

0.5

0.5

1.5

Figure 1: TSP based Microaggregation. Stars denote


records, their index and assigned clusters while, discs
represent cluster centers and their index. Optimal TSP
tour connects records to each other.
Optimal SSE: 4.5618
2
5/1

1
8/1

0.5

4/3
3/2
7/3

-0.5

2/2

-1

[5] J. and Torra Domingo-Ferrer V., Ordinal, continuous and


heterogeneous k-anonymity through microaggregation, Data
Mining and Knowledge Discovery 11 (2005), no. 2, 195212.

9/3
-1.5

-1

-0.5

0.5

1.5

Figure 2: Optimal Microaggregation.

[1] L. and others Sweeney, k-anonymity: A model for protecting privacy, International Journal of Uncertainty Fuzziness
and Knowledge Based Systems 10 (2002), no. 5, 557570.

[4] S.L. and Mukherjee Hansen S., A polynomial algorithm for


optimal univariate microaggregation, Knowledge and Data
Engineering, IEEE Transactions on 15 (2003), no. 4, 1043
1044.

6/2
2

-1.5

Refrences

[3] A. and Domingo-Ferrer Oganian J., On the complexity


of optimal microaggregation for statistical disclosure control, STATISTICAL JOURNAL-UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE 18 (2001), no. 4,
345354.

1/1

**- Improvement = (MicTSP2-MDAV)/MDAV*100%.

[2] B. and Wang Fung K. and Chen, Privacy-preserving data


publishing: A survey of recent developments, ACM Computing Surveys (CSUR 42 (2010), 153.

1.5

-2
-2

IL (%)
MicTSP
Refinement
Time (sec) Time (sec)
MDAV
MicTSP MicTSP2* Imp **
3
16.9326 14.8456 14.7995 12.5976
0.05
0.79
4
19.5458 17.7523 17.4193 10.8796
0.06
1.51
5
22.4613 21.1884 20.5634
8.4496
0.06
2.98
6
26.3252 25.1887 24.0602
8.6039
0.07
3.06
10
33.1929 33.5223 30.7549
7.3449
0.13
9.56

Experiments

The results of MicTSP on standard dataset of SDC are


shown in Table 1. Tarragona is a dataset of 834 records
and 13 attributes, Census consists of 1080 records in 13
dimensional space, and EIA is a 11 dimensional dataset

[6] J.L. and Wen Lin T.H. and Hsieh, Density-based microaggregation for statistical disclosure control, Expert Systems
with Applications 37 (2010), no. 4, 32563263.
[7] D. and Forn
e Rebollo-Monedero J. and Soriano, An algorithm for k-anonymous microaggregation and clustering inspired by the design of distortion-optimized quantizers, Data
& Knowledge Engineering (2011).
[8] B. Heaton, New Record Ordering Heuristics for Multivariate Microaggregation, NOVA SOUTHEASTERN UNIVERSITY, 2012.
[9] J. and Martnez-Ballest
e Domingo-Ferrer A. and MateoSanz, Efficient multivariate data-oriented microaggregation, The VLDB Journal 15 (2006), no. 4, 355369.

231

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[10] SH and Khaviari Nasseri MH, Solving TSP by considering processing time: Meta-heuristics and fuzzy approaches,
Fuzzy Information and Engineering 3 (2011), no. 4, 359
378.

232

[11] C. and Gamboa Rego D. and Glover, Traveling salesman


problem heuristics: Leading methods, implementations and
latest advances, European Journal of Operational Research
211 (2011), no. 3, 427441.

Proposing a new method for selecting a model to evaluate effective


factors on job production capabilities of central province industrial
cooperatives using Data mining and BSC techniques
Davood Noshirvani Baboli

Peyman Gholami

Islamic Azad University

Islamic Azad University

Young researchers Club, Arak Branch

Young researchers Club, Arak Branch

Arak, Iran

Arak, Iran

Peyman711@yahoo.com

dnoshirvani@yahoo.com

Abstract: High unemployment rate is a serious problem in all countries. Governments are trying
to increase GDP to solve unemployment problem and have heavily invested money on cooperatives
as a quick way to make jobs. In this paper 200 industrial cooperatives are inspected according to
BSC criteria, their job making capability is evaluated then the possibility of cooperative job making
is found using classification algorithms and BSC criteria and the importance of effective factors on
job making of cooperatives is calculated using Fisher score algorithms, important factors are selected
using CFS algorithms and finally Data Base rules are extracted using association rules. Results
show high efficiency of Data Mining methods for job making analysis of industrial cooperatives.

Keywords: Data Mining, Classification Algorithms, Association Rules, Fisher score, Balanced Score Card, Cooperative company, Making Job Opportunity

Introduction

[2].

Cooperative organizations are one of the most important social and economical tools and are aimed for special conditions. Cooperative behavior which can be
called social puberty shows the determination of society for solving economical and social problems. Today
cooperative economy is part of developed economical,
social and political knowledge which is taught in many
universities across the world. This branch of economy
is successfully used in developing countries by helping
them to decrease unemployment rate and to spread social welfare.
Data mining and knowledge discovery (DMKD)
has made predominant progress during the past two
decades [1]. It utilizes methods, algorithms, and
techniques from many disciplines, including statistics,
datasets, machine learning, pattern recognition, artificial intelligence, data visualization, and optimization
Corresponding

Other sections of paper are organized as


In section 2 used algorithms and BSC are
in section 3 experimental study is defined.
explains the results and conclusions. Future
presented in section 5.

2
2.1

Preliminaries
Classification

The different performance metrics measure different


tradeos in the predictions made by a classification and
it is possible for learning methods to perform well on
one metric, but be suboptimal on other metrics. Because of this it is tocalculate algorithms on a broad set

Author, P. O. Box 3815693699 , T: (+98) 918 644 7636

233

following:
explained
Section 4
works are

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of performance metrics, [7].


Chen et al. [3] scrutinized the reason why the mixture of experts (ME) performance poorly in multiclass
classification and proposed an approximation for the
NewtonRaphson algorithm to improve the performance
of the ME architecture in multiclass classification.

Where rzc is the correlation between the summed components and the outside variable, k is the number of
components, rzc is the average of the correlations between the components and the outside variable, and rii
is the average inter-correlation between components.
Equation 5 is, in fact, Pearsons correlation coefficient,
where all variables have been standardized.
following conclusions can be drawn:

Platt et al. [4] presented the decision directed


acyclic graph architecture and a learning algorithm for
multiclass classification. Allwein et al. [5] proposed a
unifying framework for multiclass classification using a
margin-based binary learning algorithm.

2.2

The higher the correlations between the components and the outside variable, the higher the correlation between composite and the outside variable.
The lower the inter-correlations among the components, the higher the correlation between the
composite and the outside variable.

Association rule

As the number of components in the composite


increases (assuming the additional Components
are the same as the original components in terms
of their average inter correlation with the other
components and with the outside variable), the
correlation between the composite and the outside variable increases.

Association rule mining, introduced by Agrawal,


Imielinski, and Swami (1993), has been widely used
from traditional business applications such as crossmarketing, attached mailing, catalog design, lossleader analysis, store layout, and customer segmentation to e-business applications such as the renewal of
web pages and web personalization [9].

2.3

Fisher score

To evaluate the discrimination power of each feature,


we have been using the statistical criteria of Fisher 3.1
scores that are defined as follows:
c
P

Fr =

ni (i )

i=1
c
P
i=1

(1)
ni i2

Where ni is the number of samples in ith class, i is the


mean values of a feature in ith class, i is the variance
values of a feature in ith class, is the mean values of
a feature in total samples.

2.4

Feature Selection in Statistics and


Pattern Recognition

Feature subset selection has long been a research area


within statistics and pattern recognition.
Correlation based Feature Selection:
k
rZl
rZC = p
k + k(k 1)
ru

(2)

Experimental study
data source

In this paper the information of 200 industrial organizations in the central province based on 2010 survey
are used as Data Base and records are related to these
cooperatives. Used features are a number of BSC criteria:
Primary investment
Amount of export
Relatively low price for products
Customer satisfaction
Number of customers
Keeping customers
Production to capacity ratio
Applying new production technology
Having ISO license
Having insurance security for personnel
Innovation capability
Increasing control over material suppliers
Increasing control over distributers and retailers
Increasing market share
Increasing sale by improving quality
Class 3 classes of job making are selected for each cooperative. These 3 classes are defined according to the
number of job opportunities that they have made in

234

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2011.
Class 1: low job making (less than 20 job opportunities
are created)
Class 2: medium job making (between 20 to 40 job opportunities are created)
Class 3: high job making (more than 40 job opportunities are created)

Classification algorithm
J48
SVM
Logistic
Random Forest
Simple Bayes

Accuracy
79
83
81
91
69

Table 2: Precision of classification algorithms after


omitting criteria which are not selected in CFS.

3.2

Proposed Method

The proposed model for data analyzing in this paper


is as follows:

1 Start
2 Making Data Base according to data source in
previous section
3 Classification performance using a number of algorithms to acquire the power of forecasting the
number of job making opportunity using BSC criteria
4 Using CFS algorithm to select the most important feature of Data Base
5 Using Fisher score algorithm for ranking features
of industrial cooperatives in Central province
6 Using Apriori algorithm to extract the rules of
cooperatives Data Base.

Fisher score
0.861
0.671
0.511
0.607
0.418
0.319
0.089
0.156
0.147
0.311
0.264
0.056
0.208
0.097
0.041

Feature
Primary investment
Amount of export
Relatively low price products
Customer satisfaction
Number of customers
Keeping customers
Production to capacity ratio
Applying new production
technology
Having ISO license
Having insurance security for
personnel
Innovation capability
Increasing control over material suppliers
Increasing control over distributers and retailers
Increasing market share
Increasing sale by improving
quality

Table 3: Fisher Score of each feature

7 End

Results from Table 1 show classes and features are


correctly selected. In this step important features are
selected from all features using CFS algorithms and the
following features are selected:
Primary investment
4 Results
Amount of export
Relatively low price products
Customer satisfaction
First Data Base is classified using a number of clasNumber of customers
sification algorithms and 10 Cross Fold. Results are
Keeping customers
shown in Table 1.
Production to capacity ratio
Having ISO license
Classification algorithm Accuracy
Having insurance security for personnel
J48
76
Innovation capability
SVM
81
To find the precision of feature selection after omitting
Logistic
79
a number of criteria classification with 10 Cross Fold
Random Forest
87
is performed again and the results are shown in Table
2.
Simple Bayes
64
Table 1: The accuracy of classification algorithms

235

Now importance (weights) of each criterion can be

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

obtained using Fisher Score algorithms. Results are


shown in Table 3.
Finally Data Base results are extracted using Apriori algorithm. Because of the high number of rules only
8 rules with high confidence and support are selected.
Results are shown in Table 4.

[9] Y.M. Wang, C. Parkan, and A. Swami, Multiple attribute


decision making based on fuzzy preference information on
alternatives: ranking and weighting, Fuzzy Sets and Systems
152 (2005), 331-346.

Rules
Primary
investment(high)+ number
of customers(high)+
Standard(high)

high job making


Amount
of
export(high)+customer
satisfaction(high)+innovation
capability(high)
high job making
Increasing
market
share(high)+
new
production
technology(high) high job
making
Production
to
capacity
ratio(medium)+insurance
security for personnel(medium( + keeping customers(high)
medium job making
new production technology
(high)
+
keeping
customers
(medium) + Amount
of export (high)
medium job making
Primary
investment(high)+ number
of customers(medium)
medium job making
customer
satisfaction(low)+
Amount of export
(high)+ number of
customers(low)low
job making
Primary
investment(low)
+innovation capability(low)
low job making

Conclusions

The applied approach in this paper is a new strategy


that has not been used by companies. Results show
that using BSC criteria is very useful for forecasting
job making capability of organizations. Based on the
importance of features primary investment has an important role in the job making of organizations and important association rules are shown in this paper. For a
more comprehensive research one can use all BSC criteria. Performing this research in other fields of economy
and comparing the results is also suggested. Organization benefits and other economic indexes can also be
used in other researches.

Refrences
[1] Y. Peng, G. Kou, Y. Shi, and Z. Chen, A descriptive framework for the field of data mining and knowledge discovery,
International Journal of Information Technology and Decision Making 7 (2008), no. 4, 639-682.
[2] U.M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, From
data mining to knowledge discovery, Advances in Knowledge
Discovery and Data Mining, AAAI Press (1996), 134.
[3] Chen K, Xu L, and Chi H, Improved learning algorithms for
mixture of experts in multi-class classification, Neural Networks 12 (1999), 1229-1252.
[4] Platt
JC,
Cristianini
N,
and
Shawe-taylor
J,
Large margin DAGs for multi-class classification:
http://www.brookson.com/gsm/clone.pdf,
Proceedings
of neural information processing systems, NIPS99, MIT
Press (1999), 547-553.
[5] Allwein EL, Schapire RE, Singer Y, and Kaelbling P, Reducing multi-class to binary: a unifying approach for margin
classifiers, Journal of Machine Learning Research 1 (2000),
113-124.
[6] Loucopoulos C, Three-group classification with unequal Misclassification costs: a mathematical programming approach,
Omega 29 (2001), no. 3, 291-297.
[7] Rich Caruana and Alexandru Niculescu-Mizil, An Empirical
Comparison of Supervised Learning Algorithms, Appearing
in Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh (2006).
[8] R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases, SIGMOD,
Washington, DC, USA (1993), 207-216.

Confidence support
1
0.56

0.46

0.41

0.62

0.55

0.39

0.59

0.44

Table 4: Data base rules with high confidence and rules


support

236

A Complex Scheme For Target Tracking And Recovery Of Lost


Targets In Cluster-Based Wireless Sensor Networks
Behrouz Mahmoudzadeh

Amirkabir University Of Technology(Tehran polytechnic)


Department of Electrical Engineering
behruz mahmoudzadeh@aut.ac.ir

Karim Faez
Amirkabir University Of Technology(Tehran polytechnic)
Department of Electrical Engineering
k.faez@aut.ac.ir

Abstract: This paper takes a look on performance of an energy-efficient target tracking using
recovery process in a cluster-based wireless sensor network. object tracking is vulnerable to loss
of the target due to reasons such as sensor failure, predicting error or abrupt change in object
trajectory. For a reliable object tracking method that could be used in critical situations, a foolproof mechanism is needed. This paper explains an object tracking method (HCTT) along with
a recovery mechanism to track objects and recover lost objects(if needed) in a clustered network.
The simulation has been carried out using the Castalia simulation framework of OMNET++.

Keywords: Wireless Sensor Network; Object Tracking; Object Recovery.

Introduction

cluster structure is being adopted to overcome the object tracking problem. so we use a clustered structure
consisting of dynamic clustering and static clustering
called hybrid cluster-based target tracking (HCTT) together with a recovery process.
The main contributions of the paper are summarized
as follows:(1) We address the base target tracking
method(HCTT) and (2) the recovery process and finally (3) We examine the results of simulations to show
the efficiency of the proposed scheme.

A Wireless Sensor Network (WSN) consists of a number of sensor nodes (depending on the usage) where
each sensor has the prerequisite ingredients to save
and compute data. One of the practical application
where WSNs can be used in, is object tracking [1] [2].
However, difficulties exist in a target tracking sensor
network which we should overcome to reach the ideal
tracking. The network is always vulnerable to errors
such as sensor failures,detection errors, prediction errors, network failures, and localization errors. These
The Performance Of The
cause to lose the objects course. the target exists 2
within the sensor network, but its not traceable anyHCTT
more. so we need to a robust recovery method which
recover the lost objects. The recovery process should
be quick and effective. as we know the cluster struc- In this section a review of the method is offered and
ture is the only suitable structure which is capable of for a full study with details readers are referred to [3]
extending and have benefits for large scale WSNs. the
Corresponding

Author, T: (+98) 937 803-9792

237

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.1

System Model

requirements for the S2DIC handing over might


not be met as the target travels. In this case,
another new dynamic cluster will be created for
continuing the tracking. we should Note that the
D2SIC handoff is preferred to the D2DIC handoff as the second one is more costive and has a
larger delay than the first one. The new cluster
head sends a request message to request for the
handing over the leadership. The previous cluster head answers a message containing the before
estimations of the objects location to deliver the
leadership to the cluster head of the new dynamic
cluster. After receiving the leadership, the new
active cluster head broadcasts an activate message to wake up its members to be careful about
target entering. every node senses the target reports the sensing data to the active cluster head.
The previous dynamic cluster head sends a dismiss message.

Our network is formed by n static sensor nodes randomly deployed in a area of interest. The sink node is
deployed at a corner of the network. The network is
made up of m clusters by using any suitable clustering
algorithm. Each cluster i has ni nodes including one
cluster head and many members. A general sensing
model is adopted for a sensor node vi, defined by R(vi;
rs).

2.2

Boundary Problem

we will have no problem when the target is inside


a cluster. However, when the object is seen along
the boundaries among multiple clusters, the boundary
problem takes place. The sensor data is not perfect and
reliable because sensor nodes in the monitoring region
belong to different clusters. So to solve this problem,
a dynamic cluster is constructed and then dismissed to
reduce energy consumption.

Recovery Of Lost Targets

3.1
2.3

Problem Description

Inter-cluster Handoffs

A problem that can take places when wireless sensor


The tracking task should be handled by the most networks are used for object tracking is that the netproper cluster. As the target travels in the network, work might lose the track of the object [4]. This could
the task should be handed over from the previous clus- occur because of many reasons:
ter to next most suitable one, which is done by the process of inter-cluster handoff. The inter-cluster handoff
takes place under the three scenarios:
Node Failures: WSNs have limited basis and
nodes have limited battery . Node defeat may
take places because of hardware defeat, battery
Static to Dynamic Inter-Cluster Handoff (S2DIC
discharge, enemy parasite, etc.
Handoff):
Localization defaults : Localization is not comAs the object travels along the boundaries,
plete and the estimated location of the sensors
boundary nodes can sense the target. after demay have defaults. These defaults may have a
tecting the object by a boundary node, a dynamic
cumulative effect on object tracking.
cluster is made in advance for the reaching of the
object. after that the process of handing over
Network Failure: The wireless sensor network
is started and thereafter the dynamic cluster is
may break because of communication failure,
responsible for tracking.
overload, environmental agents etc.
Dynamic to Dynamic Inter-Cluster Handoff
(D2DIC Handoff):
If none of sensing nodes is a boundary node the
target has moved away from the boundary area
and has entered the internal region of the next
static cluster. Then the D2SIC handoff is done.
Dynamic to Dynamic Inter-Cluster Handoff
(D2DIC Handoff) :
sometimes its possible as a dynamic cluster is responsible for the object tracking task, the handoff

Prediction Errors: As it was told before, The


cluster heads are warned before the object arrives. so they will wake up other nodes in the
cluster. This warning can be initiated by other
cluster heads where the target is present. an upstream cluster has to predict the future location
of the target and then it wakes up a downstream
cluster. faults can cause target loss, since the targets course may be not found because a cluster
not being woken up in advance.

238

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Abrupt change in objects direction/speed: The


target might change the course or speed abruptly.
The network may not be able to track the object
properly.

How To Recover The lost Objects


Figure 1: cluster transition

The recovery process can be explained in four steps as


described below [4]:

Loss of Target: The recovery mechanism is


started when a cluster head reports the target
has lost. The most important part in initiating
target recovery process is to discover that the object has been lost, i.e. the objects position is
not understood. If the downstream cluster head
(static or dynamic) does not sense the target in
a predetermined time, it finds out the target has
been lost. its important that a false recovery is
not initiating due to reasons like stationary object. We will continue with an example to clear
the subject as follow:
Assume the target was being tracked by cluster
head A as shown in Figure 1. upon detecting the
car by boundary nodes a or b, a dynamic cluster will be created which nodes c,d,e and f are
invited to join the dynamic cluster. Assume that
the car is not seen by the dynamic cluster. After
a predetermined time the dynamic cluster head
will trigger a timeout and declare target loss to
static cluster head.

neighboring clusters wake up the internal nodes


in the cluster. In the above example, the cluster
head sends target loss message to one hop static
and dynamic clusters . These clusters are woken
up and activated. If the target is not found in
a predetermined time, the just woken up cluster
heads timeout and wake up their one hop clusters in a similar manner. This guarantees that
the area of active recovery is increased step by
step. On detecting the target an internal node
turns its radio on, sends the receipted data and
keeps its radio to listen mode.
Sleep: It is important that after the beginning of
active recovery the network quickly comes back
to hibernation state. If the object is found during previous step, the cluster head recovering the
target informs this to neighboring clusters, which
in turn inform this to their neighbors and so on.
On receipt of target recovered message, only relevant clusters are kept awake while the others
come into the sleep mode. In the above example,
if the car was recovered at cluster B, all other
clusters other than B go to sleep mode. If for
some reason the target is not found at all and
cluster heads timeout, the clusters conclude that
the target is lost and then hibernate.

Search: The search step has been added to reduce false recovery beginning. In this step, the
static cluster head queries the dynamic cluster
Simulation
for target existing. Continuing the above exam- 5
ple, the static cluster head determines the target
presence within its own cluster for target existSimulations were carried out to study the proposed
ing. On failing to discover the car it will enter
tracking and recovery mechanism. The OMNeT++
the next step.
package with Castalia framework was used. The network consists of 300 sensor nodes, deployed in a eld of
Active recovery: The static cluster is the focal 140140 m. There are 80 GPS nodes in the network, and
point of recovery process. The static cluster head the other nodes determine their location as explained
sends a target loss message to all one hop clus- earlier. The sensor network had 30 static clusters with
ters(dynamic and static clusters). upon receipt- 12 nodes each. The cluster heads are identified during
ing of this message, the
deployment. A node chooses a cluster head based on

239

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

minimum distance between all the cluster heads it can Figure 2 shows the movement of target in the sensor
reach in just one hop. It is assumed that nodes are network while there is no kind of errors.The simulation
perfectly localized.
experiment is run for 120 s. The target starts at 7 seconds from location (10, 140) and moves at a speed of
10 m/s in a zig-zag fashion. The target nally leaves the
network at 45 seconds from location (130, 0). The tar5.1 Target Tracking
get is tracked from the moment it enters the network
until it leaves the network. A total of 164 target locations are recorded all along the network. The cluster
heads localize the target every 300 ms. Some localized
points with the timing information are shown in gure2.

5.2

Target Tracking With Error

Node failure is a major reason for the loss of target


trajectory. The nodes that can fail are internal nodes
and cluster heads. Simulation results conrm that the
probability of node failure increases, the number of localized points decrease. Figure 3 shows an example of
target recovery for node failure probability of 0.3.

Figure 2: target tracking

conclusions

This paper studied the problem of losing the target


trajectory in a target tracking wireless sensor network.
In future work,multiple targets may be studied.

Refrences
[1] A. Arora et al., A Line in the Sand: A wireless sensor network for target detection, classication and tracking: Computer Networks 46 (2004), 605634.
[2] C. S. Raghavendra. and K. M. Sivalingam., Wireless Sensor
Networks. (2004), 125128.
[3] . Z. Wang, W. Lou., and J. Ma., A novel mobility management scheme for target tracking in cluster- based sensor networks, IEEE DCOSS (2010), 172 186.

Figure 3: target tracking with error

[4] Anshu Khare. and Krishna M. Sivalingam, On Recovery


of Lost Targets in a Cluster-basedWireless Sensor Network: Pervasive Computing and Communications Workshops (2011), 208 - 213.

240

A Measure of Quality for Evaluation of Image Segmentation


Hakimeh Vojodi

Amir Masoud Eftekhary Moghadam

Qazvin Branch, Islamic Azad University

Qazvin Branch, Islamic Azad University

Department of IT and Computer Engineering

Department of IT and Computer Engineering

Qazvin, Iran

Qazvin, Iran

h.vojodi@qiau.ac.ir

eftekhari@qiau.ac.ir

Abstract: In this paper we propose an unsupervised evaluation method based on minimal intraregion disparity and maximum inter-regions disparity measured on a pixel neighborhood. This
method evaluates color image segmentation algorithms and measures the accuracy of them. The
proposed method can be used for any type of color images with any number of regions. Also
it limits over-segmentation problems. Experiments were performed on a database composed of
2400 segmented color images. We compared the proposed method with an unsupervised evaluation
method. The Experimental results demonstrate the effectiveness of the proposed method.

Keywords: Image segmentation; Unsupervised evaluation; Intra-region; Inter-region

Introduction

Segmentation is a fundamental stage in image processing and machine vision applications. Many segmentation methods have been proposed in the literature [1,2]
but it still is a challenging task to evaluate their efficiency. Consequently, methods for evaluating different
image segmentation algorithms play a key role in image
segmentation research [3].
The evaluation of a segmentation result makes a
given level of precision. Generally, two main approaches of evaluation exist including supervised and
unsupervised approaches. Supervised evaluation criteria use some prior knowledge such as a ground truth. In
these methods, the results of a segmentation algorithm
are compared to a standard image that is manually
segmented. This is the most commonly used method
of objective evaluation. Supervised evaluation is subjective, time-consuming task and for most images, especially natural images do not exist we generally cannot guarantee that single manually-generated segmentation image. These methods widely use in medical
applications [4].
Corresponding

Author, T: (+98) 142-7353320

241

Unsupervised ones compute some statistics in the


segmentation result according to the original image
without any prior knowledge. Unsupervised evaluation
enables the objective comparison of both different segmentation methods and different parameterizations of
a single method, without requiring human visual comparisons or comparison with a manually-segmented or
pre-processed reference image. Additionally, unsupervised methods generate results for individual images
and images whose characteristics may not be known
until evaluation time. Unsupervised methods are crucial to real-time segmentation evaluation and can furthermore enable self-tuning of algorithm parameters
based on evaluation results [5].
Liu and Yang [5] proposed an evaluation function
based on empirical studies. Their evaluation function
has a very strong bias towards segmentations with very
few regions. Borsotti et al. [5] improved upon Liu and
Yangs method, and proposed the modified quantitative evaluations. Zeboudj proposed a measure based
on the combined principles of maximum inter-regions
disparity and minimal intra-region disparity measured
on a pixel neighborhood [6]. All they able to evaluation
gray level segmented image.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Rosenberger presented in [7] a measure that enables image based on any of its components, using the gray
to estimate the intra-region homogeneity and the inter- level method is processed. Each intra-region color erregions disparity of gray level images.
ror of segmented image is computed based on its R, G
and B components. According to each component, one
Zhang et al. [3] proposed a novel objective segmen- error value for each region is obtained. The average
tation evaluation method based on information theory. of three color errors of each region represents the total
The new method uses entropy as the basis for mea- color error of region.
suring the uniformity of pixel characteristics within a
segmentation region. This method used to evaluate
Intra-region disparity is defined based on error
color segmented images.
color. Let I be the original image and Ig be the segmented image that is defined as a division of I into
The proposed evaluation method is based on mini- N arbitrarily-shaped regions. One defines Cx (s, t) =
mal intra-region disparity and maximum inter-regions |gI (s) gI (t)|/(L 1) as the disparity between two
disparity measured on a pixels neighborhood. It pro- pixels s and t, with L being the maximum of the gray
vides a quality score that can be used to compare dif- level. The interior disparity CIx (Rj ) of the region Rj ,
ferent segmentations of the same image. This method is defined as follows:
can be used to compare various parameterizations of
1 X
max{Cx (s, t), t W (s) Rj } (1)
CIx (Rj ) =
one particular segmentation method (including those
Sj
sR
j
which differ in terms of the number of regions used in
the segmentation) as well as fundamentally different
segmentation techniques. We compare the proposed
Rj is the set of pixels in region j. CIx (Rj ) is the
method with Zhangs method [3].
value of component x for intra-region R and W(s) to
j

The test images in the benchmark should have a


large variety so that the evaluation results can be extended to other images and applications. The experiments are conducted using the images and groundtruth segmentations in the Berkeley segmentation data
set [8].We will evaluate the performance of our algorithm on the Berkeley Segmentation Database (BSD).
The proposed method limits over-segmentation problem. Analysis of the experimental results on a large
variety of test images from the Berkeley segmentation
dataset demonstrates the efficiency of the proposed
method.

the neighborhood of the pixels. Total an intra-region


disparity. It is defined as follows:
X
1
CI(Rj ) =
CIx
(2)
3
x{R,G,B}

Intra-region uniformity is computed based on criterion


(2). It is an intuitive and effective way to evaluate segmentation performance by measuring its intra-region
uniformity. So that almost all unsupervised methods
apply this metric. While a variety of intra-region uniformity metrics have been proposed, all are based on
four quantities: color error, squared color error, texture, and entropy.

This paper is organized as follows. In Section 2 we


The inter-region disparity is computed as the averpresent our new unsupervised evaluation method. Exage
of the disparity of a region with its neighbors. Also
perimental results and analysis are presented in Section
it
is
calculated based on regions color difference. We
3. In Section 4 we present our conclusion.
use Cx (p) to denote the value of component x for pixel
p (or for pixel p and its neighboring pixels in the same
region). Defined the average value of component x in
region j as follow.

The Proposed Method

X
Cx (Rj ) = (
Cx (p))/Sj
(3)
The proposed method is based on maximum interpRj
regions disparity and minimal intra-region disparity
measured. It enables to estimate the intra-region ho- where x colorcomponents (RGB in our experimogeneity and the inter-regions disparity. In a seg- ments).
mented image the pixels that are located in a region
should have similar property though compared to pixThe disparity of two uniform regions Ri and Rj is
els of neighboring regions should have a different prop- calculated as.
erty.
X |Cx (Rj ) Cx (Ri )|
1
DEx (Rj ) =
(4)
NR
N g(Rj ) + N g(Ri )
In this article we use the RGB color space. A color
Ri (Rj )

242

The Third International Conference on Contemporary Issues in Computer and Information Sciences

N g(Rj ) is the number of gray-levels in the region


In this section we analyze our proposed unsuperRj . (Rj ) is the neighborhood of the regions.
vised evaluation and compare it with evaluation measure based on entropy (EEntropy ) [3]. We discuss the
X
1
advantages
and shortcomings of each type of methDEx (Rj ) =
DEx (Rj )
(5)
3
ods.
We
use
two groups of images (ground truth and
x{R,G,B}
machine segmentation) to perform experiments. OverIf an intra-region disparity is less than its inter-region segmentation is major problem for segmentation algodisparity, that region would be more accurate. The dis- rithms. The evaluation methods are efficient which are
parity of the region Rj is defined by the measurement sensitive to over-segmentation and penalize its.
CR(Rj ) [0, 1] expressed as follows:

These measures calculate the amount of segmenCI(R )


1 DE(Rjj ) if 0 < CI(Rj ) < DE(Rj ) tation accuracy. To compare the proposed evaluation
CR(Rj ) =
if
CE(Rj ) = 0
method with evaluation measure based on entropy, 11
DE(Rj )
0
otherwise
images of dataset are selected randomly. For each im(6) age, two types of segmentation are selected including
the evaluation measure for segmented image is as fol- segmentation with more regions, approximately equal
lows.
than number of regions in ground truth image.
N
1 X
CR(Rj )
(7)
E3 =
N j=1
Figures 1 and 2, show evaluation of two measures
for 11 selected images and their precision is shown in
Table 1. EEntropy cannot penalize over-segmentation
N is total number of regions in segmented image I. problem and in some images their precision value are
more than normal value.

Experimental Results

We use 300 images from the Berkeley dataset to


implement our experiments. It is available from
www.eecs.berkeley.edu/Research/Projects/CS/vision/
grouping/segbench/. Also, we use the segmented images in the Berkeley dataset as ground truth images.
We perform the evaluations on the ground truth images
from Berkeley segmentation dataset and segmented images resulted from Edison segmentation system [9]. We
get 6 segmented images by applying different parameters with Edison segmentation system and use 2 ground
truth images for each image of dataset. Therefore our
segmented dataset will be 2400 (8 300) different segmented image.

Figure 1: Evolutions of the EEntropy measure over to


11 segmented images.

243

Figure 2 shows the results of proposed evaluation


method. The diagram of over-segmentation is under
the normal diagram. So our method is sensitive to
over-segmentation. This method is able to obtain the
suitable degree of similarity for different images and
penalize over-segmentation well. The experimental results demonstrate that the proposed method is appropriate for evaluation of image segmentation.
The Edison segmentation system has several input parameters (spatial, minimum region ) which they
need to be adjusted. In this paper, we propose a new
method to adjust spatial parameter in Edison system.
First, we select 200 images of Berkeley dataset and segment each of images with spatial parameter varying in
the range of 2, 5, 10, 50, 100 and 200 and ground truth
image. Then each of segmented images is evaluated
with proposed method. We obtain average segmentation accuracy of each spatial parameter on 200 images.
The experimental results in TABLE 2 show that spatial =200 is more suitable input value for Edison segmentation system.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Image
1
2
3
4
5
6
7
8
9
10
11

Normal-segmentation
Over- segmentation
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over
Normal
Over

EEntropy

5.0591
5.1476
5.3755
5.1673
5.1797
4.8328
4.6521
4.6437
5.0339
5.8374
5.8806
6.1022
5.6258
5.8837
6.9249
6.4253
5.2589
5.1189
5.6475
5.2757
5.2629
5.2764

0.3038
0.1840
0.5457
0.2900
0.4083
0.0874
0.4101
0.2811
0.8845
0.1184
0.1683
0.0157
0.5115
0.2065
0.2151
0.0174
0.5200
0.4097
0.4031
0.2653
0.4137
0.2688

Table 1: The comparison results of 11 segmented images given by EEntropy and the proposed method.
Spatial
Average of 50 images
Average of 100 images
Average of 150 images
Average of 200 images

2
0.0120
0.0176
0.0145
0.0172

5
0.0104
0.0130
0.0106
0.0119

10
0.0144
0.0218
0.0178
0.0179

50
0.0180
0.0254
0.0223
0.0231

100
0.0221
0.0348
0.0294
0.0293

200
0.0230
0.0360
0.0327
0.0328

Ground truth
0.1247
0.1238
0.1173
0.1103

Table 2: The accuracy (%) of proposed method for 50, 100, 150 and 200 segmented images with parameter
spatial: 2, 5, 10, 50, 100 and 200 and ground truth of them.

Conclusion

Evaluation of image segmentation algorithm is necessary to quantify the performance of the existing segmentation methods. In this paper, we proposed an
unsupervised evaluation method for the evaluation of
color image segmentation algorithms. It is based on
the minimal intra-region disparity and maximum interregions disparity measured on a pixel neighborhood.
We used a large database composed 2400 segmented
images of Berkeley dataset for performance of the exFigure 2: Evolutions of the E measure over to 11 seg- periments. We compared the proposed method with
mented images.
an unsupervised evaluation method. The proposed
method is sensitive over-segmentation and penalizes
it. Experimental results demonstrate that the proposed method is appropriate for evaluation of segmented color images.

244

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences
[1] N. Senthilumaran and R. Rajesh, Image Segmentation - A
Survey of Soft Computing Approaches, International Conference on Advances in Recent Technologies in Communication and Computing (2009), 844846.
[2] W. Tao, H. Jin, and Y. Zhang, Colour Image Segmentation
Based on Mean Shift and Normalized Cuts, IEEE Transaction on Systems, Man, and CyberneticsPart B: Cybernetics
37 (2007), no. 5, 13821389.
[3] H. Zhang, J. Fritts, and S. Goldman, An entropy-based objective evaluation method for image segmentation, Proceedings of SPIE-Storage and Retrieval Methods and Applications for Multimedia (2004).
[4] N. M. Nasab, M. Analoui, and E. J. Delp, Robust and Efficient Image Segmentation Approaches Using Markov Random Field Models, Journal of Electronic Imaging 12 (2003),
no. 1, 5056.
[5] H. Zhang, J. E. Frittsm, and S. A. Goldman, Image Segmentation Evaluation: a Survey of Unsupervised Methods, Computer Vision and Image Understanding 110 (2008), no. 2,
260-280.

245

[6] S. Chabrier, B. Emile, C. Rosenberger, and H. Laurent, Unsupervised performance evaluation of segmentation,
EURASIP Journal on Applied Signal Processing 10/1155
(2006).
[7] S. Chabrier, C. Rosenberger, H. Laurent, B. Emile, and
P. Marche, Evaluating the segmentation result of a graylevel image, in Proceedings of 12th European Signal Processing Conference (EUSIPCO04), Vienna, Austria (2004),
953-956.
[8] Martin, C. Fowlkes, D. Tal, and J. Malik, A Database of
Human Segmented Natural Images and its Application to
Evaluating Segmentation Algorithms and Measuring Ecological Statistics, Proceedings of 8th International Conference Computer Vision 2 (2001), 416423.
[9] Edge Detection and Image Segmentation System:
http://www.caip.rutgers.edu/riul/research/code/EDISON/.
[10] R. Unnikrishnan, C. Pantofaru, and M. Heber, Toward
Objective Evaluation of Image Segmentation Algorithms,
IEEE Transaction on Pattern Analysis and Machine Intelligence 29 (2007), no. 6.
[11] F. Ge, S. Wang, and T. Liu, New Benchmark for Image
Segmentation Evaluation, Journal of Electronic Imaging 16
(2007), no. 3, 033011033026.

An Unsupervised Evaluation Method for Image Segmentation


Algorithms
Hakimeh Vojodi

Amir Masoud Eftekhary Moghadam

Qazvin Branch, Islamic Azad University

Qazvin Branch, Islamic Azad University

Department of IT and Computer Engineering

Department of IT and Computer Engineering

Qazvin, Iran

Qazvin, Iran

h.vojodi@qiau.ac.ir

eftekhari@qiau.ac.ir

Abstract: many segmentation methods have been proposed in the literature but it is difficult to
compare their efficiency. In this paper, we propose an unsupervised evaluation method based on the
combined principles of minimal intra-region disparity and maximum inter-regions disparity measured on a pixel neighborhood. The purpose of this paper is to present a framework for evaluation
of image segmentation algorithms. The proposed method, measures the accuracy of image segmentation algorithms that can be used for any type of color images with any number of regions. It can
also limit the under-segmentation and the over-segmentation problems. We compared the proposed
method with an unsupervised evaluation method on a database composed of 2400 segmented color
images. Experimental results demonstrate the effectiveness of the proposed method.

Keywords: Color Image Segmentation; Unsupervised Evaluation; Intra-region; Inter-region

Introduction

in segmentation research [3].

Segmentation is a fundamental stage in image processing, video and computer vision applications. The target of image segmentation is the domain-independent
partitioning of the image into several regions which
are visually distinct and uniform with respect to some
property, such as grey level, texture or color. Many
segmentation methods have been proposed in the literature [1, 2] but it still remains a challenging task to
evaluate their efficiency.

The evaluation of a segmentation results makes


sense at a given level of precision. Generally, two main
approaches of evaluation exist including supervised and
unsupervised approaches. Supervised evaluation criteria use some prior knowledge such as a ground truth.
In these methods, the results of a segmentation algorithm are compared to a standard reference image that
is manually segmented beforehand and the amount of
disparity becomes the measure of segmentation effectiveness. This is the most commonly used method for
objective evaluation. However, manually generating
a reference image is a difficult, subjective, and timeconsuming task for most images, especially natural images, we generally cannot guarantee that one manuallygenerated segmentation image is better than another.
These methods widely use in medical applications [4].

Researches into better segmentation methods invariably encounters with two problems: (1) inability to
effectively compare different segmentation methods, or
even different parameterizations of any given segmentation method, and (2) inability to determine whether
one segmentation method or parameterization is suitUnsupervised ones compute some statistics in the
able for all images or classes of images (e.g. natural segmentation results according to the original image
images, medical images, etc). Consequently, methods without any prior knowledge. Unsupervised evaluation
for evaluating different segmentations play a key role
Corresponding

Author, T: (+98) 911 9442764

246

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

enables the objective comparison of both different segmentation methods and different parameterizations of
a single method, without requiring human visual comparisons or comparison with a manually-segmented or
pre-processed reference image. Additionally, unsupervised methods generate results for individual images
and images whose characteristics may not be known
until evaluation stage. Unsupervised methods are crucial to real-time segmentation evaluation and can furthermore enable self-tuning of algorithm parameters
based on evaluation results [5].

terizations of one particular segmentation method (including those which differ in terms of the number of
regions used in the segmentation) as well as fundamentally different segmentation techniques.
The test images in the benchmark should have a
large variety so that the evaluation results can be extended to other images and applications. The experiments are conducted using the images and groundtruth segmentations in the Berkeley segmentation data
set [7].We will evaluate the performance of our algorithm on the Berkeley Segmentation Database (BSD).
The proposed method, limits under-segmentation and
over segmentation problems. Analysis of the experimental results on the large variety of test images from
the Berkeley segmentation dataset demonstrates the
efficiency of the proposed method.

Based on the same idea of intra-region uniformity,


Levine and Nazif [5] also defined a criterion that calculates the uniformity of a region characteristic based on
the variance of this characteristic. Complementary to
the intra-region uniformity, Levine and Nazif defined a
disparity measurement between two regions to evaluate the dissimilarity of regions in a segmentation result
This paper is organized as follows. In Section 2 we
[6].
present our new unsupervised evaluation method. Experimental results and analysis are presented in Section
Liu and Yang [5] proposed an evaluation function, 3. In Section 4 we present our conclusion.
F, based on empirical studies. Their function requires
no user-defined parameters and is not depend on the
type of image. However, if the images have very well
defined regions with very little variation in luminance 2
The Proposed Method
and chrominance, the F evaluation function has a very
strong bias towards segmentations with very few reThe proposed method is based on maximum intergions.
regions disparity and minimal intra-region disparity
Borsotti et al. [5] improved Liu and Yangs method, measured on a pixel neighborhood. It enables to es0
to propose the modified quantitative evaluations, F timate the intra-region homogeneity and the interand Q. All of these evaluation functions were gener- regions disparity.
ated entirely from the results of empirical analysis and
have no theoretical foundations. More details about F,
0
F , and Q.

2.1

Intra-region Disparity

Zeboudj proposed a measure based on the combined principles of maximum inter regions disparity
and minimal intra region disparity measured on a pixel
neighborhood. Rosenberger presented in a criterion
that enables to estimate the intra region homogeneity
and the inter regions disparity. This criterion quantifies the quality of segmentation results [6]. Zhang
et al. proposed a novel objective segmentation evaluation method based on information theory. This method
uses entropy as the basis for measuring the uniformity
of pixel characteristics within a segmentation region.

Intra-region squared color error is computed as the proportion of misclassified pixels in an image. This parameter, in the uniform case is equal to the normalized
standard deviation of the region. In a segmented image
the pixels that are located in a region should have similar property and pixels of neighboring regions should
have a different property. Thus a good segmentation
criterion should consider two conditions: homogeneity
of intra-regions and disparity of neighboring regions.

In this paper, we propose a novel objective segmentation evaluation method based on minimal intraregion disparity and maximum inter regions disparity
measured on a pixels neighborhood. Our new evaluation method provides a quality score that can be used
to compare different segmentations of the same image.
This method can be used to compare various parame-

In this paper, we use the RGB color space. A color


image based on any of its components, using the gray
level method is processed. Each intra-region squared
color error of segmented image is computed based on
its R, G and B components. According to each component, one error value for each region is obtained. The
average of three squared color errors of each region represents the total squared color error of region.

247

The Third International Conference on Contemporary Issues in Computer and Information Sciences

We use Cx (p) to denote the value of component x


for pixel p (or for pixelp and its neighboring pixels in
1
the same region). Defined the average value of compo- CEx (Rj ) =
B
j
nent x in region j as follow.

((Cx (Rj ) pn ) /(L 1))

pn W (p),pn R
/ j

(4)
where Bj is the number of pn pixels. W (p) is the neighX
Cx (p))/Sj
(1) borhood of the p . pn is neighborhood with p of Rj in
Cx (Rj ) = (
separate region. It belongs to neighboring region Rj .
pRj
N R is number of regions that are neighbor with Rj .
The CE(Rj ) is the average inter-region disparity of rewhere x colorcomponents (RGB in our experi- gion Rj .
ments). Rj is the set of pixels in region j, and is used
The V inter(Rj ) is total an inter-region disparity.
Sj = |Rj | to denote the area of region j.
It is defined as follows:
The squared color error of region j is defined as
V inter(Rj ) =
e2x (Rj ) =

X
pRj

2
1
(Cx (p) Cx (Rj )) ]
L1

1
NR 3

CEx

(5)

x{R,G,B}

(2)

If an intra-region disparity is less than its interregion disparity, that region would be more accurate
(precisian). The disparity of the region Rj is defined
where L is the total number of gray levels.
by the measurement C(Rj ) [0, 1] expressed as folThe total interior disparity denoted by V intra(Rj ) lows:
computes the homogeneity of each region of the seg- C(R ) =
j V intra(Rj )
mented image:
1 V inter(Rj ) if 0 < V intra(Rj ) < V inter(Rj )
if
V intra(Rj ) = 0
2
X
ex (Rj )
V inter(Rj )
V intra(Rj ) =
(3)
0
otherwise
3 Sj
x{R,G,B}
(6)

2.2

The accuracy of segmented image depends on accuracy of its regions. Therefore, the evaluation measure
for segmented image is as follows.

Inter-region Disparity

The boundaries separate the regions of the image.


Thus the intra-boundary pixels of each region are different with the intra-boundary pixels of their neighbor
regions. The regions are more separable and accurate,
if the difference (disparity) value is high.

EIntra Inter =

N
1 X
C(Rj ) Sj
SI j=1

(7)

where SI is area (as measured by the number of


The inter-region disparity is computed based on the pixels) of the full image and N is total number of redisparity of the intensity average between desired re- gions in segmented image I.
gion and its boundary neighborhood pixels in separate
regions. If disparity value is low, the neighboring pixels
of desired region boundary are belonged to the desired
region which incorrectly placed in the neighboring re- 3
Experimental Results
gion. Whatever the number of incorrect pixels is less,
the region boundaries will be more accurate. As a reIntra-region uniformity is computed based on criterion
sult, segmented image will be more accurate.
(1). It is an intuitive and effective way to evaluate segThe inter-region disparity computes the average mentation performance by measuring its intra-region
disparity of a region with its neighbors. The dispar- uniformity. So that almost all unsupervised methods
ity CE of two uniform regions Ri , and Rj is calculated apply this metric. While a variety of intra-region uniformity metrics have been proposed, all are based on
as:

248

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

four quantities: color error, squared color error, tex- System (EDISON) [8]. Our dataset includes (8 300)
ture, and entropy.
2400 images. We produce images with variety number of regions in the segmentation (using EDISON to
The current public version of the Berkeley Segmen- generate the segmentations) to study the sensitivity of
tation Database is composed of 300 color images. The these objective evaluation methods to the number of
images size is 481 321 pixels, and is divided into regions in the segmentation. However, producing of
two sets, a training set containing 200 images that can more regions does not necessarily make a better segbe used to tune the parameters of a segmentation al- mentation. Since over-segmentation may occur and
gorithm, and a testing set containing the remaining the trade-off between the number of regions and the
100 images on which the final performance evaluations amount of needed detail can be heavily influenced.
should be carried out. We perform the evaluations on
the ground truth images from Berkeley segmentation
We analyze experimental results and compare the
dataset and segmented images resulted from Edison proposed method (EIntraInter ) with evaluation measegmentation system [8].
sure based on entropy [3]. These measures calculate
the amount of segmentation accuracy. To compare the
Under-segmentation and over-segmentation are two proposed evaluation method with evaluation measure
major problems for segmentation algorithms which are based on entropy, 11 images of dataset are selected ranshown in Figure 1. If by default we presume that to- domly. For each image, three types of segmentation are
tally accurate segmentation for natural images is hard selected including segmentation with more regions, apto achieve in practice, we need to minimize the under proximately equal and less than number of regions in
or over-segmentation as much as possible.
ground truth image.
Figures 2 and 3, show evaluation of two measures for 11 selected images. EEntropy cannot penalize under-segmentation and over-segmentation problems and in some images their precision value are more
than normal value.
Figure 3 shows the results of proposed evaluation
Figure 1: (a) Under-segmented image, (b) ground method. The diagrams of under and over-segmentation
truth image and (c) Over-segmented image
are under the normal diagram. So our proposed
method is sensitive to under-segmentation and oversegmentation. This method is able to obtain the suitIn the case of under-segmentation, full segmenta- able degree of similarity for different images and petion has not been achieved, i.e. there are two or nalize under and over-segmentation well. The experimore regions that appear one. In the case of over- mental results demonstrate that the proposed method
segmentation, a region that would be ideally present is appropriate for evaluation of image segmentation.
as one part is split into two or more parts. These
The Edison segmentation system has several input
problems are so important and are not easy to resolve.
The evaluation methods that are sensitive to both over- parameters (spatial, minimum region ...) which they
segmentation and under-segmentation and can penal- need to be adjusted. In this paper, we propose a new
method to adjust spatial parameter in Edison system.
ize both of them are efficient.
First, we select 60 images of Berkeley dataset and segIn this section we analyze our proposed supervised ment each of images with spatial parameter varying in
evaluation measure and compare it with evaluation the range of {2, 5, 10, 50, 100 and 200} and ground
measure based on entropy (EEntropy ). We discuss the truth image. Then each of segmented images is evaladvantages and shortcomings of each type of meth- uated with proposed method. We obtain average segods. We use two groups of images (ground truth and mentation accuracy of each spatial parameter on 60 immachine segmentation) to perform experiments. We ages. The experimental results in Figure 4 show that
use manually segmented images of Berkeley dataset as spatial=200 is more suitable input value for Edison
ground truth images. The second group is images of segmentation system.
machine segmentation results that make up our test
dataset.
We generate machine segmentation images with variety numbers of regions using the Image Segmentation

249

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusion

Segmentation evaluation is essential to quantify the


performance of the existing segmentation methods.
In this paper, we present an unsupervised evaluation
method for the evaluation of image segmentation algorithms. It is based on the minimal intra-region
disparity and maximum inter-regions disparity measured on a pixel neighborhood. We compared the proposed method with an unsupervised evaluation method
based on Entropy. The proposed method is sensitive to
under-segmentation and over-segmentation and penalizes them very well. Experimental results demonstrate
Figure 4: Parameters overall comparison to select betthat the proposed method is appropriate for evaluation
ter parameter
of segmented color images. In the future, we would like
to extend the proposed method so that being able to
be used over texture images.

Refrences
[1] N. Senthilumaran and R. Rajesh, Image Segmentation - A
Survey of Soft Computing Approaches, International Conference on Advances in Recent Technologies in Communication and Computing (2009), 844846.
[2] W. Tao, H. Jin, and Y. Zhang, Colour Image Segmentation
Based on Mean Shift and Normalized Cuts, IEEE Transaction on Systems, Man, and CyberneticsPart B: Cybernetics
37 (2007), no. 5, 13821389.
[3] H. Zhang, J. Fritts, and S. Goldman, An entropy-based objective evaluation method for image segmentation, Proceedings of SPIE-Storage and Retrieval Methods and Applications for Multimedia (2004).

Figure 2: Evolutions of the EEntropy measure over to


11 segmented images.

[4] N. M. Nasab, M. Analoui, and E. J. Delp, Robust and Efficient Image Segmentation Approaches Using Markov Random Field Models, Journal of Electronic Imaging 12 (2003),
no. 1, 5056.
[5] H. Zhang, J. E. Frittsm, and S. A. Goldman, Image Segmentation Evaluation: a Survey of Unsupervised Methods, Computer Vision and Image Understanding 110 (2008), no. 2,
260-280.
[6] S. Chabrier, B. Emile, C. Rosenberger, and H. Laurent, Unsupervised performance evaluation of segmentation,
EURASIP Journal on Applied Signal Processing 10/1155
(2006).
[7] C Martin, D Tal Fowlkes, J. Malik, and H. Laurent, A
Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics, Proceedings of 8th International Conference Computer Vision 2 (2001), 416423.
[8] Edge Detection and Image Segmentation System:
http://www.caip.rutgers.edu/riul/research/code/EDISON/.
[9] R. Unnikrishnan, C. Pantofaru, and M. Heber, Toward
Objective Evaluation of Image Segmentation Algorithms,
IEEE Transaction on Pattern Analyziz and Machine Intelligence 29 (2007), no. 6.

Figure 3: Evolutions of the EIntraInter measure over


to 11 segmented images.

250

[10] F. Ge, S. Wang, and T. Liu, New Benchmark for Image


Segmentation Evaluation, Journal of Electronic Imaging 16
(2007), no. 3, 033011033026.

Evaluate and improve the SPEA using fuzzy c-mean clustering


algorithm
Pezhman Gholamnezhad

Mohammad mehdi Ebadzadeh

IAU

Amirkabir University of Technology

Department of Computer Engineering, Science and Research Branch

Department of Computer Engineering

Tehran, Iran

Tehran, Iran

pezhman.gholamnezhad@gmail.com

ebadzadeh@aut.ac.ir

Abstract: The implementation of most current MOEAs hasnt been used from new method to
productive new solutions and new solutions obtain from traditional genetic recombination operators
such as cross over and mutation. Strength Pareto Evolutionary Algorithm, introduces elitism by
explicity maintaining an external population that this population stores a fixed number of the
non-dominated solutions. The balance between current population and external population, is
an important issue. The unbalanced nature of them, the current population quickly converges
toward external population and decreases the possibility of exploring pareto optimal. We propose
a method base on the Fuzzy c-means, with out specifying the number of external population,
while keeping diversity and sufficiency and surmounts deficiency of SPEA. Results of this method
has compared with NSGA-II and SPEA and systematic experiments have shown that overall, this
method is faster than perivious algorithms and with fewer iterations and evaluations, better results
are obtains. KeywordsStrength Pareto Evolutionary Algorithm; Non-dominated Sorting GA; Fuzzy
c-means clustering; Multiobjective optimization.

Keywords: Strength Pareto Evolutionary Algorithm; Non-dominated Sorting GA; Fuzzy c-means clustering; Multiobjective optimization.

Introduction

Multiobjective optimization is the process of simultaneolusy optimizing two or more colflicting objectives
subject to certain constraints. Maximizing profit and
minimizing the cost of a product; maximizing performance and minimizing fuel consumption of a vehicle; and minimizing weight while maximizing the
strength of a particular component are examples of
multi-objective optimization problems. For nontrivial
multiobjective problems, one cannot identify a single
solution that simultaneously optimizes each objective.
While searching for solutions, one reaches points such
that, when attempting to improve an objective further, other objectives suffer as a result. The pareto
optimal set is the set of all optimal points in the decision space and the pareto optimal front is the set of
Corresponding

all pareto set points which mapped to objective space.


When giving priority to a decision maker is not explicit or very difficult specify mathematically, the decision maker requires an approximation to the pareto
optimally for making the final choice. Such an approximation can be a set of pareto optimal solutions or a
mathematichal approximation model of the pareto set
or pareto front. The advantage of evolutionary algorithms to solve problems over other methods are which
allows to produce the generation of several elements of
pareto optimal set or pareto optimal front, in a single
run.
Most MOEAs employ a selection operator to direct its search to promising areas in the decision space.
Since routine selection operators, can not be directly
applied to multiobjective optimization, furthermore,
the task of most MOEAs is to produce a set of solutions

Author, P. O. Box 1463783931, T: (+98) 21 44235935

251

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

which are smoothly distributed on the pareto front.


Thus, selection operators in a MOEAs should not advance the search to converge to a single point. Therefore, Fitness assignment has been subject for maintaining the dirversity of search.
It is usually very hard to balance dirversity and convergence with a single population in a current MOEAs.
To overcom this deficiency, an external population is
often used in a MOEAs for maintaining nondominated
solutions found during the search. Initially, the number
of non-dominated solutions in principal population is
indeterminate and uncontrollable And this is the same
problem in introducing elitism operator on MOEA.
The number of non-dominated solutions is high, speacially in MOEAs with high objective functions.
The rest of this paper is organised as follows. Section 2 introduces multiobjective optimization problems, pareto optimality, strength pareto evolutionary
algorithm, non-dominated sorting genetic algorithm,
fuzzy c-means clusering and crowding distance. Section 3 presents the details of proposed algorithm. Section 4 presents and analyzes the experimental results.
Section 5 concludes this paper and outlines future research work.

Problem Definition

Multiobjective optimization problem (MOP) can be


stated as follows[1]:
M inimize F~ (x) = (f1 (x), f2 (x), ..., fm (x))T
Subject to x X

(1)

Where X Rn is the decision space and x =


(x1 , x2 , ..., xn )T Rn is the decision variable vector.
F~ (x) : X Rm consists of m real-valued continuous
objective functions fi (x)(i = 1, 2, ..., m). Rm is the
objective space. In the case of m = 2, this problem
is refferred to as a continuous objective optimization
problem.
Let a = (a1 , a2 , ..., am )T , b = (b1 , b2 , ..., bm )T Rm
be two vectors, a is said to dominate b, denoted by
a < b, if ai bi for all i = 1, ..., n, and a 6= b. Apoint
x X is called (globally) Pareto optimal if there is
no x X such that F~ (x) < F~ (x ). The set of all
the Pareto optimal points, denoted by PS, is called the
Pareto Set. The set of all the Pareto objevtive vectors,
P F = y Rm |y = F~ (x), x P S, is called Pareto front
[2].
Facility location problems are very popular to study

among a very great range of issues in computer sciences. It worries about positioning a number of facilities on a problem plane to serve some determined
demands, in order to optimize one or several objectives,
generally known as demand satisfaction. In this paper,
we aim to introduce a brand new class of facility location problems. This hybrid class of facility location
problems concerns about locating a set of facilities on
a two dimensional (2D) continuous problem plane, in
order to provide service to a collection of demands, the
agent study. We will bring a narrow introduction on
this issue. For now, it is only needed to know that these
two properties belong to a very fundamental class of intelligent agents, i.e. the reactive agents.
Facility location problems are also titled as the location analysis problem among computer scientists. As
stated before, facility location problems purpose is to
set a set of facilities in an assigning process, to a collection of demands, in order to satisfy them completely
or when not feasible, optimally.
In order to become more familiar to the issues and
tools we tend to use in this article, here we bring some
descriptions. In this order, we can build our structure
more scientifically. So we would like to bring introductions to the facility location problem, to Voronoi
diagrams, their classes and generalizations, and also to
agent study issues, especially the reactive agents.
Zitzler and Thiele(1998) proposed an elitist evolutionary algorithm, which they called the Strength
Pareto Evolutionary Algorithm(SPEA)[3]. This algorithm introduces elitisim by explicitly maintaining an
external population. This population stores a fixed
number of the non-dominated solutions that are found
until the beginning of a simulation. At every generation, newly found non-dominated solutions are compared with the existing external population and the
resulting non-dominated solutions are preserved. The
first step is to assign a fitness to each in the population
and uses genetic operators to find a new population. In
addition fitness is also assigned to external population
members that called the strength and it is less than
the fitness of current population. At this method a
solution with a smaller fitness is better. EA population members dominated by many external members
get large fitness values. With these fitness values, a binary tournament selection procedure is applied to the
combined population to choose solutions with smaller
fitness values. Thus, it is likely that external elites will
be emphasized during this tournament procedure. As
usual, crossover and mutation operators are applied to
the mating pool and a new population is created. At
this method a clustering algorithm is applied which reduces the size of external poulation. Clustering ensures
that non-dominated solutions lead to a better spread

252

The Third International Conference on Contemporary Issues in Computer and Information Sciences

and diversity. To make the clustering approach find


the most diverse set of solutions, each exterme solution can be forced to remain in an independent cluster. A balance between the regular population size
and this external population size is important in the
successful working of SPEA. If a large external population is used, the selection pressure for the elites will
be large and the SPEA may not be able to converge
to the Pareto-optimal front. On other hand, if a small
external population is used, the effect of elitism will be
lost. Moreover, many solutions in the population will
not be dominated by any external population members
and their derived fitness will be identical.

3
3.1

Algorithm
Basic Idea

The fuzzy c-mean clustering, is a clustering method


that the number of clusters, are dynamic and related
to position of points and radius parameter, characterizes efficacy of each data dimension. We porpose for
computation of radius parameter, uses from Crowding distance. At result, in SPEA, center of clusers, are
data points that, have the smallest distance average, in
respect of solutions in same class. Hence with preservIn Fuzzy c-mean clustering algorithm, a centeroid of ing diversity in external population, we do not need to
a cluster is computed as being the mean of all points, determine the number of external population.
weighted by their degree of belonging to the cluster.
The degree of being in the certain cluster is related
to the inverse of the distance to the cluster. By iteratively updating the cluster centers and the membership
grades for each datapoint, FCM iteratively moves the 3.2 Algorithm Framework
cluster centers to the right location within a dataset[4].
Performance in this algorithm, depends on initial cenThe algorithm works as follows.
troids.
Initialization: Generate an initial population and
compute the F-value of each initial solution in popIn NSGA-II the offspring population is first created
ulation.
by using the parent population and then two populaModeling: At first, find the best non-dominated set
tions are combined together. Then a non-dominated
from F-value and then copy them to external populasorting is used to classify the entire population. Once
tion. Then find best non-dominated point, from upthe non-dominated sorting is over, the new populadated external population and delete all of dominated
tion is filled by solutions of different non-dominated
point. At this step, we compute average of crowdfronts, one a time. The filling starts with the best noning distance via crowding distance mehtod for external
dominated front and continues with solutions of secpopulation. Then we use fuzzy c-mean clustering with
ond non-dominated fronts, followed by the third nonradius parameter. At result we obtain variables with
dominated front, and so on. Since the oveall populaleast average of distance respect to other variables in
tion size is 2N, not all fronts may be accommodated are
same class.
simply deleted. When the last allowed front is being
considered, there may exist more solutions in the last
front than the remaining slots in the new population.
Instead of arbitrarily discarding some members from
the last front, it would be wise to use a niching strat- 3.3 Crossower and Selection
egy to choose the members of the last fronts, which
reside in the least crowded region in that front.
Similar to SPEA, we assignment a fitness to current
In Crowding distance, to get an estimate of the den- population and external population. Then we apply a
sity of solutions surrounding a particular solution i in crowded tournament selection and with cross over and
the population, we take the average distance of two so- mutation operator, we selection next population from
lutions on either side of solution I along each of the ob- current population and external population.
jectives. This quantity di serves as an estimate of the
perimeter of the cuboid formed by using the nearest
neighbors as vertices, that is called crowding distance.

3.4

Stopping condition

If stopping condition is met, stop else return to modeling step.

253

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Experimental Results

The test instances F1-F5 in table 1 are used for this


purpose.
Instances
F1
F2
F3
F4
F5

Variables
[ 1n , 1n ]
x1 , x2
[0, 1]n
[0, 1]n
[0, 1]n

Objectives
FON
POL
ZDT1
ZDT2
F Unconstrained

Table 3 shows mean and STD of D-metric for all


functions, within 50 final populations.
NEW
Mean
0.0304
0.0286
0.0754
0.0958
0.0645

SPEA
Std
0.0257
0.0196
0.0566
0.0874
0.0581

NEW II
Mean
Std
0.1854 0.21
0.1625 0.158
0.1154 0.098
0.2651 0.217
0.1156 0.095

SPEA
Mean
Std
0.1768 0.1657
0.151
0.124
0.2356 0.212
0.284
0.243
0.1961 0.174

Table 3: Mean and STD of D-metric for all functions,


within 50 final populations

Table 1: Test instances

The general experimental setting is as follows in


Table 2.
Representation
Initial population
Fitness variable
Clustering
Crossover
Mutation
Evaluation
Stop condition

Real numbers
Random uniform
Min-EX
Fuzzy c-mean
SBX method
Polynomial
F-value
Number of iteration

Table 2: General experimental setting

Figure 1 shows the evaluation of the average IGD


of the non-dominated solutions in the current populations among 20 independent runs with the number of
5000 function evaluations in three algorithms for F1.

Conclusion

It has not been well studied how to generate new trial


solutions in multiobjective evolutionary optimization.
In this paper, experimental studies have shown that
overall, our method perform better than SPEA and
NSGA-II methods and with fewer iteration and function evaluations, better results are obtainable. Also
we can quickly access to the desired solutions, in our
problem, that are the main features of this method. Independency of this method from controls parameter,
specially in data clustering, is one of the advantage of
this method respect to SPEA.

Refrences
[1] K. Deb, Multi-Objective Optimization Using Evolutionary
Algorithms, Baffins Lane Chichester: Wiley, 2001.
[2] K. Deb, R. zoref, and S. Ur, Multi objective evolutionary algorithm: Introducing bias among pareto optimal solutions,
A. Ghosh and S. Tsutsui(Eds), Theory and applications of
evolutionary computation. Recent Trends. London: Springerverlag.
[3] Zitzler E and L Thiele, An evolutionary algorithm for multi
objective optimization: The strength Pareto approach, Technical Report 43, Zurich, Switzerland (1998).
[4] R Yager, D. Filev, K. Sen, and M. Naik, Generation of Fuzzy
Rules by Mountain Clustering, Journal of Intelligent & Fuzzy
Systems 2 (1994), no. 3, 209219.

Figure 1: The evaluation of the average IGD of the nondominated solutions in the current populations among
20 independent runs with the number of 5000 function
evaluations in three algorithms for F1.

[5] K Deb, S Agrawal, A Pratap, and T Meyarivan, A fast and


elitist multi objective genetic algorithm: NSGA-II, Technical
Report 2001, Indian Institute of Technology, Kanpur: Kanpur Genetic Algorithms Laboratory (KanGAL) (2000).

254

Function
F1
F2
F3
F4
F5

Hypercube Data Grid: a new method for data replication and replica
consistency in data gird
Tayebeh Khalvandi

Islamic Azad university, Tehran, Iran.


Department of computer engineering, Science and Research Branch
t.khalvandi@srbiau.ac.ir

Amir Masoud Rahmani


Islamic Azad university, Tehran, Iran.
Department of computer engineering, Science and Research Branch
rahmani@srbiau.ac.ir

Seyyed Mohsen Hashemi


Islamic Azad university, Tehran, Iran.
Department of computer engineering, Science and Research Branch
hashemi@srbiau.ac.ir

Abstract: Nowadays scientific applications generate huge amount of data. Grid is an efficient
solution to manage and store huge amount of data. Data Grid provides sharing and management
services for very large data around the world. Data replication is a practical and effective method to
improve data access time and fault tolerance by replicating data. Modification of data in grid can
caused a problem of maintaining consistency among them. In this paper we proposed new method
for embedding grid in hypercube, called Hypercube Data Grid (HDG). Master data is distributed in
HDG by divided into several parts and reside them into the grid sites. Update propagation is done
with broadcast algorithm in hypercube. Simulation results by Optorsim show that the proposed
approach is improved in mean job time, effective network usage, total number of replication and
percentage of storage filled compared with other approaches.

Keywords: Data Grid, Hypercube, Distribution Master Data, Broadcast.

Introduction

Some scientific application such as high energy physics,


bioinformatics, earth observation and etc. generate
huge amount of data in terabyte or peta byte. Managing large amount of data in centralized way is impossible due to extensive access latency and load on central
server. Researchers around the world require shared
large amount of data for analyzing or conducting experiment.
Corresponding

A grid is large scale resource sharing and problem


solving mechanism in virtual organizations [1]. A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive,
and inexpensive access to high-end computational capabilities [2]. A Data Grid is type of grid which provides services that help users to discover, transfer, and
manipulate large datasets stored in distributed repositories and also, create and manage copies of these
datasets. At the minimum, a Data Grid provides two
basic functionalities: a high performance, reliable data
transfer mechanism, and scalable replica discovery and

Author, F: (+98) 21 44869744, T: (+98) 21 44869730

255

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

management mechanism [3]. A replica management


system or data replication mechanism is used to promote high data availability, low bandwidth consumption, increased fault tolerance and improved scalability and response time [410]. Data replication can be
managed by two main methods: static replication and
dynamic replication.

located closely will be organized as a group was called


network region. Intra-region transmissions can outperform the inter-region ones due to the network locality.
Region-optimizer is exploited to keep track of the file
access frequency in order to optimize the replication
strategy in BHR. It will fetch the needed replica in another region through the Internet if the replica doesnt
exist in its region.

The static replication creates and manages the


replica manually and it does not change with the
In [14], the authors proposed the BHR algorithm
changes in user behavior. The dynamic replication by using three level hierarchical structures. They adcreates new replica automatically and it changes the dressed the problem of both scheduling and replication.
location of replica with the changes in user behavior.
In [15] Adaptable Replica Consistency Service
In this paper a new method for data replication and (ARCS) for replica consistency management is proreplica consistency in HDG is proposed. The main pur- posed. Replicas are separated into two categories, maspose of this approach is reducing job execution time ter replica and secondary replica. Master replicas can
and increasing update propagations. In proposed ap- modify by end users and secondary replica is read only.
proach master data is divided into smaller parts and The frequency of a secondary replica being accessed on
distributed in HDG. Update propagation was used with average in a grid site is defined as access weight. Full
broadcast algorithm in hypercube.
consistency service is used for secondary replica with
high access probability in order to reduce the access deThe rest of the paper is organized as follows: Sec- lay. Partial consistency service is applied for secondary
tion 2 is related work. In Section 3, hypercube and replica with infrequent access.
broadcast in hypercube is introduced. Section 4 illustrates the hypercube data grid. Section 5 is the exIn [16] one way replica consistency model was introperimental results and analysis. Conclusion about this duced. Replica distribution topology has three types of
research is described in section 6.
nodes super node (SN), master node (MN), and child
node (CN). Data source was saved in SN and was modified by grid user. Data was replicated from SN to MN
automatically when added or modified by grid users.
Data was replicated to CN depend on two factors the
2 Related Works
access frequency of files and the storage capacity.
In [11] three replication algorithms were introduced,
least Frequently Used (LFU) and two economic strategies. All three algorithms are implemented in each
grid site as a part of Replica Optimization Service in
Replica Manager. When a required file is not available
locally and local storage is not enough space, LFU and
economic strategies work differently. LFU always replicates the file from a remote Grid site and deletes those
files which have been used least frequently in the recent
past. Economic strategies associate a value to each file
based on the binomial distribution or Zipf distribution;
the files that their values less than the value of new file
are deleted.

In [17] replica clustering coefficient to represent the


update communication ability of replica node is defined; by that classify replica nodes into multi-level and
form replica tree to efficiently achieve replicas update.
First-level replica nodes constitute a peer-to-peer network. Use of the role of different replica node and reduce the conflict numbers, and rapidly realize replica
consistency.

Hypercube

In [12] Least Recently Used (LRU) algorithm is introduced, which delete the files which have been used A hypercube of degree k has 2k nodes and each node
least recently.
has exactly k neighbors. The distance between any two
nodes is less than or equal to k. The nodes in a hyperIn [13] Bandwidth Hierarchy Replication (BHR) al- cube may be labeled with binary numbers of length k.
gorithm is introduced. BHR strategy extends the site- Two nodes are adjacent if their labels differ in exactly
level replica optimization considering the network lo- one bit position [18]. Some hypercube was shown in
cality. Network locality means that sites which are Figure 1.

256

The Third International Conference on Contemporary Issues in Computer and Information Sciences

the same message to all the other nodes in the hypercube. This problem has been previously examined in
[25,26], in this study, the algorithm proposed in [27] for
broadcast in hypercube was used. There are d stages,
numbered 0, 1, , d1. During stage k, nodes 0, 1, , 2k 1
send the message to nodes
k (0), k (1), ..., k (2k 1)
Figure 1: Some hypercube
respectively (and concurrently).
k (n)
Denote the node ID formed by taking the bit-wise
The diameter of hypercube with 2k nodes is k, and exclusive-OR of n and 2k . I.e., it is just n with the
the bisection width of that size network is 2k1 , the kth bit flipped.
hypercube has low diameter and high bisection width
[18]. Embedding networks of processor into hypercube
is good because of low degree and low diameter in hyHypercube Data Grid
percube [19]. Transmitting a large number of data 4
quickly makes the hypercube more useful interconnection network than networks such as trees and rectanIn this section the proposed approach is presented.
gular grid [20].
A graph G is cubical if there is an embedding of
G into hypercube of degree k for some k [21]. Havel
and Liebl [22, 23] deduced that all trees, rectangular
meshes, and hexagonal meshes are cubical. They also
proved that a cycle is cubical if and only if it is even.
In [21] showed that star graph with m + 1 node, simple
path with m nodes are cubical. There are classes of
graphs cannot be embedded into hypercube with adjacency preserving such as complete graph and graph
with odd cycle [19].
A restriction with the hypercube topology is that
the number of nodes in a system must be a power of
two. This restriction can be overcome by using an incomplete hypercube, a hypercube missing certain of its
nodes. Example of incomplete hypercube was shown in
figure 2. Unlike hypercube, incomplete hypercube can
be constructed with any number of nodes. The routing
and broadcast algorithms for incomplete hypercube are
nearly as simple as those for the hypercube [24].

4.1

Embedding network in hypercube

A method for embedding any arbitrary network structure to hypercube is presented. For embedding knowing the number of nodes in the network and its neighbors is sufficient, and no other information from the
network infrastructure is required. Steps to mapping
the network in hypercube are as follows:
The first step:
In a network with n node a hypercube of degree dlogn2 e
be made.
The second step:
In the second step network nodes is mapping in hypercube nodes. Initially a node of network selects randomly and mapping in the first node of hypercube. After mapping the node of network to the node of hypercube, the neighbors of this node in network are mapped
in neighbors of hypercube node. Based on the number
of neighbors in network and hypercube nodes one of
the following three cases may be occurs:
Both numbers of neighbors equal, all neighbor
nodes in the network are mapped to neighbor
nodes in hypercube.
Number of neighbor nodes in the network is
more, to all neighbor nodes in hypercube, the
neighbor node of network is assigned.

Figure 2: Incomplete Hypercube

3.1

Number of neighbor nodes in hypercube is more;


all neighbors in the network are mapped in hypercube neighbors.

Broadcast in hypercube

One of the fundamental hypercube communication pat- After mapping each node of network into hypercube
terns is broadcasting, in which one node has to send node, the steps is repeated for the neighbors of that

257

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

node. An unvisited node of network has been mapping cation and they can be deleted. Master data which
in hypercube node which is not been initialized in the stored in one site caused single point of failure and
previous steps.
bottleneck. To solve this problem, solutions have been
proposed in [15, 16] that the master data is stored comThe algorithm ends when all nodes in the network pletely in multiple sites. In this study, a method for
are mapped in hypercube nodes. If the mapping of distributing master data in grid is introduced. As menall nodes in the network to hypercube nodes is not tioned in the previous section in hypercube with degree
possible, a hypercube with a higher order is created, k each node has exactly k neighbors. The master data
and then network nodes are mapped in new hypercube is partitioned into k + 1 parts then distributed them
nodes as same way.
in hypercube nodes, the required data for each site is
in the same site or its neighbors. Thus data access is
It should be mentioned that the above algorithm faster and the master data can be stored in a site with
can be implemented on a connective network. If the less storage space. Distribution of master data in hynetwork is not connected, the algorithm for different percube with degree three is shown in Figure 4, which
parts of network is run separated. This approach cre- master data is partitioned into four parts.
ated hypercube virtually and does not change the main
structure of the network.

4.2

Hypercube Data Grid

Grid project is a wide range which can include largescale projects in several countries to smaller scale
projects in several levels of organization. Data grid
includes sites with computing and storage components
to run jobs and routers without computing and storage components to routing. Set of sites linked to the
router and they are considered as a virtual organiza- Figure 4: Distribution of master data in hypercube
tion (VO). The routers in different VO are connected with degree three
with each other and the sites are communicated via
routers. In proposed approach routers are organized in
hypercube. An example of this architecture is shown
4.3 Embedding grid in hypercube
in Figure 3.
The grid includes sites and routers, but only routers
are embedded in hypercube; thus the connection between routers and sites is removed. Routers are classified in a group with maximum eight members, and
then they are embedded in a hypercube with the algorithm in section 2-3, the resulted hypercube maybe
incomplete. Communication between the routers is not
change and the hypercube is a virtual, so each router
knows its location and neighbors in the hypercube. After creation hypercube, each router re-connected with
neighbor sites.

Figure 3: Example of proposed architecture

Generally there are two types of data, master data


and secondary data. Master data is the source of data
which location of them is specified at first and they cannot be deleted. Secondary data are created with repli-

4.4

Distribution master data in HDG

Distribution of master data was based on hypercube


structure. Since hypercube was created on routers,
master data was stored on sites associated by the virtual routers. The hypercube may be in order zero to
three. The zero-order hypercube was places completely

258

The Third International Conference on Contemporary Issues in Computer and Information Sciences

in a site. In one-order hypercube, master data divided


into two partitions, so these partitions are located in
two hypercube nodes. Maser data divided into three
partitions hypercube with order two. Figure 5 show
the location of master data in hypercube with order
two. Master data for hypercube with degree three was
divided into four partitions that show on Figure 4

and broadcast the least change in hypercube, then sites


update their data with the least change. The time
complexity of this stage is O(k). So maintaining consistency in hypercube data grid is done with time complexity of degree O(k).

Results

To evaluating proposed approach using OptorSim


[2830], a simulator of data grid. It was developed to
evaluate and validate replication strategy in data grid.
The simulator is assumed that the grid is made up of
several sites, each sites consists of zero or more computing element (CE) and zero or more storage element
(SE). CEs and SEs are used to execute grid jobs and
store files respectively. Job scheduling and resource
Figure 5: Distribution of master data in hypercube
allocation were done in resource broker by scheduling
with degree three
algorithm. Replicas are managed by replica manager.
Replica optimizer is in the heart of replica manger, creates and delete replica by replicating algorithm. Our
Distribution of master data for incomplete hyper- simulation topology is The CMS Data Challenge 2002
cube, which missing certain number of edges is the [31] shown in Figure 6.
same. But if some nodes were missed, First k + 1 parts
of master data were located in hypercube. If nodes
are available which havent master data; some parts of
master data was located in them.

4.5

Consistency management in HDG

There are two types of data in proposed method, master data and secondary data. Master data is divided
into smaller parts and distributed on grid structure.
Secondary data was created by data replication. Master data can be changed by grids users, but secondary
data is read-only. The changes of master data must be
propagate to other data.

Figure 6: our simulation topology[31]

There are twenty sites and eight routers. The simProcess of consistency management starts by se- ulation parameters shown in table
lecting master data manger among the whole of master data stored in the sites of VO. A master data which
Parameter
Value
saved in VO with smaller ID will be called the manager
Number of sites
20
of master data.
Number of routers
8
Number of Jobs
100
Propagation of update is start with broadcasting
Number of job type
6
the update operation, towards every VO in hypercube.
Each file size(GByte)
1
Propagation is started in a periodic way base on apTotal file size(GByte)
97
plication requirement. In hypercube with degree k the
Number of experiment
10
time complexity of broadcast has O(k) so this stage
has degree O(k). Sites receive the message, site with
Table 1: Simulation Parameter
modified master data; broadcast the changes to manger
of master data. The time complexity of this stage is
O(k) too. Manger of master data receive the changes
For simulation, the topology is embedded in hyper-

259

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

cube. Then master data distributed in it. The embed- comparison with other algorithms. The reason is that
ded topology in hypercube and position of master data the data are distributed so jobs have parts of the data
in it is shown in Figure 7.
locally. So increase the number of local access and decrease the number of replication caused decrease ENU.

Figure 7: Distribution of master data in HDG


Figure 9: Effective Network Usage
In proposed approach master data is distributed in
HDG. For evaluation, the proposed method with the
methods of master data in one site or master data
in two sites is compared. In CMS master at one of
Total number of replication
the sites marked in red at Figure 6 or both of them
is placed. In distributed method, master data divided
Total number of replication in three methods is
into four parts and placed them in sites specified in shown in figure 10. Because of data distribution in
Figure.
proposed approach, parts of data are available locally
so total number of replication in proposed approach is
Mean Job Time
lower than the others.
Mean Job Time is the division of total execution
time on the number of jobs. Mean job time is shown
in Figure 8.

Figure 10: Total Number of Replication


Figure 8: Mean Job Time
Percentage of storage filled
The proposed approach has the least mean job time,
because of distributing master data in grid and local
Last evaluation criteria, the percentage of storage
access to data in application.
filled that is shown in Figure 11. Percentage of storage
filled in No Replication algorithm for distributed masEffective network usage
ter data and master data in two sites are the same and
master data in one site has the minimum. In the algoComparison of the methods by effective network us- rithms with replication because of the smaller number
age is shown in Figure 9. As is shown in the figure 9, of replication on distributed method the percentage of
the proposed method has the least amount of ENU in storage filled is improved.

260

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[10] H. Stockinger, A. Samar, K. Holtman, B. Allcock, I. Foster,


and B. Tierney, File and object replication in data grids,
Cluster Computing 5 (2002), no. 3, 305-314.
[11] D.G. Camaron, A.P. Millar, C. Nicholson, R.C. -Schiaffino,
F. Zini, and K. Stockinger, Analysis of scheduling and
replica optimisation strategies for data grids using optorsim, Journal of Grid Computing 2 (2004), no. 1, 5769.
[12] C. Nicholson, D.G. Camaron, A.T. Doyle, A.P. Millar, and
K. Stockinger, Dynamic data replication in LCG 2008, UK
e-Science All Hands Conference (2006).
[13] Sang-Min Park, Jai-Hoon Kim, Young-Bae Ko, and WonSik Yoon, Dynamic Data Replication Strategy Based on Internet Hierarchy BHR, Lecture notes in Computer Science
Publisher, Springer-Verlag, Heidelberg 3033 (2004), 838846.

Figure 11: Percentage of Storage Filled

[14] A. Horri, R. Sepahvand, and Gh. Dastghaibyfard, A hierarchical scheduling and replication strategy, International
Journal of Computer Science and Network Security 8
(2008).

Conclusions

In this paper, a method for embedding network structure in hypercube is introduced. The grid structure
embedded in hypercube which called hypercube data
grid (HDG). Master data is distributed in HDG. Thus
the master data is divided into smaller parts and distributed them in HDG. Distribution of master data
reduce mean job time, total number of replication and
percentage of storage filled and improved effective network usage. Update propagation in HDG was done
by broadcast algorithm in hypercube which caused the
faster maintaining consistency in HDG.

[15] Ruay-Shiung Chang and Jih-Sheng Chang, Adaptable


Replica Consistency Service for Data Grids, Proceeding of
the Third International Conference on Information Technology: new Generation (2006).
[16] Chao-Tung Yang, Wen-Chi Tsai, Tsui-Ting Chen, and
Ching-Hsien Hsu, A One-way File Replica Consistency
Model in Data Grids, IEEE Asia-Pacific Services Computing Conference (IEEE APSCC 2007), Tsukuba, Japan
(2007).
[17] REN Xun-yi, WANG Ru-chuan, and KONG Qiang, Efficient Model for Replica Consistency maintenance in Data
Grids, International Symposium on Computer Science and
its Application (2008).
[18] M. J. Quinn, Parallel Computing Theory and Practice,
Vol. 2, McGrow-hill, 2008.
[19] A. Y. Wut, Embedding of Tree Networks into Hypercubes
(1985), 238249.

Refrences

[20] J. D. Ullman, Computational Aspects of VLSI, Computer


Science Press, Rockville, Md, 1984.

[1] I. Foster, C. Kesselman, and S. Tuecke, The Anatomy of


the Grid: Enabling Scalable Virtual Organizations, International J. Supercomputer Applications 15 (2001), no. 3.
[2] I. Foster and C. Kesselman, The Grid: Blueprint for a New
Computing Infrastructure, Morgan Kaufmann, 2004.
[3] F. I, K. C, S. C, T. S, and CHERVENAK A, The Data
Grid, towards an architecture for the distributed management and analysis of scientific datasets, J. Net. Comput.
Appl 23 (2000), 187200.
[4] Foster I and Ranganathan K, Design and evaluation of dynamic replication strategies for a high performance data
grid, In Proceedings of International Conference on Computing in High Energy and Nuclear Physics, Beijing, China
(2001).
[5] S. Vazhkudai, S. Tuecke, and I. Foster, Replica selection in
the globus data grid, CCGrid (2001), 106.
[6] I. Foster, The grid: A new infrastructure for 21st century
science, 2000.
[7] H.
Lamehamedi
and
B.
Szymanski,
Data
replication
strategies
in
grid
environments:
http://www.brookson.com/gsm/clone.pdf, ICA3PP (2002),
378.
[8] K. Ranganathan, A. Iamnitchi, and I. Foster, Reducing
multi-class to binary: a unifying approach for margin classifiers, CCGrid 1 (2000), 376.
[9] R.M. Rahman, K. Barker, and R. Alhajj, Replica placement
in data grid: Considering utility and risk, 2005.

261

[21] M. Livingston and Q. F. Stout, Embeddings in Hypercubes


11 (1988), 222227.
[22] I. Havel and P. Liebl, Embedding the polytomic tree into the
n-cube 98 (1973), 307314.
[23] I.Havel and P.Liebl, Embedding the polytomic tree into the
n-cube 98 (1972), 210205.
[24] H.P.Kateseff and P. Liebl, Incomplete Hypercubes 31
(1988), no. 5.
[25] C.-T Ho and S. L Johnsson, Distributed routing algorithms
for broadcasting and personalized communications in hypercubes 31 (1986), no. 5, 640648.
[26] Saad Y and Schultz M. H, Data communications in hypercubes (1985).
[27] Stout F. and Wagar B Q., Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines
10 (1990), 167-181.
[28] Cameron D.G, Carvajal-Schiaffino R, Millar A.P, Nicholson C, Stockinger K, and Zini F, Evaluating Scheduling and
Replica Optimization Strategies in OptorSim (2003).
[29] W. Bell, D. Cameron, R. Carvajal-Schiaffino, P. Millar, C.
Nicholson, K. Stockinger, and F. Zini, OptorSim v1.0 Installation and User Guide (2004).
[30] OptorSim, http://edg-wp2.web.cern.ch/. (2004).
[31] Cameron D.G, Millar A.P, Nicholson C, Schiaffino R, Zini F,
and Stockinger K, OptorSim: a simulation tool for scheduling and replica optimization in data grids (2004).

Exploiting Parameters of SLA to Allocate Resources for


Bag of Task Applications in Cloud Environment
Masoud Salehpour and Asadollah Shahbahrami

Department of Computer Engineering, Faculty of Engineering,


University of Guilan, Rasht, Iran

Abstract: There are different workload types with different characteristics that should be supported by cloud computing, whereas there is no any single solution that can allocate resources to
all imaginable demands optimally. It is necessary to design specific solutions to allocate resources
for each workload type. Based on that this paper proposes an idea to facilitate dynamic resource
allocation for bag of task applications. The proposed approach exploits users service level agreement parameters and a classification technique. Specifically, our approach manages resources and
increases utilization of them to response users in a reasonable time. We evaluates the proposed
approach using the Monte Carlo simulation. The simulation results compared with two reference
models, First Fit and Proportional Share. The proposed approach outperforms the reference models
in terms of the total cost of resource allocation and the total waiting time of clients.

Keywords: Cloud Computing; Resource Management; Bag of Tasks (BoT).

Introduction

regard optimally. Consequently, it is a need to design


and implement specific applications for each workload
type [1].

Cloud computing presents services by providing infrastructure via the network to facilitate management of
both hardware and software resources [2, 4]. The services are provided in three models, viz, Software as
a Service (SaaS), Platform as a Service (PaaS), and
Infrastructure as a Service (IaaS). The SaaS provides
most of users applications. The PaaS is concerned in
applications environments, and the IaaS is involved
in hardware level management. A contracted Service
Level Agreement (SLA), including parameters such as
cost of operation and response time, defines characteristics of these models. Service providers provisions
Figure 1: Tasks of a BoT gravitates to be completed
resources for customers in regard to the SLAs [5].
at different times.
However, resource allocation to different workload
types and applications with different characteristics in
cloud computing is a challenging problem [13]. TechniThis paper focuses on allocating resources to Bag
cally, there is no any single hardware or software that of Tasks (BoT) applications. Precisely, a BoT includes
can allocate resources to all imaginable workload types loosely coupled and compute intensive tasks demandefficiently [6]. Besides, sine each type has its specific ing minimal intertask [12]. Final results of all tasks in
properties, a single solution cannot deal with in that a BoT represent the answer of a single problem, and
Corresponding

Author, P. O. Box: 3756-41635, F:(+98) 131-6690271 Email: shahbahrami@guilan.ac.ir

262

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 2: Functionalities of different parts are depicted in detail.


also tasks of a BoT were not postulated to be executed
at the same time, so, tasks of the same BoT gravitate
to complete at different times, as it is shown in Fig. 1.
Although BoT applications are currently exploiting in
several fields including data mining and image processing, the existing approaches to allocate resources to
this workload type are not efficient.
The objective of this paper is to propose an idea to
facilitate dynamic resource allocation for BoT applications. The proposed approach uses the SLAs parameters and a classification technique. Our approach pays
much attention to both: a period of time during which
all tasks of a BoT must be serviced and agreed response
time of users SLA to manage resources in accordance
with BoT applications characteristics. Briefly, main
contributions of this paper are as follows:
Proposing an idea to allocate resources to BoT
applications. The proposed approach increases
utilization of resources and omits idle time of
physical servers.
Our approach outperforms the reference models,
viz, First Fit and Proportional Share in terms of
the total cost of resource allocation and the total
waiting time of clients.
This paper is organized as follows. Section 2 addresses some previous work. Section 3 presents the proposed approach. Our evaluation process is discussed in
Section 4. Finally, Section 5 concludes this paper.

Related Work

Proportional Share (PS) [3] was presented as a solution


to allocate resources. Originally, it distributed tasks
between all servers that this caused to increase the total response time. Therefore, scientists in [13] evaluated a modified version of the original PS to decrease
the number of allocated servers that each customer is

assigned to. They solved the problem of searching and


selecting of the PS by the assumption of having only
one server with the processing capacity equal to the
sum of all servers processing capacities.
First Fit (FF) model presented in [10]. It contained
a straightforward greedy algorithm. It placed the received tasks in the first server that can support the
tasks requirement. FF could provide a fair load balancing alongside of resource allocation process. Researchers in [9] considered different classes for servers
and clients. Then, they proposed a heuristic approach
using a discrete utility function with iterative attempts
to find under utilization servers in a distributed environment to optimize resource allocation. However, the
iterative attempts caused to a time consuming process.
Moreover, work in [7] considered a specific workload type to provide resource management. It worked
on workflow model, but its strategy in accordance with
change of requests did not consider SLAs parameters
of users. Work in [8] analyzed performance and efficiency of resource management by handling and calculating of cost and stability of applications. Work in
[11] leveraged mathematical formulation to provide an
economics based model and it involved in optimization
process to maximize cost of resource allocation.

The Proposed Approach

The proposed approach consists of two parts, namely


classifier and Resource Manager (RM). Fig. 2 shows
these parts positions where BoT of each client is captured by the classifier, and then all requests are assigned to the RM to allocate resources. The classifier
can be positioned in the SaaS environment, and the
RM can be embarked in the PaaS environment. A
datacenter with a number of physical servers is also
considered as our IaaS environment, and each server
is modeled by its CPU capacity (Ckp ) and memory capacity (Ckm ). Each server is represented by index k.

263

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Id
WL1
WL2
WL3
WL4
WL5
WL6

Bags
500
1000
1500
2000
2500
3000

Table 1: Characteristics of the experiments Bags of Tasks (workload)


Tasks Av. Tasks of Bags Total CPU Req. Av. Tasks CPU Req.
20230
40.460
1416974
70.043
39926
39.926
2790405
69.889
59753
39.835
4191982
70.155
81363
40.681
5702576
70.088
100539
40.215
7047664
70.098
120410
40.136
8435736
70.058

The functionalities of both classifier and the RM are 4.1


discussed in the following.
First step to allocate resources to each client must
be performed by the classifier. This part divides clients
based on their contracted response time into three
classes. It means that when the classifier receives a
new BoT, determines sender and puts the BoT and
senders class information into service by delivering the
request to the RM. Additionally, whereas the number
of clients requests is unknown in advance, this paper
assumes that users requests follow a Poisson distribution with mean of .
Then, the RM takes advantage of the received information (clients class) to line up different queues
exploiting Generalized Processor Sharing (GPS) discipline. In accordance with the three classes of clients,
the RM lines up three different queues. It is proven
that when the service times for users requests are not
too large to handle, the GPS can be implemented by
Weighted Fair Queue (WFQ). So, the RM particularly
exploits WFQ and it uses clients classes as WFQs
priorities. Consequently, the GPS by using WFQ can
provide three single class queues.

St. Dev.
29.295
29.132
29.087
29.125
29.198
29.174

Metrics

The main metrics are the total response time, the total
service time, and the total waiting time for users. Besides, based on these metrics and the total idle time of
servers, it is possible to calculate the total cost of resource allocation for our datacenter. Moreover, to calculate the total response time, the proposed approach
must sum up service and waiting time of servers.

4.2

Methodology and Results

The proposed approach has been simulated using the


Monte Carlo method. Our evaluation process defines
a domain of possible inputs, then generate inputs randomly from a probability distribution over the domain
and finally, it performs a deterministic computation on
Ultimately, the RM considers each of the three sin- the received inputs to aggregate the results.
gle queues as an M/G/m queue which indicates that
the included tasks arrival times have an exponential
Therefore, Table 1 shows the characteristics of our
distribution while service times are independent and experiments workload where it is depicted number of
follow a general distribution. Our datacenter contains BoTs, the total number of included tasks of BoTs, avm servers which these service in order of task request erage number of tasks in each BoT, the total tasks
arrivals (FCFS). Therefore, when the allocation is com- CPU requirement in MIPS, average CPU requirement
plete the BoT leaves the queue and the number of of each task, and standard deviation of tasks CPU
clients in the system reduces by one. Finally, tasks requirement, from the second column till the sixth colof a BoT executes based on the allocated resources of umn, respectively. Furthermore, number of servers in
a physical server. Fig. 2 shows all mentioned details of our datacenter and the number of the SLA classes are
the proposed approach.
set to 5 and 3, respectively. The total Ckp of servers is
equal to 5000.0 MIPS where Ckm is equal to 3072 MB
for each server.

Evaluation

In addition, to compare the obtained results of


the proposed approach with known resource allocation
methods two reference models, the modified PS and
This section presents metrics, methodology, and results the FF have been used. Details of the approaches were
of the evaluation process.
discussed in Section 2.

264

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

two reference models are not comparable to our approach in terms of the total idle time of servers. Besides, since the total cost of resource allocation must
be calculated based on both the solutions service time
and idle time, our approach can provide a cheaper resource allocation compared to the reference models.

Figure 3: Comparison between response time of the


proposed approach and two reference models.

In order to validate our results, the experiments


were executed and repeated hundreds of times. Fig. 3
depicts response time in minutes for (i) the FF, (ii) the
proposed approach, and (iii) the modified PS. In this
figure different BoTs (i.e., WL1 till WL6) come from
Table 1. Fig. 4 depicts service time for in that regard
where waiting time is shown in Fig. 5. As it can be
seen, our approach imposes the minimum waiting time
on clients. The total response time is mainly a function of the total service time for our approach which
it reveals our approach can maximize the utilization of
physical servers.

Figure 5: Waiting time of the models

Conclusion

Resource management is one of the most important


challenges in cloud computing where there are different workload types with different characteristics. So,
to allocate resources efficiently, it is necessary to design specific applications for each workload type. This
paper has proposed an approach for resource allocation of the BoT applications. The proposed approach
has used SLAs parameters and classified the tasks using a classification technique. Our approach has managed resources to increase utilization of resources for
responding users in a reasonable time. The proposed
approach has been simulated using the Monte Carlo
simulation. The simulation results have been compared
with two reference models, first fit and proportional
share. Our approach could outperform the reference
models in terms of the total cost of resource allocation
and the total waiting time of clients.

Figure 4: Service time of the models

References
In addition, idle time of servers in accordance with
different workload (WL1 till WL6) for the FF and the
modified PS ranges from 0.049 to 0.858 and from 0.957
to 9.330 respectively. However, the proposed approach
could omit idle time of servers. This is why the RM
continuously checks servers and exploits a deallocation
process to switch unnecessary servers off. Hence, the

[1] J. Ekanayake et al., Cloud Technologies for Bioinformatics Applications, IEEE Trans. on Parallel and Distributed
Systems 22 (2011), no. 6, 998-1011.
[2] M. Armbrust et al., A View of Cloud Computing, Communications of the ACM 53 (2010), no. 4, 50-58.
[3] Z. Liu et al., On Maximizing Service-Level-Agreement Profits, Proc. of the ACM Int. Conf. on Electronic Commerce,
2006.

265

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[4] M. Salehpour and A. Shahbahrami, Alienable Services for


Cloud Computing, Proc. of Int. Symp. on Intelligent Distributed Computing, 2011, pp. 195-200.

[9] D. Ardagna et al., SLA Based Resource Allocation Policies


in Autonomic Environments, Journal of Parallel and Distributed Computing 67 (2007), no. 3, 259-270.

[5] H. Khazaei et al., Performance Analysis of Cloud Computing Centers Using M/G/m/m+r Queuing Systems, IEEE
Trans. on Parallel and Distributed Systems 23 (2012).

[10] D. Juedes et al., Heuristic Resource Allocation Algorithms


for Maximizing Allowable Workload in Dynamic Distributed Real-time Systems, Proc. of the IEEE Int. Symp.
on Parallel and Distributed Processing, 2004, pp. 117-130.

[6] X. Liu et al., Application-Specific Resource Provisioning


for Wide-Area Distributed Computing, IEEE Network 24
(2010), no. 4, 25-34.

[11] A. Chandra et al., Dynamic Resource Allocation for Shared


Clusters using Online Measurements, ACM SIGMETRICS
(2003).

[7] Y. Lee et al., Profit-driven Service Request Scheduling in


Clouds, Proc. of the IEEE/ACM Int. Symp. on Cluster,
Cloud and Grid Computing, 2010.

[12] F. da-Silva and H. Senger, Scalability Limits of Bag-ofTasks Applications Running on Hierarchical Platforms,
Journal of Parallel and Distributed Computing 71 (2012),
no. 6, 788-801.

[8] S. Kumar et al., vManage: Loosely Coupled Platform and


Virtualization Management in Data Centers, Proc. of Int.
Conf. on Autonomic Computing, 2009.

[13] H. Goudarzi and M. Pedram, Maximizing Profit in Cloud


Computing System via Resource Allocation, Proc. of Int.
Conf. on Distributed Computing System, 2011, pp. 1-6.

266

Bus Arrival Time Prediction Using Bayesian Learning for Neural


Networks

Farshad Bakhshandegan Moghaddam

Alireza Khanteimoory

Institute for Advance Studies in Basic Sciences

Institute for Advance Studies in Basic Sciences

Department of Computer and Information Sciences

Department of Computer and Information Sciences

fmoghaddam@iasbs.ac.ir

khanteymoori@iasbs.ac.ir

Fatemeh Forutan Eghlidi


Islamic Azad University,Science and Research Branch
fatemeh.forutan@googlemail.com

Abstract: Nowadays, using of Intelligence Transportation System (ITS) is very common in many
countries. One Important component of ITS is Advanced Traveler Information System (ATIS), and
one of the main rule of ATIS is providing travel time information to travelers. Providing accurate
transit arrival time information is important because it attracts additional passengers and increases
the satisfaction of users. In this paper we used Bayesian Learning approach for Neural Network to
predict bus arrival time. We compared our proposed model to FeedForward, Backpropagation and
Cascade-Forward Backpropagation Neural Networks. The result of Bayesian Learning is a posterior
distribution over weights of the network. We use Markov Chain Monte Carlo method (MCMC) to
sample N values from the posterior weight distribution. These N samples help us to choose the best
prediction by voting for best solution. Our results show that Bayesian Neural Network work better
than standard Neural Network and the accuracy of prediction has been increased.

Keywords: Neural Network; Bayesian Learning; Markov Chain and Monte Carlo.

Introduction

By technological development the price of electronics


and components has been decreased, so the transport
companies are looking for a way to keep the customers
satisfied. One of these ways is using Intelligence Transportation System (ITS). Using ITS improve transport
outcomes such as transport safety, transport productivity, travel reliability, informed travel choices, social
equity, environmental performance and network operation resilience. One component of Intelligent Transportation Systems (ITS) is Advanced Traveller Information System (ATIS) and a major component of ATIS
is providing travel time information by different modes
to travellers. Providing accurate transit arrival time
information is important because it attracts additional
passengers and increases the satisfaction of users. The
Corresponding

increase in satisfaction of travellers can be achieved by


the provision of current travel information [1, 2]. Furthermore transit operators can track vehicles and figure
out that if the vehicles have fallen behind schedule or
not? So the vehicles information is important for both
transit operators and travellers and the need for the
model or technique to predict transit travel time is increasing. The objective of this research is developing
a model and applying it to predict bus arrival time.
Many models have been used to predict vehicle information such as: Regression Models, Time Series Models, Kalman Filtering Models, Artificial Neural Network Models etc. [3]. In this research we use Bayesian
Learning approach for Neural Network to predict bus
arrival time. Our results show that by using Bayesian
approach the accuracies of predicted arrival time has
been increased. The rest of this paper is organized as
follows. Section 2 outlines the principles of Bayesian

Author, P. O. Box 45195-1159, F: (+98) 241 421-5071, T: (+98) 241 415-5067

267

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Neural Network. In section 3 we bring the experi- conditioned on M. By integrating over (everything),
mental results and compare our proposed model with the chosen assumptions M and prior p(|M ), comprise
Z
FeedForward, Backpropagation and Cascade-Forward
Backpropagation Neural Network. At the end we bring
p(D|M ) = p(D|, M )p(, M )d
(2)

conclusion and future works that can be done for creating more precise and real models.
In MLP network, we have some training data D =
(x1 , y1 ), . . . , (xn , yn ), and want to know y new given
xnew . It has be done by integrating the predictions
2 Bayesian Learning approach of the model with respect to the posterior distribution
of the model
for Neural Network
Z
p(y new |xnew , D, M ) = p(y new |xnew , )p(|D, M )d
In Bayesian analysis all unknown and uncertain pa(3)
rameters can be modelled as probability distributions, Where denotes all the model parameters of the prior
and inferences are performed by constructing poste- structures.
rior conditional probabilities for unobserved variables,
given the observed variable and prior assumptions [4].
In Bayesian approach we have to define a probabilThe Bayesian approach is used for the first time by ity distribution for network parameters. A commonly
Buntine and Weigend in 1991 and reviewed by MacKay used prior distribution for network parameters is Gausand Neal in 1996. The main difficulty of model build- sian.
ing in standard Neural Network is controlling the comwk N (0, a2k )
(4)
plexity, because the optimal number of degrees of freedom in the model strictly depends on the number of where wk represents the weights and biases of network
2
training sample, amount of the noise in the samples and ak is the variance hyperparameter for given weight
2
and complexity of the function being estimated. An- (or bias). The hyperparameter ak is given, for example,
other problem of standard Neural Network is lack of a conjugate inverse gamma hyperprior
analysing tools for results. These issues can be handled
a2k Inv gamma(a2ave , va )
(5)
in a very natural and consistent way by using Bayesian
approach. The unknown degree of complexity is han- Also we have to define a probability distribution for
dled by dening vague (non-informative) priors for the residual. A commonly used Gaussian noise model is
hyperparameters that determine the model complexity,
e N (0, 2 )
(6)
and the resulting model is averaged over all model complexities weighted by their posterior probability given The conjugate distribution for noise model is the inthe data sample, also Bayesian analysis yields posterior verse Gamma, producing the prior
predictive distributions for any variables of interest,
2 = Inv gamma(02 , v )
(7)
making the computation of condence intervals possible.
One of the main advantages of Bayesian approach is
that because we integrate over all the possible solution
it can avoid from overlapping. In other words, Bayesian
MLP returns theoretically all possible solutions and integrates them out. In case of MLP the posterior distribution is typically very complex. The integrations required by Bayesian approach can be approximated using Markov Chain Monte Carlo (MCMC)Rmethods [5].
p(D|, M )p(|M )
This method says that An integral = g(x)p(x)dx
(1)
p(|D, M ) =
can
be approximated by using a sample of values x(t)
p(D|M )
drawn from the distribution p(x)
Where p(D|, M ) is the likelihood of the parameters
n
, p(|M ) is the prior probability of , and p(D|M ) is
1X
g(x(t) )
(8)

n
a normalizing constant, called evidence of the model
n t=1
M. The term M denotes all the assumptions that are
made in dening the model, like a choice of MLP network, specic residual model etc. the normalization
MCMC for Bayesian Neural Network have been
term p(D|M ) means marginal probability of the data, proposed by Neal [6]. The posterior distribution is
The result of Bayesian approach is posterior distribution. In Bayesian approach Predictions are made by
integrating all models over this posterior distribution.
Use of the posterior probabilities requires explicit denition of the prior probabilities for the parameters. The
posterior probability for the parameters in a model M
given data D is, according to Bayes rule,

268

The Third International Conference on Contemporary Issues in Computer and Information Sciences

represented by a sample of perhaps a few dozen sets


of network weights. The sample is obtained by simulating a Markov chain whose equilibrium distribution
is the posterior distribution for weights. Fig. 1 shows
the solutions of Bayesian MLP for regression problem.

Monte Carlo (MCMC) methods. This toolbox works


with works with Matlab versions 6.* and 7.* and provides different sampling methods to implement MCMC
such as Metropolis-Hastings sampling, Hybrid Monte
Carlo sampling, Gibbs sampling and Reversible jump
Markov chain Monte Carlo sampling. We have tested
this toolbox for MLP network in regression problem
with Gaussian noise. We implement our network for
one of the route in Zanjan city. This route has 9 bus
stations. You can see the bus stations and route in
Fig. 2.

Figure 1: Bayesian MLP solution for regression problem.

The dots show the data points. The thin grey lines
are the N different solution and the dark solid line is
the average solution. This figure shows that the average solution is smoother than other individual solution.
MCMC algorithm is exact in the limit as the size of
the sample and the length of time for which the Markov
chain is run increase, but convergence can sometimes
be slow in practice [5]. Note that samples from the posterior distribution are drawn during the learning phase,
which may be computationally very expensive, but pre- Figure 2: The route of bus in Zanjan. Blue circles show
dictions for the new data can be calculated quickly us- the stations and read line shows the route path.
ing the same stored samples.

Experimental Results

In Bayesian approach for Neural Networks there is


more need to expert works because we need to know
the probability distributions for weights, noise and
other parameters, but once that is done, the results
are consistently better than in comparison to other approaches. There are several packages for implementing
Bayesian approach for Neural Network such as: FBM
(Flexible Bayesian Modelling)1 that is introduced by
Neal, and MCMCstuff form Helsinki University (Finland) [7]. We choose MCMCStuff, it implements all
Bayesian methods for MLP in the Matlab environment. MCMCstuff toolbox is a collection of Matlab
functions for Bayesian inference with Markov chain
1 http://www.cs.toronto.edu/radford/fbm.software.html

269

Because there is no test bed for this route, we collect data by ourselves. This data set consists of the
arrival time of bus to bus stations, dwell time for bus,
schedule adherence of bus and amount of time that
bus takes to reach the bus station. We use these parameters to train our Neural Network. In this paper, a
fully connected multilayer Neural Network model was
chosen. The Neural Network architecture used in the
research has three layers: an input layer, a hidden layer
and an output layer. Because we have three inputs,
the number of neurons in input layer will be 3. It
was found that the number of neurons in hidden layer
did not substantially impact on the result of Neural
Network models. Therefore, in this paper, we use 15
neurons in hidden layer. The structure of this network
is shown in Fig. 3.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclution

We deployed Bayesian learning for Neural Networks for


bus arrival time prediction. It is a hybrid model making use of Bayesian inference in artificial Neural Networks. We compared our proposed model to FeedForward, Backpropagation and Cascade-Froward Backpropagation Neural Networks. Results show that prediction accuracy for our proposed model outperforms
Figure 3: Input-Output Structure of the ANN Models. all other discussed standard Neural Networks by error
tolerance 1 second. Accuracy of our proposed model
is 99.26 %, while other methods accuracy is at most
At first we have to choose the parameters of prob- 90.44%, and this shows that our proposed model work
ability distributions. Chosen parameters are shown as well. Enhancing prediction accuracy by including traffollow:
fic congestion and some other parameters like quality of
2
ak = 0.1, aave = 0.05, va = 1, 0 = 0.05 and v = 0.05. streets and weather condition can extend our work. By
considering these parameters we can achieve a precise
model that is very close to real model.
We compare our proposed model with other models
such as: FeedForward Neural Network, Backpropagation Neural Network and Cascade-Forward Backpropagation. The results are shown in the Tab.1 and Fig. 4. Refrences
Our results show that by error tolerance 1 second, using
Bayesian Learning increases the accuracy of prediction. [1] R. F. Casey L. N. Labell, Advanced Public Transportation
Systems: The State of the Art Update 2000, FTA-MA-26But because calculating the integral spend much time,
7007-00-1, Federal Transit administration, U.S. Department
the training phase of Bayesian approach will be more
of Transportation. (2000).
than 1 hours in comparison of a few minutes for other [2] J. J. Fruin, Passenger Information Systems for Transit, ransmodels.
fer Facilities, Synthesis of Transit Practice 7, Transportation
Research Board, National Research Council,Washington,
D.C. (1985).
[3] Ran Hee Jeong., The Prediction Of Bus Arrival Time Using Automatic Vehicle Location Systems Data, PhD Thesis.
USA, Texas A&M University. (2004).
[4] J. O. Berger, Statistical Decision Theory and Bayesian Analysis. Springer Series in Statistics, Springer,2nd edition.
[5] Simo Srkk Aki Vehtari and Jouko Lampinen, On MCMC
Sampling in Bayesian MLP Neural Networks, Engineering,
Helsinki University of Technology.
[6] Alejandro Quintero, A User Pattern Learning Strategy for
Managing Users Mobility in UMTS Networks., IEEE Transactions on Mobile Computing VOL. 4, NO. 6 (2005).

Figure 4: Accuracy on different Networks.

[7] Jarno Vanhatalo and Aki Vehtari, MCMC Methods for MLPnetwork and Gaussian Process and Stuffdocumentation for
Matlab Toolbox MCMCstuff, Laboratory of Computational
Engineering, Helsinki University of Technology.

270

SRank: Shortest Path-Based Ranking in Semantic Network


Hadi Khosravi-Farsani

Mohammadali Nematbakhsh

Computer Engineering Department

Computer Engineering Department

University of Isfahan, Iran

University of Isfahan, Iran

khosravi@eng.ui.ac.ir

nematbakhsh@eng.ui.ac.ir

George Lausen
Informatik Department
Albert-Ludwigs University, Freiburg, Germany
lausen@informatik.uni-freiburg.de

Abstract: Similarity estimation between interconnected objects appears in many real-world applications and many domain-related measures have been proposed. This work proposes a new
perspective on specifying the similarity between resources in linked data, and in general for vertices
of a directed graph. More specifically, we compute a measure that says two objects are similar
if they are connected by multiple small-length shortest path. This general similarity measure,
called SRank, is based on simple and intuitive shortest paths. For a given domain, SRank can be
combined with other domain-specific similarity measures. The suggested model is implemented in
order to cluster resources extracted from DBPedia knowledge-base.

Keywords: Similarity; Linked Data; Clustering; Semantic Web.

Introduction

Extracting the similarity score between items is relevant to many areas of computer science, for instance,
social networks, targeted advertisements, clustering,
web mining, data mining, ontology mapping, and in
general, information networks require a model to specify the notion of similarity between items. Clearly,
similarity metric could be developed based on the definition of similarity and the context in which items are
being found.
Various aspects of resources could be used to determine similarity, which usually depends on the connectivity (e.g. the number of possible paths between two
vertices) and structural similarity (e.g. the number
of common neighbors of two vertices). In this paper, we propose SRank(Short-Rank) that exploits the
resource-to-resource relationships found in information
networks. Our study is motivated by recent research
and applications on RDF resource clustering, link dis Corresponding

covery, RDF ranking over linked data cloud, which


usually require an effective and trustworthy evaluation
of underlying similarity functions among resources.
Without loss of generality, all the models in this paper
are valid and general for computing similarity score
between nodes of a directed graph.
The main intuition behind SRank is that two objects are considered as similar if they are connected by
multiple small-length shortest paths. In comparison
with other state-of-the-art similarity measures, SimRank [1] considers in-links, BipartiteRank [1] explores
out-links, and P-Rank [2] takes into account both inand out-links in order to propagate the similarity scores
through in-links, out-links, and both of them respectively.
As an example, consider the small RDF graph
G, shown in Figure 1, illustrating a soccer player
graph where a vertex represents a soccer player
and an edge represents the hyperlinks relationship (http://dbpedia.org/ontology/wikiPageWikiLink
predicate in the DBPedia knowledge base). Nodes

Author, F: (+98)311 793-2670, T: (+98) 311 793-4400

271

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

on this sample graph are an article in Wikipedia


knowledge base or corresponding resource in its
DBPedia version. The full URI address of each
node could be made by putting Wikipedia prefix
(http://en.wikipedia.org/wiki/) or BDPedia prefix
(http://dbpedia.org/page) on the start of its node
name. As we can see, Kris Commons is pointing to
Nigel Clough, and Colin Calderwood. Note that nodes
in our sample graph may have more links to other resources which has not been shown in the graph.

Figure 1: A small soccer player RDF graph extracted


from DBPedia.

We present SimRank, BipartiteRank, and P-Rank


scores for all obtained similarity pairs in Table 1. SimRank value for [M, S] and [J, N ] is unavailable, since
SimRank considers in-relationship information for similarity computation. More seriously, [M, J] and [K, N ]
pairs has no value for SimRank mainly because these
vertex pairs do not have common in-link similarity
flows. BipartiteRank works like SimRank with the
only difference that it considers out-links. This way,
Similarity of S with the other nodes is not available because S has no output links. Furthermore, similarity
between pairs of [K, N ], [J, C] , [M, C] is not available in the presence of BipartiteRank. In contrast to
SimRank and BipartiteRank, P-Rank scores flow from
in-link neighbors of resources and penetrate through
their out-link ones. P-Rank is able to produce more
similarity pairs, however it suffers from some drawbacks which specially concerning the produced scores
as well as symmetric assumption for similarity scores.
SRank takes the different path between two resources into account and produces a similarity score
based on the number of different paths as well as their
corresponding lengths. For example, [M, C] should
receive more similarity score compared to [M, K] since
they are connected by two different paths (M points to
C by a direct edge and also an indirect edge through
K). As you can see in Table 1, [M, C] pair receives a
low similarity score in P-Rank algorithm. The similarity score between them is further strengthened in the

presence of SRank algorithm. In fact, SRank assigns


high similarity value to strongly connected resources
while the previous methods assign the similarity value
only based on in- or out- link matching. The SRank
similarity scores for path length 3 and 4 are presented
in columns 6 and 7 of Table 1 respectively.
The main assumption behind previous algorithms
is that the similarity of [a, b] is equal to the similarity of [b, a] which may not be the valid assumption
in some domains. Consider again the [J, M ] pair in
the Graph G in Figure 1. Joel Lynch has only one
relationship with Matt Thornhill while Matt Thornhill has two more relationship with the other available
resources. SRank differentiates the similarity score of
the pair [J, M ] with that value for [M, J] because from
Js viewpoint, M is the correct match for J while from
M s viewpoint, J may not be the correct match for
him.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Nodes

Sim

BRank

PRank

SRank3

SRank4

K, C
J, K
J, C
M, C
C, S
J, S
S, M
M, S
N, S
M, N
K, S
N, C
K, M
M, K
J, N
J, M
M, J
K, N
N, K

.29
.58
.4
.23
.21
.18
.32
N/A
.23
.47
.25
.4
N/A
N/A
N/A
N/A
N/A
N/A
N/A

N/A
.11
N/A
N/A
N/A
N/A
N/A
N/A
N/A
.29
N/A
N/A
.14
.14
.11
.05
.05
N/A
N/A

.19
.36
.28
.11
.06
.04
.11
N/A
.07
.35
.07
.26
.17
N/A
.14
.08
.08
.12
.12

.5
.25
.25
.5
1.0
N/A
N/A
.16
N/A
.16
.5
.5
N/A
.33
N/A
1.0
.33
.5
1.0

.5
.25
.33
.5
1
.08
N/A
.41
.25
.16
.5
.5
N/A
.33
.25
1
.33
.5
1

Table 1: The corresponding similarity values for SimRank (C = 0.8), BapartiteRank (C = 0.8) and P-Rank
(C= 0.8, = 0.5)

The contributions of this paper are summarized as


follows: We propose a new mathematical similarity
measure, SRank, which produces similarity pairs based
on the number of different paths as well as their corresponding lengths. We study its main advantages over
other introduced similarity measures, and its applicability under RDF data model found in LOD cloud. We
investigate the applicability of SRank to the RDF resource clustering and introduce a new clustering algo-

272

The Third International Conference on Contemporary Issues in Computer and Information Sciences

rithm based on the SRank similarity measure. High there are multiple shortest paths between a and b as
similar resources fall into one cluster while low similar well as b and a.
resources are distributed to different clusters.
Definition 1: [Access Value] Let P P be the
N N transition probability matrix of length p of a
graph G. Access value from a to b is defined as

SRank

p
n2
1
H(a, b) = w1 Pa,b
+...+wp Pa,b
+...+wn2 Pa,b
(1)

The basic intuition of SRank can be expressed as two


objects in a directed graph are considered as similar
if they are connected by small-length shortest path.
More specifically, similarity between objects a and b
(maybe different from similarity between b and a) in a
given graph is affected by the following two contradictory conditions: Number of shortest paths from a to
b (shortest path is defined on the considering domain)
and Length of shortest paths from a to b
Referring back to Figure 1, [M, C] pair are similar
because there are two paths from M to C. on the other
hand, [C, M ] similarity is not available since there is no
path from C to M . Similarity between [M, C] is higher
than [M, K] because M has two different paths with C
while it holds only one path to K.

Where wi is the supposed weight for all paths with


p
length i. Pa,b
is probability of going from a to b with
length p and is equal to number of p-path from a to b
(kp (a, b))divided by number of p-path starting from a
(kp (a, x))
kp (a, b)
xG{a} kp (a, x)

p
Pa,b
=P

The access values between different nodes of a given


graph are approximately estimated only by considering a very few sentences of formulae 1. Moreover,
constructing all of the different paths is very timeconsuming. As a consequence, H(a, b) is replaced by
Hs (a, b) and is defined:
1
s
Hs (a, b) = w1 Pa,b
+ ... + ws Pa,b

2.1

(2)

1 s n 2 (3)

Preliminaries

A labeled directed graph is denoted as G =


(V, E, C), where V is the set of vertices, < a, b > E
is a relationship from resource a to resource b, where
a, b V . C is the ground-truth class labels which is
provided by human being. We denote the size of whole
RDF nodes as | V |= m.
A similarity function takes graph G and two vertices a and b as input and then compute similarity score
between a and b, that is, d(a, b) Similarity(G, a, b).
RDF graph clustering is to partition a RDG graph
G into k disjoint segments Gi = (Vi , Ei , C) where
V = ki=1 (Vi ) and vi vj = for any i 6= j.
A
P custom
P clustering algorithm results must increase
d(vi , vj ) in case of x=y and decrease
iG
x PjGy
P
iGx
jGy d(vi , vj ) in case of x 6= y.

It is apparent that to obtain meaningful results,


the weight of shorter paths must be higher than those
longer paths. As you will see in the experimental section, we investigated a couple of different combinations
of weight assignment through clustering algorithm and
arrived at the following equation for paths weight.
wp = 2sp

(4)

A straightforward approach to estimate similarity


score between a and b is to normalize Hs (a, b) with respect to HM ax and HM in in the whole collection. In
our experiment we use the formula 5 for similarity score
between a and b :
SRanks (a, b) =

Hs (a, b) HM in
HM ax HM in

(5)

Theorem 1: The above equations (Shown in equations 1 to 5) have the following properties:

2.2

SRank Formula

In a large graph G (i.e. web graph, RDF graph), an


arbitrary vertex is strongly connected to some vertices
while it is absolutely far from some other vertices. If
there are multiple shortest paths from a to b, then a
is close to b from as viewpoint. On the other hand, if
the same condition holds for b to a, then b is close to
a from bs viewpoint. This way, a and b are similar if

273

(Asymmetric) SRanks (a, b) 6= SRanks (b, a)


in some of resources while SRanks (a, b) =
SRanks (b, a) in symmetric resources to each
other.
(Positive defined) a, b SRanks (a, b) 0
If Hs (a, b) Hs (c, d) we not can conclude
Hs+1 (a, b) Hs+1 (c, d)

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.3

SRank-based Clustering Algorithm

In order to study the effectiveness of SRank, we choose


to investigate how different similarity measures perform in clustering applications. It is worth noting
that SRank is not confined only to clustering applications. Any data oriented applications adopting structural similarity as underlying function can make use
of SRank as its similarity measure. In this section, we
will address the challenges in the clustering process by
discussing the most important issues.
We plug SimRank, bipartiteRank, PRank as well
as SRank into a hierarchical clustering method. We
choose the agglomerative approach where each resource starts in its own cluster, and pairs of clusters
are merged as one moves up the hierarchy. Figure 2
shows the process involved in our performed clustering
process. In the mentioned approach, we need three
main different measures: 1) a measure for constructing
two element clusters 2) another measure for cluster
merging 3) and the threshold measure for stopping the
cluster merging.
In order to construct two element clusters, aggregation functions such as average, sum, count, maximum,
and minimum could be applied on SRank scores (Equation 5) to produce stable distance scores between pair
of resources. Equation 6 shows the suggested formula
to derive distance value between a pair of resources.
Intuitively, the original distance scores are correlated
with the type of aggregation function used. It must be
between 0 and 2 in the case of sum function, while it is
between 0 and 1 in the case of average, maximum, and
minimum function. In the performed evaluation, the
sum function is assumed as aggregation function and
resources are joined together, depending on the sum of
corresponding distance values.
d(a, b) = AggregationF unction
(6)
(SRanks (a, b), SRanks (b, a))
In order to decide which clusters should be combined, a measure of distance between sets of clusters
is required. The structural distance between pair of
resources, d(a, b), by SimRank, bipartiteRank, PRank,
and SRank scores could be developed to produce distance scores between their corresponding clusters. Unlike resource distance, mean or average score between
resources can only be used as cluster distance. Equation 7 defines the distance between clusters as:

P
d(Ci , Cj ) =

aCi

bCj

mi mj

d(a, b)

The clusters must be merged together as their distance scores satisfy the user-provided threshold value.
Given a set of clusters, the threshold value strongly
depends on the context in which clusters are found. In
high connected graph, high value should be assumed
for threshold measure while in low connected graph,
the threshold value should be chosen in lower value. In
the implementation of all clustering algorithms, we investigated the clustering qualities and report the best
result for each algorithm.

Figure 2: Clustering Approach

Discussion and Future Works

We have proposed a general model for computing similarity scores between resources of a directed graph. It
is based on the number of shortest paths between resources. These similarity measures can be used in order
to compare resources belonging to the RDF graph that
are not necessarily connected. They rely on the degree
of number of paths between these elements. While the
model has been developed in the case of a directed
graph extracted from LOD cloud, the notion of SRank
also can be applied to undirected graphs in the other
domains.
We are now working on other domains such as social networks and web graph to assess the feasibility
of SRank. We are also comparing the results in the
other data oriented applications that need the similarity between resources. Moreover, different weight adjustment approaches may be deployed to improve the
results. Finally, we are working on generalizations to
directed weighted graphs, directed labeled edges, and
heterogeneous graphs.

Ci Gi , Cj Gj

(7)
where mi and mj are the number of resources in clusters i and j respectively.

274

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences
[1] G. Glen and J.. Widom, SimRank: A Measure of StructuralContext Similarity: Tachnical Reporty in Stanford InfoLab
(2011).

275

[2] P. Zhao, H. Han, and Y. Sun, P-Rank: a comprehensive structural similarity measure over information networks, International Conference on Information and Knowledge Management (2009), 553562.

Web Anomaly Detection Using Artificial Immune System and Web


Usage Mining Approach
Masoumeh Raji

Vali Derhami

Yazd University

Yazd University

Electrical and Computer Engineering Department

Electrical and Computer Engineering Departmentr

raji.n@yazduni.ac.ir

vderhami@yazuni.ac.ir

Reza Azmi
Alzahra University
Computer Engineering Department
azmi@alzahra.ac.ir

Abstract: The analogy between immune systems and intrusion detection systems encourage the
use of artificial immune systems for anomaly detection in computer networks, Web servers and
web-based applications which are popular attack targets. This paper presents a web anomaly
detection based on immune system and web usage mining approach for clustering web sessions
to normal and abnormal. In this paper the immune learning algorithm and the attack detection
mechanism are described. Theoretical analysis and experimental evaluation demonstrate that the
proposed approach is more suitable for detecting unknown attacks, and are able to provide a real
time defense mechanism for detecting web anomalies.

Keywords: Intrusion Detection Systems; Artificial Immune Systems; Anomaly; Normal behavior; Session.

Introduction

niques tries to extract patterns from the data that are


collected from the interaction of users with the web.
The aim of any web usage mining process is to learn
models of users behavior and use these models for any
The World Wide Web (WWW) considered as the application that tries to ease use of the web [2].
largest distributed collection of information and plays
an important role in human life. Web applications are
The Artificial Immune System (AIS) is a powerbecoming increasingly popular in all aspects of human ful paradigm for learning which is originally inspired
activities; ranging from science and business to enter- from the natural immune system. There are a number
tainments. Consequently, web servers and web appli- of motivations for using the immune system as inspiracation are becoming the major targets of many attacks. tion for clustering web users which include recognition,
Due to the growing number of computer crimes, needs diversity, memory, self regulation and learning [3]. The
for techniques that can secure and protect web servers vertebrate immune system is composed of special type
and web applications against malicious attacks have of white blood cells (called Bcells), which are responsibeen highlighted. Unfortunately, current security solu- ble for detecting antigens and defending against them.
tions, operating at network and transport layers, have When an antigen is detected by the B-cells, an immune
insufficient capabilities in providing acceptable level of response is promoted resulting in antigen elimination.
protection against web-based attacks [1]. Attaining de- One type of response is the secretion of antibodies by
sired information has become a difficult task for users B-cells (cloning). Antibodies are Y-shaped molecules
even in a particular website. Web usage mining tech Corresponding

Author, T:+98 (912) 7907830

276

CICIS12, IASBS, Zanjan, Iran, May 5, 2012

on the surface of B-cells that can bind to antigens and


recognize them. Each antibody can recognize a set of
antigens which can match the antibody. The strength
of the antigen-antibody interaction is measured by the
affinity of their match [2].

based active defense model for web attacks (IADMW)


which is on the basis of the clone selection and hypermutation. Http queries is considered as the antigens.
An http query is represented by a vector of attributes
extracted from the http query, with associated weights
represented the importance of the attribute in the http
query. Danforth [7] presents the Web Classifying Immune System (WCIS) which is a prototype system to
detect attacks against web servers by examining web
server requests. Focused on distinguishing self from
non-self and laid the foundations for the negative selection algorithm. WCIS considers some features: length
of the URI, number of variables and distribution of
characters. Guangmin and Danforth are not considered web sessions, and a http query is labeled as an
attack. Rassam [8] proposed an immune network clustering method that is robust in detecting novel attacks
in the absence of labels. The purpose of this study is
to enhance the detection rate by reducing the network
traffic features and to investigate the feasibility of bioinspired immune network approach for clustering different kinds of attacks and some novel attacks. Rough
Set method was applied to reduce the dimension of features in DARPA KDD Cup 1999 intrusion detection
dataset. Immune network clustering was then applied
using ainet algorithm to cluster the data.

Many artificial immune models have been discussed


in literature such as negative selection, danger theory
and Artificial Immune Networks (AINs). We use the
AIN model which was initially proposed by Jern [4].
Access log files of web servers are an important source
of information for Web Intrusion Detection Systems
(WIDSs).
In this paper, is worked on access log files of Apache
server and an anomaly detection system for detecting
web-based attacks. In the training phase, the anomaly
detection system tries to learn how to distinguish normal behaviors from attack by considering several parameters. These parameters include: the number of
values assigned to variables of each request within a
session [1]; the length of URL of each request [7], the
depth of path of each request, attribute character distribution [5], Attribute length [5].

The remainder of this paper is organized as follows. In Section 2, a review on some available IDSs
is presented. Section 3 discusses the goals of this study
and introduces algorithm regarding the data representation. In Section 4, the experimental evaluation of the 3
Proposed Method
proposed system is presented. Moreover, the detection
ability of the system is tested to other area dataset.
Finally, Section 5 concludes our study.
The proposed Web Host Immune Based Intrusion Detection System (WHIBIDS) introduces immune principles into IDSs to improve the capability of learning and
recognizing web attacks, especially unknown web attacks. In the proposed algorithm sessions and requests
2 Related Work
are constructed from web logs in which the clickstream
data are stored. Clickstream data are generated as a
There are two possible approaches for intrusion detec- result of user interaction with a website. Antigen and
tion. Iintrusion detector can be provided by a set of antibodies are represented same form and their length
rules or specifications of what is regarded as normal be- is equal.
havior based on the human expertise. This approach
could be considered as an extension of misuse detection
Antigen Presenting: Define each users request
systems. In the second approach, the anomaly detector as the antigens set Ag.
Each request is repreautomatically learns the behavior of the system under sented by a vector of attributes extracted from
normal operations and then generates an alarm when the access log file.
The form of the vector of
a deviation is detected from the normal model [1].
the antigen set Ag is listed as following: Ag=
ag| =< Session ID, U RLlength, numberof variables,
Vigna et al.[5] proposed an IDS that operates on distributionof characters, Attributelength, depthof path >
multiple event streams and use similar features to our
work. The system analyzes the HTTP GET requests
There are some shortcomings to common access log
that use parameters to pass values to server-side pro- files generated by web servers such as Apache.One of
grams. However, these systems are misuse-based and these problems is to define the web sessions. Since
therefore not able to detect attacks that have not been the boundaries of sessions are not clearly defined, expreviously modeled. Guangmin [6] presents an immune traction of web sessions from these log files is not a
2

277

The Third International Conference on Contemporary Issues in Computer and Information Sciences

straightforward process.[1] In this paper sessions are


generated like [1] that demonstrate real sessions. In
[1] log file is generated with software which is written
by PHP and is called PHP log generator. A log file
which is generated with PHP log generator includes
sessions and other parameters. Session-ID shows each
request which is owned by sessions. Calculate length
of the URL and number of variables of each request[7].
Distributions of characters have a regular pattern [5].
For example in some of the attacks like buffer overflow,
it is possible to see a completely different distribution
of parameters and also this subject appears to hold
true for attacks that use manifold iteration of a special
character like multiple use of dot character in directory traversal flaws[9]. For each character, existence
percentage of a character in the proportion to length
of a parameter is calculated and then for each character
average percent of these values in whole parameters of
a request are computed. Calculating Attribute length
and depth of path that are a part of each request, for
example depth of the following path is 3.

initialization;
Fix the Maximal population sizeNB ;
Initialize B-cell population and i2 =init using a
number of random antigen;
while all antigens are presented;
do
Present antigen to each B-cell;
if activated the B-cell wij > wmin ;
then
Refresh age(t = 0);
Add the current B-cell ad its KNN to
working sub-network;
else
Increment the age of B-cell by one;
end
if for all B-cells wij < wmin ;
then
Create a new B-cell=antigen;
else
Repeat for each B-cell in working
sub-network;
Compute B-cell stimulation;
Update B-cell i 2 ;
end
if antigens of a session is presented;
then
Clone B-cell B-cell based on their
stimulation level;
if populationsize > NB ;
then
Remove extra least stimulated B-cells;
else

index/wp-admin/export.php
Finally, the vector that is corresponded to that request is normalized. The range of output is between 0
and 1. The normalized value for each field in a vector
of a request is calculated by dividing the value of that
field by the sum of values over all the fields in that
vector.
Affinity function: similarity measure between tow
antigen is Euclidean distance determines the distance
between two web application requests. Precisely, the
similarity between two requests agi and agj is defined
as:

end
else
end
end
The modified algorithm of [2]
Algorithm 1: ]

As it is shown in proposed algorithm, when an antigen is unable to activate any B-cell, this antigen may
represent a noise or a new emerging pattern. In this
v
u k
condition, a new B-cell is created which is a copy of the
uX
(agin agjn )2
(1) presented antigen. If this antigen is a noisy data and
dis(agi , agj ) = t
does not present a new emerging pattern, it would not
n=1
get enough chance to get stimulated by incoming antigens and is probably eliminated. After each antigens
of a session is presented to the network, the B-cells
go under cloning operation based on their stimulation
level. When the population of the network exceeds a
defined threshold, the least stimulated B-cells are removed from the network. The distance measure preWhere k is the number of features is extracted for each sented in this study is used in all the steps for calculatrequest. The pseudo code of the proposed algorithm is ing the internal and external (B celltoantigen) interactions of B-cells. The detailed information about calpresented as following:
3

278

CICIS12, IASBS, Zanjan, Iran, May 5, 2012

culating stimulation level and updated it are described


by [2]. In the training phase profiles of normal behaviors using the proposed algorithm are built. Then, they
are applied to new request and new session in order to
detect abnormal behaviors in the testing phase.

Table 2: evaluation on Linux systemcall dataset


= 0.75
Accuracy False alarm rate Detection rate
Request based 97%
0.03
98%
Session based 98/6%
0.01
98/5%

Table 3: evaluation on web access log dataset by adding


20% noise

Experimental Evaluation

= 0.75
Accuracy False alarm rate Detection rate
Request based 80%
0.18
76%

There are no available data on web attacks and pure


non attack that can be used as a benchmark test; therefore, we used Dataset has been gathered by [1] that it
has vast variety vulnerability tests such as SQL injection, Xss vulnerability and Directory traversal flaws.
The empirical evaluation reported in this paper is performed on web requests of sessions. The original data
used in our experiment, contains 43602 requests and
6677 sessions from log files of the web server for seven
random days. Duplicate records in dataset are removed.

Table 4: comparison of WHIBIDS vs IADMW IDS


Accuracy False alarm rate Detection rate
WHIBIDS 97/3%
0
92%
IADMW
85%
0.065
67 %

We run the proposed algorithm 5 times with 5folds cross validation and the final values for evaluation measures is the average of these 5 runs. Table 1
and Table 2 represent the proposed systems high capabilities in both criteria and both datasets. As the
results show that performance of session based is the
better than request based and we can claim that the
proposed algorithm can detect malicious activities with
high accuracy. Patterns may be repeated in multiple
b-cells within the population. This is called a loss of
diversity or overfitting which essentially leads to redundancy (e.g. multiple requests have the same signature). To show that there has not been overfitting
in training data, 20% noise is added to the test data.
Table 3 shows the noise, about 15 percent impact on
the results. If overfitting had occurred would have a
significant impact on results.Table 4 shows the comparison of WHIBIDS vs IADMW IDS, which comes
from [6]. The detection rate of WHIBIDS is 92%, but
the detection rate IADMW is 67%. Simultaneously,
WHIBIDS is also capable of classifying web attacks
and has a high accuracy rate 97.3%. These results
show that WHIBIDS is a competitive alternative for
detecting web attacks.

The maximal population size of the network is set


to 50; the control parameter for the number of nearest neighbors (K) is set to 3. The activation threshold
(wmin ) is 0.5, the similarity threshold =0.75. If the
weighted distance is greater than , each B-cell is activated and several of them who are belonging to a
session are represented as user behavior. Evaluation is
based on two criteria on tow datasets, one of the criteria upon request and the other based on session, indeed
the array of requests that indicated user behavior. We
believe that an attack is a series of actions, so the set
of requests as actual sessions are considered. To show
that the proposed algorithm works on each dataset, we
use Linux systemcall data set that there also exists the
concept of sessions.Second data set contains 13217 sessions and 66159 systemcall. Both data types have been
tested in the same algorithm parameters.
Different kinds of metrics are measured to evaluate
the ability of the algorithm to learn the properties of
the features of the data and also detecting the anomaly
activities. Detection rate is the fraction of true positive rates to the number of all cases that should have
been classified as positive. The false alarm rate can be
defined as the proportion of actually normal cases that
were incorrectly classified as anomalous.

Acknowledgment

Table 1: evaluation on web access log dataset


= 0.75
Accuracy False alarm rate Detection rate
Request based 97/3%
0
92%
Session based 98/9%
0
95%

This research was supported by APA center at Yazd


University . The authors would like to thank APA for
its support.
4

279

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusions

[1] I. Khalkhali, R. Azmi, and M. Azimpour-Kivi, Host-based


Web Anomaly Intrusion Detection System, an Artificial Immune System Approach, IJCSI International Journal of Computer Science Issues,2011 8/2011 (2011), 1424.
[2] M. Azimpour-Kivi and R. Azmi, Applying Sequence Alignment in Tracking Evolving Clusters on Web Sessions
Data, an Artificial Immune Network Approach, Computational Intelligence, Communication Systems and Networks
(CICSyN ) (2011).

In this paper we proposed an intrusion detection system, Based on the principles of the immune system
(WHIBIDS) that can detect known and unknown attacks. Here an attack as a series of actions is considered. The requests obtained from the preprocessed
log files of web server are presented to the system as
antigens. The network of the B-cells represents a summarized version of the antigens encountered to the network. Also, they are able to adapt to emerging usage
patterns proposed by new antigens at any time. The
results show the ability of the proposed AIS to clustering web sessions to normal and abnorma.The results
indicate designing an immune base IDS that has several advantages:. (1) Self learning and immune learning
make the model can detect both the known and unknown web attacks. (2) Ability to detect anomaly in
real time.(3) capability to recognize abnormal behavior
with regard to the actual sessions. (4) Using immune
network algorithm achieved high detection rates. (5)
Can be used as a general classifier.There was limitation such as determination of similarity threshold with
testing. Future work will determine this threshold by
reinforcement learning.

[3] B. H. Helmi and A. T. Rahmani, An AIS algorithm for Web


usage mining with directed mutation, Pro. World Congress
on Computational Intelligence (W CCI08) (2008).
[4] N. k. Jerne, Towards a Network Theory of the Immune System, Annals of Immunology (1974), 373-389.
[5] C. Kruegel and G. Vigna, Anomaly detection of web-based
attacks, in Proceedings of the 10th ACM Conference on Computer and Communications Security (2003), 251-261.
[6] L. Guangminl, Modeling Unknown Web Attacks in Network
Anomaly Detection, International Conference on Convergence and Hybrid Information Technology (2008).
[7] M. Danforth, Towards a Classifying Artificial Immune System for Web Server Attacks: Department of Computer and
Electrical Engineering and Computer Science, International
Conference on Machine Learning and Applications (2009).
[8] M. A. Rassam, M. A. Maarof, and A. Zainal, Intrusion Detection System Using Unsupervised Immune Network Clustering with Reduced Features, Int. J. Advance. Soft Comput.
Appl. 2/2010 (2010).
[9] Z. Brewer, Web Server Protection with CSA HTTP Explorer
Directory Traversal, Cisco Security Agent Protection Series
(2006).

280

YABAC4.5: Yet Another Boosting Approach for C4.5 Algorithm


B. Shabani

H. Sajedi

Qazvin Azad University

Tehran University

Department of Electrical, IT & Computer Sciences

Department of Computer Sciences

Qazvin, Iran

Tehran, Iran

bhr.shabani@gmail.com

hhsajedi@aut.ac.ir

Abstract: A large number of machine learning and data mining algorithms which are used for
classification, prediction, and uncertain reasoning cannot handle continuous attributes. And some
of the other algorithms require a considerably large execution time when the input data contains
continuous attributes. Discretization is very important in developing practical methods in data
mining. It is the process of converting continuous attributes of a database into discrete attributes
so that they can be used by some classification algorithms. The approach in this study is based
on successive pseudo deletions. Our empirical experiments show that C4.5 gives improvement
in performance with the discretized output from YABAC4.5 compared to SPID4.7 and MDLP
discretization algorithms.

Keywords: Data Mining; Machine Learning; Continuous Attributes; Discretization.

Introduction

The data in a database or a data warehouse are usually available either in discrete or continuous forms.
Accordingly, the values of continuous attributes could
be possibly very large while a limited number of these
possible values are for discrete attributes [1].

sion trees and classification rules, require discretization


when predictor attributes are continuous. Discretization refers to the process of converting the values of a
continuous variable into two or more bins. Assuming
that a dataset consisting of M examples and S target
classes, a discretization algorithm would discretize the
continuous attribute A in this dataset into n discrete
intervals {[d0, d1], (d1, d2], ..., (dn1, dn]}, where d0 is
the minimal value and dn is the maximal value of attribute A [2]. The continuous attributes should be first
discretized into a limited number of distinct ranges
so that these algorithms can be applied on real-world
data. Furthermore, discretization makes data more
clear so discrete attributes are easier to understand
and comprehend. Moreover, response time of many
grouping inference algorithms will decrease. There are
some algorithms to discretize continuous values of supervised data. Therefore, in data mining and machine
learning, it is more advantageous to deal with discrete
data than with continuous data.

There are a great number of algorithms in machine


learning and data mining which cannot use continuous data. Data mining is an active area of research
that has found wide usage for knowledge discovery in
databases. Data mining is an essential step in the process of knowledge discovery in databases (KDD). In addition to data mining, major steps of KDD also include
data cleaning, integration, selection, transformation,
pattern evaluation, and knowledge presentation. Since
data is frequently interspersed with missing values and
noise, which makes them incoherent, data preprocessing has thus become an important step before data
A good discretization algorithm has to balance the
mining to improve the quality of the data. This subse- loss of information intrinsic to this kind of process and
quently improves the data mining results by reducing generate a reasonable number of cut points, that is,
noise. Many data mining algorithms, including deci Corresponding

Author

281

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

a reasonable search space. Furthermore, concise summarization of continuous attributes to help the experts
and users understand the data more easily, but also
make learning more accurate and faster. There are
five different axes by which the proposed discretization
algorithms can be classified: supervised versus unsupervised, static versus dynamic, global versus local,
top-down (splitting) versus bottom-up (merging), and
direct versus incremental [2].

1 Supervised methods discretize attributes with


the consideration of class information, while unsupervised methods do not. Many studies show
that the supervised discretization produces better classification model than unsupervised discretization does [3].

tion. The ChiMerge [6] is a supervised, local and merging discretization algorithm that works in a bottom-up
way to merge two adjacent intervals with the smallest
chi-square value, until the minimum chi-square value
becomes greater than both the predefined significance
level value and the threshold value determined by degrees of freedom. The Khiop [7] is a supervised, merging, global and statistics discretization method that
have recently been published. CACC [2] is proposed
a static, global, incremental and supervised discretization algorithm, in order to raise the quality of the generated discretization scheme by extending the idea of
contingency coecient and combining it with the greedy
method. CACM and Efficient-CACM [3] are supervised and splitting discretization algorithms. For comparing these algorithms with other discretization algorithms [3] has been used C4.5 [8] and RBF-SVM classifiers. And YABAC4.5 that is presented in this paper is
a supervised, splitting and global method of discretization.

2 Dynamic methods consider the interdependence


among the features attributes and discretize continuous attributes when a classifier is being built.
On the contrary, the static methods consider atWe will have the following discussions below: Sectributes in an isolated way and the discretization
tion
2 delivers some definitions of the practical conis completed prior to the learning task.
cepts of the study; the proposed algorithm introduces
3 Global methods, which use total instances to gen- in Section 3. The result of the empirical comparison
erate the discretization scheme, are usually asso- of YABAC4.5 with the other discretization algorithm
ciated with static methods. On the contrary, lo- is presented in Section 4. Finally, the conclusion has
cal methods are usually associated with dynamic been summarized in Section 5.
approaches in which only parts of instances are
used for discretization.

4 Bottom-up methods start with the complete list


of all continuous values of the attribute as cut- 2
BASIC CONCEPTS
points and remove some of them by merging intervals in each step. Top-down methods start
with an empty list of cut-points and add new Before introducing YABAC4.5 in this paper, some conones in each step.
cepts are presented at first:
5 Direct methods, such as Equal Width and Equal
Frequency, require users to decide on the number
Instance: An instance is a collection of values for
of intervals k and then discretize the continuous
all attributes. One of the attribute in supervised
attributes into k intervals simultaneously. On the
problem is class attribute.
other hand, incremental methods begin with a
simple discretization scheme and pass through a
Cut Point: A Cut Point, c, is an attribute value
refinement process although some of them may
which divides the values of that attribute into
require a stopping criterion to terminate the distwo ranges such that all values in one range are
cretization.
less than or equal to c and all the values in the
other range are greater than c.
In this paper, an improved version of the SPID4.7
[4] discretization algorithm has been introduced.
YABAC4.5 gives competitive performance with respect
to SPID4.7 and MDLP [5] discretization algorithms.
Over the years, several algorithms of discretization
have been designed. SPID4.7 is supervised, splitting
and global method of discretization. Whereas, MDLP
is supervised, local and merging methods of discretiza-

Boundary Point: A Boundary Point [5] is a cut


point such that the case value of the instance I
containing the Cut Point differs from the class
value of the instance that immediately follows I
in the sorted order on the attribute value. When
a group of two or more instances have the same
attribute value but different class values, then
Cut Points on either side of the group are also
boundary points.

282

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Threshold Point: By the process of discretiza- 4


Proposed algorithm
tion, the values of a continuous attribute get split
into number intervals, and each interval is represented by a discrete value. The Boundary Point Stage1: Firstly, attribute selection takes place on the
of an interval is a Threshold Point .
attributes of the problem so that dimensions of the
problem are decreased. Then the attributes are stud Handling of Missing values: The data in a ied, if there is an attribute with missing values, the
database might not be complete, i.e. some of the classes related to the missing attribute will be invesattributes of the sample might be without value. tigated. Afterwards, the most frequent data in that
This problem can be solved in two ways:
class(s) is(are) considered for the missing attributes,
i.e. It is possible that these missing values be replaced
1 These missing values are replaced with new with some values of the possible attributes. For instance, when missing values belong to different classes,
values
the data related to those classes will be studied and
2 These missing values are replaced by the the related missing values will be completely replaced
with the most frequent attribute value.
most frequent value

The method we use for missing values in this study is


somehow similar to the approach 2, the values related
to the class of missing values are found at first and then
the most frequent attribute of that class is considered
for the missing values of the class. Tables I-I, I-II show
examples of this method of missing value handling.

YABAC 4.5

SPID4.7 algorithm handles missing values for an attribute Ai by putting the most frequent value among
the existing values of Ai in the place of all missing values of Ai . But we have changed these approaches, and
handle missing value by putting the most frequent attribute value in that class, and we deleted the second
part of SPID4. Before of them, we do an attribute selection to reduce instance dimension to obtain better
results and then run this algorithm.
Att1
1
2
?
?
2
2
1
?
2
(a) Data
value

Att2
f
t
t
t
t
t
f
f
t
with

class
no
yes
yes
yes
yes
yes
no
no
yes
missing

Att1
1
2
2
2
2
2
1
1
2

Att2
f
t
t
t
t
t
f
f
t

class
no
yes
yes
yes
yes
yes
no
no
yes

(b) Missing values replace


with correct data

Table 1

283

Stage 2: attributes are investigated whether are


continuous or discrete. If attribute Ai is continuous, it
will be transferred to a new table with the related class
and the table will be sorted according to the value of
the attribute. Boundary points will be determined for
the table attribute and the binary threshold point will
be inserted in a place with more gain. Afterwards, this
process will be repeated for all continuous attributes
and the number of conflict points (PDEL-count) will
be calculated. So far all boundary points have been
determined and the binary threshold point has been
inserted for all attributes. From now on the following steps will be repeated for samples until there is
no more boundary point or until conflict points are
zero. A boundary point with maximum gain is selected for each continuous attribute. The number of
conflict points (PDEL-req) is calculated for this boundary point and the attribute with the minimum number
of PDEL-req is selected. If the minimum number of
PDEL-req is less than the number of PDEL-count, that
boundary point will be selected as the threshold point
and, in this case, the minimum number of PDEL-count
is replaced with the number of PDEL-req, else those
boundary points which the number of their PDEL-req
is equal to the maximum number of PDEL-count are
removed. The algorithm of YABC4.5 is as follows:
Input: Set of instance S, each instance having values
for attributes and class. One column in table is class
and the other columns are attributes. The attributes
are discrete or continuous.
Output: Discretize set of value.
Begin
Apply attribute selection for S
For each attribute in S
If there are missing values
Begin
Find the class (s) of missing values
Replace missing values of attribute i with more
frequents attribute value between the class

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

attribute
For each continuous attribute in S
Find cut point of attribute i
Find boundary points of attribute i
Insert threshold point at the boundary point
having
maximum information gain
Find pseudo-deletion-count (PDEL-count)
While (PDEL-count6=0) and (all boundary point
have not been considered) do
For each continues attribute in S
Find boundary point with maximum
information
gain
Calculate the min-PDEL-req among the above
selected point
If min-PDEL-req < PDEL-count
Begin
Accept the selected boundary point as threshold
point
PDEL-countmin-PDEL-req
Remove boundary point(s) violating m on both
sides
of selected threshold point
End
Else
Remove the point(s) with max-PDEL-req among
the maximum gain points of each continuous
attribute
End //end of while
End

Data Set
Anneal
Australian
Credit
Detematology
Echocardiogram
Ecoli
Glass
Heart-hungary
Heart-statlog
Horse-colic
Iris
Liver-disorder
Newthyroid
Pima
Vehicle
Wine
Wisconsin

#I
798
690
690
330
132
336
214
294
270
300
150
345
214
768
94
178
699

#A
6C, 14D
6C, 8D
6C, 9D
1C, 33D
8C, 3D
7C, 0D
9C, 0D
5C, 8D
5C, 8D
7C, 15D
4C, 0D
6C, 0D
5C, 0D
8C, 0D
18C, 0D
13C, 0D
9C, 0D

#CL
6
2
2
6
3
8
7
5
2
2
3
2
3
2
4
3
2

M.V.
Yes
No
Yes
Yes
Yes
No
No
Yes
No
Yes
No
No
No
No
No
No
No

Table 2: Characteristic of data sets. I-Instances,


A-Arributes, CL-Classes, M.V.-Missing values. C#Continuses attributes, D-#Discrete attributes.

To empirically evaluate the performance of


YABAC4.5 algorithm, we perform experiment on 17
data set from UCI [9]. The general characteristics of
the dataset are illustrated in table II.
For comparing YABAC4.5 with the other algorithms,
we have used some data sets and a classification algorithm which is run on data sets discretized by using
YABAC4.5, SPID4.7 and MDLP algorithms and result
of running these algorithms have been represented in
table III.
For better comparison between different algorithms on
different data sets we have shown experiment results
in Fig. 1 so we can see that YABAC4.5 has better
results in comparing with SPID4.7 and MDLP algorithms. Due to use of new way for missing value handling, YABAC4.5 algorithm accuracy is better than the
other algorithms.
C4.5 runs on each of original data set with continuous values discretized by using its in-built local
discretizer algorithm. By using discretization algorithm, first the original data sets are discretized using YABAC4.5 and C4.5 uses the discretized data discretized data set. We do these steps for other discretization algorithm. The accuracy is calculated using
10-fold cross-validation method.

Conclusion

We proposed a discretization algorithm, YABAC4.5,


which is based on successive pseudo deletions at maximum information gain. Furthermore we have used of
a new way for missing value handling in YABAC4.5.
Our new way replaces missing value with frequent attribute value of that class. Experimental comparison
of YABAC4.5 with MDLP and SPID4.7 algorithms has
shown that the data discretized with the suggested algorithm presents acceptable efficiency as compared to
the other data discretized algorithms. YABAC4.5 generates a better discretization than the most relevant
discretization algorithms.
For future work you can improve this algorithm for
large and other domain data sets. Also you can use
YABAC4.5 with the other data mining and learning
machine algorithms such as CN2 [10], Nave-Bayes.

Refrences
[1] U. M. Fayyad and K. B. Irani, On The Handling of Continuous Valued Attributes in Decision Tree Generation, Machine Learning 8 (1992), 87106.

284

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Data Set

Ann
Austr
Cred
Der
Ech
Ecol
Gla
Hun
Stat
Hc
Iris
Liv
Thy
Pim
Veh
Win
Wis
mean

In-built
Discretizer
Acc.s.d.
93.962.58
83.954.30
85.044.02
92.115.69
64.2813.35
81.844.82
66.068.19
62.088.42
77.808.04
83.746.03
95.595.09
66.736.96
93.465.13
74.565.06
63.5014.09
70.609.38
95.412.26
80.8

MDLP Acc.s.d.

SPID4.7 Acc.s.d

YABAC4.5 Acc.s.d.

92.762.66
85.423.66
86.294.54
91.995.68
65.126.11
81.844.82
66.068.19
64.988.62
78.917.33
84.195.93
95.485.40
69.227.67
94.685.01
75.375.13
68.0615.51
96.384.47
96.072.29
81.95

92.982.52
85.614.05
85.763.79
92.115.69
68.6211.64
82.456.06
72.248.20
61.0510.85
80.247.42
82.137.13
96.664.06
67.377.57
94.214.79
76.265.44
68.3516.07
96.304.47
95.392.71
82.22

93.77 3.38
86.763.70
87.093.75
95.352.47
70.1713.80
86.557.31
73.868.99
72.2414.41
81.326.87
83.455.55
98.672.67
88.294.07
98.143.12
87.2305.61
70.563.58
96.434.42
96.202.26
98.55

Table 3: accuracy comparisons using C4.5 algorithm. Emprical result: acc.= average accuracy and s.d.=
standard deviation

Figure 1: This figure shows comparing result on different data set between different algorithms

[2] C. Tsai, C. Lee, and W. Yang, A discretization algorithm


based on Class-Attribute Contingency Coefficient, Information Sciences (2007), 714731.
[3] C. Tsai, C. Lee, and W. Yang, An effective discretization algorithm based on Class-Attribute Contingency Coefficient,
Pattern Recognition Letters 32 (2011), 19621973.
[4] S. Pal and H. Biswas, SPID4.7:Discretization Using successive Pseudo Deletion at Maximum Information Gain
Boundary Points, Proceedings of 5th SIAM International
Conference on Data Mining, Newport Beach, CA, USA
(2005), 546550.
[5] U. M. Fayyad and K. B. Irani, Multi-interval Discretization
of Continues-valued Attributes or Classification Learning,
Proceedings of Thirteenth International Joint Conference on
Artificial Intelligence, Morgan Kaufman (1993), 10221027.
[6] R. Kerber ChiMerge, Discretization of Numeric Attributes,
Proceedings of The 10th National Conference on Artificial
Intelligence, MIT Press (1999), 123128.

285

[7] M. Boulle. Kkiops, A Statistical Descretization Method of


Continuous Attributes, Machine Learning 55 (2004), 5389.
[8] J. Quinlan, C4.5: Programs for Machine Learning, Morgan
Kaufman, San Mateo, CA, 1993.
[9] C. L. Black and C. J. Mers, UCI Repository of Machine
Learning Database, Department of Information and Computer Science, University of California, Irvan (1990).
[10] P. Clark and R. Boswell, Rule Induction with CN2: some
recent improvements, Machine Learning: Proceedings of the
5th European Conference, Berlin (1991), 151153.
[11] J. Dougherty, R. Kohavi, and M.sahami, Supervised and unsupervised discretization of continues features, Proceedings
of the Twelfth International Conference on Machine Learning, A.Prieditis and S. Russell, eds., Morgan Kaufmann, San
Francisco (1999), 194202.

A New Method for Automatic Language Identification In Trilingual


documents of Arabic, English, and Chinese with Different Fonts
Einolah Hatami

Karim Faez

Islamic Azad University of Qazvin

Amirkabir University of Technology

Qazvin , Iran

Tehran , Iran

Einolah.hatami@gmail.com

kfaez@aut.ac.ir

Abstract: Identification of the script of the text in multi-script documents is one of the important
steps in the design of an ocr system for analysis and recognition of the page. In this paper, a new
and effective method has been proposed to identify the script type of a trilingual document printed
in Arabic, English and Chinese script. To identify these three languages and to extract their feature,
two methods based on the horizontal profiles have been used. In the first method, we calculate the
ratio of the number of black pixel on each text line to the enclosed area of each text line and in
the second method; each text line is divided into 3 distinct zones of the upper, middle and lower
zones. Then, we obtain the absolute maximum and the next largest relative maximum profile of the
middle zone. Text Lines with different fonts and sizes have been used to test the proposed system
that this algorithm has been tested on 150 different scanned pages containing 3750 text line of three
script with accuracy of 99.84%.

Keywords: script identification, multilingual document processing, optical character recognition

Introduction

Automatic script identification has been a challenging


research problem in a multi script environment has
acquired importance in recent years.for automatic processing of such documents through optical character
recognarion(ocr) it is a necessary to identify different script regions of the document. There are many
countries with multilingual nations and several official
languages such as India [1] and today given the expansion of relations among countries, there are a lot of
administrative and commercial documents, magazines,
reports and technical papers which may be several
languages that reflect the importance of this issue.
Few results have so far been published in this area
and often in bilingual texts; the discussion is in Latin
and another Asian languages such as Chinese, Korean,
Japanese, Arabic and Hindi [2]. In order to classify
the scripts, the following methods have been proposed
so far: Identification through analyzing the shape of
words [3], the optical density analysis (the number of
Corresponding

Author, T: (+98) 241 415-5067

286

on pixel in an area) [4], analysis based on extracted


feature of the profiles in different directions [2] and
based on Gabor filters [5], systems based on neural
networks [6], the use of bounding box of the words [7]
and also some with a combination of these methods [5]
have been used to identify the script. In this paper,
the identification of the language in trilingual texts of
Arabic, English and Chinese with different fonts and
sizes are discussed. Feature are first extracted and
with the help of the ratio of the number of black pixel
in each text line to enclosed area of the text line, the
Chinese language is identified and then, based on the
distance of the absolute maximum to the next relative
maximum profile, we recognized Arabic from English.
Then, classification of scripts is done according to the
feature-based rules. The rest of the paper is organized as follows. The section two has been devoted to
the segmentation of text line. The third section discusses feature extraction techniques. Part four talks
on the classification according to rules.the details of
experimental results obtained are presented in section
five.conclusion is given at section six.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ratio of the number of black pixel to the area.

segmentation
M = (ni=0 xi )/area,

In a scanned page with grayscale format, with the help


of a threshold, we first convert it to a binary image,
in a way that for each black pixel of the image, value
one and for white pixel, zero are considered. Next, we
obtain the horizontal projection profiles that represent
the number of on pixel (black) for row of image. Then,
the white space between text lines is used to segment
them i.e. in the space between lines the horizontal
projection profiles are zero. Figure (1) shows the horizontal profiles of two text lines.

(1)

In formula (1), xi represents the number of black


pixel in each row of a text line. Now, we obtain the
ratio (M) for each text line in which for different lines
M has different values and this difference is due to
different fonts and sizes. Table 1 shows different Ms
in 4 different sizes for the three languages and Table 2
shows the differences in 4 font types. After calculation,
the results are obtained according to Table (3).

Figure 1: Horizontal profiles of two Arabic scripts and


white space between lines which has zero value.

The output of this stage is the segmentation of text Figure 2: The ratio of black pixel to the area in 4 difline for each input image.
ferent font sizes for Arabic, English and Chinese.

feature Extracting

Two techniques are used to extract the feature:

3.1

Calculating the ratio of black pixel


to enclosed area of the text line

Horizontal projection profiles of each text line consists of nearly 100 rows, which this text line has been
surrounded in a rectangle that its area is calculated
through multiplying length to width of the rectangle
enclosed the text line then, we calculate the sum of the Figure 3: The ratio of black pixel to the area in 4 difblack pixels enclosed in this area, and we obtain the ferent fonts for Arabic, English and Chinese.

287

The Third International Conference on Contemporary Issues in Computer and Information Sciences

language
Chinese
English
Arabic

M Range
0.59 to 0.69
0.40 to 0.44
0.23 to 0.40

calculate the next relative maximum point (Pmax2)


and as it is evident in Figure 3, in Arabic, Pmax2 does
not exist or may in some fonts exists at a little distance to Pmax1 but it exists in English language and
in different fonts, this distance is in an interval that is
Table 1: Range of the ratio of black pixel to area in expressed in Table 2.
different fonts and size
language

3.2

Distance range D=pmax1-pmax2

15 to 40
The distance between the absolute English
Arabic
0 to 10
and relative maximum in the middle zone of the text line
Table 2: compares the proposed algorithm with recent
works

To extract this feature, each text line is divided into


the upper, middle and lower zone. Now, the analysis
In the proposed system, we used a rules -based clasof horizontal profiles in the middle zone of text lines in
sification
to identify the written language.
Arabic and English illustrates some facts that it can
be used to identify these two languages. In English
text lines, the maximum number of black pixel is in
the middle zone, the two border points, the middle and
Classification based on feature
upper and middle zone and the lower zone. In Arabic, 4
the maximum number of black pixel is in one point
and it is located in the middle zone.
In the proposed system, we used a rules -based classification to identify the language. 1. We obtained the
area of the enclosed line for each text line.
2. Then, we calculated the sum of black pixel for the
input text line.
3. Next, we calculated the ratio of the sum of black
pixel to the area.
4. If 0.59 <M<0.69, then it is the Chinese Language.
5. If 0.40 <M<0.44, then it is the English language.
6. If 0.23 <M<0.40, then it is the Arabic language.
Perhaps it may overlap M value due to use of different
Figure 4: the division of text line into three zone of types and font sizes for text lines of the Arabic or Enupper, middle and lower.
glish lines that for more accurate identification of the
remaining lines which are Arabic or English, we use
the feature of the second technique.
7. Calculating the absolute maximum in the middle
area of the input line (Pmax1).
8. Calculating the highest next relative maximum in
the middle area (Pmax2).
9. Calculating the distance of the absolute maximum
to the relative maximum (D=Pmax1-Pmax2).
10. If 15 <D<40 , then the text lines are identified as
English, otherwise as Arabic.

Figure 5: Three distinct zone of a sentence in English


and Arabic in accordance with their horizontal profiles.

Experimental and results:

In this paper an effective method has been proposed to


Now, if we consider the point which has the highest identify the type of written language in the trilingual
value as an absolute maximum (Pmax1) and we also texts of: Arabic, English and Chinese in printed texts.

288

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The proposed method uses two new methods for analyzing and extracting feature based on the horizontal
profiles of each text line. The proposed method in its
own simplicity has priority in identifying various fonts
and sizes that similar methods lacks this potentiality
and with 99.84% confidence on the data collection it is
correct. The proposed algorithm in this paper can also
be developed for the level of word and other languages.

language The
whole
number
of the
lines
Chinese 1300
Arabic 1200
English 1250

Conclusion

Correct
identification

Wrong
identification

In this paper an effective method has been proposed


to identify the type of language in the trilingual texts
of: Arabic, English and Chinese in printed texts. The
proposed method uses two new methods for analyzing
and extracting feature based on the horizontal profiles
of each text line. The proposed method in its own
simplicity has priority in identifying various fonts and
sizes that similar methods lacks this potentiality and
Identification with 99.84% confidence on the data collection it is correct. The proposed algorithm in this paper can also be
percentage
developed for the level of word and other languages.

1300
1198
1247

0
3
3

100%
99.81%
99.83%

Table 3: Results of experiments

Refrences
[1] A. Selamat and Ng Choon Ching, Arabic Script Documents
Language Identifications Using Fuzzy ART, Modeling & Simulation, AICMS 08. Second Asia International Conference on,
Chapter 5, pages: 528533, 2008.
[2] P.K. Aithal, G. Rajesh, D.U. Acharya, and N.V.K.M. Subbareddy, Text line script identification for a tri-lingual document: Lecture Notes in Computer Science (2010), 13.

The comparison of the proposed algorithm with recent works in this field which has been tested on the
texts with a fixed font size has been shown in Table 4.

confidence
Rule based classifier using Top
and bottom profile feature[8]
Rule based classifier using profile
feature[2]
Proposed Algorithm

Date
(lines)
500

Method

200

99.83%

3750

99.84%

96.6%

Table 4: compares the proposed algorithm with recent


works

[3] A. Lawrence Spitz, Determination of the Script and Language Content of Document Images: Lecture Notes in Computer Science, IEEE Transactions on Pattern Analysis and
Machine Intelligence 19 (1997), 235245.
[4] A. Zramdini and R. Ingold, Optical font recognition using
typographical features: Lecture Notes in Computer Science,
Pattern Analysis and Machine Intelligence, IEEE Transactions on 20 (1998), 877882.
[5] D. Dhanya and A. G. Ramakrishnan, Script Identification
in Printed Bilingual Documents: Lecture Notes in Computer
Science (2002), 7382.
[6] E.B. Bilcu and J. Astola, A Hybrid Neural Network for Language Identification from Text: Lecture Notes in Computer
Science (2006), 253258.
[7] H. Rezaee, M. Geravanchizadeh, and F. Razzazi, Automatic
language identification of bilingual English and Farsi scripts:
Lecture Notes in Computer Science (2009), 14.
[8] P.A. Vijaya and M.C. Padma, Text Line Identification from a
Multilingual Document: Lecture Notes in Computer Science
(2009), 302305.

289

Clustering in backtracking for solution of N-queen Problem


Vishal Kesri

Samaneh Ahmadi

Esfahan University

KIIT University1

Department of information science and library

School of computer Engineering

Esfahan, Iran

India

samanehahmadi71@yahoo.com

vissair@gmail.com

Vaibhav Kesri
NIT Kurukshetra
Department of Electrical
India
vaibhavkesri1@gmail.com

Abstract: The N-queens problem is a classic problem where n number of queen were to be placed
into an n x n matrix such that no queen attack any other queen. The Branching Factor grows
in a roughly linear way, which is an important consideration for the researchers. However, many
researchers have cited the issues with help of artificial intelligence search patterns say DFS, BFS
and backtracking algorithms. We have conducted an study on this problem and propose a new
backtracking algorithm on base of clustering in chess board. For perform cluster in chess board we
need to convert chess board into a network. And this algorithm give you all solution of n x n chess
board.

Keywords: Cluster-A, Cluster-B, Cluster-C

Introduction

queen is 64!/(56! 8!) 4 : 4 109 and the total number of possible solutions is 92. [2] We can consider two
solutions to be the same and can obtain one from the
The N-queen problem is a generalized form of 8-queen other by rotation or symmetry. So there exists only 12
problem, proposed by the chess player Max Bezzel. In different solutions and it becomes very hard to obtain
8-queen problem, 8 queens are required to be placed on an unique solution out of these.
a 8x8 chess board in such a way that no queen attacks
any other queen. [2] A queen can move in horizontal (in the same row), vertical (in the same column)
and diagonal direction. Also a N-queen problem must
follow the following rules [4]:

1.1

Applications

1 There is at most one queen in each column.


The N-queen problem is used in many practical solutions like parallel memory storage schemes, VLSI test3 There is at most one queen in each diagonal.
ing, traffic control and deadlock prevention. This problem is also used to find out solutions to more practiThe 8-queen problem is computationally very expen- cal problems which requires permutation like travelling
sive since the total number of possible arrangements salesman problem.
2 There is at most one queen in each row.

Corresponding

Author, T: (+91) 9871271078

290

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Solution strategy

N-queen problem is an important issue. However, this


algorithm is base on backtracking technique and solution strategy of N-queen problem depend on following
mythologies.
1 Backtracking.
2 Entry of queen attack
3 Condition.
4 Cluster
5 Analyzer

2.1

check the all above row in any row dont remain


any free cell, so backtrack.
2 Column- After placing a queen, into a cell of
chess board, so take the column value of that cell
than check the all above column in any column
dont remain any free cell, so backtrack
3 Diagonal- There is total n+(n-1) diagonal, here
n is stand for chess board size. The condition for it that, after placing a queen we check
all diagonal and count the diagonal who has
at least one free cell. And that value should
be greater that equal to number of remaining
queen. There is also two diagonal, left-diagonal
and right-diagonal and check these rule for both
diagonal.

Backtracking
2.4

This algorithm base on backtracking technique. [3]


backtracking represents one of the most general technique. Many problem which deal with searching for a
set of solution, satisfying some constraints can be solve
using the backtracking formulation. N-queen problem
is to place n queen into n x n chess board so that no
two queen attack other. Here we make a root node and
start from root node, when placing a queen into chess
board, so create a child node. On the other hand if
child node not satisfy the constraint (here constraint is
describe in section 3.2, 3.3, 3.4) so go to parent node
and try to put queen into another cell.

2.2

To perform clustering it need to think chess board as


a network, and every cell is a node and connection between node is base on movement of queen in chess.
Each node assign the number and color, these node
number base on row and column of chess board and
there is three type of color black, red and green, black
show the queen position, red show that this node is
not free (due to previous queen attack on it) and green
show this node is free. In starting all node have green
color. The example of this network given below :

Entry of queen attack

When place a queen into chess board, so that we make


a entry into chess board cell (node) for those cell which
one attack by this queen. It help us to choose cell for
place new queen. And also when remove the queen into
chess board due to backtracking, so that time delete all
entry make by this queen.

2.3

Cluster

Figure 1:

Condition

This algorithm we use following condition and check


these conditions after placing queen into a cell of chess
board, if any one of this condition true so backtrack
and choose another cell for place queen. These conditions given below:
1 Row- After placing a queen, into a cell of chess
board, so take the row value of that cell than

Figure 2:

291

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Than make cluster of node, that classified as


Cluster-C
cluster-A, Cluster-B and cluster-C. And we can perCluster-C guarantee 4 queen can place, cluster-c is
form rotation on cluster, these rotation like 90, 270,
360 degree clock wise rotation or anti-clock wise and very useful when value of n become large.
180 degree back side rotation than 90, 270, 360 degree
clock wise rotation or anti-clock wise. Each type of
cluster shown below
Claster-A
Cluster-A tell us that, there is at max one queen
is place at every cluster of node. These cluster are
following type:

Figure 6:
Figure 3:
Association Rules
After making all cluster if some single node remain,
There is some association rule which perform beso make these node cluster by using table-1, table-2,
tween cluster. These rule are following :
table-3. Because any two or more node, that table
value is same, so these node belong to cluster-A.
Eg:

1 Take two cluster of cluster-A category, and there


is intersection of these two cluster, so if we place
queen on it so both cluster treated as one cluster

2.5

Analyzer

The purpose of analyzer is that, it analyzes the next


four rows. If Analyzer get appropriate result, so it
continue otherwise it backtrack, appropriate mean is
there possible to place 4 queen or not. Analyzer analyzes next four rows on the bases of clustering and
association rule. There is some matrix, anlyz (size is
4 n), table-1(4), table-2(n), table-3(n+6). Here n
stand for chess board size. There is certain rule for
analyzes next for row, these r following:

Figure 4:

Here all three green node are in same cluster.


Cluster-B
Cluster-A tell us that, there is at max two queen
is place at every cluster of node. These cluster are
following type:

1 If there is less than 4 cluster of category of


cluster-A, so backtrack.
2 If there is 2 or more cluster of category of clusterB, so check there queen corresponding entry into
table-1, table-2, table-3. If u have four or more
position so continues.
3 If there is any cluster of category of cluster-C, so
just continues.

Figure 5:

292

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

How it work

first we make cluster of 4 node then go for 3 node cluster than 2 node.

Consider a chess board (6 6) and convert it into netHere we can see that after place 2 queen, there is
work. Here we follow backtrack algorithm. Below show 3 cluster. So we cannot place 4 queen in 3 cluster and
network of chess board (6 6):
selection of second queen position is wrong. So we need
to backtrack and choose another position for place second queen. And these cluster belong to category of
cluster-A.

Conclusion Future Work

In this paper we show that, if we have fewer clusters as


compare to remain queen. so we dont need to continue
further and instead of it we can just backtrack. The future work related to find the new cluster which related
Figure 7:
to existing category of cluster, eg cluster-A, clusterB and Cluster-C or new category of cluster. so that
new cluster will help to improve Analyzer. And fuStep-1:
ture work also related to finding new association rule
Place queen, then make their corresponding entry between clusters.
into other node. So after place two queen our chess
board looks like below:

Refrences
[1] V. Kesri, Va. Kesri, and P. Ku. Pattnaik, An Unique Solution for N queen Problem, International Journal of Computer
Applications 43 (2012), no. 12, 16.
[2] C Letavec and J Ruggiero, The n-queen problem, Informs
Transaction on Education 2 (2002), no. 3.
[3] E. Horowitz, S. Sahni, and S. Rajasekaran, Fundamental of
computer Algorithm.

Figure 8:

Step-2:
Find cluster for next four rows and give preference
to those cluster which cover more node. It means that

[4] B. Sauerwine and A quarterly Newsletter From Arcil,


Uninformed Search Analysis of the N-queens Problem:
http://www.andrew.cmu.edu/user/bsauerwi/research/P
1.pdf (2003).
[5] I. Martinjak and M. Golub, Comparison of Heuristic Algorithms for the N-Queen Problem, Proceedings of the ITI 2007
29th Int. Conf. on Information Technology Interfaces, Cavtat
Croatia 22 (2007), no. 1, 2636.
[6] E Horowitz, S Sahni, and S Rajasekaran, Fundamental of
Computer Algorithm.
[7] N. J Nilsson, Artificial Intelligence A New Synthesis.

293

An Improved Phone Lattice Search Method for Triphone Based


Keyword Spotting in Online Persian Telephony Speech
Maria Rajabzadeh

Shima Tabibian

Iran University of Science & Technology

Iran University of Science & Technology

m rajabzadeh@comp.iust.ac.ir

shimatabibian@iust.ac.ir

Ahmad Akbari

Babak Nasersharif

Iran University of Science & Technology

K.N.Toosi University of Technology

akbari@iust.ac.ir

bnasersharif@eetd.kntu.ac.ir

Abstract: Keyword spotting(KWS)refers to finding of all occurrences of the chosen words in


speech utterances. Phone Lattice Search(PLS)is one of known methods for KWS problem. In this
method, we can notice to many aspects for increasing accuracy and speed of search at the same time.
One method used in PLS, is Minimum Edit Distance(MED)measure. While this measure improves
detection rate, it also increases the false alarm rate. In addition, it only considers the information of
whole keyword in the phone lattice. In this paper, we proposed approaches for increasing the search
speed on lattice and also for increasing the KWS performance. We use Viterbi scores besides MED
measure in order to decrease the false alarm rate and use the information of whole phone lattice for
score normalization. Results using phone and triphone based phone recognizer on TFarsdat show
that proposed method increases accuracy and search speed of KWS system in comparison to using
only MED measure.

Keywords: Keyword Spotting; Phone Lattice; Lattice Search; Minimum Edit Distance; Scoring.

Introduction

Keyword spotting (KWS) systems are used for detection of selected words in speech utterances. Searching
for various words or terms is needed in spoken document retrieval which is a subset of information retrieval. KWS is used in a wide range of applications
such as searching in audio files.
One of the noticable challenges in KWS is the choice
of an suitable approach for finding the target keyword
in the input speech utterance. KWS approaches can
be divided into two main categories. The first category is called Large Vocabulary Continuous Speech
Recognition-based (LVCSR-based) and the second category is called phone sequence-based approach.

pends on the vocabulary size. In these methods, if we


want to search a word not contained in the vocabulary, we wont have chance to find it. This problem is
called Out Of Vocabulary (OOV) problem. Because of
the large search space, search speed of LVSCR-based
KWS systems is low. The phone sequence methods
dont have OOV problem, but they have lower accuracy in comparsion to LVCSR-based methods. On the
contrary, they have higher search speed in comparison
to LVCSR approaches.
The search space in each of these categories can be
presented as one-best transcription or a lattice. Many
KWS approaches use Lattices because they save several
hypotheses and they can produce results with higher
accuracy than one-best transcription[14].

Phone lattice-based KWS systems have low accuThe performance of LVCSR-based approaches de- racy due to the low performance of the speech phone

Corresponding

Author

294

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

recognizers. The performance of KWS systems are related to the performance of the speech phone recognizers. Thus, high insertion, deletion and substitution error rates affect the performance of KWS systems. The
Minimum Edit Distance (MED) during lattice search
used in some phone lattice-based KWS approaches,
compensates for speech recognizer errors. Given source
and target sequences, the MED calculates the minimum cost of transforming the source sequence to the
target sequence. In this transformation, a combination
of insertion, deletion, substitution and match operations is used. Each of mentioned operations has an
associated cost. Then, the sequences extracted from
lattice, are accepted or rejected by thresholding on this
MED score. KWS system can be robust against the errors of the speech recognizer by using MED, but MED
measure raises the false alarm rate. That is because
it considers the irrelated words as keyword hits[57].
In [5] a method named Dynamic Match Phone Lattice Search (DMPLS) uses MED measure during the
search.
In this paper, we use DMPLS method, a phonebased approach, which applies a lattice structure as its
search space. By using lattice, we can avoid OOV problem and also make search speed higher. We propose to
improve the MED weakness using Viterbi scores. Using this technique, the proposed method decreases the
false alarm rate while preserving the same detection
rate. In addition, we use lattice pruning and indexing
to increase the search speed of DMPLS.
The rest of the paper is organized as follows. In
Section 2, we describe MED measure. In Section
3, we propose Viterbi score for improving MED and
techniques for improving the search speed on lattice.
Section 4 contains the experimental results. Finally,
we conclude the paper in Section 5.

ing one string to another, given three basic operations:


insertion, deletion and substitution.
A two dimensional matrix, M, is used to hold edit
distance values. This matrix is (p+1)(q+1), where
p and q are the lengths of the two strings X and Y,
respectively. MED between X and Y can be computed
as a matrix that the last element in matrix is the minimum edit distance between X and Y:

M(0)(0)=
M(i)(0)=
M(0)(j)=
M(i)(j)=

Where S, I and D are substitution, insertion and


deletion penalties, respectively. At the time of keyword search in the lattice, all the phone sequences with
MED score lower than a threshold value, are declared
as keyword hits [7].
Dynamic Match Phone Lattice Search (DMPLS)
method uses MED measure for improving the performance of search [5]. However, MED measure can cover
speech recognizer errors and increases the detection
rate of KWS, but it also raises the false alarm rate.
Higher threshold values associated to KWS system,
causes higher detection rate and also higher false alarm
rate. In this paper, we use an approach to improve the
results of search process that uses only MED measure.
We describe the approach in Section 3.

Minimum Edit Distance Measure


3

As mentioned, weakness of speech phone recognizers and their errors affect phone-based KWS systems.
KWS systems are based on the speech recognizers results, so they inherently suffer from high insertion,
deletion and substitution error rates. The occurrence
of error caused by speaker, can also increase these error rates. Some KWS approaches use the Minimum
Edit Distance (MED) during lattice search to compensate for speech recognizer errors. The MED between
two strings is defined as the minimum cost of convert-

0
i*I; i=1,...,p
j*D; j=1,...,q
Min{M(i-1)(j-1)+S(X(i),Y(j))
M(i-1)(j) + D
M(i)(j-1) + I}

Proposed Approach

One drawback of MED measure is that it doesnt normalize the scores of candidate keywords and also uses
the information of just candidate substring. For compensating this defect, we use an approach that improves the performance of the methods that uses only
MED measure. We describe this approach in subsection 3.1. We also use some techniques for increasing the
speed of search process. We describe these methods in
subsection 3.2.

295

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.1

Viterbi Scoring for Improving the results of applying Viterbi scoring to search process
with MED measure are presented in Section 4.
Accuracy of Search

As mentioned in Section 2, MED measure can increase


the detection rate of keywords, but it also increases the
3.2
false alarm rate. To improve the accuracy of search
method, we use Viterbi Scoring.
This paper uses partial Viterbi scoring for candidate keywords obtained from the MED method. This
scoring method normalizes the scores and uses the
information of whole lattice, but MED measure just
uses the information of whole candidate keyword, so
Viterbi scoring can improve the results of search process using only MED measure. This scoring method
is implemented as a Viterbi algorithm, where the keyword confidences are obtained by Equation (1):

phoneme
CMkeyword
= Lphoneme
(keyword)
alpha

+Lphoneme (keyword)

Techniques for Increasing the


Search Speed in online lattices

In this work, we search keywords in produced online


lattices. In online search, we need to a solution to produce search space faster. We can use two techniques
that make the search process faster: lattice pruning
and lattice indexing. The produced lattices are usually large and have some redundant information. We
increase the speed of search process by using some
pruning methods. Pruning methods reduce the search
space.

Another technique that we use for increasing the


speed of search is the lattice indexing. We can indicate occurrence of a phone in a node by phone index(1)
ing. This indexing approach makes the search process
faster [1, 9].

+Lphoneme
(keyword)
beta
Lphoneme
best

Where CM is the confidence measure of keyword, Lphoneme (keyword) is the likelihood of the keyword computed as the sum of acoustic likelihoods for
phonemes recognized correctly. Forward likelihood
Lphoneme
(keyword) is the likelihood of the best path
alpha
through lattice from the beginning of lattice to the first
of keyword. Backward likelihood Lphoneme
(keyword)
beta
is computed from the end of lattice to the end of
keyword. Forward and backward probabilities are recursively evaluated as:
Lphoneme (Np )

alpha
Lphoneme
(N ) = Lphoneme
acoustic (N ) + minNp
alpha

Lphoneme (NF )

beta
Lphoneme
(N ) = Lphoneme
acoustic (N ) + minNF
beta

Experiments and Results

We evaluated our keyword spotting system on TFarsdat [10], a database of Persian conversational telephone
speech. This database consists of 320 audio files spoken
by 64 different speakers. Speakers have a wide variety
of genders, ages and educations. They also cover 10
different Persian dialects. Number of different phones
in this database considered as 30.

We considered 20 keywords chosen in order to have


no syllabic overlap between them. Keywords have 2 or
(2) 3 syllables and 4 to 8 phones. There are 7 two-syllabic
and 13 three-syllabic keywords. There are three kinds
of syllables: CV, CVC and CVCC. The distribution of
syllabic parts in the keywords is shown in Table 1.
(3)
Count

Syllable Kind
CV CVC CVCC
2
16
23
3
19
24

where NF is set of nodes directly following node N,


Distinct syllables
Np is set of nodes directly preceding node N. The alpha
Total syllables
likelihood for the first node and the beta likelihood for
the last node are set to 1. Lphoneme
is the likelihood of Table 1: Distribution of Syllabic Parts in Selected Keybest
the most probable path through lattice [8].
words
As shown in Equation (2) and (3), partial Viterbi
Keywords have occurred in 149 audio files. From
scoring normalizes the scores by using the information
of whole of lattice and it can improve the results. The these files, 113 files were used for training phase and 36

296

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

files were used for test phase (2 file for each speaker). real time. Also, when Emax is set to 2, the speed of
Total duration of training files is about 3 hours and our KWS system is 1.1 times faster than real time.
total duration of test files is 0.9 hour. Each phone
has been modelled using a left-to-right HMM with 3
states and 64 Gaussian mixtures per state. After that,
we used tying mixtures for constructing the triphone
models. In This way, we obtaines 4535 triphone models for Tfarsdat. In this way, we can obtain a better phone recognizer for constructing the phone lattice
than monophone-based models. Feature vectors contain energy and 12 MFCCs and their first, second and
third order derivatives. So, feature vector dimension is
52.
Figure 1 presents the effect of triphone models on
the performance of KWS system in comparison to
monophone models. In this figure, ROC (Receiver Operating Characteristic) curve is reported for two values
of allowed errors. The word allowed error demonstrates
the maximum allowed number of different phones between candidate keyword and the desired keyword.
Therefore, when allowed errors are set to 1; MED
measure considers a candidate keyword with only one Figure 1: KWS system based on triphone models comphone different from the original keyword. We show pared with KWS system based on monophone models
the allowed errors with Emax. As be shown, triphone
models can improve the results of KWS system. So
we choose our KWS system that is based on triphone
models.
Figure 2 presents ROC curve for comparing MED
method with a search method without lattice. Of
course these methods are based on triphone models.
As shown in the figure, MED method on lattice can
increase accuracy of search.
Table 2: Accuracy of the proposed approach measured
with FOM
Figure 3 presents ROC curve for comparing method
Method
Emax FOM
of using Viterbi Scoring with method of using only
Monophone-based
1
0.068
MED measure.The results are summarized in Table 2.
2
0.19
This table presents the accuracy of the proposed apMED + triphone
1
0.24
proach measured with FOM (Figure Of Merit). As
2
0.34
can be seen from the table, when Emax is set to 1,
MED + Viterbi + triphone
1
0.26
search using Viterbi scores in comparison to method
2
0.42
using only MED measure increases the FOM by 0.02%
and when Emax is set to 2, it increases the FOM by
0.08%. In addition, when Emax is set to 1, proposed
approach based on triphone models in comparison to
monophone models increases the FOM by 0.172% and
when Emax is set to 2, it increases the FOM by 0.15%.
As mentioned before, we use some techniques to
make search on lattice faster. Speed of search is measured by Real Time Factor (RTF). Table 3 presents
the results of search speed. In this table the results
of speed are reported for two values of allowed errors.
When Emax is set to 1, the speed of our KWS system
with adding Viterbi scoring is 1.7 times faster than

Table 3: Results of search speed measured with RTF


Emax Real Time Factor
1
0.58
2
0.91

297

The Third International Conference on Contemporary Issues in Computer and Information Sciences

parison to monophone models. We defined the allowed


errors as maximum number of different phones between
candidate keyword and the desired keyword.When the
allowed errors are set to 1; search using Viterbi scores
in comparison to method using only MED measure,
increases the FOM by 0.02% and When the allowed
errors are set to 2 , FOM is increased by 0.08%. In
addition, when Emax is set to 1, proposed approach
based on triphone models in comparison to monophone
models increases the FOM by 0.172% and when Emax
is set to 2, it increases the FOM by 0.15%.
We also used techniques such as lattice pruning
and indexing to increase the search speed. We considered lattices produced online and need to a fast search
method. As a result, the speed of our KWS system is
Figure 2: Applying MED measure compared with
faster than real time.
method without lattice

Acknowledgements

We thank Iran Telecommunication Research Center for


its supports during this work.

Refrences
[1] J. Cernocky, I. Szoke, M. Fapso, M. Karafiat, L. Burget, J.
Kopecky, F. Grezl, P. Schwarz, O. Glembek, I. Oparin, P.
Smrz, and P. Matejka, Search in Speech for Public Security and Defense, Proceedings of IEEE Workshop on Signal
Processing Applications for Public Security and Forensicsm
(SAFE) (2007), 17.

Figure 3: Effect of applying Viterbi Scoring to MED


measure compared with method using only MED measure

[2] J. T. Foote, S. J. Young, G. J. F. Jones, and K. S. Jones,


Unconstrained keyword spotting using phone lattices with
application to spoken document retrieval, Computer speech
& language Journal 11 (1997), 207224.
[3] H. Sak, M. Saraclar, and T. Gungor, On-the-fly Lattice
Rescoring for Real-time Automatic Speech Recognition,
Proceedings of in Proc. of Interspeech (2010), 24502453.

Conclusions

In this paper, we present a phone-lattice based keyword spotting system for online Persian conversational
telephony speech. This system uses MED measure.
MED measure covers some errors of speech recognizer.
So it increases the detection rate of KWS system. We
defined this system as our baseline system. For improving the performance of this system, we applied Viterbi
scoring to MED measure. This method uses the information of whole of lattice and normalizes the scores
of candidate keywords, so it decreases the false alarm
rate. In addition, we showed the effect of triphone
models on the performance of KWS system in com-

298

[4] X. Wang, L. Xie, B. Ma, E. S. Chng, and H. Li, Phoneme


Lattice based Text Tiling towards Multilingual Story Segmentation, Proceedings of Interspeech (2010), 13051308.
[5] K. Thambiratnam and S. Sridharan, Dynamic match phone
lattice searches for very fast and accurate keyword, Proceedings of International Conference on Acoustics, Speech and
Signal Processing (ICASSP) (2005), 465468.
[6] K. Thammbiratnam and S. Sridharan, Rapid yet accurate
speech indexing using dynamic match lattice spotting, IEEE
Transaction of Audio, Speech, and Language Processing
(2007), 346357.
[7] K. Audhkhasi and A. Verma, Keyword search using modified
Minimum Edit Distance measure, Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2007), 929932.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[8] I. Szoke, P. Schwarz, P. Matejka, and M. Karafiat, Comparison of keyword spotting approaches for informal continuous
speech, Proceedings of 9th European Conference on Speech
Communication and Technology (Interspeech) (2005), 633
636.
[9] K. Trinh, H. Nguyen, D. Duong, and Q. Vu, an empirical
study of multi pass decoding for Vietnamese lvcsr, Proceed-

ings of International workshop on Spoken Languages Technologies for Under-resourced languages (SLTU) (2008).
[10] M. Bijankhan, J. Sheykhzadegan, M. Roohani, R.
Zarrintare, S. Z. Ghasemi, and M. E. Ghasedi, Tfarsdat
- the telephone Farsi speech database, Proceedings of EUROSPEECH (2003), 15251528.

299

Adaptive Gaussian Estimation of Distribution Algorithm


Shahram Shahraki

Mohammad-R. Akbarzadeh-T

Mashhad Branch, Islamic Azad University

Ferdowsi University of Mashhad

Department of AI and Computer Sciences

Departments of Electrical and Computer Engineering

shahram.shahraki@mshdiu.ac.ir

akbarzadeh@ieee.org

Abstract: Some algorithms such as Estimation of Distribution Algorithms use probabilistic modeling to generate candidate solutions in optimization problems. The probabilistic presentation and
modeling allows the algorithms to climb the hills in the search space. Similarly in this paper,
Adaptive Gaussian Estimation of Distribution Algorithm (AGEDA) which is kind of multivariate
EDAs is proposed for real coded problems. The proposed AGEDA needs no initialization of parameters; mean and standard deviation of solution is extracted from population information adaptively.
Gaussian Data distribution and dependent Individuals are two assumptions that are considered in
AGEDA. The fitting task model in AGEDA is based on maximum likelihood procedure to estimate
parameters of assumed Gaussian distribution for data distribution. The proposed algorithm is evaluated and compared experimentally with Univariate Marginal Distribution Algorithm (UMDA),
Particle Swarm Optimization (PSO) and Cellular Probabilistic Optimization Algorithm (CPOA).
Experimental results show superior performance of AGEDA to the other algorithms.

Keywords: Probabilistic Optimization Algorithm, Particle Swarm Optimization, Evolutionary algorithms, Univariate Marginal Distribution Algorithm

Introduction

Evolutionary search algorithms are important population based optimization techniques in the recent years
as a consequence of computation ability increment to
solve optimization problems. These techniques search
through many possible solutions which operate on a set
of potential individuals to get better estimation of solution by using the principle of survival of the fittest, as
in natural evolution. Genetic algorithms (GAs) developed by Fraser [1], Bremermann [2], and Holland [3],
evolutionary programming (EP) developed by Fogel
[4], and evolution strategies (ES) developed by Rechenberg [5] and Schwefel [6] establish the backbone of evolutionary computation which have been formed for the
past 50 years. Estimation of Distribution Algorithms
(EDAs), or Probabilistic Model-Building Genetic Algorithms, or Iterated Density Estimation Algorithms
have been proposed by Mhlenbein and Paa [7] are as
an extension of genetic algorithms which are one of the

main and basic methods in evolutionary techniques, on


the other hands, EDAs are a novel class of evolutionary
optimization algorithms that were developed as a natural alternative to genetic algorithms in the last decade.
EDAs generate their new offspring based on the probability distribution defined by the selected points Instead of performing recombination of individuals. The
main advantages of EDAs over genetic algorithms are
no need of multiple parameters to be set (e.g. crossover
and mutation probabilities). In traditional version of
EDAs, they are inherently defined for problems with
binary representation. So, for the problem in the real
domain it must be first mapped to a binary coding before being optimized for real coded problems. This approximation might lead to undesirable limitations and
errors on real coded problems [11]. The bottleneck of
EDAs lies in estimating the joint probability distribution associated with the population that contains the
selected individuals. AccordinglyEDAs can be essentially divided to univariate, bivariate or multivariate
approaches.

Corresponding Author, Center of Excellence on Soft Computing and Intelligent Information Processing, IEEE Senoir Member,
Professor

300

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

In this paper a new kind of multivariate EDAs is announced called Adaptive Gaussian Estimation of Distribution Algorithm (AGEDA). AGEDA has been designed for real coded problems. The proposed algorithm assumed Gaussian distribution of data to model
and estimate the joint distribution of promising solutions based on maximum likelihood technique. Next
generation will be sampled based on this set of solution
as parent set and estimated joint distribution. This
type of probabilistic representation of AGEDA allows
the algorithm to escape from local optimums and move
free through fitness function.

Adaptive Continuous Estimation of Distribution Algorithm

Adaptive Gaussian Estimation of Distribution Algorithm (AGEDA) which is a subset of multivariate


EDAs and has been designed for real coded problems
is introduced in this paper. If the size of population
is infinite, it will be proven that if p(x) is the same as
the actual probability distribution of the points in Q(t)
and proportional or tournament or truncation selection
is used in Modeling Step, then the population will be
converged to the global optimal solution[8]. The most
important and crucial step of EDAs is the construction of probabilistic model for estimation of probability
distribution, to do this, in AGEDA, Gaussian distribution of data is assumed to model and estimate the
joint distribution of promising solutions. The following
estimation is used to generate new candidate solutions.
f (X) =

e[
1/2
2 k/2 ||

(x)T 1 (x)
]
2

EDA can improve exploitation without losing the exploration ability of EDAs. To achieve this, AGEDA
benefits different Gaussian distributions estimation in
every dimension of individuals.
The proposed algorithm has two implicit parameters: mean and standard deviation, where these parameters extracted from promising population adaptively,
also AGEDA has not any parameters. Therefore, there
is no need to set the parameter in AGEDA.
The procedure of proposed algorithm is described
as below:
Step 1) initializes first generation randomly with
uniform distributed random numbers in all dimensions.
Step 2) evaluates the fitness function of all the real
valued individuals.
Step 3) is the main loop of algorithm. Continue until termination condition (max generation production)
meets.
Step 4) in this step, based on truncation selection
model, top evaluated individuals are selected to estimate parameters of distribution. And weak individuals
are eliminated to not participate in the estimation.
Step 5) Distribution parameters are estimated
based on maximum likelihood estimation technique.

Suppose a training set with n patterns X = (x1 , x2 ,


.., xn ), For Gaussian distribution functions, the sample
estimates given by equation (3) and (4) are maxi(1) mum likelihood estimates and will converge to the true
values with increasing the number of cases.

Where is mean and is covariance matrix which can


be written for k -dimensional random vector X = [X 1 ,
X 2 . . . Xk ] in the following notation:
h
i
T
= E (x ) (x ) =
(2)
E[(x1 1 ) (x1 1 )]

..

.
E[(xk k ) (x1 1 )]

..
.

E[(x1 1 ) (xk k )]

..

.
E[(xk k ) (xk k )]

= E(X)

(3)

h
i
T
= E (X ) (X )

(4)

Step 6) Based on the estimated mean and standard deviation for every dimension, a new population
is sampled as (5):

So, to model the data (population), mean and


standard deviation parameters of promising population
which are computed by maximum likelihood technique
are required.

xij = G(i , i )

(5)

Where ,, are estimated parameters of populaOne of the advantages of EDAs against other EAs tion based on top evaluated individuals and G(.,.) is a
is in exploration of search space. This presentation of Gaussian random number generator. In addition, i =

301

The Third International Conference on Contemporary Issues in Computer and Information Sciences

1,2,..d (d dimensions problem) is the dimension indiIn this paper, the benchmark problems that have
cator and j = 1,2...k (max population size is k) is the been used to evaluate our algorithm are numerical funcpopulation size indicator.
tion optimization problems that contain Schwefel, Ackley, Griewank, Rosenbrock, G1, Kennedy, Rastrigin,
Step 7) is consistency check step.
and Michalewics.

xij =

xij
G (i , i )

li < xij < ui


else

(6)

Step 8) in this step all the real valued individuals


are evaluated by the fitness function.
Step 9) until now two generations are created, one is
current generation and the other one is the offspring of
them based on AGEDA procedure. And finally in this
step based on truncation selection model, next generation will be selected from specified populations.

Table II, summarizes the experimental results of


PSO, AGEDA, UMDA and CPOA for specified benchmark functions. Note that discussed functions are proposed for minimization and have a global minimum
with some local minima, in spite the proposed algorithm is designed for maximization. As a result, we
redefine them to maximize f(x). As it seems in most
of benchmark functions, the performance of AGEDA
is better than PSO. In addition the performance of
AGEDA in all the problems is better than CPOA. In
Table II, best results are marked as Bold and as it
shows, AGEDA could find best results in all problems
but Wolpert and Macready proved in [8], that it is
not possible to have a search algorithm that is better
than another search algorithm for all possible problems.

This type of probabilistic representation of AGEDA


allows the algorithm to escape from local optimums
and move free through fitness function. Later AGEDA
based on adaptive estimated standard deviation is
Experimental results show that initially AGEDA
guided to global optimum and just tries to climb up
the promising hills which have been found so far. The has a good speed versus the other algorithms.
superiority of AGEDA is due to the adaptive Gaussian
Table I. Parameter settings for specified algorithms
distribution parameters extraction.
Because of the fact that no coding or decoding procedure is essential in AGEDA, it is a faster algorithm
versus classic EDA or GA or other algorithms which
code their populations.

Experimental result

S
Schwefel
0.3
Ackley
0.4
Griewank
0.36
Rosenbrock 0.3
G1
0.3
Kennedy
0.25
Rastrigin
0.3
Michalewics 0.3

0.05
0.05
0.5
0.5
0.05
0.05
0.06
0.05

0.03
0.29
0.07
0.2
0.03
0.08
0.03
0.09

CPOA
Mutate Rmu
0.005
0.002
0.005
0.002
0.005
0.002
0.005
0.002
0.005
0.002
0.005
0.002
0.005
0.002
0.005
0.002

Rdel
0.002
0.002
0.002
0.002
0.002
0.002
0.002
0.002

S
6
6
6
6
6
6
6
6

While closing to the global optimum, AGEDA tries


to find the best solution with its power in local search
or exploitation. This superiority is due to the adaptive
Gaussian distribution parameters extraction, and this
type of probabilistic representation of the algorithm.
Standard deviation often has large value at first, but
as the algorithm close to the global optimal, its value
decreases adaptively.

In this paper the dimension of the problems is set


to m=50, 100. The population size for all of the
experiments is set to 36, and the maximum generation termination condition which is used, equals 50.
All results are averaged over 20 runs. To measure
our work with other evolutionary algorithm we implement CPOA (Cellular Probabilistic Optimization AlThis means that AGEDA has more exploration at
gorithm) and PSO (Particle Swarm Optimization) and first and more exploitation at later. Also AGEDA has
UMDA (Univariate Marginal Distribution Algorithm) better performance than algorithms which are designed
which is a univariate type of EDAs. Parameters that for binary coded problems [9] and UMDA.
are used in this algorithms are S , , , Rmu , Rdel ,
AGEDA
S and Mutate for CPOA and w, c1 and c2 are used
Mean
STD
Best
Worst
Schwefel
-4.16 e+04
-4.16 e+04
-4.16 e+04
-4.16 e+04
for PSO, It is obvious that the best parameters for
Ackley
-0.03578
-0.03578
-0.03578
-0.03578
every algorithm are problem dependent, these parameGriewank
-1.09586
-1.09586
-1.09586
-1.09586
Rosenbrock
-1.045e+04
-1.045e+04
-1.045e+04
-1.045e+04
ters are summarized in TableI. The parameters of PSO
G1
18.547
18.547
18.547
18.547
Kennedy
-359.565
-359.565
-359.565
-359.565
are equivalent for all problems and are equal to:
Rastrigin
-912.939
-912.939
-912.939
-912.939
Michalewicsz

W = 0.9, C1 = 0.1, C2 = 0.2.

302

9.65

9.65

9.65

9.65

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Schwefel
Ackley
Griewank
Rosenbrock
G1
Kennedy
Rastrigin
Michalewicsz

Mean
-4.172e+04
-2.3
-1.14
-2.36e+04
18.26
-2.78e+03
-1.53e+3
8.67

UMDA
STD
Best
3.1
-4.17e+04
1.1
-1.01
0.05
-1.1
4.36e+4
-1.84e+04
0.09
18.534
1.06e+03
-2.53e+03
186
-1156
1.01
9.56

Worst
-4.181e+04
-5.91
-1.16
-3.6e+04
18.06
-2.96e+03
-2.658e+3
6.642

Schwefel
Ackley
Griewank
Rosenbrock
G1
Kennedy
Rastrigin
Michalewicsz

Mean
-4.173e+04
-9.12
-1.12
-8.168e+05
18.53
-3.96e+04
-1.51e+03
9.095

PSO
STD
Best
16.58
-4.17e+04
0.91
-7.84
0.03
-1.08
3.15e+05
-3.66e+05
0.02
18.55
2.68e+04
-7.37e+03
100
-1.36e+03
0.9596
10.558

Worst
-4.18e+04
-10.65
-1.16
-1.26e+06
18.5
-1.011e+05
-1.66e+03
7.47

Schwefel
Ackley
Griewank
Rosenbrock
G1
Kennedy
Rastrigin
Michalewicsz

Mean
-4.19 e+04
-1.6
-1.616
-2.28e+06
18.035
-1.69e+05
-1.69e+03
8.94

CPOA
STD
Best
1.002
-4.19 e+04
0.58
-0.96
0.05
-1.55
9.434e+04
-2.154e+06
0.7724
18.549
3.12e+04
-1.74e+05
43.32
-1.60e+03
1.27
12.32

Worst
-4.29 e+04
-2.69
-1.69
-2.474e+06
16.86
-3.41e+05
-1.76e+03
8.017

The proposed algorithm will trap in local optimum


on more conditions with less probability than the others because of probabilistic nature and Gaussian parameters of AGEDA. Therefore, standard deviation is
an important parameter which set adaptively.

Discussion and Future Works

This paper proposed a novel EA, inspired by the Estimation of distribution algorithms. The Adaptive Gaussian Estimation of Distribution Algorithm (AGEDA) is
designed for real codec problems.
AGEDA uses prior information of data if possible
to find optimal parameters. The proposed algorithm
sets own needed parameters based on information of
data adaptively.

Refrences
[1] A.S. Fraser, Simulation of genetic systems by automatic digital computers, Aust. J. Biol. Sci 10 (1957), 484491.
[2] H. J. Bremermann, Optimization through evolution and recombination, Self-Organizing Systems, M. C. Yovits, G. T.
Jacobi, and G.D. Goldstine, Eds. Washington, DC: Spartan
(1962), 93106.
[3] J. H. Holland, Adaptation in Natural and Artificial Systems, Ann Arbor: Univ. Michigan Press, 1975.
[4] J. Fogel, A. J. Owens, and M. J.Walsh, Artificial Intelligence through Simulated Evolution, New York: Wiley
(1966).
[5] Rechenberg and Evolutionsstrategie, Optimierung technischer Systeme nach Prinzipien der biologishen Evolution,
New York: Wiley Stuttgart, Germany: FrommannHolzbog (1973).
[6] H.-P. Schwefel, Evolution and Optimum Seeking, New York:
Wiley (1995).
[7] H Mhlenbein and G Paa, From Recombination of Genes
to the Estimation of Distributions, Springer-Verlag PPSN
IV. LNCS, Vol. 1141 (1996), 178187.
[8] D. H. Wolpert and W. G. Macready, No Free Lunch Theorems for Optimization, IEEE Transactions on Evolutionary
Computation 1 (1997), 6782.
[9] Lyudmila Zinchenko, Matthias Radecker, and Fabio
Bisogno, Multi-Objective Univariate Marginal Distribution
Optimisation of Mixed Analogue-Digital Signal Circuits,
GECCO07, ACM, London, England, United Kingdom 9781-59593-697 (4/07/2007).
[10] Member IEEE Qingfu Zhang and Heinz Mhlenbein, On the
Convergence of a Class of Estimation of Distribution Algorithms, IEEE TRANSACTIONS ON EVOLUTIONARY
COMPUTATION VOL. 8, NO. 2 (APRIL 2004).

Figure 1: Function evaluation for AGEDA and PSO


and CPOA for 100 dimensions over 50 function evaluations

[11] Tayarani-N and M.-R Akbarzadeh-T, Probabilistic Optimization Algorithms for numerical function optimization
problems, IEEE Conference on Cybernetics and Intelligent
System (2008), 12041209.

303

A New Feature Transformation Method Based On Genetic Algorithm


Hannane Mahdavinataj

Babak Nasersharif

Departement of Computer and Electrical Engineering

Electrical and Computer Engineering Department

Islamic Azad University, Qazvin Branch, Qazvin, Iran

K.N. Toosi University of Technology, Iran

h mahdavinataj@yahoo.com

bnasersharif@eetd.kntu.ac.ir

Abstract: In pattern recognition, feature transformation methods transform features in a new


space with aim of obtaining more discriminative or orthogonal features and so more separable
classes. In this paper, we proposed a linear feature transformation method which considers both
of class discrimination and features orthogonality criteria simultaneously. We obtain this feature
transformation using genetic algorithm where its fitness function is determined using mentioned
criteria. For feature discrimination criterion, we use Dunn index. On the other hand, for feature
orthogonality and independency, we use covariance matrix and ratio of sum of its diagonal elements
to sum of non-diagonal elements. We use these criteria to determine the genetic algorithm fitness
function. Experiments on UCI dataset show that the proposed feature transformation performs
better than or as well as other known linear transformation methods.

Keywords: Feature transformation; Covariance Matrix; Dunn index ; Genetic Algorithm.

Introduction

The classification contains three main step [1]: feature extraction, classifier training using the extracted
features, evaluating and testing classifiers. For best
classification performance, we should select a classifier
that be suitable appropriate for data and our pattern
recognition problem. Obviously, a classifier may not
be suitable for all data types and problems.In addition, performance of classifiers highly depends on the
selected features set. The selected features should represent the main data properly without redundancy and
also separate data classes in a suitable way. Sometimes,
the extracted features dont have these specifications
completely. Thus, a linear or nonlinear transformation
is applied to features to make features more discriminative and independent and decrease or increase the
feature space dimensions.
Principal component analysis (PCA) is a well-known
technique for feature transformation and dimensionality reduction. It represents a linear transformation
where the data is expressed in a new coordinate basis
that corresponds to the maximum variance direction
[2]. The idea behind PCA is to find a lower dimensional
Corresponding

Author, T: (+98) 281 3665275-7

304

space, in which shorter and better vectors are used to


describe features. By projecting data onto a linear subspace spanned by principal components, PCA achieves
dimension reduction with the minimal data reconstruction error. One limitation of PCA is that it does
not model nonlinear relationships among variables efficiently. Several generalizations of PCA have been
proposed to address this limitation. Two examples of
such approaches are: Nonlinear PCA (NLPCA) [3, 4]
and kernel PCA (KPCA) [2, 5]. NLPCA generalizes
the principal components from straight lines to curves.
This can be done using neural networks with an autoassociative architecture. In KPCA, a nonlinear map is
used to translate nonlinear structure of features into
linear ones in a feature space with a higher dimension.
After this, linear PCA is applied to mapped features
[6].
The linear discriminant analysis (LDA) is also a wellknown feature transformation and dimension reduction technique. Using LDA, we obtain discriminant
feature vectors that maximize the fisher index. Fisher
Index is the ratio of the between-class scatter matrix to within-class scatter matrix [7]. For computing
LDA transformation, we should solve a generalized
eigenvalue problem. The key idea in LDA is to look

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

for a direction that separates the class means well


(when projected onto that direction) while achieving
a small variance around these means. Based on the
fact that linear discriminate analysis does not lead to
an optimal transformation necessarily. LDA faces to
some problems. The first problem is LDA ceases to
work well when classes are not linearly separable. The
next problem is about the distribution of each class.
LDA is optimal just in the case that each class has
Gaussian distribution [8]. Because these constraints
happen rarely in the real world problems, several generalizations of LDA have been proposed to address
these limitations. In nonlinear LDA (NLDA), data is
transformed non-linearly to a new space, before standard linear discriminant analysis is applied to data.
This nonlinear transformation can be performed using
neural networks [9] or kernel trick [7]. Kernel based
method for LDA is named as kernel LDA (KLDA)
[10]. In addition, evolutionary based methods have
been proposed to overcome the mentioned drawback
of LDA [1113].

In this paper, we propose a linear feature transformation for obtaining more discriminative and orthogonal features, simultaneously. For class discrimination
criterion, we use Dunn index. On the other hand, for
more orthognality and statistical independence, we use
covariance matrix and ratio of sum of its diagonal elements to sum of non-diagonal elements. In this manner, we find a transformation that make covariance matrix diagonal. These criteria are used as fitness function
of genetic algorithm in order to determine the transformation. The rest of paper is organized as follows.
Section 2 introduces Dunn index. Section 3, includes
the proposed genetic algorithm and its fitness function.
Section 4, contains our experimental results. Finally,
we conclude the paper in Section 5.

posed for evaluating and selecting an optimal clustering


scheme [14]:
Compactness: The member of each cluster
should be as close to each other as possible. A
common measure of compactness is the variance.
Separation: The clusters themselves should be
widely separated. There are three common approaches measuring the distance between two
different clusters: distance between the closest
member of the clusters, distance between the
most distant members and distance between the
centres of the clusters.

Obviously, these measures can be used for classes where


their separation and compactness before and after a
transformation is analysed. One of most known measure is Dunn index.

2.1

Dunn Indices

The Dunn index is a validity index defined by eq.[1]


which identifies compact and well-separated classes for
a specific number of classes [15].

d(c ,cj )
)
c (diam(ck )

Dnc = mini=1,...,nc minj=i+1,...,nc ( maxk=1,...,ni

(1)
where nc is number of classes, and d(ci , cj ) is the
dissimilarity function between two classes ci and cj defined by eq.[2]:
d(ci , cj ) = minxci ,ycj d(x, y)

Cluster Validity Measurement


Techniques

To evaluate the result of the clustering algorithms,


there are several clustering validity techniques and indices have been developed. The main disadvantage
of these validity indices is that they cannot measure
the arbitrary shaped clusters as they usually choose a
representative point from each cluster and they calculate distance of the representative points and calculate
some other parameter based on these points (for example: variance). The process of evaluating the results of a clustering algorithm is called cluster validity
assessment. Two measurement criteria have been pro-

(2)

and diam(c) is the diameter of the class c, a measure of dispersion of the class. The diameter of a class
c can be defined as:
diam(ci ) = maxx,yci d(x, y)

(3)

It is clear that if the dataset contains compact and


well-separated classes, the distance between the classes
is expected to be large, and the diameter of the classes
is expected to be small. Thus, based on the Dunns
index definition, we can conclude that large values of
the index indicate the presence of compact and wellseparated classes.

305

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Proposed Method

3.2

In the proposed method we use genetic algorithm for


determining the transformation matrix. We define a
fitness function based on two criteria: class discrimination and features orthogonality and their statistical
independence. For class discrimination, we use Dunn
index and for orthogonality we use diagonality of covariance matrix.

3.1

Construction Of The Hypothesis


Space

The first step in GAs is to define the encoding to describe any potential solution as a numerical vector. We
use a vector of float to express an individual code. Each
element of this vector is in interval [1, 1]. The length
of individuals in the hypothesis space is d2 where d is
number of features. So, each transformation matrix is
represented with a vector with d2 elements.

3.3 Selection Operator


Transformation Matrix Computation Using Genetic Algorithm
We use the tournament selection [17] method. In tour-

Genetic Algorithm is one of meta-heuristic optimization techniques such as simulated annealing, tabu
search and evolutionary strategies. GA has been
demonstrated to converge to the optimal solution for
many diverse and difficult problems as a powerful and
stochastic tool based on principles of natural evolution
[16] . The details of our implementation of GA are
described as follows:

nament selection, two individuals are chosen from the


population randomly and the individual with better
fitness is selected for the next population. Also, we use
elitism method to transfer individuals directly from the
previous generation to the next generation without any
change. Thus, 4% of the previous generation, which
have the best fitness values are transferred directly to
the next generation.

3.4
Algorithm1: Genetic Algorithm

input: Training Data


Output: Useful transformation matrix
Step0: initialization parameters (e.g. population size,
crossover rate, mutation rate and the maximum number of population generation.) .
Step1: create initial population randomly (P(0)).
Step2: evaluate current population (compute fitness
of all chromosomes).
Step3: while (termination condition not satisfied) do
[step 4-8].
Step4: select P(t) from P(t+1) [perform selection].
Step5:
recombine P(t) [perform mutation and
crossover].
Step6: evaluate current population (compute fitness
of all chromosomes).
Step7: t = t + 1.
Step8: go to Step 3 .
Step9: End.

Recombination

The role of the crossover operation is to create new


individuals from old ones. Crossover often is a probabilistic process that exchanges information between
some (usually two) parent individuals in order to generate some new child individuals. We used scattered
crossover in this paper.

3.5

Mutation Operator

We use Gaussian mutation to increase the performance


of genetic algorithm. Gaussian mutation adds a random number taken from a Gaussian distribution with
mean 0 to each element of the parent matrix. Then,
we normalize the feature vector elements to interval
[1, 1].

3.6

Fitness Function

The role of the fitness function is to measure the quality of solutions. In our method each chromosome is a
transformation matrix. For its evaluation we used a
fitness function which considers both of class discrimination and features orthogonality using Dunn index

Algorithm1: Pseudo code for GA

306

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

and diagonality of covariance matrix, respectively. For


this purpose, at first, we define two different functions.
For diagonality of covariance matrix and so features
orthogonality, we define the following function:

Pd
f1 (w) =

i=1

Pd

j=1,i6=j

Pd

i=1

Cw (i, j)

Cw (i, i)

Evaluation Of Proposed Approach

To evaluate suggested method, we use six datasets:


Iris, CMC, Glass, Ionosphere, Tae and Vowel from UCI
(4) database [18] which their characteristics are shown in
Table 1:

Where d is features vector dimension, Cw is covariance matrix of transformed fatures using transformation matrix W . Thus, we obtain ratio of sum of nondiagonal elements of the covariance matrix to sum of its
diagonal elements. We want to minimize this function.
Therefore, sum of diagonal elements of the covariance
matrix should be greater than the other elements of covariance matrix. In the ideal case, sum of non-diagonal
elements of the covariance matrix should be a small
number and close to zero. In such case, covariance matrix is diagonal and features are completely statistical
independent and orthogonal. In this case, f 1 tends to
zero.

Dataset
Iris
Tae
Glass
Ionosphere
Vowel
CMC

Attribute
4
5
9
34
13
9

Samples
150
151
214
351
990
1473

Classes
3
3
6
2
11
3

Table 1: The characteristics of utilized dataset

For the class discrimination, we use Dunn index


In this experiment, we use 4 different classifiers to
to define the suitable function and measure. As mentioned earlier, this index shows the discrimination be- evaluate the performance of different pre-processing
tween classes. Hence, we use eq.[1] as the desired func- methods. These classifiers are implemented using
Weka [19]: BayesNet (BN), Radial Basis Function Nettion as:
work (RBFN), Instance-Based (IB1), Multi Layer Perceptron (MLP). We use 66% and 34% of the dataset
as training and test set,respectively. In genetic algorithm, the initial population size value is 100 and maximum generation numbers is 150. In Table 2 we show
f2 (w) =
(5) the accuracy of each classifier with and without feature
mapping methods. For each classifier, the accuracy improvement of the classifier is measured and compared
to the original case (Normal) when no pre-processing is
applied to the dataset. COV and DI, show the transformation using genetic algorithm when fitness functions
d(ci,w ,cj,w )
)
mini=1,...,nc minj=i+1,...,nc ( maxk=1,...,n
(diam(c
)
k,w
are selected as eq.[4] and eq.[5], respectively. Also DIc
CV indicates the proposed method that transformation
matrix is obtained using fitness function mentioned in
Where ci,w and cj,w indicate transformed fatures in eq.[6]. As shown in Table 2, in the most of the time,
classes ci and cj using transformation matrix w. In COV provides the highest improvement in classifiers
order to consider both of measures and function in the accuracy on different types of classifiers and datasets
compared with PCA. Moreover, the method that used
fitness function, we define the fitness function as:
Dunn index as fitness function of GA (DI) has also often more successful than the LDA. DI-CV method in
most cases has been more successful than both the DI
f2 (w)
(6) and COV. According to the table, we can see that in
ff inal (w) =
f1 (w)
most cases the proposed method (DI-CV) outperforms
the other methods on different datasets. This is due
In this way, we maximize ff inal and so we maximize to considering both class discrimination and features
f 2 and minimize f 1 simultaneously.
orthogonality, simultaneously.

307

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Classifier

BayesNet

RBFN

IB1

MLP

Dataset
Iris
CMC
Glass
Ionosphere
Tae
Vowel
Iris
CMC
Glass
Ionosphere
Tae
Vowel
Iris
CMC
Glass
Ionosphere
Tae
vowel
Iris
CMC
Glass
Ionosphere
Tae
Vowel

Normal
90.19
48.80
61.64
95
41.17
36.58
96.07
47.21
68.49
96.66
37.25
49.78
94.11
45.21
65.75
90
41.17
49.78
96.07
50.79
67.12
90.83
29.41
46.10

PCA
96.07
46.21
58.90
95
33.33
41.55
98.03
53.18
78.08
95.83
43.13
43.07
94.11
42.82
71.23
87.5
37.25
45.02
96.07
50.59
61.64
90
39.21
47.61

LDA
96.07
50.99
53.42
93.33
50.98
28.35
98.03
49.40
64.38
92.5
50.89
27.72
94.11
45.02
69.86
87.5
45.09
35.06
96.07
52.19
64.38
91.66
47.05
43.29

COV
96.07
48.50
67.12
76.66
37.25
45.45
98.03
52.89
65.75
97.5
43.13
52.16
98.03
45.91
72.60
91.66
41.17
59.95
96.07
55.10
73.97
97.5
50.98
55.62

DI
98.03
44.02
69.86
94.16
39.21
47.40
98.03
52.52
67.12
97.5
45.02
51.51
98.03
49.40
75.34
91.66
43.13
52.38
96.07
55.17
69.86
94.16
47.05
54.54

DI-CV
98.03
50.99
65.75
93.33
43.13
45.02
98.03
53.38
67.12
98.33
50.98
50.21
98.03
45.81
75.34
91.66
41.17
57.35
96.07
55.37
73.97
96.66
45.09
54.97

Table 2: Classification accuracy with pre-processing methods. The Bests are Bold-Underline, The Seconds are
Bold, and The Thirds are Underline

Conclusion

In this paper, we proposed a feature transformation


method using genetic algorithm which considers both
of feature orthogonality and class discrimination measures, simultaneously. For class discrimination measure, we use Dunn index. On the other hand, for
features orthoganlity and their statistical dependence,
we use diagonality of covariance matrix as measure.
For testing diagonality, we compute ratio of sum of
non-diagonal elements of the covariance matrix to sum
of its diagonal elements . We define the genetic algorithm fitness function based on these two measures.
Experimental results using different classifiers on UCI
dataset show that the propose methods outperforms
or performs as well as known feature transformation
methods like as principal component analysis and linear discriminant analysis.

[2] A. Lima, C. Delage, Y. Nankaku, C. Miyajima, K.


Tokuda, and T. Kitamura, On the use of kernel PCA for
feature extraction in speech recognition, In Proceedings of
Eurospeech E87-D (2003), 28022811.
[3] M.A. Kramer, Nonlinear Principal Component Analysis
using Autoassociative Neural Networks, AIchEJournal 37
(1991), 233 243.
[4] E.C. Malthouse, Limitations of Nonlinear PCA as performed with generic Neural Networks, IEEE Transactions
on Neural Networks 9 (1998 ), no. 1, 165173.
[5] T. Takiguchi and Y. Ariki, Robust Feature Extraction Using Kernel PCA, 2006, pp. I509I512.
[6] H. Abbasian, B. Nasersharif, and A. Akbari, Genetic Programming Based Optimization of Class-Dependent PCA for
Extracting Robust MFCC, 2008, pp. 15411544.
[7] R.O. Duda and P. E. Hart, Pattern Classification and
Scene Analysis, New York: Wiley, 1973.
[8] M. Pohar, M. Blas, and S. Turk, Comparison of logistic regression and linear discriminant analysis 1 (2004 ),
no. 1, 143161.
[9] P. Somervuo, Experiments With Linear And Nonlinear
Feature Transformations In HMM Based Phone Recognition, 2003, pp. 5255.
[10] S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K.
Muller, Fisher Discriminant Analysis with Kernels, 1999.

Refrences
[1] R.O. Duda, P.E. Hart, and D. Stork, Pattern classification,
second edition,Wiley, 2001.

308

[11] H. Moeinzadeh, M. Mohammadi, A. Akbari, and B.


Nasersharif, Robust speech recognition using evolutionary
class-dependent LDA, 2009, pp. 21092114.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[12] M. Mohammadi, B. Raahemi, A. Akbari, B. Nasersharif,


and H. Moeinzadeh, Improving linear discriminant analysis with artificial immune system-based evolutionary algorithms, Information Sciences, , In press ( 2011 ).
[13] H. Moeinzadeh, B. Nasersharif, A. Rezaee, and H.
Pazhoumand-dar, Improving Classification Accuracy Using Evolutionary Fuzzy Transformation, 2009.
[14] F. Kovcs, C. Legny, and A. Babos, Cluster Validity Measurement Techniques, 2003.

[15] J.C. Dunn, Well Separated Clusters and Optimal Fuzzy


Partitions, Journal of Cybernetics ( 1974 ), 95104.
[16] M. Gen and Y. Yun, soft computing approach for reliability optimization: state of the art survey, 2006, pp. 1008
1026.
[17] A. Eiben and J.E. Smith, Introduction to Evolutionary
Computing, second edition, 2007.
[18] http://archive.ics.uci.edu/ml/datasets.html.
[19] http://www.cs.waikato.ac.nz/ml/weka.

309

Evaluating the performance of energy aware tag anti collision


protocols in RFID systems
Milad Haj Mirzaei

Science and Research branch, Islamic Azad University, Yazd, Iran


Department of Computer Engineering
Hmirzaei.m@gmail.com

Masoud Ghiasbeigi
Mahalat branch, Islamic Azad University, Mahalat, Iran
Department of Computer Engineering
I@Merkousha.net

Abstract: Tag collision is one of biggest challenges in RFID systems. some protocols have been
proposed in literature to address this issue. our objective in this paper is to introduce these energy
aware tag anti collision protocols and evaluate them. our comparison is based on the messages
transferred between tag and reader.

Keywords: RFID; energy aware; anti collisiont; tag collision.

Introduction

ically read data. A typical RFID system contain of


some tags and one or more readers and a computer to
integrate and process data received from reader. As
soon as systems request, reader send query to receive
tags information which are in its interrogation zone.
Then transfer the results to computer for processing.
One of the most important RFID issues is tag collision.
When more than one tag transmit its ID to reader simultaneously, the collision occur. Almost the received
signals are corrupted. RFID suffers from incorrect received signals due to collision. Its reported that in
typical RFID deployments, the tag read rate is usually
about 60-70%[3]. To address this issue some protocols
have been proposed, called Tag anti collision. These
protocols allow all of tags in the interrogation zone of
the reader to identify successfully. In this paper we
introduce and evaluate some of researches about tag
collision (tag anti collision protocols) and our focus is
on the energy consumption.

Since the advent of RFID (Radio Frequency Identification) in 1948[1] this technology was first used in
WWII by the Allies armed forces to distinguish friendly
from enemy aircraft and tanks , called IFF (Identify
Friend or Foe)[2].Today RFID has a significant role in
fields of: supply chain management, agriculture, military, healthcare, Pharmaceuticals, retail and etc. This
technology use communicated radio frequency to retrieve data. the main RFID components are :Reader
including an antenna which is the device used to read
or write data to RFID tags. Tag which is a device
with an integrated circuit on which the reader acts.
The tags can earn its energy from signals which has
received from reader, which called passive tag or by
its own battery supply (active tag). There is another
tag which called semipassive tag that use battery supply to power on and received signal energy from reader
The rest of this paper is organized as follows: secto transmit data. this technology is used to track intion
2 overviews related research works. Section 3 we
ventory, object identification and more. Data write on
introduce
energy aware Tag anti collision protocols and
the tags and attach to objects to rapid and automat Corresponding

Author, T: (+98) 251 2931-681

310

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

we evaluate them in section 4. section 5 is conclusion.

messages to query, and total tag responses sent are 11.

In this part we discuss four energy aware tag anti


collision protocols that have been modified and optimized to reduce the energy consumption.

Related research works

in past related research focused on the identifing tags


by anti collision protocols successfully. The importance
of efficient energy consumption is considered during of
combination of RFID with networks like Zigbee[4] and
wireless sensor networks[5,6] and other systems. In this
field some papers were presented to introduce new approach to optimize the energy consumption (discussed
in section 3). And many papers analyze anti tag collision approaches by mathematical models from energy
consumption point of view[710]. energy aware protocols dose not appeared in the classification of the tag
anti collision protocols until now except in[11]. when
we have not sufficient energy supply or limited energy Figure 1: QT Process for three tags. tags are read in
source we can use these protocols as a tag anti collision white circles. A=00000, B=00101, C=01001
protocol, this is the reason why we consider this part
of protocols.

Energy Aware Tag Anti collision protocols

these scheme based on three main protocols to address


the tag collision issue such as Aloha-based ,Counterbased and Tree-based protocols. In this part at first
we discuss about QT (Query tree)[12] due to its base
of our protocols.

Table 1: QT process for thress tags. R:Respond, NR:


No Respond. A=00000, B=00101, C=01001
No. query
A
B
C
result
1
Null
R
R
R
Collision
2
0
R
R
R
Collision
3
00
R
R
NR Collision
4
000
R
NR NR
Read A
5
001
NR
R
NR
Read B
6
01
NR NR
R
Read C
7
1
NR NR NR No Reply

3.2
3.1

Improved QT

Query Tree (QT)

In this approach a reader first broadcast a request bit


string S to tags. The tag which its prefix ID match with
S ,respond to the reader by sending its whole ID. If a
tag responds to reader, the tag is identified correctly
otherwise more than one tags respond simultaneously
to reader, the collision occur. In this case the reader
send a longer bit string that has one bit more than the
last string. Usually reader append 0 or 1 to S string
that is S0 or S1 (almost first use 0). Tags divide to
two subgroups: tags start with S0 and S1. It repeat
until only one tag match with S string to identify correctly. This approach delay depends on the ID length.
For instance we have three tags with IDs as 00000,
00101 and 01001. We can observe the result of this
protocol run in figure1, so the more details is available
in Table1. In this example the reader require send 7

This approach modify QT to improve its performance[13, 14]. In previous approach all the tags sent
complete ID to reader in collision case but in this approach, as soon as collision detected by reader, reader
send a message to tags to stop sending their IDs. This
signal is not a 0 and 1 symbol. This reduce number
of bits that tags will send in collision case. In this approach the reader need one clock to detect the collision
and one clock to send stop signal too. So the tags
send only 3 bits in collision. This approach processing levels are similar to figure 1 and Table 1 but 8 tag
responses with less bits. In this case the reader send
more bits, reader energy consumption does not matter
here, due to reader use AC as power source.
The next three protocols are combination of QT
protocol and Frame slotted aloha protocol, to reduce

311

The Third International Conference on Contemporary Issues in Computer and Information Sciences

reader requests and tag responses which is important 3.4


to optimize the energy consumption.

3.3

Miltislotted Scheme (MS)

The MS approach[15, 16] act as follow: in every node


of binary tree, F slots called a frame, Is used to read
tags responses. Each tag select a slot to respond randomly. If all tags with the prefix of the node are read
successfully within the F slots without collision, the
subtrees of that node are not queried further. If there
is at least one collision in responses, subtrees of that
node are queried as before, and so on. In these slots
some tags may be read correctly otherwise some tags
cannot be read due to collision (tags which responds
in the same slots). In this case the tags in subtree will
respond to reader query, which have been identified already. It happens because the reader has no way of
telling the tags to stop responding.
Binary tree shown in figure 2 contain of three tags A,B
and C. in this example we set F=4 and we can observe the operation of the approach in two sides of
each node. Here like the previous protocol first the
reader send null query to request the tags IDs. In the
first node tag B was responded in a unique slot and
was read successfully but B and C responses collided ,
because they have selected same slot. The reader has
no idea that which tags were responded so it should
repeat request in next subtree. In next query with 0
string, both A and B response collide. Note although
B has been identified in the previous node, it will respond again because lack of sleep command. In third
level the reader send 00 query string, A and B tags IDs
match with that query, then responses did not collide
and identified successfully. due to no collision, reader
does not need to query in subtree and continue sending
01 query. thus in fourth level C will be identify. In this
protocol reader send 5 request messages and tags will
send 9 responses.

Figure 2: MS Process for three tags.


B=00101, C=01001

Multislotted scheme with selective


sleep (MSS)

This scheme[15, 16] operationally similar to MS that


discussed above in addition a selective sleep command.
Here tags which read from node of a tree in collision
frame (frame with at least one collision), they are sent
to sleep the tags by the reader. There is no need to send
sleep command if tag responses do not collide in a node.
Sleep command scheme does not reduce the reader request directly, because the reader does not recognize
which prefix the collided tags had. But this process
reduce the tag responses thus reduce the collision indirectly and finally the queries are reduced. Of course
we should consider the communication and time costs.
The process of MSS for three tags A,B and C with
the IDs which mentioned befor, is shown in figure3. In
this scheme the reader queries are 3 in addition a sleep
command for tag B. and 5 responses of tags.

Figure 3: MSS Process for three tags.


B=00101, C=01001

3.5

A=00000,

Miltislotted scheme with assigned


slots (MAS)

Two previous protocols were designed to reduce the


reader queries and tag responses but a collision in F
slots was enough to force the reader to search all of
the slots, because it assumes that all slots have collision. This is because of the reader have no idea about
collided tags prefix. tags select a slot randomly to respond. In this protocol[1517] we introduce a scheme
called MAS which uses a structured assignment of slots
to nodes of the tree at different levels.
The reader chooses a frame size F as before, but now
such that F = B d , where d is a positive integer (note
that we require F B now; also logB F = d). Each
node u at level L of the tree (with F slots each) then
allocates 1 slot to each node3 v in its subtree at level
L logB F . Thus the tags whose prefix matches that
of node v will respond to the readers interrogation at
node u by transmitting on the slot assigned to it in
A=00000, node u. Each tag knows the slot it must transmit
on as follows: Based on the query prefix q at node

312

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

u, tags will transmit on qxx...x slot, wherexs are the


logB F bits after prefix q in the tag IDs. Thus, all tags
with prefix qxx...x will transmit with their slots determined by the value of xx...x. Consider an example
with B = 2, F = 4, and q = 00. That is, the query is at
a node u corresponding to 00 with L = 2 on the binary
tree. u has four nodes (the v nodes of above description) in its subtree at level L+logB F = 2+2 = 4. Each
of these nodes v will then have tags matching its prefix
reply on the corresponding slot at node u. The number
of xs will be logB F = 2. Thus, each of the v nodes
will correspond to prefixes q00, q01, q10, and q11. The
tags with prefix q00 will transmit on slot 1 (in frame
at u), tags with prefix q01 will transmit on slot 2, and
so on. In the protocol, starting at level L = 0, queries
are sent only at levels 0, logB F, 2 logB F, 3 logB F , and
so on.
In figure 4 tags A,B,C that were mentioned before will
be used here. First at root node the reader send query
with q = null string ,then the tags depends on their
first two bits, respond in one of four frame slots. in first
query tags A and B which starts with 00 respond in
slot1 an C respond in slot2. C is read successfully but
in first slot with prefix 00 collision occur. This scheme
assign a node in level 2 to each slots with address of
q00, q01, q10andq11. There by we assign each nod a
slot, but because the collision is in slot 1 ,the reader
reapeat its query with prefix q00 at level 2. This continue until A and B are read correctly. In this scheme
we have 2 request messages and 5 tag responses.

RFID system network traffic. We will focus on tag


responses for evaluating. Tag ID is 64 bits in length
which was generated randomly. We assume a noise free
channel and packet loss are due to collision only. Our
assumptions are:
In Improved QT each STOP command length is equal
to reader query command. In collision case tags after receiving STOP from reader, send back its ID to
reader. The tags response message length in this moment is 0.1 of the regular response message length. In
MSS each Sleep command length is equal to reader
query command.
At first we calculate the optimized frame size F in MS,
MSS and MAS protocols for each situation with 100,
200, 300 to 1000 tags. So we used these Fs for evaluation of those protocols.The evaluating results is shown
in figure 5. Each simulation is repeated 10 times.

Figure 5: tag response in QT, IQT, MS, MSS and MAS


protocol for 100 to 1000 tags in interogation zone of a
Reader

5
Figure 4: MAS Process for three tags.
B=00101, C=01001

Conclusion

We can understand that QT as a base protocol has


A=00000, the worst performance and MAS has the best. For
100 to 500 tags QT and IQT approximately have same
performance but in more tags situations, QT performs
better. MS and MSS perform similar in here.

Evaluation

We used C# to study performance of aforementioned


algorithms. This evaluation based on the messages
transferred between tags and reader. in this paper our
focus is on the tag responses. by reduction of messages
among RFID components, the energy needed to transmit messages will decrease and hence it will reduce

Refrences
[1] H. Stockman, Communication by Means of Reflected
Power, Proc. IRE 35 (1948), 1196-1204.
[2] G. Roussos, Networked RFID, systems, software and services, Springer-Verlag London Limited, Chapter 1, pages:
7, 2008.

313

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[3] S. R. Jeffery, M. Franklin, and M. Gaorfalakis, An adaptive


RFID middleware for supporting metaphysical data independence, The VLDB Journal 17(3) (2008), 265-289.
[4] P. Medagliani, G. Ferrari, and M. Marastoni, Hybrid ZigbeeRFID Networks for Energy Saving and Lifetime Maximization Remote Instrumentation and Virtual Laboratories, Springer US, Chapter 7, pages: 473-491, 2010.
[5] J. Sung, .T Sanchez lopez, and D. Kim, The EPC sensor
network for RFID and WSN integration infrastructure Pervasive Computing and Communications Workshops, Fifth
Annual IEEE International Conference PerCom Workshops 07 (2007), 618-621.
[6] L. Zhang and Z. Wang, Integration of RFID into wireless
sensor networks: Architectures, opportunities and challenging problems, 5th International Conference on Grid and Cooperative Computing Workshops (GCCW06) (2006), 463469.
[7] N. Pastos and R. Viswanathan, A modified grouped-tag
TDMA access protocol for radio frequency identification
networks, IEEE Wireless Communications and Networking
Conference 2 (200), IEEE Wireless Communications and
Networking Conference.
[8] X. Yan and X. Liu, Evaluating the energy consumption
of the RFID tag collision resolution protocols, Springer
Netherlands (2011), 1-8.
[9] A. Ruiz, K. Dheeraj, Klair, and K. Chin, A Simulation
Study on the Energy Efficiency of Pure and Slotted Aloha
based RFID Tag Reading Protocols, Consumer Communications and Networking Conference, CCNC IEEE 6 (2009),
1-5.

314

[10] D.K. Klair, K-W. Chin, and R. Raad, An Investigation


into the Energy Efficiency of Pure and Slotted Aloha Based
RFID Anti- Collision Protocols, IEEE Intl Symp. a World
of Wireless, Mobile and Multimedia Networks (WoWMoM
07) (2007), 1-4.
[11] Y. Zhangi, RFID AND SENSOR NETWORKS Architectures Protocols, Security and Integrations, Taylor and Francis Group, LLC, Chapter 1,2, pages: 1-58, 2010.
[12] C. Law, K. Lee, and K.Y. Siu, Efficient Memoryless Protocol for Tag Identification, Discrete Algorithms and Methods
for MOBILE Computing and Comm (2000), 75-84.
[13] F. Zhou, D. Jin, C. Huang, and M. Hao, Optimize the power
consumption of passive electronic tags for anti-collision
schemes, 5th International Conference on ASIC 2 (2003),
1213-1217.
[14] F. Zhou, D. Jin, C. Huang, and H. Min, Evaluating and optimizing power consumption of anti-collision protocols for
applications in RFID systems, ISLPED04: Proceedings of
the 2004 International Symposium on Low Power Electronics and Design (2004), 357=362.
[15] V. Namboodiri and L. Gao, Energy-aware tag anti-collision
protocols for RFID systems, PerCom 07: Fifth Annual
IEEE International Conference on Pervasive Computing
and Communications (2007), 23-36.
[16] V. Namboodiri and L. Gao, Energy-aware tag anti-collision
protocols for RFID systems, IEEE transaction on mobile
computing 9 (2010), 44-59.
[17] K. Wu and Y. Liu, A New Energy-Aware Scheme for RFID
System Based on ALOHA, Second International Conference
on Future Networks (2010), 149-152.

GPS GDOP Classification via Advanced Neural Network Training


H. Azami

Iran University of Science and Technology


Department of Electrical Engineering
hamed azami@ieee.org

S. Sanei
University of Surrey, UK
Department of Computing, Faculty of Engineering and Physical Sciences
s.sanei@surrey.ac.uk

H. Alizadeh
Iran University of Science and Technology
Department of Computer Engineering
h.alizadeh.iust@gmail.com

Abstract: Geometry dilution of precision (GDOP), a geometrically determined factor that describes the effect of geometry on the relationship between measurement error and position determination error, plays a very important role in the total positioning accuracy. The calculation of
the GPS GDOP is a time and power consuming task which can be done by solving measurement
equations with complicated matrix transformation and inversion. In order to reduce the calculation
burden, in this paper satellites geometry classification for good navigation satellites subset selection
based on advanced training algorithms including Levenberg-Marquardt (LM) and modified LM algorithms to train a feed-forward neural network (NN) and principal component analysis (PCA) is
presented. LM and modified LM are very and powerful algorithms that can train an NN rapidly.
Also, PCA is used as a pre-processing step to create the uncorrelated and informative features of
the GPS GDOP. Simulation results show that these methods are more efficient to converge upon
the optimal value in the GPS GDOP classification.

Keywords: GPS GDOP, classification, principal component analysis, Levenberg-Marquardt (LM) and modified LM.

Introduction

GDOP value indicates poor satellites positioning and


an inferior measurement configuration. Table 1 demonA satellite-based radio navigation system that pro- strates the GPS GDOP ratings.
vides three-dimension position and time by at least
four satellites that are orbited around the earth is
Class number GDOP value Ratings
called global positioning system (GPS) [1,2]. The efClass 1
1
Ideal
fect of geometry on the relationship between measureClass 2
2-3
Excellent
ment errors and position determination errors obtained
Class 3
4-6
Good
by GPS satellites is described by dilution of precision
Class 4
7-8
Moderate
(DOP) or geometric DOP (GDOP) [3,4].
Class 5
9-20
Fair
A subset of satellites with a GPS GDOP value of less
Class 6
21-50
Poor
than 2 is ideal, i.e. the attained measure of location
Table 1: GPS GDOP ratings
by these satellites is very reliable, while a higher GPS
Corresponding

Author, P. O. Box 16846-13114, T: (+98) 911 125-1101

315

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The most common approach to obtain the GPS GDOP computer simulation results are discussed in section 4.
is to calculate the inverse matrix for all combina- Finally, conclusions are given in section 5.
tions of satellites and choose the minimum one which
it is very time consuming approach. There are two
approaches to overcome the computational burden, 2
Background Knowledge for
namely regression/approximation and classification of
the Proposed Method
GPS GDOP data by computational intelligence methods such as neural networks (NNs), support vector machines (SVMs), and so on.
2.1 Geometry dilution of precision
In order to select the optimal subset of satellites, the
GPS GDOP approximators are used for computing the
value while the GPS GDOP classifier is used to select
one of the acceptable subsets of satellites for navigation using [5]. In [6] Hsu has proposed a method based
on tetrahedron volume formed by four user-to-satellite
vectors. However, it is not universally acceptable because it does not guarantee optimum selection of satellites [5]. In order to estimate and classify the GPS
GDOP, first, Simon and El-Sherief extracted a set of
features include traces of the measurement matrix and
its second and third powers, and the determinant of
the matrix. Then, in order to advantages of computational efficiency, they use the basic back propagation
(BP) of NN (BPNN) to classify and approximate the
GPS GDOP [5].
However, in many cases including GPS GDOP classification, the BPNN has many deficiencies such as too
slow convergence speed, easy to fall into local minimum, and easily affected by sudden peaks in the signal
trend during the learning process. To overcome these
problems Jwo and Lai in [5] have proposed to use the
basic BP with momentum, the optimal interpolative
(OI) network, probabilistic NN (PNN) and general regression NN (GRNN) to classify the GPS GDOP. To
improve the accuracy of the GPS GDOP classification
and increase the consuming time, in this paper we propose an approach based on principal component analysis (PCA) and Levenberg-Marquardt (LM) to classify
the GPS GDOP.

Basically, the GPS accuracy is relied to the GDOP.


Figure 1 shows the geometry of the satellites and its
affect on the GDOP values.

Figure 1: Satellites diagram and its relation with DOP:


(a) Bad DOP and (b) Good DOP
The absolute distance between a user and a satellite is
defined as follows [1]:
Ri = i + iono,i + trop,i

(1)

where
q
2
2
2
(Xi Xu ) + (Yi Yu ) + (Zi Zu ) ctu
(2)
iono,i and trop,i which are the errors induced by
the ionospheric and the tropospheric propagation, are
calculated from a model,(Xu , Yu , Zu , tu ) are the four
system unknowns and is the correction the receiver has
to apply to its own clock. Also, c is the speed of light.
To resolve this system we need four equations which
mean four pseudo-ranges from four different satellites.
The pseudo-ranges can be approximated by a Taylor
expansion. We obtain:
q
u )2 + (Yi Yu )2 + (Zi Zu )2 c tu
i = (Xi X
(3)
i =

PCA is a useful statistical technique that is widely used


in many applications such as face recognition and image compression, and is a common technique to find
patterns in data of high dimension. PCA performs a
covariance analysis between coefficients and find the
projection directions corresponding to the largest data
variation. These directions are determined by the
eigenvectors of the covariance matrix corresponding to The Taylor expansion at the first order is:
the largest eigenvalues [2]. The obtained uncorrelated = = a x +a y +a z ct (4)
i
i
i
xi
u
yi
u
zi
u
u
features of the PCA are optimal to classify an NN.
where
The paper is organized as follows. The background
knowledge for the proposed method including the concept of GDOP and classifiers used here are briefly exu
u
Yu
Zi Z
X
axi = Xi
; ayi = Yi
ri ; azi =
ri
q ri
pressed in section 2. Section 3 introduces the proposed
(5)
u )2 + (Yi Yu )2 + (Zi Zu )2
methods for the GPS GDOP classification. Then, the
ri = (Xi X

316

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Let assume Nsat be the number of visible satellites.


The matrix H is as follows:

H=

ax1
ax2
..
.

ay1
ay2
..
.

az1
az2
..
.

1
1
..
.

axNsat

ayNsat

azNsat

(6)

Let define the G matrix:


G = (H T H)1

(7)

The GDOP is:


GDOP =

p
trace[G]

(8)
Figure 2: GPS data collecting embedded system used
in our experiments

2.2

Levenberg-Marquardt Algorithm
In the first step, in order to reduce the instruction time,
the input of measurement data is normalized. Since
M = H T H is a symmetric 4 4 matrix, it has four
real-valued eigenvalues which are known as 1 , 2 , 3
and 4 . It is explicit that the four eigenvalues for M 1
are 1
i (i = 1, 2, 3, 4) . Based on the fact that the trace
of a matrix is the sum of its eigenvalues, equation (8)
can be expressed as [9]:

The LM algorithm is a fast, accurate and stable fitting


method that is widely utilized to train an NN. Like
the quasi-Newton methods, the LM algorithm was designed to approach second-order training speed without having to compute the Hessian matrix and only by
using the Jacobian matrix [7,8]. The main disadvantage of the LM algorithm is that it needs the storage of
q
some matrices that can be quite large for certain problems such as face recognition. Another method to train
GDOP = 1 1 + 2 1 + 3 1 + 4 1
(9)
an NN is modified LM. This method that is based on
Mapping with the definition of four variants can be
the Bayesian regularization and LM algorithm make to
done as follows [9]:
reduce the difficulty of determining the optimum network architecture [9].
y1 (~) = 1 + 2 + 3 + 4 = trace(M )
(10)

y2 (~) = 1 2 + 2 2 + 3 2 + 4 2 = trace[M 2 ]

(11)

y3 (~) = 1 3 + 2 3 + 3 3 + 4 3 = trace[M 3 ]

(12)

The Proposed Classification


Approaches

In this research, a large set of experiments was carried out using the following set-up: a standard GPS
receiver was installed in a fixed point and was connected to a PC. In order to the GPS GDOP real collection, the azimuth (Az) and the elevation (E) of each
observed satellite are measured by using a developed
embedded system. After collecting the GPS information on DRAM, these data were transformed to serial
port of PC for processing. Figure 2 shows the entire
GPS data collecting embedded system.

317

y4 (~) = 1 2 3 4 = det(M )

(13)

where det denotes determinant of a matrix. In order


to have uncorrelated and informative features of the
GPS GDOP, the PCA is used as a feature extractor.
Since mapping from to the GPS GDOP classes is often highly non-linear and cannot be determined analytically, it can be determined specifically by using an
NN.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 3: Classification block diagram of GPS GDOP using NN


Classification
Methods

BPNN [5]

Correct classification
rate
CPU time

93.16%

about 1.5 s
for 200 iterations

PNN [5]

GRNN[5]

LM with PCA

97.29%

97.29%

99.27

about 1.5 s

about 1.5 s

about 1.3 s for 200 iterations

Modified
LM
with
PCA
99.48

about 1.3 s
for 200 iterations

Table 2: Comparison of classication rate and training time for proposed methods and three well-known existing
classifiers
In this research, the NN is designed to do the mapping
from to the GPS GDOP classes. It must be noted that
the number of features (three) is selected with trial
and error. The entire classification block diagram of
the GPS GDOP using the NN and PCA is shown in
Figure 3.

ceptable. If too many numbers of neurons are used for


an NN, it not only increases the training time, but also
may reduce the performance of the NN. Therefore, we
change the number of neurons of the hidden layer from
10 to 100 and then test their performances. Finally,
the best accuracies are presented.

In this study, in addition to the existing methods,


namely, the BPNN, GRNN, and PNN, the LM and
modified LM algorithms to train an NN with the PCA
have been applied.

When we increase the number of iterations to more


than 200, the accuracies for the proposed methods dont
change significantly, but the training time increases
considerably. Thus, we propose to use 200 iterations
for training a feed-forward NN by using the improved
BP training algorithms.

Simulations and Discussions

In Table 2 the obtained accuracies by proposed methods, train a feed-forward NN by using the LM and
modified LM with PCA and BPNN, GRNN, and PNN
are shown. The classification accuracy of 99.27% and
99.48% by using the LM and modified LM with PCA
are achieved using the GPS GDOP measurement data,
respectively.

In this paper we use a feed-forward NN with three layers and for these methods we train it with 50% of the
GPS GDOP measurement data and then use the rest
of the data for testing these algorithms. The momentum and initial learning rate are set to 0.85 and 0.05,
respectively. Because of uncertain behavior of NNs,
It should be mentioned that Jwo et al. in [2] used the
we run all algorithms 20 times, and the average of the
NN for the GPS GDOP classification with the BP techresults is presented.
nique, PNN and GRNN for NN learning. Advantages
Deciding how many neurons to use in the hidden layer of the proposed method based on the LM and PCA
is one of the most important characteristics in an NN. in our paper than reference of mentioned are high acWhen the number of neurons is too low, the NN cannot curacy and low CPU time. Also, it has the structure
model complex data and the resulting may be unac- complexity less than for hardware implementation.

318

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusions

GPS errors resulted from satellites configuration geometry are indicated by the GDOP factor which is often used for selecting suitable satellites subset from at
least 24 orbited existing satellites. In this paper, a
fast and precise approach for the GPS GDOP classification using LM and modified LM algorithms to train
a feed-forward NN and PCA has been proposed. The
method of NNs is a realistic computing approach used
for classifying the measurement GPS GDOP. Also, in
order to reduce the computational burden and training time, we use the PCA as a pre-processing step.
PCA can create independence and informative data.
The performance of the proposed methods has been
studied on the test data of the paper. The simulation results demonstrate the significant advantage of
the proposed methods compared with several existing
methods, namely, the BPNN, GRNN, and PNN.

[2] M. Saraf, K. Mohammadi, and M. R. Mosavi, Classifying


the geometric dilution of precision of GPS satellites utilizing Bayesian decision theory, Journal of Computers and
Electrical Engineering, 37 (2011), no. 6, 10091018.
[3] N. Levanon, Lowest GDOP in 2-D scenarios, IEEE Proceedings Radar, Sonar and Navigation 147 (2000), no. 3,
149155.
[4] M. Zhang and J. Zhang, A fast satellite selection algorithm:
beyond four satellites, IEEE Journal of Selected Topics in
Signal Processing 3 (2009), no. 5, 740747.
[5] D. J. Jwo and C. C. Lai, Neural network-based GPS GDOP
approximation and classification, Journal of GPS Solutions
11 (2007), no. 1, 5160.
[6] Hsu DY, Relations between dilutions of performance and
volume of the tetrahedron formed by four satellites, IEEE
Position Location and Navigation Symposium (1994), 669676.
[7] F. Paulin and A. Santhakumaran, Classification of breast
cancer by comparing back propagation training algorithms,
Journal on Computer Science and Engineering 3 (2011),
no. 1, 327332.
[8] L. Zhang, Y. Zhao, and K. Hou, The research of LevenbergMarquardt algorithm in curve fittings on multiple GPUs,
IEEE Conference on Trust, Security and Privacy in Computing and Communications (2011), 13551360.
[9] F. D. Foresee and M. T. Hagan, Gauss-Newton approximation to Bayesian regularization, International Joint Conference on Neural Networks (1997), 19301935.

Refrences
[1] M. R. Mosavi and H. Azami, Applying neural network ensembles for clustering of GPS satellites, Journal of Geoinformatics 7 (2011), no. 3, 714.

319

[10] S. H. Doong, A closed-form formula for GPS GDOP computation, Journal of GPS Solutions 13 (2009), no. 3, 183
190.

Improving Performance of Software Fault Tolerance Techniques Using


Multi-Core Architecture
Hoda Banki, Seyed Morteza Babamir, Azam Farokh, Mohamad Mehdi Morovati
University of Kashan, Kashan, Iran
Department of Computer Engineering
babamir@kashanu.ac.ir
{banki, farokh, mm.morovati}@grad.kashanu.ac.ir

Abstract: This research shows the influence of multi-core architecture to reduce the execution time
and thus increase performance of some software fault tolerance techniques. According to superiority
of N-version Programming and Consensus Recovery Block techniques in comparison with other
software fault tolerance techniques, implementations were performed based on these two methods.
Finally, the comparison between the two methods listed above showed that the Consensus Recovery
Block is more reliable. Therefore, in order to improve the performance of this technique, we propose
a technique named Improved Consensus Recovery Block technique. In our proposed technique, not
only performance is higher than the performance of consensus recovery block technique, but also the
reliability of our proposed technique is equal to the reliability of consensus recovery block technique.
The improvement of performance is based on multi-core architecture where each version of software
key units is executed by one core. As a result, by parallel execution of versions, execution time is
reduced and performance is improved.

Keywords: Software Fault Tolerance; Multi-core; Parallel Execution; Consensus Recovery Block; N-version Programing; Acceptatnce Test.

Introduction

core architecture is a good idea for taking advantage


of parallel processing.

Nowadays the influence of software into different domains such as economics, medicine, aerospace and so
on is quite sensible. One of the main requirements of
these systems is the use of safe and reliable software.
According to the importance of software reliability, requirement of using fault tolerance techniques in software development have increased significantly. Design
diversity is one of fault tolerance methods, that needs
to run multiple versions of the program[1]. While software fault tolerance techniques increase software reliability, but on the other hand by increasing number of
versions of the program, execution time also increases
and this will reduce the performance. But by taking
advantage of distributed and parallel processing systems, the efficiency is increased and thus the cost of using these systems will be acceptable. Using the multi Corresponding

Author, T: (+98) 913 1635211

320

Based on the idea of software fault tolerance, for


some software key units in a system, N versions can
be developed separately but with similar functionality[2]. The purpose of design diversity is construction
of independent modules as possible and minimizing the
occurrence of identical errors in these modules[3]. All
versions run with identical initial conditions and inputs. Output of all versions is given to a decision module and the decision module selects a unique result as
a correct output.
The paper starts with the introduction of N-version
programming and recovery block and their derivative
techniques. In Section 3, we introduce satellite motion
system as a case study. In Section 4, we discuss the
usage of multi-core architecture in fault tolerance tech-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

niques. Implementation results are reviewed in Section 2.4


5 and the proposed method is presented in Section 6.
Finally in Section 7 conclusions are discussed.

Software
Technique

Fault-Tolerance

Now we overview some fault tolerance techniques.

2.1

N-version programing technique

N-version programing-Tie BrokerAcceptance test technique

Because of the two modified NVP techniques are


complementary, N-version programming-Tie Brokeracceptance test (NVP-TB-AT) technique has developed that concentrate on both reliability and performance. Actually, this technique is a combination of
NVP-TB technique and acceptance test. This technique uses acceptance test to increase the reliability
that will cause the execution time to increase and thus
the performance is reduced. But by using the Tiebroker technique, this reduction of performance is compensated. As a result, not only this technique has
higher performance than NVP-AT, but also has reliability equal to NVP-AT[5].

N-version programming (NVP) technique is one of the


main techniques of software fault tolerance. In this
technique, N different versions of a module are implemented. Then different versions are executed in 2.5 Recovery Block technique
parallel and the results will be presented to a decision
module and this module selects the correct result[3].
Recovery block (RcB) technique is one of the main
techniques of software fault tolerance. This technique
works this way in which different versions are prioritized in order of their importance, then they run in
2.2 N-version programing-Tie Broker order of their preferences. Acceptance or rejection of
each version is identified by a module called acceptance
technique
test. At first the overall situation of system is stored,
if any of these versions cannot successfully pass the acIn order to improve the performance of NVP tech- ceptance test, initially the system is returned to the
nique, a technique as N-version programming-Tie Bro- saved state and then the next module will run[3].
ker (NVP-TB) has developed that its strategy is to synchronize the versions. In this technique, by assumption
that three versions of software key unit are developed
when the results of two faster versions are produced, it 2.6 Distributed Recovery Block techdoes not wait for the slowest version anymore. In other
nique
words, when the two faster versions, complete their execution, their results will be compared with each other
and if they match, one of the results is returned as a Distributed recovery block (DRB) technique, is the discorrect result. If they dont match, it waits for the re- tributed version of RcB technique which several recovsult of slowest version and selects the correct result by ery blocks are implemented on several systems, and the
decision mechanism[4].
only difference between these blocks is the priority of
modules.

2.3

N-version programing-Acceptance
test technique

To reduce the probability of an incorrect result, Tai


and his colleagues added an acceptance test to the NVP
technique. In this technique, when decision mechanism
selects one of the results as a correct result, this result
is passed to the acceptance test for checking its correctness to increase the reliability[4].

2.7

Consensus Recovery Block technique

The consensus recovery block (CRB) technique is a


combination of NVP and RcB. So that, at first NVP
runs and if it fails to produce the correct result, recovery Block runs and produces the correct result[3].

321

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Acceptance Test

The equation of motion of a satellite is a second order vector differential equation, therefore it has to be
converted to a system of first order differential equaAcceptance Test (AT) is the most basic approach to tion[6]:
self-checking software. It is typically used with the
RcB, CRB and DRB techniques. The AT is used to
verify that the systems behavior is acceptable based
on an assertion on the anticipated system state. As
shown in fig. 1 it returns the value TRUE or FALSE.
An AT needs to be simple, effective, and highly reliable to reduce the chance of introducing additional
design faults, to keep run-time overhead reasonable,
to ensure that anticipated faults are detected, and to
ensure that nonfaulty behavior is not incorrectly detected. ATs can thus be difficult to develop, depending
Where, r is the position vector, GM is the product
on the specification. The form of the AT depends on
of
gravitational
constant and Earths mass, k is the
the application. The coverage of an AT is an indicator
effects
of
all
the
perturbing
forces acting on a satellite.
of its complexity, where an increase in coverage genSince
the
equation
of
motion
of a satellite is a second
erally requires a more complicated implementation of
order
three-dimensional
differential
equation, it could
the test. A programs execution time and fault manibe
solved
numerically
using
methods
such as Rungefestation probabilities also increase as the complexity
Kutta,
Adams-Bashforth
and
Adams-Moulton.
In this
increases[3].
paper, various implementations of these methods are
used as different versions of fault-tolerant techniques.

In a single-core platform, there should be only one


thread that is running at a certain time point. But
In a multi-core platform, there can be several threads
that are running on different cores at the same time. So
in the multi-core architecture, threads that are created
to run the program really run in parallel on a multicore platform. Therefore synchronization issues and
the cost for communication among cores are discussed.
If the extra cost is quite considerable compared with
the normal execution cost of process on the single core
architecture, we should consider this kind of applications are not suitable for the multi-core architecture[7].

Figure 1: Acceptance test functionality

Multi-core Architecture Usage

Satellite Motion System

In this section, the satellite motion system that is used


in scientific computing, is introduced as a case study.
Because the calculation of satellite motion is the most
critical part of the satellite control system, errors in
this section can take the entire system to fail. In order to increase the reliability of this part, the faulttolerant software techniques were utilized. Satellite Figure 2: non-Key software unit and key software unit
motion equation is as follows[6]:
As shown in Fig. 2, a software system is composed of a series of software key units and software
non-key units. Each software system includes critical

322

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

and important parts, that occurrence of error in them plemented as different versions.
causes system failure and the cost cannot be compenIn the other words, we execute different versions
sated. These critical and important parts are called
software key units and other sections are software non- on single core architecture in each technique, and then
compare its execution time on single core architecture
key units[2].
with its execution time on multi-core architecture, and
One way to increase fault tolerance is having dif- finally we offer a new technique to reach a higher perferent versions and deployment of fault tolerance tech- formance.
niques.But since the development of different versions
We reduced execution time of those techniques sigof the entire system is very costly, several different
versions that have different implementations are de- nificantly by using multi-core architecture. As shown
veloped, only for software key units. Since the key in Fig. 3, the speedup rate of NVP technique is equal
units have several versions and causes increase of ex- to 1.22 for Dual core processor and 1.89 for quad core
ecution time, we use multi-core architecture features processor. Because the reliability on this technique is
to reduce this time and run versions on different cores low the NVP-TB-AT Technique is used instead. Which
in parallel. This approach reduces execution time and the speedup rate of this technique is equal to 1.20 for
thus increases the performance. Because the cost of dual core processor and 1.61 for quad core processor.
synchronization and communication between the cores The effect of multi-core architecture on performance of
compared with the high costs resulting from the se- RcB technique is shown in Fig. 4.
quential program, is negligible[2].

Figure 4: Effect of multi-core architecture on performance of recovery block technique


Figure 3: Execution time of NVP technique and derived technique

Implementation and Results


of Multi-Core Usage

The effect of multi-core architecture on increasing performance of NVP technique has been discussed by
Yang and his colleagues[8]. In this paper we discuss
about effect of multi-core architecture on NVP derived techniques, DRB, CRB and improved consensus
recovery block. In this paper, fault- tolerance techniques have been used to increase reliability. So different implementations of numerical methods for solving differential equations of satellite motion were used
as different versions which are required in fault tolerance techniques, therefore methods such as RungeKutta, Adams-Bashforth and Adams-Moulton are im-

Execution time of RcB technique on single core,


dual core and quad core are 22.04, 16.14 and 12.14
respectively. Because in single core architecture, all
the versions execute sequentially, the execution time
is longer than other architectures. For example in our
implementation execution time of versions is equal to
14.60, 3.71 and 3.67 respectively so the execution time
of RcB technique on single core is about sum of all
these times. In order to apply advantage of parallelism, we can use distributed version of this technique
named distributed recovery block. DRB technique has
1.36 speedup rate by using dual core processor and
1.86 by quad core processor. Shown in Fig. 3, the
execution time improvement for quad core architecture is more than dual core architecture in the case
of parallelism than the case of sequential. In other
words, by increasing the number of cores, performance
improvement is expected.

323

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Suggested Technique(Improved take full advantage of multi-core systems facilities and


execution time of CRB technique, we try to use
Consensuse Recovery Block) reduce
DRB instead of RcB. In Fig. 5 algorithm of proposed

technique is shown. First versions are executed simultaneously through NVP technique and result of them
In NVP-TB-AT, if the result of two faster versions be is given to a voter. If the voter can produce a corequal, one of the results will be announced as the cor- rect result, this result is returned. Otherwise, different
rect result and any acceptance test does not performed versions are executed through DRB technique.
on the result[5]. So if there is an error in the system
that causes the result of two faster versions be similar and wrong, the overall system failure probability is
increased using this technique. Thus this technique is
less reliable than RcB technique, because in RcB technique the result should perform the acceptance test in
any conditions to be returned as a correct result. Also
if program has several correct answers, the NVP-TBAT technique may be faced with failure. If each of
the two faster versions produces correct but different
results, the voter waits for the slowest version and by
using the decision mechanism judges among result of
two faster versions and result of slowest version. If the Figure 5: consensus recovery block technique algorithm
lowest version has correct but different result than results of faster versions, the voter cannot decide and
system will face with failure. But if the RcB technique
Influence of multi-core architecture on performance
is used and the program has several correct results, sys- of CRB technique is shown in Fig. 6. Different imtem does not fail because AT is done for every version plementations of numerical methods for solving differand so the correct result will be determined.
ential equations of satellite motion were used as difOn the other hand the performance of RcB technique is largely dependent on performance of acceptance test. While in many cases, creation of acceptance test program is very difficult. CRB technique
reduces the importance of acceptance test than its importance in RcB technique. Also NVP technique will
not be able to produce the final result, in cases where
the problem has several correct answers. So the RcB
technique and NVP have drawbacks in some cases that
CRB has resolved these drawbacks. The CRB by combining the two techniques that discussed, resolves the
two drawbacks.

ferent versions which are required in CRB technique.


As is shown in Fig. 5, the execution time of CRB
technique is equal to 31.93 and execution time of improved consensus recovery block technique is equal to
24.94. In other words, by using of quad core processors the speedup rate of improved consensus recovery
block technique than CRB technique is equal to 1.28.
In these two techniques, the section of NVP is executed
similar to each other and the difference of them is in
section of recovery block. In the CRB, the section of
recovery block is executed sequentially.

According to superiority of CRB technique than


other techniques, we concentrate on this technique and
In order to improve its performance, we have proposed
a technique which is called improved consensus recovery block and is similar to CRB technique. In execution of CRB, at first the section of NVP tries to
produce the correct result. If decision module is able
to produce the result, it produces the correct result
and the technique terminates. Otherwise, the second
section namely recovery block will execute to produce
the correct result. Since the execution of the recovery block is sequentially, thus the execution time is
increased. With multi-core computers and possibility
of taking advantage of parallel processing, the recov- Figure 6: Influence of multi-core architecture on perery block does not use these facilities. In this paper to formance of consensus recovery block technique

324

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

in the worst case, namely the case that last version


perform the acceptance test successfully, the execution
time will be equal to total time to run all versions.
Whereas in improved consensus recovery block, the
section of recovery block runs as a distributed and its
execution time is equal to execution time of longest
version. According to above diagrams, execution time
of DRB technique in quad core architecture is less than
execution time of CRB technique. But since the CRB
does not have problems of DRB technique, in many
cases is more suitable. In this paper, we showed that
the execution time of CRB technique can also be reduced.

Conclusion

There are different software fault tolerance techniques.


One of these techniques is consensus recovery block
that has better performance than other techniques in
some cases and does not have other techniques problems. To increase performance of this technique, we
proposed a technique which is named improved consensus recovery block technique. In this technique distribution concepts have been used. Also according to
capability of multi core architecture for supporting the
parallel processing, this architecture has been used to
reduce execution time and thus increasing performance
of fault-tolerance techniques. As a result we showed
that the improved consensus recovery block technique
is more suitable than other techniques from reliability
and performance viewpoint.
Because the satellite motion computation system is
the most critical part of the system, in this paper software fault tolerance techniques has been used to solve
the numerical differential equation of satellite motion
to increase the reliability. For this purpose different
implementation of the numerical differential equation
of satellite motion methods has been used as different
versions which are required in software fault tolerance
techniques. Then for determination of performance increase rate, execution time on single core architecture
in sequential mode and on multi core architecture in
parallel mode has been compared in different fault tolerance techniques. In NVP-TB-AT technique that has
more performance and reliability than other derived
NVP techniques, execution time in case of sequential
mode on single core architecture is equal to 21.81 and
execution time in case of parallel mode on dual core
and quad core architecture is equal to 18.06 and 13.48
respectively. So speedup rate for dual core architec-

ture is equal to 1.20 and for quad core architecture is


equal to 1.61. Also Execution time of recovery block
technique on single core, dual core and quad core are
22.04, 16.14 and 12.14 respectively. Since high reliability is critical in satellite motion computation system,
we use consensus recovery block technique which has
high reliability but its problem is high execution time.
This problem has been solved by improved consensus
recovery block technique.
Execution time of consensus recovery block technique is equal to 31.93 and execution time of improved
consensus recovery block technique is equal to 24.94
on quad core architecture. These two techniques have
similar reliability but their performance rate is different. In other words, because consensus recovery
block does not use the distribution and parallelism
concepts, it cannot use parallelism advantages of multi
core architecture. But the proposed technique which
is named improved consensus recovery block has high
performance, by taking advantage of distribution concepts and by using of parallelism advantages in multi
core architecture. Therefore according to the results,
usage of improved recovery block technique and multi
core architecture will increase the reliability and performance in fault tolerance softwares.

Refrences
[1] A. Avizienis and J.P.J. Kelly, Fault tolerance by design
diversity- concepts and experiments, IEEE Computer 17
(1984), 67-80.
[2] L. Yang, L. Yu, J. Tang, L. Wang, J. Zhao, and X. Li,
Mcc++/java: enabling multi-core based monitoring and fault
tolerance in c++/java, 15th IEEE International Conference
on Engineering of Complex Computer Systems (2010), 255256.
[3] L.L. Pullum, Software fault tolerance techniques and implementation, Artech House Publishers, 2001.
[4] A.T. Tai, J.F. Meyer, and A. Avizienis, Performability enhancement of fault-tolerant software, IEEE Transactions on
Reliability 42 (1993), no. 2, 227-237.
[5] A.T. Tai, Software performability: from concepts to applications, Kluwer Academic Publishers, 1996.
[6] M. Eshagh and M. NAJAFI ALAMDARI, Comparison of
numerical Integration methods in orbit determination of low
earth orbiting satellites, JOURNAL OF THE EARTH AND
SPACE PHYSICS 32 (2006), no. 3, 41-57.
[7] S. Akhter and J. Roberts, Multi-core programming, Intel
Press, 2006.
[8] L. Yang, Z. Cui, and X. Li, A case study for fault tolerance oriented programming in multi-core architecture, IEEE
International Conference on High Performance Computing
and Communications (2009), 630-635.

325

An Introduction to an Architecture for a Digital-Traditional Museum


Reza Asad Nejhad

Mina Serajian

Amirkabir University

Payame Noor University-Zanjan Branch

Department of Nuclear Engineering and Physics

Department of IT and Computer Science

r.asadnejad@aut.ac.ir

mina.serajian@gmail.com

Mohsen Vahed

Seyyed Peyman Emadi

Roozbeh Institution of Higher Education

Roozbeh Institution of Higher Education

Department of IT and Computer Science

Department of IT and Computer Science

ngo.iran@yahoo.com

emadi@roozbeh.ac.ir

Abstract: The museum of present time hasnt been separated from digital world and Internet age.
Nowadays, In the world there are different museums which have deformed to digital and virtual
museums. But still there are some people that prefer to have physical presence in the museum and
visit its objects nearly rather than surfing a virtual museum. Because of this, the aim of this essay
is combining the joy of visiting old style museums with characteristics and advantages of digital
museums. For this purpose in suggested system, user goes to the museum building and by using
technologies and tool which are provided by the museum, can easily use. Museum services and have
enough joy of visiting objects and items.

Keywords: Digital museum, Information technology usage in museums, Promotion of services quality in museum,
E-Government

Introduction

Museum is a Greek word that has inspired from Muse


Youn which means house of Angles. Museum refers to
a space which a collection of objects are stored in and
exhibited. Museums are not exception in taking affect
of advent of digital age and many innovations are introduced using new technologies[1]. Utilization of new
technologies in museum, leads to the introduction of a
new area of technology which its name is digital museum. Digital or virtual museum is a museum that
exists only online[2]. The science museum of London
is one of the major science museums in the world which
was able to establish an early web presence. Considerable research has been done in this area.
Among the advantages of these museums, easily
search for a specific object, access to specialized categories of objects, no need to attend the museum phys Corresponding

Author, T: (+98) 912 741-2276

326

ically, availability of more detail, description and related photos of a specific object and possibility of saving information can be cited[3]. But many people still
believe that, going to museums and visit the objects
closely has its own special pleasure and prefer physical visits to virtual museum surfing. The purpose of
this article is to propose a framework and a structure
for using IT, Digital tools and technologies in order to
support the physical presence of visitors in the museum
building.
In following, part 2 deals with the problems of digital
and traditional museums that leads to the basic idea
of this paper. Part 3 is the introduction to the proposed system architecture, part 4 deals with tools and
technology. In part 5, conclusion and suggestions, we
express potential suggestions for future works in order
to Implementation and setup such a system.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Problem Statement

Several factors and shortcomings were the reasons for


the idea of this paper that we mention some of them
below:
Most of people still believe that physical attendance of a place and visiting an object physically
is more enjoyable than referring to websites and
looking at its photos (like digital museums).

building method for this description as bellow. As a


Visitor enters building, she will receive a special tablet
and a headset from museum operator which is settled
in the entrance gate of museum. Special software has
been installed on this tablet. The tablet has been
equipped with Bluetooth and RFID technology. Visitor should select her required language and then logins to the software. This software should have several parts, such as; museum parts classification (like:
music instruments, Art, science, etc), museum plan
(map) and ready status for receiving multimedia contents from museum objects.

Watching 3D photos of an object or simulated


Near each museum object, there is a RFID chip or
area of a museum (something like designed spaces tag that installed in a form that the waves dont overlap
for games) needs tools, processors and special fa- with each other, and RFID reader is embedded in the
cilities and high band width[2].
tablet. As the visitor enters defined area of objects,
they send ID information to tablet and try to commu By presence in museum building, visitors need nicate with it. If the software be in ready status, a
the utilization of human being or papered guide- message will appear on the screen of tablet for visitor
lines, or in advanced levels can use kiosk guide- which expresses would you like to connect or receive
lines (like those mentioned in paper[3]). while information for this Object? And by positive answer,
human being guidance may forgot or make mis- software will show multimedia content of that Object
take to express some information or details or with a save button for user.
the service which these human agents provides
may have low quality because of tiredness or high
number of visitors or not being fluent in a foreign
language. Also reading papers and using kiosks
can be boring for visitors to use.
In traditional museums, finding the location of a
special object may be difficult. Or in a big museum, like Louvre museum, people do not have
enough time to visit whole the area and have
to choose parts to visit which they are more interested. Is giving a guide book or map of the
museum the best solution for these problems in
digital age?.
Figure 1: The structure of the proposed system for
Several factors, such those mentioned above, leads aigital-traditional museum
forming the idea of suggested system in this essay
to combine advantages and characteristics of Todays
digital museums with the joy of visiting traditional
She can save the content on her portable memory
museums in practice.
and bring it to the outside of museum by pressing save
button. If visitor wants to visit just some selected objects, she can point them in the application, while passing, just those selected objects requests to connect will
be accepted by tablet.

Introduction to proposed system

The software has a part named place finding or visual map, and by selecting this Choice user can see her
place in museum and all paths and corridors of muThe architecture of the system is shown in fig 1 to seum, so can be guided smartly to a selected place.
describe and detect system needs. We use scenario- This feature is provided by RFID technology.

327

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Also by designing another application for mobile


phones, all these services can be received on mobile
phones for visitors. For this purpose, a Bluetooth chip
will be installed in addition to the RFID tag for each
object. In Bluetooth connection the multimedia content will be transferred from object to mobile phone if
the visitor wants, but in RFID connection the multimedia content is embedded in tablet and will appear if
visitor wants. (Because of limitations of most mobile
phones memory, it is not possible to have a data base
of all items information in software).
As user exits the museum building, she can send
all selected saved information about items on memory
device and gives back the tablet to museum operator.
A software module is implemented to census which
connects to a computer by Bluetooth connection as the
tablet reaches the gate and all statistical information of
visitors with tablet, such as items they visited, classifying with they selected, their selected language, their
spended time in museum and etc will be sent.

applications will be designed according to standards


and techniques.

Discussion and future works

In this essay a system structure is presented for a


Digital-Traditional museum which combines two kinds
of museums, traditional and digital. In such structure,
the quality of provided services to visitors promotes
by fading the human agents, content and explanations
will presented in the form of multimedia, and gathering census information will be more easily for museum operators. Among the needs of this system, we
can mention designing software with three functionalities which mentioned before in addition to mentioned
technologies and hardware (Infrastructure), that needs
engineering design and implementation and coordinating with mentioned technologies. Designing and implementing this software is in the list of future works of
the team.

Methodology and technologies Refrences

For implementation of this system, AIM methodology


can be one of the best methodologies to use. AIM is
proposed by Oracle and has six project phases: definitions, Operation analysis, solution design, build, transition and production[4].
Along these phases, major activities such as designing
multimedia data base, choosing and preparing suitable
software and hardware, designing software applications
and implementing and integrating can be named. Platforms are designed/setup based on existing standards
and defined protocols for used technologies. Software

328

[1] D. A Allen, Museums and Education: Museums in Modern


Life: Seven Papers Read Before the Royal Society of the Arts
in March, April and May 1949, NAL pressmark, London
(1949), 1722.
[2] Sugita. Shiegru, Towards a digital museum-experiments at
the national museum of ethology, Osaka, Japan, International Conference on multimedia for humanities (1998).
[3] Wakkary Ron and al et, Kurio, A Museum Guide for families, 3rd conference on tangible and interaction (TEI09) (Feb
2009).
[4] Jim Crum and Boss Corporation, Using Oracle 11I, Que
Publishing, Chapter 3, Software Implementation Methods,
2000.

A Comparison of Transform-Domain Digital Image


Watermarking Algorithms
Asadollah Shahbahrami

Mitra Abbasfard

Computer Engineering Department

Computer Engineering Laboratory

Faculty of Engineering

Delft University of Technology

University of Guilan, Rasht, Iran

2628 CD Delft, The Netherlands

Reza Hassanpour
Computer Engineering Department
Cankaya University, Ankara Turkey

Abstract: In image processing applications, data authentication is implemented using watermarking techniques. Watermarking is the process of inserting predefined patterns into image data in
a way that the degradation of quality is minimized and remain at an imperceptible level. Many
digital watermarking algorithms have been proposed in special and transform domains. The techniques in the spatial domain still have relatively low-bit capacity and are not resistant enough to
lossy image compression and other image processing operations. For instance, a simple noise in the
image may eliminate the watermark. On the other hand, frequency domain-based techniques can
embed more bits for watermarking and are more robust to attack. Some transforms such as Discrete
Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) are used for watermarking in
the frequency domain. In this paper, the robustness of different transform watermark algorithms is
evaluated by applying different attacks. We evaluate two and six watermark algorithms, which have
been proposed using the DCT and DWT, respectively. Our results show that the Coxs algorithm
which is based on DCT is more robust compared to other transform watermark algorithms.

Keywords: Digital Watermarking; Transform-Domain Watermarking.

Introduction

Digital multimedia data are rapidly spreading everywhere. On the other hand, this situation has brought
about the possibility of duplicating and/or manipulating the data. To keep on with the transmission of data
over the Internet the reliability and originality of the
transmitted data should be verifiable. It is necessary
that multimedia data should be protected and secured.
One way to address this problem involves embedding
an invisible data into the original data to mark ownership of them. This is down using digital watermarking
algorithms [6,16].
There are different algorithms in the spatial and
Corresponding

transform domains for digital watermarking. The techniques in the spatial domain still have relatively low-bit
capacity and are not resistant enough to lossy image
compression and other image processing operations.
For instance, a simple noise in the image may eliminate
the watermark data. On the other hand, frequency
domain-based techniques can embed more bits for watermark and are more robust to attack. Some transforms such as Discrete Cosine Transform (DCT) and
Discrete Wavelet Transform (DWT) are used for watermarking in the frequency domain. Most DCT-based
techniques work with 8 8 blocks. These transforms
are being used in several multimedia standards such as
MEPG-2, MPEG-4, and JPEG2000. In addition, different watermark algorithms have been proposed using
DCT and DWT. In considering the attacks on water-

Author, P. O. Box: 3756-41635, Fax:(+98) 131-6690271 Email: shahbahrami@guilan.ac.ir

329

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

marks, the robustness feature of an algorithm becomes


very important. The question is, which transform watermark algorithms is more robustness to different attacks compared to other techniques? We evaluate some
transform watermark algorithms and compare their robustness in this paper.

aged. For example, lossy compression could completely


defeat the watermark. In other words, watermarking
in the spatial domain is easy to destroy using some attacks such as low-pass filtering. As a result, transform
domain watermarking algorithms are used.

We evaluated two and six watermark algorithms,


which have been proposed using the DCT and DWT,
respectively using different attacks. Our results show
that the Coxs algorithm which is based on DCT is
more robust compared to other transform watermark
algorithms. In fact the robustness of the algorithm is
depends on the frequency at which the watermark data
is added. This paper is organized as follows. 2 discusses the spatial and transform domains watermarking. Some transform domain watermarking algorithms
are discussed in 3. Experimental and evaluation results
are presented in 4. Finally, conclusions are drawn in 5.

2.2

Transform Domain Watermarking

Transform domain watermarking embed watermark


data into the transformed image. Transform domain
algorithms have many advantages over spatial domain
algorithms [3]. Common signal processing includes operations such as upsampling, downsampling, quantization, and requantization. Rotation, translation, and
scaling are common geometric operations. Lossy operation is an operation to remove some unimportant
parts of the data. Most of the processing for this category takes place in the transform domain and eliminates high-frequency values.

Digital Image Watermarking

In addition, the techniques in the spatial domain


still have relatively low-bit capacity and are not resistant enough to lossy image compression and other imIn this section, spatial and transform domains water- age processing [2]. For instance, a simple noise in the
marking are briefly discussed.
image may eliminate the watermark. As another example, a watermark data placed in the high-frequency
values can be easily eliminated with little degradation
of the image by any low-pass filtering.

2.1

Spatial Domain Watermarking

On the other hand, transform-domain watermarking techniques are typically much more robust to image manipulation compared to the spatial domain techniques. This is because the transform domain does
not use the original image for embedding the watermark data. In addition, a transform domain algorithm
spreads the watermark data over all part of the image. Additionally, frequency domain-based techniques
can embed more bits for watermark and are more robust to attack. Furthermore, most of the images are
avaliable in the transform domain.

Spatial domain watermark algorithms insert watermark data directly into pixels of an image [8]. For
example, some algorithm insert pseudo-random noise
to image pixels. Other techniques modify the Least
Significant Bit (LSB) of the image pixels. The invisibility of the watermark data is obtained on the assumption that the LSB bits are visually insignificant. There
are two ways of doing an LSB modification. There
are some methods to change the LSB bits. The LSB
of each pixel can be replaced with the secret message
or image pixels may be chosen randomly according to
Some transforms such as DCT and DWT are used
a secret key. Here is an example of modifying the
LSBs, suppose we have three R, G, and B component for watermarking in the frequency domain. Most DCTin an image. Their value for a chosen pixel is green based techniques work with 8 8 blocks [3].
(R, G, B) = (0, 255, 0). If a watermark algorithm wants
to hide the bit value 1 in R component then the new
pixel value has components (R, G, B) = (1, 255, 0). As
Watermark Algorithms Based
this modification is so small, the new image is to the 3
human eye indistinguishable from the original one [12].
on DCT and DWT

Although this spatial domain techniques can be easily used on almost every image, they have the follow- We discuss two and six watermark algorithms, which
ing drawbacks. These techniques are highly sensitive have been proposed based on the DCT and DWT, reto signal processing operations and can be easily dam- spectively. We focus more on wavelet-domain water-

330

The Third International Conference on Contemporary Issues in Computer and Information Sciences

marking algorithms than on DCT. The reasons for this


are as follows. First, some multimedia standards such
as JPEG2000 and MPEG-4 are based on DWT [11,2].
These new standards brought new requirements such
as progressive and low bit rate transmission as well
as region-of-interest coding. In other words, this approach matches the emerging image and video compression standards.

3.1

Watermark Algorithms using the


DCT

Cox et al. [3] proposed spread spectrum watermarking algorithm. In this technique, the watermark data
is spread over many frequency values so that the energy in any value is very small and undetectable. In
order to implement this algorithm, a sequence of values V = v1 , v2 , ..., vn are extracted from each image
I. Then the watermark data X = x1 , x2 , ..., xn are inserted into the extracted values V , to obtain a sequence
of V 0 = v10 , v20 , , vn0 using a scaling factor with the
following equations.

vi0
vi0

= vi + xi .
= vi (1 + xi ).

(1)
(2)

In order to obtain the final watermarked image I 0


, the sequence of V 0 is inserted into the original image
I.
Koch et al. [5] proposed a watermark algorithm
using the DCT domain. First, they transformed an
image using the DCT transform and then pseudorandom numbers are inserted into a subset of blocks.
A triblet of blocks with mid-range frequencies was
slightly revised to obtain a binary sequence watermark.
They discussed the Randomly Sequenced Pulse Position Modulated Code (RSPPMC) technique. This algorithm consists of two parts. The first part produces
the actual copyright code and a random sequence of locations for embedding the code in the image. The second part embeds the code at the specified locations using a simple pulsing technique. This part includes the
following steps. First, the position sequence is used to
generate a sequence of locations for mapping the pixels.
Second, the blocks of image data are transformed and
quantized. Third, the watermark data, code pulses,
represents the binary code being embedded on selected
locations. Finally, the quantized data is decoded after
the inverse transform of the watermarked transformed
image is obtained.

331

3.2

Watermark Algorithms using the


DWT

Xie et al. [15] has developed a blind watermark technique in the DWT domain. Xies algorithm modified
the wavelet coefficients using a median filter with a 13
sliding window. A non-overlapping 3 1 window runs
through the entire low frequency band of the wavelet
coefficients. For example, elements in the window are
denoted as b1 , b2 , b3 corresponding to the coordinates
(i 1, j), (i, j), (i + 1, j), respectively. They are sorted
as b(1) b(2) b(3) . Xia et al. [14] insert pseudorandom codes to the large coefficients at the high and
middle frequency bands of the DWT for an image. The
idea is the same as the spread spectrum watermarking
idea proposed by Cox et al. [3]. A pseudo-random sequence, Gaussian noise sequence N [m, n] with mean 0
and variance 1 are inserted to the largest wavelet coefficients. Wavelet coefficients at the lowest resolution
are not changed. In other words, Xia embedded the
watermark data to all sub-bands except LL sub-band.
A watermark technique based on DWT has been
proposed by Wang et al. [13]. They search significant
wavelet coefficients in different sub-bands to embed the
watermark data. The searched significant coefficients
are sorted according to their perceptual importance.
The watermark data is adaptively weighted in different sub-bands to achieve robustness. Tsun et al. [7]
also proposed a watermarking algorithm using DWT.
In order to equally embed the watermark to the whole
image, Kim et al. [4] embedded watermark data to all
sub-bands. The watermark data has been generated
using a Gaussian distributed random vector. Leveladaptive thresholding has been used in order to select
significant coefficients for each sub-band and different
decompositions. They used 3-level decompositions and
the length of the watermark is about 1000.
Dugad et al. [1] have inserted the watermark
data to sub-bands, which have coefficients larger than
a given threshold T1 except the low-pass sub-bands.
Picking all wavelet coefficients above a threshold is
a natural way of adapting the amount of watermark
added to the image. They used three 2D DWT decomposition levels. Robustness requires the watermark
to be added in significant coefficients in the transform domain. However, the order and number of these
significant coefficients can change due to various image manipulations. Adding watermark data to significant wavelet coefficients in the high frequency bands
is equivalent to adding watermark to the edge areas of
the image, which makes the watermark invisible to the
human eye.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 1: JPEG compression attack on single bit watermarking algorithms.

Experimental Results

a broad range of contents and types. We have used 26


images provided at Fabien Petitcolas database [10] as
a standard evaluation database for watermarking alWe discuss and present our experimental results of wa- gorithms. We have performed evaluation by having in
termark robustness evaluation. We have applied dif- mind that the embedded watermark should be invisible
ferent attacks on the previous discussed watermarking so we have kept the PSNR value of the images constant
algorithms.
at 35dB and compared the robustness of the different
methods.

4.1

Original image

Quality Measurements

Watermark data

Apply a watermark
algorithm

Apply an attack

Extract watermark
data

Measuring
correlation

Two commonly measurements that are used to quanFigure 2: Block diagram for watermark robustness extify the error between images are namely, Mean Square
periments.
Error (MSR) and PSNR. Their equations are as follows.

M SE

N M
1 XX
(f (i, j) g(i, j)).
N M i=1 j=1

(3)

2552
.
(4)
M SE
Where the sum over i and j denote the sum over all
image pixels. Increasing PSNR represents increasing
fidelity of compression. In general when the PSNR is
40 dB or larger, then the two images are virtually indistinguishable by human observers. In other words, the
transformed image is almost identical to the original.
P SN R

10log10

It is important to evaluate an image watermark algorithm on many different images. Images should cover

We performed two groups of experiments. These


experiments are based on single bit and multiple bit
watermarking. In the first group of experiments, we
fixed the number of modified coefficients of DCT or
DWT transforms to 1000. This is applied to the algorithms which are based on single bit watermarking
methods. In the group of eight watermarking methods
described and used in this research, Cox, Tsuan, Kim,
Wang, Xia methods fall into this group. The algorithms proposed by Dugad, Koch, and Xie are multiple
bit algorithms. For this group we have restricted the
watermark length to 64 bits. The block diagram given
in 2 indicates how the experiments have been applied
to the test images.
First of all, each picture is watermarked using the

332

The Third International Conference on Contemporary Issues in Computer and Information Sciences

methods described in the previous sections, then an attack is applied. Next we try to extract the watermark
and compute the amount of damage done to the watermark. The similarity between damaged watermark
extracted from the image after the attack, and the original watermark is measured using their correlation.
These correlation values have been averaged for
each watermarking algorithm and plotted against the
parameters of each attack separately. Different attacks
such as JPEG compression, EZW compression, median
filter, cropping, and rotation have been applied.

4.2

Evaluation Results

Figures 1 and 3 show the result of JPEG compression


attack with single bit algorithms and multiple bit algorithms using 64-bit watermark respectively. The JPEG
compression is a lossy compression. In the JPEG compression procedure, first a color space transform is performed from RGB to YCbCr. Then, Cb and Cr components are down-sampled by removing every other row
and column. This corresponds to 50% compression.
Thereafter, a DCT transform is applied and the DCT
coefficients are quantized using a pre-defined quantization table. The JPEG compression rate depends
on the quantization level used. Hence, higher compression ratio corresponds to lower image fidelity. It
should be noted that quantization tends to eliminate
high frequency data which corresponds to image details, noise, and embedded watermark. As these figures depict Coxs algorithm and Xies algorithm are
more robust with different JPEG quality than other
transform domain watermarking algorithms.

median and Gaussian filtering which are low-pass filters do not have a major effect on it.
EZW lossy compression distortion rate is more compared to JPEG compression. This can be related to
the fact that EZW is a wavelet based method, which
can achieve higher compression rates through its hierarchical coding. Median filtering attack uses a square
window to find the median value. Normally the median
filter can eliminate isolated noise. However, when the
window size is even and hence the middle of the window is not the middle value in the list of pixel value,
more distortion is caused.
Geometric attacks have the most drastic effect
on the embedded watermark. Also, using the midfrequency components make the watermark invisible.
This gives it the robustness expected from a good watermarking method. For large filter sizes such as 15,
the watermark is not completely destroyed and its correlation value remains at about 0.534. For other methods, even DWT based methods, achieving a correlation value this high, is not easy. Another algorithm
which is comparable in performance to Cox algorithm
is Xies algorithm. However, Xies algorithm modifies
the wavelet coefficients using a median filter with a
1 3 sliding window. But despite this, its performance
and robustness is very close to Coxs algorithm. Coxs
algorithm has the advantage of preserving the original
image at the watermark detector. This makes it possible for the detector to subtract the retrieved image
and the original image and use the result as a metric to
evaluate how good the watermark has been embedded.
This helps the algorithm to avoid blind comparison and
in fact because of this feature it falls into non-blind
group of algorithms.

Because of page limitation, we show the average


obtained results for each technique. In general, Table 1 summarizes the results of applying different at- 5
Conclusions
tacks to the watermarked images. The results given in
this table are obtained by averaging the results of each
method for a given attack over all parameters and all We have studied different transform domain watertest images.
marking algorithms on digital images based on robustness criteria. We have focused on DCT and DWT
The results given in 1 indicate that Coxs algorithm transforms. The robustness of the watermarking methhas a better performance against DCT based JPEG ods has been quantitatively measured by comparing
algorithm. This can be related to the fact that JPEG the extracted watermark with the original watermark.
uses the same transform (DCT) as used by Cox algo- This helped us to measure the amount of damage done
rithm. One conclusion from this experiment is that by each attack. In two major cases, the watermark atwhen the watermarking and the attack are using the tacks destroy embedded watermark data. First when
same transform or are in the same domain, for embed- the watermark data is embedded in high frequency
ding the watermark we may choose the coefficients that components of the images. High frequency content
are less affected by the attack and are less distorted. of an image behaves like added noise and therefore
Cox method uses mid-frequency components to store noise removal algorithms such as smoothing, and methe watermark, so attacks such as JPEG compression, dian filtering destroys it. Lossy compression algorithms

333

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 3: JPEG compression attack on multiple bit watermarking algorithms.


Attack
JPEG Compression
EZW Compression
Median Filter
Cropping
Rotation

DUGAD

KOCH

XIE

TSUAN

COX

KIM

WANG

XIA

0.79
0.54
0.24
0.08
0.22

0.794
0.49
0.23
0.44
0.38

0.976
0.69
0.72
0.20
0.24

0.82
0.60
0.58
0.15
0.21

0.99
0.94
0.975
0.22
0.44

0.44
0.28
0.18
0.15
0.06

0.92
0.61
0.48
0.19
0.11

0.83
0.55
0.40
0.61
0.30

Table 1: Test results in terms of correlation values for each watermarking method.
also try to reduce the image size by removing the rithms is also dependent on the frequency at which the
small details which correspond to high frequency con- watermark data is added.
tent of the image. Therefore, watermarking methods
such as Coxs algorithm which embed the watermark
in mid frequency contents of the image are more robust against attacks. Secondly, when the watermark- Refrences
ing algorithm and the attack are based on the same
transform, the destructing effect of the attack is mini- [1] R. Dugad, K. Ratakonda, and N. Ahuja. A New WaveletBased Scheme for Watermarking Images,. In Proc. IEEE
mized. This is the case in Coxs algorithm when JPEG
Int. Conf. on Image Processing, 1998.
lossy compression attack is applied. Coxs algorithm
[2] M. S. Hsieh, D. C. Tseng, and Y. H. Huang. Hiding Digand JPEG are both based on DCT transform.
The major consideration in frequency based methods is the robustness of the method to different attacks. To achieve a better performance with respect to
robustness, the watermark should be embedded in the
lower frequency contents of the image as far as possible. The reason is that the main watermark attacks
change or eliminate high frequency content of the image. From this viewpoint, the main difference between
the frequency based watermarking algorithms, is their
choice of frequency coefficients to use for embedding
watermark data. In fact the robustness of the algo-

ital Watermarks using Multiresolution Wavelet Transform.


IEEE Trans. on Industrial Electronics, 48(5):875882, October 2006.

[3] J. Kilian I. J. Cox and T. Leighton Aand T. G. Shamoon.


Secure Spread Spectrum Watermarking For Multimedia.
IEEE Trans. on Image Processing, 6(12):16731687, December 1997.
[4] J. R. Kim and Y. S. Moon. A Robust Wavelet-Based Digital Watermarking Using Level-Adaptive Thresholding. In
Proc. IEEE Int. Conf. on Image Processing, pages 226230,
October 1999.
[5] E. Koch and J. Zhao. Towards Robust and Hidden Image
Copyright Labeling. In Proc. IEEE Workshop on Nonlinear
Signal and Image Processing, pages 452455, June 1995.

334

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] S. J. lee and S. H. Jung. A Survey of WAtermarking Techniques Applied to Multimedia. In Proc. IEEE Int. Symp.
on Industrial Electronics, June 2001.
[7] C. T. Li and H. Si. Wavelet-Based Fragile Watermarking
Scheme for Image Authentication. Journal of Electronic
Imaging, 16(1), March 2007.
[8] B. M. Macq and J. J. Quisquater. Cryptology for Digital
TV Broadcasting . Proceedings of the IEEE, 83(1):944957,
1995.
[9] M. W. Marcellin, M. J. Gormish, A. Bilgin, and M. P.
Boliek. An Overview of JPEG 2000. In Proc. Data Compression Conf., March 2000.
[10] F.
Petitcolas.
Photo
database.
www.petitcolas.net/fabien/watermarking/imagedatabase/.
[11] M. Rabbani and R. Joshi. An Overview of the JPEG2000
Still Image Compression Standard. Signal Processing: Image Communication, 17(1):348, January 2002.

335

[12] R. G. van Schyndel, A. Z. Tirkel, and C. F. Osborne. A


Digital Watermark. In Proc. IEEE Int. Conf. on Image
Processing, pages 8690, Sebtember 1994.
[13] H. M. Wang, P. C. Su, and C. C. J. Kuo. Wavelet-Based
Digital Image Watermarking. Optics Express, 3(12):491,
1998.
[14] X. G. Xia, C. G. Boncelet, and G. R. Arce. Wavelet Transform Based Watermark for Digital Images. Optics Express,
3(12):497, 1998.
[15] L. Xie and G. R. Arce. Joint Wavelet Compression and Authentication Watermarking. In Proc. Int. Conf. on image
processing, pages 427431, October 1998.
[16] Y. Yusof and O. O. Khalifa. Digital Watermarking For Digital Images Using Wavelet Transform. In Proc. IEEE Int.
Conf. on Telecomunicatios, May 2007.

Polygon partitioning for minimizing the maximum of geodesic


diameters
Zahra Mirzaei Rad

Ali Mohades

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematic and Computer science

Department of Mathematics and Computer science

z.mirzeai@aut.ac.ir

mohades@aut.ac.ir

Abstract: Let S be a set of n points inside a simple polygon. We study the problem of bipartition
S into two subsets such that minimized the maximum of the geodesic diameters of the subsets. Since
in ground transportation, where obstacles obstruct the space between the points, the geodesic metric
is more convenient and useful than Euclidean one in order to represent real world problems. This
paper focuses on an O(n2 log n) algorithm for bipartition problem. The proposed algorithm employs
geodesic metric for distance between points inside a simple polygon.

Keywords: Computational geometry, Facility Location, Polygon portioning, Geodesic diameter, Geodesic distance.

Introduction

Clustering is a prominent problem of fundamental importance in operations research. This problem seeks to
partition a set of points into k disjoint clusters subject
to some optimization criterion. More formally, such a
problem specifies a set of points S, a parameter k, a
set measure , and a k-argument function f ; the solution to the problem is a partition of S into k disjoint
subsets S1 , . . . , Sk , such that f ((S1 ),. . ., (Sk ))
is minimized. Such problems are generally NP-hard
for arbitrary k, even for planar point sets and simple
instances of and f e.g. = diameter and f = maximum [3]. Let S is a set of n points inside a simple
polygon P. We seek a bipartition of S that minimizes
the maximum of the geodesic diameters of two subsets.

Figure 1: Geodesic distance between points p1 and p2


inside polygon P .

The previous researches in computational geometry consider planar points and use Euclidean metric.
However for ground transportations when obstacles obstruct the space between the points, the geodesic metric is more convenient and useful than Euclidean one
in order to represent the real world problems.

The rest of this paper is organized as follows: Section 2 introduces some primary definitions, the proThe geodesic distance between two points in a sim- posed algorithm is presented in Section 3 and Section
ple polygon is the length of the shortest path connect- 4 focus on the correctness of the algorithm. Finally
ing the points that remains inside the polygon. Fig. 1 section 5 ends the paper with a short discussion and
future works.
is an example of geodesic distance.
Corresponding

Author, P. O. Box 15875-4413, Phone: +98(21)64540-1, Fax: 98(21)6413969

336

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Preliminaries

2.2

Furthest-site geodesic Voronoi diagram

In this section some preliminaries which are essential in


Given a finite collection of point sites in the simple
order to present the proposed algorithm are delivered.
polygon, the geodesic furthest-site Voronoi diagram
partitions the polygon into Voronoi cells. For each
Let S be a set of points lying in a simple polygon
point in a cell, the owner site is the site furthest from
P , the geodesic diameter of set S, denoted by DG (S),
that point. The geodesic furthest-site Voronoi diagram
is defined as follows:
of k sites in a simple polygon with n vertices can be
DG (S) = max{dG (p, q)|p, q S} where dG (p, q) is
computed in O((n + k) log(n + k)) time and O(n + k)
the shortest path between p and q in a way that this
space [2] . Let S is the set of sites. The bisector of s
path lays inside P . The geodesic diameter of a set
and t, denoted by b(s, t), is defined as follows:
of n points in a simple polygon P can be computed
in O(n log n) time [4]. In the following subsections
b(s, t) = {x|dG (x, s) = dG (x, t)}
we take a short overview of geodesic convex hull and
furthest-site geodesic Voronoi diagram.
where dG (x, y) shows the geodesic distance between x
and y. A s-t half space furthest from s which is denoted by H(s, t), is defined as follows:
H(t, s) = {x|dG (x, t) < dG (x, s)}. The geodesic
furthest-site
Voronoi cell of site s S is VG (s) =
T
2.1 Geodesic convex hull
{ H(t, s)|t 6= s, s, t S}. The geodesic furthest-site
Voronoi diagram VG is
{ x b(s, t)|s, t S}
The geodesic convex hull (relative convex hull) of a set and dG (x, s) = maxrs dG (x, r). Figure 3 indicates
of points S = {s1 , s2 , ..., sn } called sites lying in a sim- the Voronoi diagram and Voronoi cells of a set S =
ple n-gon P is introduced in this section.
{1, 2, 3}. A furthest-site Voronoi edge e(s,
T t) is VG
Definition: Let Q be a subset of P . Q is called geodesi- b(s, t) for each pair of sites (s, t) if |VG b(s, t)| < 1
cally convex provided that for every pair of points x,y then e(s, t) does not exist. A (Voronoi) vertex is a
Q, the geodesic path between x and y constrained to point x V which has three or more sites furthest
lie in P also lies in Q.
from it.
Definition: Let S be a set of sites in P . The
geodesic convex hull, CHG (S/P ), is the intersection
of all geodesically-convex sets containing S (Refer to
Fig. 2 for an illustration). Alternately we may view
the geodesic convex hull as the minimum-perimeter
weakly-simple polygon that contains S and is constrained to lie in P .

Figure 3: Furthest-site geodesic Voronoi diagram

BIPARTITE algorithm

Figure 2: Geodesic convex hull of points inside P .


Let S be a set of points S = {s1 , ..., sn } lying in a
simple polygon P . This paper focus on the problem of
partitioning S into two subsets S1 and S2 in order to
An algorithm that computes the geodesic convex minimize the maximum geodesic diameters of S1 and
hull of a set of n points in a simple polygon of n sides S2 , i.e. max(DG (S1 ), DG (S2 )). In the following we
in O(n log n) time and O(n) space has been proposed present BIPARTITION algorithm in a top down manby Toussaint [4].
ner to solve this problem.

337

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Algorithm BIPARTITE

If neither a and nor b is belong to predefined sets then


create two new sets Qi and Q0i ;
Input: a simple polygon P = [1, 2, ..., n] and S = insert a into Qi ;
{s1 , ..., sn }.
insert b into Q0i .
Output:two subsets S1 and S2 which minimize minimize the max(DG (S1 ), DG (S2 ))
begin

If a is belong to one of the predefined sets e.g. Qj


and b is not belong to any of the predefined sets then
insert b into Q0j .

Step 1: Compute the geodesic convex hull of S


If both a and b are belong to predefined sets e.g.
(CHG (S/P )), call it Q. It has at most n vertices, a Qk and b Ql , if k < l then insert all vertices Ql
Q=[v1 ,...,vn ].
into Q0k ;
In addition, compute the furthest-site geodesic Voronoi insert all vertices Q0l into Qk ;
diagram of vertices of Q.
Step 2: Call function CV2(Q)in order to partition
Theorem:
The BIPARTITE algorithm is
vertices of Q into two sets Q1 and Q2 , such that the O(n2 log n).
maximum geodesic distance between any pairs of vertices in each sets is minimized.
Proof: In step 1 the geodesic convex hull needs
O(n log n) [4] and the furthest-site Voronoi diagram
Step 3: Choose the edges e(s, t) where s Q1 and needs O(n log n) to be computed [2]. Clustering of
t Q2 this edges, which are straight lines or hyper- vertices in Step 2 take O(n2 log n) time to be done.
bolic arcs, construct a partitioning path and partition In step 3, it is obvious that the number of edges of
the set S into two sets S1 and S2 . These sequences of furthest-site Voronoi diagram are O(n). Considering
edges consist of a path that satisfies our goal.
all steps the time complexity of BIPARTITE algorithm
is O(n2 log n).
Step 4: report S1 and S2 .
The only remaining part of the algorithm which
should be unveiled is CV2 which is as follows:

Function CV2
Inputs: A set of vertices Q.
outputs: A partitioning of Q into Q1 and Q2 .

Accuracy of the algorithm

begin
In this section we prove the accuracy of the proposed
Compute the geodesic distances between all pairs algorithm.
of vertices of Q.
Proof by contradiction:
Assume, to the contrary, that there exist sets T1 and
Sort all these pairs in a decreasing order of their T2 where T1 and T2 are a partitioning of S and
geodesic distances.
max(DiamG (T1 ), DiamG (T2 )) < max(DiamG (S1 )
, DiamG (S2 )) (4.1).
Keep the sorted pair in a list L. It is obvious that Without lost of generality suppose that the maximum
L(1) = (p, q) is a pair of vertices which construct the value of the left side of statement 4.1 is related to two
geodesic diameter of Q.
points t1 and t2 and these points are belong to T1 and
Q1 = {p}.
the maximum value of the right side of statement 4.1
Q2 = {q}.
is related to two points s1 and s2 and these points are
i = 2.
belong to S1 , So DiamG (t1 , t2 ) < DiamG (s1 , s2 ). In
Do the following steps until there is a vertex which are BIPARTITE algorithm (s1 , s2 ) should be proceed benot assigned to Q1 and Q2 :
fore (t1 , t2 ) and therefore s1 and s2 are classified in two
different sets. This contradiction shows that the supposition is false and so the given statement is true and
(a, b) = L(2).
this completes the proof.

338

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Discussion and Future Works

Refrences
[1] J. Hershberger and S. Suri, Matrix searching with the shortest path metric, STOC93, ACM, 1993.
[2] B. Aronov, S. Fortune, and G. Wilfong, The furthest-site
geodesic Voronoi diagram, SCG88, ACM, 1988.

Several other variants of the facility location problems


have been studied. The most of them are base on Euclidean metric. We suggest researchers use geodesic
metric and resolve the problems.

[3] J. Hershberger, Minimizing the sum of diameters efficiently,


Computational Geometry: Theory and Application 2 (Oct.
1992), no. 2, 111118.
[4] G. T. Toussaint, Computing Geodesic Properties Inside a
Simple Polygon, Revue DIntelligence Artificielle 3 (1989),
942.

339

Automatic Path-oriented Test Case Generation by considering


Infeasible Paths
Shahram Moadab

Hasan Rashidi

Qazvin Branch, Islamic Azad University

Qazvin Branch, Islamic Azad University

Department of Electrical, IT and Computer Sciences

Department of Electrical, IT and Computer Sciences

Qazvin,Iran

Qazvin,Iran

moadabsh@gmail.com

hrashi@googlemail.com

Eslam Nazemi
Shahid Beheshti University
Department of Electrical and Computer Engineering
Tehran, Iran
nazemi@sbu.ac.ir

Abstract: Nowadays about 50 percent of all software developing efforts take place in testing
phase. Lack of precision in this phase may cause irrecoverable damage or end with software failure.
Automatic test case generation is one of software verification ways. Automatic Path-oriented Test
Case Generation is one of the most powerful approaches among other similar approaches Which is
accomplished in three main phases that includes control flow graph constructing, path selection and
test case generation .The path selection phase is based on McCabe proposed testing, called basis
path testing which is accomplished with basis path set generation. Existence of infeasible paths is
one of the most important basis path set generation problems. Whereas calculating the number of
infeasible paths is an undecidable task to do, in this paper we did our best to make such problems
decidable. To solve this problem, potentially most promising areas have been predicated at the
beginning, then with finding all these points and labeling them, potentially most promising areas
will be limited and at the end all infeasible path will be extracted. Besides 45 percent improvement
of time executing, this approach has produced a full automatic tool without any testers interference.

Keywords: Test Case; Basis Path Testing; Infeasible Path; Cyclomatic Complexity; Control Flow Graph.

Introduction

Software testing is called a process of verification of a


program with the purpose of finding the bugs of software. Software testing is an essential task to guarantee
and improvement of the software reliability [1]. The
goal of software testing is to design a systematical test
so that to find the groups of different errors in the least
time and effort [2].

white-box testing, black-box testing and gray-box testing techniques. In white-box testing, the program of all
data structures, code and algorithms are available. In
black-box testing or behavioral testing, the most fundamental software requirements are examined. This
complementary testing is considered to the white-box
testing. Gray-box testing is similar to black-box testing, furthermore, to some extent, software and interactive components are available.
Test case generation is one of the software testing methods in white-box testing technique. Manual

Software testing is accomplished with one of the


Corresponding

Author, P. O. Box 44318-56111, T: (+98) 142 322-4591

340

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

test case generation is very time consuming, boring


and it needs a supervisor which causes waste of resources [3]. In this regard, researchers tried to produce
automatic test case process and so far they have presented some approaches to be able to access their goals.
These approaches are classified to three main groups:
Randomized approach, Goal-oriented approach and
Path-oriented approach [4]. Automatic Path-oriented
test case generation is the most powerful approaches.
Whereas selecting all paths is not possible, then selecting the appropriate path set is important. Many
works have done about this issue that most of them
are based on McCabes proposed testing which is called
Basis Path Testing. In this method, Basis Path Set
consists of linearly independent paths generates. Presented solutions suffer from lots of defects that the most
important one can be mentioned, is disability of discovering infeasible paths.

suffers from lack of function call and bit operators and


also lack of infeasible path [5]. Many papers such as
[68] did their best to find a solution for infeasible paths
challenges. Although in Jun and Zhang (2008) proposed algorithm, many infeasible paths problems has
been solved but help of tester has been used which this
case became semi automatic [9]. Sangeeta and Dharmender (2010) have presented an automatic tool which
is called PutBracesInIfElse to produce CFG and test
case for C programs [10]. This automatic tool is based
on if-else structure. This tool uses LoopToIfFormat
function in order to change Loop Structures to If-Else
structure; but test cases produced only based on solving simple constraints and also lack of constraints such
as loops and non-linearly constraints, without complex
data structures, functions call and lack of infeasible
paths. Lili and his colleagues (2011) worked on infeasible branches instead of infeasible paths that led to
efficiency increase but it is not 100 percent yet [11].
In this paper, an automatic tool has been made Also all kinds of data structures, conditions and operto produce test case by support of infeasible paths. ators will not be covered by their method.
Whereas some paths of a program were not executable
and recognizing the number of these paths to produce basis path set isnt decidable so that it was tried
Automatic Test Case Generato make it decidable. So at first, potentially most 3
promising areas have predicated and by discovering
tion
these points and sequentially related paths and labeling, weak areas will be limited and infeasible paths get
extract. Labeling has been done on assignment intro- As it is told, automatic test case generation approaches
ductions, conditions on branches and loop structures. classifies to 3 main groups: Randomized approach,
This technique ends with producing full automatic tool Goal-oriented approach and Path-oriented approach
and also without any tester interference.
which will be explain in the following.
In part 2, related works will be discussed. In part
3, different automatic test case generation techniques
will be explained. In part 4, besides explaining the
main idea of the paper, proposed approach and related
algorithm will be presented. In part 5, experiment and
observation results will be declared. Finally, in part 6,
after coming to a conclusion, some suggestions towards
to future works will be presented.

3.1

Randomized approach

This approach is the easiest kind of test case generation. This approach fails thanks to low probability of
discovering errors in many application programs. The
advantages of randomized approach is easy to implement, and having high speed [12]. The disadvantages
of randomized approach are incomplete and inappropriate of programs paths.

Related Works
3.2

Goal-oriented approach

Some researches on one or several phases of control


graph construction, path selection and test case generation have done which will be described in the follow- This approach focuses on some goals like discovering
ing.
mutant errors or reaching to assertions. Per version
of program code that includes a simple error is called
Xu and Zhang (2006) has made an automatic tool Mutant. The advantage of this approach is sharp dewhich is called SimC to produce CFG; then presented cline probability of infeasible paths encountering [13]
test case generation for all C programs which supports and the disadvantage of Goal-oriented approach is lack
data structures such as pointers and arrays. But still of ability to find logical errors [14].

341

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.3

Path -oriented approach

Basis path generation has always been challenging


because of not being unique [2]. One of the challenges
of choosing basis path set is the probability of infeasible
Automatic Path-oriented test case generation is one of
paths existence.
the strongest approaches among white-box approaches
that is accomplished in 3 below main phases:
Definition 2. Infeasible path is a non-exercisable path for
having inconsistent conditions.

3.3.1

Control Flow Graph Construction

CFG is a directed graph that has presented with


G = {N, E, s, f } relation.
In this relation, N
is graphs nodes set, E is graphs edges (E =
{(ni , ni+1 )|ni , ni+1 N }), s and f are entry and exit
node in order. Each node represents a linear sequence
of computations. Each edge e = {ni , nj } is transfer
of control from node ni to node nj , and the condition of transferring (if it is satisfied) will transform
from node ni to node nj . Consider foo functions CFG
in Algorithm 1 which is written with C, in Figure 1.
int y;
void foo(int x){
/*S1*/ if (x < 5)
/*S2*/ y=2;
/*S3*/ while(x < 5)
/*S4*/ x++;
/*S5*/ return;
}
Algorithm 1: The function foo [9]

Insomuch, traversing all executive paths is impossible; then selecting the appropriate path set is important. To do this, mostly use McCabes proposed
testing [15] which is known basis path set. In this approach, basis path set including linearly independent
path generates.
Definition 3. A linearly independent path traverses at
least one new edge.

McCabe showed that the size of basis path set for


each obtained CFG, was unique and also equals to
cyclomatic complexity (V(G)). For calculating cyclomatic complexity, we can use one of the following formulas:

V (G) = The number of all CFG areas bounded


by nodes.
V (G) = E N + 2. E is the number of edges and
N is the number of nodes.
V (G) = d + 1. Here d is the number of predicate
nodes.

3.3.3

Test Case Generation

The main goal of test case generation phase is to find


entries in the least time for a program which by executing these entries, the large number of paths can be run.
Test case generation accomplishes in 2 stages: finding
Figure 1: foo functions control flow graph [9]
paths conditions, satisfying all these conditions. Finding paths conditions stage gets done with one of actual
execution or symbolic execution technique. In actual
3.3.2 Path Selection
execution, the amount of variables will be acquired by
executing the actual program or return to back technique. This approach has got high order [16]. To solve
After constructing CFG, some of its paths must be chothis problem, symbolic execution is being used. In symsen for testing.
bolic execution, variables acquire without actual exeDefinition 1. A path in a CFG is a sequence of nodes cuting programs, and based on entries variables and
A = na1 na2 ...nam+1 in which na1 = s and nam+1 = f or it constant amounts. In the stage of satisfying the condiis the sequence of A = ea1 ea2 ...eam . Here m is the As path tions the goal is to find the test cases so that to satisfy
the paths conditions.
length and eai = {ea1 , ea2 }(0 < i m).

342

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Proposed approach

As it is mentioned, one of the challenges of basis path


set selection is the existence of infeasible paths. Out
breaking of this problem sometimes leads to lack of
probability of basis path set Acquiring basis path
set in most programs that have a relation between
their data structures will be impossible. The function f, in Algorithm 2 shows this problem clearly [9].
void f(int x){
int y,z;
if (x < 5)
y=2;
if (x < 5)
z=1;
}
Algorithm 2: The function f [9]
However cyclomatic complexity of this algorithm
equals to 3, but only 2 feasible paths is acquirable; so
the maximum number of the acquired basis path set
will be equal to 2. Calculating the number of basis
path set which includes only the feasible paths is undecidable.
In our proposed approach which is based on the
suggested algorithm in [9], besides utilization of the advantages of this algorithm, by reducing domain of each
condition some efforts has done to make the calculating of some infeasible paths towards producing basis
path set, decidable. You can consider our proposed
approach architect in figure 2 that will be explained in
the future.

Insomuch variables boundaries are more error potentially most promising areas than other areas, assignment instructions and existing conditions of branches
are labeled by 3 values and loop structures by 6 values. Values of 3 labels of assignment instructions and
conditional instructions include boundary value, one
stage before it and one stage after that; and values of
6 loop instructions label include both top and bottom
boundary values of that structure, two previous values
and two values after them. One of the disadvantages of
based algorithm of this paper is receiving UL variable
from tester so that to end infinite probable loops, but
it in contrast with automatic test case. Our suggested
proposed algorithm will not be involved in infinite loop
because of decidability. So, there is no need to get any
variables from testing (tester) and it will produce a
full automatic tool. The next improvement is eliminating the condition B == . This condition was
considered to solve the first founded path problem
for basis path set by algorithm. The first discovered
path is linearly independent rather than null set, so
this condition is omissible. By omitting the condition the speed of executive algorithm will increase.
You can see the proposed algorithm in Algorithm 3.
BOOL FeasibleBPGen(){
Predicate(T , E);
EPath(T , E , EP);
Lbl(T);
for (len = 1; ; len++) do
while (P = Select(len))!=NULL do
if (!LR(P , B)) then
if (Feasible(P)) then
add P to B;
if (Size(B) == FindRC(T)) then
return TRUE;
end
end
end
end
end
Algorithm 3: Proposed FeasibleBPGen Algorithm

In proposed algorithm, the function Predicate(T,E)


has predicated potentially most promising areas and
the function EPath(T,E,EP) extracts the paths which
contain errors. After executing all these functions, E
and EP arrays will contain predicated potentially most
promising areas and error path-obtained sequentially.
Figure 2: Proposed Approach Architect
As it is explained, the function Lbl(T) has labeled
CFG, the function FindRC(T) calculates the number
of infeasible paths of CFG by help of labels and the
In our proposed approach, so at first, potentially function Select(len) selects the next path with length
most promising areas have predicated and by discov- len. The function LR(P,B) also will returns TRUE if
ering these points and sequentially related paths and the P can be linearly represented by path set B.
labeling, weak areas will be limited and infeasible paths
get extract. Labeling on assignment instructions, existing conditions of branches and loop structures is done.

343

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Experiments

Refrences

In this section, to show the efficiency of proposed approach, this approach has examined on GNU coreutils
package. The acquired results are available in table 1.
In the first column of this table, the name of testing
function; in the second column, the number of feasible paths; in the third column, the number of acquired
paths from [9] and in the fourth column, the number
of acquired paths from proposed algorithm has presented. Whereas all functions execute by [9] on a Pentium IV PC with 3.2 GHz CPU and 1GB memory per
10 seconds and also proposed algorithm in this paper
executes in 5.5 seconds, that is make 45% time saving.
Also because of lack of tester involvement it has turned
to a full automatic tool.
Function Name
getop()
strol()
InsertionSort()
dosify()
bsd split 3()
attach()
remove suffix()
quote for env()
isint()

|B|
11
7
5
8
6
5
3
4
9

|B| of Base
11
7
5
8
6
5
3
4
9

V(G)
11
7
5
8
6
5
3
4
9

[2] R.S. Pressman, Software Engineering A Practitioners Approach, 2003.


[3] J. Edvardsson, X Li, and S Wu, A survey on automatic test
data generation, Second Conference on Computer Science
and Engineering 11 (1999), 2128.
[4] A. Coen-Porisini, F. De-Paoli, F. De-Paoli, C. Ghezzi, and
D. Mandrioli, Software specialization via symbolic execution, IEEE Transactions on Software Engineering 17 (1991),
no. 9, 884899.
[5] Z. Xu, J. Zhang, Shi Y, and Chen Z, A test data generation tool for unit testing of C programs, Sixth International
Conference on Quality Software (QSIC06) 7 (2006), no. 4,
107-114.
[6] R. Bodik, R. Gupta, and M. L. Soffa, Refining Data Flow
Information using Infeasible Paths, Proceedings of the 6th
European conference held jointly with the 5th ACM SIGSOFT international symposium on Foundations of software
engineering 12 (1997), 361377.
[7] M. Delahaye, B. Botella, and A. Gotlieb, Explanation-Based
Generalization of Infeasible Path, Third International Conference on Software Testing, Verification and Validation
(2010), 215224.
[8] D. Gong, X. Yao, Singer Y, and Kaelbling P, Automatic
Detection of Infeasible Paths in Software Testing, IET Software 4 (2010), no. 5, 361370.

Table 1: The results of executive FeasibleBPGenon algorithm on some functions

[1] C. Kai-Yuan, D. Zhao, and L. Ke, Software testing processes as a linear dynamic system, Information Sciences
178 (2008), no. 6, 15581597.

Conclusion and Future Works

Automatic test case generation is one of the most practical testing due to increasing reliability and decreasing
cost. In this paper, a full automatic tool has presented
by limiting the value arias domain, predicating and discovering potentially most promising arias and related
paths, labeling the paths and finally extracting feasible
paths. This tool, by calculating the number of infeasible paths reduces the executing time. The experimental result shows 45 percent time saving. Also in
proposed approach, due to presenting full automatic
tool, there is no need to use testers support. In future works, the proposed approach in this paper can
be used as a base for producing test cases which supports more constraints such as polynomial constraints
and functions call.

344

[9] Y. Jun and J. Zhang, An Efficient Method to Generate Feasible Paths for Basis Path Testing, Information Processing
Letters 107 (2008), no. 34, 8792.
[10] T. Sangeeta and K. Dharmender, Automatic Test Case
Generation of C Program Using CFG, IJCSI International
Journal of Computer Science Issues 7 (2010), no. 4, 2731.
[11] P. Lili, W. Tiane, and Q. Jiaohua, Research on Infeasible
Branch-Based Infeasible Path in Program, JDCTA: International Journal of Digital Content Technology and its Applications 15 (2011), no. 5, 166174.
[12] K.H. Chang, J.H. Cross, W.H. Carlisle, and L. Shih-Sung,
A performance evaluation of heuristics-based test case generation methods for software branch coverage, International
Journal of Software Engineering and Knowledge Engineering 6 (1996), no. 4, 585560.
[13] M. Xiao, M. El-Attar, M. Reformat, and J. Miller, Empirical evaluation of optimization algorithms when used in goaloriented automated test data generation techniques, Empirical Software Engineering, Springer 12 (2007), 183239.
[14] Y. Jia and M. Harman, An analysis and survey of the development of mutation testing, IEEE Transactions on Software
Engineering 28 (2010), 2032.
[15] T.J. McCabe, A Software Complexity Measure, IEEE
Trans. Software Engineering 2 (1976), 308-320.
[16] S. Anand, P. Godefroid, and N. Tillmann, Demand-driven
compositional symbolic execution, Tools and Algorithms for
the Construction and Analysis of Systems, Springer 14
(2008), 367381.

Control Topology based on delay and traffic in wireless sensor


networks
Bahareh Gholamiyan Yosef Abad

Masuod Sabaei

Institute for Qazvin Azad University

Institute for Amir Kabir University

Department of Computer and Information Sciences

Department of Computer and Information Sciences

bgholamiyan@yahoo.com

sabaei@aut.ac.ir

Abstract: In wireless sensor networks energy consumption is one of the influential factors on the
life time of the network. Therefore , reducing energy consumption is one of the important criteria
in our designing. Control topology is a method for specifying appropriate transmission range to
nodes by regulating transmission power, in such a way that energy consumption reduces. While
topology control, maintaining some specifications of the network such as connection of all topology
nodes to each other and preventing production of unnecessary regions is required.
In this article, for creating a topology with the least interference and covering delay constraint,
we have proposed an algorithm .In this algorithm we divide the network environment in such a
way that the delay constraint based on the number of hops of each sensor nodes to sink node
could be met. Also, this division by considering the power of transmission and the traffic of
nodes in each cell, distributes energy consumption balanced in all cells. Then ,by using the
rate of energy obtained in each cell, the radius size of each cell and the transmission range
between the nodes of both adjacent cells are computed. The result of simulation shows that the
proposed algorithm reduces energy consumption at least 10 percent compared to similar algorithms.

Keywords: Control Topology; Energy Consumption; Transmission Power; Delay; Traffic; Asymmetric Division.

Introduction

One of the most important challenges of wireless sensor networks(WSN) is energy constraint. In general,
wireless communications in this networks use energy
more than processing signals, computations, sensing,
etc. Thus the application of protocols and algorithms
that improve energy efficiency could help in solving this
problem[1][2][3].
In some applications of wireless networks like environment monitoring for sensing the existence of neural gas,
it is required that packet reach sink node at a certain
time period[7]. On the other hand, due to energy constraint we should use an algorithm that is able to cover
energy constraint with the least amount of energy consumption[9][10]. Traffic rate is also an effective factor
in energy consumption[4]. Therefore in order to reduce
Corresponding

Author, T: (+98) 912 2031925

345

energy consumption , we should focus on the network


traffic.
SPEED protocol has been stated in[5].SPEED is a
routing protocol that provides soft end-to-end delay
guarantees for sensory data transfers. While SPEED
takes into account the delay caused by channel access
mechanisms, it pays no attention to the problem of
energy consumption along the route and does not consider traffic.
In [6] while covering end-to-end delay, it focuses on energy consumption by considering channel access mechanisms. In delay-constrained, energy-efficient routing
problem (DCEERP), Sensor nodes are equipped with
two radios, a low power radio to be used in shortrange communication, and a high power radio for longrange communication. These radios operate at different frequencies, and hence simultaneous transmission
over these two radios does not interfere. Network is di-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

vided into sectors and each sector is divided into cells


and Each cell has a gateway node, whose purpose is to
aggregate the information sensed in that cell and forward it to the sink.
In the intra-cell phase, the sensor nodes within a cell
relay the data to their gateway directly through their
short-range radios. As the distance from a sensor node
to its gateway is limited, a direct relay is possible. In
the inter-cell phase of the data transfer, a gateway relays the cells aggregated data along a path. In this
problem, delay and energy consumption have been considered. But each node of gateway can transmit only to
the node of its lower cell in the same sector, and transmission to other sectors increases energy consumption.
Meanwhile, for the purpose of monitoring energy consumption the problem of traffic and interference have
not been considered at all.
In [6] two other methods titled DIR and MIN-EN are
stated. In DIR each gateway node can transmit its
packets with long-range directly to sink node without
using the nodes of intermediate gateway. In this case
delay is reduced but energy consumption increases. In
MIN-EN transmission of packets is conducted by only
short-range and along one path, that is the nodes of a
sector gateway transmit their packets to lower nodes
in the same sector in order to reach the sink node. In
this case, energy that is consumed reduces but delay
goes up.
Article [7] attempts to reduce the cost of energy consumption with attention to delay. In this article for the
purpose of solving problem Topology control for delayconstraint data collection(TDDC), three algorithms
are proposed. Load-aware power-increased topology
control algorithm(LPTC)whose aim is to minimize total energy consumption in each data collection period.
Two other algorithms, that is, Static division with
equal length (SDEL) and Dynamic division with equal
length (DDEL), considering the delay constraint with
attention to the number of hops of sensor node has
been considered up sink node .
In SDEL algorithm it is supposed that all nodes have
the constraint of having the same number of hops and
the network is a rectangle shaped environment L by
W . If the constraint in the number of hops is considered T , the network is divided into T sections like Fig.
(1).
Then, it determines in which cell each node is located
through computing the length and width distances and
transmits it from each cell to the node of higher cell until it teaches sink node but in this algorithm no attention has been paid to the problem of interference and
traffic. It goes without saying that by getting closer
to the sink node, the traffic rate and consequently interference rises. Therefore, retransmitting packets will
be increased, too. With decreasing transmission range
related to nodes that must do a larger number of re-

transmissions, the energy consumption and transmission power could be reduced.

Figure 1: Division of network environment in SDEL


algorithm.

As it was observed in equations (1) and (2), energy


consumption and transmission power have a direct relationship with the transmission range [1] and with
decreasing this range, it is possible to reduce the rate
of energy consumption and transmission power.
Pi,j = c d
i,j , 2 5
Ei,j = k

d
i,j , 2

(1)
(2)

c is constant, the power of past waste, k size of


packet transmitted in bit and d transmission range of
i and j nodes.
In this algorithm Non-Polar Cell Based (NPCB) is proposed for wireless sensor networks, that operates toward reducing energy consumption given the traffic and
delay constraint cover. For this purpose, the network
is divided asymmetrically in such a way that nodes
with less traffic are transmitted with larger transmission range and nodes with higher traffic are transmitted
with smaller transmission range.
In the remainder of the article the proposed algorithm
is introduced and the method of division, indexing
cells, Computing the cell index of one node and selection of the next node are stated. In section three the
proposed algorithm is assessed compared with other
methods. In section four the conclusion of the article
is stated.

Proposed Algorithm

As it was mentioned in previous chapter, the more we


move toward sink node, the more the rate of interference increases in such a way that it causes retransmission of packets[8].This retransmissions cause an increase in the consumption of power and energy. Therefore, with a decrease in the range of transmissions, it is
possible to reduce the energy and the power of transmissions based on equations (1) and (2).

346

The Third International Conference on Contemporary Issues in Computer and Information Sciences

In this article the Non-Polar Cell Based (NPCB) algorithm is proposed for asymmetric network division in
order to decrease the power of transmission and energy
consumption given traffic and delay cover constraint.

2.1

Network Division

In this algorithm the network environment is considered circular with sink node center and sensor nodes
aware of location, are dispersed in the environment in
random and are fixed in location. In NPCB algorithm
in order to balance the energy consumed in all cells, the
network environment is divided asymmetrically so that
nodes with lower traffic transmit at a larger range and
nodes with higher traffic transmit with smaller range.
This is done considering traffic and delay cover constraint.
This type of division is conducted because the more we
move toward the sink node, the more the traffic rises
and, as a result, interference increases and packets are
lost. For packet transmission, it must be retransmitted. These retransmissions lead to more power and
energy consumption. Thus, if we make the transmission range shorter according to equations (1) and (2),
this reduces the power and energy consumption.
For this purpose, we suppose that delay constraint is
T , that is, the number of permissible hops from each
original sensor node to sink node must be smaller or
equal to T . In addition, as in Fig. (2) it could be
transmitted from one sensor node in each cell in single
hop to the sensor node of the next cell.

the power of transmitting one bit, r the traffic rate, k


the number of bits transmitted by one node and nM
the number of nodes in M region.
As it was observed in equation (3) the rate of traffic is
one of the influential factors on energy consumption.
In network environment, each node of sensor transmits its traffic and that of other nodes whose traffic is
conducted by this node through a single path. Rate
of traffic affects energy consumption, the radius size
of each cell and, as a result the transmission range of
both nodes in adjacent cells.
Equations (4), (5) and (6) are used for computing
the power of transmitting by each node, the range of
transmission in two adjacent cells and network density.
Pi,j = c d2i,j
bM + bM 1
2
N
=
A

dM =

(4)
(5)
(6)

Where Pi,j represents the transmission power in


any transmission from i to j, c constant and di,j the
transmission range of i and j nodes. The path waste
power is represented by that is considered 2 [1]. dM
is the transmission range between i and j nodes in M
and M 1 cells and bM the radius of M cell, bM 1
that radius of M 1 cell, the density, N the total
number of nodes and A the total area of the network.
After placing transmission power, transmission power
and the number of nodes in cell in equation (3), we
focus on equation (7) for computing the energy of cells.
The number of nodes in each cell is computed using
the density and the area of each cell.
EM =
EM =

crk
(L2 b21 ), if M = 1
4

crk
(L2 (b1 +...+bM 1 )2 )(bM 1 +bM )2 ,
4
if M > 1

(7)

Figure 2: Division of network environment in NPCB


algorithm.

The energy consumed in cells is computed using


equation (7). Through equalizing energies obtained
and given the point that the sum of bi is equal to L
As it was stated we like to have balanced energy size, we could compute the radius of each cell and, as
consumption in all cells. To achieve this objective, we a result, the transmission range of both nodes in two
consider equal energy for all cells. Energy is computed adjacent cells.
according to equation (3) and considering the equality
of energy used we can calculate the radius of each cell.

2.2
EM = Pt r k nM

Indexing cells

(3)

Where EM represents energy used in M region, Pt

347

After computing the radius size of each cell, we index


cells from sink node to the farthest cell, Fig. (3).

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of data in distance r meter is proportionate to r considered ( = 2).


In simulation, sensor nodes are located in a circular
region. In this simulation the radius of the region is
considered 100 meter. The size of transmitted packets is 100 bites. In Fig. (4) an example of topology
produced is shown.
Figure 3: Indexing cells in NPCB algorithm.

3.1
2.3

Computing the cell index of each


In this scenario, first by considering number of hops
node

To obtain the cell index of node where node i is located


, given equation (8) we use the method mentioned.
r=

p
x
x2 + y 2 , = tan1
y

(8)

By having Cartesian Coordinate (x, y) of node i,


we obtain the Euclidean Coordinate (r, ) of node i.
we have obtained the radius size of each cell earlier.
Now by having r and the radius size of each cell, it is
possible to obtain the cell index of node i.

2.4

First scenario

Selection of the next node

equal to 6, we compare total energy used by algorithms


for the number of different sensor nodes. As it was
mentioned in NPCB algorithm ,each node selects the
nearest node in the adjacent cell as the next node. then
nodes are only allowed to transmit on topology links
formed. We compute total energy used by network for
the state in which all nodes inside the network transmit traffic toward sink node.
Fig. (5) shows the results of simulation for the number of different nodes with a fixed number of hops. As
you observe in the figure, for all nodes that are few
in number in all algorithms almost similar results are
obtained. This is because the number of low traffic
nodes and the interference which is produced is not
considerable, But for higher number of nodes since the
proposed algorithm considers the traffic produced in
each cell, by decreasing transmission range in denser
cells, the total energy used in the network reduces.

When we intend to move toward sink node, we select


the next node from the nodes of the next cell because
this cell becomes closer to sink node.
According to equation (9) each node can select another 3.2
node in the next cell.

Second scenario

In this scenario, first the number of fixed nodes is as(9) signed 1000, then total energy used by algorithms for
different numbers of hop are compared. As it was menCondition I(j) 1 demonstrates that when we tioned in NPCB algorithm, the closest node in adjacent
cell is selected as the next node. Then, nodes are perreach the first cell, we must transmit to sink node.
After finding nodes that are located in the next cell, we mitted to transmit on topology links. Total energy
select the node which is in the shortest distance from consumption of the network for the state in which all
the nodes inside the network transmit traffic toward
present node.
sink node is computed.
Fig. (6) demonstrates the simulation results for a fixed
number of nodes and different number of hops. As you
observe in the figure, more energy is used for the few
3 Performance evaluation
number of hops compared to the case the number of
hops is larger. This is because the more the number of
The efficiency of proposed algorithm has been com- hops increases, the transmission range reduces and, as
pared with that of DCEERP and SDEL algorithms a result, energy consumption reduces, too.
using MATLAB software, and efficiency measurement As you see NPCB algorithm improves energy consumpwas conducted based on total energy consumption by tion at least 10 percent compared to other algorithms
sensor nodes. The energy used in transmitting a unit mentioned.
I(j) = I(i) 1, I(j) 1

348

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusion

In this paper we presented a new algorithm for monitoring topology with delay constraint cover of the given
traffic. This algorithm reduces energy consumption by
asymmetric division of network environment and, as a
result, decreases the transmission range of high traffic nodes and increases the transmission range of low
traffic nodes. In simulation it was shown that NPCB
algorithm improves energy consumption compared to
other similar methods.
The proposed algorithm given the traffic and permissible delay constraint, divides the network environment
in such a way that the energy used is balanced among
cells. Then by considering the energy consumed by
Figure 4: Topology produced for 2500 nodes in 6 cells. each cell, we can compute the radius size of each cell
and transmission range between two adjacent cells.

Refrences
[1] J. Pan, Y. T. Hou, L. Cai, Y. Shi, S. X. Shen, and V. Tech,
Topology Control for Wireless Sensor Networks, ACM International Conference on Mobile Computing and Networking , MobiCom03 (2003), 286-299.
[2] P. Santi, Topology control in wireless ad hoc and sensor
networks, ACM Comput. Surv. 37/2005 (2005), 164-194.
[3] M. A. Labrador and P. Wightman, Topology Control in
Wireless Sensor Networks, Springer (2009), 1-100.

Figure 5: comparison of energy consumption in algorithms with a fixed number of hops.

[4] S. Zarifzadeh, A. Nayyeri, and N. yazdani, Efficient Construction of Network Topology to Conserve Energy in
Wireless Ad-Hoc Networks, Computer Communications
31/2008 (2008), 160-173.
[5] T. He, J. Stankovic, C. Lu, and J. Stankovic, Speed: A stateless protocol for real-time communication in sensor networks, In ICDCS (2003).
[6] P. K. Pothuri, V. Sarangan, and J. P. Thomas, Delayconstrained energy efficient routing in wireless sensor networks through topology control, Proc. IEEE ,Int. Conf.Netw.
Sensing Control (2006), 35-41.
[7] H. Xu, L. Huang, W. Liu, G. Wang, and Y. Wang, Topology
control for delay-constraint data collection in wireless sensor networks, Computer Communications 32/2009 (2009),
1820-1828.
[8] M. Burkhart, P. Rickenbach, and R. Wattenhofer, Does
topology control reduce interference?, Proceedings of the 5th
ACM international symposium on Mobile ad hoc networking and computing mobihoc 04 (2004).
[9] K. Akkaya and M. Younis, Energy-aware delay-constrained
data in wireless sensor networks, Journal of Communication Systems (2004).

Figure 6: Comparison of energy consumption in algorithms with a fixed number of nodes.

349

[10] Habib M. Ammari and Sajal K. Das, A tarde-off between


energy and delay in data dissemination for wireless sensor
networks using transmission range slicing, Computer Communications (2008), 1687-1704.

Two-stage Layout of workstations in an organization based clustering


and using an evolutionary approach
Rana ChaieAsl

Shahriar Lotfi

Reza Askari Moghadam

Payame nour University of Tehran

Tabriz University

Payame nour University of Tehran

Technical Group

Computer Science Department

Technical Group

chaieasl@gmail.com

shahriar lotfi@tabrizu.ac.ir

askari@pnu.ac.ir

Abstract: In this paper, an evolutionary algorithm is presented for solving layout workstations
in organizations. According to the non-polynomial complexity - hard alignment issues, providing
solutions for problems with large sizes, are not possible through other mathematical methods. Thus,
evolutionary approaches are able to offer optimal or near optimal solutions for these problems. In
this paper, the purpose of layout in organizations is maximizing the closeness of workstations with
each other. Proposed solution does layout by using the amount of working relationships between
people and the workstations and by using evolutionary algorithms and clustering process. The
layout process is performed in two stages. First stage, including the arrangement of the workstations,
and the second stage, including workstations arrangement is in existing buildings. The proposed
algorithm, with obvious examples provided and conduct a research study, has been evaluated.

Keywords: The facility layout problem; layout of workstations; clustering; evolutionary algorithms.

Introduction

Facility layout problems (FLPs) range in scale from


the assignment of activities to cities, sites, campuses
or buildings, to the location of equipment and personnel groups on a single floor of a building [1]. A facility
layout is an arrangement of everything needed for production of goods or delivery of services. A facility is an
entity that facilitates the performance of any job. It
may be a machine tool, a work centre, a manufacturing cell, a machine shop, a department, a warehouse,
etc [2]. The FLP is a common industrial problem of
allocating facilities to either maximize adjacency requirement or minimize the cost of transporting materials between them. The maximizing adjacency objective uses a relationship chart that qualitatively species
a closeness rating for each facility pair. This is then
used to determine an overall adjacency measure for a
given layout. The minimizing of transportation cost
objective uses a value that is calculated by multiplying together the flow, distance, and unit transportation
cost per distance for each facility pair [3].
As regards, Complexity of the facility layout problems
Corresponding

Author

350

is Non-polynomial type-hard, Quadratic Assignment


Problem (QAP) approach cannot provide an accurate
view of the important qualitative impact on the layout
design process and When the problem size is large, required to many computational time [4]. To overcome
the above problems, researchers found a trend toward
the use of approximate approaches. Several authors
have been used of the meta-heuristic approaches to
obtaining near-optimal designs. Chwif, Pereira Barretto (1998), proposed an algorithm based on a simulation annealing that regard to the typical facility layout problems. Other presented solutions can be mentioned are genetic algorithm (Wu, Appleton, 2002) and
ant colony algorithms (Solimanpur, Vrat, and Shankar)
[4].
Due to the necessity of alignment issues, in this article have been paid to present the layout of human
resources in organizations by using an evolutionary algorithm to decide a suitable arrangement of the individual and the workstation considering the number of
stations, number of floors and buildings. In the article
the Section 2 describes the problem statement, in Section 3 proposed solution and algorithm is described,

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

evaluation and practical results will be presented in minimizing inter-dependencies between clusters. So we
Section 4.
have [5]:
K
X

Problem Statment

OF =

The issue of human resources layout in workstations is


includes two stages: In first stage, the arrangement of
employees at workstations (rooms) must be done. This
arrangement should be done so that people who work
in a field of activity and they also have a more working
relationship, be placed in a same workstation or the
nearest one.
In second stage, the layout of stations should be done
in terms of their relationships, to different floors of existing buildings.

K
X

Ai

i=1

i,j=1
K(K1)
2

K > 1 (3)

K represents the number of clusters within the layout.


F O is thus bound between -1 (no cohesion within the
clusters) and 1 (no coupling between the clusters) [5].
In the first stage, with the involvement of the relations
between individuals, the number of workstations and
their capacity as the issue input, Formula (1) to obtain the connection of workstations (clusters) and the
formula (2) is used to obtain the relationship between
workstations. In the second stage, the matrix obtained
from the previous step, which represents the amount
of communication between workstations together and
3 The proposed solution
having in hand the number of buildings, Number of
The layout human resources problem in this paper has floors, each floor-based stations and the distance bebeen done by using clustering operation. This opera- tween buildings, like stage one, the clustering of stations has been investigated in different floors.
tion is implemented by genetic algorithm.
In the paper [8], has been investigated to how to solve
the first stage of problem. In this section, in addition
3.1 Encoding
to a brief description of the first stage, the second issue
Encoding method used As follows, Each node in the
is resolved.
For clustering, two types of connection are introduced graph has a unique numerical identifier assigned to it.
between organizational units. Intra-connectivity is These unique identifiers define which position in the
a measure of the density of connections between the encoded string will be used to define that nodes clusnodes of a single cluster. intra-connectivity (Ai ) of ter [5]. For example, this string, S = 11332 Expresses
cluster i with Ni components and i intra-edge depen- the position of components 1 and 2 are in cluster 1,
dencies (relationships to and from modules within the components 3 and 4 are in cluster 3 and component 5
is in cluster 3.
same cluster) as [5]:
i
Ni (Ni 1)

0 Ai 1

(1) 3.2 Selection


Methods used in this algorithm, including of roulette
Ai is 0 when the modules in a cluster are not connected, wheel, tournament and ranked.
and 1 when each module in a cluster is connected to
every module in the same cluster (including itself).
Inter-connectivity is a measure of the connectivity 3.3 Elitism Concept
between distinct clusters. Inter-connectivity Ei,j be- This method increases the efficiency of GA and will
tween clusters i and j, each consisting of Ni and Nj prevent the loss of good answers that was obtained.
components respectively, with ij inter-dependencies
(relationships between the modules of both clusters)
as [5]:
3.4 GA Operators
Ai =


Ei,j =

0
ij
2Ni Nj

if
if

i=j
i 6= j

(2)

Ei,j is 0 when there are no inter-dependencies between


clusters i and j, and 1 when every module in cluster j
depends on every module in cluster and vice versa.
Now with defined two types of connection, we can specify the objective function. The objective is maximizing
intra-connectivity between the modules in a cluster and

According to the type of issue, to avoid repeating


the content of genes in chromosomes, in the algorithm has been used of Partially Matched Crossover
(PMC) method [7]. Mutation operation to avoid missing strings with high fitness when the population will
converge to an optimal position. In the proposed algorithm, to prevent of changes in chromosome structural
is used of swap mutation operator.

351

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.5

The Termination Condition

In the presented algorithms, with the completion of


generation based on a constant rate the performance
of the algorithm is terminated.

3.6

The Penalty Method


Figure 1: Multipart Graph

In the second stage, during the generation process,


chromosomal structure may be broken and the chromosomes are made with distinctive structure of the
original design of the buildings. To solve this problem,
penalty coefficient is applied for defective chromosomes
and reduced the fitness of them.

3.7

Constraint

Figure 2: Complete Graph

One of the constraints is that the possibility of transfer


stations with the same capacity in the various floors of
buildings or the Possibility of relocation of buildings
or floors. Another limitation is the distance between
floors and buildings from each other; So that the stations have less work relationship with other floors in
the one building, need be housed in the other buildings.

Assessment and practical results

In this section, to the validity of the proposed algorithm has been assessed.

4.1

Table 1: Practical results calculated for figer 1

Assess the validity of the proposed


algorithm

Functional correctness of the proposed algorithm, have


been tested by giving two examples. Examples have
shown in Figures (1 and 2). Due to the nature of a
complete graph where all nodes are connected together,
calculated value for the objective function should be
zero, but about the multipart graph, each sector of
graph should be placement in one cluster. The results
are presented in Table (1).

352

4.2

Convergence and stability assessment

The purpose of these tests is to evaluate the efficacy


of the algorithm. Figure (3) shows the rate of convergence of a random algorithm over 200 generations.
This figure includes two graphs. Thay have other two
graphs: Diagram corresponding to the average of objective function calculated and diagrams that show the
objective function calculated for a random graph in any
algorithm implementation. To prove the stability rate
of answers provided by the stability diagram for the
implementation of the algorithm 10 times for a random graph (Figure 4) is used. The standard deviation
calculated in order to prove the stability of the algorithm is used. This numeric value should be convergent
to zero. SD values obtained for both phases have such
characteristics.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The institutions organizational chart is presented in


Figure (5) and the design of building of this organization shown in Figure (6). The results of the algorithm,
for given inputs and also, the results of evaluating the
current situation in the organization chart layout is
presented in Table (4). By comparing these results together, work rate of the algorithm presented is clearly
demonstrable.
Figure 3: Convergence diagram for a random graph

Figure 4: Stability diagram for a random graph

4.3

Figure 5: The institutions organizational chart

Case Study

In this section, a case study conducted at the Institute


of nabi akram in Tabriz. In this study, using a questionnaire (Table 2), the relationship between people
and areas together have been achieved based on evaluation criteria. In order to meet priorities of target
indicators, the coefficients for each of them has been
considered. Parameters studied and their corresponding coefficients are in Table (3).
Figure 6: Design of the organization
Table 2: Questionnaire sample rate between
individuals

Table 4:The obtained results of calculate the


purposed algorithm

Table 3: Indicators to assess the relationship between


people

Conclusion

Facilities layout design is considered as a very important and influential strategy in any administrative ac-

353

The Third International Conference on Contemporary Issues in Computer and Information Sciences

tion. Through a suitable arrangement, the relationship


between an organization and cycle works reduced to the
lowest value and prevent the waste of space consumer
in organizations. On the other hand, the proven effectiveness of evolutionary algorithms in solving problems
with large search space, Chosen this approach can be
very efficient as a tool to resolve these issues.

[3] KY. Lee, SN. Han, and M. Roh, An improved genetic algorithm for facility layout problems having inner structure
walls and passages, Comput Oper Res 30 (2003), 117138.
[4] T. Hamann and F. Vernadat, The intera-cell layout problem in automated manufacturing systems, Rapports de
Recherche 1603 (1992).
[5] D. Doval, B. Mancoridis, and B. Mitchell, Automatic Clustering of Software Systems Using a Genetic Algorithm, Doctor
of Philosophy, Drexel University, Philadelphia, 19104 (1999).
[6] Ch. Hicks, A Genetic Algorithm Tool for Optimising Cellular or Functional Layouts in the Capital Goods Industry, Int.
J. Production Economics 104 (2006), 598614.

Refrences
[1] R.S. Liggett, Automated facilities layout: past, present and
future, Automation in construction 9 (2000), 197215.
[2] A. Dria, H. Pierreval, and S. Hajiri-Gabouj, Facility Layput
problems: A survey, Annual reviews in Control 31 (2007),
255267.

354

[7] F. Azadivar, JJ. Wang, and B. Sadeghi Bigham, Facility layout optimization using simulation and genetic algorithms,
Int J Prod Res 38 (2000), 43694383.
[8] R. ChaieAsl and Sh. Lotfi, Layout of human resource with
evolutionary algorithm approach, first national conference of
Scholars for computer and IT Science of Tabriz University
(2011).

CAB : Channel Available Bandwidth Routing Metric for Wireless


Mesh Networks
Majid Akbari

Abolfazl Toroghi Haghighat

Institute for Qazvin Azad University

Institute for Qazvin Azad University

Department of Computer and Information Sciences

Department of Computer and Information Sciences

majid.ag1@gmail.com

athaghighat@yahoo.com

Abstract: In this paper we propose a routing metric for Wireless Mesh Networks called Channel
Available Bandwidth that is aware of each link load. This metric assign weights to each link which
are based on different causes of channel busyness and available bandwidth. This article aims to
find high throughput paths and avoid route traffic to congested network areas that balance the
traffic. Using obtained weights for each link that comprises path, we are able to consider load and
interference of each link. We combine this new metric with routing protocol AODV and evaluate
performance of this metric using simulation. We show that proposed metric by distinguishing
between different causes of channel busyness obtains high throughput paths compared other related
metrics.

Keywords: Wireless Mesh Network; Routing; Routing Metric; Channel Busyness; Load Aware; Interference Aware.

Introduction

in WMN is the hop-count metric, as used in DSR[2],


AODV[3], and DSDV[4]. It reflects the path- length in
hops, and in many cases the shortest physical path is
used. However, it is insensitive to the quality of links
between hops and to channel busyness ratio.
The ETX routing metric [5], is defined as the expected
number of MAC layer transmissions for successfully
delivering a packet through a wireless link. ETX reflects the difficulty with which the MAC layer sends
a packet to its destination. The weight of a path is
defined as the summation of the ETX values of all links
along the path. In this way, ETX considers both path
length and packet loss ratio. However, ETX fails to
capture the link transmission rate or the interference
from other links.
With the delivery ratio in both forward and reverse
directions, denoted by df and dr , respectively, ETX
is calculated as follows :

Wireless Mesh Networks (WMNs) is an emerging network technology that offers wireless broadband connectivity[1]. They can provide a cost-effective and
flexible solution for extending broadband services to
areas where cabling is difficult. In WMNs, most of the
nodes are either static or minimally mobile and do not
rely on batteries. The goal of routing algorithms is
hence to improve network capacity or the performance
of individual communications, instead of dealing with
mobility or minimizing power consumption.
Since most users of WMNs are interested in accessing the internet or using services provided by some
servers, the traffic is mainly directed towards gateways,
or from gateways to clients. Based on the specific requirements of WMNs, we believe that a good routing
protocol should find paths with minimum delay, max1
imum data rate and low levels of interference. In this
(1)
ET X =
d

dr
f
sense, an effective routing metric, which is used by
routing protocols, must be able to capture the quality
The ETT routing metric[6], is an improvement on
of the links effectively.
The simplest and most commonly-used routing metric ETX made by considering the differences in link trans Corresponding

Author, T: (+98) 918 3071887

355

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

mission rates. ETT is defined as the amount of time


which is needed to transmit a packet through the link.
The weight of a path is the summation of the ETT
values of all links on this path. It is defined as follows:
ET T = ET X

S
B

(2)

S represents the packet size and B represents the


bandwidth or capacity of the link. Despite the improvement with respect to ETX, ETT still fails to
capture the interferences among different links.
Among the metrics that are capable of measuring load
and quality of link, Avail [7] identifies paths with better capacity and nodes with less load. This metric is
calculated throughput of each link using three variables : the packet loss probability (p), the fraction of
busy time (Fb ) and the average duration of a busy
period (Tb ).
In this metric estimation of the packet loss probability
is similar to ETX routing metric. Estimation of Fb is
based on amount of time in which channel is sensed
busy due to other nodes activity. Also, Avail routing
metric consider interflow interference by making conflict graph for a path links.
ACAP [8] routing metric to calculate the amount of
channel busyness ratio considers times spent to transmissions in addition to amount of time that channel
was sensed busy due to other nodes transmissions.
This metric next hop nodes consider congestion domain of nodes, but make no distinction for different
causes of channel busyness reasons. In this metric the
channel busyness ratio is computed as follows :
CB =

Tbusy + Ttransmitting
Tidle + Tbusy + Ttransmitting

(3)

Where CB represents the amount of channel busyness ratio, Tbusy is equal to amount of time that node
senses the channel as busy, TT transmitting is equal to
amount of time that node transmits frames, Tidle is
equal to amount of time that node senses the channel
as idle.
Here Tidle , Tbusy , TT transmitting and are computed as
follows[9]:
data
ack
+
+ SIF S + DIF S
bitrate bitrate
data
+ ack timeout + SIF S + DIF S
=
bitrate

Tsuccessf ul =
Tcollision

Ttransmitting = Tsuccessf ul + Tcollision

(4)

Tidle is equal to number of slot-time channels sensed


as idle multiplied to 20s, and Tbusy is equal to number
of slot-time channel sensed as idle multiplied to 20s.

None of the previous studies make any distinction


among different causes of channel busyness reasons.
These reasons could be attributed to amount of channel busyness due to successful transmission compared
to unsuccessful transmissions, and also amount of channel busyness due to other nodes activity in the interference range.
In this paper given these reasons , computing actual
data rate of each link provided the routing metric with
valuable information. This information that obtained
locally from MAC layer of each link reflects current
state of the network, and helps us to select the next
hop node.
The rest of this paper is organized as follows: Section 2
presents the proposed routing metric, and the routing
protocol into which the routing metric is integrated.
Simulation results and their analysis are presented in
Section 3. Finally, conclusions are drawn in Section 4.

Proposed Metric

As it was mentioned considering the total transmission


time or channel busyness time without making any distinction between them, in some situations could lead
to making incorrect decision in selecting the next hop.
In this paper we introduce the proposed metric CAB
where we make distinction between successful and unsuccessful transmission through assigning weight for
them. Also, present data rate for each link is computed in different time periods. This strategy helps us
in the next hop select a node that makes a larger available bandwidth for us. Directing traffic toward nodes
that have had smaller unsuccessful transmission rate,
causes our metric to prevent from growing transmission
queues and as a result reduce intraflow interference for
the traffic in the network.
Also, channel busyness ratio due to transmitting message from other neighboring nodes could reflect congestion around that node and as a result in an area of
the network. To prevent from creating congested areas
it is better to avoid selecting the nodes of these areas
as the next hop.
Selecting the next hop in such areas not only causes
the packet loss in current flow but also damages other
flows in interference domain that is called interflow interference. As we will see in the next sections, assigning weight to time spent for successful and unsuccessful transmissions implicitly increases the importance of
this type of channel busyness which is a favorite outcome.

356

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2.1

Channel Busyness Ratio

damaging other flows that are around there.


As it was revealed from the example considering parameter as a weight for channel busyness ratio in
To obtain channel busyness ratio we act as following
case of successful transmission, and (1 ) as a weight
given its different reasons:
for channel busyness ratio in collision case leads to reduction in the channel busyness ratio for node m in
Tsucc.
Tcoll.
Tbusy
CB = (
) + (1 ) (
)+(
) which successful transmissions have higher amount of
Ttotal
Ttotal
Ttotal
(5) time. This is an improvement compared to previous
studies that made no distinction among these nodes,
in which:
and in all three nodes as next hop, the channel busyness ratio was considered 50 percent.
Ttotal = Tsucc. + Tcoll. + Tbusy + Tidle
Parameter 0 1 is a tunable quantity which
given the objective of metric in order to specify lower
weight to successful transmissions, it should be less
than 0.5. During simulations conducted, the best domain for quantities of this parameter have been shown.
Other parameters applied in this equation are similar
to those of equation (4).
For example, suppose that in a time period of 10s a
node i has three choices as the next hop that must select one among them : node m that sensed the channel
busy for 5s during last 10s due to the other neighboring nodes, but it has had no transmission. Node n
which has had only a five-second successful transmission, but has not sensed the channel as busy. And node
p that has had a five-second unsuccessful transmission
(collision), has not sensed the channel as busy. If we
use the methods used in previous metrics for all three
nodes mentioned, the channel busyness ratio is 50 percent of total time period.
Now using equation (5) and considering = 0.3, we
have: CBm = 0.15. It means that from the viewpoint
of our proposed metric, the channel busyness ratio in
this node is 15 percent of total time period.
In this way, by using equation (5) the channel busyness
ratio in node n is 35 percent of total time period and
in node p, 50 percent of total time period.
Hence node m as the node selected for the next hop
is of more priority compared to other two nodes, node
n is of second and p of the last priority. Given these
results of the example above, our proposed metric, in
addition to channel busyness ratio, it implicitly considers the probability of collision and packet loss in
selecting the next hop. Because packets that transmitted toward node m are less likely to collide compared
to other nodes.
Node n experiences collision during its total transmission time of packets. Thus selecting it as next hop
means leading traffic toward the link where the probability of packet loss is high. Also, selecting node p
as the next hop means leading traffic toward the link
where the probability of interference with other flows
around it is high. In regard with node p more outcomes
are expected because selecting it as the next hop means

357

2.2

Computing Available Bandwidth

In previous stage we were able to compute the channel busyness ratio. Now, we are going to estimate
the available bandwidth in each link. The importance
of this work becomes clear when we know that due to
changes in wireless network environment, the data rate
in the links varies, in other words, wireless links are
multi-rate[6]. Nodes in wireless network when due to
multiple unsuccessful transmission, sense the busyness
of environment and other reasons become aware of unsuitable condition of its environment, reduce their rate
automatically.
To compute the proposed metric, CAB, that in addition to channel busyness ratio, present data rate is also
taken into consideration and acts as follow :
CAB = (1 CB)

S
Ts

(6)

Where CB represents the busyness rate of the node


which was derived from equation (5), S the size of
transmitted packet and Ts the time required for successful transmission of packet that may include the retransmissions.
As a result, in selecting next hop, the node is selected
that provides higher available bandwidth (CAB).

2.3

Operation of CAB metric in AODV


routing protocol

In order to implement the proposed metric in AODV


routing protocol we use presented method in paper[10].A source broadcasts a Route Request (RREQ)
for that destination because it has to find a path when
a source has data to transmit to a known destination.
This procedure is called as route discovery. Each node
sends a RREQ broadcast packet with computed CAB
during route discovery until RREQ arrives at a destination.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

In initial time, the value of CAB is set to infinity value


when a source sends an RREQ to destination. At each
intermediate node, a reverse link to the source is created in routing table of it when a RREQ with CAB is
received.
In order to decide minimum value of CAB, each intermediate node has to compare the current value of
its CAB with the CAB value in RREQ received from
previous node . That is why minimum CAB value influences mainly the overall throughput the path. Subsequently a node records the minimum CAB values in
routing table. Each node has to create and send RREQ
with minimum CAB. Its process is repeated to arrive at
the destination. When RREQ packets arrive at its destination, a destination finds the maximum CAB value
among a lot of received CAB value . It is to select the
path having the maximum throughput among candidates. Although intermediate nodes in RREQ procedure find the minimum CAB values of each link, the
destination selects the maximum CAB value for maximum throughput overall path. We present the following two pictures for more details about mechanism of
CAB procedure.
Finally, a destination generates a Route Reply (RREP)
with a computed result. The generated RREP is sent
in a hop-by-hop fashion to the source. As the RREP
propagates, each intermediate node creates a route to
the destination.
The value of CAB is computed using the AODV agents
by querying information about channel busyness and
transmission time from the MAC Layer. For example,
MAC layer becomes aware of channel busyness using
NAV mechanism[11].
For example, consider Fig. (1). In this topology node
s intends to transmit data toward destination d. The
value above each link is the indicator of the available
bandwidth in that link based on Mbps. As you observe
there are two routes from the source to destination,
that is, (s, a, b, c, d) and (s, a, e, f, d).
At first look, node a must choose node b as next hop because it has higher available bandwidth, but due to existence of link (b, c) with available bandwidth of 3Mbps
that is considered as a bottleneck on this path, the destination node will reply to RREQ received from node
f. maximum available bandwidth on path (s, a, b, c, d)
is equal to 3Mbps and on route (s, a, e, f, d) it is equal
to 5Mbps.

The performance of the proposed routing metric CAB


is compared with performances of ETX and ETT by
using NS-2 [12] simulator. Performance of metrics is
evaluated based on network throughput, mean end-toend delay and sensitivity of metric to the existence of
interfered traffics.

3.1

First scenario

In first scenario, it is shown that CAB metric can recognize the existence of traffic in the network and take
new traffic flow away from the congested area. The
simulation was conducted in an area of 270m 270m,
that included 36 nodes. CBR flow was used for interfered traffic flow. The size of transmitted packets
was 1400 bytes. Fig. (2) shows the throughput of link
when there is interfered traffic in the network. With
increasing this traffic, throughput reduces.
As it is shown, when interfered traffic in the network
is low, all three metrics, that is, ETX, ETT and CAB
have high throughput. Since ETX and ETT metrics
are not sensitive to interference and other nodes busyness, by increasing this traffic, they are faced with high
reduction in throughput. The results show the superiority CAB metric to other metrics.

Figure 2: The sensitivity of metrics to the interfered


traffic

3.2

Figure 1: A Simple Topology

Performance evaluation

Second scenario

In this scenario 64 nodes have been located in an area


of 700m 700m. Also the size of transmitted packets

358

The Third International Conference on Contemporary Issues in Computer and Information Sciences

are 1400 bytes. Fig. (3) and (4) illustrate throughput and end-to-end delay of all the network for metrics
compared, respectively. CAB has a higher throughput compared to the other two metrics. Distribution
of traffic in less congested areas of the network by this
metric is the reason for its higher throughput.
Concerning end-to-end delay since CAB metric avoids
selecting nodes with high busyness ratio, at first, it
shows higher delay, but over time, since it balances
traffic better, as a result end-to-end delay is lower and
its standard deviation is less than that of delay.

real data rate of each link, and reduces the packet loss.
The conducted simulations illustrated that CAB can
increase mean network throughput and reduce end-toend delay compared to other metrics.
Our metric collects all obtained information locally and
like ETX and ETT metrics does not impose the overhead of broadcast or unicast packets to the network.
In future work, we are going to evaluate the performance of our metric in multi-channel wireless mesh networks, because information obtained from CAB each
node supplied with multiple channels for transmitting
or receiving is very valuable.

Refrences
[1] I. F. Akyildiz, X. Wang, and W. Wang, Wireless mesh networks :a survey, Computer Networks 47/2005 (2005), 445
487.
[2] David B. Johnson, David A. Maltz, and Yih-Chun Hu, The
Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR), Internet-Draft (2004).
[3] C. Perkins, E. Belding-Royer, and S. Das, Ad hoc Ondemand Distance Vector (AODV) Routing, rfc 4561 (2003).
[4] C. E. Perkins and P. Bhagwat, Highly dynamic DestinationSequenced Distance-Vector routing (DSDV) for mobile
computers, SIGCOMMComput. Commun. Rev. 24/1994
(1994), 234244.

Figure 3: Network Throughput

[5] S. J. De Couto, D. Aguayo, J. Bicket, and R. Morris, A HighThroughput Path Metric for Multi-Hop Wireless Routing,
9th annual international conference on mobile computing
(MobiCom 03) (2003).
[6] R. Draves, J. Padhye, and B. Zill, Routing in Multi-Radio,
Multi-Hop Wireless Mesh Networks, in MobiCom 04: Proceedings of the 10th annual international conference on
Mobile computing and networking Philadelphia, PA, USA:
ACM Press (2004).
[7] T. Salonidis, M. Garetto, A. Saha, and E. Knightly, Identifying high throughput paths in 802.11 mesh networks: a
model-based approach, IEEE International Conference on
Network Protocols(ICNP) (2007), 2130.
[8] Nemesio A. Macabale Jr., Roel M. Ocampo, and Cedric Angelo M. Festin, Attainable Capacity Aware Routing Metric
for Wireless Mesh Networks, The Second International Conference on Adaptive and Self-Adaptive Systems and Applications (2010).
[9] H. Zhai, X. Chen, and Y. Fang, How Well Can the IEEE
802.11 Wireless LAN Support Quality of Service?, IEEE
Transactions on Wireless Communications 4/2005 (2005),
30843094.

Figure 4: Average end-to-end delay

Conclusion

In this paper, we present a new routing metric in order


to find routes with high throughput in wireless mesh
networks. This metric obtains routes with higher available bandwidth using information obtained from MAC
layer such as the amount of node channel busyness and

359

[10] J. Kim, J. Yun, M. Yoon, K. Cho, H. Lee, and K. Han,


A Routing Metric Based on Available Bandwidth in Wireless Mesh Networks, Advanced Communication Technolgy
(ICACT), The 12th International Conference on 12/2010
(2010).
[11] G. Bianchi, Performance analysis of the IEEE 802.11 distributed coordination function, IEEE Journal on Selected
Areas in Communications (2000).
[12] The network simulator NS-2, Obtained from the Internet,
http://www.isi.edu/nsnam/ns/.

A PSO Inspired Harmony Search Algorithm


Farhad Maleki

Ali Mohades

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

farhad.maleki@aut.ac.ir

mohades@aut.ac.ir

F. Zare-Mirakabad

M. E. Shiri

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

f.zare@aut.ac.ir

shiri@aut.ac.ir

Afsane Bijari
University of Economic Science
Department of Knowledge engineering and decision science
afsaneh bijari@yahoo.com

Abstract: In this paper, PSOHS, as a new version of Harmony Search is presented. The proposed
algorithm, while creating a new harmony, takes advantage of both Social and Cognitive components
of Particle Swarm Optimization algorithm. To examine the proposed algorithm, PSOHS is tested
using a battery of standard benchmark functions and the results certify the superiority of PSOHS
over HS, IHS, GHS, and EGHS algorithms.

Keywords: Optimization; Meta-heuristic Algorithms; Harmony Search; Particle Swarm Optimization.

Introduction

There are many bio-inspired algorithms such as


Simulated Annealing[1], Genetic Algorithm[2], Tabu
Search[3, 4], Particle Swarm Optimization[5], Ant
Colony Optimization[6], Bee Colony Optimization[7],
and so on. These methods serve as useful search algorithms which imitate natural phenomena to solve optimization problems.
Harmony Search (HS), as a meta-heuristic algorithm
was developed by Geem et al[8]. This algorithm was
inspired by the process of musical performance that
takes place when a musician searches for a perfect state
of harmony. HS have been widely studied and applied
to solve both standard benchmark functions and real
world problems[912].
There are several approach aimed at improving Harmony Search algorithm. Improved Harmony Search
Corresponding

(IHS) was proposed by Mahdavi et al to improve the


standard version of HS by dynamically updating PAR
and bw [13].In addition, Global best harmony search
(GHS) as another variant of HS, was proposed by Omran and Mahdavi[14]. GHS, Inspired by the concept of
swarm intelligence, removes the PAR component and
adds a social component to the HS using the best harmony in harmony memory. Moreover, EGHS as another modification of HS was Proposed by [15]. EGHS
excludes Harmony Memory Considering Rate (HMCR)
and Pitch Adjustment Rate (PAR) and includes a Location Updating Probability (LUP). In EGHS, each
component of a harmony is updated using a random
value or a function of the worst and the best harmony
in HM.
In this paper, using both cognitive and social components of a Particle Swarm Optimization, a new version
of HS is proposed. The rest of the paper is organized
as follows. In Section 2, the HS, IHS, GHS, and EGSH

Author, P. O. Box 15875-4413, F: (+98) 21 6649-7930, T: (+98) 918 740-1022

360

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

algorithms are introduced in more details. Section 3 on each strategy. Fig. 1 depicts the pseudo-code of
present a brief overview of Particle Swarm Optimiza- this algorithm.
tion. In Section 4, the proposed algorithm is presented.
Section 5 focus on the experimental results. The paAlgorithm HS
per ends with a brief conclusion and future research
Data:
proposals in Section 6.
N=size of harmonies in HM
d=dimension of each harmony in HM
PAR, HMCR, bw
2 Harmony Search and Particle
Result:
The best harmony in HM
Swarm Optimization
Begin
while Stopping Criteria not Satisfied do
j=1;
The purpose of this section is to provide a detailed
while j d do
description of the necessary information which is esif rand() HM CR then
sential to pursue the rest of the paper. In the following
k = a random number {1, , N };
subsections HS, IHS, GHS, and EGHS are represented.
H(j) =HM(k, j);
In addition, Particle Swarm Optimization algorithm is
if rand() P AR then
depicted in this section.
H(j) = HM (k, j) bw;
end
else
2.1 Harmony Search
H(j) = A random value from the
possible range;
end
Harmony Search (HS) was inspired by the musical proj=j+1;
cess of searching for a perfect state of harmony. In HS,
end
each potential solution for the problem is coded as a
if Harmony H dominate the worst hormony
feature vector named harmony and the goal is to find
in HM then
a global optimum as determined by a fitness function.
Substitute H for the worst Harmony in
HS take advantage of a limited subset of successful exHM
periences, i.e. the fitest solutions. These harmonies are
end
gathered in a memory called Harmony Memory (HM ).
end
The algorithm continues for a number of iterations; in
end
each iteration a new harmony is generated. When the
Algorithm 1: Harmony Search Pseudo-code
New Harmony was generated, it is compared with the
worst harmony in harmony memory using the fitness
IHS, GHS and EGHS are variants of original Harfunction; if the new harmony dominates the worst har- mony Search. let us take a look at each one.
mony, the new harmony will be used as a substitute
for the worst one.
Let us use HM(i, j) to show the j th component of the 2.1.1 IHS Algorithm
ith harmony in Harmony Memory. In order to create
a new harmony, all components or features of this new
harmony should be computed, let us name this new IHS was proposed to overcome the shortcomings of
harmony H. HS employs three strategies to compute original harmony search. It is exactly the same as
each component H, e.i. H(j). As the first strategy, one HS except for PAR and bw; IHS dynamically updates
of the harmonies in HM is selected randomly, e.g. the PAR and bw according to equation 1 and equation 2,
ith harmony, and then the value of HM(i,j) is assigned respectively.
to H(j). As the second strategy, one of the harmonies
in HM is selected randomly, e.g. the ith harmony, and P AR(t) = P ARM in + (P ARM ax P ARM in ) t (1)
NI
then an adjacent value of HM(i,j) is assigned to H(j).
Finally, as the third strategy, a random value from the
bwM in
ln ( bw
)
M ax
possible range is used as H(j). To compute each H(j),
bw(t) = bwM ax exp (
t)
(2)
NI
HS uses one of these three strategies and therefore it
needs to decide on one of them. HS uses two param- where P AR(t) is the Pitch Adjustment Rate for genereters named Harmony Memory Consideration Rate ation t; P ARM in is the minimum Pitch Adjustment
(HMCR) and Pitch Adjustment Rate (PAR) to decide Rate; P ARM ax is the maximum Pitch Adjustment

361

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Rate; t is the generation number; N I is the number


of iteration in the algorithm; bw(t) is the bandwidth
for generation t; bwM in is the minimum bandwidth,
and bwM ax is the maximum bandwidth.
All other parts of IHS is exactly the same as original
Harmony Search.

2.2

Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a population


based search algorithm in which each individuals emulate the success of neighboring individuals and their
own successes; the overall result of these imitations
leads us to optimal regions of a high dimensional search
space. In PSO, each particle represents a potential
solution; these particles are flown through the search
space, where the position of each particle is adjusted
2.1.2 GHS Algorithm
according to its own experience and that of its neighbors. In the following the process of position updating
The GHS has exactly the same steps as the IHS with is described in detail
the exception that instead of a peach adjustment bw Let Xi (t) represent the position of ith particle in search
for a component, e.g. j th component, it use the follow- space at time step t. The position of each particle is
ing equation
updated according to equation 3.
Xi (t + 1) = Xi (t) + Vi (t + 1)

h(j) = HM (best, k)
Where k is a random integer number from {1, 2, ..., d},
and best is the index of the best harmony in HM[14].

(3)

Where Vi (t + 1) is the velocity of particle i at time step


t + 1, and it is calculated as follows (equation4):
Vi,j (t + 1) = Vi,j (t) + C1 r1,j (t)(Xp,j (t) Xi,j (t))+
C2 r2,j (t)(Xg,j (t) Xi,j (t))
th

2.1.3

EGHS Algorithm

The EGHS and the HS are different in three aspects as


follows:
First, instead of Harmony Memory Considering Rate
(HMCR) and Pitch Adjustment Rate (PAR), the
Location Updating Probability (LUP) is used in
EGHS.Second, the process of creating new harmony
is different from the original HS. new harmony improvisation in EGHS is as follows:
Harmony Improvisation in EGHS
Begin
foreach component j (j=1 to d) do
if rand() < LU P then
H(j) = HM (best, j) C(t)
(HM (best, j) HM (wort, j)
else
H(j) = A random value from the
possible range;
end
end
end
Algorithm 2: Pseudo-code of Harmony Improvisation in EGHS

(4)
th

Where Vi,j (t) denotes the j component of the i particles velocity vector at time step t; Xi,j (t) represents
the j th component of the ith particles position vector at
time step t; C1 and C2 are positive acceleration constants used to scale the contribution of the cognitive
and social components, respectively. And also r1,j (t)
and r2,j (t) are uniformly distributed random values in
[0, 1], Xp,j (t) is the best position visited by ith particle
since the first time step. And finally Xg,j (t) is the best
position found by swarm i.e. all particles. Both Xp,j (t)
and Xg,j (t) are determined by the use of a fitness function which evaluate each particle to find how close the
corresponding solution is to the optimum. It is obvious
that velocity vector drives the optimization process,
and reflects both the Social and cognitive knowledge of
particles. Originally, gbest and lbest PSO algorithms
have been developed which differ in the size of their
neighborhoods [5]. In gbest PSO, each particle is supposed to be the neighbor of all other particles. In lbest
PSO the degree of connectivity among the population
is less than the gbest PSO. There are many different
social network structures such as Wheel, pyramid, Four
Clusters and Von Neumann which have been developed
and studied for PSO[5].

PSOHS Algorithm

In EGHS C(t),a descending function, is used as an


optimization scale for each time step t. Finally, the
EGHS replaces the worst harmony in HM with the new Exploration and exploitation are two principal compoharmony (H) even if it does not dominate the worse nents of any meta-heuristic search algorithm. Exploharmony in HM.
ration involves finding new solutions in search space

362

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

and exploitation involves utilizing the solutions which


have been found during search process in order to construct new high quality solutions. It is obvious that
any search algorithm should keep balance between
these two concepts. In the first steps of search process,
the algorithm should focus on exploration for discovering appropriate solutions. Then, especially in the
final steps of the algorithm, these solutions should be
exploited to generate new solutions. Therefore almost
all algorithms explicitly or implicitly value exploitation
more than exploration in their final steps.
There are two major weakness which have a bad influence on the ability of HS as a meth-heuristic search
algorithm. First, the HMCR and PAR are the probabilities which control the exploration and exploitation
rate, and they are constant all through the search
process. Second, harmony search generates a new harmony in each iteration and, as it mentioned above, this
new harmony is considered if it dominates the worst
harmony in HM. The problem is that after a number of
iterations, some relatively good or average harmonies
constitute Harmony memory. After that point, HS in
too many of its iterations do not lead us to any change
in HM and it means those iterations are a waste of
time and computation resources. Furthermore, this
phenomenon increases the probability of trapping into
local optima.
Although, booth IHS and GHS take advantage of dynamically updating PAR and bw to keep balance between exploration and exploitation, they update the
HM when a generated harmony which is the result of
improvisation process dominates the worst harmony in
HM. In addition, in the final steps of these algorithms,
HM consists of relatively fit harmonies and therefore
new harmonies hardly ever take part in HM. It means
that both IHS and GHS suffer from the second weakness, presented above.
To overcome the first shortcoming PSOHS, similar to
IHS and GHS dynamically changes PAR and bw using
equation 1 and 2, respectively. In addition HMCR is
changed according to the following equation
HM CR(t) = HM CRM in +
(

HM CRM ax HM CRM in
)t
NI

(5)

PSOHS address the second issue by defining the elite


harmonies as the fittest harmonies in HM. After creating a new harmony (H), it is substituted for a randomly
chosen non-elite harmony. It is obvious that whenever
the number of elite harmonies are equal or greater than
1, PSOHS never loses its best solution which have been
found during search process. In addition, the number
of elite harmonies should be less than the number of
harmonies in HM, i.e. N, otherwise the proposed algorithm does not consider the new harmonies.

To improve the algorithm, the main ideas of particle


swarm optimization is used during the process of harmony improvisation. The pseudo-code of the proposed
algorithm is as follows:
Algorithm PSOHS
Data:
N=size of harmonies in HM
d=dimension of each harmony in HM
P ARM in , P ARM ax ,
C1 , C2 , HM CRM in , HM CRM ax
NI, PSOCR, bwM ax , bwM in
Result:
The best harmony in BHM
Begin
HM= Initialize Harmony Memory including N
harmonies each of dimension d;
BHM=HM;
/* BHM is the best harmonies which have been
found by each individual since the first step of
the algorithm*/
t=1;
while Stopping Criteria not Satisfied do
foreach j=1 to d do
if rand() HMCR(t) then
k = a random number {1, , N };
if rand() PSOCR then
H(j)=C1 rand()
(HM(best,j)-HM(k,j))+C2
rand() (BHM(k,j)-HM(k,j));
else
H(j) =HM(k,j);
if rand() PAR(t) then
H(j) =HM(k,j) +
(2(rand()-1)) bw(t);
end
end
else
H(j) = A random value from the
possible range
end
end
if H dominates BHM(k), i.e. the k th
harmony in BHM then
BHM(k)=H;
end
l = a random number {1, , N } where
HM(l) are not an Elite harmony;
Substitute H for the worst HM(l);
t=t+1;
end
end
Algorithm 3: PSOHS Pseudo-code

363

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Experimental Results

To investigate the quality of the proposed algorithm,


it is tested by a battery of 10 standard benchmark functions. The following parameter setting
was used in all experiments. N = 30,PSOCR=0.6,
HM CRM ax = 0.9, HM CRM in =0.50, P ARM in =0.01
and P ARM ax =0.99.
NI=20000,C1 = C2 =
1
, where UB
2, bwM in =0.000001 and bwM ax = (U BLB)
and LB are the uperbound and lower bound for each
benchmark function. For all functions d is equal to 30
except for the Camel-Back function where d=2.
Table 1. depicts the results of the proposed algorithm
for 10 benchmark functions (f1 , , f10 ) and Table 2
shows the name of each benchmark function.
Considering the experimental result provided in [14],
the results certifies the superiority of the proposed algorithm over the other variants of HS algorithm.

dard benchmark functions; the result confirmed that


the proposed algorithm works well on the benchmark
functions, and in comparison to HS, IHS, GHS, and
EGHS, the proposed algorithm lead us to satisfying
results in a very few iterations. The highly promising
outcome of this research suggests that the proposed
search algorithm is an effective algorithm for solving
engineering optimization problems.

Refrences
[1] S. Russell and P. Norvig, Artificial Intelligence : A Modern Approach (3rd Edition), prentice Hall, Chapter 4, pages:
125126, 2010.
[2] T. Mitchel, Machine Learning, McGraw Hill, Chapter 9,
pages: 249270, 1997.
[3] F. Glover, Tabu search-part I, Computer ORSA Journal on
Computing 1 (1989), 190206.
[4] F. Glover, Tabu search-part II, INFORMS Journal on Computing 2 (1990), 432.

Table 1. Experimental Results


Fnction
Mean
SD
f1
0.000034
0.000008
f2
0.000759
0.002988
f3
19.715321
0.000031
f4
0.000620
0.000299
f5
0.0000
0.0000
f6
12451.1338
21.1305
f7
0.092701
0.087710
f8
11.0869
10.9452
f9
0.000849
0.007511
f10
-1.03162851 0.0000004
f1
f2
f3
f4
f5
f6
f7
f8
f9
f10

[5] A. P. Engelbrecht, Computational Intelligence: An Introduction, John Wiley & Sons Ltd, Chapter 4, pages: 285357,
2007.
[6] A. P. Engelbrecht, Computational Intelligence: An Introduction, John Wiley & Sons Ltd, Chapter 4, pages: 359411,
2007.
[7] S. Bitam, M. Batouche, and E. Talbi, A survey on bee colony
algorithms, IEEE International Symposium on Parallel Distributed Processing Workshops and Phd Forum IPDPSW
(2010).
[8] Z. W. Geem and J. H. Kim, A new heuristic optimization
algorithm: Harmony search, Simulation 76 (2001), 6068.
[9] ShiaF., X. Xia, c. Chang, G. Xu, X. Qina, and Z. Jia, An Application in Frequency Assignment Based on Improved Discrete Harmony Search Algorithm, 2011 International Conference on Advances in Engineering (2011).

Table 2. Rule Names


Sphere function
Schwefels problem 2.22
Rosenbrock
Step function
Rotated hyper-ellipsoid function
Schwefels problem 2.26
Rastrigin function
Ackleys function
Griewank function
Six-hump Camel-back function

[10] D. Zou, L. Gao, J Wu, and S. Li, Novel global harmony


search algorithm for unconstrained problems, Neurocomputing 73 (2010), 33083318.
[11] Z. W. Geem, J. H. Kim, and G. V. Loganathan, A Harmony search optimization: application to pipe network design, International Journal of Model Simulation 22 (2002),
125133.
[12] J.H. Kim, Z.W. Geem, and E.S. Kim, Parameter estimation
of the nonlinear Muskingum model using harmony search,
Journal of American Water Resource Association 37 (2001),
11311138.
[13] M. Mahdavi, M. Fesanghary, and E. Damangir, A An
improvedharmonysearchalgorithm for solvingoptimizationproblems, Applied Mathematics and Computation 188
(2007), 15671579.

Conclusion

[14] M. Omran and M. Mahdavi, Global-best harmony search,


Applied Mathematics and Computation 198 (2008), 643
656.

In this paper, PSOHS was proposed based on Harmony


Search and Particle Swarm Optimization. The proposed algorithm was tested using a battery of stan-

364

[15] D. Zou, L. Gao, S. Li, and J. Wu, An effective global harmony search algorithm for reliability problems, Expert Systems with Applications 38 (2011), 46424648.

Repairing Broken RDF Links in the Web of Data by Superiors and


Inferiors sets
Mohammad Pourzaferani

Mohammad Ali Nematbakhsh

University of Isfahan

University of Isfahan

Department of Computer

Department of Computer

Pourzaferani@eng.ui.ac.ir

Nematbakhsh@eng.ui.ac.ir

Abstract: Broken Link is a well-known obstacle for the Web of Data. There are a few researches
which have repaired broken links through the destination point. These approaches have two major
problems: (i) A single point of failure (ii) inaccurate changes reported from destination to source
dataset. In this paper, we introduce a new method to repair broken links through the source point of
the link. At the time of detecting broken link, it repaired instantly. For this purpose, the algorithm
finds the new address for desirable target from the destination data source and use Just-In-Time
method. In this paper, we introduce two sets which we call them Superiors and Inferiors. Through
these two sets, we create an exclusive graph structure for every entity that need to be observed
and make a fingerprint-like identification for the entity. The result shows that almost 90 percent of
broken links that are referred to a person entity in DBpedia have been repaired.

Keywords: Broken Links; Link Integrity; Superior Entity; Inferior Entity; Just-In-Time.

Introduction

Linked Data is a practical way to achieve Semantic


Web goals. Technically, Linked Data refers to data
published on the Web in such a way that it is machinereadable [3]. In this way machines can process and
mining information without using complex natural language processing techniques which in many times bring
back inappropriate results, therefore this difference between document web and Web of Data makes a big
achievement for Linked Data technology[4]. This technology makes a new type of web that is well-structured
and machine-readable. The RDF Vocabulary Definition Language (RDFS) and the Web Ontology Language (OWL), provide a basis for creating vocabularies
that can be used to describe entities in the world and
how they are related [4]. Vocabularies are collections
of classes and properties. Vocabularies are themselves
expressed in RDF, using terms from RDFS and OWL,
which provide varying degrees of expressivity in modeling domains of interest. Anyone is free to publish
vocabularies to the Web of Data, which in turn can be
Corresponding

connected by RDF triples that link classes and properties in one vocabulary to those in another, thereby
defining mappings between related vocabularies [3].
Ontologies which used in this technology make the links
which connect entities have a meaning that is understandable for machines. Datasets which published according to Linked Data publishing rules and semantic
links like RDF links which connect these datasets to
each other, make a global cloud called Web of Data.
Although there are various types of tool for publishing information regards to Linked Data publishing
rules and many tools for discovery semantic linked between the entities, there are not suitable tools to preserve these links [4]. The purpose of link preservation
is maintaining links for a long period of time. Link
preservation also refers to activities taken to fix broken
links. Broken links appeared in two situations: (i) destination of the link is not dereferencable: in this case,
the destination entity address had deleted or changed.
(ii) definition of the entity had changed over the time:
in this case the entity is dereferencable but its semantic
got changed in such a way that it doesnt mean what

Author, P. O. Box 81746-73441, F: (+98) 311 793-2670, T: (+98) 311 793-4106

365

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

the producer means by it any more. In this paper, we


have involved broken link that has been structurally
get broken. The majority research related to this field
has two common weaknesses, namely: (i) checking and
fixing broken links are held in the destination point
of the link. (ii) Most of the approaches had a bottleneck and make a single point of failure. The proposed
approach in this paper tries to check and fix the broken link from the source point and with distributing
the approach to all of datasets we can cover the second
drawback. Our approach presents two entity sets which
we call them: Superiors and Inferiors. With the usage
of these two datasets the algorithm finds the missing
entity to repair the broken link.
Table 1: Statistics regarding DBpedia dataset
Measure
Value
#Persons entities in 3.6 version
296595
#Entities had changes in address
4726
#Entities had deleted
16527
#Entities added in 3.7 version
510636

Problem Statement

Broken links has introduced in some other topics such


as web and hypertext systems and known as Link Integrity. Actually this problem is divided to two sub
problems [5]:
Dangling Link : this type caused when the destination of the link is not accessible. This may
caused by deleting or movement of destination
entity.

these links have not repaired, it caused some problems


in program and decreases the quality of dataset.

Proposed Algorithm

The proposed algorithm uses semantic links between


entities to repair broken links. The RDF model encodes data in the form of subject, predicate, object
triples. The subject and object of a triple are both
URIs that each identify a resource, or a URI and a
string literal respectively [3]. The predicate specifies
how the subject and object are related, and is also
represented by a URI [3]. Each entitys RDF model
creates a graph structure for that entity. The fundamental idea that we introduce in this paper is that
entities mostly preserve their graph structure even after movement to another address. Despite the changes
that held in properties of the entity, the graph structure
not modified and it is possible to distinct that entity
from other entities by graph structure. The proposed
approach pursuit two basic goals:

Introduce a graph structure for an entity that


used as a fingerprint and distinguish it from other
entities.
Searching for an entity in whole of dataset is not
a good approach. Therefor we introduce an approach to reduce the search area.

Editing Problem: Although the destination en- The proposed algorithm consist of five basis modules
tity is accessible but its semantic got changed in that we have a short description for them.
a way that never meaning what the author mean
at creation.
Broken links are a considerable problem as they interrupt navigational paths in a network leading to the
practical unavailability of information [3]. Broken links
in Web of Data are even worse than they are in the document Web: human users are able to find alternative
paths to the information they are looking for. Such
alternatives include directly manipulating the HTTP
URI they have entered into a Web browser or using
search engines to re-find the respective information.
They may even decide that they do not need this information at this point in time [4]. This is obviously
much harder for machine actors. Therefore redirection
of the problem to application is not suitable because if

Superiors
Repository

Destination
Data Source

Inferiors
Repository

Local
Links
Outside
Links

Crawler
Controller

Analyzer
Ranking
Query
Maker

Figure 1: The proposed algorithm architecture.

366

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.1

3.4

Entity Repository

This module is responsible for storing entities. Changes


in structure and content is a prominent aspect of data
sources. Therefor, this module should remove obsoleted entities from repository and store new address
for moved entities.
Table 2: Superiors and Inferiors statistics
Measure
Value Percent
#Entities which have Superior
in 3.6 version
274
35%
#Entities which have Inferior
in 3.6 version
786
100%
#Entities which have Superior
in 3.7 version
454
58%
#Entities which have Inferior
in 3.7 version
786
100%
#Entities which have Superior
in 3.6 and 3.7 versions
246
31%
#Entities which have Inferior in
3.6 and 3.7 versions
786
100%

Analyzer

The basic idea behind the approach is that in this step


Crawler Controller search for the superiors of each entity inferior and inferiors of each superior, afterward it
gets this step reverse and find the destination entity
through comparison in their graph similarity. Because
there are many entities create in this manner, therefore we should remove entities which are not assisting
the algorithm to find the target entity. This module
chooses semantic links which are more important for
an entity, afterward ranking type of links for their importance and give the result to Query Builder module
to create an appropriate query. We also introduce a
threshold to limit the reduction of this module. As
shown in Fig. 3 if the threshold point increased, we
have not noticed a big change in performance of the
algorithm but if threshold set to minimum, there is a
major decrease in comparison computation (Fig 2,3).
250
200
150

Crawler Controller

100

There is a relationship between this module and Analysis module. The Analysis module gives appropriate
queries to Crawler Controller and this module start its
task with initial entities in the repository. Afterward,
for each entity in previous step, make Superiors and
Inferiors sets.
Superior Entity: An entity which has a semantic
link that reference to the observable entity. The
observed entity described as (S, P, O) in RDF
data model, each entity that have this entity in
their Object, included in Superiors set.
Inferior Entity: An entity which the observable
entity has a semantic link to it. Related to previous definition, each entity that have the observed
entity in their Subject are included in Inferiors
set.

50
0
1
35
69
103
137
171
205
239
273
307
341
375
409
443
477
511
545
579
613
647
681
715
749

3.2

Figure 2: Target entities with threshold=1

250
200
150
100
50

1
35
69
103
137
171
205
239
273
307
341
375
409
443
477
511
545
579
613
647
681
715
749

3.3

Query Maker
Figure 3: Target entities with threshold=75

When the Analysis module analyzed Superiors and Inferiors sets give the statistic to Query Builder module
and in this module the appropriate query built and supThe main reason why raising the threshold point
plied to Crawler Controller module. It used SPARQL not effected the precision is that there are a few entias a RDF query language to make the query.
ties which have a similar graph structure. Therefore,

367

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

when we decrease the threshold we have not seen a have an evaluation on them. Some of entities have
noticeable change in results (Fig. 4).
a change in their address and refer to another entity
which may not be a person entity. There are 786 entity of all entities were refers to a person entity, there695
fore our evaluation dataset contains 786 person entities
694
which supply to the proposed algorithm and the result
is their new address. The success of the algorithm is
693
depending on the intersection between Superiors and
692
Inferiors of two versions (further information are shown
in Table 2).
691

As we noticed earlier, DBpedia give us the changed


address and use wikiPageRedirects link between the
entities in two versions. These links would use as a basis metric to calculate the precision of the algorithm.

690
689
1

10 15 20 25 30 35 40 45 50 55 60 65 70 75

Figure 4: Number of correct targets in various thresholds

3.5

Discussion and Future Works

Ranking

In this article we introduced an algorithm to repair


broken RDF links in the Web of Data. To this end
In this module we find the most similar entity with the we define two datasets which we called them Superiors
target entity and passed it as the final result. We used and Inferiors.
Jaro-Winkler algorithm to calculate the similarity between two entities in their properties.
Table 3: Changes in Superiors and Inferiors entities
The algorithm use Eq. 1 to get the target entity from
Measure
Value Changed Address Percentage
Inferiors and use Eq. 2 to get the target from Superi#Superiors
987
74
7%
ors,afterward Eq. 3 produce the final target.
#Inferiors
6366
477
7%

e1 = max(sim(I(Se ) , Ie ))
(1)
e2 = max(sim(S(Ie ) , Se ))
T arget = max(e1 , e2 )

Evaluation

(2)

The result shows that the algorithm finds a target


(3) entity for all of the entities and in 694 cases, finds
the true target of the entity. In future we evaluated
the proposed algorithm with other datasets and try to
repair broken links that have semantic changes in their
destination point.

In order to evaluate our system, we used DBpedia


dataset. The DBpedia project has derived such a data
corpus from Wikipedia encyclopedia [2]. It plays a significant role in the Web of Data and many data publishers have begun to set data-level links to DBpedia
resources, making DBpedia a central interlinking hub
for the emerging Web of Data [2].

4.1

Evaluation method

To evaluate the system we extract 296595 person entities from DBpedia 3.6 version. As shown in Table 1
there are 4726 entities have a change in their address.
To evaluate our algorithm we sample 1056 entity and

Refrences
[1] H. Ashman, Electronic Document Addressing - Dealing with
Change, ACM Computing Surveys 32 (2000), 201212.
[2] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak,
and Z. Ives, DBpedia: a nucleus for a web of open data
(2007), 722-735.
[3] C. Bizer, T. Heath, and T. Berners-Lee, Linked Data - The
Story So Far, International Journal on Semantic Web and
Information Systems (IJSWIS) 5 (2009), no. 3, 122.
[4] N. Popitsch and B. Haslhofer, A Reactive-Agent Based Approach for a Facility Location Problem Using Dynamic Addiively Weighted Voronoi Diagram, Web Semantics: Science,
Services and Agents on the World Wide Web 9 (2011), no. 3,
266 - 283.

368

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[5] R. Vesse, Hall W., and L. Carr, Preserving Linked Data on


the Semantic Web by the application of Link Integrity techniques from Hypermedia, ISWC 09, 2009.

369

[6] J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, Discovering and Maintaining Links on the Web of Data, ISWC 09,
Springer-Verlag, Berlin, Heidelberg, 2009.

Palmprint Authentication Based on HOG and Kullback Leibler


Ma.Yazdani

F. Moayyedi

Jahrom University, Jahrom, Iran

School of Electrical and Computer Engineering

mj.yazdani1988@gmail.com

Shiraz University, Shiraz, Iran


moayyedi@cse.shirazu.ac.ir

Mi. Yazdani
Jahrom University, Jahrom, Iran
mi.yazdani@hotmail.com

Abstract: Palmprint is a unique and reliable biometric characteristic with high usability. Since
the increasing demand for automatic palmprint authentication systems, the development of accurate
and robust palmprint verification algorithms has attracted a lot of interests. This paper focuses
on palmprint recognition method using Histogram of Oriented Gradient (HOG). The proposed
method has three main stages. In the first stage, preprocessing is performed to segment Regions Of
Interest (ROI). In the next stage HOG is employed to ROIs extracted from previous step . In the
final stage we employed kullback leibler divergence to measure the similarity between two palms.
Experimental results and False Acceptance Rate (FAR), False Rejection Rate (FRR) and Total
Success Rate (TSR) chart of the system illustrated that the HOG in conjunction with kullback
leibler divergence construct a powerful, efficient and practical approach for automatic palmprint
authentication systems.

Keywords: Biometric Authentication; Histogram of Oriented Gradient (HOG); Kullback Leibler; Palmprint.

Introduction

Biometrics identifies different people by their physiological and behavioral difference, such as face, iris,
retinal, gait, etc. As an alternative personal identity
authentication method, Biometric identification has attracted increasing attention during recent years. In the
field of biometrics, palmprint is a novel but promising
member [1]. It has been attracting much attention because it has merits, such as high speed, user friendliness, low cost, and high accuracy. However, there is
room for improvement of online palmprint systems in
the aspects of accuracy and capability of spoof attacks
[2].
A palmprint has three types of basic features: principal
lines, wrinkles, and ridges (see Figure 1 ) and they have
been analysed in various ways. Zhang and Shu [3] applied the datum point invariant property and the line
feature matching technique to conduct the verification
Corresponding

Author, T: (+98) 936 109-8785

370

process via the palm-print features. They inked the


palmprint on the papers and then scanned them to obtain 400*400 pixel images. It is not suitable for many
on-line security systems because two steps are needed
to obtain the palmprint images in their approach. Furthermore, Zhang [4] proposed a texture-based feature
extraction method to obtain the global attributes of a
palm. Besides, a dynamic selection scheme was also designed to ensure the palm-print samples to be correctly
and effectively classified in a large database. Han [5]
used the operator-based approach to extract the linelike features from palmprints for palmprint verification.
Duta [6] employed palm lines, including principle lines
and wrinkles, for identity recognition and Li [7] analysed them in the frequency domain.
In this paper we proposed a palmprint authentication system based on Histogram of Oriented Gradient
(HOG) that can operate in real time. The technique
counts occurrences of gradient orientation in localized

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

portions of an image. Since palm lines are in different


directions in different people, this is a good separator
attribute.
The output of HOG is a histogram and in the other
word it is a distribution and for the comparison of two
distributions the famous method, kullback leibler have
been used.
In the continuance we describe each part more.

Find the gaps again to extract the ROI (with the


size of 128*128).

Figure 2: Computing the angel of the palmprint.

Please refer to [9] for the detailed ROI determination process. Figure 3 illustrates a ROI image cropped
from the original palmprint image.
Figure 1: Important lines in a palmprint

Methodology

As mentioned above, the proposed system developed


upon HOG has three main stages.
In the first
stage, preprocessing is performed to remove unnecessary parts of palmprint and make it prepare for next
stage. After that HOG is used to extract the features
of the palmprint. Finally kullback leibler is employed
to measure the similarity between two palmprints. In Figure 3: Main steps of preprocessing: (a) Original image, (b) Binary image, (c) Find boundaries of
this section these stages are described completely.
the gaps, (d) Rotated image and founded gaps again,
(e) The original image for d step, (f) Extracted
ROI(128*128).
2.1 Preprocessing
Once the palmprint is captured, it is processed to get
the region of interest (ROI). ROI extraction can remove
unnecessary parts of image for further processing. This
process will also reduce, to some extent, the effect of
rotation and translation of the hand. Major steps in
preprocessing are:

2.2

Extracting HOG Feature

The essential thought behind the Histogram of Oriented Gradient descriptors is that local object appearance and shape within an image can be described by
the distribution of intensity gradients or edge direcApply a lowpass filter and using a threshold to tions.
convert the image to binary image.
The implementation of these descriptors can be
Obtain the boundaries of gaps.
achieved by dividing the image into small connected
Compute the tangent of the two gaps.
regions, called cells, and for each cell compiling a hisRotate the image with the founded angel. (way togram of gradient directions or edge orientations for
of finding angel is shown in Figure 2).
the pixels within the cell. The combination of these

371

The Third International Conference on Contemporary Issues in Computer and Information Sciences

histograms then represents the descriptor. For improved accuracy, the local histograms can be contrastnormalized by calculating a measure of the intensity
across a larger region of the image, called a block, and
then using this value to normalize all cells within the
block. This normalization results in better invariance
to changes in illumination or shadowing. In this paper
we used [10] code to extract HOG feature. In Figure 4
you can see the plot of one palmprint HOG feature.

Figure 4: Extracting HOG feature: (a) An ROI of a


palmprint, (b) Plot of normalized histogram

processing, feature extraction and matching.


The method was applied to a set of palmprint images
taken from PolyU database [8] that is collected from
250 volunteers. Samples are collected in two separate
sessions. In each session, the subject was asked to provide 6 images for each palm. Therefore, 24 images from
2 palms were collected from each subject. In total,
the database contains 6,000 images from 500 different
palms in the size of 352*288 pixels. After preprocessing the extracted ROI is 128*128 pixels.
The result was performed by calculating the standard
error rate (false acceptance rate (FAR) and false rejection rate (FRR)). FAR and FRR are defined, respectively, as

F AR =

N umberof acceptedimposterclaim
100 (2)
T otalnumberof imposteraccesses

F RR =

N umberof rejectedgenuineclaim
100 (3)
T otalnumberof genuineaccesses

The system threshold value is obtained based on

2.3

Matching with Kullback Leibler di- the Equal Error Rate (EER) criteria where FAR =
FRR. This is based on the rationale that both rates
vergence

must be as low as possible for the biometric system to


work effectively. Another performance measurement
The Kullback-Leibler (KL) divergence is a fundamen- is obtained from FAR and FRR which is called Total
tal equation of information theory that quanti?es the Success Rate (TSR). It represents the verification rate
proximity of two probability distributions [11].
of the system and is calculated as follow [12]:
It is a natural distance function from a true probability distribution, p, to a target probability distribution, q. It can be interpreted as the expected extra
F AR + F RR
message-length per datum due to using a code based
T SR = (1
) 100
(4)
T otalnumberof accesses
on the wrong (target) distribution compared to using
a code based on the true distribution.
For discrete (not necessarily finite) probability distriIn Table 1 you can see the above parameters that
butions, p= {p1 , ... , pn } and q={q1 , ..., qn }, the
are calculated for the proposed system.
KL-distance is defined to be:
Z
DKL (P ||Q) =

ln
x

dQ
dP
dP

Table 1: Experimental results on the system for HOG


(1) feature

Consequently kullback-leibler is a good measurement to compare two palmprint HOG feature.

FAR(%)
0.97

FRR(%)
0.64

TSR(%)
99.2

Our approach has been compared with Lis algorithm [7] and Zhangs approach [9].
In Lis algorithm the R features and features of the
3 Experimental Results
palmprint were extracted from the frequency domain
to identify different persons. R features showed the inThis section presents and evaluates results of the ex- tensity of the lines of a palmprint and features showed
periments carried out according to three stages, pre- the direction of these lines.

372

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 2: Comparison of different palmprint recognition method


Techniques
Lis algorithm[7]
Zhangs approach[9] Our approach
Feature extraction
R feature and feature
2D Gabor phase
HOG
ROI size
128*128
128*128
Verification rate (%)
96.32%
98%
99.2%

Zhang tried to extract texture feature from low resolution palmprint images and proposed 2-D Gabor filter for texture analysis. He used hamming distance for
matching.
table 2 displays the comparison result.

actions on instrumentation and measurement 59 (2010),


480490.
[3] D. Zhang and W. Shu, Two Novel Characteristics in Palmprint Verification: Datum Point Invariance and Line Feature Matching, Pattern Recognition 32 (1999), 691702.
[4] D.D. Zhang, Automated Biometrics: Technologies and Systems, Kluwer Academic Publishers, Dordrecht (2000).

Conclusion

Palmprint is a relative new biometric method to recognize a person. In this paper, a palmprint verification system is developed by using HOG and kullback
leibler. The features were extracted using HOG and
the palms were matched by kullback leibler. Experimental result shows that a verification rate of 99.2%
can be achieved using HOG and kullback leibler. This
shows that palmprint recognition is performed well by
using the proposed method.

Refrences
[1] T. T. Yufei Han, Palmprint Recognition Based on Directional Features and Graph Matching, Springer,Verlag Berlin
Heidelberg (2007).
[2] D. Zhang, Z. Guo, G. Lu, L. Zhang, and W. Zou, An Online
System of Multispectral Palmprint Verification, IEEE trans-

[5] C. Han, H.L. Cheng, C.L. Lin, and K.C. Fan, Personal Authentication using palmprint features, Pattern Recognition
36 (2003), 371381.
[6] N. Duta, A. Jain, and K.V. Mardia, Matching of Palmprint,
Pattern Recognition Letters 23 (2001), 477485.
[7] W. Li, D. Zhang, and Z. Xu, Palmprint Identification by
Fourier Transform, International Journal of Pattern Recognition and Artificial Intelligence 16 (2002), 417432.
[8] PolyU
Palmprint
http://www.comp.polyu.edu.hk/.

Database:

[9] D. Zhang, W.K. Kong, and J. You, On-Line Palmprint


Identification, Biometrics Research Centre, Department
of Computing, The Hong Kong Polytechnic University,
Kowloon, Hong Kong.
[10] O. Ludwig, D. Delgado, V. Goncalves, and U. Nunes, Trainable Classifier-Fusion Schemes: An Application To Pedestrian Detection, 12th International IEEE Conference On Intelligent Transportation Systems 1 (2009), 432437.
[11] J. Shlens, Notes on Kullback-Leibler Divergence and Likelihood Theory, Systems Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037 (2007).
[12] T. Connie, A. Teoh, M. Goh, and D. Ngo, Palmprint Recognition with PCA and ICA, Image and Vision Computing NZ
(2003), 227232.

373

A Simple and Efficient Fusion Model based on the Majority Criteria


for Human Skin Segmentation
S. Mostafa Sheikholslam, Asadollah Shahbahrami, Reza PR Hasanzadeh
Department of Electronic and Computer Engineering,
Faculty of Engineering, University of Guilan

Nima Karimpour Darav


Islamic Azad University,
Lahijan Branch

Abstract: A new skin segmentation model based on combination of some of the most useful
existent models of skin segmentation is presented. Formerly different skin segmentation models
have been presented for different color spaces, then according to their pros and cons, a new method,
based on a combination of three models of skin segmentation that consists of elliptical model, single
Gaussian model and hsv fixed range, and according to majority criteria is presented. The advantage
of this model is its high amount of true detection rate beside low amount of false detection rate.

Keywords: Skin Color Segmentation; Skin Cluster; Single Gaussian Model; Elliptical Model; Majority Criteria.

Introduction

Segmentation is one of the main steps in digital image


processing. There are many methods for image segmentation such as, thresholding, region growing, edge
detection, and color segmentation. Skin color segmentation is take as one of the useful methods in color
segmentation and is applied in many applications such
as, face detection, hand gesture recognition, hand and
face tracking, robotics and Human Computer Interaction (HCI) systems [1]. It is very hard to present a
method which has the ability to overcome the challenges of skin segmentation, such as consistency to the
change in light intensity [2, 3]. The result of a skin
segmentation model could directly effect on accuracy
of mentioned applications. For instance, a weak skin
segmentation algorithm cannot segment the hand region in an image correctly and this leads to decrease
accuracy in an HCI system.
According to [3], human skin color components at
different color spaces have a cluster distribution. Based
on color approach that used in this paper, skin color
Corresponding

Author, Email: sheikholslam@msc.guilan.ac.ir

374

segmentation models could be divided to three categories that contain models with explicit threshold on
the skin cluster, the parametric models, and the nonparametric models. Models with explicit threshold
on skin cluster are built based on determination of
skin cluster boundary. One of the main problems of
this method is low amount of consistency against the
changes in the light conditions [10]. The Parametric
models contain statistical models such as, single Gaussian and mixture Gaussian [4, 13]. For example, in
single Gaussian model, the distribution of skin cluster
is estimated by a Gaussian function, and then the parameters of this function are determined by using maximum likelihood training algorithm. The small amount
of the required parameters for building the single Gaussian model leads to low storage space for this model.
In the non-parametric models such as, histogram based
model, two histograms are computed, one for skin and
one for non-skin areas. After dividing every slot in to
the total count of elements, the probability of a pixel
belongs to a skin or non-skin areas could be computed
[5, 14]. The accuracy of this model is high, while it
needs to a large amount of storage space. Another

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

important problem that directly effect on accuracy of So that:


a skin segmentation model is type of color space. A
r+g+b=1
color space with the ability to separate the chrominance components from the intensity component, leads
to design a skin segmentation model with better con2.2 YCbCr Color Space
sistency against light condition changes.
In this paper, firstly some of skin segmentation
models are presented. Then a new technique, based on
combination of three models of skin segmentation, that
consists of, elliptical, single Gaussian, and HSV fixed
range, and based on majority criteria is presented. The
proposed skin segmentation model is compared with
some existing models in terms of True Detection Rate
(TDR) and False Detection Rate (FDR) approaches.
The results show that our skin segmentation model
achieves a high TDR close to 87% and low value for
FDR close to 1.4%. This means that the proposed
approach has the ability to detect correctly the large
amount of region in image that covered by human skin.

(2)

The YCbCr color space is used by European television studios and for image compression standards [6].
The Y component is luminance and it is computed using the weighted sum of RGB values. The Cb and Cr
components are two color difference values and they are
computed using subtracting luminance from Red and
Blue components respectively [6]. The main advantage
of this color space is that, the YCbCr color space separates luminance component from chrominance components. The Eq. (3) shows the transformation matrix
between RGB and YCbCr:

(3)
The rest of this paper is organized as follows: Section 2 introduces the color spaces. Skin color models
are presented in section 3. Proposed method is presented in section 4. Performance of this method is presented in section 5 and section 6 concludes the paper.

The separation between chrominance and intensity


components leads to robust model in this color space
[6, 7].

Color Spaces

A wide variety of color spaces have been applied to the


skin color modeling. The most widely used ones are 2.3
summarized as below.

HSV Color Space

The HSV color space is one of the most common cylindrical coordinate representations of points in the RGB
color model. Hue is dominant color such as green and
2.1 RGB and Normalized RGB
blue, Saturation is colorfulness of a region in proportion to its brightness, and Value is related to color lumiRGB color space is the most known color space. Cap- nance [6]. The Eq. (4) shows the relations to calculate
turing and displaying systems work based on it. RGB HSV components from RGB components [4, 6, 7].
describes color as a combination of three color rays
(red, green and blue). The RGB color space is not a
very good choice for skin segmentation, this is because
(4)
of mixing the chrominance and luminance components
[6, 7]. Hence, the normalized RGB tries to reduce the
dependency of the chrominance components to the luminance of the each pixel by a simple normalization
procedure that show in Eq. (1):

r = R/(R + G + B)

g = G/(R + G + B)

(1)

b = B/(R + G + B)

375

The Third International Conference on Contemporary Issues in Computer and Information Sciences

the most important characteristic of this color space 3.3


is its explicit discrimination of luminance from chrominance [6, 7].

It is possible to define the skin color model base on skin


cluster in HSV color space. According to [11] the skin
cluster could be defined by Eq. (7)

Related Work

In this section some of the skin segmentation models


are presented.

3.1

Thresholding on the Skin Cluster


in HSV color space

40 < V

0.2 < S < 0.6

0 < H < 25

(7)

A pixel can be located in the skin pixels class, when


its values satisfy the above ranges.

Normalized RGB Skin Model

The presented model in [9], as a classifier, defines three 3.4 Single Gaussian Model
attributes in normalized RGB color space. Eq. (5) illustrates this classifier.
Results reported in the literature indicate that a single
Gaussian function can be used as a model for skin color
(5) cluster [13]. A Gaussian probability function that used
as skin model is defined by Eq. (8)

(8)

Where r, g, b are the normalized coordinates obtained from Eq. (1). This algorithm uses a restricted
covering algorithm, and obtained from a skin probability map that described in [9]. Each pixel that satisfies
Eq. (5) can classify as a skin pixel.

Where m is a mean vector, s is a covariance matrix and x is input sample that in a color image can be
T
T
x = [Cb Cr ] or x = [R G B] and so on. From a set
that contains the samples (pixels) of skin covered regions in different images, and with applying Maximum
Likelihood that defined in Eq. (9-10), the parameters
of a single Gaussian model can be determine.

(9)

3.2

Thresholding on the Skin Cluster


in YCbCr color space

The presented algorithm in [10] is one of the most


useful skin color segmentation models. The Eq. (6)
demonstrates this model.
(10)
77 Cb 127

(6)

133 Cr 173
In fact, Eq. (6) presents the range of skin cluster
Where mM L is the mean of the samples and SM L
that obtain from a training data set. If an input pixel is the covariance of the samples as mentioned above.
satisfies Eq. (6) it can be classified as a skin pixel.
For each input pixel, first p(x) is calculated according

376

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

to (8). If p(x) was more than the threshold, the input


pixel is classified as a skin pixel, else not. In this work
we use 0.54 as threshold.

3.5

EllipticalModel

As described in [8] it is possible to determine an elliptical range in a set of skin samples and build a classifier
according to it. Eq. (11-12) show the elliptical model
that used in [12]:

Figure 1: The proposed model block diagram.

(11)

In this model, we combined three skin segmentation models which consist of, single Gaussian, elliptical, and HSV models. The final result of segmentation
is selected according to majority criteria. A majority
criterion means that, an input pixel is classified as a
skin pixel if at least two techniques classify that pixel
as a skin pixel.
(12)

5
5.1

Experimental Results
Database

We used the ECHO [15] database for evaluating our


model. This database contains videos in different lighting conditions.

In this classifier is the rotation of skin cluster. A


is the big radius and B is the small radius of ellipse.
Also Cx and Cy demonstrate the coordinate of ellipse
center. If the value of this equation was less or equal 5.2
to one, this pixel is classified as skin pixel, else not.

Proposed Model

Performance Approaches

For comparison of the results, we used two approaches


that are useful for performance analysis of the skin segmentation models [8]. First approach is TDR, and defines in Eq. (13):

T rue Detection rate =


Among the implemented models the elliptical model
has the highest amount of TDR and FDR. Both HSV
and SGM models have high amount of TDR, while
their FDR is low. The proposed technique is called
Skin Segmentation Combination Model (SSCM). Figure 1 shows the block diagram of our proposed algorithm.

x
100
Y

(13)

Where x is the number of the pixels in an output


image that is classified as skin correctly, and Y is the
total number of skin set members that obtained from
the original image. The second approach is FDR, and
defines in Eq. (14):

377

The Third International Conference on Contemporary Issues in Computer and Information Sciences

F alse Detection rate =

w
100
Z

(14)

Where w is number of the pixels in output image


that is classified as skin incorrectly, and Z is the total
number of non-skin set members that obtained from
the original image.

5.3

Figure 2a. Detection Rate. 1- NRGB, 2- YCbCr, 3HSV, 4- SGM, 5- Elliptical, 6- SSCM, respectively.

Figure 2b. False Detection Rate. 1- NRGB, 2YCbCr, 3- HSV, 4- SGM, 5- Elliptical, 6- SSCM,
respectively.

Implementation Results

According to Figure 2, The YCbCr model shows good


amount of TDR, while its FDR is high. The HSV
model has better TDR than the YCbCr model and its
FDR is lower too. It is clear from Figure 2 that elliptical model has the best TDR but its FDR is high, and
HSV and SGM models have the high amount of TDR
and their FDR is low. The SSCM model achieves the
low amount of FDR while its TDR is still high. Table 1 shows the values of TDR and FDR for all of the
discussed models respectively. Figure 3, shows the binary masks that obtained from discussed models and
SSCM.

Figure 4. a) Original, b) NRGB, c) YCbCr, d) HSV,


e) SGM, f) Elliptical, g) SSCM.

Conclusions

In this paper some of the skin segmentation models are


introduced and their pros and cons are discussed. Then

Table 1. True and False Detection rate.

378

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

a new triple skin segmentation model that we called


it Skin Segmentation Combination Model (SSCM) and
consists of elliptical boundary, single Gaussian and hsv
fixed range, based on majority criteria, presented. The
experimental results illustrated that this model has
high true detection rate close to 87% while its false
detection rate is low close to 1.4%. This paper shows
that with correct combination of available models it is
possible to achieve to a skin segmentation model with
high true detection rate and low false detection rate.

Refrences
[1] L. Sigal et al., Skin color-based video segmentation under
time-varying illumination, IEEE Trans. on pattern analysis
and machine intelligence 26 (2004), no. 7, 862-877.
[2] S. Phung et al., Skin segmentation using color and edge information, Proc. IEEE Int. Symp. on Signal Processing and
Its Application, 2003, pp. 525-528.
[3] P. Kakumanu et al., A survey of skin-color modeling and detectionnmethods, Pattern recognition 40 (2007), 1106-1122.
[4] R. Hassanpour, A. Shahbahrami, and S. Wong, Adaptive
gaussian mixture model for skin color segmentation, proceeding of world academy of science, engineering and technology 31 (2008), 102-105.
[5] M. Abdullah-Al-Wadud et al., Skin segmentation using
color distance map and water- flow property, Proc. IEEE
Int. Conf. on Information Assurance and Security, 2008,
pp. 83-88.

[6] V. Vezhnevets et al., A survey on pixel-based skin detection


techniques, Cybernetics 85 (2003), 85-92.
[7] B. Zarit et al., Comparison of five color model in skin pixel
classification, Recognition, analysis, and tracking of faces
and gestures in real-time systems, Proc. of IEEE, 1999,
pp. 58-63.
[8] J. Kovac et al., Human skin color clustering for face detection, Computer as a Tool Slovenia, Proc. of IEEE, 2003,
pp. 144-148.
[9] G. Gomez et al., Automatic feature construction and a simple rule induction algorithm for skin detection, Proc. of Int.
Workshop on Machine Learning in Computer Vision, 2002.
[10] D. Chai et al., Face segmentation using skin color map in
videophone application, IEEE Trans. on circuits and system
for video technology 9 (1999), no. 1, 551-564.
[11] K. Sobottka et al., A novel method for automatic face segmentation, facial extraction and tracking, Signal processing
Image communication 12 (1998), no. 3.
[12] R. Hsu et al., Face detection in color images, IEEE Trans.
on pattern analysis and machine intelligence 24 (2002),
no. 24, 696-706.
[13] T.Caetano et al., Performance evaluation of single and
multiple- gaussian model for skin color modeling, Proc. of
IEEE Int. Symp. on computer graphics and image processing, 2002, pp. 275-282.
[14] S. Phung et al., Adaptive skin segmentation in color images,
Proc. of IEEE Int. Conf. on Acoustics, speech, and signal
processing, 2003, pp. 353-356.
[15] ECHO is a European sign language database, which is available on, http://www.let.ru.nl/sign-lang/echo.

379

A New Memetic Fuzzy C-Means Algorithm For Fuzzy Clustering


Fatemeh Golichenari

Mohammad Saniee Abadeh

Islamic Azad University of Qazvin

University of Tarbiat modares

Department of Electrical, Computer and IT

Department of Computer Engineering

Qazvin, Iran

Tehran, Iran

f.golchenari@qiau.ac.ir

saniee@modares.ac.ir

Abstract: Fuzzy clustering is an important problem which is the subject of active research in
several real-world applications. Fuzzy c-means (FCM) algorithm is one of the most popular fuzzy
clustering techniques because it is efficient, straightforward, and easy to implement. However,
FCM is sensitive to initialization and is easily trapped in local optima. Genetic algorithms (GAs)
are believed to be effective on NP-complete global optimization problems, and they can provide
good near-optimal solutions in reasonable time. Memetic algorithms (MAs) were presented as
evolutionary algorithms that hybridize the global optimization characteristics of GAs with local
search techniques that allowed the GAs to perform a more deep exploitation of the solutions. In
this paper, a hybrid fuzzy clustering method based on FCM and fuzzy memetic (MFCMA) is
proposed which remove some FCM shortcomings. To improve the expensive crossover operator
in memetic algorithms (MAs), we hybridize MA with the fuzzy c-means algorithm and define the
crossover operator as a one-step fuzzy c-means algorithm. Experimental results show that the
proposed clustering algorithm outperforms FCM, FPSO and FPSO FCM on several UCI datasets.

Keywords: Genetic algorithm; Fuzzy clustering; Fuzzy memetic algorithm; Fuzzy c-means algorithm (FCM)

Introduction

Fuzzy algorithms can assign data object partially to


multiple clusters. The degree of membership in the
fuzzy clusters depends on the closeness of the data object to the cluster centers. The most popular fuzzy
clustering algorithm is fuzzy c-means (FCM) which introduced by Bezdek in 1974 [1] and now it is widely
used.

Clustering is the process of assigning data objects into


a set of disjoint groups called clusters so that objects in
each cluster are more similar to each other than objects
from different clusters. Clustering techniques are applied in many application areas such as pattern recognition, data mining, machine learning, etc. Clustering
Fuzzy c-means clustering is an effective algorithm,
algorithms can be broadly classified as Hard, Fuzzy, but the random selection in center points makes itPossibilistic, and Probabilistic.
erative process falling into the local optimal solution
easily. For solving this problem, recently evolutionK-means is one of the most popular hard clustering ary algorithms such as ant colony optimization (ACO),
algorithms which partitions data objects into k clusters simulated annealing (SA), genetic algorithm (GA), and
where the number of clusters, k, is decided in advance particle swarm optimization (PSO) have been successaccording to application purposes. This model is in- fully applied.
appropriate for real data sets in which there are no
definite boundaries between the clusters.
Genetic algorithms are stochastic search techniques
that can search large and complicated spaces. It is
After the fuzzy theory introduced by Lotfi Zadeh, based on biology including natural genetics and evothe researchers put the fuzzy theory into clustering.
Corresponding

Author, P. O. Box 43361-37438, F: (+98) 132 5721240, T: (+98) 132 5723158

380

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

lutionary principle. In particular, GAs are suitable


for parameter optimization problems with an objective function subject to various hard and soft. The
GA basically explores a complex space in an adaptive
way, guided by the biological evolution of selection,
crossover, and mutation. This algorithm uses natural
selection survival of the fittest to solve optimization
problems [2]. Memetic algorithms (MAs) were presented as evolutionary algorithms that hybridize the
global optimization characteristics of GAs with local
search techniques that allowed the GAs to perform a
more deep exploitation of the solutions [3]. The experimental results over four real-life data sets indicate the
FCMFMA is superior to the FCM algorithm and GA.

gorithm based on PSO and KHM is proposed, which


makes use of the merits of both algorithms. The proposed method not only helps the KHM clustering escape from local optima but also overcomes the shortcoming of the slow convergence speed of the PSO algorithm. In [9], authors used a Fuzzy c-mean algorithm
based on Picard iteration and PSO (PPSOFCM), to
overcome the shortcomings of FCM. In [10] authors describe a GA-based clustering algorithm where the chromosome encodes the centers of the clusters instead of
a possible partition of the data points.In [13] a hybrid
fuzzy clustering method based on FCM and FPSO is
proposed which make use of the merits of both algorithms.

The rest of the paper is organized in the following


manner. In Section 2, we investigate the related works
Fuzzy C-means Algorithm
and Section 3 introduces fuzzy c-means clustering, in 3
Section 4 genetic algorithm and fuzzy MA algorithm
for clustering is discussed; Section 5 presents our Fuzzy
Fuzzy c-means partitions set of n objects o =
memetic algorithm; Section 6 presents our hybrid clus{o1 , o2 , ..., on } in Rd dimensional space into c(1 < c <
tering method, and Section 7 reports the experimental
n) fuzzy clusters with Z = {z1 , z2 , ..., zc } cluster cenresults. Finally, Section 8 concludes this work.
ters or centroids. The fuzzy clustering of objects is described by a fuzzy matrix with n rows and c columns
in which n is the number of data objects and c is the
number of clusters. ij , the element in the ith row and
2 Related Works
jth column in , indicates the degree of association
or membership function of the ith object with the jth
cluster. The characters of are as follows:
In [4], introduced two new methods for minimizing
the reformulated objective functions of the fuzzy cij [0, 1] i = 1, 2, ..., n; j = 1, 2, ..., c (1)
means clustering model by particle swarm optimizac
X
tion: PSOV and PSOU. In PSOV each particle repij = 1 i = 1, 2, ..., n
(2)
resents a component of a cluster center, and in PSOU
j=1
each particle represents an unscaled and unnormalized
c
X
membership value. Also they compared the proposed
0
<
ij < n j = 1, 2, ..., c
(3)
methods with alternating optimization and ant colony
j=1
optimization. In [5], authors presented an ant colony
clustering algorithm for optimally clustering N objects The objective function of FCM algorithm is to miniinto K clusters. The algorithm employs the global mize the Eq. (4):
pheromone updating and the heuristic information to
c X
n
X
construct clustering solutions and uniform crossover
Jm =
ij m dij
(4)
operator to further improves solutions discovered by
j=1 i=1
ants. In [6], in order to overcome the shortcomings
of Fuzzy c-means, a PSO based fuzzy clustering al- Where
dij = koi zj k
(5)
gorithm is discussed. The proposed algorithm uses the
capacity of global search in PSO algorithm to overcome in which, m (m 1) is a scalar termed the weighting
the shortcomings of FCM. In [7], proposed the genetic exponent and controls the fuzziness of the resulting
fuzzy Kmodes algorithm for clustering categorical data clusters and dij is the Euclidian distance from object
sets. They treated the fuzzy Kmodes clustering as an oi to the cluster center zj . The zj , centroid of the j th
optimization problem and used genetic algorithm to cluster, is obtained using Eq. (6).
solve the problem in order to obtain globally optimal
n
P
solution. To speed up the convergence process of the
ij m oi
algorithm, they used the one-step fuzzy Kmodes algoi=1
(6)
zj = P
n
rithm in the crossover process instead of the traditional
m

ij
crossover operator In [8], a hybrid data clustering ali=1

381

The Third International Conference on Contemporary Issues in Computer and Information Sciences

The FCM algorithm is iterative and can be stated as 4.2


follows [1]:
1.Select m(m > 1); initialize the membership
function values ij , i = 1, 2, ... , n; j = 1,2,...,c.
2.Compute the cluster centers zj , j = 1,2,...,c,
according to Eq. (6).
3.Compute Euclidian distance dij , i = 1,2,...,n;
j= 1,2,...,c.
4.Update the membership function ij , i =
1,2,...,n; j = 1,2,...,c according to Eq. (7).
5.If not converged, go to step 2.
Algorithm 1: Fuzzy c-means

Description of MA

Memetic algorithms (MAs) are evolutionary algorithms (EAs) that apply a separate local search process
to refine the individuals (i.e. improve their fitness by
hill climbing, etc.). Additionally, MAs are inspired by
Richard Dawkins concept of a meme which represents
a unit of cultural evolution that can exhibit local refinement. They are like Gas combined with some kinds
of local search and able to balance the exploration and
exploitation capabilities of both genetic algorithm and
local search [11].

Unlike traditional evolutionary computation approaches,


MAs are concerned with exploiting all avail(7)
2
dij m1
able knowledge about the problem under study, which
( dik
)
k=1
can be incorporated in the form of heuristics, approxSeveral stopping rules can be used. One is to terminate imation algorithms, local search techniques, specialthe algorithm when the relative change in the centroid ized recombination operators, truncated exact methvalues becomes small or when the objective function, ods, and many other ways. This is not as an opEq. (4), cannot be minimized more. The FCM algo- tional mechanism, but as a fundamental feature. Rerithm is sensitive to initial values and it is likely to fall cent studies have shown that memetic approaches can
lead to high quality solutions more efficiently than the
into local optima
genetic algorithms.
ij = P
c

Memetic algorithm
5

4.1

Fuzzy Memetic Algorithm

Description of GA

Genetic algorithms were developed by Holland in the


1970s and they are based on the evolutionary ideas
of natural selection and genetics. The GAs generate a
population starting from a previous one by crossing the
individuals of the previous generation. This procedure
allows the algorithm to exploit the historical information kept in the chromosomes of each individual in the
population. Thus, if two good individuals are crossed,
it is quite probable that the resulting offsprings improve the solution of the problem to be solved. In classical GAs, each individual in the population encodes a
solution to the problem using a chromosome composed
by a sequence of genes whose values are 0 or 1. The
general scheme that classical GAs follow is:
Randomly initialize population(t)
Determine fitness of population(t)
while termination criterion do
1.select parents from population(t)
2.perform crossover on parents to generate
population(t+1)
3.mutate population(t+1)
4.compute fitness of population(t+1)
end

In general, a MA consists of seven basic elements: coding or string representation, population initialization,
selection, crossover, mutation, local search, Termination criterion. In this section, we will introduce these
six elements of the FMA for fuzzy C-means clustering.

5.1

String representation

In FM algorithm X, the chromosome, shows the fuzzy


relation from set of data objects, o = {o1 , o2 , , on }, to
set of cluster center, Z = {z1 , z2 , , zc }. X can be expressed as follows:

11
..
X= .
n1

...
..
.

1c
..
.
nc

(8)

In which ij is the membership function of the ith object with the j th cluster with constraints stated in (1)
and (2). Therefore, we can see that the chromosome is
the same as fuzzy matrix in FCM algorithm.

382

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

5.2

Initialization process

5.5

Mutation

In the mutation process, each gene has a small probability Pm of mutating, decided by generating a random number. After testing several mutation method,
finally we used the method of [7]. in the mutation
process, the fuzzy memberships of a point in a chromosome will be selected to mutate together with probability Pm. The mutation process is described as follows:
for i= 1 to n do
for j=1 to c do
for ii= 1 to N do
Generate a random numbers from (0,1)
for i=1 to n do
for each ij point of the chromosome;
Generate a random numbers from (0,1)
end
if r Pm then
Generate c random numbers
end
vi1 , vi2 , ..., vic ; from (0,1) for the ith
point of the chromosome;
After Initializing of each chromosome, it may vic
P
olate the constraints given in (1) and (2). So, it is
Vij for j=
Replace ij with Vic /
j=1
necessary to normalize the chromosome matrix. The
1,2,...,c; and i=1,2,...,n;
matrix undergoes, (10), the following transformation
end
without violating the constraints:
end
end
In the initialization phase, a population of N legal chromosomes is generated, where N is the size of the population. To generate a chromosome like (8) we employ
the method introduced by [12], which is described as
follows:

Xnormal

11
c
P
1j
j=1

..
.
n1
c
P
nj
j=1

5.3

...

1c
c
P
1j
j=1

..

..
.
nc
c
P
nj

where N is number of population and n is number


of data objects.
(9)

j=1

Selection

In our algorithm, we use the well-known tournament


selection, with t=4. In FM algorithm the same as other
evolutionary algorithms, we need a function for evaluating the generalized solutions called fitness function.
In this paper, Eq. (10) is used for evaluating the solutions.
K
f (x) =
(10)
Jm

5.6

Local search

As mentioned before, we used a local search after mutation in memetic algorithm. At the first we sort all
chromosome based on their fitness. After that we select some chromosome randomly among higher half of
sorted population. Afterwards each selected chromosome searches the best chromosome in its vicinity randomly based on hill climbing way with 5 neighbourhood. Then we use lamarckian idea.

Simulation Results

There in K is a constant and Jm is the objective function of FCM algorithm (Eq. (4)). The smaller is Jm , 6.1 Parameter settings
the better is the clustering effect and the higher is the
individual fitness f(X).
In order to optimize the performance of the MFCMA,
fine tuning has been performed and best values for their
parameters are selected. Based on experimental results
these algorithms perform best under the following set5.4 Crossover
tings: m=2, N=50, Pm =0.02 and Pc =0.9. The MA
terminating condition is the maximum number of iterAfter the selection process, the population will go ations 100. The FCM terminating condition in Algothrough a crossover process. We use three times run- rithm 1 is the number of iterations 5 that runs after
ning of FCM algorithm according to Algorithm 1.
MA.

383

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Instances (n,c,d)
Best
FCM [13]
Average
Worst
Best
FPSO [13]
Average
Worst
Best
FCMFPSO[13] Average
Worst
Best
MFCMA
Average
Worst

Iris(150,3,4)
67.92
70.43
71.58
66.26
67.39
69.72
62.19
62.55
62.96
60.53
60.63
61.11

Glass(214,6,9)
72.26
72.87
73.37
86.26
86.97
87.37
72.23
72.64
73.11
71.69
71.98
72.25

Cancer(683,2,9)
2196.8
2213.3
2235.8
2704.6
2724.4
2750.1
2181.9
2190.5
2218.7
2132.1
2148.7
2183.8

CMC(1473,3,9)
3517.1
3534.7
3548.3
4025.2
4095.6
4190.1
3416.5
3485.6
3531.2
3334.6
3358.9
3412.6

Table 1: Results of FCM, FMA, FCM-FMA Methods on Four Real Data Sets.

6.2

Experimental results

For evaluating the FCM, FPSO, FCM-FPSO and


MFCMA methods, four well-know data sets have been
considered:Iris, Glass, Wisconsin and CMC data sets.

[3] H. Pomares, A. Guill en, J. Gonz alez, I. Rojas, O. Valenzuela, and B. Prieto, Parallel multi objective memetic
RBFNNs design and feature selection for function approximation problems, Neurocomputing 72 (2009), 3541-3555.
[4] T. A. Runkler and C. Katz, Fuzzy clustering by particle
swarm optimization, In 2006 IEEE international conference
on fuzzy systems 2 (2006), 601-608.

The experimental results of over 100 independent


runs for FCM and 10 independent runs for FPSO,
FCM-FPSO and MFCMA are summarized in Table 1.
The figures in this table are the objective function values (4). As shown in this table, the hybrid MFCMA
obtained superior results than others in all of data sets
and it can escape from local optima.

[5] B. ZHAO and Chen Z, An ant colony clustering algorithm,


In Proceedings of the sixth international conference on machine learning and cybernetics (2007), 3933-3938.

[8] F. Yang, T. Sun, and C. Zhang, An efficient hybrid data


clustering method based on K-harmonic means, and particle swarm optimization, Expert Systems with Applications
36 (2009), 9847-9852.

Conclusion

In this paper, in order to overcome the shortcomings of


the fuzzy c-means we integrate it with fuzzy memetic
algorithm. Experimental results over four well-known
data sets, Iris, Glass, Cancer, and CMC, show that
the proposed hybrid method is efficient and can reveal
very encouraging results in term of quality of solution
found.

[6] L. Li, X. Liu, and M. Xu, A novel fuzzy clustering based on


particle swarm optimization, In First IEEE international
symposium on information technologies and applications in
education 12 (2007), 88-90.
[7] G. Gan and Z. Yang, A genetic fuzzy k-modes algorithm
for clustering categorical data, Expert Systems with applications 36 (2009), 1615-1620.

[9] H.C Liu, J.M. Yih, D. B. Wu, and S.W. Liu, Fuzzy C-mean
clustering algorithms based on Picard iteration and particle
swarm optimization, International workshop on education
technology and training (2008) (2001), 838-842.
[10] S. Bandyopadhyay and U. Maulik, An evolutionary technique based on K-Means algorithm for optimal clustering
in RN, Information Sciences 146 (2002), 221-237.
[11] R. Bansal and K. Srivastava, A memetic algorithm for the
cyclic antibandwidth maximization problem, Soft Compute
(2011), 397-412.

Refrences
[1] J. Bezdek, Fuzzy mathematics in pattern classification,
Ithaca, NY: Cornell University 3 (1989), no. 2, 68-73.
[2] U. Fayyad, G. Shapiro, and P. Smyth, From data mining
to knowledge discovery, Advances in Knowledge Discovery
and Data Mining, AAAI Press (2008), 1200-1209.

384

[12] L. Zhao, Y. Tsujimura, and M. Gen, Genetic algorithm


for fuzzy clustering, In Proceedings of IEEE International
Conference on Evolutionary Computation, Nagoya Japan
(1996), 716-719.
[13] Hesam Izakian, Ajith Abraham, and A. Swami, Fuzzy C means and fuzzy swarm for fuzzy clustering problem, Expert Systems with Applications 38 (2011), 1835-1838.

Cross-Layer Architecture Design for long-range Quantum


Nanonetworks
Aso Shojaie

Qazvin Azad university


Department of Computer engineering and information technology
aso.shojaie@gmail.com

Mehdi Dehghan Takhtfooladi


Amirkabir University of technology
Department of Computer engineering and information technology
dehghan@aut.ac.ir

Mohsen Safaeinezhad
Amirkabir University of technology
Department of Computer engineering and information technology
safaienezhad@aut.ac.ir

Ebrahim SaeediNia
Lameaie Gorgani Institute
Department of Computer engineering and information technology
saeedi.ebrahim68@gmail.com

Abstract: The advances made in Nanomaterial sciences opened the doors of electromagnetic
communication among nanodevices. Nanonetwork can be more robust than conventional wireless
networks because of their non-hierarchical distributed control and management mechanisms. Since
quantum mechanics becomes an accurate representation of matter at the atomic and subatomic
scale, it will naturally be a significant part of nanoscale networking. In this paper, a new CrossLayer network architecture for the interconnection of Quantum nodes with Ad-hoc communication
nanonetworks is provided. We proposed a new network architecture that can transmit signals from
the nodes of an Ad-hoc structured nanonetwork to another one via some quantum communication
channels. We have simulated the quantum channel and evaluate its throughput for some different
multiplexing schemes.

Keywords: nanonetwork, quantum communication, teleportation, network protocol, multiplexing, cross-layer.

Introduction

other macroscale electronic devices [1]. However, many


communication issues are still not solved and this field
requires from new insights to overcome the nano-world
peculiarities. Independently of the communication
Nanonetworks will provide humans from non-invasive
approach chosen, Nanonetworks are supposed to be
mechanisms to interact with nanoscale elements. A
densely populated networks. A high number of nodes
vast number of applications are envisioned to emerge
are required to maintain the network connectivity and
due to the interconnection of these nanodevices with
Corresponding

Author, P. O. Box 66156-43753, F: (+98) 871 721-5220.

385

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

to have a macroscale effect from the millions of nanoeffects [2]. Due to the high number of nodes and the
mentioned energy constraints, we foresee the necessity
of organizing the access to this shared medium. In the
same way as classical networks were designed to transport data between remote nodes, quantum networks
have been proposed to do communication of quantum
data. Thought classical networks can use repeaters
that amplify signals and copy data, quantum networks
cannot rely on such operations as they are forbidden
by quantum mechanics. Quantum Communication is
based on transferring entangled pairs from one location
to another, with the help of swapping, repeating and
purification [3]. Quantum communication is considered
to be the ultimate in privacy because its impossible to
read quantum state data without changing it. Thus,
if the line is tapped in any way, the receiver will know
about it. This breakthrough demonstrates its possible
to perform quantum communication in the real world.
These communication protocols between two remote
parties can be unconditionally secure against eavesdropping [4]. Quantum communication holds promise
for the secret transfer of classical messages as well forming an essential element of quantum networks, allowing
for teleportation of arbitrary quantum states and violations of Bells inequalities over long distances [5]. Each
repeater needs to execute many quantum operations,
which are done by nano quantum nodes. Therefore,
a quantum network is considered to be a distributed
quantum computing problem. In order to provide them
for long distances, quantum repeaters were proposed,
which allow extending the distance of entangled pair.
The connection of many quantum repeaters in complex
topologies makes a network of nano-transceivers [6]. In
this paper, we focus on distributed control of the communications in nano-quantum networks. The remaining of this paper is organized as follows. In Section 2,
quantum teleportation will be briefly discussed, and in
section 3, the topology of quantum network based on
teleportation is introduced. In section 4, a new network architecture for the interconnection of nanosensor devices with quantum communication is provided.
In Section 5, we have presented the simulation results
of quantum channel modeling for some different multiplexing schemes. Finally, the paper is concluded in
Section 6.

2 Teleportation Architecture
in Quantum Networks

nication rests, admits to only two possible states: a


classical on-off system must be in either state 0 or state
1, representing a single bit of information. Quantum
mechanics is quite different [7].A two-level quantum
system can be characterized by two orthogonal basis
states (vectors in a Hilbert space) |0 > and |1 >. The
system itself, however, may be in an arbitrary superposition of these two states,|0 > +|1 > , where and
are complex numbers that satisfy the normalization
condition |2 | + | 2 | = 1 . Such arbitrary superpositions represent a single quantum bit of information,
i.e., a qubit [9]. At the micro and nano scale, qubit
superposition is innocuous. In teleportation, the same
quantum state is reconstructed at a destination as was
transmitted at the source. Teleportation is a key mechanism for nanoscale network protocols. Let us examine
a quantum teleportation circuit as shown in Figure 1.

Figure 1: Fig. 1. A quantum circuit for teleportationthe Hamard and CNOT that makes the Bell pair can
be replaced with any mechanism that creates a Bell
pair over distance, such as Qubus [8].

In this figure, q0 and q1 are two qbits in the


transmitter.q2 is another qbit but in the receptor. The
quantum state that we want to teleport is |q0 >.q1 and
q2 become a Bell pair after the first. Hadamard gate
and CNOT gate. is the general consideration of three
qbits [10]. Inputs are on the left side of the circuit and
flow proceeds to the right | > is Alices qubit that is to
be transported. The entangled qubit, shared by Alice
and Bob, is 00 . The standard practice of using single
lines for quantum information flow and double lines
for classical information flow is used here. We show
slices (A0 through A4) through the quantum circuit in
the figure to indicate the operation of the circuit one
step at a time. The joint state of Alices qubit and the
entangled state is denoted by slice A0 |0 > shown in
Equation 1.
0 >= | > |00 >=
1
| [|0 > (|00 > +|11 >) + (|00 > +|11 >)]
2

(1)

The idea of a quantum network emerged after successNext, the controlled-NOT gate implements a flip of
ful experiments on quantum teleportation. Digital abstraction, on which modern computation and commu- the qubits in Alices shared entangled qubit in slice A1

386

The Third International Conference on Contemporary Issues in Computer and Information Sciences

as shown in Equation 2.If |1 > is applied, a controlled- to properly deliver messages between nodes. These
NOT gate implements the qubit flip operation, other- protocols must be designed layered to allow indepenwise the state passes though unchanged.
dence of different layers and easy upgrade of each of
them without impacting the rest protocols. In order to
1
design robust protocols, we need to create finite state
|1 >=
2
machines and decide a proper definition of the legal
sequence of actions and time-outs that such protocols
[|0 > (|00 > +|11 >) + |1 > (|10 > +|01 >)] (2)
must follow.
In the next step, slice A2 , |2 >, the Hadamard gate
is applied to Alices qubit as shown in Equation 3.
2 >=

1
[(|0 > +|1 >)
2

|(|00 > +|11 >) + (|0 > |1 >)(|10 > +|01 >)] (3)

4 Protocol Design

The result can be expanded via simple algebra as In this section, we propose quantum protocols that will
enable distributed and consistent decision making for
shown in Equation 4.
the nanonetwork nodes. Some of the functions these
1
protocols must handle are reporting results of quan2 >= [|00 > (|0 + |1) + |01 > (|1 > +|0 >)
2
tum operations, results of any measurement, exchang| + |10 > (|0 > +|1 >) + |11 > (|1 > +|0 >)] (4) ing density matrices, the time when these operations
were done, etc. Network architecture is more than simWe see that a quantum network using teleportation will ply contacts, formats and semantics of the messages, it
not work without the necessary entangled qubits [11]. also, includes many aspects of the behaviour which may
Yet, a teleportation network could be used to transmit be visible only implicitly, rather than in the contents
the entangled qubits. Entanglement swapping appears of messages.
to be a key to the solution of transmitting quantum
information.

Quantum Network Topology

For quantum networks, we have two type of links, the


quantum channel which is the connection to produce
entanglement between neighboring nodes, and the classical network, which basically is a TCP/IP network
that allows control messages to be exchanged between
any nodes in the network. Failure of a quantum channel or of the classical network should be identified and
reported to the routing protocols to try to find an additional path [9]. Path selection in quantum networks
would change constantly and, it would become very
difficult for nodes to make decisions based on possiblyoutdated available entangled resources tables. It is the
responsibility of the swapping layer to choose the right
Bell pairs to swap. In this paper, for the nanonetworks, we use a static approach, however, for future
works, this layer needs to be studied in depth to find
patterns and good algorithm to make routing decisions
[6,12]. Similarly to classical networks, Quantum networks have some sort of cost which is based on fidelity
of the Bell pairs available for that link. To keep control
of quantum operations and to report the results of the
measurements, a communication protocol is necessary

387

Figure 2: protocol design

the behavior which may be visible only implicitly,


rather than in the contents of messages. The process
of designing quantum networks is similar to designing classical networks, as they require detailed protocol designs, including finite states machines to control
physical resources and track logical state [13]. As we
want use quantum network for transmit the data of
nanosensors, we must integrate these nodes to the network architecture.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

In the nanonetwork case, when the nodes are biological or electromagnetic nanosensors or nanoactuators,
the application layer may convert the qbit to an action
signal. In this combination, we must use an interface
for converting molecular signals and quantum messages
to each others. This encoder-decoder is placed in the
application layer of quantum network, and after the
physical layer of the nanosensor node.

Figure 3: protocol design

As we want use quantum network for transmit the


data of nanosensors, we must integrate these nodes to
the network architecture.

Figure 6: Quantum protocol stack

Simulation and Evaluation

Figure 4: Quantum protocol stack

In Fig. 2, we show the relation between the different units integrating a nanosensor device and a sensingaware cross-layer protocol for nanosensor network. Fig.
3, shows the proposed protocol stack, that we detailed
its layers interconnect in Fig. 4. It can be seen that
after the Entanglement Control (ESC) layer is done,
the next higher protocol is again Purification Control
(PC), but in this case, the Bell pairs belong to further
stations [13]. And this keeps on repeating until the
end-to-end Bell pair is purified. Finally, the application
layer is reached and the data qbit can be teleported.
The control of a qbit (a single-qbit buffer) is passed
from layer to layer until consumed by the Application
layer or reinitialized to start over from the lowest layer.

In quantum networks, due to the huge amount of messages, measurements and the probabilities of success
of quantum operations, make a calculation completely
analitic of the protocols behaviour is a very hard task.
In addition, the construction of quantum repeaters
is not yet possible due to many physical challenges,
so build a network and measure the protocols performance is not possible either. The only way to study
our protocols is simulation the channel and protocols
in a network simulator. As there is not any standars
simulator for nanonetworks and quantum communications, we must develop our media over existing computer platforms. In our pervious works, we simulate
the molecular nanonetworks using MATLAB.

Figure 5: Quantum protocol stack

Figure 7: Quantum protocol stack

388

The Third International Conference on Contemporary Issues in Computer and Information Sciences

In this paper, we use Omnet++ for Quantum network simulation. Omnet++ allows us to define the
configuration parameters of a node and the network
topology. We simulated a qybus mechanism, with
20km hops with a number of qbits in each transmitter
is 50, and 16 in the receivers. In all of our simulations,
we use a target end-to-end fidelity of 0.98. We have run
simulations for two cases: only one flow, and two flows
competing for shared resources in the network shown
in Fig. 6. Both flows are over three-hop paths (AEFB
and CEFD), with the middle hop (EF) being a shared
link and hence CD flows. Used naively, the first and
third hops on each path will remain idle half of the
time. We have studied different multiplexing schemes
in order to recommend a mechanism for sharing resources in a multi-user network, and ultimately to be
able to predict the performance of a given network under certain trrafic patterns. The total throughput of
all five flows is highest for the statistical multiplexing,
achieving 257 teleported qbits per second, compared to
228 qbits per second for buffer space multiplexing and
201 for tme division multiplexing.

Discussion and Future Works

nanonetworks will have a great impact in almost every field of our society ranging from healthcare to
homeland security and environmental protection, but
Enabling the communication among nanosensors is
still an unsolved challenge [14]. Similarly, Quantum
communications is fast becoming an important component of many applications in information science and
technology. Sharing quantum information over a distance among geographically separated nodes requires
a reconfigurable transparent networking infrastructure
that can support quantum information services. In this
paper, we proposed a new network architecture that
can can transmit signals from the nodes of an Adhoc
structured nanonetwork to another one via some quantum communication channels. The interface between
nano and quantum network that convert molecular or
electromagnetic signals and qbits together, is placed in
the Application layer of quantum network nodes. We
have simulated the quantum channel and evaluate its
throughput for some different multiplexing schemes.
As our results, the best multiplexing scheme was stastical multiplexing in performance and simplicity of
implementation. We also find out that proper tuning
of uncontested links improves the network performance
while spending time of unused links doing purification
and reducing the number of end-to-end purification
steps, which are slower due to the addition of more
hopes and the longer propagation delay for the classi-

389

cal messages. A problem with simulations of quantum


systems is that when measuring one particle belonging
to an entangled group, immediate influence is produced to the rest particles, without any propagation,
therefore, running simulations in a distributed computing environment should consider how manage this
information. Also, designing the protocols such that
the stations were able to make the same consistent
decisions without exchanging unnecessary messages
proved to be a hard task. More complex and different
topologies should be tested in the future in order to
analysis the cross-layer communication protocols and
their performance in the quantum network. We want
to design and simulate the whole multi-scale nanonetwork, specially the quantum - molecular transceivers
in our future works.

Refrences
[1] I. Akyildiz and Josep Miquel Jornet, Electromagnetic wireless nanosensor networks, Elsevier, Nano Communication
Networks 1 (2010), 319.
[2] A.Shojaie, Modeling and analysis a new nanonetwork architecture for molecular communication in medical scenarios, Master of Science Dissertation, June 2011.
[3] Stephen F Bush, Nanoscale Communication Networks,
Artech House Series , Nanoscale Science and Engineering,
2010.
[4] S. Gaertner, C. Kurtsiefer, M. Bourennane, and H. Weinfurter, Exper-imental Demonstration of Four-Party Quantum Secret Sharing, Phys.Rev. Lett 98 (2007).
[5] H. J. Kimble, The quantum internet, Nature 453 (June
2008), 1023 - 1030.
[6] A. Shojaie and M. Dehghan takhtfooladi, Bio-inspired communication using diffusion based long-range nanonetworks,
Proceedings of the 2012 International Conference on Intelligent Information and Networks ICIIN 2012 (2012).
[7] W. McCarthy, Hacking Matter: Levitating Chairs, Quantum Mirages, and the Infinite Weirdness of Programmable
Atoms, Basic Books, free multimedia edition ed., 2003.
[8] R. Van Meter, T. D. Ladd, W. J. Munro, and K.
Nemoto, System design for a long-line quantum repeater,
IEEE/ACM Trans. Netw. 17(3) (2009), 10021013.
[9] A. Luciano and R.V. Meter, Path selection in hetrogeneous
quantum networks, 10th Asian Confference on Quantum Information Science (AQIS) (2010).
[10] W. G. Cooper, Evidence for transcriptase quantum processing implies entanglement and decoherence of superposition
proton states, Biosystems 97 (August 2009), 7389.
[11] C. J. Kaufman, M. A. Nielsen, and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge
University Press, October 2000.
[12] A. Shojaie and M. Safaienezhad, Automata based nanorobot
for molecular communication in medical scenarios, Proceedings of the 7th Vienna International Conference on
Mathematical Modelling (Feb 2012).
[13] M. Hayashi, K. Iwama, H. Nishimura, R. H. Putra, and S.
Yamashitaa, Quantum network coding, STACS (2007), 610
621.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[14] A. Shojaie and M. Safaienezhad, Mathematical Swarm


Model for Nanorobotic Communication, Proceedings of the

6th Vienna International Conference on Mathematical Modelling (Feb 2009).

390

Generation And Configuration Of PKI Based Digital Certificate


Based On Robust OpenCA Web Interface
Parisa Taherian

Mohammad Hossein Karimi

Shahrood University of Technology

Shahrood University of Technology

Department of Information Technology Engineering

Department of Software Engineering

Parisa.taherian@gmail.com

mhkarimi@ipm.ir

Abstract: In this paper introduced that the PKI is a security architecture that has been defined
to create an increased level of trust for exchanging information. Digital Certificate sticks an identity
to each pair of public and private keys which its issued by a Certification Authority (CA) and can
be used to encrypt data. Combine apples with encryption makes a complete security solution and
assuring the identity of all parties involved in a transaction. The OpenCA Project is a collaborative
effort to develop a robust, full-featured and Open Source out-of-the-box Certification Authority
implementing the most used protocols with full-strength cryptography world-wide. OpenCA is
based on many Open-Source Projects. Among the supported software is OpenLDAP, OpenSSL,
Apache Project, Apache mod ssl.

Keywords: PKI, Digital Certificate, OpenCA

Introduction

thentication through digital signatures

Ensure of the source and destination of that inublic Key Infrastructure (PKI) is a security architecformation
ture that has been defined to create an increased level
of trust for exchanging information. The more accu Assurance of the time and timing of that inforrate PKI be introduced the style and technologies that
mation
provide a secure infrastructure. In order to achieve this
goal, using a mathematical technique called public key
cryptography that uses a pair key for authentication
and proof of content. These keys are known public key
and private key. Public key can be exposed in public 1.2 A PKI consists of
but the private key is only available to owner and when
information is encrypted with public key, only by the
A certificate authority (CA) that both issues and
private key can be decrypted it.
verifies the digital certificates.[2]

1.1

A PKI provides the following benefits

Ensure of the accuracy of information sent and


received
Ensure the authenticity of the sender and his au Corresponding

Author

391

A registration authority (RA) which verifies the


identity of users requesting information from the
CA [2]
A central directory i.e. a secure location in which
to store and index keys. [3]
A certificate management system [4]

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

To do this

Use whose

Send an encrypted
message
Send an encrypted
signature
Decrypt
an
encrypted message
Decrypt
an
encrypted
signature
(and
authenticate
the sender)

Use the receivers


Use
the
senders
Use the receivers
Use
the
senders

Kind of
key
Public
key
Private
key
Private
key
Public
key

Table 1: Public Key V.S. Private Key [1]

1.3

Different kind of PKI

There is different type of PKI which refers to the distribution of public keys, therefore a PKI is a common
way to exchange public keys which this process called
PKE (Public Key Exchange) and no need to have a
CA, an RA or a current server. Different types of PKI
to be divided into two categories: centralized and decentralized

S/MIME).
Encryption and/or authentication of documents
(e.g., the XML Signature or XML Encryption
standards if documents are encoded as XML).
Authentication of users to applications (e.g.,
smart card logon, client authentication with
SSL). Theres experimental usage for digitally
signed HTTP authentication in the Enig form
and mod openpgp projects.
Bootstrapping secure communication protocols,
such as Internet key exchange (IKE) and SSL. In
both of these, initial set-up of a secure channel
(a security association) uses asymmetric key
(a.k.a. public key) methods, whereas actual communication uses faster symmetric key (a.k.a. secret key) methods.
Mobile signatures are electronic signatures that
are created using a mobile device and rely on signature or certification services in a location independent telecommunication environment.
Universal Metering Interface (UMI) an open
standard, originally created by Cambridge
Consultants for use in Smart Metering devices/systems and home automation, uses a PKI
infrastructure for security

PGP: PGP has a completely decentralized infrastructure: each user generates a pair of keys
public and private keys, signs its public key and 2
Digital Certificates
its email address with its private key and then exchange the result with its acquaintances through
a key server or an offline channel (from hand to As noted PKI provides inpatient to raise the level of
hand, through a telephone line, etc.).
security in computer network and in the insecure In X.509: Unlike PGP, an X.509 PKI is centralized. ternet, but there are many security problems that PKI
Each certificate must be issued by a trusted third cannot solve all of them.
party, but there can be several trusted third parDigital Certificate sticks an identity to each pair of
ties. If two persons communicates using certificates issued by two different trusted third parties, public and private keys which its issued by a Certificaeach person must trust the two third parties un- tion Authority (CA) and can be used to encrypt data.
less the two trusted parties are cross certified. A Combine apples with encryption makes a complete secross certification results from an accord between curity solution and assuring the identity of all parties
two trusted parties,which agrees on the practice involved in a transaction.
of the other contracting party.
Digital Certificates can be used for a variety
of electronic transactions including e-mail, electronic
commerce, groupware and electronic funds transfers.
1.4 Usage PKI [5]
Netscapes popular Enterprise Server requires a Digital Certificate for each secure server.
PKIs of one type or another, and from any of several
vendors, have many uses, including providing public
keys and bindings to user identities which are used for: 2.1 A Digital Certificate typically con-

tains the:
Encryption and/or sender authentication of
e-mail messages (e.g., using OpenPGP or

Owners public key

392

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.2

Owners name

IDX-PKI

Expiration date of the public key


IDX-PKI is an open source project maintained and
Name of the issuer (the CA that issued the Dig- supported by the french company IDEALX. It has the
ital Certificate)
basic features like the certificate issuance, a web interface and a support of the SCEP. It is also build on top
Serial number of the Digital Certificate
of OpenSSL, which is a good thing for the security.
Digital signature of the issuer
In fact, the main advantage of this product, but
also its main disadvantage, is the support: it must be
paid. It is an advantage because it ensure the quality,
but if you planned to make money with this project, it
2.2 Usage Digital Certificate
may not be the best one.[7]
Using Digital Certificate together encryption eliminates security concerns about privacy, exchange important information and e-commerce. However for this
purpose, Encryption alone - Because encrypted data
does not prove that the senders identity- is not enough.
On the other hand Digital Certificate verifies Someones identity. Used in conjunction with encryption,
Digital Certificates provide a more complete security
solution.
Similarly, a secure server must have its own Digital
Certificate to assure users that the server is run by the
organization it claims to be affiliated with and that the
content provided is legitimate.

3.3

OpenCA

The OpenCA Labs, born from the former OpenCA


Project, is an open organization aimed to provide a
framework for PKI studying and development of related projects.
OpenCA has several advantages like its programming language (perl) which is known by most of the
system administrators. It is a traditional open source
project since it allows the replacement of any part of
the PKI by any other from another project. This great
interoperability is its main advantage: the design of
the PKI can stick to the needs of the client. [8]

Main Projects Dealing With


PKI
4

The Open Group Certified Architect (OpenCA)

For practical implementations PKI has done several


projects that among the most important are following:
The OpenCA Project is a collaborative effort to develop a robust, full-featured and Open Source outof-the-box Certification Authority implementing the
most used protocols with full-strength cryptography
3.1 NewPKI
world-wide. OpenCA is based on many Open-Source
Projects. Among the supported software is OpenLNewPKI is an open source project aiming at building DAP, OpenSSL, Apache Project, Apache mod ssl.[8]
a full PKI. It uses the low level API of OpenSSL (this
has advantages and disadvantages) and is compatible
OpenCAs design has this features that allow to user
with Linux and Windows. The product is shipped with to create an interface for his special requirements. Also
a cross-platform GUI for the administration and the include some support for hiding functionalities and
configuration.
through commands viewCSR, viewCert and viewCRR
be configured to only show some selected links.
The main advantage of this project is the ease of
configuration, it seems very easy to deploy but its caOpenCA is composed of predefined interfaces such
pabilities of integration with the other PKIs are not so as CA, RA, pub, and node management (ca-node and
good. This is mainly a product for small entities, i.e. ra-node) that the user can use them separately or a
a all-in-one PKI. [6]
combination of several.

393

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

4.1

The CA interface

get an issued certificate [9]

The CA interface ensures all the operations related to


the certification authority (CA). This includes the issuance of certificates and the issuance of CRLs.

Figure 3: The public interface

Figure 1: The CA interface

4.2

The RA interface

4.4

The node interface

The node part of OpenCA handles the miscellaneous


tasks like the initialization of the SQL database, the
exchange of data, the backup of the certificates or the
use of the log files.

This interface can mail the owner of the certificates and


warning them to their expiring certificates. Although
the RA cannot issue certificates, it allow its administrators to approve (actually, to sign) the certificate
requests and the revocation requests.

Figure 4: The node interface

4.5
Figure 2: The RA interface

4.3

The public Interface

On this interface, you can do only three actions:


request a certificate
request the revocation of a certificate

Features [9]

OpenCA supports a large number of features like the


SCEP, an OCSP responder or even the use of an HSM
(Hardware Security Module). These features can be
separated in several categories: the general features
and the interoperability ones.
General features: OpenCA supports the generation of a heaps of certificates. These certificates
are created using templates written by the PKIs
administrator thus allowing to support any kind
of certificate.

394

The Third International Conference on Contemporary Issues in Computer and Information Sciences

If you have Cisco routers, you may want to use burn a CD-RW and send it through carrier pigeon.
the SCEP (Simple Certificate Enrollment Proto- All these solutions are working (nevertheless the last
col) to request a certificate directly and automat- is maybe not reliable).
ically from the router.
When the CA needs to sign a certificate several
When a certificate must be revoked, OpenCA can
ways
to do it are possible: you can store the private
do it in two different manners. You can issue
key
in
a file on the hard disk and use it to sign the
a CRL (Certificate Revocation List) and/or use
certificate.
In spite of being simple, this solution is not
an OCSP (Online Certificate Status Protocol) revery
secure.
For example, an assailant might gain acsponder. The CRLs can be published by a Web
cess
to
the
server
by a security hole in the web server
server and/or in a directory.
and steal the private key (this allows him to create false
Interoperability features: When configuring
certificates at will). Instead you can store the private
OpenCA, you will have to cope with a couple
key in an HSM and let the HSM do the signatures, In
of choices: which database you are using, which
this manner, the assailant gaining access to the server
LDAP server you have or how do you exchange
cannot steal the private key (but if your HSM is not
data with the other servers.
protected by PIN or a password the assailant can alOpenCA uses a database to store the issued cer- ways ask the HSM to generate a false certificate). If
tificates. The choice of this database is left to the you do not own an HSM (this is not a cheap hardware),
administrator, the only requirement is to have you can always use a smart card to hold the private key
the Perls DBi driver. Currently, OpenCA is of your CA and do the signatures.
known to operate with:
PostgreSQL, a fully SQL99 compliant
RDBMS
Mysql, a lightweight SQL database
Oracle, a well known and full featured
RDBMS
Once stored in the database, the certficates can be published through an LDAP server. One more time, the
choice of this LDAP server is left to the administrator.
Although any RFC compliant directory might work,
OpenCA was only tested with OpenLDAP.
When two parts of the PKI must exchange data,
both must use the same protocol. The choice, the
implementation and the configuration of this protocol
(as large meaning) is left to the administrator. This
is probably the largest freedom of this software: you
could transmit a zip archive through the HTTPS protocol, a tarball through FTP over an IPsec link or even

395

Refrences
[1] http://searchsecurity.techtarget.com/definition/PKI.
[2] Jhn R Vacca, Public key infrastructure: building trusted applications and Web services, CRC Press. p. 8. ISBN 978-08493-0822-2, 2004.
[3] Barton McKinley, ask of setting up a public-key infrastructure, Network World (2001).
[4] Al-Janabi Sufyan T. Faraj et al, Combining Mediated and
Identity-Based Cryptography for Securing Email, In Ariwa,
Ezendu Digital Enterprise and Information Systems: International Conference, Deis Proceedings. Springer (2010), 2-3.
[5] http://en.wikipedia.org/wiki/Public key infrastructure.
[6] http://www.newpki.org/.
[7] http://www.idealx.com/index.php?lang=en.
[8] http://www.openca.org.
[9] Nicolas Mass, Open source PKI with OpenCA.

Network Intrusion Detection Using Tree Augmented Naive-Bayes


R. Najafi

Mohsen Afsharchi

Sanay Systems

University of Zanjan

najafirobab88@gmail.com

afsharchim@znu.ac.ir

Abstract: Computer networks are nowadays subject to an increasing number of attacks. Intrusion Detection Systems (IDS) are designed to protect them by identifying malicious behaviors or
improper uses. Since the scope is different in each case (register already-known menaces to later
recognize them or model legitimate uses to trigger when a variation is detected), IDS have failed so
far to respond against both kind of attacks. In this paper, we apply two of the efficient data mining
algorithms called Naive Bayes and tree augmented Naive Bayes for network intrusion detection and
compare them with decision tree and support vector machine. We present experimental results on
NSL-KDD data set and then observe that our intrusion detection system has higher detection rate
and lower false positive rate.

Keywords: Anomaly And Misuse Detection, Bayesian Network, Intrusion detection, Tree Augmented Naive-Bayes,
Naive-Bayes.

Introduction

Intrusion detection techniques are the last line of defense against computer attacks behind secure network
architecture design, firewalls, and personal screening.
Despite the plethora of intrusion prevention techniques
available, attacks against computer systems are still
successful. Thus, intrusion detection systems (IDSs)
play a vital role in network security. The attacks are
targeted at stealing confidential information such as
credit card numbers, passwords, and other financial information. One solution to this dilemma is the use of
intrusion detection system (IDS). It is very popular security tool over the last two decades, and today, IDS
based on computer intelligent are attracting attention
of current research community a lot.
We present experimental results on NSL-KDD data
set and WEKA software.

nodes. This is due to the fact that all models (i.e.,


the child nodes) operate independently and only influence the probability of the root node. This single
probability value at the root node can be represented
by a threshold. In addition, the restriction of having
a single parent node complicates the incorporation of
additional information. This is because variables that
represent such information cannot be linked directly to
the nodes representing the model outputs. [2]
ADAM (Audit Data Analysis and Mining) is an
intrusion detector built to detect intrusions using data
mining techniques. It first absorbs training data known
to be free of attacks. Next, it uses an algorithm to
group attacks, unknown behaviors, and false alarms.
ADAM has several useful capabilities, namely;
Classifying an item as a known attack
Classifying an item as a normal event

Classifying an item as an unknown attack


Valdes and Skinner employed a naive Bayesian net Match audit trial data to the rules it gives rise
work to perform intrusion detection on network events.
to. [8]
The classification capability of a naive Bayesian network is identical to a threshold-based system that computes the sum of the outputs obtained from the child Also, TAN algorithm can be used for ranking, regres Corresponding

Author, T: (+98) 919 589-1565

396

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

sion analysis, probability estimation and engine fault


diagnosis improvement.

Bayesian networks in conjunction with Bayesian


methods and other types of models offer an efficient and principled approach for avoiding the
overfitting of data. [1]

The paper is structured as follows: Section 2 introduces Bayesian networks and classifications. Section
3 introduces intrusion detection systems and different
kinds of attacks. Section 4 describes intrusion detection with Bayesian networks. Section 5 presents and Bayesian network structure represents the interanalyzes our experimental results. Section 6 summa- relationships among the dataset attributes. Human
experts can easily understand the network structures
rizes the main conclusions.
and if necessary modify them to obtain better predictive models. By adding decision nodes and utility nodes, BN models can also be extended to deci2 Primary Description
sion networks for decision analysis. Applying Bayesian
network techniques to classification involves two subtasks: BN learning (training) to get a model and BN
A Bayesian network B =< N, A, > is a directed
inference to classify instances. The two major tasks
acyclic graph (DAG) < N, A > where each node n N
in learning a BN are: learning the graphical structure,
represents a domain variable (e.g., a dataset attribute),
and then learning the parameters (CP table entries)
and each arc a A between nodes represents a probfor that structure.
abilistic dependency, quantified using a conditional
probability distribution (CP table) i for each
The set of parents of a node xi in BS is denoted as
node ni . A BN can be used to compute the condii . The structure is annotated with a set of conditional
tional probability of one node, given values assigned to
probabilities (BP ), containing a term P (xi = Xi |i =
the other nodes [3].
i ) for each possible value Xi of xi and each possible
instantiation i of i . [3]
One application of Bayesian networks is classification. A somewhat simplified statement of the problem
of supervised learning is as follows. Given a training
set of labeled instances of the form < a1 , ..., an > , C
construct a classifier f capable of predicting the value
of C, given instances < a1 , ..., an > as input. The variables A1 , ..., An are called features or attributes, and
the variable C is usually referred to as the class variable or label. [11]
Two types of Bayesian network classifiers that we
use them in this paper are: Naive-Bayes and Tree Augmented Naive-Bayes

2.1

Naive-Bayes

Figure 1: Bayesian network for cancer


A Naive-Bayes BN is a simple structure that has the
class node as the parent node of all other nodes (see
The main advantages of Bayesian networks are:
Figure 1). No other connections are allowed in a NaiveBayes structure. Naive-Bayes assumes that all the fea Bayesian networks can readily handle incomplete tures are independent of each other. In recent years, a
data sets.
lot of effort has focused on improving Naive-Bayesian
Bayesian networks allow one to learn about classifiers, following two general approaches: selecting
causal relationships.
feature subset and relaxing independence assumptions.

397

The Third International Conference on Contemporary Issues in Computer and Information Sciences

niques which are misuse-based intrusion detection and


anomaly based intrusion detection.
Misuse-based intrusion detection IDSs that employ misuse detection approach detect attacks by comparing the existing signatures against the network traffics captured by the IDSs. When a match is found,
the IDSs will take action as the traffics are considered
harmful to computer systems or computer networks.
Actions taken by the IDSs will normally include sending alerts to network administrator and logging the intrusive events.

Figure 2: Naive-Bayes Structure [6]

IDSs that implement misuse detection approach


are, however, incapable of detecting novel attacks. The
2.2 Tree
Augmented
Naive-Bayes
network administrator will need to update the stored
(TAN)
signatures frequently to make sure that the IDSs perform well in detecting intrusions. [5]
TAN classifiers extend Naive-Bayes by allowing the at- Anomaly based intrusion detection IDSs that emtributes to form a tree, (see Figure 2) here c is the class ploy anomaly detection are capable of identifying novel
node, and the features x1 , x2 , x3 , x4 , without their re- attacks, that contain activities deviated from the norm.
Such IDSs utilize the built profiles that are learned
spective arcs from c, form a tree. [6]
based on normal activities in computer networks. This
system has two stages:
Learning: It works on profiles. The profiles represent the normal behavioural activities of the
users, systems, or network connections, applications. Great care should be taken while defining
profiles because currently there is no effective way
to define normal profiles that can achieve high detection rate and low false positives at the same
time.

Figure 3: TAN Structure [6]

Detection: The profile is used to detect any deviance in user behavior. [7]

Intrusion Detection Systems


3.1

Intrusion detection systems are used to identify, classify and possibly, to respond to benign activities. Also,
Intrusion Detection System (IDS) is used to monitor
all or partial traffic, detect malicious activities, and
respond to the activities. Network intrusion detection
system was established for the purpose of malicious activities detection to strengthen the security, confidentiality, and integrity of critical information systems.
These systems can be network-based or host-based.
HIDS is used to analyze the internal event such as process identifier while NIDS is to analyze the external
event such as traffic volume, IP address, service port
and others. The challenge of the study is: how we can
have an IDS with high detection and low false positive
rate? [4]

Problems of Intrusion Detection


Systems

IDS have three common problems: temporal complexity, correctness and adaptability.
The temporal complexity problem results from the extensive quantity of data that the system must supervise
in order to perceive the whole situation. False positive
and false negative rates are usually used to evaluate
the correctness of IDS. False positive can be defined as
alarms which are triggered from legitimate activities.
False negative includes attacks which are not detected
by the system. An IDS is more precise if it detects
more attacks and gives few false alarms.

In case of misuse detection systems, security experts must examine new attacks to add their correIntrusion detection is comprised of two main tech- sponding signatures. In anomaly detection systems,

398

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

human experts are necessary to define relevant at- say a1 , a2 , ..., an relative to the attributes A1 , A2 , ..., An
tribute for defining the normal behavior. This leads , respectively. Since naive Bayesian networks work unus to the adaptability problem. [10]
der the assumption that these attributes are independent (giving the parent node C), their combined probability is obtained as follows:

3.2

Different Types of Attacks


P (Ci |A) =

Attacks are grouped into four classes:

P (a1 |Ci )P (a2 |Ci )...P (an |Ci )P (Ci )


P (A)

(2)

Note that there is no need to explicitly compute


Denial of Service (DOS): Making some ma- the denominator P(A) since it is determined by
chine resources unavailable or too busy to answer the normalization condition. Therefore, it is sufficient to compute for each class ci its likelihood,
to legitimate users requests.
i.e. P (a1 |Ci )P (a2 |Ci )...P (an |Ci )P (Ci ) to classify any
User to Root (U2R): Exploiting vulnerability
new object characterized by its attributes values
on a system to obtain a root access.
a1 , a2 , ..., an . [9]
Remote To Local (R2L): Using vulnerability
in order to obtain a local access like a machine
user.

Probing: Collecting useful information or 4.2 Tree Augmented Naive-Bayes


known vulnerabilities about a network or a system. [8]
Bayesian network structure learning without any structural restrictions is known to be a difficult problem.
Several possibilities of adding arcs between classifier
Network Intrusion Detection features have been proposed. TAN models are wellknown extensions of naive Bayes. The main rule of
Using Bayesian Networks
TAN classification is given by:

P (C|A) = P (C)P (A1 |P A1 , C)...P (An |P An , C) (3)


In the following we will first discuss the Naive-Bayes
and then explore our contribution which is Tree AugA is the global information provided by features valmented Naive-Bayes in intrusion detection.
ues. P Ai is the parent feature of Ai (on which Ai
depends). The optimal tree can be obtained simply by
calculating mutual information measures between each
4.1 Naive-Bayes
two variables on the basis of training instances. [10]
The purpose is to find the probability that a computer
or local network attack is going on.The result of the
propagation of changed probabilities of certain events
observed by Bayesian network can be an automatic activation of some mechanism for attack prevention such
as: breaking TCP connections, traffic redirection or
disabling user activity. If the probability of an attack
is significantly increased but not enough to be considered as an attack, the network will generate a report
about the event and warn the system administrator.
Once the network is quantified, it is able to classify
any new object giving its attributes value using the
Bayes rule expressed by:
P (Ci |A) =

P (A|Ci )P (Ci )
P (A)

(1)

Where ci is a possible value in the session class and


A is the total evidence on attributes nodes. The evidence A can be dispatched in the pieces of evidence,

Figure 4: IDS with Bayesian Network

399

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.3

Algorithm

We present here an algorithm to achieve an optimal


choice and placement of detectors.
Input
(i) Bayesian network BN = (V,CPT(V ),H(V )) where
V is the set of attack vertices, CPT(V ) is the set of
conditional probability tables associated with the attack vertices, and H(V ) is the set of hosts affected if
the attack vertex is achieved.
(ii) Set of detectors D = (di , V (di ), CP T [i][j]) where
di is the ith detector, V (di ) is the set of attack vertices that the detector di can be attached to (i.e., the
detector can possibly detect those attack goals being
achieved), and CPT[i][j] j V (di) is the CPT tables
associated with detector i and attack vertex j.
Output:
Set of tuples = (di, i ) where di is the ith detector
selected and i is the set of attack vertices that it is
attached to.
DETECTOR-PLACEMENT (BN, D)
System-Cost= 0
Sort all (di , aj ), aj V (di ), i by
BENEFIT(di , aj )
Sorted list= L
Length(L)= N
for i=1 to n do
System-Cost= System-Cost + Cost(di , aj )
/* Cost(di , aj ) can be in terms of economic
cost, cost due to false alarms and missed
alarms, etc. */
if System Cost > T hreshold then
break
end
if di then
add aj toi
else
add(di , i = aj )to
end
end
return
The worst-case complexity of this algorithm is O(dv
B(v,CPT(v)) + dv log(dv) + dv), where d is the number of detectors and v is the number of attack vertices.
B(v,CPT(v)) is the cost of Bayesian inference on a BN
with v nodes and CPT(v) defining the edges. The first
term is due to calling Bayesian inference with up to d
times v terms. The second term is the sorting cost and
the third term is the cost of going through the for loop
dv times. [13]

400

BENEFIT (d, a)
/* This is to calculate the benefit from attaching
detector d to attack vertex a */
Let the end attack vertices in the BN be
F = fi , i = 1, 2, ..., M
for each fi, the following cost-benefit table exists
do
Perform Bayesian Inference with d as the
only detector in the network and connected
to attack vertex a
Calculate for each fi , the precision and
recall, call them, Precision(fi , d, a) ,
Recall(fi , d, a) System Benef it =
m
P
Benef it fi (T rue N egative)
i=1

Precision(fi , d, a) + Benefitfi (T rue P ositive)


Recall(fi , d, a)
end
return System-Benefit

Experimental Result

The data used in this paper are those proposed in the


NSL-KDD for intrusion detection which are generally
used for benchmarking intrusion detection problems.
They set up an environment to collect TCP/IP dump
raws from a host located on a simulated military network. Each TCP/IP connection is described by 41 features and labeled as either normal or as an attack.
We evaluate the performance of Naive-Bayes and
then we convert that to tree augmented Naive-bayes.
So the new system has better performance.
parent
dst bytes
srv count
hot
src bytes
count

childs
num compromised
count, srv diff host rate
is guest login
wrong fragment, flag
same srv rate

Table 1: Connections in TAN

At the end compare them with DT and SVM. We


use full training set and 10- fold cross validation for the
testing purposes. In 10-fold cross-validation, the available data is randomly divided in to 10 disjoint subsets
of approximately equal sizes. One of the subsets is then
used as the test set and the remaining 9 sets are used
for building the classifier. The test set is then used
to estimate the accuracy. This is done repeatedly 10
times so that each subset is used as a test subset once.
The accuracy estimates is then the mean of the esti-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

mates for each of the classifiers. Cross-validation has


been tested extensively and has been found to generally
work well when sufficient data is available.

5.1

Kappa Statistic Rate

The kappa statistic measures the agreement of prediction with the true class 1.0 signifies complete agreement. This rate in Naive-Bayes is 0.759 and in DT is
0.989 and in SVM is 0.961 but TAN has better result,
0.988.

5.2

Confusion Matrix

A Confusion Matrix is sometimes used to represent the


result of testing, as shown in Table 1.It is a two- dimensional table with a row and column for each class,
each element of the matrix show the number of test
examples for which the actual class is the row and the
predicted class is the column. The Advantage of using
this matrix is that it not only tells us how many got
misclassified but also what misclassifications occurred.
False
Positive
NB
TAN
DT
SVM

Normal DOS

R2L

Probe

U2R

0.037
0.009
0.01
0.031

0.02
0.001
0.001
0.002

0.06
0.002
0.003
0.003

0.069
0
0
0

0.021
0.003
0.002
0.005

Normal
NB
TAN
DT
SVM
R2l
NB
TAN
DT
SVM
DOS
NB
TAN
DT
SVM
Probe
NB
TAN
DT
SVM
U2R
NB
TAN
DT
SVM

Normal
3474
4759
4760
4732

DOS
106
7
5
17

R2L
162
8
8
14

Probe
455
12
13
24

U2R
589
0
0
0

0
5
10
31

0
0
0
1

63
69
63
41

9
1
2
1

4
1
1
0

115
9
7
65

3176
3321
3319
3264

1
0
1
0

26
3
6
1

15
0
0
0

40
22
20
32

15
10
9
8

10
0
1
0

718
768
770
764

17
0
0
0

0
2
4
3

0
0
0
0

2
2
0
1

0
0
0
0

2
0
0
0

Table 3: Experimental Result in Confusion Matrix

Table 2: False Positive Rate

5.3

Time Taken to Build Model

Naive-Bayes is build in 3.77 seconds, TAN in 20.09 seconds, DT in 36.86 seconds and SVM in 43.63 seconds.
So Naive-Bayes is faster.

5.4

Percent of Correct Classification

PCC of Naive-Bayes is %85 and PCC of TAN is %99.3


and PCC of DT is %99.4 and PCC of SVM is %97.8.

Figure 5: ROC Curve-Performance Analysis of IDS

5.5

Detection Rate and False Positive


Rate

The detection rate is the number of attacks detected


by the system divided by the number of attacks in the
data set. It is equivalent to Recall.
The false positive rate is the number of normal con-

401

The Third International Conference on Contemporary Issues in Computer and Information Sciences

nections that are misclassified as attacks divided by the


number of normal connections in the data set.

Conclusions

The average of false positive rate in Naive-Bayes is In this paper, we have proposed a framework of intrusion detection systems based on Naive-Bayes and
0.033 and in TAN and DT is 0.006, in SVM is 0.019.
TAN algorithms and compared them with decision tree
and support vector machine. According to the result,
Naive-Bayes is found less time consuming. TAN has
5.6 Accuracy Rate
better accuracy rate and detection rate, and also has
less false positive rate.
Precision Recall F Measure
Normal
0.957
0.726
0.826
NB
0.992
0.994
0.993
Refrences
TAN
0.991
0.995
0.993
DT
0.973
0.989
0.981
[1] K Korb and E Nicholson, Bayesian Artificial Intelligence
SVM
(2004).
DOS
0.963
0.953
0.958
[2] Ch Kruegel, D Mutz, W Robrtson, and F Valeur, Bayesian
NB
0.995
0.996
0.996
Event Classification For Intrusion Detection, 19th Annual
TAN
0.996
0.996
0.996
Computer Security Application Conference IEEE Computer
Society, Washington DC 187 (2008), 1423.
DT
0.992
0.98
0.986
SVM
[3] L Ben-Gal, Bayesian Network, Encyclopedia Of Statistics
In Quality And Reliability, 2007.
R2l
0.265
0.829
0.401
NB
0.873
0.908
0.89
[4] Y Wee, W Cheah, SH Tan, and K Wee, Causal Discovery And Reasoning For Intrusion Detection Using Bayesian
TAN
0.863
0.829
0.846
Network 1 (2011), no. 2.
DT
0.732
0.554
0.631
[5]
K
Chin Khor, CH Ting, and S Amnuaisuk, From Feature
SVM
Selection To Building Of Bayesian Classifiers: A Network
Probe
0.594
0.898
0.715
Intrusion Detection Perspective.
NB
0.98
0.96
0.97
[6] J Cheng and R Greiner, From Feature Selection To BuildTAN
0.973
0.963
0.968
ing Of Bayesian Classifiers: A Network Intrusion Detection
DT
0.967
0.95
0.959
Perspective, Proc.14th Canadian Conference On AI, 2001.
SVM
[7] M Pater, H Kim, and A Pamnam, State Of The Art In
U2R
0.003
0.5
0.006
Intrusion Detection System.
NB
0
0
0
[8] M Panda and M.R Patra, Network Intrusion Detection Using Nave Bayes, International Journal Of Computer Science
TAN
0
0
0
And Network Security 7 (2007), no. 12, 258263.
DT
0
0
0
[9] N Amor, S Benferhat, and Z Elovedi, Nave Bayesian NetSVM
works In Intrusion Detection System, 14th European ConAVG
0.921
0.826
0.861
ference On Machine Learning 17th European Conference
NB
0.991
0.991
0.991
On Principles And Practice Of Knowledge Discovery In
TAN
0.99
0.99
0.99
Databases.
DT
0.977
0.978
0.977
[10] S Benferhat, H Drias, and A Boudjelida, An Intrusion DeSVM
tection Approach Based On Tree Augmented Nave-Bayes
And Expert Knowledge.

Table 4: Accuracy rate in different algorithms

5.7

ROC Curve

We plot a ROC (Receive Operating Characteristic)


curve which is often used to measure performance of
IDS. This curve is a plot of the detection rate against
the false positive rate, which is shown in Figure 5.

402

[11] N Friedman, M Goldzmidt, and A Boudjelida, Building


classifiers Using Bayesian Network, In Proceeding of the
National Conference on AI, Menlo Park, CA: AAAI Press
(1996), 12771284.
[12] Cerquides Bueno J, Improving Bayesian Network Classifiers, Ph.D. Thesis, university Of Politecnica De Catalunya.
(1996), 12771284.
[13] G Modelo-Howard, S Bagchi, and G Lebanon, Determining
Placement Of Intrusion Detectors For A Distributed Application Through Bayesian Network Modelling, SpringerVerlage Berlin Heidelberg (2008), 271290.

Dynamic Fixed-Point Arithmetic:


Algorithm and VLSI Implementation
Mohammad Haji Seyed Javadi

Hamid Reza Mahdiani

Department of Computer, Electronics and IT

ECE Department,

Qazvin Branch, Islamic Azad University

Sh. Abbaspour University of Technology

seyedjavadi@qiau.ac.ir

mahdiani@pwut.ac.ir

Esmaeil Zeinali Kh.


Department of Computer, Electronics and IT
Qazvin Branch, Islamic Azad University
zeinali@qiau.ac.ir

Abstract: Although the fixed-point arithmetic is widely used due to its simple hardware implementation, it suffers from significant drawbacks such as limited dynamic range. A fixed-point
hardware does not provide acceptable accuracy levels when simultaneously deals with large and
small numbers. Although the floating-point arithmetic greatly addresses this problem, it is not
widely used because it faces important challenges when realizes in hardware. A novel computational paradigm named as Dynamic Fixed-Point (DFP) is proposed in this paper which provides
improved precision levels while has a similar VLSI implementation costs when compared to traditional fixed-point. The accuracy simulation results and VLSI synthesis costs of the new method is
presented and compared with fixed-point to prove its efficiency.

Keywords: Finite Precision Arithmetic; Fixed-Point; Floating-Point; VLSI.

Introduction

creases the final output WL. Although there are some


improvements to limit the final WL [2], this method
is not popular because it also imposes considerable extra implementation costs, degrades the system performance, and destroys its WL homogeneity. In traditional fixed-point applications, some Least Significant
Bits (LSBs) of the computational blocks are discarded
and the remaining Most Significant Bits (MSBs) are
rounded or truncated to avoid these drawbacks [3]. For
an adder with WL-bit inputs, a WL-bit sum, and a 1bit carry as an instance, the output is unconditionally
defined as shown in Eq.(1) to enable the designer to
save the carry while preserving the WL at the output.

Fixed-point arithmetic is widely used in various realtime and low-power applications due to simplicity of
its hardware units in terms of area, delay, and power
consumption. However, the main problem with a fixedpoint computational system is to preserve the dynamic
range within a finite and fixed Word-Length (WL)
which is determined based on the cost-accuracy tradeoff [1]. This limitation prevents to simultaneously
demonstrate large and small values in a finite WL. Increasing WL at the output of the computational units
AdderOutF ixedP oint (W L 1 : 0)
(1)
is a trivial solution to maintain the desired dynamic
range and accuracy. According to this approach, the
= Carry&Sum(W L 1 : 1),
WL at the output of each adder in the system should
be increased by one bit and the WL at the output of In a WL by WL bits multiplier with a 2WL bits reeach multiplier should be doubled that significantly in- sult as another instance, the output is always defined
Corresponding

Author, P. O. Box 34139-55697, T: (+98) 281 332-6735

403

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

based on Eq.(2) to provide the best WL-bit precision.


M ultOutF ixedP oint (W L 1 : 0)
= Result(2 W L 1 : W L),

(2)

Although using floating-point as an alternative to


fixed-point results in better dynamic range and precision; floating-point arithmetic is considerably more
expensive and less efficient due to higher complexity of its computational units. As a third solution,
the block floating-point combines the simplicity of the
fixed-point arithmetic with the floating-point improved
dynamic range. The block floating-point strategy and
its different varieties [1, 4], utilize a common scale factor for all numbers inside a system. This implies that
the system should be capable of simultaneously shifting
all its inside numbers, which needs specialized hardware equipment that can only be embedded in some
regular sequential architectures such as digital filters
and FFT [5]. The next section introduces a new solution which improves the accuracy and dynamic range
with respect to fixed-point arithmetic while keeps the
same level of simplicity. The remaining sections discuss
about the accuracy as well as hardware implementation
costs of the new method.

Dynamic Fixed-Point

As discussed in the previous section, the common and


most exploited method to prevent increasing system
WL in traditional fixed-point is to unconditionally discard some LSBs of the computations. Although this
works well for large enough values, it highly limits the
system dynamic range lower bound when dealing with
small values. Because in case of small values most of
the discarded LSBs are informative, while many of the
remaining MSBs does not contain any information and
are zero. To address this problem and increase the dynamic range of the fixed-point arithmetic, the Dynamic
Fixed-Point (DFP) strategy is introduced. In contrast
to fixed-point which unconditionally chooses the WL
MSBs and discards the result LSBs (regardless of this
fact that the MSBs and LSBs are informative or not),
the DFP proposes to choose the WL bits in a manner
to save the most possible informative bits. This means
that if the most significant bits of the result have zero
values, first it shifts up the result to discard the zero
MSBs and then saves the WL desired bits. This also
implies that an additional Scale-Factor (SF) should be
remembered with the WL bit value which demonstrates
how many bits the WL-bit DFP value is shifted up. It
is important to emphasize that in DFP, the SF is a
small limited range value which might be saved using
1 or 2 bits when implemented in hardware.

To implement a hardware system only using DFP


blocks, some general DFP arithmetic block should be
designed to accept input values in DFP format (a
scaled value and a limited scale factor) and produce
its output in DFP format. However, due to the compatibility of the DFP with traditional fixed-point, it is
possible to implement a hardware system in a mixed
format with a combination of fixed-point and dynamic
fixed-point blocks. To form such realizations, some
simplified DFP blocks should be designed to accept
inputs in fixed-point format and produce DFP outputs or vice versa. As an introductory step to show
the capabilities of the DFP method and superiority of
its hardware blocks with respect to traditional blocks,
in this paper we have focused on two simplified DFP
blocks, an adder and a multiplier, which accept fixedpoint values and produce DFP outputs. Such blocks
are useful in some situations for example as the primary input processing blocks where the fixed point input values input to a mixed fixed-point-DFP system
or as the primary output generator blocks where some
fixed-point values might be processed and directed to
primary outputs of the whole system. Hereafter, we
call this type of blocks as DFP blocks throughout the
following sections while the discussion about the general DFP arithmetic blocks in beyond the scope of this
paper. As the following paragraphs of this section, the
general functionality of a DFP adder and multiplier is
described with equations similar to Eq.(1) and Eq.(2)
while their accuracy analysis results is presented next.
The output of a WL-bit DFP adder is computed based
on the following equation in comparison with Eq.(1)
which describes functionality of a traditional adder.
AdderOutDF P (W L 1 : 0) =

Carry&Sum (W L 1 : 1) , SF = 0

W hen Carry = 1

(3)

Sum (W L 1 : 0) , SF = 1
W hen Carry = 0

In a similar manner, the output of a DPF multiplier


which accepts two WL-bit fixed point inputs and produce a WL-bit DFP output with a suitable SF is computed using Eq.(4). As clearly indicated in this equation, the DFP applies maximum three shifts to the multiplication result according to the values of the three
MSBs of the result. The following section provides
some evidences to show why the value 3 is selected
as the maximum number of shifts in a multiplier.
As a detailed technical look, the DFP resembles a
floating-point approach while the exponent (SF) is limited to only a small set of values. The main DFP advantage over floating point system lies in this simple
but effective limitation. The simulation results of the
following section show that DFP provides comparable

404

The Third International Conference on Contemporary Issues in Computer and Information Sciences

precision with respect to floating-point while at the


same time its limited exponent value makes it possible
to realize DFP arithmetic building blocks by exploiting
fixed-point based hardware which improves the complexity, cost, and performance of the DFP realization
with respect to floating point hardware.
M ultOutDF P (W L 1 : 0) =

result (2 W L 1 : W L) , SF = 0
when result M SB is 1
result (2 W L 2 : W L + 1) , SF = 1
when result M SBs are 01
result (2 W L 3 : W L + 2) , SF = 2
when result M SBs are 001
result (2 W L 4 : W L + 3) , SF = 3
when result M SBs are 0001

be theoritically any positive and none-zero value however; there are some simulation results which justify
why the maximum SF in a DFP multiplier is limited
to 3 as described before. Fig.1 simultaneously includes
the logarithmic average error of WL-bit normal adder
and multiplier as well as WL-bit DFP adder and multiplier with different WL and SF values. The figure
values show that average error of a DFP component
is always better than its traditional rival regardless of
WL. It also demonstrates that average error of a DFP
multiplier improves as SF increases.

(4)

2.1

Accuracy Analysis of DFP Blocks

In this section, the accuracy analysis of the presented


DFP adder and multiplier are provided and compared
with accuracy of their traditional fixed-point rivals
based on some analytical studies followed and confirmed by simulation results. While a WL-bit normal
scaled adder provides an average error of 2W1L+1 ., the
average error of a DFP adder with maximum SF value
of 1 is 2W1L+2 . Eq.(5) summarizes the average error of a
WL-bit normal adder (SF=0) and DFP adder (SF=1).
1
2W L

+ SF +1

, SF = 0, 1

Figure 1: Average Error of Normal and DFP Adders


and Multipliers with Different WLs and SFs.

To demonstrate an interesting feature of the DFP


multipliers, Fig. 2 shows the average error of a DFP
multiplier while the SF increases for different WLs.

(5)

In a similar manner, Eq.(6) shows the average error


of a WL-bit fixed-point multiplier (SF=0) and DFP
multiplier (SF > 0) with a defined SF. It shows that
as the average error of normal multiplier is 2W1 L , the
DFP multiplier provides significantly less error which
decreases with the increase of SF.
1

(1 +
2W L+2SF

2 (22SF 1)
), SF 0
3

Figure 2: Average Error of DFP Multipliers vs. SF.


(6)

To prove the accuracy efficiency of the DFP blocks and


verify equations 5 and 6, the MATLAB bit-true models
of fixed point and DFP adders and multipliers are developed and simulated using 20,000 random fixed-point
numbers and their average output errors are computed
with respect to floating point adder and multiplier respectively. It is necessary to emphasize that in all simulation results, SF=0 resembles a traditional fixed-point
component while and none-zero SF value resembles a
DFP block. It should be stated that maximum SF in
a DFP adder is 1 and it is not necessary to increase it
more. Also maximum SF in a DFP multiplier might

405

According to this figure, regardless of the multiplier


size (WL), the average error of the DFP multiplier does
not improves significantly when SF increases more than
3. As increasing SF leads to higher implementation
costs that will be shown in the following sections, this
is the main reason to limit the maximum SF in a DFP
multiplier to gain best accuracy while paying minimum
costs. Fig.2 also indicates that a WL-bit DFP multiplier with SF=3 provides better accuracy with respect
to normal WL-bit multiplier (SF=0). The next section
provides and compares the synthesis results of the DFP
and normal arithmetic blocks.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 1: Physical Characteristics of Traditional and Dynamic Building Blocks


Module Type
Adder
Multiplier
Fixed-Point
DFP
Fixed-Point
DFP
SF
0
1
0
1
2
3
Area (# of Gates)
299
409
2132
2234
2275
2345
Delay (ns)
2.96
3.25
7.66
8.44
9.09
9.52
Average Error
0.005865
0.004895
0.003855
0.003027 0.002734 0.002604

DFP Blocks Hardware

Fig. 3 demonstrates the block diagram of the extra


hardware which should be appended to the 2WL-bit
output of a fixed-point multiplier to form a DFP one.
Figure 4: Variable Shifter Internal Structure

Figure 3: Block Diagram of Extra DFP Multiplier

According to this diagram, the combinational logic


determines the number of zero MSBs of the multiplication result (up to 3 zero MSBs which corresponds
to SF=3) while a simple variable shifter then applies
suitable number of shifts to the multiplication result
according to combinational circuit output. The final
DFP output contains a WL-bit result and a 2-bit SF.
Fig. 4 also demonstrates the internal structure of the
variable shifter. It shows that the final result is the
WL MSBs of the suitably shifted multiplication result
of the fixed-point multiplier. The block diagram of a
DFP adder is similar to DFP multiplier but much simpler and smaller. In a DFP adder, the combinational
logic is omitted and the variable shifter is replaced with
a single multiplexer.

Table 1 includes the synthesis results of 8-bit normal and DFP adders and multipliers. To achieve these
results, the VHDL models of all components are developed and synthesized on 0.13 CMOS library cells
using Mentor Graphics Leonardo Spectrum. The synthesis results of the DFP multiplier are presented for
different SFs. The last row also simultaneously includes the average accuracy simulation results of DFP
and fixed-point blocks to simplify the overall comparison. The table results show that a fixed-point adder is
26% smaller and 9% faster than DFP while also provides 20% less accuracy. The fixed-point multiplier as
another instance, is 9% smaller and 20% faster than a
DFP multiplier with SF=3 while provides 48% worse
accuracy.

Conclusions

A new computational paradigm called Dynamic FixedPoint (DFP) is introduced in this paper. The DFP
arithmetic blocks have similar VLSI structures and implementation costs with respect to fixed-point blocks
while provide an improved dynamic range and accuracy. The presented analytic studies as well as simulation results show the accuracy improvement level of
DFP with respect to fixed-point. The synthesis results
also are provided to compare the VLSI implementation
costs of these two rivals.

406

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences

[3] H.R. Mahdiani and S.M. Fakhraei, A cost-error tunable


round-off method: Finite-length absorption, IEICE Electron.
Express 6/18 (2009), 13121317.

[1] T. Lenart and V. Owall, Architectures for Dynamic Data


Scaling in 2/4/8K Pipeline FFT Cores, IEEE Trans. VLSI
Syst. 14/11 (2006), 12861290.

[4] S.. Kobayashi and G.P. Fettweis, A new approach for blockfloating-point arithmetic, ICASSP (1999), 20092012.

[2] O. Sarbishei and K. Radecka, Analysis of precision for scaling the intermediate variables in fixed-point arithmetic circuits, ICCAD (2010), 739745.

[5] T. Lenart and V. Owall, A 2048 complex point FFT processor using a novel data scaling approach, ISCAS (2003),
4548.

407

Cost of Time-shared Policy in Cloud Environment


GhDastghibyfard

Abbas Horri

College of Engineering, Shiraz University

College of Engineering, Shiraz University

Department of Computer Science and Engineering

Department of Computer Science and Engineering

dstghaib@shirazu.ac.ir

horri@shirazu.ac.ir

Abstract: Cloud providers must ensure that their service delivery is flexible in order to meet
various consumer requirements. However, in order to support green computing, cloud providers
also need to minimize the cloud infrastructure energy consumption while conducting the service
delivery. In this study, for cloud environments an energy consumption model is proposed for timeshared policy. This model has been implemented and evaluated using CloudSim simulator. Related
simulation results validate the model and indicate that the energy consumption may be considerable.
Simulation results demonstrate that there is a tradeoff between energy consumption and quality of
service in the cloud environment.

Keywords: Cloud Computing, Energy Consumption, Time-Shared Policy, Green Computing.

Introduction

One of the cloud benefits is the possibility to dynamically adapt ( i.e., scale-up or scale-down) the amount
of (provisioned) resources to applications in order to
attend the variations in demand, which are predictable
[1]. Elastic (i.e., automatic scaling) applications, such
as web hosting, content delivery, and social networks
can use this cloud ability, which is susceptible to elasticity. Although cloud has been used as a platform
supporting elastic applications, it faces limitations
such as ownership, scale, and locality. For instance,
the number of hosting capabilities (virtual machines
and computing servers) to be offered to application
services at a given instance of time by a cloud is limited; hence the scaling application capacity in this
situation, will becomes complex. Therefore, the applications hosted in a cloud must compromise on the
overall QoS delivered to its users, on condition that the
number of requests overshoots the cloud capacity[2].
One of the important requirements to be provided by
Cloud computing environments is reliable QoS[3]. It is
defined in terms of the service level agreements (SLA),
which describe such characteristics as the throughput,
response time, or latency delivered by the deployed
system. Response time is an amount of time, obtained
Corresponding

from the interval between the request submission and


the first-response production. In interactive systems
such as web services and real-time applications, response time is the major QoS factor[4]. Response time
is a function of load intensity, which can be measured
in terms of arrival rates (such as requests per second)
or the number of concurrent requests. In the situation
where the response time is more important than the
other QoS factors, the system must apply time-share
policy.

1.1

Time-share and space-share policy

Cloud computing infrastructure uses virtualization


technology. Hence, the cloud has a virtualization
layer[5]. Even though the VMs are isolated in memory space, the VMs need to share the processing core.
Thus, hardware resources of each VM are constrained
by the hardware resources on the node. CloudSim simulator provides VM provisioning at two levels: host
level and VM level. At host level, it is possible to specify how much the total processing power of each core
will be assigned to each VM. At VM level, the VM
assigns a determined amount of processing power to

Author, P. O. Box 71348-51154, T: (+98) 711 613-3168

408

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

the job units. At each level, CloudSim can apply the


time-shared or space-shared provisioning policies. In
the space-share mode, the arrival of new tasks does not
have any effect on the tasks under execution, and every
new task is queued. However, in the time-share mode,
the execution time of each task varies with an increase
in the number of submitted tasks. Time-shared policy for Allocating tasks to VMs has a significant effect
on execution time, as the processor is remarkably context switched among the active tasks. Hence, when the
time-shared policy is used, cost and energy consumption of context switches must be taken into account.
The rest of the paper is organized as follows. The next
section presents related work and motivations. Section
3 introduces the proposed model. Experimental results
are detailed in Sections 4. Finally Section 5 concludes
the paper.

function of time and represented as u(t). Therefore,


total energy (E) consumption by a physical host can
be calculated as an integral of the power consumption
function over a period of time.
Z
E = P (u(t))
(2)
t

In this study, power model defined in (1) has been used.


To the best of our knowledge, no research has been carried out on the measurement of the context switch cost,
as well as modeling energy consumption of time-shared
policy in the CloudSim.

The Approach

In this section, the proposed method is introduced.


The method is based on the data measured at a numof observations on the real system. In the first
2 Related work and motivation ber
subsection, the real system model is described and in
the second subsection the model that is used for simuIn a multi-task system, the context switch refers to the lation is depicted.
switching of CPU from one process to another. Context switch allows processes to executing concurrently.
Measuring the cost of the context switch is a challeng- 3.1 Measuring Context Switch Cost in
ing problem. The cost of time-shared policy has not
a Real System
been modeled in the Cloud literature. Hence, the context switch cost should be measured in real system and
then this measurement being used to model the cost of According to the Ousterhouts method, two processes
the time-shared policy in the CloudSim. In [6] and repeatedly send a single-byte message to each other via
[7], the cost of the context switch was measured using two pipes. In each round-trip communication, two cona benchmark with two processes communicating with text switches, as well as one read and one write system
each other via two pipes. In our work, in order to mea- call in each process will occur. In this study, the time
sure the cost of the context switch the pipe communi- cost of 1,000 round-trip communications (t1) was meacation was also used to implement the frequent context sured, and the time cost of 1,000 simulated round-trip
switches between two processes. The cost per context communications (t2), which includes no context switch
switch (c1) was measured using Ousterhouts method cost was measured as well. The direct time cost per
[6]. DVFS is a method to provide variable amount of context switch was calculated as c1 =t1/2000 t2/1000.
energy for executing a task, by scaling the operating
voltage/frequency of CPU. Based upon [8] and [9], applying DVFS on CPU results in an approximately lin- 3.2 Modeling Context Switch in a Simear power-to-frequency relationship on average an idle
ulator
server consumes approximately 70 of the power consumed by the server running at 100 CPU utilization.
In [10], [11], authors have proposed following model for In this study, based on the result of the real system
modeling, the context switch time was calculated. To
power and energy of CPU in the cloud:
model the context switch in the simulator, the number
P (u) = K Pmax + (1 K) Pmax u
(1) of instructions per context switch (IPC) was required.
To calculate IPC we used IPS (instructions per second)
of the processor, which depends on CPU architecture
Where Pmax is the maximum power consumed which is available in the processor specifications table.
when the server is fully utilized; k is the fraction of One of the important parameters influencing the cost
power consumed by the idle server; and u is the CPU of the context switch is the context switch frequency.
utilization. The utilization of CPU may change over To model this parameter, a quantum parameter was
time due to workload. Thus, the CPU utilization is a defined, which showed the number of instructions that

409

The Third International Conference on Contemporary Issues in Computer and Information Sciences

execute between two context switches in MIPS (milFigure 1 demonstrates the cost of time-shared pollion instructions). Hence, cost of context switches is icy as the size of the jobs is increased. In this figure,
computed as:
axis X represents job size and axis Y represents the extra cost that caused by context switch overhead, given
X
Icost(J) =
(I(j) (q IP S) t)
(3) in MIPS. As can be seen if the job size increases, the
cost of time-shared policy will also increase. This is
Where J is the set of jobs to be executed in the time due to the fact that the larger jobs cause more context
share policy, IPS is the processor instructions per sec- switches than smaller jobs. In the cloud environment,
ond, t is the cost of a context switch in MI (million the job size is sufficiently large. In this case, the quaninstructions), I(j) is the total instructions for job j in tum size is 5 msec.
MI and q is the quantum parameter in second. To
fig1.png
measure extra energy consumption by the time-shared
policy, the linear model described in (1) and (2) has
been used. Based upon this model and the cost model Figure 1: demonstrates the indirect cost as the size of
depicted above, the extra energy consumption for the the jobs is increased
time-shared policy is:
Z

1
Icost
V M M IP S U

P (u)dt

E=

(4)

t=0

Next experiment was planned to evaluate the effect of quantum parameter does on the turnaround
time in time-shared policy. Figure 2 shows that the
turnaround time of jobs in the time-shared policy increases as the quantum parameter decrease. In this
experiment, the job length is 75,000 MI. An increase
in quantum parameter contributes to a decrease in the
time-shared policy cost.

Where P(u) is the power consumed by the host, vmmips indicates the processor speed of VM in MIPS
(million instructions per second), ICost is the cost of
the time-shared policy in MI and u is the CPU utilization. Hence, TCost/vmmips is the extra time for the
time-shared policy. E will be the extra energy usage
if the time-shared policy be applied. Each job in the
fig2.png
cloud must be executed on a VM. Hence, for each job
to be executed in the time-share policy the cost and
energy usage is calculated by the VM parameter based Figure 2: shows that the indirect cost increases by deon the above methods.
creasing the quantum parameter

Experimental Result

In this section, we will discuss analysis of the model


described in the previous section. As the targeted system is a Cloud computing environment, it is essential
to evaluate it on a large-scale infrastructure. Hence,
a data center with 100 heterogeneous physical hosts
was simulated. Each host was modeled to have a dual
core CPU. The performance of each core thereof equivalent to 1000 million instructions per second (MIPS),
4 GB of RAM, and 1 TB of storage. The power consumption by the hosts was defined according to the
model described in the previous section. Based on this
model, a host consumes power from 210 W with 0%
CPU utilization up to 300 W with 100% CPU utilization. Each VM requires one CPU core with 250 MIPS,
128 MB of RAM and 1 GB of storage. The users submit requests for the provisioning of 800 heterogeneous
VMs. Each VM runs four jobs. Initially, VMs are
provisioned according to the requested characteristics,
assuming 100% utilization.

410

Increasing the quantum size will decrease the


turnaround time and increase average response time
(time from submission till the first response is produced). On the other hand a decreasing in the quantum parameter leads to decline the response time and
a rise in the time-shared policy cost. Hence there exists
a tradeoff between the cost of the context switch and
the response time.
Final experiment was designed to show the effect
of quantum parameter on the energy consumption in
time-shared policy. Figure 3 shows that the energy consumption of jobs in the time-shared policy increases as
the quantum parameter decrease. In this experiment,
the job length is 75,000 MI. An increase in quantum
parameter contributes to a decrease in the time-shared
policy energy consumption.
fig3.png
Figure 3: Demonstrates the energy consumption as the
quantum parameter is increased

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion

[5] I. Foster, Y. Zhao, I. Raicu, and S. Lu, Cloud Computing


and Grid Computing 360-Degree Compared: Grid Computing Environments Workshop (2008), 1-10.

In this study, based the results obtained from a real


system, the cost and energy usage of time-shared policy was modeled in the CloudSim simulator and then
this model was evaluated by various scenarios. Results
indicated that the energy consumption may be so high.
This research can be extended by investigating the cost
of data access and workload type.

[6] J. Ousterhout, Why Arent Operating Systems Getting


Faster As Fast As Hardware?: USENIX (1990), 247-256.

[9] R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and


X. Zhu, No power struggles: coordinated multi-level power
management for the data center: Proceedings of the 13th
international conference on Architectural support for programming languages and operating systems (2008), 48.

Refrences

[1] M. Armbrust, A. D. A. Joseph, R. H. Katz, D. A. Patterson, A. Fox, and R. Griffith, Above the clouds: A Berkeley
view of cloud computing: EECS Department, U.C. Berkeley, 2005.
[2] Wu L., Garg Kumar, and R. Buyya, SLA-based admission control for a Software-as-a-Service provider in Cloud
computing environments, Journal of Computer and System
Sciences (2011), 367378.
[3] A. Beloglazov, R. Buyya, and Y. Lee, A taxonomy and
survey of energy-efficient data centers and cloud computing systems, Advances in Computing 82 (2011).
[4] H. Aydin, R. Melhem, D. Mosse, and P. Mejia-Alvarez,
Power-aware scheduling for periodic real-time tasks, IEEE
Transactions on Computers 53 (2004), 584-600.

[7] C. Li, C. Ding, and K. Shen, Quantifying The Cost of


Context Switch: Experimental computer science workshop
(2007), 13-14.
[8] A. Gandhi, M. Harchol-Balter, and C. Lefurgy, Optimal
power allocation in server farms: Proceedings of the eleventh
international joint conference on Measurement and modeling of computer systems - SIGMETRICS (2009), 157.

[10] R. Buyya, A. Beloglazov, and J. Abawajy, Energy-Efficient


Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges: Proceedings of the 2010 International Conference on
Parallel and Distributed Processing Techniques and Applications (2010), 12.
[11] A. Beloglazov and R. Buyya, Adaptive Threshold-Based Approach for Energy-Efficient Consolidation of Virtual Machines in Cloud Data Centers: Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and
e-Science (2010).

411

Using Fuzzy Classification System for Diagnosis of Breast Cancer


Maryam Sadat Mahmoodi

Bahram Sadeghi Bigham

University of Siastan and Bluchestan

Institute for Advanced Studies in Basic Sciences

Faculty of Computer Science

Department of Information Technology

Zahedan, Iran

Zanjan, Iran

M mahmoodi 64@yahoo.com

B sadeghi b@yahoo.com

Adel Najafi-Aghblagh Rostam Khan


University of Mohaghegh Ardebili
Department of Mathematics and Computer Sciences
Ardebil, Iran
Najafi.a.ark@uma.ac.ir

Seyed Abbas Mahmoodi


Islamic Azad University
Faculty of Computer Engineering Science and Research Branch
Yazd, Iran
Sa mahmoodi 85@yahoo.com

Abstract: Evolutionary algorithms, one of important algorithms have been used in data mining
field to induct fuzzy if- then rules based classification systems. Harmony search algorithm (HSA), an
inspired algorithm from nature, has been successfully applied to classification. This paper proposes
a rule-based system for medical data mining by using a combination of HSA and fuzzy set theory,
we call it FHDD. In the classification problem the object is to maximize the correctly classified
data and minimize the number of results. We have evaluated our new classification system via
UCI machine data set. Results show the propose algorithm can detect diseases with an acceptable
accuracy or even better than previous works. In addition, the computation time to build classifier
reduce because of FHDD utilizes an HSA to learn a set of fuzzy rules from labeled data in parallel
manner.

Keywords: Harmony Search Algorithm, Fuzzy Classification, Breast Cancer.

Introduction

Medical diagnosis can be viewed as a pattern classification problems: based on a set of input features
the goal is to classify a patient as having cancer or as
not having it, i.e. as a malignant or a benign case[1].
Breast cancer is the most common cancer in woman
accounting for about 30% of all cases [2]. Most breast
cancers are detected as a mess on the breast. Some
diseases such as breast cancer show symptoms which
Corresponding

Author, T: (+98) 0913 954 5939

412

some of these symptoms are appear in the other diseases. Thus physicians must pay attention to previous
decisions which made for patients in the same conditions, and whereas early diagnosis is important as it
is directly linked with increased survival chances, thus
the physician needs both knowledge and experience for
proper decision making. From a computational point
of view, breast cancer diagnosis can be viewed as a
pattern classification problem: based on a set of input
features the goal is to classify a patient as having cancer or as not having it, i.e. as a malignant or a benign

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

case.
This job is not easy to consider the number of factors that the expert has to evaluate. To reduce the possible errors and help the expert, the classification system can be used. Classification is a supervised learning
technique that takes labeled data samples and generates a classifier that classifies new data samples into
different predefined groups or classes. This classification problem can be easily solved by fuzzy logic with
interpretable if-then rules and membership function.
Classification schemes have been developed successfully for several applications such as medical diagnosis,
speech recognition, etc. Classification problems, sets
of if-then rules for learned hypotheses is considered to
be one of the most expressive and comprehensive representations.
Generally the rules and the membership functions
used by the fuzzy logic for solving the classification
problem are formed from the experience of the human
experts.

tion 5 is conclusion.

Harmony Search Algorithm

The harmony search (HS) method is a meta-heuristic


optimization algorithm firstly proposed by Geem in
2001[10]. HS algorithm is based on natural musical performance processes that occur when a musician
searches for a better state of harmony. When a musician is improvising, he or she has three possible choices:
(1) play any famous piece of music exactly from his or
her memory; (2) play something similar to a known
piece; or (3) compose new or random notes. In music
improvisation, each player sounds any pitch within the
possible range, together making one harmony vector
[11]. This kind of efficient search for a perfect state of
the harmony is analogous to the procedure of finding
the optimal solutions to engineering problems.

Proposed Algorithm

Because it is not usually easy to derive fuzzy rules


from human experts, many approaches have recently
been proposed to generate fuzzy rules automatically In computer simulations, we used a typical set of linfrom the training patterns.
guistic values in Figure 1 as antecedent fuzzy sets.
Then membership function applied to a fuzzy rule set
Expert systems and artificial intelligence techniques are assumed to be isosceles- triangle functions and halffor classification also help experts in a great deal and open trapezes as shown in Figure 1, where aij denotes
for this reason; many algorithms are proposed to clas- the j th linguistic value of Feature Ai . A total of 6 memsification breast cancer.
bership points (zi1 , zi2 , zi3 , zi4 , zi5 , zi6 ) are required for
representing each input variable as a fuzzy set. In that
The technique, which we have applied, based on 6 points, first and last points (z1 , z6 ) are fixed which
fuzzy genetic learning. Several methods have been the minimum and maximum of the input variable. To
proposed to produce fuzzy if-then rules. For exam- compute other remaining four membership points we
ple, fuzzy rule-based classification systems are created divided distance between endpoints by 5.
by simple heuristic procedures [3, 4], neuro-fuzzy techniques [5, 6] and genetic algorithms [7]. One of the
applications of GAs is pattern classification problems.
Genetics-based machine learning methods for rule generation divided into two categories: the Michigan approach [8] and Pittsburgh approach [7]. In the Michigan approach, each rule is handled as an individual,
called a classifier. The Pittsburgh approach [9] handles an entire rule set as an individual. In this paper,
we have used Michigan-based algorithm for classificaFigure 1: Attribute Value Ai
tion problem. To accomplish this purpose we have used
a hybrid genetic algorithm with a harmony search algorithm.
Then we represent each membership function as a
quintuplet (P1 , P2 , P3 , P4 , P5 ). If ai is lower or equal
This paper is organized as follows: Section 2 de- zi2 then P1 is 1 and P2 , P3 , P4 , P5 are 0. If ai is lower
scribes Harmony Search Algorithm. Section 3 de- or equal zi3 then P2 is 1 and P1 , P3 , P4 , P5 are 0 and...
scribes fuzzy rule based classifier using GA and HSA. . An example in Figure 2 represents the process of
Experimental results are reported in section 4, and Sec- encoding membership function sets.

413

The Third International Conference on Contemporary Issues in Computer and Information Sciences

five linguistic values. Each linguistic value is randomly


selected with the probability of 1/5.

3.1.2

Figure 2: The used antecedent fuzzy sets in the paper

S:
MS:
M:
ML:
L:

3.1

Small
Medium Small
Medium
Medium Large
Large

10000
01000
00100
00010
00001

Evaluating each fuzzy if-then rule

The fitness value of each fuzzy if-then rule is evaluated


by classifying all the given training patterns using the
set of Npop fuzzy if-then rules in the current population. The fitness value of the fuzzy if-then rule Rj is
evaluated by the following fitness function:
F itness(Rj ) = {j1 , j2 , ..., jh } 1 h C

(1)

Where jh (Rj ) is the sum of difference of the training patterns (binary strings) in Class h with the fuzzy
if-then rule Rj .

Learning Fuzzy Rule

3.1.3 Rule generation


An HSA is applied to learn the fuzzy rules. So that, the
learning process for each class is done independently.
In other words, FHDD learns fuzzy rules related to This algorithm learns rules for each class separately,
each class in parallel way which cases decrease the time therefore a fuzzy if-then rule is generated by generated rules and Harmony operations, for each class. A
taken to build output classifier.
HMCR operation is applied to the selected fuzzy ifthen rule with a prespecified HMCR probability. With
Outline proposed approach is as follows:
Step 1: Generate an initial population of fuzzy if-then a prespecified PAR probability, each antecedent fuzzy
set of fuzzy if-then rule is randomly replaced with a
rules. (Initial Population)
Step 2: Evaluate each fuzzy if-then rule in the current different fuzzy set after the HMCR operation [12]. After generating of fuzzy if-then rules, the fitness value
population.
Step 3: Generate new fuzzy if-then rules by Harmony of each of the newly generated fuzzy if-then rules are
determined.
operations. (Rule Generation)
Step 4: Replace a part of the current population
with the newly generated rules. (Updating the Current Rule)
Step 5: Terminate the algorithm if a stopping condi- 3.1.4 Updating the current population
tion is satisfied, otherwise return to stop 3.
Step 6: Save the best individuals of the resulted popIn our fuzzy classifier system, the worst rule with the
ulation. Terminate the whole algorithm if a maximum
smallest fitness value in each class is removed from the
number of iterations are satisfied. Otherwise, go to
current population, and the newly generated fuzzy ifstep 1.
then rule is added if better. The above procedure is
iterated until a pre-specified number of fuzzy if-then
Each of the above steps will be explain in detail
rules are generated.

3.1.1

Initial population
3.1.5

Let us denote the number of fuzzy if-then rules in each


population in our fuzzy classifier system by Npop (i.e.,
Npop is the population size). To construct an initial
population, Npop fuzzy if-then rules are generated by
randomly selecting their antecedent fuzzy sets from the

414

Stopping condition

We can use any stopping condition for terminating the


algorithm. In computer simulations of this paper, we
used the total number of generations as a stopping condition.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

3.2

Fuzzy Inference

Let us assume that our pattern classification problem


is a c-class problem in the n-dimensional pattern space
with continuous attributes. We also assume that M
real vectors xp = (xp1 , xp2 , ..., xpn ), p=1, 2,..., M are
given as training patterns from the c classes.
When the algorithm generate fuzzy if-then rule for
each class using M patterns, a fuzzy inference engine is
needed classify test patterns. Figure 3 illustrate fuzzy
inference engine [13].

Where
TP: true positives, the number of cases in our training
set covered by the rule that have the class predicted by
the rule.
FP: false positives, the number of cases covered by the
rule that have a class different from the class predicted
by the rule.
FN: false negatives, the number of cases that are not
covered by the rule but that have the class predicted
by the rule.
TN: true negatives, the number of cases that are not
covered by the rule and that do not have the class predicted by the rule.
Also, Precision measures, Recall measure and FMeasure are computed by following equations. FMeasure is a trade-off between Precision and Recall.
Precision =

Recall =
F Measure =

TP
TP + FP

(3)

TP
TP + FN

(4)

2 Precision Recall
Precision + Recall

(5)

Figure 3: The Testing Stage

Tables 2-4 show the mean classification rate, Precision,


Recall and F-Measure for the generated rules by the
proposed algorithm and several well-known methods,
When a rule set S is given, an input pattern xp = that have been tested by Weka software. Also because
(xp1 , xp2 , ..., xpn ) is classified by a single winner rule of the number of rules that our algorithm has proposed
Rj , in S, which is determined as follows[14]:
is low, this algorithm has good comprehensibility (Table 5).
j (xp ) = max{j (xp )|Rj }
Method

Experimental Results

For evaluation of our proposed classification system we


used from UCI data repository [15] such as Wisconsin Breast Cancer (Wisconsin), diabetes (Pima) and
Heart. (Table 1).

C4.5
NN
KNN
BayesNet
Proposed
algorithm

Classification
Rate
0.946
0.958
0.951
0.96
0.9778

Precision Recall
0.946
0.959
0.951
0.962
0.978

0.946
0.959
0.951
0.96
0.978

FMeasure
0.946
0.959
0.951
0.96
0.978

Table 2: Wisconsin breast cancer data set.


Data set
Wisconsin
Pima
Heart

Instances
699
768
270

Attributes
10
8
13

Classes
2
2
2

Table 1: DATA SETS

The classification rate being calculated according


to (2).
Classif ication Rate =

Method
C4.5
NN
KNN
BayesNet
Proposed
algorithm

(T P + T N )
(2)
(T P + T N + F N + F P )

Classification
Rate
0.762
0.753
0.702
0.743
0.793

Precision Recall
0.754
0.75
0.696
0.741
0.798

Table 3: Pima data set

415

0.754
0.754
0.702
0.743
0.795

FMeasure
0.751
0.751
0.698
0.742
0.795

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Method

Classification
Rate
C4.5
0.762
NN
0.751
KNN
0.735
BayesNet 0.811
Proposed 0.793
algorithm

Precision Recall
0.766
0.75
0.713
0.811
0.798

0.767
0.752
0.732
0.811
0.795

FMeasure
0.767
0.751
0.728
0.811
0.795

Table 4: Heart data set


Number of Rules for each class

[4] H. Ishibuchi, K. Nozaki, and H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification, Fuzzy Sets Syst 52 (1992), 2132.
[5] S. Mitra, L.I. Kuncheva, Shi Y, and Chen Z, Improving
classification performance using fuzzy MLP and two-level
selective partitioning of the feature space, Fuzzy Sets and
Systems 70 (1995), no. 1, 113.
[6] V. Vebele, S. Abe, and M. Lan, Neuralnetwork-based fuzzy
classifier, IEEE , Transact ion . on systems Men and cybernetics 25 (1995), no. 2, 333361.

[7] A. Gonzoles and R. Perez, SLAVE : A genetic learning systems based on iterative approach, IEEE Transaction on systems 7 (1999), no. 2, 176-191.

Table 5: Result of algorithm

[3] S. Abe and M.S. Lan, A method for fuzzy rules extraction directly from numerical data and its application to pattern classification, IEEE Trans. on Fuzzy Systems 3 (1995),
no. 1, 1828.

[8] H. Ishibuchi, T. Nakashima, and T Murata, Performance


evaluation of fuzzy classifier systems for multidimensional
sample classification problems, IEEE Trans on ystems,
PART. B 29 (1999), no. 5, 601618.

Conclusion

We have introduced a novel approach to fuzzy classification for medical diagnosis. This paper presents a
mixture of Harmony Search Algorithm and Fuzzy Logic
for classification. The proposed algorithm is used in
the structure of a Michigan based evolutionary fuzzy
system. The algorithm learned the rules for each class
independently. Our experiments have confirmed that
the algorithm can classify the data with considerable
classification accuracy. The algorithm has some feature
such as classification rate increase, to generate only one
rule for each class or interpretability increase.

[9] S.F. Smith, A learning system based on genetic algorithms,


1980.
[10] O. M. Alia, R. Mahdavi, D. Ramachandram, and M. E. Aziz,
Harmony search-based cluster initialization for fuzzy CMeans segmentation of MR Images, TENCON 2009 (2009),
16.
[11] A. Kattan and R. Abdullah, An enhanced parallel & distributed implementation of the harmony search based supervised training of artificial neural networks, Third International Conference on Computational Intelligence, Communication Systems and Networks (2011), 275280.
[12] Z. W. Geem, C. Tseng, and Y. Park, Harmony search for
generalized orienteering problem: best touring in China,
Springer Lecture Notes in Computer Science 3412 (2005),
741750.
[13] M. F. Ganji and M. S. Abadeh, A fuzzy classification system
based on ant colony optimization for diabetes disease diagnosis, Expert Systems with Applications 38 (2011), 14650
14659.

Refrences
[1] T. Nakashima, G. Schaefer, Y. Yokota, S. Ying Zhu, and
H. Ishibuchi, Weighted Fuzzy Classification with Integrated
Learning Method for Medical Diagnosis, IEEE Engineering
in Medicine and Biology 27th Annual Conference (2005),
56235626.
[2] American Cancer Society, Cancer facts and figures: http://
www.cancer.org/docroot/STT/stt 0.asp.

416

[14] M.F.Ganji and M.S.Abadeh, Parallel Fuzzy Rule Learning


Using an ACO-Based Algorithm for Medical Data Mining,
IEEE Fifth International Conference on Bio-Innspired Computing: theories and Applications (2010), 573581.
[15] UCI
Machine
Learning
http://www.archive.ics.uci.edu.

Repository:

Government Above the Clouds: Cloud Computing Based Approach


to Implement E-Government
Toofan Samapour

Sama Technical and Vocatinal Training College, Islamic Azad University, Astara Branch, Astara, Iran
Department of Computer
ce@samapour.ir

Mohsen Solhnia
Science and Research Branch, Islamic Azad University, Guilan, Iran
Department of Computer Engineering
m.solhnia@gmail.com

Abstract: The fast development of the connection networks infrastructure and the increase
in IT usage around the many humans activities has fantastically enhanced the necessity to
provide e-services for clients. Generally cost is considered as one of the main obstacles to present
such services for developing countries. In this paper, we are presenting new cloud computing
based approaches to overcome these existent barriers against implementation e-government
services by using cloud computing capacities. Nowadays, with increasing extension e-government
services, integration of these services from viewpoint of economic, culture and etc., has become to
almost necessary affair. This integration that being by cloud computing techniques, in addition
e-government costs decrement and facilitate using the services for clients, help to spread using of
e-government services and thereupon significantly decrease government costs. This paper discusses
the concept of cloud computing, proposed cloud based approaches for e-government services and
correlated economic opportunities and challenges.

Keywords: Cloud Computing; E-Government; Cloud Architecture.

Introduction

In each period, the most common challenges of governments are interaction with resource constraints; reduce
costs and using technology based on the specific framework from e-government to c-government via cloud
computing [1]. Common goal of the mentioned cases
is maximum benefit from minimum opportunity.
Thus, the role of research in the field of optimization
and lowering costs is filling for every day. In other
side, the world is changing to a small village with
growth of global communications and communications
equipment that government not being excluded from
this theorem. Therefore, the government developed
Corresponding

to cover the cost reduction, resource management and


communication, and with parallel progress in other sciences such as information technology. Also the progress
received to the point where in the computer industry
and information technology, multi-year dream of specialists became a reality means the use and sharing of
resources at different scales, which are known as cloud
computing. So, the next development of e-government
will be transition from traditional e-government to egovernment based on cloud computing. But for the
adoption of a technology platform and infrastructure
should be provided and it is also clear objectives for
its use. Besides, the interaction of citizens with this
emerging phenomenon is important and also uses it in
the heart of government.

Author, P.C. 43911-47869, M: (+98) 911 381-3765, T: (+98) 0182 526-3595

417

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Section 2 of this paper briefly discusses the existing


e-government services and challenges. Section 3 introduces cloud computing structure. Section 4 presents
ideas for using from cloud computing capacities in egovernment system. Benefits and challenges of using
cloud computing in E-government elaborated in Section 5, and finally Section 6 concludes the paper with
future plans.

E-Government Definition and


Challenges

In this section is being mentioned definition and


challenges of e-government. By the World Bank egovernment is defined as the use by government agencies of information technologies that have the ability to
transform relations with citizens, businesses, and other
arms of government. These technologies can serve a
range of different ends: better delivery of government
services to citizens, improved interactions with business and industry, citizen empowerment through access
to information, or more efficient government management. The resulting benefits can be less corruption,
increased transparency, greater convenience, revenue
growth, and/or cost reductions.[2].
Development of e-government services beside many
raised advantages, faced with challenges that must be
considered in their development phase [3].
1) Tremendous data: First is the tremendous data,
compounded by the continual tendency to create exclusive solutions for each governmental service. Generally, the result of this kind of leaning has been that
multiple islands of data and duplicative often produce
conflicting results.
2) Lack of Long-Term Programs and Contracts: The
second is the lack of long-term programs and contracts that hinder the government from receiving the
full value of well-timed technology promotions. The
reality of sectional contracts often creates an extra of
infrastructure that cannot be continued between contracts or transferred from one contractor team to its
successor. With the public sector being held responsible for cost control and enhanced service, these issues
must be considered in using information technology.
3) Difference in technology environments: Third challenge for public organizations is the difference in technology environments which impede the adoption of
technology refresh programs and the introduction of
innovate, new applications and strategies. 4) Methods:
With key benefits of accountability, transparency, se-

curity, innovation, and superiority of service, the stage


is set for an added to focus on cost control and enhanced information sharing. An additional new step is
moving to Infrastructure as a Service, particularly
cloud computing as a means of providing on demand
computer resources, government data, and government
services over the Internet for further to expand usage
of electronic government services.
5) Human Accepting It principally concern with the
usage made by the citizens. Then there is a challenge of
accessibility, usage and acceptance of the e-government
services. Even if the internet users are growing exponentially there is a significant part of the people who
may not be able to access e-government for various reasons. In most countries users are often not professional
users, they need the guidance to find the right way to
perform their transactions. The successful implementation of e-government services requires facilitation of
using of these services for all users [4].

Introduction to Cloud Computing

The term of cloud computing refer to a scalable and


relative set of software and hardware resources that
developed for service to the users request. All services is independent from arrange of hardware and
software resource in the data centers, resource development and restrictions of them. Given the above
definition, it is well known that the all of interactions
in the cloud is on-demand services within the specified
framework, or so-called x (software, platform, infrastructure) as a service. In cloud, users are divided into
two categories: SaaS users and users of cloud hardware
resources [1]. Means the first step is that an infrastructure of hardware such as processors, memories, storage
and other of requirements device for communication
and maintains that are not necessarily in a centralized data center, will be formed. And this is task
of cloud provider. Then cloud users independent of
hardware installations in different geographical areas
(the so-called virtualization infrastructure) use from it
and provided necessary platform for creating various
network applications. After create the software, it delivering to end users. Indeed, platform providers are
a bridge between cloud providers and SaaS Users. In
fact, if looking at the cloud from service provider view,
cloud composed of three main layers: Infrastructure
as a Service (IaaS), Platform as a Service (PaaS) and
Software as a Service (SaaS) that this three layers show
in figure 1.

418

The Third International Conference on Contemporary Issues in Computer and Information Sciences

C-Government Benefits and


Challenges

There are many expected opportunities of using cloud


computing technologies in the e-government services.
The resulting opportunities for governments, societies
and the environment have many benefits. In other
hand, implementation of c-government has some challenges. In continue some of the most important of
benefits and challenges have been mentioned:
Figure 1: Cloud

5.1
At cloud computing there are other divisions. If
cloud the in terms of ownership enterprise resources
are divided into three different states, private cloud,
public cloud and hybrid [5]. The private cloud is defined as a state that an enterprise owns all resources
in cloud. Public cloud that all resources available to
rent enterprise. Hybrid cloud is combination of private
and public clouds; this means that it composed of a
part of the rent and other parts is complete acquisition.

Cloud
Approach
Government

for

E-

Cloud computing concepts were described in previous


section. In general, it is being defined in two types. In
the first type, cloud computing can be provide using
of expensive hardware and software recourses in definite time. In the second type of definition, by cloud
computing the e-services are being integrated, where
people access to more e-services from unit portal. Utilization of any type of clouds capacities will result
many benefits for both governments and peoples. By
applying the clouds capacities in governmental services, the new world namely c-government is being
created [3]. However, cloud computing is the next
generation of information technology in which data
and software will be resided centrally and accessible
anywhere and anytime by a different devices. In next
section, opportunities and challenges of the approach
is being discussed.

419

Benefits

a) Increase Flexibility One of cloud computing basic


specifications is ability to provide easy and unlimited
scalability [6]. Clients have an access to a large virtualized resource warehouse which allows them to meet the
unusually load on peak periods in an efficient, flexible,
and cost effective way because the cost per unit reduces
with the increase in number of units [7]. Thus, the performance and economical sustainability are balanced.
Moreover, cloud computing services can be scaled in
two sides automatically and can be supplied in any
quantity in any time [8].
b) Facilitate maintenance and technical support Cloud
computing service providers maintain the purchased
applications and servers. They also manage the software updates and provide perfect technical support.
Here the excellence of the cloud appears for the egovernment services especially for small country side
government offices because neither the recruitment of
such skilled and professional staff is cheap nor the experts prefer to work in such far places [9]. In addition,
cloud computing technologies dont require installing
or deploying software updates in user side computer,
as a result this will reduce maintenance and support
challenges.
c) Rapid, easy and inexpensively scalability Generally
one of the main goals of cloud computing development is provide cost effective services for large organizations and governmental entities. Clouds capabilities decreases dependency to hardware and software infrastructures for e-services development, and therefore
Investment cost required changing the scale decreases
and this process is also faster and easier [8].
d) Reduce infrastructures needed in both two sides and
support costs Cloud computing can actualize the sharing of physical devices and dynamic allocation of system resources, thus it decreases hardware requirements
and reduction costs of the data center and user side
computer. Also, the system software license is for the
one-time investment costs especially for government,

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

therefore the government can use the mode of platform


as services and software as services in cloud computing
environment and decrease its costs.
e) Help to government for implementation teleworking
and such as these plans New services in e-government
one day after day add to e-government. Teleworking is one of popular e-services that introduces in recent years. European teleworking online in its website
defined teleworking (or telecommuting) as Teleworking occurs when information and communications technologies (ICTs) are applied to enable work to be done
at a distance from the place where the work results are
needed or where the work would conventionally have
been done [10]. By combining cloud computing and
e-government services, employees will be able connect
to their workplace by common and not specific device.
f) Help to government for better management in disasters The duty critical nature of technology based solutions is highly reflected in the disaster recovery. It is really a vital issue to the survival of most organizations to
ensure that they have the ability to survive in disasters
that might hit their IT infrastructure. Disaster recovery plans in the clouds are providing organizations with
more options to restore data rapidly and effectively
comparing to a traditional disaster recovery model [5].
It changes disaster recovery concept by reducing costs
and increasing operational speed. By using cloud as a
backup for disaster recovery, governments can be communication with the operational teams, and thereupon
better manage rescue operations.
g) Increase services security and compatibility Cloud
computing is promoting e-government system to provide many official services with unified work environment, which significantly improve the compatibility and stability. In other hand, most e-government
services are lack of an integrated management and
security strategy, causing the governmental entities
are acting independently. However, cloud computing
with characteristics of high-level system integration is
conducive to foundation of management and security
strategies in both sides e-government system.
h) More facilitate in using and spread usage rates of
e-services One of main benefits of c-government for societies is improve level of socials services such as health,
education and cultural affairs with help expand of egovernment services usage.
i) Help to improve environmental conditions The exponential use of ICT devices in the governmental sector has created a negative effect on the environment
as it increases the rate of carbon dioxide emission
and involves in more power consumption. Cloud computing is comparatively more suited in reducing the
power consumption and providing eco-friendly systems
through virtualized services. Using the virtualized services would reduce up to 90 percent of typical PC
power consumption [11].

5.2

Challenges

a) Security policies In e-government world, data security is one of the main challenges. Also, cloud computing environment provide multiple users and software
access to sharing hardware and network resources to
improve resource usage. However, different governmental entities unavoidably face with the situation sharing
the same physical infrastructure. Thus the entities
are deeply concerned about the security of important
and sensitive data being released without security and
privacy justifies [12].
b)Network infrastructure As the cloud computing is
entirely network-based and highly dependent on the
network condition. Thus, the risks of network transmission delay or other problems are being increased
after the e-government services moves to cloud computing environments, therefore system reliability is
being reduced. On the other hand, such migration to
cloud with neglect network condition, imperil success
of system implementation and its security.
c) Security considerations In cloud computing, egovernment service platform applications become more
variegated. Though, the difficulty of application service management will also increase correspondingly.
The public hopes to use the e-services without violation of personal privacy by government. In this regard,
the government must also consider the actual needs of
the citizens and law standards. To ensure service availability and to prevent serious crime may be derived.
Moreover, in order to fortify the security, the government may infringe the citizenship right in the cloud
closely supervision activities at the same time.
d) Appropriate laws In most cases can be seen that
the laws have not been changed in proportion to the
growth rate of technology. For instance, cloud computing includes many legal issues which are not fully
present currently. In some countries, the gap between
cloud computing technology and policy has been concerned, some governments have begun to formulate and
improved relevant laws, but the speed of development
of cloud technology goes much faster than the governments legislation do. Therefore, the cloud computing
environment is still full of legal issues of uncertainty.

Conclusion

In this paper we suggested merge of cloud computing


in e-government system for using from positive capacities of cloud computing in e-government services. We
proper cloud computing and e-government natures and
deduced that creating cloud based e-government have

420

The Third International Conference on Contemporary Issues in Computer and Information Sciences

much benefits for government and its peoples. In our


forthcoming attempt, we intend to more detailed discuss about technical prerequisites of the implementation procedures and prepare comprehensive framework
for the migration to cloud computing.

[5] D. C.Wyld and R. Maurin, Moving to the Cloud: An


Introduction to Cloud Computing in Government: EGovernment Series (2009).
[6] r. Buyya, J. Broberg, and A.M. Goscinski, Cloud Computing: Principles and Paradigms, New Jersey: John Wiley and
Sons, 2011.
[7] B. Furht and A. Escalante, Handbook of Cloud Computing,
New York: Springer, 2010.
[8] E.A. Marks and B. Lozano, Executives Guide to Cloud
Computing, New Jersey: John Wiley and Sons, 2010.

Refrences
[1] M. Armbrust and et. al, Above the Clouds: A Berkeley View
of Cloud Computing (2009).
[2] The e-government hand book for developing countries:
unpan1.un.org/intradoc/groups/public/documents/apcity,
World Bank, 2009.
[3] W. Zhang and Q. Chen, From E-government to Cgovernment via Cloud Computing, 2010.
[4] A. Tripathi and B. Parihar, E-governance challenges and
cloud benefits, 2011.

421

[9] W. Cellary and S. Streyjowski, E-Government Based on


Cloud Computing and Service-Oriented Architecture, 2009.
[10] Available at:, http://www.eto.org.uk/faq/defn tw.htm.
[11] T. Velte, A. Velte, T.J. Velte, and R.C. Elsenpeter, Cloud
Computing: A Practical Approach., New York: McGraw Hill
Professional, 2010.
[12] C. Yeh, Y. Zhou, H. Yu, and H. Wang, Analysis of EGovernment Service Platform Based on Cloud Computing:
Lecture Notes in Computer Science, 2010.

Human Tracking-by-Detection using Adaptive Particle Filter based


on HOG and Color Histogram
Fatemeh Rezaei

Babak H.Khalaj

Sharif University of Technology

Sharif University of Technology

Department of Electrical Engineering

Department of Electrical Engineering

f rezaei@ee.sharif.edu

khalaj@sharif.edu

Abstract: Human tracking is one of the main problems in object tracking field. There are a
lot of challenges such as human pose variation, illumination changes in the environment, lack of
specific moving behavior, occlusion and image noise. This paper presents an adaptive particle
filter using HOG and color histogram for human tracking. A motion model is proposed which
estimates the target speed from the history of its last displacements. The experimental results
show improvements in the robustness of tracking. In addition, by using a background subtraction
before extracting the HOG features, the running time of the algorithm improves. The publicly
available data set PETS2009 S2.L1 is used to evaluate the performance of the proposed method. It
is shown that the correct tracking percentage improves and probability of missing targets decreases.

Keywords: human tracking; particle filter; motion model; HOG; color histogram.

Introduction

Human tracking is an essential step in many computer


vision based applications. These applications include
human-computer interaction, security and surveillance,
sports television enhancement, video indexing, assisted
driving, activity recognition and behavior research of
human. There are a lot of challenges in human tracking such as human pose variation, illumination changes
in the environment, lack of specific moving behavior,
occlusion and image noise [1]. These challenges make
human tracking to be one of the main problems in object tracking field.
There are different approaches for tracking in the
literature such as Optical Flow, Cam shift, and estimator based trackers i.e. Kalman and particle filter [2].
In comparison with the other basic approaches, estimator based trackers are more robust against sudden
changes in motion and miss detections of the targets.
As human motion is almost nonlinear and non Gaussian, particle filter is a better choice for human tracking
[3] than Kalman filtering [4]. Particle filter highly de Corresponding

Author, T: (+98) 21 77932704

422

pends on the features used as its observations. Color


histogram has been used in [5], [6], [7] and [8] as the
observation feature. In [9], HOG [10] has been used as
the observation feature. More complicated approaches
have been proposed in [11], [12], [13] and [14]. [11], [12]
represent a particle filtering based method using clustering. [13], [14] propose a detector confidence particle
filtering. Although these recent works have reliable
tracking results, but the approaches are very complicated and time consuming. They have some overhead
on the program and cause some difficulties in online
tracking. A particle filter, based on color histogram
and HOG, has been proposed in [15]. It considers both
color and shape information and makes the filter more
robust against illumination and pose variations. The
idea in [15] is simple and useful, so this work is based
on it and improves it.
In this paper, an adaptive particle filter using HOG
and color histogram is proposed. The improvement of
our method is to introduce an adaptive motion model
which helps robustness of tracking. By estimating the
target speed from the history of the last displacements

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of it, the power of the tracker improves and probability of missing targets decreases. In addition, by using a
background subtraction and segmentation on the image before extracting the HOG features, the running
time of the algorithm improves. This helps the algorithm to be real time.

p(y) = {p(u) (y)|u = 1 : m},


p(u) (y) = f

Particle Filter
Detection

where y is the center of the object, yi denotes


pixel locations of target centered at y. m is the
number of histogram columns and K(x) is the
kernel function. a is a scale factor, (x) is the
Dirac function, n is the count of total pixels and
f is a normalization factor.

Tracking-by- 3
3.1

2.1

k(ky yi k/a)(h(xi u)).

i=1

The paper is organized as follows. In section 2,


particle filter, HOG and color histogram are briefly introduced. In section 3, the proposed method is explained. Section 4 presents experimental results, followed by conclusion in section 5.

n
X

(1)

Proposed Method
Initial Steps

Particle Filter
First of all, the first frame of the video sequence is
subtracted from its background and then is segmented
using a proper threshold to extract binary foreground
image. It should be mentioned that if the background
images are not available, they can be driven using some
methods such as reference image extraction. Second,
the HOG algorithm is run only on the white pixels of
the binary image to extract the initial location of the
targets. The centroid of these detection windows is

Particle filtering is one of the most important algorithms for human tracking. As People have no specific
structure or equation for their motions, it is necessary
to have a tracking algorithm that does not require targets equation of moving. Particle filter is not a tracking algorithm by itself, but a sampling algorithm. By
combining it with some observation models, it can be
used for tracking applications. So based on the observation model and feature extraction methods used in
{(xr , yr )|r = 1 : R}
(2)
particle filtering, there will be various approaches with
in which, R is the number of detected targets. This
different results.
step is proposed because the white pixels of the binary
image representing foreground, are most likely pixels
to be on the body of the human targets. As mentioned
2.2 Human Detection Features
before, HOG method is one of the most important and
reliable approaches for human detection, but its speed
HOG
is very low. So this simple background subtraction has
One of the best human detection algorithms is huge improvement on the speed of the algorithm and
Histogram of Oriented Gradients (HOG), intro- helps it to be real time. At this step, target templates
duced in [10]. The HOG detector is a sliding called q are driven using color histogram of the inir
window algorithm. This means that for any tial targets window locations and their HOG features.
given image, a window is moved across at all lo- The template that we proposed is the combination of
cations and scales and a descriptor is computed. both features mentioned above, including color inforSo it is a time consuming algorithm and makes mation of the target and human contour information.
difficulties with real time applications. A pre- It makes the template robust against variations and
trained classifier is used to assign a matching noise in video sequence such as illumination changes,
score to the descriptor to decide whether there is human pose variation and occlusion.
a human or not. The classifier is a linear SVM
classifier and the descriptor is based on the histograms of gradient orientations.

3.2

Proposed Particle Filter

Color Histogram
The color histogram is computed from RGB color Next step is running the particle filter for the targets.
space as follows [15]
Particle filter contains samples and their corresponding

423

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Observation Model
The observation model proposed in this paper
is a combination of HOG features and color histogram. It helps the model to be more robust
against variations and noise in video sequences
such as illumination changes, human pose variation and occlusion, since it includes color information and human contour information of the
target, simultaneously. The proposed observation model is

weights which show the probability of each sample to


be in the centroid of the target. It can be shown as
S = {(X (n) , W (n) )|n = 1 : N }

(3)

where S is the sample set, X (n) shows each sample,


W (n) is the corresponding weight and N is the number
of samples.
For particle filter modeling, three components are
considered as follows.

d2
1
e 22 ) + (1 )WHOG ,
Wn = (
2

State Vector
The state vector of the filter is considered as

where
d2 = (1 (p(Xn ), q))

Xt = {xt , yt , ut , vt }

(7)

(4)

show Bhattacharyya coefficient [16] for the template model comparing with the observations.
WHOG represents the observation weight obtained by HOG descriptor and is a coefficient
controlling the effect of each observation on the
total weight.

where x, y are the centroid coordination and u,


v are the estimated speed of each target at time
step t.
Motion Model
As we mentioned before, in general tracking applications people have no specific motion model.
So if the tracking algorithm does not adapt itself
with the changes in human direction and speed,
it fails to track the targets. It is the reason why
the Kalman filter and its extensions such as EKF
[3] that perform properly for the objects with well
known equations, such as the objects in military
applications, fail in common applications such as
human tracking.
In order to adapt the particle filter for human
tracking, we propose the motion model as
(x, y)t = (x, y)t1 + (u, v)t1 t + x,y

(5)

(u, v)t = ((x, y)t (x, y)t1 )/t + u,v

(6)

(8)

Experimental Results

In this section, we describe the experimental results.


We test the performance of the proposed method on
the publicly available data set PETS2009 S2.L1. The
Visual C++ 2008 is used for implementation. Some
results of target tracking by our proposed method are
shown in Figure 1. For numeric analysis, we use the
metric used in [17] as follows; if the overlap of our
bounding box and the ground truth bounding box is
larger than 70% and the size of our bounding box is
less than 1.5 times of the ground truth bounding box,
we treat this as a correct tracking. The correct tracking percentage in Table 1 is the number of correctly
tracked frames divided by the number of all the frames
in the video sequence. We compare the performance of
our method with two other trackers; HOG based particle filter called HOG PF and color histogram based
particle filter called CH PF. As it is shown in the Table
1, our method improves the correctness and robustness
of tracking.

where t is the between-frame time interval and


 is Gaussian random noise. The speed equation is the key point which adapts the motion
model to human direction and speed. Suppose a
situation where a human target enter the image
from one side, continue walking to another side
but suddenly see his friend, stand in his place
and start talking and after some minutes change
his path to other side. In this common simple
situation, if the particle tracker does not adapt
the estimated speed to the human moving, it 5
Conclusion
fails to track the target. In section 4, it is shown
that by applying the proposed motion model, the
particle tracker will be significantly more robust In this paper, we presented an adaptive particle filter
using HOG and color histogram. A motion model was
to changes in human motions.
proposed which estimated the target speed from the

424

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 1. Tracker Performance


T racker
Correct tracking percentage
Proposed PF
86 %
HOG PF
72 %
CH PF
63 %

history of its last displacements. In addition, background subtraction used before extracting the HOG
features. It has been shown that the correct tracking
percentage, robustness and the running time of the algorithm improves by our proposed method.

Refrences
Acknowledgement
The authors would like to thank National Elite Foundation for their support.

[1] B. Song, R.J. Sethi, and A.K. Roy-Chowdhury, Robust


Wide Area Tracking in Single and Multiple Views, In guide
to video analysis of humans:Looking at people,Springer
(2011).
[2] M. Isard and A. Blake, Condensation- conditional density
propagation for visual tracking, Computer Vision (1998).
[3] M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp,
A Tutorial on Particle Filters for Online Nonlinear/NonGaussian Baysian Tracking, IEEE Transactions on Signal
Processing,Vol. 50, No. 2 (2002).
[4] G.W. Pulford, Taxonomy of multiple target tracking methods, IEE Proc.-Radar Sonar Navig., Vol. 152, No. 5 (2005).
[5] K. Nummiaro, E. Koller-Meier, and L. Van Gool, A
Color-Based Particle Filter, First International Workshop
on Generative-Model-Based Vision, in conjunction with
ECCV02 (2002).
[6] J. Schmidt, B. Kwolek, and J. Fritsch, Kernel particle filter
for realtime 3D body tracking in monocular color images,
in Proc. Autom. Face Gesture Recognit, Southampton, U.K
(2006).
[7] C. Shan, T. Tan, and Y. Wei, Real-time hand tracking using a mean-shift embedded particle filter, Pattern Recognit.,
Vol. 40, No. 7 (2007).
[8] X. Xu and B. Li, Head tracking using particle filter with
intensity gradient and color histogram, IEEE International
Conference on Multimedia and Expo (ICME) (2005).
[9] F. Xu and M. Gao, Human Detection and Tracking based
on HOG and Particle Filter, , Int. C. Image and Signal
Processing ,CISP (2010).
[10] N. Dalal and B. Triggs, Histogram of Oriented Gradients
for Human Detection, CVPR (2005).
[11] E. Maggio, M. Taj, and A. Cavallaro, Efficient multi-target
visual tracking using Random Finitr Sets, IEEE Tans. Circuits and Systems for Video Technology (2008).
[12] E. Maggio, E. Piccardo, C. Regazzoni, and A. Cavallaro,
Particle Filtering for Multi-Target Visual Tracking, Proc.
of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP) (2007).
[13] M.D. Breitenstein, F. Reichlin, B. Leibe, E.K. Meier, and
L.V. Gool, Robust Tracking-by-Detection using a Detector
Confidense Particle Filter, ICCV (2009).
[14] M.D. Breitenstein, F. Reichlin, B. Leibe, E.K. Meier,
and L.V. Gool, Online Multi-Person Tracking-by-Detection
from a Single Uncalibrated Camera, IEEE Trans. Pattern
Analysis and Machine Intelligence (2010).
[15] L. Jin, J. Cheng, and H. Huang, Human Tracking in
the complicated background by particle filter using colorhistogram and HOG, ISPACS (2010).
[16] A. Bhattacharyya, On a measure of divergence between two
statistical populations defi ned by probability distributions,
Bulletin of the Calcutta Mathematical Society 35 (1943).

Figure 1: Some results of target tracking by our proposed method.

[17] Y. Li and B. Bhanu, Fusion of Multiple Trackers in Video


Networks, ICDSC (2011).

425

Use of multi-agent system approach for concurrency control of


transactions in distributed databases

Seyed Mehrzad Almasi

Hamid Reza Naji

Islamic Azad University, Science and Research

Kerman Graduate University of Technology

of Kerman, Department of Computer

College of Electrical and Computer Engineering

mehrzad.almasi@gmail.com

hamidnaji@ieee.org

Reza Ebrahimi Atani


University of Guilan,
Department of Computer Engineering
rebrahimi@guilan.ac.ir

Abstract: One of the most important issues in distributed databases is concurrency control of
transactions that can be run simultaneously. This is a critical issue because it can be a danger for
integrity and consistency of data in distributed databases. So, concurrency control protocols ensure
integrity and consistency of data. In this paper, we proposed a new concurrency control algorithm
based on multi-agent systems which is an extension of majority protocol.

Keywords: Distributed Database; Concurrency Control; Majority Protocol; Multi-Agents.

Introduction

with each other and finally, a vote method to determine


which transaction can be executed and which cannot.

A distributed database is a set of several databases


that correlate with each other logically over a network
of interconnected computers. Collections of data (e.g.
in a database) can be distributed across multiple physical locations. Since the database is distributed, different users can access it without interfering with one
another. The database system through a scheduler
must monitor, examine, and control the concurrent accesses so that the overall correctness of the database
is maintained [1].The traditional approach to concurrency control is based on locking [2]. In this method
that is based on allocation of data to the transaction,
when a transaction wants to access data for writing or
reading, first it should send a corresponding lock request to a section called lock manager. In this paper
we have proposed a new concurrency control algorithm
based on multi-agent systems which is an extension of
majority protocol. We have used the message passing
mechanism between nodes to let them communicate
Corresponding

Author, T: (+98) 936 261-8479

426

2
2.1

A review to the technologies


Multi-Agent systems

An agent is a computer system situated in some environment and capable of autonomous action in this
environment in order to meet its design objectives [3].
A Multi agent system consists of a group of agents that
can potentially interact with each other.

2.2

Majority protocol

Majority protocol is one of the methods in distributed


lock management for concurrency control. In this pro-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tocol, local lock manager is responsible for locking and


unlocking data. In a distributed database with data
Q replicated on all sites when a transaction wants to
lock data Q, the request will be sent to more than half
of the sites. If the corresponding data is locked inconsistently by the previous request, the next request will
be suspended until the compliance time. A transaction can use data Q only if most of the sites lock data
copy. Majority protocol has some advantages and disadvantages. If a site failure occurs and some data are
inaccessible, this method can continue its work. This
is one of the advantages of the majority protocol. Too
much message passing is one of its disadvantages and
deadlock occurrence is another one.

The Approach

Our algorithm is based on majority protocol but has


some additional features. We use message passing
mechanism for exchanging messages between nodes like
majority protocol. But here all nodes collaborate with
each other for concurrency control. This means that
nodes exchange messages to each other in contrast to
the majority protocol in which sites have peer to peer
communications with initiative nodes. One of the important characteristic of our algorithm is that some
different transactions can be considered in a single iteration of the algorithm simultaneously. This means
that concurrency control progress can be applied to
some different transactions at the same time to determine which transaction can be run. The node that
initiates the algorithm is called the monitoring node.
In this method we have used multi-agent system concept. That is each of the nodes in distributed database
environment is an agent that interacts with others. So,
massage passing between agents can be done in a parallel manner and counting votes for each transaction
can also be done parallelly. Parallelism in counting
votes and message passing gives us better results from
the viewpoint of time and reduction of workload on
monitoring agent. Here, in our proposed algorithm no
node sends valuable information during message passing. That is, we send metadata instead of valuable
information for message passing mechanism to reduce
bandwidth load. To present the algorithm we make
the following assumptions: let set n as the number of
transactions in each iteration where n is the number
of agents in the network. Each data Q is replicated on
all local databases. In each iteration of algorithm, n
requests for locking n different data can be considered.
Each transaction has a unique ID like [0,...,n 1] while
nodes can be found by [0,...,n 1] IDs. Steps of our
new algorithm are described as follows:

Step 1: The monitoring node, M broadcasts a message (Metadata) to the other n 1 nodes asking them
to check status of data Q1 , ..., Qn . Nodes should check
for acceptance of the requested data Qi . With this
message algorithm is initiated.
Step 2: Upon receiving the message (Metadata), node
i (and also the monitoring node) check whether transaction j can lock its requested data or not. If node i
locks data copy Qj by its local lock manager, then it
will send a message for transaction j to another agent
where destination agent ID is j. If i = j then, this
massage sending will be implicit.
Step 3: in this step, each node does the following: For
each i and each j(j 6= i), let Mj be the message node
j has received from node i in step 2. So, this message
is a vote for transaction j. Therefore, agent j counts
received votes for transaction j. All agents do this in
parallel. If number of votes for each transaction is at
least n2 + 1, transaction j can be run.
Step 4: in this step, each agent informs result of counting votes and sends a commit message to other nodes
for running corresponding transaction.
Step 1 is used to send the message by the monitoring
node for initiating algorithm. In step 2, depended on
status of data Q (whether can be locked or not) each
agent sends a message to the other agents where destination nodes are selected by a rule as follows. Node i
sends a message to j if it can lock the requested data of
transaction j. This is done for all transactions on each
agent. In step 3, each agent counts the number of its
received messages during step 2. These counts can be
done parallelly. In fact, there is no boundary between
steps 2 and 3, i.e. upon receiving message by agents,
counting of votes is started, however, it is possible for
some agents not to have received any messages. We
presented our algorithm in several steps for better understanding of the reviewers. In the final step, agents
will send a commit command for executing transaction
if at least n2 + 1 nodes send their votes for locking the
requested data of transaction during previous steps.

3.1

An Illustrative Example

To understand how this algorithm works, we consider


the case when there are 6 nodes in distributed database
environment. For simplicity of the example we consider
our new algorithm in which all n transactions can lock
their requested data but due to failure of some links
all messages cannot reach destination. Table 1 shows
which links have failed and which havent. Fig. 1 illustrates the messages passed between the nodes during
the first two steps of our algorithm where node 0 is the
monitoring node. In step 1, node 0 sends out a message
(represented by solid lines), to the other five nodes (1,

427

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2, 3, 4 and 5). Message passing during step 2 can be After counting votes by each agent, it is clear that
done as mentioned above (represented by dashed lines). nodes 0 to 5 have 4, 4, 2, 4, 3 and 3 votes respectively.
As mentioned above each request gaining at least n2 + 1
(here is 4) votes can be run.

Figure 1: Message passing during step 1 and 2

The Clustering part

The purpose of clustering is to decrease exchanged messages between nodes for reducing network traffic and
overload on monitoring node and other nodes. Here,
each cluster is called as an external agent that concurrency control algorithm mentioned in previous section
can be run in each of the clusters separately. Nodes in
clusters are called internal agents and the node initiating the algorithm is called monitoring agent. Use of
external agents can lead to increase in the speed of the
algorithm due to parallelism and concurrency control
level. To present the clustering model we make the following assumptions:Number of nodes in system should
be an odd number. This algorithm has the most efficiency when it has the least failures in the links.
In fig 3 there are 11 nodes where node 0 is the monitoring agent; nodes 1 to 5 are in the first cluster (cluster
0) and others are in cluster 1. First, monitoring agents
broadcast a message to all other nodes for initiating
the algorithm and parallel execution of algorithm in
both clusters (fig 3a).

Fig. 2 illustrates the received messages by the n


nodes during step 1 and step 2 of the algorithm. Messages passed between nodes illustrated by number 1
and number 0 show there is no message exchange between two corresponding nodes in step 2.

Figure 3: Stages of proposed algorithm in clustering


model
Figure 2: Matrix showing messages passed corresponding to Fig. 1

A row denotes the received messages by a node. For


example, the first row of the matrix denotes that node
0 has received messages (Metadata) from nodes 0, 1,
4 and 5. Similarly, the third row denotes that node 2
has received messages (Metadata) from nodes 2 and 5
and so on.

428

Then, transactions 1 to 5 are considered in cluster


0 and transactions 6 to 10 are considered in cluster 1.
Here, monitoring node receives votes for transaction 0
from n 1 other nodes as one of inner agents. As described in previous section each node which can lock
required data of transaction i sends a message to node
i (fig 3b). Upon receiving n2 + 1 messages from nodes
(include monitoring node, shown with dashed arrow)
each inner agent broadcasts a commit message to other

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

n 1 nodes which means that corresponding transaction can be run. If monitoring agent doesnt receive
commit message from some inner agents (here, node 4
and 7, shown by colored nodes) until a threshold time,
monitoring node broadcasts a message to cluster 0 for
transaction 7 as (7, 2) in which the first argument is ID
of the node and the second is number of votes received
previously for transaction 7 (fig 3c). This is also true
for transaction 4 in cluster 1. In the next stage, nodes 1
to 5 in cluster 0 (k1 ) send their votes to the corresponding node (j) of node i in cluster 1 (k2 ). Therefore, the
corresponding node (j) of node i is derived as follows:
j = i + (|k1 k2 | m) where m is number of node in
each cluster. This is true if i < j otherwise this formula changes as follows: j = i (|k1 k2 | m). In
this example, the corresponding node for transaction 7
is 2, where i > j.

between all nodes and we have good load balancing.


In the proposed algorithm only 4(n 1) messages are
exchanged between monitoring node and others. For
example in a system with 11 nodes there are 330, 230
and 180 messages for the majority protocol, the proposed normal algorithm and the clustering model respectively when all link are intact.
Fig 5 shows algorithms runtime status in 20 iterations
for the majority and the proposed algorithms (both
normal and clustering models) separately in environments with 11 nodes. As shown, runtime duration of
algorithm in clustering model is lower than the two others due to decrement of message passing and parallel
computation.

The Implementation

We implemented our new algorithm on a core 2 duo


PC by MPI. The Message Passing Interface (MPI) is a
standard developed by the Message Passing Interface
Forum (MPIF) [4]. It specifies a portable interface for
writing message-passing programs, and aims at prac- Figure 5: Algorithms runtime in 20 iteration with 11
ticality, efficiency, and flexibility at the same time [4]. Nodes
Fig 4 shows number of exchanged messages between
nodes in each algorithm for n transactions where n is
between 130 and 150.

Conclusions

In this paper, we have presented a new algorithm for


concurrency control distributed database systems. Our
algorithm is a global algorithm with static parameters
which uses collaborative effort from a group of nodes
for determining which request can be accepted. Load
balancing on nodes, decreasing message passing and
parallel computations are some advantages of our proposed algorithm.

Figure 4: Number of exchanged messages between


nodes in the algorithms separately

Number of required messages for concurrency control of n transactions in majority protocol is 3n(n 1)
while in proposed algorithm that is an extension of the
majority protocol this value equals to (n 1)(1 + 2n).
2
But this value in the clustering model is 3(n 21) . These
formulas are gained from analytic computations when
all links are intact and all messages are passed. Here
in the proposed algorithm, message passing is shared

Refrences
[1] B. Bhargava, Concurrency Control in Database Systems,
IEEE transaction on knowledge and data engineering 11
(1999).
[2] P.A. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency control and recovery in database systems, Addison
Wesley, Reading, (1987).
[3] M. Wooldridge, An introduction to multiagent systems, John
Wiley and Sons Ltd (2002).
[4] Y. Aoyama and J. Nakano, RS/6000 SP: Practical MPI
Programming, International Technical Support Organization.
IBM., Chapter 1, pages: 1112, 1999.

429

Multi-scale Local Average Binary Pattern based Genetic algorithm


(MLABPG) for face recognition
A. Hazrati Bishak

K. faez

Islamic Azad University, Ahar Branch

Amirkabir University of Technology

Department of Computer Engineering

Department of Electrical Engineering

Ahar, Iran

Tehran, Iran

AHazrati@iau-ahar.ac.ir

Kfaez@aut.ac.ir

T. Taheri

P. Hazrati Bishak

Islamic Azad University Qazvin Branch

Islamic Azad University, Ahar Branch

Department of computer engineering and IT

Department of Computer Engineering

Qazvin, Iran

Ahar, Iran

taheri tayebeh2002@yahoo.com

PHazrati@iau-ahar.ac.ir

Abstract: In this paper, we propose a fast and robust scheme, called Multiscale Local Average
Binary Pattern operator based Genetic algorithm (MLABPG) and apply it to face recognition.
The proposed scheme consists of two steps: feature selection and classification. In MLABPG,
feature selection is based on modified Multiscale LBP operator; we take the size of the window
(s) as a parameter, and s s denoting the scale of the LBP operator. Calculation is performed
based on average gray-values of pixels values within windows, instead of individual pixels. And
standard deviation values of pixels are used for comparison. In classification step: By weighing
classifiers representing classifier importance using a Genetic Algorithm (GA). We can optimize the
classification accuracy by combine the classifiers based weights that obtain with GA algorithms.
Our fitness function measures the accuracy rate achieved by classification fusion. The experimental
results on the ORL databases validate that the offered algorithm has better performance than or
comparable performance with state-of-the-art local feature based methods.

Keywords: Local Binary Pattern, Genetic Algorithm, Average values, Standard Deviation values

Introduction

ods [6], etc.

As one of the most active and visible research topics


in computer vision, pattern recognition and biometrics, face recognition has been extensively studied in
the past two decades [1, 2], yet it is still a challenging
problem in practice due to uncontrolled environments,
occlusions and variations in pose, illumination, expression and aging, etc. Various methods have been offered
for face feature extraction, among which the representatives include Eigen-face [3], Fisher-face [4]; Gabor
Feature based Classification (GFC) [5] and LBP meth Corresponding

Recently, Local Binary Patterns (LBP) is introduced as a powerful local descriptor for microstructures
of images [19]. The LBP operator labels the pixels
of an image by thresholding the 3 3-neighborhood of
each pixel with the center value and considering the
result as a binary string or a decimal number. Recently, Ahonen et al offered a novel approach for face
recognition, which takes benefit of the Local Binary
Pattern (LBP) histogram [7]. After it was extended to
Unicode LBP, it was used at many places because of
its high efficient code way and low excellent local tex-

Author, F: (+98) 4262227729, T: (+98) 914 926 5155

430

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ture description. After this the many researches have


work on LBP [7-16].Excellent results in face recognition
have been achieved by using the Local Binary Pattern
(LBP) method. It has been verification that uniform
patterns play an important role in texture classification [8]. Uniform patterns also showed their priority
in face recognition [9, 10]. However, the original LBP
operator has the following drawback in its application
to face recognition. It has its small spatial support
region; therefore the calculations within original LBP
that are performed between two single pixel values are
much affected by small changes in the pattern.

operator labels the pixels of an image by thresholding


the 3x3-neighbourhood of each pixel with the center
value and converts the result into a binary number by
using (1).
p1
P
LBPP,R(x,y) =
s(gp gc )2P
p=0

1 if x 0
s(x) =
0 if x < 0

(1)

Where gc is an intensity of central pixel and gp is


a gray level intensity of neighborhood pixel and 2p is
a relevant factor for any neighborhood. S (.) is a sign
Genetic Algorithms (GA) have been shown to be an function. The process is demonstrated with the Figure
effective tool to use in data analysis and pattern recog- 1.
nition [20], [21], [22].An important aspect of GAs in
a learning context is their use in pattern recognition.
Combination of classifiers is one area that GAs have
been used to optimize. Kuncheva and Jain in [23] used
a GA for selecting the features as well as selecting the
types of individual classifiers in their design of a Classifier Fusion System. GA is also used in selecting the
prototypes in the case-based classification [24].
Figure 1: The basic LBP operator
We offer a novel representation, based on Uniform
LBP called Multiscale Local Average Binary Pattern
based on Genetic algorithm (MLABPG), to overcome
the restriction of LBP, and apply it to face recognition. The proposed scheme could fully utilize the information of non-uniform LBPs in multiple scales.
In MLABPG, calculation is performed based on average values of P-neighbor values of pixels, instead of
individual pixels and standard deviation values of pixels is used for comparison. We use of four scale windows (3 3, 5 5, 7 7, and 9 9) to extract features.
Finally, by weighing classifiers representing classifier
importance using a Genetic Algorithm (GA) we can
optimize the classification accuracy.
The rest of this paper is organized as follows. Sequentially, in Section 2 and 3, we briefly reviewed local
binary patterns (LBP) and Genetic algorithm recursively. Section 4 presents the Multiscale Local Average Binary Pattern (MLABPG) Method. In Section
5, experiments on ORL face database are presented to
demonstrate the effectiveness of MLABPG. Section 6
concludes this paper with a conclusion and perspective
on future work.

Local Binary Pattern (LBP)

Two extensions of the original operator were made


in [8]. The first defined LBPs for neighborhoods of different sizes, thus making it possible to handle textures
at different scales. Using circular neighborhoods and
bilinearly interpolating the pixel values allow any radius and number of pixels in the neighborhood. In this
extension, P sampling points on a circle of radius of R,
are shown to form a (P, R).
The second defined the so-called uniform patterns:
an LBP is uniform if it contains at most one 0-1 and
one 1-0 transition when viewed as a circular bit string.
For example, 00000000, 00011110 and 10000011 are
uniform patterns. Uniformity is important because it
characterizes the Pieces that include primitive structural information such as edges and corners. Ojala
et al. noticed that in their experiments with texture
images, uniform patterns account for a bit less than
90% of all patterns when using the (8,1) neighborhood
and for around 70% in the (16,2) neighborhood. There
are various extensions and reformations of the original
LBP following its first introduction by Ojala et al [19].
A good source of references can be found in [18].

Genetic Algorithms (GAS)

The original LBP operator, introduced by Ojala et al. A genetic algorithm is a population-based search and
[19], is a powerful means of texture description. The optimization method that simulate the process of nat-

431

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ural evolution. The two main concepts of natural evolution, which are natural selection and genetic dynamics, inspired the development of this method. The basic
principles of this technique were first laid down by Holland [25] and are well described, for example, in [26],
[27].
In general, GAs start with an initial set of random
solutions called population [28]. A GA generally has
four components. A population of individuals where
each individual in the population represents a possible solution; a fitness function which is an evaluation
function by which we can tell if an individual is a good
solution or not; a selection function which decides how
to pick good individuals from the current population
for creating the next generation; and genetic operators such as crossover and mutation which explore new
regions of search space while keeping some of the current information at the same time. Each individual
in the population, representing a solution to the problem, is called a chromosome. Chromosomes represent
candidate solutions to the optimization problem being
solved. In GAs, chromosomes are typically represented
by bit binary vectors and the resulting search space
corresponds to a high dimensional Boolean space. It
is assumed that the quality of each candidate solution
can be evaluated using the fitness function.

window with 4 scales (33, 55, 77, and99). Since


the calculations within original LBP are performed between two single pixel values, it is much affected by
small changes in the pattern and it is too local to be
strength. In order to obtain better feature representation, in my offered method, LABP operator employs
a larger number of sample points. In LABP(S), we
take the size of the window (S) as a parameter, and
S S denoting the scale of the LBP operator; first single pixels in original image (I), are replaced with average gray-values of pixels values within windows with
scale S (S = 3 3, 5 5, 7 7, 9 9) and capture four
images M(S). Specifically, we will use Equation 2 to
calculation average value of each pixel (Mij ) in image
I:

LABP (s)P,R(x,y) =

MLABPG Method For Face


Recognition
Multiscale Local Average Binary
Pattern (MLABP) operator

The performance of single LBP operator is limited.


Multiscale or multiresolution could represent more image feature under different settings [13]. The motivation for having a multi-scale representation of the face
image comes from the basic observation that real-world
objects are composed of different structures at different scales. Also to extract representative features, uniform LBP was proposed and its effectiveness has been
validated [9, 10]. However, all non-uniform patterns
are clustered into one pattern, so a lot of useful information is lost. The useful information of non-uniform
patterns at large scale is dug out from its counterpart
of small scale. In this paper, to overcome the limitations of LBP, and apply it to face recognition; a simple
but powerful texture representation, called multi-scale
local average binary pattern based genetic algorithm,
is proposed for face recognition. This multi-resolution
representation based Uniform LBP can be obtained by

432

(2)

Where Mc defines the gray values of the center pixel


and Mp are gray values of P equally spaced pixels on
the circumference of a 3 3 window in image M. And
s(x) is a sign function (3).

4.1

s(Mp Mc )2P

p=0

s(x) =

P
1
X

1 if x
0 f x<

(3)

0<1

Instead of utilizing a fixed threshold , we offer to


assign its value based on . The standard deviation value
of all pixels in the window with size S, in image I considered as and is a scaling factor.
To extract representative features; we consider only
uniform patterns of LABP(s). Finally, histogram used
to extract features from each LABP(s) and each histogram is used for classification, separately.

4.2

Combine MLABP Classifier Using


Genetic Algorithm (MLABPG)

Classification fusion combines multiple classifications


of data into a single classification solution of greater
accuracy. A combination of multiple classifiers leads
to a significant improvement in classification performance. By weighing classifier representing classifier
importance using a Genetic Algorithm (GA) we can
optimize the prediction accuracy and obtain a marked
improvement over raw classification.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

#Train
2
3
4
5
6
7
8
9

MLABG
83.87
91.13
96.35
98.37
98.75
99.43
100
100

LBP[6]
79.03
86.80
93.76
96.21
97.12
97.75
98.67
99.60

Methods
LDA[4] Gabor[5]
76.33
81.33
86.67
88.10
92.86
93.43
95.47
95.67
96.67
97.43
97.10
98.64
97.33
99.65
98.56
100

LGBPH[12]
85.57
94.72
97.56
98.83
99.56
99.86
100
100

Table 1: Performance of different methods on ORL Database


In this paper we use a GA to optimize a combination of classifiers. Our objective is to assign a weight
for each LABP(S) classifier with use of fitness function. Our goal is to find a population of best weights
for each LAP(S) classifier, which minimizes the classification error rate. In our encoding scheme, the chromosome is a bit string whose length (n) is determined
by the number of classifiers. Each weight that assign to each classifier is associated with one bit in the
string. We randomly, initialized a population of four
dimensional weight vectors with values between 0 and
Figure 2: some sample images from the ORL dataset
1, corresponding to four LABP(S) classifiers and experimented with different number of population sizes.
We found good results using a population with 100 inThe recognition results by different methods, on the
dividuals. Our fitness function measures the accuracy
rate achieved by classification fusion and our objective ORL are shown in Table 1.
would be to maximize this performance (minimize the
All experiments were run randomly 20 times, after
error rate).
which results were averaged. The scaling factor is set
to 0.1 in the experiment. Wight that GA algorithm
assigned to each LABP classifier is 0.9, 0.7, 0.4, and
0.4 for scales 9 9, 5 5, 7 7, 3 3 recursively. In
5 Experimental Results
this experiment, training set has been formed by using
n deferent samples of each individual (n varies from
2 to 9) and the remaining images are used for testMy system is implemented and compared with existing. For each n, we independently run the system 20
ing Local Binary Pattern face recognition systems [6,
times. We can see again that the proposed CMLABP
11] and LDA [4], Gabor systems [5] on the FERET
approaches have better performance than other methand ORL face databases. All experiments were run
ods and when the training sample number is small, the
randomly 20 times, after which results were averaged.
performance of CMLABP is better than other methThe scaling factor is set to 0.1 in the experiments.
ods. We achieved More than 97% accuracy for ORL
database. Compared with LBP face recognition systems, the proposed method could get more than 2%
5.1 Experiments on the ORL database improvement on ORL database.
In ORL database [17, 18] exist ten distinctive images of
40 separate subjects in up-right, frontal position with 6
Conclusion
tolerance for some slanting and rotation of up to 20 degrees. Furthermore, the most variation of some image
scale is approximately 10%. Hence, it is expected that In this paper, we offered novel Local Binary Pattern
this is a more difficult database to work with. Figure (LBP) operator to deal with main defect of the origi2 shows some sample images from the ORL database. nal LBP operator, namely Multiscale of Local Average

433

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Based Pattern operator based GA (MLABPG). In my


method, the calculations is perform based on average
gray-values of pixels values within windows with scale
S, and we use of standard deviation values of these
pixels, Instead of employing a fixed threshold,. In addition, we use GA for combine LABP classifiers. Since
my offered operator employs a larger number of sample
points, it obtains better feature representation that is
not much affected by small changes in the pattern.

Refrences
[1] W.Y. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld,
Face recognition: A literature survey, ACM Computing Surveys 34 (2003), no. 4, 399-485.
[2] A. K. Jain, R. Ross, and S. Prabhakar, An introduction
to biometric recognition, IEEE Transaction on Circuits and
Systems for Video Technology 14 (2004), no. 1, 8492.
[3] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 13 (1991), no. 1, 7186.

[13] C.H Chan, C. Parkan, and A. Swami, Multi-scale local binary pattern histogram for face recognition, Centre for Vision, Speech and Signal Processing School of Electronics
and Physical Sciences University of Surrey Guildford, Surrey, U.K (2008).
[14] Bhuiyan A.-A and C. H. Liu, On face recognition using
Gabor filters, World Academy of Science, Engineering and
Technology 28 (2007).
[15] R Mehta, J Yuan, and K Egiazarian, Local Polynomial
Approximation-Local Binary Pattern (LPA-LBP) based
Face Classification, Proc. SPIE 7881 (2011).
[16] J Shelton, G Dozier, K Bryant, K Popplewell, T Abegaz,
K Purington, L Woodard, and K Ricanek, Genetic Based
LBP Feature Extraction and Selection for Facial Recognition, Proceedings of ACM South-east Conference, Kennesaw, GA (2011).
[17] The
Olivetti
Research
Laboratory
(ORL)
database,
Cambridge,
U.K:
http://www.uk.research.att.com/pub/data/att faces.zip,
1994.
[18] The
Olivetti
Database:
orl.co.uk/facedatabase.html.

http://www.cam-

[4] P. Belhumeur, J. Hespanha, and D. Kriegman, Egienfaces


vs. fisherfaces: Recognition using class specific linear projection, IEEE TPAMI 19 (1997), no. 7, 711720.

[19] T. Ojala, M. Pietikainen, and D. Harwood, A comparative


study of texture measures with classification based on feature distributions, Pattern Recognition 29 (1996), no. 1, 51
59.

[5] C. Liu, H. Wechsler, and Shi Y, Gabor feature based classification using the enhanced Fisher linear discriminant model
for face recognition, IEEE TPAMI 11 (2002), no. 4, 467
476.

[20] M.L Raymer, W.F Punch, E.D Goodman, L.A Kuhn, and
A.K Jain, Dimensionality Reduction Using Genetic Algorithms, IEEE Transactions on Evolutionary Computation 4
(2000), 164171.

[6] A. Timo, H. Abdenour, and P. Matti, Face recognition with


Local Binary Patterns, Proc. ECCV (2004), 469481.

[21] A. K Jain, D Zongker, and M. Pietikainen, Feature Selection: Evaluation, Application, and Small Sample Performance, IEEE Transaction on Pattern Analysis and Machine
Intelligence 19 (1997), no. 2.

[7] T. Ahonen, A. Hadid, and M. Pietikainen, Face Recognition


with Local Binary Patterns, Computer Vision Proceedings,
ECCV 2004, Lecture Notes in Computer Science, springer
3021 (1999), 469481.
[8] T. Ojala, M. Pietikinen, and T. Menp, Multi-resolution
gray-scale and rotation invariant texture classification with
local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (2002), no. 7, 971987.
[9] G. Zhang, Z. Huang, and S. Z. Li, Boosting local binary pattern (LBP)-based face recognition, SinoBiometrics (2004),
179186.
[10] T. Ahonen, A. Hadid, and M. Pietikinen, Face description
with local binary patterns: application to face recognition,
IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (2006), no. 12, 20372041.
[11] W Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang, Local Gabor binary pattern histogram sequence (LGBPHS):
A novel non-statistical model for face representation and
recognition, Proc. 10th IEEE Int. Conf. Computer Vision
(2005), 786-791.
[12] A. Hazrati and A. M. Eftekhari Moghadam, Face Recognition System Using a hybrid Approach of Co-occurrence
Matrix and Local Binary Pattern, the Fifth Data Mining
Conference, IDMC2011, Tehran, Iran (2011).

434

[22] De Jong K.A, Spears W.M, and Gordon D.F, Using genetic
algorithms for concept learning. Machine Learning, IEEE
Transactions on Pattern Analysis and Machine Intelligence
13 (1996), 161188.
[23] L.I Kuncheva, L.C. Jain, and S. Z. Li, Designing Classifier
Fusion Systems by Genetic Algorithms, IEEE Transaction
on Evolutionary Computation 33 (2000), 351373.
[24] Skalak D. B, A. Hadid, and M. Pietikinen, Using a Genetic Algorithm to Learn Prototypes for Case Retrieval an
Classification, Proceeding of the AAAI-93 Case-Based Reasoning Workshop, Washington, D.C., American Association
for Artificial Intelligence, Menlo Park, CA (1994), 6469.
[25] J. Holland, Adaptation in Natural and Artificial Systems,
The University of Michigan, 1975.
[26] K. De Jong, An analysis of the behavior of a class of genetic
adaptive systems, The University of Michigan (1975).
[27] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989.
[28] D.E.Goldberg, Adaptation in natural and artificial systems,
Ann Arbor, MI, Univ of Michigan Press, 1975.

A Novel Method for Function Approximation in Reinforcement


Learning
Bahar Haghighat

Saeed Bagheri Shouraki

Sharif University of Technology

Sharif University of Technology

Department of Electrical Engineering, 202 ACL

Department of Electrical Engineering, 202 ACL

haghighat bahar@ee.sharif.ir

bagheri-s@sharif.ir

Mohsen Firouzi
Sharif University of Technology
Department of Electrical Engineering, 202 ACL
mfirouzi@alum.sharif.edu

Abstract: In this work we propose a straightforward combination of dynamic programming and


Active Learning Method (ALM) fuzzy modeling technique for function approximation and generalization in continuous domain reinforcement learning. Our approach incorporates ALM and smooth
value iteration algorithm, a continuous version of value iteration in discrete state spaces, to obtain
a robust and reliable general-purpose learning method without the need to modify the learning
scheme to explicitly avoid divergence. Being capable of modeling behaviors of any complexity degree arbitrarily accurate, ALM follows the evolution of state value function approximation through
different stages of iteration flexibly, eliminating the possibility of estimation error accumulation and
enlargement which eventually results in divergence. The arbitrarily local updates in ALM, allow
for different arbitrary levels of generalization and also arbitrarily local refinements of the approximation. Simulation results of the proposed algorithm on three typical problems demonstrate the
reliable and fast convergence of this strategy.

Keywords: Reinforcement Learning; Function Approximation; Active Learning Method; Fuzzy Modeling.

Introduction

In continuous domain Reinforcement (RL) dealing with


the continuous and inherently high dimensional state
spaces of real-world applications, the problem of the
curse of dimensionality emerges. The well-known approach to this issue is to incorporate generalization
into dynamic programming (DP) through replacing the
lookup table with appropriate function approximators
[1], [2]. However this is a seriously challenging problem due to several complications. The value function to
be approximated is in fact of an evolutionary nature;
during the learning process the value function varies
as a result of both policy evaluation and also policy
improvement processes [1]. This causes serious diffi Corresponding

Author, T: (+98) 21 6616-5984

435

culties since even if the chosen function approximator


is perfectly capable of representing the final optimal
value function it might not be able to approximate
the value function in the intermediate stages. This
lack of capability to follow the evolution of the value
function through learning process accurately enough
can result in converging to an anti-optimal approximation of value function or even to a diverging one, never
reaching equilibrium [3]. The situation gets even more
complicated when it comes to incremental RL where
the arrival of data is dependent on the path that the
agent actively decides to traverse based on its current
estimation of optimal policy. This causes the sampling
of state-action space to be actually very biased to regions of higher interest/lower cost and thus making it
difficult for the function approximation (FA) method

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

to estimate the more sparsely sampled and yet relevant regions accurately [1]. These inherent complications in the problem emphasize the significance of the
FA method selection issue to ensure reliable and fast
learning. In this work, we propose the use of the fuzzy
modeling technique of Active Learning Method (ALM)
[4] as an ideal FA method capable of securing a fast
and reliable learning scheme in continuous domain reinforcement learning. We will show that ALM is capable of overcoming the aforementioned challenges due to
its unique modeling approach. Compared to existing
approaches, our proposed strategy utilizes much less
mathematical exactness and computational effort due
to its fuzzy nature, and yet outperforms typical powerful FA approaches such as a MLP trained by back
propagation in terms of convergence behavior. The
powerful modeling and function approximation characteristics of ALM seem a natural answer to the aforementioned difficulties. By utilizing Ink Drop Spread
(IDS) operator and an effective partitioning scheme,
ALM is capable of representing functions of any degree to any arbitrary level of accuracy, allowing for
a perfect tracing of the value function evolution during the learning process and thus avoiding the possible
divergence due to accumulated intermediate approximation errors. This advantage is what Gaussian Processes (GP) for function approximation [5], [6], [7] have
obtained through being non-parametric. GP methods
provide the expected value of the approximated function alongside with its variance as a quantitative indicator to the amount of uncertainty of the approximated value. This indicator can be very useful to
guide the search in RL [7], [8]. Such information is
naturally incorporated into ALM modeling technique
in the form of Narrow Path (NP) and spread values,
extracted from IDS planes, which are the fuzzy equivalents of the expected value and variance. In an attempt to solve the biased sampling and the local nonstationary problems through making the effect of each
update sufficiently local, [9] uses a Gaussian Mixture
Model (GMM). ALM also incorporates the notion of
arbitrarily local updates through the arbitrary choice
of radius of the ink stain utilized in the IDS curve extractor units. In addition to addressing the biased
sampling and the local non-stationary problems, this
also enables the algorithm to avoid the undesired and
unpredictable changes caused by the global updates
which are common in neural networks-based FA methods, and ultimately eliminating the necessity for batch
updates [10]. Variable resolution methods also try to
maintain the locality of update effects by partitioning the domain into independently updating regions
[11]. The locality of updates is managed through partitioning the regions into further subdivisions. However
the partitioning scheme is unrecoverable and there is
no generalization between neighbor portioned regions.

This would not be the case in ALM since the locality


of updates applied to the current estimation of value
function is controlled by the radius of the ink drop that
continuously diffuses to neighbor states and fades exponentially. Partitioning in ALM is also not meant to
produce independent update regions, it actually serves
as part of the feature extracting scheme. In [8] the
use of many competing parallel function approximators with overlapping domains is proposed. In order to
select the appropriate competitor to give the best estimation value, a relevance function is computed for each
competitor corresponding to its expected accuracy and
confidence. Only the competitor with the highest relevance at the query point will be used to estimate the
value at that point. ALM in contrast, estimate the
function value at each point by combining the estimation of several IDS units. These partial estimations
are weighted by their relative confidence degree, forming a kind of weighted voting among an ensemble of
experts, resulting in higher reliability. This paper is
organized as follows. In section II we will briefly study
Active Learning Method modeling technique. Section
III describes our proposed algorithm. Eventually the
performance of the proposed algorithm is assessed in
section IV where experimental results are presented,
before conclusions in section V.

Active Learning Method

Active Learning Method [4] has been shown to be a


very effective fuzzy modeling technique [12]. However,
the processing nature of ALM is more similar to neural
networks concepts than the notions of fuzzy logic computations. This is mainly due to the fact that ALM is
powered by an intuitive pattern-based processing engine, which resembles the way that humans interpret
information; in the form of pattern-like images rather
than in numerical or pure logical forms [12]. ALM
modeling technique splits a multi-input single-output
(MISO) system into several single-input single-output
(SISO) systems that together with a combination rule
engine provide an arbitrarily accurate description of
the original MISO system. The concept of this split is
in agreement with the way humans usually comprehend
complex subjects or relations, by breaking it down to
simpler aspects and descriptions. Figure 1 depicts the
split performed by ALM. Each of the obtained SISO
subsystems corresponds to an IDS unit, consisting of
an IDS plane and a feature extractor engine as depicted in figure 2. Each IDS plane corresponds to a
partition of the input space and depicts the behavior
of the output with respect to the unpartitioned input
variable. The key idea here is that in order to observe

436

The Third International Conference on Contemporary Issues in Computer and Information Sciences

the output behavior of a MISO system with respect


to an input variable, what we naturally think of is to
limit the variation range of other effective inputs, now
if we have a collection of such behaviors for different
limited ranges, i.e. partitions, of other inputs, we may
claim that we thoroughly know how the MISO system
output behaves with respect to that specific input variable. ALM is said to be a universal approximator since
in the limit that each partition is reduced to a single
point, a lookup table with 100% accuracy is obtained.
Characteristic curve and spread feature extractions in
IDS units are done through applying IDS operator to
the collection of sampled or experienced points. IDS
drops an ink stain on the sampled data points. As figure 3 shows, the ink stains diffuse to the neighbouring
points and overlap, implementing a fuzzy interpolation
scheme, with the darkness level corresponding to the
confidence degree. The notion of IDS operator is the
very basic intuitive idea inspired from human learning,
that humans usually do not perceive patterns or information abruptly i.e. observing a point in the feature
space having a certain characteristic; common sense
implies that neighboring points must also have similar characteristics. The farther we get from the point
of experience the less we believe that we might observe
such characteristics. In other words our degree of confidence to observing a previously observed characteristic
in a point in feature space diffuses to the points in the
neighboring space. It would be useful here to note that,
although the gravity field FA methods might seem relevant when it comes to the diffusion phenomenon advantage, they lack the automatic spread feature extracting
scheme. As stated before, spread feature information
would be useful to guide the search.

Figure 3: IDS Applied to Data Points on an IDS Plane

As depicted in figure 2, each IDS unit outputs two


important features, the Narrow-Path and the Spread.
Here we define these values to be computed as below
[13].

(x) = {b|

b
X

d(x, y)

y=1

yX
max

d(x, y), b Y }

(1)

y=b

(x) = max{y|d(x, y) > 0} min{y|d(x, y) > 0} (2)

Where (x) denotes the narrow path value corresponding to an input x, d(x,y) indicates the darkness value
associated with the point (x,y) in the IDS plane, and
(x) denotes the spread value associated with point x.
Suppose that ik (x) denotes the kth IDS plane output
with respect to xi input. Then the final model output
y would be:
y is 11 11 or ... or ik ik or ... or N lN N lN

(3)

Where ij is computed as below [13].

Figure 1: Splitting Scheme in ALM

ik = log(

1
)
ik

ik Yik
Plp
q=1 pq Ypq
p=1

ik = PN

(4)

(5)

Figure 4 depicts flow chart of the ALM algorithm as


explained in [4]. However, here we use a simplified
version of the algorithm by using fixed number of partitions and also fixed number of effective inputs.

Figure 2: IDS Unit Schematic

437

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

algorithm, namely the ALM Smooth Value Iteration


(ALM-SVI) is depicted in figure 5.

Figure 4: Flowchart of ALM Modeling Technique

Proposed Algorithm

As stated before, many of the main popular reinforcement learning algorithms are based on the dynamic
programming algorithm known as value iteration [14].
This might be called the discrete value iteration, since
the algorithm takes as input a complete model of the
world as a Markov Decision Task (MDT), and computes the optimal value function J* as the minimum
possible sum of future costs starting from x. In order
for the J* to be well-defined, it is assumed that costs
are non-negative and that some absorbing goal state
is reachable from all states. By extending the discrete
value iteration to the continuous case the smooth value
iteration algorithm is proposed in [3]. This is done by
replacing the lookup table over all states with a FA
method trained over a sample of states. The authors
report that as suggested by test results of a variety
of function approximators, including polynomial regression, MLP trained by back propagation, and local
weighted regression, convergence is no longer guaranteed in contrast to the discrete case. Instead four possible classes of behavior are recognized. These behaviors
and their description according to [3] are summarized
in table 1. Here we propose the use of ALM modeling
technique as the function approximator to be utilized
in smooth value iteration. We will use a simplified version of ALM where the iterative partitioning scheme
is replaced with an intuitive choice of variable partitions by the user. The pseudo code for the proposed

When the algorithm is run, the approximation


starts with an all zero surface in the first iteration. In
the following iterations, as the value of states start to
grow, the low value of the goal state and its neighboring
states propagate through the state space through the
minimum operation as depicted in figure 5. It is crucial to note that, this propagation scheme is in contrast
to what happens in fixed degree FA methods or even
the neural network based methods. This step-by-step
approximation refinement scheme is actually what accelerates the convergence speed due to the fact that it
avoids unnecessary updates of state values which may
inject error in the approximation computations, especially in early iteration stages where the approximation
is still poor. In addition, unlike the neural network
based FA methods, the locality of updates in ALM algorithm provides a safe refinement scheme in the sense
that updates effects remain local and thus the induced
error in other states approximated values due to an update in some other state is minimized. This also is an
effective factor in convergence speed acceleration and
reliability.
It is also crucial to note that, unlike fixed degree FA
methods, we do not expect the ALM-SVI to converge
to the exact optimal J* since the approximation accuracy is limited any way. The expected achievement is
that ALM-SVI performs reliably as a general-purpose
learning method. By a reliable performance we mean
that the convergence behavior would be independent of
the domain or application the algorithm is applied to.
This would be a significant achievement in comparison
to existing approaches.

Simulation Results

We have conducted the simulations in a variety of domains including a simple continuous 2D grid-world, a
continuous 2D grid-world containing costly puddles,
and a mountain car problem. These are the same examples addressed by [3] and the simulations are run
using the same test setup specifications in order for
the results to be comparable. The first set of results
is from the simple continuous 2D grid world described
in figure 6. For a quantized state space the J* can be
computed using discrete value iteration with the optimal value function being exactly linear: J*(x,y) = 20 10x -10y. In order to simulate the proposed algorithm
in this domain we use an ALM approximator with IDS
plane resolution of 256 points, IDS radius of 26 points,

438

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Good Convergence
Lucky Convergence
Bad Convergence
Divergence

Table 1: Possible Classes of Convergence Behavior


The FA accurately represents intermediate value functions and converges to optimal J*
The FA does not accurately represent intermediate value functions, yet the algorithm
converges to a J function whose greedy policy is optimal
The algorithm converges, but the resulting value function and policy are poor
Worst of all, small fitter error may become magnified from one iteration to the next,
resulting in a value function which never stops changing

Figure 5: Pseudo Code of Proposed Algorithm


i.e. 10%, and a partition vector of [8, 8] indicating 8 ods, namely LWR and a neural network trained by back
partitions for each of the two state variables x and y. propagation, fail to even converge [3].
The simulation is then run on a sample of 256 randomly
chosen states. The cost of any action at all time steps
is 0.5 until the agent reaches the goal state. Figures 7-9
depict the evolution of the state value function approximation through different iteration stages. Note that
all values are normalized. It is evident that by using
ALM as the FA method in the smooth value iteration,
the evolution of the value function are successfully and
effectively traced. The proposed algorithm converges Figure 6: Simple 2D Continuous Grid World Domain
after 48 iterations while as [3] reports linear regression
FA converges after 500 iterations and quadratic regression FA does not converge at all.
Iteration 5

0.2

0.15
0.1
0.05

In order to obtain the second set of results we have


augmented the simple 2D continuous grid world to include two puddles through which it is costly to step,
as depicted in figure 10. The ALM approximator specifications and the state sampling is similar to the one
used for the previous problem. The cost of stepping
through the puddles is assumed to be 4 times higher
than through other states. Figures 11-13 show the evolution of state value function approximation at different iteration stages. Again it is evident that utilizing
ALM as the FA method results in flexible and robust
tracing of value function estimation evolution through
iteration stages. The proposed algorithm converges to
a state value function corresponding to the optimal policy after 59 iterations. This would be success even more
appreciated when we note that two powerful FA meth-

439

0
0
0.2
0.4
0.6
0.8
y

0.8

0.2

0.4

0.6

Figure 7: J(x,y) Function Approximation Iteration 5


Iteration 22

0.8
0.6
0.4
0.2
0
0
0.2
0

0.4

0.2
0.6

0.4
0.6

0.8
y

0.8
1

Figure 8: J(x,y) Function Approximation Iteration 22

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Iteration 48

1
0.5
0
0
0.2
0

0.4
0.2
0.6

0.4
0.6

0.8
0.8
y

Figure 9: J(x,y) Function Approximation Iteration 48

Figure 10: Continuous Grid World Including Costly


Puddles
Iteration 10

The third and last set of experimental results is


obtained by applying the proposed algorithm to the
mountain car problem [1]. The mountain car problem is a well known bench mark in RL domain. The
task is to drive an under-powered car up a steep mountain road, as suggested by the diagram in figure 14.
The difficulty is that the gravity is stronger than the
cars engine, and even at full power it cannot accelerate
up the steep slope. The only solution is to first move
away from the goal and up the opposite slope on the Figure 11: J(x,y) Function Approximation Iteration 10
left, then by moving at full power the car can build
up enough inertia to carry it up the steep slope [1].
The cost of any action in this problem is assumed to
be +1 on all time steps until the car has moved past
its goal at the top of the mountain. There are three
possible actions of +1, -1, and 0 corresponding to full
power forward, full power reverse, and no power. The
problems dynamics are computed through simplified
physics equations as denoted below.
0.8
0.6
0.4
0.2
0
0

0.2

0.4

0.8

0.6

0.6

0.4

0.8

0.2

Iteration 25

0.8
0.6
0.4
0.2
0
0

0.5

x t+1 = bound(x t + 0.001at 0.0025cos(3xt ))

0.2

0.4

0.6

0.8

(6)
Figure 12: J(x,y) Function Approximation Iteration 25
Iteration 59

xt+1 = bound(xt + x t )

(7)
0.8

0.6

0.4

Where the bound operation enforces 1.2 xt+1 0.5


and 0.07 x t+1 0.07. When xt+1 reaches the left
bound, then x t+1 is reset to zero. When it reaches
the right bound, the goal is reached and the episode
is terminated. We used an ALM approximator with
IDS plane resolution of 256 points, IDS radius of 26
points, i.e. 10%, and a partition vector of [8, 8] indi- Figure 13: J(x,y) Function Approximation Iteration 59
cating 8 partitions for each of the two state variables
x and y. The simulation is then run on a sample of
512 randomly chosen states. The proposed algorithm
converges to the optimal state value function after 124
iterations. Figures 15-17 depict the evolution of the
state value function approximation. A 2-layer MLP
neural network with 80 hidden units, trained for 2000
epochs fails to converge. Table 2 summarizes the results achieved in this paper alongside those from preFigure 14: Mountain Car Problem
vious work in [3].
0.2

0
0

0.2

0.4

0.6

0.8

440

0.2

0.4

0.6

0.8

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Table 2: Summary of Convergence Results for Different FA Methods


Domain
Linear Quadratic
LWR
BackProp ALM
2D Grid World
lucky
diverge
good
lucky
lucky
2D Puddle World

diverge
diverge
lucky
Mounain Car

good
diverge
lucky
Iteration 25

problems, diverge in one of them. This indicates an


unreliable and domain based behavior of LWR, while
as explained above, ALM-SVI converges reliably all the
time, regardless of the application domain, to the optimal policy following the lucky scheme as defined in
table 1.

0.4
0.3
0.2
0.1
0
1.5
1
0.1

0.5
0.05

0
0.5

0.05
1

Position

0.1

Velocity

Figure 15: J(x,y) Function Approximation Iteration 25

Iteration 80

0.5
0

Discussion and Future Works

1.5
1
0.5
0

0.1
0.05

0.5

Position

0.05
0.1

Velocity

Figure 16: J(x,y) Function Approximation Iteration 80

Iteration 124

1
0.5
0
1.5
1
0.5
0
0.1
0.05

0.5
0
Position

0.05
0.1

Velocity

Figure 17: J(x,y) Function Approximation Iteration


124

As suggested by table 2, one might argue that


ALM-SVI converges only luckily while LWR displays
good convergence. It is crucial to note that lucky convergence of ALM is a result of its fuzzy and inexact
nature which on the other hand ensures a reliable and
fast convergence to a state value function that is capable of inducing optimal policy all the time. As a matter
of fact the term lucky should not mislead us to suppose
that the convergence happens only by chance or under
specific conditions. On the other hand, even the LWR
algorithm that offers good convergence in two of the

441

The results obtained through this work demonstrate


the unique capabilities of Active Learning Method
modeling technique that perfectly meet the requirements of function approximation problem in continuous Reinforcement Learning domain. Being capable
of modeling behaviors of any complexity level in terms
of both the number of effective input variables and also
the functional degree arbitrarily accurate, ALM follows
the evolution of state value function approximation
through different stages of iteration flexibly. This eliminates the possibility of estimation error accumulation
and enlargement which results in divergence from the
optimal state value function and thus the optimal policy eventually. The arbitrarily local updates in ALM,
with the locality controlled by the ink stain radius parameter, allow for different arbitrary levels of generalization and also arbitrarily local refinements of the
approximation. In the present work we studied ALM
performance in Dynamic Programming context; however as stated above, Active Learning Method seems
to be very promising in the incremental RL context as
well. This is mainly due to the fact that ALM already
has the means to overcome the biased sampling and
the local non-stationary problems by making the effect of each update sufficiently local, through ink stain
radius parameter selection. We strongly encourage further research and study on these potential capabilities
and plan to propose and evaluate the performance of
ALM-FA-based algorithms on such problems in our future works.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Acknowledgment
It is the authors pleasure to thank Dr. Hamid Beigi
for his continuing interest and support.

Refrences
[1] R.S. Sutton and A.G. Barto, Reinforcement learning: An
introduction, Cambridge Univ Press, Chapter 5, pages: 201
290, 1998.
[2] T. Mitchell, Machine learning, McGraw Hill, Chapter 5,
pages: 201290, 1997.
[3] J. Boyan and A.W. Moore, Generalization in reinforcement
learning: Safely approximating the value function, Advances
in neural information processing systems (1995), 369376.
[4] S.B. Shouraki, A novel fuzzy approach to modeling and control and its hardware implementation based on brain functionality and specifications, PhD. Thesis, University of Electrocomunications, Chofu, Japan, 2000.
[5] A. Rottman and W. Burgard, Adaptive autonomous control using online value iteration with Gaussian processes,
In Proc. Of the Int. Conf. on Robotics and Automation
(2009), 30333038.
[6] M.P. Deisenroth, C.E. Rasmussen, and J. Peters, Gaussian
Process Dynamic Programming, Neurocomputing 72(7-9)
(2009), 15081524.

[7] Y. Engel, S. Mannor, and R. Meir, Reinforcement Learning


with Gaussian Processes, In Proc. of the 22nd Int. Conf. on
Machine Learning (2005), 201208.
[8] A. Agostini and E. Celaya, A competitive strategy for function approximation in Q-learning, In Proc. of 22nd Int. joint
Conf. on Artificial Intelligence (2011), 11461151.
[9] A. Agostini and E. Celaya, Reinforcement learning with a
Gaussian mixture model, In Proc. of Int. joint Conf. on Neural Networks (2010), 34853492.
[10] M. Riedmiller, Neural Reinforcement learning to swing up
and balance a real pole, In Proc. of the Int. Conf. on Systems, Man and Cybernetics 4 (2005), 31913196.
[11] A.W. Moore, Variable resolution dynamic programming:
Efficiently learning action maps in multi-variate real-valued
state-spaces, In Proc. of the 8th Int. Workshop on Machine
Learning (1991), 333337.
[12] M. Murakami and N. Honda, A study on the modeling ability of the IDS method: A soft computing technique using
pattern-based information processing, Int. Journal of Approximate Reasoning 45 (2007), 470487.
[13] M. Firouzi, Spike-IDS, a novel biologically inspired spiking
neural model for Active Learning Method fuzzy modeling,
Msc. Thesis, Sharif University of Technology, Tehran, Iran,
2011.
[14] R.S. Sutton and A.G. Barto, Reinforcement learning: An
introduction, Cambridge Univ Press, Chapter 5, pages: 201
290, 1998.

442

An Intelligent Hybrid Data Mining Method for Car-Parking


Management
Sevila Sojudi

Susan Fatemieparsa

Department of Computer Engineering and IT

Department of Computer Engineering and IT

Payamenoor University of Tabriz

Payamenoor University of Tabriz

se.sojudi@gmail.com

s.fatemiparsa@yahoo.com

Reza Mahini

Parisa YosefZadehfard

Department of Computer Engineering and IT

Department of Computer Engineering and IT

Payamenoor University of Tabriz

Payamenoor University of Tabriz

r mahini@pnu.ac.ir

p yousefzadeh@yahoo.com

Somayeh Ahmadzadeh
Department of Computer Engineering and IT
Payamenoor University of Tabriz
somayeh.ahmadzadeh@gmail.com

Abstract: This paper presents using of intelligent data driven methods for developing of car
parking systems. Finding a suitable, with the lowest traffic and cost, by considering people priority
for parking place is presented. The Learning progress from previous behavior of the system is done.
To obtain these goals first, a preprocessing phase by association rule mining method is performed.
Rules by using support and confidence algorithms are selected. Then with applying these rules in
the fuzzy resuming system, the system presents optimized park places. Finally experimental results
present the benefits of using intelligent model in human systems against todays systems.

Keywords: Data-driven modeling; Car parking system; Data Mining; Fuzzy expert systems; Decision support
systems.

Introduction

Shortage of parking places for vehicles that take


the patients to the hospitals;

Parking the cars around critical places, such as hospi Problems that this case cause for the other cars.
tals is one of the most important problems in todays
life. In recent years researchers are willing to solve this
problem. According to being sensitive of solving this
case, so it is necessary to design an intelligent system
There are several solutions to solving a bow problems;
to manage parking places.
managing parking places using fuzzy inference systems
[1], fuzzy expert systems in [2], car parking locator sys Parking in sensitive places (entrance of the hospi- tem [3], management system based on wireless sensor
tal, offices, important places, etc.) even for short networks in [4] and [5],[6], [7], [8] , etc. The designed
times, cause to delay in patients transferring;
system by using several criteria and with effective combining of the data mining methods that learns from
Corresponding

Author, T: (+98) 914 103-7659

443

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

previous data and behaviors of the exits system. Obtained rules are used in the fuzzy inference system to
make suitable decisions. With managing and organizing of Hospitals doctors, employees and customers, we
can reduce the problems in hospitals. Also by using
some rules, such as costs and priority we can manage them. Structure of the paper; in section 2 a brief
of data mining methodology is presented, in section 3
fuzzy expert system is illustrated, section 4 describes
methods, in section 5 the Experimental rules. In section 6 conclusion and future work are presented.

Data Mining

Today is using of intelligent methods for modeling


and interpretation of many system are used. Since
as mentioned the character of the proposed system is
data-driven, in this case systems data has an important role in the system identification the methods of
data mining are useful in this data analysis. [9] Data
mining as one of the important stage of knowledge
discovery of data (KDD) is an iterative process within
which progress is defined by discovery, through either
automatic or manual methods.[?boissonnat]:

having sets of facts, expert system gives results based


on its knowledge [11]. There are two kinds of expert
systems have applied to solve different problems: rule
based expert systems and knowledge based expert systems. These systems expert can be created by own
data or the rules. [11].

Case study

As mentioned before paying attention to problem significance about parking in critical places and to get
better results at different park place and traffic control
around hospitals, universities and busy places such as
markets in large cities, causes to do several researches
to solve this problem. By developing of technology
and transportation systems and also by increasing the
numbers of personal vehicles, researchers want to manage the parking lots and heavy traffic in busy places
like hospitals, universities. This paper presents a data
driven model as a fuzzy expert system to solve this
problem.

4.1

Introduce Subject and Methods

This system has been designed to take better car parking places and traffic management by considering of
different criteria. The designed system proposes a suitable park place with a special park code and place by
Central notion of fuzzy systems is that truth values getting the conditions. By taking attention to parame(in fuzzy logic) or membership values (in fuzzy sets) ters, effect on the problem these criteria for the decision
are indicated by a value on the range [0.0, 1.0], with mechanism system are selected:
0.0 representing absolute Falseness and 1.0 representing absolute Truth. A fuzzy set is an extension of an
ordinary (crisp) set. Fuzzy set A is characterized by
If the ambulances or other vehicles want to park
its membership function (x) is called the membership
in the sensitive pleases, system lets them to park
function of A. The set
there for short time, but if they want to stay
there for long time the system doesnt let them;
A = {(u, A (u)) u U }

Fuzzy Expert systems

is called a fuzzy set in U [10].


A

:X[0,1]

(1)

Fuzzy controllers and fuzzy reasoning have found


particular applications in industrial systems which are
very complex and cannot be modeled precisely even
under various assumptions and approximations. Expert systems are one of the most successful solutions
for artificial intelligence optimization problems. When
domain knowledge of expert system is defined, the system can solve problems like an expert human and with

For the hospitals employees and doctors, system


let them to park their car far from the sensitive
places for hours;
For ones which want to park near the sensitive
places for minutes or so, the system determines
low-cost for them;
If all the places are full, the system shows a place
outside of the hospital according to priority of
customers.

444

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 3: Fuzzy sets membership functions of price,


time, place and people

Figure 1: Presented Algorithm for parking system

Figure 4: The results fuzzy set (Suitable mount and


Traffic)

As mentioned before in this research to getting better fuzzy rules , the first one we use the rules reduction
Figure 2: Architecture of an intelligent management algorithms and selecting 54 rules from 256 rules. Aftersystem
ward we use the method of searching association rules
in data mining Support and confidence parameters are
used for selecting 61 rules.
As it seen in this step, an Algorithm used to normalization of raw data and then a preprocessing phase for
initialization of the rules as system behaviors is done.
In this step support and confidence calculation algorithm [9] for getting the rule frequency and accuracy
measure is used [12]. After it obtained rules from this
Experimental results
method is applied in the fuzzy inference system based 5
on these rules. Finally we test our design over new test
data to measure place and traffic load with the system.
For performing experiments, the designed system is applied in Sina Hospital of Tabriz. To getting apposite re4.2
Presented solution for manage- sults system has used in three different condition; when
the environment is quite (during mornings or nights),
ment of parking the cars
in busy condition (meeting times), normal times and
the results is compared with human system and fuzzy
This section of the paper illustrates the architecture expert system. We present the results comparing diaand the algorithm of the system to solving mentioned grams are shown in figures 4,5,6. With considering the
problems. According to systems architecture system values in figurs4,5,6 human system interests to park
gets the behaviors of data as raw rules, in this step we at sensitive and near to the sections of hospital and it
have many rules which need to reduction or mining. causes the mentioned problems and also many empty
To getting better results and because of difference in parking places near the hospital are unsuitable . The
parameters, we executed for processing for all of data illustrated results are shown; the presented system by
and then data range normalization of the data is per- using different criteria such as time, personal priority,
formed.
cost and distance prevents crowding at sensitive parts.

445

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

20

140
120
100
80
60
40
20
0

15
10
5
0
sencitive

near

far

avenue

Simple

Figure 5: Comparing results in simple mode for three


sys

Normal

Busy

Figure 8: Requests lost of three statuses

40

30

6 Conclusions and Future


Works

20
10

This research describes a hybrid data driven intelligent


system for car parking management system with using
of combined data mining methods over several probSencitive Near
Far
Avenue
lem criteria for finding better park place and reducing
traffic jam up to 46.32percent comparing with similar
Figure 6: Comparing results in normal mode for tree system and human system. We can use this system at
sys
fire station or police station to improve the speed of
spatial activities. In future work we will use other intelligent methods to improving car parking quality and
using mentioned methods over other critical places.
Also figure 5 shows the systems in normal mode, as
it is shown, human system interests to park near the
sensitive palaces without attention to time and traffic
jam but this systems can create a good balance with Refrences
applying several criteria to better park strategy.
0

[1] Leephakpreeda T.,


Car-parking guidance with fuzzy
knowledge-based decision making: ElsevierBuilding and
Environment 42, ElsevierBuilding and Environment
42/2007 (2007), 803-809.

60
40
20
0

Figure 7: Comparing results in busy mode for three


sys

Finally figure 6 when systems in complex condition, the human system may select unsuitable places
for park, but it designed system can manage the classified parking places.

[2] R. Mahini, M. H. Norozi, and M. R. Kangavari, CarParking Management System: 2nd joint congress on fuzzy
and intelligent system, 2nd joint congress on fuzzy and intelligent system (2008), 2830.
[3] L. Ganchev, M. ODroma, and D. Meer, Intelligent Car
Parking Locator Service: International Journal Information Technologies and Knowledge, International Journal Information Technologies and Knowledge Vol.2 (2008).
[4] V.W.S. Tang, Y. Zheng, and J. Cao, An Intelligent Car
Park Management System based on Wireless Sensor Networks: 1st International Symposium on Pervasive Computing and Applications, 1st International Symposium on
Pervasive Computing and Applications (2006), 6570.
[5] T. Rye and S. Ison, Overcoming barriers to the implementation of car parking charges at UK workplaces: Elsevier
Ltd., Transport Policy 12/2005 (2005), 57-64.
[6] K. Aldridgea, C. Carrenob, S. Isona, T. Ryeb, and I. Strakera, Car parking management at airports: A special case?:
Elsevier Transport Policy, Elsevier 13/2006 (2006), 511521.

446

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[7] H. E. Nosratabadi, S. Pourdarab, and M. Abbasian, Evaluation of Science and Technology Parks by using Fuzzy Expert
System: The Journal of Mathematics and Computer Science, The Journal of Mathematics and Computer Science
Vol.2 No.4/2011 (2011), 594606.
[8] M. Crowder and M. Walton, Developing an Intelligent
Parking System for the University of Texas at Austin:
Southwest Region University Transportation Center, Center for Transportation Research (2003).
[9] J.K. Cios, W. Pedrycz, W.R. Swiniarski, and A.L. Kurgan,

447

Data Mining a Knowledge Discovery Approach, Springer


Science+Business Media , LLC, ISBN-13, 978-0-387-333335, 2007.
[10] L. A. Zadeh, Fuzzy logic, neural networks, and soft computing: Lecture Notes in Computer Science, ACM vol.
37/1994 (1994), 77-84.
[11] A. Kandel, Fuzzy Expert Systems, CRC Press, CRC Press
LLC ISBN: 084934297x, Pub Nov: 1991, 1991.
[12] P. Walley, Measures of uncertainty in expert system, artificial intelligence 176 (2007), 302659.

Iris Recognition with Parallel Algorithms Using GPUs


Meisam Askari

Reyhane azimi

University of Kashan, Kashan, Iran

Islamic Azad University of Arak

Department of Computer Engineering

Department of Electrical and Computer Engineering

Askari@grad.kashanu.ac.ir

Azimireyhane@gmail.com

Hossein Ebrahimpour Komle

University of Kashan, Kashan, Iran


Department of Computer Engineering
Ebrahimpour@kashanu.ac.ir

Abstract: The parallel algorithms have been applied in this paper in order to run the identification
system using the iris patterns. For identifying the internal and external boundaries of Iris in this
system, the Circular Hough transform, which is the most time consuming part of extracting the
features, has been used. Since this transformation is time consuming, the iris recognition system in
real applications has been faced with the problems. In this paper, we have attempted to design this
system with parallel algorithms and implement it on GPUs(Graphic Process Units). The platform
CUDA(Computing Unified Device Architecture) in MATLAB has been used in order to implement
the system on GPUs. Finally, it is concluded that the computation time in the parallel mode has
been significantly reduced compared to its sequential mode, and makes using this system possible
in real prompt applications.

Keywords: Parallel Algorithms; Iris Recognition System; Hough Transform; CUDA; GPU.

Introduction

Recent advances in computer hardware and software


technology, have led the industry to the development of
reliable systems based on the biometrics. Various biometric systems based on the fingerprints, facial characteristics, voice, hand geometry, handwriting and iris
have been developed. Biometric patterns provide efficient, normalized and privileged performance which
can actually be compared with other patterns in order
to determine the identity. Using the unique and stable
features is one of the most important evaluation factors
of biometric systems. In recent years, using the iris has
been much considered in biometric systems due to the
high reliability, highly complex tissue, and the rate of
accurate diagnosis[1]. The most obvious application of
iris recognition technology is in security-sensitive systems. In addition to these systems, this technology is
Corresponding

Author

448

used in the fields such as file and directory access, access to websites, and key access for file encrypting and
decrypting. Moreover, the iris recognition is used in
the fields which require high throughput and queuing
such as the clearance, air traveling without a ticket,
transportation, and airport security[2]. Primary algorithms of iris recognition were proposed by Professor Daugman in 1990[3]. Then, other algorithms such
as Wildes [4], Boles[5] and Noh[6] were presented but
Daugman algorithms has been the most successful one
in this field. Data used for iris recognition systems,
which include eye images, are databases such as Bath,
CASIA, MMU1, MMU2, LEI and ..., in this paper two
iris images of CASIA And LEI databases are considered. Despite the fact that using Daugman algorithm
has had good results, using this algorithm in the online
applications has faced some problems due to its time
consuming property. Using the GPUs, in this paper we
have provided a method to reduce the response time of

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

this algorithm significantly. In Section 2 Daugman algorithms and iris recognition system, which use this algorithm, are described. Section 3 introduces the GPUs
and the platform for using them. In Section 4 the implementation of iris recognition system by GPUs are
described and in Section 5 we will provide the results
obtained from this implementation and finally in Section 6 the conclusion and future works are presented.

Iris recognition

Cartesian coordinates to the polar coordinates. Normalized iris image is a rectangular image with angular
and radial resolution. Iris images may be taken with
different sizes and various imaging distances and mutually the size of radius may be changed. Deformation resulting from the iris texture will affect the performance of feature extraction and matching stages.
Therefore, the iris area is required to be normalized
for regulating these variables. Thus, Daugman sheet
model is used for this purpose. This model maps each
point inside the iris to a pair of polar coordinates (r,)
which r is defined on the distance [0,1] and is an angle
between[0,2] and they are shown in figure2[8].

Three steps in the iris recognition system are the image


preprocessing, features extraction, and pattern matching. The brief description of these steps is presented
below.

2.1

Preprocessing

Iris images presented in the database need preprocessing for obtaining the useful iris area. Image preprocessing is divided into two stages: iris localization and its
normalization. Iris localization determines the outer
and inner boundaries of iris, and eyelids and eyelashes,
which may cover the iris area are detected and removed. For the localization Iris algorithms we have
used the circular Hough transform. Hough transform
is a kind of transform which maps the points from the
Cartesian coordinate space to a storage space. A sample of this mapping for a circle can be seen in figure1.
In this mapping, there will be a circle in the storage
space for each point in the original image. Finally, for
detecting the circle center it is enough to calculate the
maximum amount in the storage space. However, this
storage space should be drawn for all possible radius
and this is a time consuming work.

Figure 2: Polar map of iris

2.2

Feature extraction

After identifying the iris region, the features should


be extracted from the image. Extracted features will
have the code for generating a biometric pattern. In
this stage, the iris pattern is obtained by the convolution with wavelets of normalized LOG-Gabor. The
normalized two-dimensional pattern rows, each which
each corresponds to a circular ring on the iris region,
are divided into a number of one-dimensional signals
and then these one-dimensional signals are convoluted
with the one dimensional Gabor wavelets[3]. Finally,
the obtained features are stored in the database.

2.3

Pattern matching

In this section, the user pattern is compared with the


patterns stored in the database by a matching benchmark. A matching benchmark will be a criterion of
similarity between two iris patterns. Finally, it results
in a decision in a high confidence level for identifying
whether the user is valid or not. Hamming distance is
used as one of the matching benchmarks. The value 0
will show a full matching. In order to decide whether
Figure 1: a. Image containing a circle b. Accumulator two patterns are for one person or two different indispace for especial radius[7]
viduals, a threshold is used, so that if the Hamming
distance is less than this threshold a correct matching and otherwise a false matching is detected. The
Iris normalization converts the Iris image from the Hamming distance is resulted from adding the xor be-

449

The Third International Conference on Contemporary Issues in Computer and Information Sciences

tween two patterns on the total number of bits. Mask


patterns are used in calculations in order to eliminate
the noise areas. Only the bits of patterns, corresponding the bit 1 in the mask pattern, will be used in the
calculations.
N
1 X
Xj (XOR)Yj
HD =
N j=1

In the above equation, X And Y are the binary patterns and HD is the Hamming distance which is defined
as the total opposed bits on N or as the total number
of opposed bits in the binary patterns. Hamming distance fast matching speed is an advantage because the
patterns are in binary format. Runtime for xor comparisons of two patterns is almost 10 s. Hamming
distance is suitable for comparing millions of patterns
in the large databases.
Figure 3: Performance of Different CPUs and GPU [13]

Using CUDA in Parallel


Processing

Parallelism will determine the future of computing science. Because on the one hand increasing the number
of transistors inside the CPU has made its speed increase very difficult, and on the other hand the need
for real-time abilities and three-dimensional graphics
is increasingly growing. Using the multi-core CPUs is
also an attempt in line with the parallelism[9]. But
these CPUs are expensive and the maximum efficiency
can be increased equal to the number of cores.
GPUs (graphics processing units), which have recently been much considered, are the appropriate tools
for implementing the parallel algorithms. Each GPU
includes a large number of cores which their parallel
run enables GPU to do a set of operations with much
higher speed than CPU.Very high performance and
availability are the advantage of GPUs. figure3 compares different models of GPUs and CPU for Floating
point operations.
GPU has its own related memories and there is no common memory between GPU and CPU thus at the beginning of program, data are transferred from the main
memory to the GPU memory and at the end of program the results are transferred from the GPU memory
to the main memory[10, 11]. In 2006, NVIDIA Company offered CUDA platform in order to accomplish
the massive parallel computing with high performance
on the GPUs produced by this company. Along with
CUDA, a software environment, which allowed the developers to write their own programs in language C
and run it on GPU, was provided.

450

Each CUDA program has two parts: Host and Device. Section Host is a program which is run sequentially on CPU and the Section Device is a program
which is run in parallel on the GPU cores.
As it can be seen in figure4, each parallel program
includes a number of threads. Threads are the light
processes each which do an independent operation. A
number of dependent threads form a block and a number of block form a Grid. There are different types
of memory in GPUs. Each thread has its own local
memory; each block has a common memory which the
threads inside it has the access to this memory and
there is a universal memory which all threads access
to.
In Host part, the total number of threads or in the
other words the number of light processes, which might
be run on GPU cores, should be determined. The code
of section Device is run according to the number of
threads defined in section Host. Each thread can find
its own position by the functions considered in CUDA
and does its own work according to the position. Finally, the calculated results should be returned to the
main memory.
GPUs are a great tool for implementing the image processing Algorithms. Because most of the operations,
which can act on the image, are local and should be
run on all the pixels, thus by considering one thread
for each pixel (if the number of required thread can
be defined) the time for doing the calculations can be
reduced to O(1). Yang has implemented a number of
famous image operators [12] by CUDA. In addition,
based on the work, which was previously done by us,
CUDA has been used for processing the spatial im-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ages[14], also in[15] Gray has benefited from CUDA


for determining the direction.

Implementing Iris recognition


system Using GPU

The average calculated times for different parts of Iris


Recognition System in images of database CASIA are
presented in table1. As it can be seen in table1, the
most time consuming part of this system is related to
two parts using the circular Hough algorithm. Therefore, our objective is to make the Hough transform parallel and thus reducing the time related to these two
parts. In order to make this transformation parallel,
the range in which there is the target radius, should
be determined; for example by observing the images in
CASIA database we can conclud that the pupil radius
ranges from 28 to 75. If rmin and rmax are minimum
and maximum pupil radius in the database, then the
storage space should be formed according to the total
values of R:
R = rmax rmin

Figure 4: Grid of Thread Blocks[10]

For this purpose, we formed the block equal to the


number of image pixels, and each block contained R
threads(figure5). In figure5, P is the number of image
pixels. So, for each pixel or in other words, for every
block, R threads update simultaneously all R in twodimensional space of storage. Hence, each thread has a
duty to draw the storage space for a pixel in a certain
radius. All threads of grid cannot be run simultaneously due to the interference among the threads works
and only the block-related threads, which work on R
different images, have the ability to be run simultaneously.

Recently, MATLAB software has provided the facilities by which, CUDA programs are used in MATLAB.
For this purpose, after writing CUDA program in language C, the resulted program should be converted to
the format PTX, which can be implemented in MATLAB, by the compiler nvcc. In the next section, the
most time consuming part of the system, introduced
in Section 2, has been implemented in MATLAB by
CUDA.

Figure 5: Grid of Thread Blocks[10]


Table 1: Average execution time for different parts of
iris recognition system
Operations
Time(s)
After initializing R two-dimensional spaces by the
Iris boundry detection
11.4
mentioned method, the maximum amount should be
Pupil boundry detection
9.17
calculated in each spaces and the total amount, which
Normalization & quality enhancement
0.645
represents the center of target circle, should be deterFeature extraction using Gabor filter
0.155
mined. For this purpose, we can also apply CUDA,
Total time
21.12
so that we define only one block with R threads in a
new configuration under which each thread has a duty

451

The Third International Conference on Contemporary Issues in Computer and Information Sciences

to calculate the maximum value in a two-dimensional


space. After calculating the maximums in each dimension, just the maximum should be chosen from the R
obtained values. The peak coordinates in the space ri
are the coordinates of circle center and i is the value of
identified circle Radius. Time improving resulted from
using this parallel algorithm is presented in the next
section.

Results

We selected 677 unique iris images of CASIA database


v.3[16] for testing. These images were belonged to 70
persons and for each person there are some iris images
of left and right eyes in our dataset. We used 18% of
dataset as trained data that contained at least one left
and one right eye image of each person. By testing the
rest of the dataset we obtained 99.8% accuracy using
threshold value 0.4 for hamming distance. For making the algorithms parallel, we have used the 96-core
graphics card GeForce GT 430. Furthermore, the
system code presented in this paper is in MATLAB
language, and due to the possibility of communication
between the files CUDA and MATLAB, the MATLAB
version 2010b has been used. The amount of consumed
time in two-time use of Hough algorithm with parallel
method and the amount of accelerating the algorithm
in these two steps for one iris image are presented in
table2.
Table 2: Average execution time for different parts of
iris recognition system
Operations

Obtain Iris
boundry
Obtain Pupil
boundry
Total time

Sequential
execuation
time(ms)
11.14

Parallel
execuation
time(ms)
0.95

Speed up

9.18

0.87

10.55

21.12

2.03

10.4

11.7

powerful graphics card were used, the speed would be


considerably higher. Another remarkable point is that
we usually are faced with massive database in real applications thus the time for matching the pattern in
these systems is high. But CUDA cannot be used for
solving this problem because the memory of graphics
card is limited and the database cannot be loaded. The
cluster can be used in order to overcome this problem
so that we can divide the database on the cluster nodes.
When an input data is received, a copy of it is sent to
all nodes and each of them performs the recognition
and finally the results will be sent to the source computer.

Refrences
[1] M. Shamsi, P.B. Saad, and A. Rasouli, A New Iris Recognition Technique Using Daugman Method (2007).
[2] L. Ma, Y. Wang, and T. Tan, Iris recognition based on
multichannel Gabor filtering, Springer, Berlin/Heidelberg 1
(2002), 279283.
[3] J. Daugman, How iris recognition works, Circuits and
Systems for Video Technology, IEEE Transactions on 14
(2004), no. 1, 2130.
[4] R.P. Wildes, Iris recognition: an emerging biometric technology, Proceedings of the IEEE 85 (1997), no. 9, 1348-1363.
[5] WW. Boles and B. Boashash, A human identification technique using images of the iris and wavelet transform, Signal
Processing, IEEE Transactions on 46 (1998), no. 4, 11851188.
[6] S. Noh, K. Pae, C. Lee, and J. Kim, Multiresolution independent component analysis for iris identification, 2002,
pp. 1674-1678.
[7] Nixon, S. Mark, and S. Aguada, Feature Extraction and
Image Processing, second edition, Academic Press is an imprint of Elsevier, 2008.
[8] L. Masek, Recognition of human iris patterns for biometric
identification, M. Thesis, The University of Western Australia (2003).
[9] J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone,
and J.C. Phillips, GPU Computing, Proceedings of the
IEEE, 2008, pp. 879 - 899.
[10] Nvidia Cuda C Programming Guid v.4, Nvidia Corporation,
2011.
[11] ATI Stream Computing user guide rev1.4.0a, 2009.
[12] Zhiyi and Yang, Parallel Image Processing Based on
CUDA, 2008, pp. 198-201.
[13] J. Michalakes and M. Vachharajani, GPU acceleration of
numerical weather prediction, 2008, pp. 1-7.

Discussion and Future Works

As observed in the previous section, we could increase


the iris recognition speed up to 10 times using the parallel algorithms and applying CUDA for its implementation. It should be noted that this increased speed
is obtained using an ordinary graphics card and if a

452

[14] M. Askari, R. Babaee, and H. Ebrahimpour, Performance


Improvement of Lucy-Richardson Algorithm using GPU
(2010).
[15] S. Grauer-Gray, C. Kambhamettu, and K. Palaniappan,
GPU implementation of belief propagation using CUDA for
cloud tracking and reconstruction (2008), 1-4.
[16] Chinese Academy of Sciences Institute of Automation (CASIA), CASIA Iris Image Database,
http://biometrics.idealtest.org/ Version 3.0, 2010.

Improving Performance of Mandelbrot Set Using Windows HPC


Cluster and MPI.NET
Azam Farokh, Hoda Banki, Mohamad Mehdi Morovati, Hossein Ebrahimpour Komle
University of Kashan, Kashan, Iran
Department of Computer Engineering
{farokh, banki, mm.morovati}@grad.kashanu.ac.ir
ebrahimpour@kashanu.ac.ir

Abstract: Solving the complex problems that have heavy computations and require high processing, are not possible by common methods and since the time consumed for solving them is very
important, high performance computing method should be used to solve these problems. Mandelbrot set is one of these types of problems. Since in Mandelbrot set, each pixel is computed
without the need of neighbors pixels information, the advantages of parallel processing can be used
to solve it. This paper, shows the influence of Microsoft Windows HPC Server cluster and parallel
programming by MPI to reduce the execution time of solving Mandelbrot set and thus to increase
the performance. Also the effect of number of nodes and processes in performance is discussed by
changing the number of cluster nodes and assigned processes to run the job.

Keywords: Mandelbrot set, High performance Computing, Windows HPC Cluster, MPI

Introduction

computation types, which the activities are performed


simultaneously and parallel. Therefore the time of solving problem is reduced. In other words parallel processing uses several processors concurrently, each of them
Nowadays almost in all scientific and engineering ap- by working on partial of tasks and data can perform
plications, usage of computers is inevitable. There many computations quickly and solve problems in less
are many problems that because of their complexity, time[1].
large amount of computations and extreme data involvement, require more processing and therefore the
Computers architecture complexity is increasing
time required to solve these problems, increases. Some and most of the computers have more than one proexamples of these problems are decoding genomes, an- cessor, so they are suitable to be used in parallel proimating movies and analyzing financial risks. Solving cessing. Clusters are made from a set of computers
these problems by common methods such as usage of which are called nodes. Today most of the clusters
sequential algorithms and usage of one PC or worksta- have nodes with multiple processors. Although protion is not possible. In other words, solving these prob- cessors of a node have private cache memory, but they
lems by these methods requires unreasonable time. In can access to main memory in that node and share
the past, the most common way to solve complex prob- it with each other. In addition to hardware facilities
lems was usage of supercomputers. But nowadays, new which are required in parallel processing, also there
approaches in computer hardware and software indus- should be support for the software which can run the
try and possibility of using personal computer com- program in parallel and coordinate different execution
ponents with new methods proposed in parallel pro- steps. Such coordination is necessary due to the decessing, have provided a new way to achieve high pro- pendency of parallel programs to each other. Message
cessing throughput. Parallel processing is one of the
Corresponding

Author, T: (+98) 361 5912450

453

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

passing is a method which is used more than other


methods to achieve this coordination. In this method
different execution steps of programs are coordinated
by passing the message to each other.

High Performance Computing

ment with distributed memory. The main advantage


of developing a message passing interface for these environments is portability and ease of use. The message
passing standard is an important component to build
concurrent computing environments[2]. MPI is used
in programs which have message passing and widely
is used in high performance parallel programs which
are executed on clusters or supercomputers. In a system that uses message passing, different processes run
concurrently and communicate with each other by message passing. Most of the MPI programs are parallel
and their algorithm is Single Program Multiple Data
(SPMD). In this model each of the processes executes
the same instance of the program but each one works
on different parts of data. In SPMD codes, by usage of
MPI, the same programs which are in different nodes
can be executed concurrently just by one command[3].

High performance computing is a branch of computer


science that concentrates on processing of large amount
of data and for this purpose uses supercomputers or
clusters. A supercomputer is a computer which has
high processing capacity. Supercomputers have more
performance than clusters but supercomputers are
more expensive than clusters. Cluster is a technique
which two or more computers are connected to each
At first, all processes are the same and similar to
other by a network and uses parallel processing. Clus- each other and they are distinctive only by a unique
ters are more economical than supercomputers because characteristic which is given to each process. In other
they can be made of common computers.
words a rank is assigned to each process. By this
rank, different processes in program can have different behaviors and can exchange messages with those
processes that are in the same job[4]. In this paper,
Mpi.Net has been used in implementation of programs
3 Message passing
that provides possibility of taking advantage of message passing interface for .Net programming languages.
The purpose of parallel processing is to solve problems
with more speed and therefore parallel execution is
used to achieve high performance computing. There
are different methods and tools for parallel programming. Message passing is one of the methods that 4
Microsoft Windows HPC
are used in parallel programming. There are different methods for message passing in parallel computer
platform. Two of the most famous libraries or appli- Microsoft Windows HPC Server 2008 R2 has been decation program interfaces (APIs) are Parallel Virtual signed for programs which require clusters with capaMachine (PVM) and Message Passing Interface (MPI). bility of high performance computing. Microsoft WinBecause these two methods of API has high availabil- dows HPC Server 2008 R2 can easily scale to work with
ity, most of the parallel codes that perform message thousands of processing cores. This operating system
passing, use PVM or MPI library that can be executed has powerful management facility in order to clusteron any system. In this paper MPI has been used for ing and job scheduling[5] also to carry out parallel tasks
implementation of programs.
uses Microsoft Message Passing Interface (M SM P I).
MPI is not a new programming language, but it is a
standard which defines a library of functions that can
be called from programs which are written by FORTRAN, C, C++ and .Net languages[1]. Note unlike
the PVM that is a software system, MPI is a standard,
so there are different implementation of MPI such as
LAM/MPI, Open MPI, MPICH and MPI.net.
MPI has some procedures for point to point communication and collective communication. Also supports general data types. Actually MPI has been designed for programs which are executed in an environ-

Different types of nodes in Windows HPC Server


2008 R2 are divided into four categories. Head node is
responsible for cluster management and job scheduling.
Compute node and workstation node perform tasks,
that head node has defined them. Broker node is used
to routing and nodes connection when the number of
nodes is high. Both compute nodes and workstation
nodes are responsible for tasks which the head node
has determined them but they have some differences.
Compute nodes are always available and perform the
jobs which are assigned to them but workstations are
available just when they are idle. When workstations

454

The Third International Conference on Contemporary Issues in Computer and Information Sciences

are online and available, they work like a compute


node. A certain plan like weekly program can be determined for workstations time availability. Compute
node is completely dominated by head node and even
head node can change its operating system. Another
difference is related to their operating systems. The
operating system of compute node can be Windows
Server 2008 R2 or 64 bit edition of Windows Server
2008 whereas the workstation node operating system
can be 32 bit or 64 bit edition of Windows 7[6]. The architecture of windows HPC cluster is shown in figure1.
There is a component in Windows HPC cluster called
job scheduler that queues the jobs and their related
tasks. Job is a collection of tasks that a user initiates
them. Actually a job is a resource request. A task
shows the execution of program on compute nodes.
A task can be a serial program that executes on a
processor or a parallel program that takes advantages
of multithreading, MPI or OpenMp. Job scheduler
assigns resources to jobs, initiates tasks on compute
nodes and monitors status of jobs, tasks, and compute
nodes[7].

Figure 1: Architecture of Windows HPC Cluster

Zn+1 = Zn2 + C

(1)

Z0 = 0 + i0
In Mandelbrot set each pixel is computed without
the need of neighbors pixels information. Parallel algorithm of Mandelbrot set is SPMD and it uses data
partitioning. The picture is divided into sections and
these sections are distributed between the nodes and
each node computes color value of pixels which are in
its related section. When job of each node is completed,
that node returns its results to head node[9]. Therefore
according to this subject, in order to increase performance and reduce execution time, a system can be used
which has multiple processor with non-shared memory
that communicate with each other by message passing. In this paper for implementation of this system,
Microsoft Windows HPC Server 2008 R2 cluster and
parallel programming by MPI.Net has been used.

Figure 2: A Sample of Mandelbrot Set Image

In implementation of Mandelbrot set pixel computation, because computation of value of some pixels is
5 Case Study
assigned to one node and it requires many computations, and also according to this subject that values
in each node are computed independently without the
Mandelbrot set is a mathematical set that contains
need of neighbors pixels information, computation of
a set of complex numbers. These numbers represent
pixels value in one node can be implemented in parallel.
points in complex plane and also their two dimensional
fractal shape is easily recognizable. More precisely the
Mandelbrot set is a collection of C numbers in complex
plane that its boundaries are attained by iterating the
formula 1. To create a fractal picture each pixel that is 6
Implementation and Resuluts
in the given rectangular area, should be colored. This
action would be completed after the specified number of iterations. As much the number of iterations In this paper, systems with Intel(R) Core (TM) i7 CPU
is larger, the details in the picture are sharper but the 960 3.2 GHz processors have been used for implemencomputation and execution time will be longer[8]. A tation that each processor has 8 cores. Initially, implesample of Mandelbrot Set image is shown in Figure 2. mentation was performed sequentially on a system. As

455

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

shown in Table 1, the obtained results are indicative


of long runtime. In order to reduce the run-time and
thus increase the performance, the algorithm was implemented in parallel using MPI.Net programming on
Microsoft Windows HPC Server 2008 R2 cluster.

the run-time, the number of nodes was increased up to


8. By doubling the number of nodes in the cluster, the
run-time decreased 1.4 times to 2.6 times than before;
and As a result, performance increased by the same
value. Obtained results are shown in the following diagrams.

Table 1: Results of Sequential Execution


Run-Time(S) Iteration
27.9891
2000
69.0419
5000
136.8468
10000
342.3106
25000
683.8875
50000

Figure 3 shows the results of running the program


with 2 nodes; in this case there are 16 cores available
in the cluster. The horizontal axis of charts shows the
number of formula repetitions to get the pixels color
value in Mandelbrot Set and the vertical axis is the
run-time that measured in second. As is seen, with
increasing the number of processes up to 48, which
means 3 times the number of cores, improvement of
execution time is obvious; after that, with increasing
the number of processes, the run-time is almost constant. figures 4 and 5 respectively show the results of
running the program with 4 and 8 nodes. The lowest
run-time for solving this problem is when the program
runs on eight (8) nodes with 192 processes. Note that
with 8 nodes, the total number of available cores is 64.

To run the job on the HPC cluster, node has been


selected as job resources. In this case, the specified
processes for running the job are divided equally between the nodes. Initially, two nodes and 8 processes
were used for running the program. In this case, 4 processes are assigned to each node; consequently only 4
cores per node are participating in running the program. Then to reduce the run-time, the number of
processes increased to be equal to the number of available cores in the cluster; in other words 16 processes
were used for running the program. In this case, was
observed that each core uses about 55% of its power.
Therefore, to increase the performance, the number of
processes was increased and 32 processes were considered for running the program. In this case, each core
is responsible for running 2 processes and uses about
75% of its power. Then, 48 processes were considered;
consequently each core is responsible for running 3 processes. In this case, was observed that each core uses
about 100% of its power for running the program. In
other words, when 3 processes are running on each core,
the processor would have the maximum performance.
Then to improve the run-time, the number of processes
was increased to be equal to 4 times the total number
of available cores in the cluster; after that the number of processes was increased to be equal to 5 times
the total number of available cores in the cluster. As
can be seen in figure3, the execution time was not improved in these cases, even sometimes because of the
need of processes management and switching between
them, the runtime is increased.
As noted increasing the number of processes, more
than 3 times the number of cores, does not have any influence on improvement of execution time, the number
of nodes should be increased to reduce the run-time.
With increasing the number of nodes, computations
were performed with different number of processes; finally it was found that the best case to use all the power
of cores is running 3 processes on each core. To improve

Figure 3: Diagram of Execution Results with 2 Nodes

Figure 4: Diagram of Execution Results with 4 Nodes

456

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 5: Diagram of Execution Results with 8 Nodes

Conclusion

Since solving the complex problems that have heavy


computations and require high processing, are not possible by common methods and since the time consumed
for solving them is very important, high performance
computing method should be used to solve these problems. The Mandelbrot Set is one of these problems,
and benefits of parallelism can be used for improving
its running time. In this paper, to reduce the run-time
and thus increase the performance of this problem,
Microsoft Windows HPC Server 2008 R2 cluster and
MPI programming have been used. Implementations
showed that if one process is running on each core, all
power of the core can not be used and by increasing
the number of processes on each core, percent of CPU
utilization increases. By testing different number of
processes it was found that, in different situations the
performance of processor is different. Actually when
the number of processes which are assigned to the
job is equal to total number of available cores, the
performance of processor is about 55%. Also when
the number of processes which are assigned to the job
is 2 times and 3 times the total number of available
cores that are assigned to the job, the performance of
processor is about 75 % and 100% respectively. Af-

457

ter this, by increasing the number of processes more


than 3 times the total number of available cores in the
cluster, the computation time is almost constant, even
sometimes because of the need of processes management and switching between them, the runtime was
increased. Thus to reduce the runtime, the number
of nodes should be increased, instead of increasing the
number of processes. By doubling the number of nodes
in the cluster and assigning the optimal number of processes for running the program, the run-time decreased
1.4 times to 2.6 times than before and performance increased by the same value. In our implementations
for 50000 iterations, the run-time of program in serial
mode was equal to 683.887 s. In parallel algorithm
the most reduction of run-time obtained by 8 nodes
and 192 processes and run-time reduced to 10.284 s.
In other words the run-time reduced about 8.31 times
and therefore performance increased about 8.31 times.

Refrences
[1] P. Pacheco, An Introduction to Parallel Programming, Morgan Kaufmann Publishers, 2011.
[2] D. Walker and J. Dongarra, MPI: A Standard Message Passing Interface, Super computer 12 (1996), no. 1, 56-68.
[3] F.M. Hoffman and W.W. Hargrove, High Performance Computing: An Introduction to Parallel Programming with Beowulf (2000).
[4] D. Gregor, MPI.NET Tutorial in C#, Open Systems Laboratory Indiana University, 2008.
[5] Windows HPC Server 2008 R2: System Management
Overview, Microsoft Corporation, 2008.
[6] Windows HPC Server 2008 R2: Adding Workstations to
HPC Server Clusters, Microsoft Corporation, 2010.
[7] Windows HPC Server 2008 R2: Using Windows HPC Server
2008 Job Scheduler, Microsoft Corporation, 2008.
[8] R.L. Devaney, Chaos and Fractals: The Mathematics behind
the Computer Graphic, Vol. 39, Amer Mathematical Society,
1989.
[9] P. Werstein, M. Pethick, and Z. Huang, A performance comparison of dsm, pvm, and mpi, 4th International Conference
on Parallel and Distributed Computing, Applications and
Technologies (2003), 476-482.

The study of indices and spheres for implementation and


development of trade single window in Iran
Elham Esmaeilpour

University of Sistan and Balochestan


M.Sc. student of Information Technology Management
Elham-ie@yahoo.com

Dr. Noor Mohammad Yaghobi


Faculty member of University of Sistan and Balochestan
Department of Management

Abstract: With the advent of information era many economic, social and cultural features of life
underwent drastic changes. One of the aspects of this turnaround is the profound alterations in
the economic relationships among individuals, firms and governments. Due to this, administrations
with the use of the latest technologies have turned to rendering information and services on a wide
scale not only to various stratum of society but also to the subdivisions of the government itself.
Thus, the present research has been conducted using the descriptive - survey method with a view
to a practical nature. This study endeavors to explore and explicate the window phenomenon
of a trade entity as an effective solution to facilitate cross - border transactions. Regarding its
vast dimensions and the undeniable benefits it encompasses, it has been attempted to examine the
creation and development of the foregoing phenomenon in Iran. In order to accomplish this goal,
while considering the experiences of other countries and the recommendations of CEFACT, the
essential elements of implementation of the trade single window have been identified and through a
survey the respondents of which were various experts along with the Freidman Test, the degree of its
significance and the current conditions of these criteria in Iran have been surveyed and evaluated.

Keywords: single window; e-government; e-commerce; cross-border trade.

Introduction

With regards to the fact that cross-border trade has


its own procedure, today amidst fierce competition,
only those states which utilized information technology
and in particular e-commerce have managed to upgrade
the procedure of import-export effectively and dynamically. They have also succeeded in reducing costs and
expediting the procedure as much as possible. In this
way they have contrived to improve the working conditions in addition to taking great strides in the development of the economy and growth of foreign trade.
In line with the fulfillment of this UN/CEFACT has
introduced standards and recommendations for the fa Corresponding

Author, T: (+98) 21 2266 2526

458

cilitation of trade with the view of increase in the world


trade as the main objective. The foregoing organization has so far made numerous recommendations one
of which is the trade single window will be discussed
fully the degree of its significance and the current conditions of indices for implementation and development
of trade single window in Iran in this study. In order to
become more familiar to the issues and tools we tend
to use in this article, here we bring some descriptions.
In this order, we can build our structure more scientifically. So we would like to bring introductions to the
facility location problem, to Voronoi diagrams, their
classes and generalizations, and also to agent study issues, especially the reactive agents.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

E-government and single window

Today with the extent of access and penetration of


world network, both the public and private sector have
attempted to render their services electronically with
more speed and at reduced costs to their clients. Setting up local and international standards and the close
and mutual relationship between trade with various
sectors of the government such as the payment of a
variety of taxes and duties, company registration, acquiring certificates and permits, customs formalities,
obtaining visas and access to general information are
all issues that have made the participation of the government in the information technology not essential
but vital. With rapid changes in technology it is of
utmost importance for organizations to be flexible in
the globalization. Thus it is necessary for all subdivisions of the government to take time and step by
step and along with other members of the community
enter the digital world providing all services and required information on line which are consistent with
the needs and demands of the members of society and
within access of all so that various sectors of the society in particular the trade one can easily and together
with higher quality, speed, accuracy, transparency and
lower costs utilize the rendered services. The resulting
rise in the general welfare is the direct consequences of
the establishment of e-government.

costs of national trade policy.


Calculating costs and benefits associated with international trade procedures along with implementation
of the single window are very difficult and complex.
The study in the Netherlands has concluded that although the costs of customs transactions in terms of
economic theories are quite obvious, but little empirical evidence in this regard is at hand. Some studies
have shown that the costs associated with transactions
through customs are about 2 percent of total global
trade volume and standard deviation of this estimate
is about 4Therefore, one cannot reach a precise number
in estimating the costs of international trade.

Identification of indicators for


trade single window

Implementing a single window requires tremendous efforts and huge investment in the infrastructure, logistical organizational support, legislation, system development and maintenance. The electronic system that
provides information exchange and processing is a significant factor of the single window.

The other hand, citizens are increasingly seeking


access to all levels of governmental services altogether
and at one place (physical and electronic) through various state agencies. Therefore, the administration is
obliged to gather and offer all their services in a single body to the citizens. In line with this the system
of rendering services through a single window enables
governments to provide multi-channel access for merchants and citizens.

Although the goal of implementing a single window


is the integration and standardization of relevant information and documentation, there is no unique and
homogeneity model in a single window. This means
that each country must design and implement their
systems according to the national conditions and special requirements. The use of existing standards and
tools ensures that the single window in each country is
roughly similar to cases and it is consistent processes in
other countries and can contribute to the exchange of
information among different countries. Thus, the experiences of other countries can help to develop these
concepts and lead to making models of their finding.

Trade single window in fact is the manifestation


of government single window for foreign transaction.
Trade single window acts as a bridge and relates the
merchant on one side with institution another trade
entities associated with trade on the other. This system has many benefits for both governments and trade.
Trade single window reduces office corruption and result in the information of a coherent and standardized
data bases and data bank thus diminishing the time
required and costs in the processes, saving precious
government resources. Costs associated with establishing a single window depends on its chosen approach in
its implementation. If government accepts financing of
the single window, these costs will be part of the macro

Many countries have adopted information technology as a key to national development and the powerful presence of the new century in the world arena
is subject to being equipped this of information. On
the other hand each balanced growth of the subsectors related to information technology brings about of
national sustainable development. To explain the concept of information development, various models have
been put forwarded, one of the most effective which
is the development of the dynamic model at United
Nation Development Plan. This model, as shown in
figure (1) is the interaction between the main elements
affecting the development and includes technical infrastructure development, human resources, policies, and

459

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of institutions and applications that are considered as 5


Analysis and findings
the indices for establishment and development of trade
single window. Then based on research carried out in
the theoretical literature, variables related to each of In order to rate the necessity and the current situation
of the indicators and development of the trade single
the indicators were identified.
window in Iran the Friedman test was used. These tests
measure the presence or absence of a consensus in the
statistical society on the different variables used in a
composite factor. Since the responses are dependent
on them in terms of ranking can be compared using
this method.
4 Research Method
As Table 1 shows, according to the foregoing test
the technical infrastructure and policies development
among the five indicators of trade single window has
the highest rank both the necessity and the situation
in Iran.

In this study, a descriptive-survey research method has


been used by a practical nature. One of the characteristics of descriptive study is that researcher just study
what exists and endeavours to describe and explain
it. The purpose of a descriptive survey research is the
The other hand capacity of human resource develrecognition of the traits, characteristics, opinions, atopment
is the lowest ranking in the necessity. Also the
titudes, behaviors and other issues in a community by
situation
of institutional development had the lowest
referring to the individuals.
ranking and officials need to focus more attention to
Therefore in this study library and field methods this indicator.
have been used for data collection together with the
use of questionnaires. In this questionnaire, in each of
the five indicators of the dynamic model, 6 to 7 variables were identified. On the whole, 33 variables were
derived that should be assessed in two stands including
the necessity of existing indices and the current situation in Iran by using the Lickert scale.

But according to the index ranking that represents


the average importance rating of each areas, the necessity of indicator was an average of approximately 4,
but the situation in Iran with an average of about 2
that is almost weak. Thus, to achieve the ideal situation needs attention and will feel equipped platforms
and technical infrastructure.

To work out the coefficient of reliability and validity


of measuring instruments and tests and to calculate the
questionnaire underside coordination the Cronbachs
alpha method has been used with the general alpha
value 91which suggests the high reliability of the questionnaire.
The scope of time of this survey is 1390 and organizational scope has been the ministry of Industry,
Mines and Commerce along with the Customs and the
Central Bank of Iran.
The statistical population targeted and consisted of
all various experts in area of the trade single window
and familiar with the basic concepts of it that with the
survey was administered in the organizational scope,
approximately 123 subjects represented qualify those
requirements. Subjects selected with random sampling
and estimating the sample size from the mean variance
of the prototype that were obtained at least 61 questionnaires. Thus, 100 questionnaires were distributed
among the target statistical society. According to being targeted statistical society and need to be familiar
with basic concepts of trade single window, 62 questionnaires were received from qualified individuals.

460

Indicators
technical
policies
content
institutions
capacity of human

significance
4.44
4.41
4.29
4.26
4.23

conditions
2.42
2.25
2.19
2.03
2.20

Conclusion

Utilization of tools on the basis of information technology as much as the savings made in the use of momentary and temporary opportunities in world trade is
unavoidable. It is obvious that all transactions within a
country in the short term are not feasible but conducting e-commerce from the very beginning in the overseas
trade is doable. With regards to domestic and foreign
needs and the special attention given to upgrading the
criteria for cross-border transaction, the development
of trade single window is one of the programs of the
ministry of Industry, Mines and Commerce for the rise
in the deployment of e-commerce whose development
and sustainable growth requires planed investment and

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

adequate law-making and other essential measures on to the electronic commerce promotion centre. To imthe part of the government.
prove this situation essential to set of specialized single
window as well as changing the format of hierarchy and
In relation to the research done, the establishment outside communications of different bodies.
of the trade single window in an electronic form in
the country first requires the development of technical
Ultimately to develop and promote all the meninfrastructure so that together with harmonized and tioned components, much attention should be given to
standardized development of infrastructure, the effec- the development on human recourses in the country
tive use of ICT leads to an increase in the flow of data which is in a relatively good condition.
and its availability through all agencies and trade entities dealing in the import-export services all providing
the needed security. In Iran e-services in the areas
of cross-border trade including registration of import, 7
Refrences
commercial card, bonus and incentives for export, currency allocation and ... is offered through e-commerce [1] United Nations Centre for Trade Facilitation and Electronic
and this is carried out relatively well. However, for the Business (UN/CEFACT), Recommendation and Guidelines
complete setting up of trade single window harmony on establishing a Single Window, Recommendation No. 33,
and uniformity between networks and outlets that ren- UNITED NATIONS PUBLICATION, July 2005
der the existing e-services in cross-border trade, is re- [2] McMaster, Jim, The Evolution of Electronic Trade
quired.
Facilitation: Towards a Global Single Window Trade Portal.
Fiji Islands: University of the South Pacific, 2007

The indicators related to the scope of the development of content and application is rated third in the
table. In order to upgrade this area it is necessary to
work on a uniform plan in the programming of its various practical uses and ultimately on the establishment
of an informal port for cross-border trade.
So as to read agreement and coordination among
authorities involved in trade it deems necessary to develop various executive, financial and training institutes; though this in responsibility has been passed on

[3] Linington, Gordon, International Trade Single Window and Potential Benefits to UK Business. London: SITPRO
Ltd, February 2005
[4] Kimberley, Paul, Trade Facilitation and Single Windows: Some Emerging Trends, The World Bank Border
Management Conference, June 2011
[5] External Author, Electronic Single Window, Coordinated
Border Management - Best Practices Studies, Inter-American
development Bank, December 2010
[6] Accenture; Markle Fondation; UNDP, 2001

461

Web Anomaly Detection Using Artificial Immune System and Web


Usage Mining Approach
Masoumeh Raji

Vali Derhami

Yazd University

Yazd University

Electrical and Computer Engineering Department

Electrical and Computer Engineering Departmentr

raji.n@yazduni.ac.ir

vderhami@yazuni.ac.ir

Reza Azmi
Alzahra University
Computer Engineering Department
azmi@alzahra.ac.ir

Abstract: The analogy between immune systems and intrusion detection systems encourage the
use of artificial immune systems for anomaly detection in computer networks, Web servers and
web-based applications which are popular attack targets. This paper presents a web anomaly
detection based on immune system and web usage mining approach for clustering web sessions
to normal and abnormal. In this paper the immune learning algorithm and the attack detection
mechanism are described. Theoretical analysis and experimental evaluation demonstrate that the
proposed approach is more suitable for detecting unknown attacks, and are able to provide a real
time defense mechanism for detecting web anomalies.

Keywords: Intrusion Detection Systems; Artificial Immune Systems; Anomaly; Normal behavior; Session.

Introduction

niques tries to extract patterns from the data that are


collected from the interaction of users with the web.
The aim of any web usage mining process is to learn
models of users behavior and use these models for any
The World Wide Web (WWW) considered as the application that tries to ease use of the web [2].
largest distributed collection of information and plays
an important role in human life. Web applications are
The Artificial Immune System (AIS) is a powerbecoming increasingly popular in all aspects of human ful paradigm for learning which is originally inspired
activities; ranging from science and business to enter- from the natural immune system. There are a number
tainments. Consequently, web servers and web appli- of motivations for using the immune system as inspiracation are becoming the major targets of many attacks. tion for clustering web users which include recognition,
Due to the growing number of computer crimes, needs diversity, memory, self regulation and learning [3]. The
for techniques that can secure and protect web servers vertebrate immune system is composed of special type
and web applications against malicious attacks have of white blood cells (called Bcells), which are responsibeen highlighted. Unfortunately, current security solu- ble for detecting antigens and defending against them.
tions, operating at network and transport layers, have When an antigen is detected by the B-cells, an immune
insufficient capabilities in providing acceptable level of response is promoted resulting in antigen elimination.
protection against web-based attacks [1]. Attaining de- One type of response is the secretion of antibodies by
sired information has become a difficult task for users B-cells (cloning). Antibodies are Y-shaped molecules
even in a particular website. Web usage mining tech Corresponding

Author, T:+98 (912) 7907830

462

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

on the surface of B-cells that can bind to antigens and


recognize them. Each antibody can recognize a set of
antigens which can match the antibody. The strength
of the antigen-antibody interaction is measured by the
affinity of their match [2].
Many artificial immune models have been discussed
in literature such as negative selection, danger theory
and Artificial Immune Networks (AINs). We use the
AIN model which was initially proposed by Jern [4].
Access log files of web servers are an important source
of information for Web Intrusion Detection Systems
(WIDSs).
In this paper, is worked on access log files of Apache
server and an anomaly detection system for detecting
web-based attacks. In the training phase, the anomaly
detection system tries to learn how to distinguish normal behaviors from attack by considering several parameters. These parameters include: the number of
values assigned to variables of each request within a
session [1]; the length of URL of each request [7], the
depth of path of each request, attribute character distribution [5], attribute length [5].

based active defense model for web attacks (IADMW)


which is on the basis of the clone selection and hypermutation. Http queries is considered as the antigens.
An http query is represented by a vector of attributes
extracted from the http query, with associated weights
represented the importance of the attribute in the http
query. Danforth [7] presents the Web Classifying Immune System (WCIS) which is a prototype system to
detect attacks against web servers by examining web
server requests. Focused on distinguishing self from
non-self and laid the foundations for the negative selection algorithm. WCIS considers some features: length
of the URI, number of variables and distribution of
characters. Guangmin and Danforth are not considered web sessions, and a http query is labeled as an
attack. Rassam [8] proposed an immune network clustering method that is robust in detecting novel attacks
in the absence of labels. The purpose of this study is
to enhance the detection rate by reducing the network
traffic features and to investigate the feasibility of bioinspired immune network approach for clustering different kinds of attacks and some novel attacks. Rough
Set method was applied to reduce the dimension of features in DARPA KDD Cup 1999 intrusion detection
dataset. Immune network clustering was then applied
using ainet algorithm to cluster the data.

The remainder of this paper is organized as follows. In Section 2, a review on some available IDSs
is presented. Section 3 discusses the goals of this study
and introduces algorithm regarding the data representation. In Section 4, the experimental evaluation of the 3
Proposed Method
proposed system is presented. Moreover, the detection
ability of the system is tested to other area dataset.
Finally, Section 5 concludes our study.
The proposed Web Host Immune Based Intrusion Detection System (WHIBIDS) introduces immune principles into IDSs to improve the capability of learning and
recognizing web attacks, especially unknown web attacks. In the proposed algorithm sessions and requests
2 Related Work
are constructed from web logs in which the clickstream
data are stored. Clickstream data are generated as a
There are two possible approaches for intrusion detec- result of user interaction with a website. Antigen and
tion. Intrusion detector can be provided by a set of antibodies are represented same form and their length
rules or specifications of what is regarded as normal is equal.
behavior based on the human expertise. This approach
could be considered as an extension of misuse detection
Antigen Presenting: Define each users request
systems. In the second approach, the anomaly detector as the antigens set Ag.
Each request is repreautomatically learns the behavior of the system under sented by a vector of attributes extracted from
normal operations and then generates an alarm when the access log file.
The form of the vector of
a deviation is detected from the normal model [1].
the antigen set Ag is listed as following: Ag=
ag| =< Session ID, U RLlength, numberof variables,
Vigna et al.[5] proposed an IDS that operates on distributionof characters, attributelength, depthof path >
multiple event streams and use similar features to our
work. The system analyzes the HTTP GET requests
There are some shortcomings to common access log
that use parameters to pass values to server-side pro- files generated by web servers such as Apache.One of
grams. However, these systems are misuse-based and these problems is to define the web sessions. Since
therefore not able to detect attacks that have not been the boundaries of sessions are not clearly defined, expreviously modeled. Guangmin [6] presents an immune traction of web sessions from these log files is not a

463

The Third International Conference on Contemporary Issues in Computer and Information Sciences

straightforward process.[1] In this paper sessions are


generated like [1] that demonstrate real sessions. In
[1] log file is generated with software which is written
by PHP and is called PHP log generator. A log file
which is generated with PHP log generator includes
sessions and other parameters. Session-ID shows each
request which is owned by sessions. Calculate length
of the URL and number of variables of each request[7].
Distributions of characters have a regular pattern [5].
For example in some of the attacks like buffer overflow,
it is possible to see a completely different distribution
of parameters and also this subject appears to hold
true for attacks that use manifold iteration of a special
character like multiple use of dot character in directory traversal flaws[9]. For each character, existence
percentage of a character in the proportion to length
of a parameter is calculated and then for each character
average percent of these values in whole parameters of
a request are computed. Calculating attribute length
and depth of path that are a part of each request, for
example depth of the following path is 3.
index/wp-admin/export.php
Finally, the vector that is corresponded to that request is normalized. The range of output is between 0
and 1. The normalized value for each field in a vector
of a request is calculated by dividing the value of that
field by the sum of values over all the fields in that
vector.
Affinity function: similarity measure between tow
antigen is Euclidean distance determines the distance
between two web application requests. Precisely, the
similarity between two requests agi and agj is defined
as:

initialization;
Fix the Maximal population sizeNB ;
Initialize B-cell population and i2 =init using a
number of random antigen;
while all antigens are presented;
do
Present antigen to each B-cell;
if activated the B-cell wij > wmin ;
then
Refresh age(t = 0);
Add the current B-cell ad its KNN to
working sub-network;
else
Increment the age of B-cell by one;
end
if for all B-cells wij < wmin ;
then
Create a new B-cell=antigen;
else
Repeat for each B-cell in working
sub-network;
Compute B-cell stimulation;
Update B-cell i 2 ;
end
if antigens of a session is presented;
then
Clone B-cell B-cell based on their
stimulation level;
if populationsize > NB ;
then
Remove extra least stimulated B-cells;
else
end
else
end
end
The modified algorithm of [2]
Algorithm 1: ]

As it is shown in proposed algorithm, when an antigen is unable to activate any B-cell, this antigen may
represent a noise or a new emerging pattern. In this
v
u k
condition, a new B-cell is created which is a copy of the
uX
(agin agjn )2
(1) presented antigen. If this antigen is a noisy data and
dis(agi , agj ) = t
does not present a new emerging pattern, it would not
n=1
get enough chance to get stimulated by incoming antigens and is probably eliminated. After each antigens
of a session is presented to the network, the B-cells
go under cloning operation based on their stimulation
level. When the population of the network exceeds a
defined threshold, the least stimulated B-cells are removed from the network. The distance measure preWhere k is the number of features is extracted for each sented in this study is used in all the steps for calculatrequest. The pseudo code of the proposed algorithm is ing the internal and external (B-cell to antigen) interactions of B-cells. The detailed information about calpresented as following:

464

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

culating stimulation level and updated it are described


by [2]. In the training phase tow profiles of normal and
abnormal behaviors using the proposed algorithm are
built. Then, they are applied to new request and new
session in order to detect abnormal behaviors in the
testing phase.

Experimental Evaluation

Table 2: evaluation on Linux systemcall dataset


= 0.75
Accuracy False alarm rate Detection rate
Request based 97%
0.03
98%
Session based 98/6%
0.01
98/5%

Table 3: evaluation on web access log dataset by adding


20% noise
= 0.75
Accuracy False alarm rate Detection rate
Request based 80%
0.18
76%

There are no available data on web attacks and pure


non attack that can be used as a benchmark test; therefore, we used dataset has been gathered by [1] that it
has vast variety vulnerability tests such as SQL injection, XSS vulnerability and directory traversal flaws.
The empirical evaluation reported in this paper is performed on web requests of sessions. The original data
used in our experiment, contains 43602 requests and
6677 sessions from log files of the web server for seven
random days. Duplicate records in dataset are removed.
The maximal population size of the network is set
to 50; the control parameter for the number of nearest neighbors (K) is set to 3. The activation threshold
(wmin ) is 0.5, the similarity threshold =0.75. If the
weighted distance is greater than , each B-cell is activated and several of them who are belonging to a
session are represented as user behavior. Evaluation is
based on two criteria on tow datasets, one of the criteria upon request and the other based on session, indeed
the array of requests that indicated user behavior. We
believe that an attack is a series of actions, so the set
of requests as actual sessions are considered. To show
that the proposed algorithm works on each dataset, we
use Linux systemcall data set that there also exists the
concept of sessions. Second data set contains 13217
sessions and 66159 systemcall. Both data types have
been tested in the same algorithm parameters.
Different kinds of metrics are measured to evaluate
the ability of the algorithm to learn the properties of
the features of the data and also detecting the anomaly
activities. Detection rate is the fraction of true positive rates to the number of all cases that should have
been classified as positive. The false alarm rate can be
defined as the proportion of actually normal cases that
were incorrectly classified as anomalous.

Table 4: comparison of WHIBIDS vs IADMW IDS


Accuracy False alarm rate Detection rate
WHIBIDS 97/3%
0
92%
IADMW
85%
0.065
67 %

We run the proposed algorithm 5 times with 5folds cross validation and the final values for evaluation measures is the average of these 5 runs. Table 1
and Table 2 represent the proposed systems high capabilities in both criteria and both datasets. As the
results show that performance of session based is the
better than request based and we can claim that the
proposed algorithm can detect malicious activities with
high accuracy. Patterns may be repeated in multiple
B-cells within the population. This is called a loss of
diversity or overfitting which essentially leads to redundancy (e.g. multiple requests have the same signature). To show that there has not been overfitting
in training data, 20% noise is added to the test data.
Table 3 shows the noise, about 15 percent impact on
the results. If overfitting had occurred would have a
significant impact on results.Table 4 shows the comparison of WHIBIDS vs IADMW IDS, which comes
from [6]. The detection rate of WHIBIDS is 92%, but
the detection rate IADMW is 67%. Simultaneously,
WHIBIDS is also capable of classifying web attacks
and has a high accuracy rate 97.3%. These results
show that WHIBIDS is a competitive alternative for
detecting web attacks.

Acknowledgment

Table 1: evaluation on web access log dataset


= 0.75
Accuracy False alarm rate Detection rate
Request based 97/3%
0
92%
Session based 98/9%
0
95%

This research was supported by APA center at Yazd


University . The authors would like to thank APA for
its support.

465

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusions

Refrences
[1] I. Khalkhali, R. Azmi, and M. Azimpour-Kivi, Host-based
Web Anomaly Intrusion Detection System, an Artificial Immune System Approach, IJCSI International Journal of Computer Science Issues,2011 8/2011 (2011), 1424.

In this paper we proposed an intrusion detection system, based on the principles of the immune system
(WHIBIDS) that can detect known and unknown attacks. Here an attack as a series of actions is considered. The requests obtained from the preprocessed
log files of web server are presented to the system as
antigens. The network of the B-cells represents a summarized version of the antigens encountered to the network. Also, they are able to adapt to emerging usage
patterns proposed by new antigens at any time. The
results show the ability of the proposed AIS to clustering web sessions to normal and abnormal.The results
indicate designing an immune base IDS that has several advantages:. (1) Self learning and immune learning make the model can detect both the known and
unknown web attacks. (2) Ability to detect anomaly
in real time.(3) Capability to recognize abnormal behavior with regard to the actual sessions. (4) Using immune network algorithm achieved high detection rates.
(5) Can be used as a general classifier.There was limitation such as determination of similarity threshold with
testing. Future work will determine this threshold by
reinforcement learning.

466

[2] M. Azimpour-Kivi and R. Azmi, Applying Sequence Alignment in Tracking Evolving Clusters on Web Sessions
Data, an Artificial Immune Network Approach, Computational Intelligence, Communication Systems and Networks
(CICSyN ) (2011).
[3] B. H. Helmi and A. T. Rahmani, An AIS algorithm for Web
usage mining with directed mutation, Pro. World Congress
on Computational Intelligence (W CCI08) (2008).
[4] N. k. Jerne, Towards a Network Theory of the Immune System, Annals of Immunology (1974), 373-389.
[5] C. Kruegel and G. Vigna, Anomaly detection of web-based
attacks, in Proceedings of the 10th ACM Conference on Computer and Communications Security (2003), 251-261.
[6] L. Guangminl, Modeling Unknown Web Attacks in Network
Anomaly Detection, International Conference on Convergence and Hybrid Information Technology (2008).
[7] M. Danforth, Towards a Classifying Artificial Immune System for Web Server Attacks: Department of Computer and
Electrical Engineering and Computer Science, International
Conference on Machine Learning and Applications (2009).
[8] M. A. Rassam, M. A. Maarof, and A. Zainal, Intrusion Detection System Using Unsupervised Immune Network Clustering with Reduced Features, Int. J. Advance. Soft Comput.
Appl. 2/2010 (2010).
[9] Z. Brewer, Web Server Protection with CSA HTTP Explorer
Directory Traversal, Cisco Security Agent Protection Series
(2006).

A Fast and Robust Face Recognition Approach Using Weighted Haar


And Weighted LBP Histogram
Mohsen Biglari

F. Mirzaei

University Of Kashan

University Of Kashan

H. Ebrahimpour-Komleh

University Of Kashan

Kashan, Iran

Kashan, Iran

Kashan, Iran

biglari@grad.kashanu.ac.ir

biglari@grad.kashanu.ac.ir

ebrahimpour@kashanu.ac.ir

Abstract: This paper presents a novel face recognition approach, based on Local Binary Pattern
(LBP) and Haar wavelet transform. We propose a fast and robust three-layer weighted Haar and
weighted LBP histogram (WHWLBP) representation for face recognition. In this method, face
image is decomposed using first level Haar wavelet decomposition, and then a multi-block LBP
operator is applied on each of four-channel subimages with different block sizes in order to extract
the features efficiently. The extracted histograms are concatenated together into a single ultimate
feature vector. In recognition stage, Chi square statistic is used to measure the difference between
features histograms. A weighted approach is used for histograms comparison, to emphasize on the
more important regions in faces. The performance of the proposed method is tested on Yale and
ORL face database. The results show that our method performs better than traditional methods
like LDA, PCA, KPCA and even LBP operator, and is more robust to face variations such as
illumination, expression and pose.

Keywords: Face Recognition; Local Binary Pattern; Haar Wavelet Transform; Chi Square.

Introduction

Automatic face recognition is one of the most challenging research topics in pattern recognition, which has
gained much significant attention in recent decades. A
large number of novel face recognition techniques have
been developed in the last few years [1, 2]. Many of
these methods successfully used in some real-world applications. However, after all this progresses, there are
still many challenging problems, such as different lighting conditions, pose variations and facial expressions
that cause a significant decrease in face recognition systems performance. Face recognition systems based on
Local binary pattern has been recently proposed as a
fast and robust recognition approach [35]. In the LBP
representation, instead of using raw intensity values of
pixels, a higher level pattern that reflects the relationships between intensity pixel values in a region is used.
Some approaches combined LBP with other features
like Gabor and skin color to improve the overall per Corresponding

Author, T: (+98) 361 5912450

467

formance. For example, in [6] LBP is applied on Gabor


filtered images and the result is used for face recognition. A boosting algorithm is used to extract discriminative local binary patterns for classification purpose in
[5, 7]. Finding a good descriptor for the appearance of
local facial regions is still an open issue. In this paper,
we develop a novel face recognition approach, based on
LBP histogram and Haar wavelet transform. We propose a three-layer weighted Haar and weighted LBP
histogram (WHWLBP) representation for face recognition. In the first layer, the face image is decomposed
using first level Haar wavelet decomposition, and then
the multi-block extended LBP operator is applied on
each of four-channel subimages in order to extract the
features. Finally in the last layer, the extracted histograms are concatenated together into a single ultimate feature vector. A weighted approach is used for
layer one and two in the recognition stage, to emphasize
on the more important channels. At last, Chi square
statistic is used for classification purpose. In compare
with existing approaches, our approach gained a bet-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ter accuracy with more robustness to expression, pose


variations and different lighting conditions. The rest
of the paper is organized as fallows: In Section 2 definition of LBP descriptor and its modified versions is
presented. Section 3 presents an introduction about
wavelet transform. The proposed feature extraction
approach is presented in section 4. Experimental results are reported in Section 5, and finally conclusions
are explained in Section 6.

Figure 2: Multi-scale LBP operator

2.3

Local Binary Pattern

The uniform LBP

The Local Binary Pattern (LBP) operator was proposed by Ojala et al. [8] for texture description. It has
gained much attention in face recognition field [3,9,10],
not only because of its computational efficiency and
high discrimination power, but also for invariance to
monotonic gray-scale transformations.

Another extension to the original LBP is the definition of uniform patterns [11]. A local binary pattern is
called uniform if the binary pattern contains at most
2 bitwise from 0 to 1 or 1 to 0. It is proved in the experiments in [11] that the uniform patterns account for
around 90 percent of all patterns in the (8, 1) neighboru2
for uniform LBP
hood. We use the notation LBPP,R
operator.

2.1

The original LBP

The basic LBP operator is a non-parametric 3 3 kernel. It assigns a label to every pixel of an image by
thresholding the eight surrounding pixels with the center pixel value and considering the result as a binary
number. See fig. 1 for more details.

Figure 1: The original LBP operator

2.2

The extended LBP

Haar Wavelet Transform

The Haar wavelet [12] is one the simplest wavelet algorithms, however it is very useful in the field of signal
processing. It can be used to transform an image from
spatial domain to frequency domain. In the frequency
domain we can get more robust information from face
images and therefore more robust classification. In the
large range of wavelet algorithms, Haar wavelet has
fast calculation and memory efficiency properties. By
using first level 2D Haar wavelet transform, any face
images can be decomposed to four-channel subimages
LL, HL, HL and HH in the frequency domain. Fig. 3
shows an example. In the four-channel subimages, the
LL channel provides a smaller approximation to the
original face image, because its the low frequency of
the image. The HL and LH channels are in the middle frequency and respectively provide changes of the
original face image in horizontal and vertical direction.
The HH channel is the high frequency of the image and
contains less useful information about the face.

Ojala et al. in [11] expanded the original LBP operator to multi-scale LBP to use neighborhoods of different sizes. The extended LBP uses different sampling
points on a circle of different radius size. Bilinear interpolation is used when a sampling point does not fall
in the center of a pixel. The notation LBPP,R refers to
P sampling points on a circle of radius R. Fig. 2 shows
several multi-scale LBP patterns with different P and
R. This operator is more effective than the original one Figure 3: An example of first level wavelet decomposiin the field of face recognition.
tion

468

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Feature Extraction

We propose a three-layer weighted Haar and weighted


LBP histogram (WHWLBP) representation for face
recognition that has shown in fig. 4. In this representation, the input face image will decompose by
Haar wavelet to four-channel subimages first. Then
each subimages divide into m n rectangular regions,
u2
and LBPP,R
operator apply to each region. Finally
all extracted histograms will concatenate together into
a single feature vector. Regions count (m, n) are different for each channel in order to gather more local
information in more important channels.

Experiments And Results

Different metrics are available for histograms comparison, like Histogram intersection (2), Log-likelihood
statistic (3) and Chi square statistic (4); H1 is the input
histogram and H2 is the registered histogram. Reference [3] has shown that The Chi square has a better
performance than the two others; So we choose Chi
square in our experiments. For a more efficient histogram comparison, we used weighted Chi square (5)
to emphasize more on specific channel in layer one of
our three-layer approach.
D(H1, H2) =

n1
X

min(H1i , H2i )

(2)

i=0

L(H1, H2) =

n1
X

(H1i .LogH2i )

(3)

(H1i H2i )2
H1i + H2i

(4)

(H1i,j H2i,j )2
H1i,j + H2i,j

(5)

i=0

2 (H1, H2) =

n1
X
i=0

2w (H1, H2) =

X
i,j

wj

In (5), indices i and j refer to i th bin in feature histogram corresponding to the j th channel and wj is
the weight for channel j.
The nearest-neighbor (NN) classifier is used for recognition stage; the first rank is selected as the output of
Figure 4: Our proposed three-layer wieghted Haar and
classifier.
weighted LBP histogram
To ensure the proposed algorithm efficiency and measure its robustness to different face variations, it has
been tested on ORL face Database [13] and Yale face
database [14].
The LL channel has more appearance related information; therefore we divided it into 4 4 regions.
The LH and HL channels describe the texture grade
of face image and are more robust to face variations. 5.1 Yale face database
By reducing the number of blocks, we can have more
discriminative power in each region; so we used 2 2
regions for these channels. We ignored the HH channel The Yale face database contains 11 different gray-scale
and did not use it in the second level; because of its images of 15 distinct subjects, one for each of followpoor performance which has been tested in sub-section ing facial expressions or configurations: center-light,
with glasses, happy, left-light, without glasses, normal,
A in section five.
Length L of the final feature vector f v can be calcu- right-light, sad, sleepy, surprised, and wink (Fig. 5).
lated by (1). This feature vector is partially robust to The original size of each image is 320 243 with backillumination changes, pose and expression variations. ground. In order to reduce the effect of background
In order to reduce the effect of nonlinear illumination in recognition results, the faces extracted from original
changes, the histogram equalization is applied to input images. The extraction is done as follows:
images in preprocessing stage.

L(f v) = 4 4 59 + 2(2 2 59) = 1416

(1)

469

The center position of both eyes in each face


marked manually. The distance between eyes
centers d are used in next stage.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Face/Head anthropometric measures are used to


determine the face bounding box and crop the
face region (Fig. 6). The width bbx of face region (in pixels) can be calculated by (6):

bbx =

mw hf
d
2 pupil se

(6)

Where mw hf = 139.1 (mean width of a human face


in mm), and pupil se = 33.4 (half of the inter-pupil
distance in mm) and d is the distance between both
eye centers in pixel. This measures are anthropometric
constants given in [15].

The LH and HL channels are more robust to face variations, so we can use them together with LL channel
for a more robust face recognizer. The HH channel
usually mixes with noises and doesnt perform well in
recognition, as we can see in Table 1. According to the
result and with trial and error, we select weight vector
[3 1 2 0], respectively for LL, LH, HL and HH channels. This weight vector uses in weighted Chi square
statistic as wj . For comparison purposes, in addition
to our proposed method, LDA [16], PCA [17], KPCA
[18] and multi-block LBP with different block sizes are
also tested on this database too. For multi-block LBP,
different block count has been tested. Table 2 shows
the results. The proposed WHWLBP method is tested
in weighted and non-weighted modes.

Channel
LL
LH
HL
HH

Accuracy
85.9%
68.8%
82.2%
42.2%

Figure 5: Sample images from Yale database


Table 1: Result of the recognition on Yale for each
channel subimage separately.

Methods
LDA
PCA
KPCA
u2
LBP8,2
11
u2
LBP8,2
44
u2
LBP8,2
66
u2
LBP8,2 8 8
W HW LBPN W
W HW LBPW
Figure 6: Determined face bounding box by face/head
anthropometric measures defined in [15]

We calculated the width and position of the bounding


box for each face image, according to Fig. 5 and then
extracted the face regions. At last, all of images resized
to 124 124 pixels. Fig. 6 shows some sample cropped
images from Yale. In experiments, we select the 2 normal images (without any expression) of each subject
for training and the rest 9 images for testing. In order
to measuring the discrimination power of each channel
subimages after Haar decomposition, the classification
module ran on Yale for each four-channel subimages.
Table 1 shows the result. In table 1, the LL channel
achieves the best result as we expected. But it can distort by facial expressions and illumination variations.

Accuracy
57.7%
45.9%
65.9%
70.3%
79.2%
89.6%
88.1%
87.4%
91.11%

Table 2: Result of the recognition on Yale.

5.2

ORL face database

The ORL face database contains 10 different images of


40 distinct subjects. All the images are taken against
a dark homogeneous background and the subjects are
in up-right, frontal positions (Fig. 7) and in gray-scale
mode. The size of each image is 92 112. In experiments, we select the odd images of each subject
for training and the even images for testing, and vise
versa. Table 3 shows the result.

470

The Third International Conference on Contemporary Issues in Computer and Information Sciences

have more variation in expressions and illuminations; and this is why we used partially robust
phrase in this paper.

Conclusion

Figure 7: Sample images from ORL database

In this paper, we develop a novel face recognition approach, based on LBP histogram and Haar wavelet
transform. We propose a three-layer weighted Haar
Methods
Accuracy
and weighted LBP histogram (WHWLBP) represenLDA
61%
tation for face recognition which blends the power of
both Haar wavelet transform and LBP operator.
PCA
46.5%
In our final feature vector, we effectively have a descripKPCA
69%
u2
tion of face on different levels of locality: The Haar
LBP8,2
11
87.5%
u2
wavelet operator gathers information from the freLBP8,2
44
94%
u2
quency level. The LBP operator used in layer two, conLBP8,2
66
93%
u2
tain information about the texture patterns on a pixel
LBP8,2
88
90.5%
level; then the LBP operators output are summed over
W HW LBPN W
96%
a small region to gather information on a middle level;
W HW LBPW
97.5%
and each regions histograms are concatenated together
to make a global description of input face image in last
Table 3: Result of the recognition on ORL.
layer.
The result of experiments on Yale and ORL database
proved that our proposed feature extraction method is
5.3 Discussions
effective for face representation and recognition and is
partially robust to expression, pose and illumination
As can be observed, our approach has the best accu- variations.
racy on both databases. With respect to the results,
we can get several conclusions:

Refrences
Our method is partially robust to expression, illumination and pose variations and surely more
robust than other tested methods.

[1] R. Jafri and H. R. Arabnia, A survey of face recognition


techniques, Journal of Information Processing Systems 5
(2009), 4168.

The tested traditional methods, LDA and PCA


dont perform good with face variations.

[2] X. Zhang and Y. Gao, Face recognition across pose: A review, Pattern Recognition 42 (2009), 28762896.

Multi-Block LBP performs better than original


LBP and blocks count depends on the face images conditions. In Yale which face images have a
bigger size, 6 6 multi-block LBP performs better than the others LBP methods. But in ORL,
4 4 multi-block LBP has a better accuracy.
The weighted approach improved the accuray of
non-weighted mode.
WHWLBP is much faster than LDA, PCA,
u2
u2
LBP8,2
6 6 and LBP8,2
8 8. Because it has
smaller feature vector and therefore, faster calculation in both training and testing and faster
comparison.
WHWLBP has more distance in recognition rate
from the other methods in experiments on ORL
in compare to Yale. This is because Yale images

471

[3] T. Ahonen, A. Hadid, and M. Pietikainen, Face description


with local binary patterns: Application to face recognition,
Pattern Analysis and Machine Intelligence, IEEE Transactions on 28 (2006), 20372041.
[4] D. Maturana, D. Mery, and . Soto, Face recognition with
local binary patterns, spatial pyramid histograms and naive
Bayes nearest neighbor classification (2009), 125132.
[5] G. Zhang, X. Huang, S. Li, Y. Wang, and X. Wu, Boosting
local binary pattern (LBP)-based face recognition, Advances
in biometric person authentication (2005), 179186.
[6] W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang, Local Gabor binary pattern histogram sequence (LGBPHS):
A novel non-statistical model for face representation and
recognition 1 (2005), 786791.
[7] Y. Gao and Y. Wang, Boosting in random subspace for face
recognition, Intelligent Computing in Signal Processing and
Pattern Recognition (2006), 172181.
[8] T. Ojala, M. Pietikinen, and D. Harwood, A comparative
study of texture measures with classification based on featured distributions, Pattern Recognition 29 (1996), 5159.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[9] H. K. Ekenel, M. Fischer, E. Tekeli, R. Stiefelhagen, and


A. Ercil, Local binary pattern domain local appearance face
recognition (2008), 14.
[10] H. M. Vzquez, E. G. Reyes, and Y. C. Molleda, A new image division for LBP method to improve face recognition
under varying lighting conditions (2008), 14.
[11] T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution
gray-scale and rotation invariant texture classification with
local binary patterns, Pattern Analysis and Machine Intelligence, IEEE Transactions on 24 (2002), 971987.
[12] A. Haar, Zur theorie der orthogonalen funktionensysteme,
Mathematische Annalen 69 (1910), 331371.
[13] F. Samaria and A. Harter, Parameterisation of a stochastic
model for human face identification (1994), 138142.

[14] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman,


Eigenfaces vs. fisherfaces: Recognition using class specific
linear projection, Pattern Analysis and Machine Intelligence, IEEE Transactions on 19 (1997), 711720.
[15] L. Farkas, Anthropometry of the head and face, New York:
Raven Press XIX (1994), 711720.
[16] J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, Face
recognition using LDA-based algorithms, Neural Networks,
IEEE Transactions on 14 (2003), 195200.
[17] M. Turk, A. Pentland, and A. N. Venetsanopoulos, Eigenfaces for recognition, Journal of cognitive neuroscience 3
(1991), 7186.
[18] B. Schlkopf, A. Smola, K. R. Mller, and A. N. Venetsanopoulos, Nonlinear component analysis as a kernel
eigenvalue problem, Neural computation 10 (1998), 1299
1319.

472

An Unsupervised Method for Change Detection in Breast MRI


Images based on SOFM
Marzieh Salehi

Reza Azmi

Alzahra University

Alzahra University

Computer Engineering Department

Computer Engineering Department

Marzieh.salehi.sh@gmail.com

azmi@alzahra.ac.ir

Narges Norozi
Alzahra University
Computer Engineering Department
Na.norozi@gmail.com

Abstract: An automatic change analysis method that is efficient for detecting changes in MRI
sequence is very important for medical diagnosis, follow-up and prognosis. Chemotherapy is a
standard therapy for cancerous diseases but finding methods for analyzing the reaction of tumors to
this therapy accurately is a challenging task, because direct comparing and manual analyzing is very
time-consuming, difficult and sometimes impossible. In this paper we propose a novel unsupervised
method for change detection in breast MRI images. We apply a modified self-organizing feature map
(SOFM) neural network. to obtain an appropriate threshold for the network, we use a correlationbased criterion.

Keywords: Change Detection; SOFM; Neural Network; MRI; Artificial Intelligence.

Introduction

One of the most common injurious cancerous diseases


among women is breast cancer. Medical imaging plays
an important role in detection, diagnosis, and treatment monitoring. Mammography is one of the usual
ways for breast medical imaging but this method does
not detect about 10-30 % of cases [10]. One of the
robust methods for medical imaging is MRI. MRI has
better performance in cancer detection compared to
mammography and sonography [11]. Chemotherapy
is a standard therapy for cancerous diseases but there
is no methods for analyzing the reaction of tumors to
this therapy accurately because direct comparing and
manual analyzing is very time-consuming, difficult and
sometimes impossible. Therefore automatic change
analysis methods have become challenging problems in
medical image processing. In recent years many methods have been proposed for change detection in brain
Corresponding

Author, T:+98 (913) 2832977

473

MRI sequences [4 8]. But there are fewer researches


on breast cancer [9] and because of the difference of
structure and tissue of brain and breast it is not possible to use brain methods directly for breast. According to the approach used for comparing images, change
detection methods are classified as follows: one of the
approaches is segmentation based [12]. this approach
is appropriate for some applications such as comparison of whole tumor contents. But it is not efficient
for analysis of invisible changes. Another approach is
registration based. This approach uses jacobian operator for extracting changed regions. The shortcoming of
registration based methods is that they can not detect
small changes [13]. In the third approach intensity of
pixels is compared individually or together, therefore
these methods are susceptible to noise. One of the
robust change detection methods is Generalized Likelihood Ratio Test (GLRT) proposed by Bosc et al [4].
GLRT is recently used for change detection in brain

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

images by [7],[8].
In this paper we propose a novel method for change
detection of breast tumor in MRI images. The proposed framework has three main stages: preprocessing, change detection and optimizing. In preprocessing stage, a liner intensity normalization and non-rigid
registration is applied. In the next stage We apply a
modified SOFM neural network for change detection
in breast MRI images. The basic idea exploited in the
proposed method is inspired from [1]. Finally, in last
stage to obtain an appropriate threshold for the network and minimizing error, we use a correlation-based
criterion.

2
2.1

preprocessing
Intensity Normalization

Global intensity changes may occur in different stages


of MRI acquisition due to calibration errors and variations in imaging system components. Lemieux et al
used linear intensity normalization to overcome this
problem. Bosc et al proposed a nonlinear normalization method relying on the estimation of joint probability distribution. They observed that estimating of
joint probability distribution was very sensitive to image noise[14]. in this paper we used linear intensity
normalization method.

2.2

Rigid-body transformations that consist of only rotations and translations have been use to correct different patient positioning in the successive scans. Since
more complex deformations and undesired global differences may occur between two exams, using a rigid
registration or even affine registration only is not sufficient especially in breast MRI images. These complex
deformations can be the result of acquisition artifacts
or natural features of breast tissue such as variations
in breast compression. Generally, the main problem
in registration of mammography and MR images is the
large deformation of the breast during acquisitions. For
these reasons, using a non-rigid registration is an essential task. In this paper we use an intensity based nonrigid registration algorithm that have been proposed by
Rueckert et al and extended by Rohlfing et al. In this
algorithm a hierarchical transformation model of the
motion of the breast has been developed. The global
motion of the breast is modeled by an affine transformation while the local breast motion is described by
a free-form deformation based on B-splines. Normalized mutual information is used as an intensity-based
image similarity measure. Registration is achieved by
minimizing a cost function, which consist of a weighted
combination of the similarity measure and a regularization term. The regularization term is a local volumepreservation (incompressibility) constraint which is implemented by penalizing deviations of the Jacobian determinant of the deformation from unity [14].

Unsupervised Change Detection Method

Rigid and Non Rigid Registration


3.1

SOFM model

Given two images:A, an input image and B, a reference


image. IfT (A) is the image of A under the transformation function T , then the goal of image registration is SOFM model [2], [3] is made of two layers, an input
~ =
layer and an out put one. The input signal X
to find the parameters of T that satisfy Eq. (1):
[x1 , x2 , , xd ] that is d-dimensional, can be transmitted
to all output neurons. let denote the synaptic weight
~ = [wj1 , wj2 , , wjd ],
vector of an output neuron j by W
Tmax = argmax Similarity measureT (B, T (A)) (1) j = 1, 2, , M whereM is the number of output layer
neurons and wjk is the weight of the jth unit for the
As mentioned above, image registration is the pro- kth member of the input. If the weight vector of
~
Pd matches with the input vector X
cess for estimating the spatial transformation that ith output neuron
in
the
best
way,
w
.x
will
be
maximum
among
k=1 ik k
maps points from the input image to similar points in Pd
the reference one. For this reason, choosing the type
k=1 wjk .xk for any j. in this state i is the winning
~ then Wi will be updated. The
of the mapping function is one of the important tasks neuron for the input X.
that should be considered. The type of the mapping weight updating rule is:
function should correspond to the assumed geometric
Wi (p + 1) = Wi (p) + (p).h(p).(U Wi )
(2)
deformation of the sensed image, to the method of image acquisition (e.g. scanner dependent distortions and
errors) and to the required accuracy of the registration.
is learning rate that determines the rapidness of

474

The Third International Conference on Contemporary Issues in Computer and Information Sciences

convergence and h(p) shows the topological neighbor- to 1(here all components of input and weight vectors
hood at pth iteration of learning.
are nonnegative). So we normalize the weight vector
as follows:

3.2

Change detection method

wmn,k
wmn,k = Pd
k=1 wmn,k

(4)

In this section we propose an unsupervised technique


for change detection in breast MRI images using SOFM
Py
Then
model. In the first step we must calculate the differk=1 wmn,k = 1. The action of updating
is
continued
until the network will be converged. To
ence image. For this goal we use the following simple
check
the
convergence
C(p) is computed at each iteraformula.
tion of learning as follows:
lmn = (int)|img1mn imgmn2 |
(3)
X
C(m, n) =
U (m, n)
(5)
Let difference image denoted by D = {lmn , 1 <
U (m,n)>th
m < p, 1 < n < q}. we assign to each pixel (m, n)
of the difference image, a neuron in the output layer.
For each pixel (m, n) of the difference image, we define
The condition of stopping the learning is |C(p +
an input pattern so that each pattern contains (m, n)
1)

C(p)| < that is a pre-determined positive


and its neighbors that are 8 pixels. Figure 1 shows the
small
value. After the network is converged the pixel
neighborhood for pixel (m, n).
(m, n) in D belongs to changed region if U (m, n) > th
else belongs to unchanged region.

Proposed threshold selection


method

By varying threshold, obtained changed region


changes. Our goal is to find the threshold that makes
minimum error in changed and unchanged regions. In
Figure 1: the neighborhood for pixel (m,n).
the proposed method, after preprocessing the difference image by converting it to black and white with an
appropriate threshold, we maximize the correlation of
The number of neurons in the input layer is the it with the obtained map from above change detection
same as the size of input pattern. To keep the value of method over different values of th where th is changed
each component of the input less than or equal to 1 we by an appropriate amount like 1/M and M is the maxdefine a mapping function.
imum gray value of D.

3.3

Learning the weights

For each output neuron (m, n) we have an input vec~ and a weight vector W
~ . in the model of SOFM
tor X
the pixel that has the maximum value of U (m, n) =
~ .X
~ = Pd xmn,k .wmn,k , will be updated but here
W
k=1
a threshold th is defined and those neurons that satisfy
U (m, n) > th can update themselves and their neighbors. Also in SOFM the same input is given to all
output neurons but here different inputs are given to
output neurons. The threshold th is in [0, 1]. Therefore the value of U (m, n) must be less than or equal

475

Results and Discussion

the results of change detection are very difficult to evaluate because preparing reference patterns for original
changes by human expert is impossible due to the following reasons : 1) some of the changes are invisible
for the human experts. 2) determining and labeling
each changed pixels exactly is very time consuming,
expensive and error prone. Therefore we use simulated
images in this paper. The method of generating simulated images comes in [14].

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 1: evaluation of proposed method and GLRT for 14 test images through 5 criteria.
GLRT
Mean Std-dev Max
VOR(%) 33.77 21.28 38.55
FPR(%) 3.77 2.73 2.43
ACC(%) 96.14 2.60 98.97
SPC(%) 96.23 2.73 98.03
PPV(%) 37.44 27.04 47.52

Min
7.05
1.22
89.92
89.85
7.06

In this paper, an unsupervised approach is presented as a new approach for breast change detection
in MRIs. This approach evaluates through 5 criteria:
accuracy (ACC), specificity (SPC), false positive rate
(FPR), positive predictive value (PPV) and VOR. The
results show that the proposed method has a better
performance compared to GLRT method. Tables 1,2
show the results.
In the results, we see that ACC and SPC have large
values. This is because true negative is large. i.e. the
black pixels around region of breast cause the amounts
of these two criteria be upper than the reality. If we
extract region of interest, the value of ACC and SPC
will be more actual.

Proposed
Mean Std-dev Max Min
47.59 19.43 69.78 16.49
0.97
1.48
4.93 0.00
98.58 1.26 99.69 95.08
99.03 1.48 99.99 95.07
67.65 29.67 99.66 16.60

Conclusion

In this paper we propose a novel unsupervised method


for change detection in breast MRI images. We apply a modified SOFM neural network. Each pixel of
the difference image is assigned to a neuron in the out
put layer. The number of neurons in the input layer
is equal to the number of input pattern pixels for each
output neuron. To obtain an appropriate threshold for
the network, we use a correlation based criterion. To
evaluate the results, we use five criteria. Obtained results give considerable improvement in these criteria
compared to the GLRT method.

Refrences
[1] S. Ghosh, S. Patra, and A. Ghosh, An unsupervised contextsensitive change detection teqnique based on modified selforganizing feature map neural network, Journal of Approximate Reasoning (2009), 3750.

Table 2: change detection using proposed method and


GLRT for 5 sample images.
Original patterns

Test
image1

Test
image2

Test
image3

Test
image4

Test
image5

GLRT

Proposed Approach

[2] T. Kohonen, Self-organized formation of topologically correct feature maps, Biol. Cybernet (1982).
[3] T Kohonen, Self-Organizing Maps, second ed., SpringerVerlag, Berlin (1997).
[4] M. Bosc and F. Heitz et al, Automatic change detection
in multimodal serial MRI: application to multiple sclerosis
lesion evolution., NeuroImage (2003), 643656.
[5] L. Lemieux, U.C. Wieshmann, and N.F. Moran et al, The
detection and significance of subtle changes in mixed-signal
brain lesions by serial MRI scan matching and spatial normalization, Medical Image Analysis (1998), 227242.
[6] D. Rey, G. Subsol, H. Delingette, and N. Ayache, Automatic detection and segmentation of evolving processes in
3D medical images: application to multiple sclerosis, Medical Image Analysis (2002), 163179.
[7] H. Boisgontier and V. Noblet et al, Generalized likelihood
ratio tests for change detection in diffusion tensor images:Application to multiple sclerosis, Medical Image Analysis (2012), 325338.
[8] E.D. Angelini and J. Delon et al, Differential MRI analysis
for quantification of low grad glioma growth, Medical Image
Analysis (2012), 114126.
[9] X. Li, B.M. Dawant, and E.B. Welch, A nonrigid registration algorithm for longitudinal breast MR images and
the analysis of breast tumor response, Magnetic Resonance
Imaging (2009), 12581270.
[10] D.B. Kopans, The positive predictive value of mammography, American Journal of Roentgenology (1992).

476

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[11] S. Malur and S. Wurdinger et al, Comparison of written


reports of mammography, sonography and magnetic resonance mammography for preoperative evaluation of breast
lesions, with special emphasis on magnetic resonance mammography, Breast Cancer Res (2001), 5560.
[12] Z. Xue, D. Shen, and C. Davatzikos, consistent longitudinal alignment and segmentation for serial image computing,
CLASSIC: NeuroImage (2006), 388399.

477

[13] M.I. Miller, S. Patra, and A. Ghosh, Computational


anatomy: shape, growth, and atrophy comparison via diffeomorphisms, NeuroImage (2004).
[14] R. Azmi, N. Norozi, and R. Anbiaee, a novel frame work
for automatic analysis of chemotherapy effects on breast tumors in MRIs based on Region Growing and Local Mutual
Information, Iran Computer Conference (1390).

A new image steganography method based on LSB replacement using


Genetic Algorithm and chaos theory
Amirreza Falahi

Maryam Hasanzadeh

Shahed university, Tehran, Iran

Shahed university, Tehran, Iran

a.falahi@shahed.ac.ir

hasanzadeh@shahed.ac.ir

Abstract: In this article a new information hiding method based on LSB replacement in spatial
domain has been presented. In this method, first message bites are shuffled by chaos whose parameters are adjusted by Genetic Algorithm. Then, the best order of the shuffled message bites
are selected for embedding with consideration of image pixels. This makes minimal changes in the
visual perception of the image while the first statistics of the image will also be preserved. The
Experimental results, indicating that imperceptibility, security and high capacity of this method.
Also to extract the message, the recipient does not require the original image.

Keywords: Steganography, LSB replacement, Genetic Algorithm, Chaos

Introduction

Steganography refers to the science of invisible communication. Unlike cryptography, where the goal is secure
communications from an eavesdropper, steganographic
techniques strive to hide the presence of the message
from an observer [1]. Steganography embed the data
in the least significant components of a cover media,
such that unauthorized users are not aware of the existence of hidden data [2]. The cover object can be a
still digital image, a video or an audio file. The hidden
message also can be a row text, an image, an audio file
or a video file [3],[4]. Steganography algorithm embeds
the hidden message in a cover media. The combination
of cover and the hidden message is called stego.
The Steganography techniques can be divided into two
main categories: embedding in frequency domain and
embedding in spatial domain. In the frequency domain most of the methods are based on discrete cosines
transform (DCT). After performing DCT on 8 8
blocks and quantizing the DCT coefficients, the hidden
messages are embedded in quantized DCT coefficients.
LSB replacement is the most commonly used method
in spatial domain which directly replaces the LSB of
the cover images with the hidden message bits [5].
Due to the increasing knowledge of hackers, the need
Corresponding

Author

478

for inventing approaches with high security and acceptable capacity has increased sharply. In the recent
years lot of approaches for embedding the data in images based on evolutionary algorithms, genetic algorithms and chaos theory has been presented. The use
of chaos for shuffling the message bit and improved
adaptive LSB has been suggested in [6]. In [7] the optimization system of evolutionary algorithms are used
for increasing the resistance against the statistical attacks. In [8] by genetic algorithms a technique for watermarking the data inside the images has been proposed. In 2007, an innovative watermarking scheme
based on progressive transmission with genetic algorithms has been proposed in [9]. In 2003, by using
chaos theory another approach for data hiding in the
frequency domain has been invented [10]. In 2010, a
Steganography approached has been proposed based
on LSB replacement and hybrid edge detector [11]. In
[12] a water marking algorithm based on chaos for images in wavelet domain has been proposed. In [13] a
water marking algorithm based on SVD and genetic
algorithm has been presented.
In this paper the chaos are used for shuffling the message bits. The required parameters are adjusted by
Genetic Algorithm in which the operators are selected
intelligently. In the following, the proposed method

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

will be described in Section 2. In Section 3, experi- in state when 3.5699456 < 4 , the value is conmental results will be illustrated and finally Section 4 sidered in the range of [3,4] and the x0 value from [0,1]
interval for each chromosome.
concludes the paper.
Fitness Function: due to the fact that the target is to
minimize the changes between cover image and Stego
image, the Peak Signal-to-Noise Ratio (PSNR) is considered as the fitness function which calculates the im2 Proposed Method
age changes before and after the message embedding.
Greater values of PSNR are result less change in outIn this section we describe the utilized Logistic map put image. PSNR function is defined as the following
(one of the simplest chaotic maps) and Genetic Algo- relationship: :
rithm in the proposed method.

2.1

P SN R = 10 log10 (

Chaos

The chaos phenomenon is a deterministic and analogously stochastic process appearing in a nonlinear dynamical system. Because of its extreme sensitivity to
initial conditions and the outspreading of orbits over
the entire space, it has been used in information hiding to increase security [6]. Logistic map described by
xn + 1 = xn (1 xn )

(1)

where
0 4, xn (0, 1)
Researches on chaotic dynamical systems show that the
logistic map stands in chaotic state when:
3.5699456 < 4
That is, the sequence:
{xn , n = 0, 1, 2, }
generated by the logistic map is non-periodic and nonconvergent. All the sequences generated by the logistic
map are very sensitive to initial conditions, in the sense
that two logistic sequences generated from different initial conditions are uncorrelated statistically.

2.2

1
M N

2552
0
2
j=1 (x(i, j) X (i, j))
(2)

PM PN
i=1

In which N and M are the image dimensions and i and


j are the current pixel locations.
Selection: we always choose half of the population
chromosomes which have better fitnesses. for generating the population of the next iteration.
Crossover: we use one point crossover and arithmetic
crossover. based on the empirical results we choose the
crossover probability from (0.6, 0.9) interval.
Mutation: the empirical results indicate that small
changes on x0 results in great changes on the generated sequence by the logistic map. But little changes
on results in little changes on the generated sequence
by the logistic map. Due to the fact that the resulted
changes of applying mutation operator should be random and rather small, we only perform the mutation
on the second gene, i.e. and take the mutation probability equal 0.1.
Survivor Selection: Considering that we are only
seeking one optimized value (x0 , ), we always keep
the best individual of each generation and the rest of
the individuals are replaced with offsprings. Figure 1
shows the proposed method flowchart for obtaining the
optimized (x0 , ).

Genetic Algorithm

In this paper the genetic algorithm is used for gaining


optimized chaos parameters (x0 , ). (x0 , ) is used as
a key at the sender for embedding the message and the
same key is used for extracting the message in the recipient. Without having the key, the message will not
be extractable. The genetic algorithm in our proposed
method is described as below:
representation: each individuals of the population
are described as a pair of (x0 , ), in which x0 and
are each a gene of one chromosome.
initial population: the initial population is generated randomly. Considering that logistic map resistant

2.3

Embedding and extracting algorithms

Embedding algorithm: As shown in Figure 2, after obtaining the best and optimized (x0 , ) by GA,
the embedding is performed using well known LSB replacement.
Extracting algorithm: in extraction we use (x0 , )
as the key. using the (x0 , ) and logistic map, the hidden message is extracted and retrieved (Figure 3).

479

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 3: output chart of genetic algorithm for an example

Figure 1: Embedding algorithm

Figures 5, 6 and 7 illustrate the comparison of classic LSB method and the method used in [11] with the
proposed method. The results of Figure 7 have been
gained from a population of 100 individuals and 10
generations.

Figure 4: Stego images obtained from LSB replacement


method [11]

Figure 2: Extracting algorithm


Figure 5: Stego images obtained from presented
method in [11]

Experimental Result

In each iteration of the proposed algorithm, we gain


different values and the best one will transform to the
next generation, and the (x0 , ) value in the final genFigure 7: Stego images obtained from our proposed
eration are returned as the final value. For example in
method
Figure 4, the output of genetic algorithm for a population of 100 individuals and 10 iteration is shown. Test
images size are 128 128 pixel and message length is
1500 bits. The numbers on the horizontal axis indiThese Figures show that results of the proposed
cate the generations and the numbers on vertical axis method are better than other depicted ones regarding
indicate the PSNR of the best value in each generation. subjective evaluation and resulted PSNR values.

480

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion

In this paper we presented a new image Steganography method in gray-level images based on LSB replacement. In this method for shuffling the message
bits, the chaos has been used whose parameters are set
by genetic algorithm. This would results in minimum
changes in the image which as well increases the security level of the method. The Experimental results
indicate that our method in addition to having high
embedding capacity, maintain the initial statistics of
the image in a satisfactory way and the changes in it
are infeasible visually by the human visual system.

Refrences

[5] A. Cheddad, J. Condell, K. Curran, and P.M. Kevitt, Digital


image steganography: Survey and analysisof current methods, School of Computing and Intelligent Systems, Faculty of Computing and Engineering University of Ulster at
Magee, Londonderry, BT48 7JL, Northern Ireland, United
Kingdom (2010).
[6] L. Yu, Y. Zhao, R. Ni, and T. Li, Improved Adaptive LSB
Steganography Based on Chaos and Genetic Algorithm,
EURASIP Journal on Advances in Signal Processing, Hindawi Publishing Corporation (2010).
[7] R. Khakpoor, A.M. Eftekhari moghadam, and H. Nabaee,
A secure steganography method based Genetic Algorithm to
increase the volume of data embedding, 5th Iranian Conference on Machine Vision and Image Processing (2008).
[8] H.C. Huang, C. Chu, and J.S. Pan, The optimized copyright
protection system with genetic watermarking, SpringerVerlag , Soft Comput (2009).
[9] H.C. Huang, J.S. Pan, Y.H. Huang, F.H. Wang, and
K.C. Huang, PROGRESSIVE WATERMARKING TECHNIQUES USING GENETIC ALGORITHMS, CIRCUITS
SYSTEMS SIGNAL PROCESSING, Birkhauser Boston (
2007), 671679.

[1] R. Chandramouli, Image Steganography and Steganalysis:


Concepts and Practice, Springer-Verlag Berlin Heidelberg
(2004).

[10] Zhen Liu and Lifeng Xi, Image Information Hiding Encryption Using Chaotic Sequence, Springer-Verlag Berlin Heidelberg (2007), 202207.

[2] H. WANG and S. WANG, Cyber warfare: steganography


vs. steganalysis, COMMUNICATIONS OF THE ACM (
octobr 2004 ).

[11] W.J. Chen, C.C. Chang, and T.H Ngan Le, High payload
steganography mechanism using hybrid edge detector, Expert Systems with Applications, Elseveier (2010), 3292
3299.

[3] T. Morkel, J.H.P Eloff, and M.S Olivieer, AN OVERVIEW


OF IMAGE STEGANOGRAPHY, Information and Computer Security Architecture (ICSA) Research Group, Pretoria, South Africa (2005).

[12] Veysel Aslantas, A singular value decomposition based


image watermarking using genetic algorithm, ScienceDirect, international AEU of Electronics and Communications
(2007).

[4] Kefa Rabah, Steganography-The Art of Hiding Data, Information Technology Journal, Asian Network for Scientific
Information (2004), 245269.

[13] Zhao Dawei., Chen Guanrong., and Liu Wenbo., A chaosbased robust wavelet-domain watermarking algorithm, ScienceDirect.Chaos, Solitons and Fractals (2004), 202207.

481

Providing a CACP Model for Web Services Composition


Mehregan Mahdavi

Parinaz Mobedi

Faculty of Engineering

Faculty of Engineering

Department of Computer Engineering

Department of Computer Engineering

University of Guilan

University of Guilan

p.mobedi@gmail.com

mahdavi@guilan.ac.ir

Abstract: Nowadays, quick response to customer requirements is considered as a major challenge.


Users Demands will change over time and organizations have to change their systems according to
the new users requirement. The concept of service in a service-oriented architecture may address
this issue. When a new request is propounded, it can be resolved by taking advantages of existing
services and combining them. In this paper, we present the CACP framework for web services
composition. The aim of our model is to show a whole scenario from a moment which user requests
a desired web service until a full service is created with respect to a system analysis and design.

Keywords: Web Service; Web Services Composition; QoS; GUI; UML.

Introduction

dynamic aspect of the system.Our proposed framework


contains an analysis phase which includes class diagram and sequence diagram to model both static and
dynamic aspect of the system. The article is organized
Web Services are defined by W3C as software systems
as follows: Section 2 describes related work; Section 3
that interact over a network from one machine to anproposes the CACP framework; Section 4 presents a
other. Web services can be described, advertised and
case study and finally concludes the paper.
discovered using (XML-base) standard language, and
interacted through standard internet protocol. Web
services can be composed with each other in the context of inter-organisational business process, leading
to composite services. Service composition provides a 2
Related Work
number of benefits, one of them is to decrease the cost
and risk for building new business application in the
sense that existing business logics are represent as web Self-Serv uses state chart for service composition. Selfservice and can be reused.
Serv creates state coordinator for each state in state
chart. The work is explained by Travel Planning SerMost composition languages are based on XML. vice as a case study. In this approach just behavHowever, they are hard to write and understand. ioral aspect of a web service is considered[2]. In the
Therefore, we need standard graphical languages like approach introduced by Skogan class diagrams and
UML. There are many reasons for using UML for ser- activity diagrams are employed. The methodology
vice composition. One of the obvious reasons is that is explained by gas dispersion emergency as a case
they are independent of the executable composition study[2].Gronmo and Jaeger[6] explain four phases for
language and can directly transform to composition development of web service composition. These phases
specification[1]. UML has different diagrams for mod- contain modeling, discovery, selection and deployment
eling software some of which are used for modeling of composite web service. In this approach control and
static aspect of UML and others are used for modeling data flow are shown by activity diagrams while ontol Corresponding

Author, P. O. Box 41635-3756, F: (+98) 131 6690 271, T: (+98) 131 6690 274-6 (Ex 3193)

482

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ogy concepts are represented by class diagrams. The


authors discuss a case study of Express Congo Buy to
show their approach. In this approach proposed by
John and Gerald[7], class diagrams are used to specify
the structured of the composite web service and activity diagram are used for compositions. UML specifications are transformed to OWL-S specifications. This
approach is presented by Find Cheaper Bool as a case
study.

Proposed Framework

This architecture is composed of four entities including Client, System Analysis, composition Broker and
Service provider. These entities of proposed framework work with UDDI to provide QoS user requirement. The structure of proposed framework and relation between various components is depicted in Figure
1. The name of this framework is set CACP which
is abbreviated of four main entity in the framework
(C=Client, A=system Analysis, C=Composition broker, p=service Provider). In the following we describe
each of the components.

1. Service Provider: service providers published the


service descriptions (i.e., WSDL) in UDDI. UDDI
is designed in order to integrate message soap and
access to WSDLs document.
2. Client: user can announced service need, this
work is performed by the GUI. Furthermore user
request is contained both functional and nonfunctional request.
3. System Analysis: the task of this component is
that takes the high level user requirement and
then create class diagram and sequence diagram
after extracting and analyzing user requirement.
In UML, class are displayed by class name, a list
of attribute and a list of methods. The benefit
use of class diagram is that, the name of class
is a service must be discover in UDDI and also
methods in class diagram which are available in
class diagram is the functionality that must be
perform by the service , moreover the QoS information can be found in class diagram. Also
it is useful to use sequence diagram besides class
diagram since the class diagram is model static
aspect of the system while sequence diagram is
modeled the dynamic aspect of system .

4. Match Function: the main task of this component is service discovery. For doing this, Match
Function take class diagram as an input and then
send a request that contain class name, methods
with input and output parameters to UDDI.
5. Classification: the input of this step is a table of
appropriate service which is obtained in step 4.
Then this component is put relative services in
one category and after that one agent for each
category is created .
6. Select Candidate: the best candidate in each category should be identify. At this step choose best
candidate is done according to the non-functional
requirement such as QoS.
7. Composition: the best service candidate in each
category need to collaborate. How to establish
this collaboration lies at the heart of service composition. In this step coordination between web
services is represented by the sequence diagram.
8. Execute engine: the task of this component is implementation of the composite service which are
created in step 7 and then the result is return to
user after the execution.
9. Generate WSDL: When a service composition is
completed then WSDL of these new web services
should be created. After that this describe should
be publish in UDDI. Because if the request repeat again, this new service could be detected in
a discovery phases.
10. Web Service: Web services can be atomic or composite.
11. Agent: Agent is created for those web service
that are doing the same work.

3.1

Receive User Requirement


System Analysis

and

At first client enter a requirements services through


one form GUI. This GUI allows client to select and
compose their own services. The first form of GUI contains 3 steps. 1) Select web services from among of web
services. 2) Identify the functionality of web services.
3) Specify the sequence of web services. In the second
form, information based on previous step is shown then
client can customize the selection criteria in order to
find best services from the task. In part analysis, based
on users input, class diagram for service composition
and sequence diagram for coordination between them
Composition Broker has multiple components in- is created. The sequence diagram have controller which
clude step 4-9
the task is to distribute the works among web services.

483

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1: Framework CACP for web services composition

3.2

Relationship Between Web services after that the sequence of execution web services is defined. If all three web service in a raw it means that
and Agent

each web services should be run in sequence order but


if in a column it means should be execute concurrency.
After Match function component send a user request In part b of Figure 2 the form is shown based on users
to UDDI, maybe multiple service is found for the re- web service selection.
quest by the UDDI that can fulfilll a client need. So
the main functionality of classification component is to
obtain table from previous step as input and then for
the candidate services which capable to perform the
same work put in a category and then generate agent
for each category. Each agent have table that contain
id-service, method, input and output parameters, reliability, availability. Agent is in charge of selecting
the best candidate by utilizing this table and also in
(a) First Form
selecting a candidate should be consider that QoS information from the best candidate web service is closer
to a users demand. If there is a significant change
in current QoS information for web service, the new
value of QoS should be notify to agent. Agent become
aware of changing the QoS by receiving message based
on XML and after that agent update QoS records in
table.

Case Study

In this part we review our approach for planning trip.


Step 1. First the users fill out required information
via a form. The proposed GUI is shown in Figure 2.
In part a of Figure 2, user check required web service
among existing web service. Since each web service
may have multiple functions for different task, thus the
functionality for each of web services is determined and

484

(b) Second Form

Figure 2: Receive users request from GUI

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 1: Match Functions table for planning trip


Input parameter
Output parameter

Class name

Method

flight

Booking()

String From, String to, Date Departdate,Date Return-date

Num SeatNo,Num FlightNumber,String


flightBooking-Id

car

CarRenting()

String departure address, Date date-leave,


Date date-back, String arrive-address

String carBooking-id

hotel

HotelBooking()

String name-hotel, String hotel-city,Date


date-receive, Date date-back

String hotelBooking-id

Table 2: This table show candidate services for each request


Name of service Candidate service
flight
car
hotel

MahanAirBooking, AsemanAirBooking
bestRentcar, LuxeRentcar
Mashhad-3Star-hotelBooking, Mashhad5Star-hotelBooking

Figure 3: Class diagram for whole planning trip

Step 2. With respect to users data, in this stage


both class diagram and sequence diagram is build. The
class diagram for whole planning trip is shown in Figure 3. At first, interface object class get users request
and then this request is send to control object, the
controller split users request to a small message and
then send message to entity that can accomplish work.
Here entity is modeled the same as web service. In this

class diagram functional requirements are put in section methods of class and non-functional requirements
are put in QoS Metric class. The sequence diagram
for whole planning trip is shown in Figure 4.this diagram represent how one object (web service) interact to
another object until the task is complete, the communication between web services is managed by controller
object.

485

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 4: Sequence diagram for whole planning trip

Step 3. After creating the class diagram, the task of


match function is to send a request to UDDI which request include class name, methods and parameters as
shown in Table 1.
Step 4. For each request, UDDI return a number of
candidate services as shown in Table 2.
Step 5 and 6.The input of classification component is
obtained from a table in step 4. The task of this component is to put relative services in one category and
then create agent. Now agent should select the best
candidate from this category.
Step 7.the best candidate service can be placed together based on sequence diagram. First user communicate with system by means of object UI and
also controller object is act as a single point of contact
between the UI object and web services. Now the
relationship between the best candidate web services
is formed and object controller should coordinate between them. Step 8. At the end the new web service
composition should be translated to BPEL4WS. How
to map a sequence diagram to BPEL4WS is described
in[4].Step 9. After a new web service composition is
complete. WSDL for new web service should be create
and publish in UDDI registry. Class diagram is used
for creating WSDL; rule for converting UML to WSDL
is express in[5].

main parts. The first part user interact with GUI, the
second part is analyzing phase and the third and fourth
part contain service discovery and service classification
and then service composition is done by sequence diagram. Finally, the work is explained by planning trip
as a case study.

[6] R. Gronmo and M. Jaeger, Model-driven semantic web service composition, In Proc. 12th Asia-Pacific Software Engineering Conference (APSEC) (2005).

Conclusion

In this paper a new framework for web services composition is proposed. This frame work is made up of four

486

Refrences
[1] D. Scogan, R. Gronmu, and I. Solheim, Web Service Composition in UML: Lecture Notes in Computer Science,
IEEE,8th IEEE Intl Enterpriser Distributed Object Computing (2004), 4757.
[2] I. Rauf, M. Zohabib, and Z. Malik, UML based Modeling of
Web Service Composition-A Survey, EuroCG10, Sixth International Conference on Software Engineering Research, Management and Application (2008).
[3] E. Badidi, A Publish/Subscribe Model for QoS-aware Service
Provisioning and Selection, International Journal of Computer Applications 26 (2011).
[4] B. Bauer and J. Muller, MDA applied: From Sequence Diagrams to Web Service Choreography, In N. Koch, P. Fraternali, M. Wirsing, eds., Lecture Notes in Computer Science
- Web Engineering (2004), 136148.
[5] R. Gronmo, D. Skogan, I. Solheim, and J. Oldevik, Modeldriven Web Services Development, IEEE International conference on e-Technology, e-Commerce and e-Service 2004.
(2004), 4245.

[7] T.E.M. John and C.G. Gerald, Specifying Semantic Web Service Compositions using UML and OCL, IEEE International
Conference on Web Services (ICWS) (2007).

Using Collaborative Filtering for Rate Prediction


Sonia Ghiasifard

Amin Nikanjam

Technical college of Dr.Shariaty

Iran University of Science and Technology

Department of Computer engineering

School of Computer engineering

s.ghiasifard@gmail.com

nikanjam@iust.ac.ir

Abstract: Recommendation systems are designed to allow users to locate the preferable items
quickly and to avoid the possible information overloads. Recommendation systems apply data mining techniques to determine the similarity among thousands or even millions of data. Collaborative
filtering is one of the most successful recommendation techniques. The basic idea of CF-based
algorithms is to provide item recommendations or predictions based on the opinions of other likeminded users. In this paper, we propose an approach to predict user rates on new items for user
of Yahoo Music Dataset. The proposed approach is based on collaborative filtering and consists of
seven different methods. We combine the results of these methods using a linear blending model.
To evaluate the accuracy of predictions, Mean Absolute Error (MAE) is reported.

Keywords: recommendation systems; collaborative filtering; prediction; rate; Yahoo Music Dataset.

Introduction

this dataset and our proposed methods inspired by user


behaviors and concepts of CF in Section IV.Finally we
explain our blending method to unify the results of implemented methods in Section V. Concluding remarks
Nowadays, users challenge is finding the right content are discussed in the last section of our work.
in the internet, which could give an answer to their interests and needs. Actually some users may not even
know what to look for. Recommender systems are
made for these cases to present recommendations to 2
Background
users and predict their future trends based on their
interests as demonstrated by some past activities on
the relevant sites.Collaborative filtering(CF) is one of 2.1 Traditional
Recommender Apthe most promising recommendation techniques which
proaches
works by discovering the correlation between users and
items based on the observed user preferences (i.e., ratings, clicks, etc) so that unobserved user preferences Recommender systems are tools for filtering and sortcan be interpolated from the observed ones. For ex- ing items and information. They use opinions of all
ample, the well known user based CF algorithm first users to help individuals to effectively identify their infinds the similarities between users based on their past terest from a potentially overwhelming set of choices.
ratings. Then a target users rating on a new item can Usually recommender systems are categorized in three
be predicted from the ratings of that item by other approaches which are described below.
similar users, also known as neighborhood.This paper
starts with a brief explanation about traditional recommender approaches.A statement about the struc Content-base filtering: This approach is based on
ture of Yahoo music dataset is presented in Section III.
textual information such as documents. These
Then we describe some implemented CF methods on
items are typically described with keywords. By
Corresponding

Author, T: (+98) 9352199102

487

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

using nearest neighbor functions or clustering


methods, this recommendation system will be
able to analyze these keywords and document
content and use it as a basis to recommend a suitable item. Recommended items are those items
that are mostly similar to the positively rated
ones.
Collaborative filtering: The main idea of collaborative filtering is finding users that share appreciations [7]. If two users have the same or almost the same rated items, then they have similar tastes. Such users are called neighbors. A
user gets recommendations to those items that
has not rated before, but were rated positively
by neighbors.

6 ratings of each user were placed in the test set and


the preceding 4 ratings were used in the validation set.
The train set consists of all earlier ratings (at least 10).
The total sizes of the train, validation and test sets are
therefore 252,800,275, 4,003,960, and 6,005,940 respectively. The ratings are integers between 0 and 100.
The distinct feature of this dataset is that each user
ratings are given to entities of four types genre, artist,
album and track .As shown in figure 1 in this dataset
most rates are 0,30,50,70 and 90, the second most rates
are 10, 20, 40, 60, 80 and 100, and other rates are the
remaining numbers that is not devisable by 10.

Hybrid Recommender system: To achieve better


results these recommender systems combine different techniques of collaborative approaches and
content-based approaches.

2.2

Evaluating accuracy

The mean absolute error (MAE) is a quantity that we


used to measure how close predictions are to the true
rates of users. The mean absolute error is an average
of the absolute errors ej = yi yi where yi is the prediction and yi the true value. The mean absolute error
is given by (1).
X
M AE = 1/n
|yj yj |
(1)

Figure 1: Rating distribution [1]

Algorithms

We have implemented different methods , some innovative and some already known, to predict ratings of user
2.3 Notation
on given items. In the following some previously used
collaborative filtering methods are described. All of
We consider u for user and i for item. Symbol k is set these methods are trained using the train set of Yahoo
of items and k |u| is set of items that user u rated, pre- Music Dataset. To provide comprehensive comparison,
dictions for the pairs of user/item pairs are denote as the measured MAE value on the validation set of each
ru,i .Also in the methods we consider parameters pu as model is reported to evaluate accuracy of our predica user-dependent feature and qi as an item-dependent tions.
feature.

4.1

AC Method

Dataset

We used Yahoo Music Dataset [1] for training methods and evaluating errors on predictions. Yahoo music
dataset comprises 262,810,175 ratings of 624,961 music items by 1,000,990 users which is collected during
1999-2010. Each item and each user have at least 20
ratings in the whole dataset. All ratings were split
into train, validation and test sets, such that the last

The AC (Adjusted-Cosine) method approach predicts


the rating ru,i of user u for a new item i , using the
rating of the user u in the set of users U gave to the
items which are similar to i . In this approach we used
adjusted-cosine formula to compute similarity adjst(i,j)
between items [4]. This method uses the difference between rate of an item ru,i and average rating of user
ru to compute similarity, each pair in the co-rated set
corresponds to a different user. For all items that user

488

The Third International Conference on Contemporary Issues in Computer and Information Sciences

have rated we used the weighted-sum formula (3) to


predict rating of users to items. To address the computational challenges arising from the huge number of
users, we split the users into N parts. In our experiment
we set N=1000. By this method, we get MAE=18.334
on the validation set.
P
u )(ru,j ru ))
uU (ru,i r
pP
p
adjst(i, j) = P
2
2
uU (ru,i ru )
uU (ru,j ru )
(2)
P
N k|u| (adjsti,N ru,i )
(1)
ru,i = P
(3)
N k|u| (|adjsti,N |)

we find some additional information based on train set.


We find the relationships between items (genre, artist,
album and track) they have been rated and consider
same rate for related items. For other users we used
AC method to predict their rates. With this method
we get MAE=18.6343 with =44 and MAE= 18.6388
with =50 on the validation set.
(
(1)
ru,i
, dev <
(4)
ru,i =
(7)
(U sertaste)
ru,i
, dev

4.2

In this model like the previous model we find some


additional information for users with rating standard
deviation of more than . In this model we find item
types that each user prefers. Preferred items are those
items which user usually rate 43 or higher, by this way
we find types (genre, album, artist, track) which user
like the most and give them the average rate of user
to these types. For other users we used AC model to
predict ratings. By using this model we get MAE=
18.2149 with =44 on validation set.
(
(1)
ru,i
, dev <
(5)
ru,i =
(8)
(T ypetaste)
ru,i
, dev

AC+Avg

This method is quite similar to the previous one; the


difference is that we exploited the mean rate of items
ri for calculating the final prediction. By this way we
get MAE =22.3457 which didnt help much.
(1)

(2)
ru,i

4.3

ru,i + ri
=
2

(4)

CB Method

4.5

TypeTaste

(3)

This approach predicts the rating ru,i of user u for a


new item i. In this method we used correlation-based
formula (5) to compute the similarity corri,j between
items, which use the difference between rate of user
u to items and average rating of items ri to compute
similarity.
P
i )(ru,j rj )
uU (ru,i r
pP
corri,j = pP
(5)
2
(r

)
j )2
i
uU u,i
uU (ru,j r
Like the AC model we used weighted-sum formula to
predict rating of users to items and to overcome the
computational challenges we split the users into N
parts. In our experiment we set N=1000. With this
model we get MAE=24.0286 on the validation set.The
problem of this model is the items that have not been
rated by users or the number of rates are too small.
P
N k|u| (corri,N ru,N )
(3)
P
(6)
ru,i =
N k|u| (|corri,N |)

4.4

4.6

Round

As it was stated previously in Section III, in the Yahoo


Music Dataset most of rates are multiple of 10; therefore in this model we convert all predictions calculated
(5)
using TypeTaste method ru,i to the nearest multiple
of 10 available. To do so, we used the following formula (9) to round predictions where k |val| is the set
of users and items in validation set, a is the real part
of complex number and b is the complex part .By this
method we get MAE= 18.04 on the validation set.
n

o
(6)
(5)
ru,i = i, u k |val||round ru,i
(9)
(5)

(6)

ru,i

a + ib =

a + 1
a
=

a1

ru,i
10
, b > 0.5
, b = 0.5
, b < 0.5

(10)
(11)

UserTaste
4.7

KNN+SVD

In the Yahoo Music Dataset there were some users with


special behavior, rates of these users had standard de- First we clarify SVD and k-nearest neighbor separately
viation of more than or equal to , so for these users and then we explain KNN+SVD method, which uses

489

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

both of these algorithms to predict rates [6].

4.7.1

SVD

The Singular value decomposition (SVD) is a factorization of the rating matrix. Predictions of a user/item
pair are given by the following formula (12). Where qi
is an F 1 matrix of item-features and pu is an F 1
matrix of user-features and F is the number of features.
ru,i = pTu qi

(12)

correlation is not an option because it is too slow if we


apply it on this dataset. For solving the problem in the
next algorithm we describe a practical way to make
neighborhood approach. The item-item neighborhood
model uses similarities of items as weights to compute
the prediction of a user/item pair. To make an itemitem neighborhood model applicable, we compute the
correlations on-the-fly by using the inverse of the normalized Euclidean distance between two item features.
The correlation ci,j between item i and item j with
their corresponding item features qi and qj is given by
the following equation (13) ,which can be computed in
constant time O(1),where F is the size of the features.

SVD has been accounted as a one of the most


popular models in collaborative filtering since the
2

PF
2
Netflix Prize [2] in 2006.
In this method pre(qi,k qj,k )

qP
(13)
ci,j = qP k=1
diction of one user/item pair is done in constant
F
F
2
2
q
q
k=1 i,k
k=1 j,k
time O(1) by using gradient descent as learning algorithm, training time rows linear with the number of ratings |R|. For completeness we sketch the In this method we also used a sigmoid function to map
stochastic gradient descent training in Algorithm 1. the correlations ci,j to c0i,j by introducing two new parameters and .
Data: Sparse rating matrix R R|U |x|I| = [ru,i ]
Result: Values of ru,i
Tunable: Learning rate , Regularization ,
c0i,j = 0.5 tanh ( ci.j + ) + 0.5
(14)
feature size F ;
Initialize user weights p and item weights q with
(7)
In order to predict the rating ru,i . We used k-nearest
small random values ;
neighbor algorithm to select K ratings which has the
while e < 0.1 do
T
highest correlations to the item i. Hence we introduce
ru,i pu qi ;
the set of items k |u0 | where u0 is the set of rated items
e ru,i ru,i ;
from user u with the K-highest correlations to item i (
for k=1..F do
k |u0 | k |u| ).Furthermore, we used the weighted sum
c pu,k ;
of the ratings from user u multiplied by the ci,j and
pu,k pu,k (e qi,k + pu,k ) ;
normalized with the absolute sum of the c0i,j . Here we
qi,k qi,k (e c + qi,k ) ;
have the final prediction formula (15) for the item-item
end
KNN with SVD Features (KNN+SVD).
end
Algorithm 1: Pseudo code for training a SVD on

P
0
rating data[6].
j k |u0 | ci,j ru,j
(7)

(15)
ru,i = P
c0
0
j k |u |

4.7.2

K-Nearest Neighbor

The k-nearest neighbor algorithm (k-NN) is a method


for classifying objects based on closest training examples in the feature space. Neighborhood models can
be effectively applied, if the correlation matrix can be
computed. The size of the correlation matrix is dependent to the number of data in dataset. For an item-item
KNN we should store all item-to-item correlations. But
the Yahoo Music dataset has a huge number of items
(about 624,961 items). Therefore calculating all of the
correlations will need too much memory and it takes a
long time. Calculate correlations on-the-fly will solve
the problem of memory overflow. But this dramatically slows down the prediction time. Also Pearson

i,j

Actually this method requires features for all items,


so we used the SVD in order to get the item features.
The training process is as follows. First, this algorithm uses the SVD as feature learner. In the second
step, the three meta-parameters scale, offset and neighborhood size K are optimized by using the parameter
searcher APT2 [3]. APT2 is a simple coordinate search.
The optimization target is the linear combination of
all available predictions within the current ensemble.
Linear regression is used as linear combiner. The itemitem KNN is more effective when it is applied to the
residuals of another model. The meta-parameters are
= 3.49, = -4.28 and K=184. For the SVD we use
F=50, =0.0003, =0.1. The resulting error on the
validation set is MAE=14.303.

490

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Blending

Refrences

For reducing the deviation of predictions and make


them uniform, we blend models to get better results.
Our blending is simple linear combination of AC,
CB, AC+Avg, average rates of items, Round model
and at last SVD+KNN model. By this way we get
MAE=15.4267 on the validation set.

[1] G. Dror, Y. Koren, and M. Weimer, Yahoo! Music Rating


Dataset for the KDD Cup 2011 Track1 (2011), 17.
[2] J. Bennett, S. Lanning, and N. Netflix, The Netflix Prize, In
KDD Cup and Workshop in conjunction with KDD (2007).
[3] A. Toscher and M. Jahrer, The BigChaos Solution to the
Netflix Prize 2008: Technical report, commendoresearch and
consulting 18 (November 25,2008), 1-17.
[4] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Item-based
Collaborative Filtering Recommendation Algorithms: University of Minnesota (2001), 285295.
[5] A. Das, M. Datar, A. Garg, and S. Rajaram, Google
News Personalization: Scalable Online Collaborative Filtering (2007), 271280.

Conclusions

We combine some CF algorithms and new methods to


improve the prediction accuracy of collaborative filtering on a huge dataset, by acceptable run time with just
3G RAM and cori5 CPU; results are given in Table 1.
We plan to implement other CF methods and combine
them for better results on predictions.
Table 1: Comparison MAE results, less value means
better prediction
MAE
Methods
10,000 users
500,000 users
AC
22.5684
18.3346
AC+Avg
25.49
22.3457
BC
21.218
20.04
18.6388(= 55)
UserTaste
18.6343(= 44)
TypeTaste
18.2145
18.2149
Round
17.983
18.04
SVD+KNN
14.6937
14.303
Blending
16.276
15.496

491

[6] M. Jahrer and A. Toscher, Collaborative Filtering Ensemble, 17th ACM Int. Conference on Knowledge Discovery and
Data Mining (2011).
[7] G. Linden, B. Smith, and J. York, Amazon.com recommendations: Item to item collaborative filtering, IEEE Internet
Computing 7 Issue 1 (January 2003), 7680.

A New Backbone Formation Algorithm For Wireless


Ad-Hoc Networks Based On Cellular Learning Automata
Maryam Gholami

Mohammad Reza Meybodi

Department Of Computer Engineering

Department Of IT And Computer Engineering

Islamic Azad University

Amirkabir University Of Technology

Qazvin, Iran

Tehran, Iran

Maryamgholami83@yahoo.com

Mmeybodi@aut.ac.ir

Ali Nourollah
Department Of Computer Engineering
Islamic Azad University
Qazvin, Iran

Abstract: In this paper, we propose an intelligent backbone formation algorithm based on Cellular
Learning Automata (CLA) in which a near optimal solution to the minimum CDS problem in Unit
Disk Graphs (UDG) is found. UDGs are used for modelling Ad-Hoc networks, and finding MCDS
in such graphs is a promising approach to construct an efficient virtual backbone in wireless AdHoc networks. The simulation results show that the proposed algorithm outperforms the existing
CDS-based backbone formation algorithms in terms of the backbone size.

Keywords: Wireless Ad-Hoc networks; Backbone formation; Cellular learning automata; Connected dominating
set.

Introduction

to form a virtual backbone for wireless Ad-Hoc networks by finding a near optimal solution to the minimum CDS problem in the graph of the network. In
the energy constrained Ad-Hoc networks, the proposed
method helps to extend the network lifetime due to its
smaller size CDS compared to other CDS schemas, in
terms of:

Wireless Ad-Hoc networks can be quickly deployed for


many applications. Unlike wired networks, there is no
physical backbone infrastructure in wireless Ad-Hoc
networks. In such networks two hosts can directly
communicate when they are within the range of each
other, and they communicate indirectly through relay Giving better energy conservation;
ing packets by the intermediate hosts. Constructing a
virtual backbone in such networks significantly reduces
Reducing the network traffic.
the communication overhead. The CDS formation is a
promising approach for constructing this virtual backbone[2].
The rest of this paper is organized as follows. The
next section introduced preliminary concepts. Section
Because of dynamic nature of wireless Ad-Hoc net- 3 reviews the related works. Section 4 describes the
works, learning methods have used widely in operations proposed Algorithm and finally our simulation results
of these networks. In this paper, a cellular learning are given in section 5, and we draw our conclusions in
automata based algorithm based method is proposed section 6.
Corresponding

Author, T: (+98) 938 767-0549

492

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Preliminary Concepts

2.3

Learning Automata

A learning automaton[8]is an adaptive decision-making


In this section we describe preliminary concepts used
unit that improves its performance by learning how to
in this paper.
choose the optimal action from a finite set of allowed
actions through repeated interactions with a random
environment. The action is chosen at random based
on a probability distribution kept over the action-set
2.1 Unit Disk Graph
and at each instant the given action is served as the
input to the random environment. The environment
responds the taken action in turn with a reinforcement
A wireless Ad-Hoc network can be modelled as a Unit signal. The action probability vector is updated based
Disk Graph (UDG)[1]. A graph G = (V, E) is a UDG on the reinforcement feedback from the environment.
if and only if its vertices can be put in one-to-one correspondence with equalized circles in a plane in such a
The objective of a learning automaton is to find
way that an edge connects two nodes if and only if the the optimal action from the action-set so that the avcorresponding circles intersect. In modelling a network erage penalty received from the environment is miniby UDG, nodes represent the individual hosts and an mized[8]. The environment can be described by a triple
edge connects two nodes if the corresponding hosts are E {, , c} , where { , , ..., } , represents
1
2
r
within the transmission range of each other.
the finite set of the inputs, { , , ..., } , de1

2.2

Connected Dominating Set

For graph G, Dominating Set, S, is defined as a subset


of V such that each node in V-S is adjacent to at least
one node in S. Each node in dominating set S is called
a dominator node. A node of S is said to dominate
itself and all its neighbours. A minimum DS (MDS) is
a DS with the minimum cardinality. Finding a MDS
is NP-Hard[7]. A Connected Dominating Set (CDS) C
of a graph G is a Dominating Set whose induced sub
graph is connected. CDS with minimum cardinality is
called Minimum CDS (MCDS) which forms a virtual
backbone in the graph[36]. Restricting the routing to
the CDS results in a significant reduction in message
overhead associated with routing updates[1]by which
the routing overhead can be reduced. Finding MCDS
is also NP-Hard[7]. A sample UDG and one of its virtual backbones induced by the CDS have been shown
in Figure 1.

notes the set of the values can be taken by the reinforcement signal, and c {c1 , c2 , ..., cr } denotes the
set of the penalty probabilities, where the element ci
is associated with the given action i . The recurrence
equation shown by (1) and (2) is a linear learning algorithm by which the action probability vector p is
updated. Let i (k) be the action chosen by the automaton at instant k.

pj (n) + a[1 pj (n)]
j=i
pj (n + 1) =
(1)
(1 a)pj (n)
j j 6= i
When the taken action is rewarded by the environment
(i.e. (n) = 0) and

(1 b)pj (n)
j=i
pj (n + 1) =
(2)
b
( r1 ) + (1 b)pj (n) j j 6= i
When the taken action is penalized by the environment
(i.e. (n) = 1). r is the number of actions can be chosen by the automaton, a(k) and b(k) denote the reward
and penalty parameters and determine the amount of
increases and decreases of the action probabilities, respectively.

2.4

Cellular Learning Automata (CLA)

Cellular Learning Automata (CLA)[8]is a powerful


mathematical model for many decentralized problems
and phenomena. CLA is obtained by combining cellular automata and learning automata. The basic idea
of CLA is to utilize learning automata to adjust the
state transition probability of stochastic CA. A CLA
Figure 1: Sample unit disk graph and the virtual back- is a CA in which one learning automata is assigned
bone induced by CDS
to every cell. This model is superior to CA because

493

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of its ability to learn and also is superior to a single


learning automaton because it is a collection of learning automata, which can interact with each other and
solve a particular problem.

2.5

Irregular Cellular Learning Automata (ICLA)

An Irregular Cellular Learning Automata (ICLA)[9]is a


cellular learning automata (CLA) in which the restriction of rectangular grid structure in traditional CLA is
removed. This generalization is expected because there
are applications such as wireless sensor networks, immune network systems, graph related applications, etc.
that cannot be adequately modelled with rectangular
grids. An ICLA is defined as an undirected graph in
which, each vertex represents a cell which is equipped
with a learning automaton. The learning automaton
residing in a particular cell determines its state (action)
on the basis of its action probability vector. Like CLA,
there is a rule that the ICLA operate under. The rule of
the CLA and the actions selected by the neighbouring
LAs of any particular LA determine the reinforcement
signal to the LA residing in a cell. The neighbouring
LAs of any particular LA constitute the local environment of that cell. The local environment of a cell is
non-stationary because the action probability vectors
of the neighbouring LAs vary during the evolution of
the ICLA.

2.6

Related Works

MCDS problem is an NP-hard problem, and so several


approximation algorithms have been proposed to find
a near optimal solution to this problem in a reasonable time. Guha and Khuller[11]proposed two greedy
centralized algorithms to construct a CDS in general
graph G. The idea of the first algorithm is to build a
spanning tree T rooted at the nodes with maximum
degree and grow T until all nodes are added to T and
the non-leaf nodes in T form a CDS. The second algorithm is an improvement of the first one and consists
of two phases. The first phase is to construct a DS
and the second phase is to connect DS nodes using a
Steiner tree algorithm.
Butenko et al [12] also proposed a prune-based
heuristic algorithm for constructing the MCDS. In this
algorithm, the connected dominating set is initialized
to the whole nodes of the graph and then each node is
examined to determine whether it should be removed
or retained. If eliminating a given node disconnects
the induced sub graph of the connected dominating
set, then it is retained and otherwise removed. Wu
and Li[13] proposed an algorithm that determines a
CDS using a marking process. In particular, if a node
has two unconnected neighbours, then it marked true.
At last all the marked nodes form a CDS. The authors
also introduce some pruning rules to reduce the size of
the CDS. Several distributed algorithms also proposed
for solving MCDS problem such as the algorithm proposed by Alzoubi et al. in[14]that forms a minimum
CDS in a wireless Ad-Hoc network in two phases. In
the first phase a maximum independent set (MIS) is
constructed and in the second phase, CDS is generated
by adding intermediate nodes into MIS.
Li et al.[15]proposed a greedy algorithm for con-

Open Cellular Learning Automata structing CDS in network topology graph of a wireless
(OCLA)
Ad-Hoc network. The first step of the proposed al-

CLA studied so far are closed, because they dont take


into account the interaction between the CLA and the
external environments. Open CLA (OCLA)[10], in
which the evolution of CLA is influenced by the external environments, is introduced. Two types of environments can be considered in the open CLA: global
environment and exclusive environment. Each CLA
has one global environment that influences all cells and
an exclusive environment for each particular cell. The
interconnection of a typical cell in the open CLA and
its various types of environments is shown in the following.

494

gorithm focused on constructing an MIS of the graph.


At the second step, a Steiner tree is approximated with
the minimum number of Steiner nodes to interconnect
the nodes in the MIS. In[16], Xie et al. proposed a
distributed approximation algorithm to construct the
MCDS in wireless sensor networks. In their algorithm,
the network is modelled as a hierarchical graph that
at each level of this graph, a selected set of nodes is
served as the message hubs (to route the messages) for
the other nodes in the next level of the hierarchical
graph. This algorithm uses a competition-based strategy to select the nodes at each level. Misra et al. in
[17] proposed a new heuristic called collaborative coder
using two principles:

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Domatic number of a connected graph is at least on the defined local rules in the cellular learning autwo;
tomata structure. By selecting the appropriate actions
Optimal substructure defined as subset of inde- by existing learning automatas in each cell, the subset
pendent dominator preferably with a common of CDS nodes in graph will be formed. The procedure
of this proposed algorithm is shown in algorithm 1.
connector.
Data: Max step, ICLA which mapped into the
network graph
Many algorithms also have been proposed for solving
Result: The minimum size CDS
the CDS problem. One can find a good survey of these
while step<max step do
algorithms in[18].
foreach LA in Parallel do
Select action depending on
action-prob-vector;
Calculate beta;
4 The
Proposed
Algorithm
Update action-prob-vector depending on
beta;
(OICLA-CDS)
end
end
Algorithm
1: Procedure of OICLA-CDS algorithm
The model we introduced and used in our proposed algorithm is a combination of open cellular learning auThree local rules used in the proposed algorithm
tomata and irregular cellular learning automata that
are
described as follows:
we call it open irregular CLA (OICLA). The proposed
model has all of ICLA features besides the provided reinforcement signal for updating probability action vec If one node has only one neighbour, and its learntor in any learning automata is a combination of global
ing automata does not select this node as a domand local environment responses. These responses are
inating node, will receive reward for its action;
combined via local rules. At last, all of learning au If a node has at least one leave neighbour, and
tomatas update their action probability vector based
thats learning automata select this node as a
on received signal. To solving the CDS problem, we
dominator node will receive reward.
apply the open irregular cellular learning automata as
a set of selector learning automatas.
And global rules are:
An OICLA is mapped on the network such that
each cell maps on a node in the network. Remember
all of nodes were dominated, means that each
that the network graph is a unit disk graph, so neighnode at least one dominated node as neighbour;
bouring that we apply in OICLA, is defined and applied
The resulting subgraph is a connected sub
by graph edges. In this model, each node i has a learngraph;
ing automata named LAi , that can select itself as a
dominated or dominating node by using it. Therefore
The number of dominated nodes in this set, is
all of actions of LAi is a 2 element set {a1 , a2 } . If LAi
equal to or less than number of dominated nodes
selects the action a1, means that the node i is selected
in previous iterations.
as a dominator node and If LAi selects the action a2,
means that the node i is selected as a non dominating
If these conditions will hold, one local rule should be
node.
applied to decide rewarding or penalizing selected actions by LAs: If a node has maximum degree between
The action selection periods is continued until the
its neighbours and is selected as a dominator node, its
number of stages exceeds pre-specified threshold maxselected action will receive reward.
step.
In this algorithm, all of nodes keep the neighbouring
informations in adjacent matrix format. The learning
automaton in each cell has an action probability vector that holds the probability of selecting each of the
legal actions. At the start of the process, the probability of selecting all the actions is equal to 0.5. These
probabilities are updated in the iterations of the algorithms executions with the ending condition and based

Computational Results

To study the performance of our proposed algorithm


for solving CDS problem, we have conducted simulation experiments in two groups. The performance measure of experiments is CDS size. In our experiments we

495

The Third International Conference on Contemporary Issues in Computer and Information Sciences

generate random connected graphs repeatedly and run


the algorithms, measuring the size of the CDS. The
size of the graph ranges from 60 to 200 nodes. To
simulate the structure of Ad-Hoc networks, we place
nodes (hosts) randomly in a square simulation area of
size 100100 units. The coordinates of the nodes are
chosen uniformly in each dimension. It is assumed that
the transmission range for each host is 20 and the number of the iterations is 100.

5.1

Figure 2: The size of the CDS constructed by algorithms, when the transmission range is 15.

Experiment 1

In this experiment, we compare the results obtained


from our proposed algorithm (OICLA-CDS) with the
results of Butenko et al.s algorithm[12], Li et al.s algorithm[15], Xie et al.s algorithm[16] and Misras algorithm[17]. In terms of induced CDS size. The results
are shown in Figure 2. From the Figure it is obvious
that the size of CDS constructed by our proposed algorithm is comparable with those constructed by the
best existing algorithms.

5.2

Experiment 2

Figure 3: The size of the CDS constructed by algorithms, when the transmission range is 30.

we change transmission range to 30 and repeat the


same experiments. The results are shown in Figure 3.
Like Figure 2, the results shown in Figure 3 show that
the proposed algorithm (i.e., OICLA-CDS) considerably outperforms the other algorithms. Comparing the
results shown in Figure 2 with Figure 3, we observe
that for all algorithms the average size of the CDS becomes smaller when the transmission range increases.
The reason for this reduction is that the number of
neighbours of a dominator increases when distance increases. Therefore, the graph nodes can be thoroughly
dominated by a less number of the dominators. From
figures 2 and 3, it can be seen that the gap between the
curves for our proposed algorithm and the curves for
the other centralized algorithms becomes significant as
the number of nodes increases.

496

Conclusion

In this paper, we presented a method based on a new


model of Cellular Learning Automata for solving the
MCDS problem in unit disk graphs. Experimental results show that proposed algorithm outperforms the
best existing algorithms for finding MCDS in UDGs.

Refrences
[1] V. Bharghavan and B. Das, Routing in Ad-Hoc Networks Using Minimum Connected Dominating Sets, In-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[2]

[3]

[4]

[5]

[6]

[7]

[8]
[9]

[10]

ternational Conference on Communications97, Montreal,


Canada (1997), 376380.
Y. Z. Chen and A. L. Liestman, Approximating minimum
size weakly connected dominating sets for clustering mobile
Ad-Hoc networks, MobiHoc2002 (2002), 157-164.
K. M. Alzoubi, P. J. Wan, and O. Frieder, Maximal independent set, weakly connected dominating set, and induced
spanners for mobile Ad-Hoc networks, International Journal of Foundations of Computer Science 14 (2003), no. 2,
287-303.
P. J. Wan, K. Alzoubi, and O. Frieder, Distributed construction of connected dominating set in wireless Ad-Hoc
networks, INFOCOM 2002 3 (2002), 1597-1604.
J. Wu, B. Wu, and I. Stojmenovic, Power-aware broadcasting and activity scheduling in Ad-Hoc wireless networks using connected dominating sets, Journal of Wireless Communications and Mobile Computing (2002), 425-438.
H. Lim, C. Kim, and I. Stojmenovic, Flooding in wireless
Ad-Hoc networks, Journal of Computer Communications
(2001), 353-363.
M. R. Garey and D. S. Johnson, Computers and Intractability: A guide to the theory of NP-completeness, Freeman, San
Frncisco, 1978.
K. S. Narendra and K. S. Thathachar, Learning Automata:
An Introduction, Prentice-Hall, New York, 1989.
M. Esnaashari and M. R. Meybodi, A novel clustering algorithm for wireless sensor networks using Irregular Cellular Learning Automata, Telecommunication, IST 2008 20
(1998), no. 4, 330336.
H. Beygi and M. R. Meybodi, Open Synchronous Cellular Learning Automata, Advanced in complex systems 10
(2007), no. 4, 527576.

[11] S. Guha and S. Khuller, Approximation algorithms for connected dominating sets, Journal of Computer Communications 20 (1998), no. 4, 374-387.
[12] S. Butenko, X. Cheng, C. Oliveira, and P.M. Pardalos, A
new heuristic for the minimum connected dominating set
problem on Ad-Hoc wireless networks, Recent Developments
in Cooperative Control and Optimization, Kluwer Academic
Publishers (2004), 61-73.
[13] J. Wu and H. Li, On calculating connected dominating set
for efficient routing in Ad-Hoc wireless networks, ACM
DIALM1999 (1999), 7-14.
[14] K.M. Alzoubi, X.Y. Li, Y. Wang, P.J. Wan, and O.
Frieder, Geometric spanners for wireless Ad-Hoc network,
IEEE Transactions on Parallel and Distributed Systems 14
(2003), no. 4, 408421.
[15] Y. Li, M.T. Thai, F. Wang, C.W. Yi, P.J. Wang, and D.Z.
Du, On greedy construction of connected dominating sets,
WCMC (2005).
[16] R. Xie, D. Qi, Y. Li, and J.Z. Wang, A novel distributed
MCDS approximation algorithm for wireless sensor networks, Journal of Wireless Communications and Mobile
Computing (2007).
[17] R. Misra and Ch. Mandal, Minimum Connected Dominating Set using a Collaborative Cover Heuristic for AdHoc Sensor Networks, Wireless communications & mobile
computing Distributed systems of sensors and actuators
archive 9 (2009), no. 3.
[18] Z. Liu, B. Wang, and L. Guo, A survey on connected Dominating Set Construction Algorithm for Wireless Sensor
Networks, Information Technology Journal 9 (2010), no. 6,
10811092.

497

Solving Dominating Set Problem


In Unit Disk Graphs By Genetic Algorithms
Azadeh Gholami

Mahmoud Shirazi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

azadehgholami@iasbs.ac.ir

m.shirazi@iasbs.ac.ir

Dr. Bahram Sadeghi Bigham


Department of Computer and Information Sciences
Institute for Advanced Studies in Basic Sciences
Zanjan, Iran
b sadeghi b@iasbs.ac.ir

Abstract: In this paper, we use Genetic Algorithms to find the Minimum Dominating Set (MDS)
of Unit Disk Graphs (UDG). UDGs are used for modelling Ad-Hoc networks and finding MDS in
such graphs is a promising approach to clustering the wireless Ad-Hoc networks. The MDS problem
is proved to be NP-complete. The simulation results show that the proposed algorithm outperforms
the existing algorithms for finding MDS in terms of the DS size.

Keywords: Wireless Ad-Hoc networks; Backbone formation; Genetic algorithm; Dominating set.

Introduction

and an edge connects two nodes if the corresponding


hosts are within the transmission range of each other.

Wireless Ad-Hoc networks can be quickly deployed for


many applications. Unlike wired networks, there is no
physical backbone infrastructure in wireless Ad-Hoc
networks. In such networks two hosts can directly
communicate when they are within the range of each
other, and they communicate indirectly through relaying packets by the intermediate hosts. Clustering the
wireless Ad-Hoc significantly reduces the communication overhead. Finding the dominating set (DS) is a
promising approach for clustering the wireless Ad-Hoc
networks. A wireless Ad-Hoc network can be modelled
as a Unit Disk Graph (UDG)[1]. A graph G = (V, E)
is a UDG if and only if its vertices can be put in one-toone correspondence with equalized circles in a plane in
such a way that an edge connects two nodes if and only
if the corresponding circles intersect. In modelling a
network by UDG, nodes represent the individual hosts
Corresponding

Author, T: (+98) 241 415-5055

498

For graph G, Dominating Set, S, is defined as a


subset of V such that each node in V-S is adjacent to
at least one node in S. Each node in dominating set
S is called a dominator node. A node of S is said to
dominate itself and all its neighbours. A minimum DS
(MDS) is a DS with the minimum cardinality[7]. A
sample UDG and one of its dominating sets have been
shown in Figure 1.
Genetic Algorithm (GA) is a stochastic search method
which is inspired by natural biological evolution. It was
first proposed by John Holland[8]. D.E. Goldberg has
given a new dimension to GA using it in search, optimization and machine learning[9]. The basic concept
of GA is designed to simulate the processes in natural
system necessary for evolution, specifically for those
that follow the principle of survival of the fitness, laid
down by Charles Darwin.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The Proposed Genetic Algorithm For Finding MDS(GADS)

We described main features of our genetic algorithm


(GA) for the DS problem as follows:
Figure 1: Sample unit disk graph and a dominating set

3.1
In this paper, a genetic algorithm based method is
proposed to clustering the wireless Ad-Hoc networks
by finding a near optimal solution to the minimum DS
problem in the graph of the network. In the energy
constrained Ad-Hoc and sensor networks, the proposed
method helps to extend the network lifetime due to
its smaller size DS compared to other DS schemas, in
terms of:

Giving better energy conservation;

Representation

The chromosome used for this problem consists of a


string of bits which length is equal to the number of
the nodes in the graph. Each gene in the chromosome
corresponds to a node in the considered graph and the
value of each gene determines whether that node is selected as a dominator node or not.

3.2

Fitness

Reducing the network traffic.

The rest of this paper is organized as follows. The next


section reviews the related work. Section 3 describes
the proposed Genetic Algorithm and finally our simulation results are given in section 4, and we draw our
conclusions in section 5.

Related Works

MDS problem is an NP-hard problem, and so several


approximation algorithms have been proposed to find
a near optimal solution to this problem in a reasonable
time. Jia et al.[10]proposed a greedy distributed algorithm to construct a DS in graph G. The idea of this
algorithm is to build a spanning tree T rooted at the
nodes with maximum degree and grow T until all nodes
are added to T and the non-leaf nodes in T form a DS.
Li et al. [11]also proposed a prune-based heuristic algorithm for constructing the MDS. In this algorithm,
the dominating set is initialized to the whole nodes of
the graph and then each node is examined to determine
whether it should be removed or retained. If eliminating a given node disconnects the induced sub graph of
the connected dominating set, then it is retained and
otherwise removed. One can find a good survey of some
of algorithms for finding MDS in[13]

We use a rather simple fitness function, which describes


just one property of the subgraph and embed a penalizing mechanism for preventing the production of
infeasible solutions. We define the fitness value of an
individual, say S, in GA population as follows. If S be a
feasible solution, the fitness value is equal to the number of vertices in the solution (i.e. the number of 1s
occurring in the string). If S be an infeasible solution,
the fitness function gives it a bad fitness such as the
number of all nodes of the graph. This is to avoid selection of the chromosomes which their corresponding
sub-graph is not DS, for next generation. Fitness function is shown in (1), where n is the number of nodes in
the graph:
n
P S[j]
f (s) = j=1

3.3

If S is a feasible solution

(1)

otherwise

Selection and Crossover

Selection is motivated by the evolutionary mechanism


implied by the well-known phrase Survival of the
fittest. we use elitist selection mechanism which
copies the two best individuals of a population to the
population of the next generation. In our proposed
algorithm after two parents (chromosomes) have been
selected for crossover, the GA uses stochastic uniform
crossover for generating offspring.

499

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.4

Mutation Operators

This operator motivated by evolutionary mechanism of


mutation, where a chromosome undergoes a small but
important modification. We implemented and compared three type of mutation operator in proposed GA
algorithm. The mutation operators which we used are
described below:

3.4.1

The simple mutation operator

The simple mutation operator simply selects one vertex


in the subset of vertices and changes its corresponding
allele to its complement, means that one node to be
added to the sub-graph or one node to be deleted from
it.

dominator node to non dominator node and instead


selects another non dominator node to be dominator.
The implementation of HM2 is shown in Figure 4.
Data: The population
Result: The population with mutated
chromosomes
Step 1: Randomly select a subset of 10% of the
chromosomes from the entire population;
Step 2:
foreach chromosome X selected in Step 1 do
if The number of 1s in X > n/2 then
HM1(X);
else
HM2(X);
end
end
Figure 2: The hypermutation algorithm

3.4.2

Hypermutation (HM) Operator

The basic hypermutation heuristic that proposed by


Correa, et al. [12]starts by randomly selecting a percentage of the chromosomes of the population. The
algorithm then tries to improve the fitness of each of
the selected chromosomes as follows. For a given node
in the chromosome, the algorithm performs the change
that most improves the chromosomes fitness. For an
example of their representation, assume the vertex set
of a given graph be labeled by {1, 2, 3, 4, 5, 6, 7} , a typical chromosome is {2, 3} means that nodes 2, 3 selected
as dominator. Note that chromosome representation
in [12]is different from our representation in which the
chromosome is a list of vertices in the subset. We customize the hypermutation operator for our algorithm
with its particular representation. The implementation
of our hypermutation algorithm is shown in Figure 2.
We investigate two cases:

Case 1: If the number of 1s in the chromosome is


more than or equal to half of the length of the chromosome, we use hypermutation type 1 called HM1. This
algorithm deletes dominator nodes one by one via inverting its value to 0 in the chromosome, and checks
that which of these changes results most improvement
in the fitness of chromosome and then performs that
change on the chromosome. The implementation of
HM1 is shown in Figure 3.

We illustrate the use of these operators in our proposed algorithm by an example. Consider a graph with
vertex set {1, 2, 3, 4, 5, 6, 7} , a typical chromosome is
1011011 means that nodes 1, 3, 4, 6, 7 are in the dominator set. Since the number of 1s is more than half of
the length of this chromosome, HM1 is applied. This
algorithm inverts the 1s one by one and so the chromosomes should be evaluated are 0011011, 1001011,
1010011, 1011001 and 1011010. Finally, the algorithm
selects the change that results in maximum improvement in the fitness of the chromosome.
Data: The chromosome
Result: The mutated chromosome
foreach node j with value 1 in chromosome X
do
Let Y be a new chromosome with value 0 at
jth gene;
Calculate the fitness of Y;
if fitness(Y) < fitness (best) then
Best = Y;
end
end
if fitness (best) < fitness(X) then
X = best;
end
Insert the new X into the population replacing
the old X;
Figure 3: The HM1 algorithm

Case 2: If the number of 1s in the chromosome is


Lets consider another chromosome such as 1010010.
less than half of the length of the chromosome, we use
hypermutation type 2 called HM2. HM2 changes a In this case the HM2 should be applied. The algo-

500

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

rithm first considers the first position containing 1 and neighbours (KN). The implementation of KN is shown
changes its value to 0 and then searches for a non in Figure 5. Note that set H is calculated for each node
dominator node which making it a dominator causes in the chromosome.
maximum improvement in the fitness of the chromosome. The chromosomes that should be examined are
0110010, 0011010, 0010110, and 0010011. The drawback of this algorithm is that it takes the entire chromosome and mutates its node by node with all non
dominator nodes. So it takes a lot of time and it is
Data: The population
computationally expensive.
Result: The population with mutated
chromosomes
Data: The chromosome
Step
1:
Randomly
select a subset of 10% of the
Result: The mutated chromosome
chromosomes
from
the entire population;
Let H be the set of nodes in the graph that are
Step
2:;
currently 1 in the chromosome X;
foreach chromosome X selected in step 1 do
foreach i in set H do
foreach node i in X that has value 1 do
Best = X;
Best = X;
foreach node j that has value 0 in X do
Let H be the set of neighbours of node i
Let Y be a new chromosome with value 1
that are currently 0 in the chromosome
at jth gene and value 0 at ith gene;
X(maximum cardinality of this set is
Calculate the fitness of Y;
average degree of the graph);
if fitness(Y) < fitness (best) then
foreach node j in Set h do
Best = Y;
Let Y be a new chromosome with
end
value 1 at jth gene and value 0 at ith
end
gene;
if fitness (best) < fitness(X) then
Calculate the fitness of Y;
X = best;
if fitness(Y) < fitness (best) then
end
Best = Y;
end
end
Insert the new X into the population replacing
end
the old X;
if fitness (best) < fitness(X) then
X = best;
Figure 4: The HM2 algorithm
end
end
Insert the new X into the population
replacing the old X;
3.4.3 KN Algorithm
end
We developed another heuristic which is an enhanceFigure 5: The KN algorithm
ment of algorithms described above. This algorithm
uses a local optimization rather than a global optimization technique. The idea behind this algorithm is to
mutate every gene with only its neighbours. The numComputational Results
ber of neighbours that are investigated (K) depends 4
on the average degree of the network nodes. We define
average degree of the network nodes as (2).
To study the performance of our GA algorithm for solvPn
ing DS problem, we have conducted simulation experi
(2) iments in two groups. The first experiment involves
= i=1
n
comparing impact of applying three different mutation
In which denotes the average degree of the network. operators in implementation of proposed genetic algoi is degree of node i and n is the number of nodes in rithm on solving DS problem. The second group evalthe network.
uates the results of the proposed GA with those of
the best-known DS formation algorithms. The perforIn fact k neighbours of a given node are investi- mance measures of experiment 1 are DS size and run
gated that k is at most a, so we call this algorithm k time and in experiment 2, the performance measure is

501

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.2

only DS size.
In our experiments we generate random connected
graphs repeatedly and run the algorithms, measuring
the size of the DS. The size of the graph ranges from
60 to 200 nodes. To simulate the structure of Ad-Hoc
networks, we place nodes (hosts) randomly in a square
simulation area of size 100 100 units. The coordinates of the nodes are chosen uniformly in each dimension. It is assumed that the transmission range for each
host is 20. The parameters of our genetic algorithm
are setting as bellow: population size is equal to 100,
crossover rate is 0.8 and the number of the iterations
is 100.

4.1

Experiment 2

In this experiment, we compare the results obtained


from our proposed algorithm with the results of Jia et
al.s algorithm [10]and Li et al.s algorithm [11]In terms
of induced DS size. The results are shown in Figure 7.
Our algorithm is labeled as GA-DS. From the Figure it
is obvious that the size of DS constructed by GA-DS is
comparable with those constructed by the best existing
algorithms.

Experiment 1

In this experiment we study the impact of applying


three different mutation operators in our genetic algorithm for solving the DS problem. In this experiment
we set the stalltime of the GA algorithm to 8 seconds
and compare the size of the DSs produced by GA with
three different mutation algorithms. The results are
depicted in fig. 6. As shown in Figure 6, the GA with
KN heuristic performs better than the simple mutation and hypermutation in the same amount of time.
This occurs because hypermutation is trying all of the
Figure 7: Comparison of the DS size for the DS-based
options rather than KN algorithm; therefore better reclustering algorithms
sults can be achieved in the same amount of time by
applying KN algorithm.

CONCLUSION

In this paper, we presented a method based on genetic


algorithms for solving the MDS problem in unit disk
graphs. Most of the components of proposed GA are
comparable to those used in a standard GA. We implemented the hypermutation heuristic concept presented
by Correa, et al. [12] with some modifications to tune it
for our problem. We also developed a heuristic called
KN (K Neighbours) heuristic for mutation operator.
Experimental results show that GA with KN heuristic
works better than simple mutation and hypermutation.
Furthermore we found that the proposed GA with KN
heuristic also outperforms the best existing algorithms
for finding MDS in UDGs.

Figure 6: Comparison of three mutation algorithms on


the resulting backbone size

502

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] V. Bharghavan and B. Das, Routing in Ad-Hoc Networks Using Minimum Connected Dominating Sets, International Conference on Communications97, Montreal,
Canada (1997), 376380.
[2] Y. Z. Chen and A. L. Liestman, Approximating minimum
size weakly connected dominating sets for clustering mobile
Ad-Hoc networks, MobiHoc2002 (2002), 157-164.
[3] K. M. Alzoubi, P. J. Wan, and O. Frieder, Maximal independent set, weakly connected dominating set, and induced
spanners for mobile Ad-Hoc networks, International Journal of Foundations of Computer Science 14 (2003), no. 2,
287-303.
[4] P. J. Wan, K. Alzoubi, and O. Frieder, Distributed construction of connected dominating set in wireless Ad-Hoc
networks, INFOCOM 2002 3 (2002), 1597-1604.
[5] J. Wu, B. Wu, and I. Stojmenovic, Power-aware broadcasting and activity scheduling in Ad-Hoc wireless networks using connected dominating sets, Journal of Wireless Communications and Mobile Computing (2002), 425-438.
[6] H. Lim, C. Kim, and I. Stojmenovic, Flooding in wireless
Ad-Hoc networks, Journal of Computer Communications
(2001), 353-363.

[7] M. R. Garey and D. S. Johnson, Computers and Intractability: A guide to the theory of NP-completeness, Freeman, San
Frncisco, 1978.
[8] J. H. Holland, Adaptation in Natural and Artificial Systems,
University of Michigan Press, Ann Arbor, MI, 1975.
[9] D. E. Goldberg, Genetic Algorithm in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA,
1989.
[10] L. Jia, R. Rajaraman, and T.Suel, An Efficient Distributed
Algorithm for Constructing Small Dominating Sets, Distributed Computing 15 (2002), no. 4, 193205.
[11] J. Li, J. Jannotti, D. S. J. D. Couto, D. R. Karger, and R.
Morris, A scalable location service for geographic Ad-Hoc
routing, Proc. of the sixth annual international conference
on Mobile computing and networking, Boston, MA (2000).
[12] E. S. Correa, M. T. A. Steiner, A. A. Freitas, and C.
Carnieri, A Genetic Algorithm for solving a capacity Pmedian problem, Numerical Algorithms 35 (2004), no. 24,
373388.
[13] Z. Liu, B. Wang, and L. Guo, A survey on connected Dominating Set Construction Algorithm for Wireless Sensor
Networks, Information Technology Journal 9 (2010), no. 6,
10811092.

503

Conflict Detection and Resolution in Air Traffic Management based


on Graph Coloring Problem using Prioritization Method
Hojjat Emami

Farnaz Derakhshan

Msc Student in Artificial Intelligence

Assistant Professor in Artificial Intelligence

Faculty of Electrical and Computer Engineering

Faculty of Electrical and Computer Engineering

University of Tabriz

University of Tabriz

Tabriz, Iran

Tabriz, Iran

hojjatemami@yahoo.com

derakhshan@tabrizu.ac.ir

Abstract: The current air traffic management systems are not able to manage the enormous
capacities of air traffic perfectly and have not sufficient capability to service different types of
flights. Free flight is a new concept presented potentially to solve problems in the current air traffic
management system. Despite of many advantages of free flight (such as less fuel consumption,
minimum delays and reduction of the workload of the air traffic control centers), it causes many
problems such as collisions between different aircrafts. Conflict detection and resolution (CDR) is
one of the fundamental challenges in air traffic management system. In this paper, we presented a
model for CDR between aircrafts in air traffic management using graph coloring problem concept.
In fact, we mapped the congestion area to a corresponding graph, and then addressed to find a
reliable and optimal coloring for this graph using a prioritization method. In prioritization method
we assign a priority for each aircraft based on its score.

Keywords: Air Traffic Control, Free Flight, Conflict Detection and Resolution, Graph Coloring Problem, Prioritization Method.

Introduction

Having a reliable, safe and efficient air traffic management system is a fundamental and critical need in aviation industry. In this paper, we define Air Traffic as:
Aircraft operating in the air or on an airport surface,
exclusive of loading ramps and parking areas [1] and
Air Traffic Control as: a service operated by appropriate authority to promote the safe, orderly, and expeditious flow of air traffic [1]. Air traffic management is a
very complex, dynamic and demanding problem which
involves multiple controls and various degree of granularity [2]. Generally, the main goals of air traffic management systems are as follows: providing safety (separate aircraft to prevent collisions between aircrafts),
performance and high efficiency for the flights, detecting and resolving conflicts, reducing travel time (min Corresponding

Author, T: (+98) 914 6761586

504

imum delay) with highest possible accuracy, organize


and expedite the flow of traffic, and providing information and other supports for pilots when able [3, 4].
There are many reasons for to the necessity of proposing new approaches in air traffic control including: the
number of flights are increased and this high air traffic
needs more reliability and high performance, and human errors in the process of information gathering is
inevitable. This problem leads many researchers in the
field of aviation industry to provide innovative solutions for safe and efficient air traffic management (e.g.
in [5, 6, 7]).
Because of various problems in the current air traffic
management systems, the aviation industry has turned
towards a new concept called free flight [8]. Free flight
is a new concept presented potentially to solve problems in the current air traffic management system. Free
flight means that, pilots or other users of the air traf-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

fic management systems have more freedom for selecting and modifying their flight paths in airspace during
flight time. The free flight concept changes the current
centralized and command-control airspace system (between air traffic controllers and pilots) to a distributed
system that allows pilots choose their own flight paths
more efficient and optimal, and plan for their flight
with high performance themselves. Free flight, also
called user preferred traffic trajectories, is an innovative concept designed to enhance the safety and efficiency of the National Airspace System (NAS) [9, 10].
Despite many advantages of this method, free flight imposes some problems for air traffic management system
that one of the most notably of them is the occurrence
of conflicts between different aircrafts. CDR is one of
the major and fundamental challenges in safe, efficient
and optimal air traffic management.
In this paper, conflict is defined as: conflict is the
event in which two or more than two aircrafts experience a loss of minimum separation from each other
[12]. In other words, the distance between aircrafts violates a criterion defining what is considered unwanted;
that we should avoid of these conflicts during a fast
and accurate process; otherwise air traffic management
may be deal with difficult and also will increase risk of
any aircraft collide. In addition, we use the definition
of conflict detection process as the process of deciding when conflict - conflict between aircrafts- will occur
[12], and also conflict resolution process is considered
as: specifying what action and how should be to resolve
conflicts [12].
So far, various models are proposed for conflicts detection and resolutions in air traffic. We also presented
an organized and systematic model for conflicts detection and resolution between aircrafts in air traffic
management which this model has high efficiency, flexibility and reliability. Our proposed model is based on
the prevention method of conflicts. In this paper, using mapping congestion area to corresponding graph,
we converted the problem of conflicts between aircrafts
to a Graph Coloring Problem (GCP) [11]; In fact, we
make a state space graph from congestion area. Each
node of this graph indicates one aircraft in congestion
area and each edge between two nodes represent the
conflict that may be occur between two aircrafts in
future, and the colors used for coloring this graph indicates a flight path. Then we use prioritization method
as an optimal method with least cost to solve the GCP
(i.e. for solve conflicts between aircrafts in airspace).
In this model, global approach is used to resolve the
multiple conflicts between aircrafts in congestion area.
In fact, for each aircraft, we allocate a flight path in
which this aircraft have a reliable distance (vertical or
horizontal) with each other aircrafts and there will be
no risk of conflict. We believe that if we use this model
beside new technologies such as multi-agent systems [2]

we can obtain promising efficiency in air traffic management systems. Multi-agent system is a natural tool for
air traffic management and if autonomous agents use
appropriate strategy (such as prioritization method)
they can manage the air traffic as properly. Following this short description, we describe Graph Coloring
Problem in Section 2, followed by description of our
proposed model in section 3, then in section 4 we describe prioritization method and finally in section 5 we
make some conclusion.

Graph Coloring Problem

GCP (GCP) [11] is an optimization problem that includes finding an optimal coloring for a given graph
G. GCP is one of the most studied NP-hard problems.
Coloring a graph involves assigning labels to each graph
node so that adjacent nodes have different labels. A
minimum coloring for a graph is a coloring that uses
the minimum number of different labels (colors) as possible [13]. GCP is a practical method of representing
many real world problems including time scheduling,
frequency assignment, register allocation, and circuit
board testing. In GCP the fundamental challenge for
any given graph is to find the minimum number of colors for which. This is most often implemented by using
a conflict minimization algorithm [14].
The GCP can be stated as follows: given an undirected
graph G with a set of vertices V and a set of edges E,
(G= (V, E)), a k-coloring of G consists of assigning a
color to each vertex of V; such that neighboring vertices
have different colors (labels). Formally, a k-coloring of
G= (V, E) can be stated as a function F from V to
a set of colors K such that |K| = k and F (u) 6= F (v)
whenever E contain an edge (u, v) for any two vertices
u and v of V. The minimal number of colors allocated
to a graph is called the chromatic number of G. Optimal coloring is one that uses exactly the predefined
chromatic number for any given graph. Since the GCP
is NP complete [11, 13], we need to use heuristics methods to solve it. As we know there are many methods
that proposed for GCP such as: evolutionary methods (e.g. GA [15]), local search algorithms (e.g. Tabu
search [16] or Simulated Annealing [17]) or other mathematical and optimization methods. In this paper, we
use the Prioritization Method for solving the GCP.

Our Proposed Model

In this model we tried to have a preventive approach,


and we attempted to present a method with high per-

505

The Third International Conference on Contemporary Issues in Computer and Information Sciences

formance for CDR. In our proposed model, the criterion of conflict detection is the reduction of the distance
between aircrafts of a certain limit in the future time
steps. Pseudo code of proposed model (solving conflict
problem by using GCPs concept) is shown in Figure
1.
As shown in figure 1, the traffic environment must first
be monitored, then appropriate current state information must be collected (using proper equipment [12]).
These states provide an estimate of the current traffic
situation (such as, the position, direction, destination
and speed of the aircraft). Then, the congestion area
is detected based on these status information from current air traffic. Also in this stage, the minimum reliable distance threshold can be determined to detecting
conflicts. Then congestion area is mapped to a corresponding graph based on minimum reliable distance
threshold; in other words, a state space graph is created
from congestion area. Next, distance matrix between
all aircrafts in congestion area is computed. Also the
adjacency matrix is created based on the distance between aircrafts and determined minimum reliable distance threshold.
In the second stage, the scores of aircrafts in congestion area is computed, and then based on these scores
the priority of each aircraft is computed. Computation of these scores and priorities is described in next
sections. In the third stage, the corresponding graph
is colored using prioritization method. In other words,
we used a prioritization method for solving GCP. The
output of the algorithm is an optimal and reliable coloring (an efficient solution for solving conflicts between
aircrafts in congestion area). If there is no collision, the
algorithm ends. Then, the new flight plan sent to the
aircrafts on flight paths that is free conflict plan. Here
we emphasize that our proposed model can interact
with innovative technologies (such as multi-agent system technology) to conflicts detection and resolution in
air traffic management and also in ground traffic and
related applications.

The score of an aircraft increases, when the aircraft


flies in the satisfactory weather condition.
The score of an aircraft increases, when the aircraft
flies at high altitude (under valid altitude).
The score of an aircraft increases, when the aircraft
has high (appropriate) speed.
The score of an aircraft increases, when its distance
(horizontal or vertical) from the other aircrafts is large.

Data: Problem Parameters (Traffic Information)


Result: A New Free Conflict Flight Plan for
Aircrafts in Airspace
Define Problem Parameters;
ReliableTreshold = PredefinedValue;

Step 1: Creating the Corresponding Graph (i.e.


Adjacency Matrix) of congestion area
CongestionArea = DetectCongestionArea();
DistanceBetweenAircrafts(ProblemParameters);
MapCongestionArea(Problem
Parameters,CongestionArea);
MakeAdjacencyMatrix(Problem Parameters)

Step 2: Compute the Priority for Aircrafts (in


Congestion Area)
AircraftsScores = ComputeScores
(ProblemParameters);
AircraftsPriority = ComputePriority(AircraftsScores,ProblemParameters);

Step 3: Solve the GCP (Conflicts of Aircrafts in


Congestion Area) by using the Prioritization
Method
PrioritizationAlg(CorrespondingGraph,
AircraftsPriority)
SendNewFlightPlan ();

Figure 1: Pseudo-Code for Our Proposed Model

Prioritization Method

Here, we assign the priority for each aircraft based on


its conditions in airspace. In this model, the priority
of each aircraft is specified based on its score. So that
when an aircraft has high score, it will have a high
priority and conversely if an aircraft has a low score,
then it will have a low priority. Score allocation and
consequently priority assigning for each aircraft is performed as follows:
When an aircraft is close to its destination, its score
increases.

506

We assign a priority for each aircraft based on its


score. When a conflict occurs, the aircraft with a low
priority must change its flight path and deviates from
primary and original route for prevention of occurring
conflicts. In fact, we use a hierarchy method to resolve
conflicts.
Perhaps, at first glance, this process seems very similar to the greedy method but naturally the priority
method is general and it is reasonable; for example,
when an aircraft is closer to its destination and has
minimum deviation from the mainstream, it must be
serviced in first and then the other aircrafts must be

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

GCP and then we used of prioritization method to


solve GCP. The result of corresponding colored graph
of aircrafts congestion area is presented efficient and
reliable solution to conflicts problem. In this model is
used multiple strategy to resolution of conflicts and it
has high efficiency compared to other models that dont
The pseudo-code of Priority assigning to the exist- consider this aspect.
ing aircrafts in congestion area is given in Figure 2. We
Although, in this paper we presented only an abused this method for solving conflicts between aircrafts
in congestion area. Also the prioritization method can stract, preliminary and conceptual model for CDR,
be used to solve conflicts without using of GCP.
nonetheless our next goal is that we will focus on using this model with multi-agent systems technology to
present a comprehensive model with high efficiency for
Data: Aircrafts in Congestion Area
CDR in air traffic management system.
Result: Priority for Aircrafts
while i<=NumberOfAircrafts do
DistanceToDistination(i)=
(ProblemParams. Aircrafts(i). Distination Refrences
ProblemParams.Aircrafts(i).CurrentPosition);
[1] Federal aviation regulations and aeronautical information
Velocity(i)=
manual, 2010 edition. 2009 asa, inc. Newcastle, washington.
ProblemParams.Aircrafts(i).Velocity;
[2] K.Tumer and A. Agogino, Improving air traffic manageAltitude(i)= ProblemParams.Aircrafts(i).
ment with a learning multiagent system, IEEE Intell.
Altitude;
Syst., vol. 24, no. 1, pp. 1821, Jan/Feb. 2009.
Weather(i)=
[3] Department of Transport, U.K., Air traffic forecasts for
ProblemParams.Aircrafts(i).WeatherConditions;
the United Kingdom 1997, U.K. Government, Departend
ment of Transport, Tech. Rep., 1997. [Online]. Available
http://www.aviation.dft.gov.uk/aed/air/aircont.htm.
//Compute Priority for each Aircraft
[4] Federal
aviation
administration,
advancwhile i<=NumberOfAircrafts do
ing
free
flight
through
human
factors,
ScoreDistanceToDistination=
www.hf.faa.gov/docs/508/docs/freeflt.pdf,
accessed
1
MaxDistanceToDistination august 2008, 1995.
DistanceToDistination(i);
[5] M. Nguyen-Duc, J. Briot, and A. Drogoul, An applicaScoreVelocity= Velocity (i) - MinVelocity;
tion of Multi-Agent Coordination Techniques in Air Traffic
ScoreAltitude= Altitude (i) - MinAltitude;
Management, Proceedings of the IEEE/WIC International
Conference on Intelligent Agent Technology (IAT03), 2003.
ScoreWeather = PredefinedValue;
[6]
S. Wollkind, J. Valasek, and RT. Ioerger, Automated conProblemParams.Aircrafts(i).Score=
flict resolution for air traffic management using cooperative
ScoreDistanceToDistination + ScoreVelocity
multi-agent negotiation, AIAA Guidance, Navigation and
+ ScoreAltitude + Score Weather;
Control Conference, 2004.
Priority(i)=
[7] N. Archambault, and N. Durand , Scheduling HeurisProblemParams.Aircrafts(i).Score;
tics For on-Board Sequential Air Conflict Solving, IEEE,
2004.
end
serviced. Although, in this case starvation state occurring is not unexpected but we can avoid this problem
by allocating scores to the aircrafts that for long time
are on the flight paths in airspace, so these aircrafts
also service immediately in possible time.

Figure 2: Calculate the Priority for Aircrafts

[8] Radio Technical Commission for Aeronautics. Final report


of RTCA Task Force 3: Free flight implementation, RTCA,
Washington DC, Tech. Rep.,Oct. 1995.
[9] Federal Aviation Administration, FREE FLIGHTIntroduction, http://www.faa.gov/freeflight/ff ov.htm ,
september 2011.

Conclusion

CDR is an active research topic in recent years and


presenting new algorithms that automate the process
of CDR is very important. Free flight is a new concept
that as potentially solution is presented for solving the
problems in the current air traffic management system.
In this paper, a new approach is presented to CDR in
air traffic management. In this approach, we mapped
the conflict resolution problem in congestion area to

[10] J. Rong, J. Valasek, S. Geng, and Ioerger RT, Air traffic


conflict negotiation and resolution using an onboard multi
agent system, Proceedings of the 21st Digital Avionics Systems Conference, 2002.
[11] T.R. Jensen, B. Toft, Graph Coloring Problems, Wiley
interscience Series in Discrete Mathematics and Optimization, 1995.
[12] J. Kuchar and C. Yang, A Review of Conflict Detection
and Resolution Modeling Methods; IEEE Transactions on
Intelligent Transportation Systems, Vol. 1, No. 4, December 2000.
[13] D. Werra, Heuristics for Graph Coloring, Computing
Suppl. 7, pp. 191-208, 1990.

507

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[14] M.R. GAREY, and D.S. JOHNSON, Computers and intractability: a guide to the theory of NP-completeness, W.H.
Freeman and Company, New York, 1979.
[15] C. FLEURENT, and J.A. FERLAND, Genetic and hybrid
algorithms for graph coloring. Dans G. Laporte, I.H. Osman, (Eds.), Metaheuristics in Combinatorial Optimization, Annals of Operations Research, 63 : 437-441, 1996.

508

[16] M. Kubale, Introduction to Computational Complexity


and Algorithmic Graph Coloring, Gdanskie Towarzystwo
Naukowe, 1998.
[17] M. CHAMS, A. HERTZ, D. de WERRA, Some experiments with simulated annealing for coloring graphs. EJOR
32: 260-266, 1987.

A Review of M-Health Approach for Chronic Disease Management


Marva Mirabolghasemi1 , N.A.Iahadi2
1,2

Maziar Mirabolghasemi3 and Vida Zakerifardi4


3,4

Universiti Teknologi Malaysia (UTM)

marva.mirabolghasemi@yahoo.com
2

Guilan University of Medical Sciences

Department of Nursing and Midwifery

Department of Information Systems

maziar.mirabolghasemi@yahoo.com
4

minshah@utm.my

v.zakerifardi@yahoo.com

Abstract: The growing number of patients suffering from chronic diseases causes a growing focus
on the use of information and communication technology to reduce the time consuming and costly
nature of treating chronic diseases. More than any other technology, mobile phones can provide
solutions for chronic diseases at various levels of organizations. The need for expansion of Chronic
Disease Management (CDM) is well recorded. Mobile technology is ubiquitous and can play an
essential role in healthcare, particularly in disease management. The objective of this study is to
review various researches in M-Health area to show the inevitable role of mobile communication
technologies in CDM.

Keywords: Change Management, Chronic Disease Management, Communication, M- Health, Tele-Monitoring.

Introduction

tions should be empowered so they become the centre


of management of their condition. Furthermore, the
motives of a CDM that physicians are concerned about
are information delivery, the potential for automating
The increasing number of patients suffering chronic
readings, and clinical-related decisions [5].
lifelong conditions such as hypertension and diabetes
has caused of an immense burden on health systems.
The multidisciplinary nature of such care requires
More than any other technologies, mobile technology
that the entire group accept the technologies if treatcan provide solutions at different organizational levment is to be integrated and seamless. They must also
els such as large communities, individual patients and
adapt to the impact that mobile devices and wireless
providers [1]. M-Health and e-health are inextricably
technologies may have on the doctor-patient relationlinked, used to improve health outcomes. For examship. These necessities, and change management isple, many e-health initiatives consist of digitizing pasues, will become significantly prominent as the influtient records that will standardize access to patient
ence of chronic disease care on under-resourced health
data within a national system. M-Health programs can
services befits more severe.
be as the access point for entering patients data into
the system, and as remote information tools that prepare information for healthcare clinics, health workers
in the field, and home providers [1]. Mobile technol- 1.1 M-Health Definitions and Benefits
ogy offers an efficient information management solution
for chronic patient care. The need for patient Chronic
Disease Management (CDM) is well recognized [24]. M-Health is defined as the usage of telecommunicaPatients have an essential role in the management of tion and multimedia technologies which are integrated
their condition and the opportunity to achieve exper- within wireless health care delivery systems [6]. It can
tise and knowledge in their condition and its manage- be explained as medical sensor, mobile computing, and
ment. Therefore, people with chronic lifelong condi- communications technologies for health care [7]. M Corresponding

Author, Malaysia, T: (+60) 0177905394

509

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Health involves new devices, systems, technology, policies, and standards for communication between healthcare providers and patients, integration of applications
and disease management, collaboration and care coordination systems among others. It can be said that
mobile devices will bring significant cost savings for
the health sector by decreasing the frequency of patient visits to health facilities and enhancing detection
of causes for action. M-Health Strategy should be divided into two initiative parts which are citizen-centric
and health-worker centric [8]. Mobile technologies have
the potential to reduce isolation to provide ongoing
support to health care workers as well as citizens [9].
However, patients become the focus of care, not the
doctor or the hospital [10].
Mobile technology has the merit of being locationindependent, offering mobility and flexibility to the
range of healthcare stakeholders [7, 11]. This also enhances the ability of both clinicians and patients to
achieve information with consequent advantages for
constant monitoring of patients conditions, fast emergency responses, remote/rural care, and interactive
consultancy [6, 7, 11].
A core value of mobile technologies is the low cost.
In the context of chronic diseases where patients should
attend healthcare centres frequently for several years,
there may be a trade-off between the reduction in travel
costs and mobile costs. Although, as with telemedicine
[12], cost profits may happen more readily to patients
compared to providers and there is a necessity to distinguish real savings if they are to adopt m-health more
widely.

Figure 1: Full Self-management Model

Patients need to be supported by information and


communication tools, peers, immediate care givers and
families, medication and technology, and by professionals for specialised advice, certain tests and information.
The literature about patient self management has already brought about a major paradigm shift, from a
so called traditional medical to a collaborative patientdoctor model Chen et al., [16] proposed mobile phones
as the social interaction devices that offer capability for
tighter communications with the patients community,
promote and sustain patients motivations through social influence from their community. Figure 2 shows
the architecture of the proposed system.

Chronic Disease Management


Model

The most famous model is the Chronic Care Model


(CCM), which is the largest chronic care reference outside the formal health care system. Although this is an
attractive conceptual model and researches have shown
that it can improve the quality of care and outcomes,
it is difficult to implement [13, 14]. Olmen J, Ku GM,
Bermejo R, Kegels G, Hermann K, Van Damme W [15]
introduced full self-management, and the elements are Figure 2: Mobile Based Patient Compliance System
Model
shown in Figure 1.

510

The Third International Conference on Contemporary Issues in Computer and Information Sciences

The medical personnel can submit advice as a set of


rules to the server, which also receives from the patients
mobile phone, either using automatic sensing or manual inputs. The users context information is also periodically sent to the server that uses historical data and
the user context to distinguish an optimal reminder
delivery schedule.

Different Types of Mobile


Communication Technologies 5

The primary feature of mobile phones that has been


enormously documented is text messaging in the context of health which is available on almost all mobile
phones; however only a small amount of experimental
studies that have been published on the subject [17].
PDAs, especially when combined with mobile
phones, have become a platform for processing, data
collection, and communication [6]. They are perceived
to be durable, portable, powerful and relatively easy
to use for healthcare providers and database managers.
They also enable access to information, such as treatment protocols as well as a means of rapid data transfer [17]. Low- and middle- income countries are also
making advances in the use of PDAs for consolidation,
data collection, and reporting as well as disease control and surveillance [18]. There is a growing trend in
schools and universities to use iPods to deliver lectures
and podcasts as part of the educational process [19], an
approach which could easily be used to deliver health
information [9].
A smartphone is a mobile phone that has more advanced communication options and computing ability
than regular mobile phones. For example, it has the
ability to read and store documents and to connect to
the internet. The medical industry is rapidly developing diagnostic test devices which can be connected to
smartphones, such as glycometers, sphygmomanometers, and software that records and interprets the results. Experience illustrates that smart phones can be
easily mastered by persons of all ages with very little
education [15].

focus more on tasks at hand by saving time spent on


consulting chronically ill patients [21] and patients can
move in their own environment without making extensive and frequent trips to the doctor. The research presented by [2226] focused on the telemonitoring of patients with chronic diseases. From these existing works
and other we can conclude that remote monitoring systems explain one of the most significant technological
research areas in the health context.

Monitoring Chronic Disease

Body Area Network

Sensor technology has enabled the development of


lightweight and small medical sensors which can be
worn by the patients while wirelessly transferring data.
This releases the patients from the confines of traditional wired sensors, allowing them to increase comfort
in daily living. It is foreseen that with the help of these
enhanced mobile health systems, better healthcare services can be delivered to patients, and physicians can
also benefit from a better information management.
In addition, it will provide the users the ability to access their medical records anytime and anywhere [27].
Morn et al.,[28] presents an ongoing prototype of a
telemonitoring system, based on a BAN (Body Area
Network) that is integrated with a GPS (Global Positioning System) unit, a Bluetooth (BT) pulse-oximeter,
and a smartphone. The smartphone is the hardware
platform for running software that manages the BT
piconet formed by the sensors. Therefore, the smartphone sends the data received from the sensor devices
to a central server that provides universal access to the
health status through a web application.
Kunze et al., [29] designed three levels on networks,
namely the body area network (BAN), the personal
area network (PAN) and the wide area network (WAN)
for cardiac monitoring. This involves a Bluetoothenabled wireless network of various body parameter
sensors that can communicate with the mobile device.
The PAN component of the framework connects the
BAN to users who communicate through the local cellular network. Meanwhile, the WAN provides connectivity between the patient and a remote physician.

Electronic Health Record

De Toledo et al., [20] described a standards-based ConPatient monitoring is a rapidly accepted element in nectivity Interface designed to interconnect a mobile
CDM strategies [20]. These technologies bring poten- telehealth solution with electronic healthcare record
tial benefits to both doctor and patient; doctors can systems from external providers, enhancing the appro-

511

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

priateness of this technical solution to different business models for mobile telehealth. Figure 3 includes a
cell phone that acts as a gateway for a set of devices and
telemonitoring information that may be used or not depending on the patients condition such as glucometer,
spirometer, sphygmomanometer, pulse oximeter and
scale. This collection of supported sensors enables the
identification of solutions suitable for the management
of a wide range of chronic diseases.

to IVR-supported chronic disease care within a very


lowincome community. This is a feasible strategy for
providing IVR services globally. IVR self-care support
may improve glycemic control and self-care for patients
in developing countries. The outcome of this research
study suggests that m-health tools for CDM using standard phone lines and a cloud-computing model are potentially useful , acceptable, and technically feasible in
resource-poor areas.
Doukas, Pliakas, Maglogiannis,[31] implemented a
mobile system that provides e-health data storage, retrieval and update using cloud computing. The mhealth application is developed through Googles Android operating system and the system provides management of medical images and patient health records.
The cloud computing platforms that exists for managing of users data are either commercial (Amazon
AWS) or free (iCloud). The concept of usage of cloud
computing in the field of healthcare information management is quite new but it is considered to have great
potential [32].

Figure 3: User Terminal

More professionalized functionality such as permitting the healthcare professional to modify the periodicity and type of tests directly and send this configuration to the cell phone is not requested at this point
of view, but it may be of interest in the future, when
patients and professionals become more adept to the
capabilities of Mobile Health Care Management [20].

Conclusion

The potential of the use of mobile phones and other


information and communication technologies in health
care is enormous [1]. The current trend of m-health
research has focused more on proofs of concept studies
and specific applications [33] rather than the analysis
of critical success factors or the consideration of generic
principles.

However, m-health for the chronic care sector will


develop within other health services, and progress towards sustainable systems across the stages of chronic
7 The Potential of Cloud Com- disease management is likely to be challenging. The
most pressing problem in the case of chronic disease
puting for Chronic Disease management is the integration of information and standards. Meanwhile, the stakeholders involved in chronic
Management
care needs to implement radical changes carefully in
the way of collected information, analysis of the proCloud computing provides capability for managing and cesses involved, disseminated and assimilated and a nadistributing information in a pervasive manner to and tional approach to m-health standards [5].
from several applications and platforms. This approach provides possibility for end users to access
Communication-enhanced health care through mthe computing infrastructure remotely over the Inter- health is a paradigm for the future, but it may be
net. Cloud computing is a model for providing on- prohibited by the costly infrastructural requirements
demand network access, servers, services, applications, of such technology, a lack of skilled operators, Internet
and storage which can be rapidly provided with mini- connectivity and the amount of personnel training [34].
mal management effort. Piette et al., [30] examined the
potential and feasibility of a cloud-computing approach
Despite these limitations, m-heath is expected to

512

The Third International Conference on Contemporary Issues in Computer and Information Sciences

create a revolutionary change in healthcare delivery


systems because of the rapidly developing field of mobile digital tools like PDAs, smartphones, enterprise
digital assistants, and sensor gadgets.

[18] T. Shields, A. Chetley, and J. Davis, ICT in the health


sector: Summary of the online consultation, infoDev, 2005.
[19] M. Carmichael, iPods Teach Docs to Recognize Heartbeats:
Medical Students use iPods to learn the sounds of unhealthy
hearts, Newsweek, 2007.
[20] P. De Toledo, W. Lalinde, F. Del Pozo, D. Kotz, D. Thurber,
and S. J. Fernndez, Interoperability of a Mobile Health
Care Solution with Electronic Healthcare Record System
(2006).

Refrences
[1] Kahn, C. James, J. Yang, and S. J. Kahn, Mobile Health
Needs And Opportunities In Developing Countries, Health
Affairs 29 (2010), 252-258.

[21] M. Schwaibold, M. Gmelin, and G. V. Wagner, Key factors for personal health monitoring and diagnosis devices,
Heidelberg, 2002.

[2] A. Opie, Nobodys asked me for my view: users empowerment by multidisciplinary health teams, Qualitative Health
Research 18 (1998), 188-206.

[22] S. Winkler, M. Schieber, S. Lcke, P. Heinze, T. Schweizer, D.


Wegertseder, M. Scherf, H. Nettlau, S. Henke, M. Braecklein, S. D. Anker, and F. Koehler, A new telemonitoring
system intended for chronic heart failure patients using mobile telephone technology- Feasibility study, International
Journal of Cardiology (2010).

[3] P. Brennan and C. Safran, Patient empowerment, International Journal of Medical Informatics 69 (2003), 301-304.
[4] B. Paterson, Myth of empowerment in chronic illness,
Journal of Advanced Nursing 34 (2003), 574-581.
[5] R. Stockdale, Peer-to-peer online communities for people
with chronic diseases: a conceptual framework, Journal of
Systems and Information Technology 10 (2008), 39-55.
[6] R. Istepanian, Introduction to the Special Section on MHealth: Beyond Seamless Mobility and Global Wireless
Health-care Connectivity, IEEE Transactions on Information Technology in Biomedicine (2004), 405-413.
[7] R. Istepanian and J. Lacal, Emerging Mobile Communication Technologies for Health: Some Imperative notes on
m-Health, The 25th Silver 59 Anniversary International
Conference of the IEEE Engineering in Medicine and Biology Society (2003).
[8] A. Iluyemi, Feedback on Draft WHO mHealth Review, London, 2007.
[9] P. Mechael,
Creating an Enabling Environment for
mHealth, Information and Communications Technology,
ITI 5th International Conference (2007).
[10] D. Fuscaldo and Soon, Cellphones Will Monitor the Vital Signs of the Chronically Ill, The Wall Street Journal
On-line (2004).
[11] A. Prentza, S. Maglavera, and L. Leondaridis, Delivery
of healthcare services over mobile phones: e-Vital and CJS
paradigms, Proceedings of the 28th Annual International
Conference of the IEEE EMBS (2008).
[12] A. C. Norris, Essentials of telemedicine and telecare,
Chichester: Wiley, 2002.
[13] M. Pearson, S. Wu, J. Schaefer, A. Bonomi, and S. Mendel,
Assessing the implementation of the chronic care model in
quality improvement collaborative, RAND 40 (2005), 978996.
[14] K. Coleman, B. T. Austin, C. Brach, and E. H T. Wagner,
Evidence on the chronic care model in the new millennium,
Health Affairs 28 (2009), 75-85.
[15] J. Olmen, G. M. Ku, R. Bermejo, G. Kegel, K. Hermann, and W. V. Damme, The growing caseload of
chronic life-long conditions calls for a move towards full
self-management in low income countries, GLOBALIZATION AND HEALTH 38 (2011), 1-10.
[16] G. Chen, B. Yan, M. Shin, D. Kotz, G. M. Ku, and E. Berke,
MPCS: Mobilephone based patient compliance system for
chronic illness care, 6th Annual International Mobile and
Ubiquitous Systems: Networking & Services (2009), 1-7.
[17] SatelLife, Handhelds for Health: SatelLifes Experiences in
Africa and Asia, 2005.

513

[23] A. Tura, L. Quareni, D. Longo, C. Condoluci, A. V. Rijn,


and G. Albertini, Wireless home monitoring and health
care activity management through the Internet in patients
with chronic diseases, Medical Informatics and the Internet
in Medicine 30 (2005), 241253.
[24] K. Perakis, M. Haritou, R. Stojanovic, B. Asanin, and D.
Koutsouris, Wireless patient monitoring for the e-inclusion
of chronic patients and elderly people, Proceedings of the
1st international conference on pervasive Technologies Related to Assistive Environments (2008), 14.
[25] S. Sultan and P. Mohan, How to interact: Evaluating the
interface between mobile healthcare systems and the monitoring of blood sugar and blood pressure, 6th Annual International Mobile and Ubiquitous Systems (2009), 16.
[26] D. Capozzi and G. Lanzola, An agent-based architecture
for home care monitoring and education of chronic patients,
COMPENG 10 (2010), 138140.
[27] A. Bourouis and M. Feham, Ubiquitous Mobile Health Monitoring System for Elderly, Journal of Computer Science &
Information Technology 3 (2011).
[28] M. J. Morn, A. Gmez-Jaime, J. R. Luque, and E. Casilari,
Development and Evaluation of a Python Telecare System
Based on a Bluetooth Body Area Network, EURASIP Journal on Wireless Communications and Networking (2011).
[29] C. Kunze, U. Gromann, and W. Stork, Application of ubiquitous computing in personal health monitoring systems,
Biomed. Tech. 47 (2002), 360362.
[30] J. D. Piette, M. Mendoza, O. Ganser, M. Mohamed, M.
Marinec, and N. Krishnan, A preliminary study of a cloud
computing model for chronic illness self-care support in underdeveloped countries, American Journal of Preventive
Medicine 40 (2011), 629-632.
[31] C. Doukas, T. Pliakas, and I. Maglogiannis, Mobile healthcare information management utilizing Cloud Computing
and Android OS, Engineering in Medicine and Biology Society (EMBC) (2010 ), 1037-1040.
[32] Ofer and Shimrat, Cloud Computing and Healthcare, San
Diego Physician, pages: 26-29, 2009.
[33] R. Guruajan and S. Murugesan, Wireless solutions developed for the Australian Healthcare, the 4th International
Conference on Mobile Business (2005).
[34] K. K. Agbele, H. O. Nyongesa, and A. O. Adesina, ICT
and information security perspectives in e-health systems,
Mobile Communication 4 (2010 ), 17-22.

A New IIR Modeling by means of Genetic Algorithm


Tayebeh Mostajabi

Javad Poshtan

Iran University of Science and Technology

Iran University of Science and Technology

Department of Electrical Engineering

Department of Electrical Engineering

mostajabi@elec.iust.ac.ir

jposhtan@iust.ac.ir

Abstract: Genetic algorithm is a powerful optimization technique for minimizing multimodal


functions. Therefore it is recently applied for estimating parameters of IIR model structures. IIR
structures are extensively employed in many systems such as communications, speech recognition,
bio-systems, acoustics, etc. which are recursive in nature This paper proposes a novel fitness
function to enhance the performance of estimated model by frequency data where GA is applied
as an estimator algorithm. The numerical results presented here indicate that the proposed fitness
function is effective in building an acceptable model for IIR modeling.

Keywords: IIR Filtering; Genetic Algorithm; Fitness Function; Modeling.

Introduction

Genetic algorithm is a powerful optimization technique


for minimizing multimodal functions, Thus several researches have been proposed various methods specifically designed for use in adaptive IIR filtering applications [1,3]. On the other hand many systems are
recursive in nature and would greatly benefit from implementing infinite impulse response (IIR). There are
principally two different important set of applications
in IIR filter design: adaptive signal processing and
adaptive system identification. The design of IIR filter for adaptive signal processing emphasize special
pass-band and desired frequency response whereas in
adaptive system identification, IIR filter is employed
for modeling, therefore it should behave as the same
as real system in both time and frequency domains.
Adaptive digital signal processing is an important subject in many applications. Adaptive system identification, adaptive noise cancellation, active noise control,
adaptive linear prediction, and adaptive channel equalization are just a few sets of the important applications
areas that have been significantly advanced with adaptive signal processing techniques. This systems are recursive in nature and would greatly benefit from implementing infinite impulse response. several researches
Corresponding

Author

514

have proposed various methods in order to use GA in


adaptive IIR filtering applications, for instance, in [3]
GA and input-output data in time domain are utilized
to estimate the parameters of IIR model in order to use
in system identification where fitness function based on
mean squared error (MSE) between the unknown plant
and the estimated model is used. On the other hand
GA is extensively employed in adaptive signal processing applications. The primary application is reported
in [2]. In recent years, several successful techniques are
introduced to improve GA capability for signal processing applications. [6-8]. In [6] a stability criterion embedded in the GA is applied to design robust d-stable
IIR filter. This method try to design the most stable
IIR filter that have the same magnitude frequency in
respect to the ideal frequency. It means that If this
method is used for system identification, an estimated
IIR filter would be more stable than the real plant,
whereas the aim of system identification is the design
an acceptable model that can show the defects of the
real system.
In the most applications of optimization methods
in adaptive IIR filtering, the objective function is designed based on MSE. In [9] another cost function
based on Least Mean Squared error (LMS) and mean
absolute error (MAE) is considered alongside MSE.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Another example is [7] in that the phase response is


considered besides the magnitude response and it has
tried to design a linear phase response filter via a fitness function based on variance of the phase different
sequence of the designed IIR filter. In this paper the designing methods of IIR filter in signal processing based
on frequency response are employed for adaptive system identification and for this purpose, a new fitness
function is proposed to enhance the performance of estimating model in both time and frequency responses.

Problem Statement

in which Nt is the number of sampling points in the


domain w.
Magnitude frequency response could be guaranteed
with this fitness functions, but phase response may not
be similar enough respect to the real system. the quality of magnitude response and the desired pass-band
and stop-band frequency responses might be insure the
desired performance in adaptive signal processing applications, whereas in adaptive system identification,
high quality and similarity in a whole time responses
and frequency responses (magnitude and phase) are
considered. As a result of this, the following fitness
function is proposed:
minfp1 = Emsei kw

The recursive expression with u(n) input and y(n) output and also its equivalent transfer function of the IIR
filter can be described by (1) and (2):
y(n) =

N
X
k=o

bk x(n k)

M
X

aj y(n j)

(4)

1 X
+ [ ((absH(w) absHi (w))2 ]0.5 kwL
L
+variance[angle(w) angle(Hi (w))]

(1)

j=1

b0 + b1 z 1 + ... + bM z M
G(z) =
1 + a1 z 1 + ... + aN z N

0 w , 0 wL

(2)

where L is the number of sampling points in the dowhere ak s and bk s are filter coefficients that define its main w
L
poles and zeroes respectively. This parameters are estimated by genetic algorithm so that the error based
on fitness function between the frequency response of
designed IIR filter and the real frequency response (of
4 Case Study
the plant) is minimal.
Genetic algorithm begin with the random set of
possible solutions that each one is embedded in one
chromosome. Each chromosome has (M+N+1) genes.
At every generation, the cost of each individual (chromosome) is evaluated by a predetermined cost function. An individual with lower fitness value is considered. The population is then evolved based on the circled process of natural selection, survival of the fittest,
and mutation. This cycle is continued in order to find
the optimal solution.

In this section, an IIR filter taken from [7] is considered


as an unknown plant, in order to examine the suggested
fitness function that is employed by GA for system
identification. In simulations two models are utilized:
matched order and reduced order model. Then, performances of the estimated models by using the proposed
cost function are compared with the conventional cost
function based on MSE in both time and frequency
domain.
Hplant =

Proposed Fitness Function

0.1823

(5)

(1 + 0.643z 1 )(1 1.0019z 1 + 0.9958z 2 )


(1 0.3888z 1 )(1 1.1631z 1 + 0.6501z 2 )

The typical fitness function, in adaptive filtering is the


mean squared error (MSE) between the frequency re- 4.1 Matched Order Filter Modeling
sponse of the unknown system (H(w)) and the estimated adaptive filter by ith chromosome (Hi (w)) :
For this part, matched order model is used for linear
1 X
identification:
2
Emsei =
abs(H(w) Hi (w))
(3)
N
1 + z 1 + z 2 + z 3
HM M =
(6)
0w
1 + z 1 + z 2 + z 3

515

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure(1) exhibits the pole-zero maps of real plant


(equation 3), estimated model by MSE fitness function and estimated model by proposed cost function.
It can be obviously understood that how the location
of estimated model based on proposed cost function is
more similar to the real plant in compare to model that
is estimated by MSE cost function. In addition, the
superior quality of the estimated model by proposed
fitness function is proven by comparing their step responses, frequency responses and rout locus maps. that
are drawn in figures(2)-(4) respectively.

Figure 1: Comparative pole-zero map for estimated


models based on matched order IIR filter

Figure 4: Comparative root-locus map for estimated


models based on matched order IIR filter

Figure 2: Comparative step responses for estimated


models based on matched order IIR filter

4.2

Reduced Order Filter Modeling

For this part, reduced order model is used for linear


identification:

HRM =

1 + z 1 + z 2
1 + z 1 + z 2

(7)

Figures(5)-(7) illustrated frequency responses, step


responses and impulse responses comparatively. This
figures show that the estimated model based on the
proposed cost function (equation 4) has the best perFigure 3: Comparative frequency responses for esti- formance in both time and frequency responses in commated models based on matched order IIR filter
pare to conventional one based on MSE.

516

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

also with reduced order. In each situation, the quality of estimated models are compared with each other
in both time and frequency responses. This numerical results indicate that the proposed fitness function
is effective in building an acceptable model for linear
identification.

Figure 5: Comparative frequency responses for estimated models based on reduced order IIR filter

Discussion and Future Works

Figure 7: Comparative impulse responses for estimated


The quality of magnitude response and the desired models based on reduced order IIR filter
pass-band and stop-band frequency response might
be insure the desired performance in adaptive signal
processing applications, whereas in adaptive system
identification, high quality and similarity in a whole Refrences
time responses and frequency responses (magnitude
and phase) are considered, Therefore this paper pro- [1] J. J. Shynk, Adaptive IIR Filtering, IEEE ASSP Magazine
April (1989).
posed the new fitness function in order to attain an
[2] D. Etter, M. Hicks, and K. Cho, Recursive adaptive filter
estimated model with acceptable performance in both
design using an adaptive genetic algorithm, Proc.IEEE Int.
time and frequency responses.
Conf. on ASSP 7 (1982), 635638.
[3] S. C. Ng, S. H. Leung, C. Y. Chung, A. Luk, and W. H.
Lau, The genetic search approach: A new learning algorithm
for adaptive IIR filtering, IEEE Signal Processing Magazine
Nov (1996), 3846.
[4] V. Hegde, S. Pai, and W. K. Jenkins, Genetic Algorithms
for Adaptive Phase Equalization of Minimum Phase SAW
Filters, 34th Asilomar Conf. on Signals, Systems, and Computers November (2000).
[5] S. Pai, W. K. Jenkins, and D. J. Krusienski, Adaptive IIR
Phase Equalizers Based on Stochastic Search Algorithms,
Proc. of the 37th Asilomar Conf. on Signals, Systems, and
Computers November (2003).
[6] S. T. Pan, Design of robust D-stable IIR filters using genetic
algorithms with embedded stability criterion, IEEE Trans.
Signal Processing 57/8 (2009).
[7] Y. Yang and X. Yu, Cooperative Coevolutionary Genetic Algorithm for Digital IIR Filter Design, IEEE Trans. Industrial
Electronics 54/3 (2007).

Figure 6: Comparative step responses for estimated


models based on reduced order IIR filter

[8] M. Haseyama and D. Matsuura, A filter coefficient quantization method with genetic algorithm, including simulated
annealing, IEEE Signal Processing Letters 13/4 (2006).

The proposed cost function is examined in two situations: a parametric model with matched order and

[9] N. Karabogal and B. Cetinkaya, Performance Comparison


of Genetic Algorithm Based Design Methods of Digital Filters with Optimal Magnitude Response and Minimum Phase,
proc. IEEE Int Symp on Micro-Nano mechatronics and Human Science (2003).

517

A New Similarity Measure for Improving Recommender Systems


Based on Fuzzy Clustering and Genetic Algorithm
Fereshteh Kiasat

Parham Moradi

Department of Electrical and Computer Engineering

Department of Electrical and Computer Engineering

University of Kurdistan

University of Kurdistan

f.kiasat@uok.ac.ir

p.moradi@uok.ac.ir

Abstract: Recommender systems are widely applied in e-commerce websites to help customers
in finding the items they want. A recommender system should be able to provide users with useful
information about the items that might be interesting. Similarity measure is the most important
factor in recommender system which is used to compute the user similarity. One can propose a
precise similarity measure for improving the recommender system results. The purpose of this paper
is to introduce a new similarity measure based on the combination of both users profile and users
rating records. The major advantages of the proposed measure comparing with the previous ones
is using two different information sources which results in precise results. While the previous ones
show the similarity according to user profile or rating. Planning a new similarity measure based on
combination of different user information sources e.g. user profile and rating can overcome sparsity
and cold start which are the major problems in recommender systems. The experimental results
show that the proposed measure can give satisfactory and high quality recommendations.

Keywords: recommender systems; similarity measure;fuzzy c means;genetic algorithm

Introduction

Nowadays created products information of internet


sellers are very attractive for customers to have a wide
choice. The connection between users and proper products is a key to satisfaction and loyalty of users. In
recent years, RS have played an important role in reducing the negative effect on overloading information
[6]. In addition to there are many things such as ecommerce and e-learning and therefore on causing the
importance of RS more [6, 26].
Recommender systems make recommendations using
three basic steps: acquiring preferences from the customers input data, computing recommendations using
proper techniques and presenting the recommendations
to customers [27].
Based on how recommendations are made, personalized recommender systems are usually classified into
the following categories:[2, 13, 25]
Corresponding

Collaborative filtering (CF) [1, 11] based RS [5, 7]


allow users to give ratings about a set of items
(e.g. videos, songs, films, etc. in a CF based
website), in such a way that when enough information is stored on the system we can make
recommendations to each user based on information provided by those users we consider to have
the most in common with them.
Demographic filtering[15] based RS are justified
on the principle that individuals with certain
common personal attributes (sex, age, country,
etc.) will also have common preferences.
Content-base [3, 16] These approaches recommend items that are similar in content to items
the user has liked in the past, or matched to attributes of the user.
The RS hybrid [9, 12]. commonly use a combination of CF with demographic filtering or CF
with content based filtering, to exploit goodness
of each one of these techniques.

Author, P. O. Box 66177-15175, F: (+98) 871-6660073, T: (+98) 871-6660073

518

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Currently, collaborative filtering (CF) is very much


Assuming the existence of a user-item matrix, R,
used and studied technique[11].
the similarity between two users from that matrix, user
ui and user uk , can be calculated utilizing either the
Collaborative RSs have their own limitations: Pearson Correlation Similarity or the Cosine Similarfirstly, there is the new user problem since it is nec- ity, which are the main similarity metrics used in the
essary for a set of items to be rated so as perform sim- Recommender Systems literature.
ilarity analysis, and much more number of ratings performed by a user, the more accurate the assignment to
a group of similar users[22, 24]; and secondly, there is 2.1 Pearson similarity
the new item problem since an item which has not been
rated previously cannot be recommended[5, 14, 20, 21].
This limitations have been getting from similarity mea- To find the similarity between users ui and uk , we can
sures utilizing in collaborative filtering RS such as co- utilize the Pearson Correlation metric. Pearson Corsine or Pearson similarity measure, because Through relation was at first introduced in the context of the
all this ways the similarity between users is computed GroupLens project[19]:
just by the ratings that users give to items(that causes
weak results while user ratings matrix is imperfect, because the information of millions of users and items
Pl
j=1 (rij ri )(rkj rk )
exist in almost ecommerce site and every users buy limsimik = qP
2 Pl
2
l
ited items and comment every bought items. Therefore
j=1 (rkj rk )
j=1 (rij ri )
the rate of items is unknown). Thus in this paper we
focus on similarity measures and propose a new similarity measure that can get better recommendation
It should be note that the summations over j are
without this limitations.
calculated over the l items for which both users ui and
In this paper given a new hybrid measure based on
genetic algorithm and fuzzy clustering for computing
the user similarity that use users profile information
moreover users rating. In proposed method use genetic algorithm for calculating similarity base on rated
information and by fuzzy clustering, users can be cluster by information given in their profiles and whether
users are in single cluster or not their similarity is calculated. At last with compounding these two ways
we can gain the user similarity in recommender system. The experiments show that using of this new way
increase performance of the recommender system. In
section 2 collaborative filtering recommender systems
are explained and then in section 3 express proposed
method. In section 4 we compare the results of proposed method with same methods. And in section 5
there will be conclusion.

user similarity measure

uk have expressed their opinions. Apparently, l n,


where n represents the number of total items in the
user-item matrix, R.

2.2

Cosine similarity

In the n-dimensional item space (or k dimensional item


space, in case dimension reduction techniques, like
SVD/LSI, were applied), we can view different users
as feature vectors. A user vector consists of n feature
slots, one for each available item. The values used to
fill those slots can be either the rating, rij , that a user,
ui, provided for the corresponding item, ij , or 0, if no
such rating exists. Now, we can compute the proximity between two users, ui, and uk, by calculating the
similarity between their vectors, as the cosine of the
angle formed between them.
X
X
r
r
qPij
qPkj
simik =
2
2
j
j rkj
j rij j

proposed user similarity meaA key factor in the quality of the recommendations 3
obtained in a CF based RS lies in its capacity to desure
termine which users are the most similar to a given
user. A series of algorithms and metrics[18] similarity
between users are currently available which enable this For calculating similarity measure we need something
important function to be performed in the CF core of to indicate this similarity correctly. Up to now a lot
of measures such as cosine similarity measure[8] and
this type of RS.
519

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Pearson similarity measure[19] are presented. In these


methods similarity of users are computed just based on
user ratings. There are challenges in these ways such
as sparsity and cold start (An item cannot be recommended unless a user has rated it before. This problem
applies to new items and also to obscure items and is
particularly harmful to users with eclectic tastes). In
this proposed method for calculating the similarity of
users, it is used not only from user rating about items
but also it can calculate the user profile information.
Proposed measure in this paper use the combination
of user rating and profile information to compute this
similarity. Here there are three steps:

According to this formula M indicates number of used


features in clustering. Fm i indicates the amount of feature m for user i.
For calculating degree of membership of users to clusters by using of the following function:

uik =
Pc

j=1

dik
djk

2/(m1)

Step1: Profile Based Similarity Measure In this step


demographic user information is extended and users
similarity is computed by fuzzy c-means algorithm.
According to this formula dik indicates the distance
between user i from kth center of cluster.
Step2: Rating Based Similarity Measure This mea- Here, at first we determine degree of membership to
sure computes user similarity by user ratings.
each cluster for every users. The degree of membership vector to k clusters is shown according to:
Step3: Hybrid Measure The sum of step1 and step2
with weight of and -1 respectively, computed as
users similarity.

3.1

Step1: Fuzzy c-means clustering

Fuzzy clustering algorithms partition the data set into


overlapping groups such that the clusters describe an
underlying structure within the data.
The most known method of fuzzy clustering is the
Fuzzy c-Means method (FCM), initially proposed by
Dunn[8] [26] and generalized by [4, 10, 23].Fuzzy cmeans (FCM) is a method of clustering which allows
one piece of data to belong to two or more clusters [4,8].
It is based on minimization of the following objective
function:
J (U, V ) =

c X
n
X

(uik )

M F x =< M F 1x , M F 2x , . . . , M F kx >

For example if we want to divide users into five clusters, degree of membership vector of users 1 and 2 is
shown as fallow:
MF1 =<0.1,0.6,0.1,0.18,0.02>
MF2 =<0.49,0.23,0.18,0.01,0.09>
Similarity of users x and y with this degree of membership are computed as follow:

2
Dik
(xk , vi )

i=1 k=1

In above formula m indicates amount of fuzzification.


Xk is kth sample and vi is the center of cluster i and
n indicates the number of samples.uik indicates degree
of membership of ith sample in cluster k. Also Dik is
distance of sample from the center of cluster. Users
divided into specific numbers of clusters c by the demographic information. For running of clustering algorithm the first values of centers are chosen in random.
Distance between users and the centers of clusters can
be computed by:
v
u M
uX
i F v )2
(Fm
dist(i, v) = t
m
m=1

520

P B (x, y) =

1
c

1
|M
F jx M F jy |
j=1

Pc

Where PB(x,y) is the similarity between user x and


user y. pseudo code of fuzzy c-means clustering of
users based on their demographic information is shown
in Algorithm.1 based on information from their demographic profiles.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Step1. Pseudo code of Profile Based Similarity


Algorithm
Input:
UP User profiles
C number of clusters
x user ID1
y user ID2
Output:
Similarity of user x and user y
Begin
centers Predefined Centers
J 0 //initialize value of objective
m 2 //degree of fuzzification
DF ExtractedUserDemographicFeatures()
While J less than do
// update centers:
centers updateCenter()
J computeJ(UP,C,m)
//compute degree of membership for each user:
For each u UP
For each f DF
MF membershipfunction()
end
end
// for all clusters compute distance MF of both
user x,y:
For each i C
differencei (MFx i -MFy i )
end
Similarityx,y 1/Average(difference)
end //of while
end //of Begin

represents the absolute difference between the ratings


of item, i, rated by both users.
(i)

vx,y = ba ,where b is the number of items rated by both


users, and a is the number of items rated by both users
over which the absolute difference in the ratings of both
users is i.
For each vector W=(w(0) ,w1 ,. . . ,w(M m) ) whose components lie in the range [-1, 1]. Step by step, for each
vector, w, there is a similarity function whose component, w(i) , represents the importance of the component
vx,y (i) for computing the similarity between two users.
In order to find an optimal similarity function, RB, we
use a genetic algorithm to find the vector w associated
to the optimal similarity function RB :
RB(x, y) =

M
m
X
1
(i)
W (i) Vx,y
M m + 1 i=0

We use a supervised learning task that fitness function


is the Mean Absolute Error MAE of the RS. In this
way, the population of our genetic algorithm is the set
of different vectors of weights, w. Each individual, that
is to say each vector of weights, represents a possible
genetic similarity measure , therefore we will evaluate
the MAE of the RS using this part of similarity measure. When running our genetic algorithm, the successive population generations tend to improve the MAE
in the RS. Our genetic algorithm stops generating populations when MAE in the RS for a vector of weight is
less than a threshold, .

Algorithm.1. Pseudo code of Profile Based Similarity


Algorithm

3.3
3.2

Step 2: Genetic Algorithm

Step3 : Hybrid Similarity Measure

In this proposed method for calculating the similarity


of users, it is used not only from user rating about items
Use The ratings made by a special
user x can be

 rep- but also it can calculate the user profile information.
(1) (2)
(I)
resented by a vector, rx = rx , rx , . . . , rx
with
dimension I (the number of items in the RS) in such The effect of both information is controlled by paramway that rx i indicates the rating that the user x has eter .
made over the item i. Obviously, a user may not rate
sim (x, y) = RB (x, y) + (1 ) P B(x, y)
all the items in the RS. We will use the symbol to represent that a user has not rated an item. continuously,
we use the expression rxi = to state that the user x
Where sim(x,y) indicates the similarity of two users
has not rated the item i yet.
and RB(x,y) computes the similarity of users based on
user rating matrix based on genetic algorithm which
In order to compare both vectors, rx , ry ,
presented on reference [17]. PB(x,y) also calculate the
we
can consider another vector vx,y
=

similarity of user based on user profile information. In
(0)
(1)
(M m)
vx,y , vx,y , . . . , vx,y
whose dimension is the num- this paper we use fuzzy c-means for computing the user
ber of the possible ratings that a user can make over similarity based on their profile information. Paraman item. Each component vx,y (i) of the vector vx,y eter also is used for controlling the effect of user

521

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ratings and user profile information in computing the


similarity among them.

Experiments

The implementation of our approach was based on the


MovieLens data set that was collected by the GroupLens Research Project at the University of Minnesota
in the MovieLens web site. The MovieLens dataset inFigure 2: proposed metrics and genetic similarity
cluded 100,000 ratings, assigned by 943 users on 1682
method comparative coverage results
movies. All ratings follow the 1-bad, 2-average, 3-good,
4-very good, and 5-excellent numerical scale. Each user
has rated at least 20 movies. Demographic information for the users included sex, profession and age are
available. In our experiments, by try and error act we
have 15 suitable clusters for users. In Fig.2 mean absolute error got from genetic similarity measurecite[17]
and proposed similarity with =0.2 for the number of
neighborhood from 5 to 85 is shown. According to reference[17] we can see that genetic similarity measure
has got better results than traditional measure. And
also our proposed measure has got less error than this
measure. Its important to know that just mean absolute error isnt sufficient for evaluation of recommender
system. The recommender system not only has ability
to exact prediction of user ratings but also gain suitable
coverage percentage. This measure indicates that recommender system can recommend some percentages of
Figure 3: proposed metrics and genetic similarity
items to users. There are some limitations such
method comparative Recall results

Figure 1: proposed metrics and genetic similarity Figure 4: proposed metrics and genetic similarity
method comparative Mean Absolute Error results
method comparative Precision results

522

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

as sparsity and uncompleted user-item matrix cause


reduction recommender system coverage. Proposed
measure can got suitable coverage percentage in comparison to traditional and
genetic[17] measure since it both use user-item matrix
and user profile information, so proposed measure reduce mean absolute error so as to has increased coverage percentage more than traditional and genetic measure (Fig.3). Two other quality measures to evaluate
accuracy ,recall and precision, are tested that have reliable results as for Fig(4-5). According to this results
proposed similarity measure can predict user ratings
and improve recommendations to user.

conclusion

Choosing a suitable similarity measure in RS is an important factor to gain better results, therefore, similarity measure has great role to better recommendations for users and increases their satisfaction. This
paper proposed a new similarity measure according to
user profiles and ratings. Using both user profile and
ratings information causes independent computing of
user similarity from user-item matrix then imperfectness of matrix in similarity computing is decreased, on
the other, hand new similarity measure overcome sparsity and cold start.

[7] L. Candillier, K. Jack, F. Fessant, and F. Meyer, Stateof-the-art recommender systems, Collaborative and Social
Information Retrieval and Access-Techniques for Improved
User Modeling (2009), 122.
[8] J.C. Dunn, A fuzzy relative of the isodata process and its
use in detecting compact well-separated clusters (1973).
[9] LQ Gao and C. Li, Hybrid personalizad recommended model
based on genetic algorithm, Int. conf. on wireless commun.
netw. and mob. computing, 2008, pp. 92159218.
[10] R.J. Hathaway and J.C. Bezdek, Recent convergence results
for the fuzzy c-means clustering algorithms, Journal of Classification 5 (1988), no. 2, 237247.
[11] J.L. Herlocker, J.A. Konstan, L.G. Terveen, and J.T.
Riedl, Evaluating collaborative filtering recommender systems, ACM Transactions on Information Systems (TOIS)
22 (2004), no. 1, 553.
[12] Z. Huang, H. Chen, and D. Zeng, Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering, ACM Transactions on Information Systems
(TOIS) 22 (2004), no. 1, 116142.
[13] M. Kalz, H. Drachsler, J. Van Bruggen, H. Hummel, and R.
Koper, Wayfinding services for open educational practices
(2008).
[14] B.M. Kim, Q. Li, C.S. Park, S.G. Kim, and J.Y. Kim, A new
approach for combining content-based and collaborative filters, Journal of Intelligent Information Systems 27 (2006),
no. 1, 7991.
[15] B. Krulwich, Lifestyle finder: Intelligent user profiling using large-scale demographic data, AI magazine 18 (1997),
no. 2, 37.
[16] R.D. Lawrence, G.S. Almasi, V. Kotlyar, M.S. Viveros,
and SS Duri, Personalization of supermarket product recommendations, Data Mining and Knowledge Discovery 5
(2001), no. 1, 1132.
[17] D. Li, Q. Lv, X. Xie, L. Shang, H. Xia, T. Lu, and N. Gu,
Interest-based real-time content recommendation in online
social communities, Knowledge-Based Systems (2011).

Refrences
[1] G. Adomavicius and A. Tuzhilin, Toward the next generation of recommender systems: A survey of the state-of-theart and possible extensions, Knowledge and Data Engineering, IEEE Transactions on 17 (2005), no. 6, 734749.
[2] S.K.L. Al Mamunur Rashid, G. Karypis, and J. Riedl,
Clustknn: a highly scalable hybrid model-& memory-based
cf algorithm, Proc. of webkdd 2006: Kdd workshop on web
mining and web usage analysis, in conjunction with the 12th
acm sigkdd international conference on knowledge discovery
and data mining (kdd 2006), august 20-23 2006, philadelphia, pa, 2006.
[3] M.Y.H. Al-Shamri and K.K. Bharadwaj, Fuzzy-genetic approach to recommender systems based on a novel hybrid
user model, Expert Systems with Applications 35 (2008),
no. 3, 13861399.
[4] J.C. Bezdek, Pattern recognition with fuzzy objective function algorithms, Kluwer Academic Publishers, 1981.
[5] J. Bobadilla, F. Serradilla, and J. Bernal, A new collaborative filtering metric that improves the behavior of recommender systems, Knowledge-Based Systems 23 (2010),
no. 6, 520528.
[6] J. Bobadilla, F. Serradilla, A. Hernando, et al., Collaborative filtering adapted to recommender systems of e-learning,
Knowledge-Based Systems 22 (2009), no. 4, 261265.

[18] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J.


Riedl, Grouplens: an open architecture for collaborative filtering of netnews, Proceedings of the 1994 acm conference
on computer supported cooperative work, 1994, pp. 175
186.
[19] G. Salton and C. Buckley, Term-weighting approaches in
automatic text retrieval, Information processing & management 24 (1988), no. 5, 513523.
[20] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl, Item-based
collaborative filtering recommendation algorithms, Proceedings of the 10th international conference on world wide web,
2001, pp. 285295.
[21] J. Schafer, D. Frankowski, J. Herlocker, and S. Sen, Collaborative filtering recommender systems, The adaptive web
(2007), 291324.
[22] A.I. Schein, A. Popescul, L.H. Ungar, and D.M. Pennock,
Methods and metrics for cold-start recommendations, Proceedings of the 25th annual international acm sigir conference on research and development in information retrieval,
2002, pp. 253260.
[23] H.D. Sherali and J. Desai, A global optimization rlt-based
approach for solving the hard clustering problem, Journal of
Global Optimization 32 (2005), no. 2, 281306.
[24] X. Su and T.M. Khoshgoftaar, A survey of collaborative filtering techniques, Advances in Artificial Intelligence 2009
(2009), 4.

523

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[25] E. Vozalis and K.G. Margaritis, Analysis of recommender


systems algorithms, Proceedings of the sixth helleniceuropean conference on computer mathematics and its
applications-hercma, 2003.
[26] H.F. Wang and C.T. Wu, A strategy-oriented operation

524

module for recommender systems in e-commerce, Computers & Operations Research (2010).
[27] K. Wei, J. Huang, and S. Fu, A survey of e-commerce recommender systems, Service systems and service management,
2007 international conference on, 2007, pp. 15.

The lattice structure of Signed chip firing games and related models
A. Dolati

S. Taromi

B. Bakhshayesh

Shahed University

Shahed University

Shahed University

Department of mathematics

Department of Mathematics

Department of Mathematics

dolati@shahed.ac.ir

taroomi@shahed.ac.ir

Bakhshayesh@shahed.ac.ir

Abstract: In this paper the lattice structure of Signed Chip Firing Games are studied and the
class of lattices induced by Signed Chip Firing Games with the class of U LD lattices and the class
of lattices induced by Mutating Chip Firing Games and Ablian sandpile model are compared.

Keywords: lattice, Signed Chip Firing Game (SCF G), Ablian sandpile model (ASM ), Mutating Chip Firing
Game (M CF G), U LD lattices.

Introduction

The Signed Chip Firing Game (SCF G) is a discrete


dynamical system that is introduced by R. Cori and
T.T.T. Huong [6]. It is defined over an undirected
graph G = (V, E), which is called the support graph.
An Integer-valued weight is considered for each vertex
v V , which is the number of the chips stored at v
and the sum of weights of all vertices is zero. The
game proceeds with an evolution rule, called the firing
rule. It may be stopped after a limited time or may be
continued forever.

initial configuration, called the configuration space, of


the SCF G is studied. So far, the configuration space
of the mentioned models have been characterized and
some conclusions have been made. For instance the
class of lattices induced by CF G is placed between the
distributive and the U LD lattices ([14]). It has been
proved that the class of lattices induced by ASM includes the distributive lattices and the class of lattices
induced by CF G includes the class of lattices induced
by ASM . Also, it has been shown that the class of
lattices induced by CF G and M CF G are the same
[13].For more detailed information see [10].
In this paper the class of lattices induced by SCF G,
ASM and M CF G are compared. It will be proven
that the SCF G induces some new lattices which are
not induced by the other models. Hence this game is
important in terms of inducing new lattices.

The chip ring game (CF G), Signed Chip Firing


Game (SF G), the Abelian sandpile model (ASM ), and
the mutating chip ring game (M CF G) are discrete dynamical models which are used in physics [1, 8] and
computer science [3,4,11]. Lots of researches have been
accomplished on the models of discrete dynamical m
2 Definitions
(such as CF G, ASM etc.) to name a few: the study
of the set of all configurations reached from an initial configuration ([12, 14]) and the configurations that In this section, a definition of posets and lattices will
reach the initial configuration after passing some stages be given and the models will be defined.
of the game which will be called hereafter the recurrent
configurations ([2, 5, 8]). Also, the study of whether a
given game will be terminated or continued forever is
of high interest among researchers of this field([3, 4, 9]).
In this paper we want to study these subjects more
broadly. The set of all configurations reached from an
Corresponding

Author, Tehran, PO Box: 18151-159, Iran.

525

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.1

Posets and Lattices

equal to its outgoing degree, can be fired and it sends a


chip along each of its arcs. If there exists no such vertex then the game will stop and the last configuration
A partially ordered set (poset) is a set with a relation is called the fixed point of the game. In other words a
that has three properties: transitive, reflexive and an- configuration is called a fixed point if the following
tisymmetric. For each two elements x and y of a poset statement holds:
if x < y, x z < y implies z = x then it is said that
v V : (v) < degG (v).
x is covered by y or x is a lower cover of y (or y is
an upper cover of x) and we write x y. Also the
which degG (v) is the outgoing degree of vertex v in G.
interval [x, y] is the set containing all elements of the
poset between x and y, including both x and y. The
The Abelian Sandpile Model (ASM ) [13] is played
Hasse diagram is used to represent a poset P [13]. Two
over an undirected graph G = (V, E) with a special
0
posets P and P are called isomorphism if there exists
vertex, called sink, which never fires and is denoted by
a bijective function : P P 0 such that for all the
and the configuration of each vertex is positive. Note
inequality x y implies (x) (y).
that a directed graph will be achieved if two arcs (i, j)
and (j, i) are replaced by each edge i, j of the given
The least upper bound of two elements x and y of
undirected graph. The firing rule and the fixed point
a poset is called the join of x and y and is denoted by
of this game are similar to the preceding game, CF G.
y x. The greatest lower bound of any two elements
x and y of a poset is called the meet of x and y and is
The Mutating Chip Firing Game (M CF G) [13] is
denoted by y x. A poset Lis called a lattice if any two
played over a directed graph G = (V, E) and the configelements of the poset have a least upper bound and a
uration of each vertex is positive. An infinite sequence
greatest lower bound. As all lattices studied here are
Mv = (Sv1 , Sv2 , ...) of multisets of nodes in V is fixed
finite, each lattice has a unique maximal as well as minfor each node v V , which is called the mutation seimal element.
quence of v. If the number of chips stored at a node
A lattice is called a hypercube of dimension n if it is
v V is greater than or equal to its current outgoing
isomorphic to the power set of a set of n elements,
degree then vfires according to the firing rule of the
ordered by inclusion. A lattice is called lower locally
CF G. After firing a vertex the mutation of node v ocdistributive (LLD) if the interval between any element
curs in a way that the outgoing arcs of v are removed
and the meet of all its lower covers is a hypercube.
and a new arc (v, w) is added for each ww Sv1 . Then
Upper locally distributive (U LD) lattices are defined
Sv1 is removed from Mv so that Mv = (Sv2 , Sv3 , ...). The
dually [15]. A lattice is ranked if all the paths in the
support graph of the initial state is called the initial
covering relation from the minimal to the maximal elsupport graph of a M CF G.
ement have the same length. The reader is referred to
the references provided at the end of paper for more
The Signed Chip Firing Game (SCF G) [6] is played
detailed study of posets and lattices [7].
over an undirected graph G = (V, E) where the configuration of each vertex is an integer such that the sum
of the configuration of all vertices equals zero. For any
positive vertex (i.e. a vertex whose number of chips
2.2 Definitions of the models
is positive) the firing rule is similar to the CF G. If
the configuration of a negative vertex v (i.e. a verEach of the following models are defined over a multi tex whose number of chips is negative) is less than or
graph G = (V, E), called the support graph of the equal to degv (where degv is the degree of a node
model. A configuration is a mapping that assigns a v), then v can be fired and receives a chip from each of
weight to each vertex v V that is the number of chips its neighbors. The game stops when no vertex is able
stored at v. This weight can be positive, negative or to fire.
even zero. If is a configuration of the game then the If a vertex v of an initial configuration can be fired
configuration of each vertex v is the number of chips fired and a configuration 0 is obtained, then is called
stored at v, denoted by (v). The games start with an the predecessor of 0 and we write v 0 . This reinitial configuration and continue by an evolution rule, lation is called the predecessor relation. Such a game
called the firing rule.
may last forever or may converge to a unique fixed
point. The sequence of all vertices that have been fired
The Chip Firing Game (CF G) [13] is played over through the game in order to reach the final configua directed graph G = (V, E) and the configuration of ration from the initial configuration is called an execueach vertex is positive. The firing rule is as follows: A tion of the game. The set of all configurations reachvertex v V whose number of chips is greater than or able from the initial configuration C, ordered by the

526

The Third International Conference on Contemporary Issues in Computer and Information Sciences

predecessor relation, is called the configuration space


of C and is denoted by L(C). Two convergent games
C, C 0 are equivalent if L(C) is isomorphic to L(C)0 .
From now on we will denote the configuration space of
CF G, ASM and M CF G by L(CF G), L(ASM ) and
L(M CF G), respectively.

other known models. We intend to study those class


of SCFG, not necessarily have p1 and p2 properties,
that are equivalent to other known models. Therefore
we would be able to find the lattices induced by SCFG
which does not overlap the lattices induced by other
known models and hence introduce a new class of lattices.

Main results
Refrences

In this section we study a special class of L(SCF G)


and show the SCF Gs related to this class are equivalent to the ASM , M CF G and U LD. First we state
some definitions: A SCF G is simple if each of its vertices can be fired at most once during an execution.
Definition 1: A vertex that stores a positive (negative) number of chips is called a positive (negative)
vertex.
Definition 2: The firing of a positive (negative)
node is called a positive (negative) firing.
Now we will present two properties for the SCFG
in order to reach our result.
Definition 3: For a given SCF G if the number
of chips at all vertices of the initial configuration of a
SCF G is nonzero, then the SCF G is said to have p1
property.
Definition 4: For a given SCF G if no negative
node prevents a positive node from being fired as well
as no positive node prevents a negative node from being fired, then the SCF G is said to have p2 property.
Now we can state the main results:
Theorem 1: The configuration space of simple
SCF G with p2 property is a lattice.
Theorem 2: Any simple SCF G with p1 and p2
properties is equivalent to a convergent ASM .
Theorem 3: The configuration space of a simple
SCF Gwith properties p1 and p2 is an U LD lattice.
Theorem 4: The configuration space of a simple
SCF G with properties p1 and p2 is equivalent to a
M CF G.

Discussion and Future Works

In this paper, we have introduced a class of SCFG


with p1 and p2 properties, which are equivalent to

527

[1] P. Bak, C. Tang, and K. Wiesenfeld, Self-organized criticality: An explanation of the 1/f noise, Phys. Rev. Lett.
(1987).
[2] N. Biggs, Chip firing and the critical group of a graph, Journal of Algebraic Combinatorics 9 (1999), 2545.
[3] A. Bjorner and L. Lovasz, Chip-firing games on directed
graphs, Journal of Algebraic Combinatorics 1 (1992), 304328.
[4] A. Bjorner, L. Lovasz, and W. Shor, On computing fixed
points for generalized sandpiles, European Journal of Combinatorics 12 (1991), 283-291.
[5] R. Cori and D. Rossin, On the sandpile group of a graph,
European Journal of Combinatorics 21 (2000), 447459.
[6] R. Cori and T.T.T.Huong, Signed chip firing games on some
particular casesand its applications, LIX, Lecole Polytechnique, France Institute of Mathematics, Hanoi, Vietnam,
October 26 (2009).
[7] B. Davey and H. Priestley, Introduction to Lattices and Orders, 1990.
[8] D. Dhar, P. Ruelle, S. Sen, and D. Verma, Algebraic aspects
of sandpile models, Vol. 28, 1995.
[9] K. Eriksson, Chip firing games on mutating graphs, SIAM
Journal of Discrete Mathematics 9 (1996), 118128.
[10] E. Goles, M. Latapy, C. Magnien, and M. Movan, Sandpile
Models and Lattices: A Comprehensive Survey, Theoretical
Computer Science 322 (2004), 383 407.
[11] E. Goles, M. Morvan, and H. Phan, Lattice structure and
convergence of a game of cards, Annals of Combinatorics
(1998).
[12] M. Latapy and H. D. Phan, The lattice structure of chip
firing games and related models, Physica D 155 ( 2001),
69-82.
[13] C. Magnien, Classes of lattices induced by chip firing (and
sandpile) dynamics, European Journal of Combinatorics
(2003), 665 - 683.
[14] H.D. Phan, L. Vuillon, and C. Magnien, Characterization
of lattices induced by (extended) chip firing games, Discrete
Mathematics and Theoretical Computer Science, Proc. 1st
Internat. Conf. Discrete Models: Combinatorics, Computation, and Geometry (DM-CCG01), MIMD (July 2001), 229244.
[15] B. Monjardet, K.P. Bogart, R. Freese, and J. Kung, The
consequences of Dilworths work on lattices with unique irreductible decompositions, The Dilworth Theorems Selected
Papers of Robert P. Dilworth, Birkhauser, Boston (1990),
192-201.

Tiling Finite Planes


Jalal Khairabadi

Bahram Sadeghi Bigham

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

j.khair@iasbs.ac.ir

b sadeghi b@iasbs.ac.ir

Rebvar Hosseini

Zohreh Mohammad Alizadeh

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

r.hosseini@iasbs.ac.ir

z.alizadeh@iasbs.ac.ir

Abstract: A set of natural numbers tiles the plane if a square-tiling of the plane exists using
exactly one square of side length n for every n in the set. From [2] we know that N itself tiles the
plane. From that and [3] we know that the set of even numbers tiles the plane while the set of odd
numbers does not. According to [1] it is possible to tile the plane using only an odd square. Also
it is possible to tile the plane using exactly three odd squares, but it is impossible to tile the plane
using exactly 2 odd numbers. In this paper we will check that there exists a finite set containing
n 6= 3 odd numbers and a set of even numbers that can tile a finite plane.

Keywords: Tiling; Plane; Finite Set; Finite Plane; Fibonacci.

Introduction

In 1903, M. behn [4] asked: Is it possible to tile a square


with smaller squares, such that no square is the same
size? In 1925, Moron found several rectangles that can
be tiled with squares [9]. Dehns question was answered
by R.spragues [10] confirmation. The question and its
answer was the disputable subject of a paper, squaring
an square by Tutte [11], and was reprinted in Scientific American by Martin Gardner [12]. Several papers
have been presented in this subject ever since [57].
In 1975, Golomb [8] asked if it is possible to tile an infinite plane using different squares with every side-length
represented. In 1907, Karl Scherer [13] was assisted in
tiling the plane using squares with integral sides, but
in different sizes. The number of squares used with
n side, t(n), is finite, but the function t is not limited. Golombs question was answered confirmatively
in squaring the square [2]. The solution presented
Corresponding

caused plenty of questions, for example, which sets can


tile the plane? Is it possible to tile without squared
rectangles? Is there three-coloured tiling? Is it possible to tile a half-plane?
The second paper [3] showed that no set of odd numbers and primes can tile the plane. This paper found
a kind of tiling without squared rectangles. It showed
that the set of natural numbers can tile many even
infinite pages. But it caused many questions. Can a
superset of a tiling set tile the plane? Is it possible to
partition N into two tiling and non-tiling sets? Is it
possible to tile Riemann Plane?
There are some relationships between squared planes
and squared squares. Thus, there are some strange
disconnects. It can be proved that it is not possible to
cube a cube [11]. But it has not been presented that it
is not possible to cube the plane.

Author, P. O. Box 45195-1159, M: (+98) 918 998-4279, T: (+98) 241 415-5056

528

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Tiling Finite Planes

In this section, we will review tiling for various scenarios.


Suppose that a set of even numbers is presented. In
this section, the purpose is to tile a finite plane with
this set of even numbers and 1 or 2 odd numbers.

2.1
2.1.1

Set With One or Two Odd Number


Figure 2: Plane with 2 even-value tile

Proposition 1

Definition: A set containing just one odd number can


tile the plane. Proof : Here the even numbers set is
empty.
Consider a square with an odd side length. Another
square with the same side length can tile this square.
For example a square with side length 5 can be tiled
with another square with the same side length.
By [1]we know that a set containing just two odd
numbers and a set of evens can not tile an infinite
plane, so we have the following proposition:

Figure 1: Integral Side


Figure 3: Tiling Example, 32 33

2.1.2

Proposition 2

Definition: A set containing exactly two odd numbers


and a set of evens can not tile a finite plane.
To prove this proposition the following definition is
needed.
Definition: At every corner of a square s there is an
edge extending away from s. We will call such a line
a spoke of S. We will say that s has an integral side,
if it has two spokes extending in parallel from adjacent corners. If s has no integral side we say it is a
Pin-wheel(Figure 1).

Proof : The finite plane is in two different states


from one point of view :
1. At least, one of the sides is even.
2. Or, both sides are odd.
Proof 1 : What if the plane has one even side? As it
is shown in Figure 2 If the squares m and n with odd
sides are set to the plane, one of the distances a or b
is odd and since we have used both odd numbers, so it
is possible to tile an odd plane with an even set.

529

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Proof 2 : what if both sides in the finite plane are 2.3.1 Proposition
odd? The area will be odd, but if we calculate the area
for an even set with two odd numbers, it will be even.
So, an even set with two odd numbers can not tile the Definition: It is possible to tile a finite plane by 6 odd
numbers and a set of evens.
plane.
Proof: Start with a 98 65 squared rectangle composed of nine squares of sides 1, 4, 7, 8, 9, 10, 14, 15,
2.2 Set With Four or Five Odd Num- 18, 33, 65. This sequence of numbers can tile this finite
plane. So the proposition is true.

bers

Till now tiling finite planes with 1, 2 and 3 odd numbers and a set of evens has been covered. The next 2.4 Set With n 7 Odd Numbers
propositions will cover tiling for n 4 odds and a set
evens.
Theorem 1: For n 7 the sequence
1, 4, 7, 8, 9, 10, 14, 15, 18, 33, 65 and then Fibonacci
rule sequence starting from 88 can tile a finite plane
2.2.1 Proposition
with n 7 odd numbers and a set of evens.
Proof: Till now, tiling for different values of
Definition: It is possible to tile a finite plane by 4 odd
1

n 6 with the following sequences is possible:


numbers and a set of evens.
n=4
: 1,4,7,8,9,10,14,15,18
Proof: Start with a 32 33 squared rectangle comn=5
: 1,4,7,8,9,10,14,15,18,33
posed of nine squares of sides 1,4,7,8,9,10,14,15,18.
n=6
: 1,4,7,8,9,10,14,15,18,33,65
This sequence of numbers can tile this finite plane. So
From
this point on, adding two last numbers in this sethe proposition is true.
quence will give us the next number like the Fibonacci
For n = 5 another square with odd side length is sequence. If next number is odd, then the sequence for
needed. If we add 33 to the sequence above, there will the next n is complete and tiling with this sequence is
be a square tiling for a 33 65 squared rectangle that possible. Otherwise, adding two last numbers again,
will give us the next number, this number is odd. So
contains 5 odd numbers and a set of evens.
the sequence is complete. There is a axiom here: if the
last two numbers k, l in the sequence are odd, then the
following two numbers are required:
2.3 Set With 6 Odd Numbers
k + l and l + k + l
This is just like the Fibonacci sequence rule. OtherTill now tiling finite planes with 1, 2 and 3 odd num- wise, if one of k and l is odd and the other is even
bers and a set of evens has been covered. The next adding just k + l to the sequence is sufficient. Also,two
propositions will cover tiling for n > 4 odds and a set even numbers except 10, 14 are impossible. So the
proof is complete. Note: if a set grows faster than Fiof evens.
bonacci sequence, it is not possible to tile plane with
it [1].

Discussion and Future Works

In this paper, we were not able to prove or disprove the


theory for, whether a set of even numbers with 3 odd
numbers can be tiled or not. But we proved that if we
have a set of even numbers and n odd numbers [1] and
the plane is bounded, it is possible to tile the plane.
But we have the same question for infinite planes. It
has been proved for n = {1, 2, 3} in [1] but it is yet
an open problem for n = 3.

Figure 4: Tiling Example, 98 65

530

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] A. M. BERKOFF, J. M. HENLE, A. E. MCDONOUGH,
and A. P. WESOLOWSKI, possibilities and impossibilities
in Square-Tiling, IJCGA, World Scientific Publishing 21
(2011), no. 5, 545558.
[2] F. V. Henle and J. M. Henle, Squaring the plane, The Am.
Math. Monthly 115 (2008), no. 1, 312.

[6] C. Freiling, R. Hunter, C. Turner, and R. Wheeler, Tiling


with squares and anti squares, The Am. Math. Monthly
(2000), 195204.
[7] I. Gambini, A method for cutting squares into distinct
squares, Discr. Appl. Math 98 (1999), no. 12, 6580.
[8] S. W. Golomb, The heterogeneous tiling conjecture, The J.
Recreat. Math. 8 (1975), no. 12, 138139.
[9] Z. Moro
n, O Rozkladach Prostokatow Na Kwadraty,
Prezeglad Mat. Fiz. 3 (1925), 152153.

[3] J. M. Henle and F. V. Henle, Squaring and not squaring one


or more planes, Joint Mathematics Meetings, San Antonio,
Dallas, Texas 89 (2005), no. 5, 5476.

[10] R. Sprague, Beispiel einer Zerlegung des Quadrats inlauter


verschiedene Quadrate, Math. Z. 45 (1939), 607608.

[4] M. Dehn, Uber


die Zerlengung von Rechtecken in Rechtecke,
Math. Ann. 57 (1903), 314322.

[11] W. Tutte, Squaring the square, Canadian J. Math. 2 (1950),


197209.

[5] I. Feshchenko, D. Radchenko, L. Radzivilovsky, and M.


Tantsiura, Dissecting a brick into bars, Geometriae Dedicata 145 (2010), no. 1, 159168.

[12] M. Gardner, A New Selection, The Second Scientific American Book of Mathematical Puzzles & Diversions (1961).
[13] K. Scherer, New Mosaics, privately printed (197).

531

J2ME And Mobile Database Design


Seyed Rebvar Hosseini

Lida Ahmadi

Department of Computer and Information Sciences

Department of Mathematics

Institute for Advanced Studies in Basic Sciences

University Of Kurdistan

Zanjan, Iran

Kurdistan, Iran

r.hosseini@iasbs.ac.ir

lida.ehmedi@gmail.com

Bahram Sadeghi Bigham

Jalal Khairabadi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

b sadeghi b@iasbs.ac.ir

j.khair@iasbs.ac.ir

Armin Ghasem Azar


Department of Computer and Information Sciences
Institute for Advanced Studies in Basic Sciences
Zanjan, Iran
a.ghasemazar@iasbs.ac.ir

Abstract: J2ME is a development platform for mobile devices and has been introduced by Sun
Micro-Systems Inc. In 1999 for programming limited devices such as phones, PDAs and other small
devices. But this architecture does not support APIs for data persistence management and relational
database because of its limitations. This paper presents a base for local relational database and
data persistence management for J2ME based applications and can be used in any database aware
application in J2ME platform. In this paper and implementation of this database system mobile
device and J2ME limitations have been considered. Also, the B+tree indexing structure has been
implemented in this project that allows us fast insertion, deletion, range queries, search, backup
and restore mechanism for RMS based databases.

Keywords: J2ME; B+tree; mobile devices; data management; storage; relational; database.

Introduction

The mobile devices are going to be more and more popular today and the need for data oriented software for
them is growing very fast. So the need for a fast and
acceptable manner of storing and retrieving data has
been more obvious in the past few years.
The mobile applications should be interactive in a
way that they could response to user actions. For an
Corresponding

application that uses database and storing and retrieving data, this is an important issue. since storing and
retrieving data if not implemented in a good way, can
be very lengthy and time consuming this definitely will
not let mobile platform to succeed. Also, a mobile application Should be available in off-line mode as it is
in on-line. This issue is important if either the cost
of the data transmission over the network is high or
the network is not always available. Also, the network
speed is also an issue. Hence a data persistence man-

Author, P. O. Box 45195-1159, M: (+98) 918 376-0623, T: (+98) 241 415-5056

532

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

agement system is needed in order to store and retrieve


information and data in the mobile device [3].

System Internals

We have chosen J2ME as our development platform Till now, some information about the platform and limbecause the number of mobile devices that support this itations has been discussed. In this section we are going
platform has been increasing in the past few years. The to study the internal structure of this system.
JAVA virtual machine that is implemented in the mobile devices is called KVM (K for Kilobyte).
J2ME virtual machine is nearly installed on every
mobile devices that has been manufactured nowadays.
Thus, for devices that does not come with KVM by
default, the custom releases by third party manufacturers are available (like IBM J9 for Windows Mobile
and Palm-OS and JBED and Net-Mite for Android).

2.1

Overall system analysis

Record Store management is a mechanism for storing


and retrieving data in a J2ME application without having special permissions. This API is a low level API
and the programmer should manipulate bytes directly.
RMS supports some basic operations and its structure
is similar to Simple File. The programmer should pack
bytes directly and then write them to database [1].
There is not any good disk data structure for data storage like BTree, B+tree, Hashing, etc.

But J2ME does not support JDBC, because the


classes that implement JDBC are very heavy and complex for such limited devices. In J2ME there are some
options available that are JSR75 for file connection API
and Record Store Management (RMS). JSR75 needs
Simple SQL Subset
special permissions to access disk directly and is not a
good option for database implementation. So the later
RMS B+tree Database
is used in this project, that is , this system is built on
RMS
top of the RMS and uses RMS low level APIs to store
J2ME Midllet
and retrieve data in the background. We know that
Table 1: System Architecture
there is not a unique solution for database management that supports every kind of mobile devices. That
is, all the designed systems are specifically designed for
one type of the system (for example Symbian, Android,
This basic API support operations like adding
Palm-OS and Windows Mobile database management record, update a record by ID, access a record by ID,
mechanism are different).
record filters and record enumerations. But these operations and their performance are not good enough
So some solutions for this lack of existence of unique for database applications like accounting systems or
DBMS has been developed. Among them, IBM Tool- dictionaries or something like them. Also, creating apBox for J2ME [1] for remote database access, JT-Open plications that need relational databases is nearly im[2], Point-Base Micro by Data-Mirror [3], Code-Base possible with these simple operations [1]. The main
Micro [4] are more important.
goal here is to extend RMS capabilities for search,
backup, restore and high speed query of the stored
This implementation here uses the B+tree [5] in- data. As mentioned earlier one of the most impordexing structure, an indexing structure that with hash- tant data structures for database system indexing is
ing mechanism are the most powerful indexing mech- B+Tree. This indexing method allows fast searches,
anisms today and most of powerful database manage- high speed range queries, addition and deletion of data
ment systems such as SQL Server and Oracle use this with minimal disk access. This system also supports
type of indexes in their databases. In the next section object serialization that is not supported directly in
the database structure and implementation will be de- J2ME.
scribed and section 3 will introduce an application for
this database system. Section 4 suggests other implementation ideas and will discuss future works.

2.2

Add Mechanism

Two types of addition are supported in the system.


The difference between them is number of records that
are being added to the database at one time. First
one is add record method that inserts a record into
database and updates the B+tree index accordingly.

533

The Third International Conference on Contemporary Issues in Computer and Information Sciences

This one is the fundamental method for adding data


in most applications. The other one is used when
there are multiple records waiting to be added to the
database and also they share the insertion key. For
example consider an accounting table for factor items
that we are buying and suppose that the key is factor
number.

2.5

Updating Records

Updating records is a bit tricky. Update can be performed on a record and changes different parts of a
record. If changes affect the key part of a record, the location for the record in the B+tree index should be updated. Otherwise, updating non key part of the record
In this situation factor items share a common key is sufficient.
between them, so instead of accessing disk multiple
times, addition could be done via 2, 3 or more disk
accesses, of course, this will reduce disk accessed from
20 to 2 or even smaller with respect to database proper- 2.6 Backup and Restore
ties. This will save the program some time that in the
mobile platform, this time saving is considerable. Of If we have a large collection of records, and we want
course another type of addition is also supported that to create a B+ tree on some field, doing so by repeatthis will be discussed in next sections. This mechanism edly inserting records is very slow. Bulk loading is a
is called Bulk Loading and this will be used mainly for technique to overcome this shortcoming. Bulk loading
Backup and restore of the saved data [2].
can be done much more efficiently. This technique is

2.3

Search and Range Query Mechanism

The system supports both types of exact search and


range query and uses B+tree properties. Since all data
are indexed search and range queries are performed directly on the index. This will speed up this type of
actions. We also know that B+tree structure has very
high performance in performing range queries. Search
and create result set methods are implemented to perform this operations and the programmer should interpret the results. Test results on random data are
presented here [2].

used on sorted data entries and usually is applied on a


large number of records. In this system we gain some
improvements on number of disk accesses while implementing bulk loading technique. These improvements
are achieved via caching higher B+tree items for inner
nodes in main memory and write them to disk at once
instead of writing them one by one. For this purpose a
queue has been used to store a list of upper levels for
as many levels as exists and writing process uses this
queue for minimizing the number of disk accesses.

One of the applications of this method is importing data from other database systems and restoring
data from system backups. Furthermore compacting database for physical deletion of records that are
marked for deletion is possible via this method. Also,
this method for back up and restoring data is the best
method for this purpose, because accessing RMS file
directly and manipulating it in J2ME world is impos2.4 Deletion Mechanism
sible. So,using this mechanism will minimize number
of disk accesses and hence will minimize elapsed time
Two types of deletion is common in database program- for this operations [2].
ming world. One of them is physical deletion and the
other one is marking [4]. In this system the later is
used. That is, when a record is deleted, it will not
be deleted physically. This method is used by many
3 Case Study:
J2ME High
DBMS systems nowadays. Of course this method has
some space overhead, but considering size of memory
Speed Accounting Software
and space this days, this is not a big deal for database
management systems. On the other hand, the database
will be compacted in backup and restore operations. For testing system, a complete accounting software
That is, this marked records will be deleted physically had been developed. The complete software had been
when performing this operations. Unfortunately J2ME ported from Windows platform to J2ME and their opdoes not support services for running them in back- erations are identical. All operations are done in real
ground, hence we do not have any real time mecha- time. First step is creating the database schema. For
nism for compacting data whenever the program is not defining schema and opening a table, field sizes for the
active.
records of the table are needed. So in the constructor

534

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of the table field sizes are passed to table creator. After this step, database is ready for all kind of actions.
After defining all the tables and creating the schemas,
tables will remain open until the programmer closes
them. This is done for speeding up the program by
reducing number of opening and closing operations on
tables. In this software a simple relational database
model had been used. Of course, the relational model
still needs more work. Taking reports is done via carrying out joins on tables manually.

work and communication system is also limited compared to computers. Indexing on multiple fields separately is also an idea for next works. Although data
exchange via Blue-tooth is another idea that is important for mobile devices.

All listings in this software are sorted. This is


one of the interesting properties of B+tree indexes
as mentioned earlier. The database system in this
software has special properties. A background queue
works simultaneously with the main program to perform lengthy database operations in background to
keep the user interface responding. This background
thread starts with main program and usually is used on
operations which does not affect next possible actions
or actions whose probability are low. The system with
real world data and about 30 tables had been tested
and the program speed and performance was good.
This database system can be used on any database
application like dictionaries, accounting systems, etc.

[2] TJopen Available at: http://sourceforge.net.

Discussion and Future Works

Notes
[1] IBM toolbox for J2ME. Available at http://www.ibm.com/.
[3] PointBase Micro Available at:http://www.pointbase.com/.
[4] Oracle Lite. Available at http://www.oracle.com/.

Refrences
[1] J. Keogh, J2ME: The Complete Reference, McGrawHill/Osborne, 2003.
[2] R. Ramakrishnam.,
McGraw-Hill.

Database Management Systems,

[3] V. Kumar, Mobile Database Systems, John Wiley and


Sons,Inc. Hoboken, New Jersey., 2006.
[4] Siddhartha Sen and Robert E. Tarjan, Deletion Without
Rebalancing in Multiway Search Trees (2010), 125128.
[5] R. Bayer and E. M. McCreight, Organization and maintenance of large ordered indexes, Acta Informatica 1 (1972),
no. 3, 173-189.
[6] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, Data Structures
and Algorithms, Addison-Wesley, Reading, MA, 1987.

This system is still young and many new features can


be added to it. An example is Simple SQL language
manager for insert, search, query, update, delete and
other operations like creating joins. Also, the future
of database indexing technology is possibly UBTree.
However, because of limitations of mobile devices and
J2ME, porting SQL completely to this platform and
implementing a complete UBTree system is nearly impossible. Thus, network management and accessing
remote databases is an important matter for a DBMS.
Network manager for a database system based on
J2ME should be created carefully, because mobile net-

[7] D. Comer, The ubiquitous B-trees, ACM Computing Surveys 11 (1979), no. 2, 121137.
[8] E. Horowitz, S. Sahni, and S. Anderson-Freed, Fundamentals of Data Structures in C, Computer Science Press,
Rockville, MD, 1993.
[9] J. Jannink, Implementing deletion in B+-trees, ACM SIGMOD Record 24 (1995), no. 1, 3338.
[10] D. Knuth, The Art of Computer Programming: Sorting and
Searching, Vol. 3, Addison-Wesley, Reading, MA, 1973.
[11] J.D. Ullman, Principles of Database and Knowledge-Base
Systems, Computer Science Press, Rockville, MD, 1988.
[12] G. Wiederhold, Database Design, McGraw-Hill, New York,
1983.

535

IIR Modeling via Skimpy Data and Genetic Algorithm


Tayebeh Mostajabi

Javad Poshtan

Iran University of Science and Technology

Iran University of Science and Technology

Department of Electrical Engineering

Department of Electrical Engineering

mostajabi@elec.iust.ac.ir

jposhtan@iust.ac.ir

Abstract: Because of the wide application of signal processing, such as echo cancelations, noise
reductions, bio systems, speech recognitions communications and control applications, the topic of
IIR modeling attracts the noticeable interest of researchers. IIR structures are very useful for modeling such recursive systems. However they produced multimodal error surfaces and need powerful
optimization technique such as genetic algorithm for minimizing the produced error function. On
the other hand, in order to find an acceptable model, we need a complete and informative data set
which is rarely at hand in many practical application. In this paper we employ genetic algorithm
for estimating parameters of IIR structures in which two kind of skimpy data are used simultaneously by using GA. The numerical results presented here indicate that the proposed method is
effective and practical in building an acceptable model based on IIR (infinite impulse response)
filters. Especially when there is a skimpy data set in time domain.

Keywords: IIR Modeling; Genetic Algorithm; Skimpy Data; System Identification.

Introduction

In many engineering applications, we need a suitable


mathematical model of the system or process under
the control or study. In cases where the system, is a
small device and relatively detailed map of all its components are available, an appropriate model can be
obtained by using the laws and theories of electricity,
mechanics, and thermodynamics. But in many cases,
such a detailed map is not available and we need system
identification[1]. System identification based on infinite
impulse response (IIR) models are preferably utilized
in real world applications because they more accurately model physical plants than equivalent adaptive
FIR (finite impulse response) filters. In addition, they
are typically capable of meeting performance specifications using fewer filter parameters. Despite that,
IIR structures tend to produce multimodal error surfaces for which its cost function is significantly difficult
to minimize. Therefore many conventional methods
especially stochastic gradient optimization strategies
may become entrapped to local minima. There are
principally two different important set of applications
Corresponding

Author

536

in IIR filter design: IIR filter modeling or system


identification and pass-band frequency filter design.
The design of IIR filter as pass-band design is mostly
based on the desired frequency response (pass-band)
whereas IIR filter modeling mainly refers to the identification of unknown plants, Therefore the identified
model should behave as similar as possible to the real
system in both time and frequency domains. In order
to use IIR modeling, a practical, efficient and robust
global optimization algorithm is necessary to minimize
the multimodal error function. On the other hand,
genetic algorithm are robust search and optimization
technique, which has found applications in numerous
practical problems. The robustness of GA is due to
their capacity to locate the global optimum in a multimodal landscape[2]. Thus several researchers have
proposed various methods in order to use GA in IIR
system identification [4,7] in which population based
algorithms and input-output data in time domain are
utilized to estimate the parameters of an IIR model
in order to use in system identification where cost
function based on MSE between the unknown plant
and the estimated model is used. On the other hand

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

GA is extensively employed in pass-band frequency


filter design applications and in recent years, several
successful techniques have been introduced to improve
GA capability for pass-band filter design applications
[8-10]. In [8] a stability criterion embedded in the GA
is applied to design robust d-stable IIR filter, but IIR
filters, produced by this algorithm, do not describe the
dynamic plant necessarily. Therefore it is not useful
for system identification. In this paper the designing
method of IIR filter in pass-band frequency filter design by using genetic algorithm based on frequency
response are employed for IIR system identification
and also this method is combine with IIR model identification in time domain. The estimated models by
using this combination method has desired quality in
both time and frequency domains. But the advantages
and valency of its, is detected when there is a skimpy
data set. Therefore in this paper, genetic algorithm is
applied to combine the information of two separated
skimpy data sets. This paper is organized as follows:
System identification by using IIR filter modeling and
genetic algorithm as an adaptive algorithm for IIR
modeling is described in Sec.2. In Sec.3, utilized cost
functions including time - frequency one and also the
proposed method for combine the information of two
separated skimpy data sets are introduced. Simulations and comparative study is in Sec.4. Finally, the
conclusions and discussions are given in Sec.5.

Problem Statement

Proposed Method for Combine the Information of Two


Separated Skimpy Data Sets
in Time and Frequency domains

The typical cost function in time domain is:


fT =

The recursive expression with u(n) input and y(n) output and also its equivalent transfer function of the IIR
filter can be described by (1) and (2):
N
X
k=o

bk x(n k)

M
X

aj y(n j)

(3)

which is the mean squared error (MSE) between the


output plant (y) and output estimated IIR model
by ith chromosome. Each chromosome that has less
amount of MSE is more valuable. In order to find an
appropriate model for system identification, usually input white noise with proper length is needed to elicit
all of the structures of the unknown system, whereas
in many practical cases, importing such an input is
hardly possible or there is an incomplete and uninformative data set. Therefore it seems to be useful if
the skimpy data set is complete and more informative
with adding a few frequency data. On the other hand,
genetic algorithm against conventional adaptive algorithms can optimize two cost functions in two different
search spaces simultaneously. Therefore in this paper,
genetic algorithm is applied to combine the information of two separated skimpy data sets in time and
frequency domains. As a result of this, the following
multiobjective cost function is considered:
fT F =

y(n) =

Nt
1 X
[y(n) yihat (n)]2
Nt n=1

1 X
[abs(H(w)) abs(Hi (w)]2
2Nt w
+

(4)

Nt
1 X
[y(n) yihat (n)]2
2Nt n=1

(1)

j=1

b0 + b1 z 1 + ... + bM z M
G(z) =
1 + a1 z 1 + ... + aN z N

Case Study

(2)

Genetic algorithm begin with the random set of


possible solutions that each one is embedded in one
chromosome. Each chromosome has (M+N+1) genes.
At every generation, the cost of each individual (chromosome) is evaluated by a predetermined cost function. An individual with lower fitness value is considered. The population is then evolved based on the circled process of natural selection, survival of the fittest,
and mutation. This cycle is continued in order to find
the optimal solution[11].

In this section, a benchmark IIR filter taken from [7] is


considered as an unknown plant, in order to examine
the suggested combinator method for skimpy data sets
that is employed by GA for system identification. In
simulations the quality and performance of estimated
model by combinator method is compared with estimated models by using one kind of data set. In this
regard, consider the fifth order IIR plant with the following transfer function and the reduced order model:
Hp =

537

B(z 1 )
A(z 1 )

(5)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

B(z 1 ) = [0.1084 + 0.5419z 1 + 1.0837z 2


+1.0837z 3 + 0.5419z 4 + 0.1084z 5 ]
A(z 1 ) = [1 + 0.9853z 1 + 0.9738z 2
+0.3864z 3 + 0.1112z 4 + 0.0113z 5 ]
Hm =

b0 + b1 z 1 + b2 z 2 + b3 z 3 + b4 z 4
1 + a1 z 1 + a2 z 2 + a3 z 3 + a4 z 4

(6)

Figure 3: Comparative bode diagrams for estimated


models by 10 time data, 10 frequency data and 10 time
plus 10 frequency data

Figure 1: Comparative step responses for estimated


models by 10 time data, 10 frequency data and 10 time
plus 10 frequency data

Figure 2: Comparative impulse responses for estimated


models by 10 time data, 10 frequency data and 10 time
plus 10 frequency data

538

At first, genetic algorithm is applied with 500 sampling number produced by white noise input, then GA
estimated model with only 10 number of them. After that, GA is employed in frequency domain with
only 10 data. Finally 10 data in time domain are combined with 10 frequency data in order to examine the
proposed method. In order to examine four estimated
model with each other and also with the real plant
(equation (5)), their step responses, impulse responses
and bode diagrams are depicted in figure 1 to 6 comparatively. These diagrams illustrate valuable information
about system structure that is necessary for other applications such as controller designing. Therefore it is
important that the estimated model behave as similar
as possible to the real plant in such responses. Figure(1) shows that the estimated model by 10 time data
has no acceptable performance in respect to the real
plant. In addition, this figure shows the bad behavior of step response when we only employ 10 frequency
data for estimation. Whereas when two skimpy data
is combined with each other by using our proposed
combinator method, the estimated model has an acceptable quality of step response (transient and steady
state), respect to the real plant. Similar results can be
concluded from impulse responses of estimated models
in figure(2) and also their bode diagrams in figure(3).
In figures (4) to (6) the estimated model by skimpy
data set collection including 10 time data plus 10 frequency data, is also compared with that one which is
estimated with 500 number of data set produced by
white noise input. These figures emphasis that the
estimated model by combinator method has better responses respect to the real plant. These results illustrate that combining skimpy time data with skimpy
frequency data can be useful for system identification.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 6: Comparative bode diagrams for estimated


Figure 4: Comparative step responses for estimated models by 500 time data and estimated one by 10 time
models by 500 time data and estimated one by 10 time plus 10 frequency data
plus 10 frequency data

Refrences

Figure 5: Comparative impulse responses for estimated


models by 500 time data and estimated one by 10 time
plus 10 frequency data

Discussion and Conclusions

It is well known that achieving an informative data set


and importing white noise input with proper length, is
rarely possible in many practical situations. Therefore
this article proposed a novel cobinator method embedded GA for system identification. and tried to prove
some of its capability by the case study. The numerical
results indicate that the suggested method is effective
in building an acceptable model for IIR identification,
especially when there is a skimpy data set in time domain, the collection can be efficiently more informative
by adding a few frequency data.

[1] T. Mostajabi and J. Poshtan, Control and System Identification via Swarm and Evolutionary Algorithms, International Journal of Scientific and Engineering Research 2
(2011).
[2] V. Hegde, S. Pai, and W. K. Jenkins, Genetic Algorithms
for Adaptive Phase Equalization of Minimum Phase SAW
Filters, 34th Asilomar Conf. on Signals, Systems, and Computers November (2000).
[3] R. Nambiar, C. K. Tang, and P. Mars, Genetic and Learning Automata Algorithms for Adaptive Digital Filters, Proc.
IEEE Int. Conf: on ASSP 4 (1992), 4144.
[4] S. C. Ng, S. H. Leung, C. Y. Chung, A. Luk, and W. H.
Lau, The genetic search approach: A new learning algorithm
for adaptive IIR filtering, IEEE Signal Processing Magazine
Nov (1996), 3846.
[5] O. Montiel, O. Castillo, R. Sepulveda, and P. Melin, Application of a breeder genetic algorithm for finite impulse filter
optimization, Information Sciences 161 (2004), 139158.
[6] Y. Yang and X. Yu, Cooperative Coevolutionary Genetic
Algorithm for Digital IIR Filter Design, IEEE Trans. Industrial Electronics 54/3 (2007).
[7] D. J. Krusienski and W. K. Jenkins, Design and performance of adaptive systems based on structured stochastic
optimization strategies, IEEE Circuits And Systems Magazine First Quarter (2005).
[8] S. T. Pan, Design of robust D-stable IIR filters using genetic
algorithms with embedded stability criterion, IEEE Trans.
Signal Processing 57/8 (2009).
[9] J. T. Tsai, J. H. Chou, and T. K. Liu, Optimal design of
digital IIR filters by using hybrid taguchi genetic algorithm,
IEEE Trans. Industrial Electronics 53/3 (2006).
[10] M. Haseyama and D. Matsuura, A filter coefficient quantization method with genetic algorithm, including simulated
annealing, IEEE Signal Processing Letters 13/4 (2006).
[11] T. Mostajabi and J. Poshtan, IIR Filter Design Using Time
and Frequency Responses by Genetic Algorithm for System
Identification, International eConference on Computer and
Knowledge Engineering (2011).
[12] S. L. Netto, P. R. Diniz, and P. Agathoklis, Adaptive IIR filtering algorithms for system identification, a general framework, IEEE Transactions on Education 38/1 (1995).

539

Concurrent overlap partitioning, A new Parallel Framework for


Haplotype inference with Maximum parsimonious
Mohsen Taheri

Alireza Meshkin

Islamic Azad University,Damavand Branch

Islamic Azad University,Damavand Branch

Department of Computer Engineering

Department of Computer Engineering

Damavand, Iran

Damavand, Iran

Mohsenta2003@gmail.com

Meshkin@ibb.ut.ac.ir

Mehdi Sadeghi
National Institute of Genetics Engineering and Biotechnology
Tehran, Iran
M Sadeghi@ibb.ut.ac.ir

Abstract: Haplotype information has become increasingly important in analyzing fine-scale


molecular genetics data, such as disease genes mapping and drug design and now haplotype inference is one of the most important challenges in Bioinformatics. Since direct achievement of
haplotype information from natural chromosome is too time consuming and costly, so bioinformatics researchers attend to obtain this valuable information with computational method. In this paper
a new partitioning algorithm for haplotype inference with parsimonious criterion is introduced. This
approach tries to make a new structure for solving HIPP with parallel execution capability. This
approach is called overlapping window partitioning and merge results (OWPMR), because it divides
an original problem into smaller and simpler sub problems and inferences sub genotype matrices
(sub partitions) with a specific solver and finally merges the sub results to forming the final haplotype. The OWPMR with making window size as the column division factor convert original problem
to some other sub problems, with resolving overlapping genotype column in each step, try to find an
optimum haplotype for these common columns. OWPMRs performance is highly depended to size
of overlap window and algorithms that used for solving sub partitions. In this article we modified
and improved the Parsimonious tree grow algorithm(PTG) and try to add some greedy selection for
eliminating its random behavior and named it as greedy PTG. Greedy PTG with tree base structure
in solving HIPP and also its good result, used for solver in OWPMR. Also with choosing window
size from a specific range, OWPMR reach the best result in number of Haplotype and Error rate
perspective. The accuracy of OWPMR based on error rate, global site error, single site errors and
number of switch accuracy was being compared with PHASE, 2SNP, and GERBIL and itis proven
that in most test and real sample problems, the OWPMR obtains the result with same accuracy as
PHASE in 2SNPs time complexity.

Keywords: Haplotype inference; pure parsimony; Bioinformatics; SNPs; PTG.

Introduction

of specification of these differences are being unknown


[1]. Single nucleotide polymorphisms (SNPs) are the
common form of genetic differences between human
chromosomes. In other words the difference between
Also the role of genetic differences and inheritance in individual DNA sequences mostly occurs at single-base
human disease are so important, but the most aspect
Corresponding

Author, T: (+98) 912 286 2958

540

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

sites, in which more than one nucleic acid or gap is ob- sults [35]. A recent extension of the likelihood criterion,
served across the population. Such variations are called Bayesian inference [36-38], uses also biologically-based
single nucleotide polymorphisms (SNPs).
prior probabilities to obtain more accurate estimates
of haplotype frequencies [37], [39], [3]. Just as with the
Haplotype is the SNPs information for each of two likelihood criterion, however, finding the optimal phyunlike copes of chromosome in each diploid organic logeny using Bayesian inference is NP-hard [33][32].
and estimation from aligned Single Nucleotide Polymorphism (SNP) fragments. Due to its importance for
The parsimony criterion states that under many
analysis of many fine-grain genetic data and disease plausible explanations of an observed phenomenon, the
genes mapping to specific patterns of Haplotype and one requiring the fewest assumptions should be predrug design has attracted more and more in the recent ferred [32]. Hence, based on parsimony criterion, a set
years.
H of haplotypes is defined to be optimal or the most
parsimonious for the genotypes analyzed if H is charHaplotypes encode the genetic data of an individual acterized by having the smallest cardinality [3], [25],
at a single chromosome. Hence humans chromosome [37]. The parsimony criterion is well-suited when the
is diploid and for each chromosome there is maternal genotypes are characterized by a low-medium variabiland paternal origin, but it is technologically infeasible ity [6], [35] and is at the core of several versions of
to separate the information from homologous chromo- the haplotype problem, namely: Clarks problem [2],
somes by experimental methods. In this regards, incli- the pure parsimony haplotyping problem [25], the minnation to computational method would be increased in imum perfect phylogeny problem [35], the minimum
recent years. A relevant approach in haplotype infer- recombination haplotype configuration problem [40],
ence is the pure parsimony. The haplotype inference by the zero recombination haplotype configuration probpure parsimony (HIPP) aims at finding the minimum lem [40], and the k-minimum recombination configunumber of distinct haplotypes which explains a given ration problem [40]. As drawback, apart from some
set of genotypes. Parsimony haplotype inference is one polynomial cases ([35]; [40], [28]), each version of these
of problems belonging to NP-hard class or APX-Hard optimization problems has been proved to be NP-hard
class [14], [16], [20].
[33], [34].

Previous Related Works

Generally all approaches that have been proposed for


the haplotype inference problem depend on the nature of the criterion used to choose a set of haplotypes among possible alternatives [32][28]. Most of algorithms that used for haplotype inference are one of
these follow types: Statistics method such as Clark [2]
, Parsimony methods [3],[21],[23] , maximum likelihood
methods [24] , phylogeny tree base methods [25-28] and
Bayesian methods [29-30]. All of these mentioned algorithms are categorized in two main families: the likelihood criterion [29-32] and the parsimony criterion [3].
The likelihood criterion states that under many
plausible explanations of an observed phenomenon, the
one with the highest probability of occurring should
be preferred [33]. Hence, under the likelihood criterion, a set of haplotypes is defined to be optimal (or
the most likely) if it has the highest probability of explaining the observed genotypes [34]. The likelihood
criterion is well-suited when the genotypes are characterized by a high variability [29]. As drawbacks, the
corresponding optimization problem is NP-hard [34],
and in some circumstances may provide misleading re-

The first ideas that proven the parsimonious criteria started by Gusfield on Clerk Algorithms result [3].
He observed that among all of Clerk algorithm running , the running with minimum number of distinct
haplotype or with haplotype with maximum number of
usage in phasing genotype sequences are the accurate
set for haplotype inference problem [2].In recent years
heuristic algorithm [15] ,[16] , [17], greedy algorithms
[3], branch and bound algorithms [4] , Linear programming methods [3] , [5] , [7] , semi defined programming
[8],[9], the SAT based Algorithms [21],[22],[23],[24], and
pseudo Boolean optimization algorithms[12] are proposed by bioinformatics researchers for HIPP. Also the
first approximate algorithms with O(2k1 ) guaranteed
performance when k is the number of heterozygote sites
in genotype was introduced by Lancia [25]. Hange
made other approximate algorithms in O(log n ) complexity class when n is the number of genotype [19].
A new heuristic algorithms base on parsimonious tree
grows (PTG) method announced by lie in [18] with
O(n2 m) time order when n is the number of genotype
with m sites SNPs. Recently in [9] with consider Clark
compatibility graph give a polynomial algorithm published for HIPP. Also Markov chain based algorithms
used for haplotype inference problem in PHASE [18],
[19], PLEM [20] software. In recent years, the SAT
based algorithm is many interested and SHIPS software [12] is one of SAT base algorithms. Also PBO as

541

The Third International Conference on Contemporary Issues in Computer and Information Sciences

a special form of SAT solver used in RPOLY software identities are unknown. In this article, the unknown
[5], [6].
sites being discard so genotype matrix represented by
0/1/2 in following.
In this article a new tree structured algorithm based
on divide and conquers method introduced. With using
The allele at locus i of haplotype h is denoted by
overlap window size as the partitioning factor and also h(i). Similarly, for a given genotype vector g, the genousing an enhanced version of PTG named as greedy type at locus i is denoted by g(i).
PTG for solving each sub partition dramatically improved haplotype differencing algorithm both in time
We say unordered haplotype pairs h, k, will solve
and accuracy. Also a powerful merge algorithms being the genotypes g and write h, k g if the following conused for mix the sub partitions results and shaping the ditions would be true for each j = 1, m:
final haplotype. Hence it named as overlap window
g[j] = 0
(h[j] = 0 and k[j] = 0)
partitioning and merges result method or in abbreviation as OWPMR. The partition solver, Greedy PTG is
g[j] = 1
(h[j] = 1 and k[j] = 1)

a new version of PTG method introduced in [18]. That


(h[j] = 0 and k[j] = 0)
can make a tree for each partition, therefore provides
OR
g[j] = 2

a situation for merge the sub results trees and gain

(h[j] = 1 and k[j] = 1)


the final Haplotype. With a DFS search on greedy
PTG trees, result of each partition has shaped and so
In this case < h, k > are a pair of haplotype that
on merge algorithm used matching process for finding
inferred by genotype g.
longest common string between parturition and shape
the final haplotypes.
The input of HI algorithms consists of n genotype vectors each with m coordinates corresponding
The organization of this paper is as follows: the
to SNPs. SNP values belong to 0, 1, 2, where 0s and
definition of haplotype inference problem by parsimo1s denote homozygous sites with major allele and minious consideration would be surveyed in section2, the
nor allele, respectively; 2s stand for heterozygous sites.
Greedy PTG and its enhancement against the PTG
Phasing replaces each genotype vector by two haplomethod would be debated in section 3. After a depth
type vectors with SNP values belonging to 0, 1.[12]
looking to OWPMR idea in section 4, in section 5 the
related parameters that affected the OWPMRs result
Problem:
Pure Parsimonious Haplotype
would be discussed and finally in section 6 there are
(PPH)(HIPP)
a fairly comparison between OWPMR and 3 others
HIPP software: PHASE, 2SNP, GERBIL and for real
Given a set of G as a matrix with n genotype,
and simulated dataset, accuracy and execution time of
having m SNPs each, find the minimum set H of hapthese software would be evaluated by charts and numlotypes such that for each genotype gk G there exists a
bers.
pair of haplotype {hi , hj } H resolving gk . The PPH
is belonging to NP-hard and APX-Hard class [14], [16],
[20].

PROBLEM FORMULATION

SNP sites are positions in DNA sequence where nucleotides of different individuals in this position have
different alleles. All distinct nucleotides that occur in
SNP sites named as alleles of that SNP site. Usually
in SNP sites, only two type of four possible SNP site
occur, so it can be restrict our computation to bi-allelic
SNPs, which form the vast majority of known SNPs.
In this case a haplotype can be represented by a 0/1
vector typically by representing the most frequent allele as a 0 and the alternate allele as a 1. A genotype
will be represented as a 0/1/2/? Vector, where 0 (1)
means that both maternal and paternal chromosomes
contain the 0 (1) allele, 2 means that the two chromosomes contain different alleles, and ? means that allele

542

GREEDY PTG, ENHANCED


VERSION OF PTG

Analysis of PTG in pervious section, obviously shows


that the main disadvantage of PTG can be some accidentally choice in solving undividable indexes. In this
manner, a greedy enhancement version of PTG algorithm introduced for overcome some of these choices.
Greedy PTG suggest that in each level 1 j m of
PTGs tree and for each undividable index i I(j), for
each of random selection between two nodes of Tij ,
the node with maximum number of index in last level,
(j+1) , be chosen.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

In other words, between two nodes of Tij , node result with a comparably good accuracy result. Obthat have a larger set of index in (j + 1)th level, is near viously greedy PTG is in same complexity order with
to parsimonious criteria, because it pointed by most in- PTG an both of them is member of O(n2 m) class.
dex in previous levels, so select of this node for solving
undivided index i can be made less node in level j+1
and finally minimum number of distinct haplotype.

Base on greedy PTG, for selection between same


branches of two nodes in Tij or selection of difference
branches from a member of Tij , the branch type with
larger cardinality of index set in level j+1 would be selected and so on it route the haplotype paths in common haplotype prefixes.
Also, some other enhancement in PTG was being
done. First of all is using compatibility graph for discovery of independent sub sections and so on solving
each of these sub section independent of others. Also
detection of symmetric columns in reduction phase and
used of one of these column and eliminate others redundant columns in preprocessing phase is another enhancement in greedy PTG. All in all the main steps of
Greedy PTG are summarizing in following order:

OWPMR, A New Tree Based


Algorithm For Haplotype Inference With Maximum Parsimonious

In order to find a parallel algorithm for HI PP , divided and conquer method as a top down architecture
is seems to be suitable. Since HI with pure parsimonious situation is an optimization problem, the independent reservation of sub matrix and merging their
result cant gain the optimum result and solving and
merging of sub matrix are related to each others. In
this regards, there are two basic issues that must be
concern.

First of all, it seems to be necessary to dividing


the original genotype matrix to sub problems witch
that each resulted partition can be inferred by current
4.1 Preprocessing
HIPP solver in efficient way both in time and accuracy
and it be possible to reaching the acceptable parsimoMake compatibility graph and detection independent nious solution for each partition. The selection apsub problems of original problems. Discovery of sym- propriated HIPP solver is the main role of this phase.
metric columns in genotype matrix and eliminate all of As a second step, the merge of sub problems results to
them except one.
obtain final haplotype with parsimonious criteria is another point that need be focused. Existence of overlap
section between divided partitions and needs to merge
consequence haplotype are the main features and needs
4.2 Solving dividable index
of this new algorithm.
This section is the same of something done in PTG.

For satisfy parsimony criteria, partitioning phase


of new algorithm should make overlap sub matrixes
In other words, divided and conquer algorithms make
genotype sub matrix with common column, so solv4.3 Solving undividable index
ing this overlap sub matrix obtain sub haplotypes that
have common postfix and prefix with each others. Also
For each level j and for each set I(j) base on greedy with present of these common postfixes or prefixes, it
roles try to choose that node have larger set of index can be possible to merge sub result and gain to final
in last level . So for undividable index i in I(j) , if parsimonious solution.
Tij = {vik1 , vik2 } and both of vik1 and vik2 has same
type of branch , branch with maximum pointed node
This new algorithm based on used overlap SNPs
be selected.
sites in partitioning phase named as overlap window
partitioning and merge result or OWPMR. The main
If there isnt appropriate branch it must be create idea of OWPMR is using a partitioning window size
for each level.
factor. For genotype matrix G = [gij ]nm , OWPMR with make m-w+1 sub matrix with n genotype
This new version of PTG implemented with C en- with w SNPs if w be the size of overlap window size.
vironment and named as Greedy PTG. Greedy PTG If Gik represent the genotype matrix from column
with used of greedy rules in random choice, gain fixed i until k, so OWPMR after prepossessing(reduction

543

The Third International Conference on Contemporary Issues in Computer and Information Sciences

phase) divide and solve G in following segment:


G1,w , G2,w+1 , G3,w+2 , ..., Gk,w+k1 , ..., Gmw+1,m .
Also for each 1 t w 1 , and for each 1 i, k m
sub matrix Gik have common prefixes or postfixes
with length (w-t) SNPs ,with other near sub matrix
with distance t from Gik , and haplotype result can be
merged with 2(w-1) other sub matrix .
This m-w+1 sub matrixes, solved independently
and for each of them, its results stored in a well-defined
data structure. Common SNPs sites solved in multi
partition and during of multi resolving of w columns
in each circle, common haplotypes for i..i+w SNPs for
2 i m 1 sites will be chosen. Obviously such
haplotype set is near to parsimonious criteria. Performance of OWPMR is directly deepened to size of window size and with increasing the number of window size
until a specific range, accuracy of results increased.
After partitioning the genotype matrix to sub problems with w columns and solving each of sub problem and finally making a data structure for saving
sub haplotype, for obtain final haplotype, these substrings must be merged. w as the overlap window factor for partitioning , make it possible to concatenate
the sub string gained in first step. Since for each w
consequence genotype sub matrix, already exist a prefix and postfix for concatenation, with a good merger
algorithm it can be possible to reach the maximum
parsimony haplotype. The pseudo code 1-1 is an abstract version of OWPMR. Suppose that G is input
genotype matrix with n genotype and m SNP sites for
each genotype, and HASPSET is the data structure
with m-w+1 entry for saving elementary result. nw is
the index pointer to current partition of genotype matrix , be used and in each iteration of algorithm, the
nwth partition of genotype matrix with n w genotype , forward to HIPP solver. OWPMR used greedy
PTG as the solver and the reasons of this choice will
be discussed. finally after inferred all m-w+1 sub partition, merge algorithms receive the HAPSET as the
input and after merge and concatenation the middle
result , shape the final haplotype and stores them
on ALLHAP data structure. ALLHAP is a structure
with 2n m entries and final haplotype save in it.
for nw =0; nw < m-w+1 do
Sub-G=Partitioning [nw]; // SubG represent
the nwth sub matrix
Greedy PTG (Sub-G, n ,w, HAPSET [nw]);
//solve each sub matrix and save result in
HAPSET[nw]
end
MERGE (HAPSET, ALLHAP);//ALLHAP save
Merge Result
Algorithm 1: OWPMR Pseudo Code

544

5.1

partitioning step

In partitioning step of OWPMR, genotype matrix G


should be divided base on overlap window size factor.
Hence the result of partitioning algorithm must be used
in optimization problem so the result of each section
must be related to each other. OWPMR draw a connection between each w consequence partitions by definition overlap column between them. When window
size be equal with w, so OWPMRs partitioning algorithm creates m-w+1 genotype sub matrixes with w
SNPs site in each partition and jth sub matrix contains
jth column until j+w-1-th column. Following pseudo
code shows the partitioning algorithm for OWPMR.
for nw=1; nw <= m-w+1 do
for row=0; row < n do
for column =0; column < w do
SUB G [row w+column]= G[row
m+ column +nw];
end
end
GreedyPTG (SUB G, n, w, HAPSET [nw],
outp);
end
Algorithm 2: The Partitioning algorithms for OWPMR

5.2

Merge the partition result and


forming the final Result

The main step of OWPMR can be merging the partitions results and shaping the final haplotypes. Merge
method undertakes the mixed results that concluded
by inference of sub matrixes and producing the final
Haplotypes , by probing the HAPSET data structure
row by row and for each row forms the parsimonious
Haplotype .Since following sub matrix have common
column, so their haplotype have common postfix and
prefix. If Gik is genotype sub matrix from column i until k , and Hi , kj , is the set of parsimony haplotype for
Gik inferred by greedy PTG, so some of Hi , kj postj
fixes are prefixes for Hi,k+1
and this rule is true for
each w partitions.. Existence of common postfixes and
prefixes between each w subsequences, make it possible to check the matching of inferred haplotypes and
merge them in efficient way. Merge algorithms needs
to find largest common sub string in each w column of
HAPSET. In this regard, OWPMR utilized MATCH
algorithms to find common largest substrings and produce final haplotypes. MATCH routine in each step,
compares last k bits with first k bits of two substrings
with start k from w-1 to 1 and try to find larger matching between two strings.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Since the results of greedy PTG on genotype sub


matrix is tree form, for merge these tree, after DFS
search on each of trees, the extracted form of tree, save
on HAPSET.

EVALUATION OF RESULT
Four widely used measures for assessing accuracy
are adopted in this study, the metrics include:

In the worth case, the number of compression for


achieving an allele for parsimonious haplotype, from w1 partitions, is equal with all path with any length in a
full binary tree with w level, that this amount is equal
with 2w 2 . For genotype matrix with n genotype and
m SNPs site this equal multiply by m and n, so on the
complexity of merge in worth case is O(mn2w ). So for
avoiding of this type situations of algorithm, it should
be limited the w as the main factor. OWPMR with
used four level of HAPSET, decrease the number of
comparison to (mn24 ) and make an O(mn) algorithm
for merge. If there is no matching until four levels
from current partition, merge algorithms used default
approach and for each of next 4 alleles used the last bit
appropriate partitions. This default selection without
more investigation would modify in next steps.

The Haplotype error rate (HE): average proportion of haplotypes incorrectly inferred (percentage of reconstructed haplotypes with, at least, one site erroneously assigned). [26];
The single-site error rate (SSE): average proportion of ambiguous SS (that is,
heterozygote SS in the individual) whose
phase is incorrectly inferred. [26];
The global single-site error rate
(GSSE): average proportion of all SS whose
phase is incorrectly inferred. Note that the
denominator here is the total number of
sites, regardless of them being ambiguous
or not. [26];
The Switch error (SWE), as defined by
[19], corresponds to one minus switch accuracy in [24] : average proportion of heterozygous positions miss assigned relative
to the previous heterozygous position. It
shows whether errors in haplotype reconstruction are mainly due to the miss assignment of isolated SS (high error), or of blocks
of neighboring SS (low error).[26].

The 2AR as the real dataset was being used, for


evaluation of OWPMR. 2 Adrenergic Receptors
(2AR) are G protein-coupled receptors that mediate
the actions of catecholamine in multiple issues. There
are 13 variable sites within a span of 1.6 kb in the human 2AR gene. Among 121 individuals, there are 18
distinct genotypes, but only 10 haplotypes resolve all
Figure 1: the number of distinct haplotypes with def- the genotypes. [27].
erence window size
Most haplotype inference software need to multiple
run the original problem to achieve the best result, but
OWPMR solve it with deference overlap window and
it can be iterate running some times, but resolve aid
With this consideration, there is a tradeoff between to find best window set or a range of optimum window
time complexity of matching algorithm and overlap set, so we need to find a best window for solving with
window size. Because with increase the size of win- OWPMR. Figure 2,3 show the OWPMRs distinct hapdow size the time for matching increased , so OWPMR lotype. As it shows, for window size less than 2, OWneeds to find an optimum range for window size in each PMR havent good result and with increase the size
problems . In better words, the merging algorithm of of window , it obtain better results and finally with
sub genotype needs to find maximum matching path w=7 it reach the best result and in all of 10 running
between each w partition, and without reduction the of algorithm , gain best result same as real result. For
size of matching level, it isnt possible to find parsimony 7 w 10 OWPMR reach to same results with same
solutions in polynomial time.
accuracy, so w=7 is the optimum overlap window size.

545

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences
[1] Bafna .V, Istrail. .S, and Lancia .G, Polynomial and APXhard cases of the individual haplotyping problem, Theoretical Computer Science 335 (2005), no. 1, 109125.
[2] Clark .A. G, Inference of haplotypes from PCR-amplified
samples of diploid populations, Molecular Biology and Evolution 7 (1990), no. 2, 111122.
[3] Gusfield.D, Haplotyping by pure parsimony, 14th Symposium on Combinatorial Pattern Matching (CPM) (2003),
144-155.
[4] Yuzhong.Zh, Xu-Yun.Z, Qiangfeng.Zh, and Guoliang. Ch,
An overview of the haplotype problems and algorithms,
Higher Education Press, co-published withSpringer-Verlag
GmbH 1 (2007), no. 3, 272282.

Figure 2: the number of distinct haplotypes with deference window size

[5] Krmann.K, Haplotype inference using overlappingsegments,


Masters Thesis, UNIVERSITY OF TARTU (2006).
[6] Adkins. R. M, Comparison of the accuracy of methods of
computational haplotype inference using a large empirical
dataset, BMC Genet (2004), 22.

The optimum size of overlap window size is the


value of w which that OWPMR obtains to maximum
parsimonious haplotype set in minimum time with high
accuracy. The analysis of OWPMRs accuracy charts
shows that when the grow up of accuracy and number
of distinct haplotype would being fixed in accuracydistinct haplotype axiss, it means that window set size
would be optimum or in a near of optimum size. In following simulation data it would be completely crystal
clear.

[7] Lynce, Graca.A, and Silva.J.M .Oliva. A.L, Haplotype Inference with Boolean Constraint Solving: An Overview, Oxford
Journal-Bioinformatics 14 (2008), 35453549.
[8] Li.ZP, Zhou.WF, Zhang.XS, and Chen.L, A parsimonious
tree-grow method for haplotype inference, Oxford JournalBioinformatics 21 (2005), 34753481.
[9] Benedettini.S, Roli1. A, and Gaspero.L. D, Two-level ACO
for Haplotype Inference under pure parsimony, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
8 (2008), no. 12, 149-158.
[10] Hung.P and Chen.H, Parallel Algorithm for inferring Haplotype, 2007.
[11] Lynce.I, Marques-Silva .J, and Gaspero.L. D, Haplotype inference with Boolean satisfiability, International Journal on
Artificial Intelligence Tools 17 (2008), no. 2, 355-387.
[12] Grac. A, MarquesSilva.A.J, Lynce.I, and Oliveira.A, Efficient haplotype inference with pseudo-Boolean optimization
(2007), 125-139.

[13] Grac. A, Marques-Silva.A.J, Lynce.I, and Oliveira.A, Efficient haplotype inference with combined CP and OR techniques, CPAIOR08 (2008), 308312.

CONCLUSIONS

An advanced divides and conquers approach for the


haplotype inference problem with pure parsimony
(HIPP) criteria introduced by this paper. OWPMR
as a new fast and highly accurate phasing algorithm
uses overlap partitioning algorithm. It can be hoped
that OWPMR will be very useful for high-throughput
genotype data processing, e.g., SNP Mapping Arrays
For future works we will extend our method to use an
agent based system for HIP. Since OWPMR used a parallel framework it can be implemented with a parallel
algorithm and with parallel execution of each step, improve dramatically performance of OWPMR.A multiagent system has not been used in haplotype inference
problem, so far. With regards to the parallel execution capabilities of our proposed method (OWPMR), it
seems to be important to create a specific methodology
to solve the HIP problem with a multi agent system.

546

[14] M. Melanie, An Introduction to Gentic Algorithms, MIT


Press , Cambridge , Massachusetts - London , England
Printing (1999).
[15] M. Dorigo and T.Stutzle, Ant Colony Optimization, MIT
Press, Cambridge, MA, USA (2004).
[16] Li .Z, Zhou. W, Zhang .X, and Chen. L, A parsimonious
tree-grow method for haplotype inference, Oxford JournalBioinformatics 21 (2005), 34753481.
[17] Chung .R. H, Gusfield .D, Zhang .X, and Chen. L, Haplotype inferral using a tree model, Bioinformatics 19 (2003),
no. 11, 780781.
[18] P Scheet and Stephens, MA fast and flexible statistical
model for large scale population genotype data: applications
to inferring missing genotypes and haplotypic phase, American Journal of Human Genetics 78 (2006), no. 4, 629644.
[19] Stephens M and Donnelly P, A comparison of Bayesian
methods for haplotype reconstruction from population,
American Journal of Human Genetics 73, 11621169.
[20] Niu T, Qin Z, Xu X, and Liu J, Bayesian haplotype inference for multiple linked Single-Nucleotide Polymorphisms,
American Journal of Human Genetics 70 (2002), 157169.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[21] Lynce. I, MarquesSilva. J, Xu X, and Liu J, Efficient haplotype inference with Boolean satisfiability, AAAI Conference
on Artificial Intelligence (2006), 104109.
[22] Lynce. I, Marques-Silva. J, Xu X, and Liu J, Haplotype inference with Boolean satisfiability, International Journal on
Artificial Intelligence Tools 17 (2008), no. 2, 104109.
[23] Grac .A, Marques-Silva .J, Lynce. I, and Oliveira. A, Efficient haplotype inference with pseudo-Boolean optimization, Algebraic Biology (2007), 125139.
[24] Grac .A, MarquesSilva .J, Lynce. I, and Oliveira. A, Efficient haplotype inference with combined CP and OR techniques, CPAIOR08 (2008), 308312.
[25] Lancia, Haplotyping Populations by Pure Parsimony Complexity, Exact and Approximation Algorithms, Bioinformatics (2004), 54.
[26] Stephens.M and Scheet.P, Accounting for Decay of Linkage
Disequilibrium in Haplotype Inference and Missing-Data
Imputation, Am. J. Hum. Genet 76 (2005), 449462.
[27] C.F Xu, Niu T, and Liu J, Effectiveness of computational
method in haplotype prediction Human Genetics, Human
Genetics 110 (2003), 148156.
[28] Zhang Y, Niu T, and Liu J, A coalescence-guided hierarchical Bayesian method of haplotype inference, Am. J. Hum.
Genet. 79: 313322 79 (2003), 313322.
[29] Excoffier L and Slatkin M, Maximum likelihood estimation
of molecular haplotype frequencies in a diploid population,
Molecular Biology and Evolution 12 (1995), no. 5, 921927.
[30] Fallin D and Schork N.J, Accuracy of haplotype frequency
estimation for biallelic loci via the expectation maximization algorithm for unphased diploid genotype data, American Journal of Human Genetics 67 (2000), 947959.
[31] Niu T, Qin Z, and S. Liu, Partitionligationexpectationmaximization algorithm for haplotype inference with singlenucleotide polymorphisms, American Journal of Human Genetics 71 (2002), 1242-1247.

[32] Catanzaro Da and Labb Ma, The pure parsimony haplotyping problem: overview and computational advances, International Transactions in Operational Research 16 (2009),
no. 5, 561-584.
[33] D Catanzaro, The minimum evolution problem: Overview
and classification, Networks 53 (2008), no. 2, 8990.
[34] Halldrsson B.V, Bafna V, Edwards N, and R Lippert, Combinatorial problems arising in SNP and haplotype analysis,
Discrete Mathematics and Theoretical Computer Science,
Springer-Verlag, Berlin 2731 (2003), no. 2, 2647.
[35] Gusfield D, Orzack S.H, Edwards N, and R Lippert, Haplotype inference, Handbook on Bioinformatics. CRC Press,
Boca Raton, FL (2005), 128.
[36] P Erixon, B Svennblad, T Britton, and B Oxelman, Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics, Systematic Biology 52 (2003),
665-673.
[37] J.P Huelsenbeck, F Ronquist, and R Nielsen, Bayesian inference of phylogeny and its impact on evolutionary biology
294 (2001), 2310-2314.
[38] B Larget and D.L Simon, Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees,
Molecular Biology and Evolution 16 (1999), 750-759.
[39] B.V Halldrsson, V Bafna, N Edwards, and R Lippert, Combinatorial problems arising in SNP and haplotype analysis,
Discrete Mathematics and Theoretical Computer Science,
Springer-Verlag, Berlin 2731 (2003), 2647.
[40] J Li and T Jiang, Efficient inference of haplotype from genotype on a pedigree, Journal of Bioinformatics and Computational Biology 10 (2003), no. 1, 4169.

547

A Bayesian Neural Network for Price Prediction in Stock Markets


Sara Amini

Farzaneh Yahyanejad

Institute for Advance Studies in Basic Sciences

Institute for Advance Studies in Basic Sciences

Department of Computer and Information Sciences

Department of Computer and Information Sciences

s-amini@iasbs.ac.ir

f.yahyanejad@iasbs.ac.ir

Alireza Khanteymoori

Institute for Advance Studies in Basic Sciences


Department of Computer and Information Sciences
khanteymoori@iasbs.ac.ir

Abstract: Price prediction in a stock market is a challenging task due to the complexity of
behaviors of both customers and owners and of course many other factors that are effective in this
area. In this paper a Bayesian neural network is proposed to predict the final price of a company
(IranTransfo) in Tehran Stock Exchange. We use Monte Carlo Markov Chain (MCMC) sampling
for implementing our Bayesian neural network. In addition to that, some Multilayer perceptron
networks are discussed and theie performances are compared with the proposed Bayesian neural
network. The result shows that MCMC is more effective in stock market price prediction.

Keywords: Bayesian Neural Network, MCMC method, MLP Neural Networks, stock market.

Introduction

of hidden layers is optional if they be used with nonlinear activation functions, the computational power of
the network increases dramatically.

Artificial Neural Networks are designed on the basis of


the human brain and neurons. They are parallel structures that were designed at 1943 by McColloch and
Pitts[1]. Now days artificial Neural Networks (ANNs)
are really popular learning tools in many areas because
they can be easily applied in a wide range of problems.
They can be used in different learning approaches: supervised learning, unsupervised learning and even Reinforcement learning [2]-[3]. They can be applied in
optimization problems especially by Hopfeild architecture (For a more complete discussion see e.g. [2]).
The Artificial Neural Network (ANN) is a network of
computation units that like a function gets an input
and produces corresponding output. The point is that
by employing different synaptic weights (that connect
computational units) and nonlinear complex activation
functions (in computation units), outputs can be really
complex on the basis of the inputs. In Figure 1 a simple
ANN with one hidden layer is shown. Although the use
Corresponding

Figure 1: a simple ANN with one hidden layer

Back propagation is an algorithm that is used in the


training phase in a feed forward neural network (NN).
However, it is not the only way for training phase in an
NN. For example in a Bayesian Neural Network (BNN)
the main part in the training can be done via a sampling process. There are several ways to predict the
price in stock markets like using time series, Hidden

Author, P. O. Box 45195-1159, F: (+98) 241 421-5071, T: (+98) 241 415-5051

548

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Markov Models [4], Fuzzy Models [5],..., and of course


classical neural network (as it is used in price prediction in [6] but not in a stock market and instead in
Trading Agents Competition(TAC)). We use a BNN
to predict the final price of a company (Iran Transfo)
in Tehran Stock Exchange. The rest of the paper is organized as follows. In Section 2, we give the review of
BNN.Section 3 is devoted to explaining different MLPs.
Experimental results and conclusion are presented in
Sections 4 and 5 respectivly.

full Bayesian approach we do not compute fixed values


for parameters or hyper parameters. We make an approximation for them to calculate for the parameters
and of course over the hyper parameters to obtain the
prediction of the model to calculate integration (2) [8].
Several methods can be applied to calculate the above
integral like MCMC and ensemble learning. MCMC
method is explained in this paper.

2.1

Bayesian Neural Networks

Monte Carlo Markov Chain Sampling

When dealing with a complex integral with the form


Bayesian rule is about obtaining a posterior probabil- of
Z b
ity on the basis of the prior probability. The posterior
h(x)d(x),
(4)
probability of parameters in a model M given data D
a
is defined as
one way is to rewrite h(x) into multiplication of a funcp(D|, M )p(|M )
(1) tion like f(x) and a probability function p(x) which is
p(|D, M ) =
p(D|M )
related to the interval (a,b). Therefore, we have
Where p(D|M) is the normalizing constant (evidence
Z b
Z b
of model M), p(D|, M ) is the likelihood of the paramh(x)d(x) =
f (x)p(x)d(x) = Ep(x) [f (x)]. (5)
eter and p(|M ) is the prior probability of [7].
a
a
we have
Z
p(D|M ) = p(D|, M )p(|M )d
(2) So if we can generate a large number of random points
a xi b , we can rewrite the integral to the form

The result of Bayesian modeling is the conditional


Z b
n
1X
probability of unobserved variables of interest given
f (xi ).
(6)
h(x)d(x)
n i=1
the observed data [7]. In Bayesian MLP the natua
ral end variables are the predictions of the model for
new inputs [8]. If we were given a training data set This method is called Monte Carlo Integration [9].
D = {(x1 , y 1 ), ...(xn , y n )}, the posterior predictive dis- In dealing with Equation (3) we have
tribution of output y new for the new input xnew is obN
tained by integrating the predictions of the model with
1 X
respect to the posterior distribution of the model[8].
ynew
f (xi , t ).
(7)
N i=1
Z
p(y new |xnew , D) = p(y new |xnew , )p(|D)d (3)
Samples for the posterior distribution are drawn in the
Where denotes all the model parameters and hy- learning phase which is computationally expensive, but
per parameters of the prior structure. In Bayesian in the prediction for a new input there is no need for
approach, the prior distribution is important because sampling and you can do predictions quickly on the
without it our knowledge is just about training samples basis of the stored samples [8]. In MCMC samples are
and we can not generalize it to other examples. How- produced using a Markov Chain. The Markov Chain
ever, a considerable advantage of Bayesian approach is refers to a sequence of random variables produced by a
that it gives a principled way to do inference when we Markov process. A process is Markov if it satisfies the
do not have a complete prior knowledge, and in this following equation
way we do not have to guess values for attributes that
p(St+1 |St , St1 , ..., S1 ) = p(St+1 |St )
(8)
are unknown. This is done by marginalization, or integrating over the posterior distribution of the unknown
variables [8]. In Bayesian approach we use Maximum If a process satisfies (8), it has Markov property.
A Posteriori method. In this method instead of con- Concretely, in MCMC sampling is done according to
sidering the posterior distribution, we want to find pa- Markov property [9].
rameters that maximize the posterior probability. In a

549

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Multilayer Perceptron (MLP)


Neural Networks

In an MLP, synaptic weights are updated via a two


steps process which is done several times:
a) Forward propagation. In this phase an input vector
is applied to the network and the output of the network
corresponding to this input is calculated.
b)Backward propagation. On the basis of this output
and the target, the error is calculated and is propagated backward through the network.
By using this error, synaptic weights are updated. The
way these synaptic weights are updated can be totally
different. In the following sub-sections we review some
of these learning approaches.

search direction
pk = gk + k pk1 .
Various options are available for k in (11).
Fletcher-Reeves [10] we have
k =

3.2.2

In

(12)

Polak-Ribiere

Another option for conjugate gradient is Polak-Ribiere


[10] in which k is
k =

3.1

gkT gk
T g
gk1
k1

(11)

Gradient Descent

gkT gk
.
T
gk1 gk1

(13)

Powel-Beals Restarts
In Gradient Descent weights are updated on the basis 3.2.3
of the fastest decrease in the error function. Therefor,
we have
In this method search direction [10] resets to the nege
(i, j)
(9) ative of the gradient whenever
wij = wij
wij
Wherein e is the error function and is one of the synaptic weights of the network.

3.2

|gkT gk | 0.2kgk k2 .

3.2.4

Conjugate Gradient

In Gradient Descent direction of searching is the direction that Gradient of the error decreases faster than
any other directions. However, this direction is not the
direction that fastest convergence takes place. There
are some methods in which a conjugate direction is
found that produces faster convergence in comparison
to the steepest descent direction as is in Gradient Descent. In the following we review some of them[10].

Scaled Conjugate Gradient

In all of conjugate gradient methods we have to do a


line search in an every iteration, which can be very expensive especially when training inputs are a lot. The
scaled conjugate gradient was designed to solve this
problem by combining model-trust region approach
with the conjugate gradient approach[10].

3.3
3.2.1

(14)

BFGS

Fletcher-Reeves Update
Newtons method is an optimization algorithm and is
applied on the basis of Equation (15).

Fletcher-Reeves is explained as follows:


Search the steepest descent direction in the first iteraxk+1 = xk A1
(15)
k gk ,
tion: p0 = g0
Do a line search to decide the optimal distance to move
Where Ak is the Hessian matrix. Computing Hessian
along the current search direction
matrix is expensive. Quassi Newtons algorithms do
xk+1 = xk + k pk .
(10) not need to calculate second derivatives and the Hessian matrix. They compute an approximation Hessian
The next search direction is conjugate to the previous matrix in each iteration. In BFGS the formula is:
one. In fact the new search direction is a combinayk = f (xk+1 ) f (xk ),
(16)
tion of the steepest descent direction and the previous

550

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Methods

2HLs

3HLs

4HLs

5HLs

6HLs

7HLs

8HLs

9HLs

10HLs

cgb
cgf
gd
scg
lm

98.6767
99.0119
144.4841
96.4165
113.7195

98.7817
155.3317
227.4080
141.1372
192.8755

97.3968
96.8207
243.5424
103.1662
102.6875

183.2696
133.7310
414.7131
121.0033
104.9285

110.2846
174.3894
390.8027
99.9745
97.7067

120.4961
163.4421
331.1582
110.4170
112.2179

129.8047
208.7062
459.6300
125.3221
170.8513

107.0397
249.7942
524.0216
149.3719
190.8371

155.5716
308,8018
603.9015
107.3567
117.7208

Table 1: MlPs results with different hidden layers (columns) and different training methods(rows)
order to obtain the best performance we fix number of
hidden units and run each of the MLPs with different
T



yk xTk
xk xTk number of hidden layers and choose the best option for
yk xTk
Hk I T
+ T
.
= I T
yk xk
yk xk
yk xk each. The results are shown in Table 1 and the best
(17) number of hidden layers for each training method is
bold. At the end we compare MLP with best parameters (i.e. number of hidden layers) to BNN results in
Figure 2.

And the approximation for Hessian [10] is


Hk+1

3.4

Levenberg-Marquardt

There is no need to compute Hessian matrix in this


method. In ANNs the performance function usually
has a form of sum of squares. Thus Hessian matrix
can be approximated as
H = JT J

(18)

g = J t e,

(19)

And the gradient as

where J is the Jaccobian matrix and e is the network


error. The levenberg-Marqurdt [10] formula is
xk+1 = xk [J t J + I]J t e.

(20) Figure 2: Error of different MLPs and BNN in price


prediction related to IranTransfo Company data set

Experimental Results
5

We apply our approach on a data set from Tehran


Stock Exchange. This data set contains daily records
of IranTransfo Company. The number of the records is
2138, with five attributes; Transaction Value, Capital
Turn over, Number of Exchanges, Maximum price and
Minimum price. The target for each feature vector is
the tomorrows final price.
The data split into a 70% training example set and
a 30% test set. In our BNN we use one hidden layer
and 10 hidden units. We create an MLP network and
priors for network weights and residual. Prior of network is Gaussian multivariate hierarchical with ARD
[7]. We use MCMC with 2500 samples to train our
Bayesian neural network. The accuracy of the model
is then compared with the MLPs results. We design
five different MLPs with various learning methods. In

Conclusion

In this work we apply a Bayesian neural network to


predict the final price of a company (IranTransfo) in
Tehran Stock Exchange. The result is compared to different MLPs with different learning methods and parameters, and finally Table 3 shows the effectiveness of
the proposed method.
Table 3: The Ratio of MLPs Error to BNNs Error
Methods accuracy rate
GD
1.5084 %
LM
1.0201%
CGB
1.0168%
CGF
1.0108%
SCG
1.0055%

551

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences
[1] S. Haykin, Neural Networks A comprehensive fundation: ch.
1, 1999.
[2] S. Sehad and C. Touzet, Reinforcement Learning and Neural Reinforcement Learning (1994).
[3] G. L. Rogova and J. Kasturi, Reinforcement Learning Neural Networks for distributed decision making, Proc. of the
FUSION (2001).
[4] Md. Rafiul Hassan and Baikunth Nath, Stock Market forecasting Using Hidden Markov Model: A New Approach,
Proci=eeding of the 2005 5th International Conference on
Inteligent Systems designs and Applications (2005).
[5] Md. Rafiul Hassan, A combination hidden Markov model
for stock market forecasting, Neurocomputing (2009).

552

[6] Y. Kovalchuk and Maria Fasli, Deploying Neural-NetworkBased Models for Dynamic Pricing for Supply Chain Management, Computational Intelligence (2008).
[7] Jarno Vahatalo and A Vehtari, MCMC Methods for MLP
Network and gussian Process and stuff- A documentation
in matlab Toolbox MCMCstuff (2006).
[8] Jouko Lampinen and A Vehtari, Bayesian Approach for
Neural Networks- Review and case Studies, Neural Networks (2001).
[9] B. Walsh, MarkovChain Monte Carlo and Gibbs Sampling:
Lecture Notes for EEB581, Version 26April (2004).
[10] www.mathworks.com/products/matlab/demos.html.

Maintaining the Envelope of an Arrangement Fixed


Marzieh Eskandari

Marjan Abedin

Alzahra University

Amirkabir University of Technology

eskandari@alzahra.ac.ir

m.abedin@aut.ac.ir

Abstract: We present a simple algorithm for computing dual of the envelope polygon of an
arrangement of n lines in dual space and then we present an algorithm for finding sets of lines that
by adding them to the arrangement the envelope polygon of the primal arrangement remains fixed.

Keywords: Computational Geometry; Arrangements; Envelopes; Duality.

Introduction

Definitions

In this section we are going to present some definitions


and preliminaries on arrangement of lines which are
needed in this paper.
The study of morphological properties of arrangement
of lines in the plane is of considerable interest to graphics (e.g. in computer graphics, architectural design, geography, etc), nuclear physicists, urban planners. Envelope polygon also contains a great deal of information
about the arrangements that study of this polygon will
help to understand the morphology of arrangements
better[1]. While arrangements as geometric objects are
well studied in discrete and computational geometry,
their envelope polygon seems to have received little attention recently. There are some efficient algorithms
to compute the envelope polygon of arrangement (for
more information see [2]) and in this paper besides presenting another efficient algorithm for computing the
envelope polygon, we are going to find sets of lines
which by adding them to an arrangement, the envelope polygon wont change and remains fixed. This
paper is organized as follows: In section 2 we present
the definitions, preliminaries and basic results on arrangements of lines and envelope polygon. In section
3 we present a simple algorithm for constructing the
envelope polygon. In section 4 we are going to present
an algorithm for finding sets of lines which by adding
them to the primal arrangement the envelope polygon
wont change.
Corresponding

2.1

Arrangement of n lines in education


plane:

An arrangement of n lines in the plane is a partition of


the plane into some faces, edges, and vertices (intersection points). Let A = {l1 , . . . , ln } be an arrangement
of n lines.
Denote the intersection of two non-parallel lines li
and lj by I(i, j). We can classify the vertices of an
arrangement of A as follows. A vertex p = I(li , lj )
(i, j [0, n 1]) is said to be extreme on li if all intersections point (other than p) lying on li , lies on one
side of p. The vertex p is said to be critical if it is
extreme on both li and lj , it is interior otherwise. (if
it is not extreme on neither li or lj ).
A point is at level k, denoted Lk , in an arrangement
if there are exactly k 1 lines above the point and nk
lines below it. The k-th level of an arrangement is an
x-monotone polygonal chain such that all points on the
chain are in level k. The upper envelope is a polygonal chain EU such that no line l A is above EU and

Author, P. O. Box 45195-1159, F: (+98) 261 455-0899, T: (+98) 261 457-9600

553

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

lower envelope is a polygonal chain EL such that no


line l A is below EL . In fact EU and EL made up of
two infinite rays and a sequence of segments each one
in level n and 0 respectively.

2.2

points that have the smallest angle with the vertical


line pass through the li and then the biggest segment
on li is bounded with intersections with dual of those
two other points. In this paper we need to find dual of
the biggest segment for all lines in this manner in dual
space.

Duality
2.4

Finding critical vertices

A point p = (px , py ) and line ` : (y = ax b) in the


primal plane are mapped through to a dual point l
and dual line p as follows:
Lets find dual of critical vertices of an arrangement.
From the definition of a critical vertex, it is extreme on
p : (b = px a py ), ` = (a, b)
both lines and it means that it is one of the endpoints
of biggest segment on both two lines that intersect and
A logical choice for dual of a segment like s = pq is the
then dual of intersecting lines share a common boundunion of the duals of all points on s. What we get is the
ing line. We could find all critical vertices parallel with
infinite set of lines pass through one point in dual plane.
finding the biggest segments in section 2.3 and save
Their union forms a double wedge, which is bounded
them in a list of critical vertices.
by the duals of the endpoints of s. the lines dual to
the endpoints of s define a double wedge (short form
D.W.); a left-right and top-bottom wedge, s is the
left-right wedge more precisely the wedge that doesnt 2.5 Envelope Polygon
contain the vertical line passes through the dual point,
center of the D.W. It also shows a line ` intersecting s, An envelope of an arrangement is one of the compowhich its dual, ` , lies in s [3]. Two segments inter- nents of the arrangement which has received considersect iff duals of segments share a common line which able attention in the literature.
passes through the centre of their D.W.s.
EU and EL as defined in section 2.1, can be constructed by computing the convex hull of dual points in
dual space. The left/right most point in the dual space
correspond to those lines with the smallest/largest
slope, and these two lines contribute as unbounded
edges of the arrangement in EU and EL and points
on upper/lower convex hull except left and right most
points correspond to bounded edges of EU and EL .

The envelope of an arrangement A, denoted as


E(A) is the union of bounded faces of the arrangement, and it means that all intersections points of A,
lie inside of E(A) or on the edges of E(A). A simple polygon P is an envelope polygon if there exists an
arrangement of lines A such that P = E(A).
Lemma 1.
let P be an envelope polygon. A
vertex of P is convex iff it is a critical vertex of IA(P ).

Assume a Cartesian coordinate system contains an


arrangement of n lines, each line with equation l : y = Proof. Author refer you to [1].
ax b, and points l = (a, b). By rotating the Cartesian coordinate system by , the equation of lines in the
It is easy to see that all critical vertices of an arnew system would change to y = ax cot() b/ sin()

rangement
contribute in the envelope polygon.
and therefore l in dual space would change to the point

l = (a cot(), b/ sin()).
Now let us find dual of the biggest segment on each
line of an arrangement:

2.3

Constructing
polygon

the

envelope

Finding the biggest segment

For finding dual of the biggest segment on each line in


an arrangement of lines we need to find a D.W. such
that contains the dual of all other lines and also doesnt
contain the vertical line passes through its center, so
on each li in dual space, we should find the two other

The strategy which we are going to follow for constructing the envelope polygon is so simple and is based on
the fact that all edges of the envelope polygon are
bounded segments of EU andEL while rotating whole
arrangement from 0 to 2, and also the fact that by
rotating the arrangement the envelope polygon wont

554

The Third International Conference on Contemporary Issues in Computer and Information Sciences

change.
Concentrating on the note that during the rotation
whole arrangement from 0 to 2, each line would become the line with smallest/largest slope in the rotated
arrangement twice; we could discrete the computation
and stop whenever the line with largest slope in the
arrangement become a line with smallest slope during
the rotation. Lets describe the algorithm formally:

it contributes in new envelope polygon or it wouldnt


change the previous envelope polygon. As mentioned
before we are searching for sets of lines that if we add
them to the primal arrangement the envelope polygon
remains fixed. Let call such a line `. It is possible to
find all these lines in at most O(n3 ).
Lemma 2. Line ` (as describe above) shouldnt
intersect the unbounded edges of the arrangement.

Computing the Envelope Polygon Algorithm {


1.

2.
3.

4.

5.

Proof. Intersection of ` with unbounded edges of the


arrangement cause at least one new critical vertex with
Compute the convex hull of dual points in dual one of the unbounded edge, and because all critical verspace.
tices are contributed in the envelope polygon we have
some new critical vertices in the new envelope polygon,
Do {
so the envelope polygon will change. In other words
line ` should intersect with each line on the biggest
Mark the steepest line as used line.
segment on each line.
Connect the points on lower hull. (Dual of these
wedges correspond to the bounded edges of EU
Lemma 3. Line ` (as describe above) shouldnt
of the arrangement.)
intersect any reflex chain of the envelope polygon more
Let be the angle between x-axis and the steep- than once. (Reflex chain is a chain contains series of
est line:
reflex vertices between two critical vertices in the envelope polygon.)
If( 90) then rotate clockwise all lines in
the arrangement by |90 + |
If( 90) then rotate clockwise all lines in Proof. Otherwise, some vertices of the reflex chain
change to be inside of the envelope polygon, and it
the arrangement by |450 + |
means the envelope polygon will change.
By this rotation, the steepest line becomes the
line with the smallest slope.
It is clear that sets of lines which satisfy the lemma
Compute the convex hull of the new dual points 2 and lemma 3 are those lines which by adding them
in dual space. }
to the primal arrangement the envelope polygon wont
change.
while (all lines mark as used line twice)

6. Compute dual of what is constructed above.

For satisfying lemma 2, we need to find such lines


that intersect some segments, the biggest segments on
each line of arrangement, that we found them in section
2.3. There are well-known studies to find such lines.

}
It is clear that the running time is still O(nlogn) for
arrangement of n lines to construct the envelope polygon, as it just need to compute the convex hull of n
At first, find dual of all biggest segments on the
points for 2n times; whenever we rotate all lines of the lines, as explained in section 2.3, and after that find
arrangement until the steepest line becomes a line with the intersections of the D.W.s, therefore the result resmallest slope.
gion in dual space contains points which dual of them
are the lines that satisfy the lemma 2. Lets call the result region as P . we could use the divided and conquer
algorithm in [3] to find the intersection of n D.W.s in
4 Maintaining the Envelope O(nlog(n)).

Polygon
For satisfying lemma 2, first compute the envelope
Each line that we add to an arrangement whether it polygon with simple algorithm in section 3 and for
would change the envelope polygon and this means that finding all reflex chains in envelope polygon we start

555

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

traversing the envelope polygon from an arbitrary crit- 4.1 Complexity analysis:
ical vertex that we found in section 2.4, up to the other
critical vertex that exists in critical vertices list. We
1. First step can be done in O(n2 ) because we just
need to save all the edges of envelope polygon that exneed to find two smallest angels for each dual
ist on a reflex chain during traversing. We save the
point in dual space.
segments on every reflex chain in Ci if there are more
2. We can compute the intersection of D.w.s related
than one edge of the envelope in the chain. Line `
to dual of n segments, which we found them in
should intersect a reflex chain of the envelope polygon
step 1, in O(n log(n)) by divided and conquer alat most once, note if ` doesnt intersect a reflex chain
gorithm
in [3].
of the envelope, then ` satisfies lemma 2, therefore, we
need to compute the union of intersections of D.W.s of
3. This step of the algorithm can also be done in
each pair of edges that exist in the chain. The result
O(n log(n)).
region in dual space contains points such that dual of
them would intersect each reflex chain more than once.
4. In [1], it is proved that envelope polygon has at
We should compute these spaces for all the chains and
most O(n) edges, so traversing the envelope polyunion of all of them, results a space (call Q) that dual of
gon can be done in O(n).
each point in Q is a line that intersects each reflex chain
5. Finding the intersection of D.W.s belong to each
of the envelope polygon more than once, and therefore,
pair of segments can be done in O(1) and we
if we add these lines to the arrangement, the T
envelope
has
at most O(n) edges in each reflex chain, and

polygon will change. We are looking for Q P , call


therefore,
O(n2 ) for all pairs in each reflex chain
it as H, to achieve dual of sets of lines that if we add
and if we assume to have at most O(n) reflex
them to the arrangement, the envelope polygon wont
chain, then computing all the intersection rechange. Each connected region in H would introduce
quired in this stage need at most O(n3 ) .
dual of a set of the desired lines.
Lets summarize the algorithm as follows:

1. Finding the biggest segment on each line in dual


plane.
2. Compute the intersection of D.W.s of segments
that we found them in sectin 2.3 and call the result space P .
3. Finding the envelope polygon with the algorithm
presented in section 3.
4. Travers the envelope polygon and save segments
of envelope polygon for each reflex chain separately if there are more than one segment in a
chain.
5. For each reflex chain: Compute the union of intersection of D.W.s of each pair of the segments
that exist in the chain.
6. Compute the union of result regions in step 5,
and call the result space as Q.
TP.
7. Compute the Q

6. We achieved some convex region with at most


O(n3 ) edges in step 5. After this, we want to
compute the union of all the regions and it can
also be done in O(n3 ).
7. Detecting whether two geometric objects intersect, and computing the region of intersection are
fundamental and well studied problems in computational geometry [2]. Geometric intersection
problems arise naturally in a number of appli are convex
cations and because both P and Q
2
3
regions with O(n ) and O(n ) edges respectively,
then the intersection can be computed in O(n3 ).

Refrences
[1] D. Eu, E. Guevremont, and G.T. Toussaint, ON ENVELOPES OF ARRANGEMENTS OF LINES, Journal of
Algorithms (1996).
[2] D. Keil., A simple algorithm for determining the envelope of
a set of lines: Elsevier Science Publishers B. V., Information
Processing (1991).
[3] M. de Berg and D.T. Lee, Computational Geometry Algorithms and Applications, Third Edition, Springer-Verlag
Berlin Heidelberg, 2008.

556

Investigating and Recognizing the Barriers of Exerting E-Insurance in


Iran Insurance Company According to the
Model of Mirzai Ahar Najai
(Case Study: Iran Insurance Company in Orumieh City)
Parisa Jafari

Hamed Hagtalab

Islamic Azad University, Torbat-E-Jam Branch

Islamic Azad University of Torbat-E-Jam Branch

Torbat-E-Jam, Iran

Torbat-E-Jam, Iran

p.jafari551@gmail.com

Morteza Shokrzadeh

Hasan Danaie

Islamic Azad University of Jolfa International Branch

Islamic Azad University of Torbat-E-Jam Branch

Jolfa, Iran

Torbat-E-Jam, Iran

Abstract: The goal of this study is investigating and recognizing the barriers of exerting einsurance in Iran Insurance Company according to the 3-branched model of Mirzai Ahar Najai. In
this study, different environmental barriers (including legal, cultural, and technological barriers),
organizational barriers (such as policies, insurance rules, internal structure and technology), behavioral barriers (like, expert staff shortage, the lack of supporting top managers, staff resistance
against changes) were evaluated. This study is a descriptive survey with applied goals. The statistical population included the managers, assistants, organizational experts, and different branches
of Iran Insurance in Orumie city. Sampling method was simple random sampling. Research hypotheses were examined using a One-SampleT-Test to investigate the efficiency of each variable on
exerting e-insurance. Fridman test was also used to rank variables. Research results showed that
means of the barriers of exerting e-insurance were higher than average.

Keywords: Information technology, Electronic transaction, Electronic business, Electronic insurance

Introduction

The present arena is called the period of electronic


phenomena since it has included electronic business,
banking, government, insurance, and life. Using information technology in the insurance industry, manifested in the form of e- insurance, time and geographical limitations are removed and wide revolutions are
produced in the informatics systems of the insurances
(Bahramali 2005, 281). Although electronic technologies have had limited effects on the insurance industry
compared with other industrial cities, it seems that this
effect changes considerably in short-term. Nowadays,
Corresponding

the issue of new opportunities for supplying insurance


services through Internet has been intensely considered. From the major incentives leading insurance industry into electronic world are deepening customer relations, reducing costs, improving services, and developing new sources (Bahramali 2005, 281). This study
aims to recognize and examine the barriers of exerting
e-insurance in Iran Insurance Company from environmental, organizational, and behavioral aspects. Using
e-insurance and information technology in the interactions between insurance companies and the customers
can have numerous advantages like providing 24 hour
services, lack of in-person- references for receiving com-

Author, P. O. Box 5413676996, F: (+98) 4923025252, T: (+98) 9144919720

557

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

pensations,fast and secure services, prevention from


insurance frauds, and increasing the income of insurance company. Exerting e-insurance needs the awareness of the organizational capabilities in the context
which aims to create it (Karimi 2005, 65). Considering
the theoretical frame of the research using 3 branched
Mirzai Ahar Najai model (2005), this paper had the
following goals:
1 Identifying exertion barriers of e-insurance

e-insurance in Iran Insurance Company?


H2. How much do organizational factors affect using
e-insurance in Iran Insurance Company?
H3. How much do behavioral factors affect using einsurance in Iran Insurance Company?

Discussion

2 Investigating environmental factors as the barriers of e-insurance exertion

3 Investigating behavioral factors as the barriers of In Hypothesis 1, environmental factor variable was examined using 8 questions and 3 factors. To test its
e-insurance exertion
4 Investigating organizational factors as the barri- significance, One-SampleT-Test was used whose results
showed that legal factor with the mean of 11.98, culers of e-insurance exertion
tural factor with the mean of 8.32, technological factor
with the mean of 12.89, and enviromental factor with
the total mean of 33.2 in general act as the barriers
of using e-insurance since all the significance values of
Methodology
them were smaller than 0.05.

The statistical population of this study included


the managers, assistants, organizational experts, and
branches of Iran Insurance in Orumie city, 2011. Using simple random sampling and Cochran formula, the
sample size of 120 people was achieved. To gather data,
a researcher-made questionnaire including 23 questions
with Likert-scale was used. To confirm its consistency,
the questionnaire was tested and modified by the experts and college teachers. Using Cronbach , the validity of 0.78 was achieved. The questionnaires information was analyzed by SPSS software to yield descriptive
and inferential statistics. Research findings were examined using a One-SampleT-Test, Fridman test, and
step-by- step regression.

In the H2, organizational factor variable was examined using 9 questions and 4 factors including, internal policies, interorganizational technology, insurance
rules, and structural factor. To test their significance
One-SampleT-Test test was used. The results showed
that internal policies with the mean of 8.55, insurance
rules with the mean of 8, interorganizational technology factor with the mean of 8.84, structural factor with
the mean of 10.92, and generally, organizational factor
with the total mean of 33.2 act as the barriers of using
e-insurance in Iran Insurance Company (p < 0.05).

In H3, the behavioral factor variable was examined


using 6 questions and 3 factors including staff resistance against changes, the lack of top manager support, and expert staff shortage. To test their significance, a a One-SampleT-Test was used. The results
3 Research analysis
showed that staff resistance against changes with the
mean of 7.84, the lack of top manager support with the
mean of 8.5, expert staff shortage factor with the mean
The condition of using parametric tests especially a
of 8.29, and generally, behavioral factor with the total
One-SampleT-Test is data normality. For this purmean of 24.64 act as the barriers of using e-insurance
pose,a Colmogrov- Smirnov test was used for each variin Iran Insurance Company (p 0.05).
able. As seen in Table 1, the significance level of all
values are bigger than 0.05, representing their normality.

Research hypotheses

Conclusion

The hypotheses of this research were as follows:


The general results of this paper are represented in TaH1. How much do envirpmental factors affect using ble 2.

558

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Significance level
0.165
0.165
0.08
0.059
0.11
0.059
0.066

Colmogrov-Smirnov z
1.11
9.4
2.64
1.88
1.28
1.77
2.74

Standard deviation
2.8
7.5
2.1
1.19
4.05
1.22
0.88

mean
33.2
2.1
11.98
8.32
36.34
8.5
8.8

0.063
0.088
0.085
0.072

1.9
1.49
1.65
1.41

1.35
2.08
2.7
1.57

8
10.9
24.64
7.84

0.06

1.96

0.99

8.5

0.067

1.47

1.28

8.29

varaiables
Environmental factor
Technological factor
Legal factor
Cultural factor
Organizational factor
Internal policy factor
Interorganizational
factor
Insurance rule factor
Structural factor
Behavioral factor
Personnel
management against changes
factor
Lack of manager support factor
Expert human resource shortage factor

Table 1: The results of Colmogrov- Smirnov test results for identifying data normality
Rank
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
10th

Mean Rank
9.43
8.65
7.68
5.09
4.46
4.45
4.15
4.12
3.61
3.35

Variable
Environmental Factor
Technological Factor
Legal Factor
Cultural Factor
Organizational Factor
Internal Policy Factor
Interorganizational Factor
Insurance Rule Factor
Structural Factor
Behavioral Factor

Table 2: The general results of the study

Since all research hypotheses were confirmed, reflecting the above average obstructiveness of environmental, organizational, and behavioral factors, it is
suggested that Iran Insurance managers should try to
remove them. Due to the highest obstruction value in
technological field, Iranian insurance managers should
improve their technological capabilities and remove its
obstacles. From the environmental aspect, specific regulations should be provided for the insurance companies in the field of electronic signs, contractions, and
transactions. Trade rules should be amended, supervised, and followed by the officials and all other stakeholders. People should be informed about the advantages of e-trade and extending the culture of it in the
organizations. The culture of using computer and Internet among different classes and insurers should be
extended. Definitions of Internet crime and penalties should be clarified for the Internet users. Enough

559

telecommunication and communicative bases should be


provided for exerting e-trade in Iran. Necessary context or equipments should be provided for the access of
the public or at least target electronic insurance customers to Internet. In the behavioral field, enough
expert staff should be employed with the knowledge
of information technology and e-insurance. The attempts should be done to reduce staff resistance against
changes through holding educational classes, meetings,
distributing informative, brochures, posters, and etc
in the organization. Top managers should promote
their knowledge about communication and information
technology. Trust should be created in the customers
toward the privacy and confidentiality of information
transactions, and Internet documents. Top insurance
managers should be sent to the pioneer countries in
this field for training, observing insurance advances
from near side or interactions with their managers. In
the organizational field, insurance companies should be
realistic and welcome e-trade after providing essential
capacities and capabilities. It is also suggested that
insurance fees and informative advertisements should
be in the access of the customers on-line and in shortterm. E-insurance can be started in the simpler forms
like individual insurance; economization in the costs
can be compared with the conventional ways and then
possibly generalized to the whole insurance industry
because the sale and supply of complicated or unique
insurances need designing proper networks for the specific and distinct studies. Insurance companies should
optimumly cooperate with Central Insurance Company
of Iran in preparing e-insurance standards. Central Insurance Company of Iran should identify proper poli-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

cies and programs in the field of e-insurance. Training


programs of computer, Internet, and e- insurance for
organizational staff should be a priority.

Refrences
[1] M Azad, Identifying and Investigating Effective Factors in
Purchase Purpose of E-Insurance in Tehran, 2010.
[2] A Ebrahimi, E-Trade and E-Insurance It.Technical Quarterly of Asia, 2005.
[3] F Deghpasand, E-Trade and E-Insurance. Planning Assistance of Trading Ministry, Sizan Publication, 2006.
[4] J Sahamian, The Challenges and Strategies of IT Development in Insurance Industry of Iran, Conference of Managing Insurance Challenges (2008).
[5] A Sarafizadeh, IT in the Organizations, Mir Publication,
2008.
[6] Sh Azizi, Identifying the Barriers of E-Trade Usages in Iran
Khodro Factory and Solutions for Them., 2005.
[7] F Ghasemzadeh, Legal Challenges of Using E-Trade in Iran,
Article Collections of E-Trade Conference. Bazargani publication., 2005.
[8] B Ghezelbash, The Principals of Supervising E- Insurance:
A Phenomenon in Insurance World, Insurance Research
House, 2005.
[9] M Castles, Information Arena: Society, Economy, and
Culture: Translated By Aligolian, A; Khakbaz, A.Tarhno.
Tehran, 2002.
[10] A Kameli, Marketing And Selling E- Insurance, Technical
Quarterly of Asia, 2005.

[13] A Afuah and C Tucchi, Internet Business Models and


Strategies: Text and Cases, New York: Mc Graw Hill, 2003.
[14] T Albert and W. B Sanders, E Business Marketing, New
Jercy. Prentice Hill, 2003.
[15] A Bender and J Marks, E-insurance, CSFB, Group Technology, E-commerce, 2000.
[16] R Bernam, P Baines, and P Garanear, Contemporary
Strategic Marketing, New York: Palgrave Macmillian, 2003.
[17] E Booker, Web Users Cruising for information, Not Purchases, Computer World, 1995.
[18] A.A Bromideh and M.M Amani, The Necessity of ICT and
E-Commerce Applications in the Iranian Insurance Industry: An Unbundling Proposal, presented in the 1st Conference on Insurance & ICT, Central Insurance Company of
Iran, Iran (2004).
[19] R Burder and D Dias, On E-Insurance Strategy, the EInsurance Company, presented in the 1st Conference on Insurance & ICT, Central Insurance Company of Iran, Iran
28.
[20] R Burder, S Dias, and P Leukert, On E-Insurance Strategy.
Goldman Saohs.
[21] Pave Chaffey, Internet Marketing, Strategy Implementation and Practice. UK: Prentice Hall, 2000.
[22] S Donaton, Pathfinder Blaees a Trail to Ads, Advertising Age, EIU (The Economist Intelligence Unit) 2007,
E-insurance-creating a competitive Advantage, Insurance
Journal, well Publish Hing, 1995.
[23] C Gersh and P Weiser, Capitalizing on E-Business in Insurance: Strategies for Success, 2001.
[24] David Jobber, Principles and Practice of Marketing, London. Mc Graw Hill, 2004.

[11] Mirzai Ahar Najai, Representing a 3d Analysis Model for


Management Theories Bases, Management Science. Tehran
University, 2005.

[25] P Kotler, A Framework for Marketing Management, USA:


Prentice Hill, 2001.

[12] M Nahavandian and A Haghighatkhah, E-Trade Development in Iran, Trade Reasearches, 2005.

[26] R Swiss, World Insurance in 2005: Insurance Industry on


the Road to Recovery, USA: Prentice Hill, 2006.

560

Identifying and Prioritizing Effective Factors in Electronic Readiness


of the Organizations for Accepting and Using
Teleworking by Fuzzy AHP Technique
(Case Study: Governmental and Semi-Governmental
Organizations in Tabriz City)
Morteza Shokrzadeh

Naser Norouzi

Islamic Azad University, Jolfa International Branch

Islamic Azad University, Jolfa International Branch

Jolfa, Iran

Jolfa, Iran

Morteza.Shokrzadeh@yahoo.com

Jabrael Marzi Alamdari

Alireza Rasouli

Islamic Azad University of Jolfa International Branch

Islamic Azad University E-compus

Jolfa, Iran

Iran

Abstract: Nowadays, moving toward globalization, removing physical borders and living in global
village have made societies to accept information technology as an unseperable part of their lives.
Teleworking is an important innovation embeded in the context of information technology, and
internet. But, before any widespread use of every new technology , necessary basis should be provided for it to be welcomed by the users. Or else, obligation in its exertion will lead the society to
the blind usagae of them. This paper first investigated the effective factors inelectronic readiness
of governmental and semi-governmental organizations of Tabriz city ; Then , effective factors in
accepting information technologies and teleworking were recognized using research theories and exploratory factor analysis and KMO test . To identify different aspects of electronic readiness of the
organizations considering their types and dimensions, 34 factors were regarded from which 7 factors
were extracted expressing 66.74% of total changes. To identify different aspects of information
technology and teleworking, 19 variables were used from which 7 variables were extracted , eliminating 2 questions (11 and 19) from the questionnaire, expressing 75.27% of total changes. Using
One-SampleT-Test, effectiveness of each variable on electronic readiness of organizations was tested
through research hypotheses. Exerting Fuzzy AHP (Chang method), factors were ranked. The
results showed that electronic readiness variables have higher priority than technology acceptance
variables.

Keywords: Teleworking, E-readiness, Technology Acceptance, Information Commiunication Technology, Fuzzy


AHP

Introduction

contextes for it leads organizations to using teleworking (Abtahi 2010, 16). Since accepting teleworking
processes needs organizational and staff s behavioral
changes, managers evaluate organizational readiness
Investigating the readiness level of different organiza- for accepting teleworking processses or changes to identions is the first step. Then, providing the essential
Corresponding

Author, P. O. Box 5413676996, F: (+98) 4923025252, T: (+98) 914 4919720

561

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tify a proper starting point for it, or else they will have
to bear excessive costs rather than benefits. Rediness is
a prerequisite for the successful confrontation of a person or organization with the organizational changes.
Then, a true readiness estimation seems necessary for
the true direction of the attempts and strategies. Other
prerequisites for the successful implementation of teleworking should also be carefully considered. The time
and place in which people accept a new technology and
adopt with it are imprtant. Finding effective variables
in accepting and using IT has been of great interest
for the researchers without which no efficiency can be
achieved.

in 2011 including 120 people. Using simple random


sampling and Cochran formula, a sample size with 92
people was achieved. Library and field data gathering methods were exerted. 2 researcher-made questionnaires, one about the electronic readiness of the
organizations with 34 questions and the other, examining the acceptance of IT from the view of informatic
employees of governmental organizations with 19 questions using a 5-item Likert scale (very low, low, average, high, very high) were exerted. The number of
the questions matched with the number of criteria and
sub-criteria. A questionnaire including fuzzy pair comparision was used to weigh these factors.

Teleworking

In the industrial arena, trade centers were organized in


definite locations for more conformation. Work instruments were concentrated and unmovable and the physical presence of the staff was necessary at work. In the
informatic arena, production and supply of tools are
electronic. With the advent of cheap computers, networks and Internet are accessible for every one without
a physical presence at work. Along with great technological advances, work hour, environment, and time are
losing their importance. Instead, job quality is gaining
more significance. Then, liquid work or teleworking
is growing fast in every place and time. Despite traditional employees, a teleworker is away from formal
organizational area and is connected to the employer
by the electronic media. Considering above-mentioned
points, this paper follows the following scientific and
applied goals:
1. Identifying effective factors in electronic readiness
of governmental and semi- governmental organizations
for exerting teleworking.
2. Identifying effective factors in accepting and applying teleworking by governmental and semi- governmental users.
3. Prioritizing effective factors in governmental and
semi- governmental organizations readiness for accepting and using telework by AHP technique.
4. Representing a model including the indices and effective factors in governmental and semi- governmental
organizations readiness for accepting and using telework.

Data analysis

After gathering the questionnaires, they were codified.


To analyze questionnaires data, they were given to
SPSS software to be investigated. To identify effective
factors, factoriel analysis and KMO test were used. To
test the results significance, a One-SampleT- Test, and
to prioritize factors, fuzzy AHP was used.

The results of exploratory factoriel analysis

After the indices and measurement criteria of electronic


readiness and telework/IT acceptance for each index
have been identifed and before doing factoriel analysis,
KMO test was used for different factors like electronic
readiness of the organizations, KMO test was also done
for different variables of telework/IT acceptance. The
results of this test showed the acceptability of the variables.

Factoriel analysis of the variables using Varimax


Rotation was orthogonal Generally, factoriel analysis
of the variables was done by main factor analysis for
electronic readiness of the organizations in which 7
factors of managerial indices, informatic and communicative bases indices, human resource indices, accessibility of IT, network-based economy, security indices,
and network-based policies were extracted, expressing
66.74% of the whole changes and for telework/IT acceptance in which 7 factors of percieved profitability,
ease of use, technology using purpose, job relation and
3 Methodology
conformity, mental norms and mental image, testability, and provability were examined. Eliminating 2 quesThis study is a survey with applied goals using de- tions (11,19) from the questionnaire that were without
scriptive methods. statistical population of this study factoriel load , those 7 factors represented 75.27% of

562

The Third International Conference on Contemporary Issues in Computer and Information Sciences

IT
accessibility

Human
resource
indices

Managerial IT and
indices
informatic
bases

Mental
norms
and
picture

0.131
0.127
0.125
0.127
0.132
0.135
0.139

0.141
0.134
0.132
0.136
0.148
0.146
0.150

0.158
0.173
0.166
0.162
0.149
0.154
0.143

0.065
0.058
0.061
0.069
0.063
0.072
0.067

0.171
0.169
0.166
0.173
0.173
0.170
0.175

Job
relation
and
conformity
0.072
0.075
0.078
0.07
0.069
0.066
0.074

IT use
purpose

Ease of
use

Percieved
benefit

0.1
0.097
0.105
0.103
0.107
0.095
0.093

0.078
0.074
0.076
0.075
0.081
0.082
0.080

0.083
0.093
0.091
0.085
0.078
0.080
0.079

1
2
3
4
5
6
7

Table 1: Final weights of electronic readiness criteria and IT/ telework


all changes. using Cronbach , the variability of each exertionbecause significance level of One-SampleT-Test
factor and its variables were examined whose smaller is smaller than 0.05.
than %0.05 values confirmed their validity.
H3. Which one of the effective factors in the electronic
readiness of governmental and semi- governmental organizations is of higer priority for exerting telework in
Tabriz?
To calculate criteria weights and effective factors as
6 Hypothses test
well as prioritizing them by fuzzy AHP and fuzzy
pair comparision questionnaire,7 questionnaires were
distributed among experts using delphi method.chang
H1.What factors do contribute to IT/telework accepfuzzy AHP method and excel and expert choice softtance in governmental and semi- governmental organiware were used to weigh and prioritize each criterion.
zations in Tabriz?
After analyzing all questionnaires, they should be inTo answer this question, IT/telework acceptance factor
corporated. To prioritize the criteria and effective facwas investigated with 7 variables and 19 questions in
tors, first the final weights of all criteria were put in a
the questionnaire. To test values significance, a group
table (Table 1) and then geometric mean of each row
t-taq test was used.the result showed that percieved
was calculated.first row belonged to the respondents.
benefit with the mean of 6.38, ease of use with the
mean of 8.2,technology use purpose with the mean of
Thus, the priorities of effective factors in electronic
8.8, job relation and conformity with the mean of 4.3,
readiness and IT/ telework are shown in Table 2.
mental norms with the mean of 5.9, testability with
the mean of 8.1, provability with the mean of 3.2, and
in general technology /IT acceptance with the mean
of 88.9 contribute to telework exertionbecause significance level of One-SampleT-Test is smaller than 0.05.
Priorities Geometric Criteria
mean
6
0.084
Percieved benefit
H2. What factors do contribute to the electronic
7
0.078
Ease of use
readiness of governmental-semi- governmental organi5
0.1
Technology use purpose
zations for exerting telework in Tabriz?
8
0.072
Job relation and conformity
To answer this question, electronic readiness factor
with personal life style
was investigated with 7 variables and 34 questions in
9
0.065
Mental norms and picture
the questionnaire. To test values significance, One1
0.171
Informatic and communicaSampleT-Test was used. The results showed that mantive bases
agerial indices with the mean of 6.38, basis indices with
2
0.157
Managerial indices
the mean of 22.91, human resource indices with the
3
0.141
Human resource indices
mean of 11.96, network-related policies with the mean
4
0.131
IT accessibility
of 5.2, network-related economy with the mean of 8.1,
security indices with the mean of 11.06, IT accessibility with the mean of 10.5, and in general electronic Table 2: The priorities of effective factors in electronic
readiness with the mean of 45.2 contribute to telework readiness and IT/ telework

563

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion

These priorities show that electronic readiness criteria


are of higher importance than IT acceptance . In other
words, to exert teleworking in governmental and semigovernmental organizations first electronic readiness
criteria including informatic and communicative bases
, managerial indices, human resource indices, IT accessibility should be provided and then IT acceptance criteria like IT use acceptance, percieved benefit, ease of
use, job relation and conformity with personal lifestyle,
mental norms and picture should be met. These points
are reflected in Figure 1.

7.2

Suggestions for further researches

1. Managers should evaluate organizational capabilities and prioritize organizations according to electronic
readiness and IT acceptance using fuzzy AHP or the
model of this research.
2. All 14 criteria, identified by factoriel analysis,
should be weighed by fuzzy AHP.
3.The relation between effective factors in electronic
readiness and IT acceptance for teleworking should be
determined.

Refrences
[1] S Abtahi and B Jokar, Evaluating E-Trade Performance
in Manufacturing Units of Shiraz According to Electronic
Readiness, Business, and Their Effects: A Report of Study
Scheme of Business Organization of Fars Province (2010).
[2] American Management Association: AMA/ITAC Survey
on Telework, available at:www.amanet.org (2010).
[3] S Al-gahtani, Computes-Technology Adoption, in Saudi
Arabia: Correlates of Perceived Innovation Attributes, Information Technology for Development 10 (2006), 5769.
[4] Cyber Security Industry Alliance: Making Telework a Federal Priority: Security Is Not the Issue (2005).
[5] F. D Davis, R.P Bagozzi, and P.R. Warshaw, User Acceptance of Computer Technology: A Comparison of Two Theoretical Models, Management Science 35 (1989), no. 8, 982
1003.

Figure 1: Research model

7.1

Suggestions from the study

According to research results, the following suggestions


can be represented:
1.Telecommunication bases should be provided for the
users to enable them use teleworking.
2. The organizations should conform themselves electronically to provide proper contextes for teleworking.
3.Trust should be created in the managers to the employees for teleworking.
4. Managers should agree with using electronic communications.
5. Managers should clearly determine vocational goals
for the employees.
6. Employees shoud be trained in IT and enough budget should be allocated for this purpose.
7. Employees should have access to IT experts inside
and outside the organization to support organizational
activities.

[6] Edwards.
J,
Assessing
Your
Organizations
Readiness
for
Teleworking
A
Public
Manger:
http://www.thepublicmanager.org/docsarticles/archive/Vol30,2001/1./ol30 , Issue03W30N3AssessingYour0g-Edwards.pdf. 30 (2001), no. 1.
[7] Y. C Erensal, T Oncan, and M. L Demircan, Determining
Key Capabilities in Technology Management Using Fuzzy
Analytic Hierarchy Process, A Case Study of Turkey, Information Sciences, Industrial Engineering Department, Dogus
University 176 (2006), 27552770.
[8] M Fathian and M Khanjari, Teleworking and Provision of
Proper Entrepreneurship with Modern Technologies, 1st National Conference of Entrepreneurship, Creativity, and Future Organizations (2008).
[9] M Castles, Information Arena: Society, Economy, and
Culture: Translated By Aligolian, A; Khakbaz, A.Tarhno.
Tehran, 2002.
[10] V Illegems, A Verbeke, and R SJegers, The Organizational Context of teleworking, Implementation, Technological Forecasting and Social Change 68 (2001), no. 2, 275291.
[11] K.B Kowalski and Jennifer A.S, Critical Success Factors in
Developing Teleworking Programs, Benchmarking: An International Journal 12 (2005), no. 3, 236249.
[12] Y Lee, K. A Kozar, and K. R. T Larsen, The Technology
Acceptance Model: Past, Present, and Future, Communication of the Association for Information Systems 12 (2003),
no. 50, 752780.
[13] Mark. M.H and F Clark, Using the AHP to Determine the
Correlation of Productive Issues to Profit, European Journal of Marketing 35 (2001), no. 7.

564

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[14] Robert E Morgan and W. B Sanders, Teleworking: an Assessment of the Benefits and Challenges, For European
Business Review 16 (2004), no. 4.
[15] Nag T Nguyen and J Marks, The Consequence of Spatial Distance and Electronic Communication Teleworks: A
Mull-level Investigation, A dissertation submitted to Temple University Graduate Broad. (2004).
[16] Obra Ana Rosa del Aguila, Sebastian Bruque Camara, and
Antonio Padilla Melendez, An Analysis of Teleworking Centres in Spain. Facilities 20 (2002), no. 11/12, 394-399.
[17] A Oddershede, A Arias, and H Cancino, Rural Development Decision Support Using the Analytic Hierarchy Pro-

565

cess, Mathematical and Computer Modelling 46 (2007),


1107-1114.
[18] M Perez, S Angel Marti-nez, C Filar De.Luis, and J Mana
Jose Vela, AI Technology Acceptance Model of Innovation
Adoption: The Case of Teleworking, European Journal of
Innovation Management 7 (2004), no. 4, 280291.
[19] G Pophal, Telecommuting: Managing Off-Site Staff for
Small Business (2008).
[20] Sanchez, Angel Martinez, Prez Perez, Manuela, et al., Telework Adoption Change Management and Firm Performance, Journal of Organizational Change Management 21.

Hybrid Harmony Search for the Hop Constrained Connected Facility


Location Problem
Bahareh khazaei

Farzane Yahyanejad

Institute for Advance Studies in Basic Sciences

Institute for Advance Studies in Basic Sciences

Department of Computer and Information Technology

Department of Computer and Information Sciences

b khazaei@iasbs.ac.ir

f.yahyanejad@iasbs.ac.ir

Angeh Aslanian

S. Mehdi Hashemi

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics

Department of Mathematics

Angeh.a2@gmail.com

hashemi@aut.ac.ir

Abstract: The Hop Constrained Connected Facility Location (HC-ConFl) Problem is a combination of connected facility location and Steiner trees with hob constraints. The HC-ConFL is a
NP-Complete problem and till now no heuristic algorithm is customized to solve this problem. This
paper customizes Harmony Search Algorithm in order to solve this problem. For comparison, we
also solve the problems model with CPlex. Experimental results demonstrate that this proposed
Algorithm is an effective procedure that finds high quality solutions very fast.

Keywords: Hob Constrained Steiner trees, Connected Facility Location, Harmony Search heuristic, Linear Programming Models.

Introduction

Due to recent growth of telecommunication networks,


telecommunication companies motivated researchers to
find solutions for network design problems. Such networks are designed to connect a source by intermediate switching devices to subscribers as a network.
The intermediate switching devices installed in these
networks is so expensive. In the context of reliability, Hob Constraints (HC) are used as a limit for
the number of intermediate devices used between the
source and subscribers.The aim of this paper is to minimize the cost of such networks. Our heuristic complement Harmony search improvement with using modified Belmanford Algorithm. HC-ConFL problem does
not belong to APX. A similar problem arises in the
design of the communication networks. Gollowitzer
and Ljubic, 2010 [1] have shown that the Fiber-toCurb strategy can be modelled by the connected fa Corresponding

cility location. The HC-ConFL problem is related to


two well known problems:The Connected Facility Location problem (ConFL) and the Steiner tree problem
with hop Constraints. In ConFL (Karger and Minkoff,
[2]), an undirected graph G=(V, E) is given with a
dedicated root node v0 V and edge costs ce 0,
e = (u, v), corresponding to the costs of installing a
new route between u and v. Furthermore, a set of facilities F V and customer nodes D V are given ,
and also opening costs fi 0 is assigned to each facility. We try to find a minimum cost tree so that every
customer node is assigned to an open facility and open
facilities are connected to the route through a Steiner
tree.
Steiner tree problem with hop constraints (SP H):
Given a directed connected graph G=(V, E) and non
negative weights associated with the edges. Consider
a set of essential vertices and a root vertex and some
other non essential nodes and also a positive integer
H n the problem is to find a minimum cost sub

Author, P. O. Box 45195-1159, F: (+98) 241 421-5071, T: (+98) 241 415-5051

566

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

graph T of G so that from root to each essential node


there exists a path a path T from v0 V to each basic vertex no more than intermediate Edges (eventually
including vertices from S=(V, Q)) [3]. Here the objective pivots around developing heuristics that behave
efficiently in practice.

of iterations we want the algorithm to execute (ITR).


Harmony memory is where the best Harmonies (solution vectors)are saved during the algorithm HMS Is the
number of best harmonies kept in the harmony memory, similar to gene pool in Genetic Algorithm.HMCR
,PAR and bw are parameters which are used during the
algorithm in order the control the harmony improving
process . In the Step 2 the Harmony Memory (HM) is
filled with randomly generated solution vectors. The
harmony memory matrix looks like the below matrix.

Figure 1: 1-Constraint Connected Facility Location


Example

The rest of the paper is organized as follows. Section 2, Harmony search algorithm is introduced. The
customized Algorithm Details are mentioned in Section
3. Section 4 is devoted to show the implementation and
results and Section 5 concludes.

Harmony Search Algorithm

The harmony Search algorithm is a metaheuristic algorithm for optimizing mathematical functions and engineering problems,which is inspired by the art of music
[4]. Similar to the way a musician improves his skill,
based on an aesthetic standard, design variables in a
computer memory can be improved based on objective
function. Steps of the harmony search algorithm are
summarized as follows:
STEP 1. Define the objective function of problem and
parameter initialization(HMCR, PAR, bw);
STEP 2. Harmony memory construction(HM);
STEP 3. New harmony improvisation;
STEP 4. Harmony memory update;
STEP 5. Termination criterion check. In the step of
defining objective function, the optimization problem
is specified as
M inimizef (x)
(1)
subjecttoxi Xi , i = 1, 2, ..., N

(2)

where f(x) is an objective function. The solution vector X, is the set of each decision variable, and N is the
number of desicion variables. In this Step, the parameters of the algorithm are to be defined. The Basic parameters are: Harmony Memory Size (HMS), the Harmony Memory Considering Rate (HMCR), pitch adjusting rate (PAR), Bandwidth (bw), and the number

x11
x21
..
.

HM S1
x
1
S
xHM
1

x12
x22

x1N
x2N

S1
xHM
N
S
xHM
N

S1
xHM
2
S
xHM
2

f (x1 )
f (x2 )
..
.

HM S1
f (x
)
f (xHM S )

By the number of ITR we repeat Step 3 and 4 ,each


time to improve harmonies and make a new harmony
and update the harmony memory. The new harmony is
constructed variable by variable, the following 3 Rules
are used to assign value to each variable of the solution
0
0
0
0
vector X = (x1 , x2 , ..., xN ) :
1-Randomly select a value, with probability of (1HMCR),
2-choose a value using harmonies in HM with probability of HMCR,
3- pitch the chosen Harmony by Rule 2,with the probability of PAR [5].
x1 is the value of the first decision variable for the new
harmony and is selected from the values that are in the
S
first column of the harmony memory (x11 , , xHM
),
1
with probability HMCR. HMCR is a parameter between 0 and 1. It is the probability of selecting one
value from the historical values stored in the HM, while
(1-HMCR) is the probability of randomly selecting one
value from the possible range of values [4]. Values of
the remaining decision variables (x1 , x2 , , xn ) are selected in the same way. For instance, when HMCR is
0.9, the probability that the algorithm chooses the decision variable value from historically stored values in
the HM is 0.9, and the probability of selecting from the
total space is 0.1. Once a decision variable is selected
by memory consideration, it is examined to determine
whether it should be pitch-adjusted or not. Now PAR
parameter is used, which is between 0 and 1, and is
the probability of pitch adjusting the selected value
to something close the selected value; (1-PAR) is the
probability of doing nothing to the value (not adjusting the selected value, and using exactly the value we
chose from HM) [6]. For clear understanding see the
Pseudocode in next section. Now that the whole new
solution vector is created we replace it with worst solution in Harmony memory if the new one is a better
solution.

567

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Customizing Harmony Algo- 3.2 Improvising a new harmony


rithm for HC-ConFL

In our implementation we consider HMS to be 50


unique solution vectors, after generating each new solution vector we also check to not add duplicate harmonies to the HM. The HMS is chosen by experiment
and Trial and error. Because the large part of a new
generated harmony in each iteration is made by using
previous best harmonies saved in HM, if we set the
HMS to a big number, say 5000, then the New Harmony would also be affected by some not really good
harmonies, and new generated harmonies would not be
improved as we expect. In the other hand if the HMS
is set to something less , then the improvement will be
made faster but as the innovation in Creating new solution vectors also becomes less we would get trapped
in some good and close to each other solutions (local
optimal solutions) .by Trial and error the parameters
of HMCR and PAR are relatively set to 0.95 and 0.1.
choosing the number of Iterations of HS is important
,small number will lead to not optimal solutions and
large number will take time. So instead of choosing
a static number for number iterations (ITR) we set it
dynamically. We use 2 parameters for choosing ITR
in each run. First, by Trial and error we found that
(250*Hub constraint number) is a proper number for
limiting the ITR. Second we check the improvement
Rate of new Harmonies. By combing these 2 parameters we end the algorithms Iteration when current Iteration number is close to the ITR limit(the first parameter described) and the best solution has not made
any improvements during the last iteration.

3.1

Initializing the harmony memory

The solution in this paper is a vector of 0/1. Hence,


to build the HM, we only need to produce a number of
vectors X = (x1 , ..., xkF k ), where if xi = 1 if facility i
is open. The solution vector in this paper is a vector of
0/1 of size of number of facilities. Hence to build the
HM we need to produce HMS vectors X=(x1 , , xn )
where xi = 1 means the facility is open. While initializing the HM besides generating totally random harmonies we tried to solve some sub problems (sub problems with less number of Hob constraint number) , by
recursively using the algorithm on sub problems, and
include the best solutions of the sub problems in the
HM. This would take a bit more time but as we fill the
HM with initial good quality solutions it significantly
improves the total result.

568

At each iteration a new harmony should be created,


the harmony vector is created step by step by generating each element. For generating each element we
use a random number (rnd) between [0,1], if the number is greater than HMCR we generate a random value
for current element , if the rnd is less than HMCR
we choose a value between previously generated values in HM among corresponding elements, and if the
rnd is also less than PAR we should change the chosen
value by using a function like Xi(pitched) = Xi(chosen)
(Tolerance(bandwidth) ). In our case where the vector
elements can have only 2 values 0/1, by changing the
current value we dont get something near to current
value, we get the exactly opposite value. This might
be useful in the initial iteration but as the algorithm
progresses it would change the good solutions into opposite and bad solutions .so we start the HS Algorithm
with PAR of 0.1 but as the algorithm progresses we Dynamically reduce the pitch rate to 0.01, various tests
on different data sets proved that with this change the
algorithm produces better solution harmonies.

3.3

Objective Function and evaluating


the harmonies

As mentioned before, the purpose is to open some facilities on some nodes of graphs in such a way that the
distance between the customers and their nearest facilities and distance between the facilities to the root
node with hop-constraint becomes minimum. Before
evaluating each harmony we need to ensure that each
harmony exactly meats HC-conFL problem constraints
and also we can simply make some improvements in the
generated harmony by some Greedy decisions. So before evaluating the harmonies we refine and validate
them. The generated harmony must be a connected
sub Graph, and also a tree with maximum depth of H
(The hop constraint) so that has the minimum cost.
First we calculate the shortest path to each node using
modified bellman ford Algorithm. The modified Bellman ford Algorithm is a variant of The Classic bellman
ford Algorithm in which we also count the number of
steps (intermediate Edges) using a 2D-array[number of
Steps][Vertices] and insure that no path length is more
than H. in This Algorithm the result is multiple shortest paths to each node by considering different number
of steps. Next we exclude all the nodes that are not
reachable with H number of Hubs from our solution
vector. Next step is to connect each customer node to
the nearest and cheapest Facility node which is open
(xi == 1 in our generated Harmony), we add all these

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

minimum costs for each customer to harmonys total


cost and then exclude all facility nodes that no customer is connected to it. At the next step we need to
insure that the solution sub Graph is a Tree. Recall
that after using the modified Bellman Ford, the shortest paths were related to number of intermediate hubs.
(See Figure 2: Example) Obviously the shortest path
from root to F1 is the blue path. suppose the Hub
constrain is 4 Edges, If in our generated harmony only
Facility4 and Facility1 are open we can go through F1
using the Blue path and then to F4 and still have a
Sub Tree, but if in generated harmony Facility7 and
Facility1 are open, we cant go to F1 using the blue
path as it will causes more than 4 hubs to reach F7 so
we have to either exclude F7 from our solution vector
or use the Red path to reach both F1 and then F7.

.
less Hub number], add best Harmonies to HM}
. While(it N U M BER OF IT ERAT ION )
.
While(var N U M BER OF V ARIABLES)
.
If (HM CR rand 0 1)
it
.
use a random value for Xvar
.
Else If (P AR rand 0 1 HM CR)
it
.
choose a value from all Xvar
in HM;
.
Else
it
.
choose a value from all Xvar
in HM and
.
adjust it to a close Value.
.
End While
.
%{Verify Harmony(in our case: if not a Tree,
.
change it to be a Tree with Best evaluation)}
.
%{Evaluate new harmony and accept it if it is
.
better than the worse harmony in HM}
. End While
. %{Choose the Best Harmony in the HM}
End

Figure 2: Example

So by using the classic Greedy algorithm Described


in[6] the sub Graph and Bellman shortest paths are
converted to a Tree with maximum depth of H and
minimum cost. The cost of the tree is added to the
harmonys total cost (this validation is also done when
we initially fill the HM with random generated harmonies). After the new generated harmony is validated
to match the problems constraints we add the cost of
opening each open facility (xi == 1) to the total cost
of the harmony. The next step is to update the HM
with this new generated harmony. If the new harmonys
total cost is better than the worst harmony in the HM
we replace the worst harmony in HM with this new
generated harmony. We use Heap data structure in
order to reduce the time of finding the worst harmony
in HM. At the end of the last Iteration of the HS Algorithm the best harmony is chosen as final result . The
Algorithm pseudo code is :

Computational Results

In this section we now report a computation with our


heuristic for solving HCconFL. We coded our heuristic
in visual Studio 2005(C + +). All experiments were
performed on an Intel Core2 Quad 2.33 GHz machine
with 2 GB RAM.
1-Data set. We consider a class of benchmark instances, originally introduced in Ljubic(2007), and also
used by Tomazic(2008) and Bardossy and Raghorvan(2010). These benchmark obtained by merging
data from two public sources[7,8]. UFLP instance are
combined with an STP instance, to generate ConFL input graphs in the following way: first needs of the STP
instance are selected as potential facility locations, and
the node with index one is selected as the root. The
number of facilities, the number of customers, opening
cost and assignment cost are provided in UFLP files.
STP files provide edge costs and additional Steiner
nodes. We consider mp-{1,2} and mq-{1,2}, a set of
non-trivial UFLP instances from UFLP and {c,d}n, for
n {5, 10, 15, 20}.
2- To evaluate the performance of our algorithm,we
tried to solve a formulation for HC CnFL with AIMMS.
To solve this problem we used CPLEX. The model is
as follow:

Harmony search (HS) Algorithm Used:


H X
Begin
X
X
X
p
min
Cij Xij
+
Xjk +
fi yi ,
. %{Define cost evaluation function
p=1 ijAs
iF
jkAD
.
f(x)= evaluate( X = x0 , x1 , ..., xn1 )}
. %{Define HMCR(eg: 0.9), PAR(eg: 0.1);
s.t.
X
. %{Generate Harmony memory (HM) with
P 1
P
Xij
Xjk
.
random harmonies}
iS\{k}
. %{Use HS for some Sub-Problem [in our case:with
(i,j)AS

569

(3)

(4)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Number of Hops

Result

Time

AIMMS
Result
Time

H=3
H=5
H=7
H=3
H=5
H=7
H=3
H=5
H=7
H=3
H=5
H=7

3942.854
3755.054
3615.769
3802.791
3555.85
3525.83
3563.588
3489.542
3488.81
3474.557
3473.10
3471.801

0.115
1.574
12.173
0.661
12.66
29.265
85.76
76.52
221.35
153.224
810.77
1349.58

3942.854
3755.054
3614.792
3802.79
3552.57
3520.07
3561.83
3489.02
3487.792
3473.263
...
...

HS
Instances
{c5,d5}

{c10,d10}

{c15,d15}

{c20,20}

3.37
9.1
167.1
8.7
59.87
110.172
8.75
107.698
186.245
137.853
...
...

gap
0.0
0.0
0.02
0.0
0.09
0.16
0.04
0.0
0.02
0.03
...
...

Table 1: Comparison of the HS with the AIMMS on Large-scale Instances


(j, k) AS , j 6= r, p = 2, ..., H,
X

H
X

P
Xij
Yi , j F \ {r},

(i,j)AS p=1
p
Xij
= 0,

(i, j) AS , {
X

i=r
i 6= r

Xij = 1,

,p=2,...,H
,p=1
k D

benchmarck data. Our extensive computational experiments show that our heuristic obtains high-quality so(5) lutions rapidly. The results are quite consistent in the
sense that the variance of the performance gap is quite
low.
(6)
(7)

Refrences

Xjk Yj , (i, j) AD ,

(8)

Yr = 1,

(9)

[1] I Ljubic and Stefan Gollowitzer, Layered graph approaches


to the Hop constrained Connected Facility Location Problem:
Lecture Notes in Computer Science (2010).

(j,k)AD

P
Xij
, Xjk , Yi

{0, 1}.

(10)

For the instances described above, Table 1 shows the


increase of costs caused by a reduced number of allows
hops in the solution.

[2] D.R Karger and M. Westermann, approximation algorithms


for data management in networks., Theory of computing systems (2003).
[3] S. Voss, The Steiner Tree Problem with hop constrainets,
Annals of Operations Research (1999).
[4] Yang Xin-S, Studies in Computational Intelligence: Harmony
Search as a Metaheuristic Algorithm, Vol. 191, 2009.

[5] Zong Woo Geem and KS Lee, Application of Harmony Search


to Vihecle Routing., American Journal of Applied Scinecee
(2005).

Conclusion

Hob Constrained Connected Facility Location is proposed by lijubic in 2009 and there isnt any heuristics
for it yet. In this paper we proposed a heuristic algorithm that combines Harmony Search and modified
Belmanford Algorithm and we considered a family of

570

[6] A. KAveh and H. Nasr, Solving the Conditional and Unconditional P-center with modified Harmony search: A real case
study., Scientia Iranica (2011).
[7] http://www.mpi-inf.mpg.de/departments/d1/projects/
benchmarks/UFLP.
[8] http://people.brunel.ac.uk/mastjjb/jeb/orlib/steininfo.html.

Gene Selection using Tabu Search in Prostate Cancer Microarray


Data
Farzane Yahyanejad

Mehdi Vasighi

Institute for Advance Studies in Basic Sciences

Institute for Advance Studies in Basic Sciences

Department of Computer and Information Sciences

Department of Computer and Information Technology

f.yahyanejad@iasbs.ac.ir

Vasighi@iasbs.ac.ir

Angeh Aslanian

Bahareh khazaei

Amirkabir University of Technology

Institute for Advance Studies in Basic Sciences

Department of Mathematics and Computer Science

Department of Computer and Information Technology

Angeh.a2@gmail.com

b khazaei@iasbs.ac.ir

Abstract: Prostate tumors are the second leading cause of cancer deaths. That is the most
common cancer in male around the world. This paper introduce a method which uses Tabu search
to identify most differentially expressed genes between normal and prostate cancer gene expressions.
Tabu search is an optimization method that provides solutions near to the optimal in a large set.
We want to find an optimal subset of genes from original large data set to reduce dimensionality
of data and improve classification accuracy between normal and cancer sample. We defined a class
separability index, a criterion, that is employed as an objective function in Tabu search to maximize
the class separability. For comparison, the Genetic Algorithm as a common optimization method
was also examined and the experimental results showed that the suggested method is a powerful
tool for gene selection in a microarray data.

Keywords: Tabu Search; Linear Discriminant Analysis; Gene Selection; Microarray; Prostate Cancer diagnosis .

Introduction

Recently prostate cancer has become the most common cancer in the world. Early diagnosis and detection of this disease lead to earlier treatment and can
save lives. Multiple genes are involved in cancer formation. Genomic methodologies have been used to explore gene expression correlates of prostate cancer[13].
The benefit gained from gene selection in microarray
data is the improvement of predictive performance of
analytical models to identify correlated gene expression
profiles[4].Functional genomics involves the analysis of
large datasets of information derived from various biological experiments. One such type of large-scale experiment involves monitoring the expression levels of
thousands of genes simultaneously under a particular
condition, called gene expression analysis. Microar Corresponding

ray technology makes this possible and the quantity


of data generated from each experiment is enormous,
dwarfing the amount of data generated by genome sequencing projects[5].The problem of gene selection is
nearly as same as feature selection[6] and wavelength
selection[7] and ... whose goal is to acquire an efficient subset so as to reduce the dimension of the set.
The goal of gene selection is to find an optimal subset
of most differentially expressed genes that maximizes
separation of normal and cancer classes. This selection
can lead us to build a more rigid and accurate classification model with a simpler discrimination function
by increasing decision region between the cancer and
normal. We use the Tabu search (TS) method for gene
selection that the fitness of each solution is measured
according to the ratio of between class variations to
within class variations. Tabu search is a simple heuris-

Author, P. O. Box 45195-1159, F: (+98) 241 421-5071, T: (+98) 241 415-5051

571

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tic, which has been applied for solving combinatorial


optimization problems. One of the useful aspects of
tabu search is the ability to adapt a rudimentary prototype implementation to encompass additional model
elements [8]. In the present paper, the TS is combined with separability ratio that help to find the best
subset of genes including dependency of them. We increase the search space with this algorithm. We will
propose the use of tabu search method for gene selection, and compare the performance of proposed method
with genetic algorithm. In the following section, a brief
description of tabu search is presented. In Section 3,
the objective function used to find best solutions is explained. In Section 4 experimental results are shown,
providing also comparisons to Genetice algorithm and
the effect of parameters of tabu search on results will
be discussed. Finally a short conclusion is provided in
Section 5.

cedure which explores the solution space, by making


moves from one solution s to the solution s located in
the neighborhood N(s) of s. TS starts with an initial solution that can be random or obtained by deterministic methods and evaluates the objective function.
Including generation of all possible neighbors of the
initial solution, the search algorithm moves with aim
of reaching a solution that is optimal or near-optimal
by evaluation of some objective function. To avoid cycling, the method records recent moves in a Tabu list.
The tabu status of solution is overridden when certain
criterion is satisfied. The TS algorithm description can
be summarized in Figure 1.
In the TS algorithm, the performance of each solution
(neighbor) is measured by an objective function. The
objective function is calculated in a similar manner as
Linear Discriminant Analysis (LDA) which can be used
for data classification and dimensionality reduction.
The function maximizes the ratio of between-class variance to the within-class variance, thereby guaranteeing
maximal class separability. Increasing class separability but finding discriminatory genes have the advantage of model rigidity, dimension reduction and simpler boundary between classes. The class separability
(criterion) can be defined as follow:
Criterion = inv(Sw ) Sb

(1)

Where Sw is the expected Covariance of each of classes


and Sb can be thought of as the covariance of data set
whose members are the mean vectors of each class [10].

Computational
discussion:

Results

and

In this section, implementation of proposed method


and an extensive set of computational experiments exFigure 1: General flow chart of the Tabu Search Algo- plained in detail:
rithm

3.1

Generation of neighbors

The Tabu search scheme:

Tabu search (TS) is a meta-heuristic method, was introduced by Glover in 1986 for combinatorial problems. The basic idea of TS have also been sketched by
Hansen[9]. TS is a extension of Local search method,
which including short term memory, called Tabu list,
to guide the process of search and prevent reversal
of recent moves besides not trapped in local optimal.
This method is elegant that can be viewed as an iterative technique and local neighborhood search pro-

The following scheme for generating a neighbor S for


given solution vector S=(s1 , s2 ,, sn ) is used. Suppose
the original gene (variable) space be n-dimension and
si be the ith component of the given vector S. we defined the neighborhood of S with the following step:
Step 1. Let (alpha) be a percentage of n variables.
(alpha) is considered a parameter which can be set by
user in this scheme, and will be called the replacement
threshold .
Step 2. Set k=1.
Step 3. Select i randomly.

572

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Step 4. If si = 1, set si= 0, and if si = 0, set si= 1.


3.3.2 TL size
Step 5. If k = (alpha), Stop, else k=k+1 and go to
step 3.
Experimental runs can determine the size of tabu
search. A small TL size allows more diversification
and when the list size is too large, it forbids too
many moves. An appropriate list size depends on the
strength of the tabu restrictions. Effect of different size
3.2 Finding the best neighbor
of tabu list on best solution is shown in Table 2.
In our algorithm, the chosen neighbor has to be sorted
in the tabu list (TL). The chosen neighbor is the neighbor that has maximum value for objective function.
This can be achieved by sorting index k at the termination of the above scheme for generation of neighbors.
Table 2: Observation of TS space with random iniAfter finding the best neighbor, the respective index
tialization, = 0.35,tabu-length=1000 and number of
store in the tabu list. The TL is a queue with a fixed
neighbors=700
length (storage) that each index remains in it by the
size of tabu list length. In this way, storing new indices
in the tabu list will force the deletion of old ones.
3.3.3 Neighborhood
The number of random solutions to be generated from
the current one, neighborhood size. Clearly, the larger
value of neighborhood size, the better the quality of the
solution, but it needs more time to find a comparable
TS is a parameter-sensitive technique. We have perresult (table 3).
formed the analysis at different parameter settings; the
effect of these parameters on the performance of tabu
search investigated and will be discussed:

3.3

3.3.1

Parameter setting

Initial solution

Table 3: Comparison of different size of neighbors

Different initial solutions were generated by varying


number of selected genes (variables). A parameter is
defined to control the percentage of randomly selected
variables. Different percentages of selected variables
(genes), from 20% up to 50%, were tested. At the
same time, the effect of number of neighbors was investigated. Results of changing these parameters on
criterion are shown in Table 1. The percentage of
selected variables (genes) at initial solution can be
changed during the running process.

3.3.4

High quality of expression profiles including were derived 55 of prostate tumor and no tumor prostate samples, that some of them are shavings of prostate tissue
with cancer, and the other are shavings of prostate
tissue without cancer. The matrix contains measurements on 12626 genes. Data matrix download from
[11].

3.3.5

Table 1: Comparison of different initial solutions

573

Data set

Result and discussion

The program was run several times with entirely random starting point. The program also has another
parameter which represents the number of iterations.
With 2000 iterations we optimized the other parameters of the algorithm and then different initial solu-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tions used to start the algorithm. The goal is to reach 4


Conclusion
the maximum value for the objective function which
means maximum separability between normal and tumor classes. Figure 2 shows one example of search TIn the present study, the TS algorithm was modified
to be used in gene selection in microarray for increasing
space of algorithm with ordinary initial solution.
accuracy of classification of Prostate cancer and normal individuals. The ratio base on variances of classes
was employed as objective function. Effect of different parameters on best final solution was investigated.
The experimental results were compared with genetic
algorithm and Table 4 shows that we had a better improvement in performance with TS heuristic.

Refrences
[1] Veer L., Dai H., Vijvr M.v.D., He Y., Hart A., Moa M., Peterse H., Kooy K.v.D, Marton M., Witteven A., Schreiber
G., Kerkhoven R., Roberts C., Linsley P., Bernards R., and
Friend S., Gene expression profiling predicts clinical outcome of breast cancer, Nature (2002), 530-536.

Figure 2: Observation of TS space with = 0.35,tabulength=1000 and number of neighbors=700

Finally the proposed algorithm compared to the


genetic algorithm and the results summarized in Table 4. The criterion function used in GA was same as
TS. Crossover operation is implemented by randomly
choosing the genes. Mutation should be modified as
adding a gene and eliminating another simultaneously.
We arrive to the best parameters setting by experiment, which are as below:
Population 1000
Crossover Rate 80%
Mutation rate 0.4
Generation 1000
Iteration 2000

[2] Perou C.M., Sorlie T., Eisen M.B., van de Rijn M., Jffrey
S.S., Rees C.A., Pollack J.R., Ross D.T., Johnsen H., Akslen L.A., and et al., Molecular portraits of human breast
tumors, Nature (2000), 747-752.
[3] Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J.P., Coller H., Loh M.L., Dowing J.R.,
Caligiuri MA, and et al., Molecular classification of cancer: class discovery and class prediction by gene expression
monitoring, Science (1999), 531-537.
[4] D Singh, PG Febbo, K. Ross, DG Jackson, J Manola, C
Ladd, P Tamayo, V. DAmico, P. Richie, S Lander, M Loda,
W Kantoff, R. Golub, and R Sellers, Gene expression correlates of clinical prostate cancer behavior, Cancer Cell
(2002).
[5] Madan Babu M., An Introduction to Microarray data Analysis, Chapter 11, pages: 225-249.
[6] Zhang H. and Sun G, Feature selection using tabu search
method, Pattern recognition (2002).
[7] Hageman JA., Streppel M., Wehrens R., and Buydens L. M.
C., Wavelength selection with tabu search, Pattern recognition (2003).
[8] Glover F., An introduction to Tabu search, ORSA Journal
on computing (1989).
[9] Hansen P., The steepest ascent mildest descent heuristic
for combinatorial programming: Lecture Notes in Computer
Science, Computing (1990).

Table 4: Comparison The performance of TS on Gene


Selection with GA

[10] Balakrishnama S. and Ganapathiraju A., Linear Discriminant Analysis- A brief Tutorial (1998).
[11] http://www.broad.mit.edu/cgi-bin/cancer/dataset.cgi.

574

BI Capabilities and Decision Environment in BI Success


Mahmoud Shirazi

Zahra Jafari
Islamic Azad University,Borujerd branch

Shahid Beheshti University

Faculty of Management

Faculty of management and accounting

Borujerd, Iran

Tehran,Iran

ZZ.Jafari@gmail.com

M-Shirazi@Sbu.ac.ir

Mohammad Hosseion Hayati

SOOSAN HEAVY INDUSTRIES CO., LTD. Korea Iran branch


Tehran, Iran
Hayati@soosan.ir

Abstract: This study aims to find out the roles of BI capabilities and decision environment in
BI (Business Intelligence) success the main objective. In this research is to realize how parameters
such as technological and organizational capabilities, can affect BI success considering the decision
environment in Iran. Based on our finding decision environment can have effects on some of items
of technological BI capabilities and organizational BI capabilities as they contribute to BI success.

Keywords: Business Intelligence Success; Technological BI Capabilities; Organizational BI Capabilities; Decision


Environment.

Introduction

Business intelligence is not only seen as an instrument, product or system, but it is considered as a new
approach in organizational architecture. Such model
Today, due to progress in different fields of science, such helps managers to make accurate and right decisions
progress has led to expansion of technologies, transfor- in shortest time [3].
mation of local business to global business, customer
There are some steps to follow business intelligence
awareness, and high expectation from goods, quality
in
any
organization:
services, etc. Consequently, there is a tense competition in business sector to survive. In business, the
industries need to have access to some information re1 Planning.
garding their customers preference for some goods to
2 Collecting data.
be able to excel in the market. Paying attention to such
3 Processing data.
issues has helped the business world to overcome some
obstacles and reach to some new promising horizons.
4 Analyzing and producing data.
Business Intelligence as a new concept takes advantage
5 Distributing data [4].
of all their new trends to show itself as a successful
model in business administration [1].
One of the main reasons why organization employs
Business Intelligence was introduced by Howard business intelligence is the help it can provide for in
Dresner, a Gartner Research Group analyst, as collec- decision making. BI can also help them to develop sertion of concepts and ways to expand successful business vice quality. The related softwares are able to extract
decision making via read supporting systems [2].
analyses and provide reports [5].
Corresponding

Author, P. O. Box 15876-96441, F: (+98) 21 88537959, T: (+98) 21 88537960-62

575

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The organizations which do not employ BI, face


problems such as big volume of data, and dispersion
which, in turn, affect their decision making process.
By helping to solve such problems, BI can shape organizational structure a way that it can create more
opportunities for the organization [1]. Figure 1 shows
the empirical model used for the study conducted in
Iran.

deals with the data coming from any source-inside or


outside, BI mostly take advantage of the inside data.
Collectively all these layers help to make Strategic Intelligence. Strategic intelligence is a collection of all
other types of intelligence to provide sufficient information in any organizational decision-making. A new
approach suggested is to include a new layer which is
called Customer Intelligence to this model [6].

Figure 2: Different layers of intelligence

BI success is an advantage for organizations to


achieve more benefit by investing on it. There advantages can include additional vales reduced cost, and
improved efficiency. One of the common ways to measure BI is return on investment (RoI) index [7].
As an illustration, ROI can show the success of BI
Figure 1: The empirical model by James D. Meernik
in marketing. BI can also benefit from CIMM, and
used for the study conducted in Iran.
objective observation.

BI Success

Intelligence plays a very important role in success of


any organization. Figure 2 illustrates how each type
of intelligence can interact with the others. Figure 2
shows the different layers of intelligence.
The core in this model is Artificial Intelligence
which provide assistance for our brain in learning,
thinking and explaining. The second layer is Knowledge Management that can help people to accept, organize and share Knowledge in any organization. A
long with Knowledge Management, any organization
should efficiently deal with information. Consequently
business intelligence makes the next layer which as
followed by competitive intelligence. The latter one
employs any kind of data to expand systematic management, analysis and execution to have the optimum
out put. Unlike Competitive Intelligence which mostly

Business Intelligence Capabilities

Technological BI provides the data necessary, and organization BI is used to see the efficiency of the data.
All these can lead to the profit of the organization by
making the decision making mature [8].
In 2010, James Meernik conducted a study in which
he came to conclusion that technological BI can greatly
affect BI success. This means that technology can stimulate BI. Additionally, he realized that organizational
capabilities can be effective in IT. In data analysis,
flexibility is important in organization BI.

576

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.1
3.1.1

Technological BI Capabilities

3.2

Data Sources

3.2.1

A data source can be defined as the place where the


data that is used for analysis resides and is retrieved.
Internal data is such as a data warehouse, a data mart,
or an online analytical processing (OLAP) cube. External data includes the data that organizations exchange with customers, suppliers and vendors [9]. The
first question is: Is there any relationship between the
data and BI success?

Flexibility

A BI needs to be flexible in order to be effective. Flexibility can be defined as the capability of a BI to accommodate a certain amount of variation regarding the requirements of the supported business process [14]. The
sixth question is: Is there any relationship between
flexibility and BI success?

3.2.2
3.1.2

Organizational BI Capabilities

Intuition Involved in Analysis

Data Types

Intuition, in the context of analysis, can be described as


rapid decision making with a low level of cognitive conData type refers to the nature of the data; numerical trol and high confidence in the recommendation [15].
or non-numerical and dimensional or non-dimensional. The seventh question is: Is there any relationship beIn this dissertation, numerical and dimensional data tween intuition and BI success?
is referred to as quantitative data and non-numerical
and non-dimensional data as qualitative data [10]. The
second question is: Is there any relationship between
3.2.3 Risk Level
variety of data and BI success?
Risk can be defined as making decisions when all the
facts are not known .some organizations use BI to minimize uncertainty and make better decisions [16]. The
User interaction integration provides a single personal- eighth question is: Is there any relationship between
ized interface to the user and business process integra- risk level and BI success?
tion provides a unified view of organizations business
processes [11]. The third question is: Is there any relationship between interaction of data across systems
4 Decision Environment
and BI success?
3.1.3

Interaction with Other Systems

3.1.4

User Access

Organizations may need to employ these different BI


tools from different vendors because different groups
of users have different reporting and analysis needs as
well as different information needs [12]. The fourth
question is: Is there any relationship between access to
data and BI success?

3.1.5

Data Reliability

Organizations make critical decisions based on the data


they collect every day, so it is vital for them to have
accurate and reliable data [13]. The fifth question is: Is
there any relationship between reliability of data and
BI success?

577

The decision environment can be defined as the totality of physical and social factors that are taken directly
into consideration in the decision-making behavior of
individuals in the organization [14]. This definition
considers both internal and extern factors. Internal
factors include people, functional units and organization factors. External factors include customers, suppliers, competitors, sociopolitical issues and technological issues.
The information processing needs of the decision
maker are also a part of the decision environment, provided that decision making involves processing and applying information gathered. Because appropriate information depends on the characteristics of the decision
making context, it is hard to separate the information
processing needs from decision making. This indicates
that information processing needs are also a part of
the decision environment. They are topics of interest

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

BI capabilities and Decision Environment


Data Sources
Data Types
Interaction with Other Systems
User Access
Data Reliability
Flexibility
Intuition Involved in Analysis
Risk Level
types of decisions
The information processing needs

BI success
Correlation coefficient
Sig
0.235
0.023
0.038
0.715
-0.147
0.159
0.190
0.068
0.129
0.035
0.209
0.045
0.082
0.435
0.293
0.004
0.058
0.290
-0.017
0.435

Deg of Freedom
91
91
91
91
91
91
91
91
95
95

Table 1: Relationship between BI capabilities and BI success


in research and have been discussed from both technical and managerial perspectives [12]. Decision making
is a process which is implemented by one person or
a group to find a solution for any present problem or
probable problem. This depends on the person who
makes the decision and the stages which should be followed [17]. The ninth and tenth questions are: Is there
any relationship between types of decisions make and
BI success? And, Is there any relationship between
requirements of data and BI success?

As shown in table, there meaningful relationship,


between data source, access, flexibility, and risk level
at 5% and BI success, and for type of data, reliable
information, interaction with other systems intuition ,
type of decision and information process requirement
at 5%, there are no Meaningful relationships.

Discussion

The finding shows that this studys result is compatible


with Meenik in which all parts of organization Capabilcontributed to BI success. However, in this study
5 The relationship between ca- ity
some of them did not show any meaningful relationpabilities and decision making ship. This can be explained by lack of familiarity with
organizational capabilities and BI technology in Iran
environment in Iran
limitations of study can be the nature of questionnaire
can be affected by mood of those answering. Another
drawback was lack of necessary co operation from orTo answer the questions above, a descriptive Study, ganization management.
correlation was conducted. The instrument of study
was Mernik questions including 70 questions. In orFind limitation is the decision making environment
der to test validity, some specialist opinions were con- used in Meernik, but in this study it was as indepensidered and some improvements were made. To de- dent variable the limited number of participants did
termine reliability, chronbachs coefficient Alpha was not allow to use.
used. The related figure for BI success was 0.896
and technological-organization BI was 0.763 which are
quite acceptable in the study. The approved questionnaire was put on the internet to be answered by IT Refrences
producers and users. 95 people answered the questionnaire of whom 84.2%(80) were IT employees, 9.5% (9) [1] R. L Daft and Machintosh N. B, A tentative exploration
into the amount and equivocality of information processing
were in marketing and 6.4% (6) were in other sections.
in organizational work units, Administrative Science QuarThe education level showed 66.3% were in BA, 28.4%
terly 26 (2010), no. 2, 207-224.
in masters and 5.3% in PhD. The statistical approach
[2] W. W Chin, Frequently asked questions
parfor the study is descriptive for data description and
tial
least
squares
&
PLS
graph:
http://discinferential statistics for result application. The statisnt.cba.uh.edu/chin/plsfaq/multigroup.htm.
tical method used or correlation coefficient and T-test. [3] B Azvine, Z Cui, B Majeed, and M Spott, Operational risk
Table 1 shows relationship between BI capabilities and
management with real-time business intelligence, BT Technology Journal 25 (2007), no. 1, 154167.
BI success.

578

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[4] Asset and A quarterly Newsletter From Arcil, The power of


business intelligence 1 (2008).
[5] M.D Solomon, Ensuring a successful data warehouse initiative, Information Systems Management 22 (2005), no. 1,
2636.
[6] V Pirttimaki, A Lonnqvist, and A Karjaluoto, Measurement
of business intelligence in a Finnish telecommunications
company, Electronic Journal of Knowledge Management 4
(2007), no. 1, 8390.
[7] Tarokh. M.J, Strategy intelligence (BI, CI and KM), 2010.
[8] V Oltra, Knowledge management effectiveness factors: The
role of HRM, Journal of Knowledge Management 9 (2005),
no. 4, 7086.
[9] S Williams and N Williams, The profit impact of business
intelligence, San Francisco, CA: Morgan Kaufmann, 2007.
[10] B Hostmann, G Herschel, and N Rayner, The evolution of
business intelligence: The four worlds, Gartner database
(2007).
[11] J Srivastava and R Cooley, Web business intelligence: Mining the web for actionable knowledge, INFORMS Journal
on Computing 15 (2003), no. 2, 191207.

579

[12] C White, The next generation of business intelligence: Operational BI, Information Management Magazine 1 (2005).
[13] W Eckerson, Smart companies in the 21st century: The secrets of creating successful business intelligence solutions,
TDWI The Data Warehousing Institute Report Series, 1-35
290 (2003).
[14] S Damianakis, The ins and outs of imperfect data. DM Direct 2 (2008).
[15] J. Gebauer and F Schober, Information system flexibility
and the cost efficiency of business processes, Journal of the
Association for Information Systems 7 (2006), no. 3, 122
145.
[16] M. L Gonzales and L.E. Sucar, Whats your BI environment
IQ?, DM Review Magazine (2005).
[17] L Fink and S Neumann, Gaining agility through IT personnel capabilities: The mediating role of IT infrastructure
capabilities, Vol. 8, Journal of the Association for Information Systems, 2007.

Computation in Logic and Logic in Computation


Saeed Salehi
Department of Mathematical Sciences, University of Tabriz, 29 Bahman Blvd., 5166617766 Tabriz, Iran
School of Mathematics, Institute for Research in Fundamental Sciences (IPM), 193955746 Tehran, Iran
http://saeedsalehi.ir/

root@saeedsalehi.ir

salehipour@tabrizu.ac.ir

Abstract: The theory of addition in the domains of natural (N), integer (Z), rational (Q), real
(R) and complex (C) numbers is decidable; so is the theory of multiplication in all those domains.
By G
odels Incompleteness Theorem the theory of addition and multiplication is undecidable in the
domains of N, Z and Q; though Tarski proved that this theory is decidable in the domains of R
and C. The theory of multiplication and order h, 6i behaves differently in the above mentioned
domains of numbers. By a theorem of Robinson, addition is definable by multiplication and order
in the domain of natural numbers; thus the theory hN, , 6i is undecidable. By a classical theorem
in mathematical logic, addition is not definable in terms of multiplication and order in R. In this
paper, we extend Robinsons theorem to the domain of integers (Z) by showing the definability
of addition in hZ, , 6i; this implies that hZ, , 6i is undecidable. We also show the decidability of
hQ, , 6i by the method of quantifier elimination. Whence, addition is not definable in hQ, , 6i.

Keywords: Decidability; First-Order Logic; Godels Incompleteness Theorems; Churchs Theorem; Presburger
Arithmetic; Skolem Arithmetic; Quantifier Elimination.

Introduction

The question of the decidability of logical inference has


triggered the beginning of computer science. Propositional Logic is decidable, since truth tables provide
a finite semantics for it. Aristotles Syllogism, or in
modern terminology the first-order logic of unary predicates, is decidable, since it has the finite model property. The notion of a Turing Machine was a successful
outcome of the struggle to settle the question of the
decidability of full First-Order Logic. It is now known
that the first-order logic is undecidable if it has a binary
relation symbol or a binary function symbol ([?cdp]).
The additive theory of natural numbers hN, +i was
shown to be decidable by Presburger in 1929 (and by
Skolem in 1930; see [?lnt]). The additive theories of
integer, rational, real and complex numbers (hZ, +i,
hQ, +i, hR, +i and hC, +i) are decidable as well. The
multiplicative theory of the natural numbers hN, i is
also shown to be decidable by Skolem in 1930; the theories hZ, i, hQ, i, hR, i and hC, i are also decidable.
This

Then it was expected that the theory of addition


and multiplication of natural numbers would be decidable too; confirming Hilberts Program. But the
world was shocked in 1931 by Godels Incompleteness
Theorem who showed that the theory hN, +, i is undecidable (see [?lnt]). The theory hZ, +, i is undecidable too, since N is definable in this structure: by
Lagranges Theorem k N a, b, c, d Z (k =
a2 + b2 + c2 + d2 ). So is the theory hQ, +, i by Robinsons result [?robinson] which shows that N is definable in this structure too. However, Tarski showed
that the theories hR, +, i and hC, +, i are decidable
([?marker]). It is worth mentioning that the order relation 6 is definable by means of addition and multiplication in all the above domains of numbers. For
example, the formulas z(z +x = y) and z(z 2 +x = y)
define the relation x 6 y in the structures hN, +, i and
hR, +, i respectively. The theory of addition and order h+, 6i is somehow weak, in all the above number
domains, since it cannot define multiplication. The
theory of multiplication and order h, 6i has not been
extensively studied; one reason is that addition is not

paper is dedicated to Alan Turing, to commemorate the Turing Centenary Year 2012 his 100th birthyear.

580

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

definable in hR, , 6i, since the bijection x 7 x3 of R


preserves multiplication and order but does not preserve addition. Also it is known that addition is definable in hN, , 6i by Tarskis identity ([?robinson]):
x + y = z [x = y = z = 0]
[z 6= 0 S(z x) S(z y) = S(z z S(x y))],
where S(u) is the successor of u, which is definable by
the order relation: S(u) = v w[u < w v 6 w].
The symbol u < v is a shorthand for u 6 v u 6= v.

for hQ, , 6i, since e.g. the formula y[x = y 2 ] is not


equivalent to a quantifier-free formula. So, we restrict
our attention to Q+ = {r Q | r > 0} and extend the
language to L = h0, 1, ,1 , <, R2 , R3 , . . .i, where Rn is
interpreted as being the nth power of a rational; or
in other words Rn (x) y[x = y n ].

The question of the decidability or undecidability of the structures hZ, , 6i and hQ, , 6i are missing in the literature. In this paper, by modifying
Tarskis identity we show that addition is definable in
the structure hZ, , 6i; this implies the undecidability
of hZ, , 6i. On the contrary, addition is not definable in hQ, , 6i; here we show a stronger result by the
method of quantifier elimination: the theory hQ, , 6i
is decidable. Whence, by Robinsons above-mentioned
result [?robinson], addition cannot be defined in this
structure. An interesting outlook of our results is that
though h+, i puts the domains N, Z and Q on the
undecidable side, and the domains R and C on the decidable side, the language h, 6i puts the domains N
and Z on the undecidable side, but Q and R on the
decidable side.

We note that the above main theorem implies that


the structure hQ, Li admits quantifier elimination as
well. It is enough to distinguish the signs: for any x,
either x > 0 or x = 0 or x > 0; so eliminating the
quantifiers in each case, will eliminate all of the quantifiers. Let us also note that the quantifier-free formulas
of L are decidable: for any given rational number r
and any natural n one can decide if r is an nth power
of (an-)other rational number or not. Thus, quantifier elimination in hQ, Li implies the decidability of the
structure hQ, Li, and hence hQ, , 6i.

Multiplication and Order in Z

Tarskis identity S(z x) S(z y) = S(z z S(x y)) can


define the formula x + y = z in Z when x + y 6= 0. The
case x + y = 0 was easily settled in natural numbers:
for any x, y N we have x + y = 0 x = y = 0.
But this does not hold in Z, and so we have to treat
this case differently. Our trick is to define the relation x = y in terms of multiplication and successor (which is definable by order): x = y
S(x) S(y) = S(x y). Thus, the following formula defines addition in terms of multiplication and order in Z:
x + y = z [z = 0 S(x) S(y) = S(x y)]
[z 6= 0 S(z x) S(z y) = S(z z S(x y))].
So, the theories hZ, , 6i and hZ, +, i are interdefinable,
and hence hZ, , 6i is undecidable.

Multiplication and Order in Q

Unlike the case of Z, addition is not definable in the


structure hQ, , 6i. In fact, the theory of this structure is decidable. For showing that we use the method
of quantifier elimination. First let us note that the
language h, 6i does not allow quantifier elimination

Theorem. The structure hQ+ , Li admits quantifier


elimination.

The rest of the paper is devoted to proving the main


theorem. The folklore technique of quantifier elimination starts from characterizing the terms and atomic
formulas, also eliminating negations, implications and
universal quantifiers, and then removing the disjunctions from the scopes of existential quantifiers, which
leaves the final case to be the existential quantifier with
the conjunction of some atomic (or negated atomic)
formulas. Removing this one existential quantifier implies the ability to eliminate all the other quantifiers
by induction. Let us summarize the first steps:
For a variable x and parameter a, all Lterms are
equal to xk al for some k, l Z. Atomic Lformulas
are in the form u = v or u < v or Rn (u) for some
terms u, v and n > 2. Negated atomic Lformulas
are thus u 6= v, u 6< v and Rn (u); the formulas
u 6= v and u 6< v are equivalent to u < v v < u
and u = v v < u respectively. By de Morgans laws
we can assume that the negation appears only behind
the atomic formulas of the form Rn (u), and by the
equivalences A B A B and x x,
we can assume that the implication symbol and universal quantifier do not appear in the formula (whose
quantifiers are to eliminated). Finally, the equivalence
x( ) x x leaves
V us with the elementary formulas of the form x( i i ) where each i is
in the form (x = v) or (r < x ) or (x < s) or
Rn (tx ) or Rm (ux ) for some , , , ,  N and
Lterms r, s, t, u, v. VWhence, it suffices
V to show that
the Lformula x h (xh = vh ) i (ri < xi ) 
V j
V
V
< sj ) k (Rnk (tk xk )) l (Rml (ul xl ))
j (x
is equivalent to another Lformula in which x (and so
x) does not appear. This will finish the proof.

581

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Here comes the next steps of quantifier elimination. The powers of x can be unified: let p be the
least common multiplier of the h s, i s, j s, k s and
l s. From the hQ+ , Liequivalences a = b aq = bq ,
a < b aq < bq and Rn (a) Rnq (aq ), we infer that
the V
above formula V
can be re-written
V equivalently as
x h (xp = vh ) i (ri < xp ) j (xp < sj )

V
V
p
p
k (Rnk (tk x ))
l (Rml (ul x ))
for possibly new vh s, ri s, sj s, nk s, tk s, ml s and ul s.
This
 Vformula is inVturn equivalent
V to
y h (y = vh ) i (ri < y) j (y < sj )

V
V
k (Rnk (tk y))
l (Rml (ul y)) Rp (y)
(with the substitution y = xp ). Thus it suffices to show
that
formula
 Vthe following V
V
x h (x = vh ) i (ri < x) j (x < sj )

V
V
k (Rnk (tk x))
l (Rml (ul x))
is equivalent
V to a quantifier-free formula. If the conjunction h (x = vh ) is not empty, then the above formula
formula
V is equivalent
V to the quantifier-free
V
[ h (v0 = vh ) i (ri < v0 ) j (v0 < sj )

V
V
k (Rnk (tk v0 ))
l (Rml (ul v0 ))
for some
V term v0 . So, let us assume that the conjunction h (x = vh ) is empty, and thus we are to eliminate
the V
quantifier of theVformula
V
x i (ri < x) j (x < sj ) k (Rnk (tk x))

V
l (Rml (ul x)) .


x)) is equivalent to (the quantifier-free
V
formula) i,j (ri < sj ) 6= R(n ,n ) (t t1
), since
Q
V
the solution x = N k (tk )k for k (Rnk (tk x))
can be chosen to satisfy maxi {ri } < x < minj {sj }:
choose a rational number Q+ between the posi1/N
Q
tive real numbers = maxi {ri } ( k (tk )k )
and
1/N
Q
= minj {sj } ( k (tk )k )
. Since the set Q is
dense in R,
Q there exists such a rational number . Then
x = N k (tk )k is the desired solution.
V

k
k (Rnk (tV

we show
Finally,
V
V that the formula
V
x i (ri < x) j (x < sj ) k (Rnk (tk x))

V
l (Rml (ul x))
is
the following quantifier-free formula
V equivalent toV
1
(r
<
s
)

i
j
6= R(n ,n ) (t t )
Vi,j
l:ml |N (Rml (ul t)),
where
Q N is the least common multiplier
P of nk s, and
t = (t ) in which k s satisfy k k N/nk = 1.

V
V
Q+ , i (ri < x) j (x < sj )
V If for some x V
Vk (Rnk (tk x)) l (Rml (ul x)) holds, then clearly
i,j (ri < sj ) is true, and it can be easily seen that we
V
also have 6= R(n ,n ) (t t1
). Assume ml | N ; we
show that Rml (ul t). Note that there exists some
such that x = N t. Now if Rml (ul t), then
ul x = N ul t, and so by ml | N we
V have Rml (ul x)
which
contradicts
the
assumption
V

V
l (Rml (ul x)).
The formula x i (ri < x) j (x < sj ) is Whence, V
(R
(u
t))
holds.
ml
l
l:ml |N
+
hQ
V , Liequivalent to (the quantifier-free formula)
V
Conversely, if we have
< sj )
i,j (ri < sj ) (that is maxi {ri } < minj {sj }), since
i,j (ri
V
V
+
1
hQ , <i is dense.
6= R(n ,n ) (t t )
l:ml |N (Rml (ul t)), then
by the above arguments there exist some positive real
V
For the formula x k Rnk (tk x), let p be a prime numbers < such that for any rational with
number, and put t0k be the greatest number such that < < , the number z = N t satisfies the for0
V
V
V
0
ptk divides tk ; similarly xV
is the greatest number such mula
i (ri < z)
j (z < sj )
k (Rnk (tk z)) where
x0
that
p
divides
x.
Then
R
(t

x)
is
equivalent
to
n
k
N
and
t
are
as
above.
Let
P
be
a sufficiently large
k
k
V
p k [t0k + x0 nk 0]. By a generalized form of the Chi- prime number which does not divide any of the numernese Remainder Theorem
V ([?lnt]) the existence of such ators or denominators
Q of (the reduced fractions of) tk s
an x0 is equivalent to 6= t0 (n ,n ) t0 ; here (a, b) is or ul s. Let M =
l ml and let be a positive ratiothe greatest
of a and b. That is equiva- nal number such that (/P)1/M < < (/P)1/M . We
V common divisor
V
N
N M
lent to V6= R(n ,n ) (t t1
t satisfies l Rml (ul x).
). We further note that in show that x = P
0
0
case of 6= Vt (n ,n ) t there are infinitely many Note that since < P M < we already have
V
V
V
0
0
solutions for
P k [tk0 + x nk 0] which are in the form
i (ri < x) j (x < sj ) k (Rnk (tk x)). For showing
0
0
x = N y k k tk for some fixed integers N and k s; Rml (ul x) we distinguish two cases. (1) If ml | N then
y 0 is arbitrary. In fact NP
is the least common multiplier Rml (ul x) Rml (ul PN N M t) implies Rml (ul t)
V
of nk s and k s satisfy k k N/nk = 1; the existence contradicting the assumption
l:ml |N (Rml (ul t));
of such k s follows from the fact that the greatest com- thus R (u x). (2) If (m | N ), then R (u x) or
ml
l
l
ml
l
mon divisor of (N/nk )s is 1. Moreover, the solution x0 equivalently R (u PN N M t) implies R (u tPN )
ml
l
ml
l
is unique up to the module
V N . So, if there exists some since ml | M . Since P does not divide any of the nux Q+ which satisfies k Rnk (tk x)Qfor some tk Q+ , merators or denominators of (the reduced fractions of)
then it must be of the form x = N k (tk )k for some u s or t (t s), then we must have R (PN ) which holds
l
k
ml
(arbitrary) Q+ .
if and only if ml | N ; this contradicts our assumption
(ml | N ). Thus Rml (ul x). Whence, all in all we
V
V
Thus, the formula x i (ri < x) j (x < sj ) showed that V R (u x) holds.
Q.E.D
ml
l
l

582

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Acknowledgements This research was partially


supported by grant No. 90030053 of the Institute for
Research in Fundamental Sciences (IPM), Tehran.

[2] J. Robinson, Definability and Decision Problems in Arithmetic, The Journal of Symbolic Logic 14 (1949), 98114.
[3] D. Marker, Model Theory: An Introduction, SpringerVerlag, Berlin, 2002.
[4] C. Smory
nski, Logical Number Theory I: An Introduction,
Springer-Verlag, Berlin, 1991.

Refrences
[1] E. B
orger, E. Gr
adel, and Y. Gurevich, The Classical Decision Problem, Springer-Verlag, Berlin, 2001.

583

Rating System for Software based on International


Standard Set 25000 ISO/IEC
Hassan Alizadeh
Information Technology and Digital Media Developments Centre
Ministry of Culture, Iran
alizadeh@farhang.gov.ir

Hossein Afsari
Information Technology and Digital Media Developments Centre
Ministry of Culture, Iran
Hosein.afsari@yahoo.com

Bahram Sadeghi Bigham

Department of Computer and Information Sciences


Institute for Advanced Studies in Basic Sciences
Zanjan, Iran
b sadeghi b@iasbs.ac.ir

Abstract: In software rating discussion, it is necessary to have a measurement reference model


for quality evaluation of software packages. For example it can be mentioned the standard 25030
that is based on the determination of evaluation purposes and stakeholders needs. In this study,
after investigation the implicit and explicate needs of stakeholders and also the users and policy
makers needs, a reference model is offered that weighted in AHP method after determination of
evaluation indices in four levels. New model is in conformity with native needs of Iran. It is offered
in first version and the weighting has been done in two first and second layers.

Keywords: Software Quality Evaluation, Software Rating, Content Based Software Packages, Software Evaluation
Standard

Introduction

In-Sale software packages rating based on users needs


is one of sufficient methods that can directed the supply and sale market of these works toward the better
works with more agreement to users needs. To do content software packages rating, it is necessary to perform the product quality evaluation that this purpose
is required to design a quality evaluation model for
software product. One of the basic standards for software evaluation is the international standard set 25000
Corresponding

ISO/IEC [14]. ISO 2500 standard set has introduced


25010 quality models as an offered sample for quality reference model. Standard quality model 25010 is
based on software functions evaluation against stakeholder requirements [5, 6]. In regard to the inclusive
view about the functional tool softwares and system,
it is not possible to evaluate content based software by
using it, because the content cannot be evaluated by
this model. In this standard set, it is referred that if the
offered method in 25030 set is observed to design measurement reference model, the quality reference model
will be acceptable. In this study, by extracting the

Author, P. O. Box 45195-1159, T: (+98) 241 415-5063

584

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

general model from above standards, a quality mea- is a person who has right, claim or share in system and
surement model has been offered to evaluate content- its characteristics to meet his needs and expectations.
based software products in ten steps.
The stakeholders have different needs and expectations
that can be classified in three general branches:
Software producers: This group either distributes the certain content in order to influence on its
users cultural, mental and psychological, or entertains
its audiences by a set of contents and functions which
In first stage, the system requirements have been de- the users have liking for. Some of them try to make a
termined. This stage is performed in five steps. Based tool for certain function or to offer certain services for
on this method, firstly the evaluation purposes have users needs.
been determined then, all stakeholders of these software packages have been diagnosed. The user needs- as
End users: This group contains the software authe basic stakeholders- assessed and their needs deter- diences, and they use software to see certain content or
mined. Fig.1 shows the relation between stake-holders to meet their functional needs.
requirements in the system.

Requirements

Policy makers and supervisors: This group


contains governments and supervising institutions that
inspect digital content publication or by assessment the
digital publication sphere, they will investigate the situation of people, society and digital publishers. The
results of them are used in cultural, social, political or
security backgrounds.
Since it is necessary to determine a view based
on software qualitative requirements, it must be mentioned that this study has been investigated in the
view of users and policy makers. After determining the
stakeholders, in the fourth step their needs have been
determined. These cases contain society needs and expectation, determined limits on the part of client and
needs of final users. The user requirements have been
determined in two classes: Implicit and explicit needs.

Figure 1: Analyse of stake-holders requirements

2.1
In the first step, the evaluation purpose is defined
as following: software quality evaluation with qualitative requirements that represent user needs. In the second step the type of evaluation product is determined
it is related to evaluation purpose: The basic step of
evaluation process is to determine products and in this
model, the media softwares are considered as the evaluation product. Media softwares are going to increase
the users scientific, cultural awareness by offering the
scientific, cultural, art, contents or to entertain them.
They influence cultural, mental and psychological on
users directly and indirectly. In the third step, the
system stakeholders are determined. The stakeholder

Explicit needs

The explicit needs are explained by users to use these


softwares clearly. Each of end users follows two purposes and two basic elements to use software. In
other hand, two basic needs lead them to buy software.
These needs are explicit that users purchase software to
meet them. They are including: content and Perform
certain function to meet user requirements. Some of
current functions in media softwares are including: Media distribution function, Training function, Research
function (Offering research content, Sound and video
processing function, Computational function.

585

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2.2

Implicit needs

criteria (attributes) for each of six previous characteristics (in three layers). In the sixth step, the criteria
(attributes) determined. Attribute is inherent characThey are non-expressed but real needs such as some
teristic of an existence that can be determined quantineeds that do not expressed but are hidden because
tatively and qualitatively by human or automatic tools.
they are supposed obviously. In regard to field research of media software producers and the judgement
Attribute is divided in two groups: permanent atexperts, the following implicit needs were determined
tribute that is existed in nature of things and acquired
for every software:
attribute of a system, process of product (such as prod1- Software packaging: The user receives media softuct price, product owner ). The acquired attribute is
ware product as an insale package, So the software
not the inherent qualitative attribute of a system, propacking is considered as one of user needs. In fact,
cess or product.
every media software is considered as commercial off
the shelf software product.
Quantity determination and quality evaluation of
2- Internal consistency and installation. Every
software product is done by criteria and is related to
media software in its nature is a software, so two basufficient quality attributes. In the seventh step, for
sic factors must be considered to use it. It must enjoy
each of attributes the quality characteristics and criteinternal efficiency and consistency. In other words, it
ria is determined in three layers. By using determined
has the characteristics of reliability, efficiency, maincriteria in previous step, in second step the quality
tainability and security and without any Failure and
model has been designed. Quality of a system is refault. It must be without fault in order to install, persult of its constituents quality and their interaction
form, activate and delete a program in addition to apsoftware quality includes software product potential to
propriate software type and agreement to addresses.
meet implicit and explicit needs in certain conditions.
3- User interface: It is observable and touchable part
of software that user involves it directly. It includes
Quality model is a determined set of attributes
information channels that provide communication beand the relationship between them, that it provides
tween user and computer. The user interface in mea framework to determine quality needs and evaluadia softwares is generally one of two following types:
tion. Quality model is used as a framework to insure
Choice interface and user graphical interface. One of
all quality aspects are considered in regard to interother implicit user needs is user interface.
nal and user view aspect. In regard to the extracted
4- Support: Since the majority of media software
requirements of past step, the following quality model
users are public people, so support is one of requirehas been extracted and in every basic quality attribute,
ments for users. After determination stake-holders
the secondary attributes have been defined. In this
needs, in the fifth step, the system requirements have
model, two aspects of quality is defined:
been determined. A system often includes different
elements that each of them has certain specifications
and responds to different purposes in system. To func Internal software quality: it contains software
tioning, the system requirements must transform to repackage, internal consistency, user interface, conquirements of different elements in system. The result
tent, function and support.
of defining process of requirements is called stakehold Quality in use: the users ideas are obtained
ers requirement. In this step, for each of defined elabout software components.
ements in previous step that is extracted from user
needs, the quality requirements have been extracted.

Model and criteria

In this stage, the reference model and its criteria has


been designed and determined. Two characteristics
have been determined based on stated needs: content
and function. In regard to implied needs, four other
characteristics determined including: Software packing, internal consistency, user interface and support.
Software quality model completed by determining

586

Defined quality attributes cover all quality aspects for


majority of media software products, so it can be used
as inventory to assure the complete coverage of quality. The next step is to design quality model, to determine the weight of each characteristic and criteria. In
characteristic and criteria weighing process, one basic
problem is determined that is some major differences
between different media softwares in regard to their
weight (and importance). For example, in children
software because of the necessity of apparent attractiveness in software, the experts consider user interface
as the most important aspect while in an encyclopaedia
software, the basic characteristic is content.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

So in next step (the eighth step), all of software


packages have been determined. Thus by using the
field and study work in regard to the expert opinion,
the following general species were determined for content based softwares: Encyclopaedia softwares, Training softwares, Children softwares, General softwares
and Functional softwares.

form it, Expert choice software has been used for


following cases:
Tenth step- Determine rating level.
Three following processes are used to determine rating levels: (Figure 3)
1 Software quality evaluation based on quality
model on 190 software is performed by experts
and the results are obtained.

Weighting and rating

2 These works transferred to some after people


who are familiar to evaluation softwares completely without any past knowledge about them.
They are asked to place these works in following
groups.

The last stage of model determination is to weight


and rate levels. AHP method is used to weight. The
weights of characteristics and criteria determined (for
each of softwares). Finally to determine rating levels
3 Finally the list of classification in second step
based on designed quality measurement, the software
compared to the numbers of first step and them
quality evaluation has been performed on statistic sammaximum and minimum numbers determined
ple society by experts and rating level determined. In
and normalized numbers.
the ninth step, the criteria weighing determined by
using AHP method. This process is one of famous
multi- indices decision making skills. This method can
be used to make decision in regard to some competitive
choices and decision criteria. The criteria can be qualitative and quantitative. Decision problem is solved by
forming a hierarchical form. So, first step, is to form
hierarchical three of decision that is formed from three
levels. The first level of a tree represents the purpose
of decision maker. The last level of each tree represents
competitive alternatives which compared to each other
Figure 3: Range of points earned for the stars
and decision purpose is to determine the relative importance of these alternatives as number weights. The
intermediate level of this tree is the most important
level of it that represents some criterion that competiThe above rating was performed based on user retive alternatives will be compared by them. Figure 2, quirements to determine to other stars named exshows the main six criteria and the related weights in portable and certain innovation and intelligence
four fields.
that are complementary stars and their purpose is to
improve quality level of products.

Conclusion and future works

In this paper, the quality reference evaluation is offered in two following views based on given method
in standard 25030: internal quality of software and
Quality in use. A standard method is derived from
standard 25030 to design model. Also, this model is
derived from media softwares constituents in regard
Figure 2: The weighting factors for every characteristic to cultural needs of users in media software. Morein different softwares by using AHP method
over, a scientific method is offered to measure content
in measurement reference model and defined quality
characteristics cover all quality aspects for most of meThis level includes some layers. AHP process re- dia softwares. So it can be used as inventory to assure
quires pair wise comparisons based on a tilde. To per- the complete coverage of quality. In future researches,

587

The Third International Conference on Contemporary Issues in Computer and Information Sciences

it can be done the weighting of third and fourth functions in order to decrease the judge idea and to consider
more quantitative index.
Acknowledgement: The authors wish to acknowledge Mr. Meisam Abdoli, Meisam ZargarVafa,
Ali Javedani and Madjid Paksima, whose help aided in
the completion of this study.

[2] Beata CzarnackaChrobot, Analysis of the Functional Size


Measurement Methods Usage by Polish Business Software
Systems Providers, IWSM/Mensura (2009), 17-34.
[3] A. Abran L. Buglione, The Software Measurement Body of
Knowledge, Proceedings of 1st Software Measurement European Forum (SMEF), Rome (2004).
[4] M. Kasunic, The State of Software Measurement Practice: Results of 2006 Survey, Software Engineering Institute,
Carnegie Mellon University, Pittsburgh (2006).

Refrences

[5] CMMI Product Team, CMMI for Development, Version 1.2,


Software Engineering Institute, Carnegie Mellon University,
Pittsburgh (2006).

[1] Beata Czarnacka-Chrobot, The ISO/IEC Standards for


the Software Processes and Products Measurement, SoMeT
(2009), 187-200.

[6] ISO/IEC 90003:2004 Software Engineering, Guidelines for


the application of ISO 9001:2000 to computer software, ISO,
Geneva (2004).

588

TOMSAGA: TOolbox for Multiple Sequence Alignment using


Genetic Algorithm
Farshad Bakhshandegan Moghaddam

Mahdi Vasighi

Institute for Advance Studies in Basic Sciences

Institute for Advance Studies in Basic Sciences

Department of Computer and Information Sciences

Department of Computer and Information Sciences

fmoghaddam@iasbs.ac.ir

vasighi@iasbs.ac.ir

Abstract: In this paper we have developed a collection of MATLAB routines for Multiple Sequence
Alignment using genetic algorithm, called TOMSAGA (TOolbox for Multiple SEquence Alignment
using Genetic Algorithm). TOMSAGA uses genetic algorithm to solve multiple sequence alignment
problem. Toolbox routines are programmed in MATLAB 7.0 and freely available through WWW.
http://www.iasbs.ac.ir/vasighi/TOMSAGA. The toolbox functions allow a user to have a proper
control on genetic algorithms parameters in an easy way and it gives a straightforward possibility
to visualize the obtained results.

Keywords: Genetic Algorithm; Multiple Sequence Alignment; Matlab Toolbox

Introduction

information manageable is to design tools to identify


comparable fragments in DNA sequences. This process
is referred to as sequence alignment.

Bioinformatics can be defined as applying informatics techniques on conceptualized biology in terms of


macromolecules (in the sense of physical-chemistry)
[1]. In short, bioinformatics leverage the techniques
borrowed from computer science to solve problems in
molecular biology.
Deoxyribonucleic acid (DNA) is a nucleic acid containing the genetic information used in the development and functioning of all known living organisms.
DNA molecules consist of two chains of simple units
called nucleotides. There are four different types of
nucleotides [4] denoted by A (Adenine), T (Thymine),
G (Guanine), C (Cytosine). Therefore, DNA molecules
can be represented as strings of letters from relatively
small alphabets. The use of sequence data for different purposes has greatly increased in parallel with the
improvement of sequencing technology.

Phylogenetic analyses, PCR (polymerase chain reaction) primers construction and secondary or tertiary
structures prediction can be carried out by aligning sequences [6]. Being such a main subject, algorithms to
deal with sequence alignment have already been developed.
Multiple sequence alignment (MSA) is an extension
of pairwise sequence alignment [5]. Nowadays, multiple
sequence alignment is an important tool in molecular
biology and it provides key information for sequence
analysis. As the name suggests, in multiple sequence
alignment, we would like to find an optimal alignment
for a collection of sequences.

MSA is characterized by high computational complexity. Needleman and Wunsch [7] first used dynamic
programming in the comparison of two sequences. This
The exponential growth in size of biological method also has been extended directly to the compardatabases goes in parallel with the increasing necessity ison of three sequences using a three-dimensional mafor tools to analyse and extract the valuable informa- trix [8] reduced by Murata et al. [9] with O(n3 ) comtion. One of the first steps to extract and make this
Corresponding

Author, P. O. Box 45195-1159, F: (+98) 241 415-5071, T: (+98) 241 415-5062

589

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

putational complexity, where n is the longest length


of sequences to be aligned. When the dynamic programming method is used for simultaneous multiple
sequence alignment, the computational complexity is
O(nk ), where k is the number of sequences. More details can be found in literatures [10, 20].
Stochastic optimization methods such as simulated
annealing [11], Gibbs sampling [12] and genetic algorithms (GA) [13, 17] can be used to solve MSA. Simulated annealing is very slow but works well as an alignment improver. Gibbs sampling is very well in finding local multiple alignment blocks with no gaps but
is not good at gapped situations. Isokawa et al. [13]
and Wayama et al. [14] applied simple genetic algorithms [18] with bit matrices. Zhang and Wong [16]
Figure 1: Genetic Algorithm Flow Chart
developed a method combining the techniques of genetic algorithms and pairwise dynamic programming.
Notredame et al. [15] used a genetic algorithm for
Suppose N is the desired length of sequences after
aligning two homologous RNA sequences through their
addition of gaps. The value of N is an effective factor
secondary structure.
for the size of search space. Choosing a small value for
N make it hard or sometimes impossible to find an opIn this article, we thoroughly present a toolbox for
timal alignment for less similar sequences. For larger
multiple sequence alignment using GA and explain its
values of N, we need more time to find the optimal
features. The rest of the paper organized as follow:
alignment of highly similar sequences. In this paper, N
In the next part, the Genetic Algorithm Approach for
is determined by the Eq. 1[3] :
MSA is briefly over viewed. Then, the MATLAB modules and their features are introduced and finally the
N = nmax (1 + Rsp )
(1)
results obtained for some synthetic example sequences
and real sequence dataset are presented.

GA FOR MSA

rsp is the gap ratio and nmax is the length of the


longest sequence. As an example, consider the alignment of Figure 2. In this figure nmax = 10 and rsp is
set to 0.2. The chromosome that corresponds to this
alignment is:

As an example of genetic algorithms, we used an algorithm introduced by Jorng -Tzong et. al. [3] solves
the multiple sequence alignment problem in biology using genetic algorithms. For simplicity and without loss
of generality, we avoid some mathematical representations in paper and try to describe them verbally or by
showing examples. Figure 1 shows the general structure of a Genetic Algorithm. More detail about the
GA can be found in quoted papers.

Figure 2: Example of a Chromosome

2.2
2.1

Chromosome Encoding

A chromosome is defined as a set of numbers with fixed


lengths that represent gaps positions in a sequence
(Gap oriented presentation). Before starting the GA,
we have to adjust the sequence to the same length by
adding different number of gaps to each sequence.

GA Process Flow for MSA

For solving the MSA problem for DNA sequences, the


procedure of Algorithm 1 is applied. In Algorithm
1, the symbol |P | represents the size of the population. The first population is generated randomly. The
chromosome X are generated by randomly picking mi
unique numbers (mi < N ) and then sorting these numbers increasingly. The Roulette Wheel method [21] is

590

The Third International Conference on Contemporary Issues in Computer and Information Sciences

used for selection of parents chromosomes to produce


offsprings. After producing offsprings there are some
methods to select top |P | chromosomes. The top |P |
can be obtained by replacing all the parents with offsprings (rpa) or 50 percent best of new offspring and
50 percent best of parents (hlf) or even offsprings and
parents are augmented together and the best ones selected for the next generation(kbr). The evolution is
repeated until the following termination conditions are
satisfied.

The other measure for evaluating the quality of an


alignment (fitness value of a chromosome) is Entropy.
The Entropy function calculates entropy of an alignment. Alignments with lower value of entropy, have
better quality. Entropy of an alignment can be calculated by following steps:
Step1: Define frequencies for the occurrence of each
letter in each column of multiple alignments.
Step2: Computer entropy for each column based on
below formula:

The number of generations exceeds the maximum value specified by the user (gmax ).

entropyi =

px log px

(3)

x=A,T,C,G

A chromosome remains the best individual for a


certain number of generations denoted as (bmax )
Step3: calculate the summation of entropies for all
columns.
Generate the initial population P
Let n be the size of population size
while not satisfy the termination condition do
for i=1 to n do
Select two chromosomes x and y from
population base on Roulette wheel
method.
P2 = Crossover(x,y)
Mutation (P2 )
end
Select the best top |P | chromosomes to
replace the original population. read current;
end
Algorithm 1: The Flow of Our Approach

Entropy =

entropyi

(4)

User can select one of the Fitness functions which are


provided in toolbox.

2.4

Cross Over

In the crossover process, two parent chromosomes, denoted as X and Y are selected by Roulette Wheel
Method in order to produce two offspring chromo2.3 Fitness Value
somes. Two kind of crossover is used in this toolbox, Horizontal Crossover and Vertical Crossover. In
The sum-of-pairs function and Entropy function are Horizontal crossover a sequence includes gaps is ranused to evaluate the fitness of the generated chromo- domly selected from parent X and exchanged with corsomes [19]. SP-score is a very popular scoring scheme. responding row in parent Y (Figure 3)
It defines the quality of a multiple alignment as the sum
In Vertical Cross over, sequences in each parent ranof the scores of all distinct unordered pairs of letters in
domly
split in two parts and new offspring can be genthe columns. Given a set of N aligned sequences each
erated
by combination of different slices. (Figure 4)
of length L in the form of L*N ,MSA alignment matrix
A and a substitution matrix (PAM or BLOSUM [?22])
that gives the score s(x,y) for aligning two character
x,y, the SP-score for the ith column of M (denoted mi
), SP(mi) is calculated using the below formula:

SP (mi ) =

s(mki , mli )

(2)

k<l

Where is the k t h entry in the it h column and is the


l h entry in it h.
t

591

Figure 3: Horizontal CrossOver

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

3
3.1

SOFTWARE
Software Requirements

The toolbox was developed under MATLAB 7.0 (Math


Works) and the collection of functions and algorithms
are provided as MATLAB source files, with no requirements for any other third partys utilities beyond the
Figure 4: Vertical CrossOver
standard MATLAB installation. The files only need to
be copied into a folder declared in the MATLAB Path.
The model calculation can be performed both via the
Note that after a vertical crossover, number of gaps MATLAB command window, which enables the user to
perform all the analysis steps i.e. data loading, setting
in each sequence must not be changed.
preparation, model calculation and graphic al analysis
of the results.

2.5

Mutation

The Mutation operator merges some spaces of a sequence together and then shifts to other columns. The 3.2
details of mutation operator is given in Algorithm 2.
Select a number-string xi =(xi ,1 ,xi ,2 ,..., xi ,m)
in X at random.
Select two numbers xi,g and xi,g+1
Select two numbers h and h+1 not member of xi
Replace the numbers xi,g and xi,g+1 to h and
h+1 respectively
Sort the numbers in xi by increasing
Algorithm 2: Mutation (chromosome X)

3.2.1

Mutation helps the algorithm to escape the local


minima. So the rate of Mutation will be important.
For very large value of the mutation rate, the algorithm
will be nothing than a random search if it is small the
algorithm will be trapped in local minima or may converge very slowly. Also sometimes never converge.

Input Data

Data must be structured as TXT file. Each sequence


must be in one line. There is no limitation for number
and length of sequences. User will be able to load his
data in a very straightforward way.

In this operator, two spaces are selected randomly


and move to other columns. An example is given in
Fig.3. The spaces at positions 7 and 9 in the third
number-string are selected to shift and merge together;
therefore, the new spaces are generated at positions 13 3.2.2
and 14.

Figure 5: Example of mutation

Modules

Setting Parameters

Setting the algorithm parameters is an easy and


straight forward task in this toolbox. There is a data
structure named setting that included all the setting
parameters. List of the all setting parameters is shown
in Figure 6. User can set all the parameters for GA
and MSA easily by changing the elements of this structure. The parameters that user can alter are: Scoring
Scheme (Some of pair or Entropy), Number of Generation, Size of Population, Type of Crossover (Vertical or Horizontal), Mutation Rate, and RSP. RSP is a
rate that shows how many Gaps must be added to sequences. The minimum of this rate is nmax (Maximum
length of sequences)+2.

592

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.3

In the present section, it is shown how MSA can be


performed on a real data. This data consists of five sequences. The data is in an ASCII file and each line of
this file contains one sequence. The data is loaded into
Matlab. After the data set is ready, GA runs on it by
the parameters that user set them before. These parameters are listed in Figure 9. Figure 10 and Figure 11
visualized the MSA results for entropy and SP-scoring
respectively and Figure 12 and Figure 13 shows the
fitness values for these two scoring schemes.

Figure 6: Setting Parameters

3.2.3

Example of Analysis

Results

Once the algorithm has been finished, the sequence


alignment can be visualized. We used the standard
Sequence Alignment representation for showing the results. Figure 7 shows an example of output for MSA.
Also the process of going toward the best fitness is
shown in other windows like Figure 8.

Figure 7: Example of output

Figure 9: Setting Parameters

Figure 10: MSA result with ent scoring scheme

Figure 8: Fitness values (Entropy)

Figure 11: Fitness value for ent scoring scheme

593

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[2] R. A. Salam M. F. Omar R. Abdullah, Multiple Sequence


Alignment Using Optimization Algorithms, World Academy
of Science, Engineering and Technology 5 (2005).
[3] Li -Cheng Wu Jorng-Tzong Horng Ching -Mei Lin, A genetic algorithm for multiple sequence alignment, Soft Computing Journal (2005).

Figure 12: MSA result with sop scoring scheme

[4] A. G. McLennan P. C. Turner A. D Bates, Nucleic Acid


Structure, Instant Notes in Molecular Biology C1:31-35.
Bios Scientific Publishers Limited, Liverpool (1997).
[5] M. Tompa, Using Travel Salesman Problem Algorithms
To determine Multiple Sequence Alignment Orders, Lecture Notes on Biological Sequence Analysis.Technical Report .Department of Computer Science and Engineering,
University of Washington. (2001).
[6] W. Zhong, Using Travel Salesman Problem Algorithms
To determine Multiple Sequence Alignment Orders, Master
Thesis, University of Georgia: Athens, Georgia. (2003).
[7] and C. D. Wunsch S. B. Needleman, A general method applicable to the search for similarities in the amino acid sequences of two proteins, J. Mol. Biol 42 (1970), 245-161.
[8] N. W. Woodbury R. A. Jue and R. F. Doolittle, Sequence
homologies among e. coli ribosomal proteins: evidence for
evolutionary related groupings and internal duplications, J.
Mol. Evol 15 (1980), 129-148.
[9] J. S. Richardson M. Murata and J. L. Sussman, Simultaneous comparison of three protein sequences, Proc. Natl.
Acad. Sci. USA 82 (1985), 3073-3077.

Figure 13: Fitness value for sop scoring scheme

CONCLUSION

In this paper, we introduced GA Toolbox for Multiple Sequence alignment. This toolbox is a collection of
modules for calculating MSA. Algorithm settings (GA
and MSA setting), such as number of generation, mutation rate, Scoring scheme and etc. can be defined
by user and automatically stored in a MATLAB data
structure by means of a proper function. Then, the
user can calculate MSA via the MATLAB command
window. It is our hope that the TOMSAGA promotes
the utilization of this toolbox in research by making its
best features more readily accessible. This work suggests several interesting directions for future studies.
Designing a Graphical User Interface (GUI), capability to handle protein sequences, implementing different
kind of mutations and adding different types of scoring
schemes are among our future works in this project.

Refrences
[1] D. Greebaum N. M. Luscombe M. Gerstein, What is bioinformatics? A proposed definition and overview of the field,
Department of Molecular Biophysics and Biochemistry ,
Yale University, USA. (2001).

[10] A. K. C. Wong S. C. Chan and D. K. Y. Chiu, A survey of


multiple sequence comparison methods, Bull. Math. Biol 54
(1992), 563-598.
[11] and Van P. Laarhoven E. Aart, Simulated Annealing: a Review of Theory and Applications, Kluwer Academic (1987).
[12] S. Altschul C. Lawrence M. Boguski, Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment, Science 262 (1993), 208-214.
[13] M. Wayama M. Isokawa and T. Shimizu, Multiple sequence
alignment using a genetic algorithm, Genome Informatics 7
(1996), 176-177.
[14] K. Takahashi M. Wayama and T. Shimizu, An approach to
amino acid sequence alignment using a genetic algorithm,
Genome Informatics 6 (1995), 122-123.
[15] and D. G. Higgins C. Notredame, SAGA: sequence alignment by genetic algorithm, Nuc. Acids Res 24(8) (1996),
1515-1524.
[16] and A. K. C. Wong C. Zhang, A genetic algorithm for multiple molecular sequence alignment, Comput. Applic. Biosci
13(6) (1997), 565-581.
[17] and G.. B. Fogel K. Chellapilla, Multiple sequence alignment
using evolutionary programming, Congress on Evolutionary
Computation (1999), 445-452.
[18] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley (1989).
[19] Joo Meidanis Joo Carlos Setubal, Introduction to computational molecular biology (1997).
[20] Alexander Chan, An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, SmithWaterman, FASTA, BLAST and Gapped BLAST.
[21] Kalyanmoy Deb David E. Goldberg, A Comparative Analysis of Selection Schemes Used in Genetic Algorithms, Foundations of Genetic AlgorithmsSan Francisco, CA: Morgan
Kaufmann (1991), 69-93.
[22] David Wheeler, Current Protocols in Bioinformatics, Foundations of Genetic AlgorithmsSan Francisco, CA: Morgan
Kaufmann UNIT 3.5 (2002).

594

To enrich the life book of IT specialists through shaping living


schema Strategy based on Balance-oriented Model

Mostafa Jafari
Zanjan University

Human science faculty


Zanjan, Iran
mjafari@znu.ac.ir

Abstract: This research paper answers to this question: how we can improve the social living
analysing abilities of computer and information technology (IT) specialists due to enrich their social
life. This paper is base on an applied research and the target community is 90 persons of IT &
computer specialists (professors, instructors, scholars, engineers) and social science experts. The
three main hypotheses were as fallow: There is meaningful difference between the IT specialists
and Social specialists life strategy, The life schema of IT specialists is not balance, The middle of
literacy (knowledge) of IT specialists is low. The analysis of data just confirms the third hypotheses.
Based on results of research we have proposed a multi dimensional model due to measure and make
balance on IT specialists life schema shaping strategy.

Keywords: Life schema strategy, IT specialists, Spider model diagram.

Introduction

Scientific framework

In this section we briefly describe the scientific theories


which acknowledge this research.

Today people live in confusing times, as is often the


case in periods of historic transition between different
forms of society [4]. The world is large and complex,
while human brain and their information-processing
capacities are highly limited in comparison [10]. So
he or she makes decision on the confusion space. All
people do not live based on reality but live based on
their own perception of reality. One of the cognitive
instruments of understanding how to enrich our life
is schema. Human experience time in different ways
depending on how their lives are structured [4]. One
of the cognitive instruments of understanding how to
structure our life is schema, because people do not live
based on reality but live based on their own perception
of reality.
Corresponding

2.1

Societies are increasingly structured around a bipolar


opposition between the Net and the Self[4].The ability
of an actor in the network be it a company, individual,
government, or other organization to participate in the
network is determined by the degree to which the node
can contribute to the goals of the network [4].Castle,
1999).Interactive computer networks are growing exponentially, creating new forms and channels of communication, shaping life and being shaped by life at the
same time. This new environment requires skilled flexible workers: the organization man gives way to the flex-

Author, P. O. Box 45195-1159, T: (+98) 241 415-5059

595

IT paradigm, Network society and


Identity (self image) forming

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ible woman [4].People will be able to play an efficient


role in society network through efficient life schema.
Schema helps us to reduce confusing and improve our
life strategy shaping processes. Schemas classified our
knowledge and literacy.
Computers, communications systems, and programming are all amplifiers and extensions of the mind.
What we think, become expressed in goods, services,
material and intellectual output, missiles, health, education, or images [4]. Our living schemas are reform
in the IT paradigm. To know the main dimensions of
it is a necessity. This paradigm has five characteristics
[4]:
1 The new IT paradigms raw material is information. These are technologies to act on information, not just information to act on technology,
as was the case in previous technological revolutions.
2 The second feature refers to the pervasiveness
of effects of new technologies. Because information is an integral part of all human activity, all
process of our individual and collective existence
are directly shaped (although certainly not determined) medium.

shaping strategy. According cognition school of strategy schema is mental structure of people. Everyone is
bombarded with data. The problem is how to store it
and make it available on a moments notice. Schemas
do this by representing knowledge at different levels.
This enables people to create full pictures from rudimentary data- to full in the blank. When we think
about a matter for example about the ways of life enrich strategy .the mind likely triggers a schema with
knowledge at the political, financial, and technological levels. Certain implicit assumptions go with this
schema [10].
The combination of these schemas finally dynamically reshapes the identity of anyone and any nations.
While in a world of global flows of wealth, power, and
images, the search for identity, collective or individual, ascribed or constructed, becomes the fundamental
source of social meaning [4]. Thus all people strongly
need to have a suitable model due to be able to realize their own identity and continuously enrich their life
book content. On a network society to capture this
valuable vision the spider model is a simple, efficient
and effective model.

3 The third characteristic refers to the networking


logic of any system or set of relationships using
these new information technologies. The formula
of Robert Metcalfe shows how the value of a network increases as the square of the number of the 2.3
nodes in the net. (This formula is V = n(n1)
where n is the number of nodes in the network.
4 The IT paradigm is based on flexibility. Not only
processes are reversible, but organizations and institutions can be modified and even fundamentally altered, by rearranging their components.
5 The IT paradigm is growing convergence of specific technologies in to a highly integrated system,
within which old, separate technological trajectories become literally indistinguishable.
Therefore to enrich the social life in this paradigm
needs to a multi- dimensional knowledge or literacy.
People can exquisite this type of knowledge or literacy
through life schemas strategies.

2.2

Life Schema shaping strategy

To realize a balance life in the network society strongly


depends on an efficient and effective Life Schema

Spider Model Diagram

A spider model diagram is a graphical method of displaying multi variant data in the form of two- dimensional chart of three or more quantitative variables represented on axes starting from the same point. This is
a chart that consists of a equal-angular pokes, with
each spoke representing one of the variables. The data
length of a spoke is proportional to the magnitude of
the variable for the data point relative to the maximum magnitude of the variable across all data points.
A line is drawn connecting the data values for each
spoke. This gives the plot a star-like appearance and
the origin of one of the popular names for this plot.
One application of spider model is the control of quality improvement to display the performance metrics of
any ongoing program [8].The spider model are primarily suited strikingly showing outliers and commonality, or when one chart is grater in every variable than
another, and primarily used for ordinal measurement
where each variable corresponds to better in some
respect, and all variables on the same scale[6]. The
follow model is an example life and work balance.

596

The Third International Conference on Contemporary Issues in Computer and Information Sciences

We selected the spider model as a geometric modelfor the reason that this model is suitable (efficient and
simple) to analyze and compare any multidimensional
phenomena particular at a network context.
In this research the basic or target model was the
model of Iran Education superior consultant. Based
on this descriptive model all students should learn ten
types of literacy as fallow:

1 Technological literacy
2 Scientific literacy
3 Economical professional literacy
4 Political literacy
5 Social literacy e
6 Health literacy
Figure 1: A spider model

7 Cultural- art literacy


8 Inter-cultural (global) literacy

Thus people make decision and live based their own


literacy that has structured through various schemas.
How this schemas shape? Those form through life
strategies. These strategies are awareness and literacy
acquisition lifelong trends

Methodology

9 Ecological literacy
10 Spiritual literacy

We transformed this descriptive model to a geometric


model. [Figure 2].We measured and compared the literacy level and of research two target community members.

This paper is based on an applied research. The target


community was 90 persons in two groups: on group of
computer and information technology specialists and
second group other fields (management, accounting,
psychology) professors. The main question of research
was how we can improve the social living analysing
abilities of computer and information technology (IT)
specialists due to enrich their social life. We gathered
the data through interview with professors and instructors and holding workshops in classes. We analyzed the
data through descriptive statistic techniques.
There is much kind of models to use and apply.
Some types of models are:
1 Mathematica models (E = M C 2 , V = n(n1)
2 Descriptive models (a text)
3 Analytic models (Regression)
4 Image models (Photo or, picture) and

Figure 2: The model of literate man Source: Education


Strategic Plan of Zanjan , edited by Mustafa Jafari

5 Geometric or drawing models.

597

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The kind of knowledge


1. Technological literacy
2. Scientific literacy
3. Economical professional literacy
4. Political literacy
5. Social literacy
6. Health literacy e
7. Cultural- art literacy
8. Inter-cultural(global) literacy
9. Ecological literacy
10. Spiritual literacy

IT group
65
30
50
15
20
45
20
20
30
45

Other Group
40
30
60
30
30
45
20
20
30
45

Table 1: Ten Literacies

Results

What is the meaning of these results? Discussion


is answer to this question.

The middle of 2 groups literacy from comparison point


of view is as follow:

Discussion

To analyse the results and compare the It & social


specialists lateritic attributes shows that the middle of
two target community members literacy or literacy is
at low level. The middle of literacy of second group
is meaningful more than IC and computer specialists.
The technological literacy of IT & Computer specialists
is better than second group. The Highest level literacy of IT&C group members are: Technological (65%),
economical- professional literacy (50%) and spiritual
literacy (50%). The lowest literacy of IT group is political (15%), cultural-art (20%) and inter-cultural literacy (20%).The middle of literacy of IT group is 34
percent. The most literacy of second group members
Figure 3: The middle of 10 literacy level among IT are: economic & professional (60%), health (45%) and
specialists
spiritual literacy (45%).The lowest literacy of second
group is cultural-art (20%) and inter-cultural knowledge (20%).The distance between the max and min
middle of IT groups literacy is 50(65-15). This status
is not a desire situation. There is a relative balance
among 10 kind of literacy in IT & social Experts group
members. The middle of literacy level of all two group
members is low. It seems not only the social specialists but also the IT experts do not have an effective
strategy due to write their own life book.

Conclusions and future works

All of IT and social experts strongly should promote


their own literacy level at any field. The spider diagram
Figure 4: The middle of literacy level of social special- of literacy of IT and social specialists relatively are at
balance, but at small size. Therefore they should this
ists

598

Authors and Attendances Index

Abbasfard, Mitra
Abdi reyhan, Zahra
Abdollahi, Mahdi
Abdollahi, Davood
Abedin, Marjan
Afsari, Hossein
Afsharchi, Mohsen
Agha-Mohaqeq, Mahnaz
Ahmadi, Lida
Ahmadian Ramaki, Ali
Ahmadzadeh, Vahid
Ahmadzadeh, Somayeh
Akbari, Ahmad
Akbari, Majid
Akbarzadeh, M
Alizadeh, H
Alizadeh, Hassan
Allahyar, Amin
AlmasiMousavi, SeyedMehrzad
Amini, Sara
Aminian, Media
Arabani Mostaghim, Saideh
Arabfard, Masoud
Asad Nejhad, Reza
Ashkezari Toussi, Soheila
Askari, Meisam
Askari Moghadam, Reza
Aslanian, Angeh
Asosheh, Abbass
Azadi, Neda
Azami, H
Azimi, Reyhane
Azmi, Reza
Babaee, Hossein

Babamir, Morteza
Babu, Praveen
Bagheri, Ahmad
Bagheri Shouraki, Saeed
Bakhshandegan Moghaddam, Farshad
Bakhshayesh, B
Banki, Hoda
Baraani, Ahmad
Barzegar, HamidReza
Bazargan, Kamal
Biglari, Mohsen
Bijari, Afsane
Borna, Keivan
ChaieAsl, Rana
Danaie, Hasan
Dastghibyfard, Gh
Davardoost, Farnaz
Dehghan Takhtfooladi, Mehdi
Derakhshan, Farnaz
Derakhshanfar, Roya
Derhami, Vali
Dolati, A
Ebadi, Shabnam
Ebadzadeh, Mohammad Mehdi
Ebrahimi Atani, R
Ebrahimpour-Komleh, Hossein
Eftekhary Moghadam, Amir Masoud
Emadi, Seyyed Peyman
Emami, Hojjat
Eskandari, Marzieh
Faez, Karim
Falahi, Amirreza
Farokh, Azam
Fatemie parsa, Susan

599

Firouzi, Mohsen
Forutan Eghlidi, Fatemeh
Fotouhi-Ghazvini, Faranak
Ghadimi, Fatemeh
Ghasem Azar, Armin
Ghasemzadeh, Mohammad
Gheibi, Amin
Ghiasbeigi, Masoud
Ghiasifard, Sonia
Gholami, Peyman
Gholami, Maryam
Gholami, Azadeh
Gholamiyan Yousef Abad, Bahareh
Gholamnezhad, Pezhman
Gohargazi, Hojjat
Golichenari, Fatemeh
H.Khalaj, Babak
Haghighat, Bahar
Hagtalab, Hamed
Haj Mirzaei, Milad
Haji Seyed Javadi, Mohammad
Hajinazari, Parvaneh
Hasanzadeh, Maryam
Hasanzadeh, Maryam
Hashemi, Seyyed Mohsen
Hasheminejad, S.M.Hossein
Hassanpour, Reza
Hassanzade, Elmira
Hatami, Einolah
Hatamzadeh, Payam
Hayati, Mohammad Hosseion
Hazrati Bishak, Akhtar
Hazrati Bishak, Morteza
Horri, Abbas
Hosseini, Seyed Rebvar
Iahad, N.A
Jabraeil Jamali, Mohammad Ali
Jafari, Amir Homayoun
Jafari, Parisa
Jafari, Zahra

Jafari, Mostafa
Jalalian, Zahra
Jalili, Saeed
Jamali Dinan, Samirasadat
Javadi, Marzieh
Javadi, SeyyedMohammadAli
Kalantari, Mohammad
Kargar, Saeed
Kargar, Hossein
Karimi, Mohammad Hossein
Karimian Ravandi, Masoud
Karimpour Darav, Nima
Katanforoush, Ali
Kesri, Vishal
Khairabadi, Jalal
Khakabi, Sina
Khalvandi, Tayebeh
Khanteimoory, Alireza
Khayyambashi, Mohammad Reza
Khazaei, Bahareh
Khodadadian, Elahe
Khosravi, Alireza
Khosravi, Mohsen
Khosravi-Farsani, Hadi
Kiasat, Fereshteh
Laleh, Abolghasem
Lausen, George
Lotfi, Shahriar
M.Bassiri, Maisam
Mahdavi, Mehrgan
Mahdavinataj, Hannane
Mahdiani, Hamid Reza
Mahini, Reza
Mahmoodi, Seyed Abbas
Mahmoodi, Maryam Sadat
Mahmoudzadeh, Behrouz
Maleki, Farhad
Marzaei Afshord, Masumeh
Marzi Alamdari, Jabrael
Masoud, Hamid

600

Meshkin, Alireza
Meybodi, Mohammad Reza
Minaei-Bigdeli, Behrooz
Mirabolghasemi, Marva
Mirabolghasemi, Maziar
Mirehi, Narges
Mirzaei, F
Mirzare Rad, Zahra
Moadab, Shahram
Moayyedi, Fatemeh
Mobedi, Parinaz
Moeinii, Ali
Mohades, Ali
Mohammad Alizadeh, Zohreh
Mohammad khanli, Leyli
Moradi, Amin
Moradi, Parham
Morovati, Mohamad Mehdi
Mortazavi, Reza
Mostajabi, Tayebeh
Naderi, Hassan
Najafi, Elahe
Najafi, Robab
Najafi, Adel
Naji, Hamid Reza
Namazi, Babak
Nasersharif, Babak
Nazemi, Eslam
Nematbakhsh, Mohammadali
Nikanjam, Amin
Nilforoushan, Zahra
Norouzi, Naser
Norozi, Narges
Noshirvani Baboli, Davood
Nourollah, Ali
Poshtan, Javad
Pourhaji Kazem, Ali Asghar
Pourzaferani, Mohammad
PR Hasanzadeh, Reza
Qiasi, Razieh

Rahimipour, Shiva
Rahmani, Amir Masoud
Rahmani Ghobadi, Zahra
Rajabzadeh, Maria
Raji, Masoumeh
Rashidi, Hasan
Rasouli, Alireza
Rezaei, Fateme
Roozbahani, Zahra
Sabaei, Masuod
Sadeghi, Mehdi
Sadeghi Bigham, Bahram
Sadoghi Yazdi, Hadi
Sadreddini, Zhaleh
SaeediNia, Ebrahim
Safaeinezhad, Mohsen
Safilian, Masoud
Sajedi, H
Salahshoor Mottaghi, Zahra
Salehi, Marzieh
Salehi, Saeed
Salehpour, Masoud
Samapour, Toofan
Sanei, S
Saniee Abadeh, Mohammad
Serajian, Mina
Setarehdan, S.Kamaledin
Seyyed Hamzeh, Mehdi
Shabani, B
Shahbahrami, Asadollah
Shahgholi, Abdolmajid
Shahraki, Shahram
Sharifi, Ahmad
Sheikhi, Sanaz
Sheikholslam, S. Mostafa
Shirazi, Mahmoud
Shirazi, Mahmoud
Shiri, Mohammad Ebrahim
Shirmohammadzadeh,Shahin
Shojaie, Aso

601

Shokrzadeh, Morteza
Shourie, Nasrin
Sojudi, Sevila
Solhnia, Mohsen
Tabibian, Shima
Taheri, Fatemeh
Taheri, Mohsen
Taheri, T
Taherian, Parisa
Tahmasbi, Maryam
Taromi, S
Tashakkori Hashemi, Seyyed Mehdi

Toroghi Haghighat, Abolfazl


Vahed, Mohsen
Vasighi, Mehdi
Veghari Baheri, Farzaneh
Vojodi, Hakimeh
Yahyanejad, Farzaneh
Yazdani, Marjan
Yazdani, Mina
Yosef Zadeh fard, Parisa
Zare-Mirakabad, F
Zeinali Kh, Esmaeil
Zolfagharnasab, Hooshiar

602

S-ar putea să vă placă și