Documente Academic
Documente Profesional
Documente Cultură
ABSTRACT- Biometric verification provides uniqueness of the biometric characteristics, the storage of the
authentication of a person based on the unique reference template is a key factor for the entire system
characteristics possessed by the individual. Biometric security. Therefore it is essential to protect the template from
systems have been developed based on various features, possible attacks. One approach is to encrypt the template
such as fingerprint, facial image, voice, hand geometry, using a secret key before storing it. When a verification task
handwriting, iris, and retina. Among them iris is is requested, the matcher decrypts the template and performs
considered as one of the most reliable and accurate the comparison.
candidates because, iris is unique for individuals and it is An iris is the colored ring around the pupil. Its
well protected and difficult to modify. A thorough structure is determined during the fetal development of the
understanding of Iris code is essential, because 100 eye and remains unchanged. On contrary the color of the iris
million people have been enrolled in many biometric can change as a result of the variable pigmentation in tissues.
personal identification and template protection methods The main role of the iris is to control the size of the pupil and
have been developed based on Iris code. Nowadays adjust the amount of light which enters through the pupil into
hackers can decompress the Iris code and they generate the eye interior. It is surrounded by the sclera, which is a
the iris template. Using this iris template hacker can white area of tissues and blood vessels, and it is covered by a
break the high level security, and they easily misuse our transparent layer called cornea. The whole iris is visible only
information.This is over come by using randomised with eyes wide open, as eyelids and eyelashes usually
attributes. In this paper the colour image is converted to occludes the lower and upper part of it. Iris features remain
gray scale, and is median filtered, and then the pupil is constant over an individual's lifetime and are not subject to
detected and normalized. During this process, threshold changes produced by the effects of aging as other biometric
value is obtained and it is termed as Iris code. It is unique features may be. For these reasons, the human iris is an ideal
for every person. Finally the code is encrypted using RSA feature for highly accurate and efficient identification
algorithm and the secure key is stored in database for systems. The possibility of using the iris to distinguish
authentication. individuals is over 100 years old, but the first patent for the
Keywords- randomized attributes, median filtering, automated iris biometric system was obtained by Flom and
Histogram equalization, RSA algorithm. Safir in 1987 [6]. However the most important work in the
field of the iris recognition was done by Daugman [5]. He
I. INTRODUCTION introduced the first method for iris image segmentation,
Among the biometric verification methods iris unique feature extraction and matching, which with slight
recognition is considered one of the most accurate and robust. modifications are used in today world and which are the
Iris features can be easily extracted from eye images and they reference models for other algorithms.
can be efficiently compared. However if the biometric
reference template or set of biometric features are disclosed,
the whole biometric system becomes useless for an
individual, because the biometric information cannot be
canceled or revoked as passwords. Therefore there is a need
to perform iris features matching without revealing either the
biometric data acquired during the verification process or the
reference template from the database. Generally the biometric
verification is based on the comparison between the features Figure 1. Human Eye
extracted from the input and the template. Due to the
The uniqueness of iris texture lies in the fact that the Daugman algorithm is designed to perform this
processes generating those textures are completely chaotic decompression by exploiting a graph composed of the bit
but stable. Hence in order to use the iris as a biometric, the pairs in Iris Code compression (Daugman) algorithm, prior
feature is extracted and it should be able to capture and knowledge from iris image databases, and the theoretical
encode this randomness present in the iris texture. results. The post processing techniques are Normalization,
Biometric is an individual identification ability Segmentation and also the Gabor filters which influence the
based on physiological characteristics such as fingerprint, distributions of the bits, that the bitwise Hamming distance
handwriting, retina, iris and face. There are many can be regarded as a bitwise phase distance to be calculated
advantages of employing biometric system for identification using graph-based estimation algorithm. For these reasons,
but there are also some disadvantages. We can mention to the human iris is an ideal feature or highly accurate and
high recognition accuracy, uniqueness, and no needs to efficient identification systems. Like most other biometric
memorize a code as advantages and low public acceptance, authentication systems, the input eye contained images need
and complex or expensive equipments as disadvantages. to be processed so that the characteristic iris features can be
Any way the advantages of using the biometric systems are extracted for comparison which is shown in below at Figure 1
more than its drawback, so using is increasing daily.
Based on an extensive literature survey, we classify iris
recognition systems into three categories depending on the
method by which the features are extracted for matching
purposes. Those three categories are (a) appearance based,
(b) texture based (c) feature based extraction.
Viewpoint dependent
Viewpoint independent.
image. This type of filtering eliminates sparse noise inner and outer boundaries of the iris, it is easy to map the iris
while preserving image boundaries. After filtering, ring to a rectangular block of texture Here a convolution filter
the contrast of image is enhanced to have sharp also employed for the purpose of enhancement. The original
variation at image boundaries using histogram image has low contrast and may have non- uniform
equalization. illumination caused by the position of the light source. These
may impair the result of the texture an We enhance the iris
image in order to reduce the effect of non uniform
illumination.
become a popular approach in various applications like image OR operation. Finally matching would be done of the iris. The
retrieval, remote sensing, biomedical image analysis, motion matching would be done with the trained images. So that, if
analysis etc…. to extract the entire iris template features. the images are matched and present in our database it shows
LBP is used to extract the features of the normalized the details of that person. Details such as his personal details,
iris image. And the output of LBP is feature vectors with health details. If it is not matched with the database, then the
ndimension. Finally this feature vectors are given as input to details will be collected for further investigation, if it is
the LVQ Classifiers. needed.
ABSTRACT- Biometric verification provides uniqueness of the biometric characteristics, the storage of the
authentication of a person based on the unique reference template is a key factor for the entire system
characteristics possessed by the individual. Biometric security. Therefore it is essential to protect the template from
systems have been developed based on various features, possible attacks. One approach is to encrypt the template
such as fingerprint, facial image, voice, hand geometry, using a secret key before storing it. When a verification task
handwriting, iris, and retina. Among them iris is is requested, the matcher decrypts the template and performs
considered as one of the most reliable and accurate the comparison.
candidates because, iris is unique for individuals and it is An iris is the colored ring around the pupil. Its
well protected and difficult to modify. A thorough structure is determined during the fetal development of the
understanding of Iris code is essential, because 100 eye and remains unchanged. On contrary the color of the iris
million people have been enrolled in many biometric can change as a result of the variable pigmentation in tissues.
personal identification and template protection methods The main role of the iris is to control the size of the pupil and
have been developed based on Iris code. Nowadays adjust the amount of light which enters through the pupil into
hackers can decompress the Iris code and they generate the eye interior. It is surrounded by the sclera, which is a
the iris template. Using this iris template hacker can white area of tissues and blood vessels, and it is covered by a
break the high level security, and they easily misuse our transparent layer called cornea. The whole iris is visible only
information.This is over come by using randomised with eyes wide open, as eyelids and eyelashes usually
attributes. In this paper the colour image is converted to occludes the lower and upper part of it. Iris features remain
gray scale, and is median filtered, and then the pupil is constant over an individual's lifetime and are not subject to
detected and normalized. During this process, threshold changes produced by the effects of aging as other biometric
value is obtained and it is termed as Iris code. It is unique features may be. For these reasons, the human iris is an ideal
for every person. Finally the code is encrypted using RSA feature for highly accurate and efficient identification
algorithm and the secure key is stored in database for systems. The possibility of using the iris to distinguish
authentication. individuals is over 100 years old, but the first patent for the
Keywords- randomized attributes, median filtering, automated iris biometric system was obtained by Flom and
Histogram equalization, RSA algorithm. Safir in 1987 [6]. However the most important work in the
field of the iris recognition was done by Daugman [5]. He
I. INTRODUCTION introduced the first method for iris image segmentation,
Among the biometric verification methods iris unique feature extraction and matching, which with slight
recognition is considered one of the most accurate and robust. modifications are used in today world and which are the
Iris features can be easily extracted from eye images and they reference models for other algorithms.
can be efficiently compared. However if the biometric
reference template or set of biometric features are disclosed,
the whole biometric system becomes useless for an
individual, because the biometric information cannot be
canceled or revoked as passwords. Therefore there is a need
to perform iris features matching without revealing either the
biometric data acquired during the verification process or the
reference template from the database. Generally the biometric
verification is based on the comparison between the features Figure 1. Human Eye
extracted from the input and the template. Due to the
The uniqueness of iris texture lies in the fact that the Daugman algorithm is designed to perform this
processes generating those textures are completely chaotic decompression by exploiting a graph composed of the bit
but stable. Hence in order to use the iris as a biometric, the pairs in Iris Code compression (Daugman) algorithm, prior
feature is extracted and it should be able to capture and knowledge from iris image databases, and the theoretical
encode this randomness present in the iris texture. results. The post processing techniques are Normalization,
Biometric is an individual identification ability Segmentation and also the Gabor filters which influence the
based on physiological characteristics such as fingerprint, distributions of the bits, that the bitwise Hamming distance
handwriting, retina, iris and face. There are many can be regarded as a bitwise phase distance to be calculated
advantages of employing biometric system for identification using graph-based estimation algorithm. For these reasons,
but there are also some disadvantages. We can mention to the human iris is an ideal feature or highly accurate and
high recognition accuracy, uniqueness, and no needs to efficient identification systems. Like most other biometric
memorize a code as advantages and low public acceptance, authentication systems, the input eye contained images need
and complex or expensive equipments as disadvantages. to be processed so that the characteristic iris features can be
Any way the advantages of using the biometric systems are extracted for comparison which is shown in below at Figure 1
more than its drawback, so using is increasing daily.
Based on an extensive literature survey, we classify iris
recognition systems into three categories depending on the
method by which the features are extracted for matching
purposes. Those three categories are (a) appearance based,
(b) texture based (c) feature based extraction.
Viewpoint dependent
Viewpoint independent.
image. This type of filtering eliminates sparse noise inner and outer boundaries of the iris, it is easy to map the iris
while preserving image boundaries. After filtering, ring to a rectangular block of texture Here a convolution filter
the contrast of image is enhanced to have sharp also employed for the purpose of enhancement. The original
variation at image boundaries using histogram image has low contrast and may have non- uniform
equalization. illumination caused by the position of the light source. These
may impair the result of the texture an We enhance the iris
image in order to reduce the effect of non uniform
illumination.
become a popular approach in various applications like image OR operation. Finally matching would be done of the iris. The
retrieval, remote sensing, biomedical image analysis, motion matching would be done with the trained images. So that, if
analysis etc…. to extract the entire iris template features. the images are matched and present in our database it shows
LBP is used to extract the features of the normalized the details of that person. Details such as his personal details,
iris image. And the output of LBP is feature vectors with health details. If it is not matched with the database, then the
ndimension. Finally this feature vectors are given as input to details will be collected for further investigation, if it is
the LVQ Classifiers. needed.
Abstract— From past few years various works have been There are many research works in this area. Most of them use
done in image/video databases, most of them concentrated manual semantic content extraction methods. Manual
on accessing visual features. The principle component of extraction approaches are tedious, subjective, and time
video data is the spatial/temporal semantics associated consuming [1], which limit querying capabilities. Besides, the
with it. Here, we propose a semantic content extraction studies that perform automatic or semiautomatic extraction do
system that allows the user to query and retrieve objects, not provide a satisfying solution. Although there are several
events, and concepts that are extracted automatically. This studies employing different methodologies such as object
is based on ontology. definitions. This metaontology detection and tracking, multimodality and spatiotemporal
definition provides a wide-domain applicable rule derivatives, the most of these studies propose techniques for
construction standard that allows the user to construct specific event type extraction or work for specific cases and
ontology for a given domain. In addition to domain assumptions. In [2], simple periodic events are recognized
ontologies, we use additional rule definitions (without where the success of event extraction is highly dependent on
using ontology) to lower spatial relation computation cost robustness of tracking.
and to be able to define some complex situations more
effectively. The proposed framework has been fully The event recognition methods described in [3] are based on a
implemented and tested on three different domains. We heuristic method that could not handle multiple-actor events.
have obtained satisfactory precision and recall rates for Event definitions are made through predefined object motions
object, event and concept extraction. and their temporal behavior.The shortcoming of this study is
its dependence on motion detection. Another key issue in
Index Terms—Semantic; content extraction; video content semantic content extraction is the representation of the
modeling; fuzziness; ontology; semantic content. Many researchers have studied this from
different aspects. A simple representation could relate the
1. INTRODUCTION events with their low-level features (shape, color, etc.) using
shots from videos, without any spatial or temporal relations.
There are basically three levels of video content which are raw However, an effective use of spatiotemporal relations is
video data, low-level features and semantic content. First, raw crucial to achieve reliable recognition of events. Employing
video data consist of elementary physical video units together domain ontologies facilitate use of applicable relations on a
with some general video attributes such as format, length, and domain. There are no studies using both spatial relations
frame rate. Second, low-level features are characterized by between objects, and temporal relations between events
audio, text, and visual features such as texture, color together in an ontology-based model to support automatic
distribution, shape, motion, etc. Third, semantic content semantic content extraction. Studies such as BilVideo [4], [5],
contains high-level concepts such as objects and events. The extended-AVIS [6], multiView [7] and classView [8] propose
first two levels on which content modeling and extraction methods using spatial/temporal relations but do not have
approaches are based use automatically extracted data, which ontology-based models for semantic content representation.
represent the low-level content of a video, but they hardly Bai et al. [9] present a semantic content analysis framework
provide semantics which is much more appropriate for users. based on a domain ontology that is used to define semantic
Users are mostly interested in querying and retrieving the events with a temporal description logic where event
video in terms of what the video contains. Therefore, raw extraction is done manually and event descriptions only use
video data and low-level features alone are not sufficient to temporal information. Nevatia and Natarajan [10] propose an
fulfill the user’s need; that is, a deeper understanding of the ontology model using spatiotemporal relations to extract
information at the semantic level is required in many video- complex events where the extraction process is manual. In
based applications. [11], each linguistic concept in the domain ontology is
associated with a corresponding visual concept with only
temporal relations for soccer videos. Nevatia et al. [12] define
an event ontology that allows natural representation of can have weaknesses. Besides, VISCOM provides a
complex spatiotemporal events in terms of simpler subevents. standardized rule construction ability with the help of its
metaontology. It eases the rule construction process and makes
A Video Event Recognition Language (VERL) that allows its use on larger video data possible.
users to define the events without interacting with the lowlevel
processing is defined. VERL is intended to be a language for Both the ontology model and the semantic content extraction
representing events for the purpose of designing an ontology process is developed considering uncertainty issues. For the
of the domain, and, Video Event Markup Language (VEML) semantic content representation, VISCOM ontology
is used to manually annotate VERL events in videos. The lack introduces fuzzy classes and properties. Spatial Relation
of low-level processing and using manual annotation are the Component, Event Definition, Similarity, Object Composed
drawbacks of this study. Akdemir et al. [13] present a Of Relation and Concept Component classes are fuzzy classes
systematic approach to address the problem of designing as they aim to having fuzzy definitions. Object instances have
ontologies for visual activity recognition. The general membership values as an attribute which represents the
ontology design principles are adapted to the specific domain relevance of the given Minimum Bounding Rectangle (MBR)
of human activity ontologies using spatial/temporal relations to the object type. Spatial relation calculations return fuzzy
between contextual entities. However, most of the contextual results and Spatial Relation Component instances are extracted
entities which are utilized as critical entities in spatial and with fuzzy membership values.
temporal relations must be manually provided for activity
recognition. Yildirim [14] provide a detailed survey of the 2.2 Ontology-Based Modeling
existing approaches for semantic content representation and
extraction. The linguistic part of VISCOM contains classes and relations
between these classes. Some of the classes represent semantic
In this study, a new Automatic Semantic Content Extraction content types such as Object and Event while others are used
Framework (ASCEF) for videos is proposed for bridging the in the automatic semantic content extraction process.
gap between low-level representative features and high-level Relations defined in VISCOM give ability to model events
semantic content in terms of object, event, concept, spatial and and concepts related with other objects and events. VISCOM
temporal relation extraction. In order to address the modeling is developed on an ontology-based structure where semantic
need for objects, events and concepts during the extraction content types and relations between these types are collected
process, a wide-domain applicable ontology-based fuzzy under VISCOM Classes, VISCOM Data Properties which
VIdeo Semantic Content Model (VISCOM) that uses objects associate classes with constants and VISCOM Object
and spatial/temporal relations in event and concept definitions Properties which are used to define relations between classes.
is developed. VISCOM is a metaontology for domain In addition, there are some domainindependent class
ontologies and provides a domain-independent rule individuals.
construction standard. It is also possible to give additional rule
definitions (without using ontology) for defining some special
situations and for speeding up the extraction process. ASCEF
performs the extraction process by using these metaontology-
based and additional rule definitions, making ASCEF wide-
domain applicable.
2.2.2 Object
Spatial relations express the relative object positions between Spatial Change class is utilized to express spatial relation
two objects such as above, inside, or far. The spatial relation changes between objects or spatial movements of objects in
types are grouped under three categories as topological, order to model events
distance and positional spatial relations.
2.2.16 Similarity
4 EMPIRICAL STUDY
All these tests show that the proposed ontology-based [7] J. Fan, W. Aref, A. Elmagarmid, M. Hacid, M. Marzouk,
automatic semantic content extraction framework is successful and X. Zhu, “Multiview: Multilevel Video Content
for both event and concept extraction. There are two points Representation and Retrieval,” J. Electronic Imaging, vol. 10,
that must be ensured to achieve this success. The first one is to no. 4, pp. 895-908, 2001.
obtain object instances correctly. Whenever a missing or
misclassified object instance occurs in the object instance set, [8] J. Fan, A.K. Elmagarmid, X. Zhu, W.G. Aref, and L. Wu,
which is used by the framework as input, success of event and “Classview: Hierarchical Video Shot Classification, Indexing,
concept extraction decreases. The and Accessing,” IEEE Trans. Multimedia, vol. 6, no. 1, pp.
second issue is to use the proposed VISCOM metamodel 70-86, Feb. 2004.
effectively and construct a well and correctly defined domain
ontology. Wrong, extra, or missing definitions in the [9] L. Bai, S.Y. Lao, G. Jones, and A.F. Smeaton, “Video
constructed ontology can decrease the extraction success. In Semantic Content Analysis Based on Ontology,” IMVIP ’07:
the tests, we have encountered wrong extractions because of Proc. 11th Int’l Machine Vision and Image Processing Conf.,
the wrong Similarity class individual definitions for typing pp. 117-124, 2007.
event in office domain.
[10] R. Nevatia and P. Natarajan, “EDF: A Framework for
V CONCLUSION Semantic Annotation of Video,” Proc. 10th IEEE Int’l Conf.
Computer Vision Workshops (ICCVW ’05), p. 1876, 2005.
Automatic Semantic Content Extraction Framework
contributes in several ways to semantic video modeling and [11] A.D. Bagdanov, M. Bertini, A. Del Bimbo, C. Torniai,
semantic content extraction research areas. First of all, the and G. Serra, “Semantic Annotation and Retrieval of Video
semantic content extraction process is done automatically. In Events Using Multimedia Ontologies,” Proc. IEEE Int’l Conf.
addition, a generic ontology-based semantic metaontology Semantic Computing (ICSC), Sept. 2007.
model for videos (VISCOM) is proposed. Moreover, the
semantic content representation capability and extraction [12] R. Nevatia, J. Hobbs, and B. Bolles, “An Ontology for
success are improved by adding fuzziness in class, relation, Video Event Representation,” Proc. Conf. Computer Vision
and rule definitions. An automatic Genetic Algorithm-based and Pattern Recognition Workshop, p. 119,
object extraction method is integrated to the proposed system http://ieeexplore.ieee.org/xpls/abs_all.
to capture semantic content. In every component of the jsp?arnumber=1384914, 2004.
framework, ontology-based modeling and extraction
capabilities are used. The test results clearly show the success [13] U. Akdemir, P.K. Turaga, and R. Chellappa, “An
of the developed system. Ontology Based Approach for Activity Recognition from
Future work includes the enhancement of the domain Video,” Proc. ACM Int’l Conf. Multimedia, A. El-Saddik, S.
ontology with more complex model representations and the Vuong, C. Griwodz, A.D. Bimbo, K.S. Candan, and A.
definition of semantically more important and complex events Jaimes, eds., pp. 709-712, http://dblp.unitrier.
de/db/conf/mm/mm2008.html#AkdemirTC08, 2008.
.
.
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
Abstract— The SMS can be used in various types of A. Aims and objectives:
diurnal application including medical aspiration, banking
aspiration and any other mobile aspiration. Serving an SMS Secure Messenger is an Application which is used to
is one of the economical, fastest and simple methods. When
we alternate SMS from one mobile phone to another mobile encrypt and decrypt the message. This encryption and
phone, the data contained in the message transmit as a decryption based on Cryptographic Algorithm (CA). This
decoded form(unencrypted text). Steadily this data enclosed messenger used to send encrypted message to Target user
in the message may be mystic (private) like bank account
(end user or target mobile should have same version of
number, or any identification key, and so on, and it is an
intense inconvenience to delegates such data through the Secure Messenger) or target mobile. This secure
SMS up to the established message courtesy dose not supply messenger application is used to transfer money between
(give) encryption to the data contained in the SMS before its two clients. This SMS Message is encrypted on your
transmission.
mobile by using smart client form. And Encrypted
Message is passed on the network and the Network
Keywords-Cryptographic Algorithm, Encryption, SMS
operator cannot detect or read the encrypted message.
disclosure, replay attack,Decryption.
Transaction on mobile is used to perform financial
transaction with using banking application. User must be
registered in a Bank database and they should have
I. INTRODUCTION supportable mobile for transaction [2].
The SMS courtesy has become one of the breakneck
and strongest communication channel to convey the B. Scope of problem:
information all over the world. The SMS courtesy will be
completing 22 years on December 3 2014 .The world In this application for sending encrypted SMS messages
aboriginal (first) SMS was delegated by Neil papworth using cryptographic methods. The encryption algorithm is
from united kingdom from Vodafone network. The GSM designated by a secret key. The application is grown using
and SMS standard were primarily developed by ETSI.
programming language Microsoft visual basic dot net.
ETSI stands for European telecommunication standard
institute [1] .Message are delegated as a vanilla text
between the MS and SMS centre using wireless network. SMS messages are sometimes used for the interchange of
The SMS information is already gathered (saved) in the confidential data such as social security number, or any
system of the network operator and this data can be easily important password etc. A typing fault in selecting a
perceive by their personnel. Because of this problem number when sending such a message can have astringent
encryption is important for every SMS which is being emanation if the message is readable to any receiver.
delegate through the network operator. . Sometimes we
transfer the private information like password, banking Maximum mobile operators encrypt all exchanging
details to our family member, colleagues and service information, but sometimes this is not possible when
provider through a SMS. But traditional message courtesy mobile data is encrypted. Among others these compulsion
does not give encryption to the data contained in the SMS give rise for the essential to develop additional encryption
before its transmission. for SMS messages [3].
The avenue to this problem is to develop an application and secure message transfer from one mobile phone to
that can be used in mobile devices to encrypt sending another [10]
messages. Naturally decryption for encrypted messages is
also provided. The encryption and decryption are
III.PROPOSED PLAN OF WORK
designated by a secret key that all legal parties have to
possess. In addition to cryptographic strength, when
developing this type of an application for mobile phones Proposed work will be worked out in three phases as
are restraint in memory and processing capacity[4]. follows:
Next application design by the new author called the Figure describes the overall communication model of
SSMS. This new application design for achieves the better proposed system. For complete and successful execution
security than the previous one. This application is used for of proposed system needs to deal with:
alimony system. For generate the key in this application
used the elliptic curve cryptography. This application • Secure SMS protocol designing.
provides the low bandwidth and cost effective solution [8]. • Mobile Application development for Windows
mobile devices.
Another application is also based on the payment system.
This application is based on the high security foundation. • AES based encryption on both end
This application generates the shared key for each period • User account management for banking like
and transfer the secure information between two peers [9]. application.
Next application is design for the public health care. This Sometimes, we send the confidential information like
application is based on the java public key cryptography.
banking details and private identity to our family
This application stored all the medical data of each person
members and service providers through an SMS[13]. But
VI.REFERENCES
V.IMPLMENTATION MODEL
[1] Shital D.Rautkar1, Dr. Prakash S. Prasad2 , “An overview of real
time secure SMS transmission”International journal of advanced
research computer and communication engineering” 2015.
Implementation Module divided into three part
[2] D.Linonek and M.drahasky, “SMS encryption for mobile
communication”,International conf. On security Technology,Hainan
A. Developing a windows mobile based application Island, 2008.
for, [3] H.Zhao and S.Muftic, “Design and implementation of a mobile
• AES encrypted SMS sending. transaction client system:secure UICC mobile wallet,International
Journal for information security research vol.1 2011.
• Receiving incoming SMS and [4]H.Harb,h.Farahat and M.Ezz, “secure SMS pay :secure SMS mobile
decryption. payment model,proc 2nd .International conference,2008
Abstract— The advent of a-Si Electronic Portal Imaging The paper is divided into seven sections, section 1
Device (EPID) has led to an important tool for the clinicians gives a brief introduction about the use of EPID, section 2
to verify the location of the radiation therapy beam with covers the literature survey, Section 3 discusses the
respect to the patient anatomy. However, the Electronic DICOM standard, section 4 describes EPI processing
Portal Images (EPI) are blur and suffer from low contrast steps, section 5 gives the steps for the proposed method,
due to Compton Scattering. It is difficult to differentiate section 6 discusses the results obtained followed by the
between the organs and tissues from low contrast images. conclusion in section 7.
We need better in-treatment images to extract relevant
features of the anatomy for a reliable patient set-up II. LITERATURE SURVEY
verification. The goal of this research work was to inspect
several image processing techniques for contrast In the past decade, quite a less amount of work has
enhancement and edge detection/sharpening on EPI in been done on improving electronic portal image quality.
DICOM format and improvise their visual aspects for better Experiments were conducted to increase the contrast of
diagnosis and intervention. We propose a hybrid approach portal images based on local enhancement to pixel values
to enhance the quality of electronic portal images by using of image matrix and it was found that the processed
CLAHE algorithm and median filtering followed by image
images were superior in quality, however it was time
sharpening. Results suggest impressive improvement in the
image quality by the proposed method. To quantify the consuming. [8] The influence of contrast enhancement,
degree of enhancement or degradation for various noise reduction and edge sharpening was examined in 3-
techniques experimentally, metrics like RMSE and PSNR step sequence for 12 combinations of operations and
are compared. results were compared for portal images obtained from
PIPS-PRO system. Majority of the images had superior
Keywords- Electronic Portal Imaging Device (EPID); quality after processing but it was not enough to locate
DICOM; image enhancement; image processing.
essential structures in all cases. [9] The use of Gray Level
I. INTRODUCTION Grouping for global contrast enhancement and Adaptive
Image Contrast Enhancement (AICE) for local contrast
Cancer is a cellular malignancy that results in
enhancement was suggested to enhance visual quality and
unregulated growth. Radiotherapy is used to kill
contrast of the whole EPI. It was concluded that these
malignant cancer cells. The goal of Treatment Planning
System (TPS) is to maximize the dose delivered to well methods greatly improve perception quality of the images.
defined targeted tumour volume and minimize the [10]
exposure to healthy surrounding tissues to avoid DNA III. THE DICOM STANDARD
damage or other such undesirable complications. [4] The
use and development of active matrix flat panel DICOM stands for Digital Imaging and Communications
amorphous silicon (a-Si) electronic portal imaging in Medicine and it represents the universal and
devices (EPIDs) for verification of Intensity Modulated fundamental standard in digital medical imaging. It stores
Radiation Therapy (IMRT) and modification of the details about the image and patient in the same file.
patient setup to match the planning geometry and avoid Established by National Electrical Manufacturers
discrepancies in the treatment outcome is increasing. [5] Association (NEMA), it allows the communication
Electronic Portal Image (EPI) quality is poor due to between equipment from different modalities and vendors
Compton effect which is dominant in MV range. facilitating the management of digital images. Since the
Typically, they show low to no distinction between tissues maximum gray value of DICOM images is greater than
and bones. 255, it supports up-to 65,536 (16 bits) shades of gray for
To produce higher quality images for assessment and monochrome image display. [13]
increase the treatment throughput, we test an aggregation IV. ELECTRONIC PORTAL IMAGE PROCESSING
of image processing techniques to refine the visual aspect
of EPI images. The operations were performed on 5 A. Denoising
images - pelvis (AP), chest (AP), head (Lateral), neck Noise in an electronic portal imaging system is
(Lateral), thorax (Lateral, with MLC). This paper gives a quantum mottle which arises from discrete nature of
qualitative analysis of the effect of various operations on radiation and its interactions with matter. [4] Gaussian
EPI.
noise is obtained due to random fluctuations. During Thorax 1.2132 1.2892 1.2902
transmission, the capturing device itself has salt and 2) Gamma Correction or Power Law Transformation:
pepper noise. During radiotherapy, the size and shape of Gamma is a non-linear form of increase in brightness. An
the tumour can change due to breathing or motion of improvement in luminance of images is seen as gamma
organ/patient, this can cause noise in form of motion blur. varies from dark to light tones. For low contrast images, γ
[15] < 1 improves the contrast of the image and γ > 1 reverses
B. Edge Detection the effect and makes the image dark. We inspect results
First order edge detectors like Canny, Robert, Prewitt, for γ = 0.5 and γ = 2. Results shown indicate that γ < 1
Sobel and second order edge detector - Marr Hildreth produces better EPI. Mathematically,
(LoG) are tested to preserve bony anatomical boundaries s=cγ (1)
in EPI. [2, 7] For these edge detectors, CPU running time
where c and γ are positive constants.
was measured on a 1.6 GHz - 2.3 GHz Intel core i5 PC.
Sobel operator is fastest at 0.0959 sec (average) as shown
in Fig. 1, though Canny operator performs fast with good
visual results making it suitable for real time use.
TABLE II. CPU TIME (SEC) FOR 3X3, 5X5, 7X7 MASKS
Figure 12. Pelvis: (a) original image, (b) CLAHE image, (c) Histogram
equalised image
Figure 8. Chest: Effect of (a) imadjust, (b) logarithmic transformation
Figure 13. Chest: (a) original image, (b) CLAHE image, (c) Histogram
Figure 9. Head: Effect of (a) imadjust, (b) logarithmic transformation equalised image
Figure 14. Head: (a) original image, (b) CLAHE image, (c) Histogram
equalised image
Figure 10. Neck: Effect of (a) imadjust, (b) logarithmic transformation Figure 15. Neck: (a) original image, (b) CLAHE image, (c) Histogram
equalised image
Figure 11. Thorax: Effect of (a) imadjust, (b) logarithmic transformation
Method
Figure 16. Thorax: (a) original image, (b) CLAHE image, (c) Histogram
equalised image Figure 20. Neck: (a) Negation, (b) Solarisation, (c) Proposed
Method
7) Image Negation: The negative of an image with grey
levels in the range [0, L-1] is obtained by the negative
transformation given by the expression,
= −1 − (3)
8) Solarisation: This operation reverses the image tone Figure 21. Thorax: (a) Negation, (b) Solarisation, (c) Proposed
and helps in contrast enhancement. It takes complements Method
of all the pixels in the image whose gray-scale values are
less than 128. [3] VI. EXPERIMENTAL RESULTS AND DISCUSSIONS
Image enhancement is necessary to provide a better
V. PROPOSED METHOD
representation of the images. The speed and accuracy of
Basic steps of the proposed hybrid method are: the edge detection process for EPI was observed to be
improved by Canny edge detector. The power law
DICOM image acquisition transformations gave better results with γ < 1. For dark
Normalise it in the range 0 to 1, then scale to 255 images with low contrast, better results were obtained
Apply CLAHE algorithm with clip limit 0.02 with the logarithm transformation. For images with low
Non-linearly median filtered to reduce noise contrast in gray scale, histogram equalization works
Image sharpening better. It is found that the CLAHE improves the contrast
and also equalizes the image histogram efficiently. We
The results of negation, solarisation and the proposed quantify results using statistical parameters like RMSE
method are compared from Fig. 17 - Fig. 21. and PSNR. PSNR is improved with the CLAHE method
in comparison with HE. The proposed method improves
the appearance of the EPI details significantly in terms of
visual quality and preservation of edges.
A. RMSE
RMSE is the root mean square error.
=√ (4)
Figure 17. Pelvis: (a) Negation, (b) Solarisation, (c) Proposed
Method
TABLE III. RMSE FOR VARIOUS TECHNIQUES
B. PSNR
It is a measure of denoising and contrast enhancement. It
is expressed in dB. More the PSNR, better is the image
quality. When the value of PSNR is or exceeds 40 dB, the
two images are indistinguishable.
Figure 19. Head: (a) Negation, (b) Solarisation, (c) Proposed
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
Acceleration CS (Computation
CS (RMSE)
Factor time in seconds)
2x 0.0320 163.76
3x 0.0312 159.12
4x 0.0322 154.54
5x 0.0325 154.26
8x 0.0495 150.56
10x 0.1131 146.80
Figure 3. Pictorialeffectof 2x, 3x, 4x, 5x, 8x, 10xacceleration factor on
brain data by zfwdc
∑ - Acceleration
ZFWDC
RMSE = Factor ZFWDC (RMSE)
(Computation time in
seconds)
Where xp, xr are the ZFWDC/CS reconstructed and full 2x 0.0304 160.05
kspace reconstructed image respectively, and N is number 3x 0.0459 159.32
of elements in xp and xr. 4x 0.0581 155.16
5x 0.0697 154.75
2) PSNR (Peak signal to noise ratio) 8x 0.1081 92.715
10x 0.1483 90.54
1
= 10 Figure 4. Pictorialeffectof 2x, 3x, 4x, 5x, 8x, 10x acceleration factor on
brain data by cs
TheRMSE value of CSmethod is more than Zero filled TABLE IV. THE STATISTICAL COMPARISON OF
DIFFERENT ACCELERATION FACTOR ON PHANTOM DATA IN
method for all acceleration methods. COMPRESSED SENSING RECONSTRUCTION
Acceleration CS (Computation
Factor CS (RMSE) time in seconds)
2x 0.0517 102.53
3x 0.0245 119.36
4x 0.0764 101.14
5x 0.0919 95.22
8x 0.1548 92.72
10x 0.2352 90.95
Figure 7. Pictorialeffectof 2x, 3x, 4x, 5x, 8x, 10xacceleration factor on REFERENCES
phantom image by CS
[1] Michael Lustig, David Donoho, and John M. Pauly, ‘Sparse MRI:
Pictorial effect of acceleration factor on phantom image is The Application of Compressed Sensing for Rapid MR Imaging,’
shown by ZFWDC and CS by in Fig. 6 and 7 respectively. Magnetic Resonance in Medicine 58: 1182 - 1195 (2007)
PSNR value is calculated by varying sampling factors [2] M. Lustig, D.L. Donoho, J.M. Santos, and J.M. Pauly, Compressed
sensing MRI, IEEE Signal Processing Magazine, vol. 25, 2008, pp.
shown in Fig. 8 and for all acceleration factor valuesthe 7282.
PSNR value of CS is more than the ZFWDC [3] D.L. Donoho, Compressed sensing, IEEE Transactions on
reconstructionmethod. Information Theory, vol. 52, 2006, pp. 1289-1306.
[4] Amaresha Shridhar Konar, Jain A. Divya, Shamshia
Tabassum, Rajagopalan Sundaresan, Julianna Czum,
Barjor Gimi, Ramesh Babu D.R., Ramesh Venkatesan
Abstract—Wired systems are complex, heavy, less secure and with data consistency [2].The new WCAN is proposed to
expensive. Hence, in today’s 21st century wireless technology exploit the advantages of CAN and still providing
has been gradually adopted by automobile manufacturers. A wireless access. The rest of the paper is organized as
vehicle has various control units which were connected using follows; section II outlines the related work and Drive-by-
traditional point-to-point wiring architecture in olden days. wireless technique, section III describes the block
These were replaced by a CAN bus later. This paper uses
diagram of the system, section IV briefs on the
Wireless CAN (WCAN) to interconnect various control
units. This has several important advantages such as system components used, section V presents the circuit diagram
flexibility, message routing, filtering, multicast, together of the system, section VI discusses algorithm, section VII
with data consistency. This paper proposes a drive-by- presents the hardware output and section VIII briefs the
wireless technique for vehicle control and monitor functions conclusion.
using Wireless Controller Area Network. Traditional
hydraulic or mechanical methods of steering, braking and II. RELATED WORK AND DRIVE-BY-WIRELESS
accelerating of a vehicle will be replaced by Drive by TECHNIQUE
Wireless Technique. Also, traditional vehicle monitoring
Stähle et. Al [1] investigated the so-called drive-by-
methods are done in a wireless manner. The algorithm
includes Unique Identification Codes which is sent with all
wireless, i.e., using a wireless network to control steering,
the transactions involving wireless communication packets braking, accelerating and other functions within an
to reduce interference from adjacent drive-by-wireless automobile. Mary et.al [2] showed that WCAN is suited
system. for real time control applications giving maximum
throughput for minimal latency for an optimized number
Keywords- Drive-by-wireless; WCAN; Vehicle Control and of nodes. Iturri et. Al [11] showed that ZigBee is a viable
Monitor; Unique Identification Codes technology for successfully deploying intra-car wireless
sensor networks. Lin et. Al [3] proposed an Intra-car
I. INTRODUCTION Wireless Sensor Network (WSN) to eliminate the amount
of wiring harness and simplify the wiring structure. Lin et.
Drive-by-wireless techniques replace the mechanical Al [6] evaluated the performance of intra-vehicular
and hydraulic connections between the driver and the wireless sensor networks (IVWSNs) under interference
associated vehicle actuators with electronic from WiFi and Bluetooth devices. Torbitt et. Al [7]
communication systems. These systems transmit analyzed the surface wave hypothesis at different
electronic messages to direct a vehicle component based frequencies in intra-vehicular environments. Ahmed et.
on the action taken by the driver of the vehicle, e.g., Al [8] investigated the issues around replacing the current
turning a steering wheel, pressing a brake pedal, or wired data links between electrical control units (ECU)
pressing an accelerator pedal [1]. In the past the vehicle and sensors/switches in a vehicle, with wireless links. Lin
bus communication used point to point communication et. Al [9] proposed a new wireless technology known as
wiring systems which causes complexity, bulkiness, is Bluetooth Low Energy (BLE) and outlined a new
expensive with increasing electronics and controller architecture for IVWSN. This paper proposes Drive-by-
deployed vehicles. The abundance of wiring required wireless technique using WCAN.
makes the whole circuit complicated. CAN solves this
complexity by using twisted pair cables that is shared A. Drive-by-wire System
throughout the control. Not only does it reduce the wiring Drive-by-wire technology in the automotive industry is
complexity but it also made it possible to interconnect the use of electrical or electro-mechanical systems for
several devices using only single pair of wires and performing vehicle functions traditionally achieved by
allowing them to have simultaneous data exchange. mechanical linkages. This technology replaces the
WCAN has several important advantages such as system traditional mechanical control systems with electronic
flexibility, message routing, filtering, multicast, together control systems using electromechanical actuators and
human-machine interfaces such as pedal and steering feel The Steering, Brake, Accelerator sensors are
emulators.The Drive-by-wire system used point to point associated with the Engine Control Unit. The Dashboard
communication wiring systems as shown in Figure 1. This unit contains the LCD, the D.C. Motor unit contains a
causes complexity, heaviness and is expensive. D.C. motor with a motor drive and a temperature sensor.
Finally, the Servo Motor Unit contains a Servo Motor and
a level sensor.
IV. COMPONENTS DESCRIPTION
PIC18F45K22 is the microcontroller used in the
project. Circular potentiometers are used for Brake-
Acceleration and Steering. Servo Motor and a level sensor
is used for the Servo Motor Unit and DC Motor unit
contains a D.C. motor with a motor drive and a
temperature sensor. LCD Display is used for displaying
the engine temperature and fuel levels.
B. Potentiometers
A potentiometer is a three terminal resistor with a
sliding contact forms an adjustable voltage divider and
Figure 3. Block Diagram
only two terminals are used one end and the wiper acts as
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
D. DC Motor
A DC motor has a two wire connection. All drive
power is supplied over these wires. Most DC motors are Figure 7. Pressure Sensor
pretty fast of about 5000 rpm. The DC motor speed is
controlled by a technique called pulse width modulation G. Temperature Sensor
or PWM.
The LM35 series are precision integrated-circuit
temperature sensors, whose output voltage is linearly
proportional to the Celsius (Centigrade) temperature. The
LM35 thus has an advantage over linear temperature
sensors calibrated in ° Kelvin, as the user is not required
to subtract a large constant voltage from its output to
obtain convenient Centigrade scaling. The LM35 does not
require any external calibration or trimming to provide
typical accuracies of ±1⁄4°C at room temperature and
Figure 5. D.C. Motor ±3⁄4°C over a full −55 to +150°C temperature range. Low
cost is assured by trimming and calibration at the wafer
E. Servo Motor level. The LM35’s low output impedance, linear output,
The function of the servo is to receive a control signal and precise inherent calibration make interfacing to
that represents a desired output position of the servo shaft, readout or control circuitry especially easy. It can be used
and apply power to its DC motor until the shaft turns to with single power supplies, or with plus and minus
that position. It uses position sensing device to rotate the supplies. As it draws only 60 μA from its supply, it has
shaft. The shaft can turn a maximum of 200 degree so very low self-heating, less than 0.1°C in still air. The
back and forth. LM35 is rated to operate over a −55° to +150°C
temperature range, while the LM35C is rated for a −40° to
+110°C range (−10° with improved accuracy). The LM35
series is available packaged in hermetic TO-46 transistor
packages, while the LM35C, LM35CA, and LM35D are
also available in the plastic TO-92 transistor package. The
LM35D is also available in an 8-lead surface mount small
outline package and a plastic TO-220 package.
H. CAN MCP2515
Figure 6. Servo Motor
It is a Stand-Alone CAN Controller with SPI
Interface, 18 pin I.C.
• Implements CAN V2.0B at 1 Mb/s: 0 – 8 byte length in
the data field, Standard and extended data and remote
frames
978-1-4673-6524-6/15/$ 31.00
S.SRINATH et al: DRIVE-BY-WIRELESS FOR VEHICLE CONTROL AND MONITOR USING WIRELESS CONTROLLER AREA NETWORK
• Receive Buffers, Masks and Filters: As shown in Figure 9, Pin 1 of Port C is used for the
Two receive buffers with prioritized message Storage, Six motor drive circuit while Pin 1 of Port A is used for the
29-bit filters and Two 29-bit masks pressure sensor.
• Data Byte Filtering on the First Two Data Bytes (applies
to standard data frames)
• Three Transmit Buffers with Prioritization and Abort
Features
• High-Speed SPI Interface (10 MHz): SPI modes 0,0 and
1,1
• One-Shot mode Ensures Message Transmission is
Attempted Only One Time
• Clock Out Pin with Programmable Prescaler: Can be
used as a clock source for other device(s)
• Start-of-Frame Signal is Available for Monitoring the
SOF Signal: Can be used for time-slot-based protocols
and/or bus diagnostics to detect early bus degradation
V. CIRCUIT DIAGRAM Figure 9. The D.C. Motor Module
The system comprises of four control units which
communicate with each other using Zigbee over 802.15.4 As shown in Figure 10, Pin 1 of Port C is connected
protocol. The four modules are Engine Control Unit, D.C. to the servo motor while Pin 1 of Port A is connected to
Motor Unit, Servo Motor Unit and the Dashboard Unit. the temperature sensor.
The input 220V A.C. power supply is converted to 12V
D.C. by an adapter. Various units in the modules require
only 5V D.C and 3.3 V D.C. power supply. Hence a
regulator is used for this purpose. The PIC18F45K22
microcontroller is a 40 pin I.C. There are 5 ports. Port A,
B, C and D have 8 pins each while Port E has 3 pins. The
remaining 5 pins are used for MCLR, VDD and Ground.
The ICSP (In Circuit Serial Programmer) is a 5 pin device
which is used by PitKit 3 to dump the program from the
computer to the microcontroller. Pin 1 of the ICSP is
connected to a high voltage to erase any previous
programs, Pin 2 is the clock, Pin 3 is the data, Pin 4 is
connected to Ground while Pin 5 is connected to VDD.
The Dashboard Module circuit diagram is shown in
Figure 8. It consists of a 16x2 LCD display.
Figure 10. The Servo Motor Module
SP1 and SP2 of the PIC18F45K22 are pins A5, C3, C4,
C5 and A6, C3, C4, C5 respectively. A5 and A6 are the
Enable Pin, C3 is the clock, C4 is the Data Input and C5 is Figure 11. The Engine Control Module
the Data Output. UART1 and UART2 are pins 25, 26 and
29, 30 respectively. 25 and 29 are for transmission while
26 and 30 are for reception. In CAN, CANL is for
transmission and CANH is for reception. In Zigbee Pin 2
is for transmission and Pin 3 is for reception.
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
VI. ALGORITHM
Some of the pseudo-codes for various control units are
shown below. MPLAB IDE is the development platform
used for coding.
Figure 12. The Servo Motor Module
tostring(adcvalue1, dispstring);
cantx('A');
cantx(dispstring[0]);
The Engine Control Module is shown in Figure 13. It
cantx(dispstring[1]); consists of three sensors namely the accelerator sensor,
cantx(dispstring[2]); the brake sensor and the steering sensor.
cantx(dispstring[3]);
cantx(dispstring[4]);
The Servo Motor Module is shown in Figure 12. It The D.C. Motor Module is shown in Figure 15. It
consists of a Servo Motor and a Level Sensor. consists of a D.C. Motor and a Temperature Sensor.
Abstract—Robotics provides new innovations to the effect is accomplished by assigninga node in the next
industrial revolution and the more recent information layer with respect tothe nodes in previous layer. ANNis a
revolution. Inverse kinematics computation is one of the versatile tool and widely used to solve many problems
main problems in robotics research and its complexity [15]. This paper proposes the neural network ideas such
increases by rising the degrees of freedom the manipulator.
as Feed Forward Backpropogation and Radial Basis
The inverse kinematics problem for a robotic manipulator is
obtained for the required manipulator joint values for a Function (RBF) algorithm. Feed forward neural network
given desired end point position and orientation. The is the first and simplest artificial neural network where
Artificial Neural Network is an algorithm which works connections between the units do not form a back loops
similarly as the biological neuron. In this paper the Neural and the information is moved only in a single direction.
Network ideas such as feed forward and Radial Basis A robotic manipulator is composed of several
Functions algorithms are proposed to solve the inverse links and is connected together through the joints. Figure
kinematics problem of five degree of freedom robot end 1 gives a view of the general structure of a
effector. seriesmanipulatorwith revolutejoints (5 DOF).This
explains the relation between the links, joints and the
Keywords - Inverse Kinematics, degrees of freedom, feed
forward and Radial basis function neural network corresponding frame in which they lies.
algorithm
INTRODUCTION
2. METHODOLOGY
A. Inverse Kinematics
numerical datas and the particular algorithms can response only in single direction is one condition.The
automatically controls and adjust weights, threshold and final neural network output is usually traced from the
bias of the processing elements. The process training output layer. The system error is the difference of the
means adjusting the parameters and obtaining the target value and output value and is adjusted to minimise
minimal difference between ANN output and targeted the error.
output and then the network is said to be trained. The global optimum is achieved by pursuing the
Neural Network can able produce the best most accurate values of the parameters through certain
possible result by adjusting the input without any trial and error method. Meanwhile the learning process
redesigning of the output conditions.Nowadays certain cannot assure the global optimum because of trapping of
learning architectures recommended for the purpose of network in the local optimum. One advantage in adopting
training by giving the appropriate input and thereby this strategy isits simple implementation and expenses.
getting the desired result. The processing element's [15]
interconnection, the transfer functions for the processing
elements, and the learning law controls as well as adjusts
the performance of the network.
( )= ∅( )
In the case of Gaussian Radial Basis Function Fig 7. Performance plot of Feed forward NN using NN.
‖ ‖
∅( ) = − (1)
REFERENCES
[2] Xiaojun Li, Yide Ma, Xiaowen Feng, “Self - adaptive auto
wave Pulse - coupled neural network for shortest –path
problem”, 2013.
vol.2 (2011), International Conference on Computer [17]Arun, Harish, G. Salomon, K. Saravanan, R. Kalpana, K.
Engineering and ApplicationsIPCSIT Jaya, “Neural networks and genetic algorithm based intelligent
robot for face recognition and obstacle avoidance” 2013,
[5] Rasit Koker, Cemil Oz, Tarik Cakar , Huseyin Ekiz, A study International Conference on Current Trends in Engineering and
of neural network based inverse kinematics solution for a three Technology (ICCTET)
joint robot ,2004
[18]Dominik Gront and Andrzej Kolinski,”Efficient scheme for
[6] Daniel Tarnita a, Dan B. Marghitu,”Analysis of a hand arm optimization of parallel tempering Monte Carlo method”.
systemRobotics and Computer-IntegratedManufacturing “,
Vol.29.pp.493–501(2013)
[16]www.mattlavery.com/photographybtqe/jointed-arm-robot
Abstract— Design of a low profile, compact, wide beam and separated metallization tapers until the separation is such that
wideband linear tapered slot antenna with a stripline feed meant the wave detaches from the antenna structure and radiates into
for radar applications in X-band has been presented. The aim of the free space from the substrate end. The E-plane of the
this study is to investigate the effects of changes in the physical antenna is the plane containing the electric field vectors of the
parameters of linear tapered slot antennas on their radiation
radiated electromagnetic waves. For TSAs, this is parallel to
characteristics. The radiation characteristics of linearly tapered
slot antennas with different opening rate (aperture area) the substrate since the electric field is established between two
parameters are compared. It is observed that increased opening conductors that are separated by the tapered slot. The H-plane
rate of linear tapered slot antenna exhibit wider beamwidth with containing the magnetic component of the radiated EM wave
lower sidelobe level. Stripline-fed Vivaldi antennas are comprised runs perpendicular to the substrate.
of: 1) a stripline-to-slotline transition; 2) a stripline stub and a TSAs have moderately high directivity (on the order of 10-
slotline cavity; and 3) a tapered slot. The aperture height of slot 17 dB) and narrow beamwidth because of the traveling wave
affect the radiation characteristics, return loss, beamwidth and properties and almost symmetric E-plane and H-plane radiation
scan angle are investigated with different opening rate of slot patterns over a wide frequency band as long as antenna
through the use of a commercially available electromagnetic
simulation software HFSS by ANSYS.
parameters like shape, total length, dielectric thickness and
dielectric constant are chosen properly. Other important
Index Terms— Tapered slot antenna, Radar, stripline, slotline, advantages of TSAs are that they exhibit broadband operation,
Vivaldi antenna, wideband. low sidelobes, planar footprints and ease of fabrication. A TSA
can have large bandwidth if it exhibits a good match both at the
input side (transition from the feed line to slot line) and the
radiation side (transition from the antenna to free space) of the
I. INTRODUCTION antenna. The gain of a TSA is proportional to the length of the
antenna in terms of wavelength. Tapered slot antennas are also
Phased array antenna systems can be used in numerous suitable to be used at high operating frequencies (greater than
applications, where one of the oldest is radar systems. The first 10 GHz), where a long electrical length corresponds to a
phased array radar system dates back to the Second World considerably short geometrical length. The main disadvantage
War, and today phased array radar systems are increasingly of the TSA is that only linear polarization can be obtained with
used on naval ships and aircrafts. Modern phased array radar conventional geometries.
systems can perform several tasks simultaneously, like keeping The tapered slot antennas are the best candidates for use in
track of ground and air targets while at the same time Ultra Wideband Air Borne (UWAB) technology. These
communicating with other units. The modern shared aperture antennas offer a wide bandwidth, significant gain and
radar concept explored the development of multi-function symmetric patterns in both co-polarization and cross-
wideband arrays capable of simultaneous and time polarization. TSAs are efficient and light weight. In addition,
interleaving, electronic warfare, and communications TSAs are appreciably simple in geometry making them more
functions. This necessitated the need of frequency independent advantageous. The most commonly used class of TSA in Ultra
wide band antennas. With the term frequency independent it is Wide Band (UWB) technology is Vivaldi antenna. Vivaldi
meant that the antenna pattern and impedance remain constant antenna, first introduced by Gibson [2] in 1979, has an
over a relatively wide frequency bandwidth. The stripline-fed exponentially tapered slotline. As a member of the class of
tapered slot antenna (TSA) array was introduced by Lewis in TSA, Vivaldi antenna provides broad bandwidth, low cross
1974[1] and its potential for wideband and wide-scan arrays polarization and directive propagation at microwave
makes it a prime candidate for high-performance phased-array frequencies. Vivaldi antennas are low cost, easy to fabricate
systems. and fairly insensitive to dimensional tolerances in fabrication
A tapered slot antenna uses a slot line etched on a dielectric process due to printed circuit technology used for the
material, which is widening through its length to produce an construction of these antennas. Moreover, Vivaldi arrays are
endfire radiation. An electromagnetic [EM] wave propagates small size and low weight enabling compact arrays. It shall be
through the surface of the antenna substrate with a velocity less also noted that the beamwidth and directivity of a Vivaldi
than the speed of light which makes TSAs gain slow wave antenna might be considerably improved varying the design
antenna properties. The EM wave moves along the increasingly parameters [3], [10].
Active Electronically Scanned Array (AESA) antenna impedance is matched to the impedance of free space
requires active array elements employing wideband and wide [4],[14],[15].
beam antenna elements like dipoles, linearly tapered slot
antenna, exponentially tapered slot antenna or vivaldi antenna,
constant slot width antenna, doubly layered exponentially
tapered slot antenna, waveguide slot antenna, micro strip
Bottom layer
antenna, helices and spiral antenna [5]. The tapered slot
element is a broad band element that has good scan angle
Top layer
performance, but it is bulky. Waveguide aperture elements are
narrow band and capable of wide angle performance. The
fragmented aperture antenna is thin wide-band antenna but Stripline feed
lacks of wide-angle performance [6],[8],[9].
A tapered slot antenna has a slotline flare from a small gap Dielectric substrate 1
to a large opening, matching to free space wave impedance.
TSA is larger than a half wavelength to achieve the desired Dielectric substrate 2
performance. TSAs have moderately high directivity and
narrow beamwidth because of the travelling wave properties
and almost symmetric E-plane and H-plane radiation patterns
over a wide frequency band as long as antenna parameters like
shape, total length, dielectric thickness and dielectric constant
are chosen properly. Other important advantages of TSAs are
Fig. 1 Exploded view of stripline-feed Linear Tapered slot antenna
that they exhibit broadband operation, low sidelobes and ease
of fabrication [7] [11].
Designing a wide-band scanning array is also difficult.
Typically, the element spacing should be less than one-half The stripline/slotline transition is specified WST (stripline
free-space wavelength of the highest frequency to avoid width) and WSL (slotline width). The exponential taper profile
grating lobes. In the case of a 5 : 1 bandwidth array, the array is defined by the opening rate R and two points P1(z1,y1) and
element spacing may be less than one-tenth of the free-space P2 (z2,y2),
wavelength at the lowest frequency. Mutual coupling between
the radiating elements may be quite large and this coupling
may cause scan-blindness and/or anomalies within the desired y = c1eRz +c2 (1)
bandwidth and scan volume [12] [13].
Where
II. DESIGN PARAMETERS OF A LINEAR TAPERED SLOT
c1 =
ANTENNA
P1
R
978-1-4673-6524-6/15/$ 31.00 L
P2 AR
d
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
-12.00
-14.00
-16.00
dB(S(1,1))
-18.00
-20.00
-22.00
Fig.7 Simulated Radiation pattern – 3D polar plot of LTSA with H=8.5mm.
-24.00
m3
-26.00
11.00 11.50 12.00 12.50 13.00 13.50 14.00
Freq [GHz]
-5.00
3-dB beamwidth (HPBW) greater than 90o. The Fig. 11 shows
the 3-D polar plot of LTSA with the Gain of 4.84 dB over X-
-7.50 band operating frequency.
-10.00
-12.50
-200.00 -150.00 -100.00 -50.00 0.00 50.00 100.00 150.00 200.00
Theta [deg]
-7.00
-60 60
-13.00
-19.00
-90 90
-120 120
-13.00
Fig.6 Simulated radiation pattern- Polar plots of LTSA with H = 8.5mm in
two principal planes at 10GHz.
-19.00
90
120
150
-180
-20.00
-25.00
-30.00 m3
-35.00
11.00 11.50 12.00 12.50 13.00 13.50 14.00
Freq [GHz]
0.00
-2.50
dB(GainTotal)
-5.00
Fig. 12 Geometry of LTSA with slot Aperture height H = 12.5mm, R=0,
L = 14.2mm
-7.50
Name X Y S Parameter HFSSDesign1
-5.00 11.1013 -10.0632
m1 Curve Info
m2 13.8000 -10.5536 dB(S(1,1))
-10.00 m3 12.4000 -36.2690 Setup1 : Sweep1
m1
-10.00 m2
-12.50
-15.00 -15.00
-200.00 -150.00 -100.00 -50.00 0.00 50.00 100.00 150.00 200.00
Theta [deg]
-20.00
Fig.10 Simulated radiation pattern plot of LTSA in E-plane and H-plane with
dB(S(1,1))
-30.00
-35.00
m3
-40.00
11.00 11.50 12.00 12.50 13.00 13.50 14.00
Freq [GHz]
m3 m2 m4 m5
-0.00
Fig.11 Simulated Radiation pattern – 3D polar plot of LTSA with H=10.5mm. -2.50
dB(GainTotal)
-5.00
the X-band frequency is less than -10 dB. The simulated Fig.14 Simulated radiation pattern plot of LTSA in E-plane and H-plane with
radiation patterns of an active element are shown in Fig.14. aperture height H=12.5mm.
IV. CONCLUSION
Notes in Electrical Engineering, vol.326, pp.1489-1496, Springer India, 2007 and presently working as a Professor in the Department
2015.
of Electrical Engineering, Noorul Islam University, India.
Babu Saraswathi K. Lekshmi was born in India. She
received the Bachelors of Electrical and Electronics
Engineering degree with distinction from Manonmaniam
Sundaranar University, India, in 2001. She received the
Master of Engineering degree with distinction in the field of
Power Electronics and Drives from Anna University, India, in
2005. Currently, she is a research scholar doing Ph.D in
Noorul Islam University in the area of Phased Arrays for
airborne radar applications.
Abstract—The term Big Data refers to huge, complex and For data to be analyzed using Hadoop stack, the data needs
heterogeneous data. Based on the HACE characteristics of Big to be ingested into the Hadoop Distributed File System(HDFS).
Data, which isHeterogeneous, Autonomous, Complex and Once the data is moved to HDFS, the data is ready for
Evolving associations, there are many algorithms proposed. processing. HDFS is a block structured file system where the
Hadoop is an open source framework used extensively for
files are stored as blocks and are replicated across the cluster.
distributed storage and processing. Hadoop framework provides
parallel distributive data processing standards which increases The main advantage of Big Data platform is that it can process
the overall computational power and processing time. But data that is imported from any kind of source and is of any
choosing the right component for our requirement is an type, text, image, videos etc.In addition to the HDFS, data can
important task. It helps in optimizing the overall performance of be persisted to traditional onsite data centers such as Microsoft
the data analysis irrespective of data volume. Here we describe Azure, Amazon Elastic Compute Cloud (EC2) and Amazon
the Hadoop technology stack and their optimal usage for Simple Storage Service(Amazon S3). Data from these data
analyzing various data sources, especially the social data. center is available for direct access from Hadoop cluster.
IndexTerms—Big Data, Hadoop, Hive ,Spark ,HACE III. BIG DATA CHARACTERISTICS
I. INTRODUCTION A. Volume
In today’s internet world, the volumeof the data The main characteristic of Big Data is the huge volume.
growsexponentially. It is very difficult to handle this massive This is an important factor in determining the value and
growth using traditional systems. The solution to handle this potential of the data under consideration. In simple words, it is
type of data is to choose a distributed system. Migrating to a the feature which determines if the current data belongs to the
distributed system may sound quite expensive. But the system Big Data group.
configuration required for nodes in the cluster is very low when B. Variety
compared to a single server machine. Hadoop is an open-
source framework that can be integrated with the commodity Variety is the feature that represents the capability of
hardware clusters to efficiently store,processand analyze Big BigData to handle heterogeneous data. This helps users to
Data. An important note is that Hadoop is more an optimal effectively manage their data that are spread across different
solution for Batch processing. Until an enhancement to Hadoop data centres.
is developed, relational databases are preferred for C. Velocity
transactional data. Velocity is defined as the rate at which the data is
II. DATA MINING IN BIG DATA completely processed. Since multiple commodity hardware
works in parallel, the processing speed is proportional to the
Data mining is the process of extracting information, cluster size.
patterns from the underlying data. When this process is
deployed on huge volume of data, it takes its own processing
speed to complete the task since the speed is proportional to the IV. SOCIAL DATA ANALYSIS
processing speed of the server. Big Data is data with size in the The major social data generators are Facebook, Twitter,
range of hundreds of terabytes. If the traditional method is Google, Yahoo, Instagram etc. They are the best examples for
used to mine data from Big Data, the processing speed will be sources that are autonomous and provide heterogeneous data
too low and it takes days to complete one data mining process. for analysis. Here we provide the various Hadoop ecosystem
Hence the data mining process can be carried on a distributive components that can be used for optimized data mining. In
environment where it is processed in parallel. The data is addition to these properties, these sources generate huge
mined within hours of submitting the data mining process.
volume of data which is as close as millions of records in less The client initiates the job submission by getting a job
than an hour. ID from the job tracker. All the required files such as
property files, configuration files are placed on HDFS.
A. Data Ingestion
Job tracker accepts job submission ,resource allocation
Data ingestion is the process of importing data into the and job monitoring.Job tracker splits the job into
Hadoop Distributed File System(HDFS) from external sources. multiple tasks and distributes it to task trackers.Each
An external source can be a relational database or a flat file. map tasks access files from HDFS for the execution of
Apache Sqoop is used to import data from any relational the job. The advantages of MapReduce are Scale-out
database into Hadoop file system. Sqoop provides compatible Architecture, Security & Authentication, Resource
drivers and connector to import data from most of the major Manager and Optimized Scheduling. The major
relational databases. It is an appreciated idea to denormalise the usecases for MapReduce are Data Analytics, Parallel
data before the data is persisted to HDFS. Hadoop is Processing, Data Locality, Data Integrity and Data
importantly designed to process huge, incomplete and miningThe architecture of MapReduce framework is as
duplicate data. Hence de-normalizations would improve the followed:
performance of the process thereby reducing the overload of
join operations.
Since our main requirement is to analyzethe social data,
they are mostly of log file or flat file format. Hence the Hadoop
tool to be used is Apache flume. Flume is a distributed,
reliable, and available service for efficiently collecting,
aggregating, and moving large amounts of streaming event
data. The architecture of flume is as followed:
[6] “IBM What Is Big Data: Bring Big Data to the Enterprise,”
http://www-01.ibm.com/software/data/bigdata/, IBM, 2012.
[7] A. Jacobs, “The Pathologies of Big Data,” Comm. ACM, vol.
52, no. 8, pp. 36-44, 2009.
[8] D. Luo, C. Ding, and H. Huang, “Parallelization with
Multiplicative
[9] Algorithms for Big Data Mining,” Proc. IEEE 12th Int’l Conf.
Data Mining, pp. 489-498, 2012.
[10] S. Papadimitriou and J. Sun, “Disco: Distributed Co-Clustering
with Map-Reduce: A Case Study Towards Petabyte-Scale End-
toEnd Mining,” Proc. IEEE Eighth Int’l Conf. Data Mining
(ICDM ’08), pp. 512-521, 2008.
[11] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C.
Kozyrakis, “Evaluating MapReduce for Multi-Core and
Multiprocessor Systems,” Proc. IEEE 13th Int’l Symp. High
Performance Computer Architecture (HPCA ’07), pp. 13-24,
2007.
[12] A. Rajaraman and J. Ullman, Mining of Massive Data
Sets.Cambridge Univ. Press, 2011.
[13] D. Centola, “The Spread of Behavior in an Online Social
Network Experiment,” Science, vol. 329, pp. 1194-1197, 2010.
[14] C.T. Chu, S.K. Kim, Y.A. Lin, Y. Yu, G.R. Bradski, A.Y. Ng,
and K.Olukotun, “Map-Reduce for Machine Learning on
Multicore,” Proc. 20th Ann. Conf. Neural Information
Processing Systems (NIPS’06), pp. 281-288, 2006.
[15] Y.-C. Chen, W.-C. Peng, and S.-Y. Lee, “Efficient Algorithms
for Influence Maximization in Social Networks,” Knowledge
and Information Systems, vol. 33, no. 3, pp. 577-601, Dec.
2012.
Abstract— Internet-of-Things (IoT) is the expansion of specific infrastructure via standard protocols then the
internet services. Applications of IoT are increasing. whole system is said to be Internet of Things (IoT) [1-4].
Uses of new technologies in IoT environment are 1.2Things: Things may be real or virtual, moving or
increasing rapidly. It has been already developed in steady but things will be active participants in the whole
Industrial Wireless Sensor Network (WSN). A smart system. Things will communicate with each other, called
home is also one of the applications of IoT. Rapid as things-to-things communication. Things will also able
growth in technologies and improvements in to communicate or interact with human then it is called as
architecture comes out many problems that how to things-to-human communication [4].
manage and control the whole system, Security at the However the internet of things is not just deep vision
server, security in smart homes, etc. This paper for future. It is already here and is having an impact on
presents the architecture of IoT. Smart homes are more than just technological development. These things
those where household devices/home appliances could and communicating objects which used to communicate
monitor and control remotely. When these household with the internet can configure themselves independently
devices in smart homes connect with the internet using and can operate without human intervention [3]. Figure 1
proper network architecture and standard protocols, shows the architecture of IoT [5].
the whole system can be called as Smart Home in IoT
environment or IoT based Smart Homes. Smart
Homes ease out the home automation task. This paper
presents not only the problems and challenges come in
IoT and Smart homes system using IoT but also some
solutions that would help to overcome on some
problems and challenges.
I. INTRODUCTION
Internet has changed human’s life by providing
anytime, anywhere connectivity with anyone. As many
advancement in technology has been come the sensors,
processors, transmitters, receivers, etc. are now available
in very cheap rate. Hence these all things can be used in
our day to day life [4]. If anyone wants to expand the
services of internet then Internet of Things can be said as
the expansion of internet services [1]. Today’s internet is Fig. 1. Architecture of IoT [5]
now expanding towards Internet of Things (IoT).
1.1Internet-of-Things: The internet where the 1.3 Smart Home: A smart home is the home or that
existing network of internet to the computer systems will living environment having technology to allow all the
connect to the real world objects or things. Things may household devices/home appliances to be controlled
include any objects, home appliances, devices, vehicles, automatically and can be controlled remotely [8]. In
etc. And when these things connect to the internet in Smart homes user can easily monitor and control all home
devices/home appliances through internet. Home
978-1-4673-6524-6/15/$ 31.00
PRANAY PRASHANT GAIKWAD et al: A SURVEY BASED ON SMART HOMES SYSTEM USING INTERNET-OF-THINGS
appliances connect in predefined proper network not be a single device. It will be the set of devices like
architecture and using standard protocols.Basic idea for microcontroller, CPLD processor, RF transceiver, GPRS
Smart Homes using IoT is shown in figure 2 [2]. or Zigbee module, etc. Microcontroller can be used as a
main controller and for data processing. Data acquisition
The whole system can be divided into two parts: in can be easily done by microcontroller hence it can be act
one part consist all the home devices and switch modules as interface device [5] [8].
and RF transmitter receiver and in second part include all
the interface device, processor, data collector, GPRS
module that will communicate with the internet.
II. RELATED W ORK AND METHODOLOGIES
USED
Layered architecture of the IoT-based Smart Home
System is described by Kang Bing et al., in [8].The smart
home system is divided into three layers: application
layer, network layer, and sensing layer. Starting from the
bottom, sensing layer is responsible for data collection
from all the home appliances and it sends data to the
middle layer that is network layer. Network layer uses
internet for sending data to the upper most application
layer which has different applications on different level
for different purposes. For data collection and data
processing at the sensing layer it used microprocessor
SAMSUNG S3C2440A which is a type of ARM
microcontroller [8]. To transfer the collected data to the
Fig. 2. Basic idea for Smart Home System using IoT [2] network layer it uses Zigbee module which is based on
IEEE 802.15.4 wireless standard [7-8].
In this paper for consideration only four households
devices: Light, Fan, Television, Gas outlet are shown. But
in reality user can connects number of devices. These all
household devices will connect to the switch modules.
Switch module may contain any type of module which
changes its state as it received signal. Switch module
connected to the device in such a way that when it change
the state, the state of household device connects to it will
also change [2] [4] [8] [14]. Relays can be used as a
switch module. It is an electromagnetic device or
normally called as relay switch. It isolates two circuits
electrically and connects them magnetically. In basic
relay there are three contactors which are normally open
(NO), normally closed (NC), and common (COM). COM
is normally connected to the NC. At normal condition
when household devices is not in working mode then
relay is on NO state. When it gets signal then it changes
Fig. 3. Layered architecture of the IoT-based Smart Home System
the state to NC and the device will get on working state [8]
[9]. Switch modules will connect to the smart central
controller through RF transceiver. Each switch module A reconfigurable smart sensor interface device that
will has one transceiver or one transceiver can also be integrates data collection, data processing, wired and
connects to all switch modules. Each switch module and wireless transmission together is already design for
device will be identified by assigning a unique identity to industrial Wireless Sensor Network (WSN) in IoT
them. One RF transceiver will connects at the smart environment using CPLD by Qingping Chi et al., [5].
central controller. RF modules communicate between
themselves at 433MHz. 433MHZ spectrum is specially
made for the RF communication [2] [4]. Smart central
controller will act as interface device between household
devices and internet server. Smart central controller will
Abstract— Growth of the meter reading, Lead to needs. Previously used Power Line Communication
Automatic Meter Reading (AMR) system, There are [PLC] for data communication has number of limitations
various wire-based AMR systems like Power Line like cost of installation, complexity of network and
Carrier (PLC) and Telephone Line Network (optical maintenance. For different type of services we require
or cable) and wireless AMR systems such as E- different frequency bands and power line uses 50 Hz also
metering systems based on GPRS, Bluetooth, GSM unable to support higher frequency bands. The power line
increasing rapidly, real time billing data management is uncovered since losses also increases on the other hand
via online connection in recent years. Collected data interference between different channels is also big
from many users made huge demand for storing, problem. There are some general problems like highly
managing and processing on various stream. Design person dependent, human errors can’t be avoided,
of an Electric Energy Meter for long-distance data accessibility in rural zone, billing on monthly basis and its
information transfers which based upon GSM/GPRS, processing takes excess time. The Digital Telewattmeter
but this system can’t be implemented so easily because System is an example of microprocessor based meter. The
the regular use of GSM/GPRS is still a dream to the meter was designed to transmit data on monthly basis to a
common man. A GSM/GPRS based Energy meter remote central office through a dedicated telephone line
with instant billing facility is introduced is efficient, and a pair of modems.
but there is the problem of missing SMS will degrade So there is increased demand for Wireless Automatic
the accuracy and performance. A more reliable and Meter Reading (WAMR) systems which automatically
user friendly system by creating user friendly web collecting consumption, diagnostic, and status data from
portal for multiple access using the advanced Visual metering devices and transferring that data to a central
studio .net frame work which will manage the data database for billing, troubleshooting, and processing. It
efficiently even if there is loss of SMS. It makes the mainly reduces the human efforts as well as manual
design different from the previous proposals and also errors, provides real time correct consumption, Remote
increases the throughput. The GSM/GPRS channel is power switches on/off, which reduces required time and
a very useful mode of communication as sending data increases throughput.
as SMS/Email turns out to be a very handy tool, due to This paper presents an implementation methodology
its good area coverage capability, user friendliness and for a wireless automatic meter reading system (WAMRS)
cost effectiveness. incorporating the widely used GSM/ GPRS [1][5]
network. In many countries GSM and GPRS network is
Keywords- Automatic Meter Reading System widely known for its vast coverage area, cost
(AMRS;, GSM; PIC; Short Messaging System (SMS); effectiveness and also for its competitive ever growing
Visual Studio .NET market. The system includes a microcontroller, which
transmits the power consumption values periodically, via
I. INTRODUCTION an existing GSM/GPRS [1][5][8] network, to a master
station. To maintain transparency between consumer and
company we suggest a method where we utilize
We never think the human life without the electrical
telecommunication systems for automated transmission of
power because human survival and his progress totally
data to facilitate bill generation at the server end and also
depend over it. The electrification provides opportunities
to the customer via SMS, Email. A new interactive, user
for new and more efficient metering technologies to be
friendly graphical user interface is developed using
implemented and the future residential development
Microsoft visual studio .NET framework. With proper the goal of load management and power demand control
authentication, users can access the developed web page also maintains the transparency by sending monthly
details from anywhere in the world. consumption to user by SMS also user can check his
In this paper, following a brief introduction of latest usages from any place on any time through web portal
Automatic Meter Reading Systems described and made by any frameworks.
summarized. The existing problems and future research
directions are also discussed. 2. METHODOLOGIES
II. LITERATURE SURVEY AND RELATED WORKS A. Power Line Commnication Based System
Abstract—MANET (Mobile Adhoc Networks) is In the second case, the identity thief must log in with
vulnerable to the Sybil attack as well as masquerading the victim's credentials and begin issuing commands
attack. Sybil attack is a attack where a malicious node within the bounds of one user session. In either case, a
masquerades as several different nodes called Sybil monitoring system ought to detect any significant
nodes, simultaneously creating a problem in deviations from a user's typical prowled behaviors in
functioning of the network. Such attacks may cause order to detect a likely masquerade attack. Ideally, we
damage on a fairly large scale especially when they are seek to detect a possible masquerader at any time during a
difficult to detect. With this masquerade attack is also session.
serious attack that uses a fake identity, such as a
network identity, to gain unauthorized access to 1.3 Sybil Attack: As the wireless technology [1]
personal computer information through legitimate improves the method of communication and reduces the
access identificationOnce the attackers gain access, wired network overheads it also opens a wide spectrum
they can get into all of the organization's critical data for hacker and attackers to breach the security. Wireless
and can delete or modify it, steal sensitive data or alter communication happens through open air so it also
routing information and network configuration.In this increase the fetching these information from air medium
paper, we discuss the different kinds of Sybil attacks using sniffing software tools. So that attacker can easily
and Masquerading attacks. In addition, various disturb the signals using wireless signals jammer and
methods that have been suggested over time to signal repeater. Sybil is a well know attack on wireless
decrease or eliminate their riskof above both attacks network and it works in different scenarios to attack
completely are also analyzed. wireless network like signal disturbing, fake identities,
false access gaining.
Index Terms—Identity-based attack, received
signalstrength (RSS), sensor network, Masquerade The Sybil attack [1][5] is a particularly harmful threat
attack, Sybil attack to sensor networks where a single sensor node
illegitimately claims multiple identities. A malicious node
may generate an arbitrary number of additional node
I. INTRODUCTION
identities using only one physical device. The Sybil attack
1.1 Masquerade Attack: A masquerade attack is an can disrupt normal functioning of the sensor network,
attack that uses a fake identity, such as a network identity, such as the multipath routing, used to explore the multiple
to gain unauthorized or fraud access to personal computer
disjoint paths between sourcedestination pairs. But the
information through legitimate access identification. If an
authorization process is not fully secured or protected, it Sybil attack can disrupt it when a single adversary
can become extremely harmful for the system[1][3]. presents multiple identities, which appear on the multiple
paths.
1.2 Masquerading attacks can happens in several
ways: In first case[3][4], masquerading attack happen 1.4 Specific types of Sybil attacks: There are
within an organization (i.e insider attack), a masquerade several malicious applications of Sybil attacks in different
attacker gains access to the account of an authorized user
either by stealing the authorized users identity (i.e account environment [2].
ID) and password, or by using a key logger. Secondly, the
common way is by getting benefits of authorized user’s 1.4.1 Routing: Sybil attacks can disrupt routing
laziness or trust. protocols in ad hoc networks, especially the multicast
routing mechanism. Separate paths that initially provided
Sybil nodes. Another vulnerable concept is Geographical that validates the one is to one correspondence between
routing where malicious nodes may appear at more than an entity on the network and its associated identity. This
one place at a time. An attack in an ad hoc network and centralized CA thus eliminates the problem of
thus the availability of fake identities may further lead to establishing a trust relationship between two
a large scale attack such as distributed DoS and creating communicating nodes. Douceur has proven that such kind
problem in routing protocols in such networks. of certification is the only method that may potentially
eliminate Sybil attacks completely. This approach looks
like the ideal method to tackle these attacks.
1.4.2 Tampering with Voting and Reputation
Systems: In case of any environment where there is a Disadvantages / Limitations:
voting scheme in place for purposes such as reporting and Significant performance overhead and expense
identifying node misbehavior and in the system, updating
reputation scores repeatedly.So that a Sybil attack may be
particularly harmful. Application Domain:
General
1.4.3 Fair resource allocation: Sybil attacks may also
be used to enable the attacker to obtain an unfair and not 2.2 Resource Testing: Resource Testing [1][2] is the
proportionately shares resources amongst all nodes on the most commonly implemented solution to eliminate Sybil
network equally. This attack opposed legitimate nodes to attacks. The basic principle is that computing resources of
take their share of resources and also provides the each entity on the network is limited. A verifier then
malicious node having other attacks. checks whether each identity has as many resources as the
single physical device it is related with. Storage,
1.4.4 Distributed Storage: File storage systems in computation and communication were initially proposed
peer-to-peer and wireless sensor networks can be as resources. In wireless sensor network, an attacker
compromised by the Sybil attack. This is achieved by might have storage and computation resources in large
defeating the fragmentation and replication processes in capacities compared to legimate sensor nodes.
the file system. A system can be tricked into storing data Alternatively, verification messages for verifying
into the multiple Sybil identities of the same node on the communication resources might flood the entire system
network. itself. Hence, all three are inadequate choices for sensor
networks. Radio resource testing is an extension of the
1.4.5 Data Aggregation: Sensor network readings are resource testing verification method for wireless sensor
computed by query protocols in a network rather than networks. The key assumptions of this approach are that
returning the reading of each individual sensor. This is any physical device has only one radio and that this radio
done to conserve energy. Sybil identities may be able to is incapable of transmitting and receiving messages on
report incorrect sensor readings thereby influencing the more than one channel at any given time. Resource tests
overall computed aggregate. A malicious user may be have been suggested by many as a minimal defense
able to significantly alter the aggregate with enough against Sybil attacks where the goal is to reduce their risk
identities. substantially rather than to eliminate it altogether.
II. METHODS PROPOSED TO COUNTER Disadvantages / Limitations:
SYBIL ATTACKS
Ineffective for most systems
There is no general, universally-accepted solution to
the Sybil attack, but a number of approaches for various Application Domain:
combinations of environments and attacks have been General
proposed. Some methods minimize the threat level of
these attacks in a system to a satisfactory minimum 2.3 Recurring Costs: This method [1] [2]is a variation
without incurring an appreciable performance overhead. of resource testing where resource tests are conducted
We must note that although they will not completely after specific time intervals to impose a certain “cost” on
eliminate the possibility of the attack occurring. Notable the attacker that is incurred for every identity that he
techniques to counter Sybil attacks are as under. controls or introduces into the network. However a
number of researchers that have endorsed this method
2.1 Trusted Certification: Certification is the most have used computational power in their resource test. In
frequently solution to defeating Sybil attacks. It involves this method, there is economic model to propose a critical
the presence of a trusted certifying authority [1][2](CA) value that exists for a particular combination of
application domain and attacker objective. An attack is that will reveal a Sybil node. No physical tokens are
reduces successful only if ratio of the attacker’s objective required such as radios and clock skews unlike other
value to the cost per identity exceeds this critical value. Sybil detection approaches.
They conclude that using recurring costs or fees per Disadvantages / Limitations:
identity is more effective to avoid Sybil attacks than a May encourage Sybil attackers that have no
one-time resource test. interest in subverting the application protocols,
but that are interested in being paid to reveal
their presence
Disadvantages / Limitations:
Requires the use of electronic cash or of Application Domain:
significant human effort General
keys as a shared secret session key. The main ideas are General
the association of the identity with the key assigned to a
node and the validation of the key. Validation involves 3.2 Search-Behavior Modeling Approach:
ensuring that the network is able to validate the keys that Masquerade attacks (such as identity theft and fraud) are a
an identity might have. The forged Sybil identity will not serious computer security problem. We conjectured that
pass the key validation test as the keys associated with a individual users have unique computer usage behavior,
random identity will most likely, not have an appreciable which can be profiled and used to detect masquerade
intersection with the compromised key set. attacks. The behavior captures the types of activities that a
user performs on a computer and when they perform
them.
Disadvantages / Limitations: The use of search behavior[4] profiling for
Limited to Sensor Networks masquerade attack detection permits limiting the range
and scope of the profiles we compute about a user, thus
Application Domain: limiting potentially large sources of error in predicting
Sensor Networks user behavior that would be likely in a far more general
setting, and reducingthe overhead of the masquerade
attack detection sensor. Prior work modeling user
III. METHODS PROPOSED TO COUNTER commands shows very high false positive rates with
MASQUERADING ATTACK moderate true positive rates.
In [4], a modeling approach that aims to capture the
3.1 Iris Recognition System: A novel two-stage intent of a user more accurately based on the insight that a
protection scheme for automatic iris recognition systems masquerader is likely to perform untargeted and
against masquerade attacks carried out with synthetically widespread search. In the paper modeled search behavior
reconstructed iris images is presented. The method uses of the legitimate user using a set of 7 features [4], and
different characteristics of real iris images to differentiate detectedmasquerader that deviates from that normal
them from the original ones, thereby addressing important search behavior.
security flaws detected in state-of-the-art
commercialsystems. Experiments are carried out on the 1. Number of search actions
publicly available Biosecure Database and demonstrate
the efficiency of the proposed security enhancing 2. Number of non-search actions
approach.
From the results presented in [3] it is clear that 3. Number of new processes.
efficient countermeasures must be developed and
embedded in practical biometric applications in order to 4. Number of window touches: closing a window.
deal with the security flaws disclosed in the vulnerability
assessment experiments.The global solution proposed 5. The total number of document editing applications
here classifies an input sample into the real or synthetic running on the system.
class following a two-stage approach
6. The total number of processes running on the system.
Stage 1:Edge Detection
7. Number of user-induced actions: e.g. manually starting
Stage 2:Power Spectrum or killing a process, opening a window, manually
searching for some file or some content
A two-stage protection method [3] against attacks
carried out with reconstructed iris images has been
presented. The experiments have shown its ability to Disadvantages / Limitations:
detect fraudulent access attempts using synthesized iris Less features to describe behavior
images, thereby solving importantsecurity flaws detected
in the vulnerability evaluation of a state-of-the-art iris Application Domain:
system. General
and remove both attacks with their limitations and Telecommunication from PRMIT&R College of
application domain. Engineering from Amravati University Amravati. He is
currently a postgraduate student of wireless
V. POSSIBLE RESEARCH DIRECTION communication and computing field with the department
of Computer Technology, in Priyadarshini College of
Our future work include tackling issues related to Engineering, from Nagpur University, Nagpur. His
Sybil attack in a network for different methods but research interests include WSN,Network security.
specially RSSI based scheme and try to increase a
efficiency. With this we try to improve detection and Mr. Animesh R. Tayal Animesh R. Tayal received
removing accuracy of masquerading attacks in a network. Bachelor of Engineering Degree in Computer technology
from Nagpur University, India and Master of Engineering
Degree in Wireless Communication and Computing from
G. H. Raisoni College of Engineering, Nagpur, India in
ACKNOWLEDGMENT 2002 and 2009 respectively. His research area is Natural
All the faculty members should be praised for Language Processing and Wireless Sensor Networks. He
contributing to the success of this survey in various ways. is having 10 years of teaching experience. Presently he is
Also I want to thanks my guide to research of this topic Assistant Professor in Priyadarshini College of
with me on this survey and the references. I have used Engineering, Nagpur. He is the author of fifteen research
throughout this project as well as the anonymous papers in International and National Journal, Conferences.
reviewers for their valuable comments.
REFERENCES
I. INTRODUCTION
Now a day, There is a rapid growth in the parking Fig.1. General Flow chart of car parking system. [1]
system. So, The is need to research an automatic parking
system which will be useful for the careful parking of car and back, (3) It is possible to display the rear view and
and other vehicle [9]. Various approaches which we were front-side view together. A driver can simultaneously
using in parking system: user interface-based approach, check the rear and front-side views of the vehicle, the
free space-based approach, parking slot marking-based points of most concern when parallel parking [1] [2] [3].
approach, infrastructure-based approach. The fusion of An Ultrasonic sensors is also known
AVM system and an ultrasonic sensor is used to detect as transceivers when they both send and receive. It is also
and track the vacant parking slot in the automatic parking called as Transducers. Transducers evaluate attributes of a
system. The Around View Monitor (AVM) provides a target by interpreting the echoes from radio. Ultrasonic
virtually 3600 scene of the car in bird’s-eye view. The sensors generate high frequency sound waves and
Around View Monitor (AVM) is a support technology evaluate the echo which is received back by the sensor,
that assists drivers to park more easily by better measuring the time interval between sending the signal
understanding the vehicle’s surroundings through a virtual and receiving the echo to determine the distance to an
bird’s-eye view from above the vehicle. The Around View object [4]. Fig.1. shows Flow chart of proposed system.
Monitor (AVM) helps the driver visually confirm the Once a driver starts parking, the system continuously
vehicle’s position relative to the lines around parking detects parking slot markings and classifies their
spaces and adjacent objects, allowing the driver to occupancies. First, the parking slot marking detected in
maneuver into parking spots with more ease [1]. the AVM image sequence. A tree structure-based method
Through The Around View Monitor (AVM) makes detect the parking slot marking using individual AVM
parking easier because (1) Through the bird’s-eye view, a image sequence and image registration technique. Second,
driver can check for obstructions around the vehicle, (2). empty slot is detected using ultrasonic sensors. The
The system can display the bird’s-eye, front and rear probability of parking slot occupancy is calculated
views, making it possible to check the vehicle's 360- utilizing ultrasonic sensor data acquired while the vehicle
degree surroundings simultaneously with either the fore is passing by parking slots. And finally, the selected
empty parking slot is tracked and the vehicle is properly
parked in selected parking slot [1] [2] [3] [4].
II. LITERATURE REVIEW
In the previous parking system driver manually selects
the parking slot and drive into it. This method is useful as
a backup tool for failure cases of automatic parking
system methods. Manpower is needed for each car
parking slot to select a parking slot manually and give
direction to drive properly into the slot [1]. There is need
of manpower, so this system is replaced by the ultrasonic
sensor based system. In this system, two ultrasonic based
sensors are mounted on both sides of the front bumper.
Adjacent vehicles are detected by using ultrasonic sensor
data. These ultrasonic sensor find the adjacent vehicles
and driver properly drive into the free space between that
adjacent vehicles. Using the multiple echo function,
parking space detected more accurately in real parking
environment. These method fail when there is no adjacent
vehicles and in slanted parking situations where adjacent
vehicle surfaces are not perpendicular to the heading
directions of ultrasonic sensors [1] [4]. Another method is
Parking slot Marking-based method. In this method
vehicle mounted cameras, are used. It simply tracks the
parking slot marking present on the road. The distance
between point and line-segment is used to distinguish
guideline from recognized marking line segments. Once
the guideline is successfully recognized, T- shape
template matching easily recognizes dividing marking
line-segments. This method fails where parking slot
marking are not present [1] [5]. Scanning Laser Radar-
Based system is implemented between vehicles to
recognize free space parking slot. This system consists of
range data preprocessing, corner detection, and target
parking position designation. The major disadvantage of
this system is the expensive price of the sensor [7].
A Photonic-Mixer-Device (PMD) camera is used to Fig.2. Hierarchical tree structure of parking slot marking.
scan parking-scene to detect free parking slot. PMD (a) parking slot marking (b) slots (c) junction [2]
sensor allows referring to a large number of spatial point
measurements detailed representing cuts of the observed
scene [6]. So we moved onto Infrastructure based method.
In this method, bird-eye view camera is used which helps detected. Parking slots are detected using Hierarchical tree
to track the vacant parking slot [1] [8]. structure. The corners are detected by the Harris corner
dectector. The junctions are generated by combining two
III. METHODOLOGY corners and slots are generated by combining two
junctions. Sometimes, parking slots overlapped in real
The main concept behind the detection of free parking
time situation. This overlapping of two slots is determined
slot is to recognize the parking slot marking. The
by Jaccard coefficient.
hierarchical tree structure based parking slot, marking
method deals with the four most commonly appearing
|𝑆𝑖 ⋂ 𝑆𝑗|
types of parking slot markings, i.e., rectangular, slanted J(Si.Sj) = (1)
rectangular, diamond, and open rectangular types. These |𝑆𝑖 ⋃ 𝑆𝑗|
four types of parking slot markings consist of four types
of slots, i.e., TT-slot, TL-slot, YY-slot, and II-slot and
where, J (Si, Sj) is the Jaccard coefficient between two
each slot is composed of two junctions. These junctions
rectangles formd by the ith slot Si and jyh slot Sj. If this
can be categorized into T-junction, L- junction, Y-
value is larger than a predetermined threshold (T1), two
junction, I- junction. This method detects and classifies
slots are considered as overlapped. If there are overlapped
corner and produces junction by pairing two corners [1]
slots then we have to calculate Normalized average
[2]. AVM image sequence is useful to recognize the
intensity value (NAIV) of two slots and is given by
parking slot marking. Various images generated by AVM
camera, and by combining them empty parking slot is 1 1
NAVIi=𝑀𝐴𝑋(𝐼) {𝑁 ∑(𝑥,𝑦)=𝑆𝑖 𝐼(𝑥, 𝑦)} (2)
where, NAVIi is the NAVI of Si , MAX(I) is the tracking system. All sequentially captured AVM image
maximum intensity value of image I, N is the number of sequence are combined and finds the empty parking slot.
pixels in the skeleton of the parking slot entrance and (x, This system is efficiently in use and solving the common
y) are the location of pixel in the x and y axes [1] [2]. problem of allocating parking space in busy areas in big
cities such as shopping complexes, stadium and other
Hough Transform popular places, especially during their peak hour.
To determine the current position of previously
detected parking slots, the Hough transformation is used.
It transforms the wide-angle lens image into a bird’s eye REFERENCES
view image. The input undistorted image is transformed [1] Jae Kyu Suhr and Ho Gi Jung, “Sensor Fusion-
into a bird’s eye view image with homography. Based Vacant Parking Slot Detection and
Homography is one-to-one relation between two Tracking” IEEE transactions on intelligent
coordinate system. In parking system, the camera is transportation systems, vol. 15, no. 1, pp. 21-36,
placed at some constant height and a particular tilt angle. February 2014
[2] Jae Kyu Suhr and Ho Gi Jung, “Fully-automatic
E(x,y)=|∑1𝑖=−1 ∑1𝑗=−1 𝑆𝑜𝑏𝑒𝑙𝑣𝑒𝑟𝑡𝑖 (𝑖, 𝑗) ∗ 𝐵(𝑥 + 𝑖, 𝑦 + 𝑗)| Recognition of Various Parking Slot Markings in
+ |∑1𝑖=−1 ∑1𝑗=−1 𝑆𝑜𝑏𝑒𝑙ℎ𝑜𝑟𝑖 (𝑖, 𝑗) ∗ 𝐵(𝑥 + 𝑖, 𝑦 + 𝑗)| Around View Monitor (AVM) Image
(3) Sequences” 2012 15th International IEEE
Conference on Intelligent Transportation
where, E(x,y) denotes a pixel value of the edge image Systems Anchorage, Alaska, USA, pp. 1294-
and B(x,y) denotes a pixel value of the Bird’s eye view 1299, September 16-19, 2012
image. Edge image is generated from Bird’s eye view [3] Ho Gi Jung, Dong Suk Kim, Pal Joo Yoon,
image using Sobel edge detector [3]. Jaihie Kim “Parking Slot Markings Recognition
After the detection and prediction of parking slots, the for Automatic Parking Assist System” Intelligent
correction is conducted by combining all the detected and Vehicles Symposium 2006, pp. 106-113, June
predicted parking slot. First check the parking slot are 13-15, 2006, Tokyo, Japan
overlapped or not using Jaccard coefficient. Overlapping [4] Wan-Joo Park, Byung-Sung Kim, Dong-Eun
situation is classified into the following three cases Seo, Dong-Suk Kim and Kwae-Hi Lee, “Parking
according to the Jaccard coefficient and two Space Detection Using Ultrasonic Sensor in
predetermined thresholds, i.e., T1 and T2 (T1 < T2). Parking Assistance System” 2008 IEEE
Intelligent Vehicles Symposium Eindhoven
Case I: J(Si, Sj) < T1: If the Jaccard coefficient University of Technology Eindhoven, The
between the ith slot (Si) detected in the current image and Netherlands, pp. 1039-1044, June 4-6, 2008
the jth slot (Sj) predicted from a previous image is less [5] Ho Gi Jung, Yun Hee Lee, and Jaihie Kim,
than T1, two slots are considered to be nonoverlapping. If “Uniform User Interface for Semiautomatic
Si does not overlap with any previously detected slots, Si Parking Slot Marking Recognition” IEEE
is identified as a newly detected slot. transactions on vehicular technology, vol. 59, no.
2, February 2010
Case II: J(Si, Sj) ≥ T2: If the Jaccard coefficient is [6] Ullrich Scheunert, Basel Fardi, Norman Mattern,
greater than or equal to T2, it can be considered that the Gerd Wanielik, Norbert Keppeler, “Free space
same slot is repetitively detected in sequential images. In determination for parking slots using a 3D PMD
this case, the position of this parking slot is corrected by sensor” Proceedings of the 2007 IEEE Intelligent
NAVI Vehicles Symposium Istanbul, Turkey, pp. 154-
159, June 13-15, 2007
Case III T1 ≤ J(Si, Sj) < T2: If the Jaccard coefficient [7] Ho Gi Jung, Young Ha Cho, Pal Joo Yoon, and
is between T1 and T2, two slots are considered to be Jaihie Kim, “Scanning Laser Radar-Based Target
overlapping, but they are not a repetitive detection of the Position Designation for Parking Aid System”,
same slot. Since two different parking slots are unable to IEEE transactions on intelligent transportation
overlap in real situations, only one of the two slots should systems, vol. 9, no.3, pp. 406-424, september
be selected. 2008
After combining the sequentially detected parking slot, [8] Kyounghwan An, Jungdan Choi, and Dongyong
the empty parking slots are determined. This empty Kwak, “Automatic Valet Parking System
parking slot is free space and non-overlapping marking Incorporating a Nomadic Device and Parking
slots. The driver has to select a particular parking slot and Servers” 2011 IEEE International Conference on
drive into it [1] [2]. Consumer Electronics (ICCE), pp. 111-112
[9] Gongjun Yan, Weiming Yang, Danda B. Rawat
IV. CONCLUSION Stephan Olariu, “Smart Parking:Secure and
In this paper we have discussed about the various types intelligent parking system” IEEE intelligent
of parking system that are used to recognize the empty transportation systems magazine, pp. 18-30
parking slot. Around view monitor (AVM) is one of the
best system in the parking slot marking detection and
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
Video Sequences
Abstract— In this paper, the algorithms which were used for regions. Most of their false positive values are somewhat
the fire detection in video sequences are compared. Those higher. False positive value is the rate of recognizing non-fire
compared algorithms are fire detection in video sequences using region as real fire and true positive value is the rate of
a generic color model, fire detection based on vision sensors and detecting real fire as real fire region itself. Each of the
support vector machines, a computer vision-based method for
methods is implemented by testing with a large collection of
real-time fire and flame detection, a probabilistic approach for
vision-based fire detection in videos and fire flame detection in fire as well as non-fire videos. Several training movies were
video sequences using multi-stage pattern recognition techniques. tested include fire, outdoor fire, and moving objects that are
the same color as fire and also utilized.
The comparison study gives a clear idea that the multi-stage
pattern recognition techniques used for the fire flame detection
outperforms all other algorithms. But it requires more II. VIDEO-BASED FIRE DETECTION SYSTEMS
computational time since it uses four stages, which will be more The fire flame detection is mostly applied in industrial field
complex than the other methods. in the case of monitoring and surveillance. Point detectors and
sensors are the conventional fire detection approaches. But
Keywords—Fire detection; generic color model; vision sensors;
support vector machines; probabilistic approach; pattern due to the immanent delay in the case of time taken for it to
recognition techniques. reach at the sensors, fire detection based on fire flame
characteristics came into exist. Both fire flame as well as
smoke detection systems prevailing in this area. Among them,
I. INTRODUCTION
I made a comparison on the existing fire flame detection
Fire detection has been arising as one of the relevant topic systems.
in the context of the security aspects of both personal and
commercial scenarios. Such systems are being used in large A. Fire detection in video sequences using a generic color
industrial areas, hospitals, shopping malls etc. Whenever a fire model
flame gets detected in the cameras erected onto the sites, the This is a flame based fire detection method [1]. A rule-
fire alarm will sound thereafter. Then it will enable the based generic color model is used for the flame pixel
rescuing processes. There are several methods for classification. YCbCr color space is used to separate
implementing this fire detection. In this paper, various those luminance from chrominance, rather than using RGB color
methods are compared and concluded that which will be the space [2], [3] to carry out the separation more effectively.
best method to detect the fire flame from video sequences. YCbCr color space is used to construct a generic chrominance
Most of these algorithms are based on color pixel recognition, model for flame pixel classification. As more new rules are
motion detection, or both. included in YCbCr color space which indeed to avoid the bad
effects of constantly changing illumination as well as the
The compared algorithms are: fire detection in video improved detection, reflects less dependency for RGB color
sequences using a generic color model (Algorithm1) [1], fire space. One of the disadvantages of RGB color space is of
detection based on vision sensors and support vector machines illumination dependence. If changes happened in the
(Algorithm2) [2], a computer vision-based method for real- illumination of images, classification of fire pixel cannot be
time fire and flame detection (Algorithm3) [3], a probabilistic performing efficiently.
approach for vision-based fire detection in videos Chrominance can be used in modeling color of fire rather
(Algorithm4) [4] and fire flame detection in video sequences than modeling its intensity. When translating RGB color space
using multi-stage pattern recognition techniques (Algorithm5) to one of the color spaces separation between intensity and
[5]. chrominance is more discriminate. YCbCr color space is used
to model fire pixels, because of the linear conversion between
These methods will sometimes detect the non-fire regions RGB and YCbCr color spaces,. Y is luminance; Cb and Cr are
instead of actual fire regions which are called as candidate fire
Chrominance Blue and Chrominance Red components, C. A computer vision-based method for real-time fire and
respectively. flame detection
A wavelet-based fire detection method in video sequences
This model outperforms the model which uses RGB values is developed in [5]. This method detects both fire flames as
both in detection rates and false alarm rate. It produces higher wells as flame colored moving regions in videos and analyses
detection rate up to 99% fire detection. Also the system is the motion of such regions in wavelet domain mainly for the
cheap in computational complexity. YCbCr color space is estimation of flickering nature of the fire flames. It is
better in discriminating the luminance from the chrominance, estimated based on the rate of the flame flickering
hence is more robust to the illumination changes than RGB or characteristics such as its flicker frequency. This flicker
RGB color spaces. frequency is very much related to the contour, chrominance or
luminosity values of the moving objects in the videos. Thus
It has disadvantages too. This model didn’t taking into the fire detection can be made more effective to false alarm
account the flickering nature of the fire. So the model should rate by incorporating the frequency behavior of the moving
have to be improved in order to cop-up with the flickering pixels
nature of the fire so that can enable to produce an efficient fire
flame detection system. The temporal as well as the spatial variations of pixels of
B. Fire detection based on vision sensors and support vector the moving objects is observed. This is the reason why the
machines results show a dramatic reduction in the false alarm rate. So
that it can be applied to real-time applications. This model can
A vision-based fire detection system is introduced in [4].
be applicable to both indoor as well as outdoor monitoring
The approach in this paper, offers several advantages such as
systems.
relatively low cost for equipment, fast response time and fast
confirming through the surveillance monitor. Therefore, it is D. A probabilistic approach for vision-based fire detection in
necessary to develop robust fire detection system using video videos
cameras. Majority of the existing vision-based approaches A new detection metric based on color for fire detection in
encounter several problems regarding the occurrence of high videos was proposed in [6]. Usually the computer vision-based
rate of false alarms, as they are using color information and fire detection models are incorporated in Closed-Circuit
temporal variation of pixels. More over they uses the heuristic Television (CCTV) surveillance arena in a controlled fashion
features which also leads to the same problems. In this but this model can be applicable to automatic video
context, in order to overpower these difficulties this model classification. This model analyses frame-to-frame changes of
came into existence which illustrate a fire detection algorithm potential fire regions. In addition, it extracts important visual
based on luminance map and SVM for reliable result in features of fire like color, area size, surface coarseness,
general video sequences. boundary roughness and skewness of the fire pixel
distribution. All these characteristics of fire are the powerful
To solve these problems, the author analyzed the temporal
discriminants for the robust fire recognition. The change in
variation of candidate pixels in consecutive frames using
these features is noted and evaluated and combined them
luminance map and made a temporal fire model for training
according to the Bayes classifier [7].
and testing two-class SVM to ultimately verify fire regions.
Experimental results showed that this approach is more robust As complicated features were extracted by the model, it
to noise, such as smoke, and subtle differences between shows very fast processing, and also making the system
consecutive frames compared to previous research. Even applicable not only for real time fire detection, but also for
though it exhibits an ideal way of detecting fire flames, the video retrieval in news contents, which require faster than
performance of the system still have to be improved to reduce real-time analysis.
the false alarm rates in non-fire regions as well as missing rate
of candidate fire regions. Then only it can be applied in real-
time environments. E. Fire flame detection in video sequences using multi-stage
pattern recognition techniques
In addition to that, computation time for fire detection In [8], a four-staged approach was used for the detection of
needs to be improved to design a real-time fire-warning fire flame in video sequences mainly to enhance the
system. So that the model will be a great achievement in the performance of fire detection. In the first stage, moving region
fire monitoring and surveillance and also it can be fabricated will be detected using adaptive Gaussian Mixture Model [9]. It
along with the already installed surveillance monitoring would help to manage the illumination changes and also the
systems with less expense. sudden movement of camera or even the vegetation. An
adaptive Gaussian mixture model, models each pixel with or
without morphological operations such as dilation and
cohesion. In order to improve the quality of the moving
region, the adaptive GMM has been used where as dilation
and erosion has used for removing noises as well as smooths Clearly, the method proposed by Tung Xuan Troung et.al
the result. outperformed other conventional algorithms by consistently
increasing the accuracy of fire detection and decreasing the
Fuzzy c-means clustering [10], [11] was used to segment error of false fire detection for each video.
fire and non-fire regions. There may be some other moving
objects in addition to the fire which are of different color than One of the limitations of this approach is that, it requires
the color of fire. So they get segmented using FCM algorithm much more computational time than other approaches due to
and categorized into pixels of moving region and fire colored the four stages of the algorithm is more complex than others.
regions were selected. Then additional parameters were High performance processors including NVIDIA GPUs
extracted from the tempo-spatial characteristics of the fire (Graphics Processing Unit) and TI DSPs (Digital Signal
regions. In the final stage, the fire was classified using support Processor) can support these algorithms in real time, the
vector machines (SVM) which were introduced by [12]. SVM accuracy of fire detection is more important. This approach
is widely applied to many fields of pattern recognition [13]. can be made reliable by reducing the excess computational
The SVM is not only capable of learning in high-dimensional time.
spaces but also can provide high performance with limited
training data sets. The best method to compare the conventional fire
detection approaches will be made effective by comparing
Therefore, many techniques have been proposed to their computational time that each required accomplishing the
improve the performance of SVM [14], [15]. Experimental task. The computational time required for each methods are
results indicated that this method has high accuracy rate for tabulated in Table II, as follows.
detecting fire and it provides low false alarm rate. It has high
reliability in both indoor and outdoor test videos. TABLE II. COMPUTATIONAL TIME OF THE VARIOUS FIRE
DETECTION METHODS [8]
REFERENCES [8] Tung Xuan Truong, Jong-Myon Kim, (2012). “Fire flame detection in
video sequences using multi-stage pattern recognition techniques”.
Engineering Applications of Artificial Intelligence 25, 1365–1372
[1] Celik, T., Demirel, H., (2009). “Fire detection in video sequences using [9] KaewTraKulPong, P., Bowden, R., (2002). “An Improved Adaptive
a generic color model”. Fire Saf. J. 44, 147–158. Background Mixture Model for Real-Time Tracking with Shadow
[2] T. Chen, P. Wu, Y. Chiou, An early fire-detection method based on Detection”. In: Proc. 2nd European Workshop on Advanced Video-
image processing, in: Procedings of IEEE International on Image Based Surveillance System, AVBS01, pp. 135–144
Processing, 2004, pp. 1707–1710. [10] Bezdek, J.C., 1981. “Pattern recognition with fuzzy objective function
[3] T. Celik, H. Demirel, H. Ozkaramanli, Automatic fire detection in video algorithms”. Pleum Press, New York.
sequences, in: Proceedings of European Signal Processing Conference
[11] Bezdek, J.C., Keller, J., Krisnapuram, R., Pal, N., (2005). “Fuzzy
(EUSIPCO 2006), Florence, Italy, September 2006.
models and algorithms for pattern recognition and image processing”.
[4] Ko, B.C., Cheong, K.H., Nam, J.Y., (2009). “Fire detection based on Springer.
vision sensor a support vector machines”. Fire Saf. J. 44, 322–329.
[12] Vapnik, V., (1982). “Estimation of dependences based on empirical
[5] Toreyin, B.U., Dedeoglu, Y., Gudukbay, U., Centin, A.E., (2006). data”. Springer- Verlag.
“Computer vision- based method for real-time fire and flame detection”.
[13] Burges, C.J.C., 1998. “A tutorial on support vector machines for pattern
Pattern Recog. Lett. 27, 49–58.
recognition”. J. Knowl. Discovery Data Mining 2, 121–167.
[6] Borges, P.V.K., Izquierdo, E., (2010). “A probabilistic approach for
[14] Platt, J.C., 1998. “Sequential Minimal Optimization: A Fast Algorithm
vision-based fire detection in videos”. IEEE. Trans. Circuits Syst. Video
for Training Support Vector Machines”. Microsoft Research Technical
Technol. 20, 721–731
Report MSR-TR-98-14.
[7] S. Theodoridis and K. Koutroumbos, (2006). “Pattern Recognition”.
[15] Cristianini, N., Shawe-Taylor, J., 2000. “An introduction to support
New York Academic.
vector machines and other kernel-based learning methods”. Cambridge
University Press.
Abstract— a novel approach of designing Microstrip antenna many more wireless communication system. This designing
for wireless sensor network has been presented here. Now a day approach has a great role in minimization of
for any sensor network antenna takes a vital role in transmitting telecommunication terminals. As a result these types of
and receiving the signal. The Antenna which is presented here is
a superior new design where the sensor network designer founds
antenna design reduce the terminal numbers which lead to be
a flat gain response, dual band resonant and wider bandwidth. reduce in communication cost and complexity. The developers
The purposed antenna is a square ring structure and it is fed by focus on multi tasking means they want one gadget can do
EM coupling. This design comes with a new technique which may multiple works. So if the receiver and transmitter have
not have implemented earlier. This technique is very much capabilities of efficient work on multiple bands then the circuit
helpful to the designer as well as to the researcher to implement complexity and gadget size can be optimized. As the antenna
their idea in more efficient manner. The purposed antenna is
multi layer microstrip square ring structure, which enable a flat
is the main part of transmitter as well as the receiver so the
band gain response and the gain is higher comparatively to other antenna has to be optimized. The purposed antenna is a
literature. This antenna is also resonating at multiple band of superior choice for the developer. [10, 12]
frequency, which is a good for multi band wireless application.
Also the purposed antenna can be used for various kinds of The microstrip patch antenna is a conducting plate which is
mobile communication applications including MIMO (Multiple- mounted over the dielectric substrate. The radiator of the
Input Multiple-Output). The simulation study of various
parameters of the designed antenna is done in IE3D
microstrip antenna is of different shapes and the antenna can
electromagnetic EM simulator and presented in this paper. The feed by using various methods. Among all the advantages
layer thickness is optimized to get best result. there are some drawbacks of using these patch antennas which
are comparatively low gain, very narrow band width and low
Keywords— Microstrip Fractal antenna; Wireless sensor efficiency. Many researchers recommended many methods to
network; Square ring structure; multi layer Structure; High gain; overcome these drawbacks, but every time there is an
Dual band; Wider Bandwidth; EM couple; IEEE 802.15.4.
adjustment between the purposed antenna design and its
performance [6]. The purposed antenna which was presented
I. INTRODUCTION
in this paper is a unique one which is overcomes all the
antenna has low gain and bandwidth which overcome by this line model where the patch dimensions are calculated by the
antenna design. Some structures are shown in fig.1. [13] given simplified formula [R4].
A. Antenna design The physical width of the purposed antenna can be obtained
The microstrip antenna which is presented here has multi by
layer structure. The first layer which is presented above the 1
ground plane is simple air gap between the layer 2 and ground C r 1 2
W (1)
plane. And the second layer is the last as well as top layer of 2 fr 2
this antenna design. In this layer the antenna radiator is present
The effective dielectric constant of the designed
which is fed electromagnetically by feed patch. The feed patch antenna can be determined by
is excited by probe feeding technique. The antenna radiator is
1
placed on the dielectric substrate (FR4), whose loss tangent of 1 r 1 h 2
C. Tables
TABLE 1
DIMENSIONS FOR THE PURPOSED ANTENNA DESIGN
Symbol Quantity Value
L Length of the radiator patch 15.5mm
W Width of the radiator patch 16.4mm
Lf Length of the feed strip 0.6mm
Wf Width of the feed strip 1.2mm
Separation gap between feed strip patch from
g 0.5mm
radiator patch
L1 Length of the square slot 8mm
h1 Air gap between substrate and ground plane 6mm
h2 Thickness of the substrate 1.66mm
Єr Relative dielectric constant of the substrate 4.4
Particular dimensions of the purposed antenna.
TABLE 2
OPTIMIZATION OF SQUARE SLOT USING DIFFERENT LENGTH AND KEEPING OTHER PARAMETER UNCHANGED
S11
L1(mm) L(mm) W(mm) Lf (mm) Wf(mm) g(mm) h1(mm) h2(mm) Єr
Fr1(dB) Fr2(dB)
0 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -35 -29
1 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -36 -29
2 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -35.5 -28
4 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -35 -26.5
6 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -37 -23
8 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -40 -20
10 15.5 16.4 0.6 1.2 0.5 6 1.66 4.4 -45 -13.5
Step by step optimization for best result.
First we obtain the optimum dimensions of the conventional The simulated results of the purposed antenna are shown
rectangular microstrip patch antenna with best impedance here (Fig. 3).
matching feed to resonate at 4 GHz and 5.9GHz, through
parametric analysis in commercially IE3D software. To
perform the parametric analysis, initially the length and width
of the radiator patch was modified and then the length and
width of the feed patch was optimized to attain the return loss
very deep at 4 GHz and 5.9GHz, the dimensional values are
mentioned in Table 1. After getting optimum result in return
loss, then introduce square slot in the absolute middle of the
rectangular microstrip antenna radiator. By increasing the
square slot area from 1mm to 12mm we got a best result at
8mm. having square slot of 8 mm at the middle of the antenna
radiator gives maximum return loss as well as gain, which is
shown in Table 2. The major advantages of this antenna is, it
is having a uniform gain response called flat gain response,
which is comparatively above 5dBi. It has been observe that
the feed patch makes an important role for matching the
impedance as well as for the best return loss characteristic.
The feed patch on the top layer where the antenna radiator
is presented is excited by the probe feeding techniques.
Selecting the probes also takes a major role in observing and
record an optimum result.
Optimized result during the simulation process is captured Network”, Wireless Telecommunication Symposium, ISSN
and shown in figure 3. Under this the first figure i.e. figure 3 1934-5070, pp. 1-5, 2010.
[5] Yuling Lei, Yan Zhang and Yanjuan Zhao, “The Research of
(a) is radiation pattern of the purposed antenna, figure 3 (b)
Coverage Problems in Wireless Sensor Network”, Internationa
shows the gain response of the purposed antenna over the all conference on Wireless Networks and Information Systems,
the frequency. The gain of the purposed antenna is same over ISBN: 978-0-7695-3901-0, pp. 31-34, 2009.
a band of frequency which is a major benefit for wireless [6] Syed Ahsan Ali, Umair rafique, Umair Ahmad, M. Arif Khan,
sensor network. This helps the user for an optimum result. “Multiband Microstrip Patch Antenna for Microwave Applications”,
IOSR-JECE, ISSN: 2278-2834, ISBN: 2278-8735, Vol. 3, Issue 5, pp
And the figure 3 (c) shows the radiation pattern of the 43-48, 2012.
purposed antenna. [7] Dong-Hee Park, Yoon-Sik Kwak, “Design Multi-Band Microstrip Patch
Antenna For Wireless Terminals”, IEEE Future Generation
communication and networking Letter, Vol. 2, PP. 439-441, 2007.
IV. CONCLUSION [8] Arnab Das, Bipa Datta, Samiran Chatterjee, Bipadtaran Sinhamahapatra,
A novel design of microstrip antenna with compact in size Supriya Jana, Moumita Mukherjee, Santosh Kumar Chowdhury, “Multi-
Band Microstrip Slotted Patch Antenna for Application in Microwave
and superior in performance over high frequency wireless Commmunication”, International Journal of science and Advance
communication is presented here. This antenna design also technology, ISSN 2221-8386, Vol. 2, PP. 91-95, 2012.
applicable to multi-band system or MIMO system, which [9] Nima Bayatmaku, Parisa Lotfi, Mohammadnaghi Azarmanesh, “Design
helps to cover the RF, microwave and wireless of Simple multiband Patch Antenna for Mobile Communication
Applications Using new E-shape Fractal”, IEEE Antennas And
communication. The special thing is the purposed antenna is it Propagation Letters, Vol.10, 2011.
resonates at multiple frequencies and it has better gain [10] K.J. Vinoy, K.A Jose, V.K. Varadan and V.V. Varadan, “Gain-
response, which is very much helpful of wireless sensor enhancement electronically tunable microstrip patch antenna”,
network to sensing the radio waves efficiently. Introduction of Microwave & Optical Technology Letters, Vol. 23,pp. 368-370,1999.
[11] S. Behera, and K. J. Vinoy, "A multiport Network Approach for the
square slot on the antenna radiator take a vital role in size analysis of Conformal Dual band Fractal Antennas", IEEE Trans.
reduction and to get better performance like; deep return loss Antennas Propagation, Vol. 60, No. 11, pp. 5100-5106, 2012.
(S11), radiation pattern, high gain and impedance matching. It [12] J. Romeu, C. Bojra, and S. Blanch, "High Directivity Modes in the Koch
has been observed that increase in size of the square slots of Island Fractal Patch Antenna", Proc. of 2001, Antennas and Propagation
Society International Symposium, IEEE, Vol.3, PP. 1696-1699, 2000.
the purposed antenna give better S11 and gain response. Input [13] B. Manimegalaui, S.Raju and V.Abhaikumar, “A multi fractal cantor
impedance and S11 characteristics obtaining using the antenna for multiband wireless applications”, IEEE Antennas Wireless
developed equations are found to be in good agreement with propagation Letter, Vol.8,pp.356-362, 2009
the IE3D simulated and experimental results. Some of the [14] Q. Lee, K.F Lee and J. Bobinchak, “Characteristics of a Two-Layer
Electromagnetically Coupled Rectangular Patch Antenna”, Electronic
expressions for the constituent models have been modified to Letters, Vol.23, no.20, pp. 1070-1072, 1987.
suite the purposed antenna configuration. The purposed [15] Hoseon Lee, George Shaker, Vasileios Lakafosis, Rushi Vyas, Trang
antenna is first resonates at 4.15GHz and secondly resonates at Thai, Sangkil Kim, Xiaohua Yi, Yang Wang and Manos Tentzeris
5.93GHz, we have been observed that at both the resonance “Antenna-based Smart “Skin” Sensors for Sustainable, Wireless Sensor
the bandwidth is quite high as comparable to other microstrip Networks”, IEEE International Conference on Industrial Technology
(ICIT), pp. 189 – 193, 2012.
patch antenna design. The bandwidths of the antenna at both
[16] Jared Burdin and James Dunyak, “Enhancing the Performance of
the resonance are greater than 500MHz and 1GHz Wireless Sensor Networks with MIMO Communication”, IEEE Military
respectively. Also in this design we have record a high gain, Communications Conference, ISBN: 0-7803-9393-7, Vol. 4, pp. 2321-
which is greater than 6dBi and the gain plot shows a flat band 2326, 2005.
gain response over all the frequency.
Subhrakanta Behera was born in Orissa, India. He
received the B.Sc. degree from Utkal University in
ACKNOWLEDGMENT
1995, the M.Sc. degree from Berhampur University
in 1997, the M.Tech. degree from Cochin
This research is supported by Science and Engineering University of Science and Technology, Kochi,
Research Board through Department of Science and Kerala, in 2004, and the Ph.D. degree from Indian
Technology, Govt. of India. Institute of Science, Bangalore-12, India, in 2011,
under the guidance of Prof. K. J. Vinoy. He spent a
year as a Postdoctoral Research Fellow at the GE
REFERENCES Global Research Centre, Bangalore. He is currently
an Assistance Professor in the School of Electronics
[1] A. balanis, “Antenna theory analysis and design”, Third edition, A Engineering, KIIT University, Bhubaneswar, India. His research interest is in
JOHN WILEY & SONS, INC PUBLICATION. the field of computational electromagnetic and microwave engineering.
[2] Jung-Tang Huang, Jia-Hung Shiao and Jain-Ming Wu, “A Miniaturized
Hilbert Inverted-F Antenna for Wireless Sensor Network Applications”, Debaprasad Barad was born in odisha, India. He
IEEE Transactions on Antennas and Propagation, ISSN: 0018-926X, received B. Tech. from Biju patnaik university of
Vol.58, Issue. 9, pp. 3100-3103, 2010. technology, Odisha. He is currently an Junior
[3] E. D. Skiani, S. A. Mitilineos, and S. C. A. Thomopoulos, “A study of Research Fellow in the School of Electronics
the Performance of Wireless Sensor Networks Operating with Smart Engineering, KIIT University, Bhubaneswar, India
Antennas”, IEEE Antennas and Propagation Magazine, ISSN 1045- (Under DST project, Govt. of India). His research
9243, Vol. 54, No.3, 2012. interest is in the field of communication,
[4] C. K. Singh and S. Mohan, “Effect of Antennas Correlation on Networking and microwave engineering.
the Performance of MIMO Systems in Wireless Sensor
Abstract— A life-centric model of single hop regenerative Cognitive relay network(CRN) has come into existence
relay based cognitive radio system is introduced. Objective later but faces some difficulties like deficient allocation
of this paper is to minimize the outage probability while of spectrum, failure to meet user demand. To make a
satisfying the total power constraint and interference flawless communication, cooperative relay models have
threshold for primary user(PU). Lifetime of cognitive radio been introduced and also are included in field of research.
network is also considered using the concept of battery life to
These relays are often termed as co-operative relays for
improve the overall capacity of the cognitive
radio(CR).Overall Capacity of secondary user(SU) is CRN[11].Relay selection and on that basis power
calculated from the subcarrier allocation between source allocation schemes are produced later to improve CRN
and destination of secondary user.Extensive simulation has and make it compatible with growing level research[12].
been performed for energy aware and non-energy aware Optimal power allocation is a vital problem for generating
scheme to obtain a better network lifetime as well as less CRN with efficient characteristics. In this regards, relay
outage probability without disturbing the threshold limit of assisted OFDM based CRN prove to be acknowledgeable.
PU. Different power allocation schemes are generated for
different relay networks[12]. There are different types of
Keywords- cognitive radio network, OFDM, interference
arrangements of subcarrier for different relay assisted
threshold power, outage probability, network lifetime.
CRNs[13]. On the basis of subcarrier allocations, new
I. INTRODUCTION creative thoughts can be generated in the perspective of
CRNs.
CR has some capability of acquiring knowledge about
the environment by itself. It came into existence in 1998 The simple approach was the single hop CRNs based on
by J. Mitola in a seminar at KTH (The Royal Institute of relay aided OFDMs. And then comes the two hop relayed
Technology in Stockholm) discussed broadly in transmission networks[14].On the basis Rayleigh
[1]&[2].Concept of cognitive radio was published in an distribution, their performance and evaluations have been
article in 1999[3]. Later S. Haykin gives a brief made[15].Resource or power allocation based on
architectural view of cognitive radio in [4] with QoS(Quality of Service) monitoring is also taken into
considering the software radio concept. account side by side with the Outage Probability
realization[16].Implementing the network life-time
CR has been proposed as the means for secondary users concept, the power allocation can be reached up to a
to improve the spectrum utilization by exploiting the optimal point with keeping in view of the network life
existence of spectrum holes[5].Federal Communications effectively[17]. The lifetime based approach is considered
Commission(FCC) surveyed that many licensed bands for multihop CRN in [17].
remain unused for 90% of the time[6].Hence there is a
definite need to utilize these unused licensed bands. Rest portion of the paper is oriented as follows. Section II
Orthogonal frequency division multiplexing(OFDM) is deals with the system model. Objective function
the mostly approved techniques in cognitive radio formulation is being covered in section III. Extensive
networks due to its manifold utilities with regards to simulation and result analysis is discussed in Section IV.
spectrum sensing. Interference in OFDM based systems And finally, section V deals with conclusion of this paper.
are proved to be more less or adaptable than the other
available techniques. Tradeoff between interference and II. SYSTEM MODEL
overall capacity can be achieved with the help of
OFDM[7]. Subcarrier carries an important role in case of The system model is considered as a single hop relayed
OFDM based networks. And this helps a lot in cognitive cognitive radio transmission network based on subcarrier.
radio networks associated with OFDM[8],[9]&[10]. All channels are assumed to be Rayleigh distributed for
simplicity in calculation. The fading occurred is also
considered as Rayleigh fading. Here primary concern is
on the outage probability which is defined as the link Rayleigh fading distribution is given as[15] for single hop
quality failure under a certain level called threshold in the CRN,
path coming from the source towards destination. The
outage probability of SU is limited by the interference =1− ( + ) (2)
power threshold(IPT). In the transmission scheme, the SU
and relay interfere into the PU band but this interference
should be kept below the threshold limit. The power is to γth= Predetermined threshold.
be distributed in an optimal way between the source and G= parameter such as antenna gain, path loss, shadowing,
relay node to minimize the outage probability and also noise power etc.
using this optimal power allocation we increase overall Pk= Total power allocated to k subcarriers.
capacity of the system. Qj= Total power allocated to j subcarriers.
Now for minimizing the outage probability, we have to
maximize [15] the following,
( + ) (3)
γ
th( + ) (4)
Fig.3 provides a comparison between NEA scheme and published approach in [17] but with compromising in
EA scheme on the basis of outage probability vs overall capacity slightly.
transmission power. In this particular case, it is observed
that outage probability is increased almost 17% to 20% in
case of EA approach than in case of NEA approach. But a
better result is achieved with respect to multihop scheme
in[17]. 7-11% reduction in outage probability in case of REFERENCES
EA and 2-7% reduction for NEA single hop scheme are
achieved compared to the previous work with 4 relay[17]. [1] J. Mitola III,“Cognitive Radio:An Integrated Agent
Architecture for Software Defined Radio,”Ph.D thesis, Royal
Institute of Technology (KTH), Stockholm, Sweden,pp.1-
313,May 2000.
[2] J. Mitola III, Cognitive Radio Architecture,New York: Wiley,
2006.
[3] J. Mitola,Q. Maguire, ''Cognitive Radio: Making Software
Radios More Personal,''IEEE Communications Society,
pp.13–18, vol. 6, Aug. 1999.
[4] S. Haykin,”Cognitive radio: Brain-empowered wireless
communications,”IEEE J. Sel. Areas Commun., vol. 23, pp.
201–220, Feb. 2005.
[5] R. Tandra, S. M. Mishra, and A. Sahai,”What is a spectrum
hole and what does it take to recognize one?,”Proc. IEEE,
vol. 97, pp.824-848,Mar. 2009.
[6] FCC,“Spectrum Policy Task Force,”Report, ET Docket No.
02-135, Nov.2002.
[7] T. Weiss, J. Hillenbrand, A. Krohn, and F. K. Jondral,
‘‘Mutual interference in OFDM-based spectrum pooling
Figure 4. Comparison of network lifetime and overall capacity of SUs
systems,’’Proc. of IEEE Vehicular Technology
under optimal scheme.
Conference(VTC’04), vol. 4, pp. 1873-1877, May. 2004.
Fig. 4 actually gives a overview of network lifetime, [8] A.Pandharipande,M. Kountouris,Ho Yang,Hyoungwoon
overall capacity and total transmission power in the same Park “Subcarrier allocation schemes for multiuser OFDM
plot. It also distinguishes between the characteristics of systems,”IEEE International Conference on Signal
proposed optimal and previously published scheme[17]. Processing and Communications,2004(SPCOM’04),pp540-
This comparative study concludes that using y=0.5 in EA 544,Dec 2004.
[9] L.K.S.Jayasinghe, N. Rajatheva, ”Optimal power allocation
scheme we can reach to a optimal solution where a
for relay assisted cognitive radio networks,” Proc. Of the 72nd
substantial increase in network lifetime can be achieved
IEEE Vehicular Technology Conference Fall(VTC-2010-
with the cost of compromising in overall capacity Fall),Ottawa,ON,pp1-5,Sep.2010.
slightly(8%). Network lifetime increases almost 9-11% in [10] G. Bansal, Z. Hasan , Md. J. Hossain and V.K. Bhargava,
our optimal proposed scheme(taking y=0.5) than in case “Subcarrier and power adaptation for multiuser OFDM-
of previous optimal scheme for 5 relay in [17]. based cognitive radio system,”National Conference on
Communication (NCC), Chennai, India,pp.1-5, Jan. 2010.
V. CONCLUSION [11] Juncheng Jia, Jin Zhang, Qian Zhang “Cooperative Relay for
Cognitive Radio Networks,” IEEE Conference Publications,
In this paper, the optimal power allocation is done by INFOCOM,Rio de Janeiro,pp.2304-2312, April 2009 .
minimizing outage probability under some limitations. [12] Liying Li,Xiangwei Zhou,Hongbing Xu,G.Y.Li,Dandan
Here two constraints are taken into account. First one is Wang,A. Soong,“Simplified Relay Selection and Power
the sum power constraint i.e. the total power allocated to Allocation in Cooperative Cognitive Radio Systems,”
the subcarriers associated in between S.U.S. to relay and IEEE Transactions on Wireless Communications,Vol.10,
relay to S.U.D. And the second is interference threshold No. 1,Jan 2011.
constraint that is experienced by the P.U.D. due to mutual [13] G.A.S.Sidhu, Feifei Gao,Wei Wang,Wen Chen ,“Resource
interference with S.U.S. and relay respectively. In this Allocation in Relay-Aided OFDM Cognitive Radio
proposed scheme, two different types of power allocation Networks,”IEEE Transactions on Vehicular
schemes are covered in terms of network lifetime. Technology,pp.3700-3710 ,April 2013.
Previously discussed simulation results show that the EA [14] Yong Li,Wenbo Wang,Jia Kong,M. Peng,“Subcarrier
based power allocation scheme is far superior to NEA pairing for amplify-and-forward and decode-and-forward
scheme. Observations in the previous section show that
OFDM relay links,”IEEE Communications
network lifetime increases almost 9-11% but overall
Letters,vol.13,No. 4,April 2009.
capacity decreases just 8%. As a result of increased
[15] M. O. Hasna, M.S. Alouini,“Optimal power allocation for
battery life, the total transmitting cost can be reduced. So relayed transmission over Rayleigh fading channel,” IEEE
it can be concluded that optimal power allocation based Transaction on Wireless Communication, vol.3, no. 6, pp.
approach(y=0.5) is gaining higher network lifetime(9- 1999-2004, Nov. 2004.
11% increase) and also causes less outage probability(7- [16] M. Chiung, D. ONeill, D. Juliun. and S. Boyd, "Resource
11% ) in case of EA scheme than in case of previously allocation for QoS provisioning in wireless ad hoc networks,"
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
Abstract—Modern digital systems become more complex with and design system. In such a system each synchronous model
increasing multi-clocking techniques for better performance. (IP core) is independent [2].
Multiple asynchronous clock domains have been using for To make such system in which asynchronous clocks
different I/O interface in todays modern system on chip
(SoC).Each system needs to communicate with one or other interconnect synchronously, a synchronizer has been used
system/peripherals continuously. These multiple asynchronous and can be called as clock domain crossing synchronizer.
clock domains are facing meta-stability, data losses and other When data has been transferred between two different
clock domain crossing(CDC) issues.CDC is an important issue independent (different phase and frequency) clocked system,
in all todays SoC. In this paper we have demonstrated how a synchronizer has to be used to avoid meta-stability, data
meta-stability occurs in CDC boundary, and presented a
comparison of basic synchronizers on the basis of latency time loss and for proper reception of data at the receiver side.
to reduce the propagation of meta-stability, to increase the
mean time between failures(MTBF) and to avoid data losses in
multi-clock domains. A. Clock Domain Crossing Issues
• Meta-stability
Index Terms—Multi clock Domain crossing,Meta-stability,setup
and hold time violation,synchronizer,MTBF. Meta-stability is a well-known issue that may cause system
failures in digital system where signals are transmitted across
I. I NTRODUCTION asynchronous clock domains [4].Meta-stability mainly accure
when the sampling takes place during the transition of signal
Clock signals are fundamentally the most important control from one level to another level. Meta-stability occurs due to
signals in digital systems. The co-ordination of clock signals setup and hold time violation. The proper operation of the flip
among the components predicts the performance of a system flop in synchronization circuit is depends on the stability of
at any level like on chip, on board, across board. System input signal to that circuit for a certain time period before
can be triggered at the edge of the clock or at levels of the and after the clock edge. The particular time interval before
clock. For double data rate system can be triggered at both the clock edge is called setup time and certain time period
the positive and negative edges. The main issue with the edge after the clock edge is called hold time. During this setup
triggering is that if the receiver samples a signal during the and hold period if transition at input signal takes place then
transition the BER (Bit error rate) increases and if it samples meta-stability will occur.
at the middle of the bit then bit error rate decreases[6]. In the
multi clock system, one clock can be derived from the other
clock and these will be synchronized to each other. Multiple
asynchronized clocks can be used to work multiple cores
independently. Suppose in multi-processing if processors has
been used in a system and one processor is faster than the other
one then to keep them in synchronous whole system has to be
run with a minimum clock rate as that of slowest processor.
In such situation the throughput of the system decreases .So
to increase the throughput of the system with different speed
of the processor, asynchronous clocking can be used. In this
multiple clocking both the processors runs with their own Fig. 1. An example of a CDC circuit and metastability
speed without compromising with other processor. System-on-
chip that consist of multiple clock domain connected through The meta-stability is nothing but the output of the flip-flop
asynchronous interconnect are called globally asynchronous may oscillate for an indefinite time, and may or may not settle
and locally synchronous Networkon chip (GALS)[6].Globally to a stable value before the next clock pulse triggering.fig.1
asynchronous locally synchronous (GALS) is a model of (a)[1] shows the multi-clock CDC circuit in which Clk1 is the
computation which allows synchronous assumption to model sender domain clock and clk2 is the receiver domain clock.
Fig. 3. Timing waveforms showing hold-time violations for the circuit in
Fig. 2. Timing waveforms showing setup-time violations for the circuit in Fig. 1(a).
Fig. 1(a).
Two D-flip flop has been shown,D1 is sender domain side flip- of the receiver flip-flop then that changes in Q1 will not be
flop driven by clk1 and D2 is receiver side flip flop driven by captured by the receiver flip-flop at the output as shown in
clk2.In this the Q1 signal is transmitted by sender domain and fig.3(b)[1].
need to capture at the receiver side properly.
Here the clk1 and clk2 are asynchronous to each other, II. R ELATED W ORK
shown in fig.1 (b)[1].If the transition in signal Q1 takes
place during the setup or hold time period, the meta-stability
will occur. If a transition takes place near to active edge of
clk2.a setup time violation occurs and if transition takes place When passing signals between clock domains, an important
during hold time period, hold time violation occurs and leads thing required is a synchronizer to synchronize multi clock
to meta-stability on Q2. domains and to transfer signals from one domain to another
domain properly.
• Setup time violation The authors in [1] discussed about multi-flip flop synchro-
Fig.2 (a)[1] demonstrate a sample waveform for the cir- nizer to resolve meta-stability. In [1], [2] the authors proposed
cuit shown in fir.1(a).If unexpected delay occurs at Q1, and multi-clock domain crossing fault detection method and fault
transition in signal takes place in the setup time period of diagnosis and recovery method in post-silicon verification
the receiver flip flop clock i.e.clk2 the receiver flip flop may process. In [3] how meta-stabilities led to an error in system is
capture the value 0 even though the expected value is 1.As represented and how often meta-stability will occur, a FPGA
shown in above example, if transition at signal Q1 will not based experimentation is demonstrated.
takes place until the next clock edge of the receiver then the In [4],a synchronization technique was proposed to design
receiver will get expected value ”1” at the next clock pulse. safe synchronizer in FPGA based TMR circuits and demon-
Such setup time violation produces one clock pulse delay. strated the proposed synchronizer by mathematical modelling
Fig.2 (b)[1] demonstrate the another possibility of setup and fault injection testing on FPGA circuit to provide im-
time violation. In this waveform we can see that after the provements in reliability. The author in [5], discussed various
first setup time violation if the signal Q1 changes its state synchronization techniques to address passing of one or mul-
before the next receiving clock edge or during the setup time tiple signals across a clock domain boundary.
period of the next clock pulse then receiver cant capture the The authors in [6] proposed a synchronization model and
change in the signal Q1 and remains unchanged. If such same simulated it to analyze and verify the probability of failure and
transitions take place for the next successive receiver clock demonstrated its relation with clock rate. In [7], the authors
pulses then signal Q2 remains unchanged. presented the design of 5 types of clock domain crossing
schemes and an approach using assertion-based verification
• Hold time violation (ABV) to verify proper functionality of CDC signals was
If a signal changes during hold time, a flip flop experiences described in this paper. In [8], the authors presented a detailed
the hold-time violation and changes at its input signal may comparison of three synchronizers along with latency and
be captured incorrectly. Fig.3 (a)[1] demonstrates waveform power consumption.
of such case for CDC circuit shown in fig.1 (a). In [9], the authors proposed C-element based synchronizers
If input at the receiver flip flop changes during the hold-time in clock domain crossing interface in order to improve MTBF.
period of the receiver flip-flop, an incorrect changes appears As discussed earlier, different synchronization techniques have
on the output signal of the receiver flip-flop. For example, as been developed as per the different applications according to
shown in fig.3 (a)[1] the receiver flip flop gets an output value the data and clock rate. In the next section we discussed some
of 1 one clock cycle earlier than expected. If input signal to basic synchronization techniques and their disadvantages as
the receiver flip-flop changes before the next rising clock pulse per the different conditions with clock and data rate.
A. Clock Domain crossing synchronizers As we can observe in the above equation, the sampling
In this section some of the common synchronization tech- clock frequency (how fast are signals being sampled) and
niques have been discussed. data change frequency across the clock domain crossing
One problem introduced by the synchronizers is the boundary are the main factors that directly impact the MTBF
possibility that a signal from sending clock domain might of a synchronizer circuit. From the above equation it can be
change more than once before it can be sampled, or might be concluded that in higher speed designs failure occur more
too close to the sampling edges of the slower clock domain. frequently.
This situation causes the losses or missing of sent signals
across the domain. There are two general approaches to the 2) Gray-coding Synchronizer
problem when missed signals are not allowed [3].
(1) Open-loop:-To ensure that signals are captured without In case of multi-bit clock domain crossing signals even
acknowledgment. if correct synchronizer has been used, there is no guarantee
that multi-bits are received properly by the receiving domain,
(2) Closed-loop:-It requires acknowledgement of receipt of because of the meta-stability the value change. For multi-bit
the signal that crosses a CDC boundary. transmission across multi-clock domain one common approach
is the gray code synchronizer shown in fig.5[7].
1) Two flip-flop synchronizer
Synchronizers Latency
Multi-flip-flop >2 cycles(no. of Flip flop Dependent)
Gray code Synchronizer 2cycles(no. of Flip flop Dependent)
Asynchronous FIFO >2(Memory status dependent)
Handshake synchronizer 6-7 cycles
Abstract: Currently, exhaustive hacker’s intelligence and change in compilation time consumptions.
leads to violation in leakages in software’s application The above Figure .1 defines a general
(SA). This leads to violation of intellectual property obfuscation of a program.
protection in software’s. The software industry
evaluates the output applications and data based on Source Obfuscated
Digital Rights Management (DRM), as a part of Code Obfuscator Code
Intellectual Property Protection [3]. The term refers to (input) (Output)
a range of intangible rights of ownership in an asset
such as a software program [3]. To prevents the Fig.1 Overview of Obfuscation
software Applications (SA) from leaking, obfuscation is
one of the solution identified [1]. The obfuscation The obfuscation methods in a software code,
concept, is named as one of the technique can be includes the following tasks: searching the code
researched and employed in programming languages sequences in the software, if results similar, then it
with boundary statistics. This papers represents generates different code sequences as a part of
analysis on obfuscation and its related techniques, used existing code sequences, inserts branch at the end of
in the protection of intellectual property rights (IPR). the each code sequence, applies branch operation to
As an advantage, obfuscation can be utilized to secure control flow, and follows each code sequence, which
the software against injecting faulty codes and data results in the execution of multitasks specified in
attacks. Even though, the obfuscation transformation program.
can protect code, has drawbacks and warning in their
implementation with reduced performance due to A. Need of Intellectual Property protection
larger code checkpoints. Recent statistics [6] show that four out of every
ten software programs is pirated worldwide. This
Keywords – Intellectual Property Protection, Obfuscation, indicates a software threat to programmers and thus
Code Obfuscation, Software Protection the global economy. The intellection property of a
software is protected via two mechanisms namingly,
I. INTRODUCTION legal IPP and Technical IPP. The legal IPP
The word obfuscation means “creating comprises of acquiring software copyrights and
authorizing among the parties legally, against
confusion”. This technique is used to create
generation of duplicates. Technical IPP includes
complicated code segments. It generates various methods and are code authentication,
obfuscated code, which makes harder program encoding, tampering, and watermarking and
understanding during it’s decompilation, but code obfuscation [1]. These techniques are
also the decompiled code makes no effect on the developed for defending the intellectual property
code functionality. Obfuscating the programs protection (IPP) on the software applications (SA)
are used to protect codes by making them harder against various software attacks. Nowadays, during
to reverse engineer. development of software design, intellectual property
for it, was also generated simultaneously. A defense
The main objective of obfuscation is to build against reverse engineering is obfuscation, a
technique, causes software confusing and functional.
up software with respect to reverse engineering
978-1-4673-6524-6/15/$ 31.00
S.POORNIMA et al: ANALYSIS ON OBFUSCATION FOR THE EVALUATION OF INTELLECTUAL PROPERTY PROTECTION (IPP)
Symbols renaming
Instructions Substitution
Fig.2 Intellectual Property Protection (IPP) Analysis on Software via Different Obfuscations
There are majorily two methods implemented for
A .Symbols renaming symbols renaming,
Obfuscating the software code leads the
debugging process more harder. It is due to an Printable Chars -For this method, ASCII
feature names as symbols renaming, in which the characters are used for renaming classes and
obfuscated symbol names substitutes the program’s methods. In which, original characters can be
symbols. recovered easily from stored stack trace.
A Rename refactoring operation can be performed Unprintable Chars – In this method, Unicode
on any symbol, such as the name of a class, method, unprintable characters are used. Here, original
function, and so on—except for a method declared characters cannot be recovered.
inside a protocol interface [7]. During the
This is the technique of substituting names of classes,
development of a software program, the required data
member fields, methods, etc. with some other names
is inserted in terms of symbols, classes, variables, that don’t make much sense and makes it harder to
functions, calls, etc. This data is needed to hacker for understand and difficult to reverse engineer the code.
understanding the working principle of the software.
The most powerful concept that protects the IV. AUTOMATIC CODE OPTIMIZATION
software is Code Virtualizaton, modifies the header
of the software to be protected. Code Virtualizations Code optimization is implemented on compiled
transform the software byte code into a classes and they are language dependent. Automatic
code optimization improves performance on platform String encryption generally encrypts all strings at
dependent languages. every point in the program (usually by replacing
strings by a method call) and so this means that string
Thoery: The transformation T is a local encryptions are easier than most data obfuscations to
transformation, if it affects a single basic block of a automate.
controlflow graph (CFG), it is global, if it affects an
entire CFG, it is inter-Procedural it if affects the flow VI. ASSEMBLIES MERGING AND EMBEDDING
of information between procedures, and it is an inter-
process transformation if it affects the interaction Function codes in an application program are
between independently executing threads of control known as assemblies. A set of assemblies are
Automatic code optimization transforms the grouped into a distinct file by a process, namingly
application a top-performing application, by making merging. This technique can be used to merge an
it’s execution fast. It performs optimization executable file with its dynamic linked libraries,
automatically by replacing code fragments, so that it which allows distributing a runnable program in a
helps to install effective pipeline. single file [10].
The assemblies merging provides reduction in the
V. Code Control Flow Obfuscation overall source lines of code used in the
implementation.
Control flow obfuscation techniques are
Theory: Two or more scalar variables V1………Vk
renowned and well-constructed, but those are
can be merged into one variable VM, provided the
language specific . Control flow obfuscation is one of
combined ranges of V1………Vk will fit within the
the technique employed for reverse engineering
precision of VM.
adopted in Obfuscation [8]. For control flow
obfuscation, three criteria’s are considered.
Functions: - The Control flow obfuscation operates
on function level, and proves to be the most
successful against reverse engineering. It replaces
the original functional code with false code, which
leads the hacker with no understanding, each and
every false codes are tied by complex opaque
predicates. Fig. 4 Merging variables
Code – it performs on well-known operation like
reorder, replacing, fold and unfold, delete and add Assemblies merging take a set of input
function. assemblies and merge them into one target assembly.
The first assembly in the list of input assemblies is
Platform – According to language platform, it the primary assembly. When the primary assembly is
operates on a compiled classes, the codes are an executable, then the target assembly is created as
replaced with false codes, which cannot provide an executable with the same entry point as the
proper results, even with reverse engineering. primary assembly. Also, if the primary assembly has
The control flow efficiency is defined based on a strong name, and a signature file is provided, then
transformation application, and is increases when the target assembly is re-signed with the specified
duplicate label l’duplicates are reduced from the total key so that it also has a strong name and shown in
program l’total and is defined as Figure.4. Assemblies merging occurs before any
other protection operation, therefore if assemblies
Slabel = (1) merging is used all protection settings set for
secondary assemblies are ignored. The protection
Equation (1) demonstrates the efficiency of code settings for the merged assembly are set by the
during compilation C’compiled with unreadable lines of primary assembly.
code C’LOC in the transformations applied is
A. Debugging support
Sreadable = (2)
One of the side effects of obfuscation is the
difficulty of debugging obfuscated code. Exceptions
generated and reported by a user will typically This kind of obfuscation is rather straightforward
include obfuscated method and class names making it and does not add a lot of security, as it can easily be
almost impossible to trace back the stack trace in the removed by re-optimizing the generated code.
source code. Obfuscator generates a clearly labeled However, provided the pseudo-random generator is
map file containing a detailed description of the seeded with different values, instructions
obfuscated entities and their original names, this substitutions bring diversity in the produced binary.
information is essential to the user in interpreting Currently, only operators on integers are available, as
debugger output from the obfuscated assembly. substituting operators on floating-point values bring
rounding errors and unnecessary numerical
VII. INSTRUCTIONS SUBSTITUTION inaccuracy.
The goal of this obfuscation technique simply
consists in replacing standard binary operators (like VIII. CONTROL FLOW FLATTENING
addition, subtraction or boolean operators) by
functionally equivalent, but more complicated The purpose of this pass is to completely
sequences of instructions. When several equivalent flatten the control flow graph of a program. As
instructions sequence are available, one is chosen at one can see, all basic blocks are split and put
random [13]. into an infinite loop and the program flow is
controlled by a switch and the variable b.
#include <stdlib.h>
int main(int argc, char** argv) {
int a = atoi(argv[1]);
#include <stdlib.h> int b = 0;
int main(int argc, char** argv) { while(1) {
int a = atoi(argv[1]); switch(b) {
if(a == 0) case 0:
return 1; if(a == 0)
else b = 1;
else
return 10; b = 2;
return 0; break;
} case 1:
return 1;
case 2:
return 10;
default:
break;
}
}
return 0;
}
administrator over a period of 29 years in service of 2004, M.Tech (CSE) with Gold Medal from Dr.
the student community. Presently he is working as a MGR Educational and Research Institute, University
Principal of S R Engineering College (An in the year 2007 (Distinction with Honor) and Ph.d
Autonomous institution under JNTUH), from St.Peter’s University in August 2014.He is
Ananthasagar, Warangal, India. He holds a presently working as Associate Professor in SR
Bachelor’s Degree in Electronics & Communications Engineering College, Warangal. He has 10+ Years of
Engineering, double post graduate degrees in M.Tech Teaching Experience. His areas of interest include
in Electronic Instrumentation as well as M.E in Information security and Sensor Networks. He has
Information Science & Engineering. He was awarded published papers in various national and International
a Ph.D Degree in Computer Science and Engineering Journals, national and International Conferences. He
b Indian Institute of Technology, Kharagpur,West also attend many National Workshops/FDP/Seminars
Bengal, India. He had started his career as a Lecturer etc. He is a member of ISTE, CSI, Member of
in Electronics and Instrumentation Engineering in IACSIT and Member of IAENG.
1985 at Kakatiya Institute of Technology & Science,
Warangal. He rose to occupy the post of Principal of
the same institute in 2007. Under his guidance 15
students had completed their Master’s Dissertations
including a scholar from Italy. Two research scholars
were awarded Ph.D. Further, two more research
scholars had submitted their Ph.D Theses and 26
other research scholars are working at different
universities in the state on part-time basis to acquire a
Ph.D. He had published 146 technical research
publications in various National/International
Journals/Conferences and 3 text books. He had been
contributing several International Journals in bringing
out pragmatic research publications as Member on
their Editorial Board, Advisory Board and as Editor
in Chief. He had served for several times as a
Member as well as Chairperson of the Boards of
Studies in Computer Science & Engineering and
Information Technology at several
universities/institutions. He had served as a Member
of Industry Institute Interaction (III) Panel of A.P.
State Council of Confederation of Indian Industry
(CII), Andhra Pradesh, He was also nominated as a
Member on Admissions Committee of Engineering
Agricultural Medicine Common Entrance Test
(EAMCET) and Post Graduate Engineering Common
Entrance Test (PGECET) Committee in 2010 by
Andhra Pradesh State Council of Higher Education,
Hyderabad.
Abstract—This paper elaborate the concept of real time object complexity of the algorithm. Kalman filter is used for
detection using image processing with application of detection of multiple moving objects which has applications in
determination of robot position . The main task of this system is, security systems, video indexing, etc [4].
the development of an image processing unit that is able to
process captured images of the robot and enabling the detection S. M. Low et al.[5] presents the image capturing
of robot position. This system execute in three main steps which subsystem with two IR range finders as sensors and a wireless
are carried out with MATLAB software First is, the image with camera to capture the images. The successful development of
robot and without robot is captured with the help of webcam. this system is further work in the area of object identification
Second step is of image subtraction which shows whether the and pattern recognition for the robotic vision system. To find
robot is present in image or not. At last by finding centroid of out exact position of object, the center point i.e. the centroid of
detected robot we can determine the exact position of robot in the the image have to calculate. Since the intensity at the center
image. point is highest than other region, the centroid detection is
difficult to the environmental noise as it is obtained by
Keywords—Webcam, Image Subtraction, Robot Detection, intensity of the pixel’s [6].
Centroid Detection
In image subtraction method, two images are subtracted
I. INTRODUCTION from each other and then object remains. So we get the
position of the object. For the centroid detection method is
Now days, the most interesting research field is image difficult because it is obtained by calculating the mean value
processing and its applications. In that also, real time image of pixel’s intensity. Pixel’s intensity is changing when the
processing and its automatic analysis has wide applicability in object is out of camera region [7]. The centroid detection
today’s era which do not only reduce the personnel expense, algorithm has been applied in many fields. For the centroid
but make performance accurate and reliable. Video processing detection, uncertainty evolution of centroid detection is most
can be possible in two ways: offline and online. In offline important approach to calculate the reliability of centroid
video processing, whole set of images is taken and then detection [8].
processed. In online video processing, each captured image is
immediately processed i.e. capturing time and processing time In this paper, the concept of online object detection is
is same[1][2].The above both the methods have many focused with the application of robot position finding. The
advantages and disadvantages. Offline image processing is proposed system is real time system. The surveyed literature
used for medical images processing, Biomedicine data shows that, previous systems are mostly based on the video or
processing, etc. But the most advantageous video processing is vision. So for these systems database is required to store
online image processing which has applications in object videos, but as proposed system is real time, images are
tracking, object detection, finding the exact position of robot, captured at the time of processing.
observing the Robot etc. A. Sai Suneel et al.[3][2] gives the The paper is organized in five sections as, first section
application of moving object detection and tracking in the gives introduction, Section II presents methods for object
video surveillance. In that, the static camera has been position finding, third section contain overview of proposed
developed to check the velocity and distance parameters and work, Section IV discusses experimental results, Section V
using the image difference algorithm detect the object.The concludes the paper. At last references used to clear different
algorithm to estimate moving object velocity using image concepts related to proposed system are listed below the
processing techniques from the camera adjustment parameters. conclusion part.
The detection of object is challenging task in computer
vision, when there are multiple objects have to detect from
video. Detection and tracking of moving objects, increases the
II.METHODS FOR OBJECT POSITION FINDING The flow of proposed work is as shown in figure 1:
For the object position finding, three methods are used.
These are as follows: Webcam
IV. CONCLUSION
In this paper, image subtraction method and centroid
detection method is used to find out robot position in image
Fig.3. Plain Image and exact center point of the robot. This methodology of
nd
b)Captured 2 Image-Robot Image: After some time object position finding can be used for one moving object
interval i.e. 5sec webcam captures the second image with tracking at any background.
robot. This image is as follows: In future we want to track the robot up to destination
point. By Automatic operation and manual operation robot
Robot Image
will reached to the destination point.
REFERENCES
[8] Jiechun Chen, Liping Zhao, “A New Method for Uncertainty Evaluatio
of Centroid Detection,” 2nd Internationl Conference on Image and Signal
Processing ,2009.
[9] N. Prabhakar, V. Vaithiyanathan, Akshaya Prakash Sharma, Anurag
Singh and Pulkit Singhal, “Object Tracking Using Frame Differencing
and Template Matching,” Research Journal of Applied Sciences,
Engineering and Technology, pp5497-5501 , December 15, 2012.
[10] Bharti, Tejinder Thind, “Background Subtraction Techniques-Review,”
International Journal of Innovative Technology and Exploring
Engineering (IJITEE), Volume-2, Issue-3, pp.166-168,February 2013.
[11] C. Alard,And Robert H. Lupton, “A Method For Optimal Image
Subtraction ,” The Astrophysical Journal, August 10, 1998,pp.325-331.
[12] Shruti V. Salunkhe, Kailash J. Karande, Amol B. Jagadale, “Recognition
of Robot Position On-line using Still Camera Images,” Proceedings of
International Conference On Current Innovations In Engineering And
Technology, pp64-67,2.3.2015, ISBN: 978-15-086565-24.
[13] E. Hasanabadi , M. J. Mahjoob, “Trajectory Tracking of a Planar Snake
Robot Using Camera Feedback,” 2011 2nd International Conference on
Control, Instrumentation and Automation (ICCIA).
[14] Klinovský, M., Kruczek, A. Předmět ROBOTI na ČVUT FEL v Praze.
In Proceedings of Conference ARTEP’10", 24.-26. 2 2010.Technická
univerzita Košice, 2010. ISBN 978-80-553-0347-5.
[15] G. Bekey, and J. Yuh, “The Status of Robotics: Part I,” IEEE Robotics
& Automa-tion Magazine, vol.14, No.4, pp. 76-81, 2007
Author Biographies:
First Author:
Second Author:
Abstract— Dysarthria is a most common acquired speech clinical based approach on solving the issue is highly
disorder which may be due to neurological injury, cerebral subjective. Certainly, dysarthric speakers use their natural
palsy or stroke. It is characterized by weakness in the skills to express their thoughts. However, listeners prefer
muscles involved in speech production thus affecting the synthetic speech over actions or dysarthric speech. To
meaningful context of speech. An assistive technology that improve the meaningfulness in their speech Kent et al in
can recognize the dysarthric speech, correct it and [2] introduced the concept of intelligibility for dysarthric
synthesize it in the patient's own voice is the baseline of this speakers. The intelligibility level were scaled based on an
work. For the present work Nemours database of dysarthric item identification test and this was used for clinical
speech is used. The current work concentrates only on
assessment but it did not meant to modify the
methods to improve the recognition performance. The
dysarthric speech is first analyzed using perceptual and
intelligibility. Chen et al in [3] worked on a word
speech rate analysis which helps to classify the dysarthric detection algorithm that uses a left to right HMM
speakers as mild, moderate and severe. The dysarthric technique, it detected 10 digits with a recognition rate of
speech is then assessed using an Automatic Speech more than 90%. Yakcoub et al in [4] also used HMM
Recognition (ASR) system supported by hidden Markov based technique but they varied the duration of the
model (HMM) technique. An isolated style phoneme Hamming window size. To maintain the naturalness in the
recognition system, gives details about the deviations in the synthesized speech a grafting technique is used which
dysarthric speaker's. The methods to improve the corrects the bad pronounced phonemes imparted.
recognition performance was initially tried by training and P.Vijayalakshmi and M. R. Reddy in [5] developed a
testing the system using all the classes of dysarthric continuous speech recognition system for an assessment of
speakers, which lead to confusions in the recognized text. So dysarthric speakers. A number of insertions were present
the second system, was trained using only mild and in the recognition system to overcome that, an isolated
moderate class of dysarthric speakers. This lead to a style recognition system was developed. From the
performance improvement of 0.5 - 2% . To improve the performance of isolated style recognition system it was
performance further, the system three was trained using a noted that the speakers was also affected with
bi-gram Language model based network. Over the other two velopharyngeal incompetence which was analyzed using
systems, the third system has achieved a performance group delay function.
improvement of up to 17% . The recognized text from
system three is synthesized using an HMM based speaker- Speech recognition basically converts any speech to its
adaptive speech synthesis system, and its performance is text form. The focus of the paper is to analyze the
evaluated using a mean opinion score (MOS) obtained from dysarthric speech using isolated style phoneme recognition
10 listeners. The average MOS ranges from 2 - 3.5. and also to throw light on performance improvement in
dysarthric speech recognition system so that the work on
correcting the recognized text will be minimum thus
Keywords—dysarthria, hidden Markov model(HMM), helping with a good performance in the synthesized
recognition performance, isolated style recognition,
speech.
Language model, Automatic speech recognition(ASR).
This paper is organized as follows, the upcoming
I. INTRODUCTION section explains about the speech corpora used for the
Dysarthria is a neurological speech disorder whose analysis and for recognition system. Section III deals with
characteristics are reflected by abnormalities in the the analysis of the dysarthric speech both perceptual and
strength, speed, range, timing or accuracy of the speech speech rate analysis are considered. Section IV explicates
movements which is a result of weakness in the muscles about the assessment of the dysarthric speech which gives
involved in speech production [1]. This makes the details about the deviations in the phonemes uttered by the
intelligibility of the dysarthric speaker very poor for the dysarthric speaker and discusses about the systems that
listener to understand. The intelligibility levels vary based were trained to improve the recognition performance.
on the severity level of the dysarthria. Clinically they can Section V discusses on the HMM based speaker adaptive
be cured by proper medicines and speech therapy speech synthesis system that adapted and tested the
techniques based on their degree of dysarthria. But a dysarthric speaker's voice. Section VI construes with the
conclusions and future work for the paper.
II. SPEECH CORPORA deviations and to refine the performance in the recognition
As discussed above for dysarthric speech Nemours using HMM based ASR system.
database [6] of dysarthric speech is used. This database is a
collection of 814 short non-sense sentences spoken by 11
different male adult dysarthric speakers. The speaker KS
in the database is not considered for the current work
because of his very poor intelligibility with too much of
struggle in his voice. For the speech rate analysis the TABLE I. SPEECH RATE ANALYSIS OF DYSARTHRIC SPEAKERS
speech rate calculated from the dysarthric speech need to Speaker Average Max. Min.
be compared with the normal speaker's rate, for which a name speech rate speech rate speech rate
TIMIT speech corpus [7] is used. This includes normal BB 7 10 5
speaker's data collected from 630 speakers who have BV 8 12 4
spoken 10 sentences each giving a collection of 6300 BK 3(2.996) 5 1
sentences. For the assessment of dysarthric speech an ASR SC 5 8 3
system need to be trained with normal speaker's training JF 6 7 4
data and to be tested with dysarthric speech. The normal FB 6 8 5
speaker's database was collected from 10 different normal RK 8 12 4
male adult speakers in a lab environment. Since the RL 3(3.300) 5 2
Nemours database is phonetically balanced and also since LL 5 4 6
it is the test data, the normal speaker's have also spoken the MH 7 11 5
same text. The 10 speaker's have spoken 74 sentences each
giving a collection of 740 sentences.
III. ANALYSIS OF DYSARTHRIC SPEECH IV. ANALYSIS OF DYSARTHRIC SPEECH
The analysis of dysarthric speech paves way for a clear The dysarthric speech being analyzed is assessed using
understanding of the dysarthric speakers and also helps to ASR system that uses HMM based technique that helps to
know about the deviations in their speech when compared capture the deviations in the dysarthric speech with the
with the normal speech. In the perceptual analysis of normal speech. The, methods to improve these deviations
dysarthric speech all the 740 sentences uttered by the 10 is the core of the paper.
dysathric speakers in the Nemours database were listened A. Methodology
separately and the phones in error was noticed. A clear The ASR system converts the speech to text form. The
detail about the intelligibility of the 10 speakers were ASR system uses an HMM based speech recognition.
gained through perceptual analysis. This helped to classify Initially, the speech is segmented into phoneme or word
them as mild, moderate and severe class based on their level using Hamming window of size 20ms. Feature
severity level. Speaker's FB, LL, MH, BB fall under the vectors are extracted from the speech. Mel frequency
mild class, speaker's BV, RK, JF belong to moderate class cepstral analysis is used for feature extraction which are
and speaker's SC, RL and BK come under the severe statistically independent[8]. Feature vectors are 39
category. The details obtained from the perceptual analysis dimensional which include 13 static coefficients, along
will be validated in the assessment stage which will be with 13 delta and 13 acceleration coefficients. These
explained in the following section. The speech rate coefficients explain about the vocal tract positions. HMM's
analysis gives the number of phones uttered per second. are trained for the segmented speech. The Initial model's
Since the dysarthric speaker may elongate, delete or were trained based on the number of states and mixture
shorten any phonetic sound the speech rate for every components for the segmented context. The initial models
speaker would vary hence it is also considered for the are then re-estimated using Baum-Welch re-estimation
analysis. The speech rate is calculated from the time- technique[9]. Lexicon is a part of the ASR system which
aligned phonetic transcription that is available in the gives details about the syntax and semantics of the
database. Table. I gives a summary of the speech rate language to be displayed as text. Dictionary and networks
calculated for the dysarthric speaker's. are part of the lexicon. A dictionary describes about what
The speech rate calculated need to be compared with is to be displayed on the screen when a model is identified
any normal speech to know their deviations. For from the spoken text. A network gives a sequential
calculating the speech rate for normal speaker's the TIMIT information of the recognized segments. All these
speech corpus is used. The speech rate is calculated for all elements put together will formulate an ASR system. The
the speakers in the database and their speech rate was input test speech is given to the trained ASR system. The
observed to be in the range of 6 - 17 phones/second. It is test speech is segmented and based on its features it is
observed from the Table.I that certain dysarthric speakers matched with the available trained model. The label
fall under the normal speaker's range. Thus, certain corresponding to the model is chosen from the dictionary
speaker's follow a normal phase in pronouncing the and is displayed as the recognized text. The network will
phones. But, considering speaker's BK, RL and SC who help to provide a sequence of the recognized test speech.
pronounce mostly one or two phonemes per second find The recognition system trained and tested need to be
difficult to follow the normal rate, they elongate the evaluated. The evaluation is based on the word error rate
phoneme in difficulty leading to very low rate of speech. (WER) that is calculated based on the number of words
substituted, deleted and inserted by the ASR system. %
Understanding the dysarthric speaker's intelligibility correctness and % accuracy are measured from WER.
from the analysis , in the preceding section we will discuss These are calculated using the formulae given below
on the methods used to assess the database to know the obtained from [10]:
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
N DS
%correctness *100% ......(1)
N
N DS I
%accuracy *100% ......(2)
N
N - Total number of phones or words under recognition
D -Total number of phones or words that were deleted
S - Total number of phones or words that were substituted
I - Total number of phones or words that were inserted
B. Systems trained for assessment of dysarthric speech Figure 1. Inventory based recognition in sentence style
with the normal speaker's data
To assess the dysarthric speech, an ASR system was D. Performance improvement in the recognition system
trained using normal speaker's training data. The ASR 1) Models trained using all dysarthric speaker's:
system was trained with all the 740 sentences of the 10
normal speakers and word level segmentation was In system 1, the dysarthric speech of 10 speakers in the
performed. The database has 113 unique words, thus 113 Nemours database were trained and tested using phone
models were trained. A sentence level network is used. models. Since, the speaker's have spoken 74 sentences
This ASR system was first tested with the trained data each, 50 sentences of every speaker were kept for training
itself which gave 100% accuracy. The system was later and 24 sentences for testing. The phone models are trained
tested with the dysarthric data which achieved poor using features of all class of dysarthric speakers. 38 phone
performance in recognition. The next assessment were models were trained. A dictionary based network is used in
made by training and testing the sentence with dysarthric the lexicon. This will help to combine the phones to form
speech. 113 word models are trained. Since the system was the word sequence. The recognition performance is
trained and tested with the same data a performance of calculated as explained above. The Table. II gives the
100% was achieved with the mild and moderate class. But, recognition performance results for system1. It is observed
with the severe class the performance measured was in the from the Table.II that severe class of dysarthric speakers
range of 93 - 94%. But, this doesn't give any details about have very poor performance. As explained previously,
the phones in difficulty for the dysarthric speaker's so an combining the classes of mild and moderate may lead to
isolated style phoneme recognition was carried out. even a better performance, because the presence of severe
class in the model may affect the performance of other
C. Isolated style phoneme recognition system dysarthric people.
For a better analysis of dysarthric speech the deviations
in the dysarthric speech with the normal speech need to be TABLE II. SPEECH RATE ANALYSIS OF DYSARTHRIC
SPEAKERS
captured [11]. This is carried out by developing an isolated
style phoneme recognition system. Inventories are Speaker name % Correctness % Accuracy
basically used to recognize any speech unit in an isolated BB 61.81 42.36
style. The Nemours database has 38 unique phones, thus
BV 31.25 13.19
38 phone models were trained. The inventories were
created in phoneme level but for a sentence pattern. Thus, BK 25.69 -88.19
in recognition it would be easy to find the phone in error LL 46.53 30.56
for each sentence. For every test input sentence the
MH 57.64 34.03
features are not extracted as a sentence but in the inventory
order and are recognized as phones. Fig. 1 shows an FB 65.28 54.86
example of the recognized text in isolated style of the JF 37.50 -9.72
speaker LL for the sentence 'The pat is wearing the sue'. RK 24.31 -0.69
The speaker LL has difficulty in pronouncing the phone /r/
which was noted in the perceptual analysis and has been RL 43.75 -16.67
confirmed through isolated style recognition, as the phone SC 33.33 -38.19
/r/ has been recognized as /ey/ which is encircled in the
Fig. 1. This method was first tried with all the three class
of dysarthric speakers. But, since the phone model had the 2) Models trained using only mild and moderate class
features of all three class of dysarthric speakers, who may of dysarthric speaker's:
or may not have pronounced the sound unit correctly, it
lead to confusions in recognition. To avoid these System 2, was trained using only mild and moderate
confusions the isolated style phoneme recognition was class of speakers because of confusions in the model.
tried with only mild and moderate class of dysarthric Hence, system 2 was trained and tested using only mild
people. To get exact assessments separate models for each and moderate classes. 38 phone models were trained using
speaker can be trained but we are limited with the only the features of mild and moderate class of speakers. A
database. This method of assessing the dysarthric speech dictionary based network is used as in system 1. Table. III
using inventories for a sentence level display and using gives the recognition performance results for the system 2.
phone models will give a clear idea about every speaker The improvement in the performance is only 0.5 - 2 %.
and also a conformance to the perceptual testing.
978-1-4673-6524-6/15/$ 31.00
MARIYA CELIN T.A et al: INTELLIGIBILITY MODIFICATION IN DYSARTHRIC SPEECH
TABLE III. PERFORMANCE ANALYSIS FOR MODELS TRAINED USING A recognition performance improvement of up to 17%
ONLY MILD AND MODERATE CLASS OF DYSARTHRIC SPEAKERS
is achieved using the bi-gram language model as network
Speaker name % correctness %accuracy Insertion penalty is a factor that calculates the insertion log
BB 63.89 43.75
probability as explained in [10], when included in the
recognition engine it tries to minimize the unwanted
BV 31.94 15.97 insertions in the recognized output. The insertion penalty
LL 48.61 33.33 varies for every speaker based on their recognition
MH 59.72 36.81 performance. Table V gives the recognition performance
analysis using insertion penalty as a factor of
FB 65.89 54.86
improvement. The configuration of the system is same as
JF 39.58 -9.03 system- 3. A performance improvement of 0.2 - 2% is
RK 24.31 2.08 achieved over system 3.
In Fig.2 a performance comparison chart for [3] Chen, Fangxin, and Aleksandar Kostov , 'Optimization of
%correctness for all the systems trained and for all the dysarthric speech recognition', Engineering in Medicine and
Biology Society, 1997, in Proc. of the 19th Annual International
speakers under mild and moderate dysarthric class. System Conference of the IEEE. Vol. 4, pp. 1436-1439,.
4 depicts the recognition performance using insertion [4] Mohammed Sidi Yakcoub, S. Selouani, and Douglas
penalty as a factor along with a bi-gram language model. O'Shaughnessy, 'Speech assistive technology to improve the
interaction of dysarthric speakers with machines', Proc. of the 3rd
V. SPEAKER ADAPTIVE SPEECH SYNTHESIS SYSTEM International Symposium on Communications, Control and Signal
Processing,(ISCCSP 2008), pp.1150-1154, (2008).
The dysarthric speech was initially analysed, assessed
[5] Vijayalakshmi, P., and M. R. Reddy. "Assessment of dysarthric
and then recognized. Methods to improve the recognition speech and an analysis on velopharyngeal
system was also imparted to the recognition system and an incompetence." Engineering in Medicine and Biology Society,
improvement was also achieved. Now, the recognized text 2006. EMBS'06. 28th Annual International Conference of the
obtained from system 4 need to be synthesized in the IEEE. IEEE, 2006. pp. 3759 - 3762
dysarthric speaker's own voice. For which an HMM based [6] Xavier Menkndez-Pidal, James B.Polikoff, Shirley M. Peters,
speaker adaptive speech synthesis system (HTS) is used. Jennie E. Leonzio, H. T. Bunnell, 'The Nemours Database of
Dysarthric Speech', in Proc. of ICSLP 96 Fourth International
The adaptive synthesis system has three phases namely Conference, Vol. 3. pp.1962-1965., (1996).
training phase, adaptive phase and testing phase. For
[7] Fisher, W.M., G.R.Doddington and K.M.Goudie Marshal, 'The
training the speech synthesis system an hour of speech DARPA speech recognition research database: Specifications
data of three male speaker's from the CMU Arctic database and status', in Proc.of DARPA workshop on Speech Recognition,
[12] is used. HMM's were trained for the training data. To Feb., pp. 93-99., (1986).
adapt the output speech with the dysarthric speaker's voice [8] D.O’Shaughnessy,SpeechCommunications—HumanandMachine.
74 sentences from each speaker is used for adaptation. The New York: IEEE Press, 2000.
models were adapted to the dysarthric speaker's voice. [9] Lawerence Rabiner , Biing -Hwang Juang, 'Fundamentals of
Constrained Maximum Likelihood Linear Regression Speech Recognition' published by Prentice-Hall, Inc. Upper Saddle
River, NJ, USA ©1993
(CMMLR) [13] technique followed by Maximum
[10] Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, 'The
Aposteriori Probability (MAP) [14] technique is used for HTK Book' , 2001-2009,Cambridge University Engineering
adaptation. The adapted models are then synthesized to Department, Revised for HTK version 3.4 December 2006.
maintain the speaker's identity. The synthesized speech is [11] Lilly Christina, S., P. Vijayalakshmi, and T. Nagarajan. "HMM-
evaluated using mean opinion score (MOS). MOS is a based speech recognition system for the dysarthric speech
five-point grade scale, where score 5 depicts excellent evaluation of articulatory subsystem." International Conference
intelligibility in the synthesized speech and 1 depicts very on Recent Trends In Information Technology (ICRTIT), 2012.
IEEE.
poor intelligibility. The MOS was collected from 10
[12] Kominek, John, and Alan W. Black. "The CMU Arctic speech
listeners in a lab environment for the dysarthric speaker databases." Fifth ISCA Workshop on Speech Synthesis. 2004.
BB. The MOS ranges from 2 - 3.5. The same method is
[13] M. Ferras, C. C. Leung, C. Barras, and J.-L. Gauvain, “Constrained
applicable to all other dysarthric speaker's. The errors mllr for speaker recognition,” in Proceedings of IEEE Int. Conf.
made by the dysarthric speakers are replaced by correct Acoust., Speech, and Signal Processing, vol. 4, 2007, pp. IV–53–
phones through the above systems trained. The error IV–56.
corrected synthetic speech thus bears the identity of the [14] S. Goronzy and R. Kompe, “A combined MAP + MLLR approach
dysarhric speaker. for speaker adaptation,” in Proceedings of the Sony Research
Forum, vol. 99, no. 1, 1999.
VI. CONCLUSION
This paper summarizes the analysis of Nemours BIBLIOGRAPHY
database of dysarthric speech and the analysis were
assessed using the ASR system. A small improvement was
noted in the recognition performance when using only Mariya celin T.A completed her B.E
mild and moderate class of dysarthric speakers and an in Electronics and Communication
improvement of up to 17% was noted when the system Engineering from Prathyusha
was trained using a bi-gram LM based network. The Institute of Technology and
recognized text is then synthesized and the synthesized Management and is currently
speech was evaluated using MOS which achieved an pursuing her M.E in Communication
average MOS in the range of 2 - 3.5. Although it achieves Systems at SSN College of Engineering, Chennai.
a small improvement in the recognition system, a few
aspects of articulator data can also be explored along with Dr. P. Vijayalakshmi,(M’ 08) is a
the system to gain a better recognition performance Professor in the department of ECE,
improvement. SSN College of Engineering,
REFERENCES Chennai, Tamilnadu. She completed
her M.E (Communication systems)
from NIT, Trichy and earned her
[1] R. D. Kent and K. Rosen, " Motor control perspectives on motor
speech disorders,” in SpeechMotor Control in Normal
Ph.D. degree from IIT Madras and worked as a doctoral
andDisordered Speech, B.Maassen, R.Kent,H.Peters, P.V.Lieshout, trainee for a year at INRS, Montreal, Canada.She is a Co-
and W. Hulstijn,Eds. Oxford, U.K.: Oxford Univ. Press, 2004, pp. investigator at SSNCE of a DeitY, MCIT funded
285–311, ch. 12. consortium project on TTS for Indian languages. Her
[2] Kent, Ray D., Gary Weismer, Jane.F.Kent, John.C.Rosenbek , areas of research are speech recognition, speech synthesis
'Toward phonetic intelligibility testing in dysarthria', in Journal of
Speech and Hearing Disorders 54.4 (1989) pp: 482-499. and speech pathology.
Mrinalini K Vijayalakshmi P
Department of ECE Department of ECE
SSN College of Engineering SSN College of Engineering
Chennai, India Chennai, India
vijayalakshmip@ssn.edu.in
Abstract— Speech-to-speech translation system enables in made over several institutions across the globe. On an
translation of speech signals in a source language A to target overview, any S2ST system consists of three modules
language B. A good speech-to-speech translation (S2ST) namely, speech recognition, machine translation and
system can be characterized by its ability to keep intact the speech synthesis. Translation becomes a challenging task
fluency and meaning of the original speech input. An S2ST when the languages involved have completely different
system to enable translation between Hindi and English is
linguistic structures. Hence, the approach used for
the main idea of the proposed work. A preliminary dataset
concentrating on basic travel expressions in both the machine translation becomes arguable. Sneha Tripathi
languages considered is used for this work. In order to et.al in [1] compared the various approaches that can be
develop a successful S2ST system three subsystems are used for machine translation. Phillip Koehn in [2] gives a
required namely, automatic speech recognition (ASR) detailed description on statistical machine translation.
system, machine translation (MT) system and text-to-speech Alon Lavie in [3] describes the various techniques which
synthesis (TTS) system. Hidden Markov models based ASR can be used to evaluate the translation system.
system is developed for both the languages and their
performances are analyzed based on the word error rate A number of significant research works are
(WER). The MT subsystem makes use of the statistical conducted in the field of S2ST. Sakriani Sakti et.al in [4]
machine translation (SMT) approach for the purpose of came up with the first ever network based speech-to-
translating the text between the two languages involved. The speech translation system for Asian languages, which was
SMT makes use of IBM alignment models and language built by successfully combining the three subsystems
models to enable proper translation. The performance of using rule-based translation and networking. V. V. Babu
MT is analyzed based on translated edit rate (TER) and
analysis of the translation table. HMM-based speech
et.al in [5] formulated a system named ANUVAADHAK
synthesis system (HTS) is used to synthesize the translated which is a two-way Indian language Speech-to-Speech
text. Performance of the synthesizer is analyzed based on Translation System for local travel information assistance.
mean opinion score (MOS) from a group of listeners. It was noticed that most of the speech-to-speech
translation systems use a rule-based language processing
approach by incorporating Natural Language Processing
Keywords- Speech-to-Speech Translation System; HMM (NLP).
based speech recognition; Statistical Machine Translation;
HMM based speech synthesis system. The proposed work is an S2ST system for travel
expressions which can perform translation between
I. INTRODUCTION English and Hindi and is based on statistical machine
translation approach. In this system, the speech
Speech is the most widely used form of recognition is carried out using HMM based acoustic
communication between people across the globe. Several models at the word level. The translation approach used
attempts to build systems for recognizing speech signals for the system is statistical based statistical models from
and to artificially synthesize speech signals have been the bilingual corpus. The synthesis system used in the
made for several decades. Speech systems find wide range proposed S2ST system is a HMM based speech synthesis
applications in education, medicine, military, marketing system (HTS). The paper is organized as follows; the
and cross-border operations etc. Language of speech is an following section gives a detailed description of a speech-
important factor to be dealt with in order to complete an to-speech translation (S2ST) system. Section III discusses
effective communication link. Over the past few decades, on the recognition system and the various approaches
the need to overcome this language barrier between used for this work. Section IV explains in detail the
people belonging to different linguistic backgrounds has machine translation subsystem. The SMT approach and
been an area of interest in research. Language translation alignment models used are detailed out here. Section V
systems have served as a major breakthrough for this gives an insight into the synthesis system used for the
issue. As an expansion to this milestone, a speech-to- current work. Section VI discusses on the results and
speech translation system (S2ST) is an attempt being analysis on the performance of each subsystem of the
S2ST system. Section VII concludes the paper and restaurant, basic phrases, shopping, sightseeing, and
discusses the areas of further improvement. transportation & direction. Once the text data is ready
the speech corpus can be created by recording these
II. SPEECH-TO-SPEECH TRANSLATION SYSTEM sentences. Recording for the system was carried out in a
laboratory environment by a single female speaker. The
The basic idea of the speech-to-speech translation recorded waveforms have a sampling rate of 16 KHz and
(S2ST) system is to translate speech in language A to
are mono-channel in nature. The corpus used contains a
language B. The current work allows speech translation
total of 650 different words in English and 642 unique
between Hindi and English. The system is mainly
words in Hindi including case variations. The preliminary
applicable for tourism purpose. The transition between
database of sentences can be further expanded which will
both languages occurs with the help of three major result in better performance of the system as the statistical
modules namely, speech recognition, machine translation models built will be more accurate.
and speech synthesis. The outline of the S2ST system is
given in Fig.1.
III. HMM BASED SPEECH RECOGNITION SYSTEM
As discussed earlier, the ASR system converts speech
signal to its corresponding text. The ASR system uses
HMM based recognition. The HMMs can be trained to
capture sequential information available in the speech. The
speech is divided into words or sub word units such as
phones, syllables. Hence the speech signals are segmented
and time-aligned transcriptions are created. Mel Frequency
Cepstral Coefficients (MFCC) feature vectors are
extracted. These coefficients are statistically independent
and depict the various vocal tract positions. The MFCC
vectors are 39 dimensional which include 13 static and 26
dynamic coefficients. The 26 dynamic coefficients can
Fig.1. An S2ST system having recognition, translation and synthesis further be split into 13 delta and 13 acceleration coefficients.
system and using statistical machine translation. The recognition system is trained using HMMs. The initial
The first module is the ASR system which is prototype models are trained based on the number of
responsible for converting speech signal in language A to mixture components and states of the segmented units.
the corresponding text in the same language. The ASR Number of states can defined as the number of different
system makes use of HMMs to train the system. This phones occurring in the speech unit. The number of mixture
system is described in detailed in section III. Once, the components is approximated as 1/10th of the number of
text is obtained the machine translation (MT) system examples available for a particular word. More the number
comes into picture. Here a statistical machine translation of examples better the model hence better the quality of
(SMT) based on Bayesian criteria is used which is recognition. The prototype models are re-estimated using
discussed in section IV. This SMT performs the task of Baum-Welch re-estimation procedure to form the final
converting the text in language A into text in language B. HMMs. The final HMM can be defined as given in equation
The result of the SMT produces translation for each word (1) [8].
in the target language. Following the translation task, the
translated text must be converted back to speech signal in i A, B , (1)
language B. This operation is performed by the text-to-
speech (TTS) system which is done using the HMM i =1 to W, where W = number of words
based speech synthesis system (HTS). The HTS is A – State transition probability
described in section V. The TTS system produces the B – Observation symbol probability
synthetic speech signal output in the target language B. π – Initial state probability
The syntax and semantics of a language is defined in
A. Bilingual Corpus the lexicon used for recognition. A dictionary forms the
One of the pre-requisites for an S2ST system is a core part of a lexicon and defines what is to be displayed
bilingual parallel database of the language pair in then a particular word is recognized by the system. The
translation. This is vital in order to generate statistical lexicon also contains a network which gives the
models for alignment in machine translation. The present sequential information of the recognized words. In the
work is domain-specific and concentrates on travel and current work sentence level networks are used to improve
tourism. The corpus is thus a set of most common performance. When a test speech is given as input, it is
expressions used by tourists across the globe [6]. The segmented and based on its acoustic likelihood of the
work on creating the preliminary corpus began with features it is matched to an existing model. Following
formulating the text of travel expressions in English and this, the label corresponding to the model with maximum
Hindi [7]. The text is such that the expressions in one acoustic likelihood is chosen and displayed as the
language have their corresponding translation in another recognized text. Fig.2 represents the steps involved in
language aligned in parallel. The preliminary database building the ASR system for the current S2ST system.
contains 306 sentences covering various domains such as
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
The first ASR system built for the current work is relationships between words, phrases and grammatical
based on a bilingual dictionary. This type of dictionary structures which are often vague. The statistical models
serves as a one-to-one mapping between words in both the involved can be trained over large amount of training data
languages. Such an approach is similar to direct machine and as the amount of data increases so does the
translation. This method though successful is expensive performance of the system. These models can be made
and time consuming as it requires a large amount of language-independent and hence can be applied to any
human intervention in creating the bilingual dictionary. language with minimal modifications. The objective of
The shortcomings of the above system can be overcome SMT is to translate an input sentence ‘f’ in one language
by using a mono-lingual dictionary and passing the to a sentence ‘e’ in another language without altering the
recognized text to the machine translation subsystem for meaning of the original sentence. The translation process
translation which is discussed in section IV. The mono- is considered to be a source-channel model, where the
lingual dictionary based system was built for English as output language sentence ‘e’ is viewed as being generated
well as Hindi. by a source with probability P(e) defined as the language
model, this is then passed through the translation channel
to produce input language sentence ‘f’, according to the
translation probability P(f|e). The task of translation
system is to find ‘e’ from the observed ‘f’ [2]. The best
translation is found by computing
P(f | e)P(e)
ê argmaxe argmaxe P(f | e)P(e) (4)
p(f)
A. Alignment Modeling
The translation model P(f|e) can be viewed as an
Fig.2. Steps to build an Automatic Speech Recognition (ASR) system analogue of the acoustic model P(O|λ) used in recognition
system. However, unlike recognition system the words in
The performance of ASR system is measured using translation need not occur in the same order as in source
word error rate (WER) which is based on the number of language. The translation from one language to another
words substituted, deleted and inserted by the system. may also have few words which may have no actual
WER is represented by % correctness and % accuracy translation and have to be removed or words which need
which as defined as [9]: to be added for the sentence to be meaningful. Hence we
need an intermediate alignment model or variable. These
C s N d D S * 100
%correctnes (2) alignment models can be used in both translation
N directions. Given a source language sentence ‘e’ and
N DS I (3) target language sentence ‘f’, alignment variable ‘A’ is
%accuracy * 100 introduced which determines the correspondence between
N the words or phrases in the sentence and calculate the
translation probability and the sum is taken over all
N – Number of words under recognition
possible alignments ‘A’ [10].
D – Number of words that were deleted
S – Number of words that were substituted
I – Number of words that were inserted
P( f | e ) P( f , A | e ) (5)
A
The performance of the ASR system for the current
The alignment models used for the current work are
work is discussed in section VI.
IBM Model 1-5, word-to-word HMM. Each of these
models has its own level of accuracy based on the
IV. MACHINE TRANSLATION SYSTEM parameters used in each of these models [10].
On a basic level, machine translation (MT) performs
simple substitution of words in one language for words in B. Performance of translation
another, but that alone usually cannot produce a good The performance of a machine translation system can
translation of a text because recognition of whole phrases be done by several automated as well as human evaluation
and their closest counterpart in the target language is techniques [3]. The evaluation technique used for the
needed. Solving this problem with corpus and statistical current work is translated edit rate (TER) which is similar
techniques is a rapidly growing field that is leading to to WER used for ASR system evaluation. TER can be
better translations. The objective of an effective defined as the maximum number of edits needed to be
translation system is to restore the meaning of the original made in order to exactly match a given reference,
text in the translated verse. The approaches to machine normalized by the total number of words in the reference
translation can be broadly classified into the dictionary [3]. Mathematically, it is represented as
based, rule based and corpus based machine translation
[1]. In the current work, a corpus-based statistical N D I S Sh
machine translation (SMT) approach is used which is %TER * 100 (6)
N
elaborated in this section. The SMT allows us to capture
978-1-4673-6524-6/15/$ 31.00
MRINALINI K et al: HINDI-ENGLISH SPEECH-TO-SPEECH TRANSLATION SYSTEM FOR TRAVEL EXPRESSIONS
N – Number of words in reference sentence synthesis phase, context-dependent label files are
D – Number of words that were deleted generated for the given text and the required context-
I – Number of words that were inserted dependent HMMs are concatenated to obtain the sentence
S – Number of words that were substituted HMM translated from the statistical machine translation.
Sh – Number of words that were shifted Spectral and excitation parameters are generated for the
sentence and the speech waveform is synthesized.
Apart from the TER score, the percentage change in
number of lines of the translation table is also used as an The S2ST synthesis system is trained on 1 hour of
evaluation metric. The analysis of the MT system based English data recorded in a lab environment in a female
on these techniques is discussed in section VII. The voice at a sampling rate of 16 kHz. The lexicon and
performance of this subsystem can be improved by tokenizer used for training the system are modified with
incorporating natural language processing (NLP) and respect to the phoneset used. After the system is trained,
using higher phrase level alignment models. the English wave files consisting of travel expressions
from the preliminary dataset are used to test the synthesis
V. HMM BASED SPEECH SYNTHESIS SYSTEM system. The performance of the English HTS system is
measured on the basis of the mean opinion score (MOS).
The translated output of the MT system needs to be
The MOS ranges from 1-5 with 5 indicating good
synthesized back to speech signal in order to complete the intelligibility by the listener and 1 depicting poor
S2ST system. This is done using a text-to-speech (TTS) intelligibility. The performance of the English synthesis
synthesis system. The function of a synthesis system is to
system based on this score is discussed in section VI.
artificially generate human like speech signals. An
unrestricted text-to-speech system is expected to produce
a speech signal, corresponding to the given text in a VI. PERFORMANCE STUDY OF S2ST SYSTEM
language, which is highly intelligible to a human listener. The S2ST system is built using a preliminary
The synthesis system for the S2ST system is unrestricted database. The total performance of the S2ST system
and HMM based [11] [12]. This approach is a statistical depends on each of the three systems mentioned above.
parametric approach, where the required sequence of Hence, the performance of each of the systems is studied
context-dependent HMMs is concatenated and the individually.
resultant HMM is made as an observation sequence
generator. The HMM based speech synthesis system A. Performance of ASR system
(HTS) has an advantage of being able to synthesize voice As discussed in section III, the ASR system used in
with minimum amount of training data unlike the current work is trained with HMM based acoustic
conventional concatenative speech synthesis approaches. models. The mono-lingual recognition system was built in
The overall operation of the synthesis system is shown in Hindi and English. The ASR system makes use of a
Fig.3 [13]. mono-lingual dictionary were words in one language are
mapped to the trained models in the same language. In
order to train the system with HMM word models,
minimum of 10 examples of each word is required. Some
words which lacked this criterion were recorded
individually and considered for training. After the HMMs
are trained, a sentence level network is created to capture
the sequence of recognized HMMs. The performance of
the ASR system was evaluated based on WER. The use of
sentence level networks eliminates the possibility of
insertion, deletion or substitution during recognition. This
resulted in 100% accuracy in recognition.
B. Performance of MT system
The MT system used for the current work is SMT-
Fig.3. HMM based speech synthesis system [13]. based as discussed in section IV. The text is converted
into bitext which is the binary representation of the
The text-to-speech synthesis consists of two phases
sentences with respect to the word index of each word in
namely a training phase and a synthesis phase. In the
the vocabulary. The initial statistical model is built from
training phase, the spectral features of the speech data are
the bitext. The IBM models 1, 2 and WtW-HMM
extracted. These features include the Mel generalized
described in section IV are used to form alignments of
cepstral coefficients and log fundamental frequency and
source sentences in the target language. The final
their dynamic features [13]. Using these features, the
alignment is formed after 15 iterations on the initial
phonetic transcriptions and HMMs are trained. The basic
model. Along with this, a translation table (t-table) is
sub-word unit considered for HMM based system is the
formed where the words in source language are mapped to
context-dependent pentaphones. These models are built
their translated version in target language. Each of these
starting from monophones and refining them in sequential
entries is associated with a probability which determines
steps. The sequential steps involve state-tying which is
the most likely translation of a particular word. The most
performed using tree-based clustering [14]. In the
978-1-4673-6524-6/15/$ 31.00
2015 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY, INFORMATION AND COMMUNICATION
likely translation for a given sentence can be extracted The performance of each of the systems discussed
from the series of translation in the t-table. The MT above can be improved which forms the future scope of
system was built for both the languages enabling in the system which is discussed in section VII.
translation from Hindi-English and vice versa. The
performance of the MT system is measured on basis of VII. CONCLUSION AND FUTURE WORK
TER score and t-table analysis is given in TABLE I and
The speech-to-speech translation system works to
TABLE II respectively.
translate speech in one language into speech in another
TABLE I. PERFORMANCE OF MT BASED ON TER language. This S2ST system built can recognize and
translate English and Hindi speech signals and synthesize
Name of the System TER score English text. The recognition system trained with HMM
Hindi – English 49.46 word models and sentence level network shows good
English – Hindi 59.19
performance. The machine translation is carried out using
statistics derived from a bilingual corpus. The alignment
The high TER score indicates errors in translations. and language models aid in better translation. The
This is due to errors in alignment models. The English– synthesis system is HMM based speech synthesis system.
Hindi system shows higher TER score due to the The overall performance of the system can be improved
unavailability of exact translations for prepositions in further by increasing the performance of each subsystem
Hindi leading to NULL alignments or deletion of few individually. The recognition system can be built using
words in the resultant translation. Higher IBM models and phoneme-level models and language models without
text processing approaches can be used to improve these sentence level network in order to expand the system and
errors. The efficiency of translation table was analyzed incorporate more words. The alignments in machine
based on the frequency of occurrence of words in the table translation can be improvised by text processing. For
which indicates better one-to-one word translation example, the prepositions can be combined with the main
alignment. High occurrence of a word in the t-table is due words or parts-of-speech (POS) tagging can be used to
to many-to-one mapping with different probabilities. The make the system a hybrid MT. The synthesis system for
t-table of IBM models 1, 2 and WtW-HMM for both the Hindi text can be built using the same concept as the
languages is analyzed. From TABLE II it is inferred that English synthesis system. The system can be extended
translation in Model 2 and WtW-HMM is better when further to applications in other domains.
compared with IBM model 1. The frequency of
occurrence of words in the initial model of Model1 was REFERENCES
reduced by approximately 50% at the end of 15 iterations.
In case of Model 2 and WtW-HMM the reduction was [1] T. Sneha and J.K Sarkhel, “Approaches to machine translation” in
found to be approximately 29% and 48% respectively. Annals of Library and Information Studies, Vol.57, pp.388-393,
This can be improvised by using more data and increasing Dec 2010.
the number of iterations. Large number of iterations may [2] K. Phillipp, “Statistical Machine Translation”, Cambridge
University Press, 2009.
cause overtraining of models resulting in omission of
[3] L. Alon, “Evaluating the Output of Machine Translation System”,
some words from the t-table. AMTA Tutorial, 2010.
[4] S. Sakti, N. Kimura, P. Michael, H. Chiori, E. Sumita, S.
TABLE II. ANALYSIS OF T-TABLE Nakamura et al, “The Asian network-based speech-to-speech
T-Table T-Table translation system” in Automatic Speech Recognition &
Model % Decrease Understanding (ASRU) IEEE workshop, pp.507-512, Dec 2009.
(Eng-Hin) (Hin-Eng)
Model 1 8237 7607 - [5] V. Venkata, K. Pradeep, K. Mrudula, T. Poornima and P. Kishore,
“ANUVAADHAK: A Two-way, Indian language Speech-to-
Model 2277 2116 72.177 Speech Translation System for Local Travel Information
Assistance” in International Journal of Engineering Science and
WtW HMM 2221 1917 73.348 Technology, Vol. 2(8), pp.3865-3873, 2010.
[6] http://www.fodors.com/language/french/basic-phrases
C. Performance of HTS system [7] https://translate.google.co.in/
The synthesis system used for the current S2ST [8] L. Rabiner and B.H. Juang, “Fundamentals of speech recognition”,
system is HMM based speech synthesis system (HTS). Prentice-Hall, 1993.
The system is trained on HMMs and phonetic [9] W.M. Fisher, G.R. Doddington and K.M. Goudie Marshal, “The
DARPA speech recognition research database: specifications and
transcriptions taken from pre-recoded data as discussed in status”, Proceedings of DARPA workshop on Speech Recognition,
section V. The synthesis system is built to synthesize pp. 93-99, Feb.1986.
English speech in female voice. The performance of the [10] D. Yonggang, “MTTK: An Alignment Toolkit for Statistical
HTS system is based on mean opinion score (MOS) Machine Translation”, HLT-NAACL Demonstrations Program,
2006.
collected from 10 listeners and is shown in TABLE III.
[11] B. Ramani, S. L. Christina, G. A. Rachel, V. S. Solomi,
M. K. Nandwana, A. Prakash, A. S. S, R. Krishnan,
TABLE III. PERFORMANCE OF HTS SYSTEM
S. Kishore, K. Samudravijaya, P. Vijayalakshmi, T. Nagarajan,
Name of the System MOS and H. A. Murthy, “A Common Attribute based
Unified HTS framework for Speech Synthesis in Indian
English synthesis system 3.21 Languages,” in ISCA Workshop on Speech Synthesis,
Barcelona, Spain, 2013, pp. 291 – 296.
BIBLIOGRAPHY
Vijayalakshmi P Nagarajan T
Department of ECE Department of IT
SSN College of Engineering SSN College of Engineering
Chennai Chennai
vijayalakshmip@ssn.edu.in nagarajant@ssn.edu.in
Abstract— Speech has been in use as an effective medium in TTS systems trained from such corpora were built
human machine interactions for a long time. A conventional individually for each emotion so that they conveyed the
text-to-speech (TTS) system produces monotonous speech underlying emotion in the synthetic speech output [1].
without any appropriate emotion. The inclusion of emotions Speech with appropriate emotion was also synthesized by
in such synthesis systems will not only result in expressive switching between the corpora corresponding to each
speech but also reduce the monotony of the synthetic speech. emotion [2]. However maintenance of consistency among
The time domain parameters of speech signals, like short the recorded databases remained to be a big task.
time energy, duration, and pitch contour are influenced by Requirement of large databases in such corpus-based
emotions. Hence to incorporate desired emotion into neutral
methods was then overcome based on the fact that the
speech, signal processing methods are used in this work for
modifying the prosodic speech parameters in time domain,
acoustic signal information such as energy, pitch and
either in few words or the entire speech utterance. An initial speech rate are affected by emotion. This idea was then
analysis is performed, by comparing neutral speech with implemented by analyzing and characterizing the prosodic
happy and sad speech. Based on the observations from the features of emotions like anger, compassion and happiness
analysis, the parameters of the speech signal are varied using and then transforming the neutral speech to emotional
TD-PSOLA technique. The parameters, short time energy, speech [3]. Emotional speech was also synthesized in
duration, and pitch contour in the neutral speech are varying degrees as strong, medium and weak by modeling
modified and further analyzed quantitatively to decide the the prosodic variations between neutral and emotion
combination of parameter modification that better speech [4]. Expressivity in synthetic speech was later
synthesizes the emotional speech from neutral speech. improved by modeling source parameters based on the fact
that nuances in speaking style and emotion are also
Keywords-text-to-speech (TTS), neutral speech, pitch captured by glottal source parameters apart from prosodic
contour, short time energy, TD-PSOLA features [5]. Linear prediction (LP) and time-domain pitch
synchronous overlap-add (TD-PSOLA) techniques were
I. INTRODUCTION used for synthesizing happy emotion and the emotion
content was better perceived in the latter method [6].
A text-to-speech (TTS) system is a one that
synthesizes intelligible and natural speech corresponding The proposed work aims at incorporating emotions in
to a given text. Even though the linguistic information is neutral speech using TD-PSOLA method to modify the
well delivered through the synthetic speech of a TTS prosodic speech parameters in time domain. The
system, the appropriate emotion for a particular utterance modifications are carried out based on the inferences from
is not perceived. The lack of such emotional aspects makes analysis performed using Berlin [7] and Surrey Audio-
naturalness a concern in speech synthesizers. Incorporating Visual Expressed Emotion (SAVEE) [8] databases.
emotional features in neutral speech will improve the Further, detailed analysis is carried out by varying the
synthesizer performance in terms of naturalness. With TTS speech parameters at different portions of the speech
systems this is rather a difficult task because the written utterance, so as to identify which combination of
text will have no information about the feelings and parameter variation might synthesize emotion. The work
emotional state of a speaker. But the extensible use of concentrates on two emotions, namely, happiness and
speech synthesizers in telephone dialogue systems and sadness.
screen readers for visually challenged people makes the
requirement for naturalness in synthetic speech inevitable. The paper is organized as follows: Section II briefs on
the observations made from the analysis on emotions
Initial work on emotional speech synthesis dealt with happiness and sadness. Section III discusses the
using a different corpus for each emotion. Concatenative modification of time domain speech parameters using
signal processing method. Section IV gives descriptions of Fig. 1(d), especially in the circled emotion-specific words,
incorporating happy and sad emotion into neutral speech. namely, “pleasure” and “excitement”.
Section V concludes the paper.
For the same utterance words are uttered at a faster rate in
happy speech as seen in Fig. 2(b). So decrease in duration
II. ANALYSIS OF EMOTION WITH RESPECT TO NEUTRAL is observed throughout the sentence in happy emotion.
SPEECH The duration of neutral speech of about 3.4 sec in Fig.
An analysis is performed for happy and sad emotions 2(a) is decreased to 2.7 sec in happy speech as seen in
to identify the pattern of variations that can be seen in an Fig. 2(b). As far as pitch contour is concerned, happy
emotional speech. Previous work on emotional speech speech has a hat contour in the emotion specific words
synthesis indicates that the speech signal parameters like “pleasure and excitement” as circled in Fig. 2(b), when
intensity (loudness and breathiness), pitch contour and compared to the flat contour of the neutral speech in Fig.
duration undergo changes with emotions [9]. Two 2(a).
emotional databases namely, Berlin and SAVEE are used
for the analysis. In the analysis a sentence in neutral
emotion is compared with the corresponding sentence with
emotion. The two speech utterances are analyzed such that,
the way in which the emotional speech differs from the
neutral speech in terms of the parameters namely, short
time energy, pitch contour and duration, are observed. The
analysis is carried out using 100 sentences from Berlin and
70 sentences from SAVEE, for happy and sad emotions.
The databases used are described as follows.
A. Speech Corpora
Fig. 1. Utterance “The eastern coast is a place of pure pleasure and
1) Berlin database [7]: It is a German language excitement” (a) Waveform of neutral speech (b) short time energy plot
emotional database, consisting of seven emotions, namely, of neutral speech (c) waveform of happy speech (d) short time energy
happiness, sadness, anger, fear, disgust, boredom, and plot of happy speech.
neutral. 10 sentences per emotion are recorded by ten
professional German actors (five male and five female), at
a sampling rate of 16 kHz.
2) SAVEE database [8]: The database consists of
recordings from 4 male actors in 7 different emotions
(same as those in the Berlin database). There are 30
English sentences in neutral emotion and 15 each in the
others.
3) Database developed by the authors: Text and
speech corpus is prepared neutrally without any emotion.
The keywords specific to emotions such as happiness and Fig. 2. Utterance “The eastern coast is a place of pure pleasure and
excitement” in (a) neutral speech of 3.4 sec duration with flat pitch
sadness are collected initially. Sentences are then framed contour (b) happy speech of duration 2.7 sec with hat contour.
using these keywords such that when spoken they will
reflect the emotion. For each emotion 100 such sentences C. Analysis of Sad vs. Neutral Speech
are collected. The text corpus collected is then recorded
The intensity of sad speech, in the sentences analyzed,
neutrally in a lab environment at a sampling rate of 16 kHz is less than the intensity of neutral speech. The short time
by female speaker. This corpus is used for synthesizing energy of the sad speech is computed and observed to be
speech with emotion. decreased either in few words or in the trailing portion of
B. Analysis of Happy vs. Neutral Speech an utterance. The decrease in short time energy of the sad
speech in the sentence “Das schwarze Stück Papier
Energy of a signal refers to its strength. The quasi befindet sich da oben neben dem Holzstück” (meaning
stationary nature of speech signal indicates that it is “The black sheet of paper is located up there besides the
stationary for a short duration of 20 to 30 ms. Hence for piece of timber”) is circled in Fig. 3 (d).
the purpose of analysis, short time energy is computed for
frames of speech signal obtained using windows of The observation regarding the duration is that the sad
duration 25 ms. The fact that a person speaks with high speech will have increased duration with respect to the
energy when they feel happy is observed in the analysis as neutral speech since some words are elongated or due to
well. However the increase in energy is more prominent the presence of pauses between words. This is seen in
in certain emotion-specific words of the sentence, owing Fig.4 (b), for the sentence “Ich will das eben wegbringen
to the stress placed on them. This can be observed in Fig. und dann mit Karl was trinken gehen.” (meaning " I will
1. When compared to the energy of the neutral speech just discard this and then go for a drink with Karl.") where
corresponding to the sentence “The eastern coast is a the duration is increased to 4 sec from the 2.4 sec duration
place of pure pleasure and excitement”, in Fig. 1(b), the in neutral speech in Fig. 4 (a). For the same utterance the
increasing energy in the happy speech is clearly seen in sad speech has falling contour as in Fig. 4(b).
actual signal (d) short time energy plot with energy decreased in “great
day”
Fig. 6. Utterance “Your music excites me” with (a) flat contour in
original speech (b) falling contour fitted in words “your” and “music”
(c) bucket contour fitted in the entire utterance.
C. Duration Modification
Duration of speech can either be increased or
decreased by a factor as desired using TD-PSOLA. The
excitation instants namely GCI’s obtained from DYPSA
[11] are used in calculating the pitch periods. Frames of
signal are obtained by using analysis windows of length
Fig. 5. the utterance “Have a great day” (a) waveform of original signal
two pitch periods. To increase or decrease the duration,
(b) waveform of energy modified signal (c) short time energy plot of
the instants are replicated or deleted respectively. The sounded quite excited closer to happy emotion, than the
pitch synchronous frames extracted previously are placed speech synthesized by modifying keyword alone. Fig. 8
at distances corresponding to the new excitation instants, (b) shows the waveform of synthesized speech fitted with
overlapped and added. Fig. 7 shows the increase in hat contour in the words “awesome show” which is
duration performed on the utterance “Your music excites circled. Fig. 8 (d) shows the short time energy plot of
me”. The original signal of duration 2.8 sec in Fig. 7 (a) is synthesized speech where the energy of the words
increased by a factor 2, so the modified duration is 4.5 sec “awesome show” is increased by factor 1.8 and the
in Fig. 7 (b). duration is also increased by 0.3 sec by stretching the
words. These set of quantitative variations in parameters
when tried gives synthetic speech with happiness.
Similar experiments are repeated using thirty neutral
utterances from the corpus either using emotive keywords
alone located in different positions or using emotive
keywords as well as the adjacent non-emotive words.
Depending on the keyword position, the short time energy
parameter is considered for modification. Keyword when
located in trailing part of the utterance requires the energy
to be increased or otherwise they can be left unchanged.
Fig. 7. Waveform of the utterance “Your music excites me” (a) original But the parameters duration and pitch contour are varied
speech signal (b) duration modified signal. irrespective of the keyword position.
the actual and by stretching the segment of speech by a V. EVALUATION OF SYNTHESIZED SPEECH
factor between 1.8 and 2 where variations are The Mean Opinion Score (MOS) is a subjective
implemented. For pitch contour the analysis is performed
measure used to evaluate the quality of the synthesized
using falling and bucket contours individually or in
combinations since they are mostly observed in the sad speech. In this work MOS is used to verify the emotional
sentences of the emotional databases described in Section aspect in the synthesized speech. Ten synthesized
II. utterances for happy and sad emotion each are tested by
11 listeners and a three point score is assigned based on
In the analysis carried out using thirty five neutral their perception. The score one indicates the emotion is
sentences from the corpus, variations when performed only not perceived, score two indicates the emotion is
in the emotive keywords or along with non-emotive words, perceived and score three corresponds to the case where
sadness is not perceived well from the synthesized speech. the emotion is clearly perceived. The average value of the
So the work for sad emotion now switched to analyzing score is then computed. The test result gives the score for
the variations in phrases or in the whole sentence. Phrases
synthesized happy speech as 2.25 and for synthesized sad
are extracted from neutral utterance based on the
speech as 2.48 indicating that emotion was is better
occurrences of possible pauses in the sentence.
Combinations of bucket and falling contours are fitted in incorporated in the neutral speech.
those phrases. VI. CONCLUSION
One among the analysis implemented in the neutral Emotions denote an individual’s state of mind. The
utterance “I miss having you around” is shown in Fig.9 vast applications of human machine interactions require
where the phrases “I miss” and “having you around” are machines also to express emotions. Instead of using
separately fitted with falling contour in Fig. 9 (b), and separate emotion corpus, the speech characteristics that are
duration is increased by 0.7 sec resulting in 2.6 second unique to each emotion can be modeled and incorporated
duration in synthesized speech. The short time energy in in the available neutral speech data by using signal
Fig. 9 (c) is decreased by a factor 2 throughout as processing techniques. In this paper the happy and sad
highlighted in Fig.9 (d). The resultant synthetic speech emotion are incorporated in neutral speech by varying the
after implementing all these variations sounds closer to sad prosodic parameters in time domain in different segments
emotion. of the speech. The TD-PSOLA technique used gives
naturalness and better perceptual quality to the synthesized
speech. The observations from the experiments indicate
that in case of happiness it is sufficient to modify the
parameters of emotive keywords and its adjacent non
emotive words in the sentence. For sadness it is observed
that the modification needs to be performed at phrase level
to synthesize sad speech. The variation from the analysis
when implemented gives satisfactory results for shorter
utterances. Words spotting techniques can be used further
in case of happy emotion to automate the identification of
emotive keywords. A similar analysis can also be extended
for incorporating other emotions into the neutral speech.
Fig. 9. The utterance “I miss having you around” (a) neutral speech of REFERENCES
duration 1.9 sec with flat contour (b) synthesized speech fitted with [1] E. Eide, “Preservation, identification, and use of emotion in a Text-
falling contour in phrases “I miss”, “having you around” and of duration to-speech system,”in Speech Synthesis Proceedings of the IEEE
2.6 sec. (c) short time energy plot of neutral speech (d) short time energy Workshop, pp. 127-130, 2002.
plot of synthesized speech [2] A. Iida, N. Campbell, F. Higuchi, M. Yasumura, “A corpus-based
speech synthesis system with emotion,” in Speech Communication,
The modifications are performed at phrase level using vol. 40.1, pp. 161-187, 2003.
falling contour, decreasing energy, and increasing [3] T.V. Sagar, K.S. Rao, S.R.M. Prasanna, S. Dandapat ,
duration. Similar set of parametric variations when “Characterization and incorporation of emotions in speech,” in
implemented in thirty five neutral utterances of the corpus IEEE India Annual Conference, pp. 1-5, 2006.
indicates that, instead of introducing variations only in the [4] J. Tao, Y. Kang, A. Li, “Prosody conversion from neutral speech to
emotive keywords of neutral speech, the modifications emotional speech,” IEEE Transactions on Audio, Speech and
Language Processing, vol. 14, no. 4, pp. 1145-1154, 2006.
when extended to phrases or sentences synthesizes the sad
[5] J. L. Trueba, R. B. Chicote, “Towards glottal source controllability
speech. To do so, the duration is increased at least by in expressive speech synthesis,” in Interspeech 13 th Annual
factor of 1.8 and the energy is decreased by 20% from the Conference of the International Speech Communiation
maximum energy separately in each phrase. Combination Association, 2012.
of falling, falling and bucket, bucket gives synthetic [6] G. Anushiya Rachel, S. Sreenidhi, P. Vijayalakshmi, T. Nagarajan,
speech closer to emotion speech. Among this falling “Incorporation of happiness into neutral speech by modifying
followed by falling better synthesizes sad speech. The emotive-eeywords,” in Proceedings of IEEE TENCON, Oct. 2014,
pp. 1-6.
maximum and minimum used in a falling contour are 20%
[7] “Berlin database for german language, http://www.expressice-
more and 20% less than the average pitch period of the speech.net.”
speaker respectively.
[8] “Surrey Audio-Visual Expressed Emotion (SAVEE) for British
English database-
http://personal.eesurvey.ac.uk/Personal/P.JacksonSAVEE.”
[9] S. Narayanan, A. Alwan,Text to speech synthesis – New [11] P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes,
Paradigms and Advances. Prentice Hall, 2004. “Estimation of glottal closure instants in voiced speech using
[10] A. Paeschke, M.Kienast, W. F. Sendlmeier, “F0 contours in DYPSA algorithm,” IEEE Transactions on Audio, Speech and
emotional speech,” in Proceedings of 14th International Conference Language Processing, vol. 15, no. 1, pp. 34-43, Jan 2007.
of Phonetic Sciences, pp. 929-932, 1999.
discriminative training techniques, improved acoustic
BIBLIOGRAPHY modelling techniques,etc.
Abstract—Currently huge data rates transmissions and their Generally subcarriers allocation in OFDM based wireless
manipulations are completing by wireless distributed distributed computing (WDC) network is main issue. This
computing (WDC) network, for minimize the time. But by this allocation efficiently achieved by an evolutionary algorithm,
network, we are unable to save the both power and bandwidth called the particle swarm optimization (PSO), presented in [1].
in much manner. These parameters will create critical issues, In WDC network, power is a major factor and it mostly
when wired or wireless media affects by noise, and depends on the switching frequency and computing density.
interference conditions. In this paper, the main objectives are Performance of power consumption (communication and
to produce good tradeoff between power consumption, computing power) and distributed computing power ratio
bandwidth consumption and good tradeoff between distributed (DCPR) are major issues in WDC network [3].With the help
computing power ratio (DCPR), bandwidth consumption. I of PSO, resource allocation for OFDM based wireless
propose OFDM based WDC system, to achieve the very less communication system presented in [8].
interference and to satisfy the above specified objectives. This
paper uses an optimization algorithm, called the particle However in all the above works, no one has been
swarm optimization (PSO). This algorithm gives global performed on tradeoff between bandwidth (number of sub
optimum solution. The performance of proposed system with carriers)and power minimizations in WDC network with
an evolutionary algorithm is analyzed using MATLAB effective evolutionary algorithm.
simulation.
In this paper, I proposetradeoff between power
Keywords—Wireless Distributed Computing System, OFDM, consumption, bandwidth (number of sub carriers)
Power Consumption, bandwidth consumptionand DCPR consumption. This paper is organized as follows: In section II,
I. INTRODUCTION the problem statement is presented. In section III, the problem
is formulated. In section IV, present the proposed system
Most of the wireless distributed computing network’s model. In section V, I present the simulation results. Finally,
applications like military and multimedia. Basic WDC the paper is concluded by the section VI.
networkcontains single transmitter and multi receivers are
used. All these receivers can share information and to II. PROBLEM STATEMENT
completean enormous tasks with in very less finite amount of
time. But WDC network faces interference by fading and In this paper, tradeoff between power consumption and
noise channel conditions. bandwidth (number of sub carriers) consumption is considered
as anobjective. Tradeoff between distributed computing power
Due to this disadvantage of WDC system, orthogonal ratio (DCPR)and bandwidth (number of sub carriers)
frequency division multiplexing (OFDM) scheme deploys into consumption is considered as a subobjective. These objectives
WDC system to achieve the very less amount ofinter symbol analyzed in with and without channel variances (noise, fading
and inter carrier interferences. OFDM comes under the and etc.) between transmitter and all different receivers. These
physical layer. OFDM based WDC system employs to objectives depend on computational capabilities of receivers
accomplish the required task in less amount of signal to noise and number of operations happened at all receivers. Tradeoff
ratio (SNR) and without interference. But this scheme does betweenDCPR and bandwidth (number of sub carriers)
not contain adaptively in its subcarriers allocation, that means consumptionis considered to be special performance.
it consider thestatic channel conditions and static capabilities
of receivers only.
III. PROBLEM FORMULATION
As mentioned earlier, the objectives of OFDMA based Where ‘K’ indicates total number of destination
WDC network isto achieve the required quality of service nodes or receivers.Another important resource in WDC is total
(QoS). In order to do so, we need to mathematically formulate communication power consumed. As the name suggests, it is
the required objectives. In this section, I present mathematical the total power required for a source node to communicate the
equations that relate various WDC network parameters. [2] computational task to the destination nodes. It depends on
factors like bandwidth, data rate required and channel variance
One of the important resources in a WDC network is [2].
the total computing power. It is defined as the total power The relationship between the above factors and
required for a receiver to perform a certain computational communicationpower required is given by Shannon’stheorem
task.In digital circuits computing power consumption is where ‘B’ indicates the total bandwidth is available, ‘d’ is the
mainly classified as static and dynamic power consumptions maximum possible data rate, ‘ ’ is the communication
[2]. The computing power consumed by the receiver is power required and ‘ ’ is noise power in the channel [2].
given by [2].
() () ()
= + watts (5) = × log ( )(bits/sec)(10)
The block diagram of the proposed system is shown In Fig. 2, I study the effect of fixed (10 ) and
in Fig. 1. In the proposed system, initially transmitter allocated variation in system density on the tradeoff between power
to all receivers with equalnumber of subcarriers, amount of consumption andbandwidth (number of sub carriers)
load and number of operations. Here one controllerblock is consumption. For a proposed network size,computing density
used. This controller block estimate the parameters relate to varies from 0.1 to 1.50. When the bandwidth (number of sub
power consumption andbandwidth (number of sub carriers) carriers) increases, power consumption is decreased in gradual
consumption from every receiver. These parameters are manner.
completelydepending on channel characteristics and receivers
12
capabilities and their densities (loads). The transmitter collects with out channel variance and with sytems densites (a=1.5)
with out channel variance and with sytems densites (a=1.2)
the data from controller block, after then analyze about the with out channel variance and with sytems densites (a=0.9)
10 with out channel variance and with sytems densites (a=0.6)
power andbandwidth (number of sub carriers). After the with out channel variance and with sytems densites (a=0.3)
with out channel variance and with sytems densites (a=0.1)
analysis,it changes the number of subcarriers allocation. So with channel variance and with sytems densites (a=1.5)
8 with channel variance and with sytems densites (a=1.2)
that controller block will observe thepower (communication with channel variance and with sytems densites (a=0.9)
Power Consumption
12
with out channel variance and with 1000000 data rate
0.2
with out channel variance and with 100000 data rate
with out channel variance and with 10000 data rate
10 with out channel variance and with 1000 data rate
with out channel variance and with 100 data rate 0.1
with out channe lvariance and with 10 data rate
with channel variance and with 1000000 data rate
8 with channel variance and with 100000 data rate
with channel variance and with 10000 data rate 0
100 200 300 400 500 600 700 800 900 1000 1100
Power Consumption
0.3
In Fig. 4, I study the effect of fixed (10 ) and
variation in computing density on the tradeoff between 0.2
tradeoff between average power consumption and bandwidth channel variance condition is more than in without channel
(number of sub carriers) consumption. variance condition. Why because, when the channel variance
presents, more supply voltage is used. Compared to in Fig. 6,
The DCPR is defined by the ratio of dynamic in Fig. 7 at every stage,tradeoff between DCPR andbandwidth
networkcomputing power consumption to the total available (number of sub carriers) consumptionis better. Hence
network power to complete this task [1]. compared to switching frequency, computing density gives
more effect on tradeoff between DCPR and bandwidth
In Fig. 6, I study the effect of fixed (10 ) and (number of sub carriers) consumption.
variation in computing density on the tradeoff between DCPR
and bandwidth (number of sub carriers) consumption. For VI. CONCLUSION
aproposed network size, computing density varies from 0.1 to
1.50. When the bandwidth (number of sub carriers) increases, In this paper, I proposed a new scheduling scheme for
average power consumption is decreased in gradual manner. OFDMbased WDC system. This proposedscheduling scheme
given the good tradeoff between power consumption and
bandwidth (number of sub carriers) consumption and good
0.6
with
with
out channel variance and with sytems densites (a=1.5)
out channel variance and with sytems densites (a=1.2)
tradeoff between DCPR and bandwidth (number of sub
0.5
with
with
out channel variance and with sytems densites (a=0.9)
out channel variance and with sytems densites (a=0.6)
carriers) consumption. The performance of the proposed
with
with
out channel variance and with sytems densites (a=0.3)
out channel variance and with sytems densites (a=0.1)
scheme is studied using computersimulations. The simulation
0.4 with
with
channel variance and with sytems densites (a=1.5)
channel variance and with sytems densites (a=1.2)
results shown that compared to switching frequency (number
with
with
channel variance and with sytems densites (a=0.9)
channel variance and with sytems densites (a=0.6)
of operations), computing density (load in systems) given
0.3
with channel variance and with sytems densites (a=0.3) more effect on specified tradeoffs.
DCPR
0.2
REFERENCES
0.1
0.3
with channel variance and with 1000 data rate [4] D. J. F. Barros, and J. M. Kahn,
with channel variance and with 100 data rate
“Optimized dispersion and compensation
DCPR
Abstract— In Linguistic, variation hasn’t been a matter attributes, although the effectiveness of various
of concern of prosodic typologists. Generally, it’s treated prosodic attributes is often unknown in a recognition
as unwanted sound in the data and put to conceal what is problem [E.Shriberg 2005]. This paper extracts a
genuinely important regarding the prosodic number of prosodic attributes to design a Prosodic
configurations of the language. Furthermore, most of the Attribute Model (PAM) wherein comparison has been
studies are limited to a single variety with ignorance to made with relative contribution of several prosodic
cross. Speaker variation or marking by statistical
attributes. Moreover the system fusion with PAM
processing. The Final results are usually believed to be
the representative of the language, as a whole, however demonstrates a 20% relative EER reduction to LID
current investigation challenges such approach. The system in NIST LRE 2007 data set.
dialects of a language may differ possibly in their The paper has been further arranged in
rhythmic configurations as two differential languages, following fashion. The section II illustrates prosodic
can be thought regarding acoustic correlates of rhythm feature extraction with aid of peak-picking
class. The two dialect may however be classified into syllabification and prosodic attribute examination.
‘Stress timed’ or ‘Syllable –Timed’. Additionally Section III demonstrates the Prosodic Attribute
acceptable cross-speaker variation exists within dialects. Model (PAM) followed by the Experiments and
Modeling prosodic informative features has been a
Result analysis in Section IV. At the verge of this
challenging problem, irrespective of broad diversity of
methods been investigated. This paper proposes a prosodic paper in section V, summary and conclusion has been
Attribute Model (PAM) to capture prosodic features with put forward.
robust models which model the language specific co-
occurrence statistics of a comprehensive set of prosodic II. PROSODIC FEATURE EXTRACTION
features. The prosodic LID system with PAM has been Rhythmic, Stress and Intonational properties within
evaluated in NISTLRE 2007 to demonstrate 20% relative speech are referred as prosody in linguistic. The
EER reduction in comparison to a phonotactic LID phonotactic approach (in a LID system) models the
system. Positive contributions have been made in the co-occurrence statistics of sequence of allowable
comprehensive set of most prosodic attributes.
phones/phonemes in various languages as stated in
[M.A.Zissman and K.M.Berling 2001]. Peak –
Keywords—Suprasegmental Feature, PAM,
Phonotactic, LID Peaking Syllabification (PPS) has been used in which
tokenization (as frontend) in the syllable level is done
for sequencing discrete phone units. In our approach,
I. INTRODUCTION
the maxima of intensity profile are identified
To address the issues of spoken LID, cepstral features followed by segmentation of fundamental frequency
(MFCC – Mel Frequency Cepstral coefficients) have and intensity contour into pseudo syllabic contours.
been typically used. Prosodic features such as Voice PPS, to our knowledge shows the best performance
Fundamental Frequency (F0), Intensity, duration and within prosodic LID module and therefore it has been
F0 gradient. The curvature of Fundamental frequency used for our work too.
(F0) contour and durational features have been found Once the pseudo syllabic contours and nuclei
affective feature for LID. Comparison of various positions are achieved .the feature extraction has to
prosodic features in LID applications have been done be performed. Features are extracted on a syllable –
in [Shriberg 2005, Rouas 2007, B.Yeganarayana and by –syllable basis. Introduced herein, comprehensive
L.Mary 2008]. Uses of many prosodic features are set of prosodic attributes comprising prosodic
limited due to lack of precise definition of these features having precise extraction methods belonging
features, in addition to the issues of feature to type of fundamental frequency, intensity or
normalization. Applications of different techniques to duration .Bias removed and Z- normalization
various types of prosodic feature explode hundreds of approaches are utilized for measurements
normalization .Consequently reducing ignorable bias the ineffectiveness of this attribute group. However
to factors like speaker variations. we suggest further investigations into the matter.
Fusion with phonotactic LID utilizes score
Bias Removal: = p [n] - µ level system fusion on both NISTLRE 2007 and 2009
Z –Normalization: = experiments wherein the phonotactic system accepts a
PPR (Parallel Phone Recognition) followed by VSM
approach as in [C.C Leung, et.al,2009].Both the
These are estimated form---------------------------------- Prosodic and Phonotactic scores are calibrated
-----------, which is a consecutive sequence of separately using a linear backend. The results show
attributes over a window of 2W+1 syllables in the that prosodic feature brings 19.02% and 8.27%
vicinity of the target syllable at position N. relative EER reduction to Phonotactic system with
Regression analysis has been done on the contours of 2007 and 2009 respectively.
two syllables. The residue attributes, finally , are
normalized against intonation affects. Linear V. DISCUSSIONS
regression has been performed on all syllables within
In regards to modeling, Logarithmic distribution
an utterance to obtain a phrase curve.
modeling technique is used for fundamental
III. PROSODIC ATTRIBUTE MODEL frequency ( F0) values and also for F0 dynamics
prosodic feature whereas GMM modeling technique
The Prosodic attributes discussed in Section II can be is used for F0 and Energy dynamics, duration,
put together in six groups ,viz , Fundamental Prosody (using ASR), F0 Variation spectrum and as
frequency (F0) basis ,intensity basis, duration basis, such. ANN is used for F0 contour (tilt parameter),
F0 regression ,Intensity regression and F0 residue, energy and duration and JFA (Joint Factor Analysis)
wherein the first three are the groups with different modeling of GMM means is used for prosodic
normalization techniques. The approach of PAM for features such as F0 variation and energy using
LID uses Vector space Models (VSM) to train Legendre Polynomial, duration.
language recognizerb.VSM with cepstral features can The prosody, being an inherent source of
be studied in [H. Li, et.al. 2007 and W.M. Campell. information, robust to unwanted sounds, offers and
et. al. 2007] for details. enhancement to phoneme, word based or spectral LID
We start with N unigram Acoustic Words system. Language specific cues includes rhythm,
(AW) which can be regarded as N phone models in a stress and intonation, wherein each cue is a language
standard phone recognized. To model up to bigram dependent complex perceptual entity denoted
level, N+N X N number of AW should be considered. basically as a ‘triple combination’ of measurable
For combining the informational (statistical) of all parameters, i.e. of F0 – the Fundamental frequency,
training syllables, VSM has to be constructed through amplitude and duration.
concatenation of their corresponding bag –of –sounds The task of identification of an utterance is
vectors, thus training a vector –based classifier. A accomplished using an automatic LID system. The
prosodic phone set can be derived by the combination LID system reduces the complexity of a simple
of prosodic attributes with Cartesian product, say, if (given) utterance to reduce dimensional information,
there exists K prosodic attributes than the total say, a probability for any possible language. There
number of unigram and bigram PW( Prosodic Wood) are about 7000 varieties of spoken languages and
is denoted by- therefore we can consider a subset only, irrespective
The index to each other of the K feature attributes. of differences between languages on all linguistic
N (I,1) and N(I,2) are the levels of quantization for I levels (alike different words, phoneme set, acoustic
th attribute, respectively is unigram and bigram. realization, prosody, phonotactic constraints,
The PAM is well defined with equation (iv) wherein grapheme to phoneme relation, etc..
prosodic attribute shave to be separately modeled. A brief note has been put below for study of
different approach, viz., spectral-similarity, prosody-
Here, for every attribute (i.e. [1….K]) there are N (i, based and phone recognition. In the ‘spectral-
1) +N (i, 2) Prosodic words. Similarity’ Approach, many short term spectra are
evaluated form the utterance of the language. The
IV. EXPERIMENTS AND RESULTS spectra of the specific test utteranc3s are compare
with the training utterance using distance metric,
Experiments for LID models with PAM may have Mahalanobis, Euclidean, etc.. The ‘Prosody-based’
flexible feature combination. The table below approach is based on feature extraction for pitch
compares the results of LID of six partial sets. The estimation and amplitude contours which are later
results show all Prosodic attribute groups, instead normalized to fit insensitive to average amplitude,
attributes of Particular Kind, are of importance to pitch and rate of speech. This approach is highly
LID. The EER degrades when ‘Intensity Basic’ language pair specific. The ‘Phone-recognition’
attributes excluded, irrespective of unknown cause of approach investigates the phone inventory of the
syllable, particularly of its utterance. The
characteristics of the language are extracted on the in Proc. Eighth Int. ACM Con$ Res. Dev.
basis of temporal order of the phones. Phonotactic Inform. Retrieval, 1985, pp. 155-164.
constraints may be utilized in N-gram analysis to
[4] R. E. Kimbrell, “Searching for text? Send an N-
upgrade the result. This approach requires
phonetically labeled corpora, although typically gram!,” Byte, vol. 13, no. 5, pp. 297-312, May
deliver comparative better performance. 1988.
For Language Modeling the language model [5] J. C. Schmitt, “Trigram-based method of
is trained to predict the probability of a word uk given language identification,” US Patent 5 062 143,
its preceding words U1 x-1 = u1 …….. ux-1. Practically
Oct. 1991.
it’s not feasible for longer utterances. Henceforth we
compute N-grams, in which we approximate the [6] M. Damashek, “Gauging similarity via N-grams:
probability of a word sequence by its N last words, P Language-independent text sorting,
( uk| U1 x-1) = P(uk | Uk-n+1 k-1) categorization, and retrieval of text,” submitted
N gram computation seems useful for it encodes the for publication in Sei. T. A. Albina et al., “A
syntax, pragmatics and semantics on a local level,
system for clustering spoken documents,” in
with limitations causing data sparsity problem,
however which can be overcome by use of ‘back off’ Proc. Eurospeech ’93, vol. 2, Sept. 1993, pp.
and ‘discounting’. 1371-1374.
The N gram approach does not account [7] Y. Yan and E. Barnard, “An approach to
grammatical rules on a sentence level. Different automatic language identification- ’92, vol. 2,
approaches have been proposed to improve this Oct. 1992, pp. 1007-1010. 1994, pp. 289-292.
deficiency. Automatic Speaker Recognition uses N-
grams, as because it gives a good tradeoff between [8] S. B. Davis and P. Mermelstein, “Comparison of
accuracy and computational costs. parametnc representahons for monosyllabic word
recognition in continuously spoken sentences,”
IEEE Trans Acoust , Speech, Signal Processing,
Conclusion vol. ASSP- 28, no. 4, pp. 357-366, Aug 1980.
This paper we have introduced the utilization of a [9] D. B. Paul, “Speech recognition using hidden
comprehensive set of prosodic attributes for LR
Markov models,’’ Lincoln Lab. J., vol. < no. 1,
(Language Recognition). Feature extraction relies on
the PPS of the speech. A compact PAM for a huge p; 41-62, Siring 1990.
number of prosodic attributes in proposed for LR. [10] Y.K. Muthusamy, E. Barnard, and R.A. Cole,
Above all, as suggest in all our previous papers, "Reviewing Automatic Language Identification",
prosodic features may develop advanced in IEEE Signal Processing Magazine, October
Performances of LID systems and Language Specific.
1994. Acquired: IEEE CD-ROM database.
Whereas pitch related features help LR recognition.
Moreover, the prosodic subsystem performs better, Usefulness: 10. Readability: 7.
better fusion result is witnessed in general. A study [11] Y.K. Muthusamy, R.A. Cole, B.T. Oshika, "The
focusing the superiority in LID performance by OGI Multi-Language Telephone Speech
comprehensive attribute modeling in comparison to Corpus", in Proceedings International
other prosodic LID systems modeled with small Conference on Spoken Language Processing
number of features has been experimentally made.
Suprasegmental Phonology, better known as Prosodic 1992.http://www.cse.ogi.edu/CSLU/corpora/mlts
features, deals with phonemes and auditory qualities .html
of speech. Acquired: The WWW. Usefulness: 9.
Readability: 10.
G.A. Constantinides, "A Framework for
References
[12]
Evaluating Multilingual Systems", Surprise
'96, http://www-
[1] Gauvain, “Language identification using dse.doc.ic.ac.uk/~nd/surprise_96/journal/vol1/ga
phonebased acoustic likelihoods,” in Proc. c1/article1.html.
ICASSP ’94, vol. 1, Apr. 1994, pp. 293-296. Acquired: My Brain. Usefulness: 10.
[2] S. Kadambe and J. L. Hieronymus, “Language Readability: 10. (of course!)
identification with phonological and lexical [13] Y.K. Muthusamy, N. Jain, R.A. Cole,
models,” in Proc. ICASSP ’95, vol. 5, May 1995, "Perceptual Benchmarks For Natural Language
pp. 3507-3510. Identification", in Proceedings IEEE
[3] R. J. D’Amore and C. P. Mah, “One-time International Conference on Acoustics, Speech
complete indexing of text: Theory and practice,” and Signal Processing 1994.
Acquired: IEEE CD-ROM database. Usefulness: [24] L. F. Lame1 and J.-L. Gauvain, “Identifying non-
7. Readability: 7. linguistic speech features,” in Proc. Euro speech
[14] J.R. Deller, J.G. Proakis, J.H.L Hansen, ’93, vol. 1, Sept. 1993, pp. 23-30.
"Discrete-Time Processing of Speech Signals", [25] Y. Muthusamy et al., “A comparison of
MacMillan, New York, 1993. approaches to automatic language identification
Acquired: Friend. Usefulness: 7. Readability: 5. using telephone speech,” in Proc. Eurospeech
(Highly Mathematical) ’93, vol. 2, Sept. 1993, pp. 1307-1310.
[15] W.A. Ainsworth, "Speech Recognition by [26] Y. K. Muthusamy, R. A. Cole, and B. T. Oshika,
Machine", Peter Peregrinus,London,1988. “The OGI Multilanguage telephone speech
Acquired: Main Library. Usefulness: 8. corpus,” in Proc. ZCSLP ’92, vol. 2, Oct. 1992,
Readability: 9. (Nothing on multilingual stuff) pp. 895-898.
[16] S.B. Davis and P. Mermelstein, "Comparison of [27] D. Cimarusti and R. B. Ives, “Development of an
Parametric Representations for Monosyllabic automatic identification system of spoken
Word Recognition in Continuously Spoken languages: Phase I,” in Proc. ICASSP ’82, May
Sentences", in IEEE Transactions on Acoustics, J. T. Foil, “Language identification using noisy
Speech and Signal Processing, Vol. ASSP-28, speech,” in Proc. ICASSP ’86, vol. 2, Apr. 1986,
No.4,August1980. pp. 861-864.
Acquired: EE Library. Usefulness: 5. [28] F. J. Goodman, A. F. Martin, and R. E.
Readability: 5. Wohlford, “Improved automatic language
[17] P. Mermelstein, "Automatic Segmentation of identification in noisy speech,” in Proc. ICASSP
Speech", J. Acoust. Soc. Amer., Vol. 58, pp. ’89, vol. 1, 1982, pp. 1661-1663. . - May 1589,
880-883, Oct 1975. pp. 528-531.
[18] Y. K. Muthusamy, E. Barnard, and R. A. Cole, [29] R. B. Ives, “A minimal rule AI expert system for
“Reviewing automatic language identification,” real-time classification of natural spoken
IEEE Signal Processing Mug., vol. 11, no. 4, pp. languages,” in Proc. Second Ann. Artificial Intel.
3341, Oct. 1994. Adv. Compute. Technol. Con$, Long Beach, CA,
[19] L. Riek, W. Mistreta, and D. Morgan, May 1986, pp. 337-340.
“Experiments in language identification,” [30] M. Sugiyama, “Automatic language recognition
Lockheed Sanders, Inc., Nashua, NH, Tech. Rep. using acoustic features,” in Proc. ICASSP ’91,
SPCOT- 91-002, Dec. 1991. vol. 2, May 1991, pp. 813-816.
[20] M. A. Zissman, “Automatic language [31] Y. K. Muthusamy and R. A. Cole, “Automatic
identification using Gaussian mixture and hidden segmentation and identification of ten languages
Markov models,” in Proc. ICASSP ’93, vol. 2, using telephone speech,” in Proc. ICSLP A. S.
Apr. 1993, pp. 399402. House and E. P. Neuburg, “Toward automatic
[21] T. J. Hazen and V. W. Zue, “Automatic language identification of the language of an utterance. I.
identification using a segment-based approach,” Preliminary methodological considerations,” J.
in Proc. Eurospeech ’93, vol. 2, Sept. 1993, pp. Acoust. Soc. Amer., vol. 62, no. 3, pp. 708-713,
1303-1306. Sept. 1977. M. Savic, E. Acosta, and S. K.
Gupta, “An automatic language identification
[22] M. A. Zissman and E. Singer, “Automatic
language identification of telephone speech system,” in Proc. ICASSP ’91, vol. 2, May 1991,
messages using phoneme recognition and n-gram pp. 817420.
modeling,” in Proc. ZCASSP ’94, vol. 1, Apr. [32] S. Nakagawa, T. Seino, and Y. Ueda, “Spoken
1994, pp. 305-308. language identification by ergodic HMMs and its
state sequences,” Electron. Commun. Japan, Pt.
[23] R. C. F. Tucker, M. J. Carey, and E. S. Paris,
3, vol. 77, no. 6, pp. 70-79, Feb. 1994.
“Automatic language identification using sub-
words models,” in Proc. ZCASSP ’94, vol. 1,
Apr. 1994, pp. 301-304.
Start
Image acquisition
Image preprocessing
Image segmentation
Feature Extraction
Fig. 3 Object in cluster 2 Fig.4 Object in cluster 3
End
III.OVERVIEW OF SYSTEM
The approaches for all of the existing techniques of image In the initial step, the RGB images of all the leaf samples were
classification are almost the same. First, digital images are obtained. Samples of anthracnose diseases are shown in the
acquired from environment around the sensor using a digital Figure 1.
camera. Then image-processing techniques are applied to For each image in the data set the steps were repeated. Image
extract useful features that are necessary for further analysis of segmentation of the leaf is done on each image of the leaf
these images. After that, techniques are used to identify the sample using K-Means clustering. A sample clustered image
images according to the specific problem at hand. Figure 7. with four clusters of the leaf sample image is shown in Figure.
depicts the basic procedure of the proposed algorithm in this Once the infected object was determined, the image was
research. then converted from RGB format to HSI format. The SGDM
The first phase is the image acquisition phase. In this step, matrices were then generated for each pixel map of the image
by using digital camera, the images of the various leaves are for only H and S images. The SGDM is a measure of the
taken. In the second phase image preprocessing is completed. probability that a given pixel at one particular gray-level will
In the third phase, segmentation using K-Means clustering is occur at a distinct distance and orientation angle from another
performed, to find out actual segments of the leaf in the image. pixel, given that pixel has a second particular gray level. From
Then, feature extraction for the infected part of the leaf is the SGDM matrices, the texture statistics for each image were
completed based on specific properties among pixels in the generated.
image or their texture. After this step, certain statistical analysis
tasks are completed to choose the best features that represent
Krishanjeet Bhaliyan
Electronics and Communication Engineering
Department
NIT Hamirpur
Hamirpur, India
E-mail: kbhaliyan@gmail.com
Abstract— A neural network (NN) based analysis model is gasket, Sierpinski carpet and Koch fractal geometry [4, 5]
developed for locating the operating frequencies of two have been studied for the multiband antennas.
multiband fractal antennas. Well known fractal geometries
are adopted to work on antenna as multiband and The monopole antenna based on Sierpinski gasket has
broadband applications. The developed ANN model can been studied extensively as an excellent candidate for
locate the operating frequencies of fractal antennas. The multiband applications [6]. The classical structure of this
performance of the neural model is validated with Sierpinski gasket monopole antenna which consists of the
simulations. overall height of antenna (h) and scaling (r) of antenna Its
electrical properties translate into a log periodic allocation
Keywords-Fractals; microstrip antennas; multiband of frequency bands where these multiple bands each have
antennas; neural networks. a common behavior. Manifestation of this behavior has
also been observed in the radiation patterns. It has also
I. INTRODUCTION been demonstrated that the position of the multiple bands
can be controlled by proper adjustment of the scaling used
In modern wireless communication systems
to generate the Sierpinski antenna [7, 8].
broadband, multiband and low profile antennas are in
great demand for both commercial and military Another Fractal antennas based on Koch fractal shape
applications. Nowadays users are demanding for antennas have been widely explored for the size miniaturisation and
that can operate over multiple frequency bands or are multiband applications [9, 10]. The effective length of the
reconfigurable as the demands on the system changes. Koch fractal curve increases at each iteration of fractal
Furthermore, the design of the antenna systems is always results in shifting the frequency response of the antenna.
important to be as miniaturized as possible in many A variety of loop antennas have been found with the
applications. Outstanding solutions for miniature and number of fractal geometries. A Koch fractal island [11,
multi-band antennas have been found in Fractal antennas. 12] is superimposed on circular loop antennas. The fractal
A fractal antenna theory has been found a great progress circular loop antenna based on Koch island curve has the
in the study of antenna engineering. advantage that the increased effective length of the Koch
island loop at higher iteration can be packed into a small
The word fractal was originally discovered by Benoit
space of the antenna. A Koch loop antenna system which
Mandelbrot to describe a family of complex shapes that
consists of radius (r) of the loop and feed length of (L) of
possess an inherent self-similarity or self-affinity in their
antenna. Theses design parameters of antenna can control
geometrical structure. Benoit Mandelbrot, the pioneer of
the frequency behavior of the antenna.
classifying this geometry, first coined the term 'fractal’ in
1975 from the Latin word “fractus”, which means broken. In the present work, a neural network CAD tool is
Fractals are space filling contours, meaning electrically developed for two categories of multiband antennas; (i) a
large features can be efficiently- packed into small areas Sierpinski based fractal antenna (ii) Koch based loop
[1-3]. fractal antenna. These antennas have been simulated using
CST microwave studio software to analysis the multiband
There are several fractal geometries that have been
behavior.
found to be useful in developing new and innovative
design for miniatured and multiband antennas. Some of
these classical known fractal geometries e.g. Sierpinski II. MODEL OF MULTIBAND ANTENNAS
In the present work, a neurocomputational approach is
fr1
h
ANN fr2
r
fr3
Figure 1. Sierpinski gasket multiband antenna (Overall height h and
scaling r)
Figure 3. Neural network input and output parameters for Sierpinski
gasket antenna
No. of No. of No. of 44 0.5 1.03 3.56 6.63 1.00 3.48 6.1
No. of
input hidden output Learning Error
training Epochs 44 0.6 1.03 3.36 5.87 1.01 3.33 5.95
layer layer layer rate tolerance
data set
neurons neurons neurons 44 0.7 1.04 3.09 4.54 1.03 3.07 4.54
-3
153 2 20 3 0.073 1000 10 44 0.8 1.04 2.92 4.0 1.04 2.91 4.07
TABLE II. TRAINED NN PARAMETERS FOR KOCH LOOP TABLE IV. PERFORMANCE COMPARISON OF THE
NEUROMODELER WITH THE CST SIMULATOR FOR A KOCH
LOOP ANTENNA WITH THEIR DESIGN PARAMETERS
No. of No. of No. of
No. of
input hidden output Learning Error CST Result ANN Result
training Epochs
layer layer layer rate tolerance Frequencies (GHz) Frequencies (GHz)
data set
neurons neurons neurons Overall Scaling
height (r )
(h)
26 2 15 3 0.6 2000 10-5 fr1 fr2 fr3 fr1 fr2 fr3
ANTENNA.
4.5 13 8.41 16.64 22.84 8.4634 16.6587 22.8532
IV. RESULTS AND DISCUSSION
5 14 7.82 15.44 21.05 7.8093 15.4112 21.0629
The trained neural network was tested for its validity
with the test data from the simulator. In order to test the
generalization of the network, a set of Sierpinski gaskets V. CONCLUSION
of height h = 44mm. with the scale factor 0.1 – 0.9 were Self-similarity is a property common to many
simulated. Table III shows the comparison of the fractals but in order to become a useful radiator it is
neuromodeler response with that obtained from the necessary for the fractal antenna to fulfill specific
simulator for the values of fr1, fr2 and fr3. As expected, the requirements for the desired frequencies. For this reason,
first frequency is almost same for all values of scaling, a generalized NN model was developed for the analysis of
because it depends on the overall size of the antenna. As the Sierpinski gasket antenna and Koch loop antenna in
verified by other authors for scaling equal to, less and order to locate the operational frequencies. The
more than 0.5, the second and third frequencies gets shifts performance of the network was validated with the
as per the fractal behavior of antenna and in this range the simulations. With the growing interest in using fractals as
ANN results are quite in agreement with the simulation multiband antenna for GSM, DECT, WLAN, ISM band
results. hand set applications, the developed NN model can be
The neuromodeler performance was also tested for useful to effectively locate the frequency bands of
different height and scaling of the gasket is shown in operation. Similar formulation can also be developed for
Table III where a good agreement can be seen. The other fractal structures.
network was trained to give outputs as fr1, fr2 and fr3 for
ease of calculating all the frequencies. REFERENCES
Second fractal antenna was tested by the trained neural [1] D.H.Werner, “An overview of fractal antenna engineering”, IEEE
Antennas and Propagation Magazine., vol. 45, no.1, pp.38-57, pp.
network and results were validated with simulations as 38-57, Feb 2003.
shown in Table IV. A set of test data for the Koch loop [2] Best, S.R., and J.D., Morrow, “The Effectiveness of Space filling
antenna of radius r = 4.5 & 5 mm, and Lf =13 & 14 mm Fractal Geometry in Lowering Resonant Frequency,” IEEE
were taken for the validation. The performance of Antennas and Wireless Propogation Letters, vol. 1, 2002, pp. 112-
network is obtained with simulator for the observation of 115.
the three frequencies fr1, fr2 and fr3. Both results are quite [3] Kulbir Singh, Vinit Grewal and Rajiv Saxena, “Fractal Antennas:
close as seen in the Table IV. A Novel miniaturization Technique for Wireless Communications”
International Journal of Recent Trends in Engineering, Vol 2, no. [14] R.K.Mishra and A.Patnaik, “Neural network-based CAD model for
5, November 2009 the design of square-patch Antennas”, IEEE Transactions on
[4] Douglas H. Werner', Randy L. Haup, and Pingjuan L. Werner, Antennas and Propagation, vol. 46, no.12, pp. 1890 – 1891, Dec
“Fractal Antenna Engineering: The Theory and Design of Fractal 1998.
Antenna Arrays. IEEE Antennas and Propagation Magazine, Vol. [15] www.cst.com
41, ppNo. 5, October I999. [16] www.mathworks.com
[5] Neetu, Savina Bansal, R.K. Bansal, “Design and Analysis of
Fractal Antennas based on Koch and Sierpinski Fractal
Geometries” International Journal of Advanced Research in AUTHOR DETAILS
Electrical, Electronics and Instrumentation Engineering, vol. 2, Anuradha Sonker was born in India. She received her
Issue 6, June 2013. bachelor's degree in Engineering in Electronics and
[6] C. Puente, and J. Romeu, “On the behavior of the Sierpinski Telecommunication from the University of Agra, India,
multiband fractal antenna,” IEEE Transactions on Antennas and in 2001, and the Master in Engineering in
Propagation., vol. 46, no.4, pp. 517-524, April 1998. Telecommunications present, she completed her PhD from Indian
[7] C. Puente. J. Romeu, R.Pous, X.Garcia, and F. Benitez, “Fractal Institute of Technology, Roorkee, India. She was one of the few
multiband antenna based on the Sierpinski gasket,” Electronics recipients of an Indo Swiss Exchange Program
Letters, vol. 32, no.1, pp. 1-2, Jan 1996. Fellowship to work at the Laboratory of
[8] C. Puente, J. Romeu, R.Bartoleme and R.Pous Electromagnetic and Acoustics (LEMA), ELB, EPFL,
“Perturbation of the Sierpinski antenna to Lausanne, Switzerland during 2010- 11 . Her research
allocate operating bands,” Electronics Letters, interests include fractal antennas, frequency-selective
vol.32, pp. 2186-2188, Nov. 1996. surfaces, and the application of biologically inspired
computational techniques.
[9] S. R. Best, “On the performance properties of
koch fractal and other bent wire monopole”,
IEEE Trans. on Antenna and Propagation, Vol.
51, No. 6, pp. 1292-1300, 2003 Pravin Kumar was born in India. He received his bachelor’s degree in
[10] J. Liang, C.C. Chiau, X. Chen and C.G. Parini, “Printed circular Engineering in Electronics and Communication from Uttar Pradesh
disc monopole antenna for ultra-wideband applications”, Technical University, Lucknow, India, in 2012, and Pursuing Master in
Electronics Letters, Vol. 40, No. 20., 30th September 2004 Engineering in Communication System and Networks from National
Institute of Technology, Hamirpur, India. His research interests include
[11] Amer Basim Shaalan, “Design of Fractal Quadratic Koch fractal antennas, artificial neural networks for antennas.
Antenna.” International Multi- Conference on Systems, Signals
and Devices, 2010.
[12] C. Puente, J. Romeu and A. Cardama, “The Koch monopole: a
Krishanjeet Bhaliyan was born in India. He received his bachelor’s
small fractal antenna”. IEEE Trans. On Antennas and Propagation,
Vol. 48, pp. 1173-1781, 2001. degree in Engineering in Electronics and Communication from Uttar
Pradesh Technical University, Lucknow, India, in 2012, and Pursuing
[13] Y. Kim, “Application of artificial neural networks to broadband Master in Engineering in Communication System and Networks from
antenna design based on parametric frequency model”, IEEE National Institute of Technology, Hamirpur, India. His research interests
Transactions on Antennas and Propagation., vol. 55, no.3, pp. include fractal antennas, ultra wideband antennas.
669–674, March 2007.
Abstract-- A wireless sensor network consists of a large wireless sensor networks range widely: ecological habitat
number of sensor nodes deployed in a hostile environment to monitoring, structure health monitoring, environmental
monitor the physical world accurately. A wireless sensor contaminant detection, industrial process control, and
network suffers from many issues due to resource constrained military target tracking [1].
nature of nodes, but energy is the most important area of Even though wireless sensor networks have unlimited
concern as replacement of batteries is a tedious task. applications but they also suffer from some drawbacks like
Enhancement of network lifetime by minimizing the energy
node battery lifetime. In most of the applications where
usage is of utmost importance and one of the ways to enhance
network lifetime is clustering of sensor nodes.
nodes are deployed in unattended or inaccessible areas it is
impossible to replace their batteries. Therefore most of the
In this paper we propose a new clustering protocol called research in WSN is focused on how to minimize the energy
Density Aware Energy Efficient Clustering Protocol usage in order to extend the network lifetime.
(DAEECP) for wireless sensor networks which is a variation One of the ways to minimize such energy usage is
of an existing protocol called An Enhanced Energy Efficient employment of clustering [2]. Clustering is defined as the
Protocol with Static Clustering (EEEPSC). Similar to
grouping of the similar objects or the process of finding a
EEEPSC it also forms static clusters to reduce the overhead
of dynamic clustering. However unlike EEEPSC cluster head natural association among sensor nodes. It is used in WSN
selection is performed by taking into account the density of to transmit the aggregated data to base station and
nodes apart from spatial distribution of nodes and their minimizes the number of nodes that take part in long term
residual energy. It divides clusters into two equal parts and communication which leads to reduction of total energy
calculates number of nodes in each part. Cluster head will be consumption. Clustering also conserves bandwidth since
chosen from the part having high density of nodes. A set of all inter cluster interactions are handled by CHs. CHs
experiments have been performed to compare the results of aggregate data collected from its sensors which decreases
DAEECP with existing schemes called LEACH, EEEPSC etc. the number of relayed packets. A number of works related
Based on experimental evidences it has been found that
to clustering have been discussed in literature [3-9]. In the
proposed protocol outperforms the existing schemes.
current paper we propose a new clustering strategy Density
Keywords- Clustering, cluster head, density, network Aware Energy Efficient Clustering Protocol (DAEECP). It
lifetime, energy efficiency, wireless sensor networks. is based on static clustering which eliminates the dynamic
clustering overhead. Apart from energy conservation it also
I. INTRODUCTION tries to ensure coverage by taking density of sensors into
Recent technological advances allow us to envision a account. Once the clusters are formed, they are divided two
future where large numbers of low-power; inexpensive parts left and right. Cluster Heads will always be selected
sensor devices densely embedded in the physical from denser areas which improve the network lifetime as
environment, operating together in a wireless network. A well as coverage.
sensor network is an infrastructure comprised of sensing Rest of the paper is summarized as follows. Section II
(measuring), computing, and communication elements that describes the work related to clustering. In section III
gives an administrator the ability to instrument, observe, proposed protocol has been discussed. Simulation results &
and react to events and phenomena in a specified conclusions are presented in sections IV and V
environment which enable us to deploy a large-scale
respectively.
sensor network. The envisioned applications of these
The Low Energy Adaptive Clustering Hierarchy This section describes the proposed scheme Density
(LEACH) [3] given by Heizelman is one of the most Aware Energy Efficient Clustering Protocol followed by
popular clustering protocol aimed at enhancing network the architecture of proposed protocol. The proposed
lifetime. The operation of LEACH is divided into two scheme is a modification of EEEPSC [9] which takes
phases: setup phase and steady state phase. In setup phase density of nodes into account apart from residual energies
clusters are formed and cluster heads are elected. In steady and spatial distribution of nodes thus enhances the network
state phase actual data transfer takes place. LEACH has lifetime.
some major drawbacks like extra overhead due to dynamic
clustering and it fails when elected cluster heads are A. Network Model
concentrated in one part.
To overcome the inefficiencies of LEACH same We consider a network model similar to the one used in
authors propose a new protocol LEACH-C [4] which takes [3, 4, 8, 9].
the location and energy of nodes into account so that All sensor nodes are static and homogeneous with
cluster heads are well distributed in network area. However limited energy resources.
LEACH-C uses one unrealistic assumption according to The nodes have power control capabilities.
which it assumes that all nodes have enough power to talk Base station is fixed and located outside the network
directly with base station. area
Lindsey and Raghvendra proposed Power Efficient Base station knows the location of all nodes in
Gathering in Sensor Information Systems (PEGASIS) [5] network.
an enhancement over LEACH in which nodes need to Network adopts the continuous data flow model.
communicate with their closest neighbors only and they
take turns in communicating with base station. It B. Energy Model
outperforms LEACH but it suffers from the problem of
extreme delay thus unable to provide quality of service We adopted a generalized first order radio energy
(QOS). model for energy consumption measurements while
To overcome the problem of delay same authors communicating. Transmitter dissipates energy to run radio
propose an extension to PEGASIS [6] where they enabled electronics and power amplifier and receiver only
the parallel transmission of data instead of linear data dissipates energy for radio electronics as shown in Fig. 1.
transmission among sensor nodes. Both free space and multipath fading channel models will
A.S. Zahmati and his group proposed a protocol called be used depending upon the distance between
Energy Efficient Protocol with Static Clustering (EEPSC) communicating nodes. If distance is less than predefines
[7] in 2007 which works on static clustering and thus threshold d0 then free space model is used while multipath
eliminates the overhead of dynamic clustering. Clusters are model is used in other cases. Thus to transmit an l bit
formed only once at the start. After the formation of message at distance d, the radio expands
clusters it uses residual energies of nodes to decide which
node should act as cluster head in its respective cluster. It ( , )= ( )+ (, )
also outperforms LEACH but sometimes the selected
cluster head is lying near to boundary of network area . + .ɛ , <
= (1)
which unnecessarily drains out the batteries of other nodes . + .ɛ , ≥
thus decreases the network lifetime. It also does not take
location of nodes into consideration for choosing a cluster where is :
head.
ɛ
S.K. Chaurasiya tries to overcome the problem of = (2)
EEPSC in its protocol EEEPSC [8] which uses spatial ɛ
distribution of nodes along with their residual energies to
decide the cluster heads. The node having high value of Similarly to receive an l bit message the radio expands
residual energy and which is placed close to centre will be
(, )= ()= . (3)
chosen as cluster head so that energy consumption due to
inter cluster communication is minimized. where d is the distance between sensor nodes and their
respective cluster heads or distance between cluster heads in other side then cluster head will be selected from the
and base station. is the energy expenses in higher density side. Besides this base station chooses a
transmitter/ receiver electronics. ɛ , ɛ are transmitter temporary cluster head and announces the TDMA schedule
amplifier energy expenses by sensor node when < for all the nodes in each cluster. After all these tasks base
and ≥ respectively to achieve acceptable signal to station broadcasts 3-tuple data (TCHi, TDMA schedule ,
noise ratio (SNR). ) to every node in each cluster. This completes
C. Protocol Architecture the setup phase.
Like EEPSC [7] and EEEPSC [8], DAEECP is also
static clustering scheme in which clusters are formed only
once at the start with the help of base station. The complete
operation is supposed to be consisted of several rounds
where each round is broken into three phases- setup phase,
responsible node selection phase and steady state phase.
The operation of protocol along with these phases is
discussed in the following subsections.
areas are eligible for cluster head election whereas the consumption per round and number of messages received
nodes from less dense areas will work as normal nodes at base station. Network lifetime is defined as the time
only. Selected cluster head informs the nodes of its cluster until the last node dies in the network. Number of
about beginning of round. messages received at base station reflects the throughput
3) Steady State Phase: In this phase nodes turn on their C. Results and Discussion
radios in their respective slots and send the sensed data
towards their respective cluster heads. Cluster head always A set of experiments is conducted to compare the
keep their radios on to receive data from all nodes. Cluster performance of proposed scheme DAEECP with existing
heads remove redundancy and send aggregated data to base schemes like EEEPSC.
station. A round terminates after predefined time period. Fig. 3 shows the number of messages received at base
station with time. It shows that number of messages
IV. PERFORMANCE EVALUATION received in DAEECP is more than that of EEEPSC
MATLAB is used as a simulation tool to evaluate the which clearly reflects that DAEECP outperforms EEEPSC.
performance of DAEECP and to compare the same with
existing protocols like EEEPSC. Simulation environment,
performance metrics and results are discussed in this
section.
A. Simulation Environment
Abstract— In all distributed wireless sensor networks, time complexity, energy issues, cost and limited size and so
synchronization is important. As it provides a common on.
time scale for local clocks of nodes in the network. Several
type of time synchronization protocols have been discussed
in literature depending upon application. In this article we The rest of the paper is organized as follows. Section
present a survey of time synchronization in WSNs and II describes the synchronization challenges in WSNs,
challenges in achieving time synchronization. We also while Section III gives the detail of the different types
discuss the drawback and accuracy of each technique. The of time synchronization protocol. Concluding remarks
article concludes with possible future direction. are given in Section IV.
Keywords- Distributed wireless sensor networks, clock II. SYNCHRONIZATION CHALLENGES IN WSNS
synchronization, controller, clock offset, clock drift,
challenges.
A. Nondeterministic delays
I. INTRODUCTION
There are many sources of uncertainty and delay,
Time synchronization is a vital piece of infrastructure
which impact the accuracy of time synchronization. In
in any distributed system; however, wireless sensor
general message latency can be decomposed into four
networks make predominantly extensive use of
parts, each of which contributes to uncertainty [1], [2].
synchronized time. Synchronization is essential for
Send time- It includes any processing time and
many sensor network applications that require precise
time taken to assemble and move the message
mapping of the collected data from the sensors with the to link layer.
time information such as for objects tracking, for health Access time- It includes random delays while
monitoring, for in-network signal processing, for the message is buffered at the link layer due to
medium-access, for sleep scheduling, for industrial and contention and collision. This is most
process automations, for environmental monitoring etc. significant component and highly variable
depending upon the specific MAC protocol.
Propagation time- It is the time taken for point
Time synchronization is a research area where people
to point message travel. It may be considered
are working over a past few decades. Several as negligible for single link, and may be
techniques has been discussed like Cristain’s dominant term over multiple, if there is
synchronization algorithm, reference broadcast network congestion.
synchronization (RBS), pair wise sender receiver Receive time- It is the time taken to process the
synchronization (TPSN), flooding time synchronization message and record its arrival.
(FTSP) in this domain. We have basically concern with
B. Clock offset and skew
three problems in WSNs time synchronization. First
limitation is energy, since sensor nodes are mostly
Every individual sensor in a network has its own
battery-operated, second for achieving synchronization
clock. Ideally, the clock of a sensor node should be
some message need to be exchanged, while we have configured such that ( ) = ,where stands for
limited bandwidth of wireless communication. And the ideal or reference time. However because of the
third problem is the small size of sensor node which imperfection of the clock oscillator, a clock will drift
imposes restriction on computational power and storage away from the ideal time even if it is initially perfectly
capacity. Therefore existing synchronization schemes tuned. For example, according to data sheet of a typical
such as network time protocol (NTP) and GPS are not crystal- quartz oscillator commonly used in sensor
networks, the frequency of the clock varies up to 40
feasible in wireless sensor networks because of
ppm. In general, the clock function of the node is fails, the protocol will suffer. To design with the WSNs
modelled as [4]; dynamic topology centralized synchronization protocol
is design with complex logic. Another disadvantage of
( )= + . (1) centralized synchronization protocol is that
synchronization error grows with the increase of
where the parameter and are called clock offset
(phase difference) and clock skew (frequency network hops.
difference) respectively. A graphical representation of
the clock model is illustrated in Fig. 1. Distribute synchronization protocol such as TDP [7],
GCS [8], RFA [9], ATS [10] and CCS [11] can use
local information to achieve the whole network
synchronization. This kind of protocol can easily adapt
to WSNs dynamic topology. The disadvantage of
distributed synchronization is that convergence speed
may be a bit slow, relating to the network topology.
( )= + ( ) (2)
or all the relative clock offset and skews are skews are
estimated with respect to reference node.
C. Robustness
Fig. 2. Cristain’s synchronization algorithm
Sensor nodes in wireless sensor networks always
distribute in large scale, sensor nodes may get B. Reference Broadcast Synchronization (RBS)
connected with another by many hops. This increases
the difficulty in reducing the convergence speed in time This protocol is proposed in [5]. It is a receiver-
synchronization algorithm design. Synchronization can receiver synchronization protocol. For understanding of
be classified into two types. this algorithm consider three sensor nodes A, B and C
within the same broadcast domain. If B is a beacon
Centralized synchronization
Distributed synchronization node, it broadcasts the reference signal (which contains
no timing information) that is received by node A and C
Centralized synchronization protocol such as RBS simultaneously. The two receiver node A and C record
[5], TPSN [6] and FTSP [2] has fast convergence speed the local time when the reference signal was received.
and little synchronization error. These type of protocol Then node A and C exchange this local time stamp
divide the nodes into different roles, for example, client through separate message. This is sufficient for the two
and beacon node in RBS. If the node with special role receivers to determine their relative offset at the time of
reference message reception. This procedure is level, it discards further incoming level discovery
illustrated in fig.3. packets. This chain goes on the network, and phase is
completed when all nodes are assigned a level.
into account the order of transferred message, the this algorithm more efficient. Before sensor node sends
following relationship would hold [4]. synchronization request, it checks with its neighbour to
see if any request is pending, if any, it will add its
( )< ( )+ (6)
request to the pending one, so that path will be
( ) > ( )+ (7) synchronized only once.