Documente Academic
Documente Profesional
Documente Cultură
Contents
Welcome Message02
Program at a Glance04
Keynote Speeches05
Tutorials11
Technical Program
Session 1A16
Session 1B26
Session 2A36
Session 2B50
Poster Session64
Session 3A84
Session 3B98
Session 4A110
Session 4B120
TPC co-Chairs128
Conference Organization131
.indd
2015/03/11
10:59:19
It
.indd
2015/03/11
10:59:21
It
.indd
2015/03/11
10:59:22
.indd
2015/03/11
10:59:24
Keynote Speech 1
Biography
Ramesh Harjani is the E.F. Johnson Professor of Electronic Communications in the Department of
Electrical & Computer Engineering at the University of Minnesota. He is a Fellow of the IEEE. He received his
Ph.D. in Electrical Engineering from Carnegie Mellon University in 1989. He was at Mentor Graphics, San Jose
before joining the University of Minnesota. He has been a visiting professor at Lucent Bell Labs, Allentown, PA
and the Army Research Labs, Adelphi, MD. He co-founded Bermai, Inc, a startup company developing CMOS
chips for wireless multi-media applications in 2001. His research interests include analog/RF circuits for wired and
wireless communication systems.
Dr. Harjani received the National Science Foundation Research Initiation Award in 1991 and Best Paper
Awards at the 1987 IEEE/ACM Design Automation Conference, the 1989 International Conference on ComputerAided Design, and the 1998 GOMAC. His research group was the winner of the SRC Copper Design Challenge in
2000 and the winner of the SRC SiGe challenge in 2003. He is an author/editor of seven books. He was an Associate
Editor for IEEE Transactions on Circuits and Systems Part II, 1995-1997, Guest Editors for the International Journal
of High-Speed Electronics and Systems and for Analog Integrated Circuits and Signal Processing in 2004 and a
Guest Editor for the IEEE Journal of Solid-State Circuits, 2009-2011. He was a Senior Editor for the IEEE Journal
on Emerging & Selected Topics in Circuits & Systems (JETCAS), 2011-2013. He was the Technical Program Chair
for the IEEE Custom Integrated Circuits Conference 2012-2013, the Chair of the IEEE Circuits and Systems Society
technical committee on Analog Signal Processing from 1999 to 2000 and a Distinguished Lecturer of the IEEE
Circuits and Systems Society for 2001-2002.
.indd
2015/03/11
10:59:25
Keynote Speech 2
Evolution of Opto-electronics
Technologies for Ultrawide-band Optical
Transmissions and Wireless
Communications
Kazutoshi Kato, Kyushu University
Abstract
Opto-electronics technologies have been successfully developed for expanding the capacity of optical
transmissions. The laser diodes, the optical modulators and the photodetectors have been usually the key
components and combinations of these components and the electronics are continuously leading to an invention of
new technology. Recently, opto-electronics technologies also play an important role at ultrawide-band wireless
communications. In this presentation, first, I review the background of the optical fiber network system and
requirements of opto-electronics devices for the system. Next, I pick up the technologies of the laser diodes and
photodetectors, which have been the essential components for the optical fiber network, and explain their history
and the recent trend. Then, I show the future photonics approach to realizing an ultrawide-band wireless system
such as the Tera-hertz wave communication. Finally, I present the activities of our laboratory on the high-speed
wavelength tunable laser and the Tera-hertz carrier generation.
Biography
Kazutoshi Kato received the B.S and M.S. degrees in physics and the Ph.D. degree from Waseda
University, Tokyo, Japan, in 1985, 1987, and 1993, respectively.
Since 1987, he had been with NTT Opto-Electronics Laboratories, Kanagawa, Japan, where he had been engaged
in research on opto-electronics devices for wide-band optical transmissions, microwave applications, and optical
access networks. From 1994 to 1995, he was on leave from NTT at France Telecom CNET Bagneux Laboratory,
France, as a Visiting Researcher working on novel photodetectors. From 2000 to 2003, he was with NTT Electronics
Corporation, where he was involved in developing photonic network systems. From 2009 to 2011, he was an
executive manager at the NTT Photonics Laboratories, Atsugi, Kanagawa, Japan. He is currently a Professor of
Information Science and Electrical Engineering, Kyushu University. His current research interests include the
advanced opto-electronics devices and subsystems for high-speed optical transmissions and high-frequency wireless
communications.
Dr. Kato is a senior member of the IEEE Photonics Society, a senior member of the Institute of Electronics,
Information and Communication Engineers ( IEICE ), Japan, and a member of the Japan Society of Applied Physics.
.indd
2015/03/11
10:59:27
Keynote Speech 3
Biography
Bahman Javadi is a Senior Lecturer in Networking and Cloud Computing at the University of Western
Sydney, Australia. He is recently appointed as the Director of Academic Program for the Postgraduate ICT Course
in the School of Computing, Engineering and Mathematics. Prior to this appointment, he was a Research Fellow at
the University of Melbourne, Australia. From 2008 to 2010, he was a Postdoctoral Fellow at the INRIA RhoneAlpes, France. He has been a Research Scholar at the School of Engineering and Information Technology, Deakin
University, Australia during his PhD course. He is co-founder of the Failure Trace Archive, which serves as a public
repository of failure traces and algorithms for distributed systems. He has received numerous Best Paper Awards at
IEEE/ACM conferences for his research papers. He served as a program committee of many international
conferences and workshops. He has also guest edited many special issue journals. His research interests include
Cloud and Grid computing, performance evaluation of large-scale distributed computing systems, and reliability
and fault tolerance.
.indd
2015/03/11
10:59:27
Keynote Speech 4
Next-generation Self-organizing
Networks
Haris Gaanin, Alcatel-Lucent Bell
Abstract
The next generation (called 5G) communication systems will most
likely not be an incremental advance on contemporary communication systems. They are expected to be extremely
dense and heterogeneous, which introduces many new challenges for network optimization and management. It is
under discussion whether the 5G networks will further enhance peak data rates or focus will be on area-wise spectral
and energy efficiency. In general, it is expected that 5G innovations will enhance new services and enrich our
societies beyond what we experience today. However, the largest technology challenge would be to enable customercentric technologies that takes into consideration customers quality of experience.
The next-generation networks should target to enrich a customer experience by providing broadband
multimedia content (a thousand-fold increase in network capacity) and the connectivity for mass (billions) of
devices. Because of this it is expected that the 5G network requirements will require more advanced selforganization and self-optimization (Self-X) capabilities. This is mainly because the current concepts may not be
flexible enough and sufficient to support such complex deployments and ultra-high performance requirements. This
is even more challenging when we consider that services may (and most probably will) have different performance
requirements (e.g. latency, bandwidth, etc.). Hence, in 5G networks, a customer (and service) management may be
an integral part of the network optimization process.
The todays requirements from the mobile customer perspective are known. They expect to be connected
all the time through different devices. They expect to have access to broadband services from indoor (home, office,
shopping mall) or outdoors. Today, mobile data traffic growths tenfold mainly from either indoor users and it is
clear that contemporary communication systems may not support this trend. Studies have shown that more than 50
percent of voice and 70 percent of all data traffic originates from indoor users. This sets a challenging requirement
on the 5G technologies to provide both target data rates per area and seamless customer experience with respect to
network, device and service. Future networks Self-X must be able to provide high quality of customer experience
across the network by maintaining a seamless connectivity and the connection quality irrespective of location and/or
interference from other sources. This is not the case today. In this presentation we give an overview of the technical
and business requirements for customer-centric Self-X network. We point out the issues that may arise with respect
to their optimization and management challenges. With this in mind we also describe the technical challenges and
give some ideas of possible directions. Finally, unlike today, new technologies must be able to utilize service
information and thus, optimize both the network and the service quality per customer.
.indd
2015/03/11
10:59:29
Biography
Haris Gaanin received his Dipl.-Ing. degree in Electrical engineering from the Faculty of Electrical
Engineering, University of Sarajevo in 2000. He received his M.E.E. and Ph.D.E.E. from Graduate School of
Electrical Engineering, Tohoku University, Japan, in 2005 and 2008, respectively. Since April 2008 until May 2010
he has been working first as Japan Society for Promotion of Science (JSPS) postdoctoral research fellow and then
as an Assistant Professor at Graduate School of Engineering, Tohoku University. He is currently working as
Research Director in Alcatel-Lucent Bell, Antwerp, Belgium. His professional interest is to develop, lead and
motivate the activities of real and virtual multinational research and development teams with strong emphasis on
product/solution development through applied research projects. Advanced signal processing and algorithms with
focus on mobile/wireless and wireline physical (L1) and media access (L2) layer technologies and network
architectures. He has more than 120 scientific publications (journals, conferences and patent applications). He is
senior member of the Institute of Electrical and Electronics Engineers (IEEE) and senior member of the Institute of
Electronics, Information and Communication Engineering (IEICE), where he is a chair of Europe Section. He is an
Associate Editor of IEICE Transactions on Communications and acted as a chair, review and technical program
committee member of various technical journals and conferences. He is a recipient of the 2013 Alcatel-Lucent
Award of Excellence, the 2010 KDDI Foundation Research Grant Award, the 2008 Japan Society for Promotion of
Science (JSPS) Postdoctoral Fellowships for Foreign Researchers, the 2005 Active Research Award in Radio
Communications, 2005 Vehicular Technology Conference (VTC 2005-Fall) Student Paper Award from IEEE VTS
Japan Chapter and the 2004 Institute of IEICE Society Young Researcher Award. He was awarded by Japanese
Government (MEXT) Research Scholarship in 2002.
.indd
2015/03/11
10:59:30
Keynote Speech 5
conducting plane is 0.245 (: wavelength) by 0.49 and the antenna height is /30, and the length of horizontal
element is around /4, the input impedance of this antenna is matched to 50 and its directivity becomes more than
4 dBi.
In this antenna, the inverted L element and the conducting plane are strongly coupled and the
electromagnetic field concentrates near the inverted L element and the ground plane.
structure and adding the parasitic elements, the dual band antenna, the wideband antenna for TV reception, and the
high gain planar antenna have been proposed. The single band MIMO and dual band MIMO antennas composed
of two ULPIL antennas have been proposed. The circular polarized antenna composed of ULPIL antenna and Lshaped slot, and the antennas for the wireless power transmission (WPT) system have also been proposed. When
the distance between transmitting and receiving antennas in WPT system is 10 mm, the power transfer efficiency of
99.2 % is obtained at the design frequency of 1 GHz. In this talk, these antennas will be introduced and their
design concepts will be presented.
Biography
He received his B. E. and M. E. degrees from Saga University, Japan in 1975 and 1977, respectively, and
a Dr. Eng. Degree from Kyushu University Japan in 1986.
University.
researcher at the Department of Electrical Engineering at the University of California, Los Angeles. Since 2007,
he has been a Professor in Nagasaki University. His research interests are low profile antennas for mobile
communication and the education by using the electromagnetic simulator. He was a Chair of Technical group of
Microwave Simulator in IEICE from 2006 to 2007, IEEE AP-S Fukuoka Chapter Chair from 2007 to 2008, and
IEICE Kyushu Section Chair in 2013. He wrote the following books; Portable TV Antenna, in Antenna
Engineering Handbook Fourth Edition, Chapter 30, edited by J. Volakis, McGraw Hill, 2007, Modern Antenna
Engineering, Sogo-Denshi Publishing, 2004 (in Japanese), and so on.
10
.indd
10
2015/03/11
10:59:31
Tutorial 1
Abstract
Today, it is estimated that there are over 5 billion broadband devices connected to different access (indoor
and outdoor) networks with more than one billion mobile broadband users. As more devices, applications, content
and services are connected to communications networks, the resulting complexity is driving up costs for service
providers and putting the customer experience at risk. This is especially the case when having in mind that different
wireless and wireline access technologies are coexisting around the customers. To succeed in this rapidly changing
market, operators need the ability to deliver high-value services that differentiate and enhance the customer
experience over their access technologies such as: fiber or digital subscriber line (DSL), powerline communication
(PLC), Wi-Fi and Mobile (Macro, Small cells) all targeting the speeds of up to and exceeding 1 Gbps. This tutorial
provides an overview of current communication technologies available for broadband access networks. The focus
is on the mobile technology (and network) evolution with respect to both user and operator challenges and
management (optimization) requirements. Finally, the talk presents a few advanced examples of optimization
solutions suitable for all-IP networks.
11
.indd
11
2015/03/11
10:59:32
Tutorial 2
Abstract
OpenPAT.org is the home of the Open Program Analysis Toolkit project that originated in Cambridge
and Imperial UK. OpenPAT differs from program analysis toolkits such as SUIF (Stanford), GILK (Imperial),
Valgrind (Cambridge) and Pin (Intel) in that it instruments code statically and gathers dynamic timing, control and
data flow information as the program runs. In this presentation we will review the OpenPAT approach and examine
its benefits in comparison with the alternatives and then we will create a new tool for OpenPAT with just a few lines
of code that can be used to analyse the internal workings of programs written in any compilable language.
Biography
Dr Simon Spacey graduated top of the class in Computer Science at Cambridge University and completed
his Ph.D. in Computer Science at Imperial College London early winning the Systems Prize. Simon is currently a
Senior Lecturer at Waikato University in New Zealand where he lectures Computer Systems, Computer
Architectures, Software Engineering and Computational Optimization. Simon's main research area is in
Performance Optimisation which involves using tools like OpenPAT to analyse software to identify new system
architectures that deliver computational and power advantages.
12
.indd
12
2015/03/11
10:59:33
13
.indd
13
2015/03/11
10:59:34
.indd
14
2015/03/11
10:59:34
.indd
15
2015/03/11
10:59:34
I.
INTRODUCTION
II.
The proposed geometry of the antenna array with CSRRsbased DGS is shown in Fig. 1 through simulations using a
commercial full-wave analysis software package. The
rectangular patch has dimensions W = 11 mm, and L = 9 mm,
whereas the feeding microstrip has length Ltl = 17.5 mm and
width Wtl = 3.4 mm which ensures a 50 characteristic
impedance. The inset length in essence provides the necessary
impedance matching. The substrate used for this array was
Rogers Ro/3003 with the thickness of t = 1.524 mm and a
dielectric constant of r = 3. The spacing between the elements
is chosen to be 7.5 mm (0.22 o). CSRR structures are designed
to operate at transmission zeros in the same band of the antenna
array. The dimensions of the CSRR structures chosen for this
frequency of operation are rin = 1 mm, rin1 = 1.2 mm c = 0.4
mm, g = 0.4 mm and d = 0.4 mm respectively. The BSF affects
significantly the array mutual coupling and isolation between
two elements; hence the proposed geometry has a small
deviation in the resonant frequency about 2.7 % (250 MHz)
due to the presence of the CSRRs-based DGS in the ground
plane. The proposed configuration produced mutual coupling
about -61 dB better than the conventional array with the same
dimensions using numerical experimentation technique.
m
C
t
t
16
.indd
16
2015/03/11
10:59:36
a
e
,
d
c
y
s
a
s
d
a
s
4
s
n
l
)
d
g
e
Directivity (dBi)
Mi
in
M cro
rrostr
ttrrriip
ip Lin
i e
Wtl
W
Ltl
(a)
20
Wtl
0
-20
-30
-40
CSRRs Based-DGS
120
(a)
rin
n
d
c
sd
d
7.5
8.5
9rir n
9.5
Ground
Plane
10
10.5
11
s-parameterswith-3CSRRFINALLLLtwo curves
c
(b)
r n1
ri
240
210
180
sd
Directivity H-Plane(dBi)
Directivity E-Plane(dBi)
150
270
300
-10
-50
Ltl
7.5 mm
330
30
10
Directivity (dBi)
e
s
g
d
r
h
l
a
,
a
d
a
o
e
s
y
a
n
.
e
t
Patch Antenna
M cro
Mi
r str
ro
tr
tri
rip
ip Lin
iine
Frequency(GHz)
10
5
0
-5
-10
-15
-20
-25
-30
30
330
300
270
120
240
-10
150
|S|_Parameters(dB)
-20
(b)
-50
S11(dB)
S12(dB)
S11(dB)_without_CSRRs
S12(dB)_without_CSRRs
-60
7.5
8.5
9.5
Frequency(GHz)
10
10.5
11
REFERENCES
III.
[1]
[2]
-40
(c)
Figure 2. Optimized array radiation pattern results with and without CSRRs
-30
-70
210
180
[3]
Antenna Structure
Conventional Antenna
Array without CSRR
Proposed Array with
2-CSRRs-based DGS
Proposed Array with
3-CSRRs-based DGS
Ref. No. [3]
Ref. No. [4]
Results
Mutual
Coupling
(dB)
Improvement
(dB)
Directivity
(dBi)
Gain
(dB)
-26
10.6
10.3
-37
11
9.4
9.3
-61
35
9.6
9.5
9.3
17
[4]
20
6.19
10
[5]
[6]
[7]
Using EBG
Using interdigital
capacitor loaded slots
Using Metamaterials
Using dumbell DGS
Using U-shaped
[8]
17
.indd
17
2015/03/11
10:59:38
Abstract In this paper, a double-sided printed compact ultrawideband antenna has been studied. The frequency band
considered is from 2.35 GHz to 22.5 GHz. The fractional
bandwidth of the antenna is 162%. This antenna covers the entire
band of the UWB applications, which has approved by the Federal
Communications Commission. The antenna is fabricated on an
inexpensive FR4 substrate with 35.4mm 22mm 1.6mm. The
measurement results are almost in agreement with the simulation
solutions.
18
.indd
18
2015/03/11
10:59:39
g
,
e
d
n
d
e
y
e
r
a
f
t
e
Figure 4.
Pass Band
Dimensions
Year
[1]
2.5~18GHz
2013
[2]
3.5~12GHz
[4]
3~11.2GHz
2011
[5]
3.1~10.6GHz
This paper
2.35~22.5GHz
2014
Figure 5.
2012
2004
IV.
SUMMARY
In this paper, we have proposed and investigated a doublesided printed compact ultra-wideband antenna. The simulated
and measured results show that the antenna has a good
omnidirectional characteristic at the low frequencies, which
covers the entire conventional UWB frequencies. The fractional
bandwidth of the proposed antenna has achieved to 162%. The
dimension is only 35.4mm 22mm 1.6mm.
e
d
s
s
r
e
e
s
h
e
e
,
t
l
h
f
y
V.
[1]
[2]
[3]
[4]
[5]
Figure 3. Radiation patterns at 3GHz,10GHz and 20GHz for E plane and H
plane
REFFRENCES
19
.indd
19
2015/03/11
10:59:40
CIRCUIT DESCRIPTION
VDD
I.
C3
L2
L3
INTRODUCTION
RFOUT
M2
Cout
C4
RFIN
L1
Cin
C1
M3
Rbias2
M1
C2
VBias2
Rbias1
VBias1
Low power, high efficiency and full integration are the key
design specifications for designing WBAN CMOS PAs. High
efficiency design is of particular importance for medical
implants. That point represents the main drive to seek the best
techniques to improve the power hungry PA efficiency. Many
IEEE.802.15.6 PAs have been recently presented relying on
traditional design techniques [2] and [3]. In this work we
propose a PA design using load pull in order to optimize the
PAE and output power, while assuring low power operation.
Load pull have slowly been adopted by VLSI PA designers [4].
Compared to traditional PA design methods, Load pull design
s
a
o
a
20
.indd
20
2015/03/11
10:59:41
2
f
n
f
3
n
e
40
20
-10
10
-15
PAE (%)
30
-5
-20
-25
Pout Pre-Layout
Pout Post-layout
-30
-40
-35
-30
-25
PAE Pre-Layout
PAE Post-Layout
-20
-15
-10
Pin (dBm)
-5
-10
-20
10
Performance summary
Ref.
PAE=47.8 %
PAE=41.8 %
III.
1.8
15
47.8
-7
1.4
19.5
15.4
28.5
-13
[6]
0.18
1.8
11
19.8
18
-5
[7]
0.13
1.2
18.09
22
25.34
-9.08
[5]
0
-5
S11 Pre-Layout
S11 Post-Layout
S22 Pre-Layout
S22 Post-Layout
S21 Pre-Layout
S21 Post-Layout
2.4
Freq (GHz)
2.6
2.8
[6]
3
x 10
Input
P1dB
[dBm]
2.4
[4]
2.2
PAE
[%]
2.45
2.42.48
2.42.483
[2]
10
Power
[mW]
0.18
15
-25
1.8
Gain
[dB]
0.18
[3]
-20
Supply
[V]
This*
work
[5]
[1]
20
-15
Freq.
[GHz]
REFERENCES
-10
Tech.
[um]
S-Parameters (dB)
t
s
s
r
e
t
y
e
0
d
s
50
5
0
n
n
n
10
[7]
IEEE Standard for Local and metropolitan area networks - Part 15.6:
Wireless Body Area Networks. pp. 1271, 2012.
L. Zhang, H. Jiang, J. Wei, J. Dong, F. Li, W. Li, J. Gao, J. Cui, B. Chi,
C. Zhang, and Z. Wang, A Reconfigurable Sliding-IF Transceiver for
400 MHz/2.4 GHz IEEE 802.15.6/ZigBee WBAN Hubs With Only 21%
Tuning Range VCO, IEEE J. Solid-State Circuits, vol. 48, no. 11, pp.
27052716, Nov. 2013.
H.-C. Chen, M.-Y. Yen, Q.-X. Wu, K.-J. Chang, and L.-M. Wang,
Batteryless Transceiver Prototype for Medical Implant in 0.18-m
CMOS Technology, IEEE Trans. Microw. Theory Tech., vol. 62, no. 1,
pp. 137147, Jan. 2014.
F. M. Ghannouchi and M. S. Hashmi, Load-Pull Techniques with
Applications to Power Amplifier Design, 2012.
S. M. Abdelsayed, M. J. Deen, and N. K. Nikolova, A Fully Integrated
Low-Power CMOS Power Amplifier for Biomedical Applications, in
The European Conference on Wireless Technology, 2005., 2005, pp.
277280.
K. Haridas and T. H. Teo, A 2.4-GHz CMOS Power Amplifier Design
for Low Power Wireless Sensors Network, in 2009 IEEE International
Symposium on Radio-Frequency Integration Technology (RFIT), 2009,
pp. 299302.
Anran Shao, Zhiqun Li, and Chuanchuan Wan, 0.13m CMOS Power
Amplifier For Wireless Sensor Network applications, in The 19th
Annual Wireless and Optical Communications Conference (WOCC
2010), pp. 14.
21
.indd
21
2015/03/11
10:59:42
a
l
e
w
f
d
[
b
o
r
Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
o
r
M
t
c
c
t
o
t
A
I.
INTRODUCTION
The C3, C4, C5 and C6 are the tail capacitors that enforce
furthermore the circuit to operate in class-C. The parasitic
capacitors from the tail transistors might already push the
cross-couple pairs to work partially in class-C, however large
22
.indd
22
2015/03/11
10:59:43
added capacitance will guarantee the generation of impulselike current waveforms, and thus increases the current
efficiency of the transistors. Furthermore, these large capacitors
will filter out the noise from the tail transistors at higher
frequency thus the phase noise will be improved further. To
dynamically bias the pair transistor, an operational amplifier
[6] is used. It provides a negative feedback. It adjusts the Vbias1
by sensing the variation of the common-mode voltage at source
of the NMOS cross-couple pair and keeping it equal to a
reference voltage.
III.
TABLE I.
Process
Topology
Vdd (V)
f0 (MHz)
Power
(mW)
L (1MHz)
dBc/Hz
FOM
e
c
e
e
This
work
CMOS18
Crosscoupled
1.5
1964
1.13
[4]
[7]
[8]
CMOS18
Crosscoupled
1.1
1962
1.7
CMOS18
Pierce
CMOS13
Colpitts
1.8
1500
3.78
0.6
2000
0.126
-160.0
-151.5
-142.0
-149.0
225
215
224
CONCLUSION
without Ctail
e
t
y
e
t
D
e
n
e
,
t
r
w
,
y
t
g
t
e
o
ACKNOWLEDGMENT
This work was supported by a Grant-in-Aid for Scientific
Research (B) (KAKENHI-B). This work was also partly
supported by VLSI Design and Education Center (VDEC), the
University of Tokyo in collaboration with CADENCE
Corporation and Agilent Corporation.
REFERENCES
Figure 2. Drain-source current of the NMOS pair for with or without tail
capacitors.
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
23
.indd
23
2015/03/11
10:59:43
.indd
24
2015/03/11
10:59:43
.indd
25
2015/03/11
10:59:44
1
c
c
o
e
c
a
t
g
2
e
t
t
p
3
i
t
w
p
s
Kazi Mozaher Hossein1, Ashir Ahmed1, Abdullah Al Emran2 and Akira Fukuda1
1
people,
e-Commerce,
Village
I. INTRODUCTION
I
u
v
(
d
E
H
s
h
c
W
c
f
f
g
c
v
p
v
h
d
o
V
I
l
s
a
26
.indd
26
2015/03/11
10:59:45
e
d
t
r
l
g
n
e
n
p
a
l
d
d
e
l
B. GramWeb
GramWeb is a village information platform that collects
village specific information e.g. demographic information
(population, location, socio economic status etc.) as well as
daily activities of that community. A Village Information
Entrepreneur (VIE) owns the website for his/her village.
He/she is connected with the villagers by providing some other
social services to the villagers. The social services include
healthcare, education, purchase and learning activities.
III. DESIGN OF A VILLAGE SPECIFIC CATALOG
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
27
.indd
27
2015/03/11
10:59:45
Abstract
Examining student learning behavior is one of the crucial
educational issues. In this paper, we propose a new method to
predict student performance by using comment data mining that
highly reflect student learning attitudes and activities. Analyzing
comment data after each lesson helps to grasp student learning
attitudes and situations. This paper proposes a new model based
on a statistical latent class Topics for the task of student grade
prediction; our model convert student comments using
Probabilistic Latent Semantic Analysis (PLSA), and SVM
generates prediction models of final student grades. Choosing the
number of topics and the number of words in each topic for the
PLSA model successfully improve the prediction results. In
addition, considering the student grade predicted in a range of
lessons can deal with prediction error occurred in each lesson,
and achieves further improvement of the student grade
prediction.
Keywords- Comment Data Mining, Student Grade Prediction,
PLSA.
I.
INTRODUCTION
II.
I
th
th
g
th
1
b
T
h
a
le
BACKGROUND
METHODOLOGY
28
.indd
28
2015/03/11
10:59:47
t
o
d
r
a
s
k
m
e
t
e
y
f
t
y
t
r
t
PLSA
0.756
0.783
0.773
PLSA*
0.782
0.843
0.792
TP rate
TP rate
0.9
0.9
P-Comments
TABLE II.
0.85
0.8
PLSA
PLSA*
0.75
L(1-3)
L(1-6)
L(1-9)
L(1-12)
L(1-15)
Lessons
EXPERIMENT RESULTS
IV.
Model
Lesson 7-15
N
PLSA
0.583
0.643
0.513
0.501
0.563
0.92
PLSA*
0.632
0.683
0.577
0.586
0.631
0.554
TP rate
0.9
d
t
s
e
s
y
.
0.85
C-Comments
0.85
0.8
PLSA
PLSA*
0.75
L(1-3)
L(1-6)
L(1-9)
L(1-12)
L(1-15)
Lessons
N-Comments
0.8
0.75
0.7
PLSA
PLSA*
0.65
L(1-3)
L(1-6)
L(1-9)
Lessons
L(1-12)
L(1-15)
V.
CONCLUSION
N
,
a
[2]
[3]
29
.indd
29
2015/03/11
10:59:48
f
t
t
r
Dipok Kumar Choudhury1, Mansur Ahmed2, Akinori Ozaki3, Md. Abiar Rahman4, Shoichi Ito5 and Ashir Ahmed6
w
h
a
e
p
c
I
a
a
B
a
a
h
(
r
l
a
o
b
l
w
c
c
a
t
r
m
(MS Student): Agriculture and Resource Economics. Kyushu University, Fukuoka, Japan, dipokch@gmail.com
2
(Technical Manager, Agriculture): Kyushu University-JICA Grass Root Project, Dhaka, Bangladesh
3
(Coordinator): Kyushu University-JICA Grass Root Project, Dhaka, Bangladesh
4
(Associate Professor): Dept. of Agroforestry and Environment, Bangabandhu Sheikh Mujibur Rahman Agricultural University
(BSMRAU), Gazipur, Bangladesh
5
(Professor): Agriculture and Resource Economics. Kyushu University, Fukuoka, Japan
6
(Associate Professor): Dept. of Advanced Information Technology, Kyushu University, Fukuoka, Japan
1
II.
METHODOLOGY
I. INTRODUCTION
To feed the ever-increasing population, the productivity of
agricultural land needs to be intensified in Bangladesh. The
productive capacity of agricultural land is low due to poor soil
fertility, which has been degraded by mismanagement of
agricultural resources. Intensive cultivation is associated with
the use of high inputs of agrochemicals. The intensive use of
agrochemicals could cause not only the degradation of soil
fertility, but also the pollute environment and human health. A
good soil should have at least 2.5% organic matter. However,
the soil of about 45% of the net cultivable area in Bangladesh
has less than 1% [1]. It is believed that the lower productivity
of the soil is associated with the depletion of organic matter
due to increasing cropping intensity, higher rates of
decomposition of organic matter under the prevailing hot and
humid climate, use of lesser quantities of organic manure, little
or no use of green manure etc. [2]. As a result, farmers are not
getting desired yields. On the other hand, foods are not safe as
excessive agrochemicals are being used for crop production.
Due to poor marketing system, farmers are not getting real
price of quality products. Most of the farmers are poor and
illiterate. They do not know much about modern technologies
on production, marketing and management of products.
Kyushu University had proposed a grass-root project named
Income Generation Project for Farmers using ICT (IGPF)
through producing semi-organic vegetable called Q-Vegie
which is also expected to become a new brand in Bangladesh.
The goal was to generate income for BoP (Bottom of Pyramid)
i
r
a
n
m
t
s
Q
f
p
Selected vegetable
Bitter Gourd, Bottle Gourd, Ash Gourd,
Cucumber, Okra, Ridge Gourd, Long Bean
Rabi (Winter)
(November-February)
d
w
b
n
t
g
b
B. Marketing
In Phase-I, emphasize was given to the production of semiorganic vegetables, because it was a new concept among the
30
.indd
30
2015/03/11
10:59:48
,
l
y
o
d
,
e
e
c
s
t
e
n
n
e
U
t
C. Use of ICT
ICT tools were used in production and marketing activities,
which ensured quick and easy solution of the problems and
high income. There are many components of ICT like eagriculture, Agri-eye, semi-organic learning and e-commerce
etc. (1) E-agriculture application is a foremost part of ongoing
project phase-II. Generally, it helps technical farming update
content which was stored previously, Farming activity
Information uploading by farmers and finally communication
among the farmers, expert and market agent through this
application. (2) BIGBUS (BoP Information Generation,
Broadcast and Upload System) is a component of e-agriculture
and it can be used by even low literacy farmers. Farmers can
access to this system by their own phone and upload any
harvest information by following voice navigation. (3) C2D
(Click to dial) is also part of e-agriculture and only the
registered users will be connected each other based on need
like farmers, project staffs, Agriculture experts, Market agents
as well as consumers. (4) Agri-Eye helps to farmers in terms
of weather information and they can take decision for farming
based on that information. (5) Semi-organic Learning is a
learning support content of particular semi-organic vegetables
which were updated by the agriculture expert. These contents
combined by the text, picture and animation. Finally (6) Ecommerce is a market place of Q-Vegie where consumer can
access and select their products for purchasing and make
tradeoff with farmers by themselves. This application would
reduce middle man activities and increase farmer`s end profit
margin.
III.
CONCLUSION
Farmers indigenous knowledge has never been archived in
developing countries. ICT can collect these knowledge, archive
and disseminate to new farmers. Our Q-Vegie production
involves more manual labor cost and the risk is there to protect
the vegetables from insects in organic way. Our developed ICT
tools helped the farmers to access to the indigenous knowledge,
take faster action and increase productivity. Now we are
developing ICT tools for identifying suitable market and
delivering products in a safer way. .
ACKNOWLEDGMENT
The project was funded by JICA and implemented by
Kyushu University. The authors are thankful to BSMRAU,
WIN, GCC and BARI authority for their excellent cooperation
in implementing the project.
REFERENCES
RESULTS
A. Project Activities
In Phase-I, it was found that the farm income of rural
individual farmer is too low to meet their minimum livelihood
requirements because of lack of opportunity to obtain
advanced information on farming techniques and undeveloped
network of agricultural production and marketing. As a result,
most of the farmers cannot break away from their poverty and
they are still remaining at BoP [3]. Food security and food
safety as well as income generation can be improved through
Q-Vegie cultivation but the marketing system needs to be
facilitated more systematically in the next phase-II of the
project.
Unfortunately, the report of phase-I also showed that the
demand for Q-vegie was very weak and marketing channel
was not organized by the project in the Dhaka city, which has
been motivated the phase-II initiative including three more
new locations named Mirjapur, Monohordi and Boshundia. In
this connection, this current initiative has three new main
goals compared to previous phase; 1): To create entrepreneurs
based on new business model of Q-Vegie at rural areas. 2): To
[1]
[2]
[3]
31
.indd
31
2015/03/11
10:59:49
Andrew Rebeiro-Hargrave, 2Ashir Ahmed 3Naoki Nakashima and 4Partha Pratim Ghosh
1
Institute of Decision Science for Sustainable Society, Kyushu University, Japan
2
Department of Advanced Information Technology, Kyushu University, Japan
3
Kyushu University Hospital, Japan
4
Grameen Communications, Dhaka, Bangladesh
communities [5]. It consists of back-end of data servers and a
medical call center, and inexpensive front-end instances of
portable briefcase consisting of medical sensors and measuring
equipment (costing USD 500). The front-end communicates
with the back-end using mobile network coverage and Internet
(Fig 1).
I. INTRODUCTION
Telehealth is the delivery of health-related services and
information via telecommunications technologies. Healthrelated services are delivered by healthcare workers and
supported by remote doctors in medical institutions [1].
Telehealth is normally used to keep and treat patients at home
and out of hospitals [2]. It is used to remotely monitor
chronically ill patients [3]. Telehealth is not used for medical
screening programs that test for chronic Non-Communicable
Diseases (NCD) such as hypertension, diabetes, dyslipidemia,
obesity, kidney disease and liver dysfunction in individuals who
do not show symptoms. Medical screening programs are good
for identifying morbidity at an early and treatable stage, and for
exposing individuals who normally ignore the disease
symptoms [4]. Once identified at risk, a patient is referred to a
clinic for treatment. A portable telehealth service at a screening
center can shorten the time between the patient being referred
and receiving a medical intervention. In this study we introduce
the Portable Health Clinic as an efficient medical screening
system and a synchronous telehealth component.
II. METHODOLOGY
The Portable Health Clinic (PHC) system is an e-health
system with a telehealth component. The PHC was designed by
Kyushu University and Grameen Communications Global
Communication Center (GCC) to provide affordable e-Health
service to low-income subjects living in unreached
32
.indd
32
2015/03/11
10:59:50
2
is 18.25 minutes (turn-around time includes client registration,
sensor measuring, consultancy and e-prescription print-off).
[4]
[5]
33
.indd
33
2015/03/11
10:59:51
.indd
34
2015/03/11
10:59:51
.indd
35
2015/03/11
10:59:51
t
c
r
t
d
c
p
c
II.
(1)
c0 ,C 1
INTRODUCTION
START
(i)
Channel selection
(v)
Priority
#1
I BS( m) (t;0)
I BS( m) (t;1)
#C1
I BS( m) (t; C 1)
#0
IV.
COMPUTER SIMULATION
A. Simulation model
An example of HetNet model is illustrated in Figure 2. An
MBS is located at the center of hexagonal macro cell. NSBS
SBSs are distributed uniformly within one macro cell and a
static UE is assumed to be uniformly located within each cell.
The simulation parameters are summarized in Table I. We
show that the channel reuse pattern formed by IACS-DCA can
reduce the CCI which originates from macro cell and is
received by small cell. The perfectly synchronous time division
multiple access (TDMA) system is assumed. As shown in
Figure 3, we assume C time-division channels (i.e., C timeslots
within one timeframe). For the measurement of the average
CCI power, the first order filtering with forgetting factor is
used. If a too small is used, the measured average CCI tends
a
s
t
w
T
C
36
.indd
36
2015/03/11
10:59:53
,
e
)
l
l
d
).
,
g
l
.
d
SBS
SBS
SBS
MBS
SBS
CDF
SBS
SBS
SBS
SBS
SBS
SBS
SBS
SBS
SBS
SBS
1timeframe
1timeslot
#1 #C1 time
Network
Macro cell
Small cell
Path loss
[11]
IACS-DCA
45
50
SIR(dB)
55
60
65
70
COLCLUSION
REFERENCES
[1]
TABLE I.
40
ACKNOWLEDGMENT
The research results presented in this material have been
achieved by Towards Energy-Efficient Hyper-Dense Wireless
Networks with Trillions of Devices, a Commissioned
Research of National Institute of Information and
Communications Technology (NICT), JAPAN
SBS
#0
35
SBS
SBS
SBS
SBS
30
V.
SBS
SBS
MBS:off
MBS:on
SBS
SBS
s
e
s
e
a
.
e
n
s
n
n
s
e
s
s
SBS
SBS
SBS
0.1
0.01
No. of MBSs
NMBS=1
No. of SBSs
NSBS=29
No. of channels
C=8
Carrier frequency
2 [GHz]
Frequency bandwidth
10 [MHz]
Noise power spectrum
174 [dBm/Hz]
density [10]
Radius
250 [m]
Min. MBS-SBS distance
75 [m]
Transmit power of MBS
46 [dBm]
Radius
40 [m]
Min. SBS-SBS distance
40 [m]
Transmit power of SBS
30 [dBm]
15.3+37.6log10(dBS(m),BS(n)) [dB]
MBS-SBSMBS-UE
27.6+37.6log10(dBS(m),BS(n)) [dB]
SBS-SBSSBS-UE
dBS(m),BS(n): distance between BS(m) and BS(n) [m]
Filter forgetting factor
=0.99
B. Simulation result
Figure 4 plots the CDF of downlink SIR. For comparison, we
also plot the downlink SIR when MBS is off after channel
segregation is finished (there is no CCI from MBS to small cell when
the SIR measurement is carried out). We observe that downlink SIR
when MBS is on is almost equal to downlink SIR when MBS is off.
This proves the effectiveness of our proposed IACS-DCA in reducing
CCI between macro cell and small cells.
37
.indd
37
2015/03/11
10:59:53
Dept. of Communication Engineering, Graduate School of Engineering, Tohoku University, Sendai, Japan
6-6-05, Aza-Aoba, Aramaki, Aoba-ku, Sendai, Miyagi, 980-8579, Japan
Email: (yoneya, mehbod)@mobile.ecei.tohoku.ac.jp
adachi@ecei.tohoku.ac.jp
SBS1
MBS
SBS3
SBS2
UE
I. I NTRODUCTION
Fig. 1. HetNet topology.
(1)
TX
us (t) = ( PsAll (t)/PsM
AX + s (t)),
(2)
TX
where s (t) is the transmission power level and PsM
AX is
the maximum transmission power of s th BS. A BSs total
consumption power is decided by its transmission power [1].
A BSs load is the summation of all UEs traffic load in the
cell. Please note UEs traffic load is defined as the ratio of
its required date rate over its actual link capacity. Utility of s
th BS is formed by its total consumption power, PsAll (t), and
traffic load , s (t) according to:
(3)
TABLE I
Identification Number
of Strategy i
1
2
3
4
Transmission
Level s (t)
0
1/3
2/3
1
Power
38
.indd
38
2015/03/11
10:59:55
4
Proposal HO algorithm
Baseline
0
20
B. HO Algorithm
Each UE receives the traffic load estimation s (t), data
about BS position and the cell radius r from all BSs through
beacon signals. UE obtains the velocity and position information using its integrated GPS. UE decides about HO based on
these information and received power PsRX (t). Algorithm 3
shows HONEP at UE. In this algorithm, v(t)( 0) is the
velocity of UE and vb (t)( < vb (t) < +) is its velocity
component in the direction of the connected BS. d(t) is the
UEs distance to its connected BS. For new UEs or UEs
needing HO, each UE at point z selects the BS to connect
to (HOEP), s(z, t), based on the following criterion:
40
60
Number of Users
80
100
Fig. 2. Total number of HOs for 1 s vs different number of UEs for 7 SBSs
and average velocity 4 km/h.
V. C ONCLUSION
In this paper, a joint distributed handover (HO) and base
station sleep mode algorithm was proposed within the context
of HetNet. It was noticed that the proposed algorithm yields
significant improvement in HO times compared to conventional algorithms in high UE densities.
TABLE II
S IMULATION PARAMETERS .
Parameter
Value
Path loss (d:Distance of BS and user (m)) (unit: dB)
MBS - UE
15.3+37.6log10 (d) [1]
SBS - UE
27.9+36.7log10 (d) [1]
Algorithm Parameters
Power Threshold P T H
60 dBm [3]
Distance Threshold dT H
20 m
Weighting Coefficients for Power Consump10, 5
tion and Traffic Load, ,
Weighting Exponent of Traffic Load and
1, 0.5
Distance, ,
R EFERENCES
[1] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, Opportunistic
sleep mode strategies in wireless small cell networks, in IEEE International Conference on Communications 2014 - Mobile and Wireless
Networking Symposium (ICC14-MWS), June 2014, pp. 27072712.
[2] S.Zhou, A.J.Goldsmith, and Z.Niu, On optimal relay placement and
sleep control to improve energy efficiency in cellular networks, in IEEE
International Conference on Communications, June 2011, pp. 16.
[3] G. F. Pedersen, Mobile phone antenna performance, in Aalborg University, November 2013, p. 14.
39
.indd
39
2015/03/11
10:59:56
Mohamed Elwekeil1 , Masoud Alghoniemy2 , Osamu Muta3 , Adel B. Abd El-Rahman1 , and Hiroshi Furukawa4
1
Department of Electronics and Communications Engineering,
Egypt-Japan University of Science and Technology, Alexandria, 21934 Egypt
2
Department of Electrical Engineering, University of Alexandria, 21544 Egypt
3
Center for Japan-Egypt Cooperations in Science and Technology, Kyushu University, Fukuoka-shi, Fukuoka, Japan
4
Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka-shi, Fukuoka, Japan
e-mail: mohamed.elwekeil@ejust.edu.eg, alghoniemy@alexu.edu.eg,
muta@ait.kyushu-u.ac.jp, adel.bedair@ejust.edu.eg, furuhiro@ait.kyushu-u.ac.jp
AbstractAn optimization model for solving the channel
assignment problem in IEEE 802.11 WLANs is proposed. The
proposed model is based on minimizing the total interference
at all access points in the network while allowing only nonoverlapping channels. The proposed model is formulated as a
mixed integer linear program. The main advantage of the proposed algorithm is that it guarantees a global solution. Simulation
results show that the performance of the proposed algorithm is
better than those of both the pick-first greedy algorithm and the
single channel assignment method.
I.
II.
I NTRODUCTION
T HE P ROPOSED M ODEL
Most of the existing WLANs follow the IEEE 802.11b/g/n standard which operates on the unlicensed 2.4 GHz Industrial, Scientific
and Medical (ISM) band. Figure 1 shows the IEEE 802.11 channels
in the ISM band, where the bandwidth of each channel is about
22 MHz and the separation between every two adjacent channels
is only 5 MHz; thus, neighboring channels overlap with each other.
This band consists of eleven frequency channels with only three nonoverlapping channels [1]. Therefore, careful channel assignment in
Multi-cell WLANs is very important. In Multi-Cell WLAN, multiple
interfering access points (APs) produce a considerable increase in
collisions. In this situation, the objective of channel assignment is to
assign a channel for each AP in order to reduce interference and thus
maintain an acceptable throughput.
(2)
where fi and fj are the channels assigned to APi and APj respectively. In order to get rid of the modulus function in (2), the co-channel
interference factor can be expressed by the following linear inequality
Fig. 1: Channels for the IEEE 802.11 in the 2.4GHz ISM band [2].
+ Zij
), ij 0,
ij 1 (Zij
(3)
where Zij
and Zij
are auxiliary variables representing the positive
+
Zij
= fi fj . This
and negative values of (fi fj ) with Zij
+
will assure that Zij + Zij equals the modulus |fi fj |. In order to
+
and Zij
is zero, which is
guarantee that at least one of the values Zij
+
+ Zij
;
required for the replacement of the modulus part in (2) by Zij
an EITHER-OR constraint (4) is defined as in [5]
+
Qij , Zij
(1 Qij ),
Zij
(4)
+
upper bound (e.g., 100) for both Zij
and Zij
. Thus, fi fj = Zij
40
.indd
40
2015/03/11
10:59:57
a
s
d
n
,
e
s
m
n
g
y
e
k
h
e
,
100
AP4
80
(5)
e
s
o
s
;
+
j
AP6
AP5
(6)
AP8
30
20
30
40
50
60
70
80
Building width (m)
90
100
110
120
(7)
Pt
s.t. ij + Zij
+ Zij
1
ij 0
+
Zij
fi + fj = 0
Zij
(8)
+
Qij 0
Qij {0, 1}
Zij
Zij + Qij
ki {0, 1, 2}
fi 5ki = 1
IV.
Interference (dBm)
Proposed
Pick-first
Single Ch
-60.30
-59.77
-49.74
-61.04
-59.42
-44.16
-61.44
-55.15
-45.46
-58.86
-54.66
-32.64
-59.23
-58.36
-48.86
-66.10
-62.04
-49.48
-60.75
-62.63
-46.25
-65.23
-69.34
-50.64
-60.05
-62.81
-32.71
-51.36
-49.23
-29.16
C ONCLUSION
R EFERENCES
l
y
AP3
AP7
where Gt and Gr are transmit and receive antenna gains in the line
of sight direction, respectively. In the simulation, it is assumed that
do = 5 m, Gt = Gr = 3 dBi and the AP transmit power equals 20
dBm.
[1]
III.
AP1
60
40
where do is the reference distance for the antenna far field, dij is the
distance between APi and APj and LF S (do ) is the free space path
loss for distance do , which is given by
4do
LF S (do ) = 20 log10 (
)dB,
Gt Gr
70
50
AP2
AP9
90
N UMERICAL R ESULTS
[2]
[3]
[4]
[5]
[6]
[7]
41
.indd
41
2015/03/11
10:59:58
d
a
s
i
t
s
p
s
c
i
a
1 (Egypt-Japan University of Science and Technology (E-JUST)): Department of Electronics and Communications Engineering,
New Borg El Arab, Alexandria, Egypt, mahmoud.sleem@ejust.edu.eg, shalaby@ieee.org
2 (Kyushu University): Center for Japan-Egypt Cooperation in Science and Technology, Fukuoka-shi, Fukuoka, Japan,
muta@ait.kyushu-u.ac.jp
3 (Kyushu University): Graduate School of Information Science and Electrical Engineering, Fukuoka-shi, Fukuoka, Japan,
furuhiro@ait.kyushu-u.ac.jp
A
are uniformly distributed such that > . System chunks are
INTRODUCTION
d
c
s
t
p
p
a
] where , = {[( ) ]
+
} and is the | |
1/2
I.
(, , )
s
i
w
K
m
0
[
c
u
a
filling equation [
, =
1.6,
1
)
0.2 ( 1
=(1)+1
,
u
o
a
o
i
l
e
(1)
SYSTEM MODEL
III.
A. Channel model
We consider a single-cell multi-user OFDMA-based system
with N sub-carriers served by one centric base station (BS)
equipped with T transmit antennas and single-antenna K users
d
T
a
Q
p
42
.indd
42
2015/03/11
11:00:00
e
g
=
=
x
m
h
s
d
r
y
,
k
e
d
l
e
d
e
d
e
e
d
A. Simulation Environment
We simulate the downlink of multi-user MISO-OFDMA
system with = 4 antennas, bandwidth = 100 MHz divided
into 1024 sub-carriers, under Rayleigh fading channel model
with exponential power decay profile (PDP). Number of users,
K, is set to 10, The BER constraint, BERth, is set to 10-3, the
minimum user rate per sub-carrier, Rmin/N , is set heuristically to
0.5. We compare our proposed algorithm with the algorithm in
[2] that maximizes sum rate under power and average BER
constraints only, the algorithm in [3] that maximizes sum rate
under power, BER and proportional rate constraints among users
and the round robin algorithm.
C. Fairness performance
To further show the effectiveness of our proposed algorithm
in terms of fairness among users, Fig. 2 shows the Jains fairness
index (FI) [5], defined as =
(
=1 )
2
=1
against number of
CONCLUSION
43
.indd
43
2015/03/11
11:00:01
Abstract In this paper, we propose a limited feedbackbased interference alignment (IA) scheme suitable for two tier
macrocell-femtocell heterogeneous networks. Firstly, an analytical expression for the total system sum rate loss due to the
employment of limited feedback channel versions is derived.
Then, a comparative simulation study is done between two IA
schemes that are employed in our proposed limited feedback
system, namely, Hierarchical IA (HIA) scheme and Iterative
Reweighted Least Squares (IRLS) IA scheme. Simulation results
confirmed the severe effect of quantization of CSI on the IA
performance. Additionally, the obtained results show that the
IRLS based IA scheme is more robust to quantization errors
that the HIA scheme.
I. I NTRODUCTION
(i)
yk = Wk
Heterogeneous network (HetNet) is considered as a promising technology for cellular networks to extend the coverage
and capacity [1]. However, the existence of HetNets is accompanied with large intercell interference (ICI). Many IA
techniques proposed to recover from such ICI [1][2]. In [1],
the authors proposed a Hierarchical IA (HIA) technique. In
HIA, the transmit weights for the femtocell BSs (FBSs) are
calculated first, followed by calculating that of the macrocell
BS (MBS). All the transmit weights can be calculated in closed
form by separating the calculations of FBSs and MBS. In [2],
we proposed a downlink interference mitigation framework
that based on two algorithms, namely, the restricted waterfilling (RWF) algorithm and the IRLS based IA algorithm.
This framework showed an excellent performance in HetNet
scenarios compared with other IA techniques. The RWF
algorithm is responsible for maximizing the downlink sum rate
of the MBS on a restricted number of eigenmodes leaving the
other eigenmodes for the operation of the accompanied shared
spectrum femtocells [2]. These femtocells coordinate their
transmissions to be in such directions that are free from MBS
transmissions. However, neither the achievable performance of
the HIA technique nor that of the IRLS based IA technique is
clarified in limited feedback environment.
In this paper, we will make a comparative study for both
the HIA technique and the IRLS-IA technique to clarify
achievable performance in limited feedback environment. This
will also be accompanied with evaluating the upper bound of
sum-rate loss obtained with limited feedback systems in closed
form.
II. S YSTEM M ODEL
(i)
+ Wk
dk
pk
(j) (j)
k,f (k) Hk,f (k) Vk sk
d
k
j = 1
j = i
+ Wk
pk
(i) (i)
k,f (k) Hk,f (k) Vk sk
dk
dm
4
pm
m = 1 l=1
m = k
dm
(1)
(l) (l)
k,f (m) Hk,f (m) Vm sm +
nk
ci C
2
1
hkj ci
(2)
hkj =
1 ekj
ekj skj
hkj +
(3)
(i)
Rk
4
tot =
RLtot = E Rtot R
E {RLk }
(i)
R
k
(4)
k=1
where
and
are the perfect CSIs and limited feedback
based CSIs sum rate for the ith stream of the k th user
44
.indd
44
2015/03/11
11:00:03
( k ) k,f (k)
4 dk
dk
log2 1 +
Rtot =
2
d
k p
k=1 i=1
(i)
(j)
( k ) k,f (k) W
Hk,f (k) V
+
dk
k
k
j=1
j=i
(1) Each user k, will use its codebooks Ck,f (k) with Bk,f (k) bits, to
k,f (k) .
quantize its cross links CSIs, Hk,f (k) , to its quantized version H
(2) Each user k will send the vector indexes of all its cross channels
obtained from step (1), to its corresponding BS f (k).
(3) Each BS receives the channel indexes from all its served users, and
using the same codebooks Ck,f (k) construct the quantized version of the
k,f (k) .
channels H
(4) Each of the FBSs forwards the quantized channels of the FUs users
to the MBS through the backhual links. The MBS will add the quantized
cross channels CSIs of its MUs and forward all the quantized channels to
the IA design central unit.
(5) The IA design central unit will use the quantized CSIs forwarded by
MBS to evaluate the IA transceivers using either the IRLS algorithm in
i ).
i, V
[2] or the HIA algorithm in [1], (W
(6) The IA
design central unit will forward all IA transceivers to the MBS,
i, V
i) .
(W
i ), hierarchically through its BS,
(7) Each user will obtain its precoder, (V
RL
UB
dk
4
k=1 i=1
4
dk
pk
j=1
j=i
d
m pm
m=1 l=1
m=k
dm
Bk,f (k)
dk
Bk,f (m)
M2
M2 1
M2
M2 1
,2
,2
Bk,f (k)
Bk,f (m)
10
15
20
25
30
35
40
40
30
20
7
9
11
B (Number of feedback bits)
13
15
30
20
10
0
7
9
11
B (Number of feedback bits)
13
15
15
10
5
0
7
9
11
B (Number of feedback bits)
13
15
levels (B) and at different signal to noise ratio (15 dB, 25 dB,
and 35 dB). This is because IRLS-IA scheme depends on an
optimization problem that aims to maximize the sum-rate.
V. C ONCLUSION
A limited feedback IA framework for heterogeneous networks has
been proposed. The proposed framework employed together with both
IRLS and HIA schemes. An expression for the sum-rate loss upper
bound for the proposed HetNet scenario is derived. A comparative
simulation study for the proposed limited feedback framework is accomplished based on both IRLS-IA and HIA schemes. The simulation
results showed the IRLS IA scheme is more robust to interference
miss alignment, occurred due to quantization process, than the HIA
scheme.
10
Fig. 2.
1
n
B
Bk,f (k)
1+
, 2 k,f (k)
2 1
n
M
n=1
+ log2 1 +
15
(log2 (e)) 2
20
Fig. 1.
25
0
0
(5)
30
2
(i)
(i)
W
Hk,f (k) V
k
k
2
dm
4
(i)
(l)
pm
)
W
(
H
V
+
1
k,f (m)
k,f (m) m
dm
k
m=1 l=1
m=k
(6)
R EFERENCES
[1] Wonjae Shin, Wonjong Noh, Kyunghun Jang, Hyun-Ho Choi, Hierarchical Interference Alignment for Downlink Heterogeneous Networks,IEEE
Trans. on Wireless Comm., vol. 11 no. 12 pp. 4549 - 4559, Oct. 2012 .
[2] Mohamed Rihan, Maha Elsabrouty, Osamu Muta, and Hiroshi Furukawa,
Iterative Interference Alignment in Macrocell-Femtocell Networks: A
Cognitive Radio Approach,IEEE inter. Symposium on Wireless Comm.
Systems (ISWCS), Barcelona-Spain, August 2014.
[3] Mohamed Rihan, Maha Elsabrouty, Osamu Muta, Hiroshi Furukawa,
Interference Alignment with Limited Feedback for Macrocell-Femtocell
Two-Tier Heterogeneous Networks,Technical Report of IEICE RCS,
RCS 2014-178, Vol. 114, No. 254, October 2014.
[4] N. Jindal, MIMO Broadcast Channels With Finite-Rate Feedback,IEEE
Trans. on Inf. Theory, vol. 52, no. 11, pp. 5045-5060, Nov. 2006.
[5] R. Bhagavatula and R. W. Heath, Adaptive Bit Partitioning for Multicell
Interference Nulling with Delayed Limited Feedback,IEEE Trans. Signal
Proc., vol. 59, no. 8, pp. 3824-3836.
45
.indd
45
2015/03/11
11:00:04
3
4
Center for Japan-Egypt Cooperation in Science and Tech., Kyushu Univ., Fukuoka, Japan. (muta@ait.kyushu-u.ac.jp).
Graduate School of Information Science and Electrical Eng., Kyushu Univ., Fukuoka, Japan. (furuhiro@ait.kyushu-u.ac.jp)
AbstractA variant of the K-best (KB) MIMO decoding algorithm is proposed, namely, reduced complexity K-best (RCKB).
The reduced complexity K-best provides significant complexity
reduction up to 51.7% with performance reminiscent of the traditional KB, in well-conditioned channels. Complexity reduction
is the result of discarding irrelevant nodes in the tree that have
distance metric greater than a predetermined radius at each tree
level. Complexity analysis and simulation results are presented.
Fixing the number of nodes that survive at each tree level may
result in visiting unnecessary nodes. In order to see this, consider the
following eight distance metrics in the third tree level shown in Fig.
1, D = [ 0.2 8 9 9 9 9 10 10 ], the KB algorithm [4]
with K = 2 will choose the smallest two values which are [0.2
8]. It is clear that it is unlikely that the surviving path will mitigate
from the node with metric eight especially near the end of the tree.
Hence, the number of surviving nodes at each tree level should be
varied adaptively. In particular, the radius i should be modified as
we traverse the tree, we provide a heuristic for determining the pruned
radius, i at a specific tree level i. In particular
I. I NTRODUCTION
In multi-input multi-output (MIMO) communication systems, the
traditional K-best sphere decoder (KB) memorizes the best K-nodes
at each level of the search tree [1]. The chosen K-nodes include
irrelevant nodes that increase the decoding complexity without performance improvement; by discarding these irrelevant nodes, one can
decrease the complexity without compromising the performance.
In this paper, a variation of the KB decoder for MIMO systems is
proposed, namely, reduced complexity K-best (RCKB). The RCKB
provides lower complexity than the traditional KB algorithm without
sacrificing its performance in well-conditioned channels. The reduction in complexity comes from discarding irrelevant nodes in every
tree level according to a specific value which varies from tree level
to another.
i K 2 dmin
i
,
i =
10SN R/10
), K), i = 2M 1, , 2.
NiRCKB = min(card(RCKB
i
(4)
where card(RCKB
) is the cardinality of the set
i
RCKB
RCKB
= {j|dji < i , j {1, 2, , Ni+1
q}}
i
(1)
(3)
where dmin
is the minimum distance metric at tree level i. It is clear
i
that in this case, the radius value of the sphere, i , is not fixed and
varies depending on the tree level, i, and the operating SNR for a
specific K value. Note that equation (3) has no proof, but it provides
good results.
The RCKB algorithm discards visiting unnecessary nodes in order
to reduce the complexity without affecting the performance. To
achieve this, only NiRCKB nodes survive at each tree level i, where
NiRCKB is given by
II. BACKGROUND
In additive white Gaussian noise environment (AWGN), the maximum likelihood (ML) decoder is the optimum decoder where the ML
M L that minimizes the 2-norm
solution finds the symbol estimate x
of
M L = arg min y Hx2 ,
x
i = 2M, 2M 1, , 2
(5)
i = 2M 1, 2M 2, , 2.
In essence, card(RCKB
) is the number of nodes at level i that
i
have distance metrics smaller than the pruned radius i .
Figure 1 illustrates a numerical example for the RCKB algorithm
for16-QAM signaling with 2 2 MIMO and K = 2 at SNR = 5dB;
the RCKB algorithm starts at the second higher tree level, i = 3.
From Eq. (3), 3 = 6.75dmin
= 1.35. Then, according to (4), the
3
number of survived nodes at this tree level N3RCKB = min(1, 2) =
1. Similarly, for the next tree level i = 2, 2 = 35.6. According to
(4), the number of survived nodes N2RCKB = min(4, 2) = 2 which
are nodes with distance metrics [8 9]. It is clear that, unlikely nodes
have been discarded without affecting the solution.
(2)
46
.indd
46
2015/03/11
11:00:06
2
Root
+1
+3
1.2
4 = 4.45 dmin
= 35.6
7.8
0.05
9.9
12.8
10
8.8
3
11
10
9.1
13
4
14
Pruned node
9
4
12
Complexity curves
4
10
8
8
9
8.9
13
8.9
.2
14
SNR = 5dB
8.9
3 = 6.75 dmin
= 1.35
9
10
14
-1
10
11
18
10
Leaf
12
10
11
12
17
18
10
Saved nodes in
RCKB over KB
q pj=0
NjKB , for the first group is
PK 1
( q)j ,
j=0
( q)PK K,
(6)
ln(K)
,
ln( q)
(7)
KB
CK
,
(8)
is total number
1 ( q)PK +1
+ (2M PK 1)K ,
1 q
(9)
+ (2M PK 1)
the second group. Hence, q
1 q
RCKB
KB
CK
CK
.
The percentage gain in complexity can be defined as
gain
CK
=
10
SNR (dB)
15
A. Complexity of the KB
N KB =
BER
1.2
0.15
-3
.15
340
320
300
K=4
RCKB4 280
260
K=6
RCKB6 240
220
200
180
160
140
120
100
80
60
40
20
0
20
K=2
RCKB2
Level (i)
Branch Label
(symbol)
First group
(if K = 6)
Node metric
Second group
(if K = 6)
Branch
Branch distance
metric
KB C RCKB
CK
K
KB
CK
100%
(10)
V. S IMULATION
The performance of the proposed decoder is compared to the KB
decoder. It is assumed that the transmitted power is independent of the
number of transmit antennas, M , and equals to the average symbols
energy in a Rayleigh fading well-conditioned channel. Figure 2
illustrates the performance and complexity of RCKB for 16-QAM
K
2
4
6
2
4
6
2
4
6
KB
CK
RCKB
CK
3 3 MIMO
44
84
116
4 4 MIMO
60
116
164
4 4 MIMO
120
232
344
max
Cgain
for 16-QAM
32
27.2%
48
42.9%
56
51.7%
for 16-QAM
44
26.7%
68
41.4%
84
48.8%
for 64-QAM
88
26.7%
136
41.4%
184
46.5%
VI. C ONCLUSIONS
We have proposed a modified K-best sphere decoding algorithm,
namely, reduced complexity K-best (RCKB). The RCKB achieves
complexity reduction compared to the KB without sacrificing its
performance. We have provided complexity analysis for proposed
and traditional algorithms. Simulation results have confirmed the
improvement of the proposed decoder.
R EFERENCES
[1] Z. Guo, P. Nilsson, Algorithm and Implementation of the Kbest Sphere Decoding for MIMO Detection, IEEE Journal On
Selected Areas In Communications, pp. 491-503, 2006.
[2] Y. Hsuan Wu, Y. Ting Liu, H. Chang, Y. Liao, H. Chang, EarlyPruned K-best Sphere Decoding Algorithm Based on Radius
Constraints, ICC, pp. 4496-4500, 2008.
[3] M. O. Damen, H. El Gamal, G. Caire, On maximum-likelihood
detection and the search for the closest lattice point, Information
Theory, IEEE Transactions, pp. 2389-2402, 2003.
[4] R. Shariat-Yazdi, T. Kwasniewski, Configurable K-best MIMO
Detector Architecture, ISCCSP, pp. 1565-1569, 2008.
[5] R. Graham, D. Knuth, O. Patashnik, Concrete Mathematics,
Addison-Wesley, 1989.
47
.indd
47
2015/03/11
11:00:08
.indd
48
2015/03/11
11:00:08
.indd
49
2015/03/11
11:00:08
Graduate school of Information Science and Electrical Engineering, Kyushu University, Japan.
1
Faculty of Science, Kafrelsheik University, Egypt. Alaa_83moh@yahoo.com
2
Research Institute for Information Technology, Kyushu University, Japan.
controller has a global view of the current status of the network
and can interact with its network devices. All the multicast
management, such as multicast tree computing and group
management are handled by the this controller, and the
controller has complete knowledge of the topology and the
members of each group, thus it can create more efficient
multicast trees than the distributed approach [3].
I.
INTRODUCTION
II.
50
.indd
50
2015/03/11
11:00:09
III.
ESTIMATED RESULTS
[2]
[3]
[4]
[5]
[6]
51
.indd
51
2015/03/11
11:00:11
w
o
i
a
f
EJUST Center, Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
2
Department of Advanced Information Technology, Kyushu University, Fukuoka, Japan
Email: farhad@ejust.kyushuu.ac.jp, {ferdous, murakami}@soc.ait.kyushuu.ac.jp
c
n
n
q
t
p
s
c
T
p
n
a
f
t
w
c
T
v
p
I. INTRODUCTION
The amount of energy consumed by sensor nodes in a
wireless sensor network (WSN) may be too variant. Reducing
the energy level of a sensor node under a certain threshold
causes its death. Consequently, the death of sensor nodes may
lead to shortened lifetime of WSN. Dividing a network into
smaller partitions including a cluster head and a number of
sensor nodes in the cluster may alleviate the problem and
enhance the network lifetime. Unlike traditional case that every
sensor transmits data directly to the destination, in a clustered
network, the data is transmitted by the cluster heads via several
hops to the base station. This results in saved energy for the
network. However, to reduce the nodes death rate, it may be
necessary to recluster the network at certain conditions.
Fig.1. A WSN including clusters, where data is transmitted from sensors (black
nodes) via cluster heads (red nodes) to the base station
E t - n c x E c= E r
w
o
T
c
w
p
t
c
v
f
e
i
o
s
F
i
i
p
n
c
t
e
(1)
52
.indd
52
2015/03/11
11:00:12
g
r
y
r
)
1000 Joules
from 1 to 100
from 0.0 to 0.9
1 sec
from 1 to 1
10000
(5)
alpha= 0.0
9000
alpha= 0.1
8000
alpha= 0.2
alpha= 0.3
7000
alpha= 0.4
6000
alpha= 0.5
alpha= 0.6
5000
alpha= 0.7
4000
alpha= 0.8
3000
alpha= 0.9
2000
1000
0
0
10
20
30
40
50
nc
60
70
80
90
100
Fig.2. WSN lifetime vs. the number of clusterings for ranging from 0.0 to 0.9
2500
N
s
e
2000
1500
1000
500
0
0
10
20
30
40
50
nc
60
70
80
90
100
Fig.3. WSN lifetime vs. the number of clusterings for different values, while
Pr/Pc 1 and Pr/Pc 1
IV. CONCLUSION
Clustering WSN can significantly improve the network
energy efficiency and lifetime. However due to the energy
overhead of clustering, it is essential to determine an optimal
frequency for clustering depending on the network condition.
We consider regular time interval for clustering operation
though it may be necessary to recluster the network at certain
conditions to avoid high rate of sensor nodes death.
REFERENCES
[1] W. Heinzelman, A. Chandrakasan and H. Balakrishnan, Energy
Efficient Communication Protocol for Wireless Microsensor Networks,
Proceedings of the 33rd Hawaii International Conference on System
Sciences (HICSS '00), 2000.
[2] Kyung Tae Kim and Hee Yong Youn, EnergyDriven Adaptive
Clustering Hierarchy (EDACH) for Wireless Sensor Networks, EUC
Workshops, LNCS 3823, pp. 1098 1107, 2005.
[3] Manju Bala and Lalit Awasthi, Proficient DHEED Protocol for
Maximizing the Lifetime of WSN and Comparative Performance
Investigations with Various Deployment Strategies, International Journal
of Advance Science and Technology, Vol.45, August 2012.
53
.indd
53
2015/03/11
11:00:13
v
s
a
l
a
e
p
Electrical Engineering Department, Engineering Faculty, Ferdowsi University of Mashhad, Iran, msafaie@stu.um.ac.ir
2
Computer Engineering Department, Engineering Faculty, Ferdowsi University of Mashhad, Iran, hnoori@um.ac.ir
3
E-JUST center, Graduate School of Information Science and Electronics Engineering, Kyushu University, Japan
i
r
a
[
f
o
u
B. Hardware Modules
The proposed firewall is composed of three main modules:
1) The IPv4 Controller: It is the heart of the design. It extracts
required fields of the Ethernet packets and forwards them to
Memory Controller. 2) Memory Controller: It is a small Finite
State Machine (FSM), which detects type of the data on the input
port, i.e. IP, Port or neither, and forwards data to the memory
modules. 3) The memory modules: IP TCAM and Port TCAM
modules are used to store firewall rule tables. They determine
whether extracted fields match the rules during a single clock
cycle. TCAM allows a third matching element of X or dont
care besides 1 and 0. This facility adds more flexibility to
the search. For example, by replacing Xs in few least
significant bits of an IP address, we can sensitize the packet
classification process to hundreds of IP addresses using a single
memory address.
INTRODUCTION
t
p
F
o
c
t
f
a
f
T
n
u
a
1
M
I.
A. Assumptions
There are many encapsulation protocols for the payload of
an Ethernet frame. These protocols are often distinguished using
a two-byte field in Ethernet frame header, called EtherType. In
the current implementation, we assumed that all frames are IPv4.
The Protocol field of IPv4 packet header declares the
encapsulation protocol of IPv4 packet payload (e.g., TCP, UDP,
54
.indd
54
2015/03/11
11:00:14
s
,
.
:
s
o
e
t
y
M
e
k
t
o
t
t
e
e
d
y
r
EXPERIMENTAL RESULTS
A. Device Utilization
Size of rule tables plays an important role in resource
utilization. Larger TCAMs require more logic elements (LEs)
and memory blocks (Fig. 1). Increasing number of rules from
128 to 1024 causes the operating frequency to decrease from 185
MHz to 165 MHz.
B. Throughput
Due to the fixed number of required clock cycles to classify
the packet, higher operating frequency enables the firewall to
process more packets per time unit and offer higher throughput.
Fig. 2 depicts maximum operating frequencies related to the size
of the TCAMs in the mentioned three families of Altera FPGA
chips. Generally, operating frequency reduces for larger rule
tables, but in some cases synthesis results show a rise in
frequency that might be the consequence of optimization
algorithms applied by the synthesis tool.
ACKNOWLEDGMENT
We thank Mr. Mahmoud Fathi of Ferdowsi University of
Mashhad for his noteworthy assistance in configuration of
Ethernet PHY chips, and also we appreciate Laboratory of
Embedded Systems of Ferdowsi University of Mashhad for
providing Altera FPGA boards and tools.
REFERENCES
[1]
[2]
d
d
.
,
t
e
e
d
.
d
w
e
l
l
n
l
[3]
[4]
[5]
[6]
Figure 1. Required Les for different FPGA families.
55
.indd
55
2015/03/11
11:00:14
S
u
a
2
a
m
e
e
S
i
t
r
S
i
d
C
a
t
t
S
w
t
t
t
r
A
S
t
S
s
w
S
s
S
a
i
S
t
a
i
T
t
e
a
e
a
Vinesha Selvarajah1,3, Mueen Uddin2, Shinichi Matsumoto1,3, Junpei Kawamoto1,3, Kouichi Sakurai1,3
Faculty of Information Science and Electrical Engineering, Kyushu University, Kyushu, Japan vinesh@itslab.inf.kyushu-u.ac.jp,
kawamoto@inf.kyushu-u.ac.jp, sakurai@csce.kyushu-u.ac.jp,
2
Department of Computer Science, University Malaysia Pahang, Pahang, Malaysia, mueenmalik9516@gmail.com,
3
Institute of Systems, Information Technologies and Nanotechnologies (ISIT), Kyushu, Japan, smatsumoto@isit.or.jp,
A.
Case Scenario
Case: User A is a cloud user of XYZ Cloud Service
company. Recently, she felt that her information on the cloud
were modified or deleted without her knowledge. She suspects
that her account is being compromised. She then sought help
from the police to help her to investigate on the matter and
serve justice. We designed the scenario based on the recent
security breach occurred between hackers and a famous public
cloud service provider "Dropbox", where hundreds username
and password including pictures, videos and other files were
prepared to be leaked out by the attackers which requested
exchange for bitcoins. [6]
I.
INTRODUCTION
Over the last one and a half decade, since cloud technology
was introduced and implemented, many looked at it as an
advantage of making use of the resource it provided especially
on availability of unlimited amount storage spaces. In the early
research conducted, [1] discovered that one of the major issues
related to cloud forensic is location transparency. Information
that is kept in a cloud server may be replicated in few different
locations. Further research in Liverpool [2] also discuss that the
main flaw concerning cloud is in the perspective of evidence
acquisition due to the remote locations of data centers. Adding
on to the research, [3] agrees with the earlier researchers on
multi-server locations and explained how data centers are
vulnerable to attacks or dominated by hackers without leaving
behind footprints. [3] suggests in his research that the answer to
this problem is to image records and files on the datacenters to
aid in forensic investigation processes. Reference [4] added on
by suggesting a private cloud server to be created and utilized
only when needed to aid forensic investigation, however the
evidence acquisition in a large data storage act as a barrier to
this suggestion. Recent research [5] stated that physically
acquiring object from indirect environment is uncertain
because customers and data centers are spread around the
world.
II.
PROPOSED METHOD
r
s
i
t
a
56
.indd
56
2015/03/11
11:00:15
e
d
s
p
d
t
c
e
e
d
o
l
o
r
n
d
g
l
l
III. CONCLUSION
Cloud computing easily accommodates clients' need in
reducing the cost of maintaining servers and hardware. This
technology complicates the investigation process due to the
geographical location of the cloud servers elsewhere. Focusing
on the steps to acquire evidences will enhance the investigation
process to be more effective so that the integrity of evidences
can be protected and validated to be used in court for further
proceedings. Our method will result in a systematic flow of
forensic investigation steps in cloud forensics. The limitation of
the proposed method lies on system being only a conceptual
model and requires great effort bringing the method into
existence to ensure that the flow of forensic investigations is
systemized.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
57
.indd
57
2015/03/11
11:00:15
e
m
t
I
m
S
(
p
A
p
T
c
s
1,4
I.
t
i
c
t
t
PMIPv6;
INTRODUCTION
II.
m
a
a
v
o
p
b
m
c
s
t
p
m
c
w
a
c
m
s
I
M
58
.indd
58
2015/03/11
11:00:16
e
g
f
s
r
r
6
,
y
n
o
e
t
e
s
s
t
e
d
r
t
e
r
l
e
t
r
l
e
d
e
r
]
n
.
n
e
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
CONCLUSION
[17]
[18]
[19]
[20]
[21]
[22]
ACKNOWLEDGMENT
This work was supported by Malaysia-Japan International
Institute of Technology (MJIIT) center at Universiti Teknologi
Malaysia.
[23]
[24]
REFERENCES
[1]
[2]
[3]
[25]
[26]
[27]
59
.indd
59
2015/03/11
11:00:17
a
T
a
c
t
Dep. of Electronics and Electrical communication Eng., Faculty of engineering, Tanta university, Tanta, Egypt,,
alaazain1986@gmail.com
2
Dep. of Electronics and Electrical communication Eng., Faculty of engineering, Tanta University, Tanta, Egypt,
h_khobby@yahoo.com
3
Dept. of Information Systems, Information and Technology Institute, Menoufia University, Menoufia, Egypt,
hatem6803@yahoo.com
4
Dep. of Electronics and Electrical communication Eng., Faculty of engineering, Tanta university, Tanta, Egypt,
mnaby45@gmail.com
III.
IV.
I.
SIMULATION SETUP
i
f
m
o
h
INTRODUCTION
V.
A. Throughput
From Figure 2, it is obvious that the throughput for OLSR is
high compared to that of AODV. This is because of the fewer
routing forwarding and routing traffic. Here, the malicious
node dropped the data rather than forwarding it to the
destination, thus affecting throughput. The same is observed in
the case with AODV, without attack; its throughput is higher
than in the case with attack; because of the packets discarded
by the malicious node.
w
j
c
c
r
Hybrid protocols
Proactive protocols
60
.indd
60
2015/03/11
11:00:17
6
r
l
n
R
;
o
e
B.
Figure 3. Network load of (AODV, DSR and OLSR) with and without attack
Figure 6. Network load of (AODV, DSR and OLSR) with and without attack.
VI.
C. NETWORK LOAD
XI. CONCLUSION
A. Throughput
From Figure 4, it is obvious that the throughput for AODV
is high compared to that of AODV. This is because of the
fewer routing forwarding and routing traffic. Here the
malicious node jammed, thus effecting throughput. The same is
observed in the case with AODV, DSR, the throughput is
higher than in the case of AODV than DSR.
s
r
s
e
n
r
d
REFERENCES
[1]
[2]
Figure 4. Throughput of (AODV, DSR and OLSR) with and DOS attack.
[3]
[4]
[5]
[6]
[7]
Figure 5. Packet End-to-end delay of AODV, DSR and OLSR with attack
61
.indd
61
2015/03/11
11:00:18
.indd
62
2015/03/11
11:00:18
.indd
63
2015/03/11
11:00:19
AbstractRecently 60 GHz band WLAN system has become attractive. We examine a multiuser MIMO (multipleinput multiple-output) - OFDM (orthogonal frequency division
multiplexing) system that multiplexes 4 users spatially in that
system, and target system performance is set to achieve 6 Gbps
per user. In 60 GHz WLAN system, beamforming with RF
circuits is employed to compensate large path loss and reduce
interference between users. But it also changes the channel
information between each transmitter and receiver and the user
selection should also consider the variation of beam patterns made
by the RF circuits. We modied the user selection algorithms
with considering of the RF beam pattern selection and the
performances of the modied algorithms are estimated by the
simulation. We indicate adopted channel model in the simulation
and propose two types of user selection algorithm including beam
pattern selection. The rst one is capacity-based selection and
the second one is SUS (semi-orthogonal user selection)-based
algorithm. System performances of both algorithms can achieve
6 Gbps per user when number of total users is more than 8 and
low computation complexity.
KeywordsIEEE802.11ad, Channel
MIMO, OFDM, User selection
modeling,
Time of Arrival
Fig. 1. Millimeter-wave channel model with one LOS components and several
clusters.
TABLE I.
E XTRACTED PARAMETERS
Multiuser
Cluster pdtc,
Forward rays pdtc, f
Backward rays pdtc, b
Forward K-factor, kf
Backward K-factor, kb
Cluster arrival rate,
Forward ray arrival rate, f
Backward ray arrival rate, b
No. of forward rays, Nf
No. of backward rays, Nb
I. I NTRODUCTION
Recently due to the increasing data trafc, the unlicensed 60 GHz band has become attractive because of its
larger bandwidth and has been standardized in standards like
IEEE802.11ad [1]. According to the standards, there are four
channels from 57.24 GHz to 65.88 GHz. Each channel is
used by one user and all antennas adopted beamforming
with RF circuits. For further improve the channel efciencies
of the millimeter wave channels, we examine a multiuser
MIMO (multiple-input multiple-output) - OFDM (orthogonal
frequency division multiplexing) system that multiplexes 4
users spatially. In this paper we invest two modied user
selection algorithms with simulation that employed established
channel model, where the user selection algorithms also including beam pattern selection of each antenna element.
9.03 [ns]
4.50 [ns]
7.50 [ns]
7.42 [dB]
11.8 [dB]
0.15 [ns1 ]
0.47 [ns1 ]
0.39 [ns1 ]
1
4
B. Angular Prole
In 60 GHz wireless communication system, beamforming
in RF domain is adopted, so channel model must include angular prole. Characterization of angular proles of the clusters
and rays in each cluster is needed for this channel model.
From ray tracing, the angular prole of clusters becomes
uniform distribution in horizontal, and two kinds of uniform
distributions for rst cluster and other clusters, respectively (as
shown in Table II). Angular prole of the rays in each cluster
becomes Laplace distribution (as shown in Table III).
TABLE II.
TABLE III.
Horizontal AoD
32.7 [deg]
AoD [deg]
[140,175]
[95,120]
DISTRIBUTIONS
AoA[deg]
[140,175]
[60,85]
Horizontal AoA
39.7 [deg]
Vertical AoA
18.5 [deg]
64
.indd
64
2015/03/11
11:00:19
g
s
.
s
m
s
r
TABLE IV.
S IMULATION
SETTING
5mm
90
BF Unit 2
BF Unit 1
5
5mm
180
BF Unit 3
512
336
128
2.64 [GHz]
10 [dBm]
174 [dBm/Hz]
Using this work
ZF
TABLE V.
N UMERICAL
COMPLEXITY
E-search
C-based
SUS-based
5.2 1012
2.8 105
1.6 103
50
45
BF Unit 4
III.
270
Fig. 2. Antenna conguration at AP with four BF units and for each unit, it
has four beam patterns.
B. Selection Algorithm
We combine RF beamforming selection with two user
selection algorithms. The rst one is a capacity based
algorithm[4]. , that is shown in Algorithm 1, where K is
existing number of users, F is number of beam patterns.
40
35
30
Exhaustive search
25
Capacitybased
SUSbased
20
Target capacty
15
Fig. 3.
Simulation result
10
12
14
Number of users
16
18
20
IV. C ONCLUSION
The performances of the proposed user selection algorithms
with RF beamforming decrease by 25% compared with the
exhaustive search. However, it is considerable the numerical
complexity of the proposed algorithms can be reduced remarkably.
ACKNOWLEDGMENT
This work was supported by Research and development of radio spectrum resources of the Ministry of Internal Affairs and Communications,
Japan.
R EFERENCES
[1]
[2]
kTi
i1
i
if C(i)
< C(i1)
then
Algorithm terminated
end if
So So {(i)}, Ti+1 = {k Ti , k = (i), bk = b(i) }
end for
[3]
[4]
Second one is based on SUS (Semi-orthogonal User Selection) algorithm[5]. For all combinations the differences
between directions of the beam patterns are set as 90 degree
in SUS based algorithm.
[5]
65
.indd
65
2015/03/11
11:00:20
W
1
s
t
s
d
v
c
v
Graduate School of Information Science and Electrical Engineering, Kyushu University, Japan.
2
Egypt-Japan University of Science and Technology, Egypt.
3
NTT Network Innovation Laboratories, NTT Corporation, Japan
E-mail: nusrat05cuet@gmail.com
0
f
t
INTRODUCTION
Recently, mobile communications systems form an
indispensable part of daily lives of millions of people. Next
generation mobile communication requires higher data rate
using many frequency bands and MIMO with many antennas.
Wideband RF circuits are viable solution in such multi-band
access points and user terminals. These devices require precise
RF gain control to make a beam form in phased array antennas
and to limit the incident power to receiver circuits. For precise
gain control, traditional variable gain amplifier (VGA) and
variable FET attenuator are good candidates [1, 2]. The latter
has advantages of low power consumption, bi-directionality,
and stability for unnecessary oscillation and thermal variation.
Mobile user terminals require low power, low cost, and small
size circuits, CMOS technology has become popular to realize
RF circuits on chip. This paper describes a wideband variable
attenuator design in 0.18 m CMOS technology.
ATTENUATOR DESIGN
There are some conventional attenuators using T, and bridged
T topologies with adjustment of the series and shunt resistance
[3]. In the -attenuator, minimum attenuation occurs when the
series resistance is small value and the shunt resistances are
large value by control the FET switches. In that case, the loss at
lowest frequencies comes only from the nonzero on-resistance
of the series switch. As this resistance gets smaller, the insertion
loss due to the minimum insertion of the attenuator gets smaller.
At higher frequencies, there is additional loss caused by the
parasitic capacitors to ground, therefore minimizing these
capacitors reduces the insertion loss. Similarly, the series
components of the T-attenuator at the minimum gain setting are
completely on and the shunt component is turned off.
l
F
w
t
c
a
Vc
Low
Low
Low
Low
Low
High
High
High
High
High
V1
High
Low
Low
Low
Low
High
Low
Low
Low
Low
V2
Low
High
Low
Low
Low
Low
High
Low
Low
Low
V3
Low
Low
High
Low
Low
Low
Low
High
Low
Low
V4
Low
Low
Low
High
Low
Low
Low
Low
High
Low
V5
Low
Low
Low
Low
High
Low
Low
Low
Low
High
Attenuation
(dB)
-6
-8
-10
-12
-14
-16
-18
-20
-22
-24
66
.indd
66
2015/03/11
11:00:22
d
,
r
r
h
e
f
e
e
e
6
t
y
() . ()
()
Bandwidth
(GHz)
DC-5
0.4-0.8
This
work
DC-4
Step size
Max
attenuation
Return loss
(dB)
3/6
(dB)
-24
-48
-24
(dB)
> 14
> 12
>10
Noise Figure
(NF)
(dB)
Max 48
Min 5.8
Max 24.5
Min 6.1
Discrete
step
0.16 m
CMOS
20
Discrete
step
65 nm
CMOS
6.4/3.2
Discrete
step
0.18 m
CMOS
48
Parameters
SIMULATION RESULTS
The attenuator has been designed and simulated using TSMC
0.18 m CMOS technology model. Fig 2 shows the simulated
frequency response of the proposed attenuator when changing
the attenuation from 6 dB to 24 dB.
As the attenuator has been designed symmetrically the return
loss for both input and output are approximately the same.
Figure 3 shows the input/ output return loss versus frequency
with the minimum and maximum attenuation. Table 2 shows
the performance summary of the proposed digital attenuator
compared with the conventional work. We calculate the
attenuators figure of merit.
Control mode
Technology
FoM
[4]
[5]
CONCLUTIONS
A digitally controlled wideband variable attenuator has been
presented. The design circuit achieved a good performance over
the entire band with acceptable return loss. The worst return loss
is -8.2 dB at 4 GHz when the attenuation level becomes -14 dB.
ACKNOWLEDGMENT
This work was supported by Funding Program for WorldLeading Innovative R&D on Science and Technology and a
Grant-in-Aid for Scientific Research (B) (KAKENHI-B). This
work was also partly supported by VLSI Design and Education
Center (VDEC), Tokyo University in collaboration with
CADENCE Corporation and Agilent Corporation.
REFERENCES
[1]
[2]
[3]
[4]
[5]
67
.indd
67
2015/03/11
11:00:23
I.
neural
INTRODUCTION
y (t ) a i y ( t i )
i 1
b u (t j ) e (t )
j 1
(1)
c
T
t
i
t
i
V
c
n
f
w
n
r
n
i
h
s
f
f
b
h
b
v
T
m
l
l
f
t
68
.indd
68
2015/03/11
11:00:25
a
r
m
r
g
t
S
r
d
t
y
.
)
o
s
t
l
n
e
e
e
w
e
a
d
e
l
r
r
s
h
a
l
d
n
.
y
e
Figure 2. Proposed VCO control voltage comparisons with the original circuits
and conventional PLL model
TABLE . MODEL INDENTIFICATION TIME AND SIMULATION
TIMES
REFERENCES
[1] L. Ljung, System Identification: Theory for the User, 2nd ed. Englewood
Cliffs, NJ, USA: Prentice Hall, 1999.
[2] B. Bond, Z. Mahmood, Y. Li, R. Sredojevic, A, Megretski,V, Stojanovic,
Y. Avniel, and L. Daniel, "Compact modeling of nonlinear analog circuits
using system identification via semidefinite programming and incremental
stability certification", IEEE Trans. Comput.-Aided Design Integr. Circuits
Syst., Vol. 29, No.8, pp. 1149-1162, Aug. 2010.
[3] L. Liu, T. Sakurai, and M. Takamiya, "A Charge-Domain Auto- and
Cross-Correlation Based Data Synchronization Scheme with Power-and AreaEfficient PLL for Impulse Radio UWB Receiver", IEEE J. Solid-State
Circuits, vol. 46, no. 6, pp. 1349-1359, June 2011.
[4] B. Razavi. Modeling and simulation, in Monolithic Phase-locked loops
and Clock Recovery Circuits- Theory and Design. New York, NY, USA:
IEEE Press, 1996.
[5] L. Liu, and R. Pokharel,"Post-Layout Simulation Time Reduction for
Phase-Locked Loop Frequency Synthesizer Using System Identification
Techniques, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, Vol. 33, No. 11, pp. 1751-1755, Nov. 2014.
The proposed PLL frequency synthesizer model for postlayout simulation time reduction is shown in Fig. 1. The
layout data of the PLL frequency synthesizer from [3] is used
for parameter estimation of each building block. Fig.2 shows
the simulated VCO control voltage comparisons with the
69
.indd
69
2015/03/11
11:00:25
T
g
i
a
P
p
o
e
U
D
L1
R1
C1
INTRODUCTION
M3
L3
R3
C2
L2
Cout
E
U
o
G
J
Rl
Ls3
Lg1
RFIN
M1
Vgs3
Ls1
M2
p
c
LD3
C3
Vgs1
I.
LD1
R2
Rs Cin
VDD2
VDD1
CIRCUIT DESCRIPTION
Fig. 2.
UWB-PA.
MEASUREMENT RESULTS
70
.indd
70
2015/03/11
11:00:28
ACKNOWLEDGMENT
The authors would like to thank the ministry of higher
Education (MoHE)-mission department, and Egypt-Japan
University of Science and Technology (E-JUST) for funding
our work, in addition, this work was partly supported by a
Grant-in-Aid
for
Scientific
Research
(B)
from
JSPS.KAKENHI (Grant no. 23360159).
[2]
[3]
[4]
This work
Freq. (GHz)
3-10
3- 7
3- 10
4- 9
|S11 | (dB)
<-10
<-6
<-10
<-5.0
|S22 | (dB)
<-14
<-7
<-10
<-8
Gain (dB)
11 0.6
14 0.5
10 0.8
13.5 0.7
Gd (ps)
86
178
250
60
PAE (%)
NA
NA
NA
10
5.6
Area (mm )
0.77
0.88
1.76
0.77
Power (mW)
100
24
84
21
OP1dB
2
d
f
d
.
5
s
d
4
.
a
.
References
[1]
[2]
[3]
[4]
REFERENCES
Federal Communication Commission, Revision of Part 15 of The
Commission's Rules Regarding Ultra-Wideband Transmission Systems,
First Report and Order, ET Docket 98-153, FCC 02-48, April 2002.
R. Sapawi, R. Pokharel, S. A.Z. Murad, A. Anand, N. Koirala, H.
Kanaya and K. Yoshida, Low Group Delay 3.110.6 GHz CMOS
Power Amplifier for UWB Applications, IEEE Microwave and
Wireless Components Letters, vol. 22, no.1, pp.41-43, Jan. 2012.
S. A. Z. Murad, R. K. Pokharel, A. I. A. Galal, R. Sapawi, H. Kanaya,
and K. Yoshida An Excellent Gain Flatness 3.07.0 GHz CMOS PA
for UWB Applications, IEEE Microwave And Wireless Components
Letters, vol. 20, no. 9, September 2010.
C. Lu, A.V. Pham and M. Shaw, A CMOS power amplifier for fullband UWB transmitters, 2012 IEEE Radio Frequency Integrated
Circuits (RFIC) Symposium, pp.400, 11-13 June 2006.
71
.indd
71
2015/03/11
11:00:30
T
f
A
I.
INTRODUCTION
C. System overview
The prototype of Noh-Guide is assembled by HTML5, Web
browser, and Web Server. There are actors, audiences, and a
system controller. The system controller plays a role of
operating the subtitles.
Singing
Highlighted
S
c
Display
area
Forward & back
Adjust
MOTIVATION
Change
mode to
explain
a
m
o
q
i
d
72
.indd
72
2015/03/11
11:00:31
d
s
d
h
e
e
.
n
e
,
n
f
e
,
e
d.
b
a
f
B. Results
The answers to the questionnaires said that the difficulty of
Noh, the usefulness of our system, and the importance of
helpful subtitles as shown in Table 1.
Effectiveness of the Noh guide is shown through the answers
to this questionnaire. The program most appealed to the
examinees was Futari Hakama as we assumed because the
program was Kyogen and done by a living national treasure.
We also gathered free format comments from the examinees
to evaluate the effect of our system. According to the comments,
most examinees evaluated that our system helped them to
understand Singing and actors action. However, some
comments addressed some problems of the system as well.
When they focused on their mobile device, they were not able
to watch and concentrate the stage.
a)
b) jumping
C. Discussion
Table 1 shows a part of results of the experiment. Examinee
found that the Casual subtitle is more helpful than the Singing
one. We consider this reason is that Singing is difficult to
understand for most audiences and the Casual language can
help them to watch Noh performance. We conclude that letting
the audiences understand the meaning of performance is an
important key to make them enjoy the performance.
Figure 2 shows actions of a main actor. The action
performance is linked with a song and background music,
where jumping action means avoiding waves over the sea. If
any novice audiences cant understand the meaning of singing,
they cant assume accurate information about the action either.
The most important result is that our system was able to help
all the examinee understand the meaning of Singing.
attacking
Stage of YASHIMA
Figure 2. Action of Shite
Table 1. Effects of Subtitle
Answer
Question
Which is more
helpful subtitle?
Casual or Singing.
Do you feel the
subtitle is suitable
to the performance?
Dose Noh-guide
help to understand
this stage?
Is Noh-guide easy to
control?
III.
Casual
Singing
Other
Yes
No
Other
10
IV.
EXPERIMENT
A. Experiments
We demonstrated Noh-guide on the stage of Maibayashi of
Kiyotsune in the Kumamoto Prefectural Theater on
September 19th, 2014. The target stage was a part of the event
called Kumamot Noh Zanmai. Maibayshi is consisted of
Shite, Hayashi, and Jiutai.
The number of examinee was 10, 8 women and 2 men. The
average of their ages was 31. Every examinee brought their own
mobile device, iPhone or Android. They watched the stage with
own devices. We gathered their impressions from
questionnaires. At the same time, we tried to gather their system
interaction logs. However, our attempt failed because few
devices failed down during the performance.
ACKNOWLEDGMENT
We thank Ichiro Nakamura, who provide necessary data to
construct this system. We also would like to thank Kumamoto
Prefectural Theater for giving us a chance of this experiment.
REFERENCES
[1]
[2]
[3]
[4]
[5]
73
.indd
73
2015/03/11
11:00:32
II.
C
r
I.
METHODOLOGY
INTRODUCTION
e
h
a
i
s
d
s
j
s
c
a
h
2
h
c
d
B. SimWare Simulator
SimWare [2], is a holistic novel simulator that measures the
power consumption of datacenter by considering most effective
parameters such as servers, cooling system, fans and thermal
effect of servers on each other due to heat recirculation.
d
d
1
s
w
e
t
i
w
w
m
s
t
e
74
.indd
74
2015/03/11
11:00:33
t
n
t
e
s
a
,
t
r
r
t
e
e
e
e
l
r
h
o
s
s
f
w
g
l
.
t
e
d
w
l
h
f
CPUs, CPU utilization, runtime, wait time and some others are
recorded.
We assumed that one hardware accelerator is attached to
each server and every job can use them. The accelerators can
handle only one job at a time and when a job is using the
accelerator, the speedup would be 10X. However 10X speedup
is hypothetical and each different job might obtain different
speedup, but as we want to compare different accelerators with
different power consumption and the amount of speedup is
same for all the accelerators, this number (10X speedup) is
justifiable.
REFERENCES
[1]
[2]
[3]
[4]
No. of Chassis
No. of Servers
in each Chassis
50
10
No. of Cores
in each
Server
10
[5]
Power
Consumption
of Cores (total)
130W
http://www.eetimes.com/document.asp?doc_id=1324372
Y. Sungkap and H. H. S. Lee, "SimWare: A Holistic Warehouse-Scale
Computer Simulator," Computer, vol. 45, pp. 48-55.
http://impact.asu.edu/BlueTool
A. Pahlavan, M. Momtazpour, and M. Goudarzi, "Data center power
reduction by heuristic variation-aware server placement and chassis
consolidation," in Computer Architecture and Digital Systems (CADS),
2012 16th CSI International Symposium on, pp. 150-155.
www.cs.huji.ac.il/labs/parallel/workload/swf.html
75
.indd
75
2015/03/11
11:00:33
Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Japan
email: {maher.salem, mohammed.sayed}@ejust.edu.eg, victor.goulart@acm.org
stages is divided into two vectors; the left bits, and the
right bits. Finally, the edge detector is used to detect the
zero-one transition. The detected one at the first vector
represents the first active requester, while the detected one
at the second vector represents the second active
requester. The main problem in this algorithm occurs
when the highest priority requester is active. The
sequence of additions results in a vector of ones. This
vector when passed to the edge detector stage results a
vector of zeros. This bug was not mentioned in the
original paper [3].
The bug in the 3DP2S architecture was fixed in [4].
We will name this circuit 3DP2S_OZU. Its idea is based
on ANDing the priority vector with input request vector
to indicate whether the highest priority requester is active
or not. If it is active, the first grant vector will be the
priority vector itself where the priority vector grants the
highest priority requester. On the other hand, if the
highest priority requester is not active, the grant vector
generated by the circuit will be activated. The control is
done through a multiplexer circuit. The fixed circuit is
shown in Fig. 2.
INTODUCTION
RELATED WORK
In the state-of-art, only two researches targeted twoselect RRAs. The first one is 3-Dimentional
programmable 2-select (3DP2S) arbiter. 3DP2S was
proposed in [3]. The arbitration circuit of 8-point 3DP2S
is shown in Fig. 1. It consists of log2(8) stages of unit
blocks (UBs) and edge detector stage. The UB of 3DP2S
is a thermometer-coded adder saturated at 2 i.e. whenever
the sum of the two inputs is larger than 2 the output would
be 2. Further, the UB also takes two pointer bits (i.e.
inputs priorities) and outputs the OR result of them. The
pointer bits control the adder function. If the right input
priority is logic high, the UB will not add its inputs and
pass the right input as it is. Therefore, the result of
additions will propagate through stages to the paths of
requesters that have lower priority. The result of the three
76
.indd
76
2015/03/11
11:00:34
PROPOSED SOLUTION
V.
IV.
CONCLUSIONS
REFERENCES
[1] W. J. Dally and B. Towles, Route Packets, Not Wires: OnChip Inteconnection Networks, In Proc. of 38th Design
Automation Conference (DAC), pp.684-689, 2001
[2] M. Abdelrasoul, M. Ragab, V. Goulart, "Impact of Round
Robin Arbiters on router's performance for NoCs on
FPGAs," IEEE International Conference on Circuits and
Systems (ICCAS), pp.59-64, 2013
[3] J. S. Ahn, D. K. Jeong, and S. Kim, Fast three-dimensional
programmable two-selector, Electronics Letters, vol. 40, no.
18, 2004
[4] H.F. Ugurdag, F. Temizkan, O. Baskirt, and B. Yuce, Fast
two-pick n2n round-robin arbiter circuit, Electronics Letters,
vol. 48, vo. 13, 2012
[5] P. Gupta and N. McKeown, Designing and implementing a
fast crossbar, Micro IEEE, vol.19, Issue 1, pp.20-28, 1999
77
.indd
77
2015/03/11
11:00:34
I.
INTRODUCTION
With the incorporation of multi-core and embedded GPUs,
the performance capabilities of embedded systems are rapidly
improving. However, exploiting parallelism is even harder than
the traditional systems owing to much constraint on serial
performance, memory, and power. Moreover, embedded
applications generally require meeting real-time constraints,
which further increase application development complexity.
Therefore, developers must utilise tools to aid in determining
application and system performance bottlenecks, and
characteristics. Such information can help in both manual and
automatic complier optimisations.
78
.indd
78
2015/03/11
11:00:35
Source
Code
CFG Analysis
Instrumentation
Data Dependence
Profiler
Binary
Code
III. METHODOLOGY
We aim to develop and integrate a low overhead profiling
subsystem onto the well-known LLVM compilation system.
The system allows for just-in-time (JIT) compilation, allowing
us to implement dynamic code analysis. To decrease the
perceived profiling overhead, the profiling subsystem will
make use of the device idle time and the large storage space
generally available in embedded devices to perform
incremental profiling, that gives suitable information for
parallelism analysis.
Programmers/
Compilers
Figure 1. A proposed diagrame of our system.
ACKNOWLEDGMENT
REFERENCES
[1]
79
.indd
79
2015/03/11
11:00:36
3
4
Center for Japan-Egypt Cooperation in Science and Tech., Kyushu Univ., Fukuoka, Japan. (muta@ait.kyushu-u.ac.jp).
Graduate School of Information Science and Electrical Eng., Kyushu Univ., Fukuoka, Japan. (furuhiro@ait.kyushu-u.ac.jp)
NiIP KB
iU
i=M +1
iL
d
o
g
>
t
i
w
(
P
(3)
(4)
w
M
n
n
KB
KB
where card(IP
) is the cardinality of the set IP
which is the
i
i
number of nodes at level i that have distance metrics smaller than the
pruned radius i . In particular, we provide a heuristic for determining the
pruned radius, i at a specific tree level i. In particular
i K 2 dmin
i
,
i =
10SN R/10
IP KB
), K)
max(card(i
= K
2K max(card(IP KB ), K)
i+M
KB
= {j|dji < i , j {1, 2, , 2K 1}, i U }
IP
i
II. BACKGROUND
SD = arg min
x
y Rx ,
In multi-input multi-output (MIMO) communication systems, the traditional K-best sphere decoder (KB) memorizes the best K-nodes at each
level of the search tree [1].
In this paper, a variation of the KB decoder for MIMO systems
is proposed, namely, improved performance K-best (IPKB). The IPKB
provides performance improvement without complexity increase. Unlike
[2] that presents an algorithm for a reduction in complexity without
performance degradation, this paper provides performance improvement
without complexity increase. The chosen K-nodes include irrelevant
nodes that increase the decoding complexity without performance improvement; by discarding these irrelevant nodes, one can decrease the
complexity without compromising the performance as mentioned in [2].
We can invest the discarded irrelevant nodes in some tree levels by
increasing the visited nodes (VNs) in other tree levels with the same
number of discarded nodes, in order to improve the performance at the
same complexity value.
a
n
I. I NTRODUCTION
F
S
metrics d(x) =
y Rx can be computed recursively using partial
Euclidean distances [3].
Alternatively, the KB algorithm traverses the tree in a breadth-first
search strategy at each level. Where, a decision is made on a level-by-level
basis by keeping only the best K-nodes that corresponds to the smallest
K-distance metrics. Note that this paper uses real form representation
mentioned in [4].
i = 2M, 2M 1, , 2
(5)
(2)
U
V
K
H
80
.indd
80
2015/03/11
11:00:37
t
l
t
n
B
s
g
o
<
.
l
d
r
,
e
e
e
t
s
s
d
M
g
e
d
e
2
e
e
10
12.8
13
1.2
9.8 8.8
11
9
8.8
14
+1
10
8.9
9
9
10
14
8
9
10
9.1
14
7.8
2.8
+3
9
14
15
Neutral
level
IPKB solution
10
Complexity curves
4
10
BER
.2
7.8
8.9
9.9
-1
.15
8.9 0.05
9
1.2
0.15
-3
Level (i)
Upper level
4 = 9 dmin
= 1.35
Lower level
SNR = 5dB
10
10
10
10
SNR (dB)
15
K = 2 340
IPKB2 320
K = 4 300
IPKB4 280
K = 6 260
IPKB6 240
220
200
180
160
140
120
100
80
60
40
20
0
20
A. Complexity of the KB
To determine the complexity of the KB algorithm, tree levels are
divided into two groups. The first group contains tree levels where number
of available nodes in each tree level, NiKB K, whereas the second
group contains tree levels where number of available
nodes per level
> K. Note that, each survived node is expanded into q child nodes in
the next tree level. Then,
number
(the same as available nodes
K of VNs
in this group), N KB = q pj=0
NjKB , for the first group is
N KB =
PK 1
j
( q) ,
j=0
( q)PK K,
VI. C ONCLUSIONS
We have proposed a modified K-best sphere decoding algorithm,
namely, improved performance K-best (IPKB). The IPKB achieves performance improvement over the KB without increasing its complexity.
We have provided complexity analysis for proposed algorithm. Simulation
results have confirmed the improvements of the proposed decoders.
(6)
where
PK is numberof tree levels in first group for specific K. Given that
( q)PK K < ( q)PK +1 and knowing q and K, we can determine
PK [6].
ln(K)
PK =
,
(7)
ln( q)
R EFERENCES
[1] Z. Guo, P. Nilsson, Algorithm and Implementation of the K-best
Sphere Decoding for MIMO Detection, IEEE Journal On Selected
Areas In Communications, pp. 491-503, 2006.
[2] I. Al-Nahhal, M. Alghoniemy, O. Muta, A. B. Abd El-Rahman,
H. Furukawa, A Reduced Complexity K-best Sphere Decoding
Algorithm for MIMO Channels, JEC-ECC, 2015.
[3] Y. Hsuan Wu, Y. Ting Liu, H. Chang, Y. Liao, H. Chang, EarlyPruned K-best Sphere Decoding Algorithm Based on Radius Constraints, ICC, pp. 4496-4500, 2008.
[4] M. O. Damen, H. El Gamal, G. Caire, On maximum-likelihood
detection and the search for the closest lattice point, Information
Theory, IEEE Transactions, pp. 2389-2402, 2003.
[5] R. Shariat-Yazdi, T. Kwasniewski, Configurable K-best MIMO
Detector Architecture, ISCCSP, pp. 1565-1569, 2008.
[6] R. Graham, D. Knuth, O. Patashnik, Concrete Mathematics,
Addison-Wesley, 1989.
where . is the floor operation. For example, consider 16-QAM for 22
MIMO shown in Fig. 1. In case of K = 6, from Eqs. (6) and (7), the
number of tree levels for the first group PK = 1.29 = 1 and the total
nodes of the first group N KB = 4.
For the second
group, each tree level has a fixed number of VNs,
(8)
KB
Using PK from (7), the complexity of the KB, CK
, is total number of
VNs
1 ( q)PK +1
KB
(9)
+ (2M PK 1)K ,
CK = q
1 q
81
.indd
81
2015/03/11
11:00:39
.indd
82
2015/03/11
11:00:39
.indd
83
2015/03/11
11:00:39
Kyushu University
Institute of Systems, Information Technologies and Nanotechnologies
3
Fujitsu Laboratories Limited
4
Fujitsu Limited
This paper outlines the different ML driven compilation approaches that have been proposed in related works
and the challenges that need to be resolved to make it
practical.
Abstract Modern day compilers rely on heuristics to choose an optimization strategy to apply to
an input program with the objective of improving its
performance. An optimization strategy is defined as a
sequence of code optimization techniques and their respective parameters. Nevertheless, this approach has
turned out to be suboptimal, and researchers have
proposed using machine learning (ML) driven compilation as a way of improving the optimization scenario
selection. This paper first outlines the different ML
driven compilation approaches that have been proposed in related works. Then it identifies from the
state of the art the five major challenges to be resolved
in order to make ML driven compilation practical.
I. Introduction
The compilers job is to translate human-readable code
into machine code that makes efficient use of hardware resources. To apply sequences of optimizations, compilers
rely on pre-defined optimization levels and static performance models of the system to estimate whether a given
optimization would be beneficial. An optimization level
is a fixed optimization scenario that is applied to any input program mostly ignoring its specificities. Still, modern compilers fall short in selecting appropriate optimizations that yield the highest speedups. As an illustrative
example, we tested Intels ICC compiler ability at optimizing 750 tensor contraction programs with a relatively
small optimization space of 9505 strategies. We observed
that for 59% of our test programs there was at least one
optimization strategy that would provide more than 5%
speedup than the one chosen by ICC, and often times
resulting in more than twice the speedup.
More recent compiler research has shown promising results in the application of machine learning (ML) techniques to improve compiler performance. In this alternate approach, the hardware and compiler are treated as a
black box and a tunable model is trained with benchmark
programs to learn to recognize which types of programs
benefit from which optimizations.
84
.indd
84
2015/03/11
11:00:40
case a scenario encoding must also be defined. The scenario input can be avoided by training one predictor for
each scenario. However, the number of optimization scenarios may be too large for this technique to be practical.
Classification is an alternative approach but is harder to
model and is prone to a strongly unbalanced training set,
which ML techniques do not handle well.
Challenge 4: Generality How varied are the types
of programs for which the predictor can accurately find
beneficial optimizations? This entails first determining
which application domain would benefit the most from
ML driven compilation. The next step is identifying a
source from which to mine programs to train and test
the predictor. There are different sources that can be
mined to generate the software characteristics including
benchmarks, auto-generated code, and crowd-sourcing [2].
Once we have a program source to mine, we need some
method to ensure enough program varieties are covered
[6].
Challenge 5: Reproducibility In the ML driven
compilation research field there is a general lack of trust
in published results because the experimental data is usually not readily accessible for independent investigation.
Moreover, much research effort is lost because the tools
employed are often developed ad-hoc and also not made
publicly available. This is further complicated because in
ML driven compilation there is no standard methodology
or metrics for evaluating and comparing different prediction approaches. Therefore there is a need for experimental environment sharing services and predictor evaluation
methodologies to enable collaboration that can lead to
resolving the challenges presented in this work.
References
[1] F. Agakov et al., Using machine learning to focus iterative optimization, International Symposium on Code Generation and
Optimization (CGO), pp. 295-305, March 2006
[2] G. Fursin and O. Temam, Collective optimization: a practical
collaborative approach, ACM Transaction on architecture and
code optimization (TACO), vol. 7, no. 4, 2010
[3] E. Park et al., Using graph-based program characterization for
predictive modeling. International Symposium on Code Generation and Optimization (CGO), pp. 196-206, 2012
[4] S. Kulkarni and J. Cavazos, Mitigating the compiler optimization phase-ordering problem using machine learning, OOPSLA,
2012
[5] A. Trouv
e et al., Using Machine Learning in order to Improve
Automatic SIMD Instruction Generation In The eighth international workshop on automatic performance tuning (iWAPT),
2013
[6] K. Hoste and L. Eeckhout, Microarchitecture-independant
workload characterization, MICRO, vol. 27-3 pp. 63-72, 2007
85
.indd
85
2015/03/11
11:00:41
1
2
reem.elkhouly@ejust.edu.eg
ahmed.elmahdy@ejust.edu.eg
3
elmasry@mpi-inf.mpg.de
Abstract
Control dependence elimination optimisation, namely if-conversion, is essential when generating efficient parallel code from
a serial code. While many If-conversion optimisation heuristics have been proposed in the literature, little investigated the
effectiveness of detected pattern guided transformations. In our research we tackle this problem by exploring the optimisation space
for a set of representative kernels, focusing on frequent branches. We have implemented our technique as an LLVM prototype
tested on Intel x86 platform. Compared to some well-known optimization techniques, our idea focuses on extracting a pattern that
can identify the profitable conversions. We hereby highlight the performance improvement opportunities that may be investigated
via new learning techniques.
I. I NTRODUCTION
Making computers run faster is a major goal in the fields of computer architecture and compilers. A key performance
driver is providing parallel execution at various granularities. With the multicore shift, larger parallelism granularities are
generally sought. Also, single-core performance is still important, as it is a major scaling limiting factor. Accelerating singlecore performance mainly relies on instruction-level parallelism, where various independent instructions are executed on the
same processing cycle. The degree of parallelism is inherently limited by the data-flow characteristics of the running program.
However, control-flow significantly hinders exploiting the true dependence manifested by the data-flow, significantly reducing
the achievable degree of parallelism [1], [2]. When branches are highly mispredicted, converting control-dependence into
data-dependence via predicated execution is a solution. In this model, instructions are generally guarded by predicates,
thereby eliminating control-flow [3]. This approach thus relies on if-conversion optimisations to convert conditional branches
into predicated instructions, allowing further potential parallelisation subject to the inherent data-flow dependences. However,
predication comes at the extra cost of executing nullified instructions. This can potentially degrade performance for large
if-then bodies. Moreover, branches interact in terms of allowing for different execution schedules, for which finding the
optimal schedule is generally a hard combinatorial search problem. In our work we revisit the problem of deciding which
branches to convert.
II. M ETHODOLOGY
Source code
program.c
Executable
program.o
LLVM
bitcode
program.bc
Optimized
assembly code
program.s
Time profiling
Hotspot
functions
program.bc
Pre-conversion
optimization
(mem2reg, ..)
BitmaskControlledIfConversion
LLVM bitcode
in SSA form
program.bc
In particular, we consider representative, frequently executed kernels from selective SPEC-CPU2006 benchmarks [4]. We
exhaustively try all possible combinations of if-converting conditional branch instructions, and report the obtained performance
on an x86 processor. Moreover, we measure the effectiveness of some commonly used heuristics by comparing the performance
of these heuristics with the optimal strategy while changing the corresponding metrics that the heuristics deploy. We will use
these observations to extract a pattern for profitable conversions using techniques as the Monte-Carlo Search Tree. Figure 1
describes the implemented prototype where the source code is initially converted into the bitcode format. At this moment, the
code needs to be prepared by some pre-if-conversion optimisations that allow the if-conversion to perform well [5].
86
.indd
86
2015/03/11
11:00:42
4.5
0.14
400
selected ifcvt
clang -O3
0.135
395
0.125
0.12
0.115
4.35
Runtime (s)
385
Runtime (s)
Runtime (s)
4.4
390
0.13
380
375
4.3
4.25
4.2
370
4.15
365
0.11
selected ifcvt
clang -O3
4.45
selected ifcvt
clang -O3
4.1
360
0
10
10
11
11
00
01
10
11
00
01
10
11
00
01
10
11
00
01
10
11
00
01
10
11
00
01
10
11
00
01
10
11
00
01
10
11
(a) bzip2
0
00
10
00
11
00
00
00
00
01
01
01
01
10
10
10
10
11
11
11
11
00
00
00
00
01
01
01
01
10
10
10
10
11
11
11
11
0
10
01
01
10
350
0
00
01
10
01
0.1
4.05
0
10
00
11
00
355
0
00
00
00
00
0.105
(b) mcf
(c) astar
Fig. 2. Runtimes of all selective ifcvt along with the clang -O3 output runtime
0.8
0.6
0.4
1.14
1.14
1.12
1.12
1.1
Clang -O3
Best ifcvt
Worst ifcvt
Runtime to optimal ratio
1.08
1.06
1.04
1.02
1.1
1.08
1.06
1.04
1.02
0.2
0
bzip2
mcf
Benchmarks
0.98
0.98
15
astar
20
25
30
35
40
45
Average no. of instructions per basic block
50
0.5
1.5
(a)
2
2.5
3
Average if depth
3.5
4.5
(b)
Fig. 4. Heuristics tested on bzip2
87
.indd
87
2015/03/11
11:00:43
Department of Electronics and Communications, Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt
2
Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
Emails: mostafa.saied@ejust.edu.eg, farhad@ejust.kyushu-u.ac.jp, murakami@ait.kyushu-u.ac.jp, m.ragab@ejust.edu.eg
AbstractThe emerging 3D integration technology significantly overcomes 2D integration process limitations. The use of very short (very
shorter than average wire length) Through-Silicon Vias (TSVs) introduces
a significant reduction in routing area, power consumption, and delay.
Though, 3D technology suffers from extremely low yield. It is shown
in literature that reducing TSV count has a considerable effect in
improving yield. The TSV multiplexing technique called TSVBOX was
introduced in [1] to reduce the TSV count without affecting the direct
benefits of TSVs. The TSVBOX introduces some delay to the signals
to be multiplexed. In this paper, we deduce a design methodology for
TSVBOX-based 3D Network-on-Chip (NoC) to overcome the TSVBOX
delay degradation impact on system validity.
I. I NTRODUCTION
Input inverter
driver Rdr_Conv
V
TSV
RW
CW
RTSV/2
RTSV/2 RW
CTSV
CW
Output 1x-inverter
driver
CL
where
CP N = CdbP + CdbN , RP N =
(2)
RonP RonN
RonP + RonN
While the delay of the SEL (SEL) signal can be approximated by:
VDD
TdSEL = ln
(3)
88
.indd
88
2015/03/11
11:00:44
Input inverter
driver
Rdr-SEL
SEL
To other TSVBOXes
TSV
in the data bus
RTSV/2 RTSV/2
RW
CW
Input inverter
driver Rdr_TSVBOX
V1
CP
2Cg
CP
CP
RonP
SEL
CN
RW
RonP SEL
CN
CTSV
CN
CW
CP
CP
RonP SEL
RonN
CN
CN
2Cg
Conventional
Data driver
design steps
Select Rdr-SEL=Rdr-max
No
CL
Is
Rdr-SEL<=Rdr-max
?
2Cg
To other TSVBOXes
in the data bus
Data driver
design steps
Calculate Rdr-TSVBOX
Calculate Rdr-Conv
Select Rdr-Conv=Rdr-max
Yes
No
Is
Rdr-Conv<=Rdr-max
?
Yes
Calculate KN-TSVBOX
CW
CTSV
Calculate
TSVBOX model
parasitics
CL
CP Output 1x-inverter
driver
V2
CN
TSVBOX
TSV
RTSV/2 RTSV/2 RW
RW
CW
SEL
RTSV/2 RW
RonN
CN
Input inverter
driver
Rdr-SEL
SEL
RTSV/2
CW
CP
Output 1x-inverter
CP
driver
V1
RonP
RonN
TSV
2Cg
RonN
CN
Input inverter
driver Rdr_TSVBOX
V2
CW
CTSV
Technology to be used
(180 nm, 130 nm, etc)
To other TSVBOXes
in the data bus
RW
Calculate KN-Conv
To other TSVBOXes
in the data bus
Finish
Select Td-SEL |
Td-SEL <= 0.5TCLK-Td-TSVBOX
SEL driver
design steps
SEL (CLK),
SEL
SEL
Td-SEL
(a)
SEL
Select Rdr-SEL=Rdr-max
No
max(VthN, VthP)
Yes
0.5TCLK
SEL, SEL
Is
Rdr-SEL<=Rdr-max
?
SEL
TCLK
T2
T1
Td-SEL
Calculate KN-SEL
(b)
SEL
Select Td-Disch-SEL |
Td-Disch-SEL <=Td-SEL
SEL driver
design steps
max(VthN, VthP)
Calculate Rdr-SEL
Trem
Select Rdr-SEL=Rdr-max
No
Is
Rdr-SEL<=Rdr-max
?
Yes
Unit
Theoretical
simulation
nsec
nsec
nsec
nsec
2.499
4.1455
0.5
0.1383
2.45
4.15
0.5
0.12
|error|
0.04%
0.1%
0.0%
13.23%
Calculate KN-SEL
Finish
Unit
k
k
k
k
Values
(1, 1.5)
(1, 1.5)
(2.7526, 4.1289)
(9.9531, 14.93)
15.51
15.51
5.6344
1.5582
R EFERENCES
[1] M. Said, F. Mehdipour, and M. El-Sayed, Improving Performance
and Fabrication Metrics of Three-Dimensional ICs by Multiplexing Through-Silicon Vias, DSD13, pp. 581-586, 2013.
[2] I. Loi, S. Mitra, T. Lee, S. Fujita, L. Benini, A low-overhead Fault
Tolerance Scheme for TSV-based 3D Network on Chip Links,
ICCAD08, pp. 598-602, 2008.
[3] N. Weste, D. Harris, CMOS VLSI Design, A Circuits and Systems
Perspective, Addison Wesley, 2011.
[4] A. Papanikolaou, D. Soudris, and R. Radojcic, Three Dimensional
System Integration, Springer, 2011.
[5] J. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits, A Design Perspective, Prentice Hall, 2003.
[6] http://www.itrs.net/reports.html
89
.indd
89
2015/03/11
11:00:46
Research Institute for Nano device and Bio Systems, Hiroshima University, Higashi-Hiroshima, 739-8527, Japan
Phone: +81-82-424-6265 E-mail:yamasaki-shogo@hiroshima-u.ac.jp
ison of data becomes possible. On the other hand, a disadvantage of the clock-counting search is that it takes a long
time if the distance of the most similar data is large. The
worst-case clock number for the SFCC (Straight Forward
Clock Counting) method increases exponentially with the
number N of feature-vector component bits. The previously
reported CCR (Clock Counting Reduction) method [3]
achieves reduction to a linear increase by starting the search
from the highest-value bit and including lower value bits in
the search sequentially after a match for the higher-value
bits has been found.
Match signals are received from the reference-sample
rows in the sequence of their distances to the input sample.
By summing the match-signal number the K-th nearest
neighbor discovery is recognized and the search is terminated by a stop signal. After this, the recognized-class identification by a majority vote is carried-out with an identification circuit that can be freely set to the number K of interest. One evaluated concept identifies the class of each of
the K nearest neighbors by the storage location in the associative memory. However, with this concept the reference
data has to be sorted and the data number for each class becomes inflexible. Figure 2 shows the preferred architecture
for the k-NN clustering circuit, which enables complete
flexibility with respect to the number of classes, the number
of reference data for each class and the storage locations of
the reference data in the associative memory. This flexibility
is achieved by a look-up table stored in an SRAM which can
be changed for each application and specifies the class information of the reference data in each row of the associative memory. The Match Signal Storage Registers on the
left are connected to Match Signal Detecting Circuits
which sequentially identify the row numbers of the K nearest neighbors. For nearest-neighbor row i, an acti signal is
generated and used to read the correct class information
from a look-up table. This class information then controls a
de-multiplexer to increase the status of the corresponding
class counter by 1. Class identification and counting process
are finished when the nextR signal is asserted at the output
of the last Match Signal Detecting Circuit. Afterwards a
comparator at the output of the class counters identifies the
recognized class with the highest counting status of the corresponding class counter as the recognition result. The associative memory rows, where the stored data is not among
the K nearest neighbors, are skipped in this clustering process. Therefore, the clustering process finishes with the
recognition result after K clock cycles.
The required number of feature-vector dimensions is different for each application as for example, face recognition,
character recognition, fingerprint authentication. Architecture
that can handle a different number of dimensions in the same
hardware is required for a SoC with standardization potential.
A bit slice of the developed DEC (Dimension Extension Circuit) for enabling dimension extension is shown in Fig. 3.
With this DEC circuit additional component distances are
added sequentially at the time of their input and stored in the
lower-right DFF. In this way a flexible feature-vector dimensionality in the range 8~2042 dimensions is achieved.
1. Introduction
In recent years, a variety of application developments
which require pattern matching, such as face/object recognition in images, or voice recognition have become hot topics
[1, 2]. Many users desire these applications could be implemented in advanced mobile devices, such as smart
phones.
Conventionally, pattern matching and other process of intensive computation are off-loaded from the mobile device
to the Cloud, i.e. to data centers with highly large parallel
servers. Due to the required data transmission between terminal and Cloud, this method involves relatively large time
delays. Furthermore, a Cloud connection isnt available
everywhere and sometimes it may be disrupted. Also, total
power consumption is very high due to data transmission
and external server processing.
Important classifiers for recognition are based on the K
nearest-neighbor (k-NN) algorithm. For flexible integrated
k-NN hardware, which combines high searching speed for
the most similar data in a reference-data base with low
power consumption, high search reliability and accurate
recognition, wide applications in mobile devices can be expected. Furthermore, such a VLSI chip has the potential of
becoming an application specific standard product (ASSP)
for intelligent systems with recognition capability. Unfortunately, there is no previous example of efficient VLSI integration of k-NN algorithm without algorithm simplification
and reduced matching accuracy. This paper reports a flexible low-power recognition SoC (System on Chip) based on
k-NN algorithm.
2. k-NN Realization with searching by clock counting
and, expansion method for feature-vector dimensions
k-NN is a method of statistical classification, based on
multiple reference samples closest in the feature space,
which is often used in pattern recognition. The number of
reference data may be very large and k-NN algorithm consistency is fairly reliable. However, computation amount is
large because all data must be compared, which is one of the
reasons that no efficient VLSI realization of k-NN exists.
Here, we adopted the distance-search method by clock
counting which allows fully-parallel low-power Euclidean-distance search [3, 4]. Fig. 1 shows the block diagram of
k-NN algorithm integration with associative memory operating in the clock domain. After calculating the distances
between each component of the feature vectors for reference
and input samples, these component distances are expressed
as a number of clock cycles. During the distance search by
clock counting, match signals are received from the reference-data rows in the sequence of their distances to the input sample. In this manner, fully-parallel error-free compar-
Fi
F
n
90
.indd
90
2015/03/11
11:00:47
91
.indd
91
2015/03/11
11:00:48
p
r
s
(
f
m
a
p
e
c
c
a
c
p
Ahmed Medhat1, Ahmed Shalaby1, Mohammed S. Sayed1, Maha Elsabrouty1, Farhad Mehdipour2
1
ECE Department, Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt
{ahmed.abdelsalam, ahmed.shalaby, mohammed.sayed, maha.elsabrouty}@ejust.edu.eg
2
Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Fukuoka, Japan
{farhad@ejust.kyushu-u.ac.jp}
I.
II.
t
w
w
a
p
T
e
a
s
t
s
a
INTRODUCTION
f
I
s
s
t
e
s
t
I
p
s
w
a
F
c
r
M
a
f
p
c
A
s
92
.indd
92
2015/03/11
11:00:48
h
e
f
n
h
h
o
y
t
f
s
y
%
s
s
n
s
l
h
e
o
e
e
m
w
e
h
,
w
)
f
r
Sequence (Resolution)
BR (%)
PSNR (db)
-0.003
0
-0.011
0
0
0
-0.002
Duck (1280x720)
Ice (1280x720)
Blue_Sky (1280x720)
Shields (1280x720)
Stockholm (1280x720)
Park_Joy (1280x720)
-55.04
-46.39
-32.67
-39.24
-28.73
-45.06
0.93
0.67
0.88
0.87
0.52
1.03
Average
-41.19
0.82
TABLE II.
Search Method
Belgith et al. [5]
Anand Paul [6]
Proposed ASWA
BR (%)
PSNR (db)
1.10
3.44
0.82
-0.120
-0.183
-0.002
REFERENCES
[1]
[2]
CONCLUSION
[3]
[4]
[5]
[6]
93
.indd
93
2015/03/11
11:00:49
b
S
w
p
S
s
d
b
m
o
a
i
t
Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Japan
b
T
w
o
p
O
D
w
p
a
E
n
c
F
r
i
s
o
O
b
a
d
f
i
a
1
2
r
94
.indd
94
2015/03/11
11:00:50
d
.
o
e
h
y
C
T
e
t
t
o
n
g
h
n
t
t
d
d
s
.
.
s
.
e
l
e
s
t
e
IV.
III.
CONCLUSIONS
In this section, the performance of the base integer 2DDCT architecture is evaluated versus the same architecture
with optimized SAU and with both optimized SAU and
pipelined OAU in terms of delay and area. The architectures
are described in Verilog RTL and synthesized using Cadence
Encounter RTL compiler RC11.10, and an ASIC CMOS 65
nm technology.
REFERENCES
[1] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, Overview ofthe
High Efficiency Video Coding (HEVC) Standard, IEEE Trans. on Circuits
and Systems for Video Technology,vol.22, No.12, pp.1649-1668, 2012.
[2] N. Ahmed, T. Natarajan and K. R. Rao, "Discrete cosine transform,"IEEE
Trans. Comput., vol. C-23, pp. 90-93, 1974.
[3] http://spiral.ece.cmu.edu/mcm/gen.html (last visit 15-1-2015)
[4] Y. Voronenko, and M. Pschel, Multiplierless Multiple Constant
Multiplication, ACM Transactions on Algorithms (TALG), vol. 3, Issue 2,
May 2007.
[5] S. Y. Park and P. K. Meher, Flexible Integer DCT Architectures for
HEVC, IEEE International Symposium on Circuits and Systems (ISCAS),
2013.
[6] Wenjun Zhao; Onoye, T. ; Tian Song, High-performance multiplierless
transform architecture for HEVC, IEEE International Symposium on
Circuits and Systems (ISCAS), 2013.
[7] P. K. Meher, S. Y. Park, B. K. Mohanty, K. S. Lim, and C. Yeo, Efficient
Integer DCT Architectures for HEVC, IEEE Transactions on Circuits and
Systems for Video Technology, 2013.
[8] W. H. Chen, C. H. Smith, and S. C. Fralick, A fast computational
algorithm for the discrete cosine transform, IEEE Trans. Commun.,vol.
COM-25, no. 9, pp. 10041009, 1977.
[9] C.-P. Fan, Fast 2-dimensional 4x4 forward integer transform
implementation for H.264/AVC, IEEE Trans. Circuits Syst. II, vol. 53, no.
3, pp. 174177, 2006.
95
.indd
95
2015/03/11
11:00:50
.indd
96
2015/03/11
11:00:50
.indd
97
2015/03/11
11:00:51
m
t
f
i
e
r
e
t
p
f
b
m
(
h
3
t
a
I.
INTRODUCTION
(b)
(c)
Figure 1. Principle of skin resistance measuring. (a) Electrode configuration,
(b) Touch Panel sectional view and equivalent circuit, (c) Measured
impedance spectrum.
e
a
c
(
W
f
i
r
T
A
c
c
P
METHODOLOGY
Z = j 2 fL +
2
+ Rf ,
j 2 fC f
(1)
(a)
w
t
t
C
a
98
.indd
98
2015/03/11
11:00:52
f
l
e
r
h
.
X
g
n
e
L ( 2 f c )
(2)
o
e
e
III.
REFERENCES
[1]
[2]
RESULTS
[3]
99
.indd
99
2015/03/11
11:00:53
s
T
d
r
f
o
i
i
u
a
i
Sherif Hekal1, Adel B. Abdel-Rahman1, Ahmed Allam1, Ramesh K. Pokharel2, H. Kanaya2, and H. Jia2
1
ECE department, Egypt-Japan University of Science and Technology, Alex, Egypt, sherif.hekal@ejust.edu.eg
2
Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
introduced to obtain miniaturized structures that can be
embedded in portable and biomedical devices.
I.
INTRODUCTION
The technology of wireless power transfer (WPT) has
attracted a great interest for its wide potential applications such
as RFIDs, biomedical implants, wireless buried sensors, and
portable electronic devices. The implementation methods of
WPT can be classified into two main techniques: near-field and
far-field. The short-range and mid-range WPT can be achieved
based on near field coupling, which can be divided into three
types: inductive coupling, EM resonant coupling, and strong
resonant coupling. Inductive coupling is the most popular
technique for high power transfer, and is usually applied at the
lower frequency region [1]. At higher frequencies, the resonant
type becomes a good choice. Resonant circuits focus the power
at a certain frequency, so that power transfer efficiency can be
improved [2]. Strong coupling uses intermediate resonators
with high Q-factors to increase the total efficiency of the
proposed WPT system [3]. Far field technology is usually
applied for long distance transfer based on radiated waves like
radio waves, Microwave links, or laser beams.
(a)
(b)
F
r
(c)
Figure 1. New proposed WPT system using two H-slot coupled resonators
(a) EM simulator implementation. (b) Equivalent circuit (c) Simulated and
measured power transfer efficiency.
E
p
100
.indd
100
2015/03/11
11:00:55
d
n
s
h
t
d
.
s
k 2 Qd Ql
1 k 2 Qd Ql
k12Qd Qt
k 22Qt Ql
dt tl .
2
1 k1 Qd Qt 1 k 22 Qt Ql
(1)
(2)
IV.
20
20
Resonator parameters
Lslot (mm)
6
10
Wslot (mm)
1
1.9
LH (mm)
14
14
n
n
d
s
r
d
,
r
(a)
(a)
(b)
(b)
Figure 4. Measured S-parameters for the proposed WPT system with/without
strong resonant coupling
(c)
Figure 2. Strong resonant coupling (a) Schematics of driver, TX, and load
resonators. (b) Bottom layer of H-slot resonator. (c) Equivalent circuit.
REFERENCES
[1]
[2]
[3]
[4]
Figure 3. S-parameters of the proposed structure using strong resonant coupling
[5]
C.-J. Chen, T.-H. Chu, C.-L. Lin, and Z.-C. Jou, A Study of Loosely
Coupled Coils for Wireless Power Transfer, IEEE Trans. Circuits Syst.
II Express Briefs, vol. 57, no. 7, pp. 536540, Jul. 2010.
B. L. Cannon, J. F. Hoburg, D. D. Stancil, and S. C. Goldstein,
Magnetic resonant coupling as a potential means for wireless power
transfer to multiple small receivers, IEEE Trans. On Power Electron.,
vol. 24, no. 7, pp. 18191825, Jul. 2009.
A. Kurs, A. Karalis, R. Moffatt, J. D. Joannopoulos, P. Fisher, and M.
Soljacic, Wireless energy transfer via strongly coupled magnetic
resonances, Science, vol. 317, pp. 8385, 2007.
D.-J. Woo, T.-K. Lee, J.-W. Lee, C.-S. Pyo, and W. Choi, Novel U-slot
and V-slot DGSs for bandstop filter with improved Q factor, IEEE
Trans. Microw. Theory Tech., vol. 54, no. 6, pp. 28402847, Jun. 2006.
A. K. RamRakhyani and G. Lazzi, On the design of efficient multicoil
telemetry system for biomedical implants, IEEE Trans. Biomed.
Circuits Syst., vol. 7, no. 1, pp. 1123, Feb. 2013.
101
.indd
101
2015/03/11
11:00:56
t
t
I.
e
d
e
a
o
T
t
INTORODUCTION
b
l
v
o
t
v
t
s
t
n
s
p
p
t
o
w
e
(
102
.indd
102
2015/03/11
11:00:57
o
r
e
t
a
e
r
r
a
t
r
e
l
d
e
t
.
t
e
o
t
d
.
o
I
d
C
e
g
l
y
r
a
s
III.
IV.
CONCLUSION
ACKNOWLEDGMENT
The authors thank NTT Laboratories for their experimental
support. A part of this work was supported by the Strategic
Information and Communications R&D Promotion Programme
(SCOPE) 2014, from the Ministry of Internal Affairs and
Communications, Japan, and CREST/JST.
REFERENCES
[1]
[2]
[3]
g
Figure 4. Interfered light intensity detected with the photodetector.
103
.indd
103
2015/03/11
11:00:57
I. INTRODUCTION
The
wavelength
division
multiplexing
(WDM)
transmission technology has promoted the expansion of the
large-capacity optical communication. In addition to the large
capacity, the WDM technology has another advantage that it
transmits differently formatted signals with different
bandwidths. Figure 1(a) indicates a situation that three lights
with different wavelengths are transmitted through an optical
fiber at a WDM system. At the conventional WDM system, in
which the grid spacing is 50 GHz or 100 GHz, one channel is
allocated even for small-capacity data traffic. This feature is
illustrated with the red band (the actually used band) and the
blue band (the allocated band) in the figure. Thus, for future
high spectral-efficiency transmission, the conventional WDM
channel would be inefficient. To increase the spectral
efficiency of the WDM transmission, the WDM channels on
the flexible grid has been defined by the ITU [1].
At the flexible grid, the channels are allocated on the basis
of 6.25-GHz spacing. Each channel can occupy the bandwidth
of 12.5 GHz n (n is a natural number) with its center at the
grid. Figure 1(b) shows an example that the channels with
37.5-GHz, 12.5-GHz, and 25-GHz bands are allocated on the
grids. We can see the advantage of the flexible grid that each
channel occupies the minimum bandwidth for the signal being
transported. Since the center optical frequency should be
aligned on the 6.25-GHz-spacing grid, the flexible grid system
needs the optical source which has its optical-frequency
stability within 0.1 GHz.
In this paper, we propose a novel technique to accurately
monitor the optical frequency of the distributed feedback
(DFB) laser, which is the common light source for the WDM
104
.indd
104
2015/03/11
11:00:59
2.9 GHz. On the other hand, Fig. 4(b) shows the spectrum of
the beat signal when f 0. It has two peaks whose
frequencies are 2.9 GHz + f and 2.9 GHz f. The RF power
measured by the RF detector was decreasing when |f| was
increasing, from which we confirmed that we could stabilize
the optical frequency of the laser by controlling the RF power
at the maximum. Then, we controlled the LD temperature so
that the RF power is maximum. Figures 5 shows the Max
Hold trace for thirty minutes with the feedback control. The
width of the Max Hold trace indicates the wavelength
deviation. The FWHM of the trace is less than 0.1 GHz. This
shows that the optical frequency can be stabilized within 0.1
GHz by our proposed method.
Fig. 5.
Fig.2. The principle of the proposed method
ACKNOWLEDGEMENT
Fig.4.
105
.indd
105
2015/03/11
11:00:59
I. I NTRODUCTION
Space-division multiplexing (SDM) is a promised degree
of freedom to increase the transmission capacity, which is
rapidly approaching its fundamental limit in single mode fibers
[1]. Few-mode fibers (FMFs) are remarkable channels for
SDM techniques. However, the nonlinear interaction between
different propagation modes in FMFs is a major source of
performance limitation which must be addressed for its mitigation. Few analytical efforts have been developed to model
the nonlinear propagation in multi-mode fibers [2], [3]. In this
paper, we extend the GN-model developed for single mode
fibers [4] to address the different nonlinearities impact in
FMFs. In [3], a general integral formula for the cross-modal
nonlinear interaction has been proposed for multimode fibers.
However, in this work, a simple closed-form expression (with
less computational complexity) for the nonlinear capacity of
FMFs is derived for the case of weak linear coupling regime
among the different spatial modes. In addition, the effect of
different nonlinearity penalties for various constellation orders
are investigated.
II. P ROPOSED GN-M ODEL FOR F EW-M ODE F IBERS
The signal propagation of mode p in a FMF has been
already described in [5]. It is divided into a linear part
(dispersion
and a nonlinear part,
given by
+ attenuation)
2 4
2
8
p is
Np = j 9 fpppp Ap + 3 h=p fpphh Ah . Here A
the field envelope of mode p, is the fiber nonlinearity coefficient, fpppp is the intra-modal nonlinear coefficient tensor
of mode p, and fpphh is the inter-modal nonlinear coefficient
tensor between p and h spatial modes. The calculated values
of these tensors have been reported in [1].
The GN-model for single mode fibers assumes that the nonlinearity source can be modeled as an additive Gaussian noise
which is statistically independent from both the amplifier noise
and the transmitted signal [4]. Also, it assumes the transmitted
signal as a wavelength-division multiplexed (WDM) comb
signal with Nch channels. These assumptions can be extended
for FMFs based on the fact that the interaction between
any two orthogonal polarization modes is equivalent to that
106
.indd
106
2015/03/11
11:01:01
1
2Rs
log2 1 +
Bch
B n Ns
(G 1)F h +
4 2
3 M3
45
LP
Capacity (b/symbol)
11a or 11b
LP
01
Copropagated with LP
01
intramodal limit
5.5
4
5
5000
3,000
5500
6000
5,000
L (km)
7,000
9,000
Fig. 1: Capacity versus fiber maximum reach at different nonlinear penalties for a FMF of Ls = 100 km.
Linear Shannon limit
Capacity (b/symbol)
36
(a)
4QAM
Linear Shannon limit
Nolinear shannon limit
Intermodal
penalty
64QAM
28
16QAM
20
12
(b)
4QAM
Linear Shannon limit
36
Total
penalty
28
16QAM
20
12
(c)
4QAM
16
12
4
P (dBm)
Ptxp
2
2
log( 2 Bw
| 2p |Lef fp ) 3
4 2
P
f
+
f
L
ef fp
tx.
3
pphh
9 pppp
2 Bw
| 2p |
h=p
(2)
16QAM
20
36
Capacity (b/symbol)
Intramodal
penalty
28
12
Capacity (b/symbol)
C=
tx,p
[1] I. Kaminow, T. Li, and A. E. Willner, Eds., Optical Fiber Telecommunications Volume VIB, Sixth Edition: Systems and Networks, 6th ed.
Amsterdam ; Boston: Academic Press, May 2013.
[2] F. Ferreira, S. Jansen, P. Monteiro, and H. Silva, Nonlinear semianalytical model for simulation of few-mode fiber transmission, Photonics Technology Letters, IEEE, vol. 24, no. 4, pp. 240242, 2012.
[3] G. Rademacher, S. Warm, and K. Petermann, Analytical description of
cross-modal nonlinear interaction in mode multiplexed multimode fibers,
Photonics Technology Letters, IEEE, vol. 24, no. 21, pp. 19291932,
2012.
[4] A. Carena, V. Curri, G. Bosco, P. Poggiolini, and F. Forghieri, Modeling
of the impact of nonlinear propagation effects in uncompensated optical
coherent transmission links, Journal of Lightwave Technology, vol. 30,
no. 10, pp. 15241539, 2012.
[5] S. Mumtaz, R. Essiambre, and G. P. Agrawal, Nonlinear propagation
in multimode and multicore fibers: Generalization of the manakov equations, Lightwave Technology, Journal of, vol. 31, no. 3, pp. 398406,
2013.
[6] A. Mecozzi, C. Antonelli, and M. Shtaif, Nonlinearities in space-division
multiplexed transmission, in Optical Fiber Communication Conference
and Exposition and the National Fiber Optic Engineers Conference
(OFC/NFOEC), 2013, Mar. 2013, pp. 13.
[7] A. E. El-Fiqi, A. Ismail, Z. A. El-Sahn, H. M. H. Shalaby, and R. K.
Pokharel, Evaluation of nonlinear interference in few-mode fiber using
the gaussian noise model, submitted in CLEO-2015 (unpublished).
107
.indd
107
2015/03/11
11:01:02
.indd
108
2015/03/11
11:01:02
.indd
109
2015/03/11
11:01:03
Electronics and Communication Engineering, Egypt-Japan University of Science and Technology (EJUST), New Borg ElArab City, Alexandria, Egypt, {sawsan.abdelsalam, maha.elsabrouty} @ejust.edu.eg
efficiency of embedding perceptual-based weighting strategy
into the CS framework for intra-frames. It mainly utilizes the
structural sparsity of 2D-DCT transform and focuses the
measurements and recovery on 2D-DCT low frequency
coefficients. In this paper, we propose to proceed for further
improvements by exploiting inter-frame correlation among
successive frames. Residual-based recovery is utilized here
as it proves good performance by just recovering the residual
between the required frame and its predicted frame [8, 9].
Utilizing residual based recovery with our perceptual-based
recovered frames is anticipated to further improve the
performance over other works in literature.
I.
INTRODUCTION
II.
s. t.
METHODOLOGY
(1)
(2)
s. t.
(3)
where
| | and is small tolerance error.
Then, non-key frames predictions (a.k.a side information
110
.indd
110
2015/03/11
11:01:04
III.
40
35
30
25
Standard L1-min
intra-PercCS
Resid-PercDCVS
20
40
50
60
70
Average measurement rate per frame (%)
40
35
30
25
(a)
(4)
Standard L1-min
intra-PercDCVS
Resid-PercDCVS
20
15
30
80
45
40
50
60
70
Average measurement rate per frame (%)
80
(b)
45
15
30
50
50
(5)
1.8
1.6
s. t.
55
1.4
1.2
1
0.8
30
40
50
60
70
Average measurement rate per frame (%)
80
Standard L1-min
intra-PercCS
Resid-PercDCVS
1.6
1.4
1.2
1
0.8
30
(a)
1.8
40
50
60
70
Average measurement rate per frame (%)
80
(b)
Figure 2: Reconstruction time for (a) News and (b) Foreman sequences
ACKNOWLEDGMENT
This work has been supported by the Egyptian Mission of
Higher Education (MoHE) and Egypt-Japan University of
Science and Technology (E-JUST).
REFERENCES
[1]
111
.indd
111
2015/03/11
11:01:05
ECE Department, Egypt-Japan University of Science and Technology (E-JUST), Alexandria, Egypt
{elsayed.elgendy, ahmed.shalaby, mohammed.sayed}@ejust.edu.eg
Abstract HEVC has adopted Sample Adaptive Offset (SAO) as
a new in-loop filter block. SAO can significantly improve coding
efficiency, however, it requires intensive operations to get best
SAO parameters for each CTB. Real time and low power video
encoders still requires more efficient SAO encoding algorithm. In
this paper, statistics of SAO modes are explored, also analysis of
frequently used types is carried out. Moreover, the effect of
eliminating those rarely used modes on video quality is studied in
terms of PSNR. Based on the modes statistical analysis, we
propose an algorithm that reduces the SAO encoding time by
40.6% with only 0.05% YUV PSNR reduction on average.
I. INTRODUCTION
The rapidly increased demand for high resolution videos
have pushed to develop effective compression techniques.
High Efficiency Video Coding (HEVC) was jointly developed
by MPEG and VCEG as a video compression standard, which
aims to reduce the bit rate by 50% in comparison with
H.264/MPEG-4 AVC assuming the same quality [1]. HEVC
adopted the in-loop filter in its main profile to reduce artifacts
generated by quantization and block-based processes such as
blocking artifacts, color biases and ringing artifacts. It is used
on both sides; the encoder and the decoder. The in-loop filter
mainly consists of three main blocks, Deblocking Filter
(DBF), Sample Adaptive Offset (SAO) and Adaptive Loop
Filter (ALF). SAO filter aims to improve visual quality by
preventing ringing artifacts near object edges. It reduces
samples mean distortion by adaptively adding an offset value
to each sample [2]. Despite the SAO encoding time is lower
than other coding modules, real time videos encoding still
requires efficient algorithms for SAO encoding.
Category
1
2
Offset Sign
Positive
Positive
Negative
4
0
Negative
Non
EO CATEGORIES
Selection logic
C < 2 neighbors
C < 1 neighbor and
C = the other neighbor
C > 1 neighbor and
C = the other neighbor
C > 2 neighbors
C > 1 neighbor and
C < the other neighbor
Condition
C < A && C < B
(C < A && C=B) ||
(C =A && C < B)
(C >A && C=B) ||
(C =A && C > B)
C > A && C > B
None of the above
112
.indd
112
2015/03/11
11:01:06
TABLE II.
Encoding configuration
HM 16.2 main configuration
All intra (AI),
Random access (RA),
Low delay with P picture (LP)
Low delay with B picture (LB)
6 sequences of 6 classes
Windows 7
Intel Xeon X5690 at 3.46 GHz CPU and 96 GB
RAM.
Anchor
Prediction structure
(GOP structure)
Target sequences
Operating system
Machine
specification
EXPERIMENTAL CONDITIONS
SAO
Execution
Time
Y_PSNR
U_PSNR
V_PSNR
YUV_PSNR
SAO ON
SAO BO
OFF
Reduction
Relative
percent
230
136.7
93.3
40.6%
34.2011
41.3052
41.9512
35.4638
34.1833
41.2858
41.9161
35.4463
- 0.0178
- 0.0194
- 0.0351
- 0.0175
- .05%
- .05%
- .08%
- .05%
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
113
.indd
113
2015/03/11
11:01:06
I.
Fig. 1.
I NTRODUCTION
II.
Practical implementation of digital communication and signal processing system require representing the signal in finite
precision form. In fact, studying the quantization effects and
field point representation is one of the key bridges that link the
theoretical signal processing algorithms to its implementation
platforms. In the particular extreme case of 1-bit quantized CS,
only the sign of the measurements are kept. More specifically,
in analog-to-digital (A/D) conversion the acquisition of 1-bit
measurements of analog signal only requires a comparator
with reference voltage zero, which can be implemented using
inexpensive and fast hardware that is robust to amplification
of the signal and other errors.
The objective of this paper is to clarify the effect of onebit CS in audio signal compression for digital transmission
systems and evaluate the achievable performance in terms of
Mean Opinion Score (MOS)
(1)
114
.indd
114
2015/03/11
11:01:07
n
o
e
.
.
e
e
s
,
g
g
e
e
l
r
s
d
g
n
r
k
S
f
d
The resulting samples are divided into frames and each frames
contain 1321 samples and the duration of each frame is 30 ms.
One faithful measure of the perceptual quality of audio signal
is the Perceptual Evaluation of Speech Quality (PESQ). The
output of this test called Mean Opinion Score (MOS). Signal
with higher MOS, means that less distorted and more similar to
the original signal. The simulations are running on Panasonic
workstation with an Intel processor core i7 at 2.9.
f = argmin f 1
(2)
s.t yCS = Q(Hf ) + 1
TABLE I.
Recovery
Algorithms
BachHymn
Piano
Folk Music
Bach Partita
IRl1
No Perc
1.1
1.21
0.73
0.77
0.92
0.14
0.33
0.62
l1 Stand.
With Perc
2.06
3.39
1.6
2.75
1.21
2.32
1.65
2.84
IRl1
With Perc
1.35
1,25
0.78
0.88
0.93
0.3
0.7
0.84
FPC [2]
IRl1
With Perc and
Const. H
1.38
2.36
0.84
2.2
0.97
1.52
1.41
2.36
No
Perc
1.45
1.24
1.07
1.11
0.56
0.8
0.84
0.9
With
Perc
1.91
2.31
1.58
1.6
0.8
0.92
1.41
1.74
1-Bit
FPC [4]
No
With
Perc
Perc
0.14
1.64
1.23
1.85
0.01
0.59
0.8
1.14
2-Bits
4-Bits
2-Bits
4-Bits
2-Bits
4-Bits
2-Bits
4-Bits
IV.
C ONCLUSION
[3]
[4]
115
.indd
115
2015/03/11
11:01:08
Egypt-Japan
University of Science and Technology (E-JUST), New Borg Al-Arab, Alexandria, Egypt
Email: ahmed.emran, maha.elsabrouty @ejust.edu.eg
Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Fukuoka-shi, Fukuoka, Japan
Email: muta@ait.kyushu-u.ac.jp
Graduate School of Information Science and Electrical Engineering, Kyushu University
Email: furuhiro@ait.kyushu-u.ac.jp
AbstractDeveloping a fast converging LDPC code is desirable in many state of the art systems that require high throughput
and low error rate. In addition, quantization of LDPC codes is
one of the important aspects in practical LDPC implementation.
In this paper, we combine both targets and we propose a highly
efficient scaled min-sum layered implementation. The work in
the paper jointly optimizes the scaling factor of the LDPC along
with the quantization step to provide optimized performance. The
simulation results also show the performance improvement of
using optimal scaling with floating point (without quantization).
In addition, the over all performance enhancement of using
both optimal scaling and quantization parameters compared with
previous literature results.
I.
I NTRODUCTION
(1)
116
.indd
116
2015/03/11
11:01:09
III.
10
10
IV.
(b) 256-QAM
10
-1
-1
10
-2
10
-2
10
-3
10
-3
10
-4
10
-4
10
-5
10
-5
10
-6
10
-6
10
-7
10
-8
10
-9
10
-7
10
-8
10
10
1.6
-9
1.7
1.8
1.9
Eb / N0 in dB
2.1
10
11
11.1
11.2
11.3
11.4
Eb / N0 in dB
11.5
11.6
Fig. 1. BER and WER of long code with rates 2/3 modulated by (a) BPSK
and (b) 256QAM.
The gap between our results floating point performance is about 0.1 dB for BPSK and about 0.2 dB
for 256-QAM.
V.
CONCLUSION
(a) BPSK
DESIGN
R EFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
E. ETSI, 302 307: Digital Video Broadcasting (DVB), Second generation framing structure, channel coding and modulation systems for
Broadcasting, Interactive Services, News Gathering and other broadband
satellite applications, vol. 1, p. 2, 2006.
D. J. MacKay, Good error-correcting codes based on very sparse
matrices, Information Theory, IEEE Transactions on, vol. 45, no. 2,
pp. 399431, 1999.
M. M. Mansour and N. R. Shanbhag, High-throughput LDPC decoders,
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on,
vol. 11, no. 6, pp. 976996, 2003.
X. Zhang and P. H. Siegel, Quantized min-sum decoders with low error
floor for LDPC codes, in Information Theory Proceedings (ISIT), 2012
IEEE International Symposium on. IEEE, 2012, pp. 28712875.
R. Zarubica, R. Hinton, S. G. Wilson, and E. K. Hall, Efficient
quantization schemes for LDPC decoders, in Military Communications
Conference, 2008. MILCOM 2008. IEEE. IEEE, 2008, pp. 15.
C. Marchand, L. Conde-Canencia, and E. Boutillon, Architecture and
finite precision optimization for layered LDPC decoders, Journal of
Signal Processing Systems, vol. 65, no. 2, pp. 185197, 2011.
A. A. Emran and M. Elsabrouty, Generalized simplified variablescaled min sum LDPC decoder for irregular LDPC codes, in Personal
Indoor and Mobile Radio Communications (PIMRC), 2014 IEEE 25rd
International Symposium on. IEEE, 2014, pp. 892896.
117
.indd
117
2015/03/11
11:01:11
.indd
118
2015/03/11
11:01:11
.indd
119
2015/03/11
11:01:11
w
e
t
M
e
5
T
0
i
t
i
i
I. Introduction
Time and frequency synchronization is an essential issue for
accurate data communications in OFDMA systems with large
Doppler shift [1]. A transmitted signal in an OFDMA system
consists of a reference block and a data block. Schmidl-Cox
(S&C) method [2] in which a reference block consists of two
identical parts is used as a coarse synchronization method. A
reference block of Shi and Serpedin (S&S) method [3] consists
of four identical parts. A separate paper [4] shows that an
optimal number of partitions for a multi-path fading channel
with a Doppler shift is approximately given by M = N/2 for
60 N 240, where M is the number of partitions and N is the
length of the reference block. In this paper, we investigate that
how many parts should we divide the reference block into for
maximizing the synchronization performance in a multi-path
environment with Doppler shift. Eects of M are investigated
for N = 480, 960.
II. Timing Synchronization
A. Channel model
A multi-path fading channel with a Doppler shift can be modelled as follows: Consider a discrete-time and time-invariant
system with impulse response h ( = 0, 1, . . . LT 1), where
LT denotes the maximum multi-path delay. Let = N fD T s
{0, 1, . . . , N 1} be a normalized frequency oset, where fD is
a Doppler frquency, T s is a sampling interval and N is the size
of Discrete Fourier Transform (DFT) for an OFDMA which
This work is supported by Japan Society for the Promotion of Science (JSPS)
KAKENHI Grant Number 25820162
)
=
(
|k ()|,
(3)
F
1
k=1
=
k ()
di di+k
rn++iL
rn++(i+k)L
,
(4)
i=0
n=0
120
.indd
120
2015/03/11
11:01:12
is
The parameter that attains maximum value of ()
selected as an estimate of , i.e.,
Lt=1
Lt=2
Lt=3
Lt=4
Lt=5
5.5
5
4.5
2
a
a
3
a
a
4
a
b
5
a
a
6
b
a
7
c
b
8
c
b
9
b
b
10
b
c
Lt=6
Lt=7
Lt=8
Lt=9
Lt=10
N
480
960
LT
1
12
12
2
12
12
3
15
12
4
8
12
5
8
8
6
6
8
7
8
8
8
5
4
9
4
6
10
5
5
3.5
3
IV. Conclusion
2.5
2
1.5
1
0.5
0
10
15
20
25
30
M
35
40
45
50
55
60
Lt=1
Lt=2
Lt=3
Lt=4
Lt=5
6.5
6
5.5
1
a
a
480
960
LT
Lt=6
Lt=7
Lt=8
Lt=9
Lt=10
5
4.5
4
References
3.5
3
2.5
2
1.5
1
0.5
0
10
15
20
25
30
M
35
40
45
50
55
60
[1] M. Morelli, I. Scott, C.-C. J. Kuo, and M.-O. Pun, Synchronization Techniques for Orthogonal Frequency Division Multiple Access (OFDMA): A
Tutorial Review, Proc. of the IEEE, Vol.95, pp.1394-1427, 2007.
[2] T. M. Schmidl and D. C. Cox, Robust Frequency and Timing Synchronization for OFDM, IEEE Trans. Commun., Vol.45, No.12, pp.1613-1621,
Dec. 1997.
[3] K. Shi and E. Serpedin, Coarse Frame and Carrier Synchronization of
OFDM Systems: A New Metric and Comparison, IEEE Trans Wireless
Commun., Vol.3, No.4, pp.1271-1284, July 2004.
[4] T. Higuchi and Y. Jitsumatsu, Design Criteria of Preamble Sequence
for Multipath Fading Channels with Doppler Shift, 17th Int. Symp. on
Wireless Personal Multimedia Commun.(WPMC2014), 2014.
[5] M. Ruan, M. C. Mark, and Z. Shi, Training Symbol Based Coarse Timing
Synchronization in OFDM Systems, IEEE Tran. Wireless Commun., Vol.8,
No.5, pp.2558-2569, 2009.
121
.indd
121
2015/03/11
11:01:13
I. I NTRODUCTION
Fig. 2: An example of the transmitted signal of a hybrid DQPSKMPPM scheme with M = 4 and n = 2.
( )
+ )
One of the most important issues to many optical communications systems is the receiver sensitivity. Indeed, when
increasing the receiver sensitivity, less number of signal photons per bit can be transmitted to achieve a given bit-error
rate (BER) [1]. One of the preeminent modulation schemes for
increasing the receiver sensitivities in optical communications
systems is direct-detection differential quadrature phase shift
keying (DD-DQPSK) [2]. DQPSK is one of the most popular
receivers for multilevel phase-modulated optical communications systems and is more bandwidth efficient than differential
binary phase shift keying (DBPSK) but with the price of
increased complexity. DD-DQPSK can be demodulated using
an optical delay demodulator so that it avoids the need of an
optical local oscillator [2]. Of course using DD-DQPSK significantly simplifies the receiver implementation. In this paper
we propose a hybrid differential quadrature phase shift keyingmultipulse pulse-position modulation (DQPSK-MPPM) technique assuming optical amplifier-noise limited systems in an
attempt to increase further the receiver sensitivity of optical
communications systems. The key idea here is to use DQPSK
on top of an energy efficient modulation scheme, e.g., MPPM,
in order to gain the advantages of both schemes. It turned out
that the proposed system would enhance the performance of
traditional DBPSK, DQPSK, and MPPM techniques.
( )
+ )
{ }
+ 2n bits are transmitted each time frame as
of log2 M
n
follows. The first log2 M
bits are encoded using MPPM
n
scheme. These bits would identify the positions of the n pulses
within the frame. Each MPPM optical pulse is then DQPSK
modulated using an additional two bits. That is, compared
with traditional DQPSK, instead of transmitting a consecutive
stream of DQPSK pulses (each with a relatively low power),
we transmit less number of high-power DQPSK pulses. The
positions of these pulses within the frames are identified using
more data bits. An example of the transmitted signal of a
hybrid DQPSK-MPPM scheme with M = 4 and n = 2 is
shown in Fig. 2.
At the receiver side, the received signal is first split into
two branches using a 3-dB coupler, Fig. 3. The lower branch
is composed of a traditional direct-detection MPPM receiver in
order to identify the positions of the received n pulses within
the frame. In the upper branch, the DQPSK data is directly
detected.
In the upper branch, the DD-DQPSK demodulation needs
the received optical signal to be split through two asymmetric
interferometers with phase difference of /2 [2]. As shown in
the figure, the received optical signal is further divided into two
parts, one part is variably delayed depending on the positions
of the previous and current signal slots being compared. If
the previous and current signal slots being compared exist
in the same frame, the delay is (m2 m1 ) , where m1
{0, 1, . . . , M 2} and m2 {m1 + 1, m1 + 2, . . . , M 1}
122
.indd
122
2015/03/11
11:01:14
+ (1 SERM P P M ) 2n BERDQP SK .
(1)
10
IV. C ONCLUSION
A hybrid DQPSK-MPPM modulation technique has been
proposed for high sensitivity optical communications systems.
10
10
10
10
10
12
10
8
6
4
2
0
Average received optical power (P ) in dBm
av
10
Rb = 1/
10
10
10
10
10
10
10
Rb = 1/2
12
10
10
9
8
7
6
5
Average received optical power (Pav) in dBm
123
.indd
123
2015/03/11
11:01:16
Department of Electrical Engineering and Computer Science, School of Engineering, Kyushu University
Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University
Alcatel-Lucent Bell N.V Belgium
Graduate School of Information Science and Electrical Engineering, Kyushu University
, , 744 Motooka, Nishi-ku, Fukuoka-shi, Fukuoka-ken, 819-0395 Japan
AbstractIn multicarrier modulation system such as orthogonal
frequency division multiplexing (OFDM), the reduction of high
peak-to-average power ratio (PAPR) is a challenging problem.
Recently, an adaptive peak cancellation was proposed to reduce
the out-of-band leakage power as well as an in-band distortion
power (EVM), while keeping the transmitted PAPR below the predetermined and permissible value. In this paper, we evaluate and
discuss the performance of MIMO-OFDM systems using eigenbeam SDM in terms of bit error rate (BER), complementary
cummulative distribution function (CCDF) and the systems computational complexity. Our computer simulation results conrm
the effectiveness of the adaptive peak cancellation under the
restriction of out-of-band power radiation.
124
.indd
124
2015/03/11
11:01:17
t
h
s
=
e
.
=
)
]
f
,
s
l
.
d
g
e
e
g
r
e
(i)
Fig. 4. Bit error rate performance of OFDM system with PAPR reduction.
.indd
125
2015/03/11
11:01:18
t
w
s
p
m
R
a
s
Receiver
Fig. 1.
D/A
BPF
Power
Amplifier
i
c
i
i
u
l
b
i
o
c
Digital Process
Down
Convert
(IF BB)
ADC
MLSE
LPF
#1
000
00
I. I NTRODUCTION
A radio relay transmission is a promising technique to construct broadband wireless communication networks, where the
communications trafc from/to the base nodes connected to
the relay node is handed to/from wire-line system. In such a
network, it is desirable to improve power efciency at power
amplier and reduce required hardware complexities at analog
receiver circuits; analog-to-digital (A/D) converter (ADC) and
related analog hardware designs are important factors to simplify
the transceiver circuits at each node.
Single carrier transmission is a promising candidate for power
efcient wireless communications, since it achieves a lower
peak-to-average power ratio (PAPR) which improves power
efciency at the transmit power amplier. From a viewpoint
of a low PAPR characteristics, single carrier offset quadrature
amplitude modulation (OQAM) is attractive. On the other hand,
to reduce analog hardware complexity at the receiver circuits,
it is required to minimize the required resolution of ADC
while mitigating the inuence of quantization errors, i.e., there
is a trade-off relationship between hardware complexity and
nonlinearity at ADC. To mitigate the inuence of nonlinearity
at a low resolution ADC, we have proposed two nonlinearity
mitigated A/D conversion techniques, i.e., the dither ADC and
the hysteresis ADC [1].
In this paper, we investigate the effect of the nonlinearity mitigated A/D conversion techniques on the achievable performance
in single carrier OQAM systems, where the dither ADC and the
hysteresis ADC [1] are adopted on the receiver side. In addition,
we extend the proposed ADCs to multi-bit quantization case and
evaluate its performance. Using these techniques, it is expected
to mitigate the performance degradation caused by nonlinearity
of a low resolution ADC.
#2
01
Estimated
channel
ADC
model
Demodulation
Square
Error
Calculation
Estimated
channel
ADC
model
Demodulation
Square
Error
Calculation
#2L
111
Estimated
channel
11
Fig. 2.
ADC
model
Demodulation
p
b
s
h
t
s
c
a
t
m
i
c
r
e
w
f
e
A
o
a
w
t
t
L Symbol
000
A. System Model
Figure 1 shows block diagrams of the transmitter and the
receiver considered in this paper. We consider an offset quadrature amplitude modulation (OQAM) as a power efcient single
carrier transmission scheme, where the transmit signal is bandlimited by a pulse shaping lter whose frequency transfer
function is a square root of raised cosine roll-off function with
LPF
Up
Convert
(BB RF)
Analog Process
Down
Convert
(RF IF)
OQAM
t
a
Analog Process
Digital Process
Information
Sources
B
i
Square
Error
Calculation
a) Dither-ADC
Input
Output
b) Hysteresis-ADC
Input
-1
-
comparator
Z-1
MUX(Digital)
Fig. 4.
Output
Switching circuit
Switching circuit
Fig. 3.
-1
-
0.3
a
i
u
i
e
a
f
w
0.9
a
v
2
c
c
i
d
t
t
126
.indd
126
2015/03/11
11:01:19
TABLE I
S IMULATION PARAMETERS .
Modulation
OQPSK, O16QAM
Demodulation
Coherent Detection
Channel model
6-path Rayleigh fading
ADC sampling frequency
f=32fs
Number of quantization bits
Q=1,2,3
100
Ideal Quantization
1bit_
0
2bit_
2bit_
0.3
01
BER
0.11
0.01
2
0.001
3
IV. C ONCLUSION
This paper investigated the achievable performance of single
carrier OQAM systems using the enhanced ADCs, where two
dither based ideas are utilized to mitigate the nonlinearity of
ADC. Simulation results prove the effectiveness of the enhanced
ADC in single carrier OQAM systems.
R EFERENCES
[1] D. Kanemoto, O. Muta, et al, Linearity enhancement technique for one bit
A/D converter in wireless communication devices, ISCE2014, June 2014.
[2] N.Yamasaki,The application of large amplitude dither to the quantization
of wide range audio signals,J. Acoust. Soc. Jpn.(E),1983.
[3] R. Jacob Baker, CMOS Circuit Design, Layout, and Simulation, Third
Edition, Wiley-IEEE Press, 2010.
[4] E. M. Mohamed, O. Muta, and H. Furukawa, Adaptive Channel Estimation for MIMO-Constant Envelope Modulation, IEICE Trans. Commun.,
Vol.E95-B, No.07, pp.2393-2404, July 2012.
127
.indd
127
2015/03/11
11:01:20
128
.indd
128
2015/03/11
11:01:24
129
.indd
129
2015/03/11
11:01:25
130
.indd
130
2015/03/11
11:01:29
131
.indd
131
2015/03/11
11:01:31
The Third International Japan-Egypt Conference on Electronics, Communications and Computers (JEC-ECC 2015)