Sunteți pe pagina 1din 6

Low-Power and High-Performance

for Cryptosystem Using Power Aware and Pipeline Techniques



Minh-Tung DAM
Center of Electrical
Engineering
DUYTAN University,
Danang, Vietnam
damminhtung@dtu.edu.vn

Van-Cuong NGUYEN
Faculty of Electronics &
Telecommunications
DANANG University of
Technology,Vietnam
ngvancuong2000@gmail.
com

Trong-Tuan NGUYEN
Danang, Center of IC
(CENTIC) - Vietnam
tuannt@centic.vn



Thang-Dong TRAN LE
Center of Electrical
Engineering
DUYTAN University,
Danang, Vietnam
tranthangdong@duytan.edu.
vn
AbstractIt has been a decade since the National Institute of
Standards and Technology (NIST) has selected the Rijndael
algorithm as the Advanced Encryption Standard (AES). Since
then, AES becomes the new block cipher standard of US
government. A couple of years ago, with the shift of the
technological trend towards the power aware system design,
therefore, low power AES architectures gain importance over
area and performance oriented designs. In this study, we
proposed a low power design called power aware technique for
cryptosystem to reduce the power consumption while promising
to enhance the level of security and achieve the high performance
that adapts its self to the applications with real time requirement.
KeywordsAES, Low Power Design, SHA, Cryptosystem
I. INTRODUCTION
Cryptographic algorithms are utilized for security services
in various applications ranging from wireless networks to smart
grids with some specific area and performance needs. Until
modern times, cryptography referred almost exclusively to
encryption, which is the process of converting ordinary
information into unintelligible gibberish. Decryption is the
reverse, in other words, moving from the unintelligible
cipher text back to plaintext. In November 2001, with some
minor changes, the block cipher Rijndael takes the name AES
[1] and is announced as the new block cipher standard of US
government. The AES replaces the aged DES [2] (Data
Encryption Standard) algorithm which was designed back in
late 1970s.
As AES is a standardized encryption algorithm and considered
secure, it has become the default choice in numerous
applications. With its longer and flexible key sizes, AES
algorithm seems to be in the outreach of exhaustive search for
the near future. Over the past 10 years, through deeper analysis
and conducted measurements, AES has gained significant
confidence for its security. AES algorithm is a complex
computational algorithm that requires a huge number of
resources and power consumption in applications. In recent
literature, implementations of the AES algorithm in Field
Programmable Gate Array (FPGA) [3-6] and an Application
Specific Integrated Circuit (ASIC) [7-9] are reported.
However, most of them focus on the performance of high
throughput, which are not area-efficient and result in high
power consumption. Whereas, with the shift of the
technological trend towards the power aware system design,
low power AES architectures gain importance over area and
performance oriented designs.
Besides that, to enhance the level security of information with
various environments, many research works have focused on
improving the encryptions and authentications. For example,
Himanshu Gupta [10] employed the multiple encryptions of
secure electronic transactions by using the MD and SHA
algorithm. In [11], Sairam Natarajan proposed four famous
cryptographic algorithms to implement multilevel security or to
create the multi-cipher text, which included AES, DES, Rivest
Shamir Adleman and Ceaser. Jayant [12] proposed a method
with double encryptions for protecting the sensitive image.
Almost researching works are implemented at algorithm level
and are programmed on CPU platform. There is a critical
problem when deploying these methods with multiple
encryptions, or when combining encryptions and authentication
on CPU platform, is that they are not satisfied the real-time
applications. For example, there is problem with run-time
video conference or the video on the scene that captures from
UAVs to Base. In our previous works [13]-[14], we proposed a
novel architecture for Cryptosystem by using only AES
algorithm to enhance the level security of information while the
encryption processing is suitable for real time requirements.
However, none of the above works referred to the solution
optimized for low energy/power consumption that is also a
very important factor in modern applications such as the video
conference, the video on the scene in military, remote surgery
in medicine and satellite space. For example, as the military
services to improve battlefield communications by cameras, the
need to secure communications both in transmission and at rest
in which the event devices are lost or stolen and low
power/energy devices becomes paramount. Therefore, they
need to seek the cryptosystem that meets the high level of
security information while also falling within low
power/energy consumption and high speed processing which
adapts to the real time requirements are urgent situations.
Based on the urgent requirements, thus, in this paper, we
continue our previous work in [13]-[14] by proposing the
power aware technique and applying pipeline technique for
Cryptosystem and authentication using AES and SHA-512
(Secure Hash Algorithm) with target designs: the low
power/energy consumption, the high speed processing that
adapts applications with real-time requirements, the
enhancement of level security and the efficiency of hardware
resource. Because of the highest power consumption in overall
Cryptosystem and authentication, in this paper, we focus on
evaluating the power consumption of AES_core.
This paper is organized as follows. In Section 2, the related
researches and overview of our previous works are explained.
In Section 3, we propose power aware technique for
Cryptosystem and authentication for reducing power/energy
consumption. In Section 4, experimental results and
comparisons are provided. Section 5 concludes the paper.
II. RELATED WORKS
A. Current Researches
AES algorithm has received significant interest over the
past decade due to its performance and security level. Both
ASIC and FPGA are the interests to realize AES. In ASIC
design flows, FPGA is usually used as the prototype for
verification before going through the fabrication. However,
FPGA has been becoming the modern trend for embedded
system demands because of the advantages of the fast time to
market, low design cost and reusability. Most of the published
AES hardware designs focused on high speed, low the area
occupancy and high throughput for implementation in FPGAs.
For example, Singh and Mehra [3] researched an FPGA-based
high-speed and area-efficient AES encryption for data
security using a fully pipelined design. The operational
frequency can be up to 347.6 MHz and the throughput can be
up to 44.5 Gbps. Nalini Iyer [4] exploited functional block
resource sharing between encryption, decryption as well as on-
the-fly key scheduler generation. They proposed two
architectures: (1) Iterative architecture is optimized for area
which frequency and hardware efficiency are 188 MHz and
0.09 Mbps/slices, respectively. (2) Pipeline architecture is
optimized for speed which frequency and hardware efficiency
are 373 MHz and 3.8 Mbps/slices, respectively. Yulin Zhang
[5] used the BRAM to store the S-box values and exploited two
kinds of BRAM: one for round of transformation, the other for
key expansion. By combining the operations in a single round,
the critical delay is reduced. The clock frequency can be up to
271.15 MHz and the throughput can be up to 34.7 Gbps. Chih-
Peng Fan [6] also proposed two architectures: (1) Sequential
architecture which frequency and throughput are 75.3 MHz and
0.876 Gbps, respectively. (2) Fully pipeline architecture which
frequency and throughput are 222.2 MHz and 28.4 Gbps,
respectively.
In addition, some ASIC implementations have been reported in
the literature. For example, Hodjat [7] developed a 3.84 Gbps
AES crypto coprocessor with modes of operation support based
on a 0.18 m CMOS technology. Their design features a 128-
bit data-path and encrypts a block of data in 11 clock cycles. A
completely different design approach is necessary when
optimizing AES hardware for low power consumption or small
silicon area. Feldhofer [8] introduced an AES implementation
suited for passively-powered devices like RFID tags. It
comprises an 8-bit data-path which occupies an area of 3,595
gates (including registers and control logic) when synthesized
using a 0.35 m standard cell library. M. Feldhofer [9] showed
that the AES algorithm allows for a wide range of trade-offs
between performance, power consumption, and hardware cost.
However, most of them focus on the performance of AES core,
which are not result in low power consumption. Therefore, the
requirements of high speed, low the area and high throughput
for high-performance of AES core are met by existing AES
implementations. In contrast, in addition to low power
consumption, their previous works did not achieve effectively.
Hence, in this paper, we proposed a power aware technique for
Cryptosystem that have low power consumption as well as a
good performance balance.
B. Our Previous Works
In term of cryptographic algorithms in general and AES
algorithm in particular will consist of two main elements: key
expansion; permutation and replacement data. In order to speed
up the encryption/decryption data, our previous work [13]
proposed the pipelined architecture for SubBytes, ShiftRow,
MixColumn, AddRoundKey. Moreover, in [14], we also used
the simultaneous Encryptions/Decryptions AES HW-Threads
to expand the AES processing speed.
As mentioned above, the operation of AES algorithm is
involved two main steps: AES_Key_Expander and AES
Transformation. Fig 1 shows the pipelined architecture for
AES Transformation procedures. In the AES Transformation,
there are four basic procedures as follows: SubBytes,
ShiftRow, MixColumn and AddRoundKey. In these
procedures, the key operations are permutation and
replacement with SBOX and Key_Expander. We shall RTL
four procedures in one combination circuit which is processed
within one clock cycle. Fig. 2 describes the combination circuit
for AES Transformation with only one clock cycle processing.
For this pipelined architecture, with 128-bits data-in, our
design shall operate in 10, 12 and 14 clock cycles with mode
keys of 128, 192 and 256, respectively.



Fig.1. The architecture of Pipelined for AES Tranformation





Fig. 2. Combination circuit for AES Tranformation

III. PROPOSED POWER AWARE TECHNIQUE FOR AES
ALGORITHM



Fig. 3. The overall diagram of the embedded system
A. The overall diagram of the embedded system.
The embedded system is given in Fig. 3. The IP core
(AES&SHA-512) connects to 64-bit AXI Bus. The
ARM/Microblazer processor controls the importance role of
the controller and sets the require parameters in the data-
transfer processing. After receiving the message, it sets the
necessary parameters for SHA-512 and AES to start the
operation. The output value of the SHA-512 authentication is
compared to the message authentication code. If the results are
the same, the AES core operates to encrypt/decrypt the
message. In this paper, we focus on evaluating the power
consumption and performance of AES core in this
Cryptosystem and authentication by using power aware and
pipeline technique.
B. Proposed power aware technique for AES algorithm .
FPGAs represent an evolutionary improvement in gate array
technology which offers potential reductions in prototype
system costs and product time-to-market. It is clear that that
FPGAs have the capability of being reconfigurable within a
system, which can be a big favorable in applications that need
multiple trial versions. For example, they can be used in an
encryption scheme to perform the encryption using whatever
encryption algorithm is programmed into it. Moreover, we can
use the same chip for multiple rounds of encryption that
combine different encryption algorithm. In this paper, we focus
on implementing a low power design called power aware
technique to reduce the power/energy consumption of AES
algorithm. This new technique reduced the consumed power
and only clock signals of those modules which valid data are
working are active. Thus lead to reduction in power
consumption.
Power consumption is taken into consideration when designing
the AES algorithm. FPGA chip power is comprises of the static
and dynamic parts. The static power is caused mainly by the
leakage current between power supply and ground. It depends
much on the transistor technology and operating temperature
and it is not feasible to control at system level. Therefore, the
leakage current was not taken into consideration because of its
negligible level. Because of programmability of FPGA, the
dynamic power is dependent design. Hence, there are three
major strategies in FPGA power consumption reduction: The
effective capacitance of resources, the resources utilization, and
the switching activity of resources. The effective capacitance
corresponds to the sum of parasitic effects due to
interconnection wires and transistors. Since FPGA architecture
usually provides more resources than required to implement a
particular design, some resources are not used after chip
configuration and they do not consume the dynamic power.
This is referred to as resource utilization. Switching activity
represents the average number of signal transitions in a clock
cycle. Generally, it depends on the clock itself. Our proposed
technique based on the third strategy. In place of a global
clock, the AES core needs a system clock to activate all three
Key_Expander modules. It means that the overall system
operates in which all three Key_Expander modules are
activated simultaneously. Fig.4 shows the waveform of active
clocks for all three Key_Expander modules of 128, 192 and
256. Obviously, when the system clock is activated, all clocks
of three Key_Expander modules are also activated to work
concurrently. The active clocks for all three Key_Expander
modules are activated during one phase of operation only and
they do not change for a long period of time. However, after
completing the processing of key expander, only one of the key
expander results is selected to process in the AES
transformation procedure. The results of another
Key_Expander modules do not use in this procedure.
Therefore, a significant amount of dissipated power by the
clock driver is wasted.



Fig. 4. Waveform of Active Clocks for all three Key_Expander modules

We have proposed a power aware technique which will gate
the clock and thus reduce the power dissipation by significant
percentage. Minimizing power consumption is a question of
avoiding unnecessary signal transitions which do not contribute
to the computation in question. Power aware technique can be
used to eliminate the requirement for any global clock in a
system. By eliminating any global clock, we eliminate a major
source of power consumption. We apply this idea to turn of the
activity on the clock net as one way of reducing power
consumption in clock. In AES algorithm, at the same time,
only one of the selected Key_Expander modules of 128, 192 or
256 is needed to use as shown in Fig. 5.



Fig. 5. Waveform of Active Clock for each Key_Expander module

Therefore, we divide our design into three clock regions called
Key_Exp_128 clock, Key_Exp_192 clock and Key_Exp_256
clock which are corresponding to three keys of 128, 192 and
256. The essential idea in our proposed technique is to
eliminate any unused clock and activate the clock
corresponding to module which valid data is working. In AES
architecture, each Key_Expander module of 128, 192 or 256
operates asynchronously with respect to each other, the clock
frequency at which each other modules are locked if it is
unused. It can be satisfied to the local needs. For example, if
key 128-bit is selected, only Key_Exp_128 clock is activated.
Meanwhile, Key_Exp_192 clock and Key_Exp_256 clock are
deactivated during the Key_Expander processing. Similarly,
the clocks of Key_Exp_128and Key_Exp_256 are deactivated
if clock of Key_Exp_192is activated. Hence, the overall
power/energy consumption is decreased significantly.
Comparing to the AES core with active clock for all three
Key_Expander modules of 128, 192, 256; our proposed
method with active clock for each Key_Expander module of
128, 192, 256 only consumes energy when each module is in
active mode. The other remaining modules are either quiescent
because of turned off clocks. All remaining Key_Expander
modules are quiescent until they are activated. After
completion of its task, it returns to a quiescent, almost non
dissipating state until a next activation.
IV. EXPERIMENTAL RESULTS AND COMPARISONS
Our proposed system is synthesized and implemented using
Xilinx Design Suite 14.5 and Atlys Spartan-6 LX45 FPGA
device which is also used to measure the dynamic power by
Digilent Adept tool as shown in Fig.6. Besides that, the static
power is measured by XPower Analyzer tool in ISE 14.5. The
results of these implementations are compared with those
found in the literature.
The Xilinxs ChipScope tool is used for tracing run-time values
operation on Atlys Spartan-6 LX45 FPGA device. ChipScope
is an embedded, software based logic analyzer. By inserting an
Integrated Controller Core (ICON) and an Integrated Logic
Analyzer (ILA) into the design and connecting them properly,
the signals in the design can be monitored. Fig.7shows the
clock signals that are activated and deactivated from FPGA
through ChipScope tool after programming the Spartan-6
LX45 device.



Fig. 6. The Current and Dynamic Power are measured by Digilent Adept tool.



(a) Only clock signal of key_128 is activated



(b) All clock signals of keys 128, 192 and 256 are activated

Fig. 7. The captured signals are displayed and analyzed by using the
ChipScope Pro Analyzer tool.

From Table I, as compared the case of active clock for each
Key_Expander to active clocks for all three Key_Expander
modules, static power consumption is reduced by 17% and 8%
at 100 and 200 MHz clock frequencies, respectively.
Table II shows the comparisons between AES core with active
clock for each Key_Expander module and AES core with
active clocks for all three Key_Expander modules in term of
the current and the dynamic power consumption.

TABLE I. STATIC POWER COMPARISON
Mode Activate clock for each
Key_Expander
Activate clocks for all
three Key_Expanders
Clock
(MHz)
100 200 100 200
Power(W) 0.00596 0.01091 0.007140 0.01190

In this table, the current consumption and dynamic power
consumption are average values that are taken from 10
different random times. Case 1 is the case of without using our
proposed power aware technique. Otherwise, Case 2 uses our
proposed technique. As can be seen from the Table II, the
dynamic power and current in Case 2 are reduced remarkably
compared with Case 1. For example, compared with Case 1,
the power consumption of Case 2 is significantly reduced by
29%, 23% and 24% in encode mode of Keys 128, 192 and 256,
respectively. Similarly, our proposed technique in Case 2
consumes less power than Case 1 by 29%, 22.2%, 23.7% in
decode mode of keys 128, 192 and 256, respectively. Besides
that, the current consumption is decreased by 29.2%, 22.6%
and 24.1% in encode mode of keys 128, 192 and 256,
respectively. We can quantify the throughput of AES
encryption processing for each block of 128-bits as follow:

Throughput_key_size=
KeySize
(Time_in_data + Time_processing + Time_out_data)


Following the proposal of AES, the time for Data_in and
Data_out is within one clock cycle. The time processing for
AES Transformation for 128, 192 and 256 are 11 clocks cycles,
13 clocks cycles and 15 clocks cycles, respectively. With the
maximum frequency synthesis result, these parameters of
throughput and performance for this design are indicated on
Table III.
Table III shows the comparison between our proposed design
and four other previous designs in term of performance and
low power consumption. Each design is implemented by
different technology with various parameters such as clock
frequency, supply voltage. Therefore, it is difficult to compare
them directly. However, the comparison still gives us
meaningful information. Compared with other designs in [15]-
[18], our design in case 1 consumes the fourth lowest power.
However, in case 2, when we apply the power aware technique,
we achieve the lowest power consumption with 222.75 mW.

TABLE II: THE CURRENT AND DYNAMIC POWER COMPARISON BETWEEN TWO CASES




















TABLE III: COMPARISON OF OUR PROPOSED DESIGN WITH PREVIOUS DESIGNS


Paper

[15]

[16]

[17]

[18]

Our design



Technology

Stratix II


Virtex 4vf100
Virtex 2 PRO Virtex-II
XC2V1000
Spartan6-LX45

DOR


DOR + K

Option 1

Option 2

Case1*

Case2**

Clock Frequency
(MHz)
100 100 25 25 100
Power (mW)

301

283

768.06

778.84

885

1130

313.5

222.75
Max Frequency
(MHz)

475

N/A

26.88

27.32

111

90

134.7

160.253
Case 1: is the case of without using our proposed
power aware technique
Case 2: is the case of using our proposed power
aware technique

Mode 128
Encode Decode IDLE Encode Decode IDLE
Current
(mA)
261.4 260.6 195 185.2 185.8 170
Power (mW) 314.1 312.9 234 222.3 223.2 204
Mode 192
Current
(mA)
260.4 257.9 195 201.6 200.4 170
Power (mW) 312.6 309.3 234 242.2 240.6 204
Mode 256
Current
(mA)
253.8 251.2 195 192.6 196.4 170
Power (mW) 304.8 302.1 234 231.6 236.1 204
Throughput (Gbps)

0.617

N/A

0.3441

0.3497

0.267

0.267

1.327

1.578
Area (Slices) N/A 2856 2448 2439 452 1226 2990 2942
Hardware efficiency
(Mbps/slices)
N/A N/A

0.1406

0.1434

0.59

0.218

0.444

0.536
Power efficiency
(Mbps/mW)
2.05 N/A

0.448

0.449

0.302

0.236

4.233

7.084
Latency (Clock
cycles)
N/A N/A N/A N/A 13
*without using our proposed power aware technique
**using our proposed power aware technique

Besides that, our design in case 1 and case 2 has the largest
throughput than other designs with 1.327 and 1.578 Gbps,
respectively. Although the hardware efficiency in option 1 of
the design in [18] is a little better than our design in which
the hardware efficiency is defined by throughput per the
number of slices, they consume four-time larger power and
six-time smaller throughput than our design. Exception for
this situation, compared with other remaining designs, we
achieve the highest hardware efficiency. Moreover, the
efficiency of throughput per power consumption
(Mbps/mW) in two cases is a great better than other designs
with 4.233 and 7.084, respectively. Therefore, it is evident
that our design not only achieves the highest throughput but
also consumes the lowest power. Generally, our proposed
design achieves the highest performance and power
efficiency compared with other previous designs.
V. CONCLUSION
In this paper, we presented a power aware and pipeline
techniques for Cryptosystem and authentication to reduce
power consumption and furthermore greatly promise to
enhance the level of security and achieve the high
performance that adapts its self to the applications with real-
time requirement. The result shows that our design
consumed the low power with 222.75 mW and obtained the
high performance for real-time processing of 1.578 Gbps
throughput and hardware efficiency of 0.536 Mbps/slices. As
compared with those found in the literature, our design was
also superior in term of performance and low power
consumption.
REFERENCES
[1] NIST,Advancedencryptionstandard(AES),Nov.20
,http://csrc.nist.gov/publications/ps/ps197/ps-197.pdf
[2] NIST,Dataencryptionstandard(DES),Oct.1999,
http//csrc.nist.gov/publications/ps/ps46-3/ps46-3.pdf
[3] G. Singh, R. Mehra, FPGA Based High Speed and Area Efficient
AES Encryption For Data Security, International Journal of
Research and Innovation in Computer Engineering, vol. 1, no. 2, pp.
53-56, Feb. 2011.
[4] Nalini Iyer, P.V. Anandmohan, D.V. Poornaiah, and V.D. Kulkarni
Efficient Hardware Architectures for AES on FPGA CIIT 2011,
CCIS 250, pp. 249257,Springer-Verlag Berlin Heidelberg 2011.
[5] Yulin Zhang, Xinggang Wang, Pipelined Implementation of AES
Encryption Based on FPGA, Information Theory and Information
Security (ICITIS),pp. 170-173, Dec. 2010.
[6] Chih-Peng Fan and Jun-Kui Hwang FPGA implementations of high
throughput sequential and fully pipelined AES algorithm ,
International Journal of Electrical Engineering, Vol15, No.6, pp. 447-
455, 2008.
[7] Hodjat, D. D. Hwang, B.-C. Lai, K. Tiri, and I. M. Verbauwhede, A
3.84 Gbits/s AES crypto coprocessor with modes of operation in a
0.18-m CMOS technology, in Proceedings of the 15th ACM Great
Lakes Symposium on VLSI (GLSVLSI 2005), pp. 351356. ACM
Press, 2005.
[8] M. Feldhofer, J. Wolkerstorfer, and V. Rijmen, AES implementation
on a grain of sand, IEEE Proceedings Information Security,
152(1):1320, Oct. 2005.
[9] M. Feldhofer, K. Lemke, E. Oswald, F.-X. Standaert, T.Wollinger,
and J.Wolkerstorfer, State of the Artin Hardware Architectures,
ECRYPT deliverable D.VAM.2, available for download at
http://www.ecrypt.eu.org/documents/D.VAM.2-1.0.pdf, Sept. 2005.
[10] Himanshu Gupta,Role of multiple encryption in secure electronic
transaction, International Journal of Network Security & Its
Applications (IJNSA), Vol.3, No.6, November 2011.
[11] Sairam Natarajan A Novel Approach for Data Security Enhancement
Using Multi Level Encryption Scheme, (IJCSIT) International
Journal of Computer Science and Information Technologies, Vol. 2
(1) , 2011, 469-473.
[12] Jayant Kushwaha,Bhola Nath RoySecure, Image Data by Double
encryption International Journal of Computer Applications (0975
8887) Volume 5 No.10, August 2010.
[13] Trong-Tuan NGUYEN, Van-Cuong NGUYEN, Hung-Manh PHAM
Enhance the performance and security of SoC using pipeline and
dynamic partial reconfiguration The 2012 International Conference
on Integrated Circuits and Devices in Vietnam (ICDV 2012) Section
#6.
[14] Trong-Tuan NGUYEN, Van-Cuong NGUYEN, Mai-Duyen Le
NGUYEN A Novel Architecture for Cryptosystem with
Simultaneous Encryptions/Decryptions HW-Threads and Self-
Dynamic Reconfiguration The 2013 International Conference on
Integrated Circuits and Devices in Vietnam (ICDV 2013).
[15] G. H. Karimian, B. Rashidi, and A.farmani, A High Speed and Low
Power Image Encryption with 128-Bit AES Algorithm, International
Journal of Computer and Electrical Engineering, Vol. 4, No. 3, June
2012.
[16] Bahram Rashidi and Bahman Rashidi, FPGA Based A New Low
Power and Self-Timed AES 128-bit Encryption Algorithm for
Encryption Audio Signal, I. J. Computer Network and Information
Security, pp. 10-20, 2013,.
[17] Jason Van Dyken and Jos G. Delgado-Frias, FPGA schemes for
minimizing the power-throughput trade-off in executing the
Advanced Encryption Standard algorithm, Journal of Systems
Architecture, pp.116123, 2010.
[18] Roohi Banu and Tanya Vladimirova, Fault-Tolerant Encryption for
Space Applications, IEEE Trans. On Aerospace and Electronic
Systems vol.45, No.1 January 2009.

S-ar putea să vă placă și