Documente Academic
Documente Profesional
Documente Cultură
PROJECT REPORT
By
Slot: B1
October, 2018
Abstract:
Images play an important role in the modern-day world. Confidentiality, integrity and
authentication of images are as important as that of a text. But performing
encryption/decryption of images is significantly different than plain text due to their large
size and redundancy of pixels. With the rise of the powerful GPUs containing thousands of
high-performance and efficient cores, the cryptographic algorithms which were once thought
as secured can now be easily broken in a matter of seconds. Earlier increasing complexity
meant an increase in the processing time for encryption as well as decryption. Now, with the
advent of these cutting-edge GPUs and the evolvement of GPU computing, the processing
time has been reduced to a fraction of the time it used to take earlier. This paper presents a
parallel implementation of AES using NVIDIA CUDA and OpenCV to encrypt images
rapidly. We achieved an average speed up of four times on GPU as compared to CPU-only
for both 128-bit as well as 256-bit keys.
Security of data is the prime concern of this era. We need a much higher security in data
transmission as compared to previous generations. AES-256 (Advanced Encryption
Algorithm) is one of the most popular and extensively used algorithm in cryptography. It is
secured enough to be used by the US government for transmitting their classified
information. It basically consists of different rounds of permutations and substitutions,
depending upon the key size: 10 rounds for 128-bit, 12 for 192-bit and 14 rounds for 256-
bits. Security increases with increasing key size. The implementation of the AES algorithm
on the image differs highly from a general text due to its large size and the various lossy or
lossless compression techniques used on image. The images can be viewed as a two-
dimensional matrix of integers of varying values depending upon the bit depth of the image.
Higher resolution images have a large number of pixels and due to this higher size, it takes a
long time to process them. Hence, they have to be efficiently implemented on a GPU.
The advantage in our methodology is that we can share the encrypted file in the image form.
Without knowing the key, it is computationally infeasible (at the time of writing this paper)
to know the contents of the image as all the image will be just a random pattern of different
colours. The next section presents a brief review of different methods researchers have used
in the past for encryption/decryption of text and images. We then present our proposed
method in detail along with the observations and results in the subsequent sections.
Literature Review:
Image Encryption
9. Image Roza 2013 8th The paper The results obtained from the
Encryptio Afarin Iranian proposes an analysis shows the algorithm
n Using ; Conferenc improved can be applied successfully
Genetic Saeed e on version of for any type of image.
Algorithm Mozaf Machine Genetic The result is sensitive to key
[Roza fari Vision and Algorithm for and distortions in the image.
Afarin; Image the image Further there can be an
Saeed Processin encryption. In attempt to increase the cipher
Mozaffari, g (MVIP) the inital steps value to get a better result.
2013] the image is
dislocated
randomly and
then divided into
four parts.urther
crossover and
mutation is
applied. If
entropy of the
final result
becomes
more.Further
randomness are
obtained through
entropy,
correlation
coefficient and
histogramanalysi
s.
10. Pixel S.Sow 2012 To avoid the More secured as compared to
Based miya; Internatio data redundancy PMS because of the 64 keys.
Image I.Moni nal the pixel and It overlooks the security
Encryptio ca Conferenc magic square while changing to RGB
n Using Tresa; e on method has been format and can be easily be
Magic A.Pra Control combined to attacked.
Square bhu Engineeri give a new Further an attemt to minimize
[S.Sowmi Chakk ng and algorithm that is thetime and space complexity
ya; aravar Communi also named as can be made in future.
I.Monica thy cation Pan Magic
Tresa; Technolog Square
A.Prabhu, y method.The
2012] plain text is
divided into
pixels and a total
of 64 keys are
genrated.
The encrypted
result is very
different when
compared to the
text so can easily
be transmitted
over the internet.
11. A New Jianmi 2012 An attempt is From Lyapunov exponent
Duffing- ng Internatio made to modify simulation we get to know
Lorenz Liu; nal the Duffing and that the improved version has
Chaotic Huijin Conferenc Lorenz chaotic better chotic feature.
Algorithm g Lv e on algorithm A new designed dynamic
and Its Control because it is very mapping is required for this
Applicatio Engineeri simple and application of this version.
n in Image ng and cannot be used Further since it is 6
Encryptio Communi for any dinmmensional we can
n cation encryption simply it to 4 dimensions for
[Jianming Technolog cryptography.It ease of calculations.
Liu; y 6 dimenssional
Huijing and more
Lv,2012] complex.
12. A Novel Xiao 2012 4th This paper In comparison to encrypted
Image Feng, Internatio introduces a image and decrypted image
Encryptio Xiaoli nal method which is the performance is very high
n n Congress based on the when fourier transform is
Algorithm Tian, on Image fractional used.
Based On Shaow and Signal Fourier This algorithm has large
Fractional ei Xia Processin equations for the space for keys which could
Fourier g purpose of image be vulnerable for the
Transform encryption.Furth attacking agents.
and Magic er the magical Further an attempt can be
Cube cube rotation made to ake it more safe
Rotation. scrambling is from the attackers and
[Xiao also used in this enhance the security of the
Feng, paper. data.
Xiaolin
Tian,
Shaowei
Xia,2012]
13. Attack to Shuai 2012 3rd In this paper the The leaking of the
an Image Ren, Internatio security information can be traced
Encryptio Cheng nal concerns of the from the proposed algorithm
n shi Congress image to ensure the security of the
Algorithm Gao, on Image encryption using image transmission.Security
based on Qing and Signal Chaotic maps is of the image is based on the
Improved Dai, Processin addressed. It encryption algoritm any
Chaotic XiaoF g shows how the attempt to compromise it will
Cat Maps ei Fei (CISP201 decryption of the make make it vulnerable to
[Shuai 2) image can be one the attakers.
Ren, by the the In future key gen algorithm
Chengshi proposed key may be used for the
Gao, Qing solved enhancemet of overall
Dai, algorithms very security of the image.
XiaoFei easily.Hencepro
Fei,2012] ves the
vulneribility of
the previous
algorithms
14. Design Zhang 2012 4th This paper The security of the image is
and Lei Internatio presents an enhanced and gray level pixel
Realizatio Li Li nal image produces better results.
n of Image Gao Congress encryption SMS4 as we know is given
Encryptio Xianw on Image algorithm which by State Secrets
n System ei and Signal is derived from Administration.
Based on Processin SMS4 Further an attempt can be
SMS4 g commercial made to design commercial
Commerci cipher algorithm. ciher prooducts.
al Cipher In the proposed
Algorithm algorithm
. encryption,
[Zhang decryption and
Lei Li Li safety
Gao transmission of
Xianwei,2 the image is
012] ensured. The
conclusion
drawn is that
grey level
images are more
equally
distributed.
AES encryption on CUDA
Luis
Carlos
Erpen De
Bona,2012
]
22. Accelerati Yuhen 2014 Graphics I/O is bottleneck of the
on of AES g Ninth Asia processing performance when using
encryption Yuan ; Joint units, Instruction GPU to obtain parallel
with Conferenc sets, Encryption, processing, and the
Zhen
OpenCL e on Resource throughput rate does not
[Yuheng zhong Informatio management, P include I/O overheads.
Yuan ; Zh He ; Z n Security arallel
enzhong heng processing, Co Future work can be
He ; Zhen Gong mputational experimenting with various
; Weid modeling, Throu techniques to improve I/O
g
ong ghput efficiency and finally
Gong ; W improve the overall efficiency
eidong Qiu
of parallelization. Testing
Qiu scheme on other algorithms
,2014] and apply parallel computing
to cryptanalysis
Image encryption on CUDA
Advantages
• Shared memory – CUDA exposes a fast shared memory region that can be
shared among threads. This can be used as a user-managed cache, enabling
higher bandwidth than is possible using texture lookups.
• Full support for integer and bitwise operations, including integer texture
lookups
Steps to install CUDA 10 on Ubuntu 18.04
Step 6) Add the following lines to your ~/.profile file for CUDA 10.0
# set PATH for cuda 10.0 installation
if [ -d "/usr/local/cuda-10.0/bin/" ]; then
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-
10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
fi
Step 7) Reboot the computer
#include<stdio.h>
#include<cuda.h>
void fromCPU()
}
int main()
fromCPU(); fromGPU<<<2,3>>>();
cudaDeviceSynchronize(); return 0;
}
Algorithm explanation and identifying the areas of parallelism.
The Advanced Encryption Standard (AES), also known by its original name
Rijndael is a specification for the encryption of electronic data established by the
U.S. National Institute of Standards and Technology (NIST) in 2001.
For AES, NIST selected three members of the Rijndael family, each with a block
size of 128 bits, but three different key lengths: 128, 192 and 256 bits.
As DES has a smaller key size which makes it less secure to overcome this triple
DES was introduced but it turns out to be slower. Hence, later AES was introduced
by the National Institute of Standard and Technology.
The basic difference between DES and AES is that in DES plaintext block is
divided into two halves before the main algorithm starts whereas, in AES the entire
block is processed to obtain the ciphertext.
• AES supports larger key sizes than 3DES's 112 or 168 bits.
AES comprises three block ciphers: AES-128, AES-192 and AES-256. Each
cipher encrypts and decrypts data in blocks of 128 bits using cryptographic keys of
128-, 192-and 256-bits, respectively. The Rijndael cipher was designed to accept
additional block sizes and key lengths, but for AES, those functions were not
adopted.
The key size used for an AES cipher specifies the number of transformation rounds
that convert the input, called the plaintext, into the final output, called the
ciphertext. The number of rounds are as follows:
Each round consists of several processing steps, including one that depends on the
encryption key itself. A set of reverse rounds are applied to transform ciphertext
back into the original plaintext using the same encryption key.
1. KeyExpansion—round keys are derived from the cipher key using
Rijndael's key schedule. AES requires a separate 128-bit round key block
for each round plus one more.
3. 9, 11 or 13 rounds:
4. AddRoundKey
1. SubBytes
2. ShiftRows
3. AddRoundKey
In the SubBytes step, each byte in the state is replaced with its entry in a fixed 8-bit
lookup table
In the ShiftRows step, bytes in each row of the state are shifted cyclically to the
left. The number of places each byte is shifted differs for each row.
In the MixColumns step, each column of the state is multiplied with a fixed
polynomial c(x)
In the AddRoundKey step, each byte of the state is combined with a byte of the
round subkey using the XORoperation (⊕).
Parallelization strategy
⚫ For encryption, the image file to be encrypted has been divided into
numerous blocks of 128 bits.
⚫ All the rounds of AES has to been ran on each of these 128 bit blocks
parallely
⚫ The number of threads in each block is the number of cores (i.e. 384)
For efficiency, we have taken the number of blocks in grid as the number of
processors in the multiprocessor GPU. Our GPU has three processors and hence
we made three blocks in our grid. We divide the number of pixel-blocks of input
image by this number of blocks to obtain the number of threads. If the number of
threads in a block exceeds 1024, we allocate a new block to maintain the number.
Each block of sixteen - pixels will then be run on an independent thread for the
further rounds of AES. This will ensure maximum parallelization.
After obtaining the cipher-text, we can optionally convert the cipher-text into
hexadecimal numbers and then represent each encrypted pixel as a combination of
these hexadecimal numbers taking two numbers at a time. These pixels can then be
displayed in the form of an encrypted image. This image will mostly contain
random structures. The advantage of this conversion of cipher text into an image is
the reduction in the size of the encrypted file, provided a lossless image
compression technique has been used.
The same technique in the reverse order has been used for decrypting the encrypted
image back to the original image.
Evaluation results
We processed various images of different resolutions ranging from an HD image
of size 0.3 MB to high-resolution images of size 11.6 MB using both 128-bit as
well as 256-bit keys. Fig. 2 shows the original image and the encrypted image
obtained. After that we compared the time taken for processing between GPU and
CPU. The configuration details of the machine the experiments were performed
upon are tabulated in Table 1. The timing results for images are tabulated in Table
2 for 128-bit key and Table 3 for 256-bit key. Table 4 and Table 5 list the timing
results for text files of same sizes as that of images for 128-bit and 256-bit key
respectively.
Note that these timings depict the total time taken to read an image, encrypt it,
generate and save the encrypted image, then again read the encrypted image,
decrypt it and finally output the original image.
NVIDIA GeForce
Intel® Core™ i5
940MX
7th Generation
2 GB DDR3
4 cores
3 Multiprocessors
8 GB DDR4 RAM
128 CUDA Cores/MP
Fig. 2. Notice how the encrypted image has random pixels and hence original
image cannot be guessed just by looking at it.
Table 2. Timing results for encrypting/decrypting images with a 128-bit key
File Image GP CPU Speed Throughput
size U up (Mbps)
(MB) resoluti (sec (sec) GPU CPU
on )
11.6 7680x4 29.4 119.6 4.1x 3.15 0.77
320 2 3
5.32 5120x2 13.1 52.36 4x 3.22 0.81
880 9
3.17 3840x2 7.40 29.67 4x 3.42 0.85
160
1.32 2560x1 3.36 13.20 3.9x 3.14 0.80
440
0.71 1920x1 2.02 7.44 3.7x 2.81 0.76
080
0.30 1280x7 0.91 3.34 3.6x 2.64 0.72
20
Table 5. Timing results for encrypting/decrypting text files with a 256-bit key
File GP CPU Speed Throughput
size U up (Mbps)
(MB) (sec (sec) GPU CP
) U
11.6 1.33 6.19 4.65x 69.77 14.9
9
5.32 0.66 2.86 4.33x 64.48 14.8
8
3.17 0.46 1.66 3.61x 55.13 15.2
7
1.32 0.21 0.71 3.38x 50.28 14.8
7
0.71 0.17 0.40 2.35x 33.41 14.2
0
0.30 0.12 0.16 1.33x 20.00 15.0
0
We observe that it takes a long time for the CPU to encrypt an image while a GPU
can complete similar task in a fraction of time. We can infer from Figure 1 that
there is just a small difference between the execution times on GPU and CPU for
low resolution images like 1280x720 and 1920x1080 images. But as we
encrypt/decrypt high resolution images of the order of 5120x2880 and 7680x4320,
the time difference becomes significant. Another interesting observation that we
make here is that the time taken on GPU seems to be independent on the key size
used i.e. there is not much time difference between 128-bit and 256-bit key
encryptions. On the other hand, the time taken for encryption/decryption increases
significantly when a 256-bit key is used in place of a 128-bit key for the CPU-only
implementation.
Performance parameters considered
Fig. 3. GPU vs CPU: Time comparison for different resolution images for both key sizes
Fig. 4. GPU vs CPU: Time comparison for different text files for both key sizes
Conclusion
We have presented a method to accelerate image encryption/decryption with AES
using the GPU. We have also calculated the timing results for images of different
resolutions and have compared them with a CPU-only computation. The speed-up
on GPU achieved is around four times with respect to the CPU. Along with this we
noted an interesting observation that the encryption/decryption time of image
seems to be independent of the key-size when the encryption has been done in
parallel on the GPU. On the other hand, the time taken for encryption/decryption of
image increased significantly when a serial computation was done on the CPU.
Another observation is that the speed up on GPU over CPU increases with the
increase in the resolution of images and also when longer keys are used i.e. 256-bit
key instead of 128-bit key. Hence, the user need not compromise the security of
the image to encrypt it in a shorter time by taking a smaller key.
Our future aim is to try the same parallel implementation on other parallel
architectures too and compare their performance. We also aim to reduce the space
complexity of our implementation. With increasing sharing of multimedia over
social networking sites, we also need efficient encryption techniques for secure
transfer of media types like audio and videos. Also, the method described in this
paper works on GPUs of personal computers. We aim to extend this technique to
GPUs of modern smartphones.
References
[1] Nagendra, M., & Sekhar, M. C. (2014). Performance improvement of
Advanced Encryption Algorithm using parallel computation. International
Journal of Software Engineering and Its Applications, 8(2), 287-296.
[5] Soltani, A., & Sharifian, S. (2015). An ultra-high throughput and fully
pipelined implementation of AES algorithm on FPGA. Microprocessors and
Microsystems, 39(7), 480-493.
[6] Li, Q., Zhong, C., Zhao, K., Mei, X., & Chu, X. (2012, June).
Implementation and analysis of AES encryption on GPU. In High
Performance Computing and Communication & 2012 IEEE 9th International
Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE
14th International Conference on (pp. 843-848). IEEE.
[8] Noer, D., Engsig-Karup, A. P., & Zenner, E. (2011). Improved software
implementation of DES using CUDA and OpenCL. In Western European
Workshop on Research in Cryptology.
[9] Afarin, R., & Mozaffari, S. (2013, September). Image encryption using
genetic algorithm. In Machine Vision and Image Processing (MVIP), 2013 8th
Iranian Conference on(pp. 441-445). IEEE.
[10] Sowmiya, S., Tresa, I. M., & Chakkaravarthy, A. P. (2017, February). Pixel
based
image encryption using magic square. In Algorithms, Methodology, Models
and Applications in Emerging Technologies (ICAMMAET), 2017 International
Conference on (pp. 1-4). IEEE.
[12] Feng, X., Tian, X., & Xia, S. (2011, October). A novel image encryption
algorithm based on fractional fourier transform and magic cube rotation. In
Image and Signal Processing (CISP), 2012 4th International Congress on
(Vol. 2, pp. 1008-1011). IEEE.
[13] Ren, S., Gao, C., Dai, Q., & Fei, X. (2010, October). Attack to an
image encryption algorithm based on improved chaotic cat maps. In Image
and Signal Processing (CISP), 2012 3rd International Congress on (Vol. 2,
pp. 533-536). IEEE.
[14] Lei, Z., Li, L., & Xianwei, G. (2011, October). Design and
realization of image encryption system based on SMS4 commercial cipher
algorithm. In Image and Signal Processing (CISP), 2012 4th International
Congress on (Vol. 2, pp. 741- 744). IEEE.
[15] Li, Q., Zhong, C., Zhao, K., Mei, X., & Chu, X. (2012, June).
Implementation and analysis of AES encryption on GPU. In High
Performance Computing and Communication & 2012 IEEE 9th International
Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE
14th International Conference on (pp. 843-848). IEEE.
[19] Ma, J., Chen, X., Xu, R., & Shi, J. (2017, June). Implementation and
Evaluation of Different Parallel Designs of AES Using CUDA. In Data
Science in Cyberspace (DSC), 2017 IEEE Second International Conference on
(pp. 606-614). IEEE.
[20] Luo, C., Fei, Y., Luo, P., Mukherjee, S., & Kaeli, D. (2015, October).
Side-channel power analysis of a GPU AES implementation. In 2015 33rd
IEEE International Conference on Computer Design (ICCD) (pp. 281-288).
IEEE.
[22] Yuan, Y., He, Z., Gong, Z., & Qiu, W. (2014, September).
Acceleration of AES encryption with OpenCL. In Information Security
(ASIA JCIS), 2014 Ninth Asia Joint Conference on(pp. 64-70). IEEE.
[23] Qiu, H., & Memmi, G. (2014, December). Fast selective encryption
method for bitmaps based on GPU acceleration. In Multimedia (ISM),
2014 IEEE International Symposium on (pp. 155-158). IEEE.
[24] Habibpour, L., Yousefi, S., Lighvan, M. Z., & Aghdasi, H. S. (2016).
1D Chaos- based image encryption acceleration by using GPU. Indian
Journal of Science and Technology, 9(6).
[25] ABDULJABBAR, W. K., ABDUL-RAHMAN, S. Y. A. R. I. Z. A., &
RAMLI, R. (2017). A NEW PROCESSING OF CHAOS-BASED FAST
IMAGE ENCRYPTION ALGORITHMS. Journal of Theoretical & Applied
Information Technology, 95(11).