Documente Academic
Documente Profesional
Documente Cultură
https://doi.org/10.1007/s11277-019-06171-x
S. Jancy1 · C. Jayakumar2
Abstract
Sensors play an integral part in the technologically advanced real world. Wireless sensors
are which have powered by batteries with limited capacity. Hence energy efficiency is one
of the major issues with wireless sensors. Many techniques have been proposed in order
to improve sensor efficiency. This paper discusses to improve energy efficiency of sensor
through data compression. Sequence statistical code based data compression algorithm is
being proposed to improve the energy efficiency of sensors. SDC and FOST codes were
used in this algorithm in order to achieve better compression ratio. The simulation result
was compared with arithmetic data compression techniques. In the proposed algorithm
computation process is very simple than arithmetic data compression techniques.
1 Introduction
1.1 Data Compression
Data compression involves encoding and modelling. It is a process in which bits struc-
ture of data is converted in a manner that it takes less space on the disk. It aids the
reduction of storage size of the data. This process is also called as bit rate reduction
or source reduction. Reference [1] DCT and DWT data compression techniques are
discussed. Reference [2] analyse the state of the art study of confidentiality preserv-
ing techniques for WSN. Reference [3] examine the problem of irregular energy con-
sumption. Reference [4] present the methodical and comprehensive classification of
the energy conservation schemes. Reference [5] evaluate the recent trends and devel-
opment in the use of wavelet in wireless communication. Reference [6] to propose the
saving energy clustering algorithm. Reference [7] analyse the signal processing tasks.
* S. Jancy
jancyphd16@gmail.com
1
CSE Department, Sathyabama University, Chennai, India
2
Sri Venkateswara College of Engineering, Chennai, India
13
Vol.:(0123456789)
972 S. Jancy, C. Jayakumar
Table 1 Compression ratio of File no. File size (bytes) Compression ratio Compression ratio
arithmetic coding and Huffman (arithmetic coding) (Huffman coding)
coding
1 52,331 1.12 1.06
2 57,206 1.10 1.25
3 37,137 1.58 1.51
4 55,050 1.13 1.09
Table 2 Compression ratio and compression time (for sequential code data compression (SDC) [18])
File no. File size Compressed file size Compression ratio Compres-
sion time
(ms)
References [8–13] analyse the power consumption problem and hybrid routing algo-
rithm. With wide execution in computing services, data compression is majorly used
in data communication. In order to reduce the size of data several software solutions
and compression techniques are employed. An average compression ratio is normally
achieved with common types of data using the available techniques. The requirement
for a much smaller space for storage and faster communication outcome is of these
data compression techniques. Redundancy of data storage and storage costs are be lim-
ited through data storage compression (Table 1).
1.2 Entropy Coding
A coding scheme in which symbols get assigned to codes in order to match lengths of
code with the probabilities of the symbol is called as entropy coding. Characteristi-
cally symbols represented by codes in proportional to the negative logarithm of the
probability replace symbols represented by equal length codes. Hence the shortest
codes are used by the common symbols. In accordance with Shannon theorem − logb p
is symbols optimal code length. Where b is the number of symbols used to make out-
put codes and p is the input symbols probability. Huffman coding and arithmetic cod-
ing are two majorly used encoding techniques. Variable length code and fixed length
code are the two classifications of the types of code. In accordance with Shannon theo-
rem the entropy h of discrete random variable x is a major of amount of uncertainly
involves with the value of x.
∑ 1
H(x) = p(x) ⋅ log2
x∈X
p(x)
13
Sequence Statistical Code Based Data Compression Algorithm… 973
1.3 Information Theory
Mathematical laws constrain the transformations of information. The two fundamental con-
cept of communication theory is addressed by the information theory. These are transmis-
sion rate of communication and data compression. In data compression the compression
limits is the entropy of data h. Channel capacity c is a rate limit of the transmission rate of
communication. Every communication scheme are position in between these two limits.
The term entropy is used by information which is encoded in a message [14]. In informa-
tion theory, entropy is a measure that allows for the evaluation of the level of randomness
in a string of symbols [15]. The two major entropy coding techniques are:
A. Huffman coding
The Huffman coding algorithm is used when the data compression is perform using
individual letter frequencies’. It is an optimal compression algorithm. The algorithm works
in the concepts of using few a bits to encode the letters which are more frequent than the
letters which are less frequent. This algorithm based on statistical coding. A symbols prob-
ability has a direct bearing on it representation length with Huffman compression coding
system, the more frequently used characters are assigned with smaller codes and less fre-
quently used characters with larger codes. This variable length coding system reduces the
size of the file which have been compressed and transfer. Drawback of the Huffman algo-
rithm compression ratio is based on the individual characters frequency (Fig. 1).
B. Arithmetic coding
Lossless and lossy data compression are the algorithms in which arithmetic coding is com-
monly used. The entropy coding technique encodes the frequency of the symbol with few a
bits than the symbols with lesser frequency. With arithmetic coding the input message which
60,000
40,000
20,000 Compression Rao(Huffman Coding)
0
1 File size(bytes)
2 3 4
1 2 3 4
File size(bytes) 52,331 57,206 37,137 55,050
Compression
1.12 1.1 1.58 1.13
Rao(Arithmec Coding)
Compression
1.06 1.25 1.51 1.09
Rao(Huffman Coding)
13
974 S. Jancy, C. Jayakumar
is composed of symbols is converted into a floating point number which is greater than or
equal to zero, however less than 1. In order to characterize the symbols, arithmetic coding is
processing it relies on a model. The models task is to tell the encoder on the probability of a
character in a input message. If the probability of the characters in the message is given accu-
rately by the message will be encoded close to optimum. On the contrary, if the probabilities
of the symbol are misrepresented by the model, the encoder expands message instead of com-
pressing it [12]. With arithmetic coding the interval of real numbers between 0 and 1 represent
a message. The interval needed to represent the message varies inversely with the length of a
message. With variation the number of bits needed to specify that interval grows. References
[16, 17] propose the universal algorithm for sequential data compression and pixel scanning
method for data compression (Fig. 2).
13
Sequence Statistical Code Based Data Compression Algorithm… 975
The first order model the characters are statically independent. Let pi be the probability of
the ith letter in the alphabet. Let m be the size of the alphabet The entropy is calculated as,
∑
m
H=− pi log2 pi
i=1
A b C D E F G h
Table 3 Compression ratio and compression time (for packet level data compression (PLDC) [18])
File no. File size Compressed file size Compression ratio Compres-
sion time
(ms)
13
976 S. Jancy, C. Jayakumar
The proposed algorithm uses first order static code (FOST) and sequence code (SDC)
(Fig. 4, Table 4).
Steps for Proposed Algorithm:
1. Sequence code is generated and assign for the given input data.
2. For the given input data the first order static code is assigned.
3. The difference between sequence code and first order statistic is calculated.
ΔD = SCD − FOSD
SCD—sequential coded.
FOSD—first order statistic coded.
D—difference.
4. Double digits and single digits are segregated from SDC.
5. All double digits are converted into single digits once segregated.
6. A location table is generated for every double and single digits.
7. Once converted from double digits to single digits the derived value is with the corresponding SDC
code.
8. Steps 3, 4 5, 6, 7 are repeated once the sequence digits are assigned.
9. This process continued until all the input data, SDC are converted into single digit.
10. The SDC of the resultant single digits are derived.
13
Sequence Statistical Code Based Data Compression Algorithm… 977
A b C D E F g h i J
1 2 3 4 5 6 7 8 9 10
K l M N O P q r s T
11 12 13 14 15 16 17 18 19 20
U v W X Y Z
21 22 23 24 25 26
Example:
The input data is “PROCEDURE”.
Assign sequence code for the given input data.
P r O C E d u r E
16 18 15 3 5 4 21 18 5
Input data SDC FOST
P 16 0.0137645
R 18 0.0497563
O 15 0.0564513
C 3 0.0217339
E 5 0.0141442
D 4 0.0349835
U 21 0.0225134
R 18 0.0497563
E 5 0.1041442
13
978 S. Jancy, C. Jayakumar
Segregate the double digit and single digits in sequential coded data.
Double digit
P—16 R—18
O—15 U—21
R—18
Single digit
C—3 E—5
D—4 E—5
P—16—7 R—18—9
O—15—6 U—21—3
R—18—9.
Generate the location table for double digits and single digits.
A[0]P YES –
A[1]R YES –
A[2]O YES –
A[3]C – YES
A[4]E – YES
A[5]D – YES
A[6]U YES –
A[7]R YES –
A[8]E – YES
Once the double digits are converted into single digits, assign the corresponding SDC of
the single digits (Fig. 5, Table 5).
P − 16 − 7 − G − 0.0158610
R − 18 − 9 − I − 0.0558094
O − 15 − 6 − F − 0.0197881
U − 21 − 3 − C − 0.0124248
R − 18 − 9 − I − 0.0558094
13
Sequence Statistical Code Based Data Compression Algorithm… 979
-10000
A[0]P(G) – YES
A[1]R(I) – YES
A[2]O(F) – YES
A[6]U(C) – YES
A[7]R(I) – YES
Obtain the difference between newly generate sequential coded data and the corresponding
FOST.
ΔG = 7 − 0.0158610 = 6.984139
ΔI = 9 − 0.0558094 = 8.9441906
ΔF = 6 − 0.0197881 = 5.9802119
ΔC = 3 − 0.0124248 = 2.9875752
ΔI = 9 − 0.0558094 = 8.9441906
13
980 S. Jancy, C. Jayakumar
P − 16 − 7 − G ⎫
R − 18 − 9 − I ⎪
⎪
O − 15 − 6 − F ⎬ 34 − 7 − G − 0.0158610
U − 21 − 3 − C ⎪
R − 18 − 9 − I ⎪
⎭
C−3⎫
E−5 ⎪
D−4⎬
17 − 8 − H − 0.0492888
⎪
E−5 ⎭
∑
7
∑
6
H(O) = 14.9 log4 = 8.9.H(C) = 2.9 log2 = 0.8.
i=3 i=4
∑
5
∑
4
H(E) = 4.8 log2 = 1.4.H(D) = 3.9 log2 = 1.1.
i=5 i=6
∑
3
∑
2
H(U) = 20.9 log4 = 12.5.H(R) = 17.9 log4 = 10.7.
i=7 i=8
∑
1
H(P) = 4.8 log2 = 1.4.
i=9
13
Sequence Statistical Code Based Data Compression Algorithm… 981
∑
7
∑
6
H(O) = 5.9 log2 = 1.7.H(C) = 2.9 log2 = 0.8.
i=3 i=4
∑
5
∑
4
H(E) = 4.8 log2 = 1.4.H(D) = 3.9 log2 = 1.1.
i=5 i=6
∑
3
∑
2
H(U) = 2.9 log2 = 0.8.H(R) = 8.9 log2 = 2.6.
i=7 i=8
∑
1
H(P) = 4.8 log2 = 1.4.
i=9
∑
1
H= 7.9 log4 = 2.3.
i=2
2.0 4.3
2.3
13
982 S. Jancy, C. Jayakumar
13
Sequence Statistical Code Based Data Compression Algorithm… 983
7 Performance and Measurement
8 Conclusion
Wireless Sensor Networks are made up of many sensors. Individual sensors are desig-
nated with unique processing capacity. These sensors are battery power. Energy effi-
ciency is wireless sensor network major disadvantage. In order to improve the efficiency
13
984 S. Jancy, C. Jayakumar
of wireless sensor network, compression techniques is majorly used. The Sequence Sta-
tistical Code based data compression is proposed this paper to achieve better compres-
sion ratio. This proposed algorithm used SDC and FOST codes. The computational pro-
cesses much better when compared with arithmetic coding.
References
1. Sheltami, T., Musaddiq, M., & Shakshuki, E. (2016). Data compression techniques in wireless sen-
sor networks. Future Generation Computer Systems, 64, 151–162.
2. Li, N., Zhang, N., Das, S. K., & Thuraisingham, B. (2009). Privacy preservation in wireless sensor
networks: a state-of-the-art survey. Ad Hoc Networks, 7, 1501–1514.
3. Li, J., & Mohapatra, P. (2007). Analytical modeling and mitigation techniques for the energy hole
problem in sensor networks. Pervasive and Mobile Computing, 3, 233–254.
4. Anastasi, G., Conti, M., Di Francesco, M., & Passarella, A. (2009). Energy conservation in wireless
sensor networks: A survey. Ad Hoc Networks, 7, 537–568.
5. Lskshmanan, M. K., & Nikookar, H. (2006). A review of wavelets for digital wireless communica-
tion. Wireless Personal Communications, 37, 387–420.
6. Chang, J.-Y., & Pei-Hao, J. (2012). An efficient cluster-based power saving scheme for wireless
sensor networks. EURASIP Journal on Wireless Communications and Networking, 2012, 172.
7. Xiao, J.-J., Ribeiro, A., Luo, Z.-Q., & Giannakis, G. B. (2006). Distributed compression-estimation
using wireless sensor networks. IEEE Signal Processing Magazine, 23(4), 41.
8. Alippi, C., Anastasi, G., Di Francesco, M., & Roveri, M. (2010). An adaptive sampling algorithm
for effective energy management in wireless sensor networks with energy hungry sensors. IEEE
Transcations on Instrumentation and Measurement, 59(2), 335–344.
9. Srisooksai, T., Keamarungsi, K., Lamsrichan, P., & Araki, K. (2012). Practical data compression in
wireless sensor networks: A survey. Journal of Network and Computer Applications, 35, 37–59.
10. Ravindra Babu, T., Narasimha Murty, M., & Agrawal, V. K. (2007). Classification of run-length
encoded binary data. Pattern Recognition, 40, 321–323.
11. Yick, J., Mukherjee, B., & Ghosal, D. (2008). Wireless sensor network survey. Computer Networks,
52, 2292–2330.
12. Abdulla, A. E. A. A., Nishiyama, H., & Kato, N. (2012). Extending the lifetime of wireless sensor
networks: A hybrid routing algorithm. Computer Communications, 35, 1056–1063.
13. Kolo, J. G., Ang, L.-M., Shanmugam, S. A., Lim, D. W. G., & Seng, K. P. (2013). A simple data
compression algorithm for wireless sensor networks. AISC, 188, 327–336.
14. Witten, I. H., Neal, R. M., & Cleary, J. G. (1987). Arithmetic coding for data compression. Com-
munication of the ACM, 30(6), 520–540.
15. Giancarlo, R., Scaturro, D., & Utro, F. (2012). Textual data compression in computational biology:
Algorithmic techniques. Computer Science Review, 6, 1–25.
16. Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression. IEEE Trans-
actions on Information Theory, 23(3), 337–343.
17. Kolo, J. G., Seng, K. P., Ang, L.-M., & Prabaharan, S. R. S. (2011). Data compression algorithm
for visual information. In ICIEIS 2011, Part III, CCIS (Vol. 253, pp. 484–497). Berlin: Springer.
18. Jancy, S., & Jayakumar, C. (2015). Packet level data compression techniques for wireless sensor
networks. Journal of Theoretical and Applied Information Technology, 75. ISSN:1992-8645.
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
13
Sequence Statistical Code Based Data Compression Algorithm… 985
13