Documente Academic
Documente Profesional
Documente Cultură
RE = diag(3 ; 1 ; 2 ). Hence, the sum of these three matrices is Modified Symmetrical Reversible Variable-Length Code
a scalar multiple of the identity matrix. The same is the case for the and Its Theoretical Bounds
autocorrelation matrices corresponding to E6 , E7 , and E8 . Thus, the
whiteness of the quantization error over the fundamental unit of a WP Chien-Wu Tsai and Ja-Ling Wu, Senior Member, IEEE,
partition may be viewed as a consequence of the whiteness over each
dodecahedron, and over each of the two sets of three 14-hedra, namely,
fS3 ; S4 ; S5 g and fS6 ; S7 ; S8 g.
+
Abstract—Reversible variable length codes (RVLCs) have been adopted
in emerging video coding standards—H.263 and MPEG-4—to enhance
By Theorem 2, the whiteness of the quantization error E implies that their error-resilience capabilities (which are important and essential) in
the effective NMI of any WP partition cannot be improved by means error-prone environments. This study proposes an efficient algorithm to
of invertible linear transforms. Thus, in conclusion, any image of a WP construct a symmetrical RVLC from a given Huffman code. In addition,
partition under an invertible linear transform has larger effective NMI theoretical bounds on the maximum codeword length for fixed-length
Huffman codes, and on the optimal average codeword lengths for sources
than that of the truncated octahedron. with exponential distribution are provided.
Index Terms—Error resilience, Huffman Codes, MPEG-4., reversible
ACKNOWLEDGMENT variable length codes (RVLCs).
The authors wish to express their gratitude to T. Hales for sparking
their interest in the Weaire–Phelan partition, and to D. Hui for pointing
I. INTRODUCTION
out that it is instructive to reformulate (9) as (12).
Almost all image coding standards, such as the JPEG still image
REFERENCES coding standard [1], the ITU series of H.261 and H.263 video coding
standards [2], [3], the ISO series of MPEG-1 and MPEG-2 standards
[1] A. Gersho and R. M. Gray, Vector Quantization and Signal Compres-
sion. Boston, MA: Kluwer, 1992. [4], [5], adopt variable-length codes (VLCs) as their entropy coding
[2] P. L. Zador, “Development and evaluation of procedures for quantizing stage. Due to the variable code length nature of VLCs, they are very
multivariate distributions,” Ph.D. dissertation, Stanford Univ., Stanford, sensitive to errors occurring in noisy environments. Even a single bit
CA, 1963. error is extremely likely to induce propagation errors such that the data
[3] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups,
3rd ed. New York, NY: Springer-Verlag, 1998. received after the bit error position becomes useless and results in a
[4] A. Gersho, “Asymptotically optimal block quantization,” IEEE Trans. serious problem.
Inform. Theory, vol. IT-25, pp. 373–380, July 1979. Reducing the effect of this problem has led to the development of
[5] D. Weaire, Ed., The Kelvin Problem. London, U.K.: Taylor & Francis, reversible variable length codes (RVLCs), which can be decoded in
1996.
both the forward and backward directions. RVLCs have received exten-
[6] D. Weaire and R. Phelan, “A counter-example to Kelvin’s conjecture on
minimal surfaces,” Phil. Mag. Lett., vol. 69, no. 2, pp. 107–110, 1994. sive attention only recently, especially during the development of the
[7] R. Zamir and M. Feder, “On lattice quantization noise,” IEEE Trans. new video standards H.263+ and MPEG-4 [6], which require enhanced
Inform. Theory, vol. 42, pp. 1152–1159, July 1996. error resilient capabilities. Fraenkel and Klein [8] presented necessary
[8] R. Kusner and J. M. Sullivan, “Comparing the Weaire–Phelan equal- conditions for the existence of RVLCs along with an algorithm to con-
volume foam to Kelvin’s foam,” in The Kelvin Problem, D. Weaire,
Ed. London, U.K.: Taylor & Francis, 1996, pp. 71–80. struct a complete RVLC for a given set of codeword lengths. Mean-
[9] G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, 2nd while, Takishima, Wada, and Murakami [7] published the first work
ed. Cambridge, U.K.: Cambridge Univ. Press, 1952. specifying algorithms for constructing symmetrical and asymmetrical
[10] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: RVLCs from a given Huffman code. Finally, Wen and Villasenor [9],
Cambridge Univ. Press, 1985.
[10] proposed a new class of asymmetrical RVLCs with the same code-
length distribution as the Golomb–Rice codes and exp-Golomb codes,
and applied them to the H.263+ and the MPEG-4 standards.
This study proposes a novel symmetrical codeword selection mech-
anism that can make the symmetrical RVLC construction algorithm,
originally proposed by Takishima et al., generate more efficient codes,
and also overcome a problem in variations of the bit alignment patterns.
In addition, theoretical results of maximum codeword lengths and op-
timal average codeword lengths of the constructed symmetrical RVLCs
under specific conditions and source distributions are derived.
Manuscript received March 5, 2000; revised March 19, 2001. The material in
this correspondence was presented in part at the IS&T/SPIE 13th International
Symposium on Electronic Imaging 2000, San Jose CA, January 23–28, 2000.
The authors are with the Department of Computer Science and Information
Engineering, National Taiwan University, Taipei, 106, Taiwan, ROC (e-mail:
cwtsai@cmlab.csie.ntu.edu.tw; wjl@cmlab.csie.ntu.edu.tw).
Communicated by P. A. Chou, Associate Editor for Source Coding.
Publisher Item Identifier S 0018-9448(01)06219-8.
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.
2544 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
late the prefix condition must be eliminated from m0 (L). Let u(i; L)
denote the number of symmetrical codewords at level L that violate
the prefix condition when a symmetrical codeword at the ith level is
selected as a target codeword. Three different cases then exist:
i) u(i; L) = m0 (L 0 2 ), when
i i L=2 ; is generated for a given probability distribution of symbols, and
ii) u(i; L) = 1, when , and the (2i 0 L)-bit suffix is
L=2 < i < L the bit-length vector of this Huffman code is then taken as the only
symmetrical for the target codeword; input required by the novel scheme. Consequently, the constructed
iii) u(i; L) = 0 in all other cases. symmetrical RVLC is unrelated to the bit alignment patterns of the
starting Huffman code.
Hence, the number of symmetrical codewords at level L in an instan- Let p(i) denote the total number of symmetrical codewords located
taneously decodable VLC m(L) is as follows: at level i which violate the prefix condition owing to some symmetrical
bL=2c L01 codewords positioned in the path from the root to level i that have been
m(L) = m0 (L) 0 u(i; L) 1 n(i) 0 x(i; L) (2) selected as target symmetrical codewords. That is,
i=1 i=bL=2c+1 bi=2c i01
where n(i) are components of the bit length vector of the original VLC p(i) = u(k; i) 1 nrev (k ) + x(k; i): (3)
denoting the number of codewords at level i, and x(i; L) represents k=1 k=bi=2c+1
the number of codewords at level i whose (2i 0 L)-bit suffixes are
Thus, the number of available symmetrical codewords, m0 (i), at level
symmetrical. In (2), the second and third terms are used to calculate
i is
the total number of symmetrical codewords caused by the violation of
the prefix condition in cases i) and ii), respectively. m
0 (i) = m0 (i) 0 p(i): (4)
From (2), the first and second terms are clearly determined only by
the code length L and bit length vector n(i), while the last term is influ- To make m0 (i) as large as possible for each level i, p(i) must be
enced by the bit alignment pattern of each codeword. This finding im- minimized, which can be achieved in two ways. One method is to set
plies that different symmetrical RVLCs with different bit length vectors nrev (k ), the first term of (3), at as small a value as possible. This signi-
can be constructed based on different VLCs with identical bit length fies that some candidate codewords in the first half of the levels (from
vectors, referred to herein as the variation problem. Furthermore, if the level 1 to level i=2) have to be marked as unavailable to maximize the
final term in (2) can be reduced, the number of available symmetrical number of available candidates at level i. However, this may negatively
codewords represented by m(i) will increase, producing a more effi- impact the average codeword length, causing it to increase. The other
cient symmetrical RVLC. way to reduce p(i) is to minimize the second term, which can be ac-
The procedure used by Takishima’s algorithm to construct the sym- complished by carefully selecting target codewords when the number
metrical RVLC can be summarized as follows. of available candidates exceeds the number of codewords needed at this
Step 1) Initialize the bit length vector of the target symmetrical level.
RVLC, nrev (i), by the bit length vector n(i) of the starting To simplify the selection mechanism, the available candidates are
VLC (a Huffman code, for example). arranged in ascending order according to the maximum length of
Step 2) If nrev (i) m(i), nrev (i) is preserved unchanged be-
their symmetrical bit suffixes, excluding the first bit of each candidate
codeword. The codeword selection order is based on this arrangement.
cause more symmetrical codewords than needed are cur-
Table I illustrates this ordering for symmetrical codewords at levels
rently available. Otherwise, one bit is added to some code-
4; 5; and 6, respectively. If a symmetrical codeword is required at
words such that
level l, the first candidate C to be selected is the one with maximum
nrev (i + 1) = nrev (i + 1) + nrev (i) 0 m(i) length r of the symmetrical bit suffixes and without prefix condition
violation for the selected codewords at earlier levels. The prefix
condition violation occurs earliest at level 2l 0 r due to the selection
nrev (i) = m(i):
Step 3) Repeat Step 2) until every codeword has been assigned a of candidate C , and does not alter the number of candidates available
symmetrical RVLC codeword. at levels between l + 1 and 2l 0 r 0 1. Consequently, if a candidate
codeword C 0 for which the maximum length t of the symmetrical
bit suffixes exceed r , a symmetrical codeword at level 2l 0 t 0 1
III. TAKISHIMA’S ALGORITHM WITH AN EFFICIENT CODEWORD
will no longer be a potential candidate. Hence, owing to the selection
of codeword C 0 instead of codeword C , the number of available
SELECTION MECHANISM
An efficient codeword selection mechanism is now presented, symmetrical codewords at level 2l 0 t 0 1 is reduced by one, while
helping Takishima’s algorithm to overcome the variation problem the number of symmetrical codewords needed at the next level is
and develop a more efficient symmetrical RVLC. A Huffman code increased by one. Consequently, the average code length obtained
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001 2545
from the selection of codeword C 0 will probably exceed that obtained L . Since construction of the symmetrical RVLCs begins from level L,
from the selection of codeword C . Thus, better result will be obtained the following equations are obtained:
by adopting the proposed codeword selection criterion. As an example, m(L) = m0 (L)
assume a codeword is required at level 5, and further assume that
codewords “01110” and “01010” are currently available for selection. m(L 0 ( + 1)
+ 1) = m0 (L + 1) x L; L
+ 2) = 0 ( + 2) 0 ( + 2) 0 ( 111
Since the maximum number of reversible bit suffixes for codewords
m(L m L x L; L x L + 1; L + 2)
“01110” and “01010” are one and three, respectively, then according
to the proposed selection criterion, codeword “01110” will be the m(L + ) = 0( + ) 0 (
k m L + )0 (
k x L; L k x L + 1; L + 2)
first choice in this example. The precise reason for the above choice
is that it prevents the first candidate codeword after level 5 from 0111 0 ( + 01 + ) x L k ; L k ; 0 01
k L :
being selected because it violates the prefix condition of the codeword (5)
Intuitively, k should be between 0 and L 0 1 since the upper bound of
“011101110,” which is located at level 9. The same situation applies
the maximum codeword length in this case is 2L01. Consequently, L+
for codeword “01010,” which prevents codeword “0101010” at level
k cannot exceed 2L 0 1, guaranteeing that k falls within this range. Let
7 from being selected.
The procedure used by the proposed algorithm (Takishima’s algo-
x(L+j; L+i) denote the number of symmetrical codewords selected at
rithm with efficient codeword selection mechanism) to construct the
level L + j which are also the prefix of symmetrical candidates at level
symmetrical RVLC is summarized as filliws.
L + i for i > j , as previously mentioned. Based on the symmetrical
Step 1) Initialize the bit-length vector of the target symmetrical property, if the L + 2j 0 i suffix bits of the codeword at level L + j are
RVLC, nrev (i), by the bit length vector n(i) of the starting reversible, then it will be a prefix of a symmetrical codeword at level
VLC. L + i. Hence, these unqualified codewords must be eliminated from
Step 2) Calculate m0 (i) and the maximum length b(i) of the sym- each level. Adding up the previous equations and then comparing the
metrical bit suffixes, excluding the first bit of each candidate result with N , produces the following inequality:
codeword in m0 (i), and arrange these codewords based on T = m(L) + m(L + 1) + 111 + m(L + k) N
Make the candidate codewords as target codewords in the counted in the result of x(l2 ; l3 ). For example, Table I, reveals that
sequence. when the codeword “0000” is selected at level 4, then codewords
Step 3) Repeat Step 2) until every codeword has been assigned a “00000” at level 5 and “000000” at level 6 have to be eliminated,
symmetrical RVLC codeword. and this codeword elimination process is represented by subtracting 1
from x(4; 5) and x(4; 6), respectively. However, x(5; 6) should not
have 1 subtracted due to the selection of codeword “0000.”
IV. BOUNDS OF SYMMETRICAL RVLCs As mentioned above, calculating x(l2 ; l3 ) requires exact knowledge
of all the codewords selected prior to level l2 , and no formula could be
In this section, we derive some theoretical results of the maximum
derived to find the maximum symmetrical codeword length because of
this restriction. To simplify the calculation of function x(1 ; 1) in (6), the
codeword length and optimal average codeword length for certain
sources.
dependence of codewords among several levels is ignored, only con-
sidering dependence between two levels. Consequently, a new term,
A. Maximum Codeword Length for a Fixed-Length Huffman Code s(l2 ; l3 ), is defined, which represents the number of symmetrical code-
Assume that a source has N = 2L symbols with a uniform prob- words at level l2 that are also the prefixes of the symmetrical codewords
ability distribution, and that the entropy of this source is known to be at level l3 . Substituting s(l2 ; l3 ) for x(l2 ; l3 ) in (5), produces
a maximum [11]. It would be the worst case for the construction of
Huffman VLCs and symmetrical RVLCs for such a source. To obtain m(L + k ) = m0 (L + k ) 0 s(L; L + K) 0 s(L + 1; L + 2)
the average codeword length of a symmetrical RVLC, we must deter- 0 111 0 s(L +k 01; L + k ); 0 01
k L :
mine the maximum codeword length of the symmetrical RVLC in ad-
vance. The Huffman code produced by this situation is a fixed-length (7)
code with equal length L for each symbol. The symmetrical codewords
Since s(l2 ; l3 ) is greater than or equal to x(l2 ; l3 ), the inequality (6)
are then constructed starting from level L by the proposed algorithm,
will also change into the following form through substitution:
there are m(L) = m0 (L) codewords at level L and all of these code-
words are selected due to the fact that m(L) is equal to or less than N . k k i01
If m(L) is less than N , we proceed to the next level to seek more sym- T = m0 (L + i) 0 s(L + j; L + i) N (8)
metrical codewords without violating the prefix condition for the re- i=0 i=1 j =0
maining VLCs. The procedure is then repeated until all Huffman VLCs
have been assigned to symmetrical codewords. where k0 represents the minimum number of levels that must be sought,
Assuming the process is repeated k + 1 times, the maximum code- such that the total number of available candidates at levels from L to
word length is then L + k . From (3), let nrev (i) be zero when 1 i < 0 0
L + k is greater than or equal to N , and k must be greater than or
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.
2546 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001 2547
TABLE III
SYMMETRICAL RVLCs GENERATED FROM TAKISHIMA’S AND THE PROPOSED ALGORITHMS FOR ENGLISH ALPHABET
From the relation indicated in (19), the Huffman tree will be a single-
side-growing tree, implying that the bit length vector has the form word “0” or codeword “1” is selected at level one. The subscript
(1; 1; . . . ; 1; 2). n indicates that the designated bit is repeated n times. Conse-
Based on the general bit-length vector form (1; 1; . . . ; 1; 2), quently, the candidate codewords become“0,” “11,” “101,” “1001,”
only one symmetrical codeword is needed for each level except and so on. Finally, the optimal average codeword length of symmet-
the last. Fortunately, one symmetrical codeword can always be rical RVLCs for this kind of sources will be obtained. Fig. 1 displays
found at each level without violating the prefix condition. All of the relative coding efficiency of Huffman and the proposed algorithm
these symmetrical codewords have a regular pattern (except for when they are applied to a source represented by (14), with = 2
level one), namely “10n 1” or “01n 0,” depending on whether code- for various symbol sizes. The figure indicates that the optimal average
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.
2548 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 25, 2009 at 11:21 from IEEE Xplore. Restrictions apply.