UbiCC Bouzid BookChapter 494

ROBUST ENCODING OF THE FS1016 LSF PARAMETERS :
APPLICATION OF THE CHANNEL OPTIMIZED TRELLIS

CODED VECTOR QUANTIZATION
BOUZID Merouane
Speech Communication and Signal Processing Laboratory,
Electronics Faculty, University of Sciences and Technology Houari Boumediene (USTHB),
P.O. Box 32, El-Alia, Bab-Ezzouar, Algiers, 16111, ALGERIA
mbouzid@usthb.dz, mbouzid@yahoo.com
ABSTRACT
Speech coders operating at low bit rates necessitate efficient encoding of the linear
predictive coding (LPC) coefficients. Line spectral Frequencies (LSF) parameters
are currently one of the most efficient choices of transmission parameters for the
LPC coefficients. In this paper, we present an optimized trellis coded vector
quantization (OTCVQ) scheme designed for robust encoding of the LSF
parameters. The objective of this system, called initially "LSF-OTCVQ Encoder",
is to achieve a low bit-rate quantization of the FS1016 LSF parameters. The
efficiency of the LSF-OTCVQ encoder (with weighted distance) was first proved
in the ideal case of transmissions over noiseless channel. After that we were
interested on the improvement of its robustness for real transmissions over noisy
channel. To protect implicitly the transmission parameters of the LSF-OTCVQ
encoder incorporated in the FS1016, we used a joint source-channel coding carried
out by the channel optimized vector quantization (COVQ) method. In the case of
transmissions over noisy channel, we will show that the new encoding system,
called "COVQ-LSF-OTCVQ Encoder", would be able to contribute significantly
to the improvement of the FS1016 performances by ensuring a good coding
robustness of its LSF spectral parameters.
Keywords: source-channel coding, robust speech coding, LSF parameters.
1
INTRODUCTION
In speech coding systems, the short-term spectral

information of the speech signal is often modelled by
the frequency response of an all-pole filter whose
transfer function is denoted by H(z) = 1/A(z) in
which A(z) = 1 + a1 z 1 ++ ap z p [1]. In telephone
band speech coding (300-3400 Hz, fe = 8 KHz), the
parameters of this filter are derived from the input
signal through linear prediction (LP) analysis of p =
10 order. The 10 parameters {ai}i=1,2,,10, known as
the Linear Predictive Coding (LPC) coefficients [1],
play a major role in the overall bandwidth and
preserving the quality of the encoded speech.
Therefore, the challenge in the quantization of the
LPC parameters is to achieve the transparent
quantization quality [2], with the minimum bit-rate
while maintaining the memory and computational
complexity at a low level.
In practice, one doesn't quantify directly the LPC
coefficients because they have poor quantization
properties. Thus, other equivalent parametric
representations have been formulated which convert
them into much more suitable parameters to
quantize. One of the most efficient representations of

the LPC coefficients is the Line Spectral Frequency
(LSF) [3]. The LSF parameters (LSFs) which are
related to the zeros of polynomials derived from A(z)
[1] exhibit a number of interesting properties. These
properties [2] make them a very attractive set of
transmission parameters for the LPC coefficients.
Exploiting these properties, various coding schemes
based on scalar and vector quantization were
developed in the past for the efficient quantization of
spectral LSF parameters. Several works showed that
the vector quantizer (VQ) schemes, such as
multistage VQ [4], Split VQ [2], can achieve at
lower bit-rates the transparent quantization quality of
the LSFs compared with those conceived based on
scalar quantizer (SQ).
In this paper, we present an optimized trellis
coded vector quantization (OTCVQ) scheme
designed for the efficient and robust coding of LSF
parameters. The aim of this system, called at the
beginning "LSF-OTCVQ Encoder", is to achieve a
low bit rate transparent quantization of LSFs by
exploiting the intra-frame dependence between the
closest pairs of the LSF parameters. In the case of
ideal transmissions over a noiseless channel, we have

already proved in [5] that the LSF-OTCVQ encoder
(with weighted distance) could achieve good
performances when applied to encode the LSF
parameters of the US Federal Standard FS1016.
Indeed, we have showed that LSF-OTCVQ encoder
of 27 bits/frame produces equivalent perceptual
quality to that obtained when the LSF parameters are
unquantized.
Subsequently, our interest was drawn to the
improvement of the LSF-OTCVQ encoder
robustness for real transmissions over noisy channel.
In low bit rate speech coding domain, the essential
objective is to reduce the bit rates of speech coders
while maintaining a good quality of transmission. In
general, during the design of speech coding systems,
the effects of transmission noises are often neglected.
A redundant channel coding [6] is conventionally
used to ensure an "explicit" protection to sensitive
parameters of speech coders against channel errors.
According to the separate design approach,
suggested by Shannon in his classical source/channel
coding theorems [7], the channel encoder can be
designed separately from the source encoder by
adding redundant bits (Error-detecting-correcting
codes) to source data. Indeed, robust encoding
systems could be designed according to this
separation approach but at the cost of an increase of
the bit-rate/delay transmission and the complexity of
the coding/decoding. However, at low bit rate where
the constraints in complexity and delay are very
severe, this channel coding is not especially
recommended. The separation design disadvantages
have motivated some researchers to investigate a
joint solution to the source and channel coding
optimization problem so that they can reduce the
complexity on both sides, while providing
performances close to the optimum. For these
purposes, Joint Source-Channel Coding (JSCC) was
introduced in which the overall distortion is
minimized by simultaneously considering the impact
of the transmission errors and the distortion due to
source coding [8], [9], [10]. Most of these works
have proved the effectiveness of the JSCC to protect
implicitly (i.e., without redundancy) source data
while maintaining a constant bit rate and a reduced
complexity.
To implicitly protect the transmission indices of
our LSF-OTCVQ encoder incorporated in the
FS1016, we used a JSCC method carried out by the
Channel Optimized Vector Quantization (COVQ).
We will show first how to adapt and apply
successfully the COVQ technique for the robust
design of a new encoding system (called "COVQLSF-OTCVQ encoder") in order to implicitly protect
some of its indices. To finish, we will generalize the
study with the complete protection of all the indices
of the COVQ-LSF-OTCVQ encoder.
An outline of this paper is as follows. In section
2, we briefly review the basics of vector quantization.

In section 3, we describe the design steps of the
OTCVQ encoding system. Examples of comparative
results of TCVQ/OTCVQ encoders are reported in
this section. Next, we present the joint coding
method by the COVQ technique. The performances
of the COVQ system applied to encode memoryless
source are presented at the end of the section. The
application of the OTCVQ scheme for encoding the
LSF parameters is described in section 5. Simulation
results, when using two different distance measures
(unweighted and weighted) in the design and the
operation of the LSF-OTCVQ encoder, are provided.
In section 6, we present the application of the LSFOTCVQ encoder to quantize the LSF parameters of
the FS1016 speech coder. After, a JSCC-COVQ
method was used to implicitly protect the LSFOTCVQ indices for transmissions over noisy
channel. Conclusions are given in section 7.
2
VECTOR QUANTIZATION
A k-dimensional vector quantizer (VQ) of size L

is a mapping Q of k-dimensional Euclidean space k
into a finite subset (codebook) Y = {y0,, yL1}
composed of L codevectors [11]. The design
principle of a VQ consists of partitioning the kdimensional space of source vectors into L non
overlapping cells {R0,.., RL1} and associating with
each cell Ri a unique codevector yi.
Coding a sequence of input source vectors by a
VQ consists thus to associate to each source vector x
the binary index i {0,, L 1} of a close
codevector yi whose distance from the input vector is
minimized. In general, the vector quantization
involves an irreversible loss of information which
results in a quality degradation evaluated commonly
by a distortion measure. For a given VQ, the average
distortion is defined by [11]:
D=
1
k
L 1
d ( x, yi ) p(x) dx ,
(1)
i =0 xRi
where p(x) is the k-fold probability density function

of the source and d(x, yi) is the widely used squared
Euclidean distance.
The optimal design of a VQ is based on the
principle of searching simultaneously the partition
{R0,.., RL1} and the representing codevectors {y0,..,
yL1} which minimizes the average distortion D. To
resolve this problem, two main necessary conditions
of optimality need to be successively satisfied during
the VQ design process [11]:
1. For a given codebook Y = {y0, y1,..., yL1}, the
optimal partition satisfy :
Ri = x : d ( x, y i ) d ( x, y j ), j i
(2)
It's the nearest neighbor optimality condition.

2. Given an encoder partition {Ri, i = 0,..., L1}, the
optimal codevectors yi are the centroids in each
partition cell Ri (centroid condition) :
y i = Cent ( Ri ) = E ( X / X Ri )
(3)
Various algorithms for the design of VQ have

been developed in the past. The most popular one is
certainly the LBG algorithm [12]. This algorithm
(LBG-VQ) is an iterative application of the two
optimality conditions such as the partition and the
codebook are iteratively updated.
3
OPTIMIZED ENCODING SYSTEM BASED

ON THE TRELLIS CODED VECTOR
QUANTIZATION
The scalar trellis coded quantization (TCQ) [13]

and its generalized version to vector case (TCVQ)
[14], [15] improve upon traditional trellis encoders
[16] by labelling the trellis branches with entire
subsets rather than with individual reproduction
levels. This approach, which was motivated by
Ungerboeck's formulation of Trellis Coded
Modulation (TCM) [17], uses a structured alphabet
with an extended set of quantization levels.
In this work, one was interested particularly on
the TCVQ encoder which structure is quite similar to
TCQ, with an increase in complexity due to vector
codebook searching [14]. The design of a TCVQ
encoder consists of several interrelated steps. These
steps include selection of trellis, extended initial
codebook construction, partitioning of the
codebook's codevectors into subcodebooks (subsets)
and labelling the trellis branches with these subsets.
Consider the design process of a k-dimensional
TCVQ encoder of rate R bits per sample (bps) used
to encode a sequence of source vectors. The S-state
trellis used in TCVQ can be any one of Ungerboeck's
amplitude modulation trellises [17]. The extended
initial TCVQ codebook is generally designed by the
LBG algorithm. It contains 2kR+1 codevectors (twice
that of the VQ). However, during the TCVQ
encoding process, only a subset of size 2kR of these
codevectors may be used to represent a source vector
at any instance of time. According to Ungerboeck's
set partitioning method, the codevectors are then
partitioned into four subsets D0, D1, D2 and D3 each
of size 2kR1. In our TCVQ encoders design, we used
the heuristic algorithm described in [15] to partition
the extended TCVQ codebook. After that, the subsets
are labelled on the trellis branches according to
Ungerboeck's rules of TCM [17]. These rules are
meant to ensure that the distortion between the
original and the reconstructed source sequences

(under clear channel assumptions) is close to the
minimum.
To encode the source vectors sequence, the wellknown Viterbi algorithm [16] is used to find a
legitimate optimal path through the trellis, which
results in minimum distortion. The TCVQ encoder
transmits to reception a bit sequence specifying the
corresponding optimal path (sequence of subsets) in
addition to a sequence of kR1 bits codewords
necessary to specify codevectors from the chosen
subsets. At the TCVQ decoder side, the bit sequence
that specifies the selected optimal trellis path is used
as the input to the convolutional coder of the TCVQ
system. The output of this coder selects the proper
subset Di. The codewords of the second binary
sequence are used to select the correct codevectors
from each subset.
An example of a 4-states scalar TCQ encoder of
rate R = 2 bps used to encode a memoryless source,
which is uniformly distributed on the interval [-A A],
is illustrated on Fig. 1.
0/D0
1/D2
0/D1
1/D3
1/D2
0/D0
1/D3
0/D1
(a)
D0 D1 D2
D3
-7A/8 -5A/8 -3A/8 -A/8
D0 D1 D2
A/8
D3
3A/8 5A/8 7A/8
(b)
x0
y1
Input bit
Output bits
s2
s1
s0
y0
(c)
Figure 1: TCQ encoder of rate R=2 bps : (a) Section
of labelled 4-states trellis, (b) Output alphabet levels
and partition, (c) TCQ convolutional coder.
Examples of simulation results for encoding

unity-variance memoryless Gaussian sources using
integer and fractional rates TVCQ encoders are
respectively given in tables 1 and 2. For different
rates, results are given in terms of Signal to Noise
Ratio (SNR) in dB, along with the corresponding
LBG-VQ performance and distortion rate function
D(R). Notice that when the rate is fractional, the

dimension k has to be such that kR becomes an
integer.
Table 1: Performances of TCVQ encoding with
integer rates for the Gaussian source.
Rate Vector TCVQ Trellises Size (State's Number)
bps Dim.
4
8
16
32
64
1
4.64
4.77
4.85
4.93 4.98
1
LBGVQ
D(R)
4.40
4.85
5.03
5.09
5.10
5.21
4.98
5.08
5.12
5.15
5.22
4.42
4.49
5.05
5.14
5.16
5.18
5.23
4.69
10.18
10.31
10.38
10.46 10.51
9.31
10.36
10.50
10.57
10.60 10.69
9.70
10.59
10.69
10.72
10.75 10.81 10.00
10.93
11.02
11.05
11.07 11.12 10.41
6.02
12.04
Table 2: Performances of TCVQ encoding with

fractional rates for the Gaussian source.
Rate Dim.
bps k
0.66
0.75
0.80
TCVQ Trellises Size (State's Number)

4
3.34
3.72
3.96
6
4
5
8
3.39
3.78
4.04
16
3.41
3.80
4.07
32
3.42
3.82
4.08
64
3.45
3.87
4.14
128
3.48
3.90
4.18
LBG- D(R)
VQ
256
3.49
3.93
4.20
3.05
3.36
3.69
4.01
4.51
4.82
At the same encoding rate, these results show

that the TCVQ outperforms the TCQ (k = 1).
Moreover, the TCVQ allows fractional rates as
shown by the simulation results listed in table 2. We
can see also that, for a given rate, the TCQ/TCVQ
performances are higher than those of the
conventional SQ/VQ.
To more improve the TCVQ performances, a
training optimization procedure for the extended
TCVQ codebook design was developed [5]. For a
given training source vectors, this procedure updates
the TCVQ codebook by replacing each codevector
with the average of all the source vectors mapped to
this codevector. This leads to an iterative design
algorithm for the overall TCVQ encoder. Using this
optimization variant, the algorithm will be called
OTCVQ (Optimized Trellis Coded Vector
Quantization) algorithm.
Examples of simulation results for encoding
memoryless Gaussian sources using fractional rate
OTCVQ encoders are listed in table 3.
Table3 : Performances of the OTCVQ with
fractional rates for the Gaussian source.
Rate Dim.
bps
k
0.66
0.75
0.80
6
4
5
TCVQ Trellises Size (State's Number)

4
3.41
3.81
4.08
8
3.45
3.85
4.14
16 32 64 128
3.47 3.48 3.49 3.52
3.87 3.89 3.93 3.96
4.16 4.17 4.21 4.23
LBGVQ D(R)
256
3.53 3.05 4.01
3.97 3.36 4.51
4.25 3.69 4.82
Comparing these results with those given in table 2,

we clearly notice the performance improvements
brought by the optimization of the TCVQ codebooks.

4
JOINT CODING BY THE CHANNEL

OPTIMIZED VECTOR QUANTIZATION
Vector quantization is currently used in various

practical applications and since some type of channel
noise is present in any practical communication
system, the analysis and design of VQs for noisy
channels is receiving increasing attention.
In this work, we considered the joint sourcechannel coding (JSCC) associated specifically with
the use of VQ in order to provide an implicit
protection to our quantizers. Particularly, we were
interested on a category of JSCC relating to
quantizers optimized by taking into account the error
probability of channel. It's about the channel
optimized vector quantization [8], [18].
4.1
COVQ
system
principle:
optimality conditions
Modified
A channel optimized vector quantizer (COVQ) is

a coding scheme based on the principle of VQ
generalization by taking into account the present
noise on the transmission channel. The idea is to
exploit the knowledge about the channel in the
codebook design process and the encoding algorithm.
Thus, the operations of source and channel coding
are integrated jointly into the same entity by
incorporating the channel characteristics in the
design procedure. Indeed, the LBG-VQ is well
appropriate to a modification in this sense. The
purpose then is to minimize a modified total average
distortion between the reconstituted signal and the
original signal, given the channel noise.
The design of a COVQ encoder is carried out by
a VQ version extended to the noisy case [8], [18].
The COVQ scheme keeps the same VQ block
structure (encoder/decoder, dimension, bit rate). The
difference is in the formulation of the necessary
conditions of optimality to minimize a modified
expression of the total average distortion. This new
distortion
is
formulated
by
considering
simultaneously the distortion due to vector
quantization and channel errors [18], [19]:
D=
1
k
L 1
p( x)
p( j / i ) d ( x, y j ) dx ,
j =0
i =0 Ri
L 1
(4)
where p(j/i) is the channel transition probability

which represents the probability that the index j is
received given that the index i is transmitted. By
comparing the Eq. (4) with Eq. (1), one can notice
easily that these two equations are equivalent, except
that the Eq. (4) uses a modified distance measure
(term in the braces). It about the same distance d but
with weightings given by the channel transition
probabilities p(j / i), i, j = 0,..., L1.

The formulations of optimality necessary
conditions of COVQ are also derived in two steps,
according to the minimization principle of the
modified total average distortion [8], [18], [19].
For a given codebook Y = {y0,..., yL1} and by
using a squared Euclidean distance measure, the
optimal partition Ri (i= 0,..., L1) for a noisy channel
is such that :
L1
2 L1
2
Ri = xRk : p( j /i) x yj p( j /l) x yj , l i (5)
j=0
j=0
Similarly, the optimum codebook for a fixed

partition is given by:
j i ,
p
p( j i ) = 1 np j = i,
0
otherwise
where i is the set of all integers j, (0 j L 1),

such that the binary representation of j is of
Hamming distance one from the binary
representation of i.
In the case where the source distribution is
unknown, long training database of k-dimensional
vectors can be used for the quantizer design. With
the approximation given in Eq. (8), the equations (4)
and (6) will be respectively modified as:
D=
L 1
p( j / i) xp( x).dx
yj =
i =0
Ri
L 1
, j = 0,, L1.
(6)
Ri
p( j i ) = (1 p) nd H (i, j ) p d H (i, j ) ,
(7)
where dH (i, j) (0 dH (i, j) n) is the Hamming

distance between the n-bits binary codewords
represented by integers i and j.
When the channel bit error probability p is
sufficiently small, the probability of multiple bit
errors in an index is very small relative to the
probability of zero or one bit error [9], [18], [19]. To
simplify the numerical computations, it is often
adequate to consider only the effects of single bit
errors on channel codewords. The BSC channel
model can be then approximated by [9]:
(9)
p( j / i) xl / N
yj =
The codevector yj represents now the centroid of

all input vectors that are decoded into the cell Rj,
even if the transmitted index i is different from j. The
equations (5) and (6) are respectively referred as the
generalized nearest neighbour and centroid
conditions with a modified distortion measure. The
optimal codevectors for noisy channel are thus linear
combinations of those for the noiseless case,
weighted by the a posteriori channel transition
probabilities.
In our applications, the communication channel
considered is a discrete memoryless channel with
finite input and output alphabets. Precisely, we
assumed a memoryless binary symmetric channel
(BSC) model with bit error (crossover) probability p
[6], [16]. For codewords (VQ indices) of n bits, the
BSC transition probabilities are described by [9],
[19]:
N 1 L1
1
1
p( j / it ) d ( xt , y j ) ,
N t =0 j k
i
and :
p( j / i) p( x).dx
i =0
(8)
i j
l:xl Ri
p ( j / i ) Ri
(10)
/N
i j
where N is the size of the training base and Ri

denotes the number of training vectors belonging to
the cell Ri.
4.2
COVQ encoder design algorithm
The design procedure of the COVQ encoding

system is a straightforward extension of the LBGVQ algorithm. An iterative optimization of the two
modified optimality conditions is carried out such as
the partition and the codebook codevectors are
updated by using the modified distortion including
the channel probability [8], [18]. The steps of our
version of the COVQ algorithm are detailed in [20].
We suppose that a set of input vectors is
available (training base) and that the BSC channel
error probability is given. This channel probability,
which is often called design error probability of
COVQ codebook, is considered as an input
parameter in the optimization process. At the
beginning this design parameter is set temporarily at
a low value; then gradually increased until matching
the desired design error probability.
The choice of the initial codebook is very
important since it can significantly impact the final
results. In our design, the initial codebook is
conceived for = 0 (i.e., for noiseless channel). It is
about a simple run of the conventional LBG-VQ
algorithm which will converge to a locally optimal
codebook. This codebook will be used as initial
codebook of the COVQ algorithm. Then, for each

stage of , the algorithm will converge to an
intermediate codebook which will be used as initial
codebook of the next stage in the COVQ design
process.
The greatest difficulty in the COVQ system
design is that the channel error probability is a
parameter in the optimization process. In real
transmission situation, this parameter is difficult to
estimate. It may even vary in time, making the
design according to a specific value rather academic.
Thus, according to the practical situation and to the
estimates of the real communication channel
characteristics, COVQ encoders can be selected to
obtain the highest degree of robustness.
4.3
COVQ encoder performances
We now present numerical results on the

performance of COVQ encoding system operating
over a BSC channel with variable bit error
probability p. Examples of simulation results of
COVQ encoders, trained for various values of the
design probability parameter ( = 0.001, 0.005,
0.010 and 0.050) are given in table 4. These encoders,
whose selected characteristics are: k = 2, R = 2 bps
and L = 16, were applied to encode memoryless
Gaussian source. For a comparative evaluation with
the conventional VQ, the LBG-VQ (designed for a
noiseless channel, = 0.000) performances were also
included in the table.
Table 4 : SNR Performances comparison between
COVQ and VQ over BSC channel
0.000
0.001
0.005
0.01
0.05
0.000
0.001
0.005
0.01
0.05
0.1
0.2
9.686
9.584
9.292
8.927
6.824
4.650
2.518
9.685
9.604
9.314
8.965
6.918
5.292
3.109
9.624
9.565
9.357
9.034
7.351
5.875
3.876
9.537
9.481
9.332
9.179
7.608
6.801
4.752
8.664
8.643
8.571
8.477
7.800
7.043
5.886
In the case of transmissions over noisier

channels (higher values of p), the results indicate that
COVQ performs better than LBG-VQ. For example,
for a BSC of p = 0.2, a considerable SNR gain of
3.36 dB was obtained by the COVQ (trained for =
0.05) compared with the LBG-VQ. One notice that
when the channel probability p does not match with
the design probability , COVQ encoders trained for
identical or close to p are those which yields the
best performances. However, when the channel is
noiseless (p = 0.000) the SNR-performances of
COVQ encoders are suboptimal with the increase of

the design parameter . In this case, the LBG-VQ
ensures comparable performances or better than the
COVQ. Same remarks when the channel error
probability is low (p < 0.005) with a slight
performances improvement obtained by COVQ
encoders trained for a low value of the design
parameter (example, COVQ for = 0.001).
5
OPTIMIZED-TCVQ FOR LOW-BIT RATE

ENCODING OF LSF PARAMETERS
Using the OTCVQ encoding technique, an

encoding scheme for the LSF parameters is presented
in this section. The aim of this encoding system,
called "LSF-OTCVQ Encoder" [5], is to efficiently
quantize the LSF parameters of one frame using only
the dependencies among the same parameters.
For speech coding applications, the OTCVQ is
used in block mode, where each block corresponds to
an LSF vector of size 10. In this work, twodimensional 2-D codebooks (k = 2) are used for
encoding the LSF vectors. Thus, each stage in the
trellis diagram is associated with 2-D of the LSF
vector. Hence, there are five stages in the LSFOTCVQ trellis with two branches entering and
leaving each state. Since the LSF parameters have
different means and variances, five extended
codebooks are then needed to encode an LSF vector.
Knowing that choice of an appropriate distance
measure is an important issue in the design of any
VQ system, we have used another distance measure
in the design and the operation steps of the LSFOTCVQ encoder. It's about the weighted Euclidean
distance measure. Based on the LSF parameters
properties, several weighted distance measures have
been proposed for the LSF encoding [2], [4], [21]. In
our applications, we used the weighted squared
Euclidean distance given by:
10
d ( f , f ) = c i wi ( f i fi ) 2 ,
(11)
i =1
where fi and f i are respectively the ith coefficients of

the original f and quantized f LSF vectors; ci and wi
represent respectively the constant and variable
weights assigned to the ith LSF coefficient.
These weights are meant to provide a better
quantization of LSF parameters in the formant
regions. Many weighting functions have been
defined to calculate the variable weight vector w =
[w1,, w10]. Particularly, we used the weighting
function, known by the inverse harmonic mean
(IHM) [21]:
wi =
1
1
,
+
f i f i 1 f i +1 f i
(12)
where f0 = 0 and f11 =0.5. The constant weight vector

c = [c1,, c10] is experimentally determined [2]:
1.0, for 1 i 8
c i = 0.8, for i = 9
0.4, for i = 10
(13)
The LSF quantizer performances are evaluated

by the average spectral distortion (SD) which is often
used as an objective measure of the LSF encoding
performance. This measure correlates well with
human perception of distortion. When calculated
discretely over a limited bandwidth, the spectral
distortion for frame i is given, in decibels, by [4] :
SDi =
n1 1
S (e j 2 n / N )
1
10 log10
. (14)
n1 n0 n =n
S (e j 2 n / N )
0
1) The average SD is about 1 dB,

2) The percentage of outlier frames having SD
between 2 and 4 dB is less than 2%,
Now, we evaluate the performances of our LSFOTCVQ encoder operating at different bit rates. All
simulation results reported in this section were
obtained by using four-state trellis and 2-D
codebooks. For each encoding rate, 2 bits are thus
assigned to represent the initial state. When the
remaining bits cannot be equally assigned to
represent the five 2-D codebooks, fewer bits are used
in the last codebooks, since it is known that human
resolution in the higher frequency bands is less than
in the lower frequency bands. We investigated the
optimum bit allocations for the LSF-OTCVQ
encoder and found that the bit allocations given in
table 5 yield the best results.
Table 5 : Bit allocations of each LSF-OTCVQ trellis
stage codebook as a function of bit rate
Bits / LSF
Vector
Trellis Stage Number :
24
1
5
2
5
3
5
4
4
5
3
25
26
27
28
The speech data used in the experiments of this

section consists of approximately 43 min of speech
taken from the TIMIT speech database [22]. To
construct the LSF database, we have used the same
LPC analysis function of the FS1016 speech coder
[23]. A 10-order LPC analysis, based on the
autocorrelation method, is performed every analysis
frame of 30 ms using a Hamming window. One part
of the LSF database, consisting of 75000 LSF
vectors, is used for training and the remaining part,
of 11262 LSF vectors (different from the training
set), is used for test.
For different bit rates, the performances of the
LSF-OTCVQ encoder are shown in table 6. These
results have been obtained by using separately
two different distortion measures (unweighted and
Table 6: Performances of the LSF-OTCVQ encoder as a function of bit rate.

LSF-OTCVQ (unweighted distance)
Bits/frame
24
25
26
27
28
Average
SD (dB)
1.34
1.24
1.18
1.14
1.04
SD Outliers (in %)
2- 4 dB
> 4 dB
7.04
3.97
3.01
2.95
1.60
Bits / Stage
codebook
For speech signal sampled at 8 kHz with a 3 kHz

bandwidth, an N = 256 point FFT is used to compute
the original S(ej2n/N) and quantized (ej2n/N) power
spectra of the LPC synthesis filter, associated with
the ith frame of speech. The spectral distortion is thus
computed discretely with a resolution of 31.25 Hz
per sample over 96 uniformly spaced points from
125 Hz to 3.125 kHz. The constants n0 and n1 in Eq.
(14) correspond to 1 and 96 respectively.
Generally, it is accepted that an average SD of
about 1 dB indicates negligible audible distortion has
incurred during quantization. This value has been, in
the past, suggested for transparent quantization
quality and used as a goal in designing many LPC
quantization schemes. In [2], Paliwal and Atal
established that the average SD is not sufficient to
measure perceived quality alone. They introduced
the notion of spectral outliers frames. Consequently,
we can get transparent quality if we maintain the
following three conditions:
3) No frames must have SD greater than 4 dB.
0.03
0.03
0.02
0.02
0.01
LSF-OTCVQ (weighted distance)

Average
SD (dB)
1.29
1.19
1.15
1.07
0.98
SD Outliers (in %)
2- 4 dB
> 4 dB
5.26
2.99
2.72
1.90
1.10
0.02
0.00
0.00
0.00
0.00
weighted distances) in both the design and the

operation of the LSF-OTCVQ encoder.
These comparative results clearly show the
improvement of the LSF-OTCVQ performances,
obtained by using the weighted distance. The LSFOTCVQ encoder, designed with a weighted distance,
need 27 bits/frame to get transparent quantization
quality. Compared to the encoder designed with the
unweighted distance, it can save about 1-2 bits/frame
while maintaining comparable performances.
6
EFFICIENT AND ROBUST CODING OF

THE
FS1016
LSF
PARAMETERS:
APPLICATION OF THE LSF-OTCVQ
In this section we use the LSF-OTCVQ encoder

(with weighted distance) to quantize the LSF
parameters of the FS1016. For the moment, we
suppose that the transmissions are done over a
noiseless ideal channel. Recall that the US Federal
Standard FS1016 is a 4.8 kbits/s Code Excited Linear
Prediction (CELP) speech coder [23]. According to
the FS1016 norm, the LSF parameters are encoded at
the origin by an SQ of 34 bits/frame.
For the same test database (11262 LSF vectors),
this 34 bits/frame LSF SQ results in an average SD
of 1.72 dB, 25.99 % outliers in the range 2-4 dB, and
0.46 % outliers having SD greater than 4 dB. By
comparing these results with those given in table 6,
we can see that the LSF-OTCVQ encoder (for all
studied lower rates) performs better than the 34
bits/frame SQ used at the origin in the FS1016. Thus,
several bits per frame can be gained by the
application of the LSF-OTCVQ in the LSF encoding
process of the FS1016.
Subjective listening tests of the 27 bits/frame
LSF-OTCVQ encoder were also performed.
Incorporating this encoder in the FS1016, the bit rate
for the quantization of the LSF parameters decreases
to 900 bits/s and consequently the FS1016 operate at
a bit rate of 4.57 kbits/s. To carry out these tests, we
generated for the same original speech signal three
versions of synthetic speech signals: one with
unquantized LSFs and the two others with quantized
LSFs using respectively the 27 bits/frame LSFOTCVQ encoder and the 34 bits/frame SQ.
Subjective quality evaluations are done here through
A-B comparison and MOS (Mean Opinion Score)
tests using 8 listeners. Six sentences from the TIMIT
database (spoken by three male and three female
speakers) are used for the subjective evaluations.
The A-B comparison test involves presenting
listeners with a sequence of two speech test signals
(A and B). For each sentence, a comparison is done
between the two synthetic signals: one A (or B) with
unquantized LSFs and the other B (or A) with LSFs
quantized by the LSF-OTCVQ encoder. The A-B
signal pairs are presented in a randomized order. The
listeners choose either one or the other of the two
synthesized versions, or indicate no preference. For

the MOS tests, the listeners were requested to rate
each synthetic speech sentence (with LSF-OTCVQ
quantized LSFs) in a scale between 1 (bad) and 5
(excellent). At the end, the average score of opinion
(MOS) is calculated.
Results from the A-B comparison tests show that
the majority of the listeners (58.84 %) have no
preference. The mean preference for speech signal
coded with LSF-OTCVQ quantized LSFs (20.83 %)
is identical to that obtained for the speech signal
coded with unquantized LSFs. Roughly, we can
conclude that the two considered versions of coded
speech are statistically indistinguishable, i.e., there
are no perceptible differences and the quantization
does not contribute to audible distortion. In terms
of MOS, the considered coded version of speech
exhibits a good score of 3.89. This implies that good
communications quality and high levels of
intelligibility [2] are obtained using the 27 bits/frame
LSF-OTCVQ encoder in the FS1016.
In addition, in term of average segmental signalto-noise ratio (SSNR), the synthetic speech signals
with unquantized LSF parameters gave an average
SSNR of 11.05 dB; with LSF-OTCVQ encoding of
LSF parameters, the average SSNR obtained is 10.31
dB. In the case where LSF parameters are quantized
by the 34 bits SQ, an average SSNR of 9.59 dB was
obtained. Thus, a reduction in coding rate with an
improvement of the SSNR-performances of the
FS1016 was obtained by application of the LSFOTCVQ encoding system.
6.1
Robustness of the COVQ-OTCVQ encoder:

Transmission over a noisy channel
In a practical communication system, the

robustness of the LSF-OTCVQ encoder must be
reinforced so that the encoder will be able to cope up
with channel errors. In this part, we were interested
in implicit protection of the encoders by application
of the JSCC-COVQ technique. We will see first how
to apply the COVQ for the robust design of the LSFOTCVQ encoder in order to provide an implicit
protection to some of its indices. To finish, we will
generalize the study with the full protection of all the
indices of the new LSF-OTCVQ encoder with the
COVQ technique.
6.1.1 Design of the LSF-OTCVQ encoder with
JSCC-COVQ technique
The design principle of the LSF-OTCVQ encoder
optimized for noisy channel is based mainly on the
design algorithm of LSF-OTCVQ modified
according to the basic concept of the COVQ. In the
applications, the five extended codebooks of our new
encoding system, denoted by: "COVQ-LSF-OTCVQ
encoder", were optimized for a design error
probability = 0.05.
The basic steps of our design algorithm of the 27
bits/frame COVQ-LSF-OTCVQ encoder are
summarized below. Notice that the trellis states
number of the encoder is always S = 4; consequently
2 bits/frame are necessary to represent the initial
state. The remaining 25 bits are assigned for the 5
codebooks according to the bits allocation given in
table 5. Let us specify that at the beginning the 5
initial extended codebooks are designed by the LBGVQ algorithm ( = 0.000) using the weighted
Euclidean distance. The codebooks design of
COVQ-LSF-OTCVQ encoder is done using the same
training data base (75000 LSF vectors). Thereafter,
this base is divided into 5 training subsets of 2-D
LSF vector pairs (LSF 1-2, LSF 3-4, LSF 5-6, LSF
7-8 and LSF 9-10).
Design steps of COVQ-LSF-OTCVQ encoder :
Step 1: Initial design
Based on the 5 training subsets, use the COVQ

(c = 0.05) algorithm to design the five (2-D)
extended initial codebooks of the encoder.
Partition each initial codebook in 4 sub-codebooks
using the set partitioning algorithm. Then, label the
transitions of each trellis stage with the
corresponding partitioned COVQ-codebook (i.e.,
COVQ-codebook LSF1-2 for stage 1,
Set a stop threshold to very small value.
Step 2: TCVQ coding/decoding process
For the given LSF vectors training base, find the

best possible reproduction LSF vectors through
the trellis by using a modified version of Viterbi
procedure.
Calculate the average SD between the original and
quantized LSF vectors.
Step 3: Termination Test
If the relative decrease of the average SD is below

the threshold , save the 5 optimized codebooks of
COVQ-LSF-OTCVQ encoder, stop.
Otherwise, updates the 5 COVQ-codebooks using
a modified version of the optimization procedure
and go to step 2.
In step 2, the TCVQ encoding process of input LSF
vectors consists to find the best possible sequence of
codevectors (optimal path) through the trellis. This
research task is assured by the Viterbi algorithm with
a slight modification of the distance computation
formula. This distance, which must be minimized
during the TCVQ search process of the optimal
codevector, is formulated as follows:
d ( f , fi ) =
ji
p( j / i)
1 k
cm wm d ( f (m) f j (m))2 (15)
k m=1
where k is the dimension of LSF vectors (k = 2 for

LSF's pairs) and i is the set of the i-neighbors such
as dH (i, j) = 1. Recall that after the encoding process,
COVQ-LSF-OTCVQ encoder transmits two binary
sequences in addition to two bits representing the
trellis initial state.
In this part, we must notice that only the indices
sequence of COVQ-LSF-OTCVQ codevectors
(sequence of 20 bits for the 5 indices) is supposed to
be protected implicitly by COVQ. This sequence
results directly from the COVQ search procedure
through the 5 codebooks of the encoder. On the
other hand, the other binary sequences (initial state,
optimal path) are not delivered by VQ search process
and consequently they are not protected implicitly
against channel errors.
6.1.2 Performances of the COVQ-LSF-OTCVQ
system: Encoding of the FS1016 LSF
parameters
We present now the performances of the 27
bits/frame COVQ-LSF-OTCVQ encoder ( = 0.05)
applied for the efficient and robust coding of FS1016
LSF parameters. In these simulations, the channel
errors will affect only the transmission of LSF
parameters. For the moment, only the sequences of
20 bits/frame specifying the COVQ-LSF-OTCVQ
codevectors indices are transmitted over a BSC
channel of bit error probability p varying between 0
and 0.5.
The data base used in the following evaluations
is composed of 13.69s speech sequences extracted
from the test data base. Synthesized speech signals of
this base were generated by the FS1016, with
objective evaluations in terms of average SD for the
LSF encoders and average SSNR for synthetic
speech signals. The SD Performances of the 27
bits/frame
systems:
LSF-OTCVQ
(without
protection) and COVQ-LSF-OTCVQ ( = 0.05) are
reported in table 7.
These results show that when the channel error
probability becomes rather high (p > = 0.05), the
COVQ yields significant improvement to the
performances of LSF-OTCVQ encoder. Without
protection, the LSF-OTCVQ has incurred more
severe degradation compared with the protected LSF
encoder. This degradation is represented by a brutal
increase in the average SD of the LSF-OTCVQ as
well as the percentage of outliers frames having SD>
4 dB. Under these conditions, the COVQ ( = 0.05)
has permitted thus to LSF-OTCVQ to have a good
robustness against channel errors by maintaining a
reduced and slow increase of the average SD and the
number of outliers frames (SD > 4 dB).
Table 7: Performance comparisons between COVQ-LSF-OTCVQ/LSF-OTCVQ encoders of 27 bits/frame:

Application to the FS1016 LSF parameters encoding
LSF-OTCVQ Encoder
COVQ-LSF-OTCVQ Encoder
BSC
Probability p
Average
SD (dB)
0.000
0.001
0.005
0.010
0.050
0.100
0.200
0.500
1.690
1.693
1.710
1.712
1.800
1.924
2.130
2.696
SD Outliers (in %)
2-4 dB
> 4 dB
25.607
0.441
25.827
0.441
26.710
0.662
26.931
0.441
32.671
0.883
38.852
0.883
46.799
3.532
67.911
7.726
However, when the transmissions are done over a

noiseless channel (p = 0.000) or slightly disturbed
(p ), the performances of COVQ-LSF-OTCVQ
become suboptimal by compromising the transparent
quantization quality.
On other hand, important observations were
noted concerning the SSNR objective performances
of the global FS1016 encoder. Indeed, contrary to
certain conclusions made before, the FS1016 SSNR
performances (with LSF parameters coded by
COVQ-LSF-OTCVQ) are also remarkable when the
channel is slightly disturbed. The comparative
evaluation of the FS1016 objective performances,
with LSFs coded by LSF-OTCVQ and COVQ-LSFOTCVQ encoders, is presented in Fig. 2.
12
Average SSNR (dB)
10
FS1016 with LSF-OTCVQ

FS1016 with COVQ-LSF-OTCVQ
0.001
0,01
0,1
0.5
Error Probability (p)
Figure 2: Average SSNR performances of the

FS1016 speech coder.
For error probabilities p 0.01, these results

show that the distortions are negligible for the two
LSF encoding systems. We can conclude that the
encoding system COVQ-LSF-OTCVQ ( = 0.05) can
provide a good implicit protection to the FS1016
LSF parameters with suboptimal SD-performances
when the channel is slightly disturbed.
Average
SD (dB)
1.073
1.099
1.148
1.224
1.707
2.696
4.251
6.649
6.2
SD Outliers (in %)
2-4 dB
> 4 dB
0.440
0.000
0.441
0.442
0.883
1.544
2.649
1.986
10.596
7.505
15.010
21.192
17.439
43.929
13.245
80.573
COVQ-LSF-OTCVQ
encoder
redundant channel coding
with
Now, we generalize the study with the full

protection of all transmission indices of the 27
bits/frame COVQ-LSF-OTCVQ encoder ( = 0.05).
By adequately exploiting the bits gained by this
encoder, a redundant channel coding is used to
explicitly protect the 7 bits/frame remaining without
protection. Since in our simulations the transmissions
are done via BSC channel with the assumption of
only one error bit dominating by corrupted index
(single error), a simple single error-correcting code is
largely sufficient to correct all possible single errors
which will affect the transmitted sequences of the
encoder (5 bits of the optimal path and the 2 bits of
the initial state). Notice, of course, that the 20 bits/
frame representing the codevectors indices of the
optimal path are already protected by COVQ.
To carry out the channel coding of the nonprotected 7 bits/frame, we used two error-correcting
Hamming (7, 4, 3) codes belonging to the category
of systematic linear block codes. In this paper, we
will not review the design/operation theory of the
Hamming codes which is generally well documented
[6]. These codes were first conceived to effectively
correct only one error per transmission block (single
error-correcting codes). In our design, the two
Hamming (7, 4, 3) codes have the capacity to protect
8 bits by generating together 14 bits. The 27
bits/frame COVQ-LSF-OTCVQ encoder, with the
two Hamming (7, 4, 3) codes, will thus operate at a
rate of 34 bits/frame. It is about the same number of
bits allocated with the original coding of the
FS1016's LSF parameters. Thus, the global design of
the FS1016 with COVQ-LSF-OTCVQ (plus the 2
Hamming codes) of LSF parameters maintains the
speech coder rate to its original value of 4.8 kbits/s.
The performances of the non-protected LSFOTCVQ compared with those of the COVQ-LSFOTCVQ ( = 0.05) encoder with Hamming (7, 4, 3)
Table 8 : Performances comparison between the LSF-OTCVQ encoder and the COVQ-LSF-OTCVQ ( = 0.05) +
Hamming (7, 4, 3) codes
COVQ-LSF-OTCVQ Encoder +
2 Hamming (7, 4, 3) codes
BSC
Probability p
Average
SD (dB)
0.000
0.001
0.005
0.010
0.050
0.100
0.200
0.500
1.690
1.689
1.701
1.725
1.802
1.948
2.226
2.389
SD Outliers (in %)
2-4 dB
25.607
25.607
26.048
26.490
28.697
32.229
34.878
35.982
> 4 dB
0.441
0.441
0.441
0.662
1.545
3.532
7.505
10.596
codes are given in table 8.

For all error probability variation range, the
results showed that the channel coding by Hamming
codes (7, 4, 3) has clearly improved the
performances of the 27 bits/frame COVQ-LSFOTCVQ encoding system. The global system thus
has a good robustness against the errors of the noisy
channel. On the other hand by comparing these
results with those given in table 7, the LSF-OTCVQ
encoder has incurred larger degradation in terms of
average SD and outliers. This is due mainly to the
random noise effects of the binary sequences
specifying the initial state or the optimal path.
Concerning the SSNR performances of the
global FS1016 (with LSFs coded by COVQ-LSFOTCVQ + 2 Hamming (7, 4, 3) codes), the
degradations are very low and even negligible for
error probabilities p < 0.01. The SSNR performances
of the FS1016, in the cases with and without LSF
protection, are presented in Fig. 3.
12
Average SSNR (dB)
10
FS1016 with non-protected LSF-OTCVQ

FS1016 with COVQ-LSF-OTCVQ + 2 Ham(7,4)
0.001
0,01
0,1
0.5
Error Probability (p)
Figure 3: Average-SSNR performances of global FS1016
LSF-OTCVQ Encoder
without protection
Average
SD (dB)
1.073
1.665
1.993
2.030
2.896
3.825
5.070
7.057
SD Outliers (in %)
2-4 dB
0.440
4.635
5.077
5.960
13.907
17.880
23.620
12.362
> 4 dB
0.000
9.933
14.790
14.128
26.931
41.721
54.304
86.754
CONCLUSION
In this work, an optimized trellis coded vector

quantization scheme has been developed and
successfully applied for the efficient and robust
encoding of the FS1016 LSF spectral parameters. In
the case of ideal transmissions over a noiseless
channel, objective and subjective evaluation results
revealed that the 27 bits/frame LSF-OTCVQ encoder
(with weighted distance) produced equivalent
perceptual quality to that when the LSF parameters
are unquantized.
After, we used a JSCC-COVQ technique to
protect implicitly the transmission indices of the
LSF-OTCVQ encoder incorporated in the FS1016.
The simulation results showed that our new COVQLSF-OTCVQ encoding system has permitted to the
basic LSF-OTCVQ encoder to have a good
robustness against BSC channel errors especially
when the transmission errors probability is high. To
finish this work, it was necessary to protect all the
transmission indices of the COVQ-LSF-OTCVQ
encoder since only a part of its indices was protected
implicitly by JSCC-COVQ. By using adequately the
bits per frame gained by this encoder, a redundant
channel coding by Hamming codes was used to
explicitly protect the remaining bits without
protection. We showed that the COVQ-LSFOTCVQ encoder, using the Hamming codes (7, 4, 3),
has contributed significantly to the improvement of
the encoding performances of the FS1016's LSF
parameters.
We can conclude that our global COVQ-LSFOTCVQ encoding system with Hamming channel
codes can ensure an effective and robust coding of
the LSF parameters of the FS1016 operating over
noisy channel.
REFERENCES
[1] W.B. Kleijn and K. K. Paliwal, : Speech coding

and synthesis, Elsevier Science B.V., (1995).J.
[2] K. K. Paliwal and B.S. Atal : Efficient vector
quantization of LPC parameters at 24 bits/frame,
IEEE Transactions on Speech and Audio
Processing, vol. 1, no. 1, pp. 3-14 (1993). F. R.
[3] F. Itakura : Line spectrum representation of
linear predictive coefficients of speech signals",
Journal of Acoustical Society of America, vol.
57, p.535 (1975).
[4] W. F. LeBlanc, B. Bhattacharya, S. A.
Mahmoud and V. Cuperman : Efficient search
and design procedures for robust multi-stage VQ
of LPC parameters for 4 kb/s speech coding,
IEEE Transactions on Speech and Audio
Processing, vol. 1, no. 4, pp. 373-385 (1993).
[5] M. Bouzid, A. Djeradi and B. Boudraa :
Optimized Trellis Coded Vector Quantization of
LSF Parameters: Application to the 4.8 Kbps
FS1016 Speech Coder, Signal Processing, Vol.
85, Issue 9, pp. 1675-1694 (2005).
[6] S. Lin : An Introduction to Error-Correcting
Codes", Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, USA (1970).
[7] C. E. Shannon: A Mathematical Theory of
Communication, Bell System Technical Journal,
vol. 27, no. 3 and 4, pp. 379-423 and 623-656
(1948).
[8] K. A. Zeger and A. Gersho : Vector quantizer
design for memoryless noisy channels, in
Proceedings of the International Conference on
Communications (ICC'88), Philadelphia, pp.
1593-1597 (1988).
[9] N. Farvardin : A Study of vector quantisation for
Noisy Channels, IEEE Transactions on
Information Theory, vol. 36, n. 4, pp. 799-809
(1990).
[10] S. B. Z. Azami, P. Duhamel and O. Rioul :
Combined source-channel coding: Panorama of
methods,
CNES
Workshop
on
Data
Compression, Toulouse France (1996).
[11] A. Gersho, R. M. Gray : Vector quantization
and Signal compression, Kluwer Academic
Publishers, USA (1992).
[12] Y. Linde, A. Buzo, R. M. Gray : An Algorithm
for Vector Quantization Design, IEEE
Transactions on Communications, COM-28, pp.
84-95 (1980).
[13] M. W. Marcellin and T. R. Fischer : Trellis

coded quantization of memoryless and Gaussmarkov
sources,
IEEE
Trans.
on
Communications, vol. 38, pp. 83-93 (1990).
[14] T. R. Fischer, M. W. Marcellin and M. Wang :
Trellis coded vector quantization", IEEE
Transactions on Information Theory, vol. 37,
pp. 1551-1566 (1991).
[15] H. S. Wang and N. Moayeri : Trellis coded
vector
quantization,
IEEE
Trans.
on
Communications, vol. 40, pp. 1273-1276 (1992).
[16] A. J. Viterbi and J. K. Omura : Principles of
Digital Communication and Coding, McGrawHill Kogakusha (1979).
[17] G. Ungerboeck : Trellis-coded modulation with
redundant signal sets, Part I and II, IEEE
Commun. Magazine, vol. 25, pp. 5-21, (1987).
[18] N. Farvardin and V. Vaishampayan : On the
performance and Complexity of ChannelOptimized
Vector
Quantizers",
IEEE,
Transactions on Information Theory, vol. 37,
n.1, pp. 155-159 (1991).
[19] D. M. Chiang, L. C.
Potter : Vector
Quantisation For Noisy Channels: A guide To
performance And Computation, IEEE Trans. on
Circuits and systems for Video Technology, vol.
7, n.1, pp. 604-612 (1997).
[20] M. Bouzid : Codage conjoint de source et de
canal pour des transmissions par canaux bruits,
Doctorate Thesis, Speech Communication,
USTHB university, Alger, 2006.
[21] R. Laroia, N. Phamdo and N. Farvardin: Robust
and efficient quantization of speech LSP
parameters using structured vector quantizers",
Proc. IEEE Int. Conf. Acoust., Speech and
Signal Processing, pp. 641-644 (1991).
[22] J. S. Garofolo and al. : DARPA TIMIT
Acoustic-phonetic Continuous Speech Database,
Technology Building, National Institute of
Standards and Technology (NIST), Gaithersburg
(1988).
[23] J. P. Campbell, T. E. Tremain and V. C. Welch :
The Proposed Federal Standard 1016 4800 bps
Voice Coder: CELP, Speech Technology
Magazine, pp. 58-64 (1990).

UbiCC Bouzid BookChapter 494

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

UbiCC Bouzid BookChapter 494

Încărcat de

Drepturi de autor:

Formate disponibile

ROBUST ENCODING OF THE FS1016 LSF PARAMETERS :

APPLICATION OF THE CHANNEL OPTIMIZED TRELLIS

In speech coding systems, the short-term spectral

quantize. One of the most efficient representations of

ideal transmissions over a noiseless channel, we have

2, we briefly review the basics of vector quantization.

A k-dimensional vector quantizer (VQ) of size L

where p(x) is the k-fold probability density function

It's the nearest neighbor optimality condition.

Various algorithms for the design of VQ have

OPTIMIZED ENCODING SYSTEM BASED

The scalar trellis coded quantization (TCQ) [13]

original and the reconstructed source sequences

-7A/8 -5A/8 -3A/8 -A/8

3A/8 5A/8 7A/8

Examples of simulation results for encoding

D(R). Notice that when the rate is fractional, the

10.75 10.81 10.00

11.07 11.12 10.41

Table 2: Performances of TCVQ encoding with

TCVQ Trellises Size (State's Number)

At the same encoding rate, these results show

TCVQ Trellises Size (State's Number)

Comparing these results with those given in table 2,

brought by the optimization of the TCVQ codebooks.

JOINT CODING BY THE CHANNEL

Vector quantization is currently used in various

A channel optimized vector quantizer (COVQ) is

where p(j/i) is the channel transition probability

probabilities p(j / i), i, j = 0,..., L1.

Similarly, the optimum codebook for a fixed

where i is the set of all integers j, (0 j L 1),

where dH (i, j) (0 dH (i, j) n) is the Hamming

The codevector yj represents now the centroid of

where N is the size of the training base and Ri

COVQ encoder design algorithm

The design procedure of the COVQ encoding

codebook of the COVQ algorithm. Then, for each

COVQ encoder performances

We now present numerical results on the

In the case of transmissions over noisier

COVQ encoders are suboptimal with the increase of

OPTIMIZED-TCVQ FOR LOW-BIT RATE

Using the OTCVQ encoding technique, an

where fi and f i are respectively the ith coefficients of

where f0 = 0 and f11 =0.5. The constant weight vector

The LSF quantizer performances are evaluated

1) The average SD is about 1 dB,

Trellis Stage Number :

The speech data used in the experiments of this

Table 6: Performances of the LSF-OTCVQ encoder as a function of bit rate.

For speech signal sampled at 8 kHz with a 3 kHz

3) No frames must have SD greater than 4 dB.

LSF-OTCVQ (weighted distance)

weighted distances) in both the design and the

EFFICIENT AND ROBUST CODING OF

In this section we use the LSF-OTCVQ encoder

synthesized versions, or indicate no preference. For

Robustness of the COVQ-OTCVQ encoder:

In a practical communication system, the

Based on the 5 training subsets, use the COVQ

For the given LSF vectors training base, find the

If the relative decrease of the average SD is below

where k is the dimension of LSF vectors (k = 2 for