Sunteți pe pagina 1din 128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Error Control Codes


Tarmo Anttalainen
23. 1. 2013
Abstract: This paper describes basic theory, encoding and decoding principles, implementation and characteristics of various error control codes. The main attention is paid on binary
block- and convolutional codes that are widely used in telecommunication systems. The most
important evolving coding schemes, such as Trellis Coded Modulation (TCM) and Turbo
codes, are also introduced.

Contents
1. INTRODUCTION TO ERROR CONTROL CODES .....................................................................1
1.1. PURPOSE OF ERROR CONTROL CODING ..........................................................................................1
1.2. SOME HISTORICAL MILESTONES ......................................................................................................3
1.3. COMMUNICATION SYSTEM AND CHANNEL .....................................................................................4
1.4. APPLICATIONS FOR ERROR CONTROL CODING................................................................................6
1.5. WAVEFORM CODING AND LINE CODING.........................................................................................7
1.6. AUTOMATIC REPEAT REQUEST (ARQ) ...........................................................................................7
2. LINEAR BLOCK CODES .................................................................................................................9
2.1. INTRODUCTION TO BLOCK CODES ...................................................................................................9
2.2. BLOCK CODE CHARACTERISTICS ..................................................................................................11
2.2.1. Code Rate ............................................................................................................................12
2.2.2. Hamming Distance ..............................................................................................................12
2.2.3. Minimum Hamming distance ...............................................................................................12
2.2.4. Hamming Weight .................................................................................................................13
2.2.5. Systematic Codes .................................................................................................................13
2.2.6. Error Correction and Detection Capability of the Block Codes .........................................14
2.3. EXAMPLES OF SIMPLE BLOCK CODES ............................................................................................15
2.3.1. Simple Parity-Check Codes .................................................................................................15
2.3.2. Repetition Codes ..................................................................................................................16
2.3.3. Residual error rate of a repetition code ..............................................................................17
2.3.4. Hamming Codes ..................................................................................................................19
2.4. INTRODUCTION TO LINEAR ALGEBRA ...........................................................................................22
2.4.1. Fields and Galois Fields .....................................................................................................22
2.4.2. Polynomials .........................................................................................................................25
2.4.3. Vector Spaces ......................................................................................................................26
2.4.4. Multiplication in Extension Field GF(2 m) ...........................................................................27
2.5. STRUCTURE OF LINEAR BLOCK CODES.........................................................................................33
2.5.1. Hamming Weight and Minimum Distance ...........................................................................33
2.5.2. Matrix Representation of Linear Block Codes ....................................................................34
2.5.3. Parity Check Matrix and Error Correction Capability of Linear Block Codes...................37
2.6. HAMMING CODES .........................................................................................................................38
2.6.1. Design of a Hamming code .................................................................................................39
2.6.2. Implementation of an Encoder for a Hamming Code ..........................................................39
2.7. MAXIMUM-LIKELIHOOD DECODING .............................................................................................40
2.7.1. Syndrome Decoding ............................................................................................................41
2.7.2. Standard Array ....................................................................................................................45
2.7.3. The Relation of Code Rate and Block length .......................................................................48
2.8. GENERAL ERROR PERFORMANCE CHARACTERISTICS OF LINEAR BLOCK CODES.............................49
2.9. WEIGHT DISTRIBUTION ................................................................................................................49

__________________________________________________________________________
ECC35.doc

Page 1

1/23/2013

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
PROBLEMS ..........................................................................................................................................50
3. CYCLIC BLOCK CODES ...............................................................................................................56
3.1. THE STRUCTURE OF LINEAR CYCLIC BLOCK CODES .....................................................................56
3.1.1. Generator polynomial of a cyclic code ................................................................................56
3.1.2. Parity check polynomial of a cyclic code ............................................................................58
3.2. DESIGN OF A CYCLIC CODE ...........................................................................................................59
3.3. GENERATOR MATRIX OF A CYCLIC CODE ......................................................................................63
3.4. IMPLEMENTATION OF CYCLIC CODES ............................................................................................64
3.4.1. Encoders for Cyclic Codes ..................................................................................................64
3.4.2. Application, error detection encoder in GSM .....................................................................67
3.5. SYNDROME DECODING OF CYCLIC CODES ...................................................................................70
3.6. SOME POPULAR CYCLIC CODES ...................................................................................................74
3.6.1. Hamming Codes ..................................................................................................................74
3.6.2. Bose-Chaudhuri-Hachuenghem (BCH) Codes ....................................................................74
3.6.3. Cyclic (23, 12) Golay Code .................................................................................................75
3.7. CYCLIC CODES FOR BURST ERROR CORRECTION .........................................................................75
3.7.1. Error bursts .........................................................................................................................75
3.7.2. Cyclic codes for correction of error burst ...........................................................................75
3.7.3. Fire code..............................................................................................................................77
PROBLEMS ..........................................................................................................................................78
4. CYCLIC REDUNDANCY CHECK, CRC .....................................................................................82
4.1. ERROR DETECTION AND ARQ ......................................................................................................82
4.2. STRUCTURE OF CRC-CODE ..........................................................................................................84
4.2.1. CRC Encoder .......................................................................................................................84
4.2.2. CRC Decoder.......................................................................................................................85
4.3. ERROR DETECTION CAPABILITY OF CRC ......................................................................................87
4.4. DESIGN OF CRC CODE ..................................................................................................................89
PROBLEMS ..........................................................................................................................................90
5. CONVOLUTIONAL CODES ..........................................................................................................92
5.1. CONVOLUTIONAL ENCODING........................................................................................................92
5.2. TREE DIAGRAM OF A CONVOLUTIONAL CODE ..............................................................................95
5.3. STATE DIAGRAM OF A CONVOLUTIONAL CODE ............................................................................96
5.4. TRELLIS DIAGRAM .......................................................................................................................97
5.5. DECODING OF CONVOLUTIONAL CODE .........................................................................................98
5.5.1. Maximum Likelihood Decoding...........................................................................................98
5.5.2. Hard-decision Decoding and Viterbi algorithm ..................................................................99
5.5.3. Soft-decision decoding ......................................................................................................101
5.5.4. Implementation aspects of the Viterbi Algorithm ..............................................................101
5.6. DISTANCE PROPERTIES OF CONVOLUTIONAL CODES ..................................................................101
5.7. APPLICATION EXAMPLE, GSM ERROR CORRECTION ..................................................................105
PROBLEMS ........................................................................................................................................106
6. TRELLIS CODED MODULATION .............................................................................................108
6.1. BINARY PHASE SHIFT KEYING, BPSK ........................................................................................108
6.2. QUADRATURE PHASE SHIFT KEYING, QPSK ..............................................................................109
6.3. COMPARISON OF QPSK AND 8-PSK ...........................................................................................110
6.4. TRELLIS CODED 8-PSK ..............................................................................................................111
PROBLEMS ........................................................................................................................................115
7. TURBO CODES ..............................................................................................................................116
7.1. CODEWORD STRUCTURE OF TURBO CODES ................................................................................116
7.2. ITERATIVE DECODING AND PERFORMANCE OF TURBO CODES ...................................................120
7.3. PERFORMANCE OF TURBO CODES ..............................................................................................122
7.4. APPLICATION EXAMPLE, 3GPP TURBO ENCODER ......................................................................123
PROBLEMS ........................................................................................................................................123

__________________________________________________________________________
ECC35.doc

Page 2

1/23/2013

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
INDEX ..................................................................................................................................................124
REFERENCES ....................................................................................................................................125

__________________________________________________________________________
ECC35.doc

Page 3

1/23/2013

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

1. Introduction to Error Control Codes


In this first chapter we introduce error control codes and their role in communication systems.
Some applications are presented as well as the most important historical milestones and basic
alternative techniques for error control.
Coding means any kind of process that transforms a sequence of digital symbol into another
digital sequence of symbols. It contains, for example, ciphering or encryption but we concentrate here on error control coding only.

1.1. Purpose of Error Control Coding


Figure 1.1.1 illustrates a unidirectional communication system using carrier modulation. In the
transmitter data modulates carrier waves amplitude, frequency and/or phase shift. Most of the
transmitted power Pout is lost on the way to receiver and the received power Pout is very small.
Loss of power L depends, among other things, on the transmission distance d. Received signal
contains noise and interference from other systems and after filtering signal power to noise
power ratio is S/N. The detector detects 0s and 1s (in binary system) and regenerates data. Bit
error rate, BER, of regenerated data depends on signal to noise ratio The lower S/N the higher
BER..
Noise and
interference
Transmitter

Reciver
Bandwidth B

Data

Modulator

Pout

Loss
L
Pin

RF frontend and
demodulator

Regenerated
data
with errors
Detector

S/N

BER

Transmission distance d
(section length)

Figure 1.1.1 A digital communication system.


In many communication channels, especially in radio communications, errors occur frequently. It could naturally be possible to keep data transmission practically error free but this would
lead to uneconomical implementations. For example by using high transmission power and/or
low channel attenuation, we can increase signal to noise ratio in the receiver and get lower bit
error rate. Lower attenuation can be achieved often only by shortening transmission distance
and for this we probably have to install additional intermediate repeaters or relay equipment.
In cellular networks new base station sites and equipment is required.
Transmission power is always a limiting factor. For example in cellular mobile systems
transmission power is adjusted to as low value as possible to keep the capacity of the system
as high as possible. Radio channels of GSM (2G) are used again in nearby cells and interference from other cells using the same frequencies is the major noise source and it has to be
________________________________________________________________________
ECC35.doc

Page 1/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
minimized. In the WCDMA (3G) system the same carrier frequency is used by all users and
signals of other users is the main noise source. In order to maximize capacity, the power of all
received signals must be adjusted to be the same.
Fading causes additional variable attenuation on radio waves and signal-to-noise ratio in the
receiver decreases temporarily below error threshold and a lot of errors occur. As a consequence, efficient error correction scheme is mandatory feature of the digital cellular systems.
Very Large Scale Integration (VLSI) technology and Digital Signal Processing (DSP) devices
have made implementation of error control codes efficient and low cost.
With the help of error correction coding we may use lower transmission power and get the
same error performance than with higher power without coding. This saving of power is often
expressed as the coding gain. For example, if we can reduce signal power by 3 dB with the
help of error control coding and still get same performance than without error control, (3 dB
worse signal to noise ratio) the coding gain is 3 dB.
Even if we have high average signal to noise ratio and error free transmission, errors may occur in practical systems temporarily. Some applications do not tolerate errors at all. For example errors on internal buss of a multiprocessor system may cause severe consequences if no
error control is in use.
Efficient end to end communication between applications requires communication system to
correct or at least detect errors at low OSI-layers, namely physical or data link layers of each
intermediate transmission section.
There has been a lot of research work since 1950s for finding good error control codes to protect digital data against errors. Depending on the application two basic methods are in use:
Backward Error Correction (BEC) uses error detection and retransmission. If the received
data frame is detected to be in error it is retransmitted. This method is also called Automatic Repeat Request (ARQ) and it is used in most computer communication systems, such as
LANs (Local Area Networks). Error detection and retransmission are usually functions of
Data Link Layer (or/and higher layer) in the OSI reference model. Because of retransmissions some parts of data suffer longer delay than the others, all applications do not tolerate
this. BEC also requires feedback channel used by the receiver for acknowledgments to inform the transmitter if retransmission is needed or not.
Forward Error Correction (FEC) is a method that corrects some number of errors in received data. FEC does not require retransmissions and it is thus suitable for isochronous
applications, where constant low delay is required, such as speech transmission in cellular
networks. FEC provides weaker protection although it requires more redundant error control data than error detection. However, for some applications FEC is the only choice because it does not introduce long variable delay as ARQ does. FEC does not either require
feedback channel and it can be used in simplex connections. FEC is implemented in physical layer of OSI model.
As we will show later BEC is much more efficient than FEC and it is used always if application allows that.

________________________________________________________________________
ECC35.doc

Page 2/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

1.2. Some historical milestones


The history of error control coding began in1948 [4, p.3] when Claude Shannon showed that
if the transmission data rate is less than the maximum capacity C of the channel, it is possible
to design an error-control code that makes the probability of output error as small as desired.
He showed that this maximum capacity through a lowpass channel in bit/s is
C = B log2(1 + S/N) bit/s

1.2.1

where B = bandwidth in Hz and S/N is the signal to noise power ratio. If the information data
rate is less than C then by using a good enough error correction code it is possible to transmit
coded messages over the channel with arbitrary small probability of errors. Equality is an asymptotic limit that can be achieved only by increasing code length to infinity. In his fundamental studies Shannon established that such good codes exist but his proof does not tell how
to find them. Search for them has been extremely difficult task and it has proceeded slowly.
To approach the maximum channel capacity C requires more and more complex coding
schemes.
The first block codes were introduced by Hamming in 1950. They were able to correct a single
error. Then it took ten years (1959-60) to find better codes that were able to correct multiple
errors. They are called BCH codes according to their inventors, Bose, Ray-Chaudhuri and
Hocquenghem. The block codes and their important subset called cyclic block codes are discussed in Chapter 2.
Convolutional codes were developed late 50's. They are not block codes because they may
have infinite length. Important algorithm, Viterbi-decoding algorithm, for decoding of convolutional code, was developed in 1967, and it made implementation of convolutional codes feasible. Convolutional codes are discussed in Chapter 5.
The development of performance of the error control codes in 70s was reached by increasing
complexity of codes but principles, such as convolutional code and Viterbi decoding, remained the same. The impact of increase in complexity on performance was not as dramatic
as expected. In the end of 70s the Trellis Coded Modulation (TCM) was invented. The TCM
combines channel coding and modulation and it increased, for example, voice band modem
data rate from 9.6 kbit/s up to 33.6 kbit/s. The principle of Trellis coding is introduced in
Chapter 6.
In 1993 latest major new coding scheme, Turbo codes or Parallel Concatenated Convolutional Codes (PCCC) were reported and it achieves near Shannon's maximum capacity performance on AWGN (Additive White Gaussian Noise) channel. The discovery of Turbo codes
has stimulated a huge research effort to fully understand the performance of this new coding
scheme. We review the structure and performance of these codes in Chapter 7.
The use of error control codes in practical systems has been limited by implementation technology. Another reason for delay from invention to implementation is information transfer
from mathematicians, or other research personnel, to implementation engineers. Typically the
delay from discovery of an essentially new and better coding scheme to its volume implementation has been from 10 to 30 years. The first applications have often been deep-space communications.
________________________________________________________________________
ECC35.doc

Page 3/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
The problem how to find and implement good error control codes is not yet solved. The research for new good error control codes and their implementation technologies continues.

1.3. Communication System and Channel


A communication system connects a data source to the destination user via communication
channel with the help of the transmitter and the receiver, [4, p. 2]. Major functions of the
transmitter and receiver are presented in block diagram in Figure 1.3.1.

Source

Transmitter

Receiver

Source
encoder

Source
decoder
Source
codeword

Source
information

Estimated
source
codeword

Channel
encoder

Received
information
Channel
decoder

Received
channel
codeword

Channel
codeword
Line encoder
and/or
Modulator

Destination

Transmission
channel

Demodulator
and/or
line decoder

Noise

Figure 1.3.1 Block diagram of a communication system.

The transmitter in Figure 1.3.1 transforms the signal, given by the source, to the form that
minimizes the error probability in the received information. Channel encoder is responsible
for error control coding and it adds redundant data to the signal to be transmitted. This reduces
signal to noise ratio in the receiver and increases error rate in received channel data words.
Added error control data makes data rate higher, higher data rate requires wider bandwidth
and then the received noise level becomes higher. However, if we have designed the system
properly, the channel decoder in the receiver is able, with the help of redundant error control
information, to correct (or detect) most of the errors. This makes residual error probability
smaller than it would be without error control coding. The functions of each block in Figure
1.3.1 are briefly explained below.
Source encoder
Information from the data source is first processed by the source encoder that may compress
digital data or perform A/D conversion for analog information. Source encoder produces
source codewords for the channel encoder.
Channel encoder
Channel encoder performs error control coding and produces channel codewords that contain,
in addition to the source codewords, redundant information for error control.
A code, which sends each input source word as one longer fixed length channel codeword is
called a block code. In convolutional coding, data is not divided into words but encoded as
________________________________________________________________________
ECC35.doc

Page 4/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
continuous data stream. In both cases the data rate at the output of channel encoder in higher
than data rate at the input because of the added redundant error control information.
Line encoder
Line encoder sends channel codewords as continuous data stream with additional block synchronization information that the receiver needs to detect the where each codeword starts. In
the case of baseband transmission line coding is carried out before transmission and the purpose of it is to change the spectrum of transmitted signal to the spectral shape that suits better
to transmission channel. Line coding cancels direct current content of the data, inserts bit synchronization information, and decreases required bandwidth for example by coding binary
digits into multiple value, M-ary, symbols.
Modulator
For radio transmission modulation is carried out. Modulator converts each digital symbol into
a corresponding analog waveform from a finite set of possible analog waveforms. For example Binary Phase Shift Keying (BPSK) binary value "1" may be sent as in-phase carrier and
binary value "0" as a carrier with 180 degrees phase shift (or vice versa depending on the system).
Note that we can describe operation of a modern radio system in a way that we have separate
line coding phase before modulation or we may combine them. For example in the case of 8PSK we may think that three bits are first encoded into 8-value digital signal. In modulator
each of these digital symbol values is represented as one of the 8 possible carrier phase shifts.
We may alternatively think that each different block of three bits corresponds directly to a certain carrier phase and there is no need for separate line encoding function in the transmitter.
Transmission channel
The sequence of analog waveforms or line-encoded symbols is transmitter through the channel. The channel is subject to various types of noise, distortion and interference and the signal
at the channel output is different from the transmitted signal. The received signal is distorted
and attenuated and it contains noise.
Receiver
In the receiver demodulator produces received data from received analog symbols that contain
noise and other disturbances added to the signal in the channel. If noise is too high, the detected symbols may be different from line encoded data at the output of the transmitter. Line decoder then produces received codewords that may be different from the transmitted ones.
If the system uses Forward Error Correction (FEC), it tries to correct possible errors. The
task of channel encoder is then to estimate, with the help of the redundancy of error control
code, what was the actual transmitted codeword. In data communication systems, which allow
variable delay, Backward Error Correction (BEC) or ARQ, is used instead of FEC and the
decoder uses error control information to detect if errors have occurred. If the received codeword is detected to be in error, it is discarded and the transmitter retransmits the data block in
error. It is much more efficient (requires less error control data) to detect errors than to correct
them but ARQ requires a reverse channel for acknowledgements to inform the transmitter if
retransmission is needed or not.

________________________________________________________________________
ECC35.doc

Page 5/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Finally source decoder produces original information data or signal for the destination. As an
example, source decoder may carry out decompression of data, D/A conversion, etc.
Combined channel coding and modulation
In some modern systems channel coding and modulation are not independent as in Figure
1.3.1, but combined together. One widely used method for combined coding and modulation
is Trellis Coded Modulation (TCM), which will be introduced in Chapter 6.

1.4. Applications for Error Control Coding


The main application area of error control codes is communications. These codes, however,
have many other important applications. Codes are used to protect data against noise and
faults in digital circuits. Performance of computers without error correction and detection coding would be poor. Error control codes protect data in computer memories and digital tapes
and disks. Compact Discs (CDs) and Digital Video Disks (DVDs) utilize error correction
technology.
Error control codes are used to achieve reliable communication even when received signal
power level is close to the noise power, i.e., signal to noise ratio (S/N) is low. This makes
communication over longer distances possible because, with help of these codes, receiver can
detect properly very low level signal that has suffered high attenuation. This is why deep
space communications often utilize latest and most sophisticated error control schemes.
In some systems, such as mobile cellular systems, interference is the limiting factor instead of
noise. With help of error correction coding, receivers are able to tolerate more interference
and utilization of frequency band can be improved because other uses in nearby cells are able
to use same frequencies. Radio transmission in mobile applications is very unstable because
of variable fading. To minimize interference transmission power should be kept as low as possible all times to minimize interference to other simultaneous users. Radio transmission characteristics, and requirement for high utilization of frequency band, makes implementation of
high performance digital mobile radio system impossible without efficient error control coding. The most common fading is so called Rayleigh-fading that causes bursts of errors to the
transmitted data. In addition to error control coding, and to improve its performance, efficient
interleaving schemes are implemented in the digital cellular mobile systems to improve the
performance of error control codes.
Error control technology is also important within a system where data flows between subsystems via internal busses. Many communication systems, such as routers and modular transmission systems, have this kind of structure. Corruption of control data on internal bus may
cause severe malfunction that is avoided if error control (detection and/or correction) code is
implemented.
One example of a system in every home that uses error control coding is the teletext of TVbroadcast. That system utilizes single error correcting Hamming code discussed in Chapters 2
and 3.

________________________________________________________________________
ECC35.doc

Page 6/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

1.5. Waveform Coding and Line Coding


We can divide channel coding into two different areas, error control coding and waveform
coding [2, p 246]. Our main concern here is error control coding that adds redundant data to
the transmitted data stream for that decoder uses it for error detection and correction.
Input data of the modulator contains source codeword together with error control redundancy.
Waveform coding in the modulator transforms waveforms into "better waveforms", which are
suitable for the transmission channel and make detection process in the receiver less subjective to errors. For example, in binary phase modulation, Binary Phase Shift Keying (BPSK)
we use carrier phases 0 and 180 degrees to make the Euclidean distance between "0" and "1"
as long as possible with a certain symbol energy. Carrier wave modulation transfers data spectrum to radio frequency band available for transmission. Modulator has performed waveform
coding and changed binary digits, bits, into carrier waveforms for radio transmission.
In the digital baseband transmission, such as LANs, we do not use continuous wave (CW)
modulation and we often use the term line coding that is one form of the waveform coding.
Line coding selects symbols, pulse shapes and values that are transmitted to the line or the
channel, in such a way that at certain average signal energy the error rate is minimized. One
purpose of the line coding is to remove direct current component from the data stream that
does not transmit any information but wastes power.
In both radio and baseband transmission the receiver has to be able to detect symbol synchronization that it needs to sample each received symbol in order to estimate its value. This is
another purpose of line coding. In baseband transmission we may use, for example, Manchester code that transmits each bit as a transition from one value to another. We could also use
Manchester coding before modulation that would assure that the receiver is informed, with the
carrier phase change, at what rate the symbols arrive.
Although waveform coding is essential for good quality transmission we concentrate from
now on to error control coding that provides further improvement in performance.

1.6. Automatic Repeat Request (ARQ)


ARQ is Backward Error Correction (BEC) scheme. For data transmission Forward Error
Correction (FEC) is rarely used alone, because it requires large amount of redundant data and
becomes overloaded when errors occur very frequently. Data transmission tolerates variable
delay from frame to frame making retransmission of discarded frames feasible. The need for
possible retransmission is checked in the receiver with the help of error detection code. If the
decoder in the receiver detects that errors have occurred, it discards the frame in error. In some
system this is all that the receiver does in the case of errors but in some other system data link
layer sends negative acknowledgment to the transmitting party. In the case of successful reception positive acknowledgment is sent to the transmitting machine to tell that retransmission
is not necessary.
In the Data Link Layer of the sending machine (e.g. in LAN) error detection encoding is carried out and the timer is activated for each transmitted frame. If the transmitter does not re________________________________________________________________________
ECC35.doc

Page 7/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
ceive the positive acknowledgment in a defined time frame, the timer expires and the data
frame is retransmitted.
There are many different ARQ protocols in use and many of them, for example TCP, use socalled "sliding window" principle. These systems are able to send a certain number of frames
before none of them are acknowledged. This makes utilization of the transmission channels
efficient especially if the channel delay is long. The principle of retransmission may be so
called "go back n" or selective repeat. First one sends all frames starting from the first one that
is not acknowledged. Selective repeat is more efficient scheme and it is able to send only one
of the previous frames again and the acknowledged frames that were sent later are not retransmitted. We will not study ARQ methods more here but concentrate on design and implementation of various error control codes.

________________________________________________________________________
ECC35.doc

Page 8/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

2. Linear Block Codes


The history of error control block codes began in 1950 when a class of single error correcting
block codes was introduced by Hamming [8, p 399]. The correcting capability of Hamming
codes is quite weak but they are still used in many applications like in Bluetooth. A major
breakthrough was the discovery of large family of multiple error correcting binary codes in
1959-60. They were named to BCH-codes according to their inventors Hachuenghem and
Rose-Chaudhuri. An important subclass of BCH-codes was discovered by Reed and Solomon.
These RS-codes are not binary codes and they achieve maximum separable distance between
their codewords. Since 60's block codes have become practical to implement for a wide range
of applications with the help of the development of integrated circuits. Satellite communications were one of the first applications for these codes.
Block codes are described by two parameters:
n is the length of the codeword or block length and
k is the number of information symbols encoded into each codeword.
We write a block code as (n, k). How codewords are generated for each set of k information
bits is defined by generator polynomial or generator matrix of the code.
Most of the known good codes belong to a class called linear codes. The structure of these
codes makes encoders and decoders easier to implement. Codewords of linear codes:
contain all zero word;
a sum of two codewords is one word in the code.
In linear codes all codewords are combinations of the other words in the code. This makes
analyses of their performance much easy as we will see later. Many of the practical codes are
also systematic codes, in which the first bits in a code block are equal to the information bits
and the rest are redundant symbols for error control.
Next we define some terms that we need later when we analyze different codes. Then we introduce some simple block codes. In the final section of this chapter we introduce linear algebra that is needed to create and analyze block codes. Then we learn to design and implement
some block codes.

2.1. Introduction to block codes


Most technically trained persons are able to understand easily the basic principles of errorcontrol coding. On the other hand it is extremely difficult to understand and find codes with
good performance. We introduce first some very simple codes to clarify basic terms needed in
later studies. Then we give an introduction to linear algebra and Galois Fields, that coding
theory of linear block codes is based on.
Let V be a set of all possible combinations of n symbols (in binary case a set of bits) xi, called
n-tuples, where i = 1...n [5]. In binary case these symbols xi may get values 0 or 1.
V = {x1, x2, . . .,xn}

2.1.1

________________________________________________________________________
ECC35.doc

Page 9/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We call a subset C of V a code and the selected n-tuples of C we call codewords. We use M as
a number of codewords in C. Note that error control requires redundancy that means that all
possible combinations of n-bits (whole V) are not used as codewords. If all codewords have
constant length of n we talk about block code with block length of n.
For a certain code we select M = 2k of vectors in V for codewords. This means that to transmit
k information bits we need M codewords each representing one of the possible sets of k information bits.
Example 2.1.1
Consider a binary block code with block length n = 5 where each codeword transmits
two information bits, i.e., k = 2 [4, p 6]. There are 2n = 25 = 32 different 5-bit words,
and thus V includes 32 different sets of 5 binary digits. Two information bits have only
2k = 4 possible combinations and we need only 4 different codewords from V to
transmit information containing four possible sets of information bits. Thus the subset
C, the code, includes four 5-bit words. For example, we may define our code as:

1
1

C
0
1

0
0
1
1

1
0
1
1

0
1
1
1

1
0

0
1

2.1.2

This is a very small (5, 2) code with, k = 2, M = 4 and n = 5. We use one of the codewords for each set of two information bits and we may define for example that:
0010101
0110010
1001110
1111111

2.1.3

If error correction is used and one five bit word is received, the decoder selects corresponding 2-bit information sequence for the destination. If an error has occurred, received word is most probably different from all four 5-bit codewords. The decoder in
the receiver has to estimate (for error correction) which of the four codewords was
most probably the transmitted one and gives corresponding 2-bit sequence to the destination. For example if the received word is [01100], the transmitted codeword was
most probably [01110] and hence information word was most probably [10].
If error detection is used, it is most probable that in the case of errors one of the words,
that is not included in the code, is received. If the received word is not one of the
codewords we know that errors have occurred and this is enough for error detection.
Without error control coding transmission errors in two-bit uncoded sequence would
create another two-bit word and the receiver would not be able to detect or correct errors.
Let us now assume that a severe noise burst has occurred and the receiver receives a
random five-bit word. For this simple code we can easily calculate the probability that
error is detected and that is: 1- 4/32 = 87.5%. Error detection fails if one word in the
code is received and the residual error rate is then: 4/32 = 12.5%.
________________________________________________________________________
ECC35.doc

Page 10/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

We can see that the code of Example 2.1.1 is not necessarily able to correct or even detect two
errors because two errors may change a codeword to another error free codeword. For example errors in the second and fourth bit would change [10101] to [11111] and errors were not
noticed.
We could improve the performance by using longer codewords. The performance of the code
depends on, in addition to the length of the code, how we select the codewords that are used
for transmission, i.e., how we define the code. The problem of finding the best possible selection of codewords is extremely difficult task and many good codes are still undiscovered.

2.2. Block Code Characteristics


Encoding process for block code receives k-bits of information at a time from the source or
source encoder and encodes these into n-bit codeword that is transmitted probably with the
help of modulation to the transmission channel as shown in Figure 2.2.1. We call this code as
(n, k) code. Note that there are many codes with this same designation of (n, k) because there
are many ways to define, which of the n-bit words are selected and how they are related to kbit information sequences. To improve error performance with the help of error control coding
we have to make n > k.
k-bit
block
Source

Transmitter
Channel
encoder

n-bit
block

Source
information

Transmission
channel

n-bit
block

Receiver
Channel
decoder

Transmitter

Source

Channel
encoder

Channel
encoded data

n-bits/
frame

Received
data

Transmission
channel

k-bit
block
Destination

Received
information

Block Code

k-bits/frame

Source
information

Received
channel
codeword

Channel
codeword

n-bits/
frame

Receiver
Channel
decoder

Tree Code

k-bits/frame
Destination

Received
information

Figure 2.2.1 Block coding and tree coding [4, p 8].

Block codes, see Figure 2.2.1, encode k information bits into n-bit codewords. Encoding is
done block by block and each block is independent from previous and preceding blocks. There
is one to one relationship between each sequence of k-information bits and corresponding
codeword.
We will see that the longer code the better performance. However, selection of the coding
method should be based on characteristics of the channel and application requirements. For

________________________________________________________________________
ECC35.doc

Page 11/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
example data applications over radio channel, where errors often occur in bursts, it is often
reasonable to use short block length and error detection and retransmit frames in error.
In the case of tree-codes, such as convolutional codes, coding is continuous and data is not
divided into independent blocks. However, during a certain time interval n-bit frame is transmitted for each kbit information frame, see Figure 2.2.1. Encoder stores a few previous information frames and they impact on the encoded n-bit frame in addition to the present information. Thus subsequent transmitted n-bit frames are not independent. We will come back to
these codes in Chapter 5.
2.2.1. Code Rate
We define the Code Rate for a block code (n, k) to be
Rc = k/n

2.2.1

Code rate is always smaller than one and it tells the amount of information in the transmitted
data and 1-Rc tells the amount of redundancy inserted into transmitted data. In many error correction codes in use the code rate is in the order of 0.5 to make performance good enough.
This is the case for example in GSM radio path error correction. Note that this doubles data
rate. If only error detection is needed, code rate close to one gives usually good enough performance. For example in Ethernet four bytes are used for error detection in frames up to 1500
bytes. Thus the code rate is in this case 0.997.
2.2.2. Hamming Distance
Hamming distance d(x ,y) of the two codewords x and y is the number of places in which they
differ [4, p 9]. For example, if x= [10101] and y=[01100], then d(x ,y)= d(10101, 01100) = 3.
If we use non-binary code with multiple symbol values, not just 0 and 1, the Hamming distance would tell only how many symbols are different, not how much they differ.
2.2.3. Minimum Hamming distance
Minimum Hamming distance or free distance, dmin, is the smallest number of places in which
any two codewords of a code differ [4, p 9; 1, p 473].
dmin = min d(x ,y)

2.2.2

where x and y represent all different codewords of the code. Smallest figure of all possible
distances is taken as the minimum distance. Minimum distance plays important role in error
control coding because it tells, what is the minimum number of errors that may change one
codeword to the other. Sometimes we describe a code with (n, k, dmin) instead of (n, k) because of the importance of the minimum distance.
Our fundamental goal, when we are designing code, is to select 2k codewords from 2n available ones so that the minimum distance is maximized.

________________________________________________________________________
ECC35.doc

Page 12/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
2.2.4. Hamming Weight
The Hamming weight w(c) of a codeword c is equal to the number of nonzero places in the
codeword [4, p46]. We will see in Subsection 2.5.1 that the Hamming weight tells the Hamming distances for linear codes. It is much simpler to evaluate weights of the codewords than
to calculate all the Hamming distances.
Example 2.2.1
In Example 2.1.1 we defined a code with four codewords. Hamming distances for that
code are:
d(10101,10010) = 3
d(10101,01110) = 4
d(10101,11111) = 2
d(10010,01110) = 3
d(10010,11111) = 3
d(01110,11111) = 2
We see that minimum distance dmin = 2 for this code. This tells how many errors are
required, in worst case that the code fails. Hamming weights are simply w(10101) = 3,
w(10010) = 2, w(01110) = 3 and w(11111) = 5.
We saw that for the code in example above had 22 = 4 Hamming weights and six Hamming
distances, i.e., different pairs of codewords. Generally we get 2k = M Hamming weights, the
2k
same as the number of codewords and Hamming distances, the number of different
2
combinations of two codewords because we have to compare each word with all others.
Example 2.2.2
Let us take two simple codes from Table 3.2.1: (7, 4) and (15, 11) Hamming codes.
2 4 16
16!
120 HamCode (7, 4) has 24 =16 Hamming weights and
2 2 2!16 2!
ming distances. Code (15, 11) has 2048 weights and 2096128 distances.
Most codes used in practice have much longer block lengths and the difference between the
number of distances and weights is much larger than in the example above. We see that if we
can manage with weights only, it is an important advantage. This is why all codes in use are
linear codes for which the minimum weight tells the minimum distance.
2.2.5. Systematic Codes
For systematic codes the first bits in the transmitted codeword equal information bits, i.e., first
k-bits of n-bit long codeword contain information bits as they are and the rest n-k bits are redundant bits for error control. Convolutional code (tree-code in Figure 2.2.1) can also be systematic if the first k-bits in n-bit frame are equal to the information bits. Most block codes in
use are systematic.

________________________________________________________________________
ECC35.doc

Page 13/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
2.2.6. Error Correction and Detection Capability of the Block Codes
When a codeword in error is received the task of the error correcting decoder is to find the
codeword that most probably was the transmitted one. For that we assume that closest word
with smallest Hamming distance to the received one is the best choice. Example 2.2.3 shows
that this is a very valid assumption.
Example 2.2.3:
Let us assume that Bit Error Rate (BER) of the received data is 110-3, i.e., there is in
average one error in 1000 bits of data and that encoded data is transmitted in 10 bit
blocks. We may use binomial distribution in Equation 2.3.1 to derive probabilities for
different number of errors. Now n = 10 and p = BER = 110-3. We get probability for
zero (i = 0), one (i = 1) and two (i = 2) errors as:
n
n n!

P(i ) pi (1 p) n i , where
i
i i!n i !
P(0) 0.990
= 99 %
-3
P(1) 9.9 10
1%
-5
P(2) 4.46 10
0.0045 %
These figures mean that 99 words out of 100 are error free and approximately one in
100 contains one error. For one block containing two errors we in average receive
more than 22000 blocks containing smaller number of errors (0 or 1 error).
Because occurrence of smaller number of errors has higher probability than higher number of
errors, error correcting decoder looks for all possible codewords and assumes that the one that
is closest to the received one was transmitted. This natural method of error correction decoding is called maximum likelihood decoding.
If the received codeword in error happens to be equal to one of the error free words, decoder is
not able to correct or even detect that errors have occurred. Minimum number of error that
may cause this situation is equal to minimum distance dmin.
In Example 2.1.1 we saw that, for the code used in example, two errors, the first and last bit in
error, change word [01110] to [11111] and error would not be detected. If one error has occurred, only the first bit in error, the received word would be [11110] and decoder would not
be able to correct error because it would not know if the transmitted word was [01110] or
[11111]. Received word could be either of these with one bit error. However, decoder would
detect error because received word is not exactly the same as one of the error free codewords.
Generally, if t errors occur, decoder is always able to correct errors if
dmin 2t + 1

2.2.3

For the code in Example 2.1.1 dmin = 2 and thus t must be equal to 0 and even a single error
may not be corrected. We see, if we look at code in Examples 2.1.1 and 2.2.1 that sometimes
error correction is possible even if the inequality is not satisfied, but t-error correction is not
quarantined if dmin < 2t + 1.

________________________________________________________________________
ECC35.doc

Page 14/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
A code is able to detect an error if the received word is not the same as one of the codewords.
This is always the case if fewer errors than minimum distance have occurred. Up to l bit errors
is always detected if
dmin l + 1
2.2.4
If l errors occur, and Equation 2.2.4 is valid, it is still sure that codeword is not changed the
errors to another codeword because all codewords differ in higher number of bit positions than
l because dmin is larger than l.

2.3. Examples of simple block codes


In this section we look at some examples of simple block codes. With help of these codes we
get an introduction to block codes in general. These codes are very easy to implement and
they provide much smaller residual error rate than if we do not use error coding at all. Much
better codes are presented in later chapters.
2.3.1. Simple Parity-Check Codes
Parity-check codes are very simple high rate codes with poor performance [4, p 11]. Codeword contains k information bits and one parity bit. Parity check bit is set to 0 or 1 so that the
total number of ones in the codeword is even (even parity) or odd (odd parity). Thus for example if k = 4 and even parity is used, four bit information sequences are coded into 5-bit
codewords as follows:
000000000
000100011
001000101
001100110
...............
...............
111111110
Sometimes we use odd parity and parity bit gets the value that number of ones is odd in the
codeword. The length of the codeword is n = k + 1. We write this code as (n, n-1) or (k+1, k)
code. The minimum distance is 2 and hence it cannot correct any error but the code detects
single and odd number of error, see Equations 2.2.3 and 2.2.4. The code rate of the (5, 4)-code
above is Rc = k/n = 4/5 = 0.8.
Vertical and Longitudinal Parity, VRC/LRC
We could improve previous simple single error detecting parity check code to correct a single
error by building bigger code blocks and adding one parity word to the end of each code
block. We calculate "horizontal" parity bit of each byte or character and call a byte, including
all character parity bits, Vertical Redundancy Check (VRC) [10, p 64]. Then we calculate
"vertical" parity of the first bits of bytes, second bits of bytes, etc., over whole block and add
additional block parity word in the end of the block. This we call Longitudinal Redundancy
Check (LRC). Let us take an example that uses even LRC- and VRC-parity.
________________________________________________________________________
ECC35.doc

Page 15/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Example 2.3.1
Information data is divided into 4-bit bytes and a block of four bytes is transmitted as
one information data block. Let us now construct a code block when the information
sequence is [0100 0011 1111 1100]:
VRC,
parity bits of four bit bytes
0 1 0 0 1
0 0 1 1 0
1 1 1 1 0
1 1 0 0 0
LRC-byte:
0 1 0 0 1
Parity symbols are written in italic and the parity word (Longitudinal Redundancy
Check, LRC byte) is [01001]. The codeword to be transmitted then becomes [01001
00110 11110 11000 01001]. This is (25, 16) code and its code rate is 0.64.
If, in the example above, the first bit is received as one, receiver notices that first horizontal
parity as well as first vertical parity indicates an error. The only bit that may cause this is the
first one. Then the decoder inverts it and error is recovered. If two errors occur in the same
row (or column) horizontal (vertical) parity does not indicate an error and only detection of
error is possible. Two parity bits indicate errors but the decoder is not able to locate them.
Two or more errors cannot be corrected even if they are not located either in the same column
or in the same row. The decoder does not know which rows and columns fit together. However, the detection of multiple errors is quite reliable. Note that the last bit in a block is just as
important as any other parity check bit in a code because it tells if the bit in error is a parity bit
or a bit in the information section.
This code has minimum distance of three and according to previously presented formula dmin
2t+1 all single errors are correctable. According to error detection formula (dmin l + 1) all
double error cases are always detected.
2.3.2. Repetition Codes
Repetition codes are low rate codes with good error performance. They are usable in application where data rate is not a limiting factor but simple implementation is a key factor. One application example is Bluetooth packet header and packet payload of heavily protected packets
use (3, 1) repetition code. Encoder simply sends one bit of information several times. It is reasonable to repeat each bit odd number of times to get better correction capability because (3,
1) and (4, 1) can both correct only one error per each codeword.
Example 2.3.2
If we repeat each information bit five times, n = 5, and we have (5, 1) block code. Its
codewords are:
0 00000
1 11111
We see that the only Hamming distance of the code is 5 and it is then naturally also the
minimum distance. If two errors occur, distance to original word is two and distance to
________________________________________________________________________
ECC35.doc

Page 16/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
the other one is three, and error correction is successful. Four errors are not enough to
change one codeword to another word and up to four errors are successfully detected.
Verify these results with the help of Equations 2.2.3 and 2.2.4.
Generally the minimum distance of a repetition code is n and it can correct up to 1/2(n-1) errors. Reader may check this by substituting t = 1/2(n-1) into the Equation 2.2.3. The same way
we get from Equation 2.2.4 that l dmin -1= n1 and n 1 errors are successfully detected.
Code rate of repetition code is: Rc = 1/n, that is 0.2 for the (5, 1) code Example 2.3.2.
We could, instead of encoding information bit by bit, encode information word to a codeword
in which each bit is repeated say r times [5, p 405]. An example, where information bits are
grouped into four bit blocks (k = 4), is shown in Table 2.3.1. Table 2.3.1 presents what may
happen when a single, double and triple error occurs.
Table 2.3.1 Examples of encoding, repetition code (12, 4)

example
1
2
3
4

original
message
0110
0110
0110
0110

encoded message

received message

000 111 111 000


000 111 111 000
000 111 111 000
000 111 111 000

000 111 101 000


000 111 100 000
000 011 101 001
111 111 111 000

decoded
message
0110
0100
0110
1110

We can easily analyze this simple (12, 4) code without deeper knowledge of coding theory.
We see immediately that this code can correct 1...4 bits (always 1 error), Hamming distances
are 3, 6, 9 and 12, minimum distance is 3 and it can detect 1...8 bits (always 1 and 2 errors).
See Equations 2.2.3 and 2.2.4 to compare minimum distance and error control capability.
Table 2.3.1 presents as an example 1 what happens in decoding when a single error occurs.
The decoder corrects error using majority decision and the third bit of message is received as
original one. Example 2 presents a double error case where the majority decision gives output
of 0 instead of 1 and third bit in the decoded message is in error. Example 3 shows a triple error case that the decoder corrects successfully because only one of repeated bits is in error in
each three-bit repetition subsequence. Example 4 shows a three-error case, which cannot be
either detected or corrected by the decoder.
We could improve the code in Table 2.3.1 with interleaving for better burst error tolerance. In
this code we would send one bit from all four three bit blocks of encoded message at a time.
This we could implement simply by sending four-bit information sequence as it is three times.
This code would then correct error bursts of length four or less. Burst error correction is discussed in the end of Chapter 3.
2.3.3. Residual error rate of a repetition code
Error correction schemes are never perfect but they are able to decrease received bit error
probability to usually many decades lower level. Remaining error rate after decoding we call
residual error rate. Another way to see this is that we can tolerate lower signal to noise ratio
________________________________________________________________________
ECC35.doc

Page 17/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
(and received bit error rate) and then we call this improvement coding gain that is discussed in
Section 1.1.
For repetition codes we can easily calculate improvement in error performance by using binomial distribution and taking p as bit error probability (BER)
n
P(i ) pi (1 p) n i
i

2.3.1

This formula gives the probability of i errors in the sequence of n bits when probability of a
bit error is p (and errors are independent). Binomial coefficient takes into account all possible
error sequences of i errors in n-bit block and that is given as
n n !

i ! n i !

2.3.2

Let us now analyze the code in Table 2.3.1 in terms of the residual error rate. Note that the
results we get are valid only for this case where we derive the residual block error probability,
i.e., if the decoded four bit word contains one or more errors.
Example 2.3.3
Assume that the bit error probability is 0.05, i.e., 5*10-2. Let us study two examples
where the number of information bits k per each codeword, is 4 or 6. Each of these information bits is error control coded so that they are sent as they are (repetition rate
one, i.e., no encoding at all), or repeated 3 times or 5 times. Thus the code block length
for k = 4 is 4, 12 or 20 and for k = 6 the block length is 6, 18 or 30. Here we consider
error correction only. Now we may calculate residual block error probability for received 4 and 6 bit codewords, i.e., if the decoded word contains 1 or more errors:
Repetition rate 1 (no error control coding): Decoding error probability is
1 - (1-p)k = P(i>0) = P(1)+P(2)++P(k)
This is equal to one minus that all k bits are error free (left hand side) or equal to probability of 1, 2, ..., k errors (right hand side). Both give the same results as we will see
in Problem 2.3.5.
Repetition rate 3:

1- [P(0) + P(1)]k = 1 - [(1-p)3 + 3(1-p)2 p]k

This is equal to one minus that all k subsequences of 3 bits are received error free or
they include a single error, i.e., one minus that all k bits are corrected successfully.
Repetition rate 5:

1- [P(0)+P(1)+P(2)]k = 1-[(1-p)5 + 5(1-p)4 p + 10(1-p)3 p2]k

This is equal to one minus that all k sequences of five bits contain 0, 1 or 2 errors, i.e.,
one minus that they are correctable.
Table 2.3.2 Decoding block error probability when BER = p = 0.05

Length of
Repetition rate
information
1
3
5
data block
k=4
1.9 10-1
2.9 10-2
4.6 10-3
-1
-2
k=6
2.6 10
4.3 10
6.9 10-3
________________________________________________________________________
ECC35.doc

Page 18/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Table 2.3.2 shows error probability for decoded k-bit words when k-bit information bits are
sent without repetition (repetition rate is 1) and when they are repeated 3 and 5 times. We see
that we have approximately ten times better performance when bits are repeated three times
and even 100 times better performance when they are repeated five times. With lower bit error
rate the difference would be much larger. For example, when BER = 110-3, repetition rate of
three reduces the number of words in error by the factor of 1000. With more sophisticated
codes we can do even much better than this!
2.3.4. Hamming Codes
Here we introduce the codes that can always correct a single error. Later we will handle this
code in more analytic way but at this point we only illustrate the structure and operation of
Hamming code. For each integer m a (2m - 1, 2m - 1 - m) Hamming code exists [4, p12]. When
m = 1 we get one bit code (1, 0) word with no information so single error cause no harm. With
m = 2 we get (3, 1) code that is a simple repetition code that can correct one error in each three
bit block. When m is large code rate is close to one but then codewords are long and only one
error in each of them is correctable.
The shortest non-trivial Hamming code (7, 4) we get when m = 3. One possible set of codewords for (7, 4) Hamming code is presented in Table 2.3.3. There are four information bits, i1
i2 i3 i4, followed by three parity bits, p1 p2 p3, in each codeword. This is a systematic code
because the first bits are identical to the information bits. In this Hamming code example the
parity bits are defined as (you might define then in other way as we will see later):
p1 = i1 + i2 + i3
p2 = i2 + i3 + i4
p3 = i1 + i2 + i4

2.3.3

Information bits are added in modulo-2, i.e., odd number of bits "1" in the sum makes 1 and
even number of them makes 0, i.e.,
0+0=0
0+1=1
1+0=1
1+1=0

________________________________________________________________________
ECC35.doc

Page 19/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Table 2.3.3 (7, 4) Hamming code example.

i1;i2;i3;i4;p1;p2;p3
0 0 0 0 0 0 0
0 0 0 1 0 1 1
0 0 1 0 1 1 0
0 0 1 1 1 0 1
0 1 0 0 1 1 1
0 1 0 1 1 0 0
0 1 1 0 0 0 1
0 1 1 1 0 1 0
1 0 0 0 1 0 1
1 0 0 1 1 1 0
1 0 1 0 0 1 1
1 0 1 1 0 0 0
1 1 0 0 0 1 0
1 1 0 1 0 0 1
1 1 1 0 1 0 0
1 1 1 1 1 1 1
The first four bits of codewords in Table 2.3.3 are equal to the information bits and the last
three are parity bits calculated according to Equations 2.3.3. This kind of code, which transmits first information bits as they are and then parity bits, we call a systematic code. Naturally
there are 2k = 24 = 16 different codewords and each block of 4 information bits is transmitted
as one of them. Figure 2.3.1 presents the encoding logic for this Hamming code.
7-bit codeword

4-bit data word


i1

i1

i2

i2

i3

i3

i4

i4
p1
p2

Madulo
2
adder

Parity on
i1 i2 i 3

Madulo
2
adder

Parity on
i2 i 3 i 4

Madulo
2
adder

Parity on
i1 i2 i 4

p3

Figure 2.3.1 Hamming (7, 4) encoder [4, p13].

Modulo 2 adder output is one if number of bit values 1 is odd at the input of an adder, i.e., 0 +
0 = 0, 0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0. Figure 2.3.2 presents the decoder for (7, 4) Hamming
code. The decoder receives codewords that may be in error
c' = [i'1, i'2, i'3, i'4, p'1, p'2, p'3]
________________________________________________________________________
ECC35.doc

Page 20/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We write, for example, encoded and transmitted bit as i1 and the received bit as i'1 because it
may be in error. Decoder in the receiver now computes three bits for error correction as:
s1 = p'1 + i'1 + i'2 + i'3
s2 = p'2 + i'2 + i'3 + i'4
s3 = p'3 + i'1 + i'2 + i'4

2.3.4

Received
7-bit codeword

Decoded
4-bit data word

i'1

i'2

i'3

i'4

i2
i3
i4

p'1

i1

o4 o 3 o 2 o 1

p2
'

Modulo
2
adder

p'3

Modulo
2
adder

Error at

s1

i1, i2, i3 or p1

Error at

s2

i2, i3, i4 or p2

Error
correction
logic

'
Modulo
2
adder

Error at

s3

i1, i2, i4 or p3

Figure 2.3.2 Hamming (7, 4) decoder [4, p13].

The three-bit pattern [s1, s2, s3] is called a syndrome. It does not depend on information bits
but on error pattern only, i.e., how errors are located in the codeword. There are eight possible
syndromes, one of those corresponds to error free word, and seven for each of the possible
single error patterns. If a single error occurs, error correcting decoder can identify, by calculating the syndrome, which of the received bits is in error and corrects it (inverts it).
Example 2.3.4
Let us, as an example, derive syndrome s = [s1, s2, s3] for a single error case when i'1 in
error and all other received bits are error free. Then i'1 i1, i'2 = i2, i'3= i3, p'1= p1 = i1+ i2+
i3, (as it was calculated in encoder, Equation 2.3.3.). Now we get, for example, syndrome bit s1 according to Equation 2.3.4 as
s1 = p'1+ i'1+ i'2+ i'3 = p1+ i'1+ i2+ i3 = (i1+ i2+ i3) + i'1+ i2+ i3 =
(i'1+ i1) + (i2+ i2) + (i3+ i3) = 1 + 0 + 0 = 1
This is valid no matter what are the actual information bit values because both 1+1 and
0+0 make 0 and both 1+0 and 0+1 make 1. When we derive other syndrome bits the
same way we get:
s = [1 0 1]
This indicates that error correction should invert i'1 for this and only for this syndrome.
Then with this input the output o1 of error correction logic in Figure 2.3.2 should go to
high state.
________________________________________________________________________
ECC35.doc

Page 21/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

After error correction parity bits are discarded. If two or more errors occur, decoder is not able
to correct them. Syndrome gets one value similar to one corresponding one of the single error
patterns and decoder inverts one bit. Then the decoded word may contain one more error than
the received word.
We may define parity bits in many ways. Equations 2.3.3 are not the only choice. We may
permute bit positions in codewords (all codewords the same way), code would not be systematic any more but its performance would remain the same. All of these variations are called
the (7, 4) Hamming code. Hamming codes are used in many applications, such as Bluetooth
(shortened (15, 10) code for packet payload) due to its simple implementation.

2.4. Introduction to Linear Algebra


Theory of error control codes is developed by mathematicians with the help of linear algebra.
We can see information word as well as codeword as a sequence of field elements, numbers in
a field. Error control coding works with these numbers. It is found to be handy to use finite
field algebra instead of for example real decimal numbers. One of its advantages is that any
operation with two bit sequences with the same length (numbers) gives the word with the
same length.
In linear algebra we work with a set of numbers (field elements) that we are able to add, subtract, multiply and divide. This set of numbers and rules for these mathematical operations is
called a field. Most common example of a field is the field of real numbers. It is natural to us
to write 5 + 5 = 10 but this follows, even that we need not think about it, from the definition
of the field of real numbers. Note that this field is not a finite field because we can always, for
example, add one to the largest number we have imagined, so we have infinite number of different elements (numbers) in the field.
2.4.1. Fields and Galois Fields
Field is a set of mathematical objects that can be added, subtracted, multiplied and divided.
The real numbers form a familiar field containing infinite number of elements (numbers). The
field of complex numbers is an extension field to the field of real numbers, and it contains all
real numbers as well [4, p17].
Following mathematical rules are always valid for all fields [4, p 27]:
Every field contains elements 0 and 1.
Adding or subtracting two field elements or numbers gives one number in the field.
Multiplying or dividing (division by 0 is not allowed) one number with another gives one
number in the field.
The distributive law, (a + b) c = ac + bc, holds for all field elements.
In study of error control coding we need special fields containing a finite number of elements.
This kind of field is called a finite field or Galois Field GF(q) and it contains q number of elements. Then the result of any algebraic operation gives always one element in the given finite
set of elements. The result of multiplication, for example, does not grow without limit as in
ordinary algebra of real and complex numbers.
________________________________________________________________________
ECC35.doc

Page 22/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Mathematical characteristics for Galois fields contain all rules above with some additions [4,
p 27]:
Galois field GF(q) contains a finite number q of elements.
Every Galois field contains elements 0 and 1.
Adding or subtracting two numbers gives one number in the finite set of elements of the
Galois field.
Multiplying or dividing (division by 0 is not allowed) one number with another gives one
number in the finite set of elements.
The distributive law, (a + b) c = ac + bc, holds for all field elements.
Galois field GF(2)
Let us consider the smallest Galois field GF(2), which contains only two numbers or elements
0 and 1. This is a finite field, which means that operation with numbers in the set gives one
number in the set and now 0 and 1 are the only numbers we have available. For example we
must define that 1+1=0 not as in ordinary algebra 1+1=2, which would correspond to binary
representation 10.
Calculation rules for GF(2) = {0, 1} are shown in Table 2.4.1. Reader may check that if we
use rules in Table 2.4.1 for addition and multiplication of numbers 0 and 1, all rules of Galois
fields above are fulfilled.
Table 2.4.1 Addition and multiplication tables for GF(2) = {0, 1}.

0
1

0
1

1
0

0
1

0
0

0
1

Additive and multiplicative inverses


Note that we may also subtract and divide in the finite field using tables for addition and multiplication in Table 2.4.1 [8, p 403]. For subtraction we take additive inverse for each element
a as -a. To find what is -a, an inverse element of a, we write a + b = c that gives b = c - a = c
+ (-a). Now we look out from the table which figure we should have in place of (-a) so that
the equation would be valid. That is the inverse element of a. This additive inverse in GF(2) is
1 for 1 and 0 for 0. Reader may check this by giving to b values of 0 and 1 in turn and checking that equations are valid for all possible values of b and c. The same way we can divide by
multiplying by multiplicative inverse, that is b-1 for b. Multiplicative inverse we find by writing a/b = c, a b-1 = c, a = b c. Now we notice that the inverse is 1 for 1. Zero has not a
multiplicative inverse because we are not allowed to divide by that.
Addition rule in the table 2.4.1 is also known as mod-2 addition where 1+1=0. Other rules in
Table 2.4.1 are self-evident. Operations are identical with "exclusive-or" and "and" operations
but we call them addition and multiplication for that we can use later matrix formalism.
We will concentrate on binary codes where the elements of information and codewords are
from GF(2), i.e., "1" or "0". However, for evaluation of non-binary codes, where code symbols have more values than just 0 and 1, we need more field elements. For these codes we
construct Galois field where each element represents a set of bits. This field would be GF(2m),
________________________________________________________________________
ECC35.doc

Page 23/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
extension field of GF(2), where each element represents m number of bits. There are 2m different elements (numbers) in the extension field of GF(2).
Prime number q fields GF(q)
In general we could define the Galois field GF(q) for any value of q if q is a prime number (is
not divisible). In this field we have elements from zero to q-1. For example, we can define
GF(3) = {0, 1, 2} according to Table 2.4.2 and GF(4)= {0, 1, 2, 3} as given in Table 2.4.3.
Note that GF(3) uses mod-3 addition and multiplication, for example 2+1=0 and 2 2=1.
Mod-q addition and multiplication mean that the result we get with ordinary addition or multiplication is then divided by q and the remainder is taken as the final result.
Table 2.4.2 Addition and multiplication tables for GF(3) = {0, 1, 2}

0
1
2

0
1
2

1
2
0

2
0
1

0
1
2

0
0
0

0
1
2

0
2
1

Examples of mod-2 and mod-3 addition: GF(2): 1+1=2, the quotient of the division 2/2 is 1
and the remainder is 0. This is the result of 1+1 in GF(2), see Table 2.4.1. In the case of GF(3)
2 2 = 4, 4/3 gives remainder as 1. This is the result of 2 2 in GF(3), see Table 2.4.2.
Extension fields GF(qm)
Finite or Galois fields exist to all prime numbers q but also for all qm where q is a prime number and m is an integer. We write an extension field of GF(2) as GF(2m). These extension
fields we use to handle non-binary codes where (non-binary) code symbols are expressed as
m-bit bytes. For example GF(28) or GF(256) consists of 256 different (eight bit) bytes and
GF(24) or GF(16) contains 16 different (hexadecimal) symbols, four bit bytes. For each extension field GF(2m) we need to define addition and multiplication tables that follow Galois field
characteristics above [6, p213].
Extension fields such as GF(22) = GF(4) do not follow mod-4 arithmetic, see addition and
multiplication rules for GF(4) in Table 2.4.3.We will introduce procedure how to construct
addition and multiplication tables for extension fields of GF(2), such as GF(4), later in this
chapter.
Table 2.4.3 Addition and multiplication tables for GF(4) = {0, 1, 2, 3}.

0
1
2
3

0
1
2
3

1
0
3
2

2
3
0
1

3
2
1
0

0
1
2
3

0
0
0
0

0
1
2
3

0
2
3
1

0
3
1
2

Addition in extension field is defined as a bit by bit mod-2 addition according to GF(2) and
we just mod-2 add (exclusive-or) m bits, bit by bit. For example, if two elements GF(4)=
GF(22) are 11 (3) and 10 (2) addition gives
________________________________________________________________________
ECC35.doc

Page 24/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
11 + 10 = 01 (1)
As another example we take two elements 1101 (13) and 1110 (14) from GF(24) = GF(16)
and addition gives
1101 + 1110 = 0011 (3)
The multiplication in GF(2m) must be consistent with addition in the sense that the distributive law
(a + b)c = ac + bc
holds for any a, b and c in GF(2m), where addition is bit-by-bit exclusive or. This suggests that
multiplication should have a structure of a shift and exclusive-or, rather than the conventional
structure of shift and add. For example, we may try to define multiplication for GF(24) in accordance with the following product of four bit numbers 1110 and 1101
1110
1101
1110
0000
1110
1110
1000110
The result is seven bit long and not found from GF(24). Our system should be a finite field
under multiplication so that two elements in GF(2m) should produce another element in
GF(2m) and word length should not increase. The product of any two m-bit numbers from
GF(2m) must produce an m-bit number in GF(2m). Now we need a method how to fold back
overflow bits in such a way that division is meaningful and the resulting word length would
stay the same (result of the example above should be one of the sixteen possible four bit sequences).
To get multiplication rules over GF(2m) we express binary numbers as polynomials, multiply
them and divide the result by the number, called the primitive polynomial, and take the remainder as a result, see Subsection 2.4.4. However, first we have to get familiar with polynomial expression of binary numbers. We will use polynomials also to define cyclic codes and to
implement encoders and decoders.
2.4.2. Polynomials
As we know figures we use are actually polynomials where the place of the number tells the
power of the base number. Each number is the coefficient of corresponding power of the base
number. For example, decimal figure
1234 = 1103 + 2102 + 3101 + 4100
where powers of 10 define the place of each coefficient. Generally we can write polynomial
representation for any number system by using x as the base number and placing coefficient in
front of each term (power of x). In binary system we have only two possible coefficients 0 or
1. We may express binary figures as polynomials of x where coefficients get either value 1 or
0. For example,
1101
p(x) = 1 x3 + 1 x2 + 0 x1 + 1 x0 = x3+ x2 + 1
________________________________________________________________________
ECC35.doc

Page 25/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
1010

p(x) = x3 + x1 = x3 + x

If we use binary figures to represent decimal figures (as we use them in computers) we just
put 2 in the place of x and get corresponding decimal figure. Here we use polynomial expression just to tell which places in the binary word are nonzero, and x is just a dummy parameter
and has no importance. In addition of polynomials we use modulo-2 addition (or GF(2) arithmetic) for coefficients. Multiplication we carry out in ordinary way, see example below.
Example 2.4.1
Multiplication of two binary polynomials (x3 + x2 + x) and (x3 + x2 + 1) makes:
(x3+ x2 + x) (x3+ x2 + 1) = x6 + x5 + x3 + x5 + x4 + x2 + x4 + x3 + x =
x6 + (1+1) x5 + (1+1) x4 + (1+1)x3+ x2 + x = x6 + x2 + x
Compare this result 1000110 with our earlier example where we multiplied binary
words 1110 and 1101. This expression gives the same results as a shift and exclusiveor multiplication in the previous section.
We saw that polynomials are handy for multiplication of long binary figures and we will use
this concept of binary polynomials later when we analyze and design cyclic block codes in
Chapter 3.
2.4.3. Vector Spaces
We may represent codewords as vectors. Codewords are sets of n-elements [8, p404], n-tuple,
from a field. In the case of binary codes we take these elements from GF(2).
Figure 2.4.1 presents 2- and 3-tuples over GF(2). Each bit in the binary codeword represents a
scalar value in that dimension (coordinate) of a vector. Figure 2.4.1 presents all possible vectors in two-dimensional space (four different vectors) and three-dimensional space (8 vectors).

x
2

[0,1]

[0,1,0]
[1,1]

[ 1,0]

[1,1,0]

[0,1,1]

[1,1,1]

[0,0,0]
[ 0,0]

x1

[1,0,0] x 1

[0,0,1]
x

Two-dimensional vector space.


Two dimensions, four vectors
(including zero vector).

[1,0,1]

Three-dimensional vector space.


Three dimensions, eight vectors
(including zero vector).

Figure 2.4.1 Two and three bit codewords, vectors, in vector space over GF(2).

It is not easy to imagine what the spaces with more dimensions than three would look like.
However, we can extend this expression to as many dimensions we like by making binary
words longer. The addition and subtraction of the vectors are performed element by element
over GF(2) and the result of the operation is another vector in the same vector space [8, p
________________________________________________________________________
ECC35.doc

Page 26/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
406]. For example, [0, 1] + [1, 0] = [1, 1], where [1, 1] is also a vector in two-dimensional
space.
We have defined vector spaces to help our study of error control codes. Elements or scalars of
the vectors represent code symbols that are 0 or 1 from GF(2) for binary codes. Codewords
are n-tuples of elements that is n-bit long sequence of bits in the case of binary code (GF(2))
[8, p 405].
Vectors v1, . . . ,vk are said to be linearly independent if there is not a single set of scalars ai
(excluding case where all scalars are equal to 0) such that
a1v1+ a2v2+,...,+ akvk = 0

2.4.1

For example, the vectors [0,0,1], [0,1,0] and [1,0,0] are linearly independent, but vectors
[1,1,1], [0,1,1] and [1,0,0] are linearly dependent. We notice immediately that that the set of
linearly independent vectors must not contain zero vector and they must all be different.
To construct an n-dimensional vector space we need n linearly independent basis vectors. Any
vector in the space is a sum of a set of basis vectors. Number of the basis vectors equals the
number of dimensions of the space. Basis vectors have to be linearly independent. Unit vectors in Figure 2.4.1 are not the only possible basis vectors. We could select them as we like
but we have to take as many linearly independent basis vectors as we have dimensions. Then
all vectors in the space can be written as a linear combination of basis vectors. For example,
for three-dimensional space we need three basis vectors and we could define them to be, for
example, [1,0,0], [0,1,0] and [0,1,1], instead of unit vectors [1,0,0], [0,1,0] and [0,0,1].
2.4.4. Multiplication in Extension Field GF(2m)
If our code symbols or information symbols are not binary, we represent them as a set of bits,
m-tuple, a vector. The number m of these bits corresponds to the number of dimensions of this
vector space. The addition and subtraction of vectors are performed element by element (bitby-bit) over GF(2) and the result is another vector in the same vector space. For example in
GF(24) = GF(16) = {0000, 0001, 0010. . . . 1111} we may add [1010]+[0110] = [1100] [6,
p215].
Let us use 2-dimensional space as an example, GF(2m),= GF(22), where we have 2-tuples or
vectors [0,0], [0,1], [1,0] and [1,1] (m=2). We can express these vectors as polynomials, 0, 1,
x, x+1 respectively. We notice that all polynomials in the set have degree (m-1) or less. The
addition and subtraction of these polynomials are performed on their coefficients over GF(2)
and it results in another polynomial in the set. For example (x+1) + (1) = 1x+1 + 0x+1 = x.
The multiplication and division operations of the vectors are not obvious [8, p406]. To multiply elements of GF(2m), we express them as polynomials, multiply them and divide the result
by a primitive polynomial of degree m. Primitive polynomials is a subset of irreducible polynomials, prime polynomials, over GF(2) (binary coefficients) of degree m. Irreducible means
that we are not able to divide polynomial into the product of two or more polynomials. Table
2.4.6 gives primitive polynomials up to the degree of 28.
Let us now construct arithmetic tables for GF(4) to clarify this process. Addition table is
straight forward but for multiplication table we need the primitive polynomial of degree m = 2
________________________________________________________________________
ECC35.doc

Page 27/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
because we are working with field GF(2m) = GF(22). For m=2 we take prime polynomial from
Table 2.4.6.
p(x) = x2 + x + 1

2.4.2

We can check that this is irreducible by trying to make it by multiplying smaller degree polynomials, 1, x, x+1. Remainder of all divisions in non-zero and thus polynomial 2.4.2 is not the
product of any combination of lower degree polynomials and it is shown to be irreducible.
Now we write all possible two-bit words as polynomials, multiply two of them and divide the
result by p(x) and we get multiplication rules in Table 2.4.4. In the table we have now used
polynomial expression of two bit binary words that are:
00 0
01 1

10 x
11 x + 1
Table 2.4.4 Addition and multiplication tables for GF(4) [8, p 407].

x +1

x +1

0
1
x
x+1

0
1
x
x+1

1
0
x+1
x

x
x+1
0
1

x +1
x
1
0

0
1
x
x+1

0
0
0
0

0
1
x
x+1

0
x
x+1
1

0
x+1
1
x

We have created addition table by mod-2 adding polynomial coefficients. Multiplication table
we have divided the product of two polynomials by p(x) in Equation 2.4.2 and taking the remainder as the result to the table (mod- p(x)). Note that Table 2.4.4 is equivalent to Table
2.4.3. Table 2.4.3 just presents the field elements as decimal numbers and we get that expression by replacing x by 2 in Table 2.4.4.
We could check that each element in Table 2.4.4 has additive and multiplicative inverse. Additive inverse of element a is a-1 to make a + a-1 = 0 and multiplicative inverse of a is a-1 to
make a * a-1 = 1. For example additive inverse of x+1 is x+1 because (x+1) + (x+1) = 0 and
the multiplicative inverse of x is x + 1 because (x+1)x=1 (the remainder of x2 + x divided by
x2 + x + 1).
In general a finite field GF(qm) exists for any number qm where q is a prime and m is a positive
integer. The relationship between prime field GF(q) extension field is that GF(qm) is that the
prime field is a subfield of GF(qm) such that elements of GF(q) are a subset of elements in
GF(qm). If we look at extension field in Table 2.4.4 we see that its prime field in Table 2.4.1 is
included in "extended" Table 2.4.4. This is why we say that GF(qm) is an extension field of
GF(q) [8, p 407].
Example 2.4.2
As an example of Galois fields, we will construct GF(24) = GF(16) [6, p215]. The elements are the set of sixteen different four-bit bytes,
GF(24) = {0000, 0001, 0010, . . . , 1111}
and addition of two elements is bit-by bit modulo-2 addition. To define multiplication
we use the primitive polynomial of degree m = 4, p(x) = x4 + x + 1 from Table 2.4.6.
To multiply for example 0101 by 1011 we write these as polynomials
________________________________________________________________________
ECC35.doc

Page 28/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
a(x) = x2 + 1

and b(x) = x3 + x + 1

and we get
c(x) = a(x)b(x) = (x2 + 1) (x3 + x + 1) = x5 + x2 + x + 1
Now we perform mod p(x) operation, i.e., divide by p(x) = x4 + x + 1 and take the remainder as a result and we get
d(x) = 0x4 + 0x3+ 0x2 + 0x1 + 1x0= 1
that is polynomial representation of four bit byte [0001]. Now we have got the product: [0101][1011] = [0001] for GF(16) and defined one product in the multiplication
table of GF(16).
Division is also always possible for every element a in the field except 0. This we do by multiplying with inverse element a-1. Product of an element and its inverse element gives 1, i.e.,
a a-1 = 1
We can see in the example above that inverse element of 0101 is 1011 and vice versa. Inverse
elements we find easily from the multiplication table by looking for the element that gives 1
as a result of multiplication.
Calculation is sometimes easier if we express field elements as powers of the primitive elements. This expression is explained below.
Primitive elements and polynomials
In addition to binary and polynomial notation we can use exponential notation for extension
field elements. Every Galois field has at least one primitive element, denoted as , which can
represent every field element, except zero, as a power of [8, p408]. Consider the example of
GF(4) where m = 2 and the prime field is GF(2) (GF(2m) = GF(4)). We try = x, 2 = x+1
and 3 = 1, where the results are modulo p(x) = x2 + x + 1 values, i.e., remainders of the result
divided by the primitive polynomial. Now we can write a binary 2-tuples of GF(4) with three
different notation as shown in Table 2.4.5.
Table 2.4.5 Representations for the field elements of GF(4).

Exponential
Notation
0

0
1
2

Polynomial
Notation
0
1
x
x+1

Binary
Notation
00
01
10
11

For the exponential notation, the field elements are represented by the successive powers of
the primitive element. The advantage of this notation is that the multiplication of two elements is equivalent to the addition of their exponents. For example, the multiplication of x and
(x+1) gives (x2 + x), which is then taken as modulo p(x) = x2 + x + 1 that gives 1. Equivalently, the multiplication in exponential notation of these two elements 1and 2 also results in 3
= 0 namely 1. Reader may check this result and divide 3 = x3 by p(x) = x2 + x + 1. Generally
________________________________________________________________________
ECC35.doc

Page 29/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
we simply subtract qm-1=3 so many times from the exponent of that we get one element in
the field.
Elements of the prime field GF(q) are a set of integer elements {0,..., q-1}, that is in the case
of GF(2) = {0, 1}. In extension field GF(qm), we may represent the polynomial elements as
successive powers of the primitive element and we can multiply two polynomials by adding
exponents of the exponential notation. To construct a field we use special prime (irreducible)
polynomial called primitive polynomial where x is the primitive element of the field, i.e., =
x. We use these primitive polynomials to construct extension field GF(2m) where all elements
in the field can be expressed as powers of the primitive element. In Table 2.4.5 we had an example where m = 2. These fields we need when considering non-binary codes. The primitive
polynomials over GF(2) are listed in Table 2.4.6. Note that there is only one primitive polynomial of each degree all prime polynomials are not included. For example polynomial x4 + x2
+ 1 is a prime polynomial but not a primitive polynomial and it cannot be used for construction of GF(24).
Table 2.4.6 Primitive polynomials over GF(2) [4, p 79].
(a subset of prime (irreducible) polynomials)

Degree
2
3
4
5
6
7
8
9
10
11
12
13
14
15

Polynomial
x2+ x + 1
x3 + x + 1
x4 + x + 1
x5 + x2 + 1
x6 + x + 1
x7 + x3 + 1
x8 + x4 + x3 + x2 + 1
x9 + x4 + 1
x10 + x3 + 1
x11 + x2 + 1
x12 + x6 + x4 + x + 1
x13 + x4 + x3 + x + 1
x14 + x4 + x3 + x + 1
x15 + x + 1

Degree
16
17
18
19
20
21
22
23
24
25
26
27
28

Polynomial
x16 + x12 + x3 + x + 1
x17 + x3 + 1
x18 + x7 + 1
x19 + x5 + x2 + x + 1
x20 + x3 + 1
x21 + x2 + 1
x22 + x + 1
x23 + x5 + 1
x24 + x7 + x2 + x + 1
x25 + x3 + 1
x26 + x6 + x2 + x + 1
x27 + x5 + x2 + x + 1
x28 + x3 + 1

We now summarize the construction of an extension field GF(qm), where q is a prime and m is
an integer. We first generate all field elements that are powers of x modulo p(x). We use is
degree-m primitive polynomial from Table 2.4.6. If the prime field is GF(2), coefficients of
the polynomial are binary. The addition of two polynomial elements is done by adding coefficients over GF(q) of corresponding powers of x. The multiplication of two elements is the addition of the powers of their corresponding exponential notation (or polynomial product mod
p(x))
Example 2.4.3
Now we construct GF(16) = GF(24), where q = 2 and m = 4. For this we take primitive
polynomial of degree m = 4 from the Table 2.4.6 that is p(x) = x4 + x + 1. The power
of the primitive element = x is raised to represent all non-zero field elements {0,1,
________________________________________________________________________
ECC35.doc

Page 30/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
...,14} in GF(16) and their modulo p(x) values are computed to give the polynomial
representations that are shown in Table 2.4.7. For example

6 = x6 [mod p(x)] = x3+ x2


To add two elements we add coefficients of powers of x over GF(2) and then express
the sum in its exponential notation according to Table 2.4.7. For example

6+ 7 = (x 3 + x2) + (x3+ x + 1) = (x2+ x + 1) = 10


Where we have taken polynomials from Table 2.4.7, added them and written the result
again as the power of .. Multiplication of the two polynomials above makes

67 = (x 3 + x2)(x3+ x + 1) [mod p(x)] = (x3+ x2 + 1) = 13


Easier way to get this same result is to add exponents (and, if necessary, subtract 15).
Table 2.4.7: Field elements of GF(16) generated by p(x) = x4+ x + 1 [8, p 410].

Exponential
notation
0
0 =15

1
2
3
4
5
6
7
8
9
10
11
12
13
14

Polynomial notation
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=

x
x2
x3

x3 +
x3
x3
x3 +
x3 +
x3 +
x3

x2 +
x2
+
2
x
+
2
x +
x2 +
x2 +
x2

x +
x
x +
+
x
x +
x
x +
+
+

0 =
1 =
=
=
=
1 =
=
=
1 =
1 =
=
1 =
=
1 =
1 =
1 =

Binary
notation
0000
0001
0010
0100
1000
0011
0110
1100
1011
0101
1010
0111
1110
1111
1101
1001

Exponential notation is handy for multiplication. For example: 4*5 =9. However, the
same result we get (x+1)(x2+x) = x3+ x2+x2+x= x3+x= 9.
Minimal Polynomials
In this section we introduce minimal polynomials that play key role in the formation of the
generator polynomials for BCH-codes. Primitive polynomials that we used in constructing
extension fields are a special case of minimal polynomials.
In ordinary arithmetic, a polynomial of degree-n with real coefficients has exactly n roots and
some of them may have the same value. As we know a root of polynomial f(x) is the value of x
that makes f(x) = 0. We are used to that roots of many polynomials do not exist in the field of
real numbers and we need the field of complex numbers, which is an extension field to the
field of real numbers. Similarly, in finite field arithmetic, if the polynomial defined in the subfield is irreducible, it has no roots in the subfield, only in the extension field.
________________________________________________________________________
ECC35.doc

Page 31/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
For example f(z) = z4 + z3 + z2 + z + 1 is irreducible over GF(2) and it has no roots from
GF(2). We use here z as a parameter instead of x (to avoid confusion with polynomials of x
that are used to represent binary figures). We can easily check that roots do not exist in GF(2)
by giving values 0 and 1 to z and noticing that f(z) does not go to zero. Instead it has all four
roots 3, 6, 9, 12 from GF(24) that is an extension field of GF(2). With the help of Table
2.4.7 of GF(24) we can verify that for example z =3 is a root:
f(z) = z4 + z3 + z2 + z + 1 = (3)4 + (3)3 +(3)2 +3 + 1 =
12 + 9 +6 +3 + 1 = (x3 + x2 + x + 1) + (x3 + x) + (x3 + x2)+ (x3) + 1 = 0
Hence 3 is a root of f(z). As f(z) has the degree of four with roots 3, 6, 9,
12 then f(z)= (z + 3)(z + 6)(z + 9)(z + 12) must be equal to f(z) = z4 + z3 + z2 + z + 1.
This comes from the fact that both formulas give 0 if z gets a value of a root. Note that, for
example, if z=6, we get one term of the product (z+6) = (6+6) = 0, that makes f(z)=0. To
check that the two formulas of f(z) are equal, we use Table 2.4.7 again and write
f(z) = (z + 3)(z + 6)(z + 9)(z + 12) =
(z2 + 6z + 3z + 9)(z2 + 12z +9z + 21) =
(z2 + 2z + 9)(z2 + 8z + 6) =
z4 + (8 + 2) z3 + (6 + 10 + 9) z2 + (8 + 17) z + 15 =
z4 + z 3 + z 2 + z + 1
Let f(z) be a primitive (irreducible) polynomial of degree n over GF(2) and have a root from
GF(24), i.e. f() = 0. The important properties of these roots are [8, p413]:
For any l 0, ( 2)l is also a root of f(z), i.e., , 2, 4, 8,.. . . The element ( 2)l is called
a conjugate of .
As every element in an extension field GF(24) is a root of the polynomial (z2)m+ z, there is
a polynomial in GF(2), called minimal polynomial of . This polynomial is the smallest
degree monic polynomial (leading coefficient is 1) having as a root [8, p 414].
These minimal polynomials play an important role when constructing BCH-codes.
One question that may come to a mind of the reader is: why do we have to use these Galois
fields when working with coding theory and why cannot we manage with ordinary rules of
binary figures? Let's take two binary words [1010] and [0110] and add these words. The result
we need to get in coding theory is 1100 where we have added words bit by bit according to
rules of GF(16).
Ordinary binary algebra, that we need to handle decimal mathematics in binary systems like
computers, we express decimal figures in binary form, like 10 as 1010 and decimal figure 6 as
0110 and sum of these is 16 that is in binary form 10000. Note that the previous result, 1100
corresponding decimal figure 12, is different. Adding or multiplying two figures in GF(2m)
keeps the length of codeword unchanged that is one of the important features of Galois fields
when we are dealing with error control codes.
Although the multiplication and division rules of Galois field are unfamiliar to us, logic circuits or computer subroutines to implement them are straightforward. Most of the algebraic
manipulations in the Galois field behave much like in ordinary field of real numbers. For example we perform matrix operations as we are used to perform them.
________________________________________________________________________
ECC35.doc

Page 32/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Galois fields have also many other valuable properties that are needed in deeper studies of
coding theory. Here we have just introduced some properties of Galois fields, but this is
enough for our study of binary codes.
From now on we will deal with binary symbols from GF(2) or non binary symbols that are
represented as m-bit vectors from GF(2m). Other prime number fields as GF(3) presented
above are shown just as examples of other possible Galois fields.

2.5. Structure of Linear Block Codes


Codewords of linear block codes are n-tuples of elements that in binary case are from GF(2),
i.e., bits with value 0 or 1. These codewords can be seen as k-dimensional vectors in vector
space with k linearly independent n bits long basis vectors. Basis vectors are added together to
make up each codewords.
General properties of linear codes [4, p 46]:
Sum of two codewords is also one of the codewords.
All zero vector is one of the codewords (origin of the vector space) (sum of two equal
codewords).
This structure of the linear codes makes the design and evaluation of the performance easy as
we will see later.
2.5.1. Hamming Weight and Minimum Distance
The Hamming weight w(c) of a codeword c is equal to the number of nonzero places or components in the codeword [4, p 46]. We noticed in Section 2.2 that the minimum distance dmin
is an important measure when we study the error control capability of a code. Minimum distance defines the minimum number of errors that can change a codeword to another codeword
of a code and then neither error correction nor detection would work. Let c be a codeword of a
linear code and naturally c-c (or c+c) is all zero codeword. Now by evaluating all non-zero
codewords we get Hamming weight w(c) of all codewords that equals to the number of their
non-zero places. Let us take two binary codewords x and z. The Hamming distance between
these two codewords is [1, p 481]
d(x, z) = w(x+z)
For example if x = [1010] and z = [1100] then x+z = [0110] where we have mod-2 added
words bit by bit. For linear codes x+z = y is also a codeword. The distance between x and y
therefore equals the weight of the other codeword y=x+z,
d(x, z) = w(x+z) = w(y)
Thus when we calculate distances between all pairs of codewords we actually evaluate the
weight of the other codeword that is the sum of the two ones under study. When a codeword z
is all zero vector then x+z = x and evaluation actually gives weights of all non-zero codewords. In the case of linear code all distances between codewords equal to the weight of one
codeword of this code. Then weights of the codewords give all Hamming distances of a code.

________________________________________________________________________
ECC35.doc

Page 33/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We can now state that minimum distance of linear code equals to the minimum Hamming
weight of the code:
dmin = min w(c) = w(c)min

2.5.1
where c is any codeword except all zero word and we have taken the smallest weight of all
codewords except all-zero codeword as the minimum distance. The study of distance characteristics of a linear code is much easier than in the case of nonlinear codes where we could not
make assumption above. Now we can say that the linear code that can correct t errors and detect l errors must have the minimum weight satisfying [4, p 47]
dmin= w(c)min 2t + 1
dmin= w(c) min l + 1

2.5.2

2.5.2. Matrix Representation of Linear Block Codes


Generally linear block codes are defined by the generator matrix G of the code [3, p417]:
g11
g
21
G=


gk 1

g12
g22

gk 2

g1n
g2 n


gkn

2.5.3

where each row is a vector of gi. For example, g1 = g11 g12 g1n .
Matrix has k rows that is equal to the number of information bits encoded to each codeword
and the number of columns n is equals to the length of the codewords.
Encoding
An information vector is a k-bit vector written as:
i = [i1 i2 . . . ik]
The corresponding n-bit codeword, c = [c1 c2 . . . cn], is given by
c=iG
2.5.4
Each bit in the codeword is generated by the information vector and one column of the generator matrix. The codeword becomes:
c = i1 g11 i2 g 21 ik g k1 , i1 g12 , ,ik g k 2 , , i1 g1n ik g kn
For binary code the information words as well as codewords are binary, i.e., sequences of binary symbols, bits. Any codeword is a linear combination of the row vectors of G because information bits with value logical 1 define which rows are taken and added together bit by bit
to make up a codeword. A general requirement for the generator matrix is that rows in the matrix have to be linearly independent, that is, sum of any set of rows should not make up allzero word (or a sum of any set of other rows should not be equal to one row in matrix). Otherwise different information words would produce equal codewords and code would not work.
We see that each row vector can be seen as the basis vector of one dimension of kdimensional vector space. The number of dimensions is the number of information bits that is
________________________________________________________________________
ECC35.doc

Page 34/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
equal to the number of rows in generator matrix. Codewords are vectors in this space created
by adding basis vectors bit by bit.
Two codes are equivalent if and only if their generator matrixes are related [4, p 49] by
1. Column permutations and/or
2. Elementary row operations.
Equivalent codes have similar performance but the set of codewords may be different. In
Problem 2.5.2 we illustrate that the set of codewords always equals the permuted set of codewords of an equivalent code.
The code is not changed (codes are equal, i.e., the set of codewords remain the same) under
elementary row operations and elementary row operations on a matrix are as follows [4, p37]:
1. Interexchange of any two rows;
2. Multiplication of any row by a nonzero field element (only 1 in binary case);
3. Replacement of any row by the sum of itself (and a multiple of) any other row.
Elementary row operations change the mapping of information words to codewords but performance of the code remains the same because the set of codewords is not changed.
Any generator matrix of an (n, k) linear code can be reduced by row operations and column
permutations to the systematic form [4, p 49] [3, p 417]:

G = I k P

1 0
0 1


0 0

0 p11

p12

0 p21

p22

pk 2

1 pk 1

p1n
p2 n


pkn

2.5.5

where Ik is the k x k identity matrix and P is a k x (n-k) matrix that determines the n-k redundant bits or parity check bits. Every linear code has an equivalent systematic code [4, p 50].
Generator matrix in systematic form generates a systematic linear block code in which the
first k bits of each codeword are identical to the information bits and the remaining n - k bits
of each codeword are linear combinations of the k information bits. Systematic codes make
implementation of long codes efficient, as we will see later.
Example 2.5.1
Let us take a generator matrix for a simple systematic binary linear code [4, p 47].
1 0 0 1 0
G = 0 1 0 0 1
0 0 1 1 1

If the information vector i = [0 1 1], the encoded codeword becomes

________________________________________________________________________
ECC35.doc

Page 35/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
1 0 0 1 0
c = i G = [0 1 1] 0 1 0 0 1 = [0 1 1 1 0]

0 0 1 1 1

Parity Check Matrix and Decoding


The decoder in the receiver checks if the received codeword is one of the error free original
codeword or if it is in error. For this we need the parity check matrix H that should give for
any error free codeword
c HT = 0
2.5.6
Now decoder just computes according to Equation 2.5.6 and if the result is a 0-vector, most
probably no errors have occurred. Naturally H must be compatible with G that is used for
generation of the codewords.
To derive the parity check matrix we start with the generator matrix in systematic form. We
use generator matrix in the encoder to make up the codewords and encoder calculates codewords as
c = i G = i [ I P ] = [ i i P ] = [ck cn-k]
where [ck cn-k] represents a codeword divided into two parts. First k-bits of codeword are
identical to the information bits and the second part is parity check section with n-k bits. Now
we can see from the equation above (note that i = ck for a systematic code) that
cn-k = i P = ck P
where ck represent the first k bits of the codeword. Now we may write
- ck P + cn-k = 0
This we may write into another form as
P
P
[ck cn-k]
=
c
2.5.7

I = 0
I n k
n k
Now if we compare Equations 2.5.6 and 2.5.7 we notice that we have got the transpose of the
parity check matrix that is
P
HT =
2.5.8

I n k
HT has n rows and n-k columns. The parity check matrix we get in a form
H = [- PT I n-k]

2.5.9

H has n-k rows and n columns. Note that - PT equals PT if we are dealing with binary codes.
Because c HT = 0 holds for all codewords that may be equal to any row in G, we get [4, p48]

P
G HT = I k P
= -P + P = 0
I n k
Where 0 is the k*(n-k) zero matrix where all elements equal to zero. The parity check matrix
we have found is valid because it fulfills the requirement in Equation 2.5.6.
________________________________________________________________________
ECC35.doc

Page 36/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Example 2.5.2
The generator matrix of the (5, 3) systematic linear block code in Example 2.5.1 was:
1 0 0 1 0
G = 0 1 0 0 1 = [ I P ]

0 0 1 1 1

Now according to Equation 2.5.9 we may write corresponding parity check matrix:
1 0 1 1 0
H = [- PT I n-k] =

0 1 1 0 1

This has n-k=2 rows and n=5 columns.


To check a received codeword, for example c = [01110], that corresponds to the information vector i = [011], the decoder calculates
1 0
0 1

c HT = [0 1 1 1 0] 1 1 = [0 0] = 0

1 0
0 1
The codeword is detected to be error free. The reader may check as an exercise that
GHT gives zero matrix.
Note that just as there are more than one choice for G, there is also more than one choice for
H [4, p 48]. We may change the generator matrix into different form by elementary row operations but the code itself (the set of codewords) remains the same and the same H can be
used for all code variations.
In this section we have looked at the decoding process when the received codeword has been
error free. In Section 2.7 we will valuate how a codeword in error is corrected.
2.5.3. Parity Check Matrix and Error Correction Capability of Linear Block Codes
We can evaluate the error correction capability of linear block code by looking at parity check
matrix. Generally we can say that: The minimum Hamming weight (and distance) of a code is
not smaller than w if and only if every set of w-1 columns in H (or rows of HT) are linearly
independent [4, p 48]. The minimum Hamming distance for linear codes equals to the minimum Hamming weight. If the minimum Hamming weight is w the code can correct t (w1)/2 errors and detect up to w-1 errors according to Equation 2.5.2.
For example, if every set of three columns (w-1=3) of H (equivalent to three rows of HT) are
linearly independent, one or a set of two or three errors in the received codeword can never
result to all zero vector in decoding. Then the minimum weight and distance w = 4. The sum
of three rows in HT, which three errors add together, gives always non-zero result but sum of

________________________________________________________________________
ECC35.doc

Page 37/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
four rows may give zero and four errors may not be detected. This we will further clarify later
in Section 2.7 where we take a look at the syndrome decoding.
Hence, in order to define (n, k) code that can correct t errors, we have to define (n-k) by n matrix H where every set of 2t columns are linearly independent [4, p 49]. Then 2t = w-1 and w
= 2t+1. The same way, if we want to detect l errors, every set of l = w-1 columns must be linearly independent. In Example 2.5.2 we had a code with parity check matrix that has equal
columns (w-1=1) and w=2 and hence it is not able to correct even a single error. However, a
single error is always detected because multiplication with a word in error gives always a nonzero result.

2.6. Hamming Codes


We introduced Hamming codes in Subsection 2.3.3. Now we describe them with the help of
the matrix formalism discussed in Subsection 2.5.2.
Hamming codes are able to correct one error in a codeword and this means that minimum distance (or minimum weight) of this code must be at least 3 according to Equation 2.2.3. This
requires that at least every set of two columns (w-1=2) of H should be linearly independent as
explained above. This simply means that adding any two columns together should not give all
zero result. Then all columns are different and the minimum distance is 3 (w=3). Linearly independent set of two means that sum of them is not equal to zero, i.e., they are different, see
Equation 2.4.1.
If a parity-check matrix for a binary code has m-rows, then each column is m-bit binary number. Parity check matrix has as many rows as the code has parity check bits, that is, m=n-k.
There are 2m different m-bit binary numbers and thus not more than 2m-1 different non-zero
columns (number of columns of H equals block length n). Hence each m defines a (n, k) =
(2m-1, 2m-1-m) Hamming code [4, p 54] [3, p 421]. This comes from the fact that the length of
the codeword n is equal to the number of columns in parity check matrix, 2m-1, and the number of information bits k equals the number of code bits n minus number of rows in parity
check matrix. As mentioned earlier the number of rows in the parity check matrix equals the
number of redundant bits of the code, i.e., m=n-k.
Example 2.6.1
The simplest nontrivial example of a Hamming code we get when m=3=n-k that leads
to m (2m-1) parity check matrix. Parity check matrix has 3 rows and each column
vector is 3-bit (m=3) vector and there are 2m-1 different columns without all zero vector. The parity check matrix in systematic form can be for example [4, p 54]:
1 1 0 1 1 0 0
H = 1 0 1 1 0 1 0 = [- PT I ]
0 1 1 1 0 0 1

(m = n-k =3 rows)

and corresponding generator matrix becomes

________________________________________________________________________
ECC35.doc

Page 38/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
1
0
G=
0

0
1
0
0

0
0
1
0

0
0
0
1

1
1
0
1

1
0
1
1

0
1
=[IP]
1

This (2m-1, 2m-1- m) = (7, 4) code is a Hamming code. Clearly, every pair of columns
of H are linearly independent because no pair of binary column vectors can make sum
as zero (i.e., they are different and w-1=2). Some of the sets of the three columns are
dependent (for example columns 3, 4 and 5 of H) so the minimum distance is
w=2+1=3 and the code is always able to correct single error and detect double errors.
Note that all rows of generator matrix have at least 3 non-zero elements in order to
make the minimum weight of the code equal to 3.
As a conclusion, each binary Hamming code has a structure (n, k) = (2m-1, 2m-1-m), where
m=n-k is the number of rows in the parity check matrix. The columns in the parity check matrix are different and they represent all non-zero m-bit binary words.
2.6.1. Design of a Hamming code
Now we are able to design any Hamming code we like. First we define the number of parity
check bits (n-k) that equals the number of rows m in parity check matrix. That gives the block
length of 2m-1 that is the number of columns of H. Then we write the parity check matrix
where columns contain all different m-bit binary numbers except all-zero column. This makes
each pair of columns linearly independent. We write first m columns on the right hand side in
a way that they form an identity matrix and the rest of the words we write on the left hand side
in which order we like. From the parity check matrix we can construct the generator matrix for
the encoder as shown in Subsection 2.5.2.
H = [- PT I n-k]

G = [ Ik P ]

Note that there exist Hamming codes only with a certain block lengths because they exist only
for integer values of m, i.e., m = 3 gives the block length of 2m-1 = 7, m = 4 gives the block
length 2m-1 = 15, etc. Note also that the longer block length we use the higher code rate we
get!
There exist also non-binary Hamming codes where elements of matrixes get more values than
just 0 or 1 from GF(2). Some examples of these are presented in [4, p 55, 17].
Another way to design a binary Hamming code is to write parity check part of the generator
matrix in systematic form with so that all rows are different and they all contain two or more
ones [1, p 482]. From this we could then construct the parity check matrix.
2.6.2. Implementation of an Encoder for a Hamming Code
Check bits of the Hamming code are generated from the information bits according to the
submatrix P of the generator matrix [1, p 483], see the generator matrix in Example 2.6.1.

________________________________________________________________________
ECC35.doc

Page 39/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Each binary 1 in the three last columns picks up one of the information bits and one check bit
is generated for each column. In the case of example 2.6.1 check bits are
p1 = i1 + i2 + i4
p2 = i1 + i3 + i4
p3 = i2 + i3 + i4
Figure 2.6.1 presents an encoder that computes check bit for the systematic Hamming code in
Example 2.6.1. Each block of message bits going to the transmitter is loaded into a message
register. The cells of the message register are connected to exclusive-OR gates and their outputs give check bits. An input buffer holds the next block of message bits while the check bits
of previous code block are shifted out. The cycle then repeats with the next block of message
bits [1, p 483].
Input

Buffer
Message register
i4
i3
i2
i1

Mod-2
adders +

p3
p2
p1
Parity Check Bits

Message
bits
To transmitter
in order:
p3 p2 p1 i 4 i 3 i 2 i 1
last
first
Check one
one
bits

Figure 2.6.1 Encoder for (7, 4) Hamming code in Example 2.6.1..

Note that we most often write information and code vectors in order where the first bit is on
the left and last one on the right that corresponds to the display of oscilloscope or logic analyzer. In Figure 2.6.1 we have the first bit on the right hand side because data flows from left
to right.

2.7. Maximum-Likelihood Decoding


There are two basic principles of decoding, hard- and soft-decision decoding. We concentrate
here only on hard-decision decoding methods, which means that the received bits are detected
to be one or zero (in binary transmission) bit by bit and then error control decoding takes
place. Soft-decision decoding means that we receive the whole codeword (samples of bits may
have many values, not just one or zero) and then calculate what is the closest error free codeword. The soft-decision decoding is more complicated but its performance is approximately 2
dB better than the performance of hard-decision decoding.
The encoder encodes each block of message bits i into a codeword c. The received codeword
contains possibly some errors and we write it as c'. Now, we assume that the hard-decision
decoding is in use, and the received binary codeword c' is passed to the decoder.
The task of the error correcting decoder is to find out what was the actual transmitted codeword c and correct (or just detect) errors if errors have occurred. For this it uses the stored information about the error control code in use that is defined by the protocol of physical layer
________________________________________________________________________
ECC35.doc

Page 40/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
in the case of Forward Error Correction (FEC), and data link layer in the case of error detection that requires retransmission procedures.
Because a small number of errors occurs much more frequently than high number of errors,
the error correcting decoder computes which error free codeword has smallest Hamming distance to the received word and assumes that it was the transmitted word. This principle results
to minimum probability of codeword error and is called Minimum-Distance or MaximumLikelihood Decoding.
A direct way to perform error correction would be to compare c' with every codeword in the
code [1, 484]. This method requires storing all 2k code vectors at the receiver and performing
2k comparisons for each received codeword. However, efficient codes have usually quite long
block length and a large number of information bits k. For this comparison the decoder could
mod-2 add each received codeword to all error free codewords in the code [3, p 446]. This
would result to error vectors and the error vector with smallest weight tells the minimum distance and that would tell which the transmitted word is (most probably). This is one possible
but inefficient way to implement minimum-distance decoding rule.
As an example, for Hamming code (31, 26), that is quite modest code today, the receiver
should store n * 2k > 109 bits and perform 226 = 67108900 comparisons for each received
codeword.
2.7.1. Syndrome Decoding
More practical decoding methods for codes with large k use parity-check information given by
parity check matrix H that we introduced in Subsection 2.5.2. As we saw the parity check matrix H used by decoder is directly related to the generator matrix G used in encoder. Decoder
multiplies the transpose of the parity check matrix by the received codeword. If the codeword
one of the words in the code c, it is most probably error free and the decoder gets:

P
cHT = c
= [0 . . 0]
I n k

2.7.1

If the received codeword is in error (and not equal to any of the error free codewords), the received codeword is written as c', Equation 2.7.1 gives at least one non-zero element as the result of multiplication. We call this vector, which has as many elements as parity-check matrix
has rows (or HT has columns), the syndrome s that is given by:
s = c'HT

2.7.2

where c' represents a received codeword, which may be error free or in error. If the syndrome
s = 0, i.e., all elements of the syndrome equal zero, received word is error free or errors have
changed it to another codeword in which case errors are undetectable. Otherwise errors are
indicated by the presence of nonzero elements in s. The evaluation, if there exists non-zero
element(s) in syndrome, is enough for error detection. The hardware needed for syndrome calculation is essentially the same as is needed for generation of parity check bits in the encoder
[1, p484].

________________________________________________________________________
ECC35.doc

Page 41/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Error correction can also be based on the syndrome but requires more circuitry. To develop
decoding method we introduce n-bit error vector e, whose nonzero elements mark the positions of transmission errors in c'. For instance if the transmitted codeword c = [1 0 1 1 0] and
the received word in error is c' = [1 0 0 1 1] then e = [0 0 1 0 1]. In general
c' = c + e

2.7.3

and in the case of binary numbers (in finite field GF(2) inverse element of is -1=1 and -0 = 0,
see section 2.4.1) we can write
c = c' - e = c' + e

2.7.4

We can think this in a way that the second error in the same bit location cancels the original
error and the resulting codeword is the original one. If we successfully identify the error vector that has occurred, we can reproduce original codeword according to Equation 2.7.4. If we
now substitute this to Equation 2.7.2 we obtain
s = c'HT = (c + e)HT= cHT + eHT = eHT

2.7.5

This result shows us that the syndrome depends only on the error pattern, not on the transmitted codeword.
Syndrome has as many elements as the codewords have redundant bits that is n-k. This equals
the number of columns in submatrix P of G and the number of rows of H (or the number of
columns of HT). This restricts the number of different syndromes to 2n-k.
The number of different error vectors equals the total number of different n-bit words minus
the number of the codewords, that is, 2n - 2k. This is usually much higher than the number of
syndromes and thus syndromes are not unique for all possible error patterns. For example, in
the case of (7, 4) Hamming code the number of detectable error patterns is 2n - 2k = 112 and
the number of syndromes is only 2n-k = 8. This means that all 112 error cases are detected (all
that are not the same as codewords) but only 7 of them can be corrected properly by the syndrome decoder.
For error correction we have 2 n-k -1 different syndromes when we have excluded all-zero syndrome that indicates error free transmission. Hence we are able to correct 2 n-k -1 different (selected) error patterns. These error patterns we select to be most likely error vectors that are
error patterns with smallest number of errors. This strategy is called maximum-likelihood decoding and it is optimum in a sense that it minimizes the word error probability. Maximumlikelihood decoding corresponds to choosing the code vector that has the smallest Hamming
distance to the received codeword [1, p 485].
To carry out maximum-likelihood decoding, we must first compute the syndromes generated
by 2 n-k -1 most likely error vectors. The table look-up decoder in Figure 2.7.1 then operates as
follows. The decoder calculates syndrome s from the received vector c' according to Equation
2.7.5 and looks up the assumed error vector e stored in the table. The sum c' + e = cr, that corresponds to the most likely transmitted codeword, is then generated by exclusive-OR gates. If
no error has occurred the syndrome and error vector are all zero vectors and received word is
accepted as it was received. Last n-k elements of the n-bit decoded words are omitted and the
output then corresponds (most probably) to the original k-bit message block in the input of the
encoder.
________________________________________________________________________
ECC35.doc

Page 42/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

en

e2

e1

Decoded
code word

+
+
Received
code word
c'

c' + e

+
c'
n

c'2

c'1

Syndrome calculator
s
Table
Figure 2.7.1 Table-lookup decoder [1, p 485].

As we saw, the number of nonzero syndromes 2n-k -1 defines how many different error patterns we can correct in the decoder and this depends on the number of redundant bits, n-k, in
the codeword. In n-bit word the number of different j-error patters is
n n!
j ! (n j )!
j

2.7.6

where we include all possible places for j errors in n-bit word. Hence to correct up to t errors k
and n must satisfy
n n
n
n
n
2n-k -1 n
1 2
t
2
t

2.7.7

which simple requires that the number of non-zero syndromes is larger or equal to the number
of error patterns to be corrected. In the case of single error correcting codes, such as Hamming
code, equation reduces to
2n-k -1 n

2.7.8

that is valid for Hamming codes as we have seen in our examples.


Furthermore, when e corresponds to a single error in the jth bit of the codeword we find from
Equation 2.7.5 that the syndrome s is identical to the jth row of HT. Therefore, to provide a
distinct syndrome for each single-error pattern and for the error free pattern, the rows of HT
(or columns of H) must all be different (all pairs of two rows contain two linearly independent
vectors) and each of them must contain at least one nonzero element. As explained in Section
2.6.1 Hamming codes are designed to satisfy this requirement for H, while 2n-k 1= 2m -1 = n.

________________________________________________________________________
ECC35.doc

Page 43/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Example 2.7.1
Let us design the table lookup decoder for the (7, 4) Hamming code that we used in
Example 2.6.1. There we wrote a parity check matrix as
1 1 0 1 1 0 0
H = [- P I ] = 1 0 1 1 0 1 0

0 1 1 1 0 0 1
T

We see that all columns are different and contain at least one non-zero element because all-zero column is omitted.
1
1

T
The transpose of H is: H = 1
1

0
0

1 0
0 1

1 1

1 1
0 0

1 0
0 1

Now we compute syndromes for all single error vectors [1 0 0 0 0 0 0],


[0 1 0 0 0 0 0], . . . , [0 0 0 0 0 0 1] according to Equation 2.7.5, s = e HT and we get
Table 2.7.1. There are 2n-k -1 = 23 -1 = 7 single error patterns and corresponding syndromes are listed in Table 2.7.1. The syndromes equal rows of HT. This table requires
only 80 bit memory in the decoder.
Table 2.7.1 Syndromes for the (7, 4) Hamming code.

0
1
0
0
0
0
0
0

0
0
1
0
0
0
0
0

0
0
0
1
0
0
0
0

e
0
0
0
0
1
0
0
0

0
0
0
0
0
1
0
0

0
0
0
0
0
0
1
0

0
0
0
0
0
0
0
1

0
1
1
0
1
1
0
0

s
0
1
0
1
1
0
1
0

0
0
1
1
1
0
0
1

We see that all single error patterns have unique syndromes and this code is able to
correct all of them. Let us now suppose that all zero codeword is transmitted and received word contains two errors such that e = [1 0 0 1 0 0 0]. The decoder calculates s
= e HT= [0 0 1] and from Table 2.7.1 we take error pattern e = [0 0 0 0 0 0 1] assuming
that the last bit has been in error. The decoder inverts the last bit that actually was not
in error and the decoded word includes three errors, one of which has appeared in erroneous correction by the decoder. However, if the bit error rate BER=110-3 (for example), there is 500 single error cases that are corrected properly for one double error
case as shown in Problem 2.2.9 and use of the single error vector in the table seems to
be very reasonable choice.
________________________________________________________________________
ECC35.doc

Page 44/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
If multiple transmission errors per word occur infrequently, we need not to be concerned
about occasional extra errors caused by erroneous error correction. If multiple errors occur
frequently more powerful code is required.
2.7.2. Standard Array
We noticed that the syndrome has a characteristic of an error pattern and it is not dependent
on the transmitted codeword, see Equation 2.7.5. We observe that there are 2n possible error
patterns (as many as different n-bit words) and only 2n-k syndromes. The number of unique
syndromes of the Hamming code in Example 2.7.1 equals the number of single error patterns.
This is not always the case.
Now we construct a decoding table in which we list all the possible 2k codewords in the first
row, beginning with all-zero codeword in the first column [3, p 446]. This all-zero codeword
also represents the all-zero error pattern. Then we will list all n error patterns of weight 1 in
the first column. For each error pattern we fill its row by adding the error pattern in the first
column to the error free codeword in the first row. Next error pattern, called coset leader, written must not be present in the upper rows of the table.
Then, if n < 2n-k, we continue with double error patterns that are not yet present in the table.
For each error pattern we fill its row by adding it to codeword. Then we continue with triple
error patterns, etc., until we have 2 n-k rows that is equal to the number of different syndromes.
When all 2 n-k rows are done we have 2 n-k 2k table, called standard array that is presented in
Table 2.7.2.
Table 2.7.2 Standard Array

c1
e1
e2
.
.
e 2n-k

c2
c2+e1
c2+e2
.
.
c2+e2n-k

c3
c3+e1
c3+e2
.
.
c3+e2n-k

. . .
. . .
. . .

. . .

c2k
c2k+e1
c2k+e2
.
.
c2k +e2n-k

Each row of standard array consists of all received words that would result from the corresponding error pattern in the first column. Each row is called a coset and the first left-most
word, error pattern, is called a coset leader. The table contains all possible n-bit words, error
free words and words in error, and the coset leader represents an error vector.
Example 2.7.2
Let us construct the standard array for the (5, 2) systematic code with generator matrix
1 0 1 0 1
G=
= [ I P ]
0 1 0 1 1

We can easily see that the minimum distance dmin = 3. Code has only 4 different codewords and the minimum weight is 3. The standard array for this code is given in Table
2.7.3.
________________________________________________________________________
ECC35.doc

Page 45/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Table 2.7.3 Standard array for the (5, 2) code [3, p 447].

00000

Code words
01011
10101

11110

00001
00010
00100
01000
10000
11000
10010

01010
01001
01111
00011
11011
10011
11001

11111
11100
11010
10110
01110
00110
01100

10100
10111
10001
11101
00101
01101
00111

Coset leaders consist all-zero error pattern, all error patterns of weight 1, and two error
patterns of weight 2. There are many more double error patterns but in the table there
is only room for two because number of rows and syndromes is 2 n-k = 23 = 8. Double
error patterns that we have selected have different syndromes from the syndromes of
single error patterns. Syndrome calculation is presented in Example 2.7.3. Then the
words standard array will be different. Reader may try to write other double error patterns and fill the table. If the syndrome is not unique but equal to one of the single error syndromes we will get similar words to the table and decoder is not able to select
right row and column and the error correction fails.
Note that standard array contains all possible received words, error free codewords and codewords in error, that is, all binary words of length n are included. The number of rows is 2n-k
and the number of columns is 2k and hence the number of words in the table is 2 n-k 2k = 2n.
If we store standard array in decoder, it would check to which column the received word belongs and would take uppermost word in that columns as the corrected codeword. However, it
is usually not reasonable to store standard array because in the case efficient codes (long
codes) it is very large. Instead of that, we use syndrome and error vectors that we already discussed in Section 2.7.1.
Now suppose that e is a coset leader and c is the transmitted codeword. Then this error pattern
would result to the received codeword written in the table [3, p 448]
c' = c + e

2.7.9

The syndrome is, as previously (2.7.5),


s = (c + e)HT = c HT + eHT = e HT

2.7.10

Clearly all received codewords in the same coset (row) have the same syndrome, since the latter depends only on the error pattern. Furthermore, each coset has different syndrome because
the selected error vectors produce unique syndromes. We may construct a syndrome decoding
table in which we list all 2n-k different syndromes and corresponding minimum weight error
patterns, i.e., coset leaders. Then, when word c' is received, the decoder computes the syndrome
s = c' HT

2.7.11

________________________________________________________________________
ECC35.doc

Page 46/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We have selected error vectors (coset leaders) which give unique syndrome for each row, coset, of standard array. The syndrome defines the row and we take the error vector e (coset
leader) and add this to received word.
cr = c' + e
2.7.12
Now we have performed error correction and cr is one of the words in the first row. This correction is successful if error pattern that we have written as one of the coset leaders has occurred. This process has been identical to the syndrome decoding in Section 2.7.1, but we
have now considered multiple error cases as well and explained error correction with the help
of the standard array. Standard array contains all possible received words and each of them is
decoded to one of the error free codewords. The use of syndrome gives the same result as if
we would store standard array into the decoder.
Example 2.7.3
Let us write syndrome decoding table for the (5, 2) code in Example 2.7.2.To calculate
syndromes we construct parity check matrix from generator matrix according to Equation 2.5.8 and example 2.7.1 and write its transpose as
1
0
P
HT = 1
I 0

0 1
1 1

0 0

1 0
0 1

We can now compute syndromes for each error vector (coset leader) according to
Equation 2.7.10. First we write all single error syndromes and then try which double
error patterns produce the remaining syndromes. All syndromes are listed in Table
2.7.4.
Table 2.7.4 Syndrome table for the (5, 2) code.

Error pattern e
00000
00001
00010
00100
01000
10000
11000
10010

Syndrome s
000
001
010
100
011
101
110
111

We could select two last rows in a different way because also error vector [00110]
gives syndrome [110] and error vector [01100] gives syndrome [111]. If we use these
and construct the standard array in Table 2.7.3, we notice that the order of words in the
last two rows will change but still the table would contain all possible words of length
n.
Suppose now that the received word is c'= [10101]. The resulting syndrome is
________________________________________________________________________
ECC35.doc

Page 47/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
[000], and the reception is assumed to be error free. As another example suppose that
two errors have occurred and c'= [01101]. Decoder computes syndrome [110] and
adds the corresponding error vector [11000] to the received codeword. Two errors are
(probably) corrected and corrected codeword is [10101].
Now suppose that the actual error vector is e = [10100] and it has changed error free
codeword [10101] to [00001]. The syndrome computed becomes s = [001]. Hence, the
error determined from Table 2.7.4, e = [00001], is added to the received word c'. The
result is decoding error that adds one error in the received word. We have seen that
this (5, 2) code corrects all single errors and only two different double error cases,
[11000] and [10010], which we had selected. There are ten different sequences of two
errors. This code corrects 20% of them. If double error vector is different from those
two that was selected to be correctable, the decoder makes additional errors.
Assume that the error vector occurred is, for example, [00110], and then the resulting
syndrome would be [110]. Now the decoder would invert the two first bits (according
to Table 2.7.1) even that third and fourth were actually the ones in error. This is not a
severe problem because often it does not usually matter how many errors there are if
we cannot correct received codeword. The main goal is to correct as many error sequences as possible.
2.7.3. The Relation of Code Rate and Block length
Forward-error-correction (FEC) codes are designed to correct t 1 errors per the received
word. The code should also be efficient in terms of the code rate, Rc = k/n, i.e., for the required performance its code rate should as close to one as possible. Code rate and block length
are related by the inequality [1, p487]
t n
1 R c 1 log 2
n
i 0 i

2.7.13

which follows from Equation 2.7.7 because n - k = n (1 - R c ), see Problem 2.7.4. For efficient
transmission R c should be close to one and inequality 2.7.13 shows that n should then be
large. Then naturally k is also large because R c = k/n. However, long codewords make encoders and decoders expensive, power consuming and difficult to implement if we are not able to
utilize the structure of the code to simplify implementation. Cyclic codes are a special subclass of linear block codes that have a cyclic structure that leads to very practical implementation. Thus the block codes used in communication systems are almost always cyclic codes and
we describe them in the following chapter.

________________________________________________________________________
ECC35.doc

Page 48/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

2.8. General error performance characteristics of linear block codes


To evaluate error correction capability of a linear code we should find out what is its minimum Hamming distance (or minimum weight of a linear code). Then we would know that it
can always correct all errors up to t if dmin 2t + 1 (Equation 2.2.3), and it is able to detect always up to l errors if dmin l + 1 (Equation 2.2.4). The code may be able to detect and correct
some error sequences with more errors but it is not guaranteed. For evaluation of the minimum distance, we have to know the code in detail. If we knew the parity check matrix we
could find out the minimum distance according to procedure in Subsection 2.5.3. By knowing
the generator matrix we could write all codewords and see the minimum weight that equals
minimum distance.
Generally we know about any code that the minimum weight of (non-zero) information sequence is 1. If we know that the code has n-k parity check bits we can see, without any further
knowledge about the code, that the highest possible minimum distance (= minimum weight)
of a linear block code, known as Singleton bound is [4, p 50]:
dmin = w(c) min 1 + n - k

2.8.1

This we can say without any knowledge about the generator or parity check matrixes. We can
never design a code better than this. Note that Equation 2.8.1 typically gives very optimistic
value for dmin.
Equation 2.8.1 follows from the property of linear codes that minimum distance between
codewords equals the minimum weight of codewords. Smallest weight non-zero information
word has one non-zero element. If all n-k parity bits are non-zero, the minimum weight of the
codewords is 1+(n-k), and hence the minimum distance cannot be larger than 1+n-k.
Minimum distance of repetition code that repeats each bit n times and transmits it as n-bit
codeword fulfills equality of Equation 2.8.1. It is the only binary code that equals Singleton
bound. For some short codes the minimum distance equals n-k, as in the case of (7, 4) Hamming codes, but for most practical codes it is much worse. Reader may look at code examples
in Table 3.2.1 and compare the number of redundant bits in the codes and their minimum distances.

2.9. Weight Distribution


For linear codes the minimum weight is equal to the minimum distance and it tells the minimum number of errors (worst case) that we cannot either detect or correct. The weight distribution tells how many codewords the code has for each value of Hamming weight w, and it
gives more information about the performance of the code performance. Weight distribution is
n+1 dimensional vector with components Al , where l = 0,,n [4, p 431]
A = [A0, A1, A2, , An]

2.9.1

For all linear codes with the minimum distance of d, A0 = 1 (weight 0 codeword always exists), A1 = A2 = ,, = Ad-1 = 0. Weight distribution is not trivial to derive for long codes, but
with help of that we would know that, for example, a code with very many low weight code-

________________________________________________________________________
ECC35.doc

Page 49/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
words has worse performance that another with only a few codewords close to the minimum
weight.
Example 2.9.1
Let us derive the weight distribution of an example Hamming code in Table 2.3.3. Its
components are: A0 = 1, A1 = A2 = 0, A3 = 7, A4 = 7, A5 = A6 = 0, A7 = 1. The weight
distribution vector is then:
A = [1 0 0 7 7 0 0 1]
In this chapter we have studied the structure, implementation and characteristics of linear
block codes. In the following chapter we concentrate to a specific class of linear block codes
that has a cyclic property, namely cyclic block codes.

Problems
Problem 2.2.1: What are the Hamming distances of the codewords below? What are their
Hamming weights?
[00110101] and [00111011]
[00100101] and [11001101]
Problem 2.2.2: We have defined a code that transmits two information bits using four codewords. What is the code, written as (n, k) code? Codewords are
0000000
0100111
1011100
1101011
What are the Hamming distances of the code? What is the minimum distance of the code?
What is the code rate? Is this a systematic code? How many errors it can always correct? How
many errors it can always detect?
Problem 2.2.3: Improve the code of the Problem 2.2.2 so that it would be able to correct a
single error. You have to replace at least one of the codewords with another possible 5-bit
codeword. Evaluate distances, the minimum distance and error detection and correction capability of your new code.
Problem 2.2.4 Find out how many errors the code in Example 2.1.1 is always capable to detect. Explain! Write each codeword with a single error in all possible places. Which of these
single error cases the code can/cannot correct? Explain!
Problem 2.2.5: The code in Example 2.1.1 is used to transmit information sequence 00. If a
single error occurs, is the decoder able to correct all possible single errors? If not, which of
them it cannot correct? Is it able to detect them all? Which double, triple or four error cases it
is not even able to detect?
Problem 2.2.6: We assume that because of a severe error burst random codeword is received.
What is the probability that the code in Problem 2.2.2 is not able to detect errors? If we have a
block code (15, 11) in use what is the probability that error detection fails?
________________________________________________________________________
ECC35.doc

Page 50/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 2.2.7: We assume that a random word is received because of a severe error burst.
What are the code rates and probabilities that error detection fails in the case of codes: a) (5,
4), b) (5, 3), c) (5, 2), d) (5, 1). e) What do you see?
Problem 2.2.8: A random word is received because of a severe error burst. Calculate probabilities for successful error detection when the code rate is 0.8 and the block length is a) 5, b)
10, c) 50 and d) 100. e) What do you see?
Problem 2.2.9: Data is transmitted in 5 bit blocks. Find probabilities for that the received 5bit block is error free, contains one error, two errors, 3 errors, 4 errors or 5 errors. Find all
probabilities for Bit Error Rates (BER) a) 0.5, b) 0.1, c) 110-2, d) 110-3 and e) 110-5. Write
your results into the table where rows represent different bit error rates and columns different
numbers or errors. f) Compare probabilities of small and large number of errors when the bit
error rate is small. What do you see?
Problem 2.2.10: How many Hamming weights and Hamming distances have a) (7, 3) and b)
(23, 12) block codes?
Problem 2.3.1: Horizontal and vertical parity code uses even parity. Information sequence is
parity protected in Bytes with five information bits (VRC) and Longitudinal Redundancy
Check is calculated over five Bytes. a) What is the transmitted encoded block when the information sequence is:
10101 01000 11110 10110 00101?
b) What is the information sequence after error correction if the received codeword is:
101011 001111 110010 000000 000101 010111?
c) Are there errors in the received data block below? If there are and can we correct them,
what is the information block? If we cannot correct them, why not? The received block is:
100010 010001 111100 100001 111010 011000.
d) What is this code (n, k) and what is its code rate? What is the information bit rate via 64
kbit/s channel if this error protection encoding were the only redundant information required?
Problem 2.3.2: The repetition encoder transmits each bit two times. What is the residual error
rate at the output of the error detecting decoder? Bit error rate on the line is 1*10-3.
Problem 2.3.3: Consider a code, which transmits each information bit three times as a codeword. a) What is the code (n, k) and its code rate? b) What is the minimum distance of this
code? c) How many bit errors this code can always correct? Is your result consistent with
Equation 2.2.3? Describe the error correction algorithm of the receiver. d) How many errors it
can always detect? Is this consistent with Equation 2.2.4? Describe by words the error detection algorithm of the receiver.
Problem 2.3.4: The bit error probability of the channel is 110-3. a) What is the residual error
rate if no error coding is used? The code is actually (1,1) code where each bit is transmitted as
it is. Consider next the repetition code (3, 1) where each bit is transmitted as three-bit codeword. b) What is the residual error rate when error correction is used? c) What is the residual
error rate when error detection (and retransmission) is used? d) What percentage of the received words has to be retransmitted (all words in error are retransmitted)? Are the cases with
double error essential from retransmission probability point of view?
Problem 2.3.5: Four bit data blocks are transmitted through the channel. Show that the probability of a block containing one or more errors, that is, P(1) + P(2) + P(3) + P(4), is equal to
the probability that all bits are not error free, i.e., 1-(1-p)4, where p is bit error rate of the
________________________________________________________________________
ECC35.doc

Page 51/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
channel. Hint: Write the latter formula as a polynomial of p. Then use the binomial distribution and write P(1)P(4) also as a polynomial of p. Compare the two polynomials you have
got.
Problem 2.3.6. Calculate probabilities of correct error correction decoding (decoded information block is error free) for the repetition code in Table 2.3.1. Information block length k =
4 and bit error probability p=0.05 and the repetition rate is a) 1; b) 3 or c) 5. Use the binomial
distribution. If the information block length k=6, what is the probability that the decoded information block is error free if the repetition rate is d) 1; e) 3 or f) 5.
Problem 2.3.7: Calculate the residual error rate when the repetition code is used and each information bit is transmitted five times. The bit error rate is 1*10-3. a) Decoder only detects errors and codewords in error are retransmitted. b) What is the probability of retransmission? c)
What is the residual error rate if error correction is used instead of ARQ?
Problem 2.3.8: Is the Hamming code (7, 4) in Table 2.3.3 a cyclic code? In cyclic code all
shifts of the codewords are also codewords. When shifting to left, leftmost bit moves to the
place of the rightmost bit. What are the bit sequences that make up all codewords with cyclic
shifts?
Problem 2.3.9: Design the error correction logic for the Hamming code (7, 4) decoder in Figure 2.3.2. Use inverters and two input AND or NAND gates.
Problem 2.4.1: Show that the distributive law in Table 2.4.1 holds for all elements of GF(2).
Problem 2.4.2: Find additive inverses of a) 0 and b) 1 in GF(2).
Problem 2.4.3: Find a) additive and b) multiplicative inverses for 2 in GF(3).
Problem 2.4.4: Construct addition and multiplication tables of Finite field GF(5). Use modulo-5 addition and multiplication. An example of these tables is given in Table 2.4.3 for GF(4).
Problem 2.4.5: Write following binary words as polynomials p(x) of x: 11011; 1000010 and
101010.
Problem 2.4.6: Are all binary vectors in each of the following sets linearly independent? If
you find any dependence in the set you know that the set does not contain linearly independent
vectors.
a) [1,1,1], [0,1,1] and [1,0,0]
b) [1,0,0], [0,1,0] and [0,1,1]
c) [1,1,0,0], [0,1,1,0], [1,0,1,0] and [1,1,1,1]
Problem 2.4.7: What are the field elements of extension field GF(qm)= GF(23) = GF(8) in
exponential, polynomial and binary representations.
Problem 2.4.8: Show that p(x) = x4 + x + 1 is a prime polynomial (irreducible). Hint: Divide
by all lower degree polynomials and check the remainder.
Problem 2.4.9: Is p(x)= x3 + x2 + x + 1 a prime polynomial? If not, what are the factors?
Problem 2.4.10: Show that multiplication Table 2.4.4 for GF(4) is valid.
Problem 2.4.11: Show that exponential notation 6 of the field element corresponds to polynomial notation x3 + x2 in GF(24) that is constructed with primitive polynomial p(x)= x4 + x
+ 1 and the primitive element as = x.
________________________________________________________________________
ECC35.doc

Page 52/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 2.4.12: Add four-bit codewords a) [0010] and [1011] b) [0011] and [1101] in
GF(24). See table 2.4.7. Present results in binary, exponential and in polynomial notation.
Problem 2.4.13: Multiply four-bit codewords a) [0010] and [1011], b) [0011] and [1101] in
GF(24). Use exponential notation of Table 2.4.7. Present results in binary, exponential and in
polynomial notation.
Problem 2.5.1: Is the code generated by the matrix below, equivalent to the code in example
2.5.2? Is this code a systematic code? How can you construct this generator matrix from the
one in Example 2.5.2?
1 1 0 1 1
G = 0 1 1 1 0

1 1 1 0 0
Problem 2.5.2: Change the generator matrix below into an equivalent systematic form.
0 0 1 0 1

G = 1 0 0 1 0
1 1 0 0 1
a) Perform elementary row operations and show that the code remains the same (the set of
codewords remains the same). b) Perform three column permutations (to right) first and then
row operations to change matrix to systematic form. Is the resulting set of codewords still the
same? If not, why can we argue that the code has still the same performance?
Problem 2.5.3: Show that G HT = 0 using the matrixes given in Example 2.5.2.
Problem 2.6.1: Is the code in Example 2.5.1 a Hamming code? How many errors the code in
Example 2.5.1 can correct and detect? What are the minimum weight and the minimum distance of the code? Check this by writing down all codewords.
Problem 2.6.2: Why cannot we use the matrix below as the generator matrix of a linear block
code? What is wrong with it?
1 1 1 0 0
G = 1 0 0 1 0
0 1 1 1 0
Hint: Write down all codewords to see what is wrong. Then consider generator matrix to see
what is wrong with it.
Problem 2.6.3: Write the parity check matrix of the (7 ,4) code generated by the generator
matrix below. Is this code a Hamming code? Hint: Check if it is designed according to Hamming code design rules.
1 0 0 0 1 0 1
0 1 0 0 1 1 1

G=
0 0 1 0 1 1 0

0 0 0 1 0 1 1
Problem 2.6.4: What is the code (n, k) defined by the generator matrix below? Write the
codewords when information sequences to be encoded are 1011 and 1111. What is the code
rate of this code?
________________________________________________________________________
ECC35.doc

Page 53/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
1
0
G=
0

0
1
0
0

0
0
1
0

0
0
0
1

0
1
1
1

1
1
1
0

1
1

Problem 2.6.5: Write the parity check matrix for the code in Problem 2.6.4. Write all syndromes and corresponding error vectors. What is the decoded word when the received word is
[0110110].
Problem 2.6.6: What binary Hamming code we have with shorter blocklength than 500 bits?
What are their code rates? What do you see?
Problem 2.6.7: What are the user data rates if the data rate of the channel is 64 kbit/s and different Hamming codes calculated in Problem 2.6.6 are used?
Problem 2.6.8: Design a systematic Hamming code (3, 1). What previously introduced simple
code is similar to this?
Problem 2.6.9: a) Design the parity check matrix and generator matrix for systematic Hamming code (15, 11). b) What is the codeword for information vector
i = [00011000000]. c) Decode the received vector c' = [100000000011100]. Is this word error
free? What is the corresponding information vector i? Note that we may write many different
(15, 11) Hamming codes and the results may be different depending on our design.
Problem 2.7.1: Derive the syndromes for the (7, 4) Hamming code defined by the generator
matrix below:
1 0 0 0 1 0 1
0 1 0 0 1 1 1

G=
0 0 1 0 1 1 0

0 0 0 1 0 1 1
Is the received word [0110110] in error? If it is what is it after error correction and what is the
information sequence.
Problem 2.7.2: The block code given in Examples 2.7.2 and 2.7.3 is in use. What are most
probably the transmitted codewords and information bits if the received ones are: a) [10001],
b) [11011], c) [11110], d) [10000]?
Problem 2.7.3: The linear block code in Problem 2.7.1 is used. Is the received word error free
and if not, what was (most probably) the transmitted codeword? Received words are: a) c' =
[1000101]; b) c' = [0101001]; c) c' = [1000110]; d) c' = [1110100]; e) c' = [0100100];
Problem 2.7.4: Show that Equation 2.7.13 follows from Equations 2.7.6 and 2.7.7.
Problem 2.7.5: What is the maximum code rate of any block code when block length is n = 7
and the code should correct a) one error, b) up to two errors? Use Equations 2.7.7 and 2.8.1
and compare the results. Note that k gets only integer values.
Problem 2.7.6: Find, with help of Equation 2.7.7, the minimum block length of a code that is
able to correct three errors. What is the number of information bits per codeword? How would
you describe this code?
________________________________________________________________________
ECC35.doc

Page 54/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 2.7.7: Show with the help of the Equation 2.7.7 that a block code (15, 7) may not be
able to correct errors if they are more than two.
Problem 2.7.8: What is the maximum code rate of a code with a block length 100 that can
correct all single and double error cases. Use Equation 2.7.7.
Problem 2.7.9: Show that the code rate of a block code with length of n = 100, cannot exceed
0.93, when it is designed to correct all words with a single error.
Problem 2.7.10: The code rate is 0.7. What is the maximum number of errors that the code
with block length of a) 10, b) 20 can always correct?
Problem 2.7.11: Block length of a block code is 100 and code rate 0.9. How many errors this
code can always correct?
Problem 2.7.12: Write the standard array and syndrome table for a (5, 2) code defined by the
generator matrix below. What error sequences the code can correct?
1 0 1 1 1
G=

0 1 1 0 1
Problem 2.7.13: Which double error patterns we could correct with the code in Examples
2.7.2 and 2.7.3.
Problem 2.7.14: Modify the standard array in Table 2.7.3 if the selected two error patterns
were [00110] and [01100]. What would be the corresponding syndromes? Do the original and
modified standard arrays contain all possible received words?
Problem 2.8.1: What is the maximum code rate of a code with a block length 100 that can
correct all single and double errors according to Equation 2.8.1? Compare your result with
Equation 2.7.7 used in Problem 2.7.8.
Problem 2.8.2: What is the maximum code rate of a block code with length of n=100, that
corrects all single error cases? Use Equation 2.8.1. Compare your result with maximum code
given by Equation 2.7.7 in Problem 2.7.9.
Problem 2.9.1: Write the weight distribution vector for a code in Example 2.7.2.

________________________________________________________________________
ECC35.doc

Page 55/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

3. Cyclic Block Codes


Cyclic codes are widely used in practical communication systems because their structure
makes implementation of encoder/decoder circuitry simple. Good block codes have a large
number of long codewords. Generator matrix and parity check matrix for long codes would be
large and they would be difficult to handle. Instead of matrix representation we prefer to use
polynomials to define cyclic codes and to explain their operation.

3.1. The Structure of linear Cyclic block Codes


Cyclic codes are a subset of the class of linear block codes and they satisfy the cyclic shift
property, i.e., if c = [cn-1 cn-2 .... c1 c0] is a codeword of cyclic code then all cyclic shifts of c,
for example [cn-2 cn-3 , ... , c1 c0 cn-1] are codewords of the same cyclic code as well [4, p 96].
3.1.1. Generator polynomial of a cyclic code
We described in Subsection 2.4.2 how we can represent binary (or non-binary) words as polynomials. When dealing with cyclic codes, that are often very long, it is convenient to express
the generator matrix of a cyclic code (n, k) as a generator polynomial g(x). This polynomial
divides xn - 1 (in binary case we may write xn + 1) and has the degree of n - k [3, p 425], that is
equal to the number of redundant bits in a code.
Definition 3.1: There is a cyclic code of block length n with generator polynomial g(x) if and
only if g(x) divides xn - 1 [4, p 98] where n is the block length of the code.
The simple encoding rule for a cyclic code is
c(x) = i(x) g(x)

3.1.1

where c(x) is a polynomial with degree n-1 that represents a codeword of length n and i(x) is
an information polynomial of degree k-1, and g(x) is the generator polynomial that must have,
according to Equation 3.1.1 degree of n-k.
Example 3.1.1:
Let us derive a cyclic code with block length n = 4 and the generator polynomial
g(x)=x+1. First we must check if this cyclic code exists and for that we divide
xn-1 = x4-1 by x+1. Remainder is zero and we know that the cyclic code with this generator and block length 4 exists. The degree of the codeword polynomial is 3 (number
of bits is 4) and the degree of information polynomial is 2 (three bits) because the degree of g(x) is one. Now, for example, if i(x) = x2 +x + 1 [111], we get
c(x) = i(x) g(x)= (x2 +x+ 1) (x + 1) = x3 + x2 + x + x2 + x+1 [1001]. We see that the
first three bits are not equal to the information bits and this code is not systematic.
The encoding according to Equation 3.1.1 does not give codewords in systematic form that we
usually prefer. Systematic form has an advantage that the information sequence in the beginning of the codeword can be forwarded as it is (for example in a router) while error check part
is computed. Error control part is attached to the frame after information section. We may
generate a systematic cyclic code, which contains the same the set of codewords as nonsys________________________________________________________________________
ECC35.doc

Page 56/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
tematic code created by Equation 3.1.1, i.e., equal code but in systematic form. Codewords
c(x) in systematic form we get from
c(x) = xn-k i(x) + p(x)

3.1.2

The first term simply means that information word is shifted n-k bits to left and error check
polynomial or bits are inserted to the lower order part of the codeword. The degree of p(x) is
n-k-1 that is smaller than the degree of g(x) that is n-k.
For example if n-k = 4, then the least significant bit of i(x) is shifted four places to left according to Equation 3.1.2. The degree of generator polynomial is also n-k = 4. Now there are four
free places for check bits (polynomial p(x)) on the right and the highest power of four bit word
is n-k-1=3 (x3, x2, x1, x0).
Now the problem is: how do we compute error check part p(x) of the systematic codeword in
the encoder. If we divide both sides of Equation 3.1.2 by g(x) the remainders should be the
same. However, because the degree of p(x) is smaller that the degree of g(x) the remainder of
the division p(x)/g(x) becomes naturally
Rg(x)[p(x)] = p(x)

3.1.3

The remainder of the division c(x)/g(x) = 0 because according to Equation 3.1.1 g(x) must divide c(x) (we get only quotient i(x)). Now we have got
Rg(x) [c(x)] = Rg(x)[xn-k i(x)] + Rg(x)[p(x)]

0 = Rg(x)[xn-k i(x)] + p(x)

p(x) = - Rg(x)[xn-k i(x)]

3.1.4

In binary case we may write:


p(x) = Rg(x)[xn-k i(x)]

3.1.5

That gives us the rule for generation of a systematic cyclic code that involves three steps [3, p
429]:
n-k
the message polynomial with degree of k-1 is first multiplied by x (i.e. n-k zeros are added in the end of the sequence of k information bits).
the result, with degree of n-1, is divided by the generator polynomial g(x) (degree n-k)
the remainder p(x) of the division is added to the shifted message polynomial.
Example 3.1.2:
Let us use the same generator, g(x) = x+1, and block length, n = 4, as in Example
3.1.1., but generate codewords in systematic form. Now the block length n = 4, length
of the information sequence k = 3, and the number of parity bits n-k = 1 (equal to the
degree of generator polynomial). As an example let
i(x) = x2 +x + 1 [111], we multiply it by xn-k and get
xn-k i(x) =x (x2 + x+ 1)= x3 + x2 + x [1110].
We see that we have i(x) one place to left to get free space for p(x).
Now we divide xn-k i(x) = x3 + x2 + x by g(x) = x+1 and get the remainder as
p(x) = Rg(x)[xn-k i(x)] = Rg(x)[x3 + x2 + x] = 1. Then the complete codeword is
c(x) = xn-k i(x) + p(x) = x3 + x2 + x +1 [1111]
We see that the first three bits are equal to the information sequence and we have derived systematic codeword.
________________________________________________________________________
ECC35.doc

Page 57/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

The systematic encoding according to Equation 3.1.2 and non-systematic encoding according
to Equation 3.1.1 produce exactly the same set of codewords, but the association between the
i(x) and the c(x) is different [4, p 99], i.e., information words are transmitted with different
codewords.
3.1.2. Parity check polynomial of a cyclic code
We stated previously in definition 3.1 that the generator polynomial g(x) of the cyclic code has
to divide xn-1 where n is the block length of the code. When we have defined generator polynomial, we get the parity check polynomial h(x) as a quotient of the division and the remainder is zero (that is the basic requirement for any cyclic code).

(xn -1)/g(x) = h(x)


xn -1 = g(x) h(x)
n

3.1.6
n

Then h(x) c(x) = [(x -1)/g(x)]c(x) = (x -1) i(x)

3.1.7

where the last result comes from the fact that c(x) = i(x) g(x). Note that this is valid for nonsystematic code but this codeword exists in systematic code as well for one message polynomial. Division of Equation 3.1.7 by (xn -1) gives only quotient i(x) and the remainder [4, p 98]
is zero
Rxn -1 [h(x) c(x)] = 0
3.1.8
Every error free codeword c(x) has to satisfy this requirement that corresponds to Equation
2.5.6 where we used parity check matrix instead of parity check polynomial.
According to Equation 3.1.8 the decoder may check if the received codeword is error free and
for this it
multiplies by h(x)
n
divides the result by (x -1)
and if the remainder of the division is 0, the received word was most probably error free.
Example 3.1.3:
For the code used in the previous examples we get parity check polynomial as
h(x) = (xn -1)/g(x) =(x4 -1) /(x+1)= x3 + x2 + x + 1
We may now check if the received word x3 +1 [1001] encoded in Example 3.1.1 is
error free.
Rxn -1 [h(x) c(x)] = Rx4 -1 [(x3 + x2 + x + 1)(x3 +1)] =
Rx4 -1 [x6 + x5 + x4 + x3 + x3 + x2 + x + 1)] = 0
The word is error free and the quotient of the division is x2 + x + 1. This is equal to
i(x) of non-systematic code in Example 3.1.1 according to Equation 3.1.7.
We may also check if the word x3 + x2 + x +1 [1111] encoded in Example 3.1.2 is
error free:
Rxn -1 [h(x) c(x)] = Rx4 -1 [(x3 + x2 + x + 1)( x3 + x2 + x + 1)] =
Rx4 -1 [x6 + x4 + x2 + 1)] = 0
The word is error free but the quotient of the division x2 + 1 is not equal to i(x). In this
systematic case we extract the parity check part p(x)=1 from c(x) and get
x3+ x2 + x= xn-k i(x) = x i(x). Division by x, or shift by one place to right, gives
i(x) = x2 +x + 1 [111] that is the same as in Example 3.1.2.
________________________________________________________________________
ECC35.doc

Page 58/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

We have seen how to generate codewords in non-systematic and systematic form for a cyclic
code. Now we will show how to find the suitable generator polynomial for a cyclic code.

3.2. Design of a cyclic code


To design a cyclic code with block length n we must find a way to define a generator polynomial that fulfills the definition 3.1, i.e., g(x) divides (xn -1). One way to see which polynomials
could be used as generator polynomial is to find all divisors of xn-1, that is, its prime factors is
[4, p 101]:
xn -1 = f1(x) f2(x) . . . fs(x)

3.2.1

where s is the total number of prime factor polynomials. Any factor polynomial or product of
two or more factor polynomials can be used as a generator polynomial g(x). If all prime factors are different, there are 2s -2 different non-trivial cyclic codes of length n. We have excluded the trivial cases g(x) = 1 (no error protection at all) and g(x) = xn -1 (no information in
codewords).
Note that the degree of selected generator polynomial defines the number of error check bits
in each codeword. Usually higher degree generator polynomial gives better error protection
but lower code rate. We may improve code rate by increasing block length n but this requires
a new study to find a generator polynomial for this longer code.
The straightforward way to find these prime factors is "try and error". We try first x+1, then
x2+1, then x2+x+1 then x3 +1, etc. Every time when the division is successful (remainder is
zero) we have found one of the prime factors and we continue our search by dividing the quotient further. Note that with writing all these polynomials we use dividers that correspond all
binary words where the least significant bit is 1, x+1=11, x2+1=101, x2+x+1= 111, x3 +1 =
1001.... The coefficient x0 =1 must be included in every factor to get term 1 of xn +1 as a result
of the product of all the factors fi(x) in Equation 3.2.1.
Note that all generator polynomials, fulfilling the requirement above, are not good ones. We
need additional study to evaluate the performance of a code with a certain generator polynomial. Some selected cyclic codes and their main characteristics are presented in Table 3.2.1 as
examples.

________________________________________________________________________
ECC35.doc

Page 59/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Table 3.2.1 Some selected cyclic codes [1, p 489]

Code:

Rc

dmin g(x)

Hamming
codes

7
15
31

4
11
26

0.57
0.73
0.84

3
3
3

x3+ x + 1
x4+ x + 1
x5+ x2 + 1

BCH
codes

15
31
63
127

7
21
45
113

0.46
0.68
0.71
0.89

5
5
7
5

x8+ x7+ x6+ x4 + 1


x10+ x9 + x8+ x6 + x5+ x3 + 1
x18+ x17+ x16+ x15+ x9+ x7+ x6+ x3+ x2+x+1
x14+ x9+ x8+ x6+ x5+ x4+ x2+ x+1

Golay code

23

12

0.52

x11+ x9 + x7+ x6 + x5+ x + 1

Codes in Table 3.2.1 are reviewed later in Section 3.6. We see that longer codes with the same
correction capability have higher code rates. We also see that with the same code rate we get
better performance if the block length is increased. Now we summarize the polynomials we
have used and will use later to describe cyclic codes.
A summary of the polynomials
The polynomials of cyclic codes with block length n that we have discussed and will discuss
later in this chapter are listed below [4, p 100]:
Generator polynomial:
g(x) degree: n-k
Error check polynomial:
p(x) degree: n-k-1
Parity-check polynomial:
h(x) degree: k
Information or message polynomial:
i(x)
degree: k-1
Codeword polynomial:
c(x) degree: n-1
Received codeword polynomial:
c'(x) degree: n-1
Recovered codeword (after error correction)
c'r(x) degree: n-1
Error polynomial:
e(x) degree: n-1
Syndrome polynomial:
s(x)
degree: n-k-1
We clarify the procedures presented above with the help of a couple of examples.
Example 3.2.1
We want to design a cyclic code with block length n = 7. According to Definition 3.1
and Equation 3.2.1 we divide x7+1 into prime factors. We find prime factors by dividing by all possible factors: x+1= [11], x2+1 = [101], x2+x+1= [111], x3 +1 = [1001]....
and take factors that divide x7+1 and finally the quotient of the division. We get the
factor polynomials
x7+1 = (x + 1)(x3 +x2 +1)(x3 +x +1)

3.2.2

We may check this result by multiplying all factors on the right hand side and get x7+1
as a result. Now we may select any of these polynomials or product of some of them as
a generator polynomial for our code. We could easily show that both of the third order
polynomials construct the same code (the same set of codewords).
________________________________________________________________________
ECC35.doc

Page 60/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Let us take g(x) = x3 +x2 +1 as our choice. We get the codewords with nonsystematic
encoding using Equation 3.1.1, c(x) = i(x)g(x), and they are listed in Table 3.2.1 together with information polynomials. Both polynomial and binary vector representation are shown. An example: For information word i(x) = x3 +1 (corresponds binary
sequence of [1001]) we get: c(x) = i(x)g(x) = (x3+1)(x3+x2+1) = x6+ x5+x3+ x3+ x2+1
= x6+ x5+x2+1 that corresponds to binary codeword [1100101] as we see in Table
3.2.2 [3, p425].
Table 3.2.2 Nonsystematic cyclic code (7, 4), generator
polynomial g(x) = x3+ x2 +1 [3, p425].

i(x)

c(x)

0
1
x
x+1
x2
2
x+1
x2+x
x2+x+1
x3
3
x +1
x3 +x
x3+x+1
x 3+ x 2
3
x + x2+1
x3+ x2+x
x3+ x2+x+1

0
3
x + x2+1
x4+ x3+x
x4+ x2+x+1
x5+ x4+ x2
5
x + x4+ x3+ 1
x5+ x3+ x2+ x
x5+ x+1
x6+x5+ x3
6
x +x5+ x2+ 1
x6+x5+ x4+x
x6+x5+ x4+ x3+ x2+x+ 1
x6+ x4+ x3+ x2
x6+ x4+ 1
x6+ x2+x
x6+ x3+x+ 1

i
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

c
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0

0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0

0
1
1
0
0
1
1
0
1
0
0
1
1
0
0
1

0
1
0
1
1
0
1
0
0
1
0
1
1
0
1
0

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

We see that this code is not systematic because the first section of the codewords is not
identical to information bits. We see that this code includes three redundant bits that is
the same as the degree of the generator polynomial. We see also that the code is cyclic
because of shifting any of the codewords to left and moving leftmost bit to the first
place on the right results to another codeword in this code. There are three different
sequences [0000000], [0001101], [1111111] and [0010111] that together make up all
the codewords when shifted cyclically.
We see from Table 3.2.2 that minimum weight of the codewords is three so this code
is always able to correct one error and detect two errors. This follows from general requirement for linear codes that we stated in Section 2.5.1.
Parity check polynomial we get if we divide x7-1 by g(x) according to Equation 3.1.6
or with the help of other factors given in Equation 3.2.2,
h(x) = (x + 1)(x3 + x +1) = x4 + x2 + x + x3 + x +1 = x4 + x3 + x2 + 1.
Any error free codeword multiplied by h(x) should be divisible by x7+ 1 according to
Equation 3.1.8. To take an example codeword, second row of the table,
c(x) = x3 + x2 +1, and multiplying this by h(x) we get:
h(x) c(x) = (x4 +x3 + x2 +1) (x3 + x2 +1) =
x7 + x6 + x4 + x6 + x5 + x3 + x5 + x4 + x2 + x3 + x2 + 1 = x7+1
________________________________________________________________________
ECC35.doc

Page 61/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
When we divide this h(x)c(x) by x7+ 1 we get 1 and zero remainder so Equation 3.1.8
is valid.
In the following example we construct the same code as in Example 3.2.1 but now in the systematic form.
Example 3.2.2
If we want our code to be systematic, the procedure for generation of codewords is little bit more complicated. We want to construct a systematic code with block length n
= 7 and find the prime factors as in Example 3.2.1 and take g(x) = x3 + x2 +1 as the
generator polynomial for our code. This decision defines that we have three check bits
in the code that is equal to the degree of generator polynomial, that is n - k = 3 and k =
n - 3 = 4. First we multiply each information polynomial by xn-k = x3, i.e., shift bits
three steps to left. Then we divide this by generator polynomial and the remainder p(x)
defines the three check bits of the codewords as polynomial p(x) according to Equation
3.1.5. As an example let's take information word i(x) = x3 +1 that corresponds binary
sequence [1001], then we get
i(x) xn-k = (x3+1) x3 = x6 + x3 [1001000]
Division of this by g(x) = x3+x2 +1 gives quotient x3+x2+x + 1 and remainder x +1.
Remainder we add to the shifted information word x6 + x3 and get a codeword: c(x) =
i(x) xn-k+ p(x) = x6+ x3+ x + 1 that corresponds to the codeword [1001011]. This code
is systematic and the first four bits equal to information bits. All codewords of this
systematic code are listed in Table 3.2.3.
Table 3.2.3 Systematic cyclic code (7, 4), g(x) = x3 +x2 +1.

i(x)

c(x)

0
1
x
x+1
x2
2
x+1
x2+x
2
x +x+1
x3
3
x +1
x3+x
x3+x+1
x3+ x2
x3+ x2 + 1
x3+ x2+x
x3+ x2+x+1

0
x3+ x2+1
x4+ x2+x+1
x4+ x3+x
x5+ x+ 1
5
x + x3+ x2+ x
x5+ x4+ x2
x5+x4+ x3+1
x6+x2+ x
6
x +x3+ x+ 1
x6+x4+ 1
x6+ x4+ x3+ x2
x6+ x5+ x2+ 1
x6+ x5+ x3+1
x6+ x5+x4+x
x6+ x5+x4+x3+x2+x+ 1

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

i
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

c
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
1
1
0
0
1
1
0
1
0
0
1
1
0
0
1

0
0
1
1
1
1
0
0
1
1
0
0
0
0
1
1

0
1
1
0
1
0
0
1
0
1
1
0
1
0
0
1

Looking at the table we see that this code is systematic and cyclic. The generator and
parity check polynomials are the same as in Example 3.2.1 and the codewords are the
same i.e., codes are the equal. However, information sequences are mapped differently
to the codewords. There are three different sequences [0000000], [0001101],
________________________________________________________________________
ECC35.doc

Page 62/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
[1111111] and [0010111] that make up all the codewords when sifted cyclically. We
see that the minimum weight of codewords is three so error correction capability is
naturally the same as that of the non-systematic code.
We have defined cyclic code with generator polynomial but we could use generator matrix as
well. The relation of a generator polynomial and generator matrix in systematic form is presented in the following section.

3.3. Generator matrix of a cyclic code


We can construct the generator matrix of a cyclic code in systematic form
G = I k P from the generator polynomial as follows [3, p 427 and 4, p 109]. The lth row
of G corresponds to a polynomial of the form xn-l + Rl (x), l = 1, 2,..., k, where Rl (x) is a polynomial of degree less than n - k. This form can be obtained by dividing xn-l by g(x). We get
quotient Ql (x) and the remainder Rl (x) = Rg(x)[xn-l]. When we multiply the result by g(x) we
get
xn-l /g(x)= Ql (x) + Rl (x)/g(x),
xn-l = Ql (x)g(x) + Rl (x)= Ql (x)g(x) + Rg(x) [xn-l], l = 1, 2, ..., k
3.3.1
n-l
where Ql (x) is the quotient and Rg(x) [x ] is the remainder of the division: xn-l divided by
g(x). The desired polynomial corresponding to lth row in G (binary code) is
xn-l - Rg(x)[xn-l] = Ql (x) g(x)
3.3.2
where xn-l represents element in the section of identity matrix and Rg(x)[xn-l] represents the
parity check part of the matrix. Compare this with Equation 3.1.5 and imagine that i(x) has
only one non-zero element (lth bit), then we compute p(x) for lth row according to Equation
3.1.5 and check part of that row becomes Rg(x) [xn-l] = p(x). Note that xn-ki(x) in Equation
3.1.5 corresponds to xn-l of Equation 3.3.2 when i(x) contains only one nonzero bit.
Now we look at an example where we construct generator matrix in systematic form when the
generator polynomial for this cyclic code is known.
Example 3.3.1
Let us now take the generator polynomial as:
g(x) = x3 + x + 1
This polynomial is used in GSM to protect most sensitive bits of the speech frame. We
may construct a short cyclic code with, for example, block length n = 7 with this polynomial. First we have to check if the cyclic code with block length 7 exists. For this
we divide x7 + 1 by g(x) = x3 + x + 1. Remainder is zero so this cyclic code exists and
we may construct a short cyclic code (7, 4). This we actually did previously in Example 3.2.1 when we searched prime factors of x7 + 1, see Equation 3.2.2. The code has 3
parity bits because the degree (maximum power) of the generator polynomial is 3. We
get polynomials for each row l = 1, 2, 3, 4, when we divide xn-l by g(x) and write the
row polynomial as xn-l + Rg(x) [xn-l] where Rg(x) [xn-l] is the remainder of the division.

________________________________________________________________________
ECC35.doc

Page 63/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

l=1
l=2
l=3
l=4

xn-l = Ql (x)g(x) + Rg(x) [xn-l]:


x6 = (x3 + x + 1)g(x) + x2 + 1
x5 = (x2 + 1)g(x) + x2 + x + 1
x4 = xg(x) + x2 + x
x3 = g(x) + x + 1

xn-l+ Rg(x)[xn-l]:
x6 + x2 + 1
x5 + x2 + x + 1
x4 + x2 + x
x3 + x + 1

Now we may write the generator matrix for this short code as:

1
0
G=
0

0
1
0
0

0
0
1
0

0
0
0
1

1
1
1
0

0
1
1
1

1
1

To make connection between generator polynomial and matrix even more clear we
take an information word i(x) = x3 [1000]. The corresponding codeword equals to
the uppermost row of the matrix. We get this codeword in systematic form according
to Equations 3.1.2 and 3.1.3 as
c(x) = xn-k i(x) + p(x) = xn-k i(x) + Rg(x)[xn-k i(x)]
x3 x3 + Rg(x)[ x3 x3] = x6 + x2 + 1 [1000101].
We see that this is the same as the uppermost row in the matrix. If information sequence is [0100] we get the second row, etc.
In GSM the same generator polynomial as in Example 3.3.1 is used for error detection but that
code is long, block length is n = 53. The complete generator matrix for (53, 50) cyclic code
would have 53 columns (number of bits in codewords) and 50 rows (number of message bits),
where 50 first columns would have a form of identity matrix and would not contain much information. This is why the generator polynomial expression is used in most telecommunication standards, which define usually quite long codes, instead of the matrix representation.
In the next section we look at implementation of cyclic codes. This shows to us why cyclic
codes, especially them in systematic form, are so popular.

3.4. Implementation of cyclic codes


The encoding operations for generating a cyclic code may be performed by a linear feedback
shift register [3, p 429]. In the case of systematic code, the first k bits are identical to the information bits and parity bits are added in the end of the information sequence. If we use this
systematic structure, we need not to store long information sequence until we are able to generate codeword.
3.4.1. Encoders for Cyclic Codes
The encoding operations for generating a cyclic code may be performed by a linear feedback
shift register that is designed according to the generator polynomial [3, p 430]. As shown in
the previous section the generation of a systematic cyclic code involves three steps, namely
multiplying the message polynomial i(x) by xn-k, dividing the product by g(x) and finally add________________________________________________________________________
ECC35.doc

Page 64/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
ing the remainder to xn-ki(x), see Equation 3.1.2. Of these steps, only division is nontrivial.
Figure 3.4.1 presents a general division circuit where the divider is a general generator polynomial in form
g(x) = gn-k xn-k +gn-k-1 xn-k-1 + ... + g1 x +g0

3.4.1

Coefficients get in binary case only values zero, connection open, or one, connection closed in
Figure 3.4.1. The shift register has n-k stages that initially contain all zeros. The coefficients
of information polynomial are clocked into shift register one coefficient (bit) at a time, beginning with highest order coefficients. In binary case adding and subtracting are equal and performed by mod-2 adders.
Quotient

i(x) x

-g0

-g1

-gn-k-1

-gn-k

n-k

Figure 3.4.1 A feedback shift register for division of xn-ki(x) by generator polynomial g(x).

After n-k shifts (equal to the number of redundant bits of the code) first term appears at the
output and is multiplied by -gn-k giving the first term of the quotient. Then, as in ordinary long
division, the divider g(x) multiplied by the term of quotient is subtracted and the next shift
takes place. When all coefficients are clocked into the shift register (k shifts) it contains the
remainder.
Let us now take an example where we use shift register to divide one polynomial (binary
word) by another polynomial.
Example 3.4.1
Let's divide a polynomial A(x) by polynomial g(x) [10, p 46].
A(x) = x14+ x12+ x11+ x10+ x8+ x4+ x3+ x, corresponds [101110100011010]
g(x) = x4+ x+ 1
g(x) = x4 + x + 1 = [1 0 0 1 1]
Quotient

-g0 = 1

A(x)

-g1 = 1

-g2 = g3 =0

-gn-k = -g4 = 1

Figure 3.4.2 A circuit that divides by g(x) = x4+ x+ 1.

________________________________________________________________________
ECC35.doc

Page 65/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
For long division where we use coefficients of the polynomials:
g(x)=
x4+ x+ 1
_1 0 1 0 0 1 0 0 1 1 1_____ Q(x) = x10+ x8+ x5+ x2+ x+ 1
1 0 0 1 1 1 0 1 1 1 0 1 0 0 0 1 1 0 1 0 A(x) = x14+ x12+ x11+ x10+ x8+ x4+ x3+ x
10011
0 1 0 0 0 1)
00000
1 0 0 0 1 2)
10011
0 0 1 0 0 3)
00000
0 1 0 0 0 4)
00000
1 0 0 0 0 5)
10011
0 0 1 1 1 6)
00000
0 1 1 1 1 7)
00000
1 1 1 1 0 8)
10011
1 1 0 1 1 9)
10011
1 0 0 0 0 10)
10011
0 0 1 1 = remainder x + 1
The contents of the shift register changes shift by shift. Every time when quotient is
one, g(x) is added or subtracted to the data in shift register just as we did in the long
division. When last term of A(x) has entered to shift register and last subtraction is
done, the shift register contains the remainder.
To see how the shift register operates we shift the sequence A(x) into the register starting with highest order coefficient (leftmost bit). After four shifts the input bit and contents of the shift register are from left to right 11101. When the next shift takes place
the first bit of quotient is read out and the next input bit and contents of the register
have become 00010. This is the same result as of the first subtraction 1) in the long division above (inversed order because oldest bit is on right). After next shift we have
10001 as in the long division 2), etc. The process continues until the last bit is read into the register and the shift register contains the remainder.
Usually we use systematic cyclic codes and we are only interested in generating parity check
bits for each codeword. The encoder then takes the form in Figure 3.4.3. The first k bits at the
output of the encoder are simply the k information bits. These k bits are clocked simultaneously into the shift register, since switch 1 is in the closed position. Note that the generator polynomial always contains terms 1 and n-k and thus the last and the first connections always ex________________________________________________________________________
ECC35.doc

Page 66/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
ist. Note also that multiplication by xn-k with i(x) is not performed explicitly because we have
shifted the adder to the end of the shift register to avoid those unnecessary shifts.

-g0 = 1

-g1

-g

-g2

n-k-1

-g = 1
n-k

c0

c1

c2

cn-k-1

Parity check
bits
Message
n-k
i(x) x

Figure 3.4.3 Encoder for a systematic cyclic code.

When all k information bits are clocked into the encoder, the positions of the two switches are
reversed. At this time, the shift register contains the n-k parity check bits that correspond to
the coefficients of the remainder polynomial. After k information bits these n-k check bits are
clocked out.
Note that the dividing circuitry in Figure 3.4.3 works properly only if there is n-k zeros in the
end of the word to be divided. These zeros are not taken into the circuitry, instead division is
stopped, feedback switch position is changed, zeros appear to the feedback and check bits are
read out. This is optimum for generating systematic codewords. General division circuitry in
Figure 3.4.1 works for any data sequence.
Let us now take an example of an encoder used in a telecommunication system.
3.4.2. Application, error detection encoder in GSM
The first, most sensitive speech data, 50 bits are protected by an error detecting block code
before error correction encoding. The code implemented is a (53, 50) shortened systematic,
cyclic code with the generator polynomial
g(x) = x3 + x + 1

3.4.2

We could show that this code is not original cyclic code by checking if g(x) = x3 + x + 1 divides x53+ 1 [4, p 98]. We would notice that is does not divide and no cyclic code exist with a
block length n = 53 with this generator polynomial. Original cyclic code we could find by dividing x54+ 1, x55 + 1, etc, by g(x) and when the division is successful we have found the block
length of the original code. However, we may adapt block length of cyclic code to our system
by shortening the code as presented below.
Shortened code means that we set first l information bits to zero (i.e., reduce k by l) and calculate the check bits as in original code [3, p.421]. Because first l information bits are always
zero we need not transmit them (we drop l rows and columns from generator matrix) and
codewords are shortened by l. In the decoder we add l zero bits before decoding.
________________________________________________________________________
ECC35.doc

Page 67/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Shortened code has at least the same minimum distance dmin as the original code and error detection and correction capabilities are the same or even better [4, p 153]. The number of
codewords is reduced by shortening and it may increase the minimum distance. Shortened cyclic code is no longer cyclic because cyclic shift of a (shortened) codeword does not necessarily give a new codeword. However, in decoder we may first add l zero bits to get cyclic code
and perform decoding in the same way as in the case of original code. Disadvantage of shortening is that the code rate decreases.
The error detection encoder of this (53, 50) shortened cyclic block code is displayed in Figure
3.4.4 [8, p 711]. Most sensitive bits of the speech frame are, in addition to the error correction,
protected by this error detection code.
Switch
D

Speech data C1a


in 50 bit blocks
Operation: 1...50 cycles: switch closed
50..53 cycles: switch open

+
First 50 bits
Last
3 parity bits

Error
detection
coded
data in
53 bit
blocks

Figure 3.4.4 The Class 1a (53, 50) systematic, cyclic block code encoder.

Observe that the taps of the linear shift register encoder are allocated at the positions specified
by the generator polynomial. The shift register divides by the polynomial. Since the systematic
encoding rule has been adapted (first k-bits are equal to information bits), after the first fifty
clock pulses the switch is opened and three parity bits (remainder) are ready to leave the encoder. Note that the first information bit corresponds to the coefficient of the highest order
term of the message polynomial and parity check bits are added after the lowest order coefficients.
Another example of shortened cyclic code is (15, 10) Hamming code used in Bluetooth (generator g(x) = x5 + x4 + x2 +1) [16, p68].
Example 3.4.2
In order to clarify operation of this simple encoder we assume that the encoder generates
a short (7, 4) code given in Example 3.3.1. This means that instead of 50 we have 4 information bits and 3 parity bits in each codeword.
Let's take an information word as i = [0001] that corresponds to the polynomial
i(x) = 0 x3 + 0 x2 + 0 x1 + 1 x0 = 1
Codeword becomes with the help of the generator matrix in Example 3.3.1:
c = i G = [0001011] that correspond to codeword polynomial
c(x) = 0 x6 + 0 x5 + 0 x4 + 1 x3 + 0 x2 + 1 x1 + 1 x0 = x3 + x + 1
________________________________________________________________________
ECC35.doc

Page 68/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

We may use polynomial expression and see that the information sequence is followed
by the remainder of the long division where information sequence is first multiplied by
xn-k = x3, that gives x0 x3= x3, and then divided by generator polynomial g(x) = x3 + x +
1, that gives p(x) = x + 1 as the remainder. The transmitted codeword is the sum of
these, i.e., c(x) = x3 + x + 1 [3, p431], [1, p 491].
Let's now look at how the shift register in Figure 3.4.4 performs this process. The four
information bits are first sent as they are and then the switch is opened. Inputs of the
stages of the shift register are 110 from left to right when the fourth shift has taken
place. Then the last information bit has been read out and the switch is opened. Parity
bits are then read out according to Table 3.4.1.
Table 3.4.1 Contents of the shift register in Figure 3.4.4.

Information
sequence

Number
of shift

0
0
0
1
-

1
2
3
4
5
6
7

Contents of the shift registers:


Stage 1
0
0
0
1
0
0
0

Stage 2
0
0
0
1
1
0
0

Stage 3
0
0
0
0
1
1
0

Encoded
data
0
0
0
1
0
1
1

And complete generated codeword becomes c = [0001011] that corresponds to the polynomial c(x) = x3 + x + 1. This is the same we have got previously with the help of the
generator matrix and the generator polynomial.

________________________________________________________________________
ECC35.doc

Page 69/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We can say about a linear cyclic code that its largest possible minimum distance is (Equation
2.8.1, Singleton bound, valid for any linear block code)
dmin 1 + n - k

3.4.3

This follows from the property of linear code that minimum weight of non-zero codeword
equals to the minimum distance of a code as follows: Smallest weight non-zero information
word has at least one non-zero element. If all parity n-k bits are non-zero the weight of the
word is 1+(n-k), and hence the minimum Hamming weight and (minimum distance) cannot be
larger than 1+n-k. The only binary code that achieves equality of Equation 3.4.3 is repetition
code.
With the help of this we know that the error detection code of this application example, with
n-k = 3, may fail if there are more than three errors, dmin 4, i.e., error detection does not necessarily work. However, the distance between non-zero systematic codewords is only two if
all parity check bits are non-zero for single bit message sequences. This is why at least one
parity bit should be different, i.e., non-zero for some words. This makes us to assume that
minimum weight is not larger than three for this code. This means that error detection may fail
if there are more than two errors in a codeword.
To consider this argument we may calculate codewords of a short (7, 4) code using this same
generator polynomial (Problem 3.1.3). If we look at these codewords (that are actually the
same as in Example 3.2.1), we notice that minimum distance is actually three. This comes
from the fact that there are k message sequences with only single non-zero bit. The distance
between these is two. For minimum distance of three we should have at least one different
parity bit but we have only 7 different parity bit sequences with at least one non-zero element.
Because of this parity sequence must be the same for some message sequences with single
non-zero element. This gives the minimum distance of two.
We can continue further this evaluation. We may assume that the difference between information sequences is only one bit, one having one non-zero element and another with two nonzero elements (one of these is the same). If the block length is 7 or shorter we can increase
distance between these words to three as we do by Hamming codes. If the code is much longer
than 7, as the GSM code Figure 3.4.4, we have very many information sequences with distance of one. If they are so many that we have to use the same parity bit sequences for two information sequences with distance one the minimum Hamming distance would be only one.
Even that code (53, 50) discussed above is weak, it provides improvement in GSM with a
small amount of redundant information, and it was found to be acceptable to check if the preceding error correction has failed.

3.5. Syndrome Decoding of Cyclic Codes


We already introduced the principle of minimum distance or maximum likelihood decoding
with help of syndromes in Section 2.7. Now we look at the syndrome decoding method for
cyclic codes.

________________________________________________________________________
ECC35.doc

Page 70/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Let us assume that the hard-decision decoding is in use, and the received binary codeword is
passed to the decoder. The task of the decoder is to decide what error-free codeword has the
smallest Hamming distance to the received word. This principle gives a minimum probability
of a codeword error.
In the decoder we could mod-2 add each received codeword to all possible error free codewords [3, p 446]. This would result to error vectors and the error vector with smallest weight
would tell, which is the codeword that has minimum distance to the received one. This is one
possible but inefficient way to implement minimum-distance decoding rule.
A more efficient method is to use syndrome polynomial the same way as we used syndrome
vector in section 2.7. For that we define syndrome polynomial s(x), that we use for decoding,
as a remainder of the received codeword polynomial c'(x) = c(x) + e(x) divided by g(x) [4, p
99]:
s(x) = Rg(x)[c'(x)] = Rg(x)[c(x) + e(x)] = Rg(x)[e(x)] 3.5.1
We see that, just as the syndrome vector of Equation 2.7.5 was dependent only on the error
vector, the syndrome polynomial does not depend on c(x) or i(x) but only on e(x) that is an
error polynomial corresponding to one error vector. We see from Equation 3.5.1 that if g(x)
divides c'(x) syndrome is zero and error polynomial is zero, so there are no errors or at least
c'(x) is equal to one codeword of a code.
The principle of syndrome decoding is the same as we presented previously for general linear
block codes. We list the error polynomials (vectors) with smallest number of non-zero coefficients and compute syndrome polynomial for each error polynomial. Decoder then looks up
the decoding table and inverts bits of the received word at places of the non-zero bits of the
error vector, i.e., calculates c(x) + e(x) = cr(x).
We illustrate this process in Example 3.5.1 for cyclic codes where we use polynomial expression.
Example 3.5.1
The principle of syndrome decoding is the same as we presented previously with help
of matrix representation. Now let's take the same (7, 4) code as in the previous Examples 3.2.1 and 3.2.2. This code is a cyclic Hamming code. Hamming code is cyclic if
block length of it is 2m-1 and it has m = n-k parity bits, m is a positive integer [3, p433]
[4, p 111]. We use the same generator polynomial as in Example 3.2.1:
g(x) = x3 +x2 +1
Error polynomials with a single error are 1; x; x2; x3; x4; x5 and x6. We divide each of
these by g(x) and get corresponding syndromes. Table 3.5.1 presents all syndromes of
this code in polynomial and binary vector representation.

________________________________________________________________________
ECC35.doc

Page 71/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Table 3.5.1 Syndromes of a (7, 4) cyclic code, g(x) = x3 + x2 +1.

s(x)
0
1
x
x2
x2+1
2
x +x+1
x+1
x2+x

e(x)
0
1
x
x2
x3
x4
x5
x6

0
0
0
1
1
1
0
1

s
0
0
1
0
0
1
1
1

0
1
0
0
1
1
1
0

0
0
0
0
0
0
0
1

0
0
0
0
0
0
1
0

0
0
0
0
0
1
0
0

e
0
0
0
0
1
0
0
0

0
0
0
1
0
0
0
0

0
0
1
0
0
0
0
0

0
1
0
0
0
0
0
0

Let us now assume that received word is [0001100] corresponding the polynomial
c'(x) = x3+ x2. Syndrome decoder divides c'(x) by generator polynomial and the remainder, the syndrome polynomial, is 1. Second line in the table tells that the rightmost bit is in error and decoder inverts it. Then the corrected word is [0001101] that
corresponds to information bits [0001] of the code, see Table 3.2.1. This was most
probably the encoded one, and single error is corrected.
Note that if we have more different non-zero syndromes than codewords with single error, we
add lines in the table so that it will include all 2n-k different syndromes. For that we add double
error that gives syndrome which is not on the list. The principle is exactly the same as in Section 2.7.2 when we constructed standard array.
Note that syndromes and corresponding error vectors are the same for non-systematic and systematic codes, generated by the same generator polynomial, because the set of codewords is
the same. The only difference is how we map resulting codewords to information sequences.
Implementation of the Syndrome Calculator
Decoder has to find out the syndrome for table look-up. To derive the syndrome the decoder
has to divide received codeword c'(x) by generator polynomial g(x). This we may perform by a
shift register as we did previously for generation of parity check bits.
The coefficients of the received polynomial c'(x) (or bits of the received binary vector) are
shifted into (n-k)-stage shift register shown in Figure 3.5.1. Initially the content of the shift
register is all zero and feedback switch is closed.

g0

g1

g2

gn-k-1

gn-k
Output
syndrome

Received
codeword

s0

s1

sn-k-1

Figure 3.5.1 An (n-k) stage shift register for syndrome computing.

________________________________________________________________________
ECC35.doc

Page 72/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
When entire n-bit received word is shifted into the register, the content of n-k stages constitutes the syndrome. Switch position is changed and syndrome is clocked out. This syndrome is
used for the syndrome table look up to identify the most likely error vector.
Example 3.5.2
Let us consider the syndrome computation for the (7, 4) cyclic (Hamming) code generated by the polynomial g(x) = x3 + x2 + 1. Suppose that the received word polynomial
c'(x) = x6 + x2. Corresponding vector is c' = [1000100]. This is clocked into the shift
register in Figure 3.5.2. Note that the highest order bit is shifted first. After seven
shifts the content of the shift register is 010, that corresponds to the syndrome polynomial s(x) = x or vector s = [010] (note the bit order: highest order bit in the vector is
on the left but in the shift register on the right hand side). That we get also if we use
long division to divide received polynomial with the generator polynomial.
Received
codeword
1000100

Output
syndrome

s0

s1

Input

Shift

1
0
0
0
1
0
0

1
2
3
4
5
6
7

sn-k-1

Register content
s0 s1 s2 .
0 0 0
1 0 0
0 1 0
0 0 1
1 0 1
0 1 1
1 0 0
0 1 0

Figure 3.5.2 Syndrome computation for the (7, 4) cyclic code,


g(x) = x3 + x2 + 1 and received vector c' = [1000100]

We can now look at the syndrome decoding table, Table 3.5.1, and find an error vector
polynomial e(x) = x or vector e = [0000010] of the syndrome s(x) = x or s = [0 1 0] .
Then we add this to the received word and get cr(x) = c(x) + e(x) = x6 +x2+x or cr =
[1000110]. This is an error free word of the systematic (7, 4) cyclic (Hamming) code
in Table 3.2.3 and it corresponds information word i(x) = x3 or i = [1000].
The table lookup decoding method using the syndrome is practical only when n-k (number of
redundant or parity bits) is not very large, e.g., n-k < 10 [3, p 451]. Table lookup method may
be impractical for many interesting and powerful modern codes with large number of redundant check bits. For example, if n-k = 20, the table has 220, that is approximately one million,
entries.
More efficient and practical decoding algorithms have been developed for a class of cyclic
codes especially for the BCH codes that are introduced in the following section.
________________________________________________________________________
ECC35.doc

Page 73/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

3.6. Some Popular Cyclic Codes


In this section we introduce a few important cyclic codes.
3.6.1. Hamming Codes
The class of cyclic codes includes the Hamming codes. Their block length n is n = 2m-1 and
they use n-k = m parity bits [3, p433], where m is any positive integer. The cyclic Hamming
codes are the same as we described in Sections 2.3.4 and 2.6. Hamming codes are the subset
of BCH codes with k = 2m-1-m and error correction capability of 1, t=1.
3.6.2. Bose-Chaudhuri-Hachuenghem (BCH) Codes
The BCH codes comprise a large class of cyclic codes that include both binary and nonbinary
codes [3, p 436]. Binary BCH codes may be constructed with parameters [17, p455, 18, p81]
n = 2m-1
n-k = mt
(in some special cases n-k may be smaller than mt)
dmin = 2t + 1
3.6.1
where m (m 3) and t are any positive integers. The maximum number of errors that the code
is always able to correct is t. This class of binary cyclic codes provides the communication
system designer with a large selection of block lengths and code rates. Nonbinary BCH codes
include also powerful non-binary Reed-Solomon codes that are described for example in [3,
p437], [4, p174] and [8, p423]. They are used CD-ROMs, Digital Video Broadcasting (DVB)
systems and in hard disks.
The generator polynomials for BCH codes can be constructed from the factors of
xn - 1 just as we constructed generator polynomial for any cyclic code in Section 3.1, but now
the block length must be according to n = 2m 1. Some BCH codes were given in Table 3.2.1
and many more are listed in [3, p 437] with their generator polynomials, together with values
for n and k and their error correction capability.
One application example for binary BCH codes is Radio Link Protocol (RLP) of GSM, which
uses shortened (240, 216) code (original BCH code (255, 231) shortened by 15 bits, m = 8, t =
3) [20, p. 75] (RLP is used for non-transparent (ARQ) radio connections).
The other application for a BCH code is DVB-T. It uses BCH (67, 53) shortened code with t =
2, derived from the original systematic BCH (127, 113) code. Code generator polynomial:
g(x) = x14 + x9 + x8 + x6 + x5 + x4 + x2 + x + 1.
The shortened BCH code is implemented by adding 60 bits, all set to zero, before the information bits of an BCH(127, 113) encoder. After encoding these null bits shall be discarded,
leading to a BCH code word of 67 bits.

________________________________________________________________________
ECC35.doc

Page 74/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
3.6.3. Cyclic (23, 12) Golay Code
The Golay code is a binary linear (23, 12) code with minimum distance dmin = 7 [3, p 423].
This cyclic code we generate by means of the generator polynomial [3, p 433]
g(x) = x11 + x9 + x7 + x6 + x5 + x + 1

3.6.2

3.7. Cyclic Codes for Burst Error Correction


Most error control codes are designed to correct any random error pattern with t errors [4, p
114]. This is not always optimum because in many channels errors occur in bursts. If there is
only a need to correct t errors during a short time interval, we may obtain more efficient, i.e.,
higher rate, code. Sometimes we prefer to use interleaving of subsequent codewords to improve burst error performance, but it increases delay and cannot be used when frames are not
transmitted regularly.
3.7.1. Error bursts
A cyclic error burst of length t is a vector whose non-zero components are among t cyclically
successive components, the last and first of which are non-zero [4, p 114].
We can describe a burst error pattern as
e(x) = xi b(x) (mod xn-1)

; i = 0n-1

3.7.1

where b(x) is a polynomial with the degree of t-1 and with the lowest order coefficient 1, i.e.,
error burst starts with the bit in error and ends to the last bit in error. Polynomial b(x) describes the burst error pattern. The length of the codeword is n and the location of the burst is
given by xi, that defines the place of the lowest order bit of the error burst. Division by xn-1,
expressed as (mod xn-1) in Equation 3.7.1, gives the cyclical shift property from the beginning
of the codeword to its end, i.e., it moves higher order bits to the lower order end of the codeword when i is increased in Equation 3.7.1 and the error burst is shifted left.
3.7.2. Cyclic codes for correction of error burst
A cyclic code for correcting burst errors must have unique syndrome polynomials for each
error pattern that the code can correct. As we saw in Section 3.5, syndrome is computed as
s(x) = Rg(x)[e(x)]

3.7.2

If the syndrome is different for all error bursts in Equation 3.7.1 and length of the burst b(x) is
t, the code is able to correct all error bursts of length t or less.
Example 3.7.1
For binary (15, 9) cyclic code we may use a generator polynomial [4, p 114]
g(x) = x6 + x3 + x2 + x + 1
There are n-k = 6 parity bits in this code. We can write according to Equation 3.7.1 all
cyclic error bursts with length of t = 3 or less and they are:
________________________________________________________________________
ECC35.doc

Page 75/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
e(x) = xi
e(x) = xi (x +1)
e(x) = xi (x2+1)
e(x) = xi (x2+ x + 1)

i = 0, . . . ,14
(mod x15-1) i = 0, . . . ,14
(mod x15-1) i = 0, . . . ,14
(mod x15-1) i = 0, . . . ,14

With help of Equation 3.7.2 we see that syndrome s(x) has the degree of g(x) minus
one, i.e., syndrome is a sequence of six bits and thus there are 26 = 64 different syndromes including all zero syndrome.
It is straightforward to verify, even that it would take some time, that the syndromes of
all error vectors above are different and thus this cyclic code is able to correct all burst
of errors with length 3 or less. Compare this code with (15, 7) BCH code and (15,11)
Hamming code in Table 3.2.1.
Note that a codeword plus a correctable burst error cannot be equal to another codeword plus
correctable error burst. If this would be the case, the decoder would not know which of the
two error bursts it should correct. In general a linear code that can correct all bursts of length t
or less cannot have a burst of length 2t or less as a codeword! This means that the distance
between places of non-zero bits in a codeword has to be longer than 2t.
Following theorem, Rieger Bound, is valid for all block codes, not only cyclic codes.
A linear block code that corrects all bursts of length t or less must have at least 2t parity symbols.
To see that this is valid imagine two systematic codewords, all zero word and the one with
only one rightmost information bit nonzero. When n-k = 4 those words are:
00000000
00011111
Error correction fails if, after error burst, the word in error is not closest to the original error
free word. Now if two errors occur, i.e., t = 2, we still perform error correction properly, i.e.,
after two errors the word is still closer to original one than the other. If t = 3 Rieger bound is
exceeded and error correction may fail, if for example lat three bits are in error.
Rieger bound gives only the minimum number of parity bits and we cannot design a code,
with less than 2t parity bits that can correct t-bit long error bursts. If we can manage with 2t
bits or do we need more depends on how we define the code, i.e., how we compute parity bits.
See Table 3.7.1 and compare the number of check bits with the length of error burst that each
code can correct.
The best known burst-correcting codes are cyclic codes. A number of good burst error correcting cyclic codes over GF(2) has been found by computer search. Some of them are listed in
Table 3.7.1. We see that many short codes in Table 3.7.1 meet Rieger bound but longer codes
require more parity bits.

________________________________________________________________________
ECC35.doc

Page 76/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Table 3.7.1 Some binary burst-correcting cyclic code [4, p.115].

Generator polynomial

Parameters

x4 + x3 + x2 + 1
x5 + x4 + x2 + 1
x6 + x5 + x4 + x3 + 1
x6 + x5 + x4 + 1
x7 + x6 + x5 + x3 +x2 + 1
x8 + x7 + x6 + x3 + 1
x12 + x8 + x5 + x3 + 1
x13+ x10+ x7+ x6+x5+ x4+ x2+ 1

(7, 3)
(15, 10)
(15, 9)
(31, 25)
(63, 56)
(63, 55)
(511, 499)
(1023, 1010)

Maximum length
of error burst
2
2
3
2
2
3
4
4

Interleaving to improve performance in the case of error bursts


We can construct longer and better burst error correcting codes with the help of technique
called interleaving. To get (jn, jk) code from (n, k) code we take j subsequent codewords and
transmit first symbols of all codewords first, then second symbols, etc. Now if the original (n,
k) code can correct bursts of t errors, the interleaved (jn, jk) code can correct jt long error
bursts.
Example 3.7.2
We want to design a code that can always correct 8 bit long bursts. For this we can
take a (31, 25) code from Table 3.7.1 that can correct error bursts up to length of 2. By
interleaving each subsequent four codewords (j = 4) we get (124, 100) code that can
correct error bursts up to length of 8. Rieger bound says for t = 8 the number of check
bits must be at least 2t = 16, but this practical code uses 24 check bits.
If original codes are cyclic then the interleaved code is a cyclic code as well [4, p 116].
This principle of interleaving is used in many telecommunication applications, for example in
GSM radio transmission.
3.7.3. Fire code
The Fire codes are burst correcting cyclic codes that follow the property [4, p116]:
A Fire code is a cyclic burst-correcting code over GF(2) with generator polynomial
g(x) = (x2t-1-1) p(x)

3.7.3

where p(x) is a prime polynomial over GF(2) from Table 2.4.6 whose degree m is not smaller
than t and it does not divide (x2t-1-1). If p(x) with degree m divides (x2t-1-1), we take higher degree polynomial. The block length of the Fire code is the smallest integer n such that g(x) divides (xn-1). A Fire code is able to correct error bursts of length t or less.
The structure of any Fire code is
((qm-1)(2t-1), (qm-1)(2t-1)-m-2t+1),

3.7.4

where q = 2 for binary codes


t is the maximum length of correctable error burst and
m is the degree of the prime polynomial p(x) that is used to construct this code. The
degree m is higher or equal to t.
________________________________________________________________________
ECC35.doc

Page 77/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

The following example illustrates how we design a Fire code for desired application.
Example 3.7.3
As an example of binary Fire code, we choose t = m = 10, if we want to correct all 10
bit long or shorter error bursts. The prime polynomial of degree 10 we get from Table
2.4.6:
p(x) = x10+ x3 + 1
Polynomial p(x) must not divide x2t-1-1= x19-1. If it divides, we have to take higher
order p(x) and try again. Now division is not successful and the generator polynomial
becomes according to (3.7.3)
g(x) = (x19 - 1) p(x) = x29+ x22 + x19+ x10 + x3+ 1
From Equation 3.7.4 we get parameters of this code that are.
((qm-1)(2t-1), (qm-1)(2t-1)-m-2t+1);
((210-1)(20-1), (210-1)(20-1)-10-20+1);
(19437, 19408)
This code able to correct error bursts up to 10 error long and its code rate Rc = 0.9985.
Compare the number of check bits with the minimum given by Rieger bound.
The Fire codes are very high rate, as we noticed in the example above, and the redundancy n-k
is in minimum and code rate is maximum when m equals t. We get from Equation 3.7.4 that in
this best case of the Fire code the number of redundant bits is 3t-1, that exceeds Reiger bound
(2t) in Section 3.7.1, by t-1.
One application example is GSM signaling over radio interface where shortened Fire code
with generator g(x) = (x23 + 1)(x17 + x3 +1) is used [9, p 244]. Signaling messages are transmitted as separate frames only when needed and interleaving of subsequent frames cannot be
used. Then for protection against error bursts a very good burst error correcting code is needed. The Fire code is shortened to (224, 184) code.
By interleaving Fire codes we get the best burst-correcting codes known of high rate [4, p
119].

Problems
Problem 3.1.1: a) Is there a cyclic code (4, k) with generator polynomial g(x) = x+1? What is
the value of k? Calculate the codewords of b) a non-systematic and c) systematic code. d)
Show that these codes are cyclic codes? How are codewords constructed by cyclical shifts of
codewords?
Problem 3.1.2: What are the block lengths of the two shortest cyclic codes with generator
polynomial as g(x) = x2 + x + 1. a) What are these codes (n, k)? b) What are the codewords in
nonsystematic form? c) Do they have a cyclic property, i.e. can we make any word if we cyclically shift one word in a code?
Problem 3.1.3: Find the shortest cyclic code for the generator polynomial
________________________________________________________________________
ECC35.doc

Page 78/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
g(x) = x3 + x + 1. a) What are the block length, number of information bits and redundant bits
in a code word and what is the code rate? b) Write the codewords for all information vectors
in non-systematic and c) in systematic form. d) Are these codes equal (the same set of codewords)? e) Is each information sequence encoded into the same codeword in both cases b and
c? f) How many errors this code can always correct?
Problem 3.1.4: Use g(x) = x3 + x + 1 to generate a (6, 3) code. Write all codewords in nonsystematic form and show if this is a cyclic code (perform all cyclic shifts for all codewords).
Check also if given g(x) can be used to generate cyclic code of length 6.
Problem 3.2.1: a) Find all non-trivial prime factors of x7-1. b) How many different cyclic
codes there are with block length of 7 and what are their generator polynomials? What are
these codes (n, k)?
Problem 3.2.2: Are a) the two (7, 4) codes and b) the two (7, 3) codes in Problem 3.2.1 equal
(the same codewords)? Write down all the codewords in nonsystematic form (or in systematic
form).
Problem 3.2.3: Explain what kind of code is the (7, 1) code defined in Problem 3.2.1?
Problem 3.3.1: a) Write the generator matrix in systematic form for a cyclic code (7, 4) that is
specified by a generator polynomial: g(x) = x3+x2+1. b) Show that the codewords are the same
if we use generator matrix or generator polynomial. Use information sequences are [1010] and
[1100] as examples.
Problem 3.3.2: Write the generator matrix in non-systematic form for a (7, 4) code with generator polynomial g(x) = x3+x2+1. Write rows in matrix as c(x) = i(x) g(x) where i(x) = x3, x2,
x and 1. Then change the matrix into systematic form to show that this is the same code as in
Problem 3.3.1.
Problem 3.3.3: Use elementary row operations to change the generator matrix below to the
systematic form.

1
0
G=
0

1
1
0
0

0
1
1
0

1
0
1
1

0
1
0
1

0
0
1
0

0
0
0

What is this block code (n, k) defined by the generator matrix? Write the codewords for both
nonsystematic and systematic codes when the information sequences to be encoded are 1011
and 1111. What is the code rate?
Problem 3.4.1: Draw the encoder for a systematic cyclic code (7, 4) with generator polynomial g(x) = x3 + x2 +1. What is the content of each stage shift by shift when information sequence [1010] is shifted in? What are the parity check bits? Derive the codeword with the
help of polynomials to check your result.
Problem 3.5.1: The generator polynomial of a (7, 4) cyclic code is g(x) = x3 +x + 1. a) What
is the parity check polynomial? b) How many unique syndromes this code has? What are the
lowest weight error polynomials and their syndromes? c) What are the most probable transmitted words when received codewords are c(x)= x6+x5+x4+x3+x2 and c(x)= x6+x4+x2+x?

________________________________________________________________________
ECC35.doc

Page 79/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 3.5.2: Construct a syndrome calculator for a (7, 4) cyclic code that is defined by g(x)
= x3 +x + 1. What is the syndrome and corrected word when the received word is c(x)=x2?
Follow the contents of the shift register shift by shift.
Problem 3.5.3: Write all information words, codewords and corresponding polynomials in
systematic form for a (7, 4) cyclic code with the generator polynomial g(x) = x3+ x + 1. What
is the decoded sequence when the received word c(x)= x6+x4+ x2.
Problem 3.6.1: What are the available binary BCH codes (n, k) that are able to correct double
errors and that have a block length less than two hundreds? What are their code rates? Assume
(as it is for most BCH-codes) that the number of redundant bits in codeword is equal to mt,
where m is integer 3, 4, 5,..., and t is the number of errors the code can always correct.
Problem 3.7.1: a) Write all single, double and triple error burst polynomials of the (15, 9)
code that is generated with generator polynomial g(x) = x6 +x3 +x2 + x + 1. Use Equation
3.7.1 as it is used in Example 3.7.1. b) How many syndromes this code has? c) Is it enough for
correction of all error bursts in case a?
Problem 3.7.2: Show that the syndromes of all single, double and triple bit error bursts are
distinct when the highest order non-zero component of an error vector is at most x6, i.e., evaluate only the cases when errors occur in the section of seven least significant bits. The code is
(15, 9) code, generated with generator polynomial g(x) = x6+x3+x2+ x + 1.
Problem 3.7.3: a) Write the syndrome table for a burst error correcting code (7, 3) defined by
g(x) = x4 + x3 + x2 + 1 (from Table 3.7.1). Derive syndromes for all single and double error
(cyclic) bursts. b) Is there a unique syndrome for each error bursts? c) Is there any other error
case that this code could correct? What is that? d) How many syndromes we would need to
correct all double error cases, not only error bursts? What would then be the code rate?
Problem 3.7.4: a) What is the parity check polynomial for a burst error correction code (7, 3)
in Table 3.7.1? Check with the parity check polynomial if the received word is in error and if
it is, perform error correction. The received words are b) [0001001], c) [0001011] and d)
[0011101]. e) What are the information sequences if nonsystematic coding were used? You
may use the results of Problem 3.7.3.
Problem 3.7.5: Design a Fire code which can correct all bursts up to two errors. What is this
code written as (n, k) and what is its generator polynomial?
Problem 3.7.6: a) What is the number of syndromes of the code in Problem 3.7.5? b) Is it
high enough for correction of error bursts shorter or equal to two? c) Is it high enough for correction of all single and double error cases?
Problem 3.7.7: Design a Fire code that can correct all error bursts of length 3 or shorter. a)
What is the generator polynomial and what is this code, written as (n, k)? b) How many syndromes it has? c) What is the number of cyclic error bursts up to length of 3? Do we have
enough syndromes for all these error bursts? d) What is the number of all error cases with 3 or
less errors? Is it possible to design a code with this block length and code rate that could correct all error cases up to three errors?
Problem 3.7.8: Consider a Fire code with g(x) = (x23 + 1)(x17 + x3 +1) used in GSM signaling.
a) What is the maximum length of error burst it can correct? b) How many parity bits this
code has? c) How many parity bits are needed at least according to Reiger bound?
________________________________________________________________________
ECC35.doc

Page 80/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 3.7.9: A shortened Fire code (224, 184) g(x) = (x23 + 1)(x17 + x3 +1) is used in GSM
signaling. a) What is (n, k) for the original code? b) How many zeroes are added for shortening in front of the information sequence before encoding? c) What are the code rates of the
original code and a shortened code?

________________________________________________________________________
ECC35.doc

Page 81/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

4. Cyclic Redundancy Check, CRC


In the previous chapters we have been mainly concerned about Forward Error Correction
codes (FEC) and their capabilities. We use FEC in systems where variable delay of data
frames is not allowed. Error-correcting codes are used in some data transmission systems, for
example in radio applications such as GSM (speech transmission over unstable radio path). In
many applications, as interactive speech, we have no other choice because of constant and
short time delay requirement.
However, in data transmission we need not usually worry about delay variation between
frames and we can handle errors more efficiently. Most data applications can tolerate variable
delay but they cannot tolerate a single error that may occur because of residual error rate of
error correction decoding. Error correction decoder fails is number of errors exceeds its correction capability.
Another disadvantage of error correction coding is the large number of redundant bits required
for good performance, i.e., low code rate. Good FEC codes usually require code rate in the
order of 0.5, i.e., half of the transmitted data is used only for error correction. Even with this
quite low code rate the code is sometimes unable to correct errors. This may occur when for
example radio signal suffers deep fade (radio signal loss is high for a period of time) and as a
result received data contains a very long error burst.
Another widely used alternative error control method is Backward Error Correction (BEC)
that uses efficient error detection and retransmission in the case of error. This principle is also
known as Automatic Repeat Request (ARQ).

4.1. Error Detection and ARQ


For error detection most popular method used by most modern systems is Cyclic Redundancy
Check (CRC). Examples of the applications of CRC are: Local Area Networks, such as Ethernet, 2 Mbit/s primary rate transmission in telecommunications network, global signaling network (SS7), computer disks, etc.
To compare efficiency of error correction and error detection/ARQ technologies let us take a
simple example [11, p 208].
Example 4.1.1
Let us assume that the bit error rate in the channel is BER = 1*10-6 and frames, each
containing 1000 bits, are transmitted through it. The probability for a certain number
of errors we get by using Binomial distribution in Equations 2.3.1 and 2.3.2. However,
it is difficult to use and requires very accurate calculation when the frame is very long,
i.e., n is very large, and Bit Error Probability, BER, is very low. For example probability for error free frame would be
n
1000
n i
1000
0.0010 0.999999 0.9999991000
P0 p i 1 p
i
0

________________________________________________________________________
ECC35.doc

Page 82/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
For this, most calculators are not accurate enough to give even approximately correct
result. Poisson distribution gives a good approximation for binomial distribution and it
is easier to use, when number of occurrences, such as number of bits in a data frame, is
high. The probability of a number of errors we get with the help of the Poisson distribution:
m i m
P(i) =
4.1.1
e
i!
Where i is a certain number of errors in a frame and we want to find the probability
that exactly i errors occur. The average number of errors in the frame is m and in our
case that it is
m = 1000*1*10-6 = 0.001
Now probabilities for i number of errors, P(i) are:
P(0) = e-0.001 = 0.9990
P(1) = 0.001 e-0.001 = 1 10-3
P(2) = 0.0012/2 e-0.001= 5 10-7
P(3) = 0.0013/6 e-0.001= 1.7 10-10
etc.
We see from the results that approximately one frame in a thousand frames contains
one error and one frame in two millions (or little bit more) has more than one error
(P(2)+P(3)+P(4)+).
Error correction of a single error requires information which of the 1000 bits in frame
is in error, and this requires 10 redundant bits because 210 =1024 (29 = 512) is not
enough) is enough to tell the location of the bit in error. Residual error probability of
frames is approximately 1/1000000 (same as probability of more than one error in a
frame) when all single error cases are corrected.
Error detection of a single error requires only one parity bit. Parity bit is able to detect
all single bit errors and odd number of errors. In this case residual error probability of
a frame is approximately 0.5/1000000 that is approximately the same as probability of
a frame with double error. The performance of error detection using only single redundant bit is even better than error correction with 10 redundant bits!
If more errors are expected, redundant data needed for error correction increases dramatically but error detection can manage with very much smaller number of check
bits.
If a single parity bit is added to a bad block for error detection and the block is badly garbled
by a long burst of error, the probability that the error will be detected is only 0.5, that is hardly
acceptable. We may improve parity check as explained in Subsection 2.3.1, but we have much
better detection method known as CRC that is based on division by a parity check polynomial.

________________________________________________________________________
ECC35.doc

Page 83/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

4.2. Structure of CRC-code


The basic idea of CRC is very similar to cyclic codes we discussed in Chapter 3 but here we
concentrate on detection only. If the frame is detected to be in error it is discarded, we do not
try to correct errors! For CRC code we define a polynomial that we use both for generation of
the check bits in the encoder and for error detection in the decoder. The definition of CRC
polynomial, as well as retransmission procedures, is included in protocol specification of data
link layer in OSI model.
4.2.1. CRC Encoder
We take a message block, that usually contains addresses and control information in addition
to the actual user data, as a polynomial and divide this by an error check polynomial in the
encoder. Then we transmit the remainder of the division in the end of the frame after the message block. Receiver divides the whole block with the same polynomial. This results to zero
remainder (most likely) if errors have not occurred.
The CRC codes in use are systematic and the first section of a frame contains unchanged original message bits.
As earlier the message or information sequence contains k-bits and it corresponds to message
polynomial of degree k-1. We use n-k error check bits, which means that the corresponding
remainder polynomial must have a degree of n-k-1. Furthermore this requires that the divider,
generator polynomial, must have degree of n-k. This same result we had in Section 3.2 where
we studied generator polynomials of error correcting cyclic codes.
Formally we carry out following phases in the encoder (see generation of any systematic cyclic code in Section 3.1):
n-k
message polynomial i(x) of degree of k-1 is first multiplied by x (i.e. n-k zeros are added
in the end of k information bits).
the result, with degree of n-1, is divided by generator polynomial g(x) of degree n-k
n-k
the remainder of the division, p(x) of degree n-k-1, is added to x i(x).
Now we have got the codewords in systematic form:
c(x) = xn-k i(x) + p(x)

4.2.1

The first term simply means that information word is shifted n-k bits to left and error check
polynomial or bits are inserted to the lower order part of the codeword. The degree of p(x) is
n-k-1 that is smaller than the degree of g(x) that has a degree of n-k.
For example if n-k=4, then least significant bit of i(x) is shifted four places to left. The degree
of generator polynomial is also 4. Now there are four free places for check bits. Polynomial
p(x) corresponds to a four bit word with highest power as n-k-1= 3 (x3, x2, x1, x0).
The encoder computes check polynomial (check bits) with the division

________________________________________________________________________
ECC35.doc

Page 84/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
p(x) = Rg(x)[xn-k i(x)]

4.2.2

that gives the remainder of the division (we call this mod-g(x)). In practical encoder we need
not to perform explicitly shift of the message, we just transmit information sequence followed
by check bits that encoder has calculated according to Equation 4.2.2. As we saw the encoder
operates exactly the same way as encode for any cyclic systematic FEC code.
4.2.2. CRC Decoder
When the data block is received, the decoder divides the received codeword by g(x). If the received word is error free, it equals one word c(x) generated by Equation 4.2.1. Decoder divides it by g(x) and checks the remainder:
Rg(x)[c(x)] = Rg(x)[xn-k i(x)] + Rg(x)[p(x)]

4.2.3

If the codeword is error free, the first term gives (Equation 4.2.2)
Rg(x)[xn-k i(x)] = p(x)

4.2.4

and naturally, because the degree of p(x) is smaller than the degree of g(x), the second term
gives
Rg(x)[p(x)] = p(x)
4.2.5
Now with binary codes the decoder gets from (Equation 4.2.3)
Rg(x)[c(x)] = Rg(x)[xn-k i(x)] + Rg(x)[p(x)] = p(x) + p(x) = 0

4.2.6

and decoder has checked that the received word was error free.
Example 4.2.1
Let us take an information sequence as [1101011011] that corresponds to the polynomial [11, p 210]
i(x) = x9 + x8 + x6 + x4 + x3 + x + 1
and the generator polynomial as
g(x) = x4 + x + 1
that corresponds to the vector [10011]. Now k = 10 (10 information bits) and from the
generator polynomial (degree n-k=4) we see that we add four check bits to the frame
so the length of the codeword will be n = 14. To find out what is the corresponding
codeword we divide shifted i(x), xn-k i(x), by g(x).
xn-k i(x) = x4 (x9+ x8+ x6+ x4+ x3+ x+ 1)= x13+ x12+ x10+ x8+ x7+ x5+ x4
This corresponds [11010110110000] where we now have four bit space on the right
for error detection bits. Then we carry out long division:

________________________________________________________________________
ECC35.doc

Page 85/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

x +x+1

x9
x13
x13

+x8
+x12
12

x
x12

+x3
+x10
+x10

+x
+x8
+x9
-x9
+x9

+x7

+x5

+x4

x7
x7

+x5

+x4
+x4

+x8
+x8

x
x5

+x3
-x3
-x3

+x2
-x2

+x
-x

We have got remainder, that is taken as the check part of the codeword, p(x) = x3+ x2
+x. Note that coefficients are from GF(2) and x = (-1)x = 1x = +x (inverse element of
1 1s 1) and then -1-1= +1+1 = 0. We may carry out long division by writing only coefficients instead of x and its powers:
1100001010
Q(x)
g(x) 1 0 0 1 1 1 1 0 1 0 1 1 0 1 1 0 0 0 0
xn-k i(x)
10011
10011
10011
00001
00000
00010
00000
00101
00000
01011
00000
10110
10011
01010
00000
10100
10011
01110
00000
1110
p(x)= x3 + x2 + x
Now we have got codeword polynomial as
c(x) = xn-k i(x) + p(x) = x4 (x9 + x8 + x6 + x4 + x3 + x + 1) + p(x)
= x13 + x12 + x10 + x8 + x7 + x5 + x4 + x3 + x2 + x
that corresponds to the transmitted codeword (first bit on the left)
[11010110111110]
Reader may check what does the decoder get when it divides this by g(x). The remainder will be zero and data is accepted by decoder, otherwise it is discarded.
Note that although encoding and decoding processes are the same as in the case of error correction cyclic codes these codes are designed for very different purposes. FEC code in the de________________________________________________________________________
ECC35.doc

Page 86/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
coder gives a remainder that is used as a syndrome for location of bits in error. CRC does not
care about what is the exact value of the remainder, if it non-zero data is simply discarded and
if it is zero data is accepted.
To extract information from error free codeword the decoded simply strips off the four rightmost bits of the codeword. If operation of the decoder is described with the help of polynomials, the decoder divides c(x) by xn-k and takes the quotient as i(x) and discards the reminder.
Reader may perform this division to see that the codeword in Example 4.2.1 gives the original
information.
We have stated that CRC gives much better results than for example simple parity bit error
detection. Let us now look at how good it really is.

4.3. Error detection capability of CRC


Let us now analyze what kind of error patterns CRC code can detect [11, p 211]. If transmission errors have occurred, the received codeword is, instead of error free c(x),
c'(x) = c(x) + e(x)

4.3.1

Error polynomial e(x) indicates all bits (terms) that have been inverted on the line. The number of non-zero elements, bits, in e(x) equals to the number of errors. An error burst is characterized by an initial 1, a mixture of 0s and 1s, and a final 1. All intermediate terms of e(x) that
are not in error, are zero.
The receiver divides the received codeword containing checksum with the generator polynomial to find out the remainder, which becomes
Rg(x)[c'(x)] = Rg(x)[c(x) + e(x)] = Rg(x)[c(x)] + Rg(x)[e(x)] =
Rg(x)[xn-k i(x)] + Rg(x)[p(x)] + Rg(x)[e(x)] =
p(x) + p(x) + Rg(x)[e(x)] = Rg(x)[e(x)]

4.3.2

We see that the remainder of the division is only dependent on the error polynomial, it does
not depend on the codeword. If there are no errors the resulting remainder is zero.
Decoder evaluates the remainder of the division above, and if g(x) happens to divide e(x), the
decoder does not notice that errors have occurred. This requires that g(x) is a factor of e(x),
i.e., e(x) = g(x)Q(x) where Q(x) is the quotient of the division e(x) by g(x) and the remainder is
zero.
Now we evaluate the requirements for generator polynomials that generate good CRC codes.
The summary of these requirements is given in Section 4.4.
Single error detection
If the received word contains a single error, we write an error polynomial as e(x) = xi, where i
tells the location of the bit in error. Now if g(x) contains more than one term, it will never divide e(x), i.e., e(x)/g(x) = Q(x) (with zero remainder) is not truth. We can easily confirm ourselves that a word with more than one bit can never divide a word with a single bit. One way
to show this is to try to divide e(x) = xi into factors to see if e(x) = g(x)Q(x) could be truth. We
notice immediately that all factors of e(x) have only a single term so there is not a factor simi________________________________________________________________________
ECC35.doc

Page 87/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
lar to g(x) containing multiple terms and thus g(x) with more than one term does not divide
e(x). This confirms that CRC, if generator polynomial has more than one term, detects all single bit errors [11, p 211].
Double error detection
If there has been two isolated single bit errors the error polynomial e(x) = xi + xj.
We can write this equivalently as e(x) = xj (xi-j + 1). If g(x) divides e(x), errors are not noticed
and the result of the division is quotient only, i.e.,
Q(x) = xj(xi-j +1)/g(x) and Q(x)g(x) = xj (xi-j +1)
If we specify g(x) in such a way that in contains term 1 then xj does not divide g(x) (for j>0).
Then, if equation above is valid, xj has to be a factor of Q(x), i.e.,
(Q(x)/xj)g(x) = Q(x)g(x) = xi-j +1
where Q(x) = Q(x)/xj. Equation above is valid only if g(x) divides (xi-j +1) and then errors
would not be noticed. If g(x) does not divide (xi-j + 1) when i-j gets all possible values from 1
to n-1, all double error patterns are detected. Maximum value for i-j is n-1, where n is the
length of the codeword. When i-j = n-1 then the first and last bits in the frame are in error, i.e.,
e(x) = xn-1+1.
Now we may conclude that if we design g(x) in such a way that it contains term 1 and for a
given frame length n it does not divide the error polynomial e(x) = xi+1, where i = 1... n-1, all
double error patterns are detected. Simple low degree polynomials that give protection to long
frames are known. For example x15+ x14+ 1 does not divide xi+1 for any i below 32768 [11, p
211].
Detection of odd number of errors
If there is an odd number of errors, e(x) contains odd number of terms, e.g., x5+ x2+ 1, (not for
example x2+ 1). Fortunately there is no binary polynomial with odd number of terms that has
term x+1 as a factor. This would mean that (x+1)f1(x) = f2(x) with odd number of terms, which
cannot never happen no matter what is f1(x).
If generator polynomial has as a factor, i.e., g(x) = (x+1)g'(x), error detection would fail if remainder becomes zero, that is
e(x)/g(x) = Q(x)
e(x) = g(x)Q(x) = (x+1)g'(x)Q(x)
We see that on right hand side we always have even number of terms no matter what are polynomials g'(x) and Q(x). Then if e(x) contains odd number of terms equation never holds and
errors are always detected.
Another way to show this we substitute x=1 into e(x) containing odd number of terms and get
always e(1)=1. Let us now assume x+1 divides e(x). Then the division would give quotient
Q(x)= e(x)/(x+1) only. Now we may write e(x) = (x+1)Q(x) and substituting x=1 gives e(1)=
(1+1)Q(1) = 0. Right hand side is always valid whatever Q(x) is. The formula is not valid if
e(x) has odd number of terms and we have shown that e(x) with odd number of terms is not
divisible by x+1.

________________________________________________________________________
ECC35.doc

Page 88/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Now we can conclude that if we design g(x) in such a way that it contains x+1 as a factor, the
code is always able to detect all error patterns with odd number of errors.
Burst error detection
A burst error of length l can be represented by e(x) = xi(xl-1+. . +1) where i determines how far
from the right hand end of the codeword the error burst is located. If g(x) divides e(x) then the
result of the division is quotient only and errors are not noticed, i.e.,
Q(x) = xi(xl-1 +1)/g(x) and Q(x)g(x) = xi(xl-1 +1)
If g(x) contains term x0=1, it will not have xi as a factor (for i>0) and then xi has to be a factor
of Q(x), i.e., (Q(x)/xi)g(x) = Q(x)g(x) = xl-1 +1 (where Q(x) = Q(x)/xj). If the degree of the
remaining error polynomial xl-1+...+1 is lower than the degree of g(x) (that is n-k) equation
above cannot be valid and all error burst with length smaller or equal to l are detected, i.e, nk>l-1. If the degree of g(x) is l (=n-k), its degree is higher than l-1 and all error bursts up to the
length of l are detected. This maximum length of an error burst that is always detected equals
to the number of check bits in a code (n-k, degree of g(x)).
If the length of an error burst l=n-k+1 where n-k is equal to the number of check bits. Then
the degree of the factor (xl-1+ . . . +1) of e(x) equals the degree of g(x) and the remainder is
zero only if the error burst is identical to g(x). A polynomial divides another polynomial of the
same degree only if they are identical. The first and last bits are 1 according to the definition
of the burst and whether the burst is the same as g(x) depends on the l-2 = n-k-1 intermediate
bits. If we assume that all error sequences have equal probability, the probability that such an
error sequence, i.e., one random burst of length l+1 in a frame, is not noticed becomes
(1/2)l-2 = (1/2)n-k-1 .
It can also be shown that when an error burst longer than l = n-k+1 bits occurs, or several
shorter bursts occur, the probability of a bad frame getting through unnoticed is (1/2)l-1 =
(1/2)n-k assuming that all error patterns are equally likely (random burst). This comes from the
probability that the n-k check bits of the code match the received random frame in error and
the remainder happens to become zero.

4.4. Design of CRC code


As a summary of the requirements above we can say that the polynomial g(x) should be designed in such a way that g(x):

has more than one term to detect all single error cases,

does not divide xi+1, where i = 1, 2,, n-1, and detects all double errors in a frame of
length n,

contains x+1 as a prime factor to detect all odd numbers of errors (this also ensures that x
does not divide g(x) which was required in our double error case study),

has a degree of n-k=l to detect all error bursts of length l or less.


If we design g(x), that has a degree of n-k, according to the guidelines above, this code is also
able to detect:
l-2
n-k-1
random bursts of length l = n-k+1 with the probability of 1-(1/2) = 1-(1/2)
and
l-1
random bursts longer than l= n-k+1 with the probability of 1-(1/2) = 1-(1/2)n-k
________________________________________________________________________
ECC35.doc

Page 89/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Many international standards use CRC polynomials for error check. Some examples of standardized polynomials are given below.
CRC-4
CRC-8
CRC-12
CRC-12
CRC-16
CRC-16 (ANSI)
CRC-CCITT
CRC-32

g(x) = x4 + x + 1
g(x) = x8 + x7 + x4 + x2 + x + 1
g(x) = x12 + x11 + x3 + x2 + x + 1
g(x) = x12 + x11 + x3 + 1
g(x) = x16 + x15 + x2 + 1
g(x) = x16 + x15 + x5 + 1
g(x) = x16 + x12 + x5 + 1
g(x) = x32 + x26 + x23 + x22 + x16 + x12 +x11 + x10 + x8 +
x7 + x5 + x4 + x2 + x + 1 (hex. 104C11DB)

CRC-4 is used in 2 Mbit/s PCM frame in telecommunication network [13, p 99] and CRC-8
for Bluetooth packet header protection [16, p64]. CRC-16 (ANSI) is commonly used in USA
and CRC-CCITT in Europe. CRC-CCITT is used in HDLC (High Level Data Link Control)
protocol and in other HDLC related protocols such as LAPD (Link Access Protocol, Dchannel) and GSM signaling [9, p272]. Bluetooth packet payload and IEEE 802.11b Wireless
LAN radio packets are also protected by CRC-CCITT. Local Area Networks (LANs) use
CRC-32 code above that is specified in IEEE/ANSI 802 and IS 8802 standards [12, p 267].
Note that the CRC code is not specified by abbreviation only, such as CRC-16. There are several codes in use, which have the same length but different polynomial.
Note that all the generator polynomials above (except CRC-4) contain x+1 as a prime factor.
A 16-bit checksum, as an example, catches
all single and double (up to very long frames) errors,
all odd number of errors,
all burst errors of length 16 or less,
15
99.997% of 17-bit error bursts (1- (1/2) ) and
16
99.998% of 18-bit or longer error bursts (1- (1/2) ) [11, p 212].
The performance of different codes with 16 check bits is the same in practice and the results
above are valid for all properly designed CRC-16 codes.
The figures above give us a view that CRC is very reliable error detection scheme that gives
good protection at very high code rate. For example in "Ethernet" four byte CRC-32 checksum
is used in each frame that is 64,...,1500 Bytes long. This gives the code rate of 0.938,...,0.997
and error protection is much better than with CRC-16 analyzed above.

Problems
Problem 4.2.1: Binary message polynomial is i(x) = x4+x3+ 1 (5 bit word) and the generator
polynomial g(x) = x3+x+ 1. Cyclic redundancy check is used for error detection. a) What is
this code expressed as (n, k)? Write the encoded transmitted codeword as a vector and as a
polynomial. b) Divide the transmitted codeword by g(x). What are the result, quotient and remainder?
________________________________________________________________________
ECC35.doc

Page 90/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 4.2.2: Generator polynomial of CRC code is g(x) = x4 + x + 1 and the received 14
bit word is a) c(x) = x13 + x12 + x10 + x8 + x7 + x5 + x3 + x2 + x
b) c(x) = x13 + x10 + x6 + x4 + x2+ 1. Are the received words error free? What are the message sequences?
Problem 4.2.3: Generator polynomial of CRC code is g(x) = x4 + x + 1 and the received
codeword is a) c(x) = x6 + x5 + x4+ x. Is the received word error free? What is the 5-bit message sequence?
Problem 4.3.1: CRC-12-code uses generator polynomial g(x) = x12 + x11 + x3 + x2 + x + 1. a)
Show or explain why it is able to detect all single errors and all odd number of errors b) What
is the maximum length of an error burst, which it can always detect? c) What is the probability
of detection of a random error burst with length of 13 bits? d) What is the probability of detection of bursts of length 14 bits or longer?
Problem 4.4.1: Design two CRC-polynomials that can, with help two and three redundant
bits, detect all odd number of errors.
Problem 4.4.2: Generator polynomial for a CRC code is g(x) = x3 + x+ 1. a) What is the longest block length for this CRC-code at which it can detect all double error cases? b) What is the
first double error case that cannot be detected as the block length increases.
Problem 4.4.3: How many redundant bits we have in CRC-code with generator polynomial
g(x) = x3+ x2+ x+1. Is this code able to detect all odd number of errors?
Problem 4.4.4: a) Is a CRC-code with the generator polynomial g(x) = x3+x2+ x+1 able to
detect all double errors and all odd number of errors if the block length is 6? How long error
bursts are always detected? b) What is the probability of detection of error bursts with length
4? What is the probability of detection of error bursts longer than 4?
Problem 4.4.5: Check if CRC codes given in Section 4.4 can detect all error cases with odd
number of errors.

________________________________________________________________________
ECC35.doc

Page 91/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

5. Convolutional Codes
Unlike in the case of block codes, input symbols (usually binary, one bit per symbol) of convolutional codes are not grouped into independent blocks but each input bit has influence on a
span of output bits. When we say that a certain encoder produces an (n, k, K) convolutional
code, we express that for k-input bits, n-output bits are generated giving code rate of k/n. K
tells the encoder's memory measured in terms of input symbols. Convolutional encoding may
be continuous process but in many applications encoding is processed for subsequent blocks
of data independently. For example in GSM each speech frame is encoded independently using a convolutional encoder explained in Section 5.7.

5.1. Convolutional Encoding


A convolutional code is a sequence of encoded symbols that is generated by passing the information sequence through a binary shift register as shown in Figure 5.1.1 [8, p 359]. At each
symbol instant, a k-bit information symbol is inserted into the input stages of a shift register.
The register consists of Kk binary stages that constitute the present k-bit information symbol
and the (K-1) previous k-bit input symbols. The parameter K is known as the constraint length
and determines the memory length of the shift register, i.e., how many information symbols
have influence to encoded symbol. Note that there are many different definitions for constraint
length in literature and we have selected this one that is used in references 1, 3 and 8 (some
sources write memory size as K-1 because the present (last) symbol needs not necessarily be
stored). Other ways to see the constraint length of a convolutional code is: how many successive output symbols each input bit influences to [1, p 492)] or how many input symbols impact on a single output symbol. If the input symbols are binary, i.e., k = 1, then one bit is inserted into the encoder at a time.
Kk stage shift register

k-bit
information
symbol

+ n

+ 2

+ 1

Sequence of n-bit encoded symbols

Figure 5.1.1 General convolutional encoder for the CC(n, k, K) code [8, p359].

The n linear algebraic function generators g1, g2 . . .gn, defined by their connections to the shift
register stages, make up encoded symbols with the set of mod-2 adders, see Figure 5.1.1. At
each symbol instant, a new information symbol enters the register, the other symbols move to
________________________________________________________________________
ECC35.doc

Page 92/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
the following symbol location and one symbol leaves the register. The new encoded symbol is
then generated before next symbol enters the register.
The process continues symbol by symbol. Encoded data is not divided into codewords or
blocks as in the case of block codes that we discussed in the previous chapters.
The convolutional code CC(n, k, K) is described by three parameters, n as the length of encoded symbol, k as the length of information symbol and K as the constraint length. Code rate of a
convolutional code is then given as
Rc = k/n

5.1.1

and it represents the amount of information per encoded bit or symbol. For binary codes k = 1
and for each information bit encoder transmits n bits.
In order to generate convolutional code, we have to define a set of generators [8, p 359]. Each
of these is described as a vector or polynomial with a dimension of full register length of Kk
bits. Each place in this vector (or coefficient of the generator polynomial) specifies if the corresponding bit position is connected to modulo-2 adder of this generator or not.
We demonstrate here a binary convolutional code CC(2, 1, 3) with constraint length K = 3 in
Figure 5.1.2. Constraint length of three means that two previous bits and the present bit impact on the output. There are three one bit memories in the encoder (two would actually be
sufficient because we do not necessarily need to store the present bit and tat is why the first
memory is drawn as a dashed line box in Figure 5.1.2). Now we have n = 2, k = 1 and Rc =
k/n = 1/2. We will generate with this code two output bits for each information bit. For this
we need two generators to define how those three information bits, stored in registers, create
the two encoded bits. Let us take these generators as
g1 = [1 0 1]

and g2 = [1 1 1]
5.1.2
These vectors we can equivalently write as polynomials of the delay operator D as
g1(D) = 1 + D2 and g2(D) = 1 + D + D2

5.1.3

We use polynomial of D instead of x, which we used in Chapters 3 and 4, because this expression is used in most sources that describe convolutional codes. Note that we now write generator polynomials with the highest power of D on the right hand side (polynomials of x we
wrote so that highest order term (eldest bit) was on the left). Now we want the order of terms
in generator polynomial to correspond to the order of information symbols in the shift register
when they enter bit by bit from left. Eldest bit (first one) is on the right and the presence of the
highest power term in generator polynomial defines if it has an impact to the encoded symbol
or not. Generator vectors are now written in the same order, highest degree coefficient (the
first bit) on the right.
The shift register in our example has three binary stages, the first stage is for the latest bit and
the latter stages are for the two previous input bits. The status of the register is defined by the
logical values of its bits, while the state of the code is presented by the two previous input
bits, values of the two rightmost stages in Figure 5.1.2.

________________________________________________________________________
ECC35.doc

Page 93/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
K=3
k=1
. . . . . 0

Generator
vectors:

g = [1 0 1]
1
g = [1 1 1]
2

First information bit


+
First encoded symbol

. . . . .

10

00

10

11

Corresponding
generator
polynomials:
g (D) = 1 + D 2
1
g (D) = 1 + D + D 2
2

Figure 5.1.2 Convolutional encoder for the CC(2, 1, 3) code [8, p 360].

Initially, the shift register is set to the all-zero status and then an input bit of logical value 1 is
applied, see Figure 5.1.2. Generators produce the first encoded sequence that is 11 (g2(D),
g1(D)). At the next instant data moves to right and encoded symbol 10 is generated. As encoding process continues, information bits 1010... (written with first bit on the left, in the figure
first bit is on the right hand side) are encoded as 11 01 00 01 ... . Corresponding states (defined by two previous bits stored in two rightmost registers) of the encoder were 00, 10, 01,
10, 01...(written in the order of first bit on the right). These states and the present input bit are
used as to describe a code with help of the state diagram and the trellis diagram, which we
will describe later in this chapter.
We could also define convolutional code with a matrix, called generator-polynomial matrix
[4, p 354]. To this matrix we write all generator polynomials of the code. In the case of our
example presented previously matrix would be
G(D) = [ g1(D) g2(D) ] = [ 1 + D2

1 + D + D2]

5.1.4

There are one row (one input bit at a time) and two polynomials in two columns in the matrix
and two output bits are generated for each input bit. Polynomials define how the present and
two previous bits create the first and second output bit of each encoded two-bit block.
Convolutional codes can be viewed as the discrete convolution of the impulse response of the
encoder and the information sequence. In our example, if the information sequence would be
logical 1 followed by all zeros, the output sequence is referred to as the impulse response of
the encoder and the two-bit (g1(D), g2(D)) output blocks would be 11, 10, 11, 00, 00, 00, ...
that is written as a bit sequence with the first bit on the left as 11011100000. For an arbitrary
input sequence, the encoder output is the modulo-2 addition of the impulse responses arising
from each logical 1 at the input where each response is positioned in accordance with logical
ones in the input sequence [8, p 361].

________________________________________________________________________
ECC35.doc

Page 94/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Systematic Convolutional Code
Convolutional codes may have systematic or non-systematic form just as block codes. The
example encoder in Figure 5.1.2 generates non-systematic code. Systematic means that each
information bit is sent as it is and previous bits stored in the encoder do not impact on that. As
an example the code in Figure 5.1.2 would be in systematic form if g1(x) = 1, i.e., mod-2 adder
1 would be connected only to the leftmost stage.
Puncturing
The code rate of a convolutional code is typically quite low. To increase it an encoder may
discard some encoded bits. For example in the case of a systematic rate code we could send
only every other parity bit and the code rate would then be increased to 2/3. The decoder does
not consider these discarded bits at all. The performance of a punctured code is not as good as
the performance of the original code because puncturing decreases the minimum free distance
(corresponding Hamming distance) of the code.
As an application for puncturing is an ordinary GPRS data traffic channel where user data
rate is increased by additional puncturing. Puncturing is increased when the quality of the
channel is good enough for weaker error correction code.

5.2. Tree Diagram of a Convolutional Code


There are three alternative methods that we may used to describe a convolutional code [3, p.
472]. These are tree diagram, the state diagram and the trellis diagram. As an example, the
tree diagram for a convolutional encoder shown in Figure 5.1.2 is given in Figure 5.2.1. Assuming that the encoder is in all zero state initially, the first branch in the diagram shows that
if the first input bit is 0, the output sequence is 00, and if the first bit is 1, output is 11.
Generated output
Contents
g1 g2
of the
00
encoder,
(000)
first on right:
00
(xxx)
(000)
11
0

(100)

00
(000)

11
(100)

01
(010)

10
(110)

11

1
01
(010)

(001)

00

11

(101)

(100)

Example sequence:
100..
(first bit on the left)

10
10
(110)

Next outputs and state:


00 a 00 a
b
a (100)11 b 11
01 c
Generator
10 d
(010) 01
vectors:
11
a
c 00 b
b(110) 10
g = [1 0 1]
c
d 10
1
01 d
(001) 11
a 00 a
g = [1 1 1]
11 b
c (101) 00
2
01
c
b 10 d
Corresponding
(011) 10
a
c 11
00
b
generator
d(111) 01
c
d 10
polynomials:
01 d
(000) 00
00
a
a 11 b
a(100) 11
g1(D) = 1 + D2
c
b 01
10 d
(010) 01
a
c 11
g2(D) = 1 + D + D2
00 b
b(110) 10 d 10 c
01 d
(001) 11
Note that we have
a
a 00
11 b
written the first
c (101) 00
c
b 01
encoded
10
d
(011) 10
11 a
bit on the left.
c
00 b
d(111) 01
10
c
d 01 d
(000)

(011)

01
(111)

Figure 5.2.1 Tree diagram for the CC(2, 1, 3) code.

________________________________________________________________________
ECC35.doc

Page 95/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Now if the first bit is 1 and second bit is 0, the output is 1101. In order to get output sequence
directly from the labels of the tree diagram we have written the first bit of encoded sequence,
that is generated by g1(x), on the left. Continuing through the tree, we see that if the third bit is
0 the output is 11, if it is 1 the following two output bits are 00. We follow the uppermost
branch if input is continuous 0 and lowermost branch if input is continuous 1. To see the generated output sequence we trace the particular path through the tree that is determined by the
input sequence.
If we look at the tree in Figure 5.2.1 closely, we notice that it repeats itself after the third step.
We can find a branch for the fourth input bit that gives the same output than the third input bit
with the same value if the state has been the same (two previous bits). This follows from the
fact that the constraint length of this encoder is three. In the figure 5.2.1 we have marked
nodes with two leaving branches with labels a, b, c and d. These are the states of the encoder
defined by the two bits stored in the encoder. We can write the states and their variables as:
a = 00,

b = 10,

c = 01 and

d = 11

These are the values of two rightmost stages of the shift register in Figure 5.1.2 (oldest bit on
the right) when a new bit has arrived and output bits are generated. We can see that the output
is defined by the state of the encoder (nodes a, b, c and d in the diagram) and an input bit defining one of the two leaving branches in the diagram.
Note, for example, that if the state is b=10, the tree diagram shows the generated output bits
when bits 1 and 0 are stored in rightmost places of the shift register. The state that impact on
the next output is actually defined by the two previous bits that are stored in the leftmost places of the shift register in Figure 5.1.2 when the new bit arrives. When the new bit arrives,
these two bits are moved to right and the output sequence is generated by them together with
the new entered bit. Now the next state (that together with the next input bit define the next
output) is defined by the arrived bit and one previous bit (the two bits in the leftmost places).
We can see from Figure 5.2.1 that the contents of the two leftmost places always define the
state, that impact on the next output bits, because they are shifted to rightmost places when
generation of output bits takes place. For example the content of the shift register is (10x) always for state b.
Connecting the branches that correspond to the same state we can construct the following diagrams that are called state and trellis diagrams.

5.3. State Diagram of a Convolutional Code


We saw in Sections 5.1 and 5.2 that the state of a convolutional encoder is defined by the previous symbols that are stored in the encoder when output bits are generated [8, p 361]. The
state is defined by two leftmost bits that are shifted to rightmost places, before the new bit arrives, see Figure 5.1.2. The state transition and the next state depend on the latest arrived bit
stored in the leftmost register place. Total number of the register stages defining the state is
(K-1)k which is 2 in our example (K=3, k=1) and the number of states is 2(K-1)k that is 4 in our
example. The state changes when the latest symbol value, on the leftmost place in the register,
is shifted to right and eldest bit is dropped out. Figure 5.3.1 presents the state diagram of our
encoder example CC(2, 1, 3) [8, p 364].
________________________________________________________________________
ECC35.doc

Page 96/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

1/01
g1

g2

State d
11

1/10

State b
10

Input/output
0/10

0/01
State c
01
1/00
0/11

1/11
State a
00
0/00

Figure 5.3.1 The state diagram CC(2, 1, 3) convolutional code.

Now we use the state diagram in Figure 5.3.1 to describe the operation of the encoder. We
label states as previously: a=00, b=10, c=01 and d=11. The transition between states we label
by the input bit / generated output bits. The shift register is initially reset and the state of the
encoder is a. If we assume that the bit sequence to be encoded is 1010 ..., the first bit 1 causes
the transition from state a to state b. At the same time output bits 11 are generated as shown in
the transition branch. Now the state is b and new input is 0. At the new clock cycle instant, the
input is 0 and the state is changed from b to c and output bits 01 are generated (first bit on left
hand side).
If the input sequence would continue as alternating 1 and 0, state would change from b to c, c
to b, b to c,... and the encoder would transmit data sequence 11010001000. . . (the first bit on
the left hand side). This path is unique to the given input sequence.

5.4. Trellis Diagram


Third and very descriptive representation of a code is the trellis diagram that is shown in Figure 5.4.1. There we have concatenated the consecutive instants of the state diagram in Figure
5.3.1. All states are present in each "column" and each "row" represents one of the states.
Lines show which is the next state and output for a certain input bit value. We follow the diagram from left to right. Transitions start from initial all-zero state and we can now easily follow the paths of state transitions and write output sequence.
The diagram in Figure 5.4.1 shows all 16 possible paths that occur for 4-bit information sequences. The path corresponding the input sequence 1010 ... is drawn by the thick line corresponding encoded sequence of 11 01 00 01 ... Naturally must have 16 different output sequences be able to transmit all possible combinations of 4 information bits.
________________________________________________________________________
ECC35.doc

Page 97/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Symbol Instant
J=0
Input
symbol
and state
transition:
0
1

State a
00

J=1
0/00

J=2
0/00

1/11

1/11

J=3
0/00
1/11

State b
10
State c
01

J=4
0/00
1/11

0/11

0/11

1/00
1/10

0/01

State d
11

1/00
0/01

0/01
1/10
0/10

1/10
0/10

1/01

1/01

Figure 5.4.1 The trellis diagram of a CC(2, 1, 3)-code.

Note that in Figure 5.4.1 we have written first output bit on the left (generator polynomial
one) so that we can write output sequence directly as a sequence of branch labels from left to
right.

5.5. Decoding of Convolutional Code


In the previous section we looked at convolutional codes from encoder's point of view. In this
section we study how decoders can use convolutional codes for error correction.
5.5.1. Maximum Likelihood Decoding
In the previous section we saw that the input sequence changes the encoders states that together with present input bit generate a sequence of encoded symbols to be transmitted. The
convolutional decoder in the receiver estimates the most likely path of state transitions in the
trellis. When that is identified, corresponding information sequence is delivered as the decoded information sequence. If the decoder uses Viterbi algorithm, all possible paths in the trellis
are searched, and their distances to the sequence at the decoder input are compared. The path
with the smallest distance to the received sequence is then selected, and the corresponding information sequence is regenerated. This method is known as maximum likelihood decoding
because the most likely sequence from the all paths in the trellis is selected. It therefore results
in minimum error rate [8, p 366].
We have seen that convolutional code has not well defined block length and it can be used for
continuous coding of data stream. However, in practice the data to be encoded has usually a
certain defined block length, for example data frame (e.g., Ethernet), 20ms speech frame of
GSM, etc. However, convolutional codes can be used for independent data blocks so that the
encoding starts from initial state (usually all-zero state) in the beginning of each frame and the
block is terminated by appending (K-1)k logical zeros after the last information symbol into
the encoder. These additional tail bits force the encoder to return to all-zero state in the end of
encoded block (when there is no feedback circuit). Then encoding of new block starts again
from all-zero state.
________________________________________________________________________
ECC35.doc

Page 98/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
5.5.2. Hard-decision Decoding and Viterbi algorithm
We explain here the operation of the Viterbi algorithm with the help of our simple example
code CC(2, 1, 3) that we have used previously. Here we assume that the receiver has made,
before convolutional decoding, hard decisions, i.e., if each received bit is a logical 0 or 1.
Hamming distances between the received two bit symbols and output symbols corresponding
each branch in the trellis are used as a metric. Figure 5.5.1 records all paths selected by the
Viterbi decoder when the received sequence is [11 01 00 01].
Suppose that there is no channel errors and received sequence is the same as the encoded sequence 11, 01, 00, 01,... At the first instant, J=1, the received symbol is 11, and decoder compares it with the sequences of possible input branches of states a and b. We write now the
Hamming distance between the input branch and the received sequence as a metric of each
path. We continue now to right in the trellis and write path metric of the input branches of
each node as a sum of branch metrics. We take one example where J=2. Then the transitions
a-a-b correspond to the input sequence 0011. Actual input sequence was 1101 and the Hamming distance between the path a-a-b and the input sequence is 3.
Symbol Instant

Input symbol
and state
transition:
0
1
Weight of
the paths
are written
in italic

State a
00

J=0

J=1

0/00

J=2
0/00
1/11

1/11

State b
10

J=3

3 0/00
3
1/11 5

0/01
1/00

1/10

1/10
0/10

Received symbols:
Decoded output

1/00

State d
11
01
0

0/10 5

3
1/01

11
1

1/11

01
10 4
4

State c
01

0/00

0/11

3
0/01

J=4

00
1

1/01

01
0

Figure 5.5.1 Example of Viterbi decoding.

From instant J=3 onwards there are two possible branches to each node a, b, c and d. The
Viterbi decoder selects one of the possible paths as the survivor path to each of the four nodes
at all following instants J = 4, 5,.... For this decision decoder calculates both two path metrics
for each node, selects the smallest one as a survivor path and terminates the path with higher
distance. Let us take state c at instant J=3 as an example. There are two possible paths: a-a-b-c
with metric 4 and a-b-d-c with metric 3. Decoder selects a-b-d-c as the survivor path and terminates path a-a-b(-c). All terminated paths are cleared from the decoder memory.
If we analyze trellis further we notice that there is always only one survivor path at each instant J from initial state to each state of the decoder. If we now should decide at instant J=4,
what was the transmitted sequence, we would take path a-b-c-b-c because its Hamming distance to the received sequence is minimum that is, in this error free case, zero. This decoding
________________________________________________________________________
ECC35.doc

Page 99/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
method is optimum in the sense that it minimizes the probability that the entire sequence is in
error [8, p368].
Example 5.5.1
Let us assume that convolutional code in Figure 5.4.1 is in use and the received sequence is 01 00 10 00 01. The encoder decides in the end of the sequence, which was
the transmitted sequence and the corresponding information. The encoder knows the
trellis in Figure 5.4.1 and calculates survivor paths that are presented in Figure 5.5.2.
Symbol Instant
J=0
Input symbol
and state
transition:
0
1
Weight of
the paths
are written
in italic

State a
00

J=1
0/00

J=2

J=3

00

J=4
2 00
00
11 11
11

00
11

1/11

11

State b
10

11

3
01

State c
01

10
3

10

State d
11

10

01
11
1

00
10
1

01

01

Received symbols:
Decoded sequence
Decoded output

00 3

3
11

10
10
0

01 3

J=4

00

10

10

10 5
01

01

3 4

00

11 4

00
00
1

10 5

01
01
01
0

Figure 5.5.2 Example of Viterbi decoding.

At time instant J=2 all-zero path has smallest distance but at time instant J=4 it has become clear that it was most probably a wrong path. The whole sequence is considered
in detection and the decision is made in the end of the whole sequence at J=4 and two
errors that have occurred in the beginning of the sequence are corrected. The successful error correction is the result of the fact that the latter part of the sequence is error
free and the distances of wrong paths tend to increase while the distance of the right
path remains the same if no more errors occur.
As shown in the example above errors may make a wrong path to seem to be the correct one at
some time instants. However, when time goes on and more data is taken into account, distances of wrong paths tend to increase because error free data increases its distance. The distance of the correct path increases only by the number of errors occurred.
If convolutional code is used to encode independent data block the encoder is usually forced
to return to all-zero state in the end of the data block. When the encoder does not have any
feedback loop it returns to zero if K-1 zero tail bits are added to the information data block
that is encoded. Encoder is then in all-zero state when encoding of a new data block starts.
The decoder then simply chooses the path that starts from all-zero state and returns back to all
zero-state in the end.

________________________________________________________________________
ECC35.doc

Page 100/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
5.5.3. Soft-decision decoding
In hard-decision decoding the input of Viterbi decoder consists of only two (binary) values. In
soft-decision decoding the received signal is quantized to multiple levels, not just binary 0 or
1. The process we have described previously is still valid but we measure the distance between received sequence and each branch more accurately. Now we derive one survivor path
for each state at each time instant with the help of this so called confidence measure instead of
Hamming distance.
For an Additive White Gaussian Noise channel (AWGN), hard quantization of the received
signal result in a loss of about 2dB in noise performance (Eb/N0) compared with infinitely fine
quantization [8, 369], while an 8-level quantization reduces the loss to 0.25 dB. This indicates
that 8-level quantizing could be good enough for most purposes.
5.5.4. Implementation aspects of the Viterbi Algorithm
We illustrated the operation of Viterbi decoder in Section 5.5.2. More complete definition of
the algorithm is given for example in [8, 372]. The complexity of the trellis diagram directly
reflects the computation and memory requirement of the Viterbi decoder [8, p371]. In general,
for CC(n, k, K) code, there are 2(K-1)k states in the encoder. At each node in trellis, there are 2k
paths terminated to each node and one of these is selected as a survivor path. As a consequence there are 2k path comparisons at each node repeated for all 2(K-1)k nodes at each time
instant. The computation increases exponentially with K and k and this restricts K and k to relatively small values.
The Viterbi decoder selects and updates the surviving sequences and stores them into the
memory at each time instant. At the end of the encoded packet, the decoder selects the survivor with minimum distance, i.e., the minimum Hamming distance with hard-decision decoding and, or the minimum confidence measure with soft-decision decoding.
In practice the encoded packets are usually very long and it is impractical to store the entire
length of the surviving sequences before making the decision. This might lead to unacceptable
delay and memory consumption. Instead, only the most recent L information bits in each of
the surviving sequences are stored. For the present minimum distance survivor the symbol of
this path L periods ago is decoded to an information symbol. In order to avoid that the most
recent symbols do not influence too much to the decoded information symbols, the parameter
L is usually selected to be L 5K. Present received symbol may change the selected survivor
path but usually all survivors join at distance L in the past.

5.6. Distance Properties of Convolutional Codes


The Hamming distances of the paths in the trellis determine the correction capability of the
code [8, p 375]. As convolutional encoding involves modulo-2 linear operations on the information sequence the convolutional code is linear and therefore the distance separation of the
paths in the trellis is independent of which particular code sequence is considered. In Section
2.5.1 we saw that for linear codes the sum of codewords is another codeword and the minimum distance between codewords is equal to the minimum weight of non-zero codewords.
________________________________________________________________________
ECC35.doc

Page 101/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Reader may look at the Trellis in Figure 5.4.1 and check if all of the first, for example, 6-bit
sequences may be generated as a sum of two other possible sequences.
Convolutional codes are linear codes because they are produced by linear generators (adders).
We know that minimum distance of linear code equals its minimum weight, see Section 2.5.1.
Then, for the sake of simplicity, we may assume that all-zeros sequence is transmitted and analyze that only. Now if non-zero path in the trellis is favored by the Viterbi decoder, erroneous
decoding occurs. We illustrate distance properties of convolutional codes with the simple example code CC(2, 1, 3) we have used previously.
If error free all zero sequence is received, decoder loops in state a of the state diagram in Figure 5.3.1. When a decoding error occur, non-zero path is selected, and the state changes from
a to another state for some instants before returning to state a again. By splitting the node a of
the state diagram in Figure 5.3.1 into an input ai state and output state ao, the traces of all the
incorrect paths are revealed by the possible connections from the input to the output as shown
in Figure 5.6.1.
JHD
JHD

State d
11

(1/10)

(1/11)

JD
(0/10)

J H D2
State a
i
00

(1/01)

JD
State b
10
JH

JD

(0/01)
(1/00)

State c
01

(0/11)

State a o
00

Figure 5.6.1 Modified state diagram for CC(2, 1, 3) code.

The transition from ai to b represents leaving of the correct path (of transmitted all zero sequence) and transition from c to ao represents the return to the correct path. We gave additional labels, J, H and D, to branches. Exponent of D in each branch corresponds the Hamming distance of a transition to all-zero sequence. The factor H indicates those branches, state
transitions, activated by input bit 1. Factor J (its exponent) serves as a counter to indicate the
number of instants in any given path before merging back to correct all zero sequence.
Let us look the transition from ai to b in Figure 5.6.1. This is a single transition (one hop) and
the exponent of J is unity. Input of the encoder for this state transition is a single bit 1 and that
is why we have labeled it by H with exponent 1. The transition produces 11 instead of all zero
00 and the distance is 2 and factor D2 is attached to the branch and complete label of this
branch becomes J H D2.
Let Xs be a variable that represents accumulated weight of each path that enters state s, i.e.,
what has happened to the weight (or distance) when we have entered to this state. The transfer
function associated with all transitions from ai to ao then provides the required measure of
path weights. We can now express all possible incorrect paths from ai to ao by the transfer
function:
T(D, H, J) = Xao/Xai
5.6.1
We can find this function by writing equations that define how states are reached from other
states:
________________________________________________________________________
ECC35.doc

Page 102/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Xb = J H D2 Xai + J H Xc
Xc = J D Xb + J D Xd
Xd = J H D Xb + J H D Xd
Xao = J D2 Xc

5.6.2

We have four equations and four unknown terms and the transfer function of the CC(2, 1, 3)
code is obtained by solving the group of equations above. By doing this we get
TCC 213 D, H , J

X ao
J 3H D5

X ai 1 J H D 1 J

5.6.3

When we divide this out, we get


TCC213(D, H, J) = J3 H D5 + J4 H2 D6 + J5 H2 D6 + J5 H3 D7
+ 2 J6 H3 D7 + J7 H3 D7 + ...

5.6.4

This formula contains all paths in error (all non-zero paths). The first term indicates that there
is an incorrect path having a Hamming distance of five bits (exponent of D is 5) from all-zero
sequence. It merges back to node a after three instants (exponent of J is 3) and this path produces one erroneous information bit (exponent of H) when all zero sequence is assumed to be
transmitted one.
Each term in the sum of the transfer function represents an incorrect trellis path. We can also
read, for example, the first term directly from the state diagram in Figure 5.6.1 as the transition ai-b-c-ao. There we have three hops, one of them corresponding input 1 and their total
weight is 5. That is the result when we multiply all transition labels. An important property of
the transfer function is that it provides distance properties of all the paths of the convolutional
code.
Let us try to find the path from the Trellis in Figure 5.4.1 that corresponds the first term in
Equation 5.6.4. The factor J in the transfer function determines the number of branches in the
path, i.e., the number of hops. In the first term we have J3 that means that we have three hops
in the shortest nonzero path. From the trellis in Figure 5.4.1 we see that the shortest path really
has three hops from state a back to state a. We see that the shortest path which leaves all-zero
state and returns back is a path contains three hops (exponent of J), its weight is 5 (exponent
of D) and one hop is related to the input bit with binary value one (exponent of H).
The minimum Hamming distance between two paths of a convolutional code is called the
minimum free distance and written as dfree. The dfree of our CC(2, 1, 3) code is five, the exponent of D in the first term of the transfer function in Equation 5.6.4. This is the lowest
weight path that gives smallest distance between any other paths in trellis. Looking at Figure
5.4.1 of this same code, we see that if nonzero path is selected its smallest weight is 5 (a-b-ca). With help of this we know that this convolutional code is able to correct all two bit error
patterns, because in case two errors the correct (all zero) path is still closest to the received
path.
If the convolutional code is terminated after q instants then the transfer function is obtained by
truncating T(D, H, J) at the term Jq. However, for a very long code sequence, the transfer
function has close to infinite number of terms and therefore J is no longer important in deter________________________________________________________________________
ECC35.doc

Page 103/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
mining the truncation [8, p377]. If we are not interested in path length we can suppress J by
setting J =1 and Equation 5.6.4 gets the form:
TCC213(D, H, 1) = H D5+ H2D6+ H2D6 + H3D7+ 2 H3D7+ H3D7 + ... 5.6.5
When we combine some terms of the transfer function we get:
TCC213(D, H, 1) = H D5 + 2 H2 D6 + 4 H3 D7 , ...

5.6.6

This transfer function does not depend on the path length J. For example the second and third
term in the original transfer function indicate that both terms contain two non-zero input bits,
i.e., if this wrong path is selected two information bits are in error. The Hamming distance of
both paths is six to all-zero path, but they have different path lengths (exponent of J). These
terms are now combined together to form the second term in Equation 5.6.6, which tells that
there are two distance 6 paths.
Furthermore H indicates the number of erroneous information bits associated with the path.
We need not consider that if it is enough for us to know if there are errors or not (i.e., if the
error correction fails or not). Then we can set H=1 and the transfer function gets the form that
depends only on D that gives Hamming distances of all incorrect paths to the all-zero sequences:
TCC213(D, 1, 1) = D5 + 2 D6 + 4 D7 , ...

5.6.7

As the correct path is an all-zero sequence, the Hamming distance between incorrect and correct is the weight (number of logical ones) of incorrect path. We may look at the Trellis in
Figure 5.4.1 and notice that there really is one weight 5 path (transition a-b-c-a) and two
weight 6 paths (transitions a-b-c-a and a-b-c-b-c-a). We know that in each section of six bits
up to two errors are always successfully corrected because correct path will never be terminated.
Example 5.6.1
To illustrate the performance of our simple CC(2, 1, 3) example code we may we may
look at its trellis in Figure 5.4.1 and time period from J=2 to J=5 (section from J=4 to 5
is equal to section J=3 to 4). Assume, as an example, that the encoder state at J=2 is b
and the next transmitted six bits are 010001. This correct path is terminated at J = 5 only
if the other path 100110 to state c at J=5 is preferred. Their distance is five with less
than two errors correct path will always survive. This is the case for any sequence of data from any state at J=2 to any state at J=5.
Our example convolutional code in Figure 5.4.1 has free distance of 5 and only one three hop
path with weight 5. Then if 2 errors occur in six subsequent bits they are successfully corrected. In the next 6 bit sequence another set of two errors can be corrected. Then the code may
manage error situations where error rate is as low as 2/6!
In the following section we use the presented expressions for understanding of the convolutional CC(2, 1, 5) code used for error protection of GSM speech, data and signaling channels.

________________________________________________________________________
ECC35.doc

Page 104/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

5.7. Application Example, GSM Error Correction


In GSM speech channel most sensitive bits of speech frame are error correction encoded. Half
rate convolutional code CC(2, 1, 5) is used for each speech frame independently. Each speech
frame has 4 zero tailing bits to reset the 5 stage shift register. Constraint length of this code is
5.
The four zero bits are clocked into the shift register to reset it to initial state. When the first
information bit of the speech frame appears, shift register is in initial state (all four memories
in zero state) and encoding of the new frame starts. Diagram of the encoder is presented in
Figure 5.7.1.
Generator polynomials:
3
4
g0 (D) = 1 + D + D
3
4
g (D) = 1 + D + D + D
1
K=5
u ..... u
0
188

Switch
c ... c
0
377

Figure 5.7.1 The CC(2, 1, 5) TCH/F convolutional encoder [8, p 711].

189 information bits are encoded by half rate convolutional code shown in Figure 5.7.1 and
designated as CC(2, 1, 5). Generator polynomials are shown in the figure as well. We could
write the generator matrix for this convolutional code as
G(D) = [1 + D3 + D4 1 + D + D3 + D4]

5.7.1

The matrix has one row because one bit is taken into the encoder at a time. It contains two
columns because two bits are transmitted for each input bit. As defined by polynomials and
shown it the figure even bits are constructed as modulo-2 sum of the oldest two bits in the
shift register and the present one. Odd output bits are generated as mod-2 sum of all five bits
except the third one in the shift register.
We also see in Figure 5.7.1 that each input bit stays in the encoder five cycles while ten output
bits are generated. That gives the constraint length of five.
Error Correction Capability of CC(2, 1, 5) Code
The state diagram of the convolutional code CC(2, 1, 5) has 2(K-1)k = 16 states, because constraint length K = 5 and k = 1 for this binary code. The transfer function of the code would be
________________________________________________________________________
ECC35.doc

Page 105/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
found by solving group of 16 equations [8, p 380]. For binary codes where k = 1 the complexity of computations grows exponentially with constraint length K.
An alternative way to find transfer function for large value of K is to trace through all possible
non-zero paths in the trellis by computer search and record their path distances. The minimum
distance dfree for this code was found to be 7. Then it corrects all error cases with 3 or less errors. If errors are distributed over data error performance is much better.
Convolutional codes rely on the adjacent bits to correct an error [8, p 390]. For example the
code above corrects always up to three errors that may be in subsequent bits. If an error burst
contains larger number of errors probability that correct path is terminated becomes high.
Bursts with large number of errors occur in Rayleigh fading radio channels and convolutional
code become overloaded due to the deep fades and consequent long error bursts. However, the
code would be able to correct large number of errors in many 3 error bursts if there is some
distance between them.
In order to improve performance of convolutional codes, interleaving techniques are used. If
we interleave subsequent convolutional encoded data blocks, we transmit one bit from all interleaved blocks at a time. Interleaving distributes errors of error burst to all interleaved encoded blocks and improves the performance essentially. Another way to use interleaving and
improve performance is used in GSM where convolutional encoded data frame is transmitted
in data blocks each containing every eighth bit of the frame. Then if one block suffers deep
fade, every eight bit is in frame is probably in error, and error correction can easily recover
original data and no user data is lost.

Problems
Problem 5.1.1: Draw the convolutional encoder for the binary convolutional code CC(2, 1,
3). Its generator polynomials are g1 = 1 + D2, g2 = 1 + D. What is the constraint length of the
encoder? What is the code rate?
Problem 5.1.2: Encode input sequence [1011100] using the encoder diagram in Figure
5.1.2.
Problem 5.2.1: Draw the tree diagram of the convolutional encoder in the problem 5.1.1.
Which nodes of the diagram are repeated later on?
Problem 5.2.2: Encode input sequence [1011100] using the tree diagram in Figure 5.2.1.
Problem 5.3.1: Draw the state diagram of a convolutional encoder in Problems 5.1.1 and
5.2.1.
Problem 5.3.2: Use the state diagram in Problem 5.3.1 and define then encoded output sequence when input is [11001000] (the first bit on the left). Encoder is assumed to be initially
in the all-zero state.
Problem 5.3.3: Use the state diagram in Figure 5.3.1 to encode input sequence [1011100].
Problem 5.4.1: Draw the trellis diagram for the CC(2, 1, 3) convolutional encoder in Problems, 5.1.1, 5.2.1 and 5.3.1.
________________________________________________________________________
ECC35.doc

Page 106/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Problem 5.4.2: Encode input sequence 10010 (first bit on the left) with help of trellis diagram
in Problem 5.4.1.
Problem 5.4.3: Encode input sequence 1011100 (first bit on the left) with help of the trellis
diagram in Figure 5.4.1.
Problem 5.5.1: Decode the received sequence 0011101101 (first bit on the left) with help of
the trellis in Problem 5.4.1.
Problem 5.5.2: Decode the received sequence 1100100101 (first bit on the left) with help of
the trellis in Problem 5.4.1. Perform decoding step by step according to Viterbi algorithm. Is
this sequence error free? If there were errors, how many they were and what was most probably the transmitted sequence? What was the corresponding information sequence?
Problem 5.5.3: Decode the received data sequence 01 00 11 01 11 using the convolutional
code CC(2, 1, 3) defined in Figure 5.1.2. Use the trellis diagram presented in Figure 5.4.1.
Draw survivor paths and write hamming distances of each path as shown in Figure 5.5.1. Is
the received sequence in error? If there are errors, what is the corrected data sequence and the
corresponding information sequence?
Problem 5.5.4: Decode the received data sequence 00 01 10 11 11 using the convolutional
code CC(2, 1, 3) defined in Figure 5.1.2. Use the trellis diagram presented in Figure 5.4.1.
Draw survivor paths and write hamming distances of each path as shown in Figure 5.5.1. Are
there errors in the received sequence? If there are, how many and what is the decoded data sequence and the corresponding information sequence?
Problem 5.6.1: Show that the set of six bit codewords generated by the Trellis in Figure 5.4.1
make up a linear code.
Problem 5.6.2: Draw the state diagram for the code in Problem 5.3.1 in similar form as Figure 5.6.1 and derive the transfer function of this CC(2, 1, 3) code.
Problem 5.6.3: Derive the transfer function of CC(2, 1, 3) code with help of the state diagram
in Problem 5.6.2. Set H=1 and J=1 from the beginning. What is the minimum free distance
dfree of this code?

________________________________________________________________________
ECC35.doc

Page 107/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

6. Trellis Coded Modulation


In previous chapters we have looked error control coding as a separate process that is carried
out independently from line coding and modulation, see Figure 1.3.1. This is not always the
case and we may combine error control and line coding and modulation. The error control
coding and line coding together we call channel coding. The Trellis Coded Modulation
(TCM) which combines channel coding and modulation was invented in the end of 70s. Utilization of TCM increased for example analog modem rates from 9.6 kbit/s up to the present
33.6 kbit/s.
In this chapter we describe the principle of TCM with the help of an example where we use
ordinary Quadrature Phase Shift Keying (QPSK) modulation or TCM and 8-ary PSK (or 8PSK).

6.1. Binary Phase Shift Keying, BPSK


Phase modulation (PM) is a method in the class of exponential Continuous (or carrier) Wave
(CW) modulations. In PM the instantaneous phase of the carrier wave varies according to the
message. Figure 6.1.1 shows two examples of digital phase modulation.
The first one in Figure 6.1.1 is the digital binary PM, which is called Phase Shift Keying
(PSK) or Binary Phase Shift Keying (BPSK) where the phase of the carrier is varied according
to whether the digital modulating signal has a logical value 1 or 0. Modulating binary signal has only two possible values and we need to use only two carrier phases 0 and 180 degrees ( radians). If the transmission time for each symbol or bit is T we say that the symbol
time is T and the modulation rate or symbol rate
r = 1/T Bauds

6.1.1

In BPSK we may change in modulator the phase of the carrier 1/T times in a second. For
BPSK we may define that binary values of the message correspond to carrier waveforms:
0

Accos(ct)
1

Accos(ct )
where
Ac is the amplitude of the carrier
c = 2fc is the radian frequency and
fc is the carrier frequency
Bit rate rb of BPSK is equal to the symbol rate r because each symbol (carrier burst duration
T) carries one bit representing its two possible values "0" or "1". An easily understandable
way to describe digital phase modulations is called a constellation diagram, which is shown in
Figure 6.1.1.

________________________________________________________________________
ECC35.doc

Page 108/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Constellation diagram
of BPSK
Q
1

In Quadrature Phase Shift Keying,


QPSK, two bits at a time change
the carrier to four different phases:
00
0 degrees phase shift
01
+90 degrees phase shift
11
+180 degrees phase shift
10
+270 or -90 degrees phase shift

The digital form of PM is


called BPSK, where the
carrier phase is 0 or 180
degrees depending on the
binary value of the data.

I
Carrier
wave

T im e

Constellation diagram
of QPSK

+180 o

10

00

00

o
0

BPSK
01 x

11

00

00

x
I

+180

11

01

o
o
+180 +180

T im e

QPSK
T im e

x10
0

+180

+90

-9 0

Figure 6.1.1 Digital Phase Modulations, Binary and Quadrature Phase Shift Keying.

In the constellation diagram the I-axis corresponds to the in-phase carrier wave and Q the carrier with a 90-degrees phase shift. Each signal point in the diagram represents one possible
transmitted "symbol" or waveform that represents one value of the digital modulating signal.
The distance of a cross from the origin corresponds to the amplitude of the carrier and thus it
is related to the energy of the transmitted symbol. For BPSK we select opposite phases to get
highest possible distance, with certain symbol energy, between signal points in the constellation diagram. We call the distance between signal points Euclidean distance to distinguish it
from the Hamming distance. Maximum Euclidean distance minimizes the error rate because
the farther the symbols are from each other the higher noise level is needed to cause errors in
the receiver.

6.2. Quadrature Phase Shift Keying, QPSK


In many systems we prefer to use more than two phases of the carrier to increase bit rate.
When four carrier phases are used, each phase represents the value of two binary bits and we
talk about Quadrature Phase Shift Keying (QPSK). If the modulation rate is the same as in
BPSK, the bandwidth of the signal remains the same but the binary data rate is doubled because now each symbol carries values of two bits. For this modulation scheme we may define
for example
00
01
11
10

Accos(ct)
Accos(ct + /2)
Accos(ct + )
Accos(ct - /2)

6.2.1

Figure 6.1.1 illustrates an example of QPSK constellation. An original carrier wave and the
modulated waveform are drawn in the figure. At a time a pair of bits is taken from the incoming bit stream (110001101111...) and the carrier phase is shifted by the modulator according
________________________________________________________________________
ECC35.doc

Page 109/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
to the value of these two bits until the next two bits are received. We can see from Figure
6.1.1 that, for example, the bit combination 01 is sent as a carrier with a +90 degree phase
shift. The distance between the signal cross and the origin of the diagram tells the carrier amplitude that is the same for all symbols in our example in Figure 6.1.1.
Note that if the transmission power of QPSK is the same that of BPSK, and thus the distance
between signal points and the origin must remain the same. The Euclidean distance between
signal points of QPSK is decreased by the factor of square root two compared with BPSK
(with the same transmission power) and noise tolerance is correspondingly decreased. We
may further increase the number of different phases to increase data rate, which would further
decrease Euclidean distance and make noise tolerance worse.
If symbol duration in Figure 6.1.1 is the same as for BPSK bit rate of QPSK is double because
in each time period T two bits are transmitted. That is for QPSK
rb = 2 r = 2*1/T bit/s

6.2.2

If we would use two amplitude values in addition to the four phases we will have four more
crosses in the constellation diagram. Each of the crosses would represent the values of three
subsequent bits. This combination of phase and amplitude modulations is called Quadrature
Amplitude Modulation (QAM). The use of QAM with many signal points is reasonable if the
noise level in the channel is low as for example in speech channel through the telecommunications network or in the case of short haul radio relay systems. We may increase the number of
signal points further to the order of hundreds in voice band modems if we use TCM, which
combines error control coding with QAM.
As long as we keep symbol duration unchanged required channel bandwidth for all PSK and
QAM methods remain the same. Usual approximation for required radio bandwidth is
BT = r = 1/T

6.2.3

Note that the bandwidth depends only on the symbol rate both modulation examples in Figure
6.1.1 require the same bandwidth although bit rate of QPSK is double.

6.3. Comparison of QPSK and 8-PSK


We now illustrate the performance of Trellis Coded Modulation with the help of an example.
In this example we compare ordinary QPSK and Trellis Coded 8-PSK with 8 phases. Constellation diagrams and Euclidean distances of these modulation methods are shown in Figure
6.3.1.

________________________________________________________________________
ECC35.doc

Page 110/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Constellation diagram
of QPSK

Constellation diagram
of 8-PSK
Q

Q
01

111

d1
11

101

00
I

011

d0

001

d1
000
d2

100

I
d3
010

10

110

Figure 6.3.1 Euclidean distances of QPSK and 8PSK

If we take the distance from the origin as 1 we may write the other distances as [6, p 270]:
d0 =
d1 =

2 2

d2 = 2 2
d3 = 2

6.3.1

When an error occurs in the receiver the symbol is most often received as one of its neighbors.
Other error cases have no importance in normal operational conditions when error rate is not
very high. We see that for QPSK the smallest Euclidean distance is d1= 2 and for 8-PSK it is
d0= 2 2 . Distances in Figure 6.3.1 correspond to the voltage value and to compare the
two modulation methods in terms of signal-to-noise power ratio we get dB value as
10 log10(d12/d02) dB = 10 log10{2/(2- 2 )} dB = 5.333 dB
This means that 8-PSK would require 5 dB higher signal-to-noise ratio in order to achieve the
same symbol error rate and bit error rate as QPSK! Bit error rate is equal to (or very close to)
the symbol error rate because we assume that in the case of error neighbor symbol is detected
and each symbol error produces one bit error in both cases, see Figure 6.3.1. However, 8-PSK
provides 1.5 times higher data rate. In the next section we illustrate principle and performance
of TCM comparing these two modulation methods, QPSK without TCM and 8-PSK with
TCM. We will see that with TCM 8-PSK gives better performance when information data rate
and use of spectrum remain the same.

6.4. Trellis Coded 8-PSK


As we saw the more symbol values we use the lower is the noise tolerance if no coding is involved. Now we show that if we keep the user data rate the same and use TCM together with
8-PSK we get much better performance than with QPSK without TCM. In this case we use
additional bit per symbol as redundant information to improve performance. Figure 6.4.1
shows a trellis encoder including convolutional rate encoder with the constraint length of
three. With the same modulation or symbol rate this encoder together with 8-PSK gives the
same user data rate and uses the same spectrum as QPSK without trellis encoder.
________________________________________________________________________
ECC35.doc

Page 111/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We have now two options. We may send two user bits in each symbol (duration T) using
QPSK or we may use encoder in Figure 6.4.1 and transmit two user bits in each 8-PSK symbol. When the symbol rate is the same, the same bandwidth is used and the user rate is the
same in both options.
c
2
i

3-Bits
to
8-PSK
c modulator
1

2-Bit
input
i

c
0

Figure 6.4.1 Encoder for a trellis code [6, p 271].

The encoder is systematic but odd and even bits are transmitted in different 3-bit frames. The
generator matrix for the encoder in Figure 6.4.1 is [6, p 270] is
0
1
G(D) =
2
0 1 D

0
D

6.4.1

The trellis for the convolutional encoder in Figure 6.4.1 is shown in Figure 6.4.2. Branches are
labeled as in/out-bits, i.e., i1/c1, c2.
B ra n c h la b e ls: in /o u t = i1/c 1, c 2
State
00
State
10

0 /0 0
1 /1 0

1 /1 0

1 /1 0

0 /1 0

0 /1 0
1 /0 0

1 /1 1

0 /1 1
1 /0 1

0 /1 0
1 /0 0
0 /0 1

0 /0 1

0 /0 1

State
01
State
11

0 /0 0

0 /0 0

0 /1 1
1 /0 1

1 /1 1

0 /1 1
1 /1 1
1 /0 1

Figure 6.4.2 Trellis diagram for a convolutional encoder

We see from the trellis in Figure 6.4.2 that the free distance for the convolutional code is three
and thus it is able to recover all error cases if there is not more than one error in each six bit
period. The trellis is independent from the other input bit i0 but we may draw the complete
trellis for the encoder in Figure 6.4.1 which contains both input bits and all three output bits.
Now we have two parallel branches between nodes in Figure 6.4.2.

________________________________________________________________________
ECC35.doc

Page 112/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Branch labels: in/out = i 0 , i 1 /c 0 , c 1 , c 2
0 0 /0 0 0
1 0 /1 0 0

State
00
State
10
State
01
State
11

0 0 /0 0 0

0 0 /0 0 0

0 1 /0 1 0
1 1 /1 1 0

1 0 /1 0 0

1 0 /1 0 0
1 1 /1 1 0

0 1 /0 1 0

1 1 /1 1 0

1 1 /1 1 0

0 1 /0 1 0
1 0 /1 1 0

0 0 /0 1 0

0 1 /0 1 0

0 1 /0 1 0

1 1 /1 1 0

0 1 /0 0 0

0 1 /0 0 0
1 1 /1 0 0

0 1 /0 0 0

1 1 /1 0 0

1 1 /1 0 0

0 0 /0 0 1
1 0 /1 0 1

0 0 /0 0 1

0 0 /0 0 1
1 0 /1 0 1

1 0 /1 0 1
1 1 /1 1 1

1 1 /1 1 1

1 1 /1 1 1

0 0 /0 1 1

0 0 /0 1 1
1 0 /1 1 1

0 1 /0 1 1

0 0 /0 1 1

1 0 /1 1 1
0 1 /0 0 1

0 1 /0 1 1

0 1 /0 1 1

1 0 /1 1 1
0 1 /0 0 1

0 1 /0 0 1
1 1 /1 0 1

1 1 /1 0 1

1 1 /1 0 1

Figure 6.4.3 Complete trellis diagram for the encoder in Figure 6.4.1.

The task of the convolutional decoder in the receiver is to decide which path the encoder has
followed in the trellis. In operational system the error rate is typically better than 110-3 and
convolutional decoder is able to correct close to all errors. Note that it fails only if there are
two or more errors in six subsequent bits. Assuming that convolutional code can correct all
errors of coded bits it always chooses correct path through trellis. Then all errors that may occur in decoding are errors of the uncoded bit c0, which is the first of the three bits in branch
labels in Figure 6.4.3 and the signal points in Figure 6.4.4. Convolutional code cannot correct
them because paths in trellis are independent from bit c0. However, from Figure 6.4.4 we see
that value of c0 defines which of the two signal points at opposite sides of constellation diagram is selected. The set of two bits is selected by convolutional coded bits c1 and c2. The distance between points in each two point set is maximum d3, see also Figure 6.3.1 and thus
probability of error is minimum.

________________________________________________________________________
ECC35.doc

Page 113/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
General combined encoder/modulator
1
2
k1

Binary
convolutional
encoder

.
.
.

1
2
.
.
.

1
2
.
.
.

Constellation diagram
of 8-PSK and
signal point selection

Select
subset
(1, 2, ...2n)

Signal
point

Select
point
from
subset
(1, 2, ...2k )

c0, c1, c2
011

c1, c2 = 10

010
001

c1, c2 = 00

k2

100

000

Example trellis encoder for 8-PSK

c
2

2-Bit
input
i

3-Bits
to
8-PSK
modulator

c
1

101

c1, c2 = 01

111
110

c1, c2 = 11

c
0

Figure 6.4.4 General combined encoder/modulator [3, p. 514] and trellis coded 8-PSK.

If we would use uncoded QPSK instead of trellis coded 8-PSK the Euclidean distance would
be d1 instead of d3 as shown in Figure 6.3.1. The coding gain is then approximately
g = 10 log10 (d32 / d12) dB = 10 log10 2 = 3 dB

6.4.2

This means that uncoded QPSK requires approximately 3 dB better signal to noise ratio at the
same error rate. (Actually error rate of QPSK is then two times higher because each signal
point is QPSK have two closest neighbors, see Figure 6.3.1, instead of one in trellis coded 8PSK.) Improvement of 3dB means that with the help of TCM transmission power of 8-PSK
can be reduced to half compared to QPSK without coding. The improvement is smaller than
3dB at very low signal-to-noise ratio because convolutional code for signal point subset selection may also fail when noise is very high.
Example 6.4.1
Assume that information data 110010 is encoded and transmitted using our example
TCM. Encoded data for input bit i1 we get from trellis in Figure 6.4.2. The encoder path
leaves all zero state and returns to that after three hops.
i0
i1
i0
i1
i0
i1
Info
1
1
0
0
1
0
Encoded data c0 c1 c2
c0 c1 c2
c0
c1 c2
1
10
0
01
1
10
This correct path is terminated in decoder if, because of errors, all zero path is preferred
and then decoding fails. Their distance is three and more than one error is needed for
false decoding (error rate 2/6 = 33% or higher). In operational system error rate is much
better and correct path is (almost) always found. Then the only problem of the receiver
is to distinguish between parallel branches in Figure 6.4.3, that is
c0 c1 c2 c0 c1 c2 c0 c1 c2
010
001
010
110
101
110
________________________________________________________________________
ECC35.doc

Page 114/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
When we have designed constellation so that these parallel braches are located on opposite side of the constellation, as in Figure 6.4.4, we have the same noise tolerance as in
BPSK that is 3 dB better than QPSK that we would use without TCM:
We have used a simple example to illustrate performance improvement that TCM achieves.
Figure 6.4.4 shows a general TCM encoder where a set on information bits are convolutional
encoded and they are used in the receiver to choose a subset of constellation points. Signal
points of each subset are designed to have as large as possible Euclidean distance. Uncoded
bits define the signal point in each subset. TCM provides from 3 to 6 dB improvement depending on implementation [3, p 523].

Problems
Problem 6.4.1: Draw the block diagram of the trellis encoder for the generator matrix below.
0
0
1

G(D) =
2
2
0 1 D 1 D D
Problem 6.4.2: Draw the trellis diagram for the trellis-coded 8-PSK in Problem 6.4.1.
Problem 6.4.3: Draw the constellation diagram for the trellis-coded 8-PSK in Problems 6.4.1
and 6.4.2.

________________________________________________________________________
ECC35.doc

Page 115/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

7. Turbo Codes
In 1993 a new encoding and decoding scheme, Turbo codes, was reported that approaches
Shannons channel maximum channel capacity given in Chapter 1. The simulation results
shown in [14, Figure 8.1] illustrate that at typical error rate 110-5 a Turbo code tolerates 3dB
lower signal to noise ratio than the reference convolutional code used in simulations. The
characteristics of Turbo codes are not yet well known and we only briefly review their structure with the help of a simple example. The most important class of Turbo codes is known as
Parallel Concatenated Convolutional Code (PCCC), which we introduce next.

7.1. Codeword structure of Turbo codes


A Turbo encoder consists of the parallel concatenation of two or more, usually identical, rate
convolutional encoders, realized in systematic feedback form, and the pseudorandom interleaver, see Figure 7.1.1 [14, Figure 8.2]. The encoder structure is called a parallel concatenation because the two encoders operate on the same set of input bits, rather than one encoding
the output of another.
vr(0)

u
r

Rate 1/2 Systematic


Convolutional Encoder
Interleaver,
length N
u'
r

Rate 1/2 Systematic


Convolutional Encoder

vr(1)

vr(2)

Optional
puncturer
Figure 7.1.1. The PCCC encoder with two encoders, interleaver and optional puncturer.

The interleaver in Figure 7.1.1 permutes or mixes input bits and the set of input bits of another
encoder remains the same but the input sequences are different. The input bits are grouped
into finite-length sequences whose length N equals the size of the interleaver. The encoders
are rate-1/2 encoders and they produce two output bits for each input bit. Only one output,
parity bit, is shown in Figure 7.1.2 because the other output bit of systematic encoder is identical to the input bit as we see in Figure 7.1.2. The first encoder receives input bits ur and produces output bits vr(0)= ur and vr(1) and the second encoder receives interleaved or mixed sequence ur and produces ur, vr(2).
Since both encoders are systematic (i.e., the first of the produced two output bits is equal to
the input bit) and operate on the same set of input bits it is necessary to send input bits only
once. By knowing the structure of the interleaver the receiver is able to regenerate ur from ur.
Then the overall code rate is 1/3. To increase the code rate to we could puncture parity se________________________________________________________________________
ECC35.doc

Page 116/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
quences by alternately deleting vr(1) or vr(2). Puncturing is used for convolutional codes and
Viterbi decoding does not consider deleted bit when producing path metrics. Puncturing may
significantly simplify decoding of high rate convolutional codes [2, page 382] but the performance of the code is not that good as without puncturing.
We will refer a Turbo code as (h0, h1, N) where h0 and h1 are generator polynomials and N
stands for the length of the interleaver. As an example let us take a Turbo encoder in Figure
7.1.1 where the two encoders are equal CC(2, 1, 2) systematic feedback convolutional encoders shown in Figure 7.1.2. Our example generator polynomials are
g0(D) = 1 + D2

7.1.1

g1(D) = D

7.1.2

The first polynomial is a feedback polynomial and the second is feedforward polynomial. We
have now used a delay operator D that corresponds directly the delay of one flip-flop in the
shift register. The input bit impacts on parity bit after D (one bit) delay as we see from feedforward polynomial g1(D). The feedback polynomial tells that we make a sum of present input
bit and two Ds earlier bit. The transfer function, i.e., how parity bit v depends on input bit u,
becomes: D/(1+D2). We may write the generator matrix as
D

G( D) 1
2
1 D

7.1.3

This means that for each input bit output contains input as it is and parity bit generated by the
second element in the matrix. If we carry out the division D/(1+D2) we get D-1-D-3+D5
+and we see that the parity bit is the sum of the previous input bit (delay D-1) and third,
fifth, etc., input bit in the past. The reader may check if this is valid by looking Table 7.1.1.
From Figure 7.1.2 we can see that the first parity bit is zero (initial state is all-zero state), the
second is the first input bit, third is the second input + first parity bit, fourth parity bit is the
third input + the second parity bit, etc. In Table 7.1.1 each parity bit is written as a sum of one
bit earlier input bit and two bits earlier parity bit.
u
g1(D) = D

g0(D) = 1 + D2

Figure 7.1.2. CC(2, 1, 2) systematic convolutional encoders of example Turbo code.

If the pseudorandom interleaver has size N=16, we have defined a (1+D2, D, 16) Turbo code.
The interleaver in Figure 7.1.1 is a 4x4 matrix filled row by row with the input bits ur. When
it becomes full the input bits ur for another encoder are read out in a pseudorandom manner
until each bit is read once and only once. Let us define a pseudorandom interleaver by the
permutation function:
16 = {15, 10, 1, 12, 2, 0, 13, 9, 5, 3, 8, 11, 7, 4, 14, 6}

7.1.4

This tells position of each bit in interleaved data block. For example u0 = u15 and u1 = u10,
etc. Now if the information data block to be encoded is, for example,
________________________________________________________________________
ECC35.doc

Page 117/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
u = [u0, u1, u15] = [1,0,0,0,1,0,0,0,0,0,0,1,0,1,0]

7.1.5

then according to the permutation we get


16

u = [u0, u1, u15] = [0,0,0,1,0,1,0,0,0,0,0,0,1,1,0]

7.1.6

which is the input sequence of another encoder. To see the corresponding parity sequences we
may use convolutional encoder in Figure 7.1.2 or its trellis in Figure 7.1.3. Using CC(2, 1, 2)
convolutional encoder in Figure 7.1.2 we get output sequences for u and u as shown in Table
7.1.1.
Table 7.1.1 Output sequences of the example Turbo encoder.

u
1
0
0
0
1
0
0
0
0
0
0
0
1
0
1
0

vi(1) = ui-1+vi-2
0
1
0
1
0
0
0
0
0
0
0
0
0
1
0
0

u
0
0
0
1
0
1
0
0
0
0
0
0
0
1
1
0

vi(2) = ui-1+vi-2
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
1

We can produce the Trellis diagram for this convolutional encoder and it is shown in Figure
7.1.3. The states are written in a way that the first bit (left) represents the contents of the delay
element on the left hand side.
State s
ab

In/Out

State s0
00
State s1
01
State s2
10
State s3
11

Figure 7.1.3 Trellis diagram for CC(2, 1, 2) convolutional encoder in Figure 7.1.2.

________________________________________________________________________
ECC35.doc

Page 118/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
Figure 7.1.4 [14, Figure 8.3] shows the paths in trellis for input sequences u and u. Above
each branch there is an input bit written and below the line there is the corresponding output
parity bit.
s0

Input sequence u = {1,0,0,0,1,0,0,0,0,0,0,0,1,0,1,0}


0/0 0/0
0/0 0/0 0/0 0/0 0/0

In/Out
1/0

s1

0/0

1/0

1/0
0/1

0/0

0/1

0/1

1/0

s2
s3
Input sequence u' = {0,0,0,1,0,1,0,0,0,0,0,0,0,1,1,0}
s0

0/0

0/0

0/0

s1

0/0
1/0

0/0

0/0

0/0

0/0

0/0

0/0

1/0

1/0

0/1

1/1

s2
0/1

s3

Figure 7.1.4. Example paths in encoders.

As we have seen the corresponding unpunctured parity sequences are


v(1) = {0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,0}
v(2) = {0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1}
The resulting codeword contains u, v(1) and v(2) and thus its Hamming weight
(without puncturing) is
d = w(u) + w(v(1)) + w(v(2)) = 4 + 3 + 3 = 10
where w(u) is the Hamming weight of sequence u. If the code is punctured starting with v0(1),
we discard bits 0, 2, 4 , etc. from v(1) and bits 1, 3, 5 , etc. from v(2). The resulting Hamming weight is then
d=4+3+2=9
If we start puncturing with v0(2), we discard bits 0, 2, 4 , etc. from v(2) and bits 1, 3, 5 etc.
from v(1). The resulting Hamming weight becomes
d=4+0+1=5
Finding the free distance of a Turbo code is complicated matter because Turbo encoder is time
varying due to the interleaver. If we write delayed input sequence as = Du then delayed parity sequence is Dv(1). Because of interleaving another delayed input sequence Du and
thus delayed parity sequence is not equal to Dv(2) with high probability.
As an example we may write delayed input sequence from example above as
= Du ={0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,1}
According to the permutation 16 we get
={1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0}
________________________________________________________________________
ECC35.doc

Page 119/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
We see that this is not equal to the delayed u. Now the parity sequence becomes
y(2) = {0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0}
Comparing this with v(2) in Table 7.1.1 we see that the delayed input sequence results in a parity sequence (of interleaved input sequence) that differs in both bit positions and Hamming
weight.
The simple example above has illustrated that because of pseudorandom interleaver the input
sequences u and u are almost always different. Their weights are always the same but the two
encoders will produce parity sequences of different weight (with very high probability).
Note that a nonzero tail sequence is required to return the encoder to all-zero state as we see in
Figures 7.1.3 and 7.1.4. This is a consequence of the systematic feedback form of the encoder.
A non-zero input is required to leave the all-zero state and non-zero input is required to return
back there. Otherwise a non-zero element continues to circulate in the encoder.
In this particular encoder any information sequence with weight 1 will produce parity sequence of alternating 1s and 0 after input 1 has entered the encoder.
The input sequence u that consists of N-1 zeros followed by a 1 is a valid sequence for the
first encoder. For some interleavers u will be permutated to itself and u = u. In this case, both
v1 and v2 have weight zero. The overall weight of the codeword and free distance is now 1,
regardless of whether or not the puncturing is used. For systematic feedback encoder, as the
one in Figure 7.1.2, forcing the first encoder to return to all-zero state ensures that every information sequence has at last weight 2. For this reason, it is common to force the first encoder to all-zero state.
With a pseudorandom interleaver it is highly unlikely that both encoders will be returned to
the all-zero state at the end of the sequence even when the K-1 last bits, tail bits, of the input
sequence u are chosen to force the first encoder back to all-zero state. The final state of the
second encoder is not known but according to simulations is does not influence much on the
performance of the code in the case of large interleavers. We will assume here that the first
encoder is forced to return to all zero state and the final state of the second encoder is unknown. There are special interleaver structures that force both encoders to all-zero state but
they are not discussed here.

7.2. Iterative Decoding and Performance of Turbo Codes


For decoding of Turbo codes an iterative algorithm is developed. For detection of each bit the
whole received sequence containing systematic bits and both parity sequences are considered.
Decoding process is soft input/soft output (SISO) process where values of each bit is quantized to at least 8 values, not just as 0 or 1 as in hard decision.
Figure 7.2.1 shows block diagram of PCCC decoder. Initially decoder is reset. Decoding procedure contains following steps:
1. Soft values of systematic information bits (x), the first parity sequence (y1) and the
second parity sequence (y2) are measured and stored;
________________________________________________________________________
ECC35.doc

Page 120/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
2. SISO 1 decoder derives extrinsic information (Le1(d)) for each systematic bit using
values of the other systematic bits of x and the first parity sequence (y1);
3. Extrinsic information (Le1(d)) from the first decoder is interleaved and used together
with measured values x as an input to the second decoding step by SISO 2. Now SISO
2 gets slightly more accurate values of systematic bits.
4. SISO 2 derives second extrinsic information (Le2(d)) for each bit using the other improved values (x and (Le1(d)) of systematic bits and the whole second parity sequence
(y2);
5. Second extrinsic information (Le2(d)) is deinterleaved and used together with the
measured values x as an input to the next decoding step by SISO 1.
6. SISO 1 derives new extrinsic information (Le1(d)) for each bit using the other systematic bits of x corrected by their extrinsic information (Le2(d)) and the first parity sequence (y1);
7. SISO 2 derives second extrinsic information (Le2(d)) for each bit using the other improved values (x and (Le1(d)) of systematic bits and the whole second parity sequence
(y2);
8. Repeat steps 5-7 until the defined number of cycles is done.
9. When iteration is finished SISO 2 produces estimate for each systematic bit that is
measured values x corrected with final extrinsic information of each bit (both Le1(d)
and Le2(d) impact on the output);
10. Decision for each bit is made (based on set threshold) if it is 0 or 1.
Le2(d)

Measured
bit values of
the first
parity
sequence

y1

Deinterleaver

Interleaver

SISO 1

Le1(d)

x
Measured systematic bit values

L2(d)

SISO 2

Estimated
soft value
of each bit

y2
Measured bit values of the
second pariry sequence

d
Detected
binary
sequence

Figure 7.2.1. PCCC decoder.

As explained above so called extrinsic information is calculated for each bit by using the other
bits in the systematic bit sequence and the first parity sequence. Then extrinsic information,
learned from the first parity check sequenced, is used to correct each measured input bit value.
This more accurate value of systematic bits is used together with the other parity bit sequence
to derive second extrinsic information sequence. Then the result is used together with systematic sequence to produce third estimate, etc. It requires eighteen iteration cycles to get close to
final performance as shown in Figure 7.2.1 [14, Figure 8.5]. All characteristics of the algorithm are not yet known but simulations have shown that it gives very good results. There is a
lot of research work ongoing to reduce power consumption of decoder implementations [21].
The brief explanation above is not aimed to be complete description of the decoder operation.
Turbo decoding is not very simple matter and we have to skip it. However, if reader is inter________________________________________________________________________
ECC35.doc

Page 121/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________
ested in this topic, more complete description with illustrative example is given in reference
[20].

7.3. Performance of Turbo Codes


Why do Turbo codes perform so well? There is not actually complete answer available today.
However, we have seen that the reason is not large free distance. In our example it was only 2,
which is smaller than free distance of typical convolutional codes. The reason is related to the
interleaver, which mixes input sequence and if the first encoder produces low weight parity
sequence the other generates high weight sequence at high probability. Because of this there
are very few low weight paths and they do not impact on the performance very much. In convolutional code the number of low weight paths is much higher.
This explains why error rate of Turbo code increases much more slowly than in the case of
convolutional code when signal-to-noise ratio decreases. Actually convolutional codes give
lower error rate than Turbo codes when signal-to-noise ratio is high. However, in practice it is
enough to reach residual error rate level in the order of 110-5 and at this error rate Turbo
codes perform 2 to 3 dB better than convolutional codes as shown in Figure 7.3.1 and [14,
Figure 8.1].

10-1
10-2
1. Iteration

10-3
(2,1,14)
Convolutional
Code

Bit
10-4
Error
Probability

2. Iterations
6 Iterations 3. Iterations
Error floor

10-5
10-6 10 Iterations

18 Iterations

10-7
0

0.5

1.5

2.5 Eb/N0 dB

Figure 7.3.1. Performance of an example PCCC and a reference convolutional code.

Turbo codes are proposed for many modern systems, such as Universal Mobile Telecommunications System (UMTS) and Power Line Communications systems. However, their iterative
decoding method requires much computing which consumes power and causes additional delay.

________________________________________________________________________
ECC35.doc

Page 122/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

7.4. Application Example, 3GPP Turbo Encoder


Third Generation Partnership Project (3GPP) specifies one Turbo code for third generation
(3G) mobile radio network as an alternative to convolutional code. Figure 7.4.1 shows the
block diagram of the 3GPP encoder.
Systematic bits

+
Input bits

Parity
bits

Encoder 1

+
Interleaver,
size
40 ... 5114
bits

+
+

D
Encoder 2

Parity
bits

Figure 7.4.1 3GPP Turbo encoder.

As we see it contains two identical eight state encoders. Input sequence is sent as it is and also
used by encoder 1 to produce the first parity sequence. Input sequence is scrambled by an interleaver and used by encoder 2 to produce the second parity sequence. Interleaver size may be
anything between 40 and 5114 bits to adapt encoder to different channels using different
frame size. Code rate of this encoder is 1/3.
In 3G and Universal Mobile Telecommunication System (UMTS) turbo code is proposed to
be used first in applications, which are not delay critical.

Problems
Problem 7.1.1: Use the Turbo encoder example explained in section 7.1. The input sequence
is
u = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, x, x].
Define the tail bits in the input sequence above in such a way that the first constituent encoder
is forced to all-zero state in the end of the sequence. What is the weight of the complete output
sequence of this Turbo encoder without puncturing? What is the state of the second constituent encoder in the end of the sequence?

________________________________________________________________________
ECC35.doc

Page 123/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Index
Automatic Repeat Request, ARQ
2, 82
Backward Error Correction, BEC
2, 82
basis vectors
27
BCH codes
74
Binary Phase Shift Keying (BPSK)
5
Binary Phase Shift Keying, BPSK,
108
Binomial coefficient
18
block length
10
Bluetooth
16
Bose-Chaudhuri-Hachuenghem (BCH) Codes
74
burst error pattern
75
channel coding
108
Channel encoder
4
Code Rate
12
Code rate of convolutional code
93
Codewords
10
coding gain
2, 18
Combined channel coding and modulation
6
Compact Discs (CDs)
6
confidence measure
101
conjugate
32
constellation diagram
108
constraint length
92
Continuous (or carrier) Wave, CW, modulations 108
Convolutional Codes
92
coset
45
coset leader
45
CRC Decoder
85
CRC Encoder
84
CRC-12
90
CRC-16
90
CRC-16 (ANSI)
90
CRC-32
90
CRC-CCITT
90
Cyclic Block Codes
56
cyclic burst
75
Cyclic Redundancy Check, CRC
82
Decoding of convolutional code
98
Desing of CRC polynomials
89
Digital Video Disks (DVDs)
6
dimension
27
Distance
12
Elementary row operations
35
equivalent code
35
error burst
87
Error correction capability of CRC
87
error detection
41
Error Detection and ARQ
82
Error detection capability of CRC
87
euclidean distance
109
extension field
24, 28
extrinsic information
121
Field
22
Finite field
22
Fire code
77
Forward Error Correction, FEC
2, 5
free distance
12
Galois Field
22, 23
generator polynomial
56
Generator polynomial of a cyclic code
56

generator-polynomial matrix
94
GPP Turbo Encoder
123
Hamming code
19
Hamming distance
12
Hamming weigh
13, 33
Hamming weight
33
hard-decision decoding
40, 70
Hard-decision Decoding
99
High Level Data Link Control, HDLC
90
Interleaving
77
Line encoder
5
linear codes
9, 33
linearly dependent
27
linearly independent
27, 34
Longitudinal Redundancy Check, LRC
15
maximum likelihood decoding
14
Maximum Likelihood Decoding
98
maximum-likelihood decoding
42
minimal polynomial
32
minimal polynomials
31
minimum distance
12
Minimum Hamming distance
12
Minimum-Distance Decoding
41
Mod-n addition and multiplication
24
Modulation
5
modulation rate
108
Modulator
5
monic polynomial
32
m-tuple
27
n-tuples
9
Parallel Concatenated Convolutional Code (PCCC),
116
Parallel Concatenated Convolutional Codes (PCCC)3
parity check bits
35
parity check matrix
36
Parity check matrix and error correction capability 37
parity check polynomial
58
Parity check polynomial of a cyclic code
58
Phase modulation, PM
108
Phase Shift Keying, PSK,
108
polynomial
25
Power Line Communications systems.
122
prime field
30
prime polynomial
27
primitive element
29
primitive polynomial
30
Puncturing
95
Quadrature Amplitude Modulation, QAM
110
Quadrature Phase Shift Keying, QPSK
108
Receiver
5
redundant bits
35
Reed-Solomon codes
74
residual error rate
17
Shannon
3
Shortened code
67
Singleton bound
49
sliding window
7
soft input/soft output (SISO)
120
Soft-decicion decoding
40
Source encoder
4

________________________________________________________________________
ECC35.doc

Page 124/128

Tarmo Anttalainen

Error Control Codes

23.1.2013

_________________________________________________________________________

Standard Array
Structure of CRC-code
subfield
survivor path
syndrome
syndrome
Syndrome Decoding
syndrome polynomial
syndromes
systematic code
systematic codes
Systematic Convolutional Code
systematic form
tail bits

48
84
28
99
41
21
70
71
21
20
9
95
35
98, 120

transfer function
Transmission channel
Tree Diagram of a Convolutional Code
Trellis Coded Modulation
Trellis Coded Modulation, TCM
Trellis Diagram
Turbo codes
Vertical Redundancy Check, VRC
Very Large Scale Integration
Viterbi Algorithm
VLSI
Weight
Weight Distribution

102
5
95
108
3, 6, 108
97
3
15
2
101
2
13
49

References
[1] A. Bruce Carlson, Communication Systems, An Introduction to Signals and Noise in Electrical
communication, McGraw-Hill International Editions, Third Edition -88
[2] Sklar, Digital Communications, Prentice-Hall International Editions -88
[3] John G. Proakis, Digital Communications, McGraw-Hill International Editions, Third Edition -95
[4] Richard E. Blahut, Theory and Practice of Error Control Codes, Addison-Wesley Publishing
Company -83.
[5] Chester J. Salwach, Codes that Detect and Correct Error, College Journal in Mathematics.
[6] Richard E. Blahut, Digital Transmission of Information, Addison-Wesley Publishing Company 90
[7] Harry Leib, Course "Combined Channel Coding with Modulation" material and notes, Helsinki
University of Technology, spring -96
[8] Raymond Steele, Mobile Radio Communications Second Edition, IEEE Press/Pentech Press, 1999
[9] M. Mouly, M. B. Pautet, The GSM System for Mobile Communications,
[10] Jan Ekberg, Seppo J. Halme, Koodausmenetelmt, Otakustantamo 498, 1987
[11] Andrew S. Tanenbaum: Computer Networks, Prentice-Hall International Editions, -88.
[12] Gilbert Held: Understanding Data Communications, from fundamentals to applications, John
Wiley & Sons, -91 and -95.
[13] Roger L. Freeman, Practical Data Communications, John Wiley & Sons, New York
[14] Christian Schlegel; Trellis Coding; IEEE Press New York, 1997
[15] Halshall; Data Communications, Computer Networks and Open Systems; Addison Wesley,
Fourth Edition, Appendix A.
[16] ETSI; Specification of the Bluetooth System, Core; v1.0B, Dec. 1st 1999.
[17] Gordon L. Stuber: Principles of Mobile Communication, 2nd Edition; Kluwer Academic Publishers, 2001
[17] Stephen G. Wilson; Digital Modulation and Coding; Prentice Hall 1996
[18] Martin Bossart: Channel Coding for Telecommunications; John Wiley & Sons 1999
[19] Ezio Biglieri and others; Introduction to Trellis Coded Modulation with Applications; Macmillan
Pudlishing, New York, 1991.
[20] Bernhard H. Walke; Mobile Radio Networks, Networking and Protocols; John Wiley & Sons,
Ltd, 2000.
[21] T. Anttalainen; Turbo Codes, Post graduate seminar presentation in Helsinki University of Technology, 3th of February 2002.

________________________________________________________________________
ECC35.doc

Page 125/128

S-ar putea să vă placă și