Sunteți pe pagina 1din 84

# Resmi N.G.

Reference:
Digital Image Processing 2nd Edition
Rafael C. Gonzalez
Richard E. Woods
Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 2

 Error-Free Compression
 Variable-Length Coding
 Huffman Coding
 Other Near Optimal Variable Length Codes
 Arithmetic Coding
 LZW Coding
 Bit-Plane Coding
 Bit-Plane Decomposition
 Constant Area Coding
 One-Dimensional Run-Length Coding
 Two-Dimensional Run-Length Coding

##  Lossless Predictive Coding

 Lossy Compression
 Lossy Predictive Coding

## 3/24/2012 CS 04 804B Image Processing Module 3 3

 Transform Coding
Transform Selection

 Bit Allocation

##  Threshold Coding Implementation

 Wavelet Coding
 Wavelet Selection

##  Decomposition Level Selection

 Quantizer Design

##  Image Compression Standards

 Binary Image Compression Standards
 One Dimensional Compression

## 3/24/2012 CS 04 804B Image Processing Module 3 4

 Continuous Tone Still Image Compression Standards
JPEG

 JPEG 2000

## 3/24/2012 CS 04 804B Image Processing Module 3 5

Introduction
 Need for Compression
 Huge amount of digital data
 Difficult to store and transmit
 Solution
 Reduce the amount of data required to represent a digital image
 Remove redundant data
 Transform the data prior to storage and transmission
 Categories
 Information Preserving
 Lossy Compression

## 3/24/2012 CS 04 804B Image Processing Module 3 6

Fundamentals
 Data compression
 Difference between data and information
 Data Redundancy
 If n1 and n2 denote the number of information-carrying
units in two datasets that represent the same information,
the relative data redundancy RD of the first dataset is
defined as 1
RD  1  ,
CR
n1
where, CR  , is called the compression ratio.
n2
3/24/2012 CS 04 804B Image Processing Module 3 7
Case1: n2  n1
CR  1 and RD  0  no redundant data
Case 2 : n2  n1
CR   and RD  1  highly redundant data
significant compression
Case 3 : n2  n1
CR  0 and RD    second dataset contains
more data than the original

## 3/24/2012 CS 04 804B Image Processing Module 3 8

Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 9

Coding Redundancy
 Let a discrete random variable rk in [0,1] represent the
graylevels of an image.
 pr(rk) denotes the probability of occurrence of rk.
nk
pr ( rk )  , k  0,1, 2,...L  1
n
 If the number of pixels used to represent each value of rk
is l(rk), then the average number of bits required to
represent each pixel is
L 1

Lavg   l ( rk ) pr ( rk )
k 0
3/24/2012 CS 04 804B Image Processing Module 3 10
 Hence, the total number of bits required to code an MxN
image is MNLavg.
 For representing an image using an m-bit binary code,
Lavg= m.

## 3/24/2012 CS 04 804B Image Processing Module 3 11

 How to achieve data compression?
 Variable length coding - Assign fewer bits to the more
probable graylevels than to the less probable ones.

## 3/24/2012 CS 04 804B Image Processing Module 3 12

3/24/2012 CS 04 804B Image Processing Module 3 13
Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 14

Interpixel Redundancy
 Related to interpixel correlation within an image.
 The value of a pixel in the image can be reasonably
predicted from the values of its neighbours.
 The gray levels of neighboring pixels are roughly the
same and by knowing gray level value of one of the
neighborhood pixels one has a lot of information about
gray levels of other neighborhood pixels.
 Information carried by individual pixels is relatively
small. These dependencies between values of pixels in the
image are called interpixel redundancy.
3/24/2012 CS 04 804B Image Processing Module 3 15
 Autocorrelation

## 3/24/2012 CS 04 804B Image Processing Module 3 16

3/24/2012 CS 04 804B Image Processing Module 3 17
3/24/2012 CS 04 804B Image Processing Module 3 18
 The autocorrelation coefficients along a single line of
image are computed as
A(n)
 (n) 
A(0)
1 N 1n
where A(n)  
N  n y 0
f ( x, y) f ( x, y  n)

## 3/24/2012 CS 04 804B Image Processing Module 3 19

 To reduce interpixel redundancy, transform it into an
efficient format.
 Example: The differences between adjacent pixels can be
used to represent the image.
 Transformations that remove interpixel redundancies are
termed as mappings.
 If original image can be reconstructed from the dataset,
these mappings are called reversible mappings.

## 3/24/2012 CS 04 804B Image Processing Module 3 20

3/24/2012 CS 04 804B Image Processing Module 3 21
Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 22

Psychovisual Redundancy
 Based on human perception
 Associated with real or quantifiable visual information.
 Elimination of psychovisual redundancy results in loss of
quantitative information. This is referred to as
quantization.
 Quantization – mapping of a broad range of input values
to a limited number of output values.
 Results in lossy data compression.

## 3/24/2012 CS 04 804B Image Processing Module 3 23

Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 24

Fidelity Criteria
 Objective fidelity criteria
 When the level of information loss can be expressed as a
function of original (input) image and the compressed and
subsequently decompressed output image.
 Example: Root Mean Square error between input and
output images. 
e ( x , y )  f ( x , y )  f ( x, y )
1
 1 M 1 N 1
 
 
2 2
erms    
x 0 y 0 
f ( x, y )  f ( x, y )  
 
 MN
3/24/2012 CS 04 804B Image Processing Module 3 25
 Mean Square Signal-to-Noise Ratio
M 1 N 1 

 x 0 y 0
f ( x, y ) 2
SNRms  M 1 N 1 2
 

  
x 0 y 0 
f ( x , y )  f ( x, y ) 

## 3/24/2012 CS 04 804B Image Processing Module 3 26

 Subjective fidelity criteria
 Measures image quality by subjective evaluations of a
human observer.

## 3/24/2012 CS 04 804B Image Processing Module 3 27

Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 28

Image Compression Models

## 3/24/2012 CS 04 804B Image Processing Module 3 29

 Encoder – Source encoder + Channel encoder

##  Source encoder – removes coding, interpixel, and

psychovisual redundancies in input image and outputs a
set of symbols.

##  Channel encoder – To increase the noise immunity of the

output of source encoder.

## 3/24/2012 CS 04 804B Image Processing Module 3 30

 Source Encoder

 Mapper
 Transforms input data into a format designed to reduce
interpixel redundancies in input image.
 Reversible process generally
 May or may not reduce directly the amount of data required
to represent the image.
 Examples
 Run-length coding(directly results in data compression)
 Transform coding

## 3/24/2012 CS 04 804B Image Processing Module 3 31

 Quantizer
 Reduces the accuracy of the mapper’s output in
accordance with some pre-established fidelity criterion.
 Reduces the psychovisual redundancies of the input
image.
 Irreversible process (irreversible information loss)
 Must be omitted when error-free compression is desired.

## 3/24/2012 CS 04 804B Image Processing Module 3 32

 Symbol encoder
 Creates a fixed- or variable-length code to represent the
quantizer output and maps the output in accordance with
the code.
 Usually, a variable-length code is used to represent the
mapped and quantized output.
 Assigns the shortest codewords to the most frequently
occuring output values.
 Reduces coding redundancy.
 Reversible process

 Source decoder

 Symbol decoder
 Inverse Mapper

## 3/24/2012 CS 04 804B Image Processing Module 3 34

 Channel Encoder and Decoder

##  Essential when the channel is noisy or error-prone.

 Source encoded data – highly sensitive to channel noise.
 Channel encoder reduces the impact of channel noise by
inserting controlled form of redundancy into the source
encoded data.

 Example
 Hamming Code – appends enough bits to the data being
encoded to ensure that two valid codewords differ by a
minimum number of bits.

## 3/24/2012 CS 04 804B Image Processing Module 3 35

 7-bit Hamming(7,4) Code
 7-bit codewords
 4-bit word
 3 bits of redundancy
 Distance between two valid codewords (the minimum
number of bit changes required to change from one code to
another) is 3.
 All single-bit errors can be detected and corrected.
 Hamming distance between two codewords is the number
of places where the codewords differ.
 Minimum Distance of a code is the minimum number of
bit changes between any two codewords.
 Hamming weight of a codeword is equal to the number of
non-zero elements (1’s) in the codeword.

## 3/24/2012 CS 04 804B Image Processing Module 3 36

Binary data Hamming Codeword
b3b2b1b0 h1h2h3h4h5h6h7
0000 0000000
0001 1101001
0010 0101010
0011 1000011
0100 1001100
0101 0100101
0110 1100110
0111 0001111
3/24/2012 CS 04 804B Image Processing Module 3 37
Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 38

Basics of Probability

Ref: http://en.wikipedia.org/wiki/Probability
3/24/2012 CS 04 804B Image Processing Module 3 39
Ref: http://en.wikipedia.org/wiki/Probability
3/24/2012 CS 04 804B Image Processing Module 3 40
Ref: http://en.wikipedia.org/wiki/Probability

## 3/24/2012 CS 04 804B Image Processing Module 3 41

Elements of Information Theory
 Measuring Information
 A random event E occuring with probability P(E) is said
to contain
1
I ( E )  log   log( P( E ))
P( E )
 units of information.
 I(E) is called the self-information of E.
 Amount of self-information of an event E is inversely
related to its probability.

## 3/24/2012 CS 04 804B Image Processing Module 3 42

 If P(E) = 1, I(E) = 0. That is, there is no uncertainty
associated with the event.
 No information is conveyed because it is certain that the
event will occur.
 If base m logarithm is used, the measurement is in m-ary
units.
 If base is 2, the measurement is in binary units. The unit of
information is called a bit.
 If P(E) = ½, I(E) = -log (½) = 1 bit. That is, 1 bit of
information is conveyed when one of the two possible
equally likely outcomes occur.

## 3/24/2012 CS 04 804B Image Processing Module 3 43

Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 44

The Information Channel
 Information channel is the physical medium that connects
the information source to the user of information.
 Self-information is transferred between an information
source and a user of the information, through the
information channel.
 Information source – Generates a random sequence of
symbols from a finite or countably infinite set of possible
symbols.
 Output of the source is a discrete random variable.

## 3/24/2012 CS 04 804B Image Processing Module 3 45

 The set of source symbols or letters{a1, a2, …, aJ} is
referred to as the source alphabet A.
 The probability of the event that the source will produce
symbol aj is P(aj).
J

 P(a )  1
j 1
j

## z   P(a1 ), P(a2 ),..., P(aJ )

T
 The Jx1 vector is used to
represent the set of all source symbol probabilities.
 The finite ensemble (A,z) describes the information source
completely.

## 3/24/2012 CS 04 804B Image Processing Module 3 46

 The probability that the discrete source will emit symbol
aj is P(aj).
 Therefore, the self-information generated by the
production of a single source symbol is,

I (a j )   log P( a j )

##  If k source symbols are generated, the average self-

information obtained from k outputs is
 kP(a1 ) log P(a1 )  kP(a2 ) log P(a2 )  ...  kP(aJ ) log P(aJ )
J
 k  P(a j ) log P(a j )
j 1

## 3/24/2012 CS 04 804B Image Processing Module 3 47

 The average information per source output, denoted as
H(z), is
J
H (z )  E[ I (z )]   P(a j ) I (a j )
j 1
J J
1
  P(a j ) log   P(a j ) log P(a j )
j 1 P(a j ) j 1

##  This is called the uncertainty or entropy of the source.

 It is the average amount of information (in m-ary units
per symbol) obtained by observing a single source
output.
 If the source symbols are equally probable, the entropy is
maximized and the source provides maximum possible
average information per source symbol.
3/24/2012 CS 04 804B Image Processing Module 3 48
3/24/2012 CS 04 804B Image Processing Module 3 49
 A simple information system

##  Output of the channel is also a discrete random variable

which takes on values from a finite or countably infinite
set of symbols {b1, b2, …, bK} called the channel alphabet
B.

 The finite ensemble (B,v), where v   P(b1 ), P(b2 ),..., P(bJ )T
describes the channel output completely and thus the

## 3/24/2012 CS 04 804B Image Processing Module 3 50

 The probability P(bk) of a given channel output and the
probability distribution of the source z are related as
J
P(bk )   P(bk | a j ) P(a j )
j 1

## where P(bk | a j ) is the conditional probability that

the output symbol bk is received , given that the
source symbol a j was generated .

## 3/24/2012 CS 04 804B Image Processing Module 3 51

 Forward Channel Transition Matrix or Channel Matrix

 P  b1 | a1  P  b1 | a2  ... P  b1 | aJ  
 
 P  b2 | a1  P  b2 | a2  ... P  b2 | aJ  
Q
 : : ... : 
 
 P  bK | a1  P  bK | a2  ... P  bK | aJ  

##  Matrix element, qkj  P  bk | a j 

 The probability distribution of the output alphabet can be
computed from
 v = Qz

## 3/24/2012 CS 04 804B Image Processing Module 3 52

 Conditional entropy function
Entropy
J J
H (z )  E[ I (z )]   P(a j ) I (a j )   P (a j ) log P(a j )
j 1 j 1

##  Conditional entropy function

J
H (z | bk )  E[ I (z | bk )]   P (a j | bk ) I (a j | bk )
j 1
J
  P(a j | bk ) log P(a j | bk )
j 1

## where P (a j | bk ) is the probability that symbol a j is

transmitted by the source given that the user receives bk .
3/24/2012 CS 04 804B Image Processing Module 3 53
 The expected or average value over all bk is
K
H (z | v )   H (z | bk ) P(bk )
k 1
K  J 
    P(a j | bk ) log P(a j | bk )  P(bk )
k 1  j 1 
K J
  P(a j | bk ) P(bk ) log P(a j | bk )
k 1 j 1

P(a j , bk )
Conditional Probability, P (a j | bk ) 
P(bk )
K J
 H (z | v )   P(a j , bk ) log P(a j | bk )
k 1 j 1

## 3/24/2012 CS 04 804B Image Processing Module 3 54

 P(aj,bk) is the joint probability of aj and bk. That is, the
probability that aj is transmitted and bk is received.
 Mutual information
 H(z) is the average information per source symbol,
assuming no knowledge of the output symbol.
 H(z|v) is the average information per source symbol,
assuming observation of the output symbol.
 The difference between H(z) and H(z|v) is the average
information received upon observing the output symbol,
and is called the mutual information of z and v, given by
 I(z|v) = H(z) - H(z|v)

## 3/24/2012 CS 04 804B Image Processing Module 3 55

I (z | v)  H (z )  H (z | v)
 J   J K 
   P(a j ) log P(a j )     P(a j , bk ) log P( a j | bk ) 
 j 1   j 1 k 1 
J J K
  P(a j ) log P(a j )   P(a j , bk ) log P( a j | bk )
j 1 j 1 k 1

K
  P(a j , bk )
k 1

## 3/24/2012 CS 04 804B Image Processing Module 3 56

J K J K
I (z | v )   P( a j , bk ) log P( a j )   P( a j , bk ) log P( a j | bk )
j 1 k 1 j 1 k 1
J K P (a j | bk )
  P (a j , bk ) log
j 1 k 1 P(a j )
J K P (a j , bk )
  P (a j , bk ) log
j 1 k 1 P (a j ) P (bk )

## 3/24/2012 CS 04 804B Image Processing Module 3 57

P (a j , bk )  P (a j | bk ).P (bk )
P (a j , bk )  P (bk | a j ).P (a j )
J K P (bk | a j ).P (a j )
I (z | v )   P (bk | a j ).P (a j ) log
j 1 k 1 P (a j ) P (bk )
J K qkj .P (a j )
  qkj .P (a j ) log
j 1 k 1 P (a j ) P (bk )
J K qkj
  qkj .P (a j ) log
j 1 k 1 P (bk )
J K qkj
  qkj .P (a j ) log
j1 k 1 P (bk )
3/24/2012 CS 04 804B Image Processing Module 3 58
J
P (bk )   P (bk | a j ) P (a j )
j 1
J K qkj
I (z | v )   qkj .P(a j ) log J
j 1 k 1
 P(b
i 1
k | ai ) P(ai )
J K qkj
  qkj .P(a j ) log J
j 1 k 1
q
i 1
ki P(ai )

## 3/24/2012 CS 04 804B Image Processing Module 3 59

 The minimum possible value of I(z|v) is zero.
 Occurs when the input and output symbols are statistically
independent.
 That is, when P(aj,bk) = P(aj)P(bk).
J K P (a j , bk )
I(z | v )   P( a j , bk ) log
j 1 k 1 P (a j ) P (bk )
J K P (a j ) P (bk )
  P (a j , bk ) log
j 1 k 1 P (a j ) P (bk )
J K
  P (a j , bk ) log1  0
j 1 k 1

## 3/24/2012 CS 04 804B Image Processing Module 3 60

 Channel Capacity
 The maximum value of I(z|v) over all possible choices of
source probabilities in the vector z is called the capacity,
C, of the channel described by channel matrix Q.
C  max[I(z | v)]
z

##  Channel capacity is the maximum rate at which

information can be transmitted reliably through the
channel.
 Binary information source
 Binary Symmetric Channel (BSC)

## 3/24/2012 CS 04 804B Image Processing Module 3 61

 Binary Information Source
Source alphabet A  {a1 , a2 }  0, 1
P  a1  pbs , P  a 2   1- pbs  p bs
Entropy of source,
H (z )   pbs log 2 pbs  p bs log 2 p bs
where z   P  a1 , P  a 2     pbs ,1- pbs 
T T

##   pbs log 2 pbs  p bs log 2 p bs  is called the binary entropy

 
function denoted as H bs (.)
For example, H bs (t )  t log 2 t  t log 2 t
3/24/2012 CS 04 804B Image Processing Module 3 62
3/24/2012 CS 04 804B Image Processing Module 3 63
 Binary Symmetric Channel (Noisy Binary Information
Channel)
Let the probability of error during transmission
of any symbol be pe .
Channel matrix for BSC
 P (b1 | a1 ) P (b1 | a2 ) 
Q 
 2 1
P (b | a ) P (b2 | a 2 
)
 P (0 | 0) P (0 |1) 
 
 P (1| 0) P (1|1) 
1  pe pe   pe pe 
   
 pe 1  pe   pe p e 
3/24/2012 CS 04 804B Image Processing Module 3 64
Output alphabet B  {b1 , b 2 }  0, 1
v   P  b1  , P  b 2     P  0  , P 1 
T T

## The probabilities of the receiving output symbols

b1 and b2 can be determined by,
v  Qz
 pe pe   pbs 
=  
 pe p e   p bs 
 P(0)  p e pbs  pe p bs
P (1)  pe pbs  p e p bs
3/24/2012 CS 04 804B Image Processing Module 3 65
 The mutual information of BSC can be computed as
 2 2 qkj
I (z | v )   qkj .P(a j ) log 2 2
j 1 k 1
q
i 1
ki P(ai )

q11
 q11.P(a1 ) log 2
q11 P(a1 )  q12 P(a2 )
q21
 q21.P(a1 ) log 2
q21 P(a1 )  q22 P(a2 )
q12
 q12 .P(a2 ) log 2
q11 P(a1 )  q12 P(a2 )
q22
 q22 .P(a2 ) log 2
q21 P(a1 )  q22 P(a2 )
3/24/2012 CS 04 804B Image Processing Module 3 66
pe pe
 p e . pbs log 2  pe . pbs log 2
p e pbs  pe p bs pe pbs  p e p bs
pe pe
 pe . p bs log 2  p e . p bs log 2
p e pbs  pe p bs pe pbs  p e p bs
 
 p e . pbs log 2 p e  p e . pbs log 2 p e pbs  pe p bs

##  pe . pbs log 2 p  p . p log  p p  p p 

e e bs 2 e bs e bs

 pe . p bs log 2 p  p . p log  p p  p p 
e e bs 2 e bs e bs

 p e . p bs log 2 p  p . p log  p p  p p 
e e bs 2 e bs e bs

 H bs ( pe pbs  p e p bs )  H bs ( pe )
where H bs (.)    pbs log 2 pbs  p bs log 2 p bs 
3/24/2012 CS 04 804B Image Processing Module 3 67
 Capacity of BSC
 Maximum of mutual information over all source distributions.
T
1 1 1
I (z | v ) is max imum when pbs is .This corresponds to z   ,  .
2 2 2
1 1
 I (z | v )  H bs ( pe  p e )  H bs ( pe )
2 2
1 1
 H bs ( pe  (1  pe ) )  H bs ( pe )
2 2
1
 H bs    H bs ( pe )
2
1 1 1 1
  log 2  log 2  H bs ( pe )
2 2 2 2
 1  H bs ( pe )
3/24/2012 CS 04 804B Image Processing Module 3 68
3/24/2012 CS 04 804B Image Processing Module 3 69
Overview
 Introduction
 Fundamentals
 Coding Redundancy
 Interpi xel Redundancy
 Psychovisual Redundancy
 Fidelity Criteria
 Image Compression Models
 Source Encoder and Decoder
 Channel Encoder and Decoder
 Elements of Information Theory
 Measuring Information
 The Information Channel
 Fundamental Coding Theorems
 Noiseless Coding Theorem
 Noisy Coding Theorem
 Source Coding Theorem

## 3/24/2012 CS 04 804B Image Processing Module 3 70

Fundamental Coding Theorems

## 3/24/2012 CS 04 804B Image Processing Module 3 71

 The Noiseless Coding Theorem or Shannon’s First
Theorem or Shannon’s Source Coding Theorem for
Lossless Data Compression
 When both the information channel and communication
system are error-free
 Defines the minimum average codeword length per source
symbol that can be achieved.
 Aim: to represent source as compact as possible.
 Let the information source (A,z), with statistically
independent source symbols, output an n-tuple of symbols
from source alphabet A. Then, the source output takes on
one of the Jn possible values, denoted by, αi , from
A'  {1 ,  2 ,  3 , ,  J n }

## 3/24/2012 CS 04 804B Image Processing Module 3 72

Probability of a given  i , P ( i ) is related to single  symbol
probabilities as
P( i )  P(a j1 ) P(a j 2 )... P(a jn )
z '  {P(1 ), P( 2 ),..., P( J n )}
Entropy of the sourceis given by
Jn
H (z ')   P( i ) log P( i )
i 1
Jn
   P(a j1 ) P(a j 2 )... P(a jn )  log  P(a j1 ) P(a j 2 )... P(a jn ) 
i 1

H (z ')  nH (z )

## 3/24/2012 CS 04 804B Image Processing Module 3 73

 Hence, the entropy of the zero-memory source is n times
the entropy of the corresponding single symbol source.
Such a source is called the nth extension of single-symbol
source.
1
Self information of  i is log .
P( i )
1 1
log  l ( i )  log 1
P( i ) P( i )
αi is therefore represented by a codeword whoselength
is the smallest integer exceeding the self - information
of αi .

## 3/24/2012 CS 04 804B Image Processing Module 3 74

1 1
P ( i ) log  P( i )l ( i )  P ( i ) log  P( i )
P ( i ) P( i )
Jn Jn Jn
1 1

i 1
P ( i ) log   P( i )l ( i )   P( i ) log
P ( i ) i 1 i 1 P( i )
1

## H (z ')  L 'avg  H ( z ')  1

Jn
where L 'avg   P ( i )l ( i )
i 1

## H (z ') L 'avg H (z ')  1

 
n n n
L 'avg 1
H (z )   H (z ) 
n n
 L 'avg 
lim    H (z )
n 
 n 
3/24/2012 CS 04 804B Image Processing Module 3 75
 Shannon’s source coding theorem for lossless data
compression states that for any code used to represent the
symbols from a source, the minimum number of bits
required to represent the source symbols on an average
must be atleast equal to the entropy of the source.
L 'avg 1
H (z )   H (z) 
n n
The efficiency  of any encoding strategy can be defined as
nH (z )

L 'avg
H (z ')

L 'avg
3/24/2012 CS 04 804B Image Processing Module 3 76
 The Noisy Coding Theorem or Shannon’s Second
Theorem
 When the channel is noisy or prone to error
 Aim: to encode information so that the communication is
made reliable and the error is minimized.

##  Use of repetitive coding scheme

 Encode nth extension of source using K-ary code
sequences of length r, Kr ≤ Jn.
 Select only φ of the Kr possible code sequences as valid
codewords.

## 3/24/2012 CS 04 804B Image Processing Module 3 77

 A zero-memory information source generates information
at a rate equal to its entropy.
 The nth extension of the source provides information at a
rate of H (z ') information units per symbol.
n

##  If the information is coded, the maximum rate of coded

information is log(φ/r) and occurs when the φ valid
codewords used to code the source are equally probable.
 Hence, a code of size φ and block length r is said to have a
rate of 
 R  log
r
 information units per symbol.

## 3/24/2012 CS 04 804B Image Processing Module 3 78

 The noisy coding theorem thus states that for any R<C,
where C is the capacity of the zero-memory channel with
matrix Q, there exists an integer r, and code of block
length r and rate R such that the probability of a block
decoding error is less than or equal to ε for any ε>0.

##  That is, the probability of error can be made arbitrarily

small so long as the coded message rate is less than the
capacity of the channel.

## 3/24/2012 CS 04 804B Image Processing Module 3 79

 The Source Coding Theorem for Lossy Data
Compression
 When channel is error-free, but communication process is
lossy.
 Aim: information compression
 To determine the smallest rate at which information about
the source can be conveyed to the user.
 To encode the source so that the average distortion is less
than a maximum allowable level D.
 Let the information source and ecoder output be defined
by (A,z) and (B,v) respectively.
 A nonnegative cost function ρ(aj,bk), called distortion
measure, is used to define the penalty associated with
reproducing source output aj with decoder output bk.
3/24/2012 CS 04 804B Image Processing Module 3 80
Average value of distortion is given by
J K
d (Q)    (a j , bk ) P(a j , bk )
j 1 k 1
J K
   (a j , bk ) P(a j )qkj
j 1 k 1

## where Q is the channel matrix.

Rate distortion function R ( D ) is defined as
R( D)  min  I (z, v ) 
QQD

## where QD  {qkj | d (Q)  D} is the set of all

D  admissible encoding  decoding procedures.
3/24/2012 CS 04 804B Image Processing Module 3 81
 If D = 0, R(D) is less than or equal to the entropy of the
source, or R(0)≤H(z).

R( D)  min  I (z, v)defines the minimum rate at
QQD
which information can be conveyed to user subject to the
constraint that the average distortion be less than or equal
to D. K

k 1

##  d(Q) = D indicates that the minimum information rate

occurs when the maximum possible distortion is allowed.

## 3/24/2012 CS 04 804B Image Processing Module 3 82

 Shannon’s Source Coding Theorem for Lossy Data
Compression states that for a given source (with all its
statistical properties known) and a given distortion
measure, there is a function, R(D), called the rate-
distortion function such that if D is the tolerable amount
of distortion, then R(D) is the best possible compression
rate.
 The theory of lossy data compression is also known as
rate distortion theory.

##  The lossless data compression theory and lossy data

compression theory are collectively known as the source
coding theory.
3/24/2012 CS 04 804B Image Processing Module 3 83
Thank You