LECT - 2. Source Coding

Chapter 2: Source Coding
Graduate Program Department of Electrical and Computer Engineering
Source Coding
Information sources may be analog or discrete or digital
Analog sources: Audio or video Discrete sources: Computers, storage devices such as magnetic or optical devices
Whatever the nature of the information sources, digital communication systems are designed to transmit information in digital form Thus the output of sources must be converted into a format that can be transmitted digitally
Sem. I, 2012/13
Digital Communications Chapter 2: Source Coding
Source Coding
This conversion is generally performed by the source encoder, whose output may generally be assumed to be a sequence of binary digits Encoding is based on mathematical models of information sources and the quantitative measure of information emitted by a source We shall first develop mathematical models for information sources
Sem. I, 2012/13
Mathematical Models for Information Sources

Any information source produces an output that is random and which can be characterized in statistical terms A simple discrete source emits a sequence of letters from a finite alphabet
Example: A binary source produces a binary sequence such as 1011000101100; alphabet: {0,1} Generally, a discrete information with an alphabet of L possible letters produces a sequence of letters selected from this alphabet
Assume that each letter of the alphabet has a probability of occurrence pk such that pk =P{X = xk}, 1 k L; and pk = 1
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding 4
Two Source Models

1. Discrete Sources
Discrete Memoryless source (DMS): Output sequences from the source are statistically independent
Current output does not depend on any of the past outputs
Stationary source: The discrete source outputs are statistically dependent but statistically stationary
That is, two sequences of length n, (a1, a2, ..an) and (a1+m, a2+m, ..an+m) each have joint probabilities that are identical for all n 1 and for all shifts m
2. Analog sources has an output waveform x(t) that is a sample function of a stochastic process X(t), where we assume X(t) is a stationary process with autocorrelation Rxx() and power spectral density xx(f)
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding
Mathematical Models
When X(t) is band limited stochastic process such that xx(f) = 0 for f W; by the sampling theorem
X (t ) =
n =
X(
n ) 2w
sin( 2W (t
n ) 2W n 2W (t ) 2W
Where {X(n/2W} denote the samples of the process at the sampling rate (Nyquist rate) of f = 2W samples/s
Thus applying the sampling theorem we may convert the output of an analog source into an equivalent discretetime source
Sem. I, 2012/13
Mathematical Models
The output is statistically characterized by the joint pdf p(x1, x2 , xm) for m 1 where xn=X(n/2W); 1 n m, are the random variables corresponding to the samples of X(t) Note that the samples {X(n/2W} are in general continuous and cannot be represented digitally without loss of precision Quantization may be used such that each sample is a discrete value, but it introduces distortion (More on this later)
Sem. I, 2012/13
Logarithmic Measure of Information

Consider two discrete random variables X and Y such that X = {x1, x2,xn} and Y = {y1,y2,.ym} Suppose we observe Y = yj and wish to determine, quantitatively, the amount of information Y = yj provides about the event X = xi , i = 1,2,..n Note that if X and Y are statistically independent Y = yj provides no information about the occurrence of X = xi If they are fully dependent Y = yj determines the occurrence of X = xi
The information content is simply that provided by X = xi
Sem. I, 2012/13

A suitable measure that satisfies these conditions is the logarithm of the ratio of the conditional probabilities P{X = xiY = yj} = P{xiyj} and P{X = xi} = p(xi) Information content provided by the occurrence of Y = yj about the event X = xi is defined as
I ( xi , y j ) = log P( xi / y j ) P( xi )
I(xi , yj) is called the mutual information between xi and yj Unit of the information measure is the nat(s) if the natural logarithm is used and bit(s) if base 2 is used Note that ln a = ln 2. log2a = 0.69315 log2a

If X and Y are independent I(xi , yj) = 0 If the occurrence of Y = yj uniquely determines the occurrence of X = xi , I(xi , yj) = log (1/ p(xi)) i.e., I(xi , yj) = -log p(xi) = I(xi) Self information of the event X Note that a high probability event conveys less information than a low probability event For a single event x where p(x) = 1, I(x) = 0 the occurrence of a sure event does not convey any information
Sem. I, 2012/13
10

Example 1. A binary source emits either 0 or 1 every s seconds with equal probability. Information content of each output is then given by I(xi) = -log2 p = -log2 1/2 = 1 bit, 2. xi = 0,1 Suppose successive outputs are statistically independent (DMS) and consider a block of binary digits from the source in time interval ks. Possible number of k-bit block = 2k = M, each of which is equally probable with probability 1/M = 2-k I(xi) = -log22-k. = k bits in ks sec (Note the additive property)

Now consider the following relationships
I ( xi , y j ) = log
and since
p( xi y j ) p( xi )
= log
p(xi y j )p(y j ) p (xi )p(y j )
= log
p(xi , y j ) p(xi )p (y j )
p( y j xi ) =
p( xi , y j ) p( y j )
I ( xi , y j ) =
p ( y j xi ) p( y j )
= I ( y j , xi )
Information provided by yj about xi is the same as that provided by xi about Y = yj. Using the same procedure as before we can also define conditional self- information as
I ( x i y j ) = log 2
1 p ( xi y j )
= log 2 p ( x i y j )
Now consider the following relationships
p ( xi y j ) p ( xi ) log = I ( xi , y j ) I ( xi ) p ( xi )
Which leads to
I ( xi , y j ) = I ( xi ) I ( xi y j )
Note that I ( xi y j ) is self information about X = xi after having observed the event Y = yj
Sem. I, 2012/13
13

Further, note that
I ( xi ) 0
and
I ( x i y j ) 0 and thus
when I ( xi ) > I ( xi y j )
I ( xi , y j ) 0
Indicating that the mutual information between two events can either be positive or negative
Sem. I, 2012/13
14
Average Mutual Information

Average mutual information between X and Y is given by
I ( X; Y) = p( x i , y j ) I ( x i , y j ) = p( x i , y j ) log 2
i =1 j =1 i =1 j =1 n m n m
p( x i , y j ) p( x i ) p( y j )
Similarly, average self-information is given by

H ( X ) = p ( x i ) I ( x i ) = p ( x i ) log 2 p( x i )
where X represents the alphabet of possible output letters from a source H(X) represents the average self information per source letter and is called the entropy of the source

If p(xi) = 1/n for all i, then the entropy of the source becomes
1 1 = log 2 n H ( X ) = log 2 n n
In general, H(X) log n for any given set of source letter probabilities
Thus, the entropy of a discrete source is maximum when the output letters are all equally probable
Sem. I, 2012/13
16

Consider a binary source where the letters {0,1} are independent with probabilities p(x0=0)=q or 1, p(x1=1)=1-q Then the entropy of the source is given by H(X)= -q log2q (1-q) log2(1-q) = H(q) whose plot as a function is shown below
Binary Entropy Function
Sem. I, 2012/13
17

In a similar manner as above, we can define conditional self-information or conditional entropy as follows
1 H ( X / Y) = p( x i , y j ) log p( xi / y j ) i =1 j =1
Conditional entropy is the information or uncertainty in X after Y is observed From the definition of average mutual information one can show that
I(X, Y) = p ( x i , y j )[log p ( x i / y j ) log p ( x i )]
n m i =1 j =1
= H (X/Y) + H ( X )
Sem. I, 2012/13
18

Thus I(X,Y) = H(X) H(X/Y) and H(X) H(X|Y), with equality when X and Y are independent. H(X/Y) is called the equivocation It is interpreted as the amount of average uncertainty remaining in X after observation of Y
Sem. I, 2012/13
19

H(x) Average uncertainty prior to observation I(X;Y) Average information provided about the set X by the observation of the set Y The above results can be generalized for more than two random variable Suppose we have a block of k random variables x1,x2,..xk with joint probability P(x1,x2,..xk); the entropy of the block will then be given by
H (X ) =
j1
n1
.... P ( x
j2 jk
n2
nk
j1
, x j 2 ...... x jk ) log P ( x j1 , x j 2 ...... x jk )
Sem. I, 2012/13
20
Continuous Random Variables - Information Measure

If X and Y be continuous random variables with joint pdf f (x,y) and marginal pdfs f (x) and f (y) the average Mutual information between X and Y is defined as
f ( y / x) f ( x) I ( X , Y ) = f ( x ) f ( y / x ) log dx dy f ( x) f ( y)

The concept of self information does not exactly carry over to continuous random variables since these would require infinite number of binary digits to exactly represent them and thus making their entropies infinite
Sem. I, 2012/13
21
Continuous Random Variables

We can however define differential entropy for continuous random variable as
H ( x) =
f ( x) log f ( x)dx
And the average conditional entropy as
H (X /Y)=
f ( x, y)log f ( x, y)dxdy
Average mutual information is then given by I(X,Y)= H(X) H(X/Y) or I(X,Y) = H(Y) - H(Y/X)
Sem. I, 2012/13
22
Continuous Random Variables

If X is discrete and Y is continuous, the density of Y is expressed as n f ( y ) = f ( y / xi ) p ( xi )
i =1
The mutual information about X = xi provided by the occurrence of the event Y = y is given by
f ( y / xi ) p( xi ) f ( y / xi ) = log I ( xi ;Y )= log f ( y ) p( xi ) f ( y)
The average mutual information between X and Y is
I ( X ;Y ) =
n i =1
f ( y / xi ) p ( xi )log
f ( y / xi ) f ( y)
23
Sem. I, 2012/13

LECT - 2. Source Coding

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

LECT - 2. Source Coding

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 2: Source Coding

Graduate Program Department of Electrical and Computer Engineering

Digital Communications Chapter 2: Source Coding

Digital Communications Chapter 2: Source Coding

Mathematical Models for Information Sources

Two Source Models

Digital Communications Chapter 2: Source Coding

Digital Communications Chapter 2: Source Coding

Logarithmic Measure of Information

Digital Communications Chapter 2: Source Coding

Logarithmic Measure of Information

Logarithmic Measure of Information

Digital Communications Chapter 2: Source Coding

Logarithmic Measure of Information

Logarithmic Measure of Information

p(xi y j )p(y j ) p (xi )p(y j )

Logarithmic Measure of Information

Now consider the following relationships

Digital Communications Chapter 2: Source Coding

Logarithmic Measure of Information

Digital Communications Chapter 2: Source Coding

Average Mutual Information

Similarly, average self-information is given by

Average Mutual Information

Digital Communications Chapter 2: Source Coding

Average Mutual Information

Binary Entropy Function

Digital Communications Chapter 2: Source Coding

Average Mutual Information

Digital Communications Chapter 2: Source Coding

Average Mutual Information

Digital Communications Chapter 2: Source Coding

Average Mutual Information

, x j 2 ...... x jk ) log P ( x j1 , x j 2 ...... x jk )

Digital Communications Chapter 2: Source Coding

Continuous Random Variables - Information Measure

Digital Communications Chapter 2: Source Coding

Continuous Random Variables

And the average conditional entropy as

Digital Communications Chapter 2: Source Coding

Continuous Random Variables

Digital Communications Chapter 2: Source Coding

S-ar putea să vă placă și