Documente Academic
Documente Profesional
Documente Cultură
Source Coding
Information sources may be analog or discrete or digital
Analog sources: Audio or video Discrete sources: Computers, storage devices such as magnetic or optical devices
Whatever the nature of the information sources, digital communication systems are designed to transmit information in digital form Thus the output of sources must be converted into a format that can be transmitted digitally
Sem. I, 2012/13
Source Coding
This conversion is generally performed by the source encoder, whose output may generally be assumed to be a sequence of binary digits Encoding is based on mathematical models of information sources and the quantitative measure of information emitted by a source We shall first develop mathematical models for information sources
Sem. I, 2012/13
Assume that each letter of the alphabet has a probability of occurrence pk such that pk =P{X = xk}, 1 k L; and pk = 1
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding 4
Stationary source: The discrete source outputs are statistically dependent but statistically stationary
That is, two sequences of length n, (a1, a2, ..an) and (a1+m, a2+m, ..an+m) each have joint probabilities that are identical for all n 1 and for all shifts m
2. Analog sources has an output waveform x(t) that is a sample function of a stochastic process X(t), where we assume X(t) is a stationary process with autocorrelation Rxx() and power spectral density xx(f)
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding
Mathematical Models
When X(t) is band limited stochastic process such that xx(f) = 0 for f W; by the sampling theorem
X (t ) =
n =
X(
n ) 2w
sin( 2W (t
n ) 2W n 2W (t ) 2W
Where {X(n/2W} denote the samples of the process at the sampling rate (Nyquist rate) of f = 2W samples/s
Thus applying the sampling theorem we may convert the output of an analog source into an equivalent discretetime source
Sem. I, 2012/13
Mathematical Models
The output is statistically characterized by the joint pdf p(x1, x2 , xm) for m 1 where xn=X(n/2W); 1 n m, are the random variables corresponding to the samples of X(t) Note that the samples {X(n/2W} are in general continuous and cannot be represented digitally without loss of precision Quantization may be used such that each sample is a discrete value, but it introduces distortion (More on this later)
Sem. I, 2012/13
Sem. I, 2012/13
I(xi , yj) is called the mutual information between xi and yj Unit of the information measure is the nat(s) if the natural logarithm is used and bit(s) if base 2 is used Note that ln a = ln 2. log2a = 0.69315 log2a
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding 9
Sem. I, 2012/13
10
I ( xi , y j ) = log
and since
p( xi y j ) p( xi )
= log
= log
p(xi , y j ) p(xi )p (y j )
p( y j xi ) =
p( xi , y j ) p( y j )
I ( xi , y j ) =
p ( y j xi ) p( y j )
= I ( y j , xi )
Information provided by yj about xi is the same as that provided by xi about Y = yj. Using the same procedure as before we can also define conditional self- information as
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding 12
I ( x i y j ) = log 2
1 p ( xi y j )
= log 2 p ( x i y j )
p ( xi y j ) p ( xi ) log = I ( xi , y j ) I ( xi ) p ( xi )
Which leads to
I ( xi , y j ) = I ( xi ) I ( xi y j )
Note that I ( xi y j ) is self information about X = xi after having observed the event Y = yj
Sem. I, 2012/13
13
I ( xi ) 0
and
I ( x i y j ) 0 and thus
when I ( xi ) > I ( xi y j )
I ( xi , y j ) 0
Indicating that the mutual information between two events can either be positive or negative
Sem. I, 2012/13
14
p( x i , y j ) p( x i ) p( y j )
where X represents the alphabet of possible output letters from a source H(X) represents the average self information per source letter and is called the entropy of the source
Sem. I, 2012/13 Digital Communications Chapter 2: Source Coding 15
In general, H(X) log n for any given set of source letter probabilities
Thus, the entropy of a discrete source is maximum when the output letters are all equally probable
Sem. I, 2012/13
16
Sem. I, 2012/13
17
1 H ( X / Y) = p( x i , y j ) log p( xi / y j ) i =1 j =1
Conditional entropy is the information or uncertainty in X after Y is observed From the definition of average mutual information one can show that
I(X, Y) = p ( x i , y j )[log p ( x i / y j ) log p ( x i )]
n m i =1 j =1
= H (X/Y) + H ( X )
Sem. I, 2012/13
18
Sem. I, 2012/13
19
H (X ) =
j1
n1
.... P ( x
j2 jk
n2
nk
j1
Sem. I, 2012/13
20
The concept of self information does not exactly carry over to continuous random variables since these would require infinite number of binary digits to exactly represent them and thus making their entropies infinite
Sem. I, 2012/13
21
H ( x) =
f ( x) log f ( x)dx
H (X /Y)=
f ( x, y)log f ( x, y)dxdy
Average mutual information is then given by I(X,Y)= H(X) H(X/Y) or I(X,Y) = H(Y) - H(Y/X)
Sem. I, 2012/13
22
The mutual information about X = xi provided by the occurrence of the event Y = y is given by
f ( y / xi ) p( xi ) f ( y / xi ) = log I ( xi ;Y )= log f ( y ) p( xi ) f ( y)
The average mutual information between X and Y is
I ( X ;Y ) =
n i =1
f ( y / xi ) p ( xi )log
f ( y / xi ) f ( y)
23
Sem. I, 2012/13