0 evaluări0% au considerat acest document util (0 voturi)

43 vizualizări59 paginiMay 24, 2019

© © All Rights Reserved

PDF, TXT sau citiți online pe Scribd

© All Rights Reserved

0 evaluări0% au considerat acest document util (0 voturi)

43 vizualizări59 pagini© All Rights Reserved

Sunteți pe pagina 1din 59

Source Coding

Dr. Mohamed A. Abdelhamed

Department of Communication and Computer Eng.

Higher Institute of Engineering, El-Shorouk Academy.

Academic year (2018-2019)

Contacts:

WhatsApp: +201002323525

E-mail: m.abdelhamed@sha.edu.eg

mohabdelhamed@yahoo.com

2.1 SOURCE ENTROPY

The entropy of a source is defined as the average

amount of information per symbol (bit/symbol) for

the message generated by the source (X), that is

probabilities and it is maximized when the symbols of

the source is equiprobable as will be shown below.

2

2.1 SOURCE ENTROPY

3

2.1 SOURCE ENTROPY

Generally, H ( X ) logL for any given set of source

symbol probabilities and the equality is achieved when

the symbols are equiprobable.

Example:

Consider a discrete memoryless source that emits two

symbols (or letters) x1 and x2 with probabilities q and 1-q,

respectively. Find and sketch the entropy of this source as

a function of q. Hint: This source can be a binary source

that emits the symbols 0 and 1 with probabilities q and 1-

q, respectively.

4

2.1 SOURCE ENTROPY

Solution:

5

2.1 SOURCE ENTROPY

6

2.1 SOURCE ENTROPY

7

2.1 SOURCE ENTROPY

(Extended source)

8

2.1 SOURCE ENTROPY

(Extended source)

H X k kH ( X )

9

2.1 SOURCE ENTROPY

(Extended source)

10

2.1 SOURCE ENTROPY

(Extended source)

1 1 1 1 1 1 1 1 1

, , , , , , , ,

16 16 8 16 16 8 8 8 4

11

2.1 SOURCE ENTROPY

(Extended source)

12

2.2 Coding for Discrete Memoryless Sources (DMS)

representation of data generated by a discrete

source.

The device that performs the representation is called a

source encoder, shown in Figure 2.2.

stream of bits

DMS Source

encoder Stream of bits

Correspond to

Figure 2.2 Source Encoding stream of symbols

or message

13

2.2 Coding for Discrete Memoryless Sources (DMS)

14

2.2 Coding for Discrete Memoryless Sources (DMS)

equality is achieved

when the symbols

are equiprobable

L Probability of occurrence

Average code word R Pn

i i

of symbol

length i 1

Code word length

of symbol 15

2.2 Coding for Discrete Memoryless Sources (DMS)

Rmin H ( x)

1

R R

16

2.2 Coding for Discrete Memoryless Sources (DMS)

Code efficiency

17

2.2.1 Fixed-Length Code Words

Assuming L possible symbols, then

R log 2 L 1

Where:

x : denotes the largest integer less than x.

R: Number of bits/symbol

L: Number of symbol

18

2.2.1 Fixed-Length Code Words

H ( x)

R

log 2 L

R

R log 2 L

1

19

2.2.1 Fixed-Length Code Words

Problem:

Find the code efficiency for the fixed-length word

encoder assuming the following DMSs:

(a) 8 equiprobable symbols

(b) 10 equiprobable symbols

(c) 100 equiprobable symbols

(d) 4 symbols with probabilities 0.5, 0.25, 0.125, 0.125

20

2.2.1 Fixed-Length Code Words

symbol.

21

2.2.1 Fixed-Length Code Words

2 L

N k or N k log 2 L

Hence, the minimum value of N is given by

N k log 2 L 1

For extended source 22

2.2.1 Fixed-Length Code Words

2 L

N k or N k log 2 L

Hence, the minimum value of N is given by

N k log 2 L 1

For extended source

23

2.2.1 Fixed-Length Code Words

as k increase but at expense of complexity in design.

The efficiency of the encoder is given by

H ( x) kH ( x)

R N 24

2.2.2 Variable Length Code Words

efficiently coding method is to use variable length code

words.

An example of such coding is the Morse code in which

the letters which occur frequently are assigned short

code words and those that occur infrequently are

assigned long code words. This type of coding is called

entropy coding.

Advantage: Provides the optimum (lower) data rate.

Disadvantage: Complex in design of encoder/decoder.

25

2.2.2 Variable Length Code Words

where:

L Probability of occurrence

R ni Pi

i 1

Code word length of symbol

(bit/symbol)

and instantaneously decodable.

What has been sent will

be received without error

When the last bit arrives in the code

we know that the code has been finished 26

2.2.2 Variable Length Code Words

Example:

consider a DMS with output symbols x1, x2, x3 and x4

with probabilities 1/2, 1/4, 1/8, and 1/8, respectively.

sequence to be decoded is 0 0 1 0 0 1.

Table 2.1 Three different codes for the same source.

Symbol Pi Code I Code II Code III

x1 1/2 1 0 0

x2 1/4 00 01 10

x3 1/8 01 011 110

x4 1/8 10 111 111 27

2.2.2 Variable Length Code Words

In code I,

the first symbol corresponding to 0 0 is x2. However, the

next four bits are not uniquely decodable since they may

be decoded as x4x3 or x1x2x1.

Perhaps, the ambiguity can be resolved by waiting for

additional bits, but such a decoding delay is highly

undesirable.

The tree structure of Code II and code III are shown in

Figure 2.3. Code II is uniquely decodable but not

simultaneously decodable (delay in decoding process

which is undesired).

28

2.2.2 Variable Length Code Words

29

2.2.2 Variable Length Code Words

30

Kraft Inequality

by this condition (Kraft Inequality):

Code length

L

2 nk

1

k 1

code not on the code words themselves

prefix. 31

Kraft Inequality

cannot be a prefix code. However, Kraft inequality is

satisfied by both code II and code III, though only code

III is a prefix code.

Prefix codes are distinguished from other uniquely

decodable codes by the fact that the end of the code

word is always recognizable which makes the code

instantaneously decodable.

32

Huffman Coding Algorithm

33

Huffman Coding Algorithm

34

Huffman Coding Algorithm

35

Huffman Coding Algorithm

Example:

Consider a DMS with five possible symbols having the

probabilities 0.4, 0.2, 0.2, 0.1 and 0.1. Use Huffman

encoding algorithm to find the code word for each symbol

and the code efficiency.

Solution

The following table shows the complete steps of Huffman

encoding for the given source.

36

Huffman Coding Algorithm

37

Huffman Coding Algorithm

probabilities 0.4, 0.2, 0.2, 0.1, 0.1 are assigned the code

words 00, 10, 11, 010, 011, respectively.

Therefore, the average number of bits per symbol is

L

R ni Pi (2 0.4 2 0.2 2 0.2 3 0.1 3 0.1) = 2.2 bits

i 1

L

1

R Pi log = 2.1219 bits

i 1 Pi

38

Huffman Coding Algorithm

H ( x)

= 0.9645

R

1 = 0.0355

39

Huffman Coding Algorithm

not unique. There is arbitrariness in the way a bit 0 and

a bit 1 are assigned to the two symbols in the last stage.

Also, when the probability of a combined symbol is

found to equal another probability in the list, the

combined symbol may be placed as high as possible or

as low as possible. In these cases, the code words can

have different lengths but the average codeword length

is the same.

40

Huffman Coding Algorithm

L

2 (ni R)2 Pi

i 1

41

Huffman Coding Algorithm

Report:

Consider a DMS that produces three symbols x1, x2 and x3

with probabilities 0.45, 0.35, and 0.2, respectively. Find

the entropy, the code words, and the encoding efficiency

in both cases of single symbol encoding and two symbol

encoding (second-order extension code).

42

Fano Coding Algorithm

1. Arrange the information source symbols in order of

decreasing probability.

2. Divide the symbols into two equally probable groups,

as possible as you can.

3. Each group receives one of the binary symbols (i.e. 0 or

1) as the first symbol.

4. Repeat steps 2 and 3 per group as many times as this is

possible.

5. Stop when no more groups to divide

43

Fano Coding Algorithm

Example:

44

Fano Coding Algorithm

Note that:

If it was not possible to divide precisely the

probabilities into equally probable groups, we should

try to make the division as good as possible, as we can

see from the following example.

45

Huffman vs. Fano coding

46

MUTUAL INFORMATION IN DMCs

only on present input) is a mathematical (statistical) model for a

channel with discrete input X and discrete output Y.

The Channel is completely specified by “A set of transition

probabilities”.

The DMC is represented graphically as shown in the figure.

The sum of the

probabilities

that will come- Transition

out of the same (Conditional)

symbol must Prob. of channel

equal one

47

MUTUAL INFORMATION IN DMCs

transmit.

If the channel is noisy, there is found amount of

uncertainty, thus the characteristic of the channel can be

described by the channel matrix P,

P( y1 x1 ) P( y2 x1 ) . . . P( yQ x1) Each row

represent one

P( y1 x2 ) P( y2 x2 ) . . . P( yQ x2 ) input, thus sum.of

Prob. In each row

P . Equal one

.

.

P( y1 xq ) 0 P( y2 xq ) . . . P( yQ xq )

48

MUTUAL INFORMATION IN DMCs

𝑥𝑖 is transmitted.

It should be noted that each row of the channel matrix P

corresponds to a fixed channel input, whereas each

column corresponds to a fixed channel output.

The sum of the elements along a row is always equal to

one, that is

Q

P y j xi 1 for all 𝑖

j 1

49

MUTUAL INFORMATION IN DMCs

variables X and Y is given by

P xi , y j P y j , xi P X xi ,Y y j

P Y y j X xi P X xi

P X xi Y y j P Y y j

P y j xi P xi P xi y j P y j

50

MUTUAL INFORMATION IN DMCs

random variable 𝑌 can be determined by averaging

𝑃(𝑥𝑖 , 𝑦𝑖 ) on 𝑥𝑖 , that is

q q

P y j P xi , y j P y j xi P xi

i 1 i 1

can be determined knowing the probabilities of the

input symbols 𝑝(𝑥𝑖 ) for 𝑖 = 1, 2, … , 𝑞 and the matrix of

transition probabilities 𝑃 𝑦𝑗 𝑥𝑖 .

51

MUTUAL INFORMATION IN DMCs

probability that 𝑥𝑖 is transmitted when 𝑦𝑗 is received,

and it can be determined using Bayes rule as

P xi y j

P xi , y j

P y j xi P xi

P y j P y j

P y j xi P xi P y j xi P xi

q q

P xi , y j P y j xi P xi

i 1 i 1

52

MUTUAL INFORMATION IN DMCs

by

q

1

H X y j P xi y j log

i 1

P xi y j bits/symbol

about a transmitted symbol when a symbol is received is

53

MUTUAL INFORMATION IN DMCs

Q

H X Y H X yj P yj

j 1

P xi

Q q

1

y j P y j log 2

j 1 i 1 P xi y j

Q q

P xi , y j log2 P

1

j 1 i 1 xi y j

54

MUTUAL INFORMATION IN DMCs

equivocation and it represents the amount of

uncertainty remaining about the channel input 𝑋 after

observing the channel output 𝑌.

Since 𝐻 𝑋 represents the amount of uncertainty about

the channel input 𝑋 before observing the channel output

𝑌, then the difference 𝐻 𝑋 − 𝐻 𝑋 𝑌 represents the

amount of information provided by observing the

channel output 𝑌 , and it is called the mutual

information of the channel, that is

55

MUTUAL INFORMATION IN DMCs

I X ;Y H X H X Y

H Y H Y X

H X H X Y H Y H Y X

56

Relationships between entropies

57

Relationships between entropies

58

?

59

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.