Sunteți pe pagina 1din 38

Unit V.

Image Compression

Two mark Questions 1. What is the need for image compression?


In terms of storage, the capacity of a storage device can be effectively increased with methods that compress a body of data on its way to a storage device and decompresses it when it is retrieved. In terms of communications, the bandwidth of a digital communication link can be effectively increased by compressing data at the sending end and decompressing data at the receiving end. At any given time, the ability of the Internet to transfer data is fixed. Thus, if data can effectively be compressed wherever possible, significant improvements of data throughput can be achieved. Many files can be combined into one compressed document making sending easier.

2. What is run length coding?


Run-length Encoding, or RLE is a technique used to reduce the size of a repeating string of characters. This repeating string is called a run; typically RLE encodes a run of symbols into two bytes, a count and a symbol. RLE can compress any type of data regardless of its information content, but the content of data to be compressed affects the compression ratio. Compression is normally measured with the compression ratio.

3. What are the different compression methods?


i. ii. iii. iv. The different compression methods are, Run Length Encoding (RLE) Arithmetic coding Huffman coding and Transform coding

4. Define compression ratio.


Compression ratio is defined as the ratio of original size of the image to compressed size of the image. It is given as Compression Ratio = original size / compressed size: 1

5. What are the basic steps in JPEG?


The Major Steps in JPEG Coding involve: i. ii. iii. iv. v. vi. DCT (Discrete Cosine Transformation) Quantization Zigzag Scan DPCM on DC component RLE on AC Components Entropy Coding

6. What is coding redundancy?


If the gray level of an image is coded in a way that uses more code words than necessary to represent each gray level, then the resulting image is said to contain coding redundancy.

7. What is interpixel redundancy?


The value of any given pixel can be predicted from the values of its neighbors. The information carried by is small. Therefore the visual contribution of a single pixel to an image is redundant. Otherwise called as spatial redundant geometric redundant or interpixel redundant. Eg: Run length coding

8. What is psychovisual redundancy?


In normal visual processing certain information has less importance than other information. So this information is said to be psycho visual redundant.

9. What is meant by fidelity criteria?


Data loss due to psychovisual redundancy coding may need to be checked. Fidelity criteria are a measure for such loss. Two kinds of fidelity criteria 1) subjective and 2) objective

10.

What is run length coding?

Run-length Encoding, or RLE is a technique used to reduce the size of a repeating string of characters. This repeating string is called a run; typically RLE encodes a run of symbols into two bytes, a count and a symbol. RLE can compress any type of data regardless of its information content, but the content of data to be compressed affects the compression ratio. Compression is normally measured with the compression ratio.

11.

Define source encoder.

Source encoder performs three operations: 1) Mapper -this transforms the input data into non-visual format. It reduces the interpixel redundancy. 2) Quantizer - It reduces the psycho visual redundancy of the input images. This step is omitted if the system is error free. 3) Symbol encoder- This reduces the coding redundancy .This is the final stage of encoding process.

12.

Draw the JPEG decoder.

13.

What are the types of decoder?

Source decoder- has two components a) Symbol decoder- This performs inverse operation of symbol encoder. b) Inverse mapping- This performs inverse operation of mapper. Channel decoder-this is omitted if the system is error free.

14. Differentiate between lossy compression and lossless compression methods.


Lossless compression can recover the exact original data after compression. It is used mainly for compressing database records, spreadsheets or word processing files, where exact replication of the original is essential. Lossy compression will result in a certain loss of accuracy in exchange for a substantial increase in compression. Lossy compression is more effective when used to compress graphic images and digitized voice where losses outside visual or aural perception can be tolerated.

15.

What is meant by wavelet coding?

16.

Define channel encoder.

The channel encoder reduces the impact of the channel noise by inserting redundant bits into the source encoded data. Eg: Hamming code

17.

What is jpeg?

The acronym is expanded as "Joint Photographic Expert Group". It is an international standard in 1992. It perfectly Works with colour and greyscale images, Many applications e.g., satellite, medical,...

18.

Differentiate between jpeg and jpeg2000 standards.

jpeg
JPEG is good for photography Compression ratios of 20:1 are easily attained 24-bits per pixel can be used leading to better accuracy Progressive JPEG(interlacing)

jpeg2000
JPEG 2000 is an all encompassing standard Wavelet based image compression standard Lossless and lossy compression Progressive transmission by pixel accuracy and resolution Region-of-Interest Coding Random codestream access and processing Robustness to bit-errors Content-based description Side channel spatial information (transparency)
19. What are the operations performed by error free compression?

1) Devising an alternative representation of the image in which its interpixel redundant are reduced. 2) Coding the representation to eliminate coding redundancy

20.

Define Huffman coding.

Huffman coding is a popular technique for removing coding redundancy. When coding the symbols of an information source the Huffman code yields the smallest possible number of code words, code symbols per source symbol.

21.

What is image compression?

Image compression refers to the process of redundancy amount of data required to represent the given quantity of information for digital image. The basis of reduction process is removal of redundant data.

(or) A technique used to reduce the volume of information to be transmitted about an image 22. Define encoder.

Source encoder is responsible for removing the coding and interpixel redundancy and psycho visual redundancy. There are two components A) Source Encoder B) Channel Encoder

23.

What is variable length coding?

Variable Length Coding is the simplest approach to error free compression. It reduces only the coding redundancy. It assigns the shortest possible codeword to the most probable gray levels.

24.

Define arithmetic coding.

In arithmetic coding, one to one corresponds between source symbols and code word doesnt exist where as the single arithmetic code word assigned for a sequence of source symbols. A code word defines an interval of number between 0 and 1.

25.

Draw the block diagram of transform coding system.

Twelve mark Questions 1. Explain various functional block of JPEG standard? Joint Photographic Experts Group. International standard for photographs. It is Lossless/lossy. Based on the facts that : Humans are more sensitive to lower spatial frequency components. A large majority of useful image contents change relatively slowly across images.

Steps involved : Image converted to Y,Cb,Cr format Divided into 8x8 blocks Each 8x8 block subject to DCT followed by quantization Zig-zag scan DC coefficients stored using DPCM RLE used for AC coefficients Huffman encoding Frame generation

Functional block diagram of JPEG standard

Block preparation Compute luminance (Y) & chrominance (I & Q) according to the formulas: Y = 0.3R + 0.59G + 0.11B (0 to 255) I = 0.6R - 0.28G - 0.32B (0 to 255) Q = 0.21R - 0.52G + 0.31B (0 to 255) Separate matrices are constructed for Y,I,Q. Square block of four pixels are averaged in the I & Q (lossy and compress image by factor of 2). 128 is subtracted form Y,I and Q. Each matrix is divided up into 8X8 blocks

Discrete cosine transformation


Output of each DCT is an 8X8 matrix. DCT element (0,0) is the average value of the block. Other elements are difference between original and average value. Theoretically lossless but sometimes it may be lossy.

Quantization
Less important DCT coefficients are wiped out. It is the main lossy step involved in JPEG. It is done by dividing each of the coefficients in the 8X8 matrix by a weight taken from a table. These weights are not a part of JPEG std.

Differential quantization
It reduces the(0,0) value of each block by replacing it with the amount it differs from the corresponding element in the previous block. Since these elements are the average value of their respective blocks ,they should change slowly.

Run length encoding


It linearizes the 64 elements and applies run length encoding to the list.

Statistical output encoding JPEG uses Huffman encoding for this purpose. It often produces a 20:1 compression or better. For decoding we have to run the algorithm backward. JPEG is roughly symmetric: Decoding takes as long as encoding. Advantages and Disadvantages:Advantages
Compression ratios of 20:1 are easily attained. 24-bits per pixel can be used leading to better accuracy. Progressive JPEG(interlacing)

Disadvantages
Doesnt support transparency. Doesnt work well with sharp edges. Almost always lossy and No target bit rate

Another Block Diagram

JPEG 2000 STANDARD: Wavelet based image compression standard Encoding Decompose source image into components Decompose image and its components into rectangular tiles Apply wavelet transform on each tile Quantize and collect subbands of coefficients into rectangular arrays of code-blocks Encode so that certain ROIs can be coded in a higher quality Add markers in the bitstream to allow error resilience

Advantages: Lossless and lossy compression. Progressive transmission by pixel accuracy and resolution. Region-of-Interest Coding. Random codestream access and processing. Robustness to bit-errors. Content-based description. Side channel spatial information (transparency).

2. Explain

(i) one-dimensional run dimensional run length coding.

length

coding

(ii)

two-

RLE stands for Run Length Encoding. It is a lossless algorithm that only offers decent compression ratios in specific types of data.

Pre-processing method, good when one symbol occurs with high probability or when symbols are dependent Count how many repeated symbol occur Source symbol = length of run

(i) one-dimensional run length coding Used for binary images Length of the sequences of ones & zeroes are detected. Assume that each row begins with a white(1) run. Additional compression is achieved by variable lengthcoding (Huffman coding) the run-lengths. An m-bit gray scale image can be converted into m binary images by bit-plane slicing. These individual images are then encoded using run-length coding. However, a small difference in the gray level of adjacent pixels can cause a disruption of the run of zeroes or ones. Example: Let us say one pixel has a gray level of 127 and the next pixel has a gray level of 128. In binary: 127 = 01111111 & 128 = 10000000
Therefore a small change in gray level has decreased the

run-lengths in all the bit-planes.

(ii) two-dimensional run length coding. Developed in 1950s and has become, along with its 2-D extensions, the standard approach in facsimile (FAX) coding. Two dimensional array of pixel values Spatial redundancy and temporal redundancy Human eye is less sensitive to chrominance signal than to luminance signal (U and V can be coarsely coded) Human eye is less sensitive to the higher spatial frequency components Human eye is less sensitive to quantizing distortion at high luminance levels Source image as 2-D matrix of pixel values R, G, B format requires three matrices, one each for R, G, B quantized values In Y, U, V representation, the U and V matrices can be half as small as the Y matrix Source image matrix is divided into blocks of 8X8 submatrices Smaller block size helps DCT computation and individual blocks are sequentially fed to the DCT which transforms each block separately

Advantages and disadvantages


This algorithm is very easy to implement and does not require much CPU horsepower. RLE compression is only efficient with files that contain lots of repetitive data. These can be text files if they contain lots of spaces for indenting but line-art images that contain large white or black areas are far more suitable. Computer generated colour images (e.g. architectural drawings) can also give fair compression ratios.

3. Explain variable length coding and Huffman coding.

Variable length coding:


Assigning fewer bits to the more probable gray levels than to the less probable ones achieves data compression. This is called variable length coding. Variable length code whose length is inversely proportional to that characters frequency. Must satisfy non-prefix property to be uniquely decodable.

two pass algorithm First pass accumulates the character frequency and generate codebook. Second pass does compression with the codebook. Huffman codes require an enormous number of computations. For N source symbols, N-2 source reductions (sorting operations) and N-2 code assignments must be made. Sometimes we sacrifice coding efficiency for reducing the number of computations.

Create codes by constructing a binary tree


1. Consider all characters as free nodes 2. Assign two free nodes with lowest frequency to a parent node with weights equal to sum of their frequencies 3. Remove the two free nodes and add the newly created parent node to the list of free nodes 4. Repeat step2 and 3 until there is one free node left. It becomes the root of tree

Table: Variable-Length Codes

Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.

Source reduction Code assignment

Steps
1. Find the gray level probabilities from the image histogram. 2. Arrange probabilities in reverse order, highest at top. 3. Combine the smallest two by addition, always keep sum in

reverse order. 4. Repeat step 3 until only two probabilities are left. 5. By working backward along the tree, generate code by alternating assignment of 0 & 1.

Fig: Huffman Source Reductions

Fig : Huffman code assignment procedure

Extra Notes:

4. Explain arithmetic coding and LZW coding.

Arithmetic coding
Arithmetic

compression is an alternative to Huffman compression, it enables characters to be represented as fractional bit lengths. Unlike for Huffman compression, where fractional code lengths are not possible and the allocation of shorter codewords for more frequently occurring characters needs at least one-bit codeword no matter how high its frequency. Arithmetic coding works by representing a number by an interval of real numbers greater or equal to zero, but less than one. As a message becomes longer, the interval needed to represent it becomes smaller and smaller, and the number of bits needed to specify it increases.

Entire sequence of source symbol (message) is assigned a single arithmetic code word. There is no one to one coding like Huffman The code word is within interval [0, 1] As the number of symbols in the message increases, the interval used to represent it becomes smaller and the number of information units (bits) required to represent the interval becomes larger Ex. More bits are required to represent 0.003 than 0.1

Steps: Arithmetic Coding


The basic algorithm for encoding a file using arithmetic

coding works conceptually as follows: (1) Begin with current range [L,H) initialized to [0,1). Note : We denote brackets [0,1) in such a way to show that it is equal to or greater than 0 but less than 1. (2) For each symbol of the file, we perform two steps : a) Subdivide the current interval into subintervals, one for each alphabet symbol. b) Select the subinterval corresponding to the symbol that actually occurs next in the file and make it the new current interval.

(3) Output enough bits to distinguish the current interval

from all other possible interval. Example: Encode the message: a1 a2 a3 a4

Table : Arithmetic Coding example

Fig : Arithmetic coding procedure So, any number in the interval [0.06752, 0.0688) , for example 0.068 can be used to represent the message. Here 3 decimal digits are used to represent the 5 symbol source message. This translates into 3/5 or 0.6 decimal digits per source symbol and compares favorably with the entropy of -(3x0.2log100.2+0.4log100.4) = 0.5786 digits per symbol As the length of the sequence increases, the resulting arithmetic code approaches the bound set by entropy. In practice, the length fails to reach the lower bound, because: The addition of the end of message indicator that is needed to separate one message from another The use of finite precision arithmetic

LZW (Lempel-Ziv-Welch) coding


LZW (Lempel-Ziv-Welch) coding, assigns fixed-length code words to variable length sequences of source symbols, but requires no a priori knowledge of the probability of the source symbols. LZW was formulated in 1984 The nth extension of a source can be coded with fewer average bits per symbol than the original source. LZW is used in: Tagged Image file format (TIFF) Graphic interchange format (GIF) Portable document format (PDF)

The Algorithm:
A codebook or dictionary containing the source symbols is constructed. For 8-bit monochrome images, the first 256 words of the dictionary are assigned to the gray levels 0-255 Remaining part of the dictionary is filled with sequences of the gray levels

Example: 39 39 39 39 39 39 39 39 126 126 126 126 126 126 126 126

Table : LZW Coding example Compression ratio = (8 x 16) / (10 x 9 ) = 64 / 45 = 1.4 Important features of LZW: The dictionary is created while the data are being encoded. So encoding can be done on the fly The dictionary need not be transmitted. Dictionary can be built up at receiving end on the fly If the dictionary overflows then we have to reinitialize the dictionary and add a bit to each one of the code words. Choosing a large dictionary size avoids overflow, but spoils compressions

Decoding LZW: Let the bit stream received be: 39 39 126126 256 258 260 259 257 126

In LZW, the dictionary which was used for encoding need not be sent with the image. A separate dictionary is built by the decoder, on the fly, as it reads the received code words. Recognize d 39 39 126 126 256 258 260 259 257 Encoded value 39 39 126 126 256 258 260 259 257 126 pixels 39 39 126 126 39-39 126-126 39-39126 126-39 39-126 126 256 257 258 259 260 261 262 263 264 39-39 39-126 126-126 126-39 39-39126 126-12639 39-39126-126 126-3939 39-126126 Dic. address Dic. Entry

5. Explain wavelet based image compression. In contrast to image compression using discrete cosine transform (DCT) which is proved to be poor in frequency localization due to the inadequate basis window, discrete wavelet transform (DWT) has a better way to resolve the problem by trading off spatial or time resolution for frequency resolution. Exploiting the structures between coefficients for removing redundancy Wavelet Coding

Fig : Wavelet coding system ( encoder)

Fig : Wavelet coding system ( decoder) Advantages: Lossless and lossy compression. Progressive transmission by pixel accuracy resolution. Region-of-Interest Coding. Random code stream access and processing. Robustness to bit-errors. Content-based description. Side channel spatial information (transparency).

and

6. Explain arithmetic coding and Huffman coding.

Arithmetic coding
Arithmetic

compression is an alternative to Huffman compression, it enables characters to be represented as fractional bit lengths. Unlike for Huffman compression, where fractional code lengths are not possible and the allocation of shorter code words for more frequently occurring characters needs at least one-bit codeword no matter how high its frequency. Arithmetic coding works by representing a number by an interval of real numbers greater or equal to zero, but less than one. As a message becomes longer, the interval needed to represent it becomes smaller and smaller, and the number of bits needed to specify it increases.

Entire sequence of source symbol (message) is assigned a single arithmetic code word. There is no one to one coding like Huffman The code word is within interval [0, 1] As the number of symbols in the message increases, the interval used to represent it becomes smaller and the number of information units (bits) required to represent the interval becomes larger Ex. More bits are required to represent 0.003 than 0.1

Steps: Arithmetic Coding


The basic algorithm for encoding a file using arithmetic

coding works conceptually as follows: (1) Begin with current range [L,H) initialized to [0,1). Note : We denote brackets [0,1) in such a way to show that it is equal to or greater than 0 but less than 1. (2) For each symbol of the file, we perform two steps : a) Subdivide the current interval into subintervals, one for each alphabet symbol. b) Select the subinterval corresponding to the symbol that actually occurs next in the file and make it the new current interval.

(3) Output enough bits to distinguish the current interval

from all other possible interval. Example: Encode the message: a1 a2 a3 a4

Fig : Arithmetic Coding example

Fig : Arithmetic coding procedure

Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.

Source reduction Code assignment

Steps
6. Find the gray level probabilities from the image histogram. 7. Arrange probabilities in reverse order, highest at top. 8. Combine the smallest two by addition, always keep sum in

reverse order.
9. Repeat step 3 until only two probabilities are left. 10. By working backward along the tree, generate code by

alternating assignment of 0 & 1.

Fig : Huffman Source Reductions

Fig : Huffman code assignment procedure

7. Explain how compression is achieved in transform coding and explain the DCT.

Transform Coding
Three steps:
Divide

a data sequence into blocks of size N and transform each block using a reversible mapping Quantize the transformed sequence Encode the quantized values

Benefits - transform co efficiently, relatively uncorrelated - energy is highly compacted - reasonable robust relative to channel errors.

DCT is similar to DFT, but can provide a better approximation with fewer coefficients The coefficients of DCT are real valued instead of complex valued in DFT. The discrete cosine transform (DCT) is the basis for many image compression algorithms. One clear advantage of the DCT over the DFT is that there is no need to manipulate complex numbers. The equation for a forward DCT is

and for the reverse DCT

Where,

DCT in Terms of Basis Functions

The basis functions or basis images for DCT is given by:


= ( u ) ( v ) cos

g ( x, y,u, v )

( 2x +1) u cos ( 2 y +1) v


2N

2N

Where,

( u) = 1 N ( u) = 2 N

for u = 0 for u =1,2,..., N 1

N is the block size of the image (normally N=8)

Matrix of Discrete Cosine Transform (DCT)

Zig-zag Scan DCT Blocks To group low frequency coefficients in top of vector. Maps 8 x 8 to a 1 x 64 vector.

8. Explain any two basic data redundancies in digital image compression.

Data Redundancy
Various amount of data may be used to represent the same information. Data which either do not provide necessary information or provide the same information again are called redundant data. Removing redundant data from the image reduces the size.

Redundancies In Image
In image compression 3 basic data redundancies can be identified. 1. Coding redundancy (CR) 2. Interpixel redundancy (IR) 3. Psychovisual redundancy (PVR)

Data compression is achieved when one or more of these redundancies are reduced or eliminated

Coding redundancy
A natural m-bit coding method assigns m-bit to each gray level without considering the probability that gray level occurs with: Very likely to contain coding redundancy Basic concept:

Utilize the probability of occurrence of each gray level (histogram) to determine length of code representing that particular gray level: variable-length coding. Assign shorter code words to the gray levels that occur most frequently or vice versa.

Fig : Graphical representation of fundamental basis of data compression

Interpixel Redundancy
Caused by High Interpixel Correlations within an image, i.e., gray level of any given pixel can be reasonably predicted from the value of its neighbors (information carried by individual pixels is relatively small) spatial redundancy, geometric redundancy, interframe redundancy (in general, interpixel redundancy ) Interpixel redundancy occurs because adjacent pixels tend to be highly correlated. Adjacent pixel values tend to be close to each other. The value of a given pixel can be predicated from the value of its neighbors. Visual contribution of a single pixel to an image is redundant. To reduce inter pixel redundancy image is transformed in to more efficient format. For Ex. Difference between adjacent pixels can be used to store an image. This transformation process is called mapping Reverse of that is called inverse mapping

We can detect the presence of correlation between pixels (or interpixel redundancy) by computing the auto-correlation coefficients along a row of pixels

(n) = A(n) A(0)

where A(n) =

1 N 1n f ( x, y) f ( x, y + n) N n y0 =

Maximum possible value of (n) is 1 and this value is approached for this image, both for adjacent pixels and also for pixels which are separated by 45 pixels (or multiples of 45).

Psychovisual Redundancy
Psychovisual redundancy refers to the fact that some information is more important to the human visual system than other types of information. Use of less no. of gray levels reduces the size of image. Elimination of psychovisually redundant data from an image results in a loss of quantitative information. This process is not reversible The key in image compression algorithm development is to determine the minimal data required to retain the necessary information. This is achieve by taking advantage of the redundancy that exists in the image. Any redundant information that is not required can be eliminated to reduce the amount of data used to represent the image The eye does not respond with equal sensitivity to all visual information. Certain information has less relative importance than other information in normal visual processing psychovisually redundant (which can be eliminated without significantly impairing the quality of image perception). The elimination of psychovisually redundant data results in a loss of quantitative information lossy data compression method. Image compression methods based on the elimination of psychovisually redundant data (usually called quantization) are usually applied to commercial broadcast TV and similar applications for human visualization.

9. Explain Huffman example?

coding

Algorithm

giving

numerical

Huffman Coding
This coding reduces average number of bits/pixel. It assigns variable length bits to different symbols. Achieves compression in 2 steps.

Source reduction Code assignment

Steps
1. Find the gray level probabilities from the image histogram. 2. Arrange probabilities in reverse order, highest at top. 3. Combine the smallest two by addition, always keep sum in

reverse order.
4. Repeat step 3 until only two probabilities are left. 5. By working backward along the tree, generate code by

alternating assignment of 0 & 1.

Fig : Huffman Source Reductions

Fig : Huffman code assignment procedure

Calculating Lavg & Entropy Lavg= 2.2 bits/pixel Entropy = 2.14 bits/pixel Efficiency of Huffman code = 2.14/2.2 = 0.973 Constraint : symbol be coded one at a time Uniquely code able & decodable

Encoding

Decoding

10.

Explain the Constrained Least Square filtering?

Constrained Least Squares Filtering


Only the mean and variance of the noise is required The degradation model in vector-matrix form

The objective function

MN 1

=H

f + MNMN MN 1 MN1

2 M 1 N 1 2 min C = [ f ( x, y)] x=0 y=0 2 subject to g Hf = 2


The solution

F (u, v) =

H *(u, v) G(u, v) 2 H (u, v) + P(u, v) 1 0 4 1 1 0 2

0 p( x, y) = 1 0

In that case we seek for a solution that minimizes the function

M (f ) = y Hf

A necessary condition for M (f ) to have a minimum is that its gradient with respect to f is equal to zero. This gradient is given below

M (f ) = M (f ) =2( HT y +HT Hf) f f

And by using the steepest descent type of optimization we can formulate an iterative rule as follows:

f = HTy 0

Constrained least squares iteration

M (f ) k = f + HT (y Hf ) = HTy + (I HTH)f f =f k k k k+1 k f k

In this method we attempt to solve the problem of constrained restoration iteratively. As already mentioned the following functional is minimized

M (f , ) = y Hf

+ Cf

The necessary condition for a minimum is that the gradient of M (f , is equal to zero. That gradient is )

(f ) = M (f , ) =2[(HT H + CT C)f T y ] H f

The initial estimate and the updating rule for obtaining the restored image are now given by

f = HTy 0 f = f + [HTy (HTH + CTC)f ] k k+1 k

It can be proved that the above iteration (known as Iterative CLS or Tikhonov-Miller Method) converges if

0< <

2 max

where max is the maximum eigenvalue of the matrix,

(HTH + CTC)
If the matrices H and C are block-circulant the iteration can be implemented in the frequency domain.

S-ar putea să vă placă și