Compression

Multimedia Compression
Audio, image and video require vast amounts of data

320x240x8bits grayscale image: 77Kb 1100x900x24bits color image: 3MB 640x480x24x30frames/sec: 27.6 MB/sec
Low networks bandwidth doesn't allow for real time video transmission Slow storage devices don't allow for fast playing back Compression reduces storage requirements
E.G.M. Petrakis Multimedia Compression 1
Classification of Techniques
Lossless: recover the original representation Lossy: recover a representation similar to the original one
high compression ratios more practical use
Hybrid: JPEG, MPEG, px64 combine several approaches

Compression Standards
Furht at.al. 96
E.G.M. Petrakis
Lossless Techniques
Furht at.al. 96
E.G.M. Petrakis
Lossy Techniques
Furht at.al. 96
E.G.M. Petrakis
JPEG Modes of Operation

Sequential DCT: the image is encoded in one left-to-right, top-to-bottom scan Progressive DCT: the image is encoded in multiple scans (if the transmission time is long, a rough decoded image can be reproduced) Hierarchical: encoding at multiple resolutions Lossless : exact reproduction
E.G.M. Petrakis Multimedia Compression
JPEG Block Diagrams
Furht at.al. 96
E.G.M. Petrakis
JPEG Encoder
Three main blocks:
Forward Discrete Cosine Transform (FDCT) Quantizer Entropy Encoder
Essentially the sequential JPEG encoder

Main component of progressive, lossless and hierarchical encoders For gray level and color images
Sequential JPEG
Pixels in [0,2p-1] are shifted in [-2p-1,2p-1-1] The image is divided in 8x8 blocks Each 8x8 block is DCT transformed
C (u ) C ( v ) 7 7 ( 2 x + 1)u ( 2 y + 1)v F (u, v ) = f ( x, y ) cos cos 2 2 x =0 y =0 16 16 1 for u = 0 C (u ) = 2 1 for u > 0 1 for v = 0 C (v ) = 2 1 for v > 0 E.G.M. Petrakis
DCT Coefficients
F(0,0) is the DC coefficient: average value over the 64 samples The remaining 63 coefficients are the AC coefficients Pixels in [-128,127]: DCTs in [-1024,1023]
Most frequencies have 0 or near to 0 values and need not to be encoded This fact achieves compression E.G.M. Petrakis Multimedia Compression 10
Quantization Step
All 64 DCT coefficients are quantized
Fq(u,v) = Round[F(u,v)/Q(u,v)] Reduces the amplitude of coefficients which contribute little or nothing to 0 Discards information which is not visually significant Quantization coefficients Q(u,v) are specified by quantization tables A set of 4 tables are specified by JPEG
Quantization Tables
Furht at.al. 96
for (i=0; i < 64; i++) for (j=0; j < 64; j++) Q[i,j] = 1 + [ (1+i+j) quality]; quality = 1: best quality, lowest compression quality = 25: poor quality, highest compression
AC Coefficients
The 63 AC coefficients are ordered by a zig-zag sequence Places low frequencies before high frequencies Low frequencies are likely to be 0 Sequences of such 0 coefficients will be encoded by fewer bits
Furht at.al. 96
13
DC Coefficients
Predictive coding of DC Coefficients Adjacent blocks have similar DC intensities Coding differences yields high compression
E.G.M. Petrakis
14
Entropy Encoding
Encodes sequences of quantized DCT coefficients into binary sequences AC: (runlength, size) (amplitude) DC: (size, amplitude) runlength: number consecutive 0s, up to 15 amplitude: first non-zero value size: number of bits to encode amplitude 0 0 0 0 0 0 476: (6,9)(476)
takes up to 4 bits for coding (39,4)(12) = (15,0)(15,0)(7,4)(12)
Huffman coding
Converts each sequence into binary First DC following with ACs Huffman tables are specified in JPEG Each (runlength, size) is encoded using Huffman coding Each (amplitude) is encoded using a variable length integer code (1,4)(12) => (11111101101100)
Example of Huffman table

Furht at.al. 96
E.G.M. Petrakis
17
JPEG Encoding of a 8x8 block

Furht at.al. 96
E.G.M. Petrakis
18
Compression Measures
Compression ratio (CR): increases with higher compression
CR = OriginalSize/CompressedSize
Root Mean Square Error (RMS): better quality with lower RMS
Xi: original pixel values xi: restored pixel values n: total number of pixels
1 RMS = n
( X x ) i i i =1
Furht at.al. 96
E.G.M. Petrakis
20
JPEG Decoder
The same steps in reverse order
The binary sequences are converted to symbol sequences using the Huffman tables F(u,v) = Fq(u,v)Q(u,v) Inverse DCT
1 7 7 ( 2 x + 1)u ( 2 y + 1)v F ( x, y ) = C (u )C ( v ) F (u, v ) cos cos 4 u =0 v =0 16 16
Progressive JPEG
When image encoding or transmission takes long there may be a need to produce an approximation of the original image which is improved gradually
Furht at.al. 96
E.G.M. Petrakis
22
Progressive Spectral Selection

The DCT coefficients are grouped into several bands
Low-frequency bands are first band1: DC coefficient only band2: AC1,AC2 coefficients band3: AC3, AC4, AC5, AC6 coefficients band4: AC7, AC8 coefficients
E.G.M. Petrakis
23
Lossless JPEG
Simple predictive encoding
Furht at.al. 96
prediction schemes
E.G.M. Petrakis
24
Hierarchical JPEG
Produces a set of images at multiple resolutions
Begins with small images and continues with larger images (down-sampling) The reduced image is scaled-up to the next resolution and used as predictor for the higher resolution image
E.G.M. Petrakis
25
Encoding
1. Down-sample the image by 2a in each x, y 2. Encode the reduced size image (sequential, progressive ..) 3. Up-sample the reduced image by 2 4. Interpolate by 2 in x, y 5. Use the up-sampled image as predictor 6. Encode differences (predictive coding) 7. Go to step 1 until the full resolution is encoded
Furht at.al. 96
E.G.M. Petrakis
27
JPEG for Color images

Encoding of 3 bands (RGB, HSV etc.) in two ways:
Non-interleaved data ordering: encodes each band separately Interleaved data ordering: different bands are combined into Minimum Coded Units (MCUs)
Display, print or transmit images in parallel with decompression
Interleaved JPEG
Minimum Coded Unit (MCU): the smallest group of interleaved data blocks (8x8)
Furht at.al. 96
E.G.M. Petrakis
29
Video Compression
Various video encoding standards: QuickTime, DVI, H.261, MPEG etc
Basic idea: compute motion between adjacent frames and transmit only differences Motion is computed between blocks Effective encoding of camera and object motion
MPEG
The Moving Picture Coding Experts Group (MPEG) is a working group for the development of standards for compression, decompression, processing, and coded representation of moving pictures and audio MPEG groups are open and have attracted large participation
http://mpeg.telecomitalialab.com
MPEG Features
Random access Fast forward / reverse searches Reverse playback Audio visual synchronization Robustness to errors Auditability Cost trade-off
MPEG -1, 2
At least 4 MPEG standards finished or under construction MPEG-1: storage and retrieval of moving pictures and audio on storage media
352x288 pixels/frame, 25 fps, at 1.5 Mbps Real-time encoding even on an old PC
MPEG-2: higher quality, same principles

720x576 pixels/frame, 2-80 Mbps
MPEG-4
Encodes video content as objects Based on identifying, tracking and encoding object layers which are rendered on top of each other Enables objects to be manipulated individually or collectively on an audiovisual scene (interactive video) Only a few implementations Higher compression ratios
MPEG-7
Standard for the description of multimedia content
XML Schema for content description Does not standardize extraction of descriptions MPEG1, 2, and 4 make content available MPEG7 makes content semantics available E.G.M. Petrakis Multimedia Compression 35
MPEG-1,2 Compression
Compression of full motion video, interframe compression, stores differences between frames A stream contains I, P and B frames in a given pattern Equivalent blocks are compared and motion vectors are computed and stored as P and B frames
Furht at.al. 96
E.G.M. Petrakis
36
Frame Structures
I frames: self contained, JPEG encoded
Random access frames in MPEG streams Low compression
P frames: predicted coding using with reference to previous I or P frame

Higher compression
B frames: bidirectional or interpolated coding using past and future I or P frame

Highest compression
Example of MPEG Stream

Furht at.al. 96
B frames 2 3 4 are bi-directionally coded using I frame 1 and P frame 5
P frame 5 must be decoded before B frames 2 3 4 I frame 9 must be decoded before B frames 6 7 8 Frame order for transmission: 1 5 2 3 4 9 6 7 8
Multimedia Compression 38
E.G.M. Petrakis
MPEG Coding Sequences

The MPEG application determines a sequence of I, P, B frames
For fast random access code the whole video as I frames (MJPEG) High compression is achieved by using large number of B frames Good sequence: (IBBPBBPBB)(IBBPBBPBB)...
Motion Estimation
The motion estimator finds the best matching block in P, B frames
Block: 8x8 or16x16 pixels P frames use only forward prediction: a block in the current frame is predicted from past frame B frames use forward or backward or prediction by interpolation: average of forward, backward predicted blocks
Motion Vectors
block: 16x16pixles Furht at.al. 96
One or two motion vectors per block

One vector for forward predicted P or B frames or backward predicted B frames Two vectors for interpolated B frames
MPEG Encoding
I frames are JPEG compressed P, B frames are encoded in terms of future or previous frames Motion vectors are estimated and differences between predicted and actual blocks are computed
These error terms are DCT encoded Entropy encoding produces a compact binary code Special cases: static and intracoded blocks
MPEG encoder
JPEG encoding
Furht at.al. 96
E.G.M. Petrakis
43
MPEG Decoder
Furht at.al. 96
E.G.M. Petrakis
44
Motion Estimation Techniques

Not specified by MPEG Block matching techniques Estimate the motion of an nxm block in present frame in relation to pixels in previous or future frames
The block is compared with a previous or forward block within a search area of size (m+2p)x(n+2p) m = n = 16 p = 6
E.G.M. Petrakis
Block Matching
Furht at.al. 96
Search area in block matching techniques

Typical case: n=m=16, p=6 F: block in current frame G: search area in previous (or future) frame
Cost functions
I.
The block has moved to the position that minimizes a cost function
Mean Absolute Difference (MAD)
1 n/2 m/2 MAD ( dx, dy ) = F (i, j ) G (i + dx, j + dy ) mn i = n / 2 j = m / 2

F(i,j) : a block in current frame G(i,j) : the same block in previous or future frame (dx,dy) : vector for the search location
dx=(-p,p), dy=(-p,p)
E.G.M. Petrakis
More Cost Functions

II. Mean Squared Difference (MSD)
1 n/2 m/2 2 MSD ( dx, dy ) = F (i, j ) G (i + dx, j + dy ) mn i = n / 2 j = m / 2
III. Cross-Correlation Difference (CCF)

CCF ( dx, dy ) =
F (i, j )G(i + dx, j + dy )

i j 1/ 2 1/ 2
F 2 (i, j ) G 2 (i + dx, j + dy ) i j i j
E.G.M. Petrakis
48
More cost Functions

IV. Pixel Difference Classification (PDC)
PDC ( dx, dy ) = T (dx, dy , i, j )
i j
1 if F (i, j ) G (i + dx, j + dy ) t T ( dx, dy , i, j ) = otherwise 0 t: predefined threshold each pixel is classified as a matching pixel (T=1) or a mismatching pixel (T=0) the matching block maximizes PDC
Block Matching Techniques

Exhaustive: very slow but accurate Approximation: faster but less accurate
Three-step search 2-D logarithmic search Conjugate direction search Parallel hierarchical 1-D search (not discussed) Pixel difference classification (not discussed here)
Exhaustive Search
Evaluates the cost function at every location in the search area
Requires (2p+1)2 computations of the cost function For p=6 requires169 computations per block!!
Very simple to implement but very slow
E.G.M. Petrakis
51
Three-Step Search
Computes the cost function at the center and 8 surrounding locations in the search area
The location with the minimum cost becomes the center location for the next step The search range is reduced by half
E.G.M. Petrakis
52
Three-Step Motion Vector Estimation (p=6)

Furht at.al. 96
E.G.M. Petrakis
53
ThreeStep Search
1.
Compute cost (MAD) at 9 locations

Center + 8 locations at distance 3 from center
2. Pick min MAD location and recompute MAD at 9 locations at distance 2 from center 3. Pick the min MAD locations and do same at distance 1 from center
The smallest MAD from all locations indicates the final estimate M24 at (dx,dy)=(1,6) Requires 25 computations of MAD
E.G.M. Petrakis
2-D Logarithic Search

Combines cost function and predefined threshold T Check cost at M(0,0), 2 horizontal and 2 vertical locations and take the minimum If cost at any location is less than T then search is complete If no then, search again along the direction of minimum cost - within a smaller region
55
Furht at.al. 96
if cost at M(0,0) < T then search ends! compute min cost at M1,M2,M3,M4; take their min; if min cost < M(0,0) if (cost less than T) then search ends! else compute cost at direction of minimum cost (M5,M6 in the
else compute cost at the neighborhood of min cost within p/2

(M5 in the example)
example);
Conjugate Direction Search

Furht at.al. 96
Repeat
find min MAD along dx=0,-1,1 (y fixed): M(1,0) in example find min MAD along dy=0,-1,1 starting from previous min (x fixed): M(2,2) search similarly along the direction connecting the above mins
Other Compression Techniques

Digital Video Interactive (DVI)
similar to MPEG-2
Fractal Image Compression

Find regions resembling fractals Image representation at various resolutions
Sub-band image and video coding

Split signal into smaller frequency bands
Wavelet-based coding E.G.M. Petrakis Multimedia Compression
58
References
B. Furht, S. W. Smoliar, H-J. Zang, Video and Image Processing in Multimedia Systems, Kluwer Academic Pub, 1996
E.G.M. Petrakis
59

Compression

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Compression

Încărcat de

Drepturi de autor:

Formate disponibile

Multimedia Compression

Audio, image and video require vast amounts of data

Hybrid: JPEG, MPEG, px64 combine several approaches

JPEG Modes of Operation

JPEG Block Diagrams

Essentially the sequential JPEG encoder

takes up to 4 bits for coding (39,4)(12) = (15,0)(15,0)(7,4)(12)

Example of Huffman table

JPEG Encoding of a 8x8 block

Progressive Spectral Selection

JPEG for Color images

MPEG-2: higher quality, same principles

P frames: predicted coding using with reference to previous I or P frame

B frames: bidirectional or interpolated coding using past and future I or P frame

Example of MPEG Stream

B frames 2 3 4 are bi-directionally coded using I frame 1 and P frame 5

MPEG Coding Sequences

One or two motion vectors per block

Motion Estimation Techniques

Search area in block matching techniques

1 n/2 m/2 MAD ( dx, dy ) = F (i, j ) G (i + dx, j + dy ) mn i = n / 2 j = m / 2

More Cost Functions

III. Cross-Correlation Difference (CCF)

F (i, j )G(i + dx, j + dy )

More cost Functions

Block Matching Techniques

Very simple to implement but very slow

Three-Step Motion Vector Estimation (p=6)

Compute cost (MAD) at 9 locations

2-D Logarithic Search

else compute cost at the neighborhood of min cost within p/2

Conjugate Direction Search

Other Compression Techniques

Fractal Image Compression

Sub-band image and video coding

Wavelet-based coding E.G.M. Petrakis Multimedia Compression

S-ar putea să vă placă și