Image Compression

Introduction
Much of the information is in form of images

Images are handled by machines as a matrix
of digital picture elements, or pixels
The appearance of an image depends on
image type
resolution
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 1

Types of images & Resolution
bilevel (black & white)
e.g. faxes
grayscale
color
dot per inches (dpi)

600 x 600 actual medium quality laser printer
1200 x 1200 low cost phototypesetter
4800 x 4800 high resolution phototypesetter

Bilevel images: CCITT fax
standard
fax: facsimile
CCITT Comit Consultatif International Tlphonique et
Tlgraphique, it is part of the ITU International
Telecommunication Union, one of the specialized
agencies of the United Nations
In the late 70s CCITT starts thinking about a
standard for fax transmission
1980 CCITT Group 3 standard
group 1 & 2 are earlier attempt, which use simpler
encoding and modulations techniques, resulting in
very slow transmissions
CCITT Group 3 - I
It is the most common standard for fax
transmission
It is accepted worldwide, almost every fax
machine supports this standard
It uses compression algorithms for bilevel
images

CCITT Group 3 - II
Paper size: international A4 (not US letter)
standard resolution 204x98 dpi (200x100)
high resolution 204x196 dpi (200x200)
bilevel image 1 bit/pixel
image size: 1728x1188 bits at 1728 bits/line
standard resolution about 2 Mbit

1188
Transmission rate: 4.8 Kbit/s lines/page
today is usually higher, 14.4 33.6 Kbit/s

At 4.8Kbit/s in std resolution one page would
take about 430 sec, but only 1 minute on
average with Group 3 algorithms
5
Run-length coding
Each scan line is composed by
sequences of pixel of the same color
Count the number of element of each

run
Example 3w 4b 9w 2b 2w 6b 5w 2b 5w...

G3 1D
Group 3 One-Dimensional coding (G3 1D) is
called Modified Huffman (MH) as it encodes run
lengths using a
predefined Huffman
code
In order to maintain
black/white
syncronization, each line
begins with a white run,
eventually of zero length

G3 1D
1000 011 10100 11 0111 0010 ...
predefined Huffman codewords

have been found from the
probabilities of the runs in
typical handwritten documents

G3 1D
As one line has 1728 bits, we have to define a
codeword for all 1728 black and white run
lengths
As shorter runs occur more frequently that
longer runs, we code each run length in an
additive form
there is a terminating and makeup codeword
Lengths form 0 to 63 are coded with a single terminating
codeword
Longer runs are coded with one or more makeup
codewords and a terminating codeword
Each line is terminated with a EOL symbol composed of
eleven 0 and one 1
G3 2D
Group 3 Two-Dimensional coding (G3 2D) is
called Modified READ (MR) as it is a variant of a
previously defined code, called READ (Relative
Element Address Designate)
Many images have a high degree of vertical
coherence between consecutive lines
changing elements are coded w.r.t. a nearby change

position of the same color in the previous (reference) line
10
G3 2D
Nearby means within an interval of radius 3
pixels
If there are changing elements in the current
line without correspondents in the reference
line switch to horizontal mode (1D)
On the opposite if the ref line has a run with
no counterpart in the current line special
pass code

G3 2D
reference line
current line
vertical mode horizontal pass vertical ...

mode code mode
-1 0 +2 -2
from a Huffman table, <mode | length of preceding 0001

generated
with codewords for white run | length of black run>
code
-3, -2, -1, 0, +1, +2, +3
12
G3 2D
Two dimensional coding is more prone to
transmission errors
In the G3 1D an error may cause problems in the
entire line, but syncronization is forced back by EOL
codeword
Here an error in the reference line is likely
propagated in all the other lines
For this reason there are 1 reference line for each k
lines (i.e. k-1 are coded w.r.t. each ref line)
standard resolution k=2
high resolution k=4

CCITT fax standard
compression performances
Standard resolution (~200x100 dpi)
G3 1D 0.13 bits/pixel 57s. for A4 at 4.8 Kbps
G3 2D (k=2) 0.11 bits/pixel 47s. for A4 at 4.8 Kbps
High resolution (~200x200 dpi)

G3 2D (k=4) 0.09 bits/pixel 74s. for A4 at 4.8 Kbps
Compression is very good for office image where

run lengths are long
It would be very bad for bilevel natural images
14
Continuous-tone images:
why lossless compression?
lossy compression is often preferred to have
remarkably more compressed images, with
good quality
However there are some situations in which
using an approximation may not be adequate
medical images
historical documents
images with legal relevance

lossless compression
GIF standard
PNG standard
JPEG-LS
It is a quite new standard. The original JPEG
standard included a lossless mode, but its
performances were not close to state of the
art
extimation of pixel value using quite simple
context: effective and low cost solution
www.hpl.hp.com/loco
16
GIF image format - I
Adopted by CompuServe to minimize the time
required to download images over a modem
link
The most widely used lossless image format
until 1995
8-bit pixel description
256 color images, but it is possible to use a
color map

GIF image format - II
The color map can be specified for each image
or can be omitted
if specified, it is included as an header into image
file, in uncompressed form
color map is composed of 256 24-bit entries, that
specify 256 RGB colors
Compression scheme used is LZW
Alphabet symbols are the 256 colors of the color
map plus a clear code and an end-of-information
code

GIF image format - III
Even if this feature is not widely used, GIF files
may contain more than one image, and it is
possible to share the color map
LZW-coded information is grouped into blocks
preceded by a byte-count, in order to skip an
image without decompressing it
In 1995 Unisys announced that there would be
royalties on GIF implementations due to an old
patent they held on LZW
This catalyzed the development of a new lossless
image format, designed for public domain and
with the last improvements
19
PNG image format - I
Portable Network Graphics (pronounced ping)
it uses gzip compression scheme
through some improvements compression obtained
is about 10-30% better than GIF
By default it encodes the pixels in raster scan
order, but some other methods are available
it is possible to code horizontal difference, i.e. the
difference between current pixel value and the previous
one or vertical difference, i.e. the difference w.r.t. the
above pixel
average difference, the difference with the average of
above and next pixel
...
PNG image format - II
It is possible to use more than 256 colors, up
to 16 bit grayscale and 48 bit color
GIF uses one special pixel value to indicate
transparency, PNG uses 256 different values
per pixel, allowing for picture progressively
fading into the background
It seems inevitable that PNG format will

gradually assume the role of standard lossless
image format for the WWW, replacing GIF

why lossy compression?
Digital images are yet an approximation of the
real analog phenomenon
lossy techniques allow to obtain very good
compression with a modest lost of details
This is useful for storing and trasmitting
images

lossy compression
JPEG
JPEG2000
a new image coding system that uses state-
of-the-art compression techniques based on
wavelet technology
file extension .jp2
With very compressed files, if image size is
the same, perceived quality of JPEG2000
images is better w.r.t. JPEG images

JPEG format - I
JPEG is a standard defined by the Joint
Photographic Experts Group in 1992
It was conceived to transmit images at 64
Kbps
It has a lossy mode and a lossless mode (not
so much used, and today replaced by the
JPEG-LS standard)
With lossy mode it allows to obtain very good
quality at about 1 bit/pixel
Implementation complexity is reasonable

JPEG format - II
It could be used with graylevel and color
images
Each channel of the color space (RGB, YUV...)
is treated separately
it allows progressive transmission (that is
much better suited for WWW than raster
transmission)

Raster vs. progressive
transmission
JPEG Coder - I
Quantization Discrete
Cosine
Transform
Binary
Encoder

JPEG Coder - II
Image is divided in 8x8-pixel squares
Preprocessing
Apply Discrete Cosine Transform on
each square
Coefficient quantization
Bit stream encoding

Preprocessing: color space
transformation & downsampling
from RGB into YUV
The Y component represents the brightness
of a pixel, and the U and V components
together represent the hue and saturation
Human eye can see more detail in the Y
component than in the U and V, that can be
compressed more aggressively
4:4:4 no downsampling
4:2:2 horizontal downsampling of a factor 2
4:2:0 both horizontal and vertical downsampling

Discrete Cosine Transform - I
The discrete cosine transform (DCT)
is a Fourier-related transform similar to
the discrete Fourier transform (DFT),
but using only real numbers
It is used in JPEG because it is fast and
quite easy to implement efficiently

Discrete Cosine Transform - II
where
the block is N1 N2 pixels (in JPEG, 8x8)
A(i,j) is the value of pixel of position (i,j)
B(k1 ,k 2 ) is the DCT coefficient of position (k1 ,k 2 )
low values for k1 corresponds to low vertical
frequencies, low values for k 2 to low horizontal
frequencies
Generally higher frequencies have very low values

Discrete Cosine Transform - III
DCT function basis
each 8x8 square is

reduced to 64
coefficient

Discrete Cosine Transform - IV
Knowing with infinite precision the 64 DCT
coefficient it is possible to reconstruct exactly
the pixels of the square
But
finite precision
quantization of the coefficients (always)
Some coefficient related to high frequency are not
transmitted. This allows higher compression without
sacrifying too much quality as human eye is less
responsible

Quantization - I
The DCT matrix obtained is scaled differently in each
component, dividing each by a diferent factor
the factor for each component has been decided based
on human sensitivity to changes at each frequency
In practice the matrix of factor is usually
34
Quantization - II
Next, all values are rounded to nearest integer
This leads to a quite high number of 0s in the
high frequency zone, as factors are bigger

Zig-zag scan
Low frequency
coefficients are
transmitted before
higher frequency
coefficients
This allows for
progressive
visualization of this
8x8 block

Raster vs. progressive
transmission
Raster transmission
DCT coefficient of the upper left block, then
those of all the others in the upper part of
the image and so on
Progressive transmission
first all (0,0) coefficients, than all (0,1) and
so on, following zig-zag scan in each block

Binary coding
DCT(0,0) has usually a very slow variation
from one block to the next, as it is the mean
value
For this reason it is convenient to encode the
difference from the previous value
Tipically the bit stream is coded with Huffman
It is possible to use arithmetic scheme, gaining some
compression at cost of decoding speed
Huffman codes are predefined, or it is possible
to build optimal tables and insert them in the
stream
JPEG Decoder
Binary
Decoder
Some values are lost!
Dequantization
Good quality, but

reconstruction is not
Inverse DCT
exact

JPEG performances - I

JPEG performances - II
Original Quality factor 75
Quality factor 20 Quality factor 3
41

Image Compression

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Image Compression

Încărcat de

Drepturi de autor:

Formate disponibile

Introduction

Much of the information is in form of images

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 1

dot per inches (dpi)

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 2

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 4

standard resolution about 2 Mbit

today is usually higher, 14.4 33.6 Kbit/s

Count the number of element of each

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 6

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 7

1000 011 10100 11 0111 0010 ...

predefined Huffman codewords

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 8

changing elements are coded w.r.t. a nearby change

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 11

vertical mode horizontal pass vertical ...

from a Huffman table, <mode | length of preceding 0001

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 13

High resolution (~200x200 dpi)

Compression is very good for office image where

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 15

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 17

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 18

It seems inevitable that PNG format will

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 21

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 22

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 23

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 24

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 25

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 27

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 28

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 29

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 30

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 31

DCT function basis

each 8x8 square is

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 32

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 33

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 35

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 36

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 37

Some values are lost!

Good quality, but

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 39

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 40

Quality factor 20 Quality factor 3

S-ar putea să vă placă și