Sunteți pe pagina 1din 41

Introduction

Much of the information is in form of images


Images are handled by machines as a matrix
of digital picture elements, or pixels
The appearance of an image depends on
image type
resolution

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 1


Types of images & Resolution
bilevel (black & white)
e.g. faxes
grayscale
color

dot per inches (dpi)


600 x 600 actual medium quality laser printer
1200 x 1200 low cost phototypesetter
4800 x 4800 high resolution phototypesetter

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 2


Bilevel images: CCITT fax
standard
fax: facsimile
CCITT Comit Consultatif International Tlphonique et
Tlgraphique, it is part of the ITU International
Telecommunication Union, one of the specialized
agencies of the United Nations
In the late 70s CCITT starts thinking about a
standard for fax transmission
1980 CCITT Group 3 standard
group 1 & 2 are earlier attempt, which use simpler
encoding and modulations techniques, resulting in
very slow transmissions
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 3
CCITT Group 3 - I
It is the most common standard for fax
transmission
It is accepted worldwide, almost every fax
machine supports this standard
It uses compression algorithms for bilevel
images

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 4


CCITT Group 3 - II
Paper size: international A4 (not US letter)
standard resolution 204x98 dpi (200x100)
high resolution 204x196 dpi (200x200)
bilevel image 1 bit/pixel
image size: 1728x1188 bits at 1728 bits/line

standard resolution about 2 Mbit


1188
Transmission rate: 4.8 Kbit/s lines/page

today is usually higher, 14.4 33.6 Kbit/s


At 4.8Kbit/s in std resolution one page would
take about 430 sec, but only 1 minute on
average with Group 3 algorithms
5
Run-length coding
Each scan line is composed by
sequences of pixel of the same color

Count the number of element of each


run
Example 3w 4b 9w 2b 2w 6b 5w 2b 5w...

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 6


G3 1D
Group 3 One-Dimensional coding (G3 1D) is
called Modified Huffman (MH) as it encodes run
lengths using a
predefined Huffman
code
In order to maintain
black/white
syncronization, each line
begins with a white run,
eventually of zero length

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 7


G3 1D

1000 011 10100 11 0111 0010 ...

predefined Huffman codewords


have been found from the
probabilities of the runs in
typical handwritten documents

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 8


G3 1D
As one line has 1728 bits, we have to define a
codeword for all 1728 black and white run
lengths
As shorter runs occur more frequently that
longer runs, we code each run length in an
additive form
there is a terminating and makeup codeword
Lengths form 0 to 63 are coded with a single terminating
codeword
Longer runs are coded with one or more makeup
codewords and a terminating codeword
Each line is terminated with a EOL symbol composed of
eleven 0 and one 1
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 9
G3 2D
Group 3 Two-Dimensional coding (G3 2D) is
called Modified READ (MR) as it is a variant of a
previously defined code, called READ (Relative
Element Address Designate)
Many images have a high degree of vertical
coherence between consecutive lines

changing elements are coded w.r.t. a nearby change


position of the same color in the previous (reference) line
10
G3 2D
Nearby means within an interval of radius 3
pixels
If there are changing elements in the current
line without correspondents in the reference
line switch to horizontal mode (1D)
On the opposite if the ref line has a run with
no counterpart in the current line special
pass code

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 11


G3 2D

reference line

current line

vertical mode horizontal pass vertical ...


mode code mode
-1 0 +2 -2

from a Huffman table, <mode | length of preceding 0001


generated
with codewords for white run | length of black run>
code
-3, -2, -1, 0, +1, +2, +3
12
G3 2D
Two dimensional coding is more prone to
transmission errors
In the G3 1D an error may cause problems in the
entire line, but syncronization is forced back by EOL
codeword
Here an error in the reference line is likely
propagated in all the other lines
For this reason there are 1 reference line for each k
lines (i.e. k-1 are coded w.r.t. each ref line)
standard resolution k=2
high resolution k=4

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 13


CCITT fax standard
compression performances
Standard resolution (~200x100 dpi)
G3 1D 0.13 bits/pixel 57s. for A4 at 4.8 Kbps
G3 2D (k=2) 0.11 bits/pixel 47s. for A4 at 4.8 Kbps

High resolution (~200x200 dpi)


G3 2D (k=4) 0.09 bits/pixel 74s. for A4 at 4.8 Kbps

Compression is very good for office image where


run lengths are long
It would be very bad for bilevel natural images
14
Continuous-tone images:
why lossless compression?
lossy compression is often preferred to have
remarkably more compressed images, with
good quality
However there are some situations in which
using an approximation may not be adequate
medical images
historical documents
images with legal relevance

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 15


Continuous-tone images:
lossless compression
GIF standard
PNG standard
JPEG-LS
It is a quite new standard. The original JPEG
standard included a lossless mode, but its
performances were not close to state of the
art
extimation of pixel value using quite simple
context: effective and low cost solution
www.hpl.hp.com/loco
16
GIF image format - I
Adopted by CompuServe to minimize the time
required to download images over a modem
link
The most widely used lossless image format
until 1995
8-bit pixel description
256 color images, but it is possible to use a
color map

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 17


GIF image format - II
The color map can be specified for each image
or can be omitted
if specified, it is included as an header into image
file, in uncompressed form
color map is composed of 256 24-bit entries, that
specify 256 RGB colors
Compression scheme used is LZW
Alphabet symbols are the 256 colors of the color
map plus a clear code and an end-of-information
code

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 18


GIF image format - III
Even if this feature is not widely used, GIF files
may contain more than one image, and it is
possible to share the color map
LZW-coded information is grouped into blocks
preceded by a byte-count, in order to skip an
image without decompressing it
In 1995 Unisys announced that there would be
royalties on GIF implementations due to an old
patent they held on LZW
This catalyzed the development of a new lossless
image format, designed for public domain and
with the last improvements
19
PNG image format - I
Portable Network Graphics (pronounced ping)
it uses gzip compression scheme
through some improvements compression obtained
is about 10-30% better than GIF
By default it encodes the pixels in raster scan
order, but some other methods are available
it is possible to code horizontal difference, i.e. the
difference between current pixel value and the previous
one or vertical difference, i.e. the difference w.r.t. the
above pixel
average difference, the difference with the average of
above and next pixel
...
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 20
PNG image format - II
It is possible to use more than 256 colors, up
to 16 bit grayscale and 48 bit color
GIF uses one special pixel value to indicate
transparency, PNG uses 256 different values
per pixel, allowing for picture progressively
fading into the background

It seems inevitable that PNG format will


gradually assume the role of standard lossless
image format for the WWW, replacing GIF

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 21


Continuous-tone images:
why lossy compression?
Digital images are yet an approximation of the
real analog phenomenon
lossy techniques allow to obtain very good
compression with a modest lost of details
This is useful for storing and trasmitting
images

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 22


Continuous-tone images:
lossy compression
JPEG
JPEG2000
a new image coding system that uses state-
of-the-art compression techniques based on
wavelet technology
file extension .jp2
With very compressed files, if image size is
the same, perceived quality of JPEG2000
images is better w.r.t. JPEG images

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 23


JPEG format - I
JPEG is a standard defined by the Joint
Photographic Experts Group in 1992
It was conceived to transmit images at 64
Kbps
It has a lossy mode and a lossless mode (not
so much used, and today replaced by the
JPEG-LS standard)
With lossy mode it allows to obtain very good
quality at about 1 bit/pixel
Implementation complexity is reasonable

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 24


JPEG format - II
It could be used with graylevel and color
images
Each channel of the color space (RGB, YUV...)
is treated separately
it allows progressive transmission (that is
much better suited for WWW than raster
transmission)

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 25


Raster vs. progressive
transmission
JPEG Coder - I

Quantization Discrete
Cosine
Transform

Binary
Encoder

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 27


JPEG Coder - II
Image is divided in 8x8-pixel squares
Preprocessing
Apply Discrete Cosine Transform on
each square
Coefficient quantization
Bit stream encoding

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 28


Preprocessing: color space
transformation & downsampling
from RGB into YUV
The Y component represents the brightness
of a pixel, and the U and V components
together represent the hue and saturation
Human eye can see more detail in the Y
component than in the U and V, that can be
compressed more aggressively
4:4:4 no downsampling
4:2:2 horizontal downsampling of a factor 2
4:2:0 both horizontal and vertical downsampling

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 29


Discrete Cosine Transform - I
The discrete cosine transform (DCT)
is a Fourier-related transform similar to
the discrete Fourier transform (DFT),
but using only real numbers
It is used in JPEG because it is fast and
quite easy to implement efficiently

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 30


Discrete Cosine Transform - II

where
the block is N1 N2 pixels (in JPEG, 8x8)
A(i,j) is the value of pixel of position (i,j)
B(k1 ,k 2 ) is the DCT coefficient of position (k1 ,k 2 )
low values for k1 corresponds to low vertical
frequencies, low values for k 2 to low horizontal
frequencies
Generally higher frequencies have very low values

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 31


Discrete Cosine Transform - III

DCT function basis

each 8x8 square is


reduced to 64
coefficient

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 32


Discrete Cosine Transform - IV
Knowing with infinite precision the 64 DCT
coefficient it is possible to reconstruct exactly
the pixels of the square
But
finite precision
quantization of the coefficients (always)
Some coefficient related to high frequency are not
transmitted. This allows higher compression without
sacrifying too much quality as human eye is less
responsible

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 33


Quantization - I
The DCT matrix obtained is scaled differently in each
component, dividing each by a diferent factor
the factor for each component has been decided based
on human sensitivity to changes at each frequency
In practice the matrix of factor is usually

34
Quantization - II
Next, all values are rounded to nearest integer
This leads to a quite high number of 0s in the
high frequency zone, as factors are bigger

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 35


Zig-zag scan
Low frequency
coefficients are
transmitted before
higher frequency
coefficients
This allows for
progressive
visualization of this
8x8 block

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 36


Raster vs. progressive
transmission
Raster transmission
DCT coefficient of the upper left block, then
those of all the others in the upper part of
the image and so on
Progressive transmission
first all (0,0) coefficients, than all (0,1) and
so on, following zig-zag scan in each block

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 37


Binary coding
DCT(0,0) has usually a very slow variation
from one block to the next, as it is the mean
value
For this reason it is convenient to encode the
difference from the previous value
Tipically the bit stream is coded with Huffman
It is possible to use arithmetic scheme, gaining some
compression at cost of decoding speed
Huffman codes are predefined, or it is possible
to build optimal tables and insert them in the
stream
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 38
JPEG Decoder
Binary
Decoder

Some values are lost!

Dequantization

Good quality, but


reconstruction is not
Inverse DCT
exact

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 39


JPEG performances - I

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005 40


JPEG performances - II
Original Quality factor 75

Quality factor 20 Quality factor 3

41

S-ar putea să vă placă și