Sunteți pe pagina 1din 230

Fundamental of Image and Video

Processing

by
Dr. S. D. Ruikar
Associate Professor,
Department of Electronics Engineering
Walchand College of Engineering,
Sangli
Module 1: Fundamental of
Image Processing and
Transforms
Syllabus
Basic steps of Image processing system
sampling and quantization of an Image
Basic relationship between pixels, Need
for Image Transforms, 2 D Discrete
Fourier Transform, Discrete Cosine
Transform (DCT), Walsh transforms,
Hadamard Transform Haar transform,
SVD and K L transform
Introduction to Image Processing
Nature of Image Processing
Images are everywhere! Sources of Images are
paintings, photographs in magazines, Journals, Image
galleries, digital Libraries, newspapers, advertisement
boards, television and Internet.

Images are imitations of Images.

In image processing, the term image is used to denote


the image data that is sampled, quantized, and readily
available in a form suitable for further processing by
digital computers.
IMAGE PROCESSING
ENVIRONMENT
Reflective mode Imaging

Reflective mode imaging represents the


simplest form of imaging and uses a
sensor to acquire the digital image. All
video cameras, digital cameras, and
scanners use some types of sensors for
capturing the image.
Emissive type imaging

Emissive type imaging is the second type, where


the images are acquired from self-luminous
objects without the help of a radiation source. In
emissive type imaging, the objects are self-
luminous. The radiation emitted by the object is
directly captured by the sensor to form an
image. Thermal imaging is an example of
emissive type imaging.
Transmissive imaging

Transmissive imaging is the third type,


where the radiation source illuminates the
object. The absorption of radiation by the
objects depends upon the nature of the
material. Some of the radiation passes
through the objects. The attenuated
radiation is sensed into an image.
Image Processing
Optical image processing is an area that deals
with the object, optics, and how processes are
applied to an image that is available in the form
of reflected or transmitted

Analog image processing is an area that deals


with the processing of analog electrical signals
using analog circuits. The imaging systems that
use film for recording images are also known as
analog imaging systems.
What is Digital Image Processing?

Digital image processing is an area that


uses digital circuits, systems, and software
algorithms to carry out the image
processing operations. The image
processing operations may include quality
enhancement of an image, counting of
objects, and image analysis.
Reasons for Popularity of DIP
1. It is easy to post-process the image. Small corrections
can be made in the captured image using software.
2. It is easy to store the image in the digital memory.
3. It is possible to transmit the image over networks. So
sharing an image is quite easy.
4. A digital image does not require any chemical
process. So it is very environment friendly, as harmful
film chemicals are not required or used.
5. It is easy to operate a digital camera.
IMAGE PROCESSING AND
RELATED FIELDS
Relations with other branches

Image processing deals with raster data or


bitmaps, whereas computer graphics primarily
deals with vector data.

In digital signal processing, one often deals with


the processing of a one-dimensional signal. In
the domain of image processing, one deals with
visual information that is often in two or more
dimensions.
Relations with other branches

The main goal of machine vision is to


interpret the image and to extract its
physical, geometric, or topological
properties. Thus, the output of image
processing operations can be subjected to
more techniques, to produce additional
information for interpretation.
Relations with other branches

Image processing is about still images.


Thus, video processing is an extension of
image processing. In addition, images are
strongly related to multimedia, as the field
of multimedia broadly includes the study of
audio, video, images, graphics, and
animation.
Relations with other branches

Optical image processing deals with


lenses, light, lighting conditions, and
associated optical circuits. The study of
lenses and lighting conditions has an
important role in the study of image
processing.
Relations with other branches

Image analysis is an area that concerns


the extraction and analysis of object
information from the image. Imaging
applications involve both simple statistics
such as counting and mensuration and
complex statistics such as advanced
statistical inference. So statistics play an
important role in imaging applications.
Digital Image
An image can be defined as a 2D signal
that varies over the spatial coordinates x
and y, and can be written mathematically
as f (x, y).
Digital Image

The value of the function f (x, y) at every point


indexed by a row and a column is called grey
value or intensity of the image.

Resolution is an important characteristic of an


imaging system. It is the ability of the imaging
system to produce the smallest discernable
details, i.e., the smallest sized object clearly, and
differentiate it from the neighbouring small
objects that are present in the image.
Useful definitions

Image resolution depends on two factors


optical resolution of the lens and spatial
resolution. A useful way to define resolution is
the smallest number of line pairs per unit
distance.

Spatial resolution depends on two parameters


the number of pixels of the image and the
number of bits necessary for adequate intensity
resolution, referred to as the bit depth.
Useful definitions

The number of bits necessary to encode


the pixel value is called bit depth. Bit depth
is a power of two; it can be written as
powers of 2.
So the total number of bits necessary to
represent the image is
Number of rows = Number of columns *
Bit depth
TYPES OF IMAGES
Types of Images Based on
Colour

Grey scale images are different from


binary images as they have many shades
of grey between black and white. These
images are also called monochromatic as
there is no colour component in the image,
like in binary images. Grey scale is the
term that refers to the range of shades
between white and black or vice versa.
Types of Images

In binary images, the pixels assume a value of 0


or 1. So one bit is sufficient to represent the pixel
value. Binary images are also called bi-level
images.

In true colour images, the pixel has a colour that


is obtained by mixing the primary colours red,
green, and blue. Each colour component is
represented like a grey scale image using eight
bits. Mostly, true colour images use 24 bits to
represent all the colours.
Indexed Image
A special category of colour images is the
indexed image. In most images, the full
range of colours is not used. So it is better
to reduce the number of bits by
maintaining a colour map, gamut, or
palette with the image.
Pseudocolour Image

Like true colour images, Pseudocolour


images are also used widely in image
processing. True colour images are called
three-band images. However, in remote
sensing applications, multi-band images or
multi-spectral images are generally used.
These images, which are captured by
satellites, contain many bands.
Types of Images based on
Dimensions

Types of Images Based on Dimensions


2D and 3D
Types of Images Based on Data Types
Single, double, Signed or unsigned.
DIGITAL IMAGE PROCESSING
OPERATIONS
Image Analysis
Image Enhancement
Image Restoration
Image Compression
Image Analysis
Image Synthesis
Image Processing Applications
Digital Imaging System
Components
Digital Imaging System
A digital imaging system is a set of
devices for acquiring, storing,
manipulating, and transmitting digital
images.
Nature of Light

Human beings perceive objects because of light.


Light sources are of two types primary and
secondary. The sun and lamps are examples of
primary light sources. While primary sources
generate light, secondary light sources simply
reflect or diffuse light from primary sources. The
moon and clouds are examples of secondary
sources of light.
Nature of Light

Wavelength - Wavelength is the distance


between two successive wave crests or wave
troughs in the direction of travel.

Amplitude - Amplitude is the maximum distance


the oscillation travels, away from its horizontal
axis.

Frequency - The frequency of vibration is the


number of waves crossing at a point
Simple image model

I(x, y, l) = (x, y, l) L(l)


Simple Image Formation Process
BIOLOGICAL ASPECTS OF
IMAGE ACQUISITION
Properties of Human Visual
System

Brightness adaptation

Intensity and brightness

Simultaneous contrast
Simultaneous contrast
Mach bands

Mach band effect is a phenomenon of


lateral inhibition of rods and cones, where
the sharp intensity changes are attenuated
by the visual system.
Frequency response
REVIEW OF DIGITAL CAMERA
SAMPLING AND QUANTIZATION
ShannonNyquist theorem

What should be the ideal size of the pixel?


Should it be big or small? The answer is
given by the ShannonNyquist theorem.
As per this theorem, the sampling
frequency should be greater than or equal
to 2 fmax, where fmax is the highest
frequency present in the image.
Image quantization
IMAGE QUALITY
Optical Resolution
Image Display Devices and
Device Resolution

Frame rate refers to the rate at which


video images are acquired and processed
or the rate at which the images are
transferred from the system to the display.
The international standard for frame rate is
25 frames per second and in the US it is
30 frames per second.
Pixel Size

Pixel size is the distance between the dots


in the device monitor. It is also known as
dot pitch. Pixel density is the number of
pixels per unit length in the device monitor.
Geometric Resolution

Geometric resolution is the order of the display


matrix. It is defined as the number of physical
pixels of display compatible with the image.
Colour resolution is the number of colours
available for the display. Colour depth is the
number of bits that is required to display all
colours. Gamut or palette refers to the range of
colours that are supported by the display
system.
Digital Halftone Process

Halftoning is a technique used to produce


the grey shades for bi-level devices such
as printers.
Halftoning Process
Random dithering

Random dithering is a simple way of


creating an illusion of continuous grey
levels. The method generates a random
number in the range 1256 for a pixel. If
the pixel value is greater than the random
number generated, the pixel is plotted as
white. Otherwise, it is plotted as a black
pixel.
Ordered dithering

The patterns in ordered dithering are in a


more compact form, based on the order of
dots added. Some of the patterns are
shown in Fig. 2.17. This pattern array is
then used as a threshold mask for the
given image. If the values of the pixel are
less than the threshold value, it is plotted
as white and otherwise as dark.
Ordered Dithering
Algorithm

The algorithm for ordered dither can be


written as follows:
1. Load the image.
2. Create a pattern of size n n.
3. Apply interpolation or replication technique to
enlarge the image.
4. If the enlarged image (x, y) > threshold array,
produce a dot at (x, y); otherwise insert zero.
Non-periodic dithering
The FloydSteinberg algorithm for non-
periodic dithering is as follows:
1. Load the image.
2. Perform the quantization process.
3. Calculate the quantization error.
4. Spread the error over the neighbours to the right and
below. The right pixel gets 7/16th of the error value.
The bottom pixel gets 5/16th of the error, the south-
west neighbour gets 3/16th of the error, and the
south-east neighbour gets 1/16th of the error.
IMAGE STORAGE AND FILE
FORMATS
Some of the raster file formats that are
very popular are
1. GIF (Graphics Interchange Format)
2. JPEG (Joint Photographic Experts Group)
3. PNG (Portable Network Graphics)
4. DICOM (Digital Imaging and COMmunication)
Structure of TIFF File Format
Generally, file formats consist of two parts:

1. Image header
2. Image data
Structure of TIFF File Format
The tagged image file format (TIFF) is a
standard format that is considered for the
purpose of illustration. The TIFF image
format is as follows:

1. Image file header (IFH)


2. Image file directory (IFD)
3. Directory entry (DE)
4. Image data
BASIC RELATIONSHIPS AND
DISTANCE METRICS
Image Coordinate System
Image Coordinate system
Image Topology
Diagonal Elements
8-Neighbourhood
Connectivity
8-connectivity Vs m-connectivity
Relations
Distance Measures
Distance Measures
Classification of Image Operations

One way of classification is


Point
Local and
Global
Classification
Image Vs Array Operations
Arithmetic operations - Addition
Image Subtraction
Image Multiplication
Image Division
Image Division
Logical Operations
XOR
NOT Operation
Geometrical Operation
Scaling Operations
Zooming
Linear Interpolation
Reflection
Reflection along X
Shearing
Rotation
Affine Transform
Inverse Transform
Image Interpolation
Downsampling
Upsampling
Set Operations
Statistical Operations
Mean
Mode
Standard deviation
Variance
Entropy
Image Convolution
1D-Convolution
1D-Correlation
2D-Convolution
Properties of Convolution
Data Structures
Chain Code
RAG
Relational Structures
Hierarchical Structures
Pyramid Structures
Quadtree
Application Development
Image Transform
Image transforms can be simple arithmetic operations on images or
complex mathematical operations which convert images from one
representation to another.
Mathematical Operations include simple image arithmetic, Fourier, fast
Hartley transform, Hough transform and Radon transform.
Histogram Modification include histogram equalization and adaptive
histogram equalization.
Image Interpolation includes various methods for scaling, Kriging, image
warping and radial aberration correction.
Image Registration is a tool for registering two 2D or 3D similar images and
finding an affine transformation that can be used to convert one into the
other. The operation is suitable for registering medical images of the same
object.
Background Removal is a process to correct an image for non-uniform
background or non-uniform illumination.
Image Rotation is a simple tool to rotate an image about its center by the
specified number of degrees.
NEED FOR IMAGE TRANSFORMS
NEED FOR IMAGE TRANSFORMS
Spatial Frequencies in Image
Processing

spatial frequency is used to describe the


rate of change of pixel intensity of an
image in space. It can be visualized using
the line profile of the image in any row or
column.
Image Profile
Types of image transforms
Transforms
Introduction to Fourier
Transform
DFT
4 Point DFT
4 Point DFT
4 Point DFT
4 Point DFT
4 Point DFT
4 Point DFT
2D Discrete Fourier Transform
2D Discrete Fourier Transform
2D Discrete Fourier Transform
Properties of 2D DFT
1.Separable
2.Spatial shift
3.Periodicty
4.Convlution
5.Correlation
6.Scaling
7.Conjugate
8.Rotation
Separable
Shifting

Periodicty
Convolution
Scaling

Conjugate
2 D DFT Properties
Walsh Transform
Walsh Transform
Walsh Transform
Walsh Transform
Walsh Transform

Walsh basis function for N=4


Walsh Transform
Shortcut method for finding sign change
Walsh Transform
N 1
W (u ) f ( x) g ( x, u ), N 2n
x 0

where
n 1 bi ( x ) bn1i ( u )
1
g ( x, u ) (1)
N i 0
bk ( z ) : The kth bit in the binary representation of z
e.g.,
If N 8, 8 23 , n 3
Let z 6 1102 b0 ( z ) 0 , b1 ( z ) 1 , b2 ( z ) 1

3-173
Example:
n 1 b ( x ) bn1i ( u )
1 N 1 i

W (u ) f ( x) (1)
N x 0 i 0

1 3 1
1 bi ( x ) b1i (0)

W (0) f ( x) (1) f (0) f (1) f (2) f (3)


4 x 0 i 0 4
1 3 1
1 bi ( x ) b1i (1)

W (1) f ( x) (1) f (0) f (1) f (2) f (3)


4 x 0 i 0 4
1
W (2) f (0) f (1) f (2) f (3)
4
1
W (3) f (0) f (1) f (2) f (3)
4

3-174
Example : N = 8

3-175
Discrete cosine Transform
Discrete cosine Transform
Discrete cosine Transform
Discrete cosine Transform

1
Discrete cosine Transform
1
Discrete cosine Transform
2
Discrete cosine Transform
Properties of DCT
Hadamard Transform
2D Hadamard Transform
2D Hadamard Transform
2D Hadamard Transform

N=2

N=4
Haar Transform
The Haar transform is based on a class of
orthogonal matrices whose elements are
either 1, -1, or 0 multiplied by powers of
2

The Haar transform is a computational


efficient transform is a computationally
efficient transform as the transform of an
N-point vector requires only 2(N-1)
additions and N multiplications.
Haar Transform
Haar Transform
Flow chart to compute HAAR Basis
Haar Transform
Generate one HAAR basis for N=2
Generate one HAAR basis for N=2
Generate one HAAR basis for N=2
Generate one HAAR basis for N=2
Generate one HAAR basis for N=2
Slant Transform
SVD Transform
SVD Transform
Properties

Applications
Singular Value Decomposition
We already know that the eigenvectors of a matrix A form a convenient
basis for working with A.

However, for rectangular matrices A (m x n), dim(Ax) dim(x) and the


concept of eigenvectors doesnt exist.

Yet, ATA (n x n) is symmetric real matrix (A is real) and therefore, there is


an orthonormal basis of eigenvectors {uK}.
Au k
Consider the vectors {vK} vk
k
They are also orthonormal, since:

u AT Au k 2k (k j )
T
j
Singular Value Decomposition
Since ATA is positive semidefinite, its {k0}.

Define the singular values of A as k k

and order them in a non-increasing order: 1 2 ... n 0


Motivation: One can see, that if A itself square and symmetric,
than {uk, k} are the set of its own eigenvectors and eigenvalues.

For a general matrix A, assume {1 2 R >0= r+1 = r+2 == n }.

Au k 0 v k , k r 1,..., n
1n 1m
uk ; vk
Singular Value Decomposition
Now we can write:
| | | | | | | |
Au Au Au r 1 Au n A u1 u r u r 1 u n AU
1 r

| | | | | | | |
1 0 0 0

| | | | | | | |
0 r 0
1 v1 r v r 0 v r 1 0 v n v1 v r v n
0
v r 1 V
0 0 0 0
| | | | | | | |


0 0 0 0

AUU VU T T

mn mm mn nn T
A V U
SVD: Example
1 1
Let us find SVD for the matrix A
2 2

In order to find V, we are calculating eigenvectors of


ATA: 1 2 1 1 5 3
A A
T

1 2 2 2 3 5
(5-)2-9=0;
10 100 64
1, 2 5 3 8,2
2
1 1
2 2
v1 v2
1 1
2 2
SVD: Example

The corresponding eigenvectors are found by:


5 i 3
3 ui 0
5 i
1
3 3 0 2
3 3 u1 0 u1 1

2

1
3 3 0 2
3 3 u1 0 u1 1

2
SVD: Example
Now, we obtain the U and :

1
1 1 0
A u1 1 v1 2 0 2 2 0 v1 , 1 2 2 ;
1 2 2 1 1
2 2
2

1

1 1 2 2 1 1
Au 2 2 v 2 2 0 v2 , 2 2 ;
2 2 1 0 0
2

A=VUT: 1 1
1 1 0 1 2 2 0 2 2
2 2 1 0 1
0 2 1
2 2
Singular Value Decomposition
For an m n matrix A of rank r there exists a factorization
(Singular Value Decomposition = SVD) as follows:
A U V T

mn nn V is nn

The columns of U are orthogonal eigenvectors of AAT.


The columns of V are orthogonal eigenvectors of ATA.
Eigenvalues 1 r of AAT are the eigenvalues of ATA.
i i
diag 1... r Singular values.
Singular Value Decomposition
Illustration of SVD dimensions and
sparseness
SVD and Rank-k
approximations
A = U VT
features

significant sig. significant

noise noise
= noise

objects
EXAMPLE SVD
EXAMPLE SVD
SVD example
1 1
Let A 0 1
1 0
Thus m=3, n=2. Its SVD is

0 2/ 6 1/ 3 1 0
1 / 2 1/ 2
1 / 2 1/ 6 1 / 3 0 3
1 / 2 1/ 2 1/ 2
1/ 6 1 / 3 0 0

Typically, the singular values arranged in decreasing order.


Example (2x2, full rank)
2 2
A
1 1
5 3 1 2 1 2
A A
T
, v1 , v2
3 5 1 2 1 2
2 2 1 1
Av1 1 , u1
0 0 0
0 0 0
Av2 2 , u2
2 1 1
1 0 2 2 0 1 2 1 2
A UV
T

0 1 0 2 1 2 1 2 213
KL Transform
The KL transform is named after Kari Karhunen and Michel Loeve
Develop a series expansion method for continuous random
processes.
KL transformos also known as Hotelling transform (Harold
Hotellinng Discrete formation)
It is reversible linear transform that exploits the statistical properties
of a vector representation.
The basis function of the KL transform are orthogonal eigen vectors
of the covariance matrix of a data set.
A KL transform optimally decorrelates the input data.
After a KL transform most of the energy of the transform coefficients
is concentrated within the first few components.
This is energy compaction property of a KL transform
KL Transform
Eigenvalues and Eigenvectors
The concepts of eigenvalues and eigenvectors are important for
understanding the KL transform.

If C is a matrix of dimension n n, then a scalar is called an


eigenvalue of C if there is a nonzero vector e in Rn such that :

Ce e

The vector e is called an eigenvector of the matrix C corresponding to


the eigenvalue .
Vector population
Consider a population of random vectors of the following form:
x1
x
x 2


xn
The quantity xi may represent the value(grey level)
of the image i .

The population may arise from the formation of the above


vectors for different image pixels.
Example: x vectors could be pixel values in several
spectral bands (channels)
Mean and Covariance Matrix
The mean vector of the population is defined as:

m x E{x} m1 m2 mn E{x1} E{x2} E{xn }


T T

The covariance matrix of the population is defined as:


C E x m
x
x m T
x

For M vectors of a random population, where M is large


enough 1 M
mx xk
M k 1
Karhunen-Loeve Transform

Let A be a matrix whose rows are formed from the eigenvectors of the
covariance matrix C of the population.

They are ordered so that the first row of A is the eigenvector


corresponding to the largest eigenvalue, and the last row the
eigenvector corresponding to the smallest eigenvalue.

We define the following transform:

y Ax mx

It is called the Karhunen-Loeve transform.


Karhunen-Loeve Transform

You can demonstrate very easily that:


E y 0
C AC AT
y x

1 0 0
0 0
Cy 2


0 0 n
Inverse Karhunen-Loeve Transform

Toreconstruct theoriginal vectorsx fromitscorresponding y


A1 AT
x AT y m
x

Weforma matrix AK from theK eigenvectors which correspond to


theK largest eigenvalues, yielding a transformation matrixof size
K n.
The y vectorswould thenbe K dimensional.
Thereconstruction of theoriginal vectorx is
x ATK y mx
Mean squared error of approximate
reconstruction
It can be proven that themeansquare error between
theperfectreconstruction x and theapproximaet reconstruction
xis given by theexpression

n K n
ems x x j j j
2

j1 j1 j K
1
By using AK insteadof A for theKLtransformwe can achieve
compresssion of theavailable data.
Drawbacks of the KL Transform

Despite its favourable theoretical properties, the KLT is not


used in practice for the following reasons.

Its basis functions depend on the covariance matrix of the


image, and hence they have to recomputed and
transmitted for every image.

Perfect decorrelation is not possible, since images can


rarely be modelled as realisations of ergodic fields.

There are no fast computational algorithms for its


implementation.
Example: x vectors could be pixel
values
in several spectral bands (channels)
Example of the KLT: Original images
6 spectral images
from an airborne
Scanner.

(Images from Rafael C. Gonzalez and Richard E.


Wood, Digital Image Processing, 2nd Edition.
KL Transform
KL Transform
Pros and Cons of K-L
Transform
Optimality
Decorrelation and MMSE for the same# of partial coeff.
Data dependent
Have to estimate the 2nd-order statistics to determine the
transform
Can we get data-independent transform with similar
performance?
DCT

Applications
(non-universal) compression
pattern recognition: e.g., eigen faces
analyze the principal (dominating) components
Thank You