Documente Academic
Documente Profesional
Documente Cultură
Image Representation
There are many kinds of digital image like binary, grayscale, and color. These
digital images can be classified according to the number and nature of the values of a
pixel. Each pixel of an image is represented by a specific position in some 2D region. A
binary image are images that have been quantized to two values, usually denoted 0 and 1,
but often with pixel values 0 and 255, representing black and white. A grayscale image is
image in which the value of each pixel is a single sample. Images of this sort are typically
composed of shades of gray, varying from black to white depending on its intensity,
though in principle the samples could be displayed as shades of any color, or even coded
with various colors for different intensities. An example of this image is in figure 3.1. The
original image is the letter a (leftmost) is a grayscale image that has an intensity of 0 to
255, the center image is a zoomed in version of the image and it reveals the individual
pixels of the letter a. The rightmost image is the normalized numerical values of each
pixel. For this example the coding used is that 1(255) is brightest and 0(0) is darkest.
Figure 3.1
Color
A color image is a digital image that includes color information for each pixel, usually
stored in memory as a raster map, a two-dimensional array of small integer triplets; or as
three separate raster maps, one for each channel. One of the most popular colour model is
the RGB model. The colors red, green, and blue was formalized by the CIE (Commission
Internationale d’Eclairage) which in 1931 specified the spectral characteristics of red(R),
blue(B), green(G) to be monochromatic light of wavelengths of 700 nm, 546.1nm, 435.8
nm respectively. (Morris, T., 2004). Almost any colour can be made to match using linear
combinations of red, green, and blue:
C = rR + gG + bB
Today there are many RGB standards in use. Some of these are ISO RGB, sRGB,
ROMM RGB, and NTSC RGB. (Buckley, R. et. al, 1999). These standards are
specifications for specific applications of the RGB color spaces.
Another color model is the HSV model. HSV uses three components to represent an
image: the underlying color of the sample- the hue (H), the saturation or depth of the
sample’s colour – S, the intensity of the sample or brightness –the value (V).
Figure 3.2
RGB and HSV Colorspaces
Resolution
The term resolution is often used as a pixel count in digital imaging. Resolution is
sometimes identified by the width and height of the image as well as the total number of
pixels in the image. For example, an image that is 2048 pixels wide and 1536 pixels high
(2048X1536) contains (multiply) 3,145,728 pixels (or 3.1 Megapixels). Resolution of an
image expresses how much detail we can see in it and clearly and it depends on N and m.
It is a measurement of sampling density, resolution of bitmap images give a relationship
between pixel dimensions and physical dimensions. The most often used measurement is
ppi, pixels per inch.
Megapixels
Megapixels refer to the total number of pixels in the captured image, an easier metric is
image dimensions which represent the number of horizontal and vertical samples in the
sampling grid. An image with a 4:3 aspect ratio with dimension 2048x1536 pixels,
contain a total of 2048x1535=3,145,728 pixels; approximately 3 million, thus it is a 3
megapixel image.
Scaling / Resampling
When we need to create an image with different dimensions from what we have we scale
the image. A different name for scaling is resampling, when resampling algorithms try to
reconstruct the original continous image and create a new sample grid.
Sample depth
Sample depth is the level at which binary representation is used to represent the image
The spatial continuity of the image is approximated by the spacing of the samples in the
sample grid. The values we can represent for each pixel is determined by the sample
format chosen.
8bit
A common sample format is 8bit integers, 8bit integers can only represent 256 discrete
values (2^8 = 256), thus brightness levels are quantized into these levels.
12bit
For high dynamic range images (images with detail both in shadows and highlights) 8bits
256 discrete values does not provide enough precision to store an accurate image. Some
digital cameras operate with more than 8bit samples internally, higher end cameras also
provide RAW images that often are 12bit (2^12bit = 4096).
16bit
The PNG and TIF image formats supports 16bit samples, many image processing and
manipulation programs perform their operations in 16bit when working on 8bit images to
avoid quality loss in processing, the film industry in Hollywood often uses floating point
values to represent images to preserve both contrast, and information in shadows and
highlights.
3.2.1 PC Camera
widely used for video conferencing via the Internet. Acquired images from this device
were uploaded in a web server hence making it accessible using the world wide web,
monitoring, and weather monitoring. Web cameras typically includes a lens, an image
sensor, and some support electronics. Image sensors can be a CMOS or CCD, the
former being the dominant for low-cost cameras. Typically, consumer webcams offers
a resolution in the VGA region having a rate of around 25 frames per second. Various
lens were also available, the most being a plastic lens that can be screwed in and out
to manually control the camera focus. Support electronics is present to read the image
3.2.2 Projector
Projectors are classified into two technologies, DLP (Digital Light Processing)
and LCD (Liquid Crystal Display). This refers to the internal mechanisms that the
3.2.2.1 DLP
Digital Light Processing technology used in projectors uses an optical
recreate the source material. Originally developed by Texas Instruments there are
two manners by which DLP projection creates a color image, first employs the
usage of single-chip DLP projectors and the other was on the use of three-chip
projectors. On a single DMD chip colors are generated by placing a color wheel
between the lamp and the DMD chip. Basically a color wheel is divided into four
sectors: red, green, blue and an additional clear section to boost brightness. The
later is usually omitted since it is only use to reduce color saturation. The DMD
chip is synchronized with the rotating color wheel thus when a certain color
section of the color wheel is in front of the lamp that color is displayed at the
DMD. While on a three chip DLP projector, a prism is used to split the light from
the lamp. Each primary color of light is routed to its own DMD chip, recombined
and directed out through the lens. Three chip DLP is referred to the market as
DLP2.
First, there is less ‘chicken wire’ or ‘screen door’ effect on DLP because pixels
in DLP are much closer together. Another advantage is that it has higher
contrast compared to LCD. DLP projectors are much portable for it only
requires fewer components and finally, claims had shown that DLP projectors
picture dims as the lamp deteriorates with time. It has less color saturation.
The ‘rainbow effect’ which is only present on single chip DLP projectors is
appearing when looking from one side of the screen to the other, or when
at a much higher speed or use a color wheel with more color segments.
3.2.2.2 LCD
LCD projectors contain three separate LCD glass panels, one for red,
green, and blue components of the image signal being transferred to the projector.
As the light passes through the LCD panels, individual pixels can be opened to
allow light to pass or closed to block the light. This activity modulates the light
and produces the image that is projected onto the screen (Projectorpoint).
Preprocessing Algorithms
Preprocessing algorithms and techniques are used to make the necessary data
reduction and to make the analysis easier. This stage is basically where we eliminate
unwanted information in different specific applications. Such techniques include
extracting the Region-of-Interest (ROI), performing basic mathematical operations,
enhancement of specific features and data reduction. (Umbaugh, 2005 )
• Defining Region-of-Interest
In image analysis we seldom need the whole image, we only want to concentrate
in a specified area of the image called the Region-of-Interest (ROI). Image
geometry operations are used to extract ROI. Examples of these operations
include crop, zoom, rotate, etc. (Umbaugh, 2005 ).
• Spatial Filters
Spatial filtering is used for noise reduction and image enhancement. This is done
by applying filter functions or filter operators in the domain of the image space.
(Umbaugh, 2005).
Converting RGB to binary is important besides making the analysis easier, it also
reduces the size of the image because a binary image has only two intensity
values (0 and 1) contrast to an RGB image, which has three levels each having
256 intensity values (0 to 255).
Thresholding
Thresholding
0 if . f (i, j ) ≤ θ
g (i, j ){
1 otherwise
The main parameter in thresholding lies in selecting the correct value for the
threshold. There are many ways to acquire the value of threshold and the simplest way to
select the threshold value would be to choose the mean or median value. This is effective
provided that the object pixels are brighter than the background, and they should also be
brighter than the average. Using a histogram to record the frequency of occurrence of the
image pixel and use the valley point as the threshold would be the next. The histogram
approach assumes that there is some average value for the background and object pixels,
but that the actual pixel values have some variation around these average values. A more
effective way to acquire the value of threshold is by using iterative methods.
There are two ways to possibly perform the iterative method. The first method
will incrementally search through the histogram for a threshold. Starting at the lower end
of the histogram, the average of the gray values less than the suggested threshold will be
computed thus labeled L, and the same thing with gray values greater than the suggested
threshold labeled G. The average of L and G will be then computed. If the average is
equal to the suggested threshold, it will be the threshold. Otherwise the suggested
threshold is incremented and the process repeats. (Umbaugh, 2005)
The second method searches the histogram persistently. First an initial threshold
value is suggested: a suitable choice is getting the average of the image’s four corner
pixels. Then the next steps will be similar to the first method, the only difference lies on
updating of the suggested threshold, on this method the updated value is now equal to the
average the value of L and G. (Umbaugh, 2005)
Image Differencing
A common method for detecting moving objects is by use of image differencing. Image
differencing over successive pairs of frames should reveal the different pixels which
should be composed of the moving object. However certain considerations complicate the
matter. Regions of constant intensity and edges parallel to the direction of motion
give no sign of motion (Davies, E. , 2005). Also image differencing suffers from noise. It
is prone to contain errors due to subtle changes in illumination. This can be caused due to
environmental changes and the digitization process of the camera where in internal noise
causes subtle changes in successive frames.
The documentation of the OpenCV library suggests to use a mean of a number of frame
as the reference of the differencing. The mean is calculated as
Where S(x,y) is the sum of the individual pixel intensities at point x and y
Sq(x,y) is the sum of the squares of the individual pixel intensities at point x and y
N is the total number of frames
A pixel is regarded as part of the moving object if satisfies the condition that
(m( x , y ) − p ( x , y ) ) > cσ ( x , y )
C is a certain constant that controls the sensitivity of the differencing. If C = 3, it is
known as the 3 sigma rule (Intel, 2001).
3.6 OpenCV
References
Petrou, M., and Bosdogianni, P (1999). Image Processing, The Fundamentals. John Wiley
& Sons, LTD : New York
Kolas, O. (2005) Image Processing with gluas: introduction to pixel molding. Available:
http://pippin.gimp.org/image_processing/chap_dir.html
Buckley, R., et. al. (1999). Standard RGB color spaces. In the IS&T/SID Seventh Color
Imaging Conference: Color Science, Systems and Applications. Scottsdale,
Arizona
DLP and LCD Projector Technology Explained. (n.d.). Retrieved June 2, 2006, from
http://www.projectorpoint.co.uk/projectorLCDvsDLP.htm.
Webcam. (n.d.). Wikipedia. Retrieved June 03, 2006, from Answers.com Web site:
http://www.answers.com/topic/web-cam.
Davies, E. (2005). Machine vision: theory, algorithms, practicalities. Elsiever: CA
Intel (2001). Open source computer vision library reference manual. Available:
http://developer.intel.com
Umbaugh, S. (2005). Computer Imaging: Digital Image Analysis and Processing. CRC
Press: Boca Raton, Florida.
Shapiro, L. and Stockman, G. (2001). Computer Vision. Prentice Hall. Upper Saddle
River, New Jersey.
Sites: http://www.microscope-microscope.org/imaging/image-resolution.htm