Documente Academic
Documente Profesional
Documente Cultură
An Architecture for Enhancing Image Processing via Parallel Genetic Algorithms & Data
Compression
161. DCT is the easiest method to implement would be large, consequently a DCT
efficiently on-chip. Consequently the DCT compressed version of the external image and
compression method was chosen as an the reference image are used.
effective lossy compression technique for
reducing the image size. Additional benefits to DCT Comuression
using a lossy algorithm include the ability to
compress the image to a fixed compressed The DCT takes the image in sets of 4x4
image size. This permits a variety of sizes of blocks. Each 4x4 block is separately
image to be processed on a chip with limited transformed (figure 1). The transformed block
memory per chromosome. Consequently this is not necessarily stored to the same resolution
system is far more flexible than the original as the original. This allows some compression
method. The technique and its implications of the image. Consequently the number
are described in this paper along with produced after transformation is usually
simulated results for a number of images. divided by some factor and then quantised
Conclusions and future developments are before storage. The transformed block is
subsequently discussed. related to the Discrete Fourier transform and
can be regarded as the relative magnitudes of
Parallel Genetic Alporithm for vision the two dimensional spatial frequencies which
make up the picture. Images concentrate most
A brief background to Parallel Genetic of the information in the lower spatial
Algorithms (PGAs) can be found in [7]. For frequencies. Consequently an image can be
this application each chromosome specifies a compressed by not storing the higher
two-dimensional transform which maps a frequencies as accurately as the lower
reference image to an external image. The frequencies. For this work the higher
images are 64x64 byte greyscale images. The frequencies (top half of the spectrum) is set to
transform contains information about scale, zero and the lower frequencies are only stored
rotation and position. A measure of the to 4 bit accuracy. The consequence of this is to
accuracy of the transform is found by reduce the storage capacity required per
summing the absolute error between the chromosome by a factor of four. The equations
transformed reference image and the external for transforming the image are,
image. The known largest possible error
would be 220.1n order to make the largest
number the best match the ‘fitness’ of the
chromosome is measured as 2*’-the absolute
error. Once the transform which produces the ( 2 x + 1)un.ws--]( 2 y + 1)vrr (3)
best match (highest value) has been identified f ( x , r ) = [u=o~“=o~ ‘ ( . ) “ ( v ) F ( . , 8. ) . w s ~ 8
the position orientation and scale of the target
can be determined. Where C(0) =I/+ else C()=1
Difficulties arise using this technique in
The transform used will be: dealing appropriately with the transform
domain. The position of the 8x8 blocks
I=RT (1) conforms to the normal equation { 1} however
Where I is the real image, T is the transform the fiequency distribution changes when
and R is the reference image scaling & rotation occurs. OfEsets have no
affect on the frequency distributions within a
The Transform is a 3x2 matrix block because only ofEsets corresponding to
complete block movements have been
permitted in this work. Scaling can be
properly incorporated as fiequency changes
inversely with the scale in each dmension.
Here S-refers to a scaling factor, I$ to rotation Further work will be required to properly
and ~0 & yo are position o&b. compensate for rotational movement.
Each element of the matrix is stored in the Hardware Im~lementation
chromosome as a six bit value with the
exception of the offsets which are only stored DCT has been widely recognised as the most
to three bit accuracy. However if the original effective technique for image and video
image were used the memory requirements compression and its single chip
339
implementation has already been reported As can be seen from figure 2 the circuit
[8,9]. For the hardware implementation consists of two similar sections. The top
proposed in this paper DCT is only applied section is for the calculation of x’ while the
once prior to commencing the genetic bottom section calculates y” ( x’ and y” are the
evolution for image registration. Hence, it was x-y co-ordinate addresses for the transformed
decided to perform the DCT off-chip since it is pixel of the image). The calculation of x’
not a speed critical task in the case of this commences by depositing the value of a13 in
application. Incorporating the DCT algorithm register Rxi. This value is used till the end of
on-chip will further increase both the the x-axis is reached (monitored by the
complexity and the computational intensity of counter C1) at which case R x O is loaded into
the proposed architecture. Rxi. This operation continues till the complete
image is processed. A similar procedure is
The proposed hardware could be divided into followed €or y’ in the lower section. Both x’
four main blocks as shown in figure 2. The and y’ are evaluated in parallel.
€allowing is a brief description of each block:
Block C : processes the twelve bit address
Block A :performs the following functions: corresponding to x’ and y’ so that it could be
Selection of the best of the four neighbours; mapped to the compact frame buffer FB. This
Crossover and mutation; Deposition of the process involves separating the most
best individual, from the contents of registers significant three bits of the x‘ and y’ addresses,
(REGO, REG1, REG2) into REG2. i.e. the bits responsible for identifying a
transform block, into a separate six bit bus.
The genetic evolution commences when an This will allow the use of a simple logic
external chromosome (CHROMOi) is fimction, DL3, to map the six bit addresses of
deposited in REGO which is subsequently the individual pixels (within a transform
duplicated in REG1 and REG2. The Register block) to be mapped into five bit address. The
Control Logic (RCL) is the block responsible resulting eleven bit address will be used to
for handling the transfer of data among the identify the individual pixel in the frame
registers above. buffer for fimess comparison.
An appropriate control signal on multiplexer
MUX1 will select either the four neighibouring Block E :After the calculation of x’ and y’ the
chromosomes (Cs . . . Ce) or the chromosomes corresponding pixel is extracted fiom a 2Kx4
in the registers (REGO, REG1, and REG2). Frame Buffer (FB) and is compared with the
The same control signal will be used to select corresponding pixel of the reference image. If
the corresponding fimess values via a match is detected then C3 is incremented
multiplexer MUX14. The ‘elogic will enable during which the count is stored in register
MUX2 to select the best chromosome and Rft. When the whole image is processed, one
place it in REG2. RCL will ensure that the of the following is performed :
appropriate fimess value is passed to FREG2.
The signal MIX is applied externally to the 1. If the fitness is being calculated for a
processor to indicate whether a c r o s ~ ~ or
er chromosome transferred from REG2
mutation should take place in which caise a 16- (after possible crossover), then the final
bit random number @No) to indicate the fitness, calculated here, will be moved to
appropriate position(s). In the case of FREG2 by enabling the appropriate
crossover RNo is split into two individual %bit control signals on DX5, MX12, and DXO.
numbers by another logic block (MXCL) in 2. IC however, the fimess is being evaluated
order to provide the positions requirrd for a for a chromosome during the optimisation
two point-crossover. In the case of mutation phase, then it is compared with the
only a single 8-bit number is extracted fiom previous fitness value in the register
RNO. Ipftmp (at the start of the evaluation phase
Ipftmp =O), which stores the best fitness
Block B :is responsible for fitness evaluation seen so far during optimisation, and the
of chromosomes. This is performed either corresponding chromosome is stored in
after possible crossover/mutation or during the RCtmp. If the fitness value is better than
optimisation process. The appropriate control the one in Rftmp, then it is copied to
signal on MUX8 will select a chramosome Rftmp and its associated chromosome is
fiom one of the above two destinations.. transferred to RCtmp, otherwise the
chromosome in RC is incremented or
340
decremented depending on the state of the offsets and rotations and rules should be
counter C4 (see below). incorporated to guide the PGA into legitimate
regions. These must all be provided for in
The optimisation phase commences by moving hardware and consequently will require
the chromosome in REG2 to RC. MX13 firrther research into algorithms which are
selects each parameter sequentially both effective and simple to implement in
incrementing and then decrementing its value hardware using commonplace technology.
(decided by the code produced by the counter
C4). The chromosome produced is then References
presented to the evaluation section by enabling 1. Fieatrick J.M, Grefenstette J.J and Van
MX8 which should be enabled with the Gucht D (1984) 'Image Registration by
appropriate code signalling the use of the Genetic Search' IEEE SouthEastcon pp
evaluation section in the optimisation phase. 460-64
2. Mandava V.R, Fitzpatrick J.M and
The design was evaluated using a 1 p ES2 Pickens D.R (1989) 'Adaptive Search
CMOS process, in which an individual Space Scaling in Digital Image
chromosome could be processed in Registration' IEEE Transactions on
approximately 2 milliseconds. Medical Imaging MI-8 No 3 pp 25 1-62
3. McAulay A.D and Oh J.C ' (1989) Image
Results Learning Classifier System Using Genetic
The result of using this technique are given in Algorithms' IEEE pp 705-10
figure 3. Each image has six separate PGA 4. Turton B.C.H, Arslan T, Horrocks D.H
runs with the average result for each (1994) 'A Hardware Architecture for a
generation plotted. In addition for picture A Parallel Genetic Algorithm for Image
the best result for each generation is plotted. Registration' IEE Colloquium on "Genetic
Clearly the technique has managed to find the Algorithms in Image Processing and
optimum solution in some cases (2'' = Vision" Digest No: 1994/193 ppl111-6
104000). Investigation of the results which do 5. Petty C.C and Leuze M.R (1989) ' A
not reach the optimum result show that a local Theoretical Investigation of a Parallel
minimum has been reached where one of the Genetic Algorithm' in Proceedings of the
scaling factors has collapsed to a suboptimal third International Conference on Genetic
value. Limitations on the permissible change Algorithms Schaffer J.D (Ed) Morgan
in scale would substantially assist this Kautinann Publishers pp 398-405
problem. 6. Wallace G.K (1992) 'The P E G still
In addition the coefficients found under picture compression standard' IEEE
transformation do have some limitations in Transactions on Consumer Electronics 38
this implementation. In particular the offsets NO 1 pp 18-34
are only coarsely calculated, to the nearest 8x8 7. Turton B.C.H, Arslan T (1995) 'A
block. This limitation is imposed because a Parallel Genetic VLSI Architecture for
simple OW in the compressed domain is not Combinatorial Real-Time Applications -
equivalent to an o a e t in the original domain Disc Scheduling' First IEE/IEEE
unless it is by an integer number of blocks. A International Conference on Genetic
more advanced version of this algorithm Algorithms in Engineering
would be able to adjust for this effect thus Systems:Innovations & Applications
permitting more accurate comparisons. Conference.
1
R E G 0
*
A-
R E G 1
T
-
F R E G O
F R E G l
- %-t
342
Parallel GALResults
950000
- Picture A, Average
-