A Visual Model Weighted Cos, Ine Transform For Image Compression and Quality Assessment

TRANSACTIONS
IEEE COMMUNICATIONS,
ON VOL. COM-33, NO. 6 , JUNE 1985 551
A Visual Model Weighted Cos,ine Transform for Image

Compression and Quality Assessment
NORMAN B. NILL
Abstract-Utilizing a cosine transform in image compression has knowledge. Finally, the all-real cosine transform can be rapidly
several recognized performance benefits, resulting in the ability to attain computed by applying fast discrete transform algorithms
large compression ratioswith small quality loss. Also, incorporation of a [211.
model of the human visual system into an image compression or quality Now, in conjunction with the proven utility of transform
assessment technique intuitively should (and has often proven to) improve techniques applied to imagerycompression, one could in-
performance. Clearly, then, it should, prove highly beneficial to combine tuitively expect that if a suitable model of the human visual
the image cosine transform with a visual model. In the past, combining system ( H V S ) could be successfully combined with the com-
these two hasbeen hindered by a fundamental problem resulting from the pression process, an improvement in compression performance
scene alteration that is necessary for proper cosine transform utilization. would result..Thisexpectation follows fromthefactthat,
A new analytical solution to this problem, takingthe form of a in mostcompressionapplications, ahuman is the final ob-
straightforward multiplicative weighting function,is developed in this server of the imagery operated upon. Application of various
paper. Thissolutionis readily applicable to image compression and models of the HVS have in factbeen empirically found t o
quality assessment inconjunction with a visual model and the image improve (noncosine) transform compression performance
cosine transform. In the development, relevant aspects of a human visual [71-[111.
system model are discussed, and a refined version of themeansquare Along the same lines of reasoning, incorporation of an
error quality assessment measure is given which should increase this HVS modelinan image quality measure shouldresult in a
measure’s utility. quantitative correlate of human quality assessment. Develop-
ing sucha measure leads to greaterinsight,efficiency,and
exactness in the process of developing a prospective compres-
I. INTRODUCTION sion scheme since it should entailrapid, precise, mathematically
N UMEROUS image compression techniques have been

developed over the years which have as their goal the
reduction of the number of bits needed t o transmit or store
tractable, digital computations in place of time-consuming,
labor-intensive, subjective (human) quality assessments [ 11] -
W I .
digitalimagery-consistent withimplementationcomplexity The preceding discussion indicates that 1) a combination
and cost constraints. The chosen technique must, of course, of the cosine transformtogether with anappropriate HVS
maintainacceptable image qualityupon final image decom- model would lead t o a digital image compressionscheme
pression, reconstruction, and display. Several review papers on exhibiting improved performance; and 2)theperformance
this subject have been published in recent years; the ones by ofa compressionscheme would be accurately quantifiable
Netravali and Limb [ 11 andJain [ 21 areparticularlycom- if an objective image quality measure which also incorporates
prehensive. It is generally recognized thatone of the most an HVS model is utilized.There is, however, a fundamental
efficient and effective methods for obtaining large compres- problem in combining an HVS model with an image cosine trans-
sion ratios with manageable complexity is transform compres- form which stems from the input scene alteration necessary
sion and, in partciular,cosine transform compression [ 31 . forproper cosine transformutilization. This problemhas
The cosine transform has been foundtoperformbetter neither been given adequateattention in theliteraturenor
overall than other transforms in many digital image compres- has it been adequately solved. This paper describes this prob-
sion applications [3] . The cosine transform greatlyreduces lemand itsramifications,then develops amathematically
the blockingproblem-related tothe undesirableFourier- and physically meaningful and useful solution.
Gibbs phenomenonor “ringing” at edges-when small sub- Section I1 presents thefundamental problem associated
image pixel blocks (typically 16 X 16 pixels) are individually with combining an image cosine transform‘withan HVS
transformed, encoded, transmitted, decoded, and then added model.Section 111 then describes the selection of an appro-
to form the reconstructed image. (Applying transforms t o sub- priate HVS model and its incorporation in an objective image
images rather than the entire image as a whole has advantages quality measure,followed bya discussion in Section IV of
in terms of allowing adaptivity t o differences in scene parts- how to incorporate the HVS model in transform image com-
seeHabibi [4] .) In addition,it has been shownthatthe pression. With this basic methodology of use established,
cosine transform is a limiting case of the Karhunen-Loeve Section V solves the HVS/cosine transform problem in as
transform [SI, [6].The latter is an optimum transform for mathematically rigorous a fashion as is possible, while also
imagecompressionin the meansquaresense, but requires being consistent with practicalconsiderations. Section VI
knowledge of the source correlationforimplementation- looks at the implicationsof this solution.
information which is seldom available-whereas the cosine
transform is adeterministictransformnot requiring such STATEMENT
11. PROBLEM
In order t o combine the cosine transform of imagery with
an HVS model,thetwo must be both mathematicallyand
Paper approved by the Editor for Signal Processing and Communication physically compatibleforapplication t o either compression
Electronics of the IEEE Communications Society for publication without oral
presentation.ManuscriptreceivedAugust16,1984;revisedFebruary11,
orquality assessment. We assume thatanadequatemathe-
1985.This work was supported by the Rome Air Development Center, matical compatibility is achieved if we can correctly combine
Griffiss AFB, NY, under Contract F19628-83-C-0001, as MITRE Mission- the cosine transform with the HVS model in a h e a r systems
Oriented Investigation and Experimentation Project 7330. theoretic sense. Given the current imperfect state of know-
The author is with the MITRE Corporation, Bedford, MA 01730. ledge of the HVS andthefactthat,inmanyapplications,
0090-6778/8.5/0600-0S~1$01.000 1985 IEEE
Authorized licensed use limited to: Jaypee Institute of Technology. Downloaded on September 19, 2009 at 01:10 from IEEE Xplore. Restrictions apply.
552 TRANSACTIONS
IEEE
ON COMMUNICATIONS, VOL.
NO.COM-33, 6 , JUNE.1985
little a priori knowledge of scenes or viewing conditions is Equations (2) and (3) are highly usefulresults inthe
available, it is appropriate t o model the HVS as a stationary context of application t o imagery transform compression and
linear system. This first-order approximation to the complex quality assessment, and we would like to utilize these relations
process of vision has been successfully applied in the past to with the cosine transform. As was stated, however, in apply-
image processing problems [ 81 - [ l o ] . ing the cosine transform,the original scene f ( x ) , which is
We thereby arrive atthe basic model of interest: a pre- asymmetrical (noneven),must
first be made symmetrical
viously processed scene is input to a linear system (the HVS), (even). This is easily accomplished by forming g ( x ) from f(x)
is modified by that system's impulse response function, and as previously discussed. Since g ( x ) is even and real (real be-
is displayed orrecordedattheoutput (i.e., atthebrain). cause light intensity is all real), theFouriertransform of
The basic problem in usjng the cosine transform in this model g ( x ) equals the cosine transform of g ( x ) as in (1). Further-
(in the scene processing stage) revolves around the fact that more, since h ( x ) is real due t o physical constraintsonthe
the processing of the scene must include a basic scene altera- visual process and even by assumption, theconvolution
tioninordertocorrectlyapplythe cosine transform t o it. theorem in ( 2 ) therefore holds forthe cosine transform as
This necessary alteration takes the form of forcing a symmetry well as for the Fourier transform. That is, because g ( x ) and
onto a normally asymmetrical original scene, even though it h ( x ) are real, even functions,
is alikeness of theasymmetrical scene that will be viewed
by the human observer-not the altered (symmetrical) scene.
It is this necessary forced symmetry which makes it difficult
to rigorously combine the desired linear systems theory with
the physical cases of interest. This important property of the
cosine transform, when used in image processing, has not
[l;g(t)h(x
= F,(u)H,(u)
- t )d t
= F,(u)H(u)
1cos ( 2 n u x ) dx
(4 1
been adequately brought out in the literature and is therefore
explored in more depthin the following. where F,(u), H,(u) are the cosine transforms of f ( x ) , h ( x ) ;
It is well known that (see, for example, Bracewell [ 141): respectively.
1 ) if a function g ( x ) is even, i.e., g ( x ) = f ( - x ) f f(x) where Using a different line of reasoning, Clarke [34] has de-
f ( - x ) = f ( x ) , and 2 ) if g ( x ) is all real, then the Fourier trans- monstratedthatthe cosine transform of aninput signal,
form of g ( x ) reduces exactly to the cosine transform of g ( x ) , corresponding to a given Fourierspatialfrequencyandar-
which in turn equals the cosine transform of fix);i.e., bitrary phase, is only significant in a small neighborhood
of thatFourierspatialfrequency. Clarke concludedthatit
Lrn m
g(x)e-2rrjux dx jmg(x)
m
cos ( 2 n u x ) d x
is feasible t o utilizea'multiplicative
H,(u)] in conjunction with the
factor[inthis case
cosine transform of a .signal
[in this case F , ( u ) ] . Even though (4) is true in a mathematical
sense,however, it is notsufficientby itself in aphysically
meaningful sense for application t o image processing. In other
=2 f(x) cos ( 2 n u dx x) . (1) words, a contradiction results from the fact that in order to
utilize the cosine transform, the scenemust 'be altered,' but
this very alteration causes the loss of anecessaryphysical
Although this relation has been known for many years in significance, since the human observer is not viewing this al-
a purelymathematical sense, itspracticalapplication and tered scene.This contradiction is thefundamentalproblem
utility in signal data compression has beenidentified only that must be overcome in order t o properly combine an image
fairly recently by Ahmed, Natarajan, and Rao [ 15 1 . cosine transform withan HVS model. It will be seen that
Now let f ( x ) represent the intensity distribution of a one- (4) can be used as a starting point in the problem solution.
dimensional(1-D), asymmetrical scene' as a function of Before discussing solutionapproaches, however, we first
distance x. Forcing a scene to be symmetrical, as represented look at just how the cosine transform, in combinatjon with
by the formation of g ( x ) , then allows application of the cosine the HVS model, could be used in image quality assessment
transform in place of ,the Fourier transform with no loss of and image compression.
information. That is, the scene can be exactly reconstructed
from just thecosine transform. 111. HVS MODEL AND IMAGE QUALITY ASSESSMENT
Now, from stationary linear systems theory itis known that The image quality measure, actually a measure of quality
theconvolution integral andFourierconvolutiontheorem degradation, that has most often been used in digitalimage
hold, as given in (2) and (3), respectively: compression research is the mean square error (MSE) between
the original, unprocessed image andthe processed image.
Ca
However, it has often been empirically determined that
l m f ( t ) h ( x - t ) d t ='output (2) the MSE and its variants do not correlate well with subjective
(human)quality assessments [ 91, [ l o ] , [ 1 3 1 . The reasons
are not well understood, but.one suspects that the MSE does
where notadequatelytrackthetypes of degradations causedby
f ( x ) = input scene digitalimagecompression processing techniques, and that it
does not adequately "mimic" what the human visual system
h ( x ) = HVS impulse response function does in assessing image quality.
Someresearchers [ 161, [ 171 have attempted to improve
L[ L f(t)h(x -
1
where F ( u ) , H ( u ) aretheFouriertransformsof
d x = F(u)H(u)
t)dt c z n i U x
f(x), h(x),
(3)
uponquality assessment byincorporatingelaborate
of the visual process. Suchmodels have been devised in an
attempttosimulatetheeffects
models
of many of theparameters
affecting vision, such as orientation, field angle, and Mach
respectively. bands, buttheirutilityfor practical problems is small due
t o their complexity, inherent unknowns, and need for some-
I In most of this paper, a 1-D continuous analysis is performed for ease of
times detailed a priori knowledge of viewing condition param-
understanding of the concepts and derivations involved.Theconversion to eter values. Incorporation of an elaborate visual system model
2-D i s performed where necessary. into animage quality measure is not practical at present.
NILL: VISUAL MODEL WEIGHTED COSINE TRANSFORM 553
However, it hasbeen found that several simplifying assump- Wi = 1.O for maximum structure subimage
tionsforthe visual model canstill lead to a quality measure Wi + 0.0, for minimum structure subimage.
thatperformsbetterthan,forinstance,the MSE, which
does notincorporate a visual model [ l o ] , [ 131. If one as- IV. H V S MODEL IN IMAGECOMPRESSION
sumes that the visual system is linear, at least for low contrast
images, and is isotropic, and that the scenes viewed are mono- After the cosine transform of a subimage is obtained, a
variety of encoding schemes can be applied to produce com-
chrome and static,with observer-preferredluminance levels,
pression of the..cosine coefficients. One promising technique
and are viewed for an observer-preferred length of time, then is to reorderthe 2-D array of subimage cosine coefficients
theseassumptions lead to a single, straightforward function
into a more tractable 1-D array, as suggested by Tescher [22].
representing the visual system, which is amenable to incor-
This conversion to 1-D readily invites application of various
poration in a quality measure.ITheseassumptionsare valid simple yet effectiveschemes to produce compression. Note
for certain classes of image observation, notably reconnais-
thatsome refinement tothe 1-D “zig-zag” coefficient re-
sance images being viewed for interpretation purposes.
ordering developed by Tescher can beaccomplished if it is
More specifically, many researchers have measured the
human threshold contrast sensitivity to periodic patterns replaced’ with 1-D radial coefficient
reordering (see (13)
andsurrounding discussion). In radialreordering, constant
(sine waves, square waves, etc.) viewed at a range of spatial
frequencies-good reviews of this work can be found in Levi radialfrequenciesaregrouped together to achieve, onthe
[ 181 and Kelly [ 191. By taking the reciprocal of such a con-
average, a monotonically decreasing 1-D curve.
trast sensitivity curve, one arrives at a curve akin to the spatial ‘As an example of use, a 1-D least squares polynomial could
frequency response function of the visual system. Mathemati- be fittothe reordered 1-D cosine coefficients,whereupon
cally, applyinglinearsystems concepts, this is equivalent the polynomialcoefficients, ratherthanthe cosine coeffi-
to the Fourier transform of the response of the visual system cients, would be encoded. If a polynomial order sufficiently
less thanthenumber of cosine coefficients can adequately
to an impulselight stimulus. Fig. 2 shows the shape of this
visual spatial frequencyresponse function H ( r ) . reconstructthose cosine coefficients (upon decompression),
Even afterincorporating a visual response functionin a then goodcompression has been achieved. In this approach,
quality measure,however,a furtherrefinement is inorder the HVS model is incorporated by weighting the residual
to more closely mimic how a human assesses quality. In many Squared errorbetweena given cosine coefficientand the
classes of image observation,a good assumption is that the prospectivepolynomial fit value, according tothe relative
observer will base his/her judgment of overall scene quality weight the eye-brain system would give tothatparticular
onthe higher structural(activity) regions containedinthe spatial frequency, as exemplifiedby the HVS modelinthe
scene. Thus, an improvement to an overall scene quality meas- spatial frequency domain (see Fig. 2). In this way, the accuracy
ure shouldbecome apparentbyincorporating a weighting of the polynomial fit at a given spatial frequency is directly
factorthatputs more emphasis on high structure subimage proportional to the relative emphasis placed on that spatial
areas and less emphasis on low structure subimage areas. A frequency by the HVS. This approach also avoids the necessity
concept similar to this postulate has in fact been found to of taking out the HVS model in the decoding/decompression
improve quality assessment performance [ 131. steps.
Bringing togetherthe preceding concepts of a visual re- Other,nonpolynomial schemes for compressing/encoding
sponse function and subimage structure weighting, and working the 1-D reordered cosine coefficients are possible3 and would
within the framework of the MSE difference betweenthe incorporatethe HVS model in essentially the same manner
original and processed images as well as within the framework as described in the preceding example. This application, of
of stationary linear systems theory, a quality measure results course,presupposesa solutiontothe HVS/cosine problem,
which is given in Section V.
that, it is felt, will more accurately track human assessments of
quality. This quality measure can be stated in the 2-D discrete V. HVS/COSINEPROBLEM
SOLUTION
Fourier (for now) spatial frequency domain as given in ( 5 ) :
The problem is to correctly combine a linear systems model
of the HVS withthe cosine transformof imagery, forthe
image quality assessment and image compression applica-
tions described inSections 111 andIV. The problem can be
i=1 u=o u=o
avoided by treating the HVS model andimage cosine trans-
formseparately, as suggested by Hall [ 9 ] . This procedure,
where however, although correct, is computationally very intensive.
B = number of subimage blocks in scene As shown in Fig. l(a), it requires four separate 2-D complex
K = normalizationfactorsuch as total energy Fourier transforms of the image in addition to the two 2-D
H ( r ) , = rotationallysymmetric spatial frequency response cosine transformsthat actually formthe compression/de-
. of HVS, r = flw compression scheme.
Griswold [241 developed a solution to the problem in the
Fi, Fi = Fouriertransform of unprocessedand processed
subimage i, respectively power spectrumdomain wherebya circular symmetricbit
M , N = number of Fourier coefficients +1,inorthogonal map was soughtand obtained.The developmentfollowsa
21, u directions
set of predetermined rules regarding the form of the solution
Wi = subimage i structure weighting factor,proportional function, borrowed from properties of Fourier power spectra
to subimage’s intensity level variance (see Chen andautocorrelation
functions. However, inthis
author’s
and Smith [ 201 ) opinion, these particular solution function construction rules
do not have a bearing on the form of a solution in the cosine
transformdomain.In essence, a solutionfunction has been
force-fitted to conform to certain preconditions that cannot
An initial nonlinearity is sometimes introduced into an HVS model by be adequately justified for their applicability to the problem
preprocessing the image with a logarithmic or power function [12]. Here we at hand.
are particularly interested in low contrast images, however, whereby we can
assume to be working in a linear region of a (possible) overall nonlinearity Research in image compression/encoding approaches based on 2-D to 1-D
[32]. In any case, it is not at all clear what nonlinear function would be most cosine transform coefficient reordering is currently active at TheMITRE
appropriate to use [33], and none is used in this study. Corporation; see Sullivan et al. [23].
554 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-33, NO. 6,JUNE 1 9 8 5
Using the convolution property associatedwith the Dirac
7 DIGITAL IMAGE
Q DISPLAYIMAGE
delta function, 6 (u);and rearranging terms of (8) yields
]
FOURIER TRANSFORM 1
- H(u)F,(u) - jH(u) - Q F,(u)
2 [2iU
MULTIPLY
Thequantity enclosed in brackets in (9) is seen to be the

Hilberttransform of F,(u), whichequals the sine transform
F,(u), of f i x ) . Thus, (9) contains the cosine and sinetrans-
dI
+ COSINE
DECODE
ENCODE
TRANSFORM
CHANNEL
t
forms of f(x), which together completely specify the Fourier
transform of f(x). By computing F,(u) from F,(u), the full
Fourier transform of f(x) would be generated, which would
solve the problem of how to combine H ( u ) in a linear systems
theoretic sense when starting with F,(u). However, this
(a) approach would also defeat the purpose of using the cosine
transform in the first place (see Section I). Instead, we will
WEIGHTED
MSE
accept the basic condition that only the image cosine trans-
form is to be computed. But is there a reasonably straight-
4 forward function which, when multiplied by the image cosine
CONVOLVE transform and HVS model, results in a good approximation
. @ - E 2 to (9)? That is, we seek a function A ( u ) such that
Q COSINE TRANSFORM
1
= - H(u)F,(u)
2
- jH(u) - @
L2:.
Fc(u)
I
1
Rearranging terms in (1 0) and dividing out H ( u ) ,
. (10)
H, h - HVS MODEL IN FREOUENCY

DOMAIN, SPATIAL DOMAIN RESPECTIVELY
Fig. 1. Mathematically correct but computationally intensive and, therefore,
generallyimpracticalmethodsforcombiningthehumanvisualsystem
(HVS)modelwiththeimagecosinetransformfor(a)imagetransform
compression and (b) image quality assessment.
Now fortractability j = 6
mustbeeliminated in (ll),
As for the quality measure given in (S), one could apply which can be accomplished4 by taking the modulusof A (u),
the equivalent equationinthespatial (image) domain,thus
avoiding the HVS/cosine problem, as shownin Fig. l(b).
This approach, however,necessitates the application of two
2-D convolutions, a generally computationally intensive
procedure. In addition, it would not directly aid in identify-
2
[u-1 @Fc(u)]2
n2FC2(u) 1 1'2
In order to be able to apply the same I A ( u ) I to every image

(12)
ing the visual spatial frequency range of importance for com-

pression optimization in thecosine transform domain. cosine transform computed, avalue for F , ( u ) in (1 2) is needed
In short, a generally viable solution has not been reported that is representative of all scenes to be encountered.It is
in the literature, to this author's knowledge. What is sought reasonable in this regard to consider the individual scenes as
is a fairly simply utilized function that can be directly applied single realizations of a stochasticprocess. Indeed, ascene canbe
to the HVS model in the spatial frequency domain and to the accurately modeled statistically as a random stepprocess [ 26 1.-
image cosine transform,suchthat c,ombining thetwo be- [28] whichresults in a negative exponential autocorrelation
comes both a theoretically correct procedure and a practically function and a 1-D power spectrum, P ( u ) , taking the form
useful one.
Referring back to (4), it was seen thatthe HVS model
h ( x ) was convolved with the altered, symmetrical scene g(x),
P(u) =
(2nu)2
S
+ a2
+I26(u) (13)
whereas the convolution of h(x) with f(x) is really desired-
since f(x) represents the unaltered scene.The latter can be
where s is related to the scene intensity variance, (Y equals the
achieved by bringing in a unit step function u(x) as follows:
reciprocal of the average intensity pulse width,and is the
given: h ( x ) 8 g(x) (6) scene's average intensity level. If the scenes are assumed to
be wide-sense stationary, then the statistics ('joint probability
where @denotes convolution distribution) of the scene fix) will be identical to the statis-
then: h ( x ) @ [u(x)g(x)] = h(x) @ f(x) (7)
tics of thealtered scene g(x) = f(x) +
f(-x). Hence, the
power spectrum (as calculated fromthe scene statistics)
representing the scene equals the power spectrum representing
where u ( x ) =
1 0,
1, x>o
x<o
Converting the left-hand side of (7) to the Fourier transform
the altered scene, which leads to
spatial frequency domainresults in

In optics,forexample,themodulusaloneoftenproves to adequately
representthe full Fouriertransforminlinearsystemsspatialfrequency
applications-see, for example,Smith [ 2 5 ] . In thisapplication it is also
compatible with H(u) and F,(u).
NILL: VISUAL MODEL WEIGHTED COSINETRANSFORM 555
4 8 12 16 3620 32
24 28 40
SPATIAL FREOUENCY (v OR r) IN CYCLES PER DEGREE
Fig. 2. H(r):thenormalized 2-D spatialfrequencyresponse of the human

visual system; IA(u)l:the 1-D correction factor which allows use of H(r)
with an image cosine transform; IA(r)lH(r): the 2-D (normalized)
functionwhichreplaces H(r) in imagecosine (as opposed to Fourier)
transform applications. The mathematical forms of these curves are given
in (19), (18), and (20), respectively.
where the tilde implies the cosine transform of the representa- The 1-D I A ( u ) I function (1 8 ) is shownin Fig. 2 for a =
tive ayerage scene. For convenience, the scene intensity bias 11.636 degreesv1, where this typical value of the parameter
level can be set equal to zero, which is equivalent to simply a is derived in Appendix B. An appropriate rotationally sym-
rescaling the scene intensity values, and results in [combining metric HVS model is also shown in Fig. 2 for
(1 3),and (1 4)]
Ik ( V ) l2 =
S
H(r) = (0.2 + 0.45r)e-0.18r
(19)
(27ru)2 + a2
Then substituting (I 5 ) into (1 2) and working with the positive
where the radial frequency r is in cycles per degree of visual
square root of (1 5 ) yields
c -
angle subtended. This particular curve is based on the work of
Mannos and Sakrison [ l o ] and DePalma andLowry [29].
The former identified an H ( r ) with a peak at 8 cycles/degree
by a well-conceived experimental trialanderror process,
albeit utilizing digital imagery raised to a power as the input.
Thelatter arrived at H ( r ) (actually its inverse) by psycho-
1 physical spatial frequency threshold measurements on human

observers, resulting in an H ( r ) peak around 3cycles/degree.
This author has found the H ( r ) curve based on DePalma and
Lowry's results to perform quite well in image quality work
The numerator in (1 6) is evaluated in closed form in Appendix done afew years ago (unpublished) on nondigitalimagery.
A with the result Considering all of these factors, it is felt that the H ( r ) in Fig.
2, which is a composite derived from [ 101 and [29], is a good
working representation of the HVS.
Now, since the general image is 2-D, it is more appropriate
to utilize the 2-D version of I A (v) I. It would, however, be
very difficult if not impossible to obtain this 2-D version
directly from(18) in closed form.Instead, this conversion
can be. conveniently carried out numerically (for given a )
byutilizing anyoneof several methods, as discussed by
Finally, substituting (1 7) into (1 6) gives the solution function Marchand [30]. The steps in the numerical procedure used
in closed form here were: 1) compute the Abel transform of H ( r ) to convert
H ( r ) to 1-D, i.e., toH ( u ) ; 2) multiply I A ( v ) I by H ( v ) ; 3) com-
pute the 1-D cosine transform of theeven function IA ( u ) IH(u);
and 4) compute the 1-D Hankel transform of the result of step
3. This 2-D rotationallysymmetricfunction (Le., the result
of step 4) is shown in Fig. 2. A curve fit to this numerically
556 COMMUNICATIONS,
TRANSACTIONS
IEEE
ON VOL.NO.
COM-33, 6 , JUNE 1985
obtained function is where
sgn ( x ) =
-1, x <o
This function can now be treated in image cosine transform

applications in the same manner as H ( r ) would be treated in
image Fourier transform applications. For example, for quality
assessment one simply substitutes IA (r) IH(r) from Fig. 2
forthe H ( r ) in(5), when dealing with image cosine trans-
forms instead of image Fourier transforms.
VI. DISCUSSION
AND CONCLUSIONS r-
By utilizing the multiplicative weighting factor 1A (v) I

derived inthispaper,togetherwithahuman visual system
model H(r), it becomes correct in a physical sense as well as
a linearsystems sense (to within theassumptionsmade in where K O ( * )is the modified Bessel function of the second
Section V) to combinean image cosine transformwitha kind, of order zero.
human visual system model. It is believed that this combining Then combining(A-1) and (A-2) and takingthe inverse
of an HVS model with the image cosine transform will result Fourier transform,
in better performance in image compression and image quality
assessment applications. In quality assessment, performance
should also be enhanced by inclusion of the subimage structure
weighting given in (5).
As can be seen in Fig. 2, the effect of I A ( u ) I is t o translate
H(r)intothemore positive frequencydirection, resulting
ina higher frequencypeak, relatively less emphasis on the
lower (than peak) frequencies, and relatively more emphasis
on the higher (than peak) frequencies. This implies that the m
higher spatial frequencies inthe cosine transformdomain
play amoreimportant role in the,correspondinghuman-
= ] & - 2 j )K/ o ( ~ I x ~ s i n ( 2 n v x ) d x
0
observed image qualitythantheydointhe "equivalent"
Fourier transform domain (where A ( u ) = 1.0 in the latter). (since sgn (x) is odd and K O(a1x 1) is even)
Thus,encoding of imagecosine transform coefficients for
bit compressionrequiresmore attentionand emphasis on
maintaining thefidelity of these higher frequencies than
would be needed for image Fouriertransform coefficients.
The actual curve for I A (r) IH(r) is a function of the original
H ( r ) and the several parameters noted in Appendix B affect- whereintegraltables can be used to obtqin (A-3) and (A-6)
ing IA ( u ) I. With the realistic values utilized here, H(r) and (see [ 31, eq. 333.78a and 541.8~1).
1A (r) IH(r) have peak values at5.2and 9.0 cycles/degree, B
APPENDIX
respectively. Otherparameter values and H(r) curves could,
of course, result in different peak locations, but the trend and Thescenecan be modeledas a pulse process resulting in
implications forall reasonable values remain as stated above. the scene power spectrum given in (13), where (Y is equal to
the reciprocal of the average pulse width. With u in cycles/
A
APPENDIX degree, it is appropriate to have (Y in units of reciprocal degrees,
w.hich can be arrived atby calculating the angle subtended
Evaluation of by the average pulse width of the displayedscene from the
human observer's location. From geometry this angle is equal
S to
(2nu)2 + ff2 r
for ( 1 6 ) ; It is more convenient to evaluatethis

convolution
integral by utilizing theFourierconvolutiontheorem (3),
thereby considering the two components separately:
where
W '= average pulse width (in linear units)
D = observer to displayed-scene distance.
For a digitized scene that has been quantized to about 8 bits
of grey levels and is displayed on a CRT, W is on the order of
1.5 pixels,where atypical CRTdisplaypixel is effectively
0.5 mm on a side; then, with a typical value for D of 0.5 m,
this results in
ff = 1 1.636(degrees)- l .
NILL: 557
REFERENCES 1231 D. R.Sullivan, N. B. Nill, R. D. Braun, D. H. Lehman, and B. W.
A. N. Natravali and J. 0. Limb, “Picture coding: A review,” Proc. Fam, “Advanced imagery lab final report,’’ and D. R. Sullivan,B. W.
r11
ZEEE, vol. 68, pp. 366-406, Mar. 1980. Fam,R.D.Braun,andN. B. Nill, “Advanced imagery laboratory
[21 A. K. Jain, “Image data compression: A review,” Proc. ZEEE, vol. image compression research,” RADC Tech. Reps., to be published,
69, pp. 349-389, Mar. 1981. 1985.
r31 A. G . Tescher, “Transform image coding,”in Advances in Electron- [24] N. C. Griswold, “Perceptual coding in the cosine transform domain,”
ics and Electron Physics, suppl.12, W. K. Pratt, Ed. New York: Opt. Eng., vol. 19, pp. 306-311, May-June 1980.
Academic, 1979, ch. 4, pp. 113-155. [25] W.J.Smith, Modern Optical Engineering. New York: McGraw-
A.Habibi,“Survey of adaptive imagecoding techniques,” &EE Hill, 1966, ch. 11, pp. 308-324.
r41
Trans. Commun., vol. COM-25, pp. 1275-1284, Nov. 1977. [26] L. E. Franks, “A model for the random videoprocess,” Bell Syst.
N. Ahmed and M. D. Flickner, “A derivation for the discrete cosine Tech. J., vol. 45, pp. 609-630, Apr. 1966.
r51
transform,” Proc. ZEEE, vol. 70, pp. 1132-1 134, Sept. 1982. [27] Y. Itakura, S. Tsutsumi, and T. Takagi, “Statistical properties of the
161 K. S. Shanmugan, “Comments on discrete cosine transform,” ZEEE background noise for the atmospheric windows in the intermediate
Trans. Comput., vol. C-24, p. 759, July 1975. infrared region,” Infrared Phys., vol. 14, pp. 17-29, 1974.
r71 J. 0.Limb, “Visual perception applied to the encoding of pictures,” in [28] D.Halford,“Ageneral mechanical model for M” spectral density
Advances in ImageTransmissionTechniques, Proc. Soc. Photo- random noise with special reference to flicker noise I / M , ” Proc.
Opt. Znstrum. Eng., vol. 87, pp. 80-87, 1976. ZEEE, vol. 56, pp. 251-258, Mar. 1968.
[81 T. G . Stockham, “Image processing in the contextof a visual model,” [29] J. J. DePalma and E. M. Lowry, “Sine wave response of the visual
PrOC.ZEEE, VOI. 6 0 , pp. 828-842, July 1972. system. II. Sine wave and square wave contrast sensitivity,” J. Opt.
[91 C. F. Hall, “Digital color image compression in a perceptual space,” Soc. Amer., vol. 52, pp. 328-335, Mar. 1962.
Ph.D. dissertation, Dep. Elec. Eng., Univ. Southern California, Los [30] E. W.Marchand,“Derivation of the point spread function from the
Angeles, Jan. 1978. line spread function,” J. Opt. Soc. Amer., vol. 54, pp. 915-919, July
r 101 J. L. Mannos andD.J.Sakrison,“Theeffects of a visual fidelity 1964.
criterion on the encoding of images,” ZEEE Trans. Znform. Theory, [31] W. GrobnerandN.Hofreiter, Zntegraltafel-Zweiter Teil-Bestimmte
VOI. IT-20, pp. 525-536, July 1974. Zntegrale. New York:Springer-Verlag, 1966.
D. J.Granrath,“Therole of human visual models in image 1321 T.N.Cornsweet, Visual Perception. New York: Academic, 1970,
processing,” Proc. ZEEE, vol. 69, pp. 552-561, May 1981. p. 334.
C.F. Hall and E. L. Hall,“A nonlinear model for the spatial [33] C. N. Nelson, “The theory of tone reproduction,” in The Theory of
characteristics ofthe human visual system,” ZEEE Trans. Syst., Man, the Photographic Process, C. E. K. Mees and T. H. James, Eds.
Cybern., vol. SCM-7, pp. 161-170, Mar. 1977. New York: MacMillan, 1966, ch. 22, pp. 470-477.
r 131 F. X.J. Lukas andZ. L. Budrikis, ’‘Picture quality prediction based on [341 R.J.Clarke,“Spectral response of the discrete cosine and Walsh-
Hadamard transforms,” ZEE Proc., vol. 130, part F, pp. 309-313,
a visual model,” ZEEE Trans. Commun., vol. COM-30, pp. 1679-
1692, July 1982.
R.N.Bracewell, TheFourier Transform and Its Applications.
New York: McGraw-Hill, 1965.
June 1983.
*
Norman B. Nill received the B.S. degree in
,
N. Ahmed, T. Natarajan,and K. R. Rao, “Discrete cosine transform,” photographic science from Rochester Institute of
ZEEE Trans. Comput., vol. C-23, pp. 90-93, Jan. 1974. Technology, Rochester,
NY, performed initial
I. Overington, “Toward a complete model of photopic visual threshold graduate studies in optics at the University of
performance,” Opt. Eng., vol. 21, pp. 2-13, Jan.-Feb. 1982. Rochester, Rochester,and received the M.S. degree
A. Schnitzler, “Effects of spatial frequency filtering on the perform- in electrical engineering from New York Univer-
ance of the composite photographic-human visual system,” Photogr. sity, New York, NY.
Sei. Eng., vol. 21, pp. 209-215, July-Aug. 1977. Over a ten-year period atPerkin-Elmer, Dan-
L. Levi, “Vision in communication,” in Progress in Optics, vol. 8, E. bury,CT, starting in 1968, he was involved in
Wolf,Ed. New York:Elsevier,1970,sect. 7, pp. 345-374. analytical and experimental work in image science,
D. H. Kelly, “Visual contrast sensitivity,” Opt. Acta, vol. 24, pp. including image evaluation, coherent optical image
107-130, Feb.1977. processing, and work on the NASA Space Telescope. From 1978 to 1981 he
W. Chen and C. H. Smith, “Adaptive coding of monochrome and color worked in acoustic signal processing at Analysis and Technology, North
images,’’ ZEEE Trans. Commun., vol. COM-25,pp. 1285-1292, Stonington, CT, and image pattern recognition at Synectics, Fairfax, VA.
Nov.1977. Since he joined the MITRE Corporation, Bedford, MA, in 1981, he has been
W.Chen,C.H.Smith, aqd S. Fralick,“A fast computational involved in concepts development and assessment of advanced techniques for
algorithm for the discrete cosine transform,”ZEEE Trans. Commun., image processing and exploitation. He is the author of several articles in the
vol. COM-25, pp. 1004-1009, Sept. 1977. image science area.
A. G . Tescher, “A dual transform coding algorithm,” in Proc. Nat. Mr. Nill is a member of the Society of Photographic Scientists and
Telecommun. Conf., Nov. 1981, pp. C9.2.1-C9.2.3. Engineers.

A Visual Model Weighted Cos, Ine Transform For Image Compression and Quality Assessment

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Visual Model Weighted Cos, Ine Transform For Image Compression and Quality Assessment

Încărcat de

Drepturi de autor:

Formate disponibile

TRANSACTIONS

A Visual Model Weighted Cos,ine Transform for Image

N UMEROUS image compression techniques have been

0090-6778/8.5/0600-0S~1$01.000 1985 IEEE

Using the convolution property associatedwith the Dirac

Thequantity enclosed in brackets in (9) is seen to be the

H, h - HVS MODEL IN FREOUENCY

In order to be able to apply the same I A ( u ) I to every image

ing the visual spatial frequency range of importance for com-

spatial frequency domainresults in

SPATIAL FREOUENCY (v OR r) IN CYCLES PER DEGREE

Fig. 2. H(r):thenormalized 2-D spatialfrequencyresponse of the human

1 physical spatial frequency threshold measurements on human

obtained function is where

This function can now be treated in image cosine transform

By utilizing the multiplicative weighting factor 1A (v) I

for ( 1 6 ) ; It is more convenient to evaluatethis

S-ar putea să vă placă și