Sunteți pe pagina 1din 16

678 PROCEEDINGS OF THEIEEE, VOL. 63, NO.

4, APRIL 1975

Blind Deconvolution Through Digital


Signal Processing

Invited Paper

A&rtmct-This paper m s the problem of deconvolving two sip-


scribed because the durations of the signals involved tend to be
nrls when both are unknown. The authors call this problem blind d e comparable. However, some variance with this observation is
convolution. The discussion develops two rehted solutions which can presently the subjectof research [ 5 1 .
be apptied through digital signal processing in certain practial cues.
The case of r e v h a t e d and resonated sound forms the center of the Digital signal processing has been used t o explore and imple-
development. The specific problem of restoring old acoustic recordingsmentthemethods we arepresenting. The most important
provides an experimental test. The important effects of noise and non-reason for using digital processing is that the complexity basic
stationary sign& lead to the detailed part of the presentation. In addi-
to the method is not presently within the scopeof analog signal
tion, the paper presentsresults for the case of images degraded by -meprocessing technology. Although this situation lowerscost and
common forms of blur.
speed performancefrom what it might otherwisebe,the
INTRODUCTION method is much more practical in this respect than might be
expected and is well within the reachof many classes of poten-
HE DECONVOLUTION problem (i.e., the problem of
tial users.
separating two signals that have been convolved) appears
in many contexts. Several varied discussionsoccurin An aspect of the digital processing method which often re-
this issue of the PROCEEDINGS alone [ 1]-[4]. For the pur- ceives too little attention is the requirement for converting ac-
poses of this paper, it is important to distinguish two different curately to and from digital data, respectively, before and after
forms of the problem. The first and simpler of these assumes processing. When employing digital deconvolution, special at-
that one of the twosignals is known. ,The second assumes that tention must be paid to accuracy, because it is the nature of
the deconvolution process to increasegreatly the impact of
both signals are unknown and, therefore, that the only data
small errors in the data. It is essential to reduce the effects of
available is the convolution itself. We have come to call the
conversion inperfections to be below those of the noise in the
task of estimating or eliminating one of the unknown signals
blind deconvohtion. originaldata. To dothisrequires care and effort. However,
the basic methods are well understood [6J,[71 and the means
Of course, as in any problem involving the separation of two
signals, something must be knownaboutthe distinguishing for conveniently maintaining quality is available [ 8 1 .
characteristics of both signals for which blind deconvolution is Our approach to this work has been based heavily upon the
interaction of theory and experiment. As a result, the discus-
to be carried out. In developing a solution it is desirable to
keep these characteristics as nonspecific as possible. Broader
sioncentersaroundreverberatedandresonatedsound. Our
originalinterest involved blurred images. However, the two-
applications are thenavailable.
dimensional aspect of and other detailed difficulties with that
The distinguishing characteristics in this work are the signal
application, set us looking for one involving onedimensional
extents. We assume that one signal is of considerably smaller
signals. The problem we selected was that of restoring old re-
extent than the other. Specifically, we mean that one of the
cordings made by the acoustic method which was used until
signals may be significantly nonzero over a restricted domain
about the mid 1920’s. The problem seemed simple enough in
the size of which is probably two decimal ordersof magnitude
theory and yetwas practical and rich enoughto provide a good
or more smaller than the domain of the convolution. This sit-
test. Let us now place this problem in the context of blind
uation happens frequently in practice when a long or continu-
deconvolution.
ing signal is degraded by a linear shift-invariant system. Rever-
Contrary to the popular concept concerning old recordings,
berated and resonatedsoundandcertainblurred images are
whether they be acoustic or electric, the problem of surface
examples which we consider here.
noise or scratch is not the most important. While this form of
It should be noted that the blind deconvolution problem as
degradation is immediately obvious when playing any old re-
encountered in speech [ 1I , [21 and seismic [ 31 signal process-
cording, it is generally not the major difficulty that connois-
ing is in conflict with the distinguishing characteristics justd e
sew listeners complain about, at least where collectorsquality
copies are concerned. For acoustic recordings the major prob-
Manuscript received October 10, 1974; revised November 9, 1974. lem seems to be theresonantorreverberantcharacteristic
This work was supported in part by the Advanced Research Rojects given to the musical instruments or vocal sound by the primi-
Agency of the Department of Defense unda Contract DAHC15-73-
C-0363. tive recording hornswhich were used to focus thesound energy
T. G. Stockham, Jr., and R. B. Ingebretsen are with the Computer onto the original wax disks. While it is well known that these
Science Department, University of Utah, Salt Lake City, Utah 841 12. acoustic mechanismswere incapable of transcribing frequencies
T. M. Cannon was withthe Computer Science Department of the
University of Utah, Salt Lake City, Utah. He is nowwith the much below 200 Hz or above 4000 Hz, these frequency limita-
L o s Alamos Scientific Laboratory, Los Alamos, N.Mex. 87544. tions alone do not account for the degree of the degradation
STOCMAM e t al.: BLIND DECONVOLUTION 619

L I

hlt) ;H (1)

n (1)
Fig. 1. A recording setup typical of those before the mid 1920’6.
Fig. 2. A signal block diagram for the recording setup of FW.1 includ-
ing the effect of additive surface noise.
produced. One can firmly convince oneself of this by taking a
modem recording and passing it through sharp cutoff filters Stated in mathematical terms, one is given the result of hav-
(digitally realized if one desires) with cutoff frequencies of the ing convolved two unknown signals, s ( t ) and h ( 0 , and from
kind mentioned. One can even add some artificially generated that combination is asked to estimate one of them. It has been
surface noise if one wishes and still the listening quality will our philosophy from thevery beginning t o approach this prob-
far exceed any available on old acoustic disks. Indeed, mea- lem as a filtering problem in which one is required to separate
surementsmade on old recording equipment indicate that two signals. Thedifference from conventionalfiltering,such
sharp resonant amplitude distortions with variations from 10 as that used to separate two television channels or two radio
t o 20 dB or more are commonly encountered in the frequency stations, is that in this case the undesirable component, h ( t ) ,
range between 100 and 1000 Hz on such recordings. This ef- has not been added t o but instead convolved with the desired
fect not only produces a megaphone quality (which the reader component, s(t).
might simulate by cupping his hands in front of his mouth and Following the homomorphic theory of Oppenheim [ 101, we
speaking) but also producesa very unpleasanteffect in the started our work by assuming that we could ignore the effect
form of loud bursts of soundwhencertainfrequencies are of surface noise on our process [ 11I , map the process of con-
played or sung. volution into one of addition, and treat the transformed prob-
Unfortunately, the amplitude distortionsproduced by acous- lem by conventional filtering techniques. While this approach
tic recording equipment were not of fixed character from re- led to interesting results from the outset, we have been con-
cording to recording. A typical setup, depicted in Fig. 1, in- stantly refining our attitude and approach to theproblem to a
volved five basic mechanisms (91. They were a hornfor point far beyond our initial understanding. As we shall see,
gathering acoustic energy, a tube for conducting that energy there is a close relationship between the problem of adapting
to a device called a sound box, a sound-boxdiaphram mecha- power spectrum estimation techniques to nonstationarysignals
nism, a lever mechanism and associated stylus for cutting the and the homomorphic deconvolution filtering idea we just d e
groove, and a turntable driving a wax disk into which the re- scribed. As aresult, we have beenforced to gain a better
cording was cut. As shown in thefigure, the singing which ap- understanding of both subjects.
peared at the mouthof the recording horn as an acoustic signal
s ( t ) was delivered to the disk after having been modified by
HOMOMORPHIC
DECONVOLUTION
the horn and its associated mechanism. The recording setup depicted in Fig. 1 is represented in Fig.
Surprisingly enough, the engineers of that day were excellent 2 in terms of a signal block diagram. In addition, we have in-
craftsmen and managed t o avoid what we presently call non- cluded the effects of surface noise which becomes combined
linear distortion quite well. The major effect of the recording withthe recorded waveform u ( t ) to produce the playback
horn system is, therefore, the same as that of a linear filter. In waveform p ( t ) that is available to us today when we play back
terms of functions, the singing waveform s ( t ) was convolved an old disk on modem reproductionequipment. Although the
with the impulse response of the recording mechanism h ( t ) t o surface noise on all records tends to become louder as the r e
produce the recorded waveform u ( t ) . Except for surface noise, corded signal increases in volume, this effect is minor in high-
it is this latter waveform which is available to us today by play- quality copies. As a result, we consider the effects of surface
ing an old recording. noise as additive. Since the recorded signal u ( t ) is related to
Unfortunately, the recording horn impulse response h ( t ) was the original acoustic waveform by
varied from one recording t o another. The primary cause of u(t) =s(t) 0h(t) (1)
this effect was probably due to the fact that the sound box
mechanism played more the role of a resonant amplifier than and the playback waveform to therecorded signal by
that of an acoustic to mechanical transducer. The recording
At)=W )+ (2)
engineers knew this and were constantly making an effort to
tune their sound boxes for maximum efficiency. Such tuning, the overall relatiodhip is
coupled with the placement of recording horns, the length of p ( t ) = ( s ( t ) 0 h ( t ) )+ no).
connecting tubes, and the shape of the horns, provided marked
variation in the characteristics of h ( t ) from recording to re- The objecthere is, given the playback waveform p ( t ) , t o
cording even for the same performer from day t o day. Thus reproduce the original acoustic waveform s ( t ) as closely as
it is that in restoring these old recordings, we are faced with possible. In this form, the problem sounds very much like a
a blind deconvolution problem since we know neither the sing- classical Wiener filtering problem. Unfortunately, such prob-
ing signal s ( t ) nor the impulse response h ( t ) involved in any lems usually assume that thedetails of the system H are known.
particular recording under consideration. Of course,the major difficulty of the old recordingsrestoration
680 PROCEEDINGS OF THE IEEE, APRIL 1975

problem is that the recording system frequency response H ( f ) Again, taking the complex logarithm,we obtain
is unknown. A second difference encountered here is that in
classical filtering problems, the signals are assumed to be sta-
tionary.Acoustic waveforms of singing such as s ( t ) arefar If windows whose length is on the order of half-second are
from stationary as can be readily appreciated by the fact that used and a 50-percent overlap from window to window is em-
the energies in such signals possess strongvariationsfrom ployed (e.g., as is commonlydonewitha simplehanning
moment to moment. window), and since a typical old recording might be between
Noting these issues and assuming for the time being that the 3 and 5 min in duration, then from 300 to 600 intervals for
noise n ( t ) is negligible, one is faced with the problem: given which (9) holdswouldbe available.Averaging over all of
u ( t ) as in (l), find s ( t ) . Following the homomorphic filtering these intervals yieldsthe relationship
concept, the first stepin the process isto transform the convo-
lution equation of (1) into a linear equation in the hope that
the welldeveloped discipline of linear signal processing might
be applied. This can be done by first taking the Fourier trans-
form of both sides of (l), thus producing We would at this point hope that the first term in the right-
hand side of (10) would converge to zero for large' N . Un-
= S c f ) H(f) - (4 1 fortunately, this does not happen. Thisfact is made clearer
which is simply the familiar frequency response relationship by writing (10) as two separate equations, the first equating
between the input and the output spectra for a linear system. the real parts and the second equating the imaginary parts of
By further taking the complex logarithmsof both sides of (4), the complex quantities involved. This is done in
we obtain
log V ( f l =log S(fl + log H c f ) (5 1
which possesses the additive propertywe wish. and
At this juncture, one approach might be t o take several re-
cordings made with the same recording equipment, put each
one in the form of (51, and average both sides of these equa-
tions across all of the recordings. If there were enough record-
ings and the singing on each were sufficiently different, one which descrii the approximateattenuationand phase rela-
might expect, according to the central limit theorem, that the tionships between the original s i n g i n g and the recorded wave
right-hand side of ( 5 ) would converge t o log HV). forms, respectively.
There are difficulties with this idea. The most important is Concentrating first on the log amplitudes of (1 la), itwould
that there are not a multiplicity of recordings available. There be our hope that the first term in the right-hand side would
is only one, because h ( t ) was varied from one recording to converge to zero for large N. Theoretical considerations would
another. This problem is overcome by chopping that recording predict that this does not happen since, crudely speaking, this
into many intervals of moderate length (e.g., one-half-second termrepresents in decibels the average distribution in fre-
intervals).Each of theseintervals,which we will call ui(t), quency of the energies constituting musical signals. Practical
represents a slightly differentmusical passage perturbed by the experiencebears out thisexpectation.Indeed, aswe shall
same recording mechanism. A small difficulty arises, however. show shortly, if the singing waveform were a stationary pro-
Each of these intervals is not given precisely as a convolution cess, this term would be onehalf of the logarithmof the power
of the corresponding interval of the original acousticwave and spectrum of that signal except for asmall additive constant. It
the impulse response of the recording horn. The problem d e is common knowledge that even if one were to assume that a
tails have to do with edge effects at the ends of the chopped music waveform were stationary, its spectrum would be cer-
intervals. Nevertheless, if the intervals are long compared to tainly far from white (i.e., constant or flat). In pursuing the
the temporal extent of the impulse response h ( t ) , the approxi- blind deconvolution objective, it remains to remove this term
mate relationship so that log IH(fll may be revealed.
This goal is reached by processing a modem recording of a
ui ( t )N si ( t ) 8 h ( t ) (6)
musical selection similar t o the one tobe restored in a manner
results. identical to thatjustdescribed. The recordingmechanisms
Furthermore,bymultiplyingthechoppedintervalsbya used to make the modem recordings, hereafter called proto-
smooth window function, we obtain a slightly modified set of types, possess virtuallyflatfrequency responses. Forthem,
vi's which are related closely t o the corresponding modified set the second term in the right-hand side of (1 1) reducesto zero.
of SI'S by Thus, for the modern recording, the averaging process yields
an isolated version of the first term in the right-hand side of
-
Ui(t) = Wi(t) * u ( t ) = Wi(t) [ s ( t ) 0 h ( t ) ] (1 1a) which can then be removed by subtraction.' The basic
N [ w i ( t )* s(t)l 8 h ( t ) = si(t) 0 h ( t ) . (7)
If the windows wi(t) are smooth and long compared to the 'T h e requirement for large N, and that the windowsbe long compared
temporal extent of the impulse response h ( t ) ,the approximate to the temporal extent of h(t), is the essential motivation for the as-
equality of (6) holds very closely. Taking the Fourier trans- sumption made in the introduction that one unknown signal be of con-
aidesably d e r extent than the other.
form of both sides of ( 6 ) ,we obtain 'Claims for the subjective importance of this atep were fnst made by
N. J. Miller who also selected the data and made th6 f i estimates that
Vi (f)N Si (f)* H(fl. (8) were used.
STOCKHAM e? 01.: BLIND DECONVOLUTION 68 1

assumption of course is that the prototype recording has the 0


same statistical characteristics as does the original singhg to A - 1 0
be restored. M
The problem with (1 lb) is not with the first term on the p - 2 0
right-hand side converging to zero, but with the computation L
I -30
of phase in such a manner t o be compatible with the equation
itself. The issue, which is discussed thoroughly elsewhere -40
u
[ l o ] , is that the four quadrant inverse tangent function con-
0 -50
ventionally used t o compute phase yields only the principal E
value of the complex logarithm function. Unfortunately, the -60
n
sum of principal value phases is not the principal value of a d
sum of phases. Thus the linearity which we require in (1 lb) -70
U
cannot be obtained unless these phases are computed in a - 80
special way. One such method explored extensively by Schafer
1 0 100 1000 5000
[ 121 relies on a process called phase unwrapping. However,
we have so far been unable to apply phase unwrapping t o this F R E Q U E N C Y (HZ1
problem. As a result, we have thus far not been able to esti- Fig. 3. The averages of (14)for the 1907 recording of “Vesti la Guibba”
mate the phase distortions associated with old recordings. by Enrico Caruso.
Since the human ear is known to be relatively insensitive t o
phase [ 131 , we have assumed the correction of phase distor- 0
tions t o be unnecessary. As we shall describe later, some ex-
A - 1 0
periments we have been able t o perform indicate that this as- M
sumption is probably justified. p -20
Returning our attention to the log amplitudes, we see that L
I -30
having computed the left-hand side of (1 la) for both the re-
cording to be restored and the prototype, the next step is t o
subtractthelatterfromtheformer leaving an estimate of
log IH(f)l. The remainder of the restoration process involves
the construction of the compensating linear digital fiiter whose
frequency response is the inverse of IH(f)l. Of course, this
approach is valid only under the strict assumption that the re-
cording is noise free. In fact, suchis not the case. Care is thus
taken to confine the frequency response of the compensating
filter to the frequencyband in which the old recording has ap- 1 0 1 0 0 1000 5000
preciable components. An attempt t o recover frequencies out- F R E Q U E N C Y [HZ1
side this band would only serve to amplify surface noise un- (a)
dulywithout retrieving informative signals. Thiseffect is 0
formally known as ill-conditioning.
A - 1 0
Fig. 3 shows the averages of (1 1a), modified to include noise, M
as computed from the singing portion of the 1907 recording of -20
“Vesti la Guibba” by Enrico car us^.^ Fig. 4 shows the same L
averages for a quite recent recording of the same selection by I -30
T
Jussi Bjoerling. Fig. 4(a) presents the raw data and Fig. 4(b) is u -40
the same datasmoothedin frequencies. Smoothingfurther 0
stabilizes the average and is justified because no major reso- E -50
nancephenomena which couldproducesharp peaks are ex- -60
pected in the prototype spectrum. Fig. 5(a) is the difference d
of the two and thus represents an estimate of the frequency U
-70
response of the horn used to make this recording.
When these data areinverted, as in Fig. 5(b), t o produce
-eo
compensation and properly truncated to avoid ill-condition- 1 0 1 0 0 1000 5 0 0 0
ing, the frequency response of the final restoration system is F R E Q U E N C Y (HZ1
obtained. The results are shown in Fig. 6 . From these data, a (b)
digital filter is realized using established design methods [ 141 , FIg. 4. Theaverages of (14) foraprototype recording of “Vesti la
[ 151. This filter is then used to process the digital samples of Guibba” by Jussi Bjoerling. (a) The raw data. (b) The data smoothed
in frequencies.

In Figs. 3 through 14, inclusive, the abscissa represents frequency in the original acoustic disk by high-speed convolution processing.
hertz on a logarithmic scale. The ordinate represents decibels on a scale ~ind the
~resulting
, samplesare convert& to an =dogsignal
of 10 dB per main division. The equations in this paper corresponding
to these fiiures have been formulated using the units of nepers because and presented for listening.
thesimplicity ofthe M t U d logarithm was required to produce qua- The results of this processing are very striking, especially
tions of manageable complexity. In all cases, the results obtained from
the rmry be from n e ~ to s decibels simply by from an artistic
point
of view where the emphasis is not so
multiplying
them
by 8.686. - . much on producing
recording
a of modem
quality as it is on
682 PROCEEDINGS OF THE IEEE, APRIL 1975

entirely gone. The voice seems much closer to thelistener, the


megaphone sound having been almost completely eliminated.
The realistic qualities of the voice provided by the upperrange
of frequencies within the range of the restoration process are
dramatically obvious.
Restorations have been auditioned by a broad audience in-
cluding laymen, musicians, and serious collectors of acoustic
disks. A curious phenomenon has been revealed by the re-
- 1 0
E marks of the latter goup. Connoisseur collectors almost uni-
A -20 formly agree that wheij $& acoustic recordings are played di-
.d. rectly (i.e., without restoiiltion processing), they sound better
-30 on a wide range system rehoducing frequencies well above
U

-40 3500 Hz. As we shall see, there is convincing evidence that no


components of the original musical signal inthis high-fre-
1 0 100 1000 5000 quency band were recorded. Theories as t o why reproduction
F R E Q U E N C Y (HZ1 of this high band of the original should sound better center
(a) around the following arguments. It is better for theear t o hear
4 0 something in any band of frequencies than nothing at all. The
surface noise which does extend into thesefrequencies is mod-
A 30
M ulated by the singing and thus provides a kind of artificial high-
p 20 frequency structure. Small amounts of nonlinear distortion
L create harmonic components in these regions that enhance the
t 1 0 listening experience. The curious phenomenon is that the res-
T
torations which we have produced contain no sensible energies
u 00
in these frequency bands and yet are almost always judged
E
- 1 0 superior to a highquality acoustic original reproduced in the
cI -20
simple wide-band manner.
d At thispointone might wonderwhether the omission of
U
e -30 phase compensation from the restoration procedure prevents
it from realizing its full potential. While there is some hope
-40 that one might overcome the previously mentioned problems
I O 1 0 0 1000 5000 with computing theaverages of (1 lb), no attempt has been made
F R E GI U E NC Y [HZ] thus far. Instead, an experiment was performed to confirm
@) the appropriateness of ignoring phase as a contributor to the
Fig. 5 . The difference between the data of Figs. 3 and 4@). (a) Esti- audible defects associated with old acoustic disks. This experi-
mate of amplitude distortions. @) Inverted data producing compen- ment involves the computation of the minimum phase a ss
*
sation frequency response. ciated withthe estimates of amplitude distortion. Starting
4 0
with that estimate, Fig. S(b), the associated minimum phase is
computedby means of adiscrete Hilbert transform. That
A 30 minimum phase is then combined with the truncated compen-
M
sation frequency response of Fig. 6 . The auxiliary restoration
p 20
L
h e a r filter thus formed is used to produce a restored sound.
I 1 0 This restored sound is then compared with that obtainedusing
T zero phak as previously described. Carefulauditioning on
u oo loudspeakers and earphones reveals no perceptible difference
- 1 0 betweenthese two restorations. The absence of such differ-
E ences supports the assumption that the phases of (1 Ib) need
-20 not be estimated, provided the difference between the actual
d
-30 phase and the minimum phase is not too great. While it seems
U
reasonable to ignore phase in restoringacoustic disks, this
-40 issue still remains unresolved and awaits further work.
1 0 1 0 0 1000 5000
THE EFFECTS OF NOISE
F R E GI U E N C Y (HZ1
Fig. 6. The frequency response of Fig. 5(b) truncated to avoid ill- From (4) through (1 Ib), we have neglected the effects of
conditioning due t o surface noise. surface noise upon the restoration of old recordings through
blind deconvolution. If we attempt to rectify this, we obtain
having a clear glimpse into past musical events. AU the resto-
rations we have made, which so far concentrate on therecor,d- pi (f)E H(f) + Ni (f) (12)
ings of Enrico Caruso, retain some of the “acoustic flavor” but instead of (8). Here N i ( n is the Fourier transform of the ith-
the clarity of expression, the texture of the voice, and the windowed noise segment. One can quickly see that taking the
artistic interest are dramatically changed. In addition,the logarithm of both sides of (12) will now present some prob-
prominent surges in volume caused when thepitch of the lems, because the right-hand side will not reduce t o a sum of
singing voice strikes the recording horn resonances are almost logarithmsas is required. Proceeding anyway,and ignoring
ECONVOLUTION
STOCKHAM ef ai.: BLIND 683

where A (f) is the estimated amplitude distortion. In the noise-


free case (i.e., n ( t ) 3 0), this becomes
E { A (f)) = (3)log IH(f)12=log IH(f)l (20)
as previously indicated.
Deriving the frequency response of the compensating filter,
R (f)= exp (-($I log [(@s I H U P + @N)/@Sl)
At this point, let us make the rather unrealistic supposition = [@s/(@SIH(f)I2 + @N)11'2 (21)
that the singing waveform s ( t ) is a stationary random process.
Under these circumstances, as is shown in Appendixes A and we see that, as before, it becomes the inverse filter with fre-
B, the left-hand side of (14) is a sample mean estimate of half quency response l/IH(f)l in the noise.-free case. When noise is
of the logarithm of the power spectrum of the playback wave- present, the frequencyresponse is the geometric mean between
form p ( t ) . Considering the expected values and variances of the inverse filter and the Wiener fiiter' which would have been
both sides of (14), we obtain produced if the system H and the noise spectrum @N had been
known a priori as is required in Wiener filtering situations. It
is very interesting to note that all of this transpires in spite of
the fact that the system function for the system H and the
power spectrum forthe noise are completely unspecified u
priori and never explicitly isolated during the process.
If at this point we were to process the playback signal p ( t )
through. the compensation filter R , we would obtain a signal
q ( t ) with a power spectrum given by

@Q = $p IR 1' @P ' [@S/($S IH(f)lZ+ @ N ) ] = @S. (22)


In other words, the compensating filter would produce an out-
put signal with a power spectrum identical to that of the origi-
nal singing. We, therefore, see that, in this case, the presence
of noise acts to alter the averaging process motivated by the
homomorphic theory in theright direction to move the effects
of that noise away from ill-conditioning. In this light, the
frequency response truncation used t o avoid ill-conditioning
and shown in Fig. 6 might be regarded as unnecessary. On the
where the @'s are the respective true power spectra4 and C is other hand,since the compensating frequency response of (21)
-
Euler's constant (C = 0.57721 * -). We see that while both is geometrically only half way between the ill-conditioned in-
estimates are biased, the bias is a universal constant indepen- verse filter and the optimum Wiener filter, it might be feared
dent of the data and frequency and thus could be removed u that ill-conditioning is not avoided sufficiently.
priori. In (1 6a) the simple involvement of only the power Since error is to be measured by the human ear and it is cer-
spectrum of the original singing and the power spectrum of the tain that the ear does not use a mean-square error criterion, it
noise is made possible by the further assumption that the noise is not clear what option will perform best. Further complicat-
and the singing are independent processes. Proceeding simi- ing the consideration is the fact that since the singing wave-
larly for a modem recording, we obtain form is not a stationary random process, optimum filtering is
not Wiener filtering in the usual sense. As a consequence, our
attempts to achieve a best restoration centered initially around
three options. The first was to use the R (f) of (21) as shown
inFig. 5(b) directly without truncation. The second was to
use the truncated R ( f ) as shown in Fig. 6. The third was to
modify R (f) to resemble the frequency response required by
the Wiener theory as closely as possible.
Our testswith thesesystems show that the truncated re-
sponse of Fig. 6producesadefiniteimprovement over the
pure compensation of Fig. 5(b). While this result suggests that
theapproximate Wiener fiiter of optionthree might sound
even better, such is not the case. These restorations are defi-
nitelydeficient in bass. Indeed,the restorations resulting
from the truncated R ( f ) are also judged somewhat the same
way.
A better balanced sound is obtained with the insertion of an
empirically determined bass boostcharacteristic as shown in

SActually, the Wiener filter would involve the phase of HCf) in its
formulation, but recall that we have assumed that phase compensation
'The a's are of course functions of frequencyf. is completely neglected in this application at this time.
684 PROCEEDINGS OF THE IEEE,APRIL 1975

4 0 of (14):
A 30
M
p 20
L
I 1 0 Following our desire to keep ourresults in termsof attenuation
T
u oo and to allow comparison with the homomorphic approach,we
take the logarithm of both sides of (241, divide by two, and
- 1 0 obtain
E

n
- 20
d
6 -30
U

-40
1 0 1 0 0 1000 5000
F R E Q U E N C Y [HZ1
Fig. 7. Empirically determined bass boost. Retaining our supposition about the stationary nature of s ( t ) ,
as shown inAppendixes A and C, the left-hand side of (25) is a
second sample mean estimate of half of the logarithm of the
power spectrum of the playback waveform. Considering the
expected values and variances of both sides of (251, we obtain
l , l '1
p 20 1 ! , .,
L
I 1 0
T
u oo
- 1 0
E

n
-20
d
B -30
U

-40
= $'(N)/4 = 1/(4N)
1 0 1 0 0 1000 5 0 0 0

F R E Q U E N C Y [HZ]

Fig. 8. T h e compensation frequency response of Fig. 6 with the bass


boost of Fig. 7 added.

Fig. 7. The resulting restoration frequency response shown in


Fig. 8 performs better than that shown in Fig. 6, although
both achieve the major objective of alleviating the resonant
and megaphone qualities of the original equally well. Compar-
ison of Fig. 8 with inversions of frequency response curves
published for old acoustic transducers, indicates abetter match
than the data of Fig. 6. Assuming that the published data are
correct,calculations with (21) using measured surface noise
energies and measured prototype spectra produce a frequency
= $'(N)/4 = 1/(4N)
(27b)
response which deviates from that of Fig. 6 especially in the where $ (x) is the digamma function. Again, we have assumed
bass region. that the surface noise and the singing are independent pro-
cesses. For a modern recording, these same steps produce
POWERSPECTRUM
APPROACH
Since (21) suggests the direct use of power spectra, our res-
toration scheme was modified t o produce the compensation
frequency response of (2 1)directly by power spectrum estima-
tion techniques. In this approach the basic difference is that
the averaging process performed after the taking of the loga-
rithm in the homomorphicmethod is instead performed di-
rectly upon the squared magnitude of the Fourier transformed
intervals of the playback signal. Starting again with (121, we
obtain instead of (1 3)
(28b)
Again (18) holds and so the subtractionof (28) from (26)and
Now after averaging, we have the following equation instead (27) leads to the same result as before expressed in (19)
STOCKHAM e2 al.: BLIND DECONVOLUTION 685

A -10 A 30
M M
-20 p 20
L
I -30
T
u -40

-50
E
~ -60
d
I 1
-70 -30
U U

-80 I 1 , , :I - 40
1 0 100 1000 5000 10 100 1000 5000

F R E Q U E N C Y [HZ1
Fig. 9. The averages of (25) for the same C m s o recording as used t o
produce Fig. 3.

A -IO
M
P -20

,
L
-30
O - 1 0
T E
-4 0
~ -20

E
-50

-60
d
6
U -30- 1
n
-40 1 1 1 , ,I
d
8 -70
U 10 1 0 0 1000 5000
- 80 F R E Q U E N C Y [HZ1
1 0 1 0 0 1000 5000 (b)
F R E CJ U E N C Y [HZ1 Fig. 11. Thedifferencebetween Figs. 9 and lo@). (a)Estimate of
amplitudedistortions. (b) Inverteddataproducingcompensation
(a) frequency response.
0
4 0
A - 1 0
M A 30
p -20 M
L p 20
I -30 L
T I 1 0
u - 4 0 T
U O0
-50
E
-10
~ -60 E
d
n
-20
u
e -70 d
8 -30
- 80 U

1 0 1 0 0 1000 5000
- 40
10 100 1000 5 0 0 0
F R E Q U E N C Y [HZ1
(b) F R E Q U E N C Y [HZ1
Fig. 10. The averagesof (25) for the sameBjoerlingrecordingused Fig. 12. Thefrequencyresponseof Fig. ll(b) truncated to avoid ill-
to produce Fig. 4. (a)Therawdata. (b) Thedatasmoothed in conditioning due t o surface noise.
frequencies.
being about 22 percent more stable (i.e., having smaller stan-
through (22), except that (1 9b) becomes darddeviation or being smoother)thanthehomomorphic
estimate.
v u { A (f))= 1/ ( u v ) . ( 1 9 ~ ) Nevertheless, as impliedbefore, significant differences be-
tween this power spectrum approach and the homomorphic
In other words,these two system estimators are equivalent, approach are confirmed experimentally. Specifically,
Figs. 9-1 2
(19), with the onebased upon direct power spectrum estimates correspond to Figs. 3-6 except forthis change in strategy. What
686 APRIL IEEE,PROCEEDINGS OF THE 1975

is more, notice the similarity between Figs. 8 and 12. In Fig. where &(flis the frequency response of the slowlyvarying
12, the restoration frequencyresponse agrees very closely with linear system during the ith interval. Adding noise as before,
the empirically bass boosted homomorphic estimate.Also,upon we get
listening, the two sound very much alike. Furthermore, even
though (19b) and (19c) predict that the results of Fig. 5(a) PAfl= GAfl P A f l *H ( f l + NAfl (31)
ought to be less smooththanthose of Fig. ll(a), just the instead of (12). Computing averages according to the homo-
reverse appears to be true. morphic theory leads to
At this point it is only natural to wonder about the cause of
1 N 1 N
theexperimentaldifferencesbetweenthetwoapproaches. - log IfKfll N ; log IGAfl PAfl * H(fl +NAfll. (32)
Considering the physical nature of audio signals, especially N i=1 i=1
music signals, it is somewhat obvious that the stationary as-
Proceeding as motivated by the power spectrum approach, we
sumption for the singing signal is at the heart of the issue. In
have
addition, since the power spectrum estimatesgive more natural
sounding results directly, a shadow of doubt is cast across the
advisabilityofusing homomorphic blind deconvolution for
noisy signals inspite of its attractive theoretical motivation in
the noise-free case. Somehowthismakes no sense, espe-
cially since the homomorphic theory makes no assumptions
about the stationarity of the signals involved and the power
spectral approach does. As a result, let us set out to determine
the effects of nonstationarity on bothmethods. It should be noted that (32) and (33)are no longer estimators
of true log powerspectra.This is so because the pi's are
THE EFFECTOF NONSTATIONARY SIGNALS UPON different for eachinterval.
HOMOMORPHICVERSUS POWERSPECTRUM ESTIMATESOF In the noise-free case, as is shown in Appendixes D and E ,
AMPLITUDE DISTORTIONS the expected value and variances for these expressions are
As is well known, the application of power spectrum estima-
tion techniques is permissible in the strict sense only when the
signals involved are stationary. In addition, estimates involving
time averages are possible only if the signals are ergodic. If, on
the other hand, the timevariations which characterize the non-
stationary signals change slowly enough with time, powerspec-
tralestimatesapplied to such signals maystillmake sense.
Since all power spectrum estimates used in this work involve
averages over finite lengths of time, the requirement for this
to be true is that the statistics of the signal change so slowly
in each time interval to be analyzed that the estimation calcu-
lations are virtually unaffected.
In thislight,let us modelthe singing signal s ( t ) as being
formed by passing a stationary random signal through a time-
varying linear system. Also let the variations of this system be
so slow that we can consider the system to be time-invariant
over any interval for which a Fourier analysis is to be applied. (35b)
This is equivalent to modelingthe singingwaveform as if it h

were produced by a speech synthesis system suchas commonly where L,(n represents the right-hand side of (32) and P,(fl
used in vocoders subject to the constraint that only unvoiced represents the right-hand side of (33).Comparetheseequa-
(i.e., hiss) [ 161 excitation to be used and that theparameters de- tionswith(16a),(16b),(27a),and(27b), respectively. As-
termining the frequency response of the vocal tract be slowly suming no noise (is., @N = 0 ) and that all of the pi's are
varying. Admittedly, this is a relatively simple model. It does identical, the equations become identical, and
not permit coherent components such as those produced by
the very nearly periodic vibration of the vocal chords. None- 4 s = @G * 8'(fl (36)
theless, it has served us very well in this analysis and to pro-
duce a more sophisticated working model begins to approach as is expected in the stationary case. ^Permitting the pi's to
the complexity of creating an automated singing music box. vary, we see that both E {L,} and E {P,} will in general be
Proceeding with this model and paying attention to the con- modifiedfrom thestationary result,eachbyadifferent
siderations with respect to slow variations and windows, (7) amoxnt. In
addition, while
var{Lp) remains the same,
becomes var {P,} is increased, the increase being larger the greater the
variability.
v t W = W t ) 0 b,<t) 0 h ( t ) (29) Proceeding similarly for a modemrecording, we obtain
where bt<t) is the impulse response of the slowly varying linear
system during the ith interval. Taking the Fourier transform
of both sides of (29), we obtain
K W = G,U) PXfl M f )
* (30)
STOCKHAM et al.: BLIND
ECONVOLUTION 681

Because the prototype is relatively noise free, the ai’s are


perturbed very little. As a result, neither (39a) nor (39b) will
hold, and (41) becomes
E { A H } = log IWnl

where a,{n is the frequency response of the slowly varying for the homomorphic approach,and
linear systemduring theith interval of theprototype.The
assumption that the prototype and the signal to be restored
have the same statistics requires, forthehomomorphic ap-
proach, that

(394 for the power spectrum approach.


Forthe caseof surface noise, then,thetwoattenuation
estimates are both biased but each by adifferent amount.
and, for the power spectrum approach, that
What is more, the bias in the case of (43) will be greater than
the bias of (44) especially at those frequencies for which the
noise energy is greatest. This is true, because (1/N) log <
x
log (1/N) Z Pi’, with the equality holding when all of thePi’s are
Forthe assumption to holdsimultaneously forboth ap- the same. Since the addition of noise tends to increase all the
proaches, it is further required thatthe a’s and the 0’s be Pi’s while making them more equal, ( 1/N) 2 log 0: must be
identical in pairs for each value of frequency f. However, the increased more.
pairing need not be the same at all frequencies. If the proto- For iarge dynamic range signals, this effect is quite large for
type is the same musical selection, it would not be unreason- thehomomorphic case but remains relatively small forthe
able to assume that other. This fact fits quite well with the data obtained in Figs.
5(a) and 1 l(a) and accounts at least qualitatively for the dif-
a I { f l =&(fl, i = 1,2, * * ,N (40) ferences between the restoration frequency responses of Figs.
which is more than enough to guarantee (39a) and (39b). 7 and 12. Indeed, these results have been quantitatively con-
Underthisassumption, the subtraction of (37) from (34) f i i e d through experiments with simulated data [ 171.
and of (38) from (35)yields An interestingcollateral issue arises ifwe return toour

E I’4(n}= (4) log I w 7 l 2 =log IH(nl (41 1


comparison of E{Lp} with E {4} and also compare E{LM}
with E { f i ~ }in the Same way. Figs. 13 and 14 are plots of
for both approaches, but L, - 4 and LM - f i ~respectively,
, for the Caruso and Bjoer-
ling data.In both cases, there are bands of frequencies for
var {A,&-)} = nZ/(12N)
(424 which these functions are constant and stay close to - 2.5 dB in
for the homomorphic approach,and value. For Fig. 13,this happens below about160 Hz and
above 3250 Hz. For Fig. 14, it occurs below about 60 Hz and
above 4500 Hz. Since -2.5 dB corresponds to -C/2 nepers,
which is the expected difference between L and f i in the case
of stationary signals, one would assume that these frequency
for the power spectrum approach. bands represent pure surface noise for their respective record-
Thus we see that while both attenuation estimates are un- ings.Given the available data about old recordings, this as-
biased and equivalent, the one based upon the power spectrum sumption would seem to be quite reasonable.6 Where there is
approach is no longerguaranteed t o be morestable. As a music energy the difference becomes more negative than -2.5
matter of fact, the greater .the dynamic range of the singing, dB indicating the nonstationary nature of such signals. Indeed,
the more unstable the power spectrum approach will be. the sensitivity of the L - f i curve to this effect seems very high,
Assuming some reasonable variations for the &’s, wehave thus providing an excellent test for the presence of nonstation-
shown both theoretically and experimentally that one obtains ary signals in the presence of stationary backgrounds. In that
estimates which are about twice as stable using the homo- regard, Fig. 13 stronglyindicates the absence of recorded
morphic approach than using the power spectrum approach energies outside the frequency band of 160-3250 Hz from the
[171. 1907 recording of “Vesti la Guibba” by Caruso.
In practice, of course, the noise is not zero. Returning our As we can see from the foregoing, noise will always bias our
attention to (32) and (33), we see, however, that due t o the estimates of the amplitude distortions present in old record-
variation of the Pi’s there is no way t o proceed as we did from ings. In the case of (19a), we have seen that this bias is de-
(14) to (16a) and from (25) to (27a). The closest we can come sirable because it tends to avoid the effects of illconditioning
is to regard the noise as a perturbation on each /3i(nand thus in just about the right way. For stationary signals, the bias is
the same regardless of which approach is used. Since the power
absorb it into (34a) and (35). The net effect would thus be to
reduce the variations of the pi’s at frequencies for which the
noise energy becomes large compared to the singing energies The modem recording was fdtered by a sharp cutoff low-paas Illter,
for an appreciable fraction of the time. with a cutoff fiequency of4 ItHz before sampling to prevent Iliasing.
688 PROCEEDINGS OF THE IEEE, APRIL 1975

00 The details of this work are reported elsewhere [ 181, [ 191.


A However, we disucss thembrieflyhere because they present
M actualresults ofblind deconvolutionandfurtherillustrate
p - 1 0 details of the theory.
L Images blurred by such causes as camera motion during ex-
I
posure,or inaccuratelyfocused lenses, sufferdegradation
T
-20 throughaconvolutionwithanunknownimpulseresponse
0
commonly called a point spread function [4]. The situation
E is much the same as described by (3), except that the signals
- 30 aretwo-dimensionaland the noiseis signal dependent. Such
n
d dependence is characteristic of commonly encountered image
8
U recordingmediasuch as photographic film. For example,a
- 4 0
multiplicative noise model is more appropriate than an addi-
tive one in many cases [ 201.
1 0 1 0 0 1 0 00 5000
The process of deblurring imagescloselyresembles that of
F R E Q U E N C Y [HZ1 dereverberating sound except for some important differences
Fig. 13. L p - ipas produced bythe Carumrecording. These data indetail. As withreverberatedsound, we chopthe blurred
strongly suggest that no music energies were recorded on the original photograph into intervals each of which takes the form of a
disk outside the band of frequencies between 160 and 3 2 5 0 Hz. rectangular or square subpicture. After multiplying by suitable
smooth two-dimensionalwindows, we computetheFourier
00 transform of thesesubpicturesproducingresults similar to
A thoseof(12).TheseFouriertransformsrepresentthetwo-
M dimensional spatial frequency components of the blurred and
p - 1 0 noisysubpictures.Next,bycomputing the logarithm of the
L
I
magnitude of each transformed subpicture and by averaging
over all subpictures, we obtain results similar to (13) and (14).
T
" -20 Alternately, we canuse the powerspectrumapproachand
D obtain results similar to (25). Even for low-resolution images
E involving, let us say, 340 X 340 pictureelements,approxi-
~ -30 mately 100 subpictures each containing 64 X 64 picture ele-
d ments are available, thus allowing reasonable convergence of
B
U these averages.
-40 For a prototype image we use a sharp photograph of objects
1 0 1 0 0 1000 5000
similar in type to thoseinthe blurredphotograph.After
chopping, windowing, transforming, takingthe log magnitudes,
F R E Q U E N C Y [HZ1 and averaging thesubpictures of thesharpphotograph, we
fig. 14. L, - P , for the Bjoerling recording used as a prototype. proceed as in ( 1 9 4 2 1 ) . Alternately, for the power spectrum
Compare with Fig. 13. approach, we would proceed as in (26)-(28), instead of (1 5)-
(17).Finally, the restoration frequency response,in (21), is
spectrum approach provides slightly smoother results and from then used to design a twodimensional digital filter which in
a computational point of view avoids the frequent computa- turn is used to process the digital samples of the blurred
tion of the logarithm function, it would be preferred. photograph by high-speed convolution processing.
In the nonstationary cases of (43) and (44), itis understood Unlike the ear, the eye is sensitive to phase distortions. As a
that some bias is stillrequired to avoid theeffects of ill- result, we have attempted to bridge the problems encountered
conditioning. As our listening tests have shown,the power with (1 lb) and haveachieved somepractical successes for
spectrumapproach of (44)producesa subjectively superior limited butcommonlyoccurring classes of blurs [ 191.For
result, an outcome which is in harmony with the fact that the phases corresponding to camera motion duringexposureor
homomorphic estimate is the more heavily biased. The temp- inaccurately focused lenses, the exactphase distortion is known
tation to drawconclusionsfromthisobservation about the exceptforoneortwoparameters.Formotionblur,two
possibility thattheattenuationestimate of (44) leads to a parameters are related to the direction and extent of the blur.
near-optimum restoration system should be avoided, however. For defocused blur involving lenses with circular apertures, a
The desirable truncation effects introduced into Fig. 1 l(b) to single parameter is related to theradius of the resulting circular
produce the restoration frequency response of Fig. 12 indicate point spread function.
that this is so. More importantly, it must be remembered that For a particular blurred photograph these parameters can be
since the noise is assumed stationary and the singingis not, determined by computing the two-dimensional Fourier trans-
one would expect that the best restoration filter would itself forms of Lp( f ) or 4( f ) . Except for the effectsof noise, these
be a time-varying one. We have notyetexperimentedwith Fourier transforms can be considered to be estimates of the
this notionbutit provides atantalizingpossibility forthe two-dimensional cepstrum of the blurred image. The individ-
improvement of the results which we have already obtained. ual characteristics of such cepstrapossess a sensitive and reliable
relationship tothe parameters desired. We have developed
BLIND DECONVOLUTION IN IMAGE PROCESSING automatictechniquesforidentifyingthetype of blurand
We have applied the theoretical results developed for elimi- extractingtheseparametersin cases wheremotionblurand
nating unknown reverberations and resonances from sound to defocusbluroccurindependently.For some test cases in
the problem of eliminating unknown blurs from photographs. which both types of blur occur, visual inspection of a display
DECONVOLUTION
STOCKHAM er al.: BLIND 689

Fig. 1 5 . Blind deconvolution of image motion blur. (a) Photograph of Fig. 16. Blind deconvolutionof lens defocus blur. (a) Photograph of
a sign blurred in a real camera. (b) Restoration of (a) achieved through bunding blurred in a real camera.(b). Restoration of (a) achieved
homomorphic blind deconvolution. through blind homomorphic deconvolutlon.

of the cepstrum has yielded the three parameters.Once the For image deblurring, the choice between the homomorphic
parameters are found, we form the phase compensation func- approachandthe powerspectrumapproach is based upon
tion from the appropriate known analytic expressions, we as- differentconsiderationsthan for dereverberatingsound. The
sociate that phase with the amplitude compensation of (21), reason is that the noise is signal dependent.
and we proceed with the digital filtering. Consider a multiplicative noise model. Also note that image
Figs. 15 and 16 demonstrate some results of o w experiments signals arenonstationaryinmuchthesame way as sound
with blurred images. In both examples, the blurring degrada- signals. For example, in one area, an image may be very dark
tions were produced through deliberate misuse of a real cam- and shadowy, resultingin low signal amplitudes, and in another
era. The restorations shown were then produced by the blind area, the image may be well illuminated and bright, resulting
deconvolution methods just outlined. in high s i g n a l amplitudes. The result of all this is that the noise
690 PROCEEDINGS OF THE IEEE, APRIL 1975

amplitudes will also be small in the dark areas and high in the Using (A3), it can be shown [ 181 that
bright areas.
These effects reduce the perturbations of the pi's in (43) and
E {z} = E {log ( y ) }= log p,, -C (A44
(44) caused by the presence of Ni(f) in (31). The result is that var{z} = n2/6 = 1.6449341 * - (A4b)
(43) and (44) possess more nearly equal biases than in the case
where Cis Euler's constant ( C = 0.57721 e).
of additive noise and produce results closer to those of (1 9a).
This fact reduces the bias problems previously encountered
with the homomorphic approach.
we now let y - -
Generalizing for sums of independent random variables,
2~and z log &. It has been shown by
Bartlett and Kendall[22] that
In contrast, the smoothness of the estimatesproduced by
the two approaches remains much the same as in the additive E{z} = log py + $(N) - log ( N ) log py (A5a)
noise case. Specifically, for multiplicative noise, the dynamic
range of the image signals is greater and (42b) deviates more var {z} = $ '(N) = 1/N (A5b)
from (1 9c). Thus, in the case of images, the two attenuation where $ ( N ) is the digamma function.' The derivation of (A5)
estimates are both biased by the same amount, and inthe involves development of theappropriate characteristicfunc-
direction away from ill-conditioning, while for large dynamic tion and is beyond the scope of this paper. The reader is
range signals, the homomorphic approach provides smoother referred t o [221 for details.
results. Using a Euler-Maclaurin expansion [23],
CONCLUSIONS $(N)=log(N)- 1/(2N)- 1/(12W2+ l/(120N)4- * a * .

Practical solutions to the blind deconvolution problem are (A6)


available through digital signal processing when one of the
convolved signals is of muchgreater extent than the other. Thus, for large N, $(N)+ log N and $'(N) + 1/N and the
Successful examples have been demonstrated for the cases of approximations of (A5) hold. For N = 1, $(1) = -C, $'(1) =
reverberated and resonated sound and images degraded by n2/6,and (A5) reduces to (A4).
some common forms of blur. At the present time, the major
weakness of the solutions is the inability to correctfor the APPENDIXB
unknown phase distortions in the general case. The technology Equations (15a) and (1 5b)may be derived as follows.
presently available for analog signal processing cannot realize We notethat Pi<f)12= Re{Pi<f)}' + Im{Pi(f)}2, where
the system complexity required by these solutions to the blind Re{Pi<f)} and Im{&f)} are the real and imaginary parts, re-
deconvolution problem. Thus digital signal processing is pres- spectively, of the Fourier transform of pi(t). Even for a non-
ently essential to the use of these methods. Although present- Gaussian process, Re {e{f)}
and Im{Pi(f)} will be nearly
day signal processing involves higher costs and lower speeds Gaussian. Thus Pi<f)12, as the sum of two squared Gaussian
than might be desired, themethods presentedhere can be random variables: will be distributed as d
and log &(f)12 as
practically applied at modest cost.' log .
d
Applying the results of Appendix A, we have
APPENDIX A
In the remaining appendixes, 2 and log 2
statistics will be
used and the following results needed.

y -
Consider, f i t , two random variables, y and z = log ( y ) with
d (x' with2 degrees of freedom),E { y } = py , and
var{y} = u$ = p$. Noting that a d
distribution is equivalent
var{ '2
N i=1
log&(f)12} =n2/6N= 1.6449341 - * - / N (Blb)
to an exponential distribution,
it follows that p y ( y ) =
k * exp (-ky),where p y ( y ) is the probability density distribu- where #p = E{Pt{f)12} is thetrue powerspectrum of p ( t )
tion associated withy and k = l/py. underthe usual assumptions of powerspectralestimation.
2
We will now derive pz(z) (called a log distribution with 2 Noting that
degrees of freedom and denoted log d). Recall that z = f l y )=
log ( y ) . Under appropriate conditions [ 2 1 1,
PAZ) = Py(f-l(Z)) * I(d/dz)f-'(z)l. (AI)
Since df-'(z)/dz = f'(z) = exp (z), itfollows that we have, as desired,

p,(z) = k * exp (z - k exp (2)). (A21


Rewriting k as k = exp (-log (ilk)) = exp (-log p y ) , (A21
becomes
P&) = exp [z - log pY - exp (2 - log py)l. 643)

'Using a Digital Equipment Corporation PDP-10 computer, to ac-


complish these restorations requires about 20 min of computing time
for each image like those of Figs. 15 and 16 and about 2 h of computing Note that J, (r) = d log r(t)/dtand r(r) = G x " ' e - x d x is the gamma
time for a 4-minacoustic recording. A-D andD-A conversion con- function.
tributes a minor additional overhead. Digital proceasing facilitiea more 9Throughout these statistical discussions we wiU make the approxi-
suited t o production signalproceasing t a s b would reduce these costs mating assumption that each of the kiu>12's are independent random
significantly. variables, and that Re (.&f)}and Im {Ptiu>} are uncorrelated.
STOCKHAM cr al.: BLIND DECONVOLUTION 69 1

APPENDIX C
Equations (26a) and (26b) may be derived as follows. v u { t , ( f ) ) = (i)
{2
var
1 N
log IGi(f) * ~ ( f ) 1 2 }
As in Appendix B, we will consider IPiffl12to be distributed
as X:. Accordingl , ( 1 / N ) Zgl CPi(f)l' is distributed as X ~ N(Wb)
, = nZ/(24N)
K
and log ((l/N) CZz11pz&fl12) as log 2 ~ Again, . using the where q 5 ~= E{IG&fll'} is the true power spectrum of g(f) under
results of Appendix A, we have
the appropriate conditions.

E {):( log [f$ IPxfll']}= (:)(log$+ $(N)-log(N)I APPENDIX E


Equations (35a) and (35b) may be derived as follows.

= );( 1% $ (C 1a)
Assuming the process is noiseless, we may write
1 N

{):( [f 2 Pt091'] /
p p ~=)- IGi(f) * ~ ( f Pi(n12
) (El)
N i=l
vu log = $'(N)/4 zz 1/(4N) (C1 b)
and
where 6 = E {lPt&fl12} is the true power spectrumof p ( f )under
the usual assumptions of power spectral estimation. The ap-
proximations of (Cl) are valid for largeN(e.g., N > , 20). It is
interesting to note that as N -P 00,

Beacuse &(f)' is different, in general, for each i, P p ( f ) is no


longer the sum of identically distributed x: random variables,
and and
cannot
be
considered t o be tistributed as Exact
computation of the statistics of P p ( f ) = log3 Pp(n yields
(C3) open-form results which are not readily applied. The following
approximation, however, has beenused with empirical success.
as in (Cla). We will assume that Pp(nis approximately chisquare with
2Kdegreesof freedom. Using the concept of equivalent de-
APPENDIXD grees of freedom (EDF), as introduced by others [ 241, [25],
we have
Equations (34a) and (34b) may be derived as follows.
Assuming the process is noiseless, we have 2K = EDF{Pp(f)) = 2 . E2{Pp(f)}/var{Pp(f)) . (E31
1 N EDF{Pp(f)} may be evaluated by noting that
&(fl = ; log Icz&nH(fl
* * Pt&fll
i=l

Equation (Dl) may be rewritten as


and

At this point, it should be explicitlynoted that, for each


experiment, we consider the pi<f)'s to be deterministic. Thus
we have

Notingthat log IG&fl H(fl1' is distributed as log x; and


applying the results of Appendix A, we can write
692 PROCEEDINGS O F THE IEEE, APRIL 1975

Note that if a l l the &(f)’s are identical, then (E6) becomes [ 5 1 R. B. Smith and R. M. Otis, “Homomorphic deconvolution by log
spectral averaging,” submitted for publication t o Geophysics.
K = N as expected for a stationary process. [ 6 ] T. G. Stockham, Jr., “A-D and D-A converters: Their effect on
Using this approximation with the results of Appendix A, digital audio fidelity,” in DiB’ral Signal Processing, L. R. Rabiner
we have, as desired, and C. M. Rader, Eds. New York:IEEE Press, 1972, pp.
484-496.
[ 7 ] S. Kriz, “A 16-bit A-D-A conversionsystemfor high fidelity
audio research,” in Proc. ZEEE Symp. Speech Recognition, pp.
278-282, Apr. 1974.
[ 8 ] C.-M. Tsai, “A digital techniquefortestingA-Dand D A con-
verters,” M.S. thesis, Univ. Utah, Salt Lake City, June 1973.
[ 9 ] 0. Read and W.L. Welch, From Tin Foil to Stereo. Indianapolis,
Ind.: Howard W. Sams, 1959.
[ l o ] A. V. Oppenheim, R. W. Schafer, and T. G. Stockham, Jr., “Non-
linear fdtering of multiplied and convolved signals,” h o c . ZEEE,
VOl. 56, pp. 1264-1291, Aug. 1968.
[ 11] M. Medress, “Noise analysis of a homomorphic automatic volume
control,” S.M. thesis, Dep. Elec. Eng.,M.I.T., Cambridge, Mass.,
Jan. 1968.
[ 121 R. W. Schafer, ‘‘Echo removalbydiscrete generalized linear
fdtering,” Tech. Rep. 466, M.I.T. Res. Lab. Electron.,Cambridge,
var{iiP(f)}
(-3 $’(K) 2: 1/(4K) Mass., Feb. 1969.
[ 131 J. L. Goldstein, “Auditory spectral fdtering and monaural phase
perception,” J. Acoust. Soc. Amer., vol. 41, no. 2, pp. 458-479,
1967.
[ 141 B. Gold and C. M. Rader, Digital ProcesPing of Signals. New
York: McGraw-Hill, 1969, pp. 203-232.
[ 1 5 ] H. D. Helms, “Nonrecursive digital fdters: Design methodsfor
where the last terms ineach equation are valid for large N . achieving specifications on frequencyresponse,” ZEEE Trans.
Audio Elecrroacoust.,vol. AU-16, pp. 336-342, Sept. 1968.
[ 161 J. L. Flanapn, Speech Analysis Synthesis and Perception, 2nd
ACKNOWLEDGMENT Ed. New York: Springer-Verlag, 1972, pp. 321-395.
The authors wish t o thank the people who have helped along [ 171 R. B. Ingebretsen, “Log spectralestimationforstationaryand
nonstationary processes,’’ M.S. thesis, Comput. Sci. Dep.,Univ.
the course of research leading to the ideas presentedhere. Utah, Salt Lake City, 1975.
They are grateful for the contributions of G.Randall, R. B. [ 181 E. R. C d e , “The removal of unfiown image blurs by homomor-
phic filtering,” Comput. Sci. Dep.,Univ. Utah, Salt Lake City,
Warnock, M. Milochik, K. Gerber, N. J. Miller, R. Rom, E. UTECCSc-74-029, June 1973.
Ferretti,and many others who have given encouragement, [ 191 T. M. Cannon, “Digital image deblurring by nonlinear homomor-
interest, criticism, and ideas. Special thanks are due D. W. phic filtering,” Comput. Sci. Dep.,Univ. Utah, Salt Lake City,
UTECCSC-74-091,AUg. 1974.
Evansof Salt Lake City, Utah, for the acoustic restoration [20] J. F. Walkup and R.C. Choens, “Image processing in signal-
idea, and S. B. Fassett of Boston, Mass., for his deep insight dependent noise,” Opr. Eng., vol. 13, no. 3, May/June, 1974.
into and interest in the technical, musical, and historic aspects [ 2 11 E. Parzen, Modem Probability Theory and Its Applications. New
York: Wdey, 1960, p. 312.
of acoustic recordings. [22] M. S. Bartlettand D.G. Kendall, “Thestatisticalanalysisof
variance heterogeneityandthelogarithmictransformation,” J.
REFERENCES Res. Statist. Soc. (Suppl.),vol. 8, pp. 128-138, 1946.
[23] CRC Standard Mathematical Tables, 22nd ed., Samuel M. Selby,
J. Makhoul, “Linear prediction: A tutorial review,” this issue, pp. Ed. Cleveland, Ohio: CRC Press, 1973, p. 483.
561-580. [24] R.B. Blackman and J. W. Tukey, TheMeasurement of Power
R. W. Schaferand L. R. Rabiner, “Digital representationsof Spectra, New York: Dover,-1958, p. 22.
speech signals,” this issue, pp. 662-677. (251 P. D. Welch, “The use of fast Fourier transform for the estimation
L. C. Wood and S. Treitel, “Seismic signal processing,” this issue, of power spectra: A method based on time averages over short,
pp. 6 4 9 6 6 1 . modified periodograms,” IEEE Trans. Audio Elecrroacousr., vol.
B. R. Hunt, “Digital image processing,” this issue, pp. 693-708. AU-15, pp. 70-73, June 1967.

S-ar putea să vă placă și