Sunteți pe pagina 1din 11

Speech processing

ECE 5525

Spectral subtraction algorithm and optimize

Wanfeng Zou

7/3/2014

1
Abstract

Language is the most important, direct, effective and convenient means of


information exchange. With the rapid development of science and technology in
recent years, people are not satisfied with the way to exchange information with
computer, hoping to get rid of the keyboard and the mouse and achieving the goal of
using language to control the computer. Therefore, the language signal processing
technology was produced. Language signal processing is an emerging discipline, but
also is a cross discipline which multiplied disciplines and covered a very wide range.
Now some language signal processing systems are embedded in the intelligent
system, but they can only work in a quiet environment. However, in the speech
information acquisition process will inevitably have a variety of noise interference.
Noise can not only reduce speech intelligibility and voice quality, it also affect speech
processing accuracy, and even make the system not working properly. In this paper
we will discuss the principle and method of the speech enhancement technology.
Mainly introduces a method for speech enhancement -- spectral subtraction
algorithm and its improved algorithm. The method can effectively eliminate the
stationary additive noise, the improved algorithm can effectively eliminate which the
common method produced “music noise”, obviously improves the speech signal to
noise ratio.

Keywords: Speech signal processing Speech enhancement spectral subtraction


algorithm improved algorithm

Summary

2
A speech enhancement algorithm was developed based on spectral subtraction to
reduce the disturbances of noise on speech communications. The algorithm uses a
Gaussian statistical model to revise the noise spectrum estimate for the speech
enhancement. The algorithm then uses a simple method to compute the presence
probability of speech in each frequency bin to enhance the speech signal.

Experimental Tools

MATLAB is a high-level language and interactive environment for numerical


computation, visualization, and programming. Using MATLAB, you can analyze data,
develop algorithms, and create models and applications.

Experimental objects

First we use the WINDOWS recorder software to take recode a clear speech signal in
‘wav’ format. Next add a sine wave noise signal (0.5 amplitude and 1000Hz
frequency) into the previous clear voice to get a new audio document by using
MATLAB.

Code:

clc,clear
[x,fs,bits]=wavread('11.wav');
N=size(x,1);
x1=x(1:N,1);
fn=1000;
t=1:length(x1);
x2=0.5*sin(2*pi*fn/fs*t);
y=x1+x2';
wavwrite(y,fs,'12.wav');

3
4
General Spectral subtraction algorithm

In many speech enhancement methods spectral subtraction is one of the most


popular one because of its easy to implement and less calculation in speech
processing. Spectral subtraction begins to use in 1980s becomes effective speech
enhancement algorithms.

The basic spectral subtraction it is assumed a smooth voice signal and noise is
additive noise. The voice signals and noise are not related to each other. At this noisy
speech signal can be expressed as:

y (t) is the noisy speech signal, s (t) for the clean speech signal, n (t) is the noise
signal. With Y (w), S (w) and N (w) to repentant y (t), s (t) and n (t) of the Fourier
transform the following relationship:

Because s (t) and n (t) independent so S (w) and N (w) is also independent.

=0 and

It is possible to use the "silent frames" before speech to estimating the noise.

The formula can be used to estimate the original speech:

repentant estimated value; It means average free speech signal;

5
If the negative results appear in equation, then it is changed to 0 or changes the sign
to positive, because the power spectrum cannot be negative.

So we can get the original speech valuation:

Code:

clc,clear
[x,fs,bits]=wavread('12.wav');
y=x(1:350,1);
Y=fft(y);
magY=abs(Y);
b=[];
for i=0:2000;
n=350;
x1=x(1+n*i:n+n*i);
X1=fft(x1);
magX=abs(X1);
S=(magX.^2-magY.^2);
S1=abs(S).^0.5;
s1=ifft(S1);
a=s1';
b=[b a];
end
x2=b';
plot(x2);
sound(x2,fs,bits);
wavwrite(x2,fs,'13.wav')

6
Fig 13.wav

Improve spectral subtraction algorithm

In fact the noise spectrum is Gaussian distribution:

m is the mean of x, is the standard deviation

Therefore after using basic spectral subtraction noise elimination, there still exist
some greater power spectrum of the residual components random present in the
spectrum spike. After the inverse Fourier transform the enhanced speech formed a
new rhythmic fluctuation noise (musical noise) and this kind of noise cannot use

7
spectral subtraction to remove.

In order to minimize the secondary pollution to the voice information caused by


‘musical noise’ (rhythmic fluctuation noise) spectral subtraction can be improved.
Speech information energy generally concentrated in some frequencies or frequency
bands in noisy speech, and the noise energy is often distributed over the entire
frequency range. Therefore, remove the noise at the higher the amplitude of time
frame.

Minus it will highlight the voice power spectrum.

In addition, there is an improved method, for amend the processing of the power

spectrum. Change and to and .

Combining these two improved process, the enhanced form of spectral subtraction
can be expressed as:

When =2, =1 that is general spectral subtraction. We know that is spectral

subtraction correction factor, change the value of will further enhance the signal

to noise ratio; as spectral subtraction noise figure, its role is to reduce the noise

power spectrum, modify the coefficient of would serve to reduce noise and
highlight the speech spectrum.

Code:

clc,clear;
[x,fs,bits]=wavread('12.wav');
y=x(1:350,1);
Y=fft(y);
magY=abs(Y);
b1=[];a=0.4;b=0.5;
for i=0:2000;

8
n=350;
x1=x(1+n*i:n+n*i);
X1=fft(x1);
magX=abs(X1);
S=(magX.^a-magY.^a);
S1=abs(S).^(1/b);
s1=ifft(S1);
m=mean(s1)*300;
for j=1:350;
if abs(s1(j))>m;
s1(j)=s1(j)/4;
end
end
a1=s1';
b1=[b1 a1];
end
x2=b1';
plot(x2);
sound(x2,fs,bits);
wavwrite(x2,fs,'14.wav')

Fig 14.wav

9
Conclusion

Compare the figure Fig 13.wav and Fig 14.wav we can obviously find that the speech
waveform has been significantly improved also we can hear less musical noise. But
with eliminates musical noise the voice will be reduced inevitably. Many experiments

shows that modify the will further enhance the signal to noise ratio and change

the coefficient of would serve to reduce noise and highlight the speech spectrum.

However too big value of and will cause the voice distortion. The results show
that the algorithm more effectively eliminates musical noise and improves the signal
to noise rates without significantly impairing the speech intelligibility.

Reference

Cai Han Tian, Yuan Bo Tao. A speech enhancement algorithm based on

masking properties of human auditory system. Journal of China Institute

of Communications. 2002,8, Vol.23, no.5..

Jiang Xiao Ping, Yao Tian Ren, Fu Hua. Single channel speech

enhancement based on masking properties and minimum statistics.

Journal of China Institute of Communications. 2003,6, Vol.24..

Murakami, T., Namba, M., Hoya, T. Speech enhancement based on a

combined higher frequency regeneration technique and RBF networks.

TENCON '02. Proceedings. Oct.,2002, Vol.1..

10
11

S-ar putea să vă placă și