Neural Network Based EEG Denoising: Yongjian Chen, Masatake Akutagawa, Masato Katayama, Qinyu Zhang and Yohsuke Kinouchi

30th Annual International IEEE EMBS Conference Vancouver, British Columbia, Canada, August 20-24, 2008
Neural Network Based EEG Denoising

Yongjian Chen, Masatake Akutagawa, Masato Katayama, Qinyu Zhang and Yohsuke Kinouchi
AbstractA novel filter is proposed by applying back propagation neural network (BPNN) ensemble where the noisy signal and the reference one are the same in a learning process. This neural network (NN) ensemble filter not only well reduces additive and multiplicative white noise inside signals, but also preserves signals characteristics. It is proved that the reduction of noise using NN ensemble filter is better than the improved nonlinear filter and single NN filter while signal to noise ratio is smaller. The performance of the NN ensemble filter is demonstrated in computer simulations and actual electroencephalogram (EEG) signals processing.
I. INTRODUCTION
N the paper, the reduction of white noise that consists of additive and multiplicative (A&M) noise is investigated. The biomedical signals, especially electroencephalogram (EEG) signals are investigated in this paper. EEG signals consist of a set of electric potential differences between pairs of scalp electrodes and are usually small in amplitude, of the order of 20V. The noises which reduce EEG's clinical usefulness are mostly caused by the space between electrodes and scalps, and the apparatus inside. Moreover EEG is often contaminated by a variety of bioelectric signals called artifacts, e.g. Electromyography (EMG), Electrooculogram (EOG). The traditional method of Ensemble Averaging (EA) has been widely used to extract Evoked Potential (EP) from noisy background EEG. It based on the assumptions that the EEG signals evoked by the stimulus in each trial are the same, and that the background EEG activity is random and uncorrelated with EP. However, there are some shortcomings in EA method, because components of single trial EP are not absolutely related to the stimulus, but may change in both amplitude and latency across trials in practice [1]. Because the improvement of EA on signal to noise ratio (SNR) is proportional to the square root of the number of trials being averaged, commonly more than 100 ensembles are required for EP measurement due to its poor SNR [2]. As a result the averaged signal tends to erase or smooth any variations of EP, and EA method may sometimes fail to track trial-to-trial
variations in latency and amplitude. Therefore it is desired to study new methods to enhance the SNR of EP signal. So far, various techniques have been developed for noise reduction. To additive noise, many methods can be used to reduce it, e.g., linear prediction filter (Wiener [3]), Wavelet filter [4], Subspace Decomposition [5], Independent Component Analysis (ICA)[6], but the former three methods hardly suppress multiplicative noise and noises of low SNR, and the performance of linear filter is commonly lower than of nonlinear one. To multiplicative noise, ICA method is mostly applied to suppress it, especially in digital image processing, but it hardly reduces both noises effectively by itself. In the past papers, the nonlinear adaptive noise cancellation techniques [6-8] have been applied in a variety of biomedical signal processing. Among them, neural networks (NN) have been applied to reducing noise in biomedical signal processing, but the training of the neural network filter requires examples of the noisy input signal and the corresponding desired signal without noise or much correlative signals [7-8]. However, in practice, there is not the desired signal commonly. Well then, if we put white noise into the back propagation neural network (BPNN), what can it put out? Base on BPNN algorithm, it must put out the mean of noise. Because the mean of white noise is zero,
n = lim
1 m n(i) = 0 , m m i =1
(1)
where n(*) is the sample of white noise, and each sample of white noise is independent each other, the output should be zero, i.e., the forecasting of white noise is zero by BPNN. Otherwise signals and white noise are independent each other, and BPNN can estimate any signals precisely. So if we put the signal with white noise into the BPNN, it should make noise to be zero, and estimate the signal precisely in the same time. Based on it, a novel filter is proposed by using BPNN ensemble, where the noisy signal and the reference one are the same in a learning process. It can suppress both additive and multiplicative noise effectively when the desired signal is unknown beforehand. II. MODELING A. Filter Modeling In the paper, we used BPNN to suppress the noise. The BPNN is based on autoregressive moving average prediction model [9] which is [ y (n), , y (n + L 1)] = f {x(n -1), x(n - 2),, x(n - K )} (2) where x(*) is the input, y(*) is the output, K is nodes number of the input layer, and L is nodes number of the output layer. The input series is [x(1), x(2),, x(n), x(n+1),, x(n+K),, x(N-1), x(N)], where [x(n), x(n+1),,
Manuscript received April 7, 2008. Yongjian Chen is with Graduate School of Advanced Technology and Science, The University of Tokushima, Tokushima, Japan (phone: 81-090-6287-8886; e-mail: cyj622@ee.tokushima-u.ac.jp). Masatake Akutagawa is with The University of Tokushima, Tokushima, Japan (phone: 81-088-656-7475; e-mail: makutaga@ee.tokushima-u.ac.jp). Masato Katayama is with Graduate School of Advanced Technology and Science, The University of Tokushima, Tokushima, Japan (phone: 81-088-656-7476; e-mail: zmm76933@ee.tokushima-u.ac.jp). Qinyu Zhang is with Harbin Institute of Technology, Shenzhen, China (phone: 86-755-2603-3786; e-mail: zqy@hit.edu.cn). Yohsuke Kinouchi is with The University of Tokushima, Tokushima, Japan (phone: 81-088-656-7475; e-mail: kinouchi@ee.tokushima-u.ac.jp).
978-1-4244-1815-2/08/$25.00 2008 IEEE.
262
Fig.1. the BPNN ensemble filter. The symbol z-1 is the delay operator, and d is reference signal, e is error one. x(n+K-1)] is the n-th moving input of NN. Based on the model, a BPNN ensemble filter is designed as is shown in Fig.1. In each iteration, a series of data, xn-1, xn-2,,xn-K, of the original signal is inputted into the input layer, xn is the reference, and a corresponding output, yn, is produced from the neural network filter. The incremental mode [10] is used in the anterior NN, and the batch mode [10] is used in the latter one. In the anterior NN, both input signal and reference one are the original xn. In the latter NN, both of them are the output of the anterior one. In batch mode, the gradients and errors calculated at each training example are added together to determine the change in the weights and biases, then the connection weights of the filter are updated once by back propagation algorithm after each epoch. As iterations proceed from epoch to epoch, the filter adapts to the behavior of the original signal. B. Algorithm Selection Steepest descent algorithms [11] adapt the weights of a discrete time dynamical system by minimizing an error function. The batch mode is an epochwise training algorithm. An epoch is an iteration to iteration cycling of a discrete time dynamical system from initial (n=0) to final iteration (n=ne). An epochwise training algorithm is any algorithm in which training takes place after each epoch or after a series of epochs. If the filter is required to drive the plant to a set of desired states, {d 0 , , d ne } , then the error function is
ne 1 E = (d n yn )T (d n yn ) . n=0 2
ne yn . + E = n w(i ) n =0 wn (i )
(6)
The term
yn wn (i ) is the derivative of the output at
iteration n with respect to one of the weights at iteration n. The index i denotes the number of times the weight vector has been updated using a steepest descent rule. Utilizing steepest descent, epochwise algorithms update the weights using
w(i + 1) = w(i )
C. Signal Model
Because
+ E , w(i )
(7)
where , the learning rate, is a chosen positive constant.

m
a
k =0
sin((2k + 1)2 i /( f s T)) , ak is a constant,
is a standard signal representation, it is used to generate the signal source where T is the period time, fs is the sampling frequency, and i is the order of sampling point. D. Noise model Noise components Ni, i=1, 2, , n which are used as noise source in simulated A&M noises are independent and identically distributed zero-mean random variables with a common variance E[Ni2]. With fN(x) as the symmetric marginal probability density function (pdf) of Ni, i=1, 2, , n, we consider here the following noises: 1) Gaussian pdf with f N ( x ) = e x
2
(3)
/ 2 2
/ 2 2 ;
The ordered partial derivative of the error with respect to + an output, which are defined as n = E , are called the yn Lagrange multipliers, and are used to simplify the calculation of the error gradient. They are computed by cycling backwards from iteration n=ne-1 to n=0.
2) Double exponential pdf with f N ( x ) = e x / / 2 . E. Model Parameters Estimation In order to determine the optimum size, the standard deviation of filtered noise has been evaluated to determine the network size in all kinds of SNR. It is
n =
y E L + n + j n + j , n = ne 1, , 0 yn j =1 yn
(4)
n =
n
k
2 k
/ ( N 1)
(8)
where L is the number of the output. The following two equations are used to establish the initial conditions for (4):
n =
e
E , yne
n +1 , , n + L = 0
e e
(5)
Given the Lagrange multipliers, the ordered partial derivative of the error with respect to a weight is calculated by
where nk is the noise value in k-th sample point, and N is training data number. Based on simulations, there are 5 nodes in the input layer and 1 node in the output layer of each NN, and it is optimum. Each network size in Fig.1 is determined by the number of neurons in the hidden layers. If some networks of different sizes give almost the same standard deviation of filtered noise, the smallest size is optimum. In order to determine the
263
R te 1 lo (N )(d ) a = 0 g /n B
optimum size, the standard deviation of filtered noise has been evaluated to determine the network size in all kinds of SNR. It is illustrated in Fig.2 while SNR is 0dB. (The standard deviation of original noise before filtering is 0.707.) Here, the standard deviation of filtered noise is examined in the different connections number of neural network ensemble and the ratio R (R = H1/H2, H1 and H2 denote the number of nodes in hidden layers of the first NN and the second respectively). The number of training data is 10,000 because the deviation of filtered noise keeps almost the same small value in more than 10,000 data. After training, since the standard deviation of filtered noise is the minimum using the neural network ensemble with 180 connections as is shown in Fig.2(a), the connection number, 180, has been chosen. As is shown in Fig.2(b), when the total number of connections are 180, the filter of the ratio R = 2 gives the smallest standard deviation of filtered noise, and the accuracy is almost the same in R 2. So R = 2 is chosen, i.e., the NN ensemble structure of 5*20*1 and 5*10*1 is determined in the study. The above results denote that it is optimum.
number of connections and Denoise 0.3
B. Improvement of SNR In the section, the reduction of white noises of different SNR is investigated. The ratio of original noise (before filtering) to filtered noise is denoted as
= 10 log10 ( )
N n
(10)
where N is the power of original noise, n is the power of filtered noise, and S is the power of signal. As is shown in Fig.4, while original SNR is lower, the reduction of noise using NN ensemble filter is better than the previously proposed improved nonlinear filter [12] and incremental mode NN filter which consists of only one NN. It is proved that degradation of the capability for noise reduction by NN ensemble due to the increase of the noise power is much suppressed compared with the improved nonlinear filter.
SNR and noise reduction rate 12 10 8 6 4 2 0 -5 0 5 10 SNR0=10log(S/N)(dB) 15 nonlinear filter single NN filter NN ensemble filter
ratio of hidden layer nodes 0.3 standard deviation of filtered noise
standard deviation of filtered noise
0.28
0.28
0.26
0.26
0.24
0.24
0.22 0
60
120 180 240 300 360 420 number of connections
0.22 0
2 4 Ratio(H1/H2)
Fig.4. The SNR improvement of three filters. The noise consists with additive and multiplicative noise of same power.
Fig.2.(a) Relationship between the connections number of NN and noise reduction. (b) Relationship between the ratio R and noise reduction.
Furthermore, the reduction of noise commonly changes less than 10% in the different structure of NN based on the above simulations, so if the optimum structure of NN is unknown in practice, the degradation of the capability for reduction of noise can be much suppressed. III. SIMULATION AND DISCUSSION All the computer simulations are finished by Matlab. A. Additive and Multiplicative Noise Removal In the section both additive and multiplicative noises reduction are investigated. The primary signal x with both noises are generated by
x = x ' (1 + nm ) + na
(9)
where x' is the signal source, na is additive noise, and nm is multiplicative noise. Using above method, the result is illustrated in Fig.3.
4 2 amplitude 0 -2 -4 7000
Amplitude 4 2 0 -2 -4 7000 sine output
7200 7400 7600 7800 time(sample number)
8000
7200 7400 7600 7800 Time(sample number)
8000
(a) (b) Fig.3. (a) The sinusoidal signal with additive and multiplicative noise of same power. The SNR is 0dB, and standard deviation of noise is 0.707. (b) The output of the filter. Standard deviation of noise is 0.241.
C. Noise Removal in clinical EEG Signal The filter is tested on the reduction of noises in the normal EEG and EMG. There are three groups of data whose time is 1 minute separately. The first one is the normal EEG comes from a male of 24 years, and the second is the normal EEG with EMG in the same experimental condition except that the man gnashed his teeth, and the final is the EMG in right forearm. The sampling frequency of signals is 1000Hz, and 10000 samples (10 seconds) in every signal are used. All of them are input into the filter separately, and there are some differences between the output of filter and original signals as is shown in Fig.5. The differences whose means are about 0 are noises reduced from original signals in Fig.5(a) and Fig.5(b), and it is presented that the filter can reduce EMG to be close to 0 in Fig.5(c). It is the frequency spectrum of the output of filter as is shown in Fig.6. It is worthy of note that the output spectrums of EEG with EMG and of EEG are almost same in Fig.6(a) and Fig.6(b). EMG has been reduced. If the single NN in incremental is used for removing the noises in original EEG in Fig.5(a), the result is shown in Fig.7. Compared with Fig.6(a), the NN ensemble filter can remove more noises than the single NN filter in incremental mode. The neural system, responsible for the activation of muscle fibers, conducts electrical pulses from the brain to the muscles. During voluntary muscle contractions the whole sequences of electrical pulses, so called innervation pulse trains (IPT) are transferred by the same motoneuron. Because the motor units IPT distribute independently according to the Possion distribution, and EMG is a shot noise, the filter can reduce EMG signal.
264
output 1.5 1 0.5 am plitude 0 -0.5 -1 -1.5 1000 EEG with EMG output of filter
0.05
input frequency spectrum com pared w ith outout EEG & EM G
0.04
output
amplitude
1200 1400 1600 time(ms) 1800 2000
0.03
0.02
0.01
(a)
0.8 0.6 0.4 a p d m litu e 0.2 0 -0.2 -0.4 -0.6 1000 1200 1400 1600 time(ms) 1800 2000 output EEG output of filter
0 0
100
200 300 Frequency(Hz)
400
500
Fig.7. The single NN in incremental mode is used for removing noises in original EEG in Fig.5 (a), and the frequency spectrum of the output is shown.
IV. CONCLUSION In the paper, it is clear that the BPNN ensemble filter can not only reduce random additive and multiplicative white noise inside signals, but also preserve their characteristics. It is noticeable that the noisy signal and the reference one are the same in a learning process, and if the optimum structure of NN is not known in practice, the degradation of the capability for noise reduction can be much suppressed. It is certificated that the reduction of noise using NN ensemble filter is better than the improved nonlinear filter and single NN filter while the SNR is lower. In clinic, it is presented that the filter can reduce some noises and EMG in the normal EEG. REFERENCES
1200 1400 1600 time(ms) 1800 2000
(b)
output 5 original EMG output of filter 3 am plitude(uV )
-1
-3
-5 1000
[1]
(c)
Fig.5. It is the 1000 samples' (1 second) output of the filter. (a) Output of the filter. The original signal is the normal EEG with EMG in Pz channel. (b) The original signal is the normal EEG in Pz channel. (c) The original signal is the EMG in the right forearm.
frequency spectrum of EEG&EMG compared with of outout 0.05 EEG&EMG output
0.04 Am plitude
0.03
0.02
0.01
0 0
100
400
500
(a)
frequency spectrum of EEG compared with of outout 0.05 EEG output of filter
0.04
am plitude
0.03
0.02
0.01
0 0
100
400
500
(b)
Fig.6. (a) The frequency spectrum of the output in Fig.5(a). (b) The frequency spectrum of the output in Fig.5(b).
D. Regan, Human Brain Electrophysiology: Evoked Potentials and Evoked Magnetic Fields in Science and Medicine. Publisher: Appleton & Lange, New York: Elsevier; 1989. [2] C. D. McGillen, J. I. Aunon, K. B. Yu, Signals and Noise in Evoked Brain Potentials. IEEE Transaction on Biomedical Engineering. BME-32: pp. 1012-1016; 1985. [3] T. Fechner. Nonlinear Noise Filtering with Neural Networks: Comparison with Weiner Optimal Filtering. Third International Conference on Artificial Neural Networks. IEE CONFERENCE PUBLICATION, ISSUE 372, pp. 143, 1993. [4] Nian Cai, Jian Cheng, Jie Yang. Applying a Wavelet Neural Network to Impulse Noise Removal. The 2005 IEEE International Conference on Neural Networks and Brain, pp.781-783, 2005. [5] Jing Gu, Jian Yang. Speckle Filtering in Polarimetric SAR Data based on the Subspace Decomposition. IEEE Transactions on Geoscience and Remote Sensing, Volume 42, Issue 8, pp.1635 - 1641. Aug. 2004. [6] Michel Haritopoulos, Hujun Yin, Nigel M. Allinson. Image denoising using self-organizing map-based nonlinear independent component analysis. Elsevier Science. Neural Networks 15, pp.10851098, 2002. [7] S. Selvan, R. Srinivasan. Removal of Ocular Artifacts from EEG Using an Efficient Neural Network Based Adaptive Filtering Technique. IEEE SIGNAL PROCESSING LETTERS, Vol. 6, No. 12, pp.330-332, Dec. 1999. [8] Christopher J. James, Martin T. Hagan. Multireference Adaptive Noise Canceling Applied to the EEG. IEEE Transaction on Biomedical Engineering, Vol.44, No.8, pp.775-779, August, 1997. [9] Xin Feng and Jessica Y. Schulteis, Identification of high noise time series signals using hybrid ARMA modeling and neural network approach, Proc. IEEE Int. Conf. Neural Netw., 1993, pp.1780-1785. [10] BPNN toolbox, MathWorks Corporation, Natick, Massachusetts, USA. Available: www.mathworks.com/access/helpdesk/help/toolbox/nnet. [11] Stephen W. piche, Steepest descent algorithms for neural network controllers and filters, IEEE Trans. Neural Netw., vol.5, no.2, pp.198-212, 1994. [12] H.Harashima, K.Odajima, Y.Shishikui, and H.Miyakawa, -separating nonlinear digital filter and its applications, Trans. IEICE, vol.J65-A, no.4, pp.297-304, April, 1982.
265

Neural Network Based EEG Denoising: Yongjian Chen, Masatake Akutagawa, Masato Katayama, Qinyu Zhang and Yohsuke Kinouchi

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Neural Network Based EEG Denoising: Yongjian Chen, Masatake Akutagawa, Masato Katayama, Qinyu Zhang and Yohsuke Kinouchi

Încărcat de

Drepturi de autor:

Formate disponibile

30th Annual International IEEE EMBS Conference Vancouver, British Columbia, Canada, August 20-24, 2008

Neural Network Based EEG Denoising

978-1-4244-1815-2/08/$25.00 2008 IEEE.

yn wn (i ) is the derivative of the output at

where , the learning rate, is a chosen positive constant.

sin((2k + 1)2 i /( f s T)) , ak is a constant,

ratio of hidden layer nodes 0.3 standard deviation of filtered noise

standard deviation of filtered noise

120 180 240 300 360 420 number of connections

7200 7400 7600 7800 time(sample number)

7200 7400 7600 7800 Time(sample number)

input frequency spectrum com pared w ith outout EEG & EM G

200 300 Frequency(Hz)

200 300 Frequency(Hz)

200 300 Frequency(Hz)

S-ar putea să vă placă și