Documente Academic
Documente Profesional
Documente Cultură
Abstract: In this paper, the multi-scale deep convolutional neural networks are introduced to deal with the representation for
imagined motor Electroencephalography (EEG) signals. We propose to learn a set of high-level feature representations through
deep learning algorithm, referred to as Deep Motor Features (DeepMF), for brain computer interface (BCI) with imagined motor
tasks. As the extracted DeepMF are dissimilar for different tasks and alike for the same tasks, it is convenient to separate the
diverse EEG signals for imagined motor tasks apart. Our approach achieves 100% accuracy for 4 classes imagined motor EEG
signals classication on Project BCI - EEG motor activity dataset. Moreover, thanks to the highly abstract features DeepMF
learned, only 4.125 seconds trials of training data are needed, compared with the conventional BLDA algorithm for 8.75 seconds
trials demand to achieve the same accuracy, accordingly the BCI response time and the required trials for training are almost
declined by half. Experiments are provided to illustrate the effectiveness of the proposed design approach.
Key Words: deep learning, electroencephalography (EEG), brain computer interface (BCI), convolutional neural networks
(CNNs)
3518
Feature Extracting
(a) Task 1. (b) Task 2. (c) Task 3. (d) Task 4.
950 480 440 180 100
Fig. 3: Visulization of DeepMF for random trials.
DeepMF
3519
segments before being concatenanted into feature The convolutional operation is expressed as
vectors[21].
j (s) j
ij i
(iii) Non-stationarity: EEG signals may rapidly vary over y = Activation b + k x
time, consequently the features extracted are non- i
(s)
stationary. Activation
(x) (1)
(iv) Randomness: Many factors in the environment may in- Relu (x) = max (0, x) s = 0
=
uence the EEG signals, to avoid the factors occurred tanh (x) = exp(x)exp(x)
exp(x)+exp(x) s=1
by chance, the real features of EEG needs to be extract-
ed from various adverse effects. where xi and y i are the i-th input map and j-th output map,
(v) Nonlinearity: Thanks to self-regulating mechanism of meanwhile the k ij is the convolution kernel between the i-th
brain, linear methods cannot serve as an effective ap- input map and the j-th output map. * denotes the convolu-
proach for precise analysis, higher-level features ex- tional computation. We alternately use the activation func-
tracting approach is required. tion between hyperbolic tangent function and rectied linear
unit (ReLU) which is shown closer to biological behaviour
3.2 Conguration of Deep ConvNets and also have better tting abilities[22]. When s = 1, we
Our deep convolutional neural networks contain four con- initialized the biases to be 0 and the weights Wij at each
volutional layers and three max-pooling layers to extract fea- layer with the following commonly used heuristic[24]:
tures layer-by-layer, two fully-connected layers are applied
to generate the DeepMF. The softmax output layer is used 1 1
Wij U , (2)
to identify the different imagined motor tasks. For each s- n n
ingle trial, the 50 sample points of 19 channels are resized
where U [a, a] is the uniform distribution in the interval
to 19 1 50 for square patches (Fig.4). Fig. 1 shows the
(a, a) and n is the size of previous layer.
detailed architecture of deep convolutional neural networks
The max-pooling operation is expressed as
which takes one signal trial for 19150 input and predicts
i
i
on the four imagined motor tasks, the details of parameters yj,k = max xjs+m,ks+n (3)
0m,ns
are listed in Tab. 1.
where s stands for pooling size, accordingly the i-th output
map y i pools over an s s non-overlapping area in for the
i-th input map xi .
19 The DeepMF layer is fully connected to both the 3rd pool-
Channels ing layer and 4th convolutional layer. The ConvNets will
be able to learn the multi-scale features through this dou-
19 50
Channels ble fully connected structure[23]. This is crucial to learn
50 more effective features since this design provides different
Samples
scales of receptive elds to the last softmax layer for identi-
Fig. 4: The input EEG data resizing. cation. We will show the performance gain of using such
layer-skipping structure in section Experiments.
The DeepMF layer is expressed as
1
The feature numbers gradually reduce along the data ow DeepMF = Activation
+ BDeepMF )
(ADeepMF
until the DeepMF layer, in this layer the highly abstract atten (Xpool ) (4)
ADeepMF = WDeepMF
features to represent imagined motor activities are formed. atten (Xconv )
With these high-level features, the imagined motor can be
where Xpool , Xconv are the output of the 3rd pooling
identied more conveniently, further more, the BCI system
layer and 4th convolutional layer. All the data are in
will benet from these high-level features both in the speed
the form of 4D tensor, each dimension stands for (batch-
accelerating and also tremendous declines in memory con-
size, channelsize, patchsize, patchsize), the operation of
sumption.
atten resizes the 4D tensor into 2D matrix as (batch-
size, channelsizepatchsizepatchsize). The WDeepMF and
BDeepMF are the weight matrix and bias item of DeepMF
Table 1: Conguration of multi-scale deep CNNs layer.
Layer Layer type Kernel shape Output shape The last layer of deep convolutional neural networks is
0 Input - [5, 19, 1, 50] a 4-way softmax predicting the probability distribution of
1 Convolutional [20, 19, 1, 3] [5, 20, 1, 48] corresponding 4 imagined motor tasks. The output of last
2 Pooling [1, 2] [5, 20, 1, 24] layer is expressed as
3 Convolutional [40, 20, 1, 3] [5, 40, 1, 22]
len
4 Pooling [1, 2] [5, 40, 1, 11] exp i=1 xi wi,j + bj
5 Convolutional [60, 40, 1, 3] [5, 60, 1, 9] yi = n
len (5)
j=1 exp i=1 xi wi,j + bj
6 Pooling [1, 3] [5, 60, 1, 3]
7 Convolutional [80, 60, 1, 3] [100, 80, 1, 1] where xi is one of the DeepMF processed by neuron j. The
8 DeepMF - [100] variable len is the length of DeepMF, the bj is the bias item,
9 Softmax - [4] and yi is the output probability distribution.
3520
5.0
Same 2.5
Task
1
Dimension2
0.0
2
3
(a) Both dissimilar random trails for task1. 2.5
5.0
different 7.5
6 3 0 3 6
Dimension1
Minibatch
err.1
err.3
Accuray
err.5
err.7
(6)
Prediction = softmax (xi wi,j + bj ) err.8
err.9
0.25 err.10
where the wsoft and bsoft are the weight and bias item of
the softmax layer, and labeli stands for the imagined motor 0 10 20 30
The number of imput batches
40 50
3521
Conv Layer1 Conv Layer2 Conv Layer3 Conv Layer4
1.00
3 60 Softmax
50 48 24 22 11 9
Minibatch
3 100 Layer
err.1 3 3 4
err.2
3 2 3 60
0.75 40 Hidden
err.3 2 60
20 Layer
err.4 Pooling
19
Accuray
err.5 40 Layer3
0.50 err.6 40
err.7 20 Pooling Layer2
err.8
19 20
err.9
Input Layer Pooling Layer1
0.25 err.10
0 10 20 30 40 50
Fig. 11: CNNs with single-scale structure.
The number of imput batches
Accuray
4 Single.scale
3 2 Shallow.net
3 Hidden 0.50
2 BLDA
20 Layer
19 40
0.25
40
20 Pooling Layer2
19 20 0.0 2.5 5.0
Time(s)
7.5
err.1 duce the time delay for BCI application, accordingly better
0.75
err.2
err.3
user experience will be achieved for BCI users.
err.4
Accuray
err.5
err.6
References
0.50
err.7
err.8
[1] W. Jiang, G.Z. Xu, L. Wang and H.Y. Zhang, Feature extrac-
err.9 tion of brain-computer interface based on improved multivari-
0.25 err.10
ate adaptive autoregressive models, in Biomedical Engineer-
0 10 20 30 40 50
ing and Informatics (BMEI), 2010 3rd International Confer-
The number of imput batches
ence on, 2010: 895898.
[2] U. Hoffmann, J-M. Vesin, T. Ebrahimi and K. Diserens, An ef-
Fig. 10: The convergence curve of shallow net.
cient P300-based braincomputer interface for disabled sub-
3522
jects, in Journal of neuroscience methods, 167(1):11125, ic brain mapping, in Signal Processing Magazine, IEEE ,
2008. 18(6):1430, 2001.
[3] M. Duvinage, T. Castermans and T. Dutoit, A P300-based [21] A. Rakotomamonjy, V. Guigue and G. Mallet .etal, Ensem-
quantitative comparision between the Emotiv Epoc headset and ble of SVMs for improving brain computer interface P300
a medical EEG device, in Biomedical Engineering, 765: 2012 speller performances, in Articial Neural Networks: Biolog-
764, 2012. ical InspirationsICANN 2005, 3696: 4550, 2005.
[4] T. Mutanen, H. Maki and R. Ilmoniemi, The effect of stimulus [22] V. Nair and G.E. Hinton, Rectied linear units improve re-
parameters on TMSEEG muscle artifacts, in Brain stimula- stricted boltzmann machines, in Proceedings of the 27th Inter-
tion, 6(3):371376, 2013. national Conference on Machine Learning (ICML-10), 2010:
[5] J. Hu, C.S. Wang, M. Wu, Y.X. Du, Y. He and J.H. She , Re- 807814.
moval of EOG and EMG artifacts from EEG using combination [23] P. Sermanet and Y. LeCun, Trafc sign recognition with
of functional link neural network and adaptive neural fuzzy in- multi-scale convolutional networks, in Neural Networks (IJC-
ference system, in Neurocomputing, 151.1(0):278287, 2015 NN), The 2011 International Joint Conference on, 2011:2809
[6] Y. Bengio, Scaling Up Deep Learning, in Proceedings of the 2813.
20th ACM SIGKDD International Conference on Knowledge [24] X. Glorot and Y.Bengio, Understanding the difculty of train-
Discovery and Data Mining, KDD14, 2014:19661966 ing deep feedforward neural networks, in International Con-
[7] Y. Sun, Y.H. Chen, X.G. Wang and X.O. Tang, Deep Learn- ference on Articial Intelligence and Statistics, 2010: 249
ing Face Representation by Joint Identication-Verication, in 256.
Advances in Neural Information Processing Systems 27, ac- [25] L.F. Nicolas-Alonso, J. Gomez-Gil, Brain computer inter-
cepted. faces, a review, in Sensors, 12(2): 12111279, 2012.
[8] L. Deng, J.Y. Li, and J.T. Huang .etl, Rescent advances in
deep learning for speech research at Microsoft, in Acoustics,
Speech and Signal Processing (ICASSP), 2013 IEEE Interna-
tional Conference on, May.2013: 86048608.
[9] K.J. Friston, Characterizing functional aysmmetries with brain
mapping, in The asymmetrical brain, 161186, 2003.
[10] A. Xiu, D.P. Kuang and X.J. Guo, A Deep Learning Method
for Classication of EEG Data Based on Motor Imagery, in
Intelligent Computing in Bioinformatics, accepted.
[11] H. Piroska and S. Janos, Specic Movement Detection in
EEG Signal Using Time-Frequency Analysis, in Complexity
and Intelligence of the Articial and Natural Complex System-
s, Medical Applications of the Complex Systems, Biomedical
Computing, 2008. CANS 08. First International Conference
on, Nov.2008: 209215.
[12] C.J. Lin and M.H. Hsieh, Classication of mental task from
EEG data using neural networks based on particle swarm opti-
mization, in Neurocomputing , 72(4): 11211130, 2009.
[13] A.T. Boye, U.Q. Ulrik and M. Billinger, Identication of
movement-related cortical potentials with optimized spatial l-
tering and principal component analysis, in Biomedical Signal
Processing and Control, 3(4): 300304, 2008.
[14] D.J. McFarland, C.W. Anderson and K. Muller, BCI meeting
2005-workshop on BCI signal processing: feature extraction
and translation, in IEEE transactions on neural systems and
rehabilitation engineering, 14(2): 135, 2006.
[15] S.Chiappa, and D. Barber, EEG classication using genera-
tive independent component analysis, in Nerroputing, 69(7):
769777, 2006.
[16] G.L. Wallstrom, R.E. Kass, and A. Miller, Automatic correc-
tion of ocular artifacts in the EEG: a comparison of regression-
based and component-based methods, in International journal
of psychophysiology, 532:105119, 2004
[17] H. Ramoser, J. Muller-Gerking, G. Pfurtscheller, Optimal s-
patial ltering of single trial EEG during imagined hand move-
ment, in Rehabilitation Engineering, IEEE Transactions on,
8(4): 441446, 2000.
[18] G. Florian, and G. Pfurtscheller, Dynamic spectral analysis of
event-related EEG data, in Electroencephalography and clini-
cal neurophysiology, 95(5): 393396, 1995.
[19] B. DalSeno, M. Matteucci and L. Mainardi, A genetic algo-
rithm for automatic feature extraction in P300 detection, in
Neural Networks, 2008. IJCNN 2008.(IEEE World Congress
on Computational Intelligence). IEEE International Joint Con-
ference on, 2008: 31453152.
[20] S. Baillet, J.C. Mosher and R.M. Leahy, Electromagnet-
3523