Using stacked autoencoders for the P300 component detection

L Vareka, P Mautner - frontiersin.org
L Vareka, P Mautner
frontiersin.org
Deep neural networks (ie multi-layer neural networks with multiple layers) are not very
practical to use with backpropagation because weights in the lower layer (closer to input) do
not typically change significantly during training. Therefore, training often ends in local
minima and the performance of the neural networks is typically lower when compared with
traditional linear classifiers such as Linear discriminant analysis (LDA) or Support Vector
Machines (SVM). When novel training methods (commonly referred to as deep learning)�…
Deep neural networks (ie multi-layer neural networks with multiple layers) are not very practical to use with backpropagation because weights in the lower layer (closer to input) do not typically change significantly during training. Therefore, training often ends in local minima and the performance of the neural networks is typically lower when compared with traditional linear classifiers such as Linear discriminant analysis (LDA) or Support Vector Machines (SVM). When novel training methods (commonly referred to as deep learning) have emerged, often combining unsupervized pre-training and subsequent fine-tuning, deep neural networks have become one of the most reliable classification methods. Since deep neural networks are especially powerful for high-dimensional and non-linear feature vectors, electroencephalography (EEG) and event-related potentials (ERPs) are one of the promising applications. Furthermore, to the authors’ best knowledge, there are very few papers that study deep neural networks for EEG/ERP data.[1, 2] Consequently, the aim of the experiments subsequently presented was to verify if deep learning-based models can also perform well for classification in P300 brain-computer interfaces. The P300 data used were recorded in the EEG/ERP laboratory at the Department of Computer Science and Engineering, University of West Bohemia, and are publicly available in the EEG/ERP portal (http://eegdatabase. kiv. zcu. cz/home. html). The datasets are based on a stimulation with three LED diodes (ie target, non-target, and distractor stimuli), and are described in detail in [3]. The Windowed means paradigm [4] was used for feature extraction. The feature vectors consisted of both spatial and temporal features, and had a dimensionality of 133 (7 averaged time intervals for each EEG channel x 19 EEG channels). For training of the classifiers, data from three subjects containing 730 ERP trials were used. For testing, 11 datasets from 11 subjects were used. Stack autoencoders (SAEs) were implemented and compared with some of the most reliable state-of-the-art classification methods: LDA and multi-layer perceptron. The parameters of stacked autoencoders were optimized empirically. The layers were inserted one by one and at the end, the last layer was replaced with a supervized softmax classifier and finetuning using backpropagation was performed. The architecture of the neural network was 133-100-75-60-30-2. For each classifier, average accuracy, precision, and recall, each with the corresponding standard deviation (SD), are listed in Table 1. Fig. 1 depicts achieved classification accuracy for each testing dataset. The performance achieved was comparable with LDA and higher than the performance of the multi-layer perceptron. This result encourages using stacked autoencoders for P300 BCIs. For the future work, more issues remain to be addressed. Although stacked autoencoders are less prone to overtraining than MLPs, during the fine-tuning phase, accuracy peaked after approximately 100 iterations and then levelled off slowly. Therefore, an early stopping with a validation set can be considered. It could also be interesting to explore stacked denosing autoencoders, deep belief networks or other deep learning training models. Furthermore, we plan to test deep learning in real on-line BCIs.
Frontiers