An Effective Brain-Computer Interface System Based on the Optimal Timeframe Selection of Brain Signals

Background: Brain responds in a short timeframe (with certain delay) after the request for doing a motor imagery task and therefore it is most likely that the individual not focus continuously on the task at entire interval of data acquisition time or even think about other things in a very short time slice. In this paper, an effective brain-computer interface (BCI) system is presented based on the optimal timeframe selection of brain signals. Methods: To prove the stated claim, various timeframes with different durations and delays selected based on a specific rule from electroencephalography (EEG) signals recorded during right/left hand motor imagery task and subsequently, feature extraction and classification are done. Results: Implementation results on the 2 well-known datasets termed Graz 2003 and Graz 2005; shows that the smallest systematically created timeframe of data acquisition interval have had the best results of classification. Using this smallest timeframe, the classification accuracy increased up to 91.43% for Graz 2003 and 88.96%, 83.64% and 84.86% for O3, S4 and X11 subjects of Graz 2005 database respectively. Conclusion: Removing the additional information in which the individual does not focus on the motor imagery task and utilizing the most distinguishing timeframe of EEG signals that correctly interpret individual intentions improves the BCI system performance.

6][17][18][19] The reduced system consists of 2 basic steps; EEG brain signal features extraction and their appropriate classification. 20,21ssumed nature of the EEG signal on the one hand and how its impression from motor imaginary on the other hand, specifies what features of the EEG signal should be extracted and what an appropriate classifier should be used. 22o far, different assumptions were made about the EEG signal nature including stable and non-stable, 20,22 Gaussian and non-Gaussian, 23,24 linear and non-linear, 20,21 time series (random process) 1,17 and scalable patterns. 18,19It should be noted that a sophisticated looking to the signal, makes it difficult to follow the motor imagery effects on it, requiring complex features to be extracted and therefore using a complex classifier may seem to be reasonable.So it may be better to avoid such a sophisticated looking to the signal.In this study the basic simplifying assumptions have been supposed about the signal nature and how motor imagery influences it.Choosing the best time frame of EEG signal, extracting its most suitable independent features, using simple classification proportional to the signal nature and type of its extracted features have been considered in the proposed method.
Assuming that each motor imagery task creates a unique pattern at a specific timeframe of the brain signal, appropriate features describing the pattern at a specified timeframe should be selected.In addition it should be noted that irrelevant data should not be considered in early stages of the algorithm (feature extraction) so that not needed to hardly discarded or removed later.
Band power (BP) features have been mostly extracted from motor imagery task related signals. 25These features have been relatively good to distinguish left and right hand motor imagery.But their effectiveness in applications such as left or right motion detection of the same hand is debatable.It is conceivable that various time patterns with the same or even different time periods create the same BP features.So they are not suitable in the case of more than 2 classes.There is no guarantee that the increase or decrease in the alpha or beta frequency band is accurately detected when there is little difference between the longterm or short-term patterns.
In some researches an analytical method is used to select the effective time frame. 23,26This selection has been based on changes of the BP features statistics.In fact, instead of including the effect of this time frame selection on true classification rate, the impact of changes on the features statistics included so little impact on the classification results improving is achieved.
Some researchers have focused on extracting better features in the first stage and then reducing them. 19,20,22he use of better features and subsequently reducing them may not be sufficient to achieve the desired time frame of signal because of the signal nature or computational error.In other words, the task of removing additional time frames information should not be left just for feature reduction algorithm.
Probably it is not necessary that additional data be considered first then feature reduction algorithm forced to remove them.Using entire interval of data acquisition time, computational time and complexity will be increased while the correct classification rate decreases.In some recent algorithms, features are extracted from different time frames and then a feature reduction algorithm is applied to the collection of these features. 1,17,18,21,24This method also has the aforementioned problem.It is possible that feature reduction algorithms select features from areas that are not within the desired time frame.A better approach to avoid irrational computation is selecting the desired time frame by using the signal instead of its extracted features.
Although the idea of the timeframe selection had been raised by Zhong et al in 2008, 23 the analytical method of this timeframe selection, needing analytically extraction of this timeframe for each individual and the type of features that were used led to lack of attention to the idea in subsequent researches.A data mining based timeframe selection method may create more attraction.

Materials and Methods
In this section, the proposed methodology and data sets used to examine the effectiveness of the proposed method are presented.

Database
The effectiveness of the proposed algorithm is studied using 2 well-known data sets of Graz 2003 (dataset III) 27 and Graz 2005 (dataset IIIB) 28 that have been available by Graz University of Technology.
Dataset III signals were recorded from C 3 , C z and C 4 brains areas of a normal subject (a 25-year-old female) during a feedback process.The experiment consisted of 7 runs all in the same day with 40 trials in each.The duration of each trail was 9 seconds and all records sampled with a rate of 128 Hz.In the first 2 seconds nothing happens and at t = 2 s, an acoustic alarm specifies the start of the experiment and then at t = 3 s a visual cue to the right and left displayed and she was asked to move a feedback bar in the same direction, by doing motor imagination of her right and left hands.
Dataset IIIB signals were recorded from C 3 and C 4 brains areas of 3 subjects (X 11 , S 4 , O 3 ) during a cued motor imagery task with online feedback.Experiments were conducted in 3 sessions that each session includes 4 to 9 runs.The duration of each trail was 8 seconds and all records were sampled at a rate of 125 Hz.In the first 2 seconds nothing happens and at t = 2 s, an acoustic alarm specifies the start of the experiment and then at t = 3 s a left or right arrow (virtual reality for O 3 or basket for S 4 and X 11 ) was displayed as a cue, and the subject was asked to move a feedback bar in the same direction, by doing

Proposed Method
Proposed BCI system is mainly based on the selection of optimal timeframe containing useful information from the signal followed by extraction of appropriate features from this timeframe.Obviously, removing timeframes that do not contain useful information will have positive effects in reducing computational time and complexity and increasing system performance both for signal analysis or classification.
As mentioned in the previous section, however analytical methods have already been presented for the selection of the time frame but these further developed methods are based on statistical analysis of BP features which have major disadvantages such as computational complexity and data dependency.Furthermore the generalizability of the method for similar motor imagery tasks is low.
In this section, a new data mining based timeframe selection method to improve the efficiency of a BCI system is presented.Overall block diagram of proposed optimal timeframe selection is shown in Figure 1.The practical development and details of the procedure are described below.

Preprocessing
The main purpose of preprocessing in brain computer interfacing is to remove redundant parts which recorded during data acquisition.EEG signals can be easily and noninvasively obtained by electrodes placed on the scalp.But due to the increased number of electrode channels available include 14, 64, 128, and so only those which provide the most useful part of the data should be selected for acquirement of EEG signals.To control the right and left hand motor imagery-based BCI systems, the electrodes must be placed in C 3 and C 4 areas that are the right and left sides of motor cortex region respectively. 18n the Graz 2003 database C Z channel is ignored because it showed its independence of the motor imagery tasks. 18o eliminate useless frequency content of EEG signals, a well-known third order Butterworth filter (8-30 Hz) used for its pass band smoothing.This frequency range is selected because the ERD/ERS patterns originated from sensorimotor brain cortex appear in alpha (8-13 Hz) and beta (13-30 Hz) bands which have been postulated to be good signal features for EEG-based BCIs. 2,29Moreover, in order to increase signal to noise ratio (SNR), some physiological artifacts such as electrooculogram (EOG) (2-5 Hz) or non-physiological artifacts such as AC noise (50 Hz) will be removed by this method of filtering.
Afterward, each filtered EEG signal is normalized to have zero mean and unity standard deviation.This kind of normalization leads to uniform scaling of all inputs.

Systematic Timeframe Creation
In previous researches, features have been taken from a long interval of signals which is always the same and is not adaptable to each person.In fact brain thinks in a short timeframe with a certain delay after asking for doing a motor imagery task which is different for each person in various scenarios and situations.Considering this limited interval for feature extraction will result in reducing features dimensions, while increasing their discrimination capability.Thus providing a simple and practical algorithm for automatically determining the most significant and distinctive timeframes of EEG signal in which the motor imagery task performed properly and ignoring diversion and redundant information can be very beneficial.
To prove stated claim, each signal interval is divided automatically into smaller timeframes with variable starting point and duration before extracting features.By using this segmentation method, the most optimal journals.sbmu.ac.ir/Neuroscience http timeframe of the signal in which motor imagery is done properly can be identified.
An overall framework of the proposed timeframe creation algorithm and then the optimal timeframe selection can be in the form of the following pseudocode.Where t RP , t END and CCR(t, Δt) are the reference period time, the end of signal acquisition time and the correct classification rate of (t, t + Δt) timeframe respectively.

Feature Extraction
An EEG based BCI system uses captured EEG signals of selected channels as the control signal in nerve and muscle free interaction of humans with surroundings.After the preprocessing stage and systematic timeframe creation, it is necessary for each timeframe to decode and recognize the intended interactions.Translating EEG signal into a command for a computer is very complex; therefore as the many of previous researches, the system is reduced to a classifier of 2 or more motor imaginary EEG signals.The reduced system consists of 2 basic steps; EEG brain signal features extraction and their appropriate classification.EEG signal nature and how its impression from motor imaginary, specifies what features of the EEG signal should be extracted and what an appropriate classifier should be used.
As mentioned earlier, different assumptions concerning the EEG signal nature are considered by the researchers that includes stable and non-stable, Gaussian and non-Gaussian, linear and non-linear, time series (random process) and scalable patterns.It should be noted that a sophisticated looking to the signal, makes it difficult to follow the motor imagery effects on it, requiring complex features to be extracted and therefore using a complex classifier may seem to be reasonable.So it may be better to avoid such a sophisticated looking to the signal.In this study the basic simplifying assumptions have been supposed about the signal nature and how motor imagery influences it.
Choosing optimal timeframe of EEG signal, extracting its most suitable independent features, using simple classification proportional to the signal nature and type of its extracted features have been considered in the method.Assuming that each motor imagery task creates a unique pattern at a specific timeframe of the brain signal, it is possible to use a fast unitary transform to properly describe the pattern and extract its features.Some appropriate discrete wavelet transform (DWT) coefficients describing the pattern at a specified timeframe can be selected.It should be noted that irrelevant data should not be considered in early stages of the algorithm (feature extraction) so that not needed to hardly discarded or removed later.In other words, a minimal set of DWT coefficients should be selected and the task of removing additional DWT coefficients should not be left just for feature reduction algorithm.
Two important factors in wavelet applications are determining the mother wavelet and the number of decomposition levels.Considering data sampling frequency and the fact that half of the signal frequency content will be removed at each filtering level, the number of decomposition levels can be determined (Figure 2).In the first and second databases which sampling frequencies are respectively 128 Hz and 125 Hz the first level details coefficients will be in the range of 32-64 Hz and 31.25-62.5Hz the second level details coefficients will be in the range of 16-32 Hz and 15.62-31.25 Hz and the third level details coefficients will be in the range of 8-16 Hz and 7.81-15.62Hz.So 3 levels of decomposition are required which according to earlier filtering of signals in the range of alpha and beta waves (8-30 Hz), the first level details coefficients can be ignored so that D2 and D3 details coefficients will form the feature vector. 17he appropriate selection of mother wavelet is significant as it should approximate a given pattern in the signal.After consecutive experiments "db4" which had demonstrated greater ability to detect EEG signal patterns, was used as the mother wavelet.Applying PCA algorithm to a set of DWT coefficients, new fewer uncorrelated orthogonal features obtained with better separation of different motor imagined task.

Classification
Assuming the appropriate minimal time-frequency description of the dynamic EEG signal discussed in the previous section, Elman recurrent neural network (ERNN) is more capable to recognize and distinguish signal transient (frequency content of the signal) and time-dependent patterns compared with traditional neural networks which are only able to create static mappings. 30,31Actually feedback connections of the ERNN lead to historical sensitivity and develop network ability to process, manage and modeling of temporal patterns without need to extreme training.ERNN architecture is generally similar to feed-forward neural networks. 32This means all neurons in each layer are connected to all the journals.sbmu.ac.ir/Neuroscience http neurons in the next one.But there is an exception in its architecture which is existence of a layer called context layer.Context layer, create a memory of a time delay by keeping a copy of hidden layer neurons output for the next time step.
ERNN has four layers including an input layer, hidden layer, context layer and output layer.Each external input passes the input layer to the hidden layer.Then multiplies by the weights of the hidden layer and the same thing occurs for the context layer.After calculating the sum of all multipliers outputs, it applies to sigmoid transfer function of the hidden layer and then the result will be sent to the output layer.After multiplication of entries in their corresponding weights in output layer, the sum of the values applies to a linear transfer function.To modify the network weights, the scaled conjugate gradient algorithm is applied as it can converge faster by exploring in all gradient directions.

Experimental Results
Performance of the proposed algorithm is evaluated by testing it on Graz 2003 and Graz 2005 public datasets.Each recorded signal during feedback process interval is systematically divided into different timeframes.Then, previously mentioned desired DWT coefficients are extracted from each timeframe and are compressed and projected by the PCA algorithm to reduce redundancy of features.The principal components of the PCA algorithm, which the sum of their variance is 99.9% of the total variance, are retained.Subsequently, these features applied to an ERNN classifier to select the optimal timeframe based on the BCI system maximum performance.
BCI system performance evaluation is performed based on commonly used correct classification rate (CCR) of the ERNN classifier.CCR over 20% of samples have been reported for different timeframes of each database in Tables 2-5.To ensure the over training does not occur, 20% of samples were used for validation and the remaining 60% of samples were used for the training process.Finally by comparing correct classification rates of all timeframes, the optimal one in which motor imagery task performed properly is selected.
The proposed system performance is evaluated in more details using 2 statistical criteria of sensitivity and specificity showing true positive rate (TPR) and true negative rate (TNR) respectively (Tables 2-5).Although the percentage of correctly classified target motor imagination tasks (sensitivity) is an important indicator for evaluating the performance of the BCI system, the percentage of non-target motor imagination tasks that are properly classified (specificity) is a more significant indicator since the occurrence of false positives (1-TNR) is the most undesirable event in the system. 33he results are totally meeting our expectations.Proper set of features along with a suitable classifier are able to distinguish between different stimuli using a pattern In the next step, the more effective performance of the proposed BCI system in the optimal timeframe compared with existing methods for Graz 2003 and Graz 2005 are shown in Table 6 and Table 7, respectively.It should be noted that despite this effectiveness, the presented method needs less storage capacity in feature space and less computational complexity in the ERNN training and testing.Further analysis of the results is presented in the next section.

Discussion
In this research, an effective method for determining the optimal timeframe of the EEG signal, which includes the pattern created by the motor imagery task, is presented.The proposed method is based on the assumption that during the entire interval of data acquisition time, individual may not continually focus on the request for doing the motor imagery task or even think about other things in a very short time slice.So distinguishing patterns which is essential to interpret individual intentions, exist only in a small timeframe of data and features obtained from other parts of the recorded signal are not suitable for training a BCI system.
To identify mentioned optimal timeframe, a novel data mining technique has been presented in this study.With a simple look at the signal and by the use of features that provide a complete time-frequency description of the motor imagery related pattern in the desired time and frequency range along with the appropriate classifier, the validity of the hypothesis is examined in the form of proposed experimental data mining work.
The results listed in the tables in the previous section shows that in all cases the best answer is obtained at the smallest timeframe.Also between the same length larger time intervals, those that include the optimal timeframe have made a better distinction.Therefore, it can be concluded that the request made a unique pattern in a short timeframe, and this timeframe is detectable by an empirical data mining algorithm.
As duration of the time interval which includes the optimal timeframe increased, classification rate is decreased.This indicates that the motor imagery related pattern has not been dispersed throughout the overall data acquisition time.
The occurrence time for this optimal timeframe is not journals.sbmu.ac.ir/Neuroscience http identical for all 3 individuals in the IIIB dataset.This means that there may not any analytical method or definite formula for determining the location and length of the optimal timeframe for all individuals.In other words, for each person it is necessary to determine the location and length of the optimal timeframe using an experimental data mining technique.Instead of using the systematic method, an intelligent method can be used, and perhaps leads to smaller timeframes with better answers.In the optimum timeframe and the smallest intervals containing it, TPR and TNR are the highest.This also confirms the existence of a distinct pattern of request within a short timeframe.The TNR in other timeframes compared to the optimal timeframe is significantly lower, which is not desirable.
In addition to asserting our hypothesis (shortness of the brain response time to the request for doing the motor imagery task) using the comparison of results in different timeframes, superiority of the presented method to existing methods can also be concluded.The proposed method has basic characteristics that make it superior to existing researches.Some of them are simple looking to the signal, using the least suitable uncorrelated features that provide a complete description of desired timefrequency content of the signal and utilizing a classifier compatible with a simple look at the signal which makes it possible to use simple features that provide a complete time-frequency description of the motor imagery related pattern.
Above mentioned items caused the proposed method to achieve a higher performance compared to existing methods despite its less computational complexity and storage space (Table 6 and Table 7).

Figure 2 .
Figure 2. Frequency Range of Different Decomposition Coefficients for Graz 2003 Database.

Table 1 .
Details of the Datasets

Table 2 .
Results for Database III

Table 6 .
Proposed BCI System CCR Compared With Existing Methods for Graz 2003