Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy

被引:435
作者
Combrisson, Etienne [1 ,2 ]
Jerbi, Karim [1 ,3 ]
机构
[1] Univ Lyon 1, Lyon Neurosci Res Ctr, DYCOG Lab, INSERM,U1028,UMR 5292, F-69365 Lyon, France
[2] Univ Lyon 1, Ctr Res & Innovat Sport Mental Proc & Motor Perfo, F-69365 Lyon, France
[3] Univ Montreal, Dept Psychol, Montreal, PQ H3C 3J7, Canada
关键词
k-Fold cross-validation; Small sample size; Classification; Multi-class decoding; Brain-computer-interfaces (BCIs); Machine learning; Binomial cumulative distribution; Classification significance; Decoding accuracy; MEG; ECoG; Intracranial EEG; COMPUTER-INTERFACE; MOVEMENT DIRECTION; PERMUTATION TESTS; ELECTROCORTICOGRAPHIC SIGNALS; PERFORMANCE EVALUATION; PATTERN-RECOGNITION; CLASSIFYING EEG; HAND MOVEMENTS; MOTOR; IMAGERY;
D O I
10.1016/j.jneumeth.2015.01.010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Machine learning techniques are increasingly used in neuroscience to classify brain signals. Decoding performance is reflected by how much the classification results depart from the rate achieved by purely random classification. In a 2-class or 4-class classification problem, the chance levels are thus 50% or 25% respectively. However, such thresholds hold for an infinite number of data samples but not for small data sets. While this limitation is widely recognized in the machine learning field, it is unfortunately sometimes still overlooked or ignored in the emerging field of brain signal classification. Incidentally, this field is often faced with the difficulty of low sample size. In this study we demonstrate how applying signal classification to Gaussian random signals can yield decoding accuracies of up to 70% or higher in two-class decoding with small sample sets. Most importantly, we provide a thorough quantification of the severity and the parameters affecting this limitation using simulations in which we manipulate sample size, class number, cross-validation parameters (k-fold, leave-one-out and repetition number) and classifier type (Linear-Discriminant Analysis, Naive Bayesian and Support Vector Machine). In addition to raising a red flag of caution, we illustrate the use of analytical and empirical solutions (binomial formula and permutation tests) that tackle the problem by providing statistical significance levels (p-values) for the decoding accuracy, taking sample size into account. Finally, we illustrate the relevance of our simulations and statistical tests on real brain data by assessing noise-level classifications in Magnetoencephalography (MEG) and intracranial EEG (iEEG) baseline recordings. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:126 / 136
页数:11
相关论文
共 74 条
  • [1] Gamma band activity associated with BCI performance: simultaneous MEG/EEG study
    Ahn, Minkyu
    Ahn, Sangtae
    Hong, Jun H.
    Cho, Hohyun
    Kim, Kiwoong
    Kim, Bong S.
    Chang, Jin W.
    Jun, Sung C.
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2013, 7
  • [2] A comparison of classification techniques for a gaze-independent P300-based brain-computer interface
    Aloise, F.
    Schettini, F.
    Arico, P.
    Salinari, S.
    Babiloni, F.
    Cincotti, F.
    [J]. JOURNAL OF NEURAL ENGINEERING, 2012, 9 (04)
  • [3] Ang KK, 2010, IEEE ENG MED BIO, P5549, DOI 10.1109/IEMBS.2010.5626782
  • [4] [Anonymous], MISSING KINAESTHESIA
  • [5] EEG Data Space Adaptation to Reduce Intersession Nonstationarity in Brain-Computer Interface
    Arvaneh, Mahnaz
    Guan, Cuntai
    Ang, Kai Keng
    Quek, Chai
    [J]. NEURAL COMPUTATION, 2013, 25 (08) : 2146 - 2171
  • [6] Linear classification of low-resolution EEG patterns produced by imagined hand movements
    Babiloni, F
    Cincotti, F
    Lazzarini, L
    Millán, J
    Mouriño, J
    Varsta, M
    Heikkonen, J
    Bianchi, L
    Marciani, MG
    [J]. IEEE TRANSACTIONS ON REHABILITATION ENGINEERING, 2000, 8 (02): : 186 - 188
  • [7] Differential representation of arm movement direction in relation to cortical anatomy and function
    Ball, Tonio
    Schulze-Bonhage, Andreas
    Aertsen, Ad
    Mehring, Carsten
    [J]. JOURNAL OF NEURAL ENGINEERING, 2009, 6 (01)
  • [8] Bengio Y, 2004, J MACH LEARN RES, V5, P1089
  • [9] Significance tests or confidence intervals: which are preferable for the comparison of classifiers?
    Berrar, Daniel
    Lozano, Jose A.
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2013, 25 (02) : 189 - 206
  • [10] Besserve M, 2007, BIOL RES, V40, P415, DOI /S0716-97602007000500005