An empirical assessment of validation practices for molecular classifiers

被引：68

作者：

Castaldi, Peter J. ^{[1
]}

Dahabreh, Issa J. ^{[1
]}

Ioannidis, John P. A. ^{[1
]}

机构：

[1] Stanford Univ, Sch Med, Stanford Prevent Res Ctr, Fac Med Sch, Stanford, CA 94305 USA

来源：

BRIEFINGS IN BIOINFORMATICS | 2011年 / 12卷 / 03期

基金：

美国国家卫生研究院;

关键词：

predictive medicine; genes; gene expression; proteomics; STAGE OVARIAN-CANCER; GENE-EXPRESSION DATA; BREAST-CANCER; CELL-CARCINOMA; PUBLISHED MICROARRAY; DIAGNOSTIC-TESTS; CROSS-VALIDATION; STATISTICS NOTES; ERROR ESTIMATION; META-REGRESSION;

D O I：

10.1093/bib/bbq073

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21-61%) and 29% (IQR, 15-65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04-5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n=758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice.

引用

页码：189 / 202

页数：14

共 84 条

[1] Serum proteome profiling detects myelodysplastic syndromes and identifies CXC chemokine ligands 4 and 7 as markers for advanced disease
Aivado, Manuel
Spentzos, Dimitrios
Germing, Ulrich
Alterovitz, Gil
Meng, Xiao-Ying
Grall, Franck
Giagounidis, Aristoteles A. N.
Klement, Giannoula
Steidl, Ulrich
Otu, Hasan H.
Czibere, Akos
Prall, Wolf C.
Iking-Konert, Christof
Shayne, Michelle
Ramoni, Marco F.
Gattermann, Norbert
Haas, Rainer
Mitsiades, Constantine S.
Fung, Eric T.
Libermann, Towia A.
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (04) : 1307 - 1312
[2] Statistics Notes - Interaction revisited: the difference between two estimates
Altman, DG
Bland, JM
[J]. BMJ-BRITISH MEDICAL JOURNAL, 2003, 326 (7382): : 219 - 219
[3] Selection bias in gene extraction on the basis of microarray gene-expression data
Ambroise, C
McLachlan, GJ
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) : 6562 - 6566
[4] Exonic expression profiling of breast cancer and benign lesions: a retrospective analysis
Andre, Fabrice
Michiels, Stefan
Dessen, Philippe
Scott, Veronique
Suciu, Voichita
Uzan, Catherine
Lazar, Vladimir
Lacroix, Ludovic
Vassal, Gilles
Spielmann, Marc
Vielh, Philippe
Delaloge, Suzette
[J]. LANCET ONCOLOGY, 2009, 10 (04) : 381 - 390
[5] [Anonymous], 2010, R LANG ENV STAT COMP
[6] Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide chemotherapy in breast cancer
Ayers, M
Symmans, WF
Stec, J
Damokosh, AI
Clark, E
Hess, K
Lecocke, M
Metivier, J
Booser, D
Ibrahim, N
Valero, V
Royce, M
Arun, B
Whitman, G
Ross, J
Sneige, N
Hortobagyi, GN
Pusztai, L
[J]. JOURNAL OF CLINICAL ONCOLOGY, 2004, 22 (12) : 2284 - 2293
[7] Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments
Baggerly, KA
Morris, JS
Coombes, KR
[J]. BIOINFORMATICS, 2004, 20 (05) : 777 - U710
[8] Statistics notes - The odds ratio
Bland, JM
Altman, DG
[J]. BRITISH MEDICAL JOURNAL, 2000, 320 (7247) : 1468 - 1468
[9] Cytosolic N-terminal arginine-based signals together with a luminal signal target a type II membrane protein to the plant ER
Boulaflous, Aurelia
Saint-Jore-Dupas, Claude
Herranz-Gordo, Marie-Carmen
Pagny-Salehabadi, Sophie
Plasson, Carole
Garidou, Frederic
Kiefer-Meyer, Marie-Christine
Ritzenthaler, Christophe
Faye, Loic
Gomord, Veronique
[J]. BMC PLANT BIOLOGY, 2009, 9
[10] Is cross-validation valid for small-sample microarray classification?
Braga-Neto, UM
Dougherty, ER
[J]. BIOINFORMATICS, 2004, 20 (03) : 374 - 380

← 1 2 3 4 5 6 7 8 9 →