Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers

被引:100
作者
Dakna, Mohammed [1 ]
Harris, Keith [2 ]
Kalousis, Alexandros [3 ]
Carpentier, Sebastien [4 ]
Kolch, Walter [5 ,6 ,7 ]
Schanstra, Joost P. [8 ,9 ]
Haubitz, Marion [10 ]
Vlahou, Antonia [12 ]
Mischak, Harald [1 ,11 ]
Girolami, Mark [13 ]
机构
[1] Mosa Diagnost & Therapeut, Hannover, Germany
[2] Univ Glasgow, Water & Environm Res Grp, Sch Engn, Glasgow, Lanark, Scotland
[3] Univ Geneva, Dept Comp Sci, Geneva, Switzerland
[4] Katholieke Univ Leuven, Lab Trop Crop Improvement, Leuven, Belgium
[5] Univ Glasgow, Beatson Inst Canc Res, Glasgow, Lanark, Scotland
[6] Univ Glasgow, Sir Henry Wellcome Funct Genom Facil, Glasgow, Lanark, Scotland
[7] Conway Inst, Dublin 4, Ireland
[8] Fac Med Toulouse, INSERM, U858, F-31073 Toulouse, France
[9] Univ Toulouse III Paul Sabatier, Inst Med Mol Rangueil, Equipe N 5, IFR150, Toulouse, France
[10] Hannover Med Sch, Dept Nephrol, D-3000 Hannover, Germany
[11] Univ Glasgow, BHF Glasgow Cardiovasc Res Ctr, Glasgow, Lanark, Scotland
[12] Acad Athens, Res Fdn, Athens, Greece
[13] Univ London Imperial Coll Sci Technol & Med, Dept Stat Sci, London, England
来源
BMC BIOINFORMATICS | 2010年 / 11卷
基金
爱尔兰科学基金会; 英国工程与自然科学研究理事会;
关键词
CHRONIC KIDNEY-DISEASE; DNA MICROARRAY DATA; MASS-SPECTROMETRY; CLINICAL PROTEOMICS; URINARY PROTEOME; SAMPLE-SIZE; VERIFICATION BIAS; DISCOVERY; CANCER; SERUM;
D O I
10.1186/1471-2105-11-594
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School. Results: We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test set is essential. Conclusions: Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design.
引用
收藏
页数:16
相关论文
共 51 条
  • [1] Multicentric Validation of Proteomic Biomarkers in Urine Specific for Diabetic Nephropathy
    Alkhalaf, Alaa
    Zurbig, Petra
    Bakker, Stephan J. L.
    Bilo, Henk J. G.
    Cerna, Marie
    Fischer, Christine
    Fuchs, Sebastian
    Janssen, Bart
    Medek, Karel
    Mischak, Harald
    Roob, Johannes M.
    Rossing, Kasper
    Rossing, Peter
    Rychlik, Ivan
    Sourij, Harald
    Tiran, Beate
    Winklhofer-Roob, Brigitte M.
    Navis, Gerjan J.
    [J]. PLOS ONE, 2010, 5 (10):
  • [2] A novel design for estimating relative accuracy of screening tests when complete disease verification is not feasible
    Alonzo, TA
    Kittelson, JM
    [J]. BIOMETRICS, 2006, 62 (02) : 605 - 612
  • [3] [Anonymous], 1993, An introduction to the bootstrap
  • [4] [Anonymous], 2010, R LANG ENV STAT COMP
  • [5] [Anonymous], 2013, Statistics with confidence: confidence intervals and statistical guidelines, DOI DOI 10.1002/SIM.4780090319
  • [6] [Anonymous], 1988, PROBABILITY STAT INF
  • [7] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [8] Is cross-validation valid for small-sample microarray classification?
    Braga-Neto, UM
    Dougherty, ER
    [J]. BIOINFORMATICS, 2004, 20 (03) : 374 - 380
  • [9] Statistical strategies for avoiding false discoveries in metabolomics and related experiments
    Broadhurst, David I.
    Kell, Douglas B.
    [J]. METABOLOMICS, 2006, 2 (04) : 171 - 196
  • [10] Adjusting for verification bias in diagnostic test evaluation: A Bayesian approach
    Buzoianu, Manuela
    Kadane, Joseph B.
    [J]. STATISTICS IN MEDICINE, 2008, 27 (13) : 2453 - 2473