A data review and re-assessment of ovarian cancer serum proteomic profiling

被引:245
作者
Sorace, JM [1 ]
Zhan, M
机构
[1] Vet Adm Maryland Hlth Care Syst, Dept Pathol, Baltimore, MD 21201 USA
[2] Vet Adm Maryland Hlth Care Syst, Lab Serv, Baltimore, MD 21201 USA
[3] Univ Maryland Baltimore Cty, Dept Informat Syst, Baltimore, MD 21250 USA
[4] Univ Maryland, Sch Med, Dept Pathol, Baltimore, MD 21201 USA
[5] Univ Maryland, Sch Med, Dept Epidemiol & Prevent Med, Baltimore, MD 21201 USA
关键词
D O I
10.1186/1471-2105-4-24
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The early detection of ovarian cancer has the potential to dramatically reduce mortality. Recently, the use of mass spectrometry to develop profiles of patient serum proteins, combined with advanced data mining algorithms has been reported as a promising method to achieve this goal. In this report, we analyze the Ovarian Dataset 8-7-02 downloaded from the Clinical Proteomics Program Databank website, using nonparametric statistics and stepwise discriminant analysis to develop rules to diagnose patients, as well as to understand general patterns in the data that may guide future research. Results: The mass spectrometry serum profiles derived from cancer and controls exhibited numerous statistical differences. For example, use of the Wilcoxon test in comparing the intensity at each of the 15,154 mass to charge (M/Z) values between the cancer and controls, resulted in the detection of 3,591 M/Z values whose intensities differed by a p-value of 10(-6) or less. The region containing the M/Z values of greatest statistical difference between cancer and controls occurred at M/Z values less than 500. For example the M/Z values of 2.7921478 and 245.53704 could be used to significantly separate the cancer from control groups. Three other sets of M/Z values were developed using a training set that could distinguish between cancer and control subjects in a test set with 100% sensitivity and specificity. Conclusion: The ability to discriminate between cancer and control subjects based on the M/Z values of 2.7921478 and 245.53704 reveals the existence of a significant non-biologic experimental bias between these two groups. This bias may invalidate attempts to use this dataset to find patterns of reproducible diagnostic value. To minimize false discovery, results using mass spectrometry and data mining algorithms should be carefully reviewed and benchmarked with routine statistical methods.
引用
收藏
页数:13
相关论文
共 22 条
  • [1] Adam BL, 2001, PROTEOMICS, V1, P1264, DOI 10.1002/1615-9861(200110)1:10<1264::AID-PROT1264>3.0.CO
  • [2] 2-R
  • [3] Adam BL, 2002, CANCER RES, V62, P3609
  • [4] BAGGERLY KA, PROTEOMICS
  • [5] Plasma lysophosphatidic acid concentration and ovarian cancer
    Baker, DL
    Morrison, P
    Miller, B
    Riely, CA
    Tolley, B
    Westermann, AM
    Bonfrer, JMG
    Bais, E
    Moolenaar, WH
    Tigyi, G
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2002, 287 (23): : 3081 - 3082
  • [6] Cazares LH, 2002, CLIN CANCER RES, V8, P2541
  • [7] Cancer proteomics: The state of the art
    Herrmann, PC
    Liotta, LA
    Petricoin, EF
    [J]. DISEASE MARKERS, 2001, 17 (02) : 49 - 57
  • [8] Genomics and proteomics: application of novel technology to early detection and prevention of cancer
    Michener, CM
    Ardekani, AM
    Petricoin, EF
    Liotta, LA
    Kohn, EC
    [J]. CANCER DETECTION AND PREVENTION, 2002, 26 (04): : 249 - 255
  • [9] Clinical proteomics: Translating benchside promise into bedside reality
    Petricoin, EF
    Zoon, KC
    Kohn, EC
    Barrett, JC
    Liotta, LA
    [J]. NATURE REVIEWS DRUG DISCOVERY, 2002, 1 (09) : 683 - 695
  • [10] Petricoin EF, 2002, J NATL CANCER I, V94, P1576