Multi-platform, multi-site, microarray-based human tumor classification

被引:142
作者
Bloom, G
Yang, IV
Boulware, D
Kwong, KY
Coppola, D
Eschrich, S
Quackenbush, J
Yeatman, TJ
机构
[1] Univ S Florida, H Lee Moffitt Canc Ctr, Dept Interdisciplinary Oncol, Tampa, FL 33612 USA
[2] Inst Genom Res, Rockville, MD USA
关键词
D O I
10.1016/S0002-9440(10)63090-8
中图分类号
R36 [病理学];
学科分类号
100104 ;
摘要
The introduction of gene expression profiling has resulted in the production of rich human data sets with potential for deciphering tumor diagnosis, prognosis, and therapy. Here we demonstrate how artificial neural networks (ANNs) can be applied to two completely different microarray platforms (cDNA and oligonucleotide), or a combination of both, to build tumor classifiers capable of deciphering the identity of most human cancers. First, 78 tumors representing eight different types of histologically similar adenocarcinoma, were evaluated with a 32k cDNA microarray and correctly classified by a cDNA-based ANN, using independent training and test sets, with a mean accuracy of 83%. To expand our approach, oligonucleotide data derived from six independent performance sites, representing 463 tumors and 21 tumor types, were assembled, normalized, and scaled. An oligonucleotide-based ANN, trained on a random fraction of the tumors (n = 343), was 88% accurate in predicting known pathological origin of the remaining fraction of tumors (n = 120) not exposed to the training algorithm. Finally, a mixed-platform classifier using a combination of both cDNA and of oligonucleotide microarray data from seven performance sites, normalized and scaled from a large and diverse tumor set (n = 539), produced similar results (85% accuracy) on independent test sets. Further validation of our classifiers was achieved by accurately (84%) predicting the known primary site of origin for an independent set of metastatic lesions (n = 50), resected from brain, lung, and liver, potentially addressing the vexing classification problems imposed by unknown primary cancers. These cDNA- and oligonucleotide-based classifiers provide a first proof of principle that data derived from multiple platforms and performance sites can be exploited to build multi-tissue tumor classifiers.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 25 条
[21]   Epidemiology of unknown primary tumours; incidence and population-based survival of 1285 patients in Southeast Netherlands, 1984-1992 [J].
van de Wouw, AJ ;
Janssen-Heijnen, MLG ;
Coebergh, JWW ;
Hillen, HFP .
EUROPEAN JOURNAL OF CANCER, 2002, 38 (03) :409-413
[22]   Gene expression profiling predicts clinical outcome of breast cancer [J].
van't Veer, LJ ;
Dai, HY ;
van de Vijver, MJ ;
He, YDD ;
Hart, AAM ;
Mao, M ;
Peterse, HL ;
van der Kooy, K ;
Marton, MJ ;
Witteveen, AT ;
Schreiber, GJ ;
Kerkhoven, RM ;
Roberts, C ;
Linsley, PS ;
Bernards, R ;
Friend, SH .
NATURE, 2002, 415 (6871) :530-536
[23]   A NEURAL-NETWORK DESIGN FOR EVENT-RELATED POTENTIAL DIAGNOSIS [J].
WU, FY ;
SLATER, JD ;
HONIG, LS ;
RAMSAY, RE .
COMPUTERS IN BIOLOGY AND MEDICINE, 1993, 23 (03) :251-264
[24]  
Yang IV, 2002, GENOME BIOL, V3
[25]  
Zarbo RJ, 1999, CLIN LAB MED, V19, P713