Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes

被引:153
作者
Pittman, J
Huang, E
Dressman, H
Horng, CF
Cheng, SH
Tsou, MH
Chen, CM
Bild, A
Iversen, ES
Huang, AT
Nevins, JR
West, M [1 ]
机构
[1] Duke Univ, Inst Stat & Decis Sci, Durham, NC 27708 USA
[2] Duke Univ, Computat & Appl Genomics Program, Inst Genome Sci & Policy, Durham, NC 27708 USA
[3] Duke Univ, Howard Hughes Med Inst, Durham, NC 27708 USA
[4] Koo Fdn, Sun Yat Sen Canc Ctr, Taipei 112, Taiwan
[5] Duke Univ, Dept Med, Durham, NC 27710 USA
[6] Duke Univ, Dept Mol Genet & Microbiol, Durham, NC 27710 USA
关键词
D O I
10.1073/pnas.0401736101
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We describe a comprehensive modeling approach to combining genomic and clinical data for personalized prediction in disease outcome studies. This integrated clinicogenomic modeling framework is based on statistical classification tree models that evaluate the contributions of multiple forms of data, both clinical and genomic, to define interactions of multiple risk factors that associate with the clinical outcome and derive predictions customized to the individual patient level. Gene expression data from DNA microarrays is represented by multiple, summary measures that we term metagenes; each metagene characterizes the dominant common expression pattern within a cluster of genes. A case study of primary breast cancer recurrence demonstrates that models using multiple metagenes combined with traditional clinical risk factors improve prediction accuracy at the individual patient level, delivering predictions more accurate than those made by using a single genomic predictor or clinical data alone. The analysis also highlights issues of communicating uncertainty in prediction and identifies combinations of clinical and genomic risk factors playing predictive roles. Implicated metagenes identify gene subsets with the potential to aid biological interpretation. This framework will extend to incorporate any form of data, including emerging forms of genomic data, and provides a platform for development of models for personalized prognosis.
引用
收藏
页码:8431 / 8436
页数:6
相关论文
共 35 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[3]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[4]   Gene expression profiles of poor-prognosis primary breast cancer correlate with survival [J].
Bertucci, F ;
Nasser, V ;
Granjeaud, S ;
Eisinger, F ;
Adelaïde, J ;
Tagett, R ;
Loriod, A ;
Giaconia, A ;
Benziane, A ;
Devilard, E ;
Jacquemier, J ;
Viens, P ;
Nguyen, C ;
Birnbaum, D ;
Houlgatte, R .
HUMAN MOLECULAR GENETICS, 2002, 11 (08) :863-872
[5]   Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses [J].
Bhattacharjee, A ;
Richards, WG ;
Staunton, J ;
Li, C ;
Monti, S ;
Vasa, P ;
Ladd, C ;
Beheshti, J ;
Bueno, R ;
Gillette, M ;
Loda, M ;
Weber, G ;
Mark, EJ ;
Lander, ES ;
Wong, W ;
Johnson, BE ;
Golub, TR ;
Sugarbaker, DJ ;
Meyerson, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) :13790-13795
[6]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[7]  
Breiman L., 1998, CLASSIFICATION REGRE
[8]   Unique features of breast cancer in Taiwan [J].
Cheng, SH ;
Tsou, MH ;
Liu, MC ;
Jian, JJ ;
Cheng, JCH ;
Leu, SY ;
Hsieh, CY ;
Huang, AT .
BREAST CANCER RESEARCH AND TREATMENT, 2000, 63 (03) :213-223
[9]   Bayesian CART model search [J].
Chipman, HA ;
George, EI ;
McCulloch, RE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (443) :935-948
[10]  
Clyde MA, 1999, BAYESIAN STATISTICS 6, P157