Network-constrained regularization and variable selection for analysis of genomic data

被引:421
作者
Li, Caiyan [1 ]
Li, Hongzhe [1 ]
机构
[1] Univ Penn, Sch Med, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
关键词
D O I
10.1093/bioinformatics/btn081
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Graphs or networks are common ways of depicting information. In biology in particular, many different biological processes are represented by graphs, such as regulatory networks or metabolic pathways. This kind of a priori information gathered over many years of biomedical research is a useful supplement to the standard numerical genomic data such as microarray gene-expression data. How to incorporate information encoded by the known biological networks or graphs into analysis of numerical data raises interesting statistical challenges. In this article, we introduce a network-constrained regularization procedure for linear regression analysis in order to incorporate the information from these graphs into an analysis of the numerical data, where the network is represented as a graph and its corresponding Laplacian matrix. We define a network-constrained penalty function that penalizes the L-1-norm of the coefficients but encourages smoothness of the coefficients on the network. Results: Simulation studies indicated that the method is quite effective in identifying genes and subnetworks that are related to disease and has higher sensitivity than the commonly used procedures that do not use the pathway structure information. Application to one glioblastoma microarray gene-expression dataset identified several subnetworks on several of the Kyoto Encyclopedia of Genes and Genomes (KEGG) transcriptional pathways that are related to survival from glioblastoma, many of which were supported by published literatures. Conclusions: The proposed network-constrained regularization procedure efficiently utilizes the known pathway structures in identifying the relevant genes and the subnetworks that might be related to phenotype in a general regression framework. As more biological networks are identified and documented in databases, the proposed method should find more applications in identifying the subnetworks that are related to diseases and other biological processes. Contact: hongzhe@mail.med.upenn.edu.
引用
收藏
页码:1175 / 1182
页数:8
相关论文
共 25 条
  • [1] FoxOs at the crossroads of cellular metabolism, differentiation, and transformation
    Accili, D
    Arden, KC
    [J]. CELL, 2004, 117 (04) : 421 - 426
  • [2] [Anonymous], 1997, AM MATH SOC, DOI DOI 10.1090/CBMS/092
  • [3] [Anonymous], 2004, Statistical Applications in Genetics and Molecular Biology
  • [4] Least angle regression - Rejoinder
    Efron, B
    Hastie, T
    Johnstone, I
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
  • [5] Variable selection via nonconcave penalized likelihood and its oracle properties
    Fan, JQ
    Li, RZ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360
  • [6] Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target
    Horvath, S.
    Zhang, B.
    Carlson, M.
    Lu, K. V.
    Zhu, S.
    Felciano, R. M.
    Laurance, M. F.
    Zhao, W.
    Qi, S.
    Chen, Z.
    Lee, Y.
    Scheck, A. C.
    Liau, L. M.
    Wu, H.
    Geschwind, D. H.
    Febbo, P. G.
    Kornblum, H. I.
    Cloughesy, T. F.
    Nelson, S. F.
    Mischel, P. S.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (46) : 17402 - 17407
  • [7] Exploration, normalization, and summaries of high density oligonucleotide array probe level data
    Irizarry, RA
    Hobbs, B
    Collin, F
    Beazer-Barclay, YD
    Antonellis, KJ
    Scherf, U
    Speed, TP
    [J]. BIOSTATISTICS, 2003, 4 (02) : 249 - 264
  • [8] Enhancement of antitumor immunity by CTLA-4 blockade
    Leach, DR
    Krummel, MF
    Allison, JP
    [J]. SCIENCE, 1996, 271 (5256) : 1734 - 1736
  • [9] LI C, 2007, 23 U PENN
  • [10] PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer
    Li, J
    Yen, C
    Liaw, D
    Podsypanina, K
    Bose, S
    Wang, SI
    Puc, J
    Miliaresis, C
    Rodgers, L
    McCombie, R
    Bigner, SH
    Giovanella, BC
    Ittmann, M
    Tycko, B
    Hibshoosh, H
    Wigler, MH
    Parsons, R
    [J]. SCIENCE, 1997, 275 (5308) : 1943 - 1947