Statistical analysis of big data on pharmacogenomics

被引：41

作者：

Fan, Jianqing ^{[1
]}

Liu, Han ^{[1
]}

机构：

[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA

来源：

ADVANCED DRUG DELIVERY REVIEWS | 2013年 / 65卷 / 07期

基金：

美国国家科学基金会;

关键词：

Big data; High dimensional statistics; Approximate factor model; Graphical model; Multiple testing; Variable selection; Marginal screening; Robust statistics; NONCONCAVE PENALIZED LIKELIHOOD; COVARIANCE-MATRIX ESTIMATION; FALSE DISCOVERY RATE; GENERALIZED LINEAR-MODELS; VARIABLE SELECTION; THRESHOLDING ALGORITHM; REGULARIZATION; CLASSIFICATION; REGRESSION; SHRINKAGE;

D O I：

10.1016/j.addr.2013.04.008

中图分类号：

R9 [药学];

学科分类号：

1007 ;

摘要：

This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and generic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. (C) 2013 Elsevier B.V. All rights reserved.

引用

页码：987 / 1000

页数：14

共 101 条

[71] MicroRNA polymorphisms: the future of pharmacogenomics, molecular epidemiology and individualized medicine
Mishra, Prasun J.
Bertino, Joseph R.
[J]. PHARMACOGENOMICS, 2009, 10 (03) : 399 - 416
[72] Promises and challenges of pharmacogenetics: an overview of study design, methodological and statistical issues
Ross, Stephanie
Anand, Sonia S.
Joseph, Philip
Pare, Guillaume
[J]. JRSM CARDIOVASCULAR DISEASE, 2012, 1 (01)
[73] Rothman A., 2012, BIOMETRIKA
[74] A new approach to Cholesky-based covariance regularization in high dimensions
Rothman, Adam J.
Levina, Elizaveta
Zhu, Ji
[J]. BIOMETRIKA, 2010, 97 (03) : 539 - 550
[75] Generalized Thresholding of Large Covariance Matrices
Rothman, Adam J.
Levina, Elizaveta
Zhu, Ji
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (485) : 177 - 186
[76] The effect of correlation in false discovery rate estimation
Schwartzman, Armin
Lin, Xihong
[J]. BIOMETRIKA, 2011, 98 (01) : 199 - 214
[77] SPARSE LINEAR DISCRIMINANT ANALYSIS BY THRESHOLDING FOR HIGH DIMENSIONAL DATA
Shao, Jun
Wang, Yazhen
Deng, Xinwei
Wang, Sijian
[J]. ANNALS OF STATISTICS, 2011, 39 (02) : 1241 - 1265
[78] Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach
Storey, JD
Taylor, JE
Siegmund, D
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2004, 66 : 187 - 205
[79] Statistical significance for genomewide studies
Storey, JD
Tibshirani, R
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (16) : 9440 - 9445
[80] Sun T., 2012, SPARSE MATRIX INVERS

← 2 3 4 5 6 7 8 9 10 11 →