Power and sample size estimation in high dimensional biology

被引:47
作者
Gadbury, GL
Page, GP
Edwards, J
Kayo, T
Prolla, TA
Weindruch, R
Permana, PA
Mountz, JD
Allison, DB
机构
[1] Univ Alabama Birmingham, Dept Biostat, Sect Stat Genet, Birmingham, AL 35294 USA
[2] Univ Missouri, Dept Math & Stat, Rolla, MO 65401 USA
[3] Iowa State Univ, USDA ARS, Dept Agron, Ames, IA USA
[4] Univ Wisconsin, Wisconsin Reg Primate Res Ctr, Madison, WI USA
[5] Univ Wisconsin, Dept Genet & Med Genet, Madison, WI USA
[6] Univ Wisconsin, Dept Med, Madison, WI USA
[7] William S Middleton VA Hosp, Ctr Geriatr Res Educ & Clin, Madison, WI USA
[8] NIDDKD, Phoenix Epidemiol & Clin Res Branch, NIH, Phoenix, AZ USA
[9] Univ Alabama Birmingham, Birmingham Vet Adm Med Ctr, Birmingham, AL USA
[10] Univ Alabama Birmingham, Clin Nutr Res Ctr, Birmingham, AL 35294 USA
关键词
D O I
10.1191/0962280204sm369ra
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Genomic scientists often test thousands of hypotheses in a single experiment. One example is a microarray experiment that seeks to determine differential gene expression among experimental groups. Planning such experiments involves a determination of sample size that will allow meaningful interpretations. Traditional power analysis methods may not be well suited to this task when thousands of hypotheses are tested in a discovery oriented basic research. We introduce the concept of expected discovery rate (EDR) and an approach that combines parametric mixture modelling with parametric bootstrapping to estimate the sample size needed for a desired accuracy of results. While the examples included are derived from microarray studies, the methods, herein, are 'extraparadigmatic' in the approach to study design and are applicable to most high dimensional biological situations. Pilot data from three different microarray experiments are used to extrapolate EDR as well as the related false discovery rate at different sample sizes and thresholds.
引用
收藏
页码:325 / 338
页数:14
相关论文
共 32 条
  • [1] A mixture model approach for the analysis of microarray gene expression data
    Allison, DB
    Gadbury, GL
    Heo, MS
    Fernández, JR
    Lee, CK
    Prolla, TA
    Weindruch, R
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 39 (01) : 1 - 20
  • [2] Two-stage testing in microarray analysis: What is gained?
    Allison, DB
    Coffey, CS
    [J]. JOURNALS OF GERONTOLOGY SERIES A-BIOLOGICAL SCIENCES AND MEDICAL SCIENCES, 2002, 57 (05): : B189 - B192
  • [3] ALLISON DB, 2002, 2002 P AM STAT ASS C
  • [4] [Anonymous], EXPLORING LIMITS BOO
  • [5] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [6] Combining multiple microarray studies and modeling interstudy variation
    Choi, Jung Kyoon
    Yu, Ungsik
    Kim, Sangsoo
    Yoo, Ook Joon
    [J]. BIOINFORMATICS, 2003, 19 : i84 - i90
  • [7] Donoho D.L., 2000, MATH CHALLENGES 21 C
  • [8] COMPUTERS AND THE THEORY OF STATISTICS - THINKING THE UNTHINKABLE
    EFRON, B
    [J]. SIAM REVIEW, 1979, 21 (04) : 460 - 480
  • [9] Efron B., 1994, INTRO BOOTSTRAP, DOI DOI 10.1201/9780429246593
  • [10] Everitt B S, 1996, Stat Methods Med Res, V5, P107, DOI 10.1177/096228029600500202