Sequence-level population simulations over large genomic regions

被引:81
作者
Hoggart, Clive J.
Chadeau-Hyam, Marc
Clark, Taane G.
Lampariello, Riccardo
Whittaker, John C.
De Iorio, Maria
Balding, David J.
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Epidemiol & Publ Hlth, London W2 1PG, England
[2] Serono Int, CH-1211 Geneva 20, Switzerland
[3] London Sch Hyg & Trop Med, Noncommunicable Dis Epidemiol Unit, London WC1E 7HT, England
基金
英国医学研究理事会;
关键词
D O I
10.1534/genetics.106.069088
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Simulation is an invaluable tool for investigating the effects of various population genetics modeling assumptions on resulting patterns of genetic diversity, and for assessing the performance of statistical techniques, for example those designed to detect and measure the genomic effects of selection. It is also used to investigate the effectiveness of various design options for genetic association studies. Backward-in-time simulation methods are computationally efficient and have become widely used since their introduction in the 1980s. The forward-in-time approach has substantial advantages in terms of accuracy and modeling flexibility, but at greater computational cost. We have developed flexible and efficient simulation software and a rescaling technique to aid computational efficiency that together allow the simulation of sequence-level data over large genomic regions in entire diploid populations under various scenarios for demography, mutation, selection, and recombination, the latter including hotspots and gene conversion. Our forward evolution of genomic regions (FREGENE) software is freely available from vww.ebi.ac.uk/projects/BARGEN together with an ancillary program to generate phenotype labels, either binary or quantitative. In this article we discuss limitations of coalescent-based simulation, introduce the rescaling technique that makes large-scale forward-in-time simulation feasible, and demonstrate the utility of various features of FREGENE, many not previously available.
引用
收藏
页码:1725 / 1731
页数:7
相关论文
共 27 条
[1]  
[Anonymous], FPG COMPUTER PROGRAM
[2]   Genomic regions exhibiting positive selection identified from dense genotype data [J].
Carlson, CS ;
Thomas, DJ ;
Eberle, MA ;
Swanson, JE ;
Livingston, RJ ;
Rieder, MJ ;
Nickerson, DA .
GENOME RESEARCH, 2005, 15 (11) :1553-1565
[3]   Recombination estimation under complex evolutionary models with the coalescent composite-likelihood method [J].
Carvajal-Rodríguez, A ;
Crandall, KA ;
Posada, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (04) :817-827
[4]   Balancing selection and its effects on sequences in nearby genome regions [J].
Charlesworth, Deborah .
PLOS GENETICS, 2006, 2 (04) :379-384
[5]   Evidence for substantial fine-scale variation in recombination rates across the human genome [J].
Crawford, DC ;
Bhangale, T ;
Li, N ;
Hellenthal, G ;
Rieder, MJ ;
Nickerson, DA ;
Stephens, M .
NATURE GENETICS, 2004, 36 (07) :700-706
[6]  
Ewens W.J., 2004, MATH POPULATION GENE, DOI DOI 10.1007/978-0-387-21822-9
[7]   Exact coalescent for the Wright-Fisher model [J].
Fu, Yun-Xin .
THEORETICAL POPULATION BIOLOGY, 2006, 69 (04) :385-394
[8]  
GRIFFITHS RC, 1997, IMA VOLUMES MATH ITS, P257
[9]   Generating samples under a Wright-Fisher neutral model of genetic variation [J].
Hudson, RR .
BIOINFORMATICS, 2002, 18 (02) :337-338
[10]   PROPERTIES OF A NEUTRAL ALLELE MODEL WITH INTRAGENIC RECOMBINATION [J].
HUDSON, RR .
THEORETICAL POPULATION BIOLOGY, 1983, 23 (02) :183-201