ANGSD: Analysis of Next Generation Sequencing Data

被引:1851
作者
Korneliussen, Thorfinn Sand [1 ]
Albrechtsen, Anders [2 ]
Nielsen, Rasmus [1 ,3 ]
机构
[1] Nat Hist Museum Denmark, Ctr GeoGenet, Copenhagen, Denmark
[2] Univ Copenhagen, Bioinformat Ctr, Dept Biol, DK-2200 Copenhagen, Denmark
[3] Univ Calif Berkeley, Dept Integrat Biol & Stat, Berkeley, CA 94720 USA
来源
BMC BIOINFORMATICS | 2014年 / 15卷
基金
美国国家卫生研究院; 新加坡国家研究基金会;
关键词
Next-generation sequencing; Bioinformatics; Population genetics; Association studies; SHORT READ ALIGNMENT; GENOME SEQUENCE; NGS DATA; ASSOCIATION; GENOTYPE; ACCURATE; HISTORY; STATISTICS; DISCOVERY; FRAMEWORK;
D O I
10.1186/s12859-014-0356-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously. Results: We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods. Conclusions: The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd. The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere.
引用
收藏
页数:13
相关论文
共 35 条
[11]   Characterizing Bias in Population Genetic Inferences from Low-Coverage Sequencing Data [J].
Han, Eunjung ;
Sinsheimer, Janet S. ;
Novembre, John .
MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (03) :723-735
[12]  
Kim S, 2011, INT J DATA MIN BIOIN, V5, P231
[13]   Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data [J].
Korneliussen, Thorfinn Sand ;
Moltke, Ida ;
Albrechtsen, Anders ;
Nielsen, Rasmus .
BMC BIOINFORMATICS, 2013, 14
[14]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[15]   A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data [J].
Li, Heng .
BIOINFORMATICS, 2011, 27 (21) :2987-2993
[16]   Improving SNP discovery by base alignment quality [J].
Li, Heng .
BIOINFORMATICS, 2011, 27 (08) :1157-1158
[17]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[18]   SOAP2: an improved ultrafast tool for short read alignment [J].
Li, Ruiqiang ;
Yu, Chang ;
Li, Yingrui ;
Lam, Tak-Wah ;
Yiu, Siu-Ming ;
Kristiansen, Karsten ;
Wang, Jun .
BIOINFORMATICS, 2009, 25 (15) :1966-1967
[19]   SNP detection for massively parallel whole-genome resequencing [J].
Li, Ruiqiang ;
Li, Yingrui ;
Fang, Xiaodong ;
Yang, Huanming ;
Wang, Jian ;
Kristiansen, Karsten ;
Wang, Jun .
GENOME RESEARCH, 2009, 19 (06) :1124-1132
[20]   Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants [J].
Li, Yingrui ;
Vinckenbosch, Nicolas ;
Tian, Geng ;
Huerta-Sanchez, Emilia ;
Jiang, Tao ;
Jiang, Hui ;
Albrechtsen, Anders ;
Andersen, Gitte ;
Cao, Hongzhi ;
Korneliussen, Thorfinn ;
Grarup, Niels ;
Guo, Yiran ;
Hellman, Ines ;
Jin, Xin ;
Li, Qibin ;
Liu, Jiangtao ;
Liu, Xiao ;
Sparso, Thomas ;
Tang, Meifang ;
Wu, Honglong ;
Wu, Renhua ;
Yu, Chang ;
Zheng, Hancheng ;
Astrup, Arne ;
Bolund, Lars ;
Holmkvist, Johan ;
Jorgensen, Torben ;
Kristiansen, Karsten ;
Schmitz, Ole ;
Schwartz, Thue W. ;
Zhang, Xiuqing ;
Li, Ruiqiang ;
Yang, Huanming ;
Wang, Jian ;
Hansen, Torben ;
Pedersen, Oluf ;
Nielsen, Rasmus ;
Wang, Jun .
NATURE GENETICS, 2010, 42 (11) :969-U82