Neutrality Tests for Sequences with Missing Data

被引:28
作者
Ferretti, Luca [1 ]
Raineri, Emanuele [2 ]
Ramos-Onsins, Sebastian [1 ]
机构
[1] Ctr Res Agr Genom, Bellaterra 08193, Spain
[2] Ctr Nacl Anal Genom, Barcelona 08028, Spain
关键词
STATISTICAL TESTS; SEGREGATING SITES; DNA POLYMORPHISM; NEXT-GENERATION; SELECTION; HITCHHIKING; MUTATIONS; DIVERSITY; SAMPLES;
D O I
10.1534/genetics.112.139949
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Missing data are common in DNA sequences obtained through high-throughput sequencing. Furthermore, samples of low quality or problems in the experimental protocol often cause a loss of data even with traditional sequencing technologies. Here we propose modified estimators of variability and neutrality tests that can be naturally applied to sequences with missing data, without the need to remove bases or individuals from the analysis. Modified statistics include the Watterson estimator theta(W), Tajima's D, Fay and Wu's H, and HKA. We develop a general framework to take missing data into account in frequency spectrum-based neutrality tests and we derive the exact expression for the variance of these statistics under the neutral model. The neutrality tests proposed here can also be used as summary statistics to describe the information contained in other classes of data like DNA microarrays.
引用
收藏
页码:1397 / U511
页数:12
相关论文
共 20 条
[1]   Testing for neutrality in samples with sequencing errors [J].
Achat, Guillaume .
GENETICS, 2008, 179 (03) :1409-1424
[2]   Frequency Spectrum Neutrality Tests: One for All and All for One [J].
Achaz, Guillaume .
GENETICS, 2009, 183 (01) :249-258
[3]  
Fay JC, 2000, GENETICS, V155, P1405
[4]   Optimal Neutrality Tests Based on the Frequency Spectrum [J].
Ferretti, Luca ;
Perez-Enciso, Miguel ;
Ramos-Onsins, Sebastian .
GENETICS, 2010, 186 (01) :353-U562
[5]   STATISTICAL PROPERTIES OF SEGREGATING SITES [J].
FU, YX .
THEORETICAL POPULATION BIOLOGY, 1995, 48 (02) :172-197
[6]  
FU YX, 1993, GENETICS, V133, P693
[7]  
Fu YX, 1997, GENETICS, V147, P915
[8]   The Next Generation of Molecular Markers From Massively Parallel Sequencing of Pooled DNA Samples [J].
Futschik, Andreas ;
Schloetterer, Christian .
GENETICS, 2010, 186 (01) :207-218
[9]   Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals [J].
Hellmann, Ines ;
Mang, Yuan ;
Gu, Zhiping ;
Li, Peter ;
de la Vega, Francisco M. ;
Clark, Andrew G. ;
Nielsen, Rasmus .
GENOME RESEARCH, 2008, 18 (07) :1020-1029
[10]  
HUDSON RR, 1987, GENETICS, V116, P153