PDA: a pipeline to explore and estimate polymorphism in large DNA databases

被引:13
作者
Casillas, S [1 ]
Barbadilla, A [1 ]
机构
[1] Univ Autonoma Barcelona, Dept Genet & Microbiol, E-08193 Barcelona, Spain
关键词
D O I
10.1093/nar/gkh428
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Polymorphism studies are one of the main research areas of this genomic era. To date, however, no available web server or software package has been designed to automate the process of exploring and estimating nucleotide polymorphism in large DNA databases. Here, we introduce a novel software, PDA, Pipeline Diversity Analysis, that automatically can (i) search for polymorphic sequences in large databases, and (ii) estimate their genetic diversity. PDA is a collection of modules, mainly written in Perl, which works sequentially as follows: unaligned sequence retrieved from a DNA database are automatically classified by organism and gene, and aligned using the ClustalW algorithm. Sequence sets are regrouped depending on their similarity scores. Main diversity parameters, including polymorphism, synonymous and non-synonymous substitutions, linkage disequilibrium and codon bias are estimated both for the full length of the sequences and for specific functional regions. Program output includes a database with all sequences and estimations, and HTML pages with summary statistics, the performed alignments and a histogram maker tool. PDA is an essential tool to explore polymorphism in large DNA databases for sequences from different genes, populations or species. It has already been successfully applied to create a secondary database. PDA is available on the web at http://pda.uab.es/.
引用
收藏
页码:W166 / W169
页数:4
相关论文
共 23 条
[1]   GenBank: update [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D23-D26
[2]   Multiple sequence alignment with the Clustal series of programs [J].
Chenna, R ;
Sugawara, H ;
Koike, T ;
Lopez, R ;
Gibson, TJ ;
Higgins, DG ;
Thompson, JD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3497-3500
[3]   The Jalview Java']Java alignment editor [J].
Clamp, M ;
Cuff, J ;
Searle, SM ;
Barton, GJ .
BIOINFORMATICS, 2004, 20 (03) :426-427
[4]   A vision for the future of genomics research [J].
Collins, FS ;
Green, ED ;
Guttmacher, AE ;
Guyer, MS .
NATURE, 2003, 422 (6934) :835-847
[5]  
HILL W G, 1968, Theoretical and Applied Genetics, V38, P226, DOI 10.1007/BF01245622
[6]  
JUKES TH, 1969, EVOLUTION PROTEIN MO, P21
[7]  
Kelly JK, 1997, GENETICS, V146, P1197
[8]   THE EVOLUTIONARY DYNAMICS OF COMPLEX POLYMORPHISMS [J].
LEWONTIN, RC ;
KOJIMA, K .
EVOLUTION, 1960, 14 (04) :458-472
[9]  
LEWONTIN RC, 1964, GENETICS, V49, P49
[10]  
NEI M, 1986, MOL BIOL EVOL, V3, P418