共 8 条
PseudoPipe: an automated pseudogene identification pipeline
被引:131
作者:
Zhang, Zhaolei
Carriero, Nicholas
Zheng, Deyou
Karro, John
Harrison, Paul M.
Gerstein, Mark
[1
]
机构:
[1] Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Univ Toronto, Donnelly CCBR, Banting & Best Dept Med Res, Toronto, ON M5S 3E1, Canada
[3] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
[4] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[5] McGill Univ, Dept Biol, Montreal, PQ H3A 1B1, Canada
关键词:
D O I:
10.1093/bioinformatics/btl116
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Motivation: Mammalian genomes contain many 'genomic fossils' i.e. pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or retrotransposition events. Pseudogenes are important resources in understanding the evolutionary history of genes and genomes. Results: We have developed a homology-based computational pipeline ('PseudoPipe') that can search a mammalian genome and identify pseudogene sequences in a comprehensive and consistent manner. The key steps in the pipeline involve using BLAST to rapidly cross-reference potential "parent" proteins against the intergenic regions of the genome and then processing the resulting "raw hits" - i.e. eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Finally, pseudogenes are classified based on a combination of criteria including homology, intronexon structure, and existence of stop codons and frameshifts.
引用
收藏
页码:1437 / 1439
页数:3
相关论文