MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences

被引:170
作者
Schwartz, S
Elnitski, L
Li, M
Weirauch, M
Riemer, C
Smit, A
Program, NCS
Green, ED
Hardison, RC
Miller, W [1 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, Pond Lab, University Pk, PA 16802 USA
[2] Penn State Univ, Dept Biochem & Mol Biol, University Pk, PA 16802 USA
[3] Penn State Univ, Dept Biol, University Pk, PA 16802 USA
[4] Inst Syst Biol, Seattle, WA USA
[5] NHGRI, Bethesda, MD 20892 USA
关键词
D O I
10.1093/nar/gkg579
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at >http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs include a stacked set of percent identity plots, called a MultiPip, comparing the reference sequence with subsequent sequences, and a nucleotide-level multiple alignment. New tools are provided to search MultiPipMaker output for conserved matches to a user-specified pattern and for conserved matches to position weight matrices that describe transcription factor binding sites (singly and in clusters). We illustrate the use of MultiPipMaker to identify candidate regulatory regions in WNT2 and then demonstrate by transfection assays that they are functional. Analysis of the alignments also confirms the phylogenetic inference that horses are more closely related to cats than to cows.
引用
收藏
页码:3518 / 3524
页数:7
相关论文
共 45 条
[1]   GAP COSTS FOR MULTIPLE SEQUENCE ALIGNMENT [J].
ALTSCHUL, SF .
JOURNAL OF THEORETICAL BIOLOGY, 1989, 138 (03) :297-309
[2]  
Ansari-Lari MA, 1998, GENOME RES, V8, P29
[3]   ReAligner: A program for refining DNA sequence multi-alignments [J].
Anson, EL ;
Myers, EW .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (03) :369-383
[4]   Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome [J].
Berman, BP ;
Nibu, Y ;
Pfeiffer, BD ;
Tomancak, P ;
Celniker, SE ;
Levine, M ;
Rubin, GM ;
Eisen, MB .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (02) :757-762
[5]   LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA [J].
Brudno, M ;
Do, CB ;
Cooper, GM ;
Kim, MF ;
Davydov, E ;
Green, ED ;
Sidow, A ;
Batzoglou, S .
GENOME RESEARCH, 2003, 13 (04) :721-731
[6]  
CHEN QK, 1995, COMPUT APPL BIOSCI, V11, P563
[7]  
Chiaromonte F, 2002, Pac Symp Biocomput, P115
[8]   Active conservation of noncoding sequences revealed by three-way species comparisons [J].
Dubchak, I ;
Brudno, M ;
Loots, GG ;
Pachter, L ;
Mayor, C ;
Rubin, EM ;
Frazer, KA .
GENOME RESEARCH, 2000, 10 (09) :1304-1306
[9]   THE STRUCTURE AND EVOLUTION OF THE HUMAN BETA-GLOBIN GENE FAMILY [J].
EFSTRATIADIS, A ;
POSAKONY, JW ;
MANIATIS, T ;
LAWN, RM ;
OCONNELL, C ;
SPRITZ, RA ;
DERIEL, JK ;
FORGET, BG ;
WEISSMAN, SM ;
SLIGHTOM, JL ;
BLECHL, AE ;
SMITHIES, O ;
BARALLE, FE ;
SHOULDERS, CC ;
PROUDFOOT, NJ .
CELL, 1980, 21 (03) :653-668
[10]   Distinguishing regulatory DNA from neutral sites [J].
Elnitski, L ;
Hardison, RC ;
Li, J ;
Yang, S ;
Kolbe, D ;
Eswara, P ;
O'Connor, MJ ;
Schwartz, S ;
Miller, W ;
Chiaromonte, F .
GENOME RESEARCH, 2003, 13 (01) :64-72