PHAST and RPHAST: phylogenetic analysis with space/time models

被引:324
作者
Hubisz, Melissa J. [1 ]
Pollard, Katherine S. [2 ,3 ]
Siepel, Adam [1 ]
机构
[1] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14853 USA
[2] Univ Calif San Francisco, Gladstone Inst, San Francisco, CA 94143 USA
[3] Univ Calif San Francisco, Div Biostat, San Francisco, CA 94143 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
statistical phylogenetics; functional element identification; MULTIPLE ALIGNMENTS; EVOLUTION; ELEMENTS; DISCOVERY; SOFTWARE; DNA;
D O I
10.1093/bib/bbq072
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The PHylogenetic Analysis with Space/Time models (PHAST) software package consists of a collection of command-line programs and supporting libraries for comparative genomics. PHAST is best known as the engine behind the Conservation tracks in the University of California, Santa Cruz (UCSC) Genome Browser. However, it also includes several other tools for phylogenetic modeling and functional element identification, as well as utilities for manipulating alignments, trees and genomic annotations. PHAST has been in development since 2002 and has now been downloaded more than 1000 times, but so far it has been released only as provisional ('beta') software. Here, we describe the first official release (v1.0) of PHAST, with improved stability, portability and documentation and several new features. We outline the components of the package and detail recent improvements. In addition, we introduce a new interface to the PHAST libraries from the R statistical computing environment, called RPHAST, and illustrate its use in a series of vignettes. We demonstrate that RPHAST can be particularly useful in applications involving both large-scale phylogenomics and complex statistical analyses. The R interface also makes the PHAST libraries acccessible to non-C programmers, and is useful for rapid prototyping. PHAST v1.0 and RPHASTv1.0 are available for download at http://compgen.bscb.cornell.edu/phast, under the terms of an unrestrictive BSD-style license. RPHAST can also be obtained from the Comprehensive R Archive Network (CRAN; http://cran.r-project.org).
引用
收藏
页码:41 / 51
页数:11
相关论文
共 29 条
  • [1] [Anonymous], 2010, R LANG ENV STAT COMP
  • [2] [Anonymous], P 10 INT C RES COMP
  • [3] Analysis of sequence conservation at nucleotide resolution
    Asthana, Saurabh
    Roytberg, Mikhail
    Stamatoyannopoulos, John
    Sunyaev, Shamil
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (12) : 2559 - 2568
  • [4] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [5] Distribution and intensity of constraint in mammalian genomic sequence
    Cooper, GM
    Stone, EA
    Asimenos, G
    Green, ED
    Batzoglou, S
    Sidow, A
    [J]. GENOME RESEARCH, 2005, 15 (07) : 901 - 913
  • [6] Bioconductor: open software development for computational biology and bioinformatics
    Gentleman, RC
    Carey, VJ
    Bates, DM
    Bolstad, B
    Dettling, M
    Dudoit, S
    Ellis, B
    Gautier, L
    Ge, YC
    Gentry, J
    Hornik, K
    Hothorn, T
    Huber, W
    Iacus, S
    Irizarry, R
    Leisch, F
    Li, C
    Maechler, M
    Rossini, AJ
    Sawitzki, G
    Smith, C
    Smyth, G
    Tierney, L
    Yang, JYH
    Zhang, JH
    [J]. GENOME BIOLOGY, 2004, 5 (10)
  • [7] Galaxy: A platform for interactive large-scale genome analysis
    Giardine, B
    Riemer, C
    Hardison, RC
    Burhans, R
    Elnitski, L
    Shah, P
    Zhang, Y
    Blankenberg, D
    Albert, I
    Taylor, J
    Miller, W
    Kent, WJ
    Nekrutenko, A
    [J]. GENOME RESEARCH, 2005, 15 (10) : 1451 - 1455
  • [8] Using multiple alignments to improve gene prediction
    Gross, SS
    Brent, MR
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (02) : 379 - 393
  • [9] Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes
    Guigó, R
    Dermitzakis, ET
    Agarwal, P
    Ponting, CP
    Parra, G
    Reymond, A
    Abril, JF
    Keibler, E
    Lyle, R
    Ucla, C
    Antonarakis, SE
    Brent, MR
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (03) : 1140 - 1145
  • [10] Accelerated sequence divergence of conserved genomic elements in Drosophila melanogaster
    Holloway, Alisha K.
    Begun, David J.
    Siepel, Adam
    Pollard, Katherine S.
    [J]. GENOME RESEARCH, 2008, 18 (10) : 1592 - 1601