TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets

被引:1572
作者
Pertea, G [1 ]
Huang, XQ
Liang, F
Antonescu, V
Sultana, R
Karamycheva, S
Lee, Y
White, J
Cheung, F
Parvizi, B
Tsai, J
Quackenbush, J
机构
[1] Inst Genom Res, Rockville, MD 20850 USA
[2] Iowa State Univ Sci & Technol, Dept Comp Sci, Ames, IA 50011 USA
关键词
D O I
10.1093/bioinformatics/btg034
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, and then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run on multi-CPU architectures including SMP and PVM.
引用
收藏
页码:651 / 652
页数:2
相关论文
共 6 条
  • [1] [Anonymous], [No title captured]
  • [2] System-on-a-chip design for modern communications
    Chou, EY
    Sheu, B
    [J]. IEEE CIRCUITS & DEVICES, 2001, 17 (06): : 12 - 17
  • [3] CAP3: A DNA sequence assembly program
    Huang, XQ
    Madan, A
    [J]. GENOME RESEARCH, 1999, 9 (09) : 868 - 877
  • [4] An optimized protocol for analysis of EST sequences
    Liang, F
    Holt, I
    Pertea, G
    Karamycheva, S
    Salzberg, SL
    Quackenbush, J
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (18) : 3657 - 3665
  • [5] The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species
    Quackenbush, J
    Cho, J
    Lee, D
    Liang, F
    Holt, I
    Karamycheva, S
    Parvizi, B
    Pertea, G
    Sultana, R
    White, J
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 159 - 164
  • [6] A greedy algorithm for aligning DNA sequences
    Zhang, Z
    Schwartz, S
    Wagner, L
    Miller, W
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (1-2) : 203 - 214