A hitchhiker's guide to expressed sequence tag (EST) analysis

被引:173
作者
Nagaraj, Shivashankar H.
Gasser, Robin B.
Ranganathan, Shoba [1 ]
机构
[1] Macquarie Univ, Dept Chem & Biomol Sci, N Ryde, NSW 2109, Australia
[2] Univ Melbourne, Dept Vet Sci, Parkville, Vic 3052, Australia
[3] Natl Univ Singapore, Singapore 117548, Singapore
关键词
expressed sequence tags; sequence assembly and clustering; database similarity searches; functional annotation; conceptual translation; transcriptome analysis;
D O I
10.1093/bib/bbl015
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Expressed sequence tag (EST) sequencing projects are underway for numerous organisms, generating millions of short, single-pass nucleotide sequence reads, accumulating in EST databases. Extensive computational strategies have been developed to organize and analyse both small- and large-scale EST data for gene discovery, transcript and single nucleotide polymorphism analysis as well as functional annotation of putative gene products. We provide an overview of the significance of ESTs in the genomic era, their properties and the applications of ESTs. Methods adopted for each step of EST analysis by various research groups have been compared. Challenges that lie ahead in organizing and analysing the ever increasing EST data have also been identified. The most appropriate software tools for EST pre-processing, clustering and assembly, database matching and functional annotation have been compiled (available online from http://biolinfo.org/EST). We propose a road map for EST analysis to accelerate the effective analyses of EST data sets. An investigation of EST analysis platforms reveals that they all terminate prior to downstream functional annotation including gene ontologies, motif/pattern analysis and pathway mapping.
引用
收藏
页码:6 / 21
页数:16
相关论文
共 97 条
[1]   Toward the development of a gene index to the human genome: An assessment of the nature of high-throughput EST sequence data [J].
Aaronson, JS ;
Eckman, B ;
Blevins, RA ;
Borkowski, JA ;
Myerson, J ;
Imran, S ;
Elliston, KO .
GENOME RESEARCH, 1996, 6 (09) :829-845
[2]   COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT [J].
ADAMS, MD ;
KELLEY, JM ;
GOCAYNE, JD ;
DUBNICK, M ;
POLYMEROPOULOS, MH ;
XIAO, H ;
MERRIL, CR ;
WU, A ;
OLDE, B ;
MORENO, RF ;
KERLAVAGE, AR ;
MCCOMBIE, WR ;
VENTER, JC .
SCIENCE, 1991, 252 (5013) :1651-1656
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]   PipeOnline 2.0: automated EST processing and functional data sorting [J].
Ayoubi, P ;
Jin, XJ ;
Leite, S ;
Liu, XH ;
Martajaja, J ;
Abduraham, A ;
Wan, QL ;
Yan, W ;
Misawa, E ;
Prade, RA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (21) :4761-4769
[5]   Reduced representation sequencing: a success in maize and a promise for other plant genomes [J].
Barbazuk, WB ;
Bedell, JA ;
Rabinowicz, PD .
BIOESSAYS, 2005, 27 (08) :839-848
[6]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[7]   Identification of alternate polyadenylation sites and analysis of their tissue distribution using EST data [J].
Beaudoing, E ;
Gautheret, D .
GENOME RESEARCH, 2001, 11 (09) :1520-1526
[8]   MaskerAid:: a performance enhancement to RepeatMasker [J].
Bedell, JA ;
Korf, I ;
Gish, W .
BIOINFORMATICS, 2000, 16 (11) :1040-1041
[9]   Mechanisms and rates of genome expansion and contraction in flowering plants [J].
Bennetzen, JL .
GENETICA, 2002, 115 (01) :29-36
[10]   ESTABLISHING A HUMAN TRANSCRIPT MAP [J].
BOGUSKI, MS ;
SCHULER, GD .
NATURE GENETICS, 1995, 10 (04) :369-371