ECgene: Genome-based EST clustering and gene modeling for alternative splicing

被引:66
作者
Kim, N
Shin, S
Lee, S [1 ]
机构
[1] Ewha Womans Univ, Div Mol Life Sci, Seoul 120750, South Korea
[2] Seoul Natl Univ, Sch Chem, Seoul 151747, South Korea
关键词
D O I
10.1101/gr.3030405
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
With the availability of the human genome map and fast algorithms for sequence alignment, genome-based EST clustering became a viable method for gene modeling. We developed a novel gene-modeling method, ECgene (Gene modeling by EST Clustering), which combines genome-based EST clustering and the transcript assembly procedure in a coherent and consistent fashion. Specifically, ECgene takes alternative splicing events into consideration. The position of splice sites (i.e., exon-intron boundaries) in the genome map is utilized as the critical information in the whole procedure. Sequences that share any splice sites are grouped together to define an EST cluster in a manner similar to that of the genome-based version of the UniGene algorithm. Transcript assembly is achieved using graph theory that represents the exon connectivity in each cluster as a directed acyclic graph (DAG). Distinct paths along exons correspond to possible gene models encompassing all alternative splicing events. EST sequences in each cluster are subclustered further according to the compatibility with gene structure of each splice variant, and they can be regarded as clone evidence for the corresponding isoform. The reliability of each isoform is assessed from the nature of cluster members and from the minimum number of clones required to reconstruct all exons in the transcript.
引用
收藏
页码:566 / 576
页数:11
相关论文
共 41 条
[1]   COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT [J].
ADAMS, MD ;
KELLEY, JM ;
GOCAYNE, JD ;
DUBNICK, M ;
POLYMEROPOULOS, MH ;
XIAO, H ;
MERRIL, CR ;
WU, A ;
OLDE, B ;
MORENO, RF ;
KERLAVAGE, AR ;
MCCOMBIE, WR ;
VENTER, JC .
SCIENCE, 1991, 252 (5013) :1651-1656
[2]   Identification of alternate polyadenylation sites and analysis of their tissue distribution using EST data [J].
Beaudoing, E ;
Gautheret, D .
GENOME RESEARCH, 2001, 11 (09) :1520-1526
[3]   Mechanisms of alternative pre-messenger RNA splicing [J].
Black, DL .
ANNUAL REVIEW OF BIOCHEMISTRY, 2003, 72 :291-336
[4]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[5]   Alternative splicing:: multiple control mechanisms and involvement in human disease [J].
Cáceres, JF ;
Kornblihtt, AR .
TRENDS IN GENETICS, 2002, 18 (04) :186-193
[6]   STACK: Sequence Tag Alignment and Consensus Knowledgebase [J].
Christoffels, A ;
van Gelder, A ;
Greyling, G ;
Miller, R ;
Hide, T ;
Hide, W .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :234-238
[7]   ESTGenes: Alternative splicing from ESTs in Ensembl [J].
Eyras, E ;
Caccamo, M ;
Curwen, V ;
Clamp, M .
GENOME RESEARCH, 2004, 14 (05) :976-987
[8]   A computer program for aligning a cDNA sequence with a genomic DNA sequence [J].
Florea, L ;
Hartzell, G ;
Zhang, Z ;
Rubin, GM ;
Miller, W .
GENOME RESEARCH, 1998, 8 (09) :967-974
[9]   Xpro: database of eukaryotic protein-encoding genes [J].
Gopalan, V ;
Tan, TW ;
Lee, BTK ;
Ranganathan, S .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D59-D63
[10]   Alternative splicing: increasing diversity in the proteomic world [J].
Graveley, BR .
TRENDS IN GENETICS, 2001, 17 (02) :100-107