Detecting Alternative Gene Structures from Spliced ESTs: A Computational Approach

被引:16
作者
Bonizzoni, Paola [1 ]
Mauri, Giancarlo [1 ]
Pesole, Graziano [2 ]
Picardi, Ernesto [2 ]
Pirola, Yuri [1 ]
Rizzi, Raffaella [1 ]
机构
[1] Univ Milano Bicocca, Dipartimento Informat Sistemist & Comunicaz, I-20126 Milan, Italy
[2] Univ Bari, Dipartimento Biochim & Biol Mol, Bari, Italy
关键词
algorithms; alignment; alternative splicing; MAXSNP-hardness; STRUCTURE PREDICTION; GENOME; DATABASE;
D O I
10.1089/cmb.2008.0028
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Alternative splicing (AS) is currently considered as one of the main mechanisms able to explain the huge gap between the number of predicted genes and the high complexity of the proteome in humans. The rapid growth of Expressed Sequence Tag (EST) data has encouraged the development of computational methods to predict alternative splicing from the analysis of EST alignment to genome sequences. EST data are also a valuable source to reconstruct the different transcript isoforms that derive from the same gene structure as a consequence of AS, as indeed EST sequences are obtained by fragmenting mRNAs from the same gene. The most recent studies on alternative splice sites detection have revealed that this topic is a quite challenging computational problem, far from a solution. The main computational issues related to the problem of detecting alternative splicing are investigated in this paper, and we analyze algorithmic solutions for this problem. We first formalize an optimization problem related to the prediction of constitutive and alternative splicing sites from EST sequences, the Minimum Exons ESTs Factorization problem (in short, MEF), and show that it is Np-hard, even for restricted instances. This problem leads us to define sets of spliced EST, that is, a set of EST factorized into their constitutive exons with respect to a gene. Then we investigate the computational problem of predicting transcript isoforms from spliced EST sequences. We propose a graph algorithm for the problem that is linear in the number of predicted isoforms and size of the graph. Finally, an experimental analysis of the method is performed to assess the reliability of the predictions.
引用
收藏
页码:43 / 66
页数:24
相关论文
共 20 条
[1]   ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences [J].
Bonizzoni, P ;
Rizzi, R ;
Pesole, G .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]  
Bonizzoni P, 2003, LECT N BIOINFORMAT, V2812, P63
[3]  
Bonizzoni Paola, 2006, Briefings in Functional Genomics & Proteomics, V5, P46, DOI 10.1093/bfgp/ell011
[4]   Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus [J].
Brendel, V ;
Xing, LQ ;
Zhu, W .
BIOINFORMATICS, 2004, 20 (07) :1157-1169
[5]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[6]   Alternative splicing:: multiple control mechanisms and involvement in human disease [J].
Cáceres, JF ;
Kornblihtt, AR .
TRENDS IN GENETICS, 2002, 18 (04) :186-193
[7]   ASPIC:: a web resource for alternative splicing prediction and transcript isoforms characterization [J].
Castrignano, Tiziana ;
Rizzi, Raffaella ;
Talamo, Ivano Giuseppe ;
De Meo, Paolo D'Onorio ;
Anselmo, Anna ;
Bonizzoni, Paola ;
Pesole, Graziano .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W440-W443
[8]  
Cormen TH, 1999, INTRO ALGORITHMS
[9]   ESTGenes: Alternative splicing from ESTs in Ensembl [J].
Eyras, E ;
Caccamo, M ;
Curwen, V ;
Clamp, M .
GENOME RESEARCH, 2004, 14 (05) :976-987
[10]   Ensembl 2008 [J].
Flicek, P. ;
Aken, B. L. ;
Beal, K. ;
Ballester, B. ;
Caccamo, M. ;
Chen, Y. ;
Clarke, L. ;
Coates, G. ;
Cunningham, F. ;
Cutts, T. ;
Down, T. ;
Dyer, S. C. ;
Eyre, T. ;
Fitzgerald, S. ;
Fernandez-Banet, J. ;
Graf, S. ;
Haider, S. ;
Hammond, M. ;
Holland, R. ;
Howe, K. L. ;
Howe, K. ;
Johnson, N. ;
Jenkinson, A. ;
Kahari, A. ;
Keefe, D. ;
Kokocinski, F. ;
Kulesha, E. ;
Lawson, D. ;
Longden, I. ;
Megy, K. ;
Meidl, P. ;
Overduin, B. ;
Parker, A. ;
Pritchard, B. ;
Prlic, A. ;
Rice, S. ;
Rios, D. ;
Schuster, M. ;
Sealy, I. ;
Slater, G. ;
Smedley, D. ;
Spudich, G. ;
Trevanion, S. ;
Vilella, A. J. ;
Vogel, J. ;
White, S. ;
Wood, M. ;
Birney, E. ;
Cox, T. ;
Curwen, V. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D707-D714