Xpro: database of eukaryotic protein-encoding genes

被引:16
作者
Gopalan, V
Tan, TW
Lee, BTK
Ranganathan, S [1 ]
机构
[1] Natl Univ Singapore, Dept Biochem, Singapore 119260, Singapore
[2] Natl Univ Singapore, Dept Biol Sci, Singapore 119260, Singapore
关键词
D O I
10.1093/nar/gkh051
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Xpro is a relational database that contains all the eukaryotic protein-encoding DNA sequences contained in GenBank with associated data required for the analysis of eukaryotic gene architecture. In addition to the information found in the GenBank records, which includes properties such as sequence, position, length and description about introns, exons and protein-coding regions, Xpro provides annotations on the splice sites and intron phases. Furthermore, Xpro validates intron positions using alignment information between the record's sequence and EST sequences found in dbEST. In the process of validation, alternative splicing information is also obtained and can be found in the database. The intron-containing genes in the Xpro are also classified as experimental or predicted based on the intron position validation and specific keywords in the GenBank records that are present in predicted genes. An Entrez-like query system, which is familiar to most biologists, is provided for accessing the information present in the database system. A non-redundant set of Xpro database contents is also obtained by cross-referencing to the Swiss-Prot/TrEMBL and Pfam databases. The database currently contains information for 493 983 genes-351 918 intron-containing genes and 142 065 intron-less genes. Xpro is updated for each new GenBank release and is freely available via the internet at http://origin.bic. nus.edu.sg/xpro.
引用
收藏
页码:D59 / D63
页数:5
相关论文
共 24 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[3]  
Benson DA, 2003, NUCLEIC ACIDS RES, V31, P23, DOI 10.1093/nar/gkg057
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]   DBEST - DATABASE FOR EXPRESSED SEQUENCE TAGS [J].
BOGUSKI, MS ;
LOWE, TMJ ;
TOLSTOSHEV, CM .
NATURE GENETICS, 1993, 4 (04) :332-333
[6]   ISIS, the intron information system, reveals the high frequency of alternative splicing in the human genome [J].
Croft, L ;
Schandorff, S ;
Clark, F ;
Burrage, K ;
Arctander, P ;
Mattick, JS .
NATURE GENETICS, 2000, 24 (04) :340-341
[7]  
DUBOIS P, 2003, MYSQL
[8]   Introns in gene evolution [J].
Fedorova, L ;
Fedorov, A .
GENETICA, 2003, 118 (2-3) :123-131
[9]   ON THE ANCIENT NATURE OF INTRONS [J].
GILBERT, W ;
GLYNIAS, M .
GENE, 1993, 135 (1-2) :137-144
[10]   THE EXON THEORY OF GENES [J].
GILBERT, W .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1987, 52 :901-905