HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project

被引:79
作者
Kikuno, R [1 ]
Nagase, T [1 ]
Waki, M [1 ]
Ohara, O [1 ]
机构
[1] Kazusa DNA Res Inst, Chiba 2920812, Japan
关键词
D O I
10.1093/nar/30.1.166
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have been developing a HUGE database to summarize results from the sequence analysis of human novel large (>4 kb) cDNAs identified in the Kazusa cDNA sequencing project, systematically designated KIAA plus a four-digit number. HUGE currently contains nearly 2000 gene/protein characteristic tables harboring the results of the computer-assisted analysis of the cDNA and the predicted protein sequences together with those of expression profiling and chromosomal mapping. In the updated version of HUGE, we made it possible to compare each KIAA cDNA sequence with the corresponding entry in the human draft genome sequence that was published recently. Approximately 90% of KIAA cDNAs in HUGE can be localized along the human genome for at least half or more of the cDNA's length. Any nucleotide differences between the cDNA and the corresponding genomic sequences are also presented in detail. This new version of HUGE greatly helps us evaluate the completeness of cDNA clones and the accuracy of cDNA/genomic sequences. More interestingly, in some cases, the ability to compare cDNA with genomic sequences allows us to identify candidate sites of RNA editing. HUGE is available on the World Wide Web at http://www.kazusa.or.jp/huge.
引用
收藏
页码:166 / 168
页数:3
相关论文
共 13 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[3]   RNA editing and hypermutation by adenosine deamination [J].
Bass, BL .
TRENDS IN BIOCHEMICAL SCIENCES, 1997, 22 (05) :157-162
[4]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[5]   A computer program for aligning a cDNA sequence with a genomic DNA sequence [J].
Florea, L ;
Hartzell, G ;
Zhang, Z ;
Rubin, GM ;
Miller, W .
GENOME RESEARCH, 1998, 8 (09) :967-974
[6]   Gene identification and classification in the Synechocystis genomic sequence by recursive gene mark analysis [J].
Hirosawa, M ;
Isono, K ;
Hayes, WS ;
Borodovsky, M .
DNA SEQUENCE, 1997, 8 (1-2) :17-29
[7]   HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project [J].
Kikuno, R ;
Nagase, T ;
Suyama, M ;
Waki, M ;
Hirosawa, M ;
Ohara, O .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :331-332
[8]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[9]   Prediction of the coding sequences of unidentified human genes.: XX.: The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro [J].
Nagase, T ;
Nakayama, M ;
Nakajima, D ;
Kikuno, R ;
Ohara, O .
DNA RESEARCH, 2001, 8 (02) :85-95
[10]  
Ohara O, 1997, DNA Res, V4, P53, DOI 10.1093/dnares/4.1.53