The UCSC Known Genes

被引:379
作者
Hsu, F [1 ]
Kent, WJ
Clawson, H
Kuhn, RM
Diekhans, M
Haussler, D
机构
[1] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Sch Engn, Santa Cruz, CA 95064 USA
[2] Univ Calif Santa Cruz, Howard Hughes Med Inst, Santa Cruz, CA 95064 USA
关键词
D O I
10.1093/bioinformatics/btl048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The University of California Santa Cruz (UCSC) Known Genes dataset is constructed by a fully automated process, based on protein data from Swiss-Prot/TrEMBL (UniProt) and the associated mRNA data from Genbank. The detailed steps of this process are described. Extensive cross-references from this dataset to other genomic and proteomic data were constructed. For each known gene, a details page is provided containing rich information about the gene, together with extensive links to other relevant genomic, proteomic and pathway data. As of July 2005, the UCSC Known Genes are available for human, mouse and rat genomes. The Known Genes serves as a foundation to support several key programs: the Genome Browser, Proteome Browser, Gene Sorter and Table Browser offered at the UCSC website. All the associated data files and program source code are also available. They can be accessed at http://genome.ucsc.edu. The genomic coverage of UCSC Known Genes, RefSeq, Ensembl Genes, H-Invitational and CCDS is analyzed. Although UCSC Known Genes offers the highest genomic and CDS coverage among major human and mouse gene sets, more detailed analysis suggests all of them could be further improved.
引用
收藏
页码:1036 / 1046
页数:11
相关论文
共 15 条
  • [1] The universal protein resource (UniProt)
    Bairoch, A
    Apweiler, R
    Wu, CH
    Barker, WC
    Boeckmann, B
    Ferro, S
    Gasteiger, E
    Huang, HZ
    Lopez, R
    Magrane, M
    Martin, MJ
    Natale, DA
    O'Donovan, C
    Redaschi, N
    Yeh, LSL
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D154 - D159
  • [2] GenBank
    Benson, DA
    Karsch-Mizrachi, I
    Lipman, DJ
    Ostell, J
    Wheeler, DL
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D34 - D38
  • [3] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [4] The Gene Ontology (GO) database and informatics resource
    Harris, MA
    Clark, J
    Ireland, A
    Lomax, J
    Ashburner, M
    Foulger, R
    Eilbeck, K
    Lewis, S
    Marshall, B
    Mungall, C
    Richter, J
    Rubin, GM
    Blake, JA
    Bult, C
    Dolan, M
    Drabkin, H
    Eppig, JT
    Hill, DP
    Ni, L
    Ringwald, M
    Balakrishnan, R
    Cherry, JM
    Christie, KR
    Costanzo, MC
    Dwight, SS
    Engel, S
    Fisk, DG
    Hirschman, JE
    Hong, EL
    Nash, RS
    Sethuraman, A
    Theesfeld, CL
    Botstein, D
    Dolinski, K
    Feierbach, B
    Berardini, T
    Mundodi, S
    Rhee, SY
    Apweiler, R
    Barrell, D
    Camon, E
    Dimmer, E
    Lee, V
    Chisholm, R
    Gaudet, P
    Kibbe, W
    Kishore, R
    Schwarz, EM
    Sternberg, P
    Gwinn, M
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D258 - D261
  • [5] The UCSC Proteome Browser
    Hsu, F
    Pringle, TH
    Kuhn, RM
    Karolchik, D
    Diekhans, M
    Haussler, D
    Kent, WJ
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D454 - D458
  • [6] Integrative annotation of 21,037 human genes validated by full-length cDNA clones
    Imanishi, T
    Itoh, T
    Suzuki, Y
    O'Donovan, C
    Fukuchi, S
    Koyanagi, KO
    Barrero, RA
    Tamura, T
    Yamaguchi-Kabata, Y
    Tanino, M
    Yura, K
    Miyazaki, S
    Ikeo, K
    Homma, K
    Kasprzyk, A
    Nishikawa, T
    Hirakawa, M
    Thierry-Mieg, J
    Thierry-Mieg, D
    Ashurst, J
    Jia, LB
    Nakao, M
    Thomas, MA
    Mulder, N
    Karavidopoulou, Y
    Jin, LH
    Kim, S
    Yasuda, T
    Lenhard, B
    Eveno, E
    Suzuki, Y
    Yamasaki, C
    Takeda, J
    Gough, C
    Hilton, P
    Fujii, Y
    Sakai, H
    Tanaka, S
    Amid, C
    Bellgard, M
    Bonaldo, MD
    Bono, H
    Bromberg, SK
    Brookes, AJ
    Bruford, E
    Carninci, P
    Chelala, C
    Couillault, C
    de Souza, SJ
    Debily, MA
    [J]. PLOS BIOLOGY, 2004, 2 (06) : 856 - 875
  • [7] The UCSC Table Browser data retrieval tool
    Karolchik, D
    Hinrichs, AS
    Furey, TS
    Roskin, KM
    Sugnet, CW
    Haussler, D
    Kent, WJ
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D493 - D496
  • [8] The UCSC Genome Browser Database
    Karolchik, D
    Baertsch, R
    Diekhans, M
    Furey, TS
    Hinrichs, A
    Lu, YT
    Roskin, KM
    Schwartz, M
    Sugnet, CW
    Thomas, DJ
    Weber, RJ
    Haussler, D
    Kent, WJ
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 51 - 54
  • [9] Exploring relationships and mining data with the UCSC Gene Sorter
    Kent, WJ
    Hsu, F
    Karolchik, D
    Kuhn, RM
    Clawson, H
    Trumbower, H
    Haussler, D
    [J]. GENOME RESEARCH, 2005, 15 (05) : 737 - 741
  • [10] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006