Assessment of the total number of human transcription units

被引:40
作者
Das, M
Burge, CB
Park, E
Colinas, J
Pelletier, J
机构
[1] McGill Univ, Dept Biochem, Montreal, PQ H3G 1Y6, Canada
[2] McGill Univ, McGill Canc Ctr, Montreal, PQ H3G 1Y6, Canada
[3] MIT, Dept Biol, Cambridge, MA 02139 USA
基金
美国国家卫生研究院;
关键词
human chromosome 22; gene prediction; expressed genes;
D O I
10.1006/geno.2001.6620
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Variation in the estimates of the number of genes encoded by the human genome (28,000-120,000) attests to the difficulty of systematically identifying human genes. Sequencing of human chromosome 22 (Chr22) provided the first comprehensive, unbiased view of an entire human chromosome, and intensive analysis of this sequence identified 545 genes and 134 pseudogenes that had similarity or identity to known proteins and/or ESTs and which were listed in the gene annotation (http://www.sanger.ac.uk/HGP/Chr22). This analysis yielded an estimate of approximately 36,000 functional expressed genes in the human genome (and 9000 pseudogenes). However, a key uncertainty in this estimate was that hundreds of additional genes beyond those annotated in the Chr22 sequence are predicted by the gene prediction program Genscan, an unknown number of which might represent additional expressed genes. To determine what fraction of these "predicted novel genes" (PNGs) represents expressed human genes, we used a sensitive RT-PCR assay to detect predicted transcripts in 17 tissues and one cell line. Our results indicate that at least 5000-9000 additional human genes which lack similarity to known genes or proteins exist in the human genome, increasing baseline gene estimates to similar to 41,000-45,000.
引用
收藏
页码:71 / 78
页数:8
相关论文
共 29 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Analysis of EST-driven gene annotation in human genomic sequence
    Bailey, LC
    Searls, DB
    Overton, GC
    [J]. GENOME RESEARCH, 1998, 8 (04): : 362 - 376
  • [3] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [4] STABILITY OF NUCLEAR-RNA IN MAMMALIAN-CELLS
    BRANDHORST, BP
    MCCONKEY, EH
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1974, 85 (03) : 451 - 463
  • [5] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [6] Finding the genes in genomic DNA
    Burge, CB
    Karlin, S
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) : 346 - 354
  • [7] Genome sequence of the nematode C-elegans:: A platform for investigating biology
    不详
    [J]. SCIENCE, 1998, 282 (5396) : 2012 - 2018
  • [8] Generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification
    Chen, JJ
    Rowley, JD
    Wang, SM
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 349 - 353
  • [9] Church DM, 1999, METHOD ENZYMOL, V303, P83
  • [10] Dunham I, 2000, YEAST, V17, P218, DOI 10.1002/1097-0061(20000930)17:3<218::AID-YEA37>3.0.CO