Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts

被引:7
作者
Dalla, E
Mignone, F
Verardo, R
Marchionni, L
Marzinotto, S
Lazarevic, D
Reid, JF
Marzio, R
Klaric, E
Licastro, D
Marcuzzi, G
Gambetta, R
Pierotti, MA
Pesole, G
Schneider, C
机构
[1] Lab Nazl Consorzio Interuniv Biotecnol, I-34012 Trieste, Italy
[2] Univ Milan, Dipartimento Sci Biomol & Biotecnol, I-20133 Milan, Italy
[3] Ist Nazl Tumori, Dipartimento Oncol Sperimentale, I-20133 Milan, Italy
[4] Ist FIRC Oncol Mol, I-20139 Milan, Italy
关键词
full-length cDNA; human transcriptome; cDNA microarrays; gene expression;
D O I
10.1016/j.ygeno.2005.02.009
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
In this work we describe the process that, starting with the product I ion of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5'-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5' ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential. (c) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:739 / 751
页数:13
相关论文
共 30 条
  • [1] [Anonymous], 2002, Genome Biol
  • [2] AUFFRAY C, 1980, EUR J BIOCHEM, V107, P303
  • [3] Using GeneWise in the Drosophila annotation experiment
    Birney, E
    Durbin, R
    [J]. GENOME RESEARCH, 2000, 10 (04) : 547 - 548
  • [4] RAPID AND SIMPLE METHOD FOR PURIFICATION OF NUCLEIC-ACIDS
    BOOM, R
    SOL, CJA
    SALIMANS, MMM
    JANSEN, CL
    WERTHEIMVANDILLEN, PME
    VANDERNOORDAA, J
    [J]. JOURNAL OF CLINICAL MICROBIOLOGY, 1990, 28 (03) : 495 - 503
  • [5] d2_cluster: A validated method for clustering EST and full-length cDNA sequences
    Burke, J
    Davison, D
    Hide, W
    [J]. GENOME RESEARCH, 1999, 9 (11) : 1135 - 1142
  • [6] CAMINCI P, 1998, P NATL ACAD SCI USA, V95, P5200
  • [7] CAMINCI P, 1996, GENOMICS, V37, P327
  • [8] Carninci P, 1997, DNA Res, V4, P61, DOI 10.1093/dnares/4.1.61
  • [9] CSTminer:: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison
    Castrignanò, T
    Canali, A
    Grillo, G
    Liuni, S
    Mignone, F
    Pesole, G
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W624 - W627
  • [10] Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs
    Cawley, S
    Bekiranov, S
    Ng, HH
    Kapranov, P
    Sekinger, EA
    Kampa, D
    Piccolboni, A
    Sementchenko, V
    Cheng, J
    Williams, AJ
    Wheeler, R
    Wong, B
    Drenkow, J
    Yamanaka, M
    Patel, S
    Brubaker, S
    Tammana, H
    Helt, G
    Struhl, K
    Gingeras, TR
    [J]. CELL, 2004, 116 (04) : 499 - 509