Discovery of 342 putative new genes from the analysis of 5′-end-sequenced full-length-enriched cDNA human transcripts

被引:7
作者
Dalla, E
Mignone, F
Verardo, R
Marchionni, L
Marzinotto, S
Lazarevic, D
Reid, JF
Marzio, R
Klaric, E
Licastro, D
Marcuzzi, G
Gambetta, R
Pierotti, MA
Pesole, G
Schneider, C
机构
[1] Lab Nazl Consorzio Interuniv Biotecnol, I-34012 Trieste, Italy
[2] Univ Milan, Dipartimento Sci Biomol & Biotecnol, I-20133 Milan, Italy
[3] Ist Nazl Tumori, Dipartimento Oncol Sperimentale, I-20133 Milan, Italy
[4] Ist FIRC Oncol Mol, I-20139 Milan, Italy
关键词
full-length cDNA; human transcriptome; cDNA microarrays; gene expression;
D O I
10.1016/j.ygeno.2005.02.009
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
In this work we describe the process that, starting with the product I ion of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5'-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5' ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation. cDNA-microarray gene expression analysis showed that 260 of 342 unknown clones are expressed in at least one cell line and/or tissue. Further analysis of their sequences and the corresponding genomic locations allowed us to conclude that most of them represent potential novel genes, with only a small fraction having protein-coding potential. (c) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:739 / 751
页数:13
相关论文
共 30 条
  • [21] Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis
    Mignone, F
    Grillo, G
    Liuni, S
    Pesole, G
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (15) : 4639 - 4645
  • [22] Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs
    Okazaki, Y
    Furuno, M
    Kasukawa, T
    Adachi, J
    Bono, H
    Kondo, S
    Nikaido, I
    Osato, N
    Saito, R
    Suzuki, H
    Yamanaka, I
    Kiyosawa, H
    Yagi, K
    Tomaru, Y
    Hasegawa, Y
    Nogami, A
    Schönbach, C
    Gojobori, T
    Baldarelli, R
    Hill, DP
    Bult, C
    Hume, DA
    Quackenbush, J
    Schriml, LM
    Kanapin, A
    Matsuda, H
    Batalov, S
    Beisel, KW
    Blake, JA
    Bradt, D
    Brusic, V
    Chothia, C
    Corbani, LE
    Cousins, S
    Dalla, E
    Dragani, TA
    Fletcher, CF
    Forrest, A
    Frazer, KS
    Gaasterland, T
    Gariboldi, M
    Gissi, C
    Godzik, A
    Gough, J
    Grimmond, S
    Gustincich, S
    Hirokawa, N
    Jackson, IJ
    Jarvis, ED
    Kanai, A
    [J]. NATURE, 2002, 420 (6915) : 563 - 573
  • [23] Complete sequencing and characterization of 21,243 full-length human cDNAs
    Ota, T
    Suzuki, Y
    Nishikawa, T
    Otsuki, T
    Sugiyama, T
    Irie, R
    Wakamatsu, A
    Hayashi, K
    Sato, H
    Nagai, K
    Kimura, K
    Makita, H
    Sekine, M
    Obayashi, M
    Nishi, T
    Shibahara, T
    Tanaka, T
    Ishii, S
    Yamamoto, J
    Saito, K
    Kawai, Y
    Isono, Y
    Nakamura, Y
    Nagahari, K
    Murakami, K
    Yasuda, T
    Iwayanagi, T
    Wagatsuma, M
    Shiratori, A
    Sudo, H
    Hosoiri, T
    Kaku, Y
    Kodaira, H
    Kondo, H
    Sugawara, M
    Takahashi, M
    Kanda, K
    Yokoi, T
    Furuya, T
    Kikkawa, E
    Omura, Y
    Abe, K
    Kamihara, K
    Katsuta, N
    Sato, K
    Tanikawa, M
    Yamazaki, M
    Ninomiya, K
    Ishibashi, T
    Yamashita, H
    [J]. NATURE GENETICS, 2004, 36 (01) : 40 - 45
  • [24] DNA SEQUENCING WITH CHAIN-TERMINATING INHIBITORS
    SANGER, F
    NICKLEN, S
    COULSON, AR
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1977, 74 (12) : 5463 - 5467
  • [25] Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage
    Shiraki, T
    Kondo, S
    Katayama, S
    Waki, K
    Kasukawa, T
    Kawaji, H
    Kodzius, R
    Watahiki, A
    Nakamura, M
    Arakawa, T
    Fukuda, S
    Sasaki, D
    Podhajska, A
    Harbers, M
    Kawai, J
    Carninci, P
    Hayashizaki, Y
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (26) : 15776 - 15781
  • [26] Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences
    Strausberg, RL
    Feingold, EA
    Grouse, LH
    Derge, JG
    Klausner, RD
    Collins, FS
    Wagner, L
    Shenmen, CM
    Schuler, GD
    Altschul, SF
    Zeeberg, B
    Buetow, KH
    Schaefer, CF
    Bhat, NK
    Hopkins, RF
    Jordan, H
    Moore, T
    Max, SI
    Wang, J
    Hsieh, F
    Diatchenko, L
    Marusina, K
    Farmer, AA
    Rubin, GM
    Hong, L
    Stapleton, M
    Soares, MB
    Bonaldo, MF
    Casavant, TL
    Scheetz, TE
    Brownstein, MJ
    Usdin, TB
    Toshiyuki, S
    Carninci, P
    Prange, C
    Raha, SS
    Loquellano, NA
    Peters, GJ
    Abramson, RD
    Mullahy, SJ
    Bosak, SA
    McEwan, PJ
    McKernan, KJ
    Malek, JA
    Gunaratne, PH
    Richards, S
    Worley, KC
    Hale, S
    Garcia, AM
    Gay, LJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (26) : 16899 - 16903
  • [27] DBTSS, DataBase of Transcriptional Start Sites: progress report 2004
    Suzuki, Y
    Yamashita, R
    Sugano, S
    Nakai, K
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D78 - D81
  • [28] The sequence of the human genome
    Venter, JC
    Adams, MD
    Myers, EW
    Li, PW
    Mural, RJ
    Sutton, GG
    Smith, HO
    Yandell, M
    Evans, CA
    Holt, RA
    Gocayne, JD
    Amanatides, P
    Ballew, RM
    Huson, DH
    Wortman, JR
    Zhang, Q
    Kodira, CD
    Zheng, XQH
    Chen, L
    Skupski, M
    Subramanian, G
    Thomas, PD
    Zhang, JH
    Miklos, GLG
    Nelson, C
    Broder, S
    Clark, AG
    Nadeau, C
    McKusick, VA
    Zinder, N
    Levine, AJ
    Roberts, RJ
    Simon, M
    Slayman, C
    Hunkapiller, M
    Bolanos, R
    Delcher, A
    Dew, I
    Fasulo, D
    Flanigan, M
    Florea, L
    Halpern, A
    Hannenhalli, S
    Kravitz, S
    Levy, S
    Mobarry, C
    Reinert, K
    Remington, K
    Abu-Threideh, J
    Beasley, E
    [J]. SCIENCE, 2001, 291 (5507) : 1304 - +
  • [29] Spidey: A tool for mRNA-to-genomic alignments
    Wheelan, SJ
    Church, DM
    Ostell, JM
    [J]. GENOME RESEARCH, 2001, 11 (11) : 1952 - 1957
  • [30] Amine-modified random primers to label probes for DNA microarrays
    Xiang, CC
    Kozhich, OA
    Chen, M
    Inman, JM
    Phan, QN
    Chen, YD
    Brownstein, MJ
    [J]. NATURE BIOTECHNOLOGY, 2002, 20 (07) : 738 - 742