Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia

被引:127
作者
Carninci, P
Waki, K
Shiraki, T
Konno, H
Shibata, K
Itoh, M
Aizawa, K
Arakawa, T
Ishii, Y
Sasaki, D
Bono, H
Kondo, S
Sugahara, Y
Saito, R
Osato, N
Fukuda, S
Sato, K
Watahiki, A
Hirozane-Kishikawa, T
Nakamura, M
Shibata, Y
Yasunishi, A
Kikuchi, N
Yoshiki, A
Kusakabe, M
Gustincich, S
Beisel, K
Pavan, W
Aidinis, V
Nakagawara, A
Held, WA
Iwata, H
Kono, T
Nakauchi, H
Lyons, P
Wells, C
Hume, DA
Fagiolini, M
Hensch, TK
Brinkmeier, M
Camper, S
Hirota, J
Mombaerts, P
Muramatsu, M
Okazaki, Y
Kawai, J
Hayashizaki, Y
机构
[1] RIKEN, Yokohama Inst, GSC, Lab Genome Explorat Res Grp,Tsurumi Ku, Yokohama, Kanagawa 2300045, Japan
[2] RIKEN, Genome Sci Lab, Wako, Saitama 3510198, Japan
[3] Univ Tsukuba, Inst Basic Med Sci, Tsukuba, Ibaraki 3058577, Japan
[4] Yokohama City Univ, Grad Sch Integrated Sci, Japan Div Genom Informat Resources, Tsurumi Ku, Yokohama, Kanagawa 2300045, Japan
[5] RIKEN, Tsukuba Inst, Biogen Resources Ctr, Expt Anim Res Div, Tsukuba, Ibaraki 3050074, Japan
[6] Dnaform Int Inc, Ami, Ibaraki 3000332, Japan
[7] Aloka Co Ltd, Kasumigaura Cho, Ibaraki 3000134, Japan
[8] Harvard Univ, Sch Med, Dept Neurobiol, Boston, MA 02115 USA
[9] Boys Town Natl Res Hosp, Omaha, NE 68131 USA
[10] NHGRI, NIH, Bethesda, MD 20892 USA
[11] Fleming, Biomed Sci Res Ctr A1, Inst Immunol, Vari 16672, Greece
[12] Chiba Canc Ctr, Inst Res, Div Biochem, Chuo Ku, Chiba 2608717, Japan
[13] Roswell Pk Canc Inst, Buffalo, NY 14263 USA
[14] Kyoto Univ, Inst Frontier Med Sci, Dept Reparat Mat Field Tissue Engn, Sakyo Ku, Kyoto 6068507, Japan
[15] Tokyo Univ Agr, Fac Appl Biosci, Dept Biosci, Setagaya Ku, Tokyo 1568502, Japan
[16] Univ Tokyo, Inst Med Sci, Ctr Med Expt, Lab Stem Cell Therapy,Minato Ku, Tokyo 1088639, Japan
[17] Cambridge Inst Med Res, Diabet & Inflammat Lab DRF WT, Cambridge CB2 2XY, England
[18] Univ Queensland, Inst Mol Biosci, St Lucia, Qld 4072, Australia
[19] RIKEN, BSI, Lab Neuronal Circuit Dev, Neuronal Funct Res Lab, Wako, Saitama 3000198, Japan
[20] Univ Michigan Med, Ann Arbor, MI 48109 USA
[21] Rockefeller Univ, New York, NY 10021 USA
关键词
D O I
10.1101/gr.1119703
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We report the construction of the mouse full-length cDNA encyclopedia, the most extensive view of a complex transcriptome, on the basis of preparing and sequencing 246 libraries. Before cloning, cDNAs were enriched in full-length by Cap-Trapper, and in most cases, aggressively subtracted/normalized. We have produced 1,442,236 successful 3'-end sequences clustered into 171,144 groups, from which 60,770 clones were fully sequenced cDNAs annotated in the FANTOM-2 annotation. We have also produced 547,149 5' end reads, which clustered into 124,258 groups. Altogether, these cDNAs were further grouped in 70,000 transcriptional units (TU), which represent the best coverage of a transcriptome so far. By monitoring the extent of normalization/subtraction, we define the tentative equivalent coverage (TEC), which was estimated to be equivalent to >12,000,000 ESTs derived from standard libraries. High coverage explains discrepancies between the very large. numbers of clusters (and TUs) of this project, which also include non-protein-coding RNAs, and the lower gene number estimation of genome annotations. Altogether, S'-end clusters identify regions that are potential promoters for 8637 known genes and S'-end clusters suggest the presence of almost 63,000 transcriptional starting points. An estimate of the frequency of polyadenylation signals suggests that at least half of the singletons in the EST set represent real mRNAs. Clones accounting for about half of the predicted TUs await further sequencing. The continued high-discovery rate suggests that the task of transcriptome discovery is not yet complete.
引用
收藏
页码:1273 / 1289
页数:17
相关论文
共 64 条
  • [1] ADAMS MD, 1995, NATURE, V377, P3
  • [2] COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT
    ADAMS, MD
    KELLEY, JM
    GOCAYNE, JD
    DUBNICK, M
    POLYMEROPOULOS, MH
    XIAO, H
    MERRIL, CR
    WU, A
    OLDE, B
    MORENO, RF
    KERLAVAGE, AR
    MCCOMBIE, WR
    VENTER, JC
    [J]. SCIENCE, 1991, 252 (5013) : 1651 - 1656
  • [3] How to count ... human genes
    Aparicio, SAJR
    [J]. NATURE GENETICS, 2000, 25 (02) : 129 - 130
  • [4] cDNA detection and analysis
    Bashiardes, S
    Lovett, M
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2001, 5 (01) : 15 - 20
  • [5] Patterns of variant polyadenylation signal usage in human genes
    Beaudoing, E
    Freier, S
    Wyatt, JR
    Claverie, JM
    Gautheret, D
    [J]. GENOME RESEARCH, 2000, 10 (07) : 1001 - 1010
  • [6] Normalization and subtraction: Two approaches to facilitate gene discovery
    Bonaldo, MDF
    Lennon, G
    Soares, MB
    [J]. GENOME RESEARCH, 1996, 6 (09): : 791 - 806
  • [7] BONO H, 2003, GENOME RES, V13
  • [8] An anatomy of normal and malignant gene expression
    Boon, K
    Osório, EC
    Greenhut, SF
    Schaefer, CF
    Shoemaker, J
    Polyak, K
    Morin, PJ
    Buetow, KH
    Strausberg, RL
    de Souza, SJ
    Riggins, GJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (17) : 11287 - 11292
  • [9] BOWTELL D, 2002, DNA MICROARRAYS MOL
  • [10] The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome
    Camargo, AA
    Samaia, HPB
    Dias-Neto, E
    Simao, DF
    Migotto, IA
    Briones, MRS
    Costa, FF
    Nagai, MA
    Verjovski-Almeida, S
    Zago, MA
    Andrade, LEC
    Carrer, H
    El-Dorry, HFA
    Espreafico, EM
    Habr-Gama, A
    Giannella-Neto, D
    Goldman, GH
    Gruber, A
    Hackel, C
    Kimura, ET
    Maciel, RMB
    Marie, SKN
    Martins, EAL
    Nóbrega, MP
    Paçó-Larson, ML
    Pardini, MIMC
    Pereira, GG
    Pesquero, JB
    Rodrigues, V
    Rogatto, SR
    da Silva, IDCG
    Sogayar, MC
    Sonati, MDF
    Tajara, EH
    Valentini, SR
    Alberto, FL
    Amaral, MEJ
    Aneas, I
    Arnaldi, LAT
    de Assis, AM
    Bengtson, MH
    Bergamo, NA
    Bombonato, V
    de Camargo, MER
    Canevari, RA
    Carraro, DM
    Cerutti, JM
    Corrêa, MLC
    Corrêa, RFR
    Costa, MCR
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (21) : 12103 - 12108