Characterizing the mouse ES cell transcriptome with Illumina sequencing

被引:78
作者
Rosenkranz, Ruben [1 ]
Borodina, Tatiana [1 ]
Lehrach, Hans [1 ]
Himmelbauer, Heinz [1 ]
机构
[1] Max Planck Inst Mol Genet, D-14195 Berlin, Germany
关键词
gene expression profiling; embryonic stem cells; ultrashort sequence reads; second-generation sequencing;
D O I
10.1016/j.ygeno.2008.05.011
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Large datasets generated by Illumina sequencing are ideally suited to transcriptome characterization. We generated 3,052,501 27-mer reads from F1 mouse embryonic stem (ES) cell cDNA. Using the ELAND alignment tool, 74.5% of reads matched sequenced mouse resources, < 1% were contaminants, and 3.7% failed quality control. Of the reads, 21.6% did not match mouse sequences using ELAND, but most of them were successfully aligned with mouse mRNAs using MegaBLAST. We conclude that most of the reads in the dataset are derived from mouse transcripts. A total of 14,434 mouse RefSeq genes were represented by at least 1 read. A Pearson correlation coefficient of 0.7 between Illumina sequencing and Illumina array expression data suggested similar results for both technologies. A weak 3' bias of reads was found. Reads from genes with low expression had lower GC content than the corresponding RefSeq genes, indicating a GC bias. Biases were confirmed with further Illumina read datasets generated with cDNA from mouse brain and from mutagen-treated F1 ES cells. We calculated relative expression values, because transcript length and read number were correlated. In the absence of signal saturation or background noise, we believe that short-read sequencing technologies will have a major impact on gene expression studies in the near future. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:187 / 194
页数:8
相关论文
共 37 条
[1]   FatiGO+:: a functional profiling tool for genomic data.: Integration of functional annotation, regulatory motifs and interaction data with microarray experiments [J].
Al-Shahrour, Fatima ;
Minguez, Pablo ;
Tarraga, Joaquin ;
Medina, Ignacio ;
Alloza, Eva ;
Montaner, David ;
Dopazo, Joaquin .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W91-W96
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   The significance of digital gene expression profiles [J].
Audic, S ;
Claverie, JM .
GENOME RESEARCH, 1997, 7 (10) :986-995
[4]   SNP discovery via 454 transcriptome sequencing [J].
Barbazuk, W. Brad ;
Emrich, Scott J. ;
Chen, Hsin D. ;
Li, Li ;
Schnable, Patrick S. .
PLANT JOURNAL, 2007, 51 (05) :910-918
[5]   Validation of a novel, fully integrated and flexible microarray benchtop facility for gene expression profiling -: art. no. e151 [J].
Baum, M ;
Bielau, S ;
Rittner, N ;
Schmid, K ;
Eggelbusch, K ;
Dahms, M ;
Schlauersbach, A ;
Tahedl, H ;
Beier, M ;
Güimil, R ;
Scheffler, M ;
Hermann, C ;
Funk, JM ;
Wixmerten, A ;
Rebscher, H ;
Hönig, M ;
Andreae, C ;
Büchner, D ;
Moschel, E ;
Glathe, A ;
Jäger, E ;
Thom, M ;
Greil, A ;
Bestvater, F ;
Obermeier, F ;
Burgmaier, J ;
Thome, K ;
Weichert, S ;
Hein, S ;
Binnewies, T ;
Foitzik, V ;
Müller, M ;
Stähler, CF ;
Stähler, PF .
NUCLEIC ACIDS RESEARCH, 2003, 31 (23) :e151
[6]   Whole-genome re-sequencing [J].
Bentley, David R. .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) :545-552
[7]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[8]   Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology [J].
Cheung, Foo ;
Haas, Brian J. ;
Goldberg, Susanne M. D. ;
May, Gregory D. ;
Xiao, Yongli ;
Town, Christopher D. .
BMC GENOMICS, 2006, 7 (1)
[9]   Gene discovery and annotation using LCM-454 transcriptome sequencing [J].
Emrich, Scott J. ;
Barbazuk, W. Brad ;
Li, Li ;
Schnable, Patrick S. .
GENOME RESEARCH, 2007, 17 (01) :69-73
[10]   MULTIPLEXED BIOCHEMICAL ASSAYS WITH BIOLOGICAL CHIPS [J].
FODOR, SPA ;
RAVA, RP ;
HUANG, XHC ;
PEASE, AC ;
HOLMES, CP ;
ADAMS, CL .
NATURE, 1993, 364 (6437) :555-556