Analysis of stranded information using an automated procedure for strand specific RNA sequencing

被引:20
作者
Sigurgeirsson, Benjamin [1 ]
Emanuelsson, Olof [1 ]
Lundeberg, Joakim [1 ]
机构
[1] Royal Inst Technol KTH, Sci Life Lab, Sch Biotechnol, Tomtebodavagen 23A, S-17165 Stockholm, Sweden
来源
BMC GENOMICS | 2014年 / 15卷
基金
瑞典研究理事会;
关键词
GENE-EXPRESSION PATTERNS; OVERLAPPING GENES; ANTISENSE;
D O I
10.1186/1471-2164-15-631
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Strand specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols the development of bioinformatical tools is steadily progressing. In the current procedure the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion which are then sequenced on Illumina HiSeq 2000. By the use of freely available bioinformatical tools we show, through quality metrics, that the protocol is robust and reproducible. We further highlight the practicality of strand specific libraries by comparing expression of strand specific libraries to non-stranded libraries, by looking at known antisense transcription of pseudogenes and by identifying novel transcription. Furthermore, two ribosomal depletion kits, RiboMinus and RiboZero, are compared and two sequence aligners, Tophat2 and STAR, are also compared. Results: The, non-stranded, Illumina TruSeq kit can be adapted to generate strand specific libraries and can be used to access detailed information on the transcriptome. The RiboZero kit is very effective in removing ribosomal RNA from total RNA and the STAR aligner produces high mapping yield in a short time. Strand specific data gives more detailed and correct results than does non-stranded data as we show when estimating expression values and in assembling transcripts. Even well annotated genomes need improvements and corrections which can be achieved using strand specific data. Conclusions: Researchers in the field should strive to use strand specific data; it allows for more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analysing non-stranded data, researchers should be well aware of the caveats of that approach. © 2014 Sigurgeirsson et al.; licensee BioMed Central Ltd.
引用
收藏
页数:13
相关论文
共 31 条
[1]   Comparative analysis of RNA sequencing methods for degraded or low-input samples [J].
Adiconis, Xian ;
Borges-Rivera, Diego ;
Satija, Rahul ;
DeLuca, David S. ;
Busby, Michele A. ;
Berlin, Aaron M. ;
Sivachenko, Andrey ;
Thompson, Dawn Anne ;
Wysoker, Alec ;
Fennell, Timothy ;
Gnirke, Andreas ;
Pochet, Nathalie ;
Regev, Aviv ;
Levin, Joshua Z. .
NATURE METHODS, 2013, 10 (07) :623-+
[2]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[3]   Independent filtering increases detection power for high-throughput experiments [J].
Bourgon, Richard ;
Gentleman, Robert ;
Huber, Wolfgang .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (21) :9546-9551
[4]  
DeRisi J, 1996, NAT GENET, V14, P457
[5]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[6]  
Engström PG, 2013, NAT METHODS, V10, P1185, DOI [10.1038/nmeth.2722, 10.1038/NMETH.2722]
[7]   Regulatory roles of natural antisense transcripts [J].
Faghihi, Mohammad Ali ;
Wahlestedt, Claes .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2009, 10 (09) :637-643
[8]   Transcriptional regulation of Oct4 by a long non-coding RNA antisense to Oct4-pseudogene 5 [J].
Hawkins, Peter G. ;
Morris, Kevin, V .
TRANSCRIPTION-AUSTIN, 2010, 1 (03) :165-175
[10]   Properties of overlapping genes are conserved across microbial genomes [J].
Johnson, ZI ;
Chisholm, SW .
GENOME RESEARCH, 2004, 14 (11) :2268-2272