Analysis of stranded information using an automated procedure for strand specific RNA sequencing

被引：20

作者：

Sigurgeirsson, Benjamin ^{[1
]}

Emanuelsson, Olof ^{[1
]}

Lundeberg, Joakim ^{[1
]}

机构：

[1] Royal Inst Technol KTH, Sci Life Lab, Sch Biotechnol, Tomtebodavagen 23A, S-17165 Stockholm, Sweden

来源：

BMC GENOMICS | 2014年 / 15卷

基金：

瑞典研究理事会;

关键词：

GENE-EXPRESSION PATTERNS; OVERLAPPING GENES; ANTISENSE;

D O I：

10.1186/1471-2164-15-631

中图分类号：

Q81 [生物工程学（生物技术）]; Q93 [微生物学];

学科分类号：

071005 ; 0836 ; 090102 ; 100705 ;

摘要：

Background: Strand specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols the development of bioinformatical tools is steadily progressing. In the current procedure the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion which are then sequenced on Illumina HiSeq 2000. By the use of freely available bioinformatical tools we show, through quality metrics, that the protocol is robust and reproducible. We further highlight the practicality of strand specific libraries by comparing expression of strand specific libraries to non-stranded libraries, by looking at known antisense transcription of pseudogenes and by identifying novel transcription. Furthermore, two ribosomal depletion kits, RiboMinus and RiboZero, are compared and two sequence aligners, Tophat2 and STAR, are also compared. Results: The, non-stranded, Illumina TruSeq kit can be adapted to generate strand specific libraries and can be used to access detailed information on the transcriptome. The RiboZero kit is very effective in removing ribosomal RNA from total RNA and the STAR aligner produces high mapping yield in a short time. Strand specific data gives more detailed and correct results than does non-stranded data as we show when estimating expression values and in assembling transcripts. Even well annotated genomes need improvements and corrections which can be achieved using strand specific data. Conclusions: Researchers in the field should strive to use strand specific data; it allows for more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analysing non-stranded data, researchers should be well aware of the caveats of that approach. © 2014 Sigurgeirsson et al.; licensee BioMed Central Ltd.

引用

页数：13

共 31 条

[1] Comparative analysis of RNA sequencing methods for degraded or low-input samples [J].