GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

被引:40
作者
Cumbie, Jason S. [1 ,4 ]
Kimbrel, Jeffrey A. [1 ,4 ]
Di, Yanming [2 ]
Schafer, Daniel W. [2 ]
Wilhelm, Larry J. [1 ]
Fox, Samuel E. [1 ,4 ]
Sullivan, Christopher M. [1 ,3 ]
Curzon, Aron D. [1 ]
Carrington, James C. [1 ,4 ]
Mockler, Todd C. [1 ,3 ,4 ]
Chang, Jeff H. [1 ,3 ,4 ]
机构
[1] Oregon State Univ, Dept Bot & Plant Pathol, Corvallis, OR 97331 USA
[2] Oregon State Univ, Dept Stat, Corvallis, OR 97331 USA
[3] Oregon State Univ, Ctr Genome Res & Biocomp, Corvallis, OR 97331 USA
[4] Oregon State Univ, Mol & Cellular Biol Program, Corvallis, OR 97331 USA
来源
PLOS ONE | 2011年 / 6卷 / 10期
基金
美国食品与农业研究所; 美国国家科学基金会; 美国国家卫生研究院;
关键词
DEFENSE RESPONSES; TRANSCRIPTOME; ARABIDOPSIS; SYRINGAE; INTERPLAY; TOMATO; TOOL; QUANTIFICATION; NORMALIZATION; ULTRAFAST;
D O I
10.1371/journal.pone.0025279
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.
引用
收藏
页数:11
相关论文
共 55 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   Ontologizer 2.0 - a multifunctional tool for GO term enrichment analysis and data exploration [J].
Bauer, Sebastian ;
Grossmann, Steffen ;
Vingron, Martin ;
Robinson, Peter N. .
BIOINFORMATICS, 2008, 24 (14) :1650-1651
[3]   A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[4]   Supersplat-spliced RNA-seq alignment [J].
Bryant, Douglas W., Jr. ;
Shen, Rongkun ;
Priest, Henry D. ;
Wong, Weng-Keen ;
Mockler, Todd C. .
BIOINFORMATICS, 2010, 26 (12) :1500-1505
[5]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[6]   Characterization of the hrpC and hrpRS operons of Pseudomonas syringae pathovars syringae, tomato, and glycinea and analysis of the ability of hrpF, hrpG, hrcC, hrpT, and hrpV mutants to elicit the hypersensitive response and disease in plants [J].
Deng, WL ;
Preston, G ;
Collmer, A ;
Chang, CJ ;
Huang, HC .
JOURNAL OF BACTERIOLOGY, 1998, 180 (17) :4523-4531
[7]   Activation of defense response pathways by OGs and Flg22 elicitors in Arabidopsis seedlings [J].
Denoux, Carine ;
Galletti, Roberta ;
Mammarella, Nicole ;
Gopalan, Suresh ;
Werck, Daniele ;
De Lorenzo, Giulia ;
Ferrari, Simone ;
Ausubel, Frederick M. ;
Dewdney, Julia .
MOLECULAR PLANT, 2008, 1 (03) :423-445
[8]   The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq [J].
Di, Yanming ;
Schafer, Daniel W. ;
Cumbie, Jason S. ;
Chang, Jeff H. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
[9]   Plant immunity: towards an integrated view of plant-pathogen interactions [J].
Dodds, Peter N. ;
Rathjen, John P. .
NATURE REVIEWS GENETICS, 2010, 11 (08) :539-548
[10]   Computational and analytical framework for small RNA profiling by high-throughput sequencing [J].
Fahlgren, Noah ;
Sullivan, Christopher M. ;
Kasschau, Kristin D. ;
Chapman, Elisabeth J. ;
Cumbie, Jason S. ;
Montgomery, Taiowa A. ;
Gilbert, Sunny D. ;
Dasenko, Mark ;
Backman, Tyler W. H. ;
Givan, Scott A. ;
Carrington, James C. .
RNA, 2009, 15 (05) :992-1002