Structural annotation of equine protein-coding genes determined by mRNA sequencing

被引:39
作者
Coleman, S. J. [1 ]
Zeng, Z. [2 ]
Wang, K. [2 ]
Luo, S. [3 ]
Khrebtukova, I. [3 ]
Mienaltowski, M. J. [1 ]
Schroth, G. P. [3 ]
Liu, J. [2 ]
MacLeod, J. N. [1 ]
机构
[1] Univ Kentucky, Dept Vet Sci, Maxwell H Gluck Equine Res Ctr, Lexington, KY 40546 USA
[2] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
[3] Illumina Inc, Hayward, CA 94545 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
gene structure annotation; transcriptome; gene expression; Equus caballus; RNA-seq; CELL TRANSCRIPTOME;
D O I
10.1111/j.1365-2052.2010.02118.x
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
P>The horse, like the majority of animal species, has a limited amount of species-specific expressed sequence data available in public databases. As a result, structural models for the majority of genes defined in the equine genome are predictions based on ab initio sequence analysis or the projection of gene structures from other mammalian species. The current study used Illumina-based sequencing of messenger RNA (RNA-seq) to help refine structural annotation of equine protein-coding genes and for a preliminary assessment of gene expression patterns. Sequencing of mRNA from eight equine tissues generated 293 758 105 sequence tags of 35 bases each, equalling 10.28 gbp of total sequence data. The tag alignments represent approximately 207x coverage of the equine mRNA transcriptome and confirmed transcriptional activity for roughly 90% of the protein-coding gene structures predicted by Ensembl and NCBI. Tag coverage was sufficient to refine the structural annotation for 11 356 of these predicted genes, while also identifying an additional 456 transcripts with exon/intron features that are not listed by either Ensembl or NCBI. Genomic locus data and intervals for the protein-coding genes predicted by the Ensembl and NCBI annotation pipelines were combined with 75 116 RNA-seq-derived transcriptional units to generate a consensus equine protein-coding gene set of 20 302 defined loci. Gene ontology annotation was used to compare the functional and structural categories of genes expressed in either a tissue-restricted pattern or broadly across all tissue samples.
引用
收藏
页码:121 / 130
页数:10
相关论文
共 23 条
  • [1] Antczak D F, 1987, J Reprod Fertil Suppl, V35, P371
  • [2] NCBI GEO: archive for high-throughput functional genomic data
    Barrett, Tanya
    Troup, Dennis B.
    Wilhite, Stephen E.
    Ledoux, Pierre
    Rudnev, Dmitry
    Evangelista, Carlos
    Kim, Irene F.
    Soboleva, Alexandra
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Muertter, Rolf N.
    Edgar, Ron
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D885 - D890
  • [3] Steady progress and recent breakthroughs in the accuracy of automated genome annotation
    Brent, Michael R.
    [J]. NATURE REVIEWS GENETICS, 2008, 9 (01) : 62 - 73
  • [4] CHOMCZYNSKI P, 1987, ANAL BIOCHEM, V162, P156, DOI 10.1016/0003-2697(87)90021-2
  • [5] Stem cell transcriptome profiling via massive-scale mRNA sequencing
    Cloonan, Nicole
    Forrest, Alistair R. R.
    Kolle, Gabriel
    Gardiner, Brooke B. A.
    Faulkner, Geoffrey J.
    Brown, Mellissa K.
    Taylor, Darrin F.
    Steptoe, Anita L.
    Wani, Shivangi
    Bethel, Graeme
    Robertson, Alan J.
    Perkins, Andrew C.
    Bruce, Stephen J.
    Lee, Clarence C.
    Ranade, Swati S.
    Peckham, Heather E.
    Manning, Jonathan M.
    McKernan, Kevin J.
    Grimmond, Sean M.
    [J]. NATURE METHODS, 2008, 5 (07) : 613 - 619
  • [6] Intron-exon structures of eukaryotic model organisms
    Deutsch, M
    Long, M
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (15) : 3219 - 3228
  • [7] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
    Edgar, R
    Domrachev, M
    Lash, AE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 207 - 210
  • [8] Highly integrated single-base resolution maps of the epigenome in Arabidopsis
    Lister, Ryan
    O'Malley, Ronan C.
    Tonti-Filippini, Julian
    Gregory, Brian D.
    Berry, Charles C.
    Millar, A. Harvey
    Ecker, Joseph R.
    [J]. CELL, 2008, 133 (03) : 523 - 536
  • [9] MacLeod JN, 1998, AM J VET RES, V59, P1021
  • [10] Fibronectin mRNA splice variant in articular cartilage lacks bases encoding the V, III-15, and I-10 protein segments
    MacLeod, JN
    BurtonWurster, N
    Gu, DN
    Lust, G
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1996, 271 (31) : 18954 - 18960