A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data

被引:155
作者
Barbosa-Morais, Nuno L. [1 ]
Dunning, Mark J. [1 ]
Samarajiwa, Shamith A. [1 ]
Darot, Jeremy F. J. [1 ]
Ritchie, Matthew E. [1 ,2 ]
Lynch, Andy G. [1 ]
Tavare, Simon [1 ]
机构
[1] Univ Cambridge, Dept Oncol, CRUK Cambridge Res Inst, Li Ka Shing Ctr, Cambridge CB2 0RE, England
[2] Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic 3052, Australia
基金
英国医学研究理事会;
关键词
GENOME BROWSER; AFFYMETRIX; PROBESETS; EXON; TRANSCRIPTION; PERFORMANCE; SEQUENCES; NUMBER; CHIPS; MICE;
D O I
10.1093/nar/gkp942
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Illumina BeadArrays are among the most popular and reliable platforms for gene expression profiling. However, little external scrutiny has been given to the design, selection and annotation of BeadArray probes, which is a fundamental issue in data quality and interpretation. Here we present a pipeline for the complete genomic and transcriptomic re-annotation of Illumina probe sequences, also applicable to other platforms, with its output available through a Web interface and incorporated into Bioconductor packages. We have identified several problems with the design of individual probes and we show the benefits of probe re-annotation on the analysis of BeadArray gene expression data sets. We discuss the importance of aspects such as probe coverage of individual transcripts, alternative messenger RNA splicing, single-nucleotide polymorphisms, repeat sequences, RNA degradation biases and probes targeting genomic regions with no known transcription. We conclude that many of the Illumina probes have unreliable original annotation and that our re-annotation allows analyses to focus on the good quality probes, which form the majority, and also to expand the scope of biological information that can be extracted.
引用
收藏
页码:e17.1 / e17.13
页数:13
相关论文
共 71 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms [J].
Barnes, M ;
Freudenberg, J ;
Thompson, S ;
Aronow, B ;
Pavlidis, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 (18) :5914-5923
[3]   NCBI GEO: mining tens of millions of expression profiles - database and tools update [J].
Barrett, Tanya ;
Troup, Dennis B. ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Rudnev, Dmitry ;
Evangelista, Carlos ;
Kim, Irene F. ;
Soboleva, Alexandra ;
Tomashevsky, Maxim ;
Edgar, Ron .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D760-D765
[4]   Effect of polymorphisms within probe-target sequences on olignonucleotide microarray experiments [J].
Benovoy, David ;
Kwan, Tony ;
Majewski, Jacek .
NUCLEIC ACIDS RESEARCH, 2008, 36 (13) :4417-4423
[5]  
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
[6]   Quantitative gene expression profiling in formalin-fixed, paraffin-embedded tissues using universal bead arrays [J].
Bibikova, M ;
Talantov, D ;
Chudin, E ;
Yeakley, JM ;
Chen, J ;
Doucet, D ;
Wickham, E ;
Atkins, D ;
Barker, D ;
Chee, M ;
Wang, YX ;
Fan, JB .
AMERICAN JOURNAL OF PATHOLOGY, 2004, 165 (05) :1799-1807
[7]   Exon level integration of proteomics and microarray data [J].
Bitton, Danny A. ;
Okoniewski, Michac J. ;
Connolly, Yvonne ;
Miller, Crispin J. .
BMC BIOINFORMATICS, 2008, 9 (1) :118
[8]   The HGNC Database in 2008: a resource for the human genome [J].
Bruford, Elspeth A. ;
Lush, Michael J. ;
Wright, Mathew W. ;
Sneddon, Tam P. ;
Povey, Sue ;
Birney, Ewan .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D445-D448
[9]   BASH: a tool for managing BeadArray spatial artefacts [J].
Cairns, J. M. ;
Dunning, M. J. ;
Ritchie, M. E. ;
Russell, R. ;
Lynch, A. G. .
BIOINFORMATICS, 2008, 24 (24) :2921-2922
[10]   Comprehensive genomic characterization defines human glioblastoma genes and core pathways [J].
Chin, L. ;
Meyerson, M. ;
Aldape, K. ;
Bigner, D. ;
Mikkelsen, T. ;
VandenBerg, S. ;
Kahn, A. ;
Penny, R. ;
Ferguson, M. L. ;
Gerhard, D. S. ;
Getz, G. ;
Brennan, C. ;
Taylor, B. S. ;
Winckler, W. ;
Park, P. ;
Ladanyi, M. ;
Hoadley, K. A. ;
Verhaak, R. G. W. ;
Hayes, D. N. ;
Spellman, Paul T. ;
Absher, D. ;
Weir, B. A. ;
Ding, L. ;
Wheeler, D. ;
Lawrence, M. S. ;
Cibulskis, K. ;
Mardis, E. ;
Zhang, Jinghui ;
Wilson, R. K. ;
Donehower, L. ;
Wheeler, D. A. ;
Purdom, E. ;
Wallis, J. ;
Laird, P. W. ;
Herman, J. G. ;
Schuebel, K. E. ;
Weisenberger, D. J. ;
Baylin, S. B. ;
Schultz, N. ;
Yao, Jun ;
Wiedemeyer, R. ;
Weinstein, J. ;
Sander, C. ;
Gibbs, R. A. ;
Gray, J. ;
Kucherlapati, R. ;
Lander, E. S. ;
Myers, R. M. ;
Perou, C. M. ;
McLendon, Roger .
NATURE, 2008, 455 (7216) :1061-1068