Assessing affymetrix GeneChip microarray quality

被引:51
作者
McCall, Matthew N. [2 ]
Murakami, Peter N. [3 ]
Lukk, Margus [4 ,5 ]
Huber, Wolfgang [6 ]
Irizarry, Rafael A. [1 ]
机构
[1] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD USA
[2] Univ Rochester, Med Ctr, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
[3] Johns Hopkins Sch Med, Ctr Epigenet, Baltimore, MD USA
[4] EMBL EBI Funct Genom Grp, Cambridge CB10 1SD, England
[5] Canc Res UK Cambridge Res Inst, Li Ka Shing Ctr, Cambridge CB2 ORE, England
[6] EMBL Genome Biol Unit, D-69117 Heidelberg, Germany
来源
BMC BIOINFORMATICS | 2011年 / 12卷
基金
美国国家卫生研究院;
关键词
EXPRESSION; DATABASE;
D O I
10.1186/1471-2105-12-137
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Microarray technology has become a widely used tool in the biological sciences. Over the past decade, the number of users has grown exponentially, and with the number of applications and secondary data analyses rapidly increasing, we expect this rate to continue. Various initiatives such as the External RNA Control Consortium (ERCC) and the MicroArray Quality Control (MAQC) project have explored ways to provide standards for the technology. For microarrays to become generally accepted as a reliable technology, statistical methods for assessing quality will be an indispensable component; however, there remains a lack of consensus in both defining and measuring microarray quality. Results: We begin by providing a precise definition of microarray quality and reviewing existing Affymetrix GeneChip quality metrics in light of this definition. We show that the best-performing metrics require multiple arrays to be assessed simultaneously. While such multi-array quality metrics are adequate for bench science, as microarrays begin to be used in clinical settings, single-array quality metrics will be indispensable. To this end, we define a single-array version of one of the best multi-array quality metrics and show that this metric performs as well as the best multi-array metrics. We then use this new quality metric to assess the quality of microarry data available via the Gene Expression Omnibus (GEO) using more than 22,000 Affymetrix HGU133a and HGU133plus2 arrays from 809 studies. Conclusions: We find that approximately 10 percent of these publicly available arrays are of poor quality. Moreover, the quality of microarray measurements varies greatly from hybridization to hybridization, study to study, and lab to lab, with some experiments producing unusable data. Many of the concepts described here are applicable to other high-throughput technologies.
引用
收藏
页数:10
相关论文
共 25 条
[1]  
Affymetrix, 2002, GENECHIP EXPR AN DAT
[2]   The external RNA controls consortium: a progress report [J].
Baker, SC ;
Bauer, SR ;
Beyer, RP ;
Brenton, JD ;
Bromley, B ;
Burrill, J ;
Causton, H ;
Conley, MP ;
Elespuru, R ;
Fero, M ;
Foy, C ;
Fuscoe, J ;
Gao, XL ;
Gerhold, DL ;
Gilles, P ;
Goodsaid, F ;
Guo, X ;
Hackett, J ;
Hockett, RD ;
Ikonomi, P ;
Irizarry, RA ;
Kawasaki, ES ;
Kaysser-Kranich, T ;
Kerr, K ;
Kiser, G ;
Koch, WH ;
Lee, KY ;
Liu, CM ;
Liu, ZL ;
Lucas, A ;
Manohar, CF ;
Miyada, G ;
Modrusan, Z ;
Parkes, H ;
Puri, RK ;
Reid, L ;
Ryder, TB ;
Salit, M ;
Samaha, RR ;
Scherf, U ;
Sendera, TJ ;
Setterquist, RA ;
Shi, LM ;
Shippy, R ;
Soriano, JV ;
Wagar, EA ;
Warrington, JA ;
Williams, M ;
Wilmer, F ;
Wilson, M .
NATURE METHODS, 2005, 2 (10) :731-734
[3]  
Bolstad BM, 2004, INT REV NEUROBIOL, V60, P25
[4]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210
[5]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[6]   Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer [J].
Hess, Kenneth R. ;
Anderson, Keith ;
Symmans, W. Fraser ;
Valero, Vicente ;
Ibrahim, Nuhad ;
Mejia, Jaime A. ;
Booser, Daniel ;
Theriault, Richard L. ;
Buzdar, Aman U. ;
Dempsey, Peter J. ;
Rouzier, Roman ;
Sneige, Nour ;
Ross, Jeffrey S. ;
Vidaurre, Tatiana ;
Gomez, Henry L. ;
Hortobagyi, Gabriel N. ;
Pusztai, Lajos .
JOURNAL OF CLINICAL ONCOLOGY, 2006, 24 (26) :4236-4244
[7]  
Huber Wolfgang, 2002, Bioinformatics, V18 Suppl 1, pS96
[8]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264
[9]   Microarray data quality control improves the detection of differentially expressed genes [J].
Kauffmann, Audrey ;
Huber, Wolfgang .
GENOMICS, 2010, 95 (03) :138-142
[10]   Importing ArrayExpress datasets into R/Bioconductor [J].
Kauffmann, Audrey ;
Rayner, Tim F. ;
Parkinson, Helen ;
Kapushesky, Misha ;
Lukk, Margus ;
Brazma, Alvis ;
Huber, Wolfgang .
BIOINFORMATICS, 2009, 25 (16) :2092-2094