Combining gene expression data from different generations of oligonucleotide arrays

被引:47
作者
Hwang, KB
Kong, SW
Greenberg, SA
Park, PJ
机构
[1] Childrens Hosp, Informat Program, Boston, MA 02115 USA
[2] Seoul Natl Univ, Sch Engn & Comp Sci, Seoul 151742, South Korea
[3] Beth Israel Deaconess Med Ctr, Boston, MA 02215 USA
[4] Harvard Univ, Bauer Ctr Genom Res, Cambridge, MA 02138 USA
[5] Brigham & Womens Hosp, Dept Neurol, Boston, MA 02115 USA
[6] Harvard Partners Ctr Genet & Genom, Boston, MA 02115 USA
关键词
D O I
10.1186/1471-2105-5-159
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: One of the important challenges in microarray analysis is to take full advantage of previously accumulated data, both from one's own laboratory and from public repositories. Through a comparative analysis on a variety of datasets, a more comprehensive view of the underlying mechanism or structure can be obtained. However, as we discover in this work, continual changes in genomic sequence annotations and probe design criteria make it difficult to compare gene expression data even from different generations of the same microarray platform. Results: We first describe the extent of discordance between the results derived from two generations of Affymetrix oligonucleotide arrays, as revealed in cluster analysis and in identification of differentially expressed genes. We then propose a method for increasing comparability. The dataset we use consists of a set of 14 human muscle biopsy samples from patients with inflammatory myopathies that were hybridized on both HG-U95Av2 and HG-U133A human arrays. We find that the use of the probe set matching table for comparative analysis provided by Affymetrix produces better results than matching by UniGene or LocusLink identifiers but still remains inadequate. Rescaling of expression values for each gene across samples and data filtering by expression values enhance comparability but only for few specific analyses. As a generic method for improving comparability, we select a subset of probes with overlapping sequence segments in the two array types and recalculate expression values based only on the selected probes. We show that this filtering of probes significantly improves the comparability while retaining a sufficient number of probe sets for further analysis. Conclusions: Compatibility between high-density oligonucleotide arrays is significantly affected by probe-level sequence information. With a careful filtering of the probes based on their sequence overlaps, data from different generations of microarrays can be combined more effectively.
引用
收藏
页数:16
相关论文
共 32 条
[1]  
*AFF, 2003, US GUID PROD COMP SP
[2]  
[Anonymous], 2003, Statistical Analysis of Gene Expression Microarray Data. Interdisciplinary Statistics
[3]   Spotted long oligonucleotide arrays for human gene expression analysis [J].
Barczak, A ;
Rodriguez, MW ;
Hanspers, K ;
Koth, LL ;
Tai, YC ;
Bolstad, BM ;
Speed, TP ;
Erle, DJ .
GENOME RESEARCH, 2003, 13 (07) :1775-1785
[4]   Quantitative analysis of mRNA amplification by in vitro transcription [J].
Baugh, L. R. ;
Hill, A. A. ;
Brown, E. L. ;
Hunter, Craig P. .
NUCLEIC ACIDS RESEARCH, 2001, 29 (05)
[5]   ArrayExpress - a public repository for microarray gene expression data at the EBI [J].
Brazma, A ;
Parkinson, H ;
Sarkans, U ;
Shojatalab, M ;
Vilo, J ;
Abeygunawardena, N ;
Holloway, E ;
Kapushesky, M ;
Kemmeren, P ;
Lara, GG ;
Oezcimen, A ;
Rocca-Serra, P ;
Sansone, SA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :68-71
[6]  
Chalifa-Caspi Vered, 2003, Briefings in Bioinformatics, V4, P349, DOI 10.1093/bib/4.4.349
[7]   Replicate high-density rat genome oligonucleotide microarrays reveal hundreds of regulated genes in the dorsal root ganglion after peripheral nerve injury. [J].
Costigan, Michael ;
Befort, Katia ;
Karchewski, Laurie ;
Griffin, Robert S. ;
D'Urso, Donatella ;
Allchorne, Andrew ;
Sitarski, Joanne ;
Mannion, James W. ;
Pratt, Richard E. ;
Woolf, Clifford J. .
BMC NEUROSCIENCE, 2002, 3 (1)
[8]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210
[9]   A computer program for aligning a cDNA sequence with a genomic DNA sequence [J].
Florea, L ;
Hartzell, G ;
Zhang, Z ;
Rubin, GM ;
Miller, W .
GENOME RESEARCH, 1998, 8 (09) :967-974
[10]   The Stanford Microarray Database: data access and quality assessment tools [J].
Gollub, J ;
Ball, CA ;
Binkley, G ;
Demeter, J ;
Finkelstein, DB ;
Hebert, JM ;
Hernandez-Boussard, T ;
Jin, H ;
Kaloper, M ;
Matese, JC ;
Schroeder, M ;
Brown, PO ;
Botstein, D ;
Sherlock, G .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :94-96