Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data

被引:1422
作者
Dai, MH
Wang, PL
Boyd, AD
Kostov, G
Athey, B
Jones, EG
Bunney, WE
Myers, RM
Speed, TP
Akil, H
Watson, SJ
Meng, F [1 ]
机构
[1] Univ Michigan, Mol & Behav Neurosci Inst, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Psychiat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Michigan Ctr Biol Informat, Ann Arbor, MI 48105 USA
[4] Univ Calif Davis, Dept Psychiat, Davis, CA 95616 USA
[5] Univ Calif Davis, Ctr Neurosci, Davis, CA 95616 USA
[6] Univ Calif Irvine, Dept Psychiat & Human Behav, Irvine, CA 92697 USA
[7] Stanford Univ, Sch Med, Dept Genet, Stanford, CA 94305 USA
[8] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
关键词
D O I
10.1093/nar/gni179
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals similar to 30-50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions.
引用
收藏
页码:e175.1 / e175.9
页数:9
相关论文
共 21 条
[1]
FatiGO:: a web tool for finding significant associations of Gene Ontology terms with groups of genes [J].
Al-Shahrour, F ;
Díaz-Uriarte, R ;
Dopazo, J .
BIOINFORMATICS, 2004, 20 (04) :578-580
[2]
Barrett T, 2005, NUCLEIC ACIDS RES, V33, pD562
[3]
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias [J].
Bolstad, BM ;
Irizarry, RA ;
Åstrand, M ;
Speed, TP .
BIOINFORMATICS, 2003, 19 (02) :185-193
[4]
Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[5]
Alternative mapping of probes to genes for Affymetrix chips -: art. no. 111 [J].
Gautier, L ;
Moller, M ;
Friis-Hansen, L ;
Knudsen, S .
BMC BIOINFORMATICS, 2004, 5 (1)
[6]
Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[7]
Genomic profiling of the human heart before and after mechanical support with a ventricular assist device reveals alterations in vascular signaling networks [J].
Hall, JL ;
Grindle, S ;
Han, XQ ;
Fermin, D ;
Park, S ;
Chen, YJ ;
Bache, RJ ;
Mariash, A ;
Guan, ZJ ;
Ormaza, S ;
Thompson, J ;
Graziano, J ;
Lazaro, SED ;
Pan, SC ;
Simari, RD ;
Miller, LW .
PHYSIOLOGICAL GENOMICS, 2004, 17 (03) :283-291
[8]
HARBIG J, NUCL ACIDS RES, V33, pE31
[9]
Ensembl 2005 [J].
Hubbard, T ;
Andrews, D ;
Caccamo, M ;
Cameron, G ;
Chen, Y ;
Clamp, M ;
Clarke, L ;
Coates, G ;
Cox, T ;
Cunningham, F ;
Curwen, V ;
Cutts, T ;
Down, T ;
Durbin, R ;
Fernandez-Suarez, XM ;
Gilbert, J ;
Hammond, M ;
Herrero, J ;
Hotz, H ;
Howe, K ;
Iyer, V ;
Jekosch, K ;
Kahari, A ;
Kasprzyk, A ;
Keefe, D ;
Keenan, S ;
Kokocinsci, F ;
London, D ;
Longden, I ;
McVicker, G ;
Melsopp, C ;
Meidl, P ;
Potter, S ;
Proctor, G ;
Rae, M ;
Rios, D ;
Schuster, M ;
Searle, S ;
Severin, J ;
Slater, G ;
Smedley, D ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Storey, R ;
Trevanion, S ;
Ureta-Vidal, A ;
Vogel, J ;
White, S .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D447-D453
[10]
Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264