Maintaining data integrity in microarray data management

被引:12
作者
Grant, GR [1 ]
Manduchi, E [1 ]
Pizarro, A [1 ]
Stoeckert, CJ [1 ]
机构
[1] Univ Penn, Penn Ctr Bioinformat, Philadelphia, PA 19104 USA
关键词
microarray; data integrity; quality control; data management; database; data corruption;
D O I
10.1002/bit.10847
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Gene expression microarrays are a relatively new technology, dating back just a few years, yet they have already become a very widely used tool in biology, and have evolved to a wide range of applications well beyond their original design intent. However, while the use of microarrays has expanded, and the issues of performance optimization have been intensively studied, the fundamental issue of data integrity management has largely been ignored. Now that performance has improved so greatly, the shortcomings of data integrity control methods constitute a greater percent of the stumbling blocks for investigators. Microarray data are cumbersome, and the rule up to this point has mostly been one of hands-on transformations, leading to human errors which often have dramatic consequences. We show in this review that the time lost on such mistakes can be enormous and dramatically affect results; therefore, mistakes should be mitigated in any way possible. We outline the scope of the data integrity issue, survey some of the most common and dangerous data transformations, and their shortcomings. To illustrate, we review some case studies. We then look at the work done by the research community on this issue (which admittedly is meager up to this point). Some data integrity issues are always going to be difficult, while others will become easier-one of our goals is to expedite the use of integrity control methods. Finally, we present some preliminary guidelines and some specific approaches that we believe should the focus of future research. (C) 2003 Wiley Periodicals, Inc.
引用
收藏
页码:795 / 800
页数:6
相关论文
共 12 条
[1]   Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray -: art. no. R9 [J].
Bozdech, Z ;
Zhu, JC ;
Joachimiak, MP ;
Cohen, FE ;
Pulliam, B ;
DeRisi, JL .
GENOME BIOLOGY, 2003, 4 (02)
[2]   Minimum information about a microarray experiment (MIAME) - toward standards for microarray data [J].
Brazma, A ;
Hingamp, P ;
Quackenbush, J ;
Sherlock, G ;
Spellman, P ;
Stoeckert, C ;
Aach, J ;
Ansorge, W ;
Ball, CA ;
Causton, HC ;
Gaasterland, T ;
Glenisson, P ;
Holstege, FCP ;
Kim, IF ;
Markowitz, V ;
Matese, JC ;
Parkinson, H ;
Robinson, A ;
Sarkans, U ;
Schulze-Kremer, S ;
Stewart, J ;
Taylor, R ;
Vilo, J ;
Vingron, M .
NATURE GENETICS, 2001, 29 (04) :365-371
[3]   ArrayExpress - a public repository for microarray gene expression data at the EBI [J].
Brazma, A ;
Parkinson, H ;
Sarkans, U ;
Shojatalab, M ;
Vilo, J ;
Abeygunawardena, N ;
Holloway, E ;
Kapushesky, M ;
Kemmeren, P ;
Lara, GG ;
Oezcimen, A ;
Rocca-Serra, P ;
Sansone, SA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :68-71
[4]  
COOMBES KR, 2003, CAMDA02 C P
[5]   The Plasmodium genome database -: Designing and mining a eukaryotic genomics resource [J].
Kissinger, JC ;
Brunk, BP ;
Crabtree, J ;
Fraunholz, MJ ;
Gajria, B ;
Milgram, AJ ;
Pearson, DS ;
Schug, J ;
Bahl, A ;
Diskin, SJ ;
Ginsburg, H ;
Grant, GR ;
Gupta, D ;
Labo, P ;
Li, L ;
Mailman, MD ;
Mcweeney, SK ;
Whetzel, P ;
Stoeckert, CJ ;
Roos, DS .
NATURE, 2002, 419 (6906) :490-492
[6]  
LIN S, 2002, CRIT ASS MICR DAT AN
[7]  
MERIDA LJ, 1997, INFORMATION MANAGEME
[8]   Project normal: Defining normal variance in mouse gene expression [J].
Pritchard, CC ;
Hsu, L ;
Delrow, J ;
Nelson, PS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (23) :13266-13271
[9]   QUANTITATIVE MONITORING OF GENE-EXPRESSION PATTERNS WITH A COMPLEMENTARY-DNA MICROARRAY [J].
SCHENA, M ;
SHALON, D ;
DAVIS, RW ;
BROWN, PO .
SCIENCE, 1995, 270 (5235) :467-470
[10]  
STIVERS D, 2003, CAMDA02 C P