Design and analysis of experiments with high throughput biological assay data

被引:51
作者
Rocke, DM [1 ]
机构
[1] Univ Calif Davis, Div Biostat, Davis, CA 95616 USA
关键词
gene expression; mass spectrometry; metabolomics; microarray; NMR spectroscopy;
D O I
10.1016/j.semcdb.2004.09.007
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
The design and analysis of experiments using gene expression microarrays is a topic of considerable current research, and work is beginning to appear on the analysis of proteomics and metabolomics data by mass spectrometry and NMR spectroscopy. The literature in this area is evolving rapidly, and commercial software for analysis of array or proteomics data is rarely up to date, and is essentially nonexistent for metabolomics data. In this paper, I review some of the issues that should concern any biologists planning to use such high-throughput biological assay data in an experimental investigation. Technical details are kept to a minimum, and may be found in the referenced literature, as well as in the many excellent papers which space limitations prevent my describing. There are usually a number of viable options for design and analysis of such experiments, but unfortunately, there are even more non-viable ones that have been used even in the published literature. This is an area in which up-to-date knowledge of the literature is indispensable for efficient and effective design and analysis of these experiments. In general. we concentrate on relatively simple analyses, often focusing on identifying differentially expressed genes and the comparable issues in mass spectrometry and NMR spectroscopy (consistent differences in peak heights or areas for example). Complex multivariate and pattern recognition methods also need much attention, but the issues we describe in this paper must be dealt with first. The literature on analysis of proteomics and metabolomics data is as yet sparse, so the main focus of this paper will be on methods devised for analysis of gene expression data that generalize to proteomics and metabolomics, with some specific comments near the end on analysis of metabolomics data by mass spectrometry and NMR spectroscopy. (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:703 / 713
页数:11
相关论文
共 30 条
[1]   Metabolic trajectory characterisation of xenobiotic-induced hepatotoxic lesions using statistical batch processing of NMR data [J].
Azmi, J ;
Griffin, JL ;
Antti, H ;
Shore, RF ;
Johansson, E ;
Nicholson, JK ;
Holmes, E .
ANALYST, 2002, 127 (02) :271-276
[2]   Application of orthogonal signal correction to minimise the effects of physical and biological variation in high resolution 1H NMR spectra of biofluids [J].
Beckwith-Hall, BM ;
Brindle, JT ;
Barton, RH ;
Coen, M ;
Holmes, E ;
Nicholson, JK ;
Antti, H .
ANALYST, 2002, 127 (10) :1283-1288
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]  
Box G.E., 1978, STAT EXPT
[5]   Fundamentals of experimental design for cDNA microarrays [J].
Churchill, GA .
NATURE GENETICS, 2002, 32 (Suppl 4) :490-495
[6]   Multiple hypothesis testing in microarray experiments [J].
Dudoit, S ;
Shaffer, JP ;
Boldrick, JC .
STATISTICAL SCIENCE, 2003, 18 (01) :71-103
[7]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[8]   Estimation of transformation parameters for microarray data [J].
Durbin, B ;
Rocke, DM .
BIOINFORMATICS, 2003, 19 (11) :1360-1367
[9]  
Durbin B P, 2002, Bioinformatics, V18 Suppl 1, pS105
[10]   Variance-stabilizing transformations for two-color microarrays [J].
Durbin, BP ;
Rocke, DM .
BIOINFORMATICS, 2004, 20 (05) :660-U190