HIGHLIGHTING RELATIONSHIPS BETWEEN HETEROGENEOUS BIOLOGICAL DATA THROUGH GRAPHICAL DISPLAYS BASED ON REGULARIZED CANONICAL CORRELATION ANALYSIS

被引:41
作者
Gonzalez, I. [1 ]
Dejean, S. [1 ]
Martin, P. G. P. [2 ]
Goncalves, O. [3 ]
Besse, P. [1 ]
Baccini, A. [1 ]
机构
[1] Univ Toulouse 3, Inst Math Toulouse, CNRS, UMR 5219, F-31062 Toulouse, France
[2] INRA, Lab Pharmacol Toxicol, UR 66, F-31931 Toulouse, France
[3] Univ Blaise Pascal, CNRS, Lab Microorganismes Genome & Environm, UMR 6023, Clermont Ferrand 2, France
关键词
Canonical Correlation Analysis; Regularization; Cross-Validation; Graphical Display; Gene Expression Data; Anti-Cancer Drugs Efficacy; GENE-EXPRESSION DATA; P-GLYCOPROTEIN; DISCRIMINANT-ANALYSIS; MULTIDRUG-RESISTANCE; PPAR-ALPHA; EFFLUX; CANCER; MICROARRAYS; REGRESSION; SELECTION;
D O I
10.1142/S0218339009002831
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biological data produced by high throughput technologies are becoming more and more abundant and are arousing many statistical questions. This paper addresses one of them; when gene expression data are jointly observed with other variables with the purpose of highlighting significant relationships between gene expression and these other variables. One relevant statistical method to explore these relationships is Canonical Correlation Analysis (CCA). Unfortunately, in the context of postgenomic data, the number of variables (gene expressions) is usually greater than the number of units (samples) and CCA cannot be directly performed: a regularized version is required. We applied regularized CCA on data sets from two different studies and show that its interpretation evidences both previously validated relationships and new hypothesis. From the first data sets (nutrigenomic study), we generated interesting hypothesis on the transcription factor pathways potentially linking hepatic fatty acids and gene expression. From the second data sets (pharmacogenomic study on the NCI-60 cancer cell line panel), we identified new ABC transporter candidate substrates which relevancy is illustrated by the concomitant identification of several known substrates. In conclusion, the use of regularized CCA is likely to be relevant to a number and a variety of biological experiments involving the generation of high throughput data. We demonstrated here its ability to enhance the range of relevant conclusions that can be drawn from these relatively expensive experiments.
引用
收藏
页码:173 / 199
页数:27
相关论文
共 49 条
[1]  
Anderson T.W., 1958, An introduction to multivariate statistical analysis, V2
[2]  
BACCINI A, 2005, J SOC FRANCAISE STAT, V146, P5
[3]   Regularization in statistics [J].
Bickel, Peter J. ;
Li, Bo .
TEST, 2006, 15 (02) :271-303
[4]  
Boulesteix A.-L., 2004, STAT APPL GENET MOL, V3, P1, DOI [DOI 10.2202/1544-6115.1075, 10.2202/1544-6115.1075]
[5]   MotifScorer:: using a compendium of microarrays to identify regulatory motifs [J].
Brilli, Matteo ;
Fani, Renato ;
Lio, Pietro .
BIOINFORMATICS, 2007, 23 (04) :493-495
[6]   New hepatic fat activates PPARα to maintain glucose, lipid, and cholesterol homeostasis [J].
Chakravarthy, MV ;
Pan, ZJ ;
Zhu, YM ;
Tordjman, K ;
Schneider, JG ;
Coleman, T ;
Turk, J ;
Semenkovich, CF .
CELL METABOLISM, 2005, 1 (05) :309-322
[7]   Relationships between sensory and physicochemical measurements in meat of rabbit from three different breeding systems using canonical correlation analysis [J].
Combes, Sylvie ;
Gonzalez, Ignacio ;
Dejean, Sebastien ;
Baccini, Alain ;
Jehl, Nathalie ;
Juin, Herve ;
Cauquil, Laurent ;
Gabinaud, Beatrice ;
Lebas, Francois ;
Larzul, Catherine .
MEAT SCIENCE, 2008, 80 (03) :835-841
[8]   Cross-platform comparison and visualisation of gene expression data using co-inertia analysis -: art. no. 59 [J].
Culhane, AC ;
Perrière, G ;
Higgins, DG .
BMC BIOINFORMATICS, 2003, 4 (1)
[9]   NON-SINGULARITY OF GENERALIZED SAMPLE COVARIANCE MATRICES [J].
EATON, ML ;
PERLMAN, MD .
ANNALS OF STATISTICS, 1973, 1 (04) :710-717
[10]   REGULARIZED DISCRIMINANT-ANALYSIS [J].
FRIEDMAN, JH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (405) :165-175