Mining gene expression databases for association rules

被引:249
作者
Creighton, C [1 ]
Hanash, S [1 ]
机构
[1] Univ Michigan, Bioinformat Program, Ann Arbor, MI 48109 USA
关键词
D O I
10.1093/bioinformatics/19.1.79
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Global gene expression profiling, both at the transcript level and at the protein level, can be a valuable tool in the understanding of genes, biological networks, and cellular states. As larger and larger gene expression data sets become available, data mining techniques can be applied to identify patterns of interest in the data. Association rules, used widely in the area of market basket analysis, can be applied to the analysis of expression data as well. Association rules can reveal biologically relevant associations between different genes or between environmental effects and gene expression. An association rule has the form LHS --> RHS, where LHS and RHS are disjoint sets of items, the RHS set being likely to occur whenever the LHS set occurs. Items in gene expression data can include genes that are highly expressed or repressed, as well as relevant facts describing the cellular environment of the genes (e.g. the diagnosis of a tumor sample from which a profile was obtained). Results: We demonstrate an algorithm for efficiently mining association rules from gene expression data, using the data set from Hughes et al. (2000, Cell, 102, 109-126) of 300 expression profiles for yeast. Using the algorithm, we find numerous rules in the data. A cursory analysis of some of these rules reveals numerous associations between certain genes, many of which make sense biologically, others suggesting new hypotheses that may warrant further investigation. In a data set derived from the yeast data set, but with the expression values for each transcript randomly shifted with respect to the experiments, no rules were found, indicating that most all of the rules mined from the actual data set are not likely to have occurred by chance.
引用
收藏
页码:79 / 86
页数:8
相关论文
共 24 条
[1]   Maintenance and integrity of the mitochondrial genome: a plethora of nuclear genes in the budding yeast [J].
Contamine, V ;
Picard, M .
MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, 2000, 64 (02) :281-+
[2]   The ARG11 gene of Saccharomyces cerevisiae encodes a mitochondrial integral membrane protein required for arginine biosynthesis [J].
Crabeel, M ;
Soetens, O ;
DeRijcke, M ;
Pratiwi, R ;
Pankiewicz, R .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1996, 271 (40) :25011-25018
[3]   A CLOSE RELATIVE OF THE NUCLEAR, CHROMOSOMAL HIGH-MOBILITY GROUP PROTEIN HMG1 IN YEAST MITOCHONDRIA [J].
DIFFLEY, JFX ;
STILLMAN, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (17) :7864-7868
[4]   Discovery of association rules in medical data [J].
Doddi, S ;
Marathe, A ;
Ravi, SS ;
Torney, DC .
MEDICAL INFORMATICS AND THE INTERNET IN MEDICINE, 2001, 26 (01) :25-33
[5]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[6]   Using Bayesian networks to analyze expression data [J].
Friedman, N ;
Linial, M ;
Nachman, I ;
Pe'er, D .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :601-620
[7]  
Goffeau A, 1997, YEAST, V13, P43, DOI 10.1002/(SICI)1097-0061(199701)13:1<43::AID-YEA56>3.0.CO
[8]  
2-J
[9]   Functional discovery via a compendium of expression profiles [J].
Hughes, TR ;
Marton, MJ ;
Jones, AR ;
Roberts, CJ ;
Stoughton, R ;
Armour, CD ;
Bennett, HA ;
Coffey, E ;
Dai, HY ;
He, YDD ;
Kidd, MJ ;
King, AM ;
Meyer, MR ;
Slade, D ;
Lum, PY ;
Stepaniants, SB ;
Shoemaker, DD ;
Gachotte, D ;
Chakraburtty, K ;
Simon, J ;
Bard, M ;
Friend, SH .
CELL, 2000, 102 (01) :109-126
[10]  
Kao LR, 1996, YEAST, V12, P1239, DOI 10.1002/(SICI)1097-0061(19960930)12:12<1239::AID-YEA17>3.0.CO