A latent variable model for chemogenomic profiling

被引:44
作者
Flaherty, P [1 ]
Giaever, G
Kumm, J
Jordan, MI
Arkin, AP
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Stat, Div Comp Sci, Berkeley, CA 94720 USA
[3] Stanford Univ, Sch Med, Stanford Genome Technol Ctr, Palo Alto, CA 94304 USA
[4] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
[5] Univ Calif Berkeley, Howard Hughes Med Inst, Lawrence Berkeley Natl Lab, Phys Biosci Div, Berkeley, CA 94720 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/bti515
中图分类号
Q5 [生物化学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Motivation: In haploinsufficiency profiling data, pleiotropic genes are often misclassified by clustering algorithms that impose the constraint that a gene or experiment belong to only one cluster. We have developed a general probabilistic model that clusters genes and experiments without requiring that a given gene or drug only appear in one cluster. The model also incorporates the functional annotation of known genes to guide the clustering procedure. Results: We applied our model to the clustering of 79 chemogenomic experiments in yeast. Known pleiotropic genes PDR5 and MAL11 are more accurately represented by the model than by a clustering procedure that requires genes to belong to a single cluster. Drugs such as miconazole and fenpropimorph that have different targets but similar off-target genes are clustered more accurately by the model-based framework. We show that this model is useful for summarizing the relationship among treatments and genes affected by those treatments in a compendium of microarray profiles. Availability: Supplementary information and computer code at http://genomics.lbl.gov/llda Contact: flaherty@berkeley.edu
引用
收藏
页码:3286 / 3293
页数:8
相关论文
共 48 条
[1]
SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[2]
Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[3]
Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]
INHIBITION OF ERGOSTEROL BIOSYNTHESIS IN SACCHAROMYCES-CEREVISIAE AND USTILAGO-MAYDIS BY TRIDEMORPH, FENPROPIMORPH AND FENPROPIDIN [J].
BALOCH, RI ;
MERCER, EI ;
WIGGINS, TE ;
BALDWIN, BC .
PHYTOCHEMISTRY, 1984, 23 (10) :2219-2226
[5]
SACCHAROMYCES-CEREVISIAE CONTAINS 2 FUNCTIONAL GENES ENCODING 3-HYDROXY-3-METHYLGLUTARYL-COENZYME-A REDUCTASE [J].
BASSON, ME ;
THORSNESS, M ;
RINE, J .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (15) :5563-5567
[6]
BENNETT J, 2001, ANTIMICROBIAL AGENTS, P1295
[7]
Iterative signature algorithm for the analysis of large-scale gene expression data [J].
Bergmann, S ;
Ihmels, J ;
Barkai, N .
PHYSICAL REVIEW E, 2003, 67 (03) :18
[8]
Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]
CHABNER BA, 2001, CHEMOTHERAPY NEOPLAS, P1389
[10]
CHENG Y, 2000, P 8 INT C INT SYST M, P93