Clustering of diverse genomic data using information fusion

被引:15
作者
Kasturi, J [1 ]
Acharya, R [1 ]
机构
[1] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/bti186
中图分类号
Q5 [生物化学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Motivation: Genome sequencing projects and high-through-put technologies like DNA and Protein arrays have resulted in a very large amount of information-rich data. Microarray experimental data are a valuable, but limited source for inferring gene regulation mechanisms on a genomic scale. Additional information such as promoter sequences of genes/DNA binding motifs, gene ontologies, and location data, when combined with gene expression analysis can increase the statistical significance of the finding. This paper introduces a machine learning approach to information fusion for combining heterogeneous genomic data. The algorithm uses an unsupervised joint learning mechanism that identifies clusters of genes using the combined data. Results: The correlation between gene expression time-series patterns obtained from different experimental conditions and the presence of several distinct and repeated motifs in their upstream sequences is examined here using publicly available yeast cell-cycle data. The results show that the combined learning approach taken here identifies correlated genes effectively. The algorithm provides an automated clustering method, but allows the user to specify apriori the influence of each data type on the final clustering using probabilities.
引用
收藏
页码:423 / 429
页数:7
相关论文
共 20 条
[1]
Investigating extended regulatory regions of genomic DNA sequences [J].
Babenko, VN ;
Kosarev, PS ;
Vishnevsky, OV ;
Levitsky, VG ;
Basin, VV ;
Frolov, AS .
BIOINFORMATICS, 1999, 15 (7-8) :644-653
[2]
Gene expression data analysis [J].
Brazma, A ;
Vilo, J .
FEBS LETTERS, 2000, 480 (01) :17-24
[3]
Predicting gene regulatory elements in silico on a genomic scale [J].
Brazma, A ;
Jonassen, I ;
Vilo, J ;
Ukkonen, E .
GENOME RESEARCH, 1998, 8 (11) :1202-1215
[4]
Regulatory element detection using correlation with expression [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
NATURE GENETICS, 2001, 27 (02) :167-171
[5]
Chiang D Y, 2001, Bioinformatics, V17 Suppl 1, pS49
[6]
Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[7]
Discovery and modeling of transcriptional regulatory regions [J].
Fickett, JW ;
Wasserman, WW .
CURRENT OPINION IN BIOTECHNOLOGY, 2000, 11 (01) :19-24
[8]
Using Bayesian networks to analyze expression data [J].
Friedman, N ;
Linial, M ;
Nachman, I ;
Pe'er, D .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :601-620
[9]
Holmes I, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P202
[10]
Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae [J].
Hughes, JD ;
Estep, PW ;
Tavazoie, S ;
Church, GM .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 296 (05) :1205-1214