Combinatorial motif analysis and hypothesis generation on a genomic scale

被引:22
作者
Hu, YJ
Sandmeyer, S
McLaughlin, C
Kibler, D
机构
[1] Univ Calif Irvine, Coll Med, Dept Informat & Comp Sci, Irvine, CA 92717 USA
[2] Univ Calif Irvine, Coll Med, Dept Biol Chem, Irvine, CA 92717 USA
关键词
D O I
10.1093/bioinformatics/16.3.222
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Computer-assisted methods are essential for the analysis of biosequences. Gene activity is regulated in part by the binding of regulatory molecules (transcription factors) to combinations of short motifs, The goal of our analysis is the development of algorithms to identify regulatory motifs and to predict the activity of combinations of those motifs. Approach: Our research begins with a new motif-finding method, using multiple objective functions and an improved stochastic iterative sampling strategy. Combinatorial motif analysis is accomplished by constructive induction that analyzes potential motif combinations. The hypothesis is generated by applying standard inductive learning algorithms. Results: Tests using 10 previously identified regulons from budding yeast and 14 artificial families of sequences demonstrated the effectiveness of the new motif-finding method Motif combination and classification approaches were used in the analysis of a sample DNA array data set derived from genome-wide gene expression analysis.
引用
收藏
页码:222 / 232
页数:11
相关论文
共 24 条
[1]  
BAILEY T, 1993, CS93318 UCSD
[2]  
BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
[3]  
Baldi P., 1998, Bioinformatics: The machine learning approach
[4]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[5]  
EDDY SR, 1995, P 3 INT C INT SYST M, P114
[6]  
FERANDES M, 1994, BIOL HEAT SHOCK PROT, P375
[7]   Life with 6000 genes [J].
Goffeau, A ;
Barrell, BG ;
Bussey, H ;
Davis, RW ;
Dujon, B ;
Feldmann, H ;
Galibert, F ;
Hoheisel, JD ;
Jacq, C ;
Johnston, M ;
Louis, EJ ;
Mewes, HW ;
Murakami, Y ;
Philippsen, P ;
Tettelin, H ;
Oliver, SG .
SCIENCE, 1996, 274 (5287) :546-&
[8]  
HAMPSON S, 1996, DIMACS SERIES DISCRE, V26, P437
[9]   SEARCH ALGORITHM FOR PATTERN MATCH ANALYSIS OF NUCLEIC-ACID SEQUENCES [J].
HARR, R ;
HAGGSTROM, M ;
GUSTAFSSON, P .
NUCLEIC ACIDS RESEARCH, 1983, 11 (09) :2943-2957
[10]  
HERTZ G, 1995, P 3 INT C BIOINF GEN, P201