Finding associations in dense genetic maps: A genetic algorithm approach

被引:13
作者
Clark, TG
De Iorio, M
Griffiths, RC
Farrall, M
机构
[1] Univ London Imperial Coll Sci & Technol, Dept Epidemiol & Publ Hlth, London W2 1PG, England
[2] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[3] Univ Oxford, Dept Cardiovasc Med, Oxford, England
关键词
association studies; genetic algorithm; linkage disequilibrium; logic trees; SNP data;
D O I
10.1159/000088845
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Large-scale association studies hold promise for discovering the genetic basis of common human disease. These studies will consist of a large number of individuals, as well as large number of genetic markers, such as single nucleotide polymorphisms ( SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between phenotypes and SNPs in dense genetic maps. Our approach uses a genetic algorithm ( GA) to construct logic trees consisting of Boolean expressions involving strings or blocks of SNPs. These blocks or nodes of the logic trees consist of SNPs in high linkage disequilibrium ( LD), that is, SNPs that are highly correlated with each other due to evolutionary processes. At each generation of our GA, a population of logic tree models is modified using selection, cross-over and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression frame-work. Mutation and cross-over moves use LD measures to propose changes to the trees, and facilitate the movement through the model space. We demonstrate our method and the flexibility of logic tree structure with variable nodal lengths on simulated data from a coalescent model, as well as data from a candidate gene study of quantitative genetic variation. Copyright (c) 2005 S. Karger AG, Basel.
引用
收藏
页码:97 / 108
页数:12
相关论文
共 34 条
[11]   Ancestral inference from samples of DNA sequences with recombination [J].
Griffiths, RC ;
Marjoram, P .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1996, 3 (04) :479-502
[12]  
GRIFFITHS RC, 1998, GENES FOSSILS BEHAV, P137
[13]   Simultaneously applying multiple mutation operators in genetic algorithms [J].
Hong, TP ;
Wang, HS ;
Chen, WC .
JOURNAL OF HEURISTICS, 2000, 6 (04) :439-455
[14]   Generating samples under a Wright-Fisher neutral model of genetic variation [J].
Hudson, RR .
BIOINFORMATICS, 2002, 18 (02) :337-338
[15]  
Jourdan L, 2003, LECT NOTES COMPUT SC, V2611, P66
[16]  
JOURDAN L, 2003, EUR C COMP BIOL ECCB, P29
[17]  
Kingman JFC., 1982, Stochastic Processes and their Applications, V13, P235, DOI [10.1016/0304-4149(82)90011-4, DOI 10.1016/0304-4149(82)90011-4]
[18]   Identifying interacting SNPs using Monte Carlo logic regression [J].
Kooperberg, C ;
Ruczinski, I .
GENETIC EPIDEMIOLOGY, 2005, 28 (02) :157-170
[19]   Haplotype inference in random population samples [J].
Lin, S ;
Cutler, DJ ;
Zwick, ME ;
Chakravarti, A .
AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (05) :1129-1137
[20]   Solving geometric constraints with genetic simulated annealing algorithm [J].
Liu Sheng-Li ;
Tang Min ;
Dong Jin-Xiang .
Journal of Zhejiang University-SCIENCE A, 2003, 4 (5) :532-541