Mapping complex traits using Random Forests

被引:45
作者
Bureau, A
Dupuis, J
Hayward, B
Falls, K
Van Eerdewegh, P
机构
[1] Genome Therapeut Corp, Waltham, MA 02453 USA
[2] Harvard Univ, Sch Med, Dept Psychiat, Boston, MA 02115 USA
关键词
D O I
10.1186/1471-2156-4-S1-S64
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Random Forest is a prediction technique based on growing trees on bootstrap samples of data, in conjunction with a random selection of explanatory variables to define the best split at each node. In the case of a quantitative outcome, the tree predictor takes on a numerical value. We applied Random Forest to the first replicate of the Genetic Analysis Workshop 13 simulated data set, with the sibling pairs as our units of analysis and identity by descent (IBD) at selected loci as our explanatory variables. With the knowledge of the true model, we performed two sets of analyses on three phenotypes: HDL, triglycerides, and glucose. The goal was to approach the mapping of complex traits from a multivariate perspective. The first set of analyses mimics a candidate gene approach with a high proportion of true genes among the predictors while the second set represents a genome scan analysis using microsatellite markers. Random Forest was able to identify a few of the major genes influencing the phenotypes, such as baseline HDL and triglycerides, but failed to identify the major genes regulating baseline glucose levels.
引用
收藏
页数:5
相关论文
共 7 条
  • [1] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [2] Genetic Analysis Workshop 13: Simulated longitudinal data on families for a system of oligogenic traits
    Daw, EW
    Morrison, J
    Zhou, XJ
    Thomas, DC
    [J]. BMC GENETICS, 2003, 4 (Suppl 1)
  • [3] INVESTIGATION OF LINKAGE BETWEEN A QUANTITATIVE TRAIT AND A MARKER LOCUS
    HASEMAN, JK
    ELSTON, RC
    [J]. BEHAVIOR GENETICS, 1972, 2 (01) : 3 - 19
  • [4] Kruglyak L, 1996, AM J HUM GENET, V58, P1347
  • [5] Mukhopadhyay N, 1999, AM J HUM GENET, V65, pA436
  • [6] SAS Institute Inc., 1990, SAS LANG REF VERS 6
  • [7] Tree-based linkage and association analyses of asthma
    Zhang, HP
    Tsai, CP
    Yu, CY
    Bonney, G
    [J]. GENETIC EPIDEMIOLOGY, 2001, 21 : S317 - S322