Technical note:: Computing strategies in genome-wide selection

被引：85

作者：

Legarra, A. ^{[1
]}

Misztal, I. ^{[2
]}

机构：

[1] INRA, UR631, Stn Ameliorat Genet Anim, F-32326 Castanet Tolosan, France

[2] Univ Georgia, Dept Anim & Dairy Sci, Athens, GA 30602 USA

来源：

JOURNAL OF DAIRY SCIENCE | 2008年 / 91卷 / 01期

关键词：

genome-wide selection; genomic selection; genetic evaluation; marker-assisted selection;

D O I：

10.3168/jds.2007-0403

中图分类号：

S8 [畜牧、动物医学、狩猎、蚕、蜂];

学科分类号：

0905 ;

摘要：

Genome-wide genetic evaluation might involve the computation of BLUP-like estimations, potentially including thousands of covariates (i.e., single-nucleotide polymorphism markers) for each record. This implies dense Henderson's mixed-model equations and considerable computing resources in time and storage, even for a few thousand records. Possible computing options include the type of storage and the solving algorithm. This work evaluated several computing options, including half- stored Cholesky decomposition, Gauss-Seidel, and 3 matrix-free strategies: Gauss-Seidel, Gauss-Seidel with residuals update, and preconditioned conjugate gradients. Matrix-free Gauss-Seidel with residuals update adjusts the residuals after computing the solution for each effect. This avoids adjusting the left-hand side of the equations by all other effects at every step of the algorithm and saves considerable computing time. Any Gauss-Seidel algorithm can easily be extended for variance component estimation by Markov chain-Monte Carlo. Let m and n be the number of records and markers, respectively. Computing time for Cholesky decomposition is proportional to n(3). Computing times per round are proportional to mn(2) in matrix-free Gauss-Seidel, to n(2) for half-stored Gauss-Seidel, and to n and m for the rest of the algorithms. Algorithms were tested on a real mouse data set, which included 1,928 records and 10,946 single-nucleotide polymorphism markers. Computing times were in the order of a few minutes for Gauss-Seidel with residuals update and preconditioned conjugate gradients, more than 1 h for half-stored Gauss-Seidel, 2 h for Cholesky decomposition, and 4 d for matrix- free Gauss-Seidel. Preconditioned conjugate gradients was the fastest. Gauss-Seidel with residuals update would be the method of choice for variance component estimation as well as solving.

引用

页码：360 / 366

页数：7

共 14 条

[1]

[Anonymous], 1998, Genetics and Analysis of Quantitative Traits (Sinauer)

[2] Rate of convergence of the Gibbs sampler in the gaussian case [J].

Galli, A ;

Gao, H .

MATHEMATICAL GEOLOGY, 2001, 33 (06) :653-677

[3] On a multivariate implementation of the Gibbs sampler [J].

GarciaCortes, LA ;

Sorensen, D .

GENETICS SELECTION EVOLUTION, 1996, 28 (01) :121-126

[4]

JANSS K, 1999, P COMP CATTL BREED 1, P62

[5]

LEGARRA A, 2007, P 11 QTLMAS WORKSH Q, P66

[6] Solving large test-day models by iteration on data and preconditioned conjugate gradient [J].

Lidauer, M ;

Strandén, I ;

Mäntysaari, EA ;

Pösö, J ;

Kettunen, A .

JOURNAL OF DAIRY SCIENCE, 1999, 82 (12) :2788-2796

[7]

Meuwissen THE, 2001, GENETICS, V157, P1819

[8] INDIRECT SOLUTION OF MIXED MODEL-EQUATIONS [J].

MISZTAL, I ;

GIANOLA, D .

JOURNAL OF DAIRY SCIENCE, 1987, 70 (03) :716-723

[9]

SCHAEFFER LR, 1986, P 3 WORLD C GEN APPL, P382

[10]

SOLBERG TRA, 2006, P 8 WORLD C GEN APPL

← 1 2 →