Implications of avoiding overlap between training and testing data sets when evaluating genomic predictions of genetic merit

被引:27
作者
Amer, P. R. [1 ]
Banos, G. [2 ]
机构
[1] AbacusBio Ltd, Dunedin, New Zealand
[2] Aristotle Univ Thessaloniki, Fac Vet Med, Dept Anim Prod, GR-54124 Thessaloniki, Greece
关键词
genomic selection; validation; selection; index theory; simulation; BREEDING VALUES; SELECTION;
D O I
10.3168/jds.2009-2845
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
0905 ;
摘要
The aim of this study was to evaluate and quantify the importance of avoiding overlap between training and testing subsets of data when evaluating the effectiveness of predictions of genetic merit based on genetic markers. Genomic selection holds great potential for increasing the accuracy of selection in young bulls and is likely to lead quickly to more widespread use of these young bulls with a shorter generation interval and faster genetic improvement. Practical implementations of genomic selection in dairy cattle commonly involve results of national genetic evaluations being used as the dependent variable to evaluate the predictive ability of genetic markers. Selection index theory was used to demonstrate how ignoring correlations among errors of prediction between animals in training and testing sets could result in overestimates of accuracy of genomic predictions. Correlations among errors of prediction occur when estimates of genetic merit of training animals used in prediction are taken from the same genetic evaluation as estimates for validation of animals. Selection index theory was used to show a substantial degree of error correlation when animals used for testing genomic predictions are progeny of training animals, when heritability is low, and when the number of recorded progeny for both training and testing animals is low. Even when training involves a dependent variable that is not influenced by the progeny records of testing animals (i.e., historic proofs), error correlations can still result from records of relatives of training animals contributing to both the historic proofs and the predictions of genetic merit of testing animals. A simple simulation was used to show how an error correlation could result in spurious confirmation of predictive ability that was overestimated in the training population because of ascertainment bias. Development of a method of testing genomic selection predictions that allows unbiased testing when training and testing variables are estimated breeding values from the same genetic evaluation would simplify training and testing of genomic predictions. In the meantime, a 4-step approach for separating records used for training from those used for testing after correction of fixed effects is suggested when use of progeny averages of adjusted records (e.g., daughter yield deviations) would result in inefficient use of the information available in the data.
引用
收藏
页码:3320 / 3330
页数:11
相关论文
共 21 条
[1]   The impact of genetic relationship information on genome-assisted breeding values [J].
Habier, D. ;
Fernando, R. L. ;
Dekkers, J. C. M. .
GENETICS, 2007, 177 (04) :2389-2397
[2]   Technical note: Prediction of breeding values using marker-derived relationship matrices [J].
Hayes, B. J. ;
Goddard, M. E. .
JOURNAL OF ANIMAL SCIENCE, 2008, 86 (09) :2089-2092
[3]   USE OF ALL RELATIVES IN INTRAHERD PREDICTION OF BREEDING VALUES AND PRODUCING ABILITIES [J].
HENDERSON, CR .
JOURNAL OF DAIRY SCIENCE, 1975, 58 (12) :1910-1916
[4]  
HENDERSON CR, 1973, AN BREED GEN S HON J, P10
[5]   Economic evaluation of genomic breeding programs [J].
Koenig, S. ;
Simianer, H. ;
Willam, A. .
JOURNAL OF DAIRY SCIENCE, 2009, 92 (01) :382-391
[6]   A SELECTION INDEX FOR FAT PRODUCTION IN DAIRY CATTLE UTILIZING THE FAT YIELDS OF THE COW AND HER CLOSE RELATIVES [J].
LEGATES, JE ;
LUSH, JL .
JOURNAL OF DAIRY SCIENCE, 1954, 37 (06) :744-753
[7]  
Meuwissen THE, 2001, GENETICS, V157, P1819
[8]  
NejatiJavaremi A, 1997, J ANIM SCI, V75, P1738
[9]   Strategy for applying genome-wide selection in dairy cattle [J].
Schaeffer, L. R. .
JOURNAL OF ANIMAL BREEDING AND GENETICS, 2006, 123 (04) :218-223
[10]   A discriminant function for plant selection [J].
Smith, HF .
ANNALS OF EUGENICS, 1936, 7 :240-250