1 Department of Molecular Biology and Genetics - Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Science and Technology, Aarhus University2 Biostatistics Department, University of Alabama at Birmingham3 Department of Molecular Biology and Genetics - Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Science and Technology, Aarhus University
As stated by Wray and co-authors1, knowing the proportion of variance of a trait that is explained by regression on markers in the population (h2M) is relevant because, in principle, h2M represents the maximum prediction accuracy (R2TST) that is achievable in testing (TST) data if marker effects were known2. Following one study3, Wray and co-authors1 suggest estimating h2M using a ratio of variance components that are inferred from a G-BLUP analysis (h2G-BLUP). However, the realized proportions of allele sharing at markers and at causal loci can be very different4 owing to, for example, imperfect marker–causal loci linkage disequilibrium (LD). Consequently, the marker-based model may largely misrepresent the data-generating process; this is exacerbated with unrelated individuals5. Under these conditions, it is not clear that the finite sample estimate of h2G-BLUP is an unbiased estimate of h2M (Ref. 5), conseqeuenty, it is not obvious that R2TST can achieve values equal to the finite sample estimate of h2G-BLUP. In a recent article5, we studied the R2TST of G-BLUP and its relationship with h2G-BLUP. We show analytically that mis-specification of the training–testing (TRN–TST) genomic relationships (owing to, for example, imperfect marker–causal loci LD) can impose a large-sample upper bound on R2TST that is considerably lower than the finite sample estimate of h2G-BLUP. The same study5 also presents simulation scenarios with nominally unrelated individuals, where R2TST can be extremely low in situations with markedly different h2G-BLUP, suggesting a tenuous relationship between h2G-BLUP and R2TST, even with moderately large TRN samples.